[2023-10-14 01:01:28,451][31953] Saving configuration to ./train_atari/atari_pong_APPO/config.json... [2023-10-14 01:01:28,768][31953] Rollout worker 0 uses device cpu [2023-10-14 01:01:28,769][31953] Rollout worker 1 uses device cpu [2023-10-14 01:01:28,769][31953] Rollout worker 2 uses device cpu [2023-10-14 01:01:28,770][31953] Rollout worker 3 uses device cpu [2023-10-14 01:01:28,770][31953] Rollout worker 4 uses device cpu [2023-10-14 01:01:28,771][31953] Rollout worker 5 uses device cpu [2023-10-14 01:01:28,771][31953] Rollout worker 6 uses device cpu [2023-10-14 01:01:28,772][31953] Rollout worker 7 uses device cpu [2023-10-14 01:01:28,772][31953] Rollout worker 8 uses device cpu [2023-10-14 01:01:28,773][31953] Rollout worker 9 uses device cpu [2023-10-14 01:01:28,773][31953] Rollout worker 10 uses device cpu [2023-10-14 01:01:28,773][31953] Rollout worker 11 uses device cpu [2023-10-14 01:01:28,774][31953] Rollout worker 12 uses device cpu [2023-10-14 01:01:28,774][31953] Rollout worker 13 uses device cpu [2023-10-14 01:01:28,775][31953] Rollout worker 14 uses device cpu [2023-10-14 01:01:28,775][31953] Rollout worker 15 uses device cpu [2023-10-14 01:01:29,060][31953] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-14 01:01:29,061][31953] InferenceWorker_p0-w0: min num requests: 2 [2023-10-14 01:01:29,064][31953] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-14 01:01:29,064][31953] InferenceWorker_p1-w0: min num requests: 2 [2023-10-14 01:01:29,110][31953] Starting all processes... [2023-10-14 01:01:29,110][31953] Starting process learner_proc0 [2023-10-14 01:01:30,762][31953] Starting process learner_proc1 [2023-10-14 01:01:30,766][32837] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-14 01:01:30,766][32837] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-10-14 01:01:30,784][32837] Num visible devices: 1 [2023-10-14 01:01:30,803][32837] Setting fixed seed 1234 [2023-10-14 01:01:30,804][32837] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-14 01:01:30,804][32837] Initializing actor-critic model on device cuda:0 [2023-10-14 01:01:30,805][32837] RunningMeanStd input shape: (4, 84, 84) [2023-10-14 01:01:30,805][32837] RunningMeanStd input shape: (1,) [2023-10-14 01:01:30,816][32837] ConvEncoder: input_channels=4 [2023-10-14 01:01:30,993][32837] Conv encoder output size: 512 [2023-10-14 01:01:30,995][32837] Created Actor Critic model with architecture: [2023-10-14 01:01:30,996][32837] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=6, bias=True) ) ) [2023-10-14 01:01:31,575][32837] Using optimizer [2023-10-14 01:01:31,576][32837] No checkpoints found [2023-10-14 01:01:31,576][32837] Did not load from checkpoint, starting from scratch! [2023-10-14 01:01:31,576][32837] Initialized policy 0 weights for model version 0 [2023-10-14 01:01:31,577][32837] LearnerWorker_p0 finished initialization! [2023-10-14 01:01:31,578][32837] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-14 01:01:32,513][31953] Starting all processes... [2023-10-14 01:01:32,516][32895] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-14 01:01:32,516][32895] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 [2023-10-14 01:01:32,522][31953] Starting process inference_proc0-0 [2023-10-14 01:01:32,523][31953] Starting process inference_proc1-0 [2023-10-14 01:01:32,523][31953] Starting process rollout_proc0 [2023-10-14 01:01:32,536][32895] Num visible devices: 1 [2023-10-14 01:01:32,523][31953] Starting process rollout_proc1 [2023-10-14 01:01:32,523][31953] Starting process rollout_proc2 [2023-10-14 01:01:32,524][31953] Starting process rollout_proc3 [2023-10-14 01:01:32,524][31953] Starting process rollout_proc4 [2023-10-14 01:01:32,565][32895] Setting fixed seed 1234 [2023-10-14 01:01:32,566][32895] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-14 01:01:32,566][32895] Initializing actor-critic model on device cuda:0 [2023-10-14 01:01:32,567][32895] RunningMeanStd input shape: (4, 84, 84) [2023-10-14 01:01:32,567][32895] RunningMeanStd input shape: (1,) [2023-10-14 01:01:32,527][31953] Starting process rollout_proc5 [2023-10-14 01:01:32,529][31953] Starting process rollout_proc6 [2023-10-14 01:01:32,530][31953] Starting process rollout_proc7 [2023-10-14 01:01:32,530][31953] Starting process rollout_proc8 [2023-10-14 01:01:32,539][31953] Starting process rollout_proc9 [2023-10-14 01:01:32,579][32895] ConvEncoder: input_channels=4 [2023-10-14 01:01:32,539][31953] Starting process rollout_proc10 [2023-10-14 01:01:32,551][31953] Starting process rollout_proc11 [2023-10-14 01:01:32,551][31953] Starting process rollout_proc12 [2023-10-14 01:01:32,551][31953] Starting process rollout_proc13 [2023-10-14 01:01:33,041][32895] Conv encoder output size: 512 [2023-10-14 01:01:33,043][32895] Created Actor Critic model with architecture: [2023-10-14 01:01:33,043][32895] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=6, bias=True) ) ) [2023-10-14 01:01:33,677][32895] Using optimizer [2023-10-14 01:01:33,677][32895] No checkpoints found [2023-10-14 01:01:33,678][32895] Did not load from checkpoint, starting from scratch! [2023-10-14 01:01:33,678][32895] Initialized policy 1 weights for model version 0 [2023-10-14 01:01:33,679][32895] LearnerWorker_p1 finished initialization! [2023-10-14 01:01:33,680][32895] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-10-14 01:01:34,742][31953] Starting process rollout_proc14 [2023-10-14 01:01:34,747][33234] Worker 0 uses CPU cores [0, 1] [2023-10-14 01:01:34,758][31953] Starting process rollout_proc15 [2023-10-14 01:01:34,763][33248] Worker 13 uses CPU cores [26, 27] [2023-10-14 01:01:34,765][33242] Worker 6 uses CPU cores [12, 13] [2023-10-14 01:01:34,778][33240] Worker 4 uses CPU cores [8, 9] [2023-10-14 01:01:34,815][33244] Worker 8 uses CPU cores [16, 17] [2023-10-14 01:01:34,832][33246] Worker 10 uses CPU cores [20, 21] [2023-10-14 01:01:34,835][33239] Worker 2 uses CPU cores [4, 5] [2023-10-14 01:01:34,891][33243] Worker 7 uses CPU cores [14, 15] [2023-10-14 01:01:34,955][33241] Worker 5 uses CPU cores [10, 11] [2023-10-14 01:01:35,004][33249] Worker 12 uses CPU cores [24, 25] [2023-10-14 01:01:35,018][33235] Worker 1 uses CPU cores [2, 3] [2023-10-14 01:01:35,068][33226] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-10-14 01:01:35,069][33226] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 [2023-10-14 01:01:35,087][33226] Num visible devices: 1 [2023-10-14 01:01:35,171][33245] Worker 9 uses CPU cores [18, 19] [2023-10-14 01:01:35,219][33201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-10-14 01:01:35,219][33201] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-10-14 01:01:35,238][33201] Num visible devices: 1 [2023-10-14 01:01:35,404][33238] Worker 3 uses CPU cores [6, 7] [2023-10-14 01:01:35,423][33247] Worker 11 uses CPU cores [22, 23] [2023-10-14 01:01:35,767][33226] RunningMeanStd input shape: (4, 84, 84) [2023-10-14 01:01:35,767][33226] RunningMeanStd input shape: (1,) [2023-10-14 01:01:35,779][33226] ConvEncoder: input_channels=4 [2023-10-14 01:01:35,883][33226] Conv encoder output size: 512 [2023-10-14 01:01:35,901][33201] RunningMeanStd input shape: (4, 84, 84) [2023-10-14 01:01:35,902][33201] RunningMeanStd input shape: (1,) [2023-10-14 01:01:35,913][33201] ConvEncoder: input_channels=4 [2023-10-14 01:01:36,014][33201] Conv encoder output size: 512 [2023-10-14 01:01:36,644][33850] Worker 15 uses CPU cores [30, 31] [2023-10-14 01:01:36,646][31953] Inference worker 1-0 is ready! [2023-10-14 01:01:36,647][33813] Worker 14 uses CPU cores [28, 29] [2023-10-14 01:01:36,648][31953] Inference worker 0-0 is ready! [2023-10-14 01:01:36,648][31953] All inference workers are ready! Signal rollout workers to start! [2023-10-14 01:01:36,649][33248] EnvRunner 13-0 uses policy 1 [2023-10-14 01:01:36,650][33243] EnvRunner 7-0 uses policy 1 [2023-10-14 01:01:36,650][33244] EnvRunner 8-0 uses policy 0 [2023-10-14 01:01:36,650][33238] EnvRunner 3-0 uses policy 1 [2023-10-14 01:01:36,650][33245] EnvRunner 9-0 uses policy 1 [2023-10-14 01:01:36,650][33240] EnvRunner 4-0 uses policy 0 [2023-10-14 01:01:36,650][33246] EnvRunner 10-0 uses policy 0 [2023-10-14 01:01:36,650][33247] EnvRunner 11-0 uses policy 1 [2023-10-14 01:01:36,650][33249] EnvRunner 12-0 uses policy 0 [2023-10-14 01:01:36,650][33241] EnvRunner 5-0 uses policy 1 [2023-10-14 01:01:36,650][31953] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-14 01:01:36,651][33234] EnvRunner 0-0 uses policy 0 [2023-10-14 01:01:36,668][33242] EnvRunner 6-0 uses policy 0 [2023-10-14 01:01:36,668][33235] EnvRunner 1-0 uses policy 1 [2023-10-14 01:01:36,668][33239] EnvRunner 2-0 uses policy 0 [2023-10-14 01:01:36,789][33813] EnvRunner 14-0 uses policy 0 [2023-10-14 01:01:36,801][33850] EnvRunner 15-0 uses policy 1 [2023-10-14 01:01:39,048][31953] Heartbeat connected on Batcher_0 [2023-10-14 01:01:39,051][31953] Heartbeat connected on LearnerWorker_p0 [2023-10-14 01:01:39,054][31953] Heartbeat connected on Batcher_1 [2023-10-14 01:01:39,057][31953] Heartbeat connected on LearnerWorker_p1 [2023-10-14 01:01:39,064][31953] Heartbeat connected on InferenceWorker_p0-w0 [2023-10-14 01:01:39,068][31953] Heartbeat connected on InferenceWorker_p1-w0 [2023-10-14 01:01:39,070][31953] Heartbeat connected on RolloutWorker_w0 [2023-10-14 01:01:39,073][31953] Heartbeat connected on RolloutWorker_w1 [2023-10-14 01:01:39,073][31953] Heartbeat connected on RolloutWorker_w2 [2023-10-14 01:01:39,077][31953] Heartbeat connected on RolloutWorker_w3 [2023-10-14 01:01:39,079][31953] Heartbeat connected on RolloutWorker_w4 [2023-10-14 01:01:39,081][31953] Heartbeat connected on RolloutWorker_w5 [2023-10-14 01:01:39,085][31953] Heartbeat connected on RolloutWorker_w6 [2023-10-14 01:01:39,087][31953] Heartbeat connected on RolloutWorker_w7 [2023-10-14 01:01:39,090][31953] Heartbeat connected on RolloutWorker_w8 [2023-10-14 01:01:39,095][31953] Heartbeat connected on RolloutWorker_w9 [2023-10-14 01:01:39,095][31953] Heartbeat connected on RolloutWorker_w10 [2023-10-14 01:01:39,098][31953] Heartbeat connected on RolloutWorker_w11 [2023-10-14 01:01:39,100][31953] Heartbeat connected on RolloutWorker_w12 [2023-10-14 01:01:39,103][31953] Heartbeat connected on RolloutWorker_w13 [2023-10-14 01:01:39,107][31953] Heartbeat connected on RolloutWorker_w14 [2023-10-14 01:01:39,111][31953] Heartbeat connected on RolloutWorker_w15 [2023-10-14 01:01:39,557][31953] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 637.1, 1: 507.0. Samples: 3326. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-14 01:01:44,557][31953] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 1029.7, 1: 974.8. Samples: 15850. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-10-14 01:01:44,558][31953] Avg episode reward: [(0, '-21.000')] [2023-10-14 01:01:46,682][33201] Updated weights for policy 0, policy_version 10 (0.0007) [2023-10-14 01:01:46,690][33226] Updated weights for policy 1, policy_version 10 (0.0009) [2023-10-14 01:01:47,044][33201] Updated weights for policy 0, policy_version 20 (0.0009) [2023-10-14 01:01:47,054][33226] Updated weights for policy 1, policy_version 20 (0.0008) [2023-10-14 01:01:47,420][33226] Updated weights for policy 1, policy_version 30 (0.0008) [2023-10-14 01:01:47,423][33201] Updated weights for policy 0, policy_version 30 (0.0008) [2023-10-14 01:01:49,526][33226] Updated weights for policy 1, policy_version 40 (0.0007) [2023-10-14 01:01:49,557][31953] Fps is (10 sec: 6553.6, 60 sec: 5077.5, 300 sec: 5077.5). Total num frames: 65536. Throughput: 0: 1275.4, 1: 1260.2. Samples: 32728. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:01:49,558][31953] Avg episode reward: [(0, '-20.688'), (1, '-20.500')] [2023-10-14 01:01:49,746][33201] Updated weights for policy 0, policy_version 40 (0.0008) [2023-10-14 01:01:49,890][33226] Updated weights for policy 1, policy_version 50 (0.0008) [2023-10-14 01:01:50,115][33201] Updated weights for policy 0, policy_version 50 (0.0007) [2023-10-14 01:01:50,254][33226] Updated weights for policy 1, policy_version 60 (0.0008) [2023-10-14 01:01:50,485][33201] Updated weights for policy 0, policy_version 60 (0.0007) [2023-10-14 01:01:53,586][33226] Updated weights for policy 1, policy_version 70 (0.0007) [2023-10-14 01:01:53,757][33201] Updated weights for policy 0, policy_version 70 (0.0007) [2023-10-14 01:01:53,949][33226] Updated weights for policy 1, policy_version 80 (0.0009) [2023-10-14 01:01:54,127][33201] Updated weights for policy 0, policy_version 80 (0.0007) [2023-10-14 01:01:54,320][33226] Updated weights for policy 1, policy_version 90 (0.0008) [2023-10-14 01:01:54,492][33201] Updated weights for policy 0, policy_version 90 (0.0008) [2023-10-14 01:01:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 9149.4, 300 sec: 9149.4). Total num frames: 163840. Throughput: 0: 1495.9, 1: 1475.2. Samples: 53204. Policy #0 lag: (min: 33.0, avg: 33.0, max: 33.0) [2023-10-14 01:01:54,558][31953] Avg episode reward: [(0, '-20.762'), (1, '-20.500')] [2023-10-14 01:01:57,938][33226] Updated weights for policy 1, policy_version 100 (0.0008) [2023-10-14 01:01:58,084][33201] Updated weights for policy 0, policy_version 100 (0.0008) [2023-10-14 01:01:58,294][33226] Updated weights for policy 1, policy_version 110 (0.0008) [2023-10-14 01:01:58,451][33201] Updated weights for policy 0, policy_version 110 (0.0007) [2023-10-14 01:01:58,656][33226] Updated weights for policy 1, policy_version 120 (0.0008) [2023-10-14 01:01:58,827][33201] Updated weights for policy 0, policy_version 120 (0.0008) [2023-10-14 01:01:59,557][31953] Fps is (10 sec: 19660.4, 60 sec: 11443.7, 300 sec: 11443.7). Total num frames: 262144. Throughput: 0: 1403.6, 1: 1385.7. Samples: 63894. Policy #0 lag: (min: 22.0, avg: 37.0, max: 54.0) [2023-10-14 01:01:59,558][31953] Avg episode reward: [(0, '-20.406'), (1, '-20.281')] [2023-10-14 01:01:59,560][32837] Saving new best policy, reward=-20.406! [2023-10-14 01:01:59,560][32895] Saving new best policy, reward=-20.281! [2023-10-14 01:02:02,523][33226] Updated weights for policy 1, policy_version 130 (0.0007) [2023-10-14 01:02:02,764][33201] Updated weights for policy 0, policy_version 130 (0.0009) [2023-10-14 01:02:02,891][33226] Updated weights for policy 1, policy_version 140 (0.0009) [2023-10-14 01:02:03,140][33201] Updated weights for policy 0, policy_version 140 (0.0008) [2023-10-14 01:02:03,259][33226] Updated weights for policy 1, policy_version 150 (0.0009) [2023-10-14 01:02:03,501][33201] Updated weights for policy 0, policy_version 150 (0.0008) [2023-10-14 01:02:03,634][33226] Updated weights for policy 1, policy_version 160 (0.0007) [2023-10-14 01:02:03,873][33201] Updated weights for policy 0, policy_version 160 (0.0009) [2023-10-14 01:02:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 11741.8, 300 sec: 11741.8). Total num frames: 327680. Throughput: 0: 1529.9, 1: 1512.2. Samples: 84894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:02:04,558][31953] Avg episode reward: [(0, '-20.366'), (1, '-20.342')] [2023-10-14 01:02:04,559][32837] Saving new best policy, reward=-20.366! [2023-10-14 01:02:07,586][33226] Updated weights for policy 1, policy_version 170 (0.0008) [2023-10-14 01:02:07,646][33201] Updated weights for policy 0, policy_version 170 (0.0008) [2023-10-14 01:02:07,947][33226] Updated weights for policy 1, policy_version 180 (0.0007) [2023-10-14 01:02:08,020][33201] Updated weights for policy 0, policy_version 180 (0.0008) [2023-10-14 01:02:08,310][33226] Updated weights for policy 1, policy_version 190 (0.0008) [2023-10-14 01:02:08,394][33201] Updated weights for policy 0, policy_version 190 (0.0008) [2023-10-14 01:02:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 11949.2, 300 sec: 11949.2). Total num frames: 393216. Throughput: 0: 1608.6, 1: 1582.9. Samples: 105024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:02:09,558][31953] Avg episode reward: [(0, '-20.286'), (1, '-20.333')] [2023-10-14 01:02:09,563][32837] Saving new best policy, reward=-20.286! [2023-10-14 01:02:12,375][33201] Updated weights for policy 0, policy_version 200 (0.0008) [2023-10-14 01:02:12,383][33226] Updated weights for policy 1, policy_version 200 (0.0007) [2023-10-14 01:02:12,746][33201] Updated weights for policy 0, policy_version 210 (0.0009) [2023-10-14 01:02:12,753][33226] Updated weights for policy 1, policy_version 210 (0.0008) [2023-10-14 01:02:13,109][33226] Updated weights for policy 1, policy_version 220 (0.0008) [2023-10-14 01:02:13,117][33201] Updated weights for policy 0, policy_version 220 (0.0008) [2023-10-14 01:02:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 12102.0, 300 sec: 12102.0). Total num frames: 458752. Throughput: 0: 1556.9, 1: 1530.8. Samples: 117046. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:02:14,558][31953] Avg episode reward: [(0, '-20.259'), (1, '-20.246')] [2023-10-14 01:02:14,558][32837] Saving new best policy, reward=-20.259! [2023-10-14 01:02:14,558][32895] Saving new best policy, reward=-20.246! [2023-10-14 01:02:16,891][33226] Updated weights for policy 1, policy_version 230 (0.0010) [2023-10-14 01:02:16,911][33201] Updated weights for policy 0, policy_version 230 (0.0008) [2023-10-14 01:02:17,263][33226] Updated weights for policy 1, policy_version 240 (0.0008) [2023-10-14 01:02:17,275][33201] Updated weights for policy 0, policy_version 240 (0.0008) [2023-10-14 01:02:17,632][33226] Updated weights for policy 1, policy_version 250 (0.0009) [2023-10-14 01:02:17,648][33201] Updated weights for policy 0, policy_version 250 (0.0007) [2023-10-14 01:02:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 12219.1, 300 sec: 12219.1). Total num frames: 524288. Throughput: 0: 1597.5, 1: 1574.1. Samples: 136084. Policy #0 lag: (min: 4.0, avg: 16.3, max: 36.0) [2023-10-14 01:02:19,558][31953] Avg episode reward: [(0, '-20.167'), (1, '-20.136')] [2023-10-14 01:02:19,560][32837] Saving new best policy, reward=-20.167! [2023-10-14 01:02:19,560][32895] Saving new best policy, reward=-20.136! [2023-10-14 01:02:21,435][33201] Updated weights for policy 0, policy_version 260 (0.0008) [2023-10-14 01:02:21,580][33226] Updated weights for policy 1, policy_version 260 (0.0007) [2023-10-14 01:02:21,798][33201] Updated weights for policy 0, policy_version 270 (0.0007) [2023-10-14 01:02:21,938][33226] Updated weights for policy 1, policy_version 270 (0.0007) [2023-10-14 01:02:22,176][33201] Updated weights for policy 0, policy_version 280 (0.0008) [2023-10-14 01:02:22,302][33226] Updated weights for policy 1, policy_version 280 (0.0008) [2023-10-14 01:02:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 12311.8, 300 sec: 12311.8). Total num frames: 589824. Throughput: 0: 1725.4, 1: 1713.8. Samples: 158088. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 01:02:24,558][31953] Avg episode reward: [(0, '-20.052'), (1, '-20.147')] [2023-10-14 01:02:24,562][32837] Saving new best policy, reward=-20.052! [2023-10-14 01:02:26,070][33201] Updated weights for policy 0, policy_version 290 (0.0009) [2023-10-14 01:02:26,124][33226] Updated weights for policy 1, policy_version 290 (0.0007) [2023-10-14 01:02:26,467][33201] Updated weights for policy 0, policy_version 300 (0.0008) [2023-10-14 01:02:26,488][33226] Updated weights for policy 1, policy_version 300 (0.0007) [2023-10-14 01:02:26,834][33201] Updated weights for policy 0, policy_version 310 (0.0008) [2023-10-14 01:02:26,855][33226] Updated weights for policy 1, policy_version 310 (0.0010) [2023-10-14 01:02:27,209][33201] Updated weights for policy 0, policy_version 320 (0.0008) [2023-10-14 01:02:27,218][33226] Updated weights for policy 1, policy_version 320 (0.0009) [2023-10-14 01:02:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 12387.0, 300 sec: 12387.0). Total num frames: 655360. Throughput: 0: 1691.3, 1: 1690.9. Samples: 168050. Policy #0 lag: (min: 31.0, avg: 46.1, max: 63.0) [2023-10-14 01:02:29,558][31953] Avg episode reward: [(0, '-20.000'), (1, '-20.110')] [2023-10-14 01:02:29,559][32837] Saving new best policy, reward=-20.000! [2023-10-14 01:02:29,559][32895] Saving new best policy, reward=-20.110! [2023-10-14 01:02:31,087][33226] Updated weights for policy 1, policy_version 330 (0.0008) [2023-10-14 01:02:31,107][33201] Updated weights for policy 0, policy_version 330 (0.0008) [2023-10-14 01:02:31,445][33226] Updated weights for policy 1, policy_version 340 (0.0009) [2023-10-14 01:02:31,484][33201] Updated weights for policy 0, policy_version 340 (0.0008) [2023-10-14 01:02:31,809][33226] Updated weights for policy 1, policy_version 350 (0.0007) [2023-10-14 01:02:31,854][33201] Updated weights for policy 0, policy_version 350 (0.0009) [2023-10-14 01:02:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 12449.2, 300 sec: 12449.2). Total num frames: 720896. Throughput: 0: 1740.5, 1: 1736.4. Samples: 189186. Policy #0 lag: (min: 26.0, avg: 31.5, max: 58.0) [2023-10-14 01:02:34,558][31953] Avg episode reward: [(0, '-19.946'), (1, '-20.022')] [2023-10-14 01:02:34,559][32895] Saving new best policy, reward=-20.022! [2023-10-14 01:02:34,559][32837] Saving new best policy, reward=-19.946! [2023-10-14 01:02:35,603][33226] Updated weights for policy 1, policy_version 360 (0.0010) [2023-10-14 01:02:35,726][33201] Updated weights for policy 0, policy_version 360 (0.0009) [2023-10-14 01:02:35,971][33226] Updated weights for policy 1, policy_version 370 (0.0009) [2023-10-14 01:02:36,094][33201] Updated weights for policy 0, policy_version 370 (0.0009) [2023-10-14 01:02:36,347][33226] Updated weights for policy 1, policy_version 380 (0.0007) [2023-10-14 01:02:36,468][33201] Updated weights for policy 0, policy_version 380 (0.0008) [2023-10-14 01:02:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 12501.4). Total num frames: 786432. Throughput: 0: 1753.8, 1: 1753.1. Samples: 211014. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:02:39,558][31953] Avg episode reward: [(0, '-19.899'), (1, '-19.929')] [2023-10-14 01:02:39,562][32837] Saving new best policy, reward=-19.899! [2023-10-14 01:02:39,562][32895] Saving new best policy, reward=-19.929! [2023-10-14 01:02:40,196][33201] Updated weights for policy 0, policy_version 390 (0.0007) [2023-10-14 01:02:40,245][33226] Updated weights for policy 1, policy_version 390 (0.0008) [2023-10-14 01:02:40,567][33201] Updated weights for policy 0, policy_version 400 (0.0007) [2023-10-14 01:02:40,612][33226] Updated weights for policy 1, policy_version 400 (0.0009) [2023-10-14 01:02:40,932][33201] Updated weights for policy 0, policy_version 410 (0.0007) [2023-10-14 01:02:40,981][33226] Updated weights for policy 1, policy_version 410 (0.0008) [2023-10-14 01:02:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 12546.1). Total num frames: 851968. Throughput: 0: 1743.4, 1: 1738.3. Samples: 220570. Policy #0 lag: (min: 22.0, avg: 27.8, max: 54.0) [2023-10-14 01:02:44,558][31953] Avg episode reward: [(0, '-19.780'), (1, '-19.790')] [2023-10-14 01:02:44,694][33201] Updated weights for policy 0, policy_version 420 (0.0007) [2023-10-14 01:02:44,732][33226] Updated weights for policy 1, policy_version 420 (0.0008) [2023-10-14 01:02:45,059][33201] Updated weights for policy 0, policy_version 430 (0.0007) [2023-10-14 01:02:45,094][33226] Updated weights for policy 1, policy_version 430 (0.0008) [2023-10-14 01:02:45,426][33201] Updated weights for policy 0, policy_version 440 (0.0008) [2023-10-14 01:02:45,459][33226] Updated weights for policy 1, policy_version 440 (0.0008) [2023-10-14 01:02:45,723][32837] Saving new best policy, reward=-19.780! [2023-10-14 01:02:45,750][32895] Saving new best policy, reward=-19.790! [2023-10-14 01:02:49,321][33226] Updated weights for policy 1, policy_version 450 (0.0008) [2023-10-14 01:02:49,393][33201] Updated weights for policy 0, policy_version 450 (0.0007) [2023-10-14 01:02:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 12584.5). Total num frames: 917504. Throughput: 0: 1756.7, 1: 1756.8. Samples: 243002. Policy #0 lag: (min: 31.0, avg: 32.5, max: 57.0) [2023-10-14 01:02:49,558][31953] Avg episode reward: [(0, '-19.710'), (1, '-19.690')] [2023-10-14 01:02:49,678][33226] Updated weights for policy 1, policy_version 460 (0.0008) [2023-10-14 01:02:49,770][33201] Updated weights for policy 0, policy_version 460 (0.0009) [2023-10-14 01:02:50,048][33226] Updated weights for policy 1, policy_version 470 (0.0008) [2023-10-14 01:02:50,142][33201] Updated weights for policy 0, policy_version 470 (0.0008) [2023-10-14 01:02:50,405][32895] Saving new best policy, reward=-19.690! [2023-10-14 01:02:50,405][33226] Updated weights for policy 1, policy_version 480 (0.0010) [2023-10-14 01:02:50,511][32837] Saving new best policy, reward=-19.710! [2023-10-14 01:02:50,516][33201] Updated weights for policy 0, policy_version 480 (0.0008) [2023-10-14 01:02:54,130][33226] Updated weights for policy 1, policy_version 490 (0.0008) [2023-10-14 01:02:54,503][33226] Updated weights for policy 1, policy_version 500 (0.0008) [2023-10-14 01:02:54,507][33201] Updated weights for policy 0, policy_version 490 (0.0008) [2023-10-14 01:02:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 12618.1). Total num frames: 983040. Throughput: 0: 1765.6, 1: 1777.1. Samples: 264442. Policy #0 lag: (min: 1.0, avg: 1.2, max: 11.0) [2023-10-14 01:02:54,557][31953] Avg episode reward: [(0, '-19.580'), (1, '-19.700')] [2023-10-14 01:02:54,871][33201] Updated weights for policy 0, policy_version 500 (0.0008) [2023-10-14 01:02:54,871][33226] Updated weights for policy 1, policy_version 510 (0.0008) [2023-10-14 01:02:55,241][33201] Updated weights for policy 0, policy_version 510 (0.0007) [2023-10-14 01:02:55,314][32837] Saving new best policy, reward=-19.580! [2023-10-14 01:02:58,839][33226] Updated weights for policy 1, policy_version 520 (0.0009) [2023-10-14 01:02:58,993][33201] Updated weights for policy 0, policy_version 520 (0.0009) [2023-10-14 01:02:59,202][33226] Updated weights for policy 1, policy_version 530 (0.0008) [2023-10-14 01:02:59,356][33201] Updated weights for policy 0, policy_version 530 (0.0009) [2023-10-14 01:02:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12647.6). Total num frames: 1048576. Throughput: 0: 1734.0, 1: 1754.6. Samples: 274032. Policy #0 lag: (min: 26.0, avg: 27.5, max: 51.0) [2023-10-14 01:02:59,558][31953] Avg episode reward: [(0, '-19.530'), (1, '-19.430')] [2023-10-14 01:02:59,570][33226] Updated weights for policy 1, policy_version 540 (0.0007) [2023-10-14 01:02:59,710][32895] Saving new best policy, reward=-19.430! [2023-10-14 01:02:59,728][33201] Updated weights for policy 0, policy_version 540 (0.0009) [2023-10-14 01:02:59,873][32837] Saving new best policy, reward=-19.530! [2023-10-14 01:03:03,357][33226] Updated weights for policy 1, policy_version 550 (0.0008) [2023-10-14 01:03:03,464][33201] Updated weights for policy 0, policy_version 550 (0.0010) [2023-10-14 01:03:03,719][33226] Updated weights for policy 1, policy_version 560 (0.0009) [2023-10-14 01:03:03,830][33201] Updated weights for policy 0, policy_version 560 (0.0009) [2023-10-14 01:03:04,083][33226] Updated weights for policy 1, policy_version 570 (0.0008) [2023-10-14 01:03:04,201][33201] Updated weights for policy 0, policy_version 570 (0.0008) [2023-10-14 01:03:04,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14199.4, 300 sec: 13419.2). Total num frames: 1179648. Throughput: 0: 1767.5, 1: 1788.6. Samples: 296106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:03:04,558][31953] Avg episode reward: [(0, '-19.460'), (1, '-19.450')] [2023-10-14 01:03:04,559][32837] Saving new best policy, reward=-19.460! [2023-10-14 01:03:07,939][33226] Updated weights for policy 1, policy_version 580 (0.0009) [2023-10-14 01:03:08,098][33201] Updated weights for policy 0, policy_version 580 (0.0008) [2023-10-14 01:03:08,306][33226] Updated weights for policy 1, policy_version 590 (0.0009) [2023-10-14 01:03:08,477][33201] Updated weights for policy 0, policy_version 590 (0.0007) [2023-10-14 01:03:08,670][33226] Updated weights for policy 1, policy_version 600 (0.0008) [2023-10-14 01:03:08,840][33201] Updated weights for policy 0, policy_version 600 (0.0009) [2023-10-14 01:03:09,557][31953] Fps is (10 sec: 19660.2, 60 sec: 14199.4, 300 sec: 13402.4). Total num frames: 1245184. Throughput: 0: 1737.0, 1: 1754.6. Samples: 315210. Policy #0 lag: (min: 17.0, avg: 17.0, max: 18.0) [2023-10-14 01:03:09,558][31953] Avg episode reward: [(0, '-19.360'), (1, '-19.280')] [2023-10-14 01:03:09,572][32837] Saving new best policy, reward=-19.360! [2023-10-14 01:03:09,572][32895] Saving new best policy, reward=-19.280! [2023-10-14 01:03:12,648][33226] Updated weights for policy 1, policy_version 610 (0.0008) [2023-10-14 01:03:12,837][33201] Updated weights for policy 0, policy_version 610 (0.0008) [2023-10-14 01:03:13,020][33226] Updated weights for policy 1, policy_version 620 (0.0007) [2023-10-14 01:03:13,224][33201] Updated weights for policy 0, policy_version 620 (0.0007) [2023-10-14 01:03:13,379][33226] Updated weights for policy 1, policy_version 630 (0.0007) [2023-10-14 01:03:13,595][33201] Updated weights for policy 0, policy_version 630 (0.0009) [2023-10-14 01:03:13,740][33226] Updated weights for policy 1, policy_version 640 (0.0008) [2023-10-14 01:03:13,959][33201] Updated weights for policy 0, policy_version 640 (0.0010) [2023-10-14 01:03:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13387.4). Total num frames: 1310720. Throughput: 0: 1764.5, 1: 1771.3. Samples: 327162. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:03:14,558][31953] Avg episode reward: [(0, '-19.250'), (1, '-19.090')] [2023-10-14 01:03:14,559][32837] Saving new best policy, reward=-19.250! [2023-10-14 01:03:14,559][32895] Saving new best policy, reward=-19.090! [2023-10-14 01:03:17,674][33226] Updated weights for policy 1, policy_version 650 (0.0007) [2023-10-14 01:03:17,852][33201] Updated weights for policy 0, policy_version 650 (0.0007) [2023-10-14 01:03:18,038][33226] Updated weights for policy 1, policy_version 660 (0.0008) [2023-10-14 01:03:18,220][33201] Updated weights for policy 0, policy_version 660 (0.0007) [2023-10-14 01:03:18,409][33226] Updated weights for policy 1, policy_version 670 (0.0008) [2023-10-14 01:03:18,591][33201] Updated weights for policy 0, policy_version 670 (0.0007) [2023-10-14 01:03:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13373.7). Total num frames: 1376256. Throughput: 0: 1755.1, 1: 1759.7. Samples: 347352. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:03:19,558][31953] Avg episode reward: [(0, '-19.050'), (1, '-19.020')] [2023-10-14 01:03:19,560][32837] Saving new best policy, reward=-19.050! [2023-10-14 01:03:19,560][32895] Saving new best policy, reward=-19.020! [2023-10-14 01:03:22,208][33226] Updated weights for policy 1, policy_version 680 (0.0008) [2023-10-14 01:03:22,424][33201] Updated weights for policy 0, policy_version 680 (0.0008) [2023-10-14 01:03:22,567][33226] Updated weights for policy 1, policy_version 690 (0.0007) [2023-10-14 01:03:22,788][33201] Updated weights for policy 0, policy_version 690 (0.0009) [2023-10-14 01:03:22,935][33226] Updated weights for policy 1, policy_version 700 (0.0008) [2023-10-14 01:03:23,161][33201] Updated weights for policy 0, policy_version 700 (0.0008) [2023-10-14 01:03:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 13361.4). Total num frames: 1441792. Throughput: 0: 1738.4, 1: 1749.6. Samples: 367978. Policy #0 lag: (min: 15.0, avg: 15.4, max: 29.0) [2023-10-14 01:03:24,558][31953] Avg episode reward: [(0, '-18.930'), (1, '-18.790')] [2023-10-14 01:03:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000000704_720896.pth... [2023-10-14 01:03:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000000704_720896.pth... [2023-10-14 01:03:24,599][32837] Saving new best policy, reward=-18.930! [2023-10-14 01:03:24,603][32895] Saving new best policy, reward=-18.790! [2023-10-14 01:03:26,780][33226] Updated weights for policy 1, policy_version 710 (0.0008) [2023-10-14 01:03:26,993][33201] Updated weights for policy 0, policy_version 710 (0.0007) [2023-10-14 01:03:27,133][33226] Updated weights for policy 1, policy_version 720 (0.0009) [2023-10-14 01:03:27,363][33201] Updated weights for policy 0, policy_version 720 (0.0007) [2023-10-14 01:03:27,498][33226] Updated weights for policy 1, policy_version 730 (0.0007) [2023-10-14 01:03:27,741][33201] Updated weights for policy 0, policy_version 730 (0.0008) [2023-10-14 01:03:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13350.1). Total num frames: 1507328. Throughput: 0: 1760.3, 1: 1771.8. Samples: 379514. Policy #0 lag: (min: 4.0, avg: 7.1, max: 36.0) [2023-10-14 01:03:29,558][31953] Avg episode reward: [(0, '-18.860'), (1, '-18.700')] [2023-10-14 01:03:29,559][32837] Saving new best policy, reward=-18.860! [2023-10-14 01:03:29,560][32895] Saving new best policy, reward=-18.700! [2023-10-14 01:03:31,373][33226] Updated weights for policy 1, policy_version 740 (0.0008) [2023-10-14 01:03:31,600][33201] Updated weights for policy 0, policy_version 740 (0.0008) [2023-10-14 01:03:31,736][33226] Updated weights for policy 1, policy_version 750 (0.0008) [2023-10-14 01:03:31,965][33201] Updated weights for policy 0, policy_version 750 (0.0008) [2023-10-14 01:03:32,105][33226] Updated weights for policy 1, policy_version 760 (0.0008) [2023-10-14 01:03:32,343][33201] Updated weights for policy 0, policy_version 760 (0.0008) [2023-10-14 01:03:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13339.8). Total num frames: 1572864. Throughput: 0: 1730.2, 1: 1748.2. Samples: 399532. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:03:34,558][31953] Avg episode reward: [(0, '-18.770'), (1, '-18.690')] [2023-10-14 01:03:34,560][32837] Saving new best policy, reward=-18.770! [2023-10-14 01:03:34,560][32895] Saving new best policy, reward=-18.690! [2023-10-14 01:03:35,907][33226] Updated weights for policy 1, policy_version 770 (0.0008) [2023-10-14 01:03:36,193][33201] Updated weights for policy 0, policy_version 770 (0.0008) [2023-10-14 01:03:36,272][33226] Updated weights for policy 1, policy_version 780 (0.0008) [2023-10-14 01:03:36,559][33201] Updated weights for policy 0, policy_version 780 (0.0007) [2023-10-14 01:03:36,642][33226] Updated weights for policy 1, policy_version 790 (0.0009) [2023-10-14 01:03:36,922][33201] Updated weights for policy 0, policy_version 790 (0.0010) [2023-10-14 01:03:37,002][33226] Updated weights for policy 1, policy_version 800 (0.0007) [2023-10-14 01:03:37,295][33201] Updated weights for policy 0, policy_version 800 (0.0007) [2023-10-14 01:03:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13330.4). Total num frames: 1638400. Throughput: 0: 1743.8, 1: 1751.0. Samples: 421708. Policy #0 lag: (min: 15.0, avg: 17.1, max: 47.0) [2023-10-14 01:03:39,558][31953] Avg episode reward: [(0, '-18.640'), (1, '-18.580')] [2023-10-14 01:03:39,568][32895] Saving new best policy, reward=-18.580! [2023-10-14 01:03:39,568][32837] Saving new best policy, reward=-18.640! [2023-10-14 01:03:40,943][33226] Updated weights for policy 1, policy_version 810 (0.0008) [2023-10-14 01:03:41,197][33201] Updated weights for policy 0, policy_version 810 (0.0008) [2023-10-14 01:03:41,313][33226] Updated weights for policy 1, policy_version 820 (0.0007) [2023-10-14 01:03:41,559][33201] Updated weights for policy 0, policy_version 820 (0.0008) [2023-10-14 01:03:41,682][33226] Updated weights for policy 1, policy_version 830 (0.0008) [2023-10-14 01:03:41,932][33201] Updated weights for policy 0, policy_version 830 (0.0008) [2023-10-14 01:03:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13321.7). Total num frames: 1703936. Throughput: 0: 1745.0, 1: 1749.5. Samples: 431284. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:03:44,558][31953] Avg episode reward: [(0, '-18.520'), (1, '-18.320')] [2023-10-14 01:03:44,559][32837] Saving new best policy, reward=-18.520! [2023-10-14 01:03:44,559][32895] Saving new best policy, reward=-18.320! [2023-10-14 01:03:45,488][33226] Updated weights for policy 1, policy_version 840 (0.0008) [2023-10-14 01:03:45,492][33201] Updated weights for policy 0, policy_version 840 (0.0007) [2023-10-14 01:03:45,856][33201] Updated weights for policy 0, policy_version 850 (0.0007) [2023-10-14 01:03:45,858][33226] Updated weights for policy 1, policy_version 850 (0.0009) [2023-10-14 01:03:46,220][33201] Updated weights for policy 0, policy_version 860 (0.0007) [2023-10-14 01:03:46,225][33226] Updated weights for policy 1, policy_version 860 (0.0009) [2023-10-14 01:03:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13313.6). Total num frames: 1769472. Throughput: 0: 1747.7, 1: 1752.1. Samples: 453596. Policy #0 lag: (min: 8.0, avg: 20.8, max: 40.0) [2023-10-14 01:03:49,558][31953] Avg episode reward: [(0, '-18.240'), (1, '-18.260')] [2023-10-14 01:03:49,559][32837] Saving new best policy, reward=-18.240! [2023-10-14 01:03:49,559][32895] Saving new best policy, reward=-18.260! [2023-10-14 01:03:50,105][33226] Updated weights for policy 1, policy_version 870 (0.0008) [2023-10-14 01:03:50,107][33201] Updated weights for policy 0, policy_version 870 (0.0010) [2023-10-14 01:03:50,470][33226] Updated weights for policy 1, policy_version 880 (0.0008) [2023-10-14 01:03:50,479][33201] Updated weights for policy 0, policy_version 880 (0.0008) [2023-10-14 01:03:50,840][33226] Updated weights for policy 1, policy_version 890 (0.0008) [2023-10-14 01:03:50,849][33201] Updated weights for policy 0, policy_version 890 (0.0009) [2023-10-14 01:03:54,436][33226] Updated weights for policy 1, policy_version 900 (0.0008) [2023-10-14 01:03:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13306.1). Total num frames: 1835008. Throughput: 0: 1775.8, 1: 1789.2. Samples: 475634. Policy #0 lag: (min: 16.0, avg: 34.5, max: 48.0) [2023-10-14 01:03:54,558][31953] Avg episode reward: [(0, '-18.090'), (1, '-18.160')] [2023-10-14 01:03:54,720][33201] Updated weights for policy 0, policy_version 900 (0.0010) [2023-10-14 01:03:54,801][33226] Updated weights for policy 1, policy_version 910 (0.0010) [2023-10-14 01:03:55,085][33201] Updated weights for policy 0, policy_version 910 (0.0008) [2023-10-14 01:03:55,164][33226] Updated weights for policy 1, policy_version 920 (0.0007) [2023-10-14 01:03:55,452][33201] Updated weights for policy 0, policy_version 920 (0.0009) [2023-10-14 01:03:55,458][32895] Saving new best policy, reward=-18.160! [2023-10-14 01:03:55,759][32837] Saving new best policy, reward=-18.090! [2023-10-14 01:03:58,886][33226] Updated weights for policy 1, policy_version 930 (0.0007) [2023-10-14 01:03:59,258][33226] Updated weights for policy 1, policy_version 940 (0.0007) [2023-10-14 01:03:59,387][33201] Updated weights for policy 0, policy_version 930 (0.0010) [2023-10-14 01:03:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13299.2). Total num frames: 1900544. Throughput: 0: 1747.2, 1: 1766.4. Samples: 485274. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:03:59,557][31953] Avg episode reward: [(0, '-17.980'), (1, '-17.950')] [2023-10-14 01:03:59,629][33226] Updated weights for policy 1, policy_version 950 (0.0007) [2023-10-14 01:03:59,784][33201] Updated weights for policy 0, policy_version 940 (0.0008) [2023-10-14 01:03:59,996][32895] Saving new best policy, reward=-17.950! [2023-10-14 01:03:59,997][33226] Updated weights for policy 1, policy_version 960 (0.0007) [2023-10-14 01:04:00,169][33201] Updated weights for policy 0, policy_version 950 (0.0008) [2023-10-14 01:04:00,534][32837] Saving new best policy, reward=-17.980! [2023-10-14 01:04:00,534][33201] Updated weights for policy 0, policy_version 960 (0.0008) [2023-10-14 01:04:03,971][33226] Updated weights for policy 1, policy_version 970 (0.0008) [2023-10-14 01:04:04,247][33201] Updated weights for policy 0, policy_version 970 (0.0008) [2023-10-14 01:04:04,343][33226] Updated weights for policy 1, policy_version 980 (0.0008) [2023-10-14 01:04:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13292.7). Total num frames: 1966080. Throughput: 0: 1764.6, 1: 1789.1. Samples: 507266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:04:04,557][31953] Avg episode reward: [(0, '-17.930'), (1, '-17.940')] [2023-10-14 01:04:04,618][33201] Updated weights for policy 0, policy_version 980 (0.0009) [2023-10-14 01:04:04,715][33226] Updated weights for policy 1, policy_version 990 (0.0007) [2023-10-14 01:04:04,785][32895] Saving new best policy, reward=-17.940! [2023-10-14 01:04:04,981][33201] Updated weights for policy 0, policy_version 990 (0.0008) [2023-10-14 01:04:05,051][32837] Saving new best policy, reward=-17.930! [2023-10-14 01:04:08,519][33226] Updated weights for policy 1, policy_version 1000 (0.0009) [2023-10-14 01:04:08,798][33201] Updated weights for policy 0, policy_version 1000 (0.0008) [2023-10-14 01:04:08,880][33226] Updated weights for policy 1, policy_version 1010 (0.0008) [2023-10-14 01:04:09,175][33201] Updated weights for policy 0, policy_version 1010 (0.0010) [2023-10-14 01:04:09,252][33226] Updated weights for policy 1, policy_version 1020 (0.0007) [2023-10-14 01:04:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 13500.9). Total num frames: 2064384. Throughput: 0: 1769.1, 1: 1775.0. Samples: 527462. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:04:09,558][31953] Avg episode reward: [(0, '-17.600'), (1, '-17.820')] [2023-10-14 01:04:09,566][33201] Updated weights for policy 0, policy_version 1020 (0.0010) [2023-10-14 01:04:09,567][32895] Saving new best policy, reward=-17.820! [2023-10-14 01:04:09,707][32837] Saving new best policy, reward=-17.600! [2023-10-14 01:04:13,107][33226] Updated weights for policy 1, policy_version 1030 (0.0008) [2023-10-14 01:04:13,379][33201] Updated weights for policy 0, policy_version 1030 (0.0007) [2023-10-14 01:04:13,469][33226] Updated weights for policy 1, policy_version 1040 (0.0007) [2023-10-14 01:04:13,756][33201] Updated weights for policy 0, policy_version 1040 (0.0007) [2023-10-14 01:04:13,827][33226] Updated weights for policy 1, policy_version 1050 (0.0007) [2023-10-14 01:04:14,122][33201] Updated weights for policy 0, policy_version 1050 (0.0007) [2023-10-14 01:04:14,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 13695.9). Total num frames: 2162688. Throughput: 0: 1757.8, 1: 1772.7. Samples: 538384. Policy #0 lag: (min: 3.0, avg: 6.4, max: 35.0) [2023-10-14 01:04:14,558][31953] Avg episode reward: [(0, '-17.390'), (1, '-17.750')] [2023-10-14 01:04:14,559][32895] Saving new best policy, reward=-17.750! [2023-10-14 01:04:14,559][32837] Saving new best policy, reward=-17.390! [2023-10-14 01:04:17,592][33226] Updated weights for policy 1, policy_version 1060 (0.0007) [2023-10-14 01:04:17,957][33226] Updated weights for policy 1, policy_version 1070 (0.0008) [2023-10-14 01:04:18,012][33201] Updated weights for policy 0, policy_version 1060 (0.0008) [2023-10-14 01:04:18,321][33226] Updated weights for policy 1, policy_version 1080 (0.0008) [2023-10-14 01:04:18,389][33201] Updated weights for policy 0, policy_version 1070 (0.0008) [2023-10-14 01:04:18,758][33201] Updated weights for policy 0, policy_version 1080 (0.0007) [2023-10-14 01:04:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 13677.9). Total num frames: 2228224. Throughput: 0: 1780.3, 1: 1778.6. Samples: 559680. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 01:04:19,558][31953] Avg episode reward: [(0, '-17.190'), (1, '-17.440')] [2023-10-14 01:04:19,559][32837] Saving new best policy, reward=-17.190! [2023-10-14 01:04:19,560][32895] Saving new best policy, reward=-17.440! [2023-10-14 01:04:22,165][33226] Updated weights for policy 1, policy_version 1090 (0.0008) [2023-10-14 01:04:22,539][33226] Updated weights for policy 1, policy_version 1100 (0.0008) [2023-10-14 01:04:22,623][33201] Updated weights for policy 0, policy_version 1090 (0.0008) [2023-10-14 01:04:22,895][33226] Updated weights for policy 1, policy_version 1110 (0.0007) [2023-10-14 01:04:22,993][33201] Updated weights for policy 0, policy_version 1100 (0.0007) [2023-10-14 01:04:23,258][33226] Updated weights for policy 1, policy_version 1120 (0.0008) [2023-10-14 01:04:23,369][33201] Updated weights for policy 0, policy_version 1110 (0.0009) [2023-10-14 01:04:23,737][33201] Updated weights for policy 0, policy_version 1120 (0.0010) [2023-10-14 01:04:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13660.9). Total num frames: 2293760. Throughput: 0: 1745.1, 1: 1764.0. Samples: 579616. Policy #0 lag: (min: 6.0, avg: 13.3, max: 38.0) [2023-10-14 01:04:24,558][31953] Avg episode reward: [(0, '-16.990'), (1, '-17.220')] [2023-10-14 01:04:24,566][32837] Saving new best policy, reward=-16.990! [2023-10-14 01:04:24,566][32895] Saving new best policy, reward=-17.220! [2023-10-14 01:04:26,997][33226] Updated weights for policy 1, policy_version 1130 (0.0009) [2023-10-14 01:04:27,367][33226] Updated weights for policy 1, policy_version 1140 (0.0008) [2023-10-14 01:04:27,735][33226] Updated weights for policy 1, policy_version 1150 (0.0008) [2023-10-14 01:04:27,779][33201] Updated weights for policy 0, policy_version 1130 (0.0009) [2023-10-14 01:04:28,157][33201] Updated weights for policy 0, policy_version 1140 (0.0008) [2023-10-14 01:04:28,525][33201] Updated weights for policy 0, policy_version 1150 (0.0010) [2023-10-14 01:04:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13644.9). Total num frames: 2359296. Throughput: 0: 1775.6, 1: 1791.5. Samples: 591804. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 01:04:29,558][31953] Avg episode reward: [(0, '-16.460'), (1, '-17.050')] [2023-10-14 01:04:29,558][32837] Saving new best policy, reward=-16.460! [2023-10-14 01:04:29,559][32895] Saving new best policy, reward=-17.050! [2023-10-14 01:04:31,545][33226] Updated weights for policy 1, policy_version 1160 (0.0009) [2023-10-14 01:04:31,919][33226] Updated weights for policy 1, policy_version 1170 (0.0008) [2023-10-14 01:04:32,278][33226] Updated weights for policy 1, policy_version 1180 (0.0009) [2023-10-14 01:04:32,289][33201] Updated weights for policy 0, policy_version 1160 (0.0009) [2023-10-14 01:04:32,664][33201] Updated weights for policy 0, policy_version 1170 (0.0009) [2023-10-14 01:04:33,038][33201] Updated weights for policy 0, policy_version 1180 (0.0010) [2023-10-14 01:04:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13629.8). Total num frames: 2424832. Throughput: 0: 1746.2, 1: 1761.3. Samples: 611436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:04:34,558][31953] Avg episode reward: [(0, '-16.350'), (1, '-16.930')] [2023-10-14 01:04:34,559][32837] Saving new best policy, reward=-16.350! [2023-10-14 01:04:34,559][32895] Saving new best policy, reward=-16.930! [2023-10-14 01:04:36,056][33226] Updated weights for policy 1, policy_version 1190 (0.0010) [2023-10-14 01:04:36,417][33226] Updated weights for policy 1, policy_version 1200 (0.0011) [2023-10-14 01:04:36,721][33201] Updated weights for policy 0, policy_version 1190 (0.0008) [2023-10-14 01:04:36,774][33226] Updated weights for policy 1, policy_version 1210 (0.0008) [2023-10-14 01:04:37,092][33201] Updated weights for policy 0, policy_version 1200 (0.0010) [2023-10-14 01:04:37,454][33201] Updated weights for policy 0, policy_version 1210 (0.0009) [2023-10-14 01:04:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13615.5). Total num frames: 2490368. Throughput: 0: 1743.2, 1: 1762.3. Samples: 633384. Policy #0 lag: (min: 31.0, avg: 45.0, max: 63.0) [2023-10-14 01:04:39,558][31953] Avg episode reward: [(0, '-15.970'), (1, '-16.850')] [2023-10-14 01:04:39,570][32837] Saving new best policy, reward=-15.970! [2023-10-14 01:04:39,570][32895] Saving new best policy, reward=-16.850! [2023-10-14 01:04:40,615][33226] Updated weights for policy 1, policy_version 1220 (0.0008) [2023-10-14 01:04:40,982][33226] Updated weights for policy 1, policy_version 1230 (0.0008) [2023-10-14 01:04:41,339][33201] Updated weights for policy 0, policy_version 1220 (0.0010) [2023-10-14 01:04:41,348][33226] Updated weights for policy 1, policy_version 1240 (0.0009) [2023-10-14 01:04:41,705][33201] Updated weights for policy 0, policy_version 1230 (0.0008) [2023-10-14 01:04:42,065][33201] Updated weights for policy 0, policy_version 1240 (0.0010) [2023-10-14 01:04:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13601.9). Total num frames: 2555904. Throughput: 0: 1753.6, 1: 1757.5. Samples: 643276. Policy #0 lag: (min: 31.0, avg: 31.7, max: 49.0) [2023-10-14 01:04:44,558][31953] Avg episode reward: [(0, '-15.790'), (1, '-16.380')] [2023-10-14 01:04:44,560][32895] Saving new best policy, reward=-16.380! [2023-10-14 01:04:44,560][32837] Saving new best policy, reward=-15.790! [2023-10-14 01:04:45,273][33226] Updated weights for policy 1, policy_version 1250 (0.0009) [2023-10-14 01:04:45,644][33226] Updated weights for policy 1, policy_version 1260 (0.0007) [2023-10-14 01:04:46,011][33226] Updated weights for policy 1, policy_version 1270 (0.0009) [2023-10-14 01:04:46,160][33201] Updated weights for policy 0, policy_version 1250 (0.0009) [2023-10-14 01:04:46,385][33226] Updated weights for policy 1, policy_version 1280 (0.0007) [2023-10-14 01:04:46,564][33201] Updated weights for policy 0, policy_version 1260 (0.0007) [2023-10-14 01:04:46,941][33201] Updated weights for policy 0, policy_version 1270 (0.0008) [2023-10-14 01:04:47,318][33201] Updated weights for policy 0, policy_version 1280 (0.0007) [2023-10-14 01:04:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13589.1). Total num frames: 2621440. Throughput: 0: 1740.7, 1: 1752.5. Samples: 664460. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-14 01:04:49,558][31953] Avg episode reward: [(0, '-15.480'), (1, '-16.190')] [2023-10-14 01:04:49,560][32895] Saving new best policy, reward=-16.190! [2023-10-14 01:04:49,560][32837] Saving new best policy, reward=-15.480! [2023-10-14 01:04:50,322][33226] Updated weights for policy 1, policy_version 1290 (0.0007) [2023-10-14 01:04:50,697][33226] Updated weights for policy 1, policy_version 1300 (0.0009) [2023-10-14 01:04:51,065][33226] Updated weights for policy 1, policy_version 1310 (0.0008) [2023-10-14 01:04:51,160][33201] Updated weights for policy 0, policy_version 1290 (0.0009) [2023-10-14 01:04:51,529][33201] Updated weights for policy 0, policy_version 1300 (0.0008) [2023-10-14 01:04:51,899][33201] Updated weights for policy 0, policy_version 1310 (0.0008) [2023-10-14 01:04:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13576.9). Total num frames: 2686976. Throughput: 0: 1748.1, 1: 1779.1. Samples: 686186. Policy #0 lag: (min: 26.0, avg: 34.4, max: 58.0) [2023-10-14 01:04:54,558][31953] Avg episode reward: [(0, '-15.020'), (1, '-15.940')] [2023-10-14 01:04:54,568][32837] Saving new best policy, reward=-15.020! [2023-10-14 01:04:54,807][33226] Updated weights for policy 1, policy_version 1320 (0.0007) [2023-10-14 01:04:55,176][33226] Updated weights for policy 1, policy_version 1330 (0.0007) [2023-10-14 01:04:55,544][33226] Updated weights for policy 1, policy_version 1340 (0.0010) [2023-10-14 01:04:55,691][32895] Saving new best policy, reward=-15.940! [2023-10-14 01:04:55,921][33201] Updated weights for policy 0, policy_version 1320 (0.0008) [2023-10-14 01:04:56,290][33201] Updated weights for policy 0, policy_version 1330 (0.0010) [2023-10-14 01:04:56,664][33201] Updated weights for policy 0, policy_version 1340 (0.0007) [2023-10-14 01:04:59,328][33226] Updated weights for policy 1, policy_version 1350 (0.0009) [2023-10-14 01:04:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13565.4). Total num frames: 2752512. Throughput: 0: 1734.9, 1: 1766.1. Samples: 695930. Policy #0 lag: (min: 28.0, avg: 28.1, max: 35.0) [2023-10-14 01:04:59,558][31953] Avg episode reward: [(0, '-15.020'), (1, '-15.840')] [2023-10-14 01:04:59,685][33226] Updated weights for policy 1, policy_version 1360 (0.0008) [2023-10-14 01:05:00,051][33226] Updated weights for policy 1, policy_version 1370 (0.0007) [2023-10-14 01:05:00,272][32895] Saving new best policy, reward=-15.840! [2023-10-14 01:05:00,497][33201] Updated weights for policy 0, policy_version 1350 (0.0008) [2023-10-14 01:05:00,868][33201] Updated weights for policy 0, policy_version 1360 (0.0010) [2023-10-14 01:05:01,232][33201] Updated weights for policy 0, policy_version 1370 (0.0008) [2023-10-14 01:05:03,865][33226] Updated weights for policy 1, policy_version 1380 (0.0010) [2023-10-14 01:05:04,226][33226] Updated weights for policy 1, policy_version 1390 (0.0008) [2023-10-14 01:05:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13554.4). Total num frames: 2818048. Throughput: 0: 1740.8, 1: 1773.5. Samples: 717824. Policy #0 lag: (min: 10.0, avg: 14.1, max: 42.0) [2023-10-14 01:05:04,558][31953] Avg episode reward: [(0, '-14.160'), (1, '-15.450')] [2023-10-14 01:05:04,558][32837] Saving new best policy, reward=-14.160! [2023-10-14 01:05:04,596][33226] Updated weights for policy 1, policy_version 1400 (0.0008) [2023-10-14 01:05:04,893][32895] Saving new best policy, reward=-15.450! [2023-10-14 01:05:05,084][33201] Updated weights for policy 0, policy_version 1380 (0.0008) [2023-10-14 01:05:05,447][33201] Updated weights for policy 0, policy_version 1390 (0.0009) [2023-10-14 01:05:05,824][33201] Updated weights for policy 0, policy_version 1400 (0.0009) [2023-10-14 01:05:08,472][33226] Updated weights for policy 1, policy_version 1410 (0.0007) [2023-10-14 01:05:08,840][33226] Updated weights for policy 1, policy_version 1420 (0.0007) [2023-10-14 01:05:09,201][33226] Updated weights for policy 1, policy_version 1430 (0.0008) [2023-10-14 01:05:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 13543.9). Total num frames: 2883584. Throughput: 0: 1769.7, 1: 1778.5. Samples: 739284. Policy #0 lag: (min: 10.0, avg: 19.5, max: 42.0) [2023-10-14 01:05:09,557][31953] Avg episode reward: [(0, '-13.520'), (1, '-15.120')] [2023-10-14 01:05:09,566][32895] Saving new best policy, reward=-15.120! [2023-10-14 01:05:09,569][33226] Updated weights for policy 1, policy_version 1440 (0.0007) [2023-10-14 01:05:09,582][33201] Updated weights for policy 0, policy_version 1410 (0.0007) [2023-10-14 01:05:09,950][33201] Updated weights for policy 0, policy_version 1420 (0.0008) [2023-10-14 01:05:10,325][33201] Updated weights for policy 0, policy_version 1430 (0.0008) [2023-10-14 01:05:10,697][32837] Saving new best policy, reward=-13.520! [2023-10-14 01:05:10,700][33201] Updated weights for policy 0, policy_version 1440 (0.0007) [2023-10-14 01:05:13,348][33226] Updated weights for policy 1, policy_version 1450 (0.0008) [2023-10-14 01:05:13,711][33226] Updated weights for policy 1, policy_version 1460 (0.0007) [2023-10-14 01:05:14,073][33226] Updated weights for policy 1, policy_version 1470 (0.0008) [2023-10-14 01:05:14,450][33201] Updated weights for policy 0, policy_version 1450 (0.0010) [2023-10-14 01:05:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 13684.2). Total num frames: 2981888. Throughput: 0: 1740.3, 1: 1764.8. Samples: 749536. Policy #0 lag: (min: 15.0, avg: 19.1, max: 47.0) [2023-10-14 01:05:14,558][31953] Avg episode reward: [(0, '-12.840'), (1, '-14.430')] [2023-10-14 01:05:14,559][32895] Saving new best policy, reward=-14.430! [2023-10-14 01:05:14,825][33201] Updated weights for policy 0, policy_version 1460 (0.0009) [2023-10-14 01:05:15,204][33201] Updated weights for policy 0, policy_version 1470 (0.0008) [2023-10-14 01:05:15,271][32837] Saving new best policy, reward=-12.840! [2023-10-14 01:05:17,880][33226] Updated weights for policy 1, policy_version 1480 (0.0007) [2023-10-14 01:05:18,257][33226] Updated weights for policy 1, policy_version 1490 (0.0007) [2023-10-14 01:05:18,628][33226] Updated weights for policy 1, policy_version 1500 (0.0008) [2023-10-14 01:05:18,900][33201] Updated weights for policy 0, policy_version 1480 (0.0008) [2023-10-14 01:05:19,279][33201] Updated weights for policy 0, policy_version 1490 (0.0008) [2023-10-14 01:05:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 13671.3). Total num frames: 3047424. Throughput: 0: 1771.6, 1: 1783.1. Samples: 771400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:05:19,558][31953] Avg episode reward: [(0, '-11.990'), (1, '-14.080')] [2023-10-14 01:05:19,558][32895] Saving new best policy, reward=-14.080! [2023-10-14 01:05:19,642][33201] Updated weights for policy 0, policy_version 1500 (0.0011) [2023-10-14 01:05:19,791][32837] Saving new best policy, reward=-11.990! [2023-10-14 01:05:22,287][33226] Updated weights for policy 1, policy_version 1510 (0.0007) [2023-10-14 01:05:22,661][33226] Updated weights for policy 1, policy_version 1520 (0.0007) [2023-10-14 01:05:23,039][33226] Updated weights for policy 1, policy_version 1530 (0.0008) [2023-10-14 01:05:23,396][33201] Updated weights for policy 0, policy_version 1510 (0.0009) [2023-10-14 01:05:23,767][33201] Updated weights for policy 0, policy_version 1520 (0.0011) [2023-10-14 01:05:24,135][33201] Updated weights for policy 0, policy_version 1530 (0.0009) [2023-10-14 01:05:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 13802.7). Total num frames: 3145728. Throughput: 0: 1758.7, 1: 1761.2. Samples: 791780. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 01:05:24,558][31953] Avg episode reward: [(0, '-10.690'), (1, '-13.600')] [2023-10-14 01:05:24,564][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000001536_1572864.pth... [2023-10-14 01:05:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000001536_1572864.pth... [2023-10-14 01:05:24,600][32895] Saving new best policy, reward=-13.600! [2023-10-14 01:05:24,605][32837] Saving new best policy, reward=-10.690! [2023-10-14 01:05:26,867][33226] Updated weights for policy 1, policy_version 1540 (0.0008) [2023-10-14 01:05:27,232][33226] Updated weights for policy 1, policy_version 1550 (0.0009) [2023-10-14 01:05:27,599][33226] Updated weights for policy 1, policy_version 1560 (0.0007) [2023-10-14 01:05:27,939][33201] Updated weights for policy 0, policy_version 1540 (0.0010) [2023-10-14 01:05:28,313][33201] Updated weights for policy 0, policy_version 1550 (0.0009) [2023-10-14 01:05:28,684][33201] Updated weights for policy 0, policy_version 1560 (0.0007) [2023-10-14 01:05:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 13787.7). Total num frames: 3211264. Throughput: 0: 1774.2, 1: 1784.0. Samples: 803398. Policy #0 lag: (min: 17.0, avg: 22.3, max: 49.0) [2023-10-14 01:05:29,558][31953] Avg episode reward: [(0, '-10.430'), (1, '-13.600')] [2023-10-14 01:05:29,560][32837] Saving new best policy, reward=-10.430! [2023-10-14 01:05:31,463][33226] Updated weights for policy 1, policy_version 1570 (0.0008) [2023-10-14 01:05:31,821][33226] Updated weights for policy 1, policy_version 1580 (0.0010) [2023-10-14 01:05:32,194][33226] Updated weights for policy 1, policy_version 1590 (0.0011) [2023-10-14 01:05:32,557][33226] Updated weights for policy 1, policy_version 1600 (0.0009) [2023-10-14 01:05:32,598][33201] Updated weights for policy 0, policy_version 1570 (0.0009) [2023-10-14 01:05:32,990][33201] Updated weights for policy 0, policy_version 1580 (0.0007) [2023-10-14 01:05:33,354][33201] Updated weights for policy 0, policy_version 1590 (0.0008) [2023-10-14 01:05:33,731][33201] Updated weights for policy 0, policy_version 1600 (0.0008) [2023-10-14 01:05:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13773.4). Total num frames: 3276800. Throughput: 0: 1771.2, 1: 1764.5. Samples: 823564. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 01:05:34,558][31953] Avg episode reward: [(0, '-8.640'), (1, '-12.940')] [2023-10-14 01:05:34,558][32895] Saving new best policy, reward=-12.940! [2023-10-14 01:05:34,558][32837] Saving new best policy, reward=-8.640! [2023-10-14 01:05:36,548][33226] Updated weights for policy 1, policy_version 1610 (0.0010) [2023-10-14 01:05:36,916][33226] Updated weights for policy 1, policy_version 1620 (0.0010) [2023-10-14 01:05:37,286][33226] Updated weights for policy 1, policy_version 1630 (0.0009) [2023-10-14 01:05:37,594][33201] Updated weights for policy 0, policy_version 1610 (0.0007) [2023-10-14 01:05:37,966][33201] Updated weights for policy 0, policy_version 1620 (0.0007) [2023-10-14 01:05:38,342][33201] Updated weights for policy 0, policy_version 1630 (0.0007) [2023-10-14 01:05:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13759.7). Total num frames: 3342336. Throughput: 0: 1757.4, 1: 1759.8. Samples: 844462. Policy #0 lag: (min: 13.0, avg: 19.6, max: 45.0) [2023-10-14 01:05:39,558][31953] Avg episode reward: [(0, '-7.110'), (1, '-11.880')] [2023-10-14 01:05:39,567][32895] Saving new best policy, reward=-11.880! [2023-10-14 01:05:39,567][32837] Saving new best policy, reward=-7.110! [2023-10-14 01:05:41,052][33226] Updated weights for policy 1, policy_version 1640 (0.0009) [2023-10-14 01:05:41,420][33226] Updated weights for policy 1, policy_version 1650 (0.0008) [2023-10-14 01:05:41,788][33226] Updated weights for policy 1, policy_version 1660 (0.0008) [2023-10-14 01:05:42,224][33201] Updated weights for policy 0, policy_version 1640 (0.0010) [2023-10-14 01:05:42,599][33201] Updated weights for policy 0, policy_version 1650 (0.0010) [2023-10-14 01:05:42,965][33201] Updated weights for policy 0, policy_version 1660 (0.0008) [2023-10-14 01:05:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13746.6). Total num frames: 3407872. Throughput: 0: 1782.9, 1: 1760.1. Samples: 855368. Policy #0 lag: (min: 4.0, avg: 4.8, max: 24.0) [2023-10-14 01:05:44,558][31953] Avg episode reward: [(0, '-5.800'), (1, '-10.380')] [2023-10-14 01:05:44,560][32837] Saving new best policy, reward=-5.800! [2023-10-14 01:05:44,560][32895] Saving new best policy, reward=-10.380! [2023-10-14 01:05:45,615][33226] Updated weights for policy 1, policy_version 1670 (0.0008) [2023-10-14 01:05:45,988][33226] Updated weights for policy 1, policy_version 1680 (0.0007) [2023-10-14 01:05:46,356][33226] Updated weights for policy 1, policy_version 1690 (0.0008) [2023-10-14 01:05:46,670][33201] Updated weights for policy 0, policy_version 1670 (0.0008) [2023-10-14 01:05:47,041][33201] Updated weights for policy 0, policy_version 1680 (0.0009) [2023-10-14 01:05:47,406][33201] Updated weights for policy 0, policy_version 1690 (0.0008) [2023-10-14 01:05:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13733.9). Total num frames: 3473408. Throughput: 0: 1757.4, 1: 1765.8. Samples: 876368. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 01:05:49,558][31953] Avg episode reward: [(0, '-3.820'), (1, '-9.890')] [2023-10-14 01:05:49,558][32837] Saving new best policy, reward=-3.820! [2023-10-14 01:05:49,559][32895] Saving new best policy, reward=-9.890! [2023-10-14 01:05:50,285][33226] Updated weights for policy 1, policy_version 1700 (0.0007) [2023-10-14 01:05:50,654][33226] Updated weights for policy 1, policy_version 1710 (0.0007) [2023-10-14 01:05:51,026][33226] Updated weights for policy 1, policy_version 1720 (0.0008) [2023-10-14 01:05:51,296][33201] Updated weights for policy 0, policy_version 1700 (0.0009) [2023-10-14 01:05:51,670][33201] Updated weights for policy 0, policy_version 1710 (0.0009) [2023-10-14 01:05:52,031][33201] Updated weights for policy 0, policy_version 1720 (0.0008) [2023-10-14 01:05:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13721.8). Total num frames: 3538944. Throughput: 0: 1760.0, 1: 1778.4. Samples: 898514. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 01:05:54,558][31953] Avg episode reward: [(0, '-3.230'), (1, '-9.260')] [2023-10-14 01:05:54,567][32895] Saving new best policy, reward=-9.260! [2023-10-14 01:05:54,567][32837] Saving new best policy, reward=-3.230! [2023-10-14 01:05:54,786][33226] Updated weights for policy 1, policy_version 1730 (0.0007) [2023-10-14 01:05:55,150][33226] Updated weights for policy 1, policy_version 1740 (0.0007) [2023-10-14 01:05:55,510][33226] Updated weights for policy 1, policy_version 1750 (0.0007) [2023-10-14 01:05:55,836][33201] Updated weights for policy 0, policy_version 1730 (0.0009) [2023-10-14 01:05:55,885][33226] Updated weights for policy 1, policy_version 1760 (0.0007) [2023-10-14 01:05:56,205][33201] Updated weights for policy 0, policy_version 1740 (0.0007) [2023-10-14 01:05:56,564][33201] Updated weights for policy 0, policy_version 1750 (0.0008) [2023-10-14 01:05:56,932][33201] Updated weights for policy 0, policy_version 1760 (0.0008) [2023-10-14 01:05:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13710.1). Total num frames: 3604480. Throughput: 0: 1764.3, 1: 1765.2. Samples: 908360. Policy #0 lag: (min: 31.0, avg: 38.8, max: 63.0) [2023-10-14 01:05:59,558][31953] Avg episode reward: [(0, '-0.940'), (1, '-8.360')] [2023-10-14 01:05:59,558][32837] Saving new best policy, reward=-0.940! [2023-10-14 01:05:59,799][33226] Updated weights for policy 1, policy_version 1770 (0.0009) [2023-10-14 01:06:00,173][33226] Updated weights for policy 1, policy_version 1780 (0.0009) [2023-10-14 01:06:00,544][33226] Updated weights for policy 1, policy_version 1790 (0.0008) [2023-10-14 01:06:00,610][32895] Saving new best policy, reward=-8.360! [2023-10-14 01:06:00,977][33201] Updated weights for policy 0, policy_version 1770 (0.0007) [2023-10-14 01:06:01,347][33201] Updated weights for policy 0, policy_version 1780 (0.0008) [2023-10-14 01:06:01,724][33201] Updated weights for policy 0, policy_version 1790 (0.0008) [2023-10-14 01:06:04,226][33226] Updated weights for policy 1, policy_version 1800 (0.0009) [2023-10-14 01:06:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13698.8). Total num frames: 3670016. Throughput: 0: 1749.1, 1: 1772.8. Samples: 929884. Policy #0 lag: (min: 8.0, avg: 29.6, max: 40.0) [2023-10-14 01:06:04,558][31953] Avg episode reward: [(0, '0.480'), (1, '-5.660')] [2023-10-14 01:06:04,558][32837] Saving new best policy, reward=0.480! [2023-10-14 01:06:04,599][33226] Updated weights for policy 1, policy_version 1810 (0.0008) [2023-10-14 01:06:04,978][33226] Updated weights for policy 1, policy_version 1820 (0.0008) [2023-10-14 01:06:05,125][32895] Saving new best policy, reward=-5.660! [2023-10-14 01:06:05,555][33201] Updated weights for policy 0, policy_version 1800 (0.0008) [2023-10-14 01:06:05,919][33201] Updated weights for policy 0, policy_version 1810 (0.0009) [2023-10-14 01:06:06,299][33201] Updated weights for policy 0, policy_version 1820 (0.0010) [2023-10-14 01:06:08,590][33226] Updated weights for policy 1, policy_version 1830 (0.0009) [2023-10-14 01:06:08,966][33226] Updated weights for policy 1, policy_version 1840 (0.0008) [2023-10-14 01:06:09,336][33226] Updated weights for policy 1, policy_version 1850 (0.0008) [2023-10-14 01:06:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13688.0). Total num frames: 3735552. Throughput: 0: 1762.2, 1: 1778.4. Samples: 951108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:06:09,558][31953] Avg episode reward: [(0, '2.190'), (1, '-5.410')] [2023-10-14 01:06:09,566][32837] Saving new best policy, reward=2.190! [2023-10-14 01:06:09,566][32895] Saving new best policy, reward=-5.410! [2023-10-14 01:06:10,091][33201] Updated weights for policy 0, policy_version 1830 (0.0007) [2023-10-14 01:06:10,456][33201] Updated weights for policy 0, policy_version 1840 (0.0009) [2023-10-14 01:06:10,832][33201] Updated weights for policy 0, policy_version 1850 (0.0007) [2023-10-14 01:06:13,219][33226] Updated weights for policy 1, policy_version 1860 (0.0009) [2023-10-14 01:06:13,597][33226] Updated weights for policy 1, policy_version 1870 (0.0008) [2023-10-14 01:06:13,967][33226] Updated weights for policy 1, policy_version 1880 (0.0011) [2023-10-14 01:06:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 13795.5). Total num frames: 3833856. Throughput: 0: 1737.7, 1: 1772.3. Samples: 961350. Policy #0 lag: (min: 31.0, avg: 36.9, max: 63.0) [2023-10-14 01:06:14,558][31953] Avg episode reward: [(0, '4.540'), (1, '-3.750')] [2023-10-14 01:06:14,559][32895] Saving new best policy, reward=-3.750! [2023-10-14 01:06:14,734][33201] Updated weights for policy 0, policy_version 1860 (0.0008) [2023-10-14 01:06:15,103][33201] Updated weights for policy 0, policy_version 1870 (0.0009) [2023-10-14 01:06:15,475][33201] Updated weights for policy 0, policy_version 1880 (0.0009) [2023-10-14 01:06:15,763][32837] Saving new best policy, reward=4.540! [2023-10-14 01:06:17,685][33226] Updated weights for policy 1, policy_version 1890 (0.0009) [2023-10-14 01:06:18,057][33226] Updated weights for policy 1, policy_version 1900 (0.0008) [2023-10-14 01:06:18,424][33226] Updated weights for policy 1, policy_version 1910 (0.0009) [2023-10-14 01:06:18,786][33226] Updated weights for policy 1, policy_version 1920 (0.0008) [2023-10-14 01:06:19,554][33201] Updated weights for policy 0, policy_version 1890 (0.0010) [2023-10-14 01:06:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 13783.3). Total num frames: 3899392. Throughput: 0: 1754.1, 1: 1790.4. Samples: 983068. Policy #0 lag: (min: 26.0, avg: 36.2, max: 58.0) [2023-10-14 01:06:19,557][31953] Avg episode reward: [(0, '4.540'), (1, '-3.410')] [2023-10-14 01:06:19,558][32895] Saving new best policy, reward=-3.410! [2023-10-14 01:06:19,970][33201] Updated weights for policy 0, policy_version 1900 (0.0008) [2023-10-14 01:06:20,343][33201] Updated weights for policy 0, policy_version 1910 (0.0007) [2023-10-14 01:06:20,710][33201] Updated weights for policy 0, policy_version 1920 (0.0008) [2023-10-14 01:06:22,627][33226] Updated weights for policy 1, policy_version 1930 (0.0010) [2023-10-14 01:06:22,994][33226] Updated weights for policy 1, policy_version 1940 (0.0009) [2023-10-14 01:06:23,367][33226] Updated weights for policy 1, policy_version 1950 (0.0008) [2023-10-14 01:06:24,369][33201] Updated weights for policy 0, policy_version 1930 (0.0007) [2023-10-14 01:06:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13771.6). Total num frames: 3964928. Throughput: 0: 1771.7, 1: 1770.5. Samples: 1003862. Policy #0 lag: (min: 31.0, avg: 38.6, max: 63.0) [2023-10-14 01:06:24,558][31953] Avg episode reward: [(0, '6.410'), (1, '-1.060')] [2023-10-14 01:06:24,566][32895] Saving new best policy, reward=-1.060! [2023-10-14 01:06:24,736][33201] Updated weights for policy 0, policy_version 1940 (0.0010) [2023-10-14 01:06:25,110][33201] Updated weights for policy 0, policy_version 1950 (0.0007) [2023-10-14 01:06:25,188][32837] Saving new best policy, reward=6.410! [2023-10-14 01:06:26,905][33226] Updated weights for policy 1, policy_version 1960 (0.0007) [2023-10-14 01:06:27,271][33226] Updated weights for policy 1, policy_version 1970 (0.0008) [2023-10-14 01:06:27,649][33226] Updated weights for policy 1, policy_version 1980 (0.0008) [2023-10-14 01:06:28,993][33201] Updated weights for policy 0, policy_version 1960 (0.0008) [2023-10-14 01:06:29,366][33201] Updated weights for policy 0, policy_version 1970 (0.0009) [2023-10-14 01:06:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 13760.2). Total num frames: 4030464. Throughput: 0: 1748.8, 1: 1791.1. Samples: 1014664. Policy #0 lag: (min: 31.0, avg: 41.5, max: 63.0) [2023-10-14 01:06:29,558][31953] Avg episode reward: [(0, '9.030'), (1, '0.530')] [2023-10-14 01:06:29,560][32895] Saving new best policy, reward=0.530! [2023-10-14 01:06:29,740][33201] Updated weights for policy 0, policy_version 1980 (0.0008) [2023-10-14 01:06:29,895][32837] Saving new best policy, reward=9.030! [2023-10-14 01:06:31,597][33226] Updated weights for policy 1, policy_version 1990 (0.0009) [2023-10-14 01:06:31,956][33226] Updated weights for policy 1, policy_version 2000 (0.0011) [2023-10-14 01:06:32,332][33226] Updated weights for policy 1, policy_version 2010 (0.0009) [2023-10-14 01:06:33,473][33201] Updated weights for policy 0, policy_version 1990 (0.0008) [2023-10-14 01:06:33,848][33201] Updated weights for policy 0, policy_version 2000 (0.0008) [2023-10-14 01:06:34,224][33201] Updated weights for policy 0, policy_version 2010 (0.0008) [2023-10-14 01:06:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 4128768. Throughput: 0: 1773.2, 1: 1766.1. Samples: 1035638. Policy #0 lag: (min: 1.0, avg: 11.2, max: 33.0) [2023-10-14 01:06:34,558][31953] Avg episode reward: [(0, '9.540'), (1, '0.890')] [2023-10-14 01:06:34,560][32837] Saving new best policy, reward=9.540! [2023-10-14 01:06:34,560][32895] Saving new best policy, reward=0.890! [2023-10-14 01:06:36,141][33226] Updated weights for policy 1, policy_version 2020 (0.0010) [2023-10-14 01:06:36,518][33226] Updated weights for policy 1, policy_version 2030 (0.0007) [2023-10-14 01:06:36,887][33226] Updated weights for policy 1, policy_version 2040 (0.0007) [2023-10-14 01:06:38,025][33201] Updated weights for policy 0, policy_version 2020 (0.0008) [2023-10-14 01:06:38,406][33201] Updated weights for policy 0, policy_version 2030 (0.0009) [2023-10-14 01:06:38,773][33201] Updated weights for policy 0, policy_version 2040 (0.0010) [2023-10-14 01:06:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 4194304. Throughput: 0: 1736.6, 1: 1774.4. Samples: 1056510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:06:39,558][31953] Avg episode reward: [(0, '11.440'), (1, '2.460')] [2023-10-14 01:06:39,570][32895] Saving new best policy, reward=2.460! [2023-10-14 01:06:39,570][32837] Saving new best policy, reward=11.440! [2023-10-14 01:06:40,610][33226] Updated weights for policy 1, policy_version 2050 (0.0009) [2023-10-14 01:06:40,982][33226] Updated weights for policy 1, policy_version 2060 (0.0009) [2023-10-14 01:06:41,359][33226] Updated weights for policy 1, policy_version 2070 (0.0007) [2023-10-14 01:06:41,725][33226] Updated weights for policy 1, policy_version 2080 (0.0007) [2023-10-14 01:06:42,573][33201] Updated weights for policy 0, policy_version 2050 (0.0007) [2023-10-14 01:06:42,943][33201] Updated weights for policy 0, policy_version 2060 (0.0008) [2023-10-14 01:06:43,325][33201] Updated weights for policy 0, policy_version 2070 (0.0009) [2023-10-14 01:06:43,696][33201] Updated weights for policy 0, policy_version 2080 (0.0007) [2023-10-14 01:06:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 4259840. Throughput: 0: 1763.6, 1: 1771.1. Samples: 1067422. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 01:06:44,558][31953] Avg episode reward: [(0, '13.250'), (1, '3.800')] [2023-10-14 01:06:44,558][32837] Saving new best policy, reward=13.250! [2023-10-14 01:06:44,559][32895] Saving new best policy, reward=3.800! [2023-10-14 01:06:45,470][33226] Updated weights for policy 1, policy_version 2090 (0.0009) [2023-10-14 01:06:45,843][33226] Updated weights for policy 1, policy_version 2100 (0.0007) [2023-10-14 01:06:46,214][33226] Updated weights for policy 1, policy_version 2110 (0.0008) [2023-10-14 01:06:47,537][33201] Updated weights for policy 0, policy_version 2090 (0.0010) [2023-10-14 01:06:47,909][33201] Updated weights for policy 0, policy_version 2100 (0.0010) [2023-10-14 01:06:48,293][33201] Updated weights for policy 0, policy_version 2110 (0.0010) [2023-10-14 01:06:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 4325376. Throughput: 0: 1748.4, 1: 1775.3. Samples: 1088452. Policy #0 lag: (min: 26.0, avg: 31.8, max: 58.0) [2023-10-14 01:06:49,558][31953] Avg episode reward: [(0, '14.130'), (1, '5.430')] [2023-10-14 01:06:49,560][32837] Saving new best policy, reward=14.130! [2023-10-14 01:06:49,560][32895] Saving new best policy, reward=5.430! [2023-10-14 01:06:50,061][33226] Updated weights for policy 1, policy_version 2120 (0.0007) [2023-10-14 01:06:50,431][33226] Updated weights for policy 1, policy_version 2130 (0.0007) [2023-10-14 01:06:50,793][33226] Updated weights for policy 1, policy_version 2140 (0.0010) [2023-10-14 01:06:52,178][33201] Updated weights for policy 0, policy_version 2120 (0.0008) [2023-10-14 01:06:52,553][33201] Updated weights for policy 0, policy_version 2130 (0.0007) [2023-10-14 01:06:52,924][33201] Updated weights for policy 0, policy_version 2140 (0.0007) [2023-10-14 01:06:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 4390912. Throughput: 0: 1753.6, 1: 1792.9. Samples: 1110702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:06:54,557][31953] Avg episode reward: [(0, '15.550'), (1, '6.090')] [2023-10-14 01:06:54,568][32837] Saving new best policy, reward=15.550! [2023-10-14 01:06:54,575][33226] Updated weights for policy 1, policy_version 2150 (0.0008) [2023-10-14 01:06:54,942][33226] Updated weights for policy 1, policy_version 2160 (0.0009) [2023-10-14 01:06:55,310][33226] Updated weights for policy 1, policy_version 2170 (0.0008) [2023-10-14 01:06:55,527][32895] Saving new best policy, reward=6.090! [2023-10-14 01:06:56,813][33201] Updated weights for policy 0, policy_version 2150 (0.0008) [2023-10-14 01:06:57,186][33201] Updated weights for policy 0, policy_version 2160 (0.0009) [2023-10-14 01:06:57,549][33201] Updated weights for policy 0, policy_version 2170 (0.0008) [2023-10-14 01:06:59,178][33226] Updated weights for policy 1, policy_version 2180 (0.0008) [2023-10-14 01:06:59,541][33226] Updated weights for policy 1, policy_version 2190 (0.0009) [2023-10-14 01:06:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 4456448. Throughput: 0: 1774.1, 1: 1778.3. Samples: 1121208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:06:59,558][31953] Avg episode reward: [(0, '16.450'), (1, '7.380')] [2023-10-14 01:06:59,559][32837] Saving new best policy, reward=16.450! [2023-10-14 01:06:59,908][33226] Updated weights for policy 1, policy_version 2200 (0.0010) [2023-10-14 01:07:00,202][32895] Saving new best policy, reward=7.380! [2023-10-14 01:07:01,336][33201] Updated weights for policy 0, policy_version 2180 (0.0007) [2023-10-14 01:07:01,695][33201] Updated weights for policy 0, policy_version 2190 (0.0007) [2023-10-14 01:07:02,073][33201] Updated weights for policy 0, policy_version 2200 (0.0009) [2023-10-14 01:07:03,653][33226] Updated weights for policy 1, policy_version 2210 (0.0007) [2023-10-14 01:07:04,031][33226] Updated weights for policy 1, policy_version 2220 (0.0008) [2023-10-14 01:07:04,402][33226] Updated weights for policy 1, policy_version 2230 (0.0008) [2023-10-14 01:07:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 4521984. Throughput: 0: 1760.5, 1: 1786.7. Samples: 1142692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:07:04,557][31953] Avg episode reward: [(0, '17.090'), (1, '9.410')] [2023-10-14 01:07:04,558][32837] Saving new best policy, reward=17.090! [2023-10-14 01:07:04,766][33226] Updated weights for policy 1, policy_version 2240 (0.0007) [2023-10-14 01:07:04,767][32895] Saving new best policy, reward=9.410! [2023-10-14 01:07:05,710][33201] Updated weights for policy 0, policy_version 2210 (0.0008) [2023-10-14 01:07:06,107][33201] Updated weights for policy 0, policy_version 2220 (0.0007) [2023-10-14 01:07:06,483][33201] Updated weights for policy 0, policy_version 2230 (0.0008) [2023-10-14 01:07:06,848][33201] Updated weights for policy 0, policy_version 2240 (0.0011) [2023-10-14 01:07:08,640][33226] Updated weights for policy 1, policy_version 2250 (0.0008) [2023-10-14 01:07:09,018][33226] Updated weights for policy 1, policy_version 2260 (0.0010) [2023-10-14 01:07:09,390][33226] Updated weights for policy 1, policy_version 2270 (0.0010) [2023-10-14 01:07:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 4620288. Throughput: 0: 1764.1, 1: 1794.0. Samples: 1163978. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:07:09,559][31953] Avg episode reward: [(0, '17.870'), (1, '9.970')] [2023-10-14 01:07:09,571][32837] Saving new best policy, reward=17.870! [2023-10-14 01:07:09,571][32895] Saving new best policy, reward=9.970! [2023-10-14 01:07:10,489][33201] Updated weights for policy 0, policy_version 2250 (0.0010) [2023-10-14 01:07:10,868][33201] Updated weights for policy 0, policy_version 2260 (0.0008) [2023-10-14 01:07:11,227][33201] Updated weights for policy 0, policy_version 2270 (0.0007) [2023-10-14 01:07:13,065][33226] Updated weights for policy 1, policy_version 2280 (0.0010) [2023-10-14 01:07:13,436][33226] Updated weights for policy 1, policy_version 2290 (0.0009) [2023-10-14 01:07:13,804][33226] Updated weights for policy 1, policy_version 2300 (0.0008) [2023-10-14 01:07:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 4685824. Throughput: 0: 1763.7, 1: 1789.7. Samples: 1174564. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 01:07:14,558][31953] Avg episode reward: [(0, '18.340'), (1, '10.980')] [2023-10-14 01:07:14,559][32895] Saving new best policy, reward=10.980! [2023-10-14 01:07:14,559][32837] Saving new best policy, reward=18.340! [2023-10-14 01:07:15,088][33201] Updated weights for policy 0, policy_version 2280 (0.0007) [2023-10-14 01:07:15,463][33201] Updated weights for policy 0, policy_version 2290 (0.0007) [2023-10-14 01:07:15,831][33201] Updated weights for policy 0, policy_version 2300 (0.0008) [2023-10-14 01:07:17,579][33226] Updated weights for policy 1, policy_version 2310 (0.0008) [2023-10-14 01:07:17,951][33226] Updated weights for policy 1, policy_version 2320 (0.0007) [2023-10-14 01:07:18,314][33226] Updated weights for policy 1, policy_version 2330 (0.0009) [2023-10-14 01:07:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 4751360. Throughput: 0: 1763.4, 1: 1799.8. Samples: 1195980. Policy #0 lag: (min: 9.0, avg: 11.0, max: 34.0) [2023-10-14 01:07:19,558][31953] Avg episode reward: [(0, '18.500'), (1, '12.360')] [2023-10-14 01:07:19,559][32895] Saving new best policy, reward=12.360! [2023-10-14 01:07:19,682][33201] Updated weights for policy 0, policy_version 2310 (0.0008) [2023-10-14 01:07:20,064][33201] Updated weights for policy 0, policy_version 2320 (0.0007) [2023-10-14 01:07:20,432][33201] Updated weights for policy 0, policy_version 2330 (0.0007) [2023-10-14 01:07:20,659][32837] Saving new best policy, reward=18.500! [2023-10-14 01:07:22,123][33226] Updated weights for policy 1, policy_version 2340 (0.0009) [2023-10-14 01:07:22,496][33226] Updated weights for policy 1, policy_version 2350 (0.0011) [2023-10-14 01:07:22,859][33226] Updated weights for policy 1, policy_version 2360 (0.0010) [2023-10-14 01:07:24,244][33201] Updated weights for policy 0, policy_version 2340 (0.0009) [2023-10-14 01:07:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 4816896. Throughput: 0: 1802.8, 1: 1778.6. Samples: 1217672. Policy #0 lag: (min: 31.0, avg: 37.6, max: 63.0) [2023-10-14 01:07:24,558][31953] Avg episode reward: [(0, '18.770'), (1, '13.470')] [2023-10-14 01:07:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000002368_2424832.pth... [2023-10-14 01:07:24,596][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000000704_720896.pth [2023-10-14 01:07:24,599][32895] Saving new best policy, reward=13.470! [2023-10-14 01:07:24,619][33201] Updated weights for policy 0, policy_version 2350 (0.0009) [2023-10-14 01:07:24,982][33201] Updated weights for policy 0, policy_version 2360 (0.0008) [2023-10-14 01:07:25,279][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000002368_2424832.pth... [2023-10-14 01:07:25,319][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000000704_720896.pth [2023-10-14 01:07:25,324][32837] Saving new best policy, reward=18.770! [2023-10-14 01:07:26,569][33226] Updated weights for policy 1, policy_version 2370 (0.0007) [2023-10-14 01:07:26,932][33226] Updated weights for policy 1, policy_version 2380 (0.0009) [2023-10-14 01:07:27,302][33226] Updated weights for policy 1, policy_version 2390 (0.0009) [2023-10-14 01:07:27,667][33226] Updated weights for policy 1, policy_version 2400 (0.0008) [2023-10-14 01:07:28,828][33201] Updated weights for policy 0, policy_version 2370 (0.0008) [2023-10-14 01:07:29,210][33201] Updated weights for policy 0, policy_version 2380 (0.0009) [2023-10-14 01:07:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 4882432. Throughput: 0: 1768.3, 1: 1804.2. Samples: 1228188. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 01:07:29,558][31953] Avg episode reward: [(0, '19.020'), (1, '13.810')] [2023-10-14 01:07:29,560][32895] Saving new best policy, reward=13.810! [2023-10-14 01:07:29,585][33201] Updated weights for policy 0, policy_version 2390 (0.0008) [2023-10-14 01:07:29,957][32837] Saving new best policy, reward=19.020! [2023-10-14 01:07:29,960][33201] Updated weights for policy 0, policy_version 2400 (0.0008) [2023-10-14 01:07:31,611][33226] Updated weights for policy 1, policy_version 2410 (0.0009) [2023-10-14 01:07:31,980][33226] Updated weights for policy 1, policy_version 2420 (0.0008) [2023-10-14 01:07:32,342][33226] Updated weights for policy 1, policy_version 2430 (0.0010) [2023-10-14 01:07:33,686][33201] Updated weights for policy 0, policy_version 2410 (0.0009) [2023-10-14 01:07:34,062][33201] Updated weights for policy 0, policy_version 2420 (0.0008) [2023-10-14 01:07:34,434][33201] Updated weights for policy 0, policy_version 2430 (0.0008) [2023-10-14 01:07:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 4980736. Throughput: 0: 1789.2, 1: 1780.4. Samples: 1249086. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-14 01:07:34,558][31953] Avg episode reward: [(0, '19.110'), (1, '15.090')] [2023-10-14 01:07:34,560][32837] Saving new best policy, reward=19.110! [2023-10-14 01:07:34,560][32895] Saving new best policy, reward=15.090! [2023-10-14 01:07:36,144][33226] Updated weights for policy 1, policy_version 2440 (0.0009) [2023-10-14 01:07:36,513][33226] Updated weights for policy 1, policy_version 2450 (0.0008) [2023-10-14 01:07:36,879][33226] Updated weights for policy 1, policy_version 2460 (0.0008) [2023-10-14 01:07:38,232][33201] Updated weights for policy 0, policy_version 2440 (0.0011) [2023-10-14 01:07:38,612][33201] Updated weights for policy 0, policy_version 2450 (0.0010) [2023-10-14 01:07:38,982][33201] Updated weights for policy 0, policy_version 2460 (0.0010) [2023-10-14 01:07:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 5046272. Throughput: 0: 1759.8, 1: 1774.3. Samples: 1269740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:07:39,558][31953] Avg episode reward: [(0, '19.310'), (1, '16.010')] [2023-10-14 01:07:39,571][32895] Saving new best policy, reward=16.010! [2023-10-14 01:07:39,571][32837] Saving new best policy, reward=19.310! [2023-10-14 01:07:40,689][33226] Updated weights for policy 1, policy_version 2470 (0.0010) [2023-10-14 01:07:41,069][33226] Updated weights for policy 1, policy_version 2480 (0.0009) [2023-10-14 01:07:41,452][33226] Updated weights for policy 1, policy_version 2490 (0.0007) [2023-10-14 01:07:42,780][33201] Updated weights for policy 0, policy_version 2470 (0.0008) [2023-10-14 01:07:43,150][33201] Updated weights for policy 0, policy_version 2480 (0.0008) [2023-10-14 01:07:43,524][33201] Updated weights for policy 0, policy_version 2490 (0.0010) [2023-10-14 01:07:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 5111808. Throughput: 0: 1769.5, 1: 1771.2. Samples: 1280538. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:07:44,557][31953] Avg episode reward: [(0, '19.400'), (1, '16.120')] [2023-10-14 01:07:44,558][32837] Saving new best policy, reward=19.400! [2023-10-14 01:07:44,558][32895] Saving new best policy, reward=16.120! [2023-10-14 01:07:45,246][33226] Updated weights for policy 1, policy_version 2500 (0.0009) [2023-10-14 01:07:45,616][33226] Updated weights for policy 1, policy_version 2510 (0.0009) [2023-10-14 01:07:45,990][33226] Updated weights for policy 1, policy_version 2520 (0.0008) [2023-10-14 01:07:47,411][33201] Updated weights for policy 0, policy_version 2500 (0.0011) [2023-10-14 01:07:47,790][33201] Updated weights for policy 0, policy_version 2510 (0.0008) [2023-10-14 01:07:48,160][33201] Updated weights for policy 0, policy_version 2520 (0.0008) [2023-10-14 01:07:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 5177344. Throughput: 0: 1768.3, 1: 1767.5. Samples: 1301804. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 01:07:49,558][31953] Avg episode reward: [(0, '19.390'), (1, '16.710')] [2023-10-14 01:07:49,560][32895] Saving new best policy, reward=16.710! [2023-10-14 01:07:49,812][33226] Updated weights for policy 1, policy_version 2530 (0.0007) [2023-10-14 01:07:50,179][33226] Updated weights for policy 1, policy_version 2540 (0.0008) [2023-10-14 01:07:50,546][33226] Updated weights for policy 1, policy_version 2550 (0.0009) [2023-10-14 01:07:50,925][33226] Updated weights for policy 1, policy_version 2560 (0.0007) [2023-10-14 01:07:52,048][33201] Updated weights for policy 0, policy_version 2530 (0.0008) [2023-10-14 01:07:52,463][33201] Updated weights for policy 0, policy_version 2540 (0.0008) [2023-10-14 01:07:52,823][33201] Updated weights for policy 0, policy_version 2550 (0.0007) [2023-10-14 01:07:53,204][33201] Updated weights for policy 0, policy_version 2560 (0.0009) [2023-10-14 01:07:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 5242880. Throughput: 0: 1750.3, 1: 1787.0. Samples: 1323156. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 01:07:54,558][31953] Avg episode reward: [(0, '19.430'), (1, '17.100')] [2023-10-14 01:07:54,564][32837] Saving new best policy, reward=19.430! [2023-10-14 01:07:54,828][33226] Updated weights for policy 1, policy_version 2570 (0.0009) [2023-10-14 01:07:55,210][33226] Updated weights for policy 1, policy_version 2580 (0.0009) [2023-10-14 01:07:55,582][33226] Updated weights for policy 1, policy_version 2590 (0.0007) [2023-10-14 01:07:55,650][32895] Saving new best policy, reward=17.100! [2023-10-14 01:07:57,183][33201] Updated weights for policy 0, policy_version 2570 (0.0008) [2023-10-14 01:07:57,561][33201] Updated weights for policy 0, policy_version 2580 (0.0008) [2023-10-14 01:07:57,934][33201] Updated weights for policy 0, policy_version 2590 (0.0007) [2023-10-14 01:07:59,248][33226] Updated weights for policy 1, policy_version 2600 (0.0008) [2023-10-14 01:07:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 5308416. Throughput: 0: 1771.8, 1: 1764.1. Samples: 1333680. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-14 01:07:59,559][31953] Avg episode reward: [(0, '19.560'), (1, '17.400')] [2023-10-14 01:07:59,560][32837] Saving new best policy, reward=19.560! [2023-10-14 01:07:59,618][33226] Updated weights for policy 1, policy_version 2610 (0.0007) [2023-10-14 01:07:59,985][33226] Updated weights for policy 1, policy_version 2620 (0.0008) [2023-10-14 01:08:00,132][32895] Saving new best policy, reward=17.400! [2023-10-14 01:08:01,822][33201] Updated weights for policy 0, policy_version 2600 (0.0009) [2023-10-14 01:08:02,197][33201] Updated weights for policy 0, policy_version 2610 (0.0010) [2023-10-14 01:08:02,556][33201] Updated weights for policy 0, policy_version 2620 (0.0010) [2023-10-14 01:08:03,817][33226] Updated weights for policy 1, policy_version 2630 (0.0008) [2023-10-14 01:08:04,187][33226] Updated weights for policy 1, policy_version 2640 (0.0007) [2023-10-14 01:08:04,551][33226] Updated weights for policy 1, policy_version 2650 (0.0010) [2023-10-14 01:08:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 5373952. Throughput: 0: 1743.7, 1: 1776.0. Samples: 1354364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:08:04,558][31953] Avg episode reward: [(0, '19.600'), (1, '17.520')] [2023-10-14 01:08:04,558][32837] Saving new best policy, reward=19.600! [2023-10-14 01:08:04,773][32895] Saving new best policy, reward=17.520! [2023-10-14 01:08:06,344][33201] Updated weights for policy 0, policy_version 2630 (0.0008) [2023-10-14 01:08:06,710][33201] Updated weights for policy 0, policy_version 2640 (0.0008) [2023-10-14 01:08:07,081][33201] Updated weights for policy 0, policy_version 2650 (0.0009) [2023-10-14 01:08:08,419][33226] Updated weights for policy 1, policy_version 2660 (0.0010) [2023-10-14 01:08:08,797][33226] Updated weights for policy 1, policy_version 2670 (0.0009) [2023-10-14 01:08:09,164][33226] Updated weights for policy 1, policy_version 2680 (0.0009) [2023-10-14 01:08:09,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 5472256. Throughput: 0: 1740.2, 1: 1770.4. Samples: 1375648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:08:09,557][31953] Avg episode reward: [(0, '19.640'), (1, '17.540')] [2023-10-14 01:08:09,565][32895] Saving new best policy, reward=17.540! [2023-10-14 01:08:09,565][32837] Saving new best policy, reward=19.640! [2023-10-14 01:08:10,930][33201] Updated weights for policy 0, policy_version 2660 (0.0007) [2023-10-14 01:08:11,291][33201] Updated weights for policy 0, policy_version 2670 (0.0008) [2023-10-14 01:08:11,663][33201] Updated weights for policy 0, policy_version 2680 (0.0009) [2023-10-14 01:08:12,823][33226] Updated weights for policy 1, policy_version 2690 (0.0010) [2023-10-14 01:08:13,190][33226] Updated weights for policy 1, policy_version 2700 (0.0009) [2023-10-14 01:08:13,556][33226] Updated weights for policy 1, policy_version 2710 (0.0007) [2023-10-14 01:08:13,926][33226] Updated weights for policy 1, policy_version 2720 (0.0007) [2023-10-14 01:08:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 5537792. Throughput: 0: 1743.3, 1: 1765.2. Samples: 1386068. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:08:14,558][31953] Avg episode reward: [(0, '19.670'), (1, '17.810')] [2023-10-14 01:08:14,559][32837] Saving new best policy, reward=19.670! [2023-10-14 01:08:14,559][32895] Saving new best policy, reward=17.810! [2023-10-14 01:08:15,376][33201] Updated weights for policy 0, policy_version 2690 (0.0008) [2023-10-14 01:08:15,742][33201] Updated weights for policy 0, policy_version 2700 (0.0011) [2023-10-14 01:08:16,114][33201] Updated weights for policy 0, policy_version 2710 (0.0009) [2023-10-14 01:08:16,475][33201] Updated weights for policy 0, policy_version 2720 (0.0007) [2023-10-14 01:08:17,450][33226] Updated weights for policy 1, policy_version 2730 (0.0007) [2023-10-14 01:08:17,823][33226] Updated weights for policy 1, policy_version 2740 (0.0007) [2023-10-14 01:08:18,190][33226] Updated weights for policy 1, policy_version 2750 (0.0007) [2023-10-14 01:08:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 5603328. Throughput: 0: 1746.5, 1: 1775.7. Samples: 1407586. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:08:19,558][31953] Avg episode reward: [(0, '19.700'), (1, '17.820')] [2023-10-14 01:08:19,558][32837] Saving new best policy, reward=19.700! [2023-10-14 01:08:19,558][32895] Saving new best policy, reward=17.820! [2023-10-14 01:08:20,261][33201] Updated weights for policy 0, policy_version 2730 (0.0008) [2023-10-14 01:08:20,627][33201] Updated weights for policy 0, policy_version 2740 (0.0008) [2023-10-14 01:08:21,001][33201] Updated weights for policy 0, policy_version 2750 (0.0007) [2023-10-14 01:08:21,851][33226] Updated weights for policy 1, policy_version 2760 (0.0009) [2023-10-14 01:08:22,217][33226] Updated weights for policy 1, policy_version 2770 (0.0010) [2023-10-14 01:08:22,589][33226] Updated weights for policy 1, policy_version 2780 (0.0009) [2023-10-14 01:08:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 5668864. Throughput: 0: 1781.8, 1: 1775.8. Samples: 1429834. Policy #0 lag: (min: 15.0, avg: 24.7, max: 47.0) [2023-10-14 01:08:24,558][31953] Avg episode reward: [(0, '19.870'), (1, '17.830')] [2023-10-14 01:08:24,566][32895] Saving new best policy, reward=17.830! [2023-10-14 01:08:24,836][33201] Updated weights for policy 0, policy_version 2760 (0.0007) [2023-10-14 01:08:25,213][33201] Updated weights for policy 0, policy_version 2770 (0.0009) [2023-10-14 01:08:25,578][33201] Updated weights for policy 0, policy_version 2780 (0.0009) [2023-10-14 01:08:25,724][32837] Saving new best policy, reward=19.870! [2023-10-14 01:08:26,305][33226] Updated weights for policy 1, policy_version 2790 (0.0009) [2023-10-14 01:08:26,677][33226] Updated weights for policy 1, policy_version 2800 (0.0009) [2023-10-14 01:08:27,051][33226] Updated weights for policy 1, policy_version 2810 (0.0010) [2023-10-14 01:08:29,336][33201] Updated weights for policy 0, policy_version 2790 (0.0009) [2023-10-14 01:08:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 5734400. Throughput: 0: 1751.6, 1: 1788.8. Samples: 1439858. Policy #0 lag: (min: 25.0, avg: 38.4, max: 57.0) [2023-10-14 01:08:29,557][31953] Avg episode reward: [(0, '19.870'), (1, '17.970')] [2023-10-14 01:08:29,558][32895] Saving new best policy, reward=17.970! [2023-10-14 01:08:29,717][33201] Updated weights for policy 0, policy_version 2800 (0.0009) [2023-10-14 01:08:30,095][33201] Updated weights for policy 0, policy_version 2810 (0.0009) [2023-10-14 01:08:30,887][33226] Updated weights for policy 1, policy_version 2820 (0.0008) [2023-10-14 01:08:31,256][33226] Updated weights for policy 1, policy_version 2830 (0.0008) [2023-10-14 01:08:31,619][33226] Updated weights for policy 1, policy_version 2840 (0.0009) [2023-10-14 01:08:33,927][33201] Updated weights for policy 0, policy_version 2820 (0.0009) [2023-10-14 01:08:34,306][33201] Updated weights for policy 0, policy_version 2830 (0.0009) [2023-10-14 01:08:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 5799936. Throughput: 0: 1771.1, 1: 1781.4. Samples: 1461666. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 01:08:34,558][31953] Avg episode reward: [(0, '19.840'), (1, '18.080')] [2023-10-14 01:08:34,559][32895] Saving new best policy, reward=18.080! [2023-10-14 01:08:34,682][33201] Updated weights for policy 0, policy_version 2840 (0.0009) [2023-10-14 01:08:35,378][33226] Updated weights for policy 1, policy_version 2850 (0.0010) [2023-10-14 01:08:35,741][33226] Updated weights for policy 1, policy_version 2860 (0.0010) [2023-10-14 01:08:36,110][33226] Updated weights for policy 1, policy_version 2870 (0.0009) [2023-10-14 01:08:36,476][33226] Updated weights for policy 1, policy_version 2880 (0.0008) [2023-10-14 01:08:38,521][33201] Updated weights for policy 0, policy_version 2850 (0.0007) [2023-10-14 01:08:38,932][33201] Updated weights for policy 0, policy_version 2860 (0.0007) [2023-10-14 01:08:39,298][33201] Updated weights for policy 0, policy_version 2870 (0.0008) [2023-10-14 01:08:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 5865472. Throughput: 0: 1771.5, 1: 1785.2. Samples: 1483208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:08:39,558][31953] Avg episode reward: [(0, '19.880'), (1, '18.110')] [2023-10-14 01:08:39,563][32895] Saving new best policy, reward=18.110! [2023-10-14 01:08:39,664][32837] Saving new best policy, reward=19.880! [2023-10-14 01:08:39,666][33201] Updated weights for policy 0, policy_version 2880 (0.0008) [2023-10-14 01:08:40,337][33226] Updated weights for policy 1, policy_version 2890 (0.0009) [2023-10-14 01:08:40,716][33226] Updated weights for policy 1, policy_version 2900 (0.0007) [2023-10-14 01:08:41,090][33226] Updated weights for policy 1, policy_version 2910 (0.0010) [2023-10-14 01:08:43,444][33201] Updated weights for policy 0, policy_version 2890 (0.0007) [2023-10-14 01:08:43,815][33201] Updated weights for policy 0, policy_version 2900 (0.0007) [2023-10-14 01:08:44,190][33201] Updated weights for policy 0, policy_version 2910 (0.0009) [2023-10-14 01:08:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 5963776. Throughput: 0: 1763.5, 1: 1788.1. Samples: 1493502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 01:08:44,558][31953] Avg episode reward: [(0, '19.890'), (1, '18.170')] [2023-10-14 01:08:44,560][32837] Saving new best policy, reward=19.890! [2023-10-14 01:08:44,820][33226] Updated weights for policy 1, policy_version 2920 (0.0009) [2023-10-14 01:08:45,191][33226] Updated weights for policy 1, policy_version 2930 (0.0009) [2023-10-14 01:08:45,552][33226] Updated weights for policy 1, policy_version 2940 (0.0012) [2023-10-14 01:08:45,699][32895] Saving new best policy, reward=18.170! [2023-10-14 01:08:48,053][33201] Updated weights for policy 0, policy_version 2920 (0.0009) [2023-10-14 01:08:48,416][33201] Updated weights for policy 0, policy_version 2930 (0.0010) [2023-10-14 01:08:48,786][33201] Updated weights for policy 0, policy_version 2940 (0.0007) [2023-10-14 01:08:49,498][33226] Updated weights for policy 1, policy_version 2950 (0.0010) [2023-10-14 01:08:49,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 6029312. Throughput: 0: 1781.2, 1: 1790.8. Samples: 1515108. Policy #0 lag: (min: 3.0, avg: 4.5, max: 30.0) [2023-10-14 01:08:49,558][31953] Avg episode reward: [(0, '19.920'), (1, '18.230')] [2023-10-14 01:08:49,560][32837] Saving new best policy, reward=19.920! [2023-10-14 01:08:49,868][33226] Updated weights for policy 1, policy_version 2960 (0.0011) [2023-10-14 01:08:50,231][33226] Updated weights for policy 1, policy_version 2970 (0.0011) [2023-10-14 01:08:50,454][32895] Saving new best policy, reward=18.230! [2023-10-14 01:08:52,406][33201] Updated weights for policy 0, policy_version 2950 (0.0008) [2023-10-14 01:08:52,783][33201] Updated weights for policy 0, policy_version 2960 (0.0010) [2023-10-14 01:08:53,158][33201] Updated weights for policy 0, policy_version 2970 (0.0007) [2023-10-14 01:08:54,300][33226] Updated weights for policy 1, policy_version 2980 (0.0010) [2023-10-14 01:08:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 6094848. Throughput: 0: 1759.9, 1: 1803.2. Samples: 1535992. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) [2023-10-14 01:08:54,558][31953] Avg episode reward: [(0, '19.930'), (1, '18.530')] [2023-10-14 01:08:54,569][32837] Saving new best policy, reward=19.930! [2023-10-14 01:08:54,667][33226] Updated weights for policy 1, policy_version 2990 (0.0011) [2023-10-14 01:08:55,030][33226] Updated weights for policy 1, policy_version 3000 (0.0007) [2023-10-14 01:08:55,320][32895] Saving new best policy, reward=18.530! [2023-10-14 01:08:57,123][33201] Updated weights for policy 0, policy_version 2980 (0.0008) [2023-10-14 01:08:57,498][33201] Updated weights for policy 0, policy_version 2990 (0.0007) [2023-10-14 01:08:57,866][33201] Updated weights for policy 0, policy_version 3000 (0.0007) [2023-10-14 01:08:58,836][33226] Updated weights for policy 1, policy_version 3010 (0.0008) [2023-10-14 01:08:59,218][33226] Updated weights for policy 1, policy_version 3020 (0.0010) [2023-10-14 01:08:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 6160384. Throughput: 0: 1789.2, 1: 1782.8. Samples: 1546812. Policy #0 lag: (min: 31.0, avg: 31.2, max: 41.0) [2023-10-14 01:08:59,558][31953] Avg episode reward: [(0, '19.940'), (1, '18.580')] [2023-10-14 01:08:59,559][32837] Saving new best policy, reward=19.940! [2023-10-14 01:08:59,580][33226] Updated weights for policy 1, policy_version 3030 (0.0010) [2023-10-14 01:08:59,949][33226] Updated weights for policy 1, policy_version 3040 (0.0010) [2023-10-14 01:08:59,949][32895] Saving new best policy, reward=18.580! [2023-10-14 01:09:01,546][33201] Updated weights for policy 0, policy_version 3010 (0.0007) [2023-10-14 01:09:01,913][33201] Updated weights for policy 0, policy_version 3020 (0.0009) [2023-10-14 01:09:02,296][33201] Updated weights for policy 0, policy_version 3030 (0.0010) [2023-10-14 01:09:02,670][33201] Updated weights for policy 0, policy_version 3040 (0.0009) [2023-10-14 01:09:03,856][33226] Updated weights for policy 1, policy_version 3050 (0.0010) [2023-10-14 01:09:04,225][33226] Updated weights for policy 1, policy_version 3060 (0.0007) [2023-10-14 01:09:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 6225920. Throughput: 0: 1767.9, 1: 1795.6. Samples: 1567944. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 01:09:04,558][31953] Avg episode reward: [(0, '19.940'), (1, '18.570')] [2023-10-14 01:09:04,581][33226] Updated weights for policy 1, policy_version 3070 (0.0011) [2023-10-14 01:09:06,481][33201] Updated weights for policy 0, policy_version 3050 (0.0008) [2023-10-14 01:09:06,865][33201] Updated weights for policy 0, policy_version 3060 (0.0009) [2023-10-14 01:09:07,246][33201] Updated weights for policy 0, policy_version 3070 (0.0009) [2023-10-14 01:09:08,421][33226] Updated weights for policy 1, policy_version 3080 (0.0009) [2023-10-14 01:09:08,783][33226] Updated weights for policy 1, policy_version 3090 (0.0008) [2023-10-14 01:09:09,151][33226] Updated weights for policy 1, policy_version 3100 (0.0008) [2023-10-14 01:09:09,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 6324224. Throughput: 0: 1760.8, 1: 1773.3. Samples: 1588872. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 01:09:09,558][31953] Avg episode reward: [(0, '19.950'), (1, '18.850')] [2023-10-14 01:09:09,568][32837] Saving new best policy, reward=19.950! [2023-10-14 01:09:09,568][32895] Saving new best policy, reward=18.850! [2023-10-14 01:09:11,259][33201] Updated weights for policy 0, policy_version 3080 (0.0007) [2023-10-14 01:09:11,630][33201] Updated weights for policy 0, policy_version 3090 (0.0007) [2023-10-14 01:09:12,002][33201] Updated weights for policy 0, policy_version 3100 (0.0008) [2023-10-14 01:09:12,799][33226] Updated weights for policy 1, policy_version 3110 (0.0010) [2023-10-14 01:09:13,163][33226] Updated weights for policy 1, policy_version 3120 (0.0009) [2023-10-14 01:09:13,535][33226] Updated weights for policy 1, policy_version 3130 (0.0008) [2023-10-14 01:09:14,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 6389760. Throughput: 0: 1764.7, 1: 1789.1. Samples: 1599784. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:09:14,558][31953] Avg episode reward: [(0, '19.960'), (1, '19.040')] [2023-10-14 01:09:14,560][32837] Saving new best policy, reward=19.960! [2023-10-14 01:09:14,560][32895] Saving new best policy, reward=19.040! [2023-10-14 01:09:15,746][33201] Updated weights for policy 0, policy_version 3110 (0.0007) [2023-10-14 01:09:16,119][33201] Updated weights for policy 0, policy_version 3120 (0.0007) [2023-10-14 01:09:16,493][33201] Updated weights for policy 0, policy_version 3130 (0.0010) [2023-10-14 01:09:17,302][33226] Updated weights for policy 1, policy_version 3140 (0.0008) [2023-10-14 01:09:17,674][33226] Updated weights for policy 1, policy_version 3150 (0.0008) [2023-10-14 01:09:18,040][33226] Updated weights for policy 1, policy_version 3160 (0.0007) [2023-10-14 01:09:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 6455296. Throughput: 0: 1765.5, 1: 1778.9. Samples: 1621166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:09:19,558][31953] Avg episode reward: [(0, '19.950'), (1, '19.030')] [2023-10-14 01:09:20,178][33201] Updated weights for policy 0, policy_version 3140 (0.0010) [2023-10-14 01:09:20,548][33201] Updated weights for policy 0, policy_version 3150 (0.0009) [2023-10-14 01:09:20,918][33201] Updated weights for policy 0, policy_version 3160 (0.0010) [2023-10-14 01:09:21,666][33226] Updated weights for policy 1, policy_version 3170 (0.0008) [2023-10-14 01:09:22,038][33226] Updated weights for policy 1, policy_version 3180 (0.0010) [2023-10-14 01:09:22,409][33226] Updated weights for policy 1, policy_version 3190 (0.0008) [2023-10-14 01:09:22,785][33226] Updated weights for policy 1, policy_version 3200 (0.0008) [2023-10-14 01:09:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 6520832. Throughput: 0: 1776.4, 1: 1768.2. Samples: 1642712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:09:24,558][31953] Avg episode reward: [(0, '20.000'), (1, '19.070')] [2023-10-14 01:09:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000003200_3276800.pth... [2023-10-14 01:09:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000003168_3244032.pth... [2023-10-14 01:09:24,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000001536_1572864.pth [2023-10-14 01:09:24,601][32895] Saving new best policy, reward=19.070! [2023-10-14 01:09:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000001536_1572864.pth [2023-10-14 01:09:24,611][32837] Saving new best policy, reward=20.000! [2023-10-14 01:09:24,898][33201] Updated weights for policy 0, policy_version 3170 (0.0010) [2023-10-14 01:09:25,287][33201] Updated weights for policy 0, policy_version 3180 (0.0009) [2023-10-14 01:09:25,656][33201] Updated weights for policy 0, policy_version 3190 (0.0008) [2023-10-14 01:09:26,034][33201] Updated weights for policy 0, policy_version 3200 (0.0010) [2023-10-14 01:09:26,600][33226] Updated weights for policy 1, policy_version 3210 (0.0007) [2023-10-14 01:09:26,967][33226] Updated weights for policy 1, policy_version 3220 (0.0009) [2023-10-14 01:09:27,337][33226] Updated weights for policy 1, policy_version 3230 (0.0009) [2023-10-14 01:09:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 6586368. Throughput: 0: 1756.9, 1: 1784.1. Samples: 1652842. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 01:09:29,557][31953] Avg episode reward: [(0, '20.010'), (1, '19.160')] [2023-10-14 01:09:29,558][32895] Saving new best policy, reward=19.160! [2023-10-14 01:09:29,783][33201] Updated weights for policy 0, policy_version 3210 (0.0007) [2023-10-14 01:09:30,153][33201] Updated weights for policy 0, policy_version 3220 (0.0007) [2023-10-14 01:09:30,527][33201] Updated weights for policy 0, policy_version 3230 (0.0008) [2023-10-14 01:09:30,601][32837] Saving new best policy, reward=20.010! [2023-10-14 01:09:31,119][33226] Updated weights for policy 1, policy_version 3240 (0.0009) [2023-10-14 01:09:31,483][33226] Updated weights for policy 1, policy_version 3250 (0.0008) [2023-10-14 01:09:31,859][33226] Updated weights for policy 1, policy_version 3260 (0.0008) [2023-10-14 01:09:34,442][33201] Updated weights for policy 0, policy_version 3240 (0.0008) [2023-10-14 01:09:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 6651904. Throughput: 0: 1765.6, 1: 1771.5. Samples: 1674274. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 01:09:34,557][31953] Avg episode reward: [(0, '20.030'), (1, '19.140')] [2023-10-14 01:09:34,818][33201] Updated weights for policy 0, policy_version 3250 (0.0007) [2023-10-14 01:09:35,191][33201] Updated weights for policy 0, policy_version 3260 (0.0008) [2023-10-14 01:09:35,338][32837] Saving new best policy, reward=20.030! [2023-10-14 01:09:35,532][33226] Updated weights for policy 1, policy_version 3270 (0.0009) [2023-10-14 01:09:35,901][33226] Updated weights for policy 1, policy_version 3280 (0.0010) [2023-10-14 01:09:36,275][33226] Updated weights for policy 1, policy_version 3290 (0.0008) [2023-10-14 01:09:38,949][33201] Updated weights for policy 0, policy_version 3270 (0.0007) [2023-10-14 01:09:39,329][33201] Updated weights for policy 0, policy_version 3280 (0.0007) [2023-10-14 01:09:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 6717440. Throughput: 0: 1779.2, 1: 1779.4. Samples: 1696128. Policy #0 lag: (min: 19.0, avg: 27.0, max: 51.0) [2023-10-14 01:09:39,558][31953] Avg episode reward: [(0, '20.030'), (1, '19.170')] [2023-10-14 01:09:39,566][32895] Saving new best policy, reward=19.170! [2023-10-14 01:09:39,702][33201] Updated weights for policy 0, policy_version 3290 (0.0007) [2023-10-14 01:09:40,059][33226] Updated weights for policy 1, policy_version 3300 (0.0009) [2023-10-14 01:09:40,422][33226] Updated weights for policy 1, policy_version 3310 (0.0007) [2023-10-14 01:09:40,788][33226] Updated weights for policy 1, policy_version 3320 (0.0008) [2023-10-14 01:09:43,545][33201] Updated weights for policy 0, policy_version 3300 (0.0008) [2023-10-14 01:09:43,916][33201] Updated weights for policy 0, policy_version 3310 (0.0009) [2023-10-14 01:09:44,290][33201] Updated weights for policy 0, policy_version 3320 (0.0008) [2023-10-14 01:09:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 6782976. Throughput: 0: 1758.3, 1: 1778.6. Samples: 1705972. Policy #0 lag: (min: 19.0, avg: 27.0, max: 51.0) [2023-10-14 01:09:44,558][31953] Avg episode reward: [(0, '20.040'), (1, '19.230')] [2023-10-14 01:09:44,589][32837] Saving new best policy, reward=20.040! [2023-10-14 01:09:44,607][33226] Updated weights for policy 1, policy_version 3330 (0.0008) [2023-10-14 01:09:44,981][33226] Updated weights for policy 1, policy_version 3340 (0.0008) [2023-10-14 01:09:45,350][33226] Updated weights for policy 1, policy_version 3350 (0.0008) [2023-10-14 01:09:45,720][32895] Saving new best policy, reward=19.230! [2023-10-14 01:09:45,724][33226] Updated weights for policy 1, policy_version 3360 (0.0008) [2023-10-14 01:09:48,039][33201] Updated weights for policy 0, policy_version 3330 (0.0008) [2023-10-14 01:09:48,407][33201] Updated weights for policy 0, policy_version 3340 (0.0008) [2023-10-14 01:09:48,781][33201] Updated weights for policy 0, policy_version 3350 (0.0007) [2023-10-14 01:09:49,153][33201] Updated weights for policy 0, policy_version 3360 (0.0007) [2023-10-14 01:09:49,510][33226] Updated weights for policy 1, policy_version 3370 (0.0008) [2023-10-14 01:09:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 6881280. Throughput: 0: 1778.5, 1: 1777.7. Samples: 1727970. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:09:49,558][31953] Avg episode reward: [(0, '20.040'), (1, '19.230')] [2023-10-14 01:09:49,864][33226] Updated weights for policy 1, policy_version 3380 (0.0007) [2023-10-14 01:09:50,235][33226] Updated weights for policy 1, policy_version 3390 (0.0007) [2023-10-14 01:09:53,023][33201] Updated weights for policy 0, policy_version 3370 (0.0008) [2023-10-14 01:09:53,389][33201] Updated weights for policy 0, policy_version 3380 (0.0008) [2023-10-14 01:09:53,756][33201] Updated weights for policy 0, policy_version 3390 (0.0007) [2023-10-14 01:09:53,994][33226] Updated weights for policy 1, policy_version 3400 (0.0009) [2023-10-14 01:09:54,377][33226] Updated weights for policy 1, policy_version 3410 (0.0009) [2023-10-14 01:09:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 6946816. Throughput: 0: 1751.8, 1: 1802.1. Samples: 1748794. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 01:09:54,558][31953] Avg episode reward: [(0, '20.030'), (1, '19.250')] [2023-10-14 01:09:54,740][33226] Updated weights for policy 1, policy_version 3420 (0.0011) [2023-10-14 01:09:54,890][32895] Saving new best policy, reward=19.250! [2023-10-14 01:09:57,637][33201] Updated weights for policy 0, policy_version 3400 (0.0007) [2023-10-14 01:09:58,001][33201] Updated weights for policy 0, policy_version 3410 (0.0008) [2023-10-14 01:09:58,381][33201] Updated weights for policy 0, policy_version 3420 (0.0008) [2023-10-14 01:09:58,560][33226] Updated weights for policy 1, policy_version 3430 (0.0010) [2023-10-14 01:09:58,923][33226] Updated weights for policy 1, policy_version 3440 (0.0010) [2023-10-14 01:09:59,297][33226] Updated weights for policy 1, policy_version 3450 (0.0007) [2023-10-14 01:09:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 7045120. Throughput: 0: 1779.0, 1: 1781.3. Samples: 1759998. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 01:09:59,557][31953] Avg episode reward: [(0, '20.040'), (1, '19.260')] [2023-10-14 01:09:59,558][32895] Saving new best policy, reward=19.260! [2023-10-14 01:10:02,136][33201] Updated weights for policy 0, policy_version 3430 (0.0008) [2023-10-14 01:10:02,513][33201] Updated weights for policy 0, policy_version 3440 (0.0007) [2023-10-14 01:10:02,889][33201] Updated weights for policy 0, policy_version 3450 (0.0007) [2023-10-14 01:10:03,127][33226] Updated weights for policy 1, policy_version 3460 (0.0008) [2023-10-14 01:10:03,496][33226] Updated weights for policy 1, policy_version 3470 (0.0007) [2023-10-14 01:10:03,872][33226] Updated weights for policy 1, policy_version 3480 (0.0008) [2023-10-14 01:10:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.7, 300 sec: 14329.1). Total num frames: 7110656. Throughput: 0: 1747.6, 1: 1802.8. Samples: 1780936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:04,558][31953] Avg episode reward: [(0, '20.040'), (1, '19.270')] [2023-10-14 01:10:04,559][32895] Saving new best policy, reward=19.270! [2023-10-14 01:10:06,705][33201] Updated weights for policy 0, policy_version 3460 (0.0007) [2023-10-14 01:10:07,081][33201] Updated weights for policy 0, policy_version 3470 (0.0009) [2023-10-14 01:10:07,458][33201] Updated weights for policy 0, policy_version 3480 (0.0007) [2023-10-14 01:10:07,620][33226] Updated weights for policy 1, policy_version 3490 (0.0009) [2023-10-14 01:10:07,988][33226] Updated weights for policy 1, policy_version 3500 (0.0007) [2023-10-14 01:10:08,359][33226] Updated weights for policy 1, policy_version 3510 (0.0008) [2023-10-14 01:10:08,720][33226] Updated weights for policy 1, policy_version 3520 (0.0010) [2023-10-14 01:10:09,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 7176192. Throughput: 0: 1754.4, 1: 1779.2. Samples: 1801728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:09,558][31953] Avg episode reward: [(0, '20.050'), (1, '19.370')] [2023-10-14 01:10:09,568][32837] Saving new best policy, reward=20.050! [2023-10-14 01:10:09,568][32895] Saving new best policy, reward=19.370! [2023-10-14 01:10:11,478][33201] Updated weights for policy 0, policy_version 3490 (0.0007) [2023-10-14 01:10:11,893][33201] Updated weights for policy 0, policy_version 3500 (0.0009) [2023-10-14 01:10:12,257][33201] Updated weights for policy 0, policy_version 3510 (0.0008) [2023-10-14 01:10:12,628][33201] Updated weights for policy 0, policy_version 3520 (0.0009) [2023-10-14 01:10:12,632][33226] Updated weights for policy 1, policy_version 3530 (0.0008) [2023-10-14 01:10:13,013][33226] Updated weights for policy 1, policy_version 3540 (0.0009) [2023-10-14 01:10:13,393][33226] Updated weights for policy 1, policy_version 3550 (0.0007) [2023-10-14 01:10:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 7241728. Throughput: 0: 1768.3, 1: 1796.3. Samples: 1813248. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:14,558][31953] Avg episode reward: [(0, '20.050'), (1, '19.440')] [2023-10-14 01:10:14,559][32895] Saving new best policy, reward=19.440! [2023-10-14 01:10:16,402][33201] Updated weights for policy 0, policy_version 3530 (0.0007) [2023-10-14 01:10:16,775][33201] Updated weights for policy 0, policy_version 3540 (0.0007) [2023-10-14 01:10:17,039][33226] Updated weights for policy 1, policy_version 3560 (0.0008) [2023-10-14 01:10:17,147][33201] Updated weights for policy 0, policy_version 3550 (0.0007) [2023-10-14 01:10:17,408][33226] Updated weights for policy 1, policy_version 3570 (0.0010) [2023-10-14 01:10:17,774][33226] Updated weights for policy 1, policy_version 3580 (0.0011) [2023-10-14 01:10:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 7307264. Throughput: 0: 1753.9, 1: 1774.5. Samples: 1833050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:19,557][31953] Avg episode reward: [(0, '20.030'), (1, '19.460')] [2023-10-14 01:10:19,558][32895] Saving new best policy, reward=19.460! [2023-10-14 01:10:21,148][33201] Updated weights for policy 0, policy_version 3560 (0.0011) [2023-10-14 01:10:21,517][33201] Updated weights for policy 0, policy_version 3570 (0.0009) [2023-10-14 01:10:21,694][33226] Updated weights for policy 1, policy_version 3590 (0.0009) [2023-10-14 01:10:21,887][33201] Updated weights for policy 0, policy_version 3580 (0.0008) [2023-10-14 01:10:22,061][33226] Updated weights for policy 1, policy_version 3600 (0.0009) [2023-10-14 01:10:22,427][33226] Updated weights for policy 1, policy_version 3610 (0.0007) [2023-10-14 01:10:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 7372800. Throughput: 0: 1758.6, 1: 1767.7. Samples: 1854812. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:24,558][31953] Avg episode reward: [(0, '20.030'), (1, '19.580')] [2023-10-14 01:10:24,569][32895] Saving new best policy, reward=19.580! [2023-10-14 01:10:25,631][33201] Updated weights for policy 0, policy_version 3590 (0.0009) [2023-10-14 01:10:25,999][33201] Updated weights for policy 0, policy_version 3600 (0.0007) [2023-10-14 01:10:26,219][33226] Updated weights for policy 1, policy_version 3620 (0.0008) [2023-10-14 01:10:26,363][33201] Updated weights for policy 0, policy_version 3610 (0.0007) [2023-10-14 01:10:26,585][33226] Updated weights for policy 1, policy_version 3630 (0.0007) [2023-10-14 01:10:26,959][33226] Updated weights for policy 1, policy_version 3640 (0.0010) [2023-10-14 01:10:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 7438336. Throughput: 0: 1749.3, 1: 1780.1. Samples: 1864798. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:29,557][31953] Avg episode reward: [(0, '20.030'), (1, '19.610')] [2023-10-14 01:10:29,558][32895] Saving new best policy, reward=19.610! [2023-10-14 01:10:30,283][33201] Updated weights for policy 0, policy_version 3620 (0.0008) [2023-10-14 01:10:30,650][33201] Updated weights for policy 0, policy_version 3630 (0.0010) [2023-10-14 01:10:30,780][33226] Updated weights for policy 1, policy_version 3650 (0.0011) [2023-10-14 01:10:31,021][33201] Updated weights for policy 0, policy_version 3640 (0.0008) [2023-10-14 01:10:31,144][33226] Updated weights for policy 1, policy_version 3660 (0.0009) [2023-10-14 01:10:31,511][33226] Updated weights for policy 1, policy_version 3670 (0.0008) [2023-10-14 01:10:31,881][33226] Updated weights for policy 1, policy_version 3680 (0.0010) [2023-10-14 01:10:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 7503872. Throughput: 0: 1753.2, 1: 1770.8. Samples: 1886546. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:34,558][31953] Avg episode reward: [(0, '19.910'), (1, '19.670')] [2023-10-14 01:10:34,558][32895] Saving new best policy, reward=19.670! [2023-10-14 01:10:34,832][33201] Updated weights for policy 0, policy_version 3650 (0.0008) [2023-10-14 01:10:35,202][33201] Updated weights for policy 0, policy_version 3660 (0.0008) [2023-10-14 01:10:35,577][33201] Updated weights for policy 0, policy_version 3670 (0.0008) [2023-10-14 01:10:35,678][33226] Updated weights for policy 1, policy_version 3690 (0.0009) [2023-10-14 01:10:35,941][33201] Updated weights for policy 0, policy_version 3680 (0.0009) [2023-10-14 01:10:36,050][33226] Updated weights for policy 1, policy_version 3700 (0.0010) [2023-10-14 01:10:36,424][33226] Updated weights for policy 1, policy_version 3710 (0.0010) [2023-10-14 01:10:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 7569408. Throughput: 0: 1780.6, 1: 1767.4. Samples: 1908454. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:10:39,558][31953] Avg episode reward: [(0, '19.900'), (1, '19.720')] [2023-10-14 01:10:39,565][32895] Saving new best policy, reward=19.720! [2023-10-14 01:10:39,760][33201] Updated weights for policy 0, policy_version 3690 (0.0007) [2023-10-14 01:10:40,133][33201] Updated weights for policy 0, policy_version 3700 (0.0008) [2023-10-14 01:10:40,139][33226] Updated weights for policy 1, policy_version 3720 (0.0008) [2023-10-14 01:10:40,503][33226] Updated weights for policy 1, policy_version 3730 (0.0010) [2023-10-14 01:10:40,508][33201] Updated weights for policy 0, policy_version 3710 (0.0010) [2023-10-14 01:10:40,878][33226] Updated weights for policy 1, policy_version 3740 (0.0010) [2023-10-14 01:10:44,392][33201] Updated weights for policy 0, policy_version 3720 (0.0009) [2023-10-14 01:10:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 7634944. Throughput: 0: 1746.6, 1: 1766.3. Samples: 1918082. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 01:10:44,558][31953] Avg episode reward: [(0, '19.900'), (1, '19.770')] [2023-10-14 01:10:44,604][33226] Updated weights for policy 1, policy_version 3750 (0.0008) [2023-10-14 01:10:44,759][33201] Updated weights for policy 0, policy_version 3730 (0.0007) [2023-10-14 01:10:44,978][33226] Updated weights for policy 1, policy_version 3760 (0.0009) [2023-10-14 01:10:45,132][33201] Updated weights for policy 0, policy_version 3740 (0.0009) [2023-10-14 01:10:45,336][33226] Updated weights for policy 1, policy_version 3770 (0.0008) [2023-10-14 01:10:45,555][32895] Saving new best policy, reward=19.770! [2023-10-14 01:10:48,928][33201] Updated weights for policy 0, policy_version 3750 (0.0008) [2023-10-14 01:10:49,150][33226] Updated weights for policy 1, policy_version 3780 (0.0009) [2023-10-14 01:10:49,296][33201] Updated weights for policy 0, policy_version 3760 (0.0007) [2023-10-14 01:10:49,512][33226] Updated weights for policy 1, policy_version 3790 (0.0007) [2023-10-14 01:10:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 7700480. Throughput: 0: 1773.5, 1: 1766.6. Samples: 1940240. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 01:10:49,558][31953] Avg episode reward: [(0, '19.890'), (1, '19.860')] [2023-10-14 01:10:49,661][33201] Updated weights for policy 0, policy_version 3770 (0.0007) [2023-10-14 01:10:49,876][33226] Updated weights for policy 1, policy_version 3800 (0.0008) [2023-10-14 01:10:50,173][32895] Saving new best policy, reward=19.860! [2023-10-14 01:10:53,461][33201] Updated weights for policy 0, policy_version 3780 (0.0008) [2023-10-14 01:10:53,670][33226] Updated weights for policy 1, policy_version 3810 (0.0008) [2023-10-14 01:10:53,819][33201] Updated weights for policy 0, policy_version 3790 (0.0008) [2023-10-14 01:10:54,038][33226] Updated weights for policy 1, policy_version 3820 (0.0007) [2023-10-14 01:10:54,197][33201] Updated weights for policy 0, policy_version 3800 (0.0008) [2023-10-14 01:10:54,400][33226] Updated weights for policy 1, policy_version 3830 (0.0007) [2023-10-14 01:10:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 7798784. Throughput: 0: 1750.5, 1: 1790.1. Samples: 1961056. Policy #0 lag: (min: 33.0, avg: 47.1, max: 48.0) [2023-10-14 01:10:54,557][31953] Avg episode reward: [(0, '19.890'), (1, '19.890')] [2023-10-14 01:10:54,768][32895] Saving new best policy, reward=19.890! [2023-10-14 01:10:54,771][33226] Updated weights for policy 1, policy_version 3840 (0.0008) [2023-10-14 01:10:58,040][33201] Updated weights for policy 0, policy_version 3810 (0.0008) [2023-10-14 01:10:58,452][33201] Updated weights for policy 0, policy_version 3820 (0.0007) [2023-10-14 01:10:58,512][33226] Updated weights for policy 1, policy_version 3850 (0.0007) [2023-10-14 01:10:58,822][33201] Updated weights for policy 0, policy_version 3830 (0.0008) [2023-10-14 01:10:58,878][33226] Updated weights for policy 1, policy_version 3860 (0.0007) [2023-10-14 01:10:59,191][33201] Updated weights for policy 0, policy_version 3840 (0.0008) [2023-10-14 01:10:59,241][33226] Updated weights for policy 1, policy_version 3870 (0.0008) [2023-10-14 01:10:59,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 7897088. Throughput: 0: 1759.4, 1: 1770.9. Samples: 1972110. Policy #0 lag: (min: 29.0, avg: 39.2, max: 61.0) [2023-10-14 01:10:59,558][31953] Avg episode reward: [(0, '19.880'), (1, '19.930')] [2023-10-14 01:10:59,558][32895] Saving new best policy, reward=19.930! [2023-10-14 01:11:02,960][33226] Updated weights for policy 1, policy_version 3880 (0.0007) [2023-10-14 01:11:03,024][33201] Updated weights for policy 0, policy_version 3850 (0.0009) [2023-10-14 01:11:03,328][33226] Updated weights for policy 1, policy_version 3890 (0.0008) [2023-10-14 01:11:03,392][33201] Updated weights for policy 0, policy_version 3860 (0.0009) [2023-10-14 01:11:03,691][33226] Updated weights for policy 1, policy_version 3900 (0.0008) [2023-10-14 01:11:03,765][33201] Updated weights for policy 0, policy_version 3870 (0.0009) [2023-10-14 01:11:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 7962624. Throughput: 0: 1764.3, 1: 1798.7. Samples: 1993386. Policy #0 lag: (min: 29.0, avg: 39.2, max: 61.0) [2023-10-14 01:11:04,558][31953] Avg episode reward: [(0, '19.870'), (1, '20.000')] [2023-10-14 01:11:04,559][32895] Saving new best policy, reward=20.000! [2023-10-14 01:11:07,511][33226] Updated weights for policy 1, policy_version 3910 (0.0008) [2023-10-14 01:11:07,641][33201] Updated weights for policy 0, policy_version 3880 (0.0010) [2023-10-14 01:11:07,881][33226] Updated weights for policy 1, policy_version 3920 (0.0007) [2023-10-14 01:11:08,003][33201] Updated weights for policy 0, policy_version 3890 (0.0008) [2023-10-14 01:11:08,244][33226] Updated weights for policy 1, policy_version 3930 (0.0008) [2023-10-14 01:11:08,372][33201] Updated weights for policy 0, policy_version 3900 (0.0007) [2023-10-14 01:11:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 8028160. Throughput: 0: 1745.8, 1: 1780.4. Samples: 2013494. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 01:11:09,558][31953] Avg episode reward: [(0, '19.860'), (1, '20.050')] [2023-10-14 01:11:09,570][32895] Saving new best policy, reward=20.050! [2023-10-14 01:11:11,997][33226] Updated weights for policy 1, policy_version 3940 (0.0008) [2023-10-14 01:11:12,113][33201] Updated weights for policy 0, policy_version 3910 (0.0007) [2023-10-14 01:11:12,361][33226] Updated weights for policy 1, policy_version 3950 (0.0008) [2023-10-14 01:11:12,485][33201] Updated weights for policy 0, policy_version 3920 (0.0009) [2023-10-14 01:11:12,734][33226] Updated weights for policy 1, policy_version 3960 (0.0008) [2023-10-14 01:11:12,846][33201] Updated weights for policy 0, policy_version 3930 (0.0008) [2023-10-14 01:11:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 8093696. Throughput: 0: 1776.3, 1: 1801.5. Samples: 2025798. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 01:11:14,558][31953] Avg episode reward: [(0, '19.860'), (1, '20.050')] [2023-10-14 01:11:16,553][33226] Updated weights for policy 1, policy_version 3970 (0.0008) [2023-10-14 01:11:16,747][33201] Updated weights for policy 0, policy_version 3940 (0.0008) [2023-10-14 01:11:16,919][33226] Updated weights for policy 1, policy_version 3980 (0.0008) [2023-10-14 01:11:17,125][33201] Updated weights for policy 0, policy_version 3950 (0.0008) [2023-10-14 01:11:17,293][33226] Updated weights for policy 1, policy_version 3990 (0.0007) [2023-10-14 01:11:17,496][33201] Updated weights for policy 0, policy_version 3960 (0.0007) [2023-10-14 01:11:17,656][33226] Updated weights for policy 1, policy_version 4000 (0.0010) [2023-10-14 01:11:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 8159232. Throughput: 0: 1747.1, 1: 1777.9. Samples: 2045174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:19,558][31953] Avg episode reward: [(0, '19.830'), (1, '20.060')] [2023-10-14 01:11:19,560][32895] Saving new best policy, reward=20.060! [2023-10-14 01:11:21,321][33201] Updated weights for policy 0, policy_version 3970 (0.0008) [2023-10-14 01:11:21,515][33226] Updated weights for policy 1, policy_version 4010 (0.0009) [2023-10-14 01:11:21,679][33201] Updated weights for policy 0, policy_version 3980 (0.0010) [2023-10-14 01:11:21,880][33226] Updated weights for policy 1, policy_version 4020 (0.0009) [2023-10-14 01:11:22,061][33201] Updated weights for policy 0, policy_version 3990 (0.0007) [2023-10-14 01:11:22,251][33226] Updated weights for policy 1, policy_version 4030 (0.0007) [2023-10-14 01:11:22,426][33201] Updated weights for policy 0, policy_version 4000 (0.0009) [2023-10-14 01:11:24,558][31953] Fps is (10 sec: 13106.5, 60 sec: 14199.3, 300 sec: 14218.0). Total num frames: 8224768. Throughput: 0: 1751.7, 1: 1781.6. Samples: 2067452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:24,559][31953] Avg episode reward: [(0, '19.810'), (1, '20.090')] [2023-10-14 01:11:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000004032_4128768.pth... [2023-10-14 01:11:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000004000_4096000.pth... [2023-10-14 01:11:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000002368_2424832.pth [2023-10-14 01:11:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000002368_2424832.pth [2023-10-14 01:11:24,614][32895] Saving new best policy, reward=20.090! [2023-10-14 01:11:25,978][33201] Updated weights for policy 0, policy_version 4010 (0.0008) [2023-10-14 01:11:26,107][33226] Updated weights for policy 1, policy_version 4040 (0.0007) [2023-10-14 01:11:26,348][33201] Updated weights for policy 0, policy_version 4020 (0.0007) [2023-10-14 01:11:26,481][33226] Updated weights for policy 1, policy_version 4050 (0.0008) [2023-10-14 01:11:26,720][33201] Updated weights for policy 0, policy_version 4030 (0.0007) [2023-10-14 01:11:26,845][33226] Updated weights for policy 1, policy_version 4060 (0.0009) [2023-10-14 01:11:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 8290304. Throughput: 0: 1758.3, 1: 1778.1. Samples: 2077218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:29,557][31953] Avg episode reward: [(0, '19.820'), (1, '20.110')] [2023-10-14 01:11:29,558][32895] Saving new best policy, reward=20.110! [2023-10-14 01:11:30,525][33201] Updated weights for policy 0, policy_version 4040 (0.0008) [2023-10-14 01:11:30,688][33226] Updated weights for policy 1, policy_version 4070 (0.0007) [2023-10-14 01:11:30,895][33201] Updated weights for policy 0, policy_version 4050 (0.0007) [2023-10-14 01:11:31,052][33226] Updated weights for policy 1, policy_version 4080 (0.0007) [2023-10-14 01:11:31,262][33201] Updated weights for policy 0, policy_version 4060 (0.0007) [2023-10-14 01:11:31,426][33226] Updated weights for policy 1, policy_version 4090 (0.0007) [2023-10-14 01:11:34,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 8355840. Throughput: 0: 1759.3, 1: 1772.5. Samples: 2099172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:34,558][31953] Avg episode reward: [(0, '19.840'), (1, '20.120')] [2023-10-14 01:11:34,560][32895] Saving new best policy, reward=20.120! [2023-10-14 01:11:35,082][33201] Updated weights for policy 0, policy_version 4070 (0.0009) [2023-10-14 01:11:35,204][33226] Updated weights for policy 1, policy_version 4100 (0.0011) [2023-10-14 01:11:35,458][33201] Updated weights for policy 0, policy_version 4080 (0.0008) [2023-10-14 01:11:35,573][33226] Updated weights for policy 1, policy_version 4110 (0.0008) [2023-10-14 01:11:35,832][33201] Updated weights for policy 0, policy_version 4090 (0.0007) [2023-10-14 01:11:35,939][33226] Updated weights for policy 1, policy_version 4120 (0.0010) [2023-10-14 01:11:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 8421376. Throughput: 0: 1784.9, 1: 1774.4. Samples: 2121226. Policy #0 lag: (min: 2.0, avg: 2.8, max: 21.0) [2023-10-14 01:11:39,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.130')] [2023-10-14 01:11:39,566][32895] Saving new best policy, reward=20.130! [2023-10-14 01:11:39,736][33201] Updated weights for policy 0, policy_version 4100 (0.0007) [2023-10-14 01:11:39,843][33226] Updated weights for policy 1, policy_version 4130 (0.0008) [2023-10-14 01:11:40,121][33201] Updated weights for policy 0, policy_version 4110 (0.0008) [2023-10-14 01:11:40,203][33226] Updated weights for policy 1, policy_version 4140 (0.0007) [2023-10-14 01:11:40,492][33201] Updated weights for policy 0, policy_version 4120 (0.0008) [2023-10-14 01:11:40,568][33226] Updated weights for policy 1, policy_version 4150 (0.0007) [2023-10-14 01:11:40,939][33226] Updated weights for policy 1, policy_version 4160 (0.0008) [2023-10-14 01:11:44,414][33201] Updated weights for policy 0, policy_version 4130 (0.0010) [2023-10-14 01:11:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 8486912. Throughput: 0: 1764.1, 1: 1758.6. Samples: 2130634. Policy #0 lag: (min: 7.0, avg: 7.2, max: 16.0) [2023-10-14 01:11:44,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.130')] [2023-10-14 01:11:44,825][33201] Updated weights for policy 0, policy_version 4140 (0.0008) [2023-10-14 01:11:44,886][33226] Updated weights for policy 1, policy_version 4170 (0.0007) [2023-10-14 01:11:45,190][33201] Updated weights for policy 0, policy_version 4150 (0.0008) [2023-10-14 01:11:45,261][33226] Updated weights for policy 1, policy_version 4180 (0.0008) [2023-10-14 01:11:45,558][33201] Updated weights for policy 0, policy_version 4160 (0.0007) [2023-10-14 01:11:45,626][33226] Updated weights for policy 1, policy_version 4190 (0.0007) [2023-10-14 01:11:49,403][33201] Updated weights for policy 0, policy_version 4170 (0.0008) [2023-10-14 01:11:49,438][33226] Updated weights for policy 1, policy_version 4200 (0.0008) [2023-10-14 01:11:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 8552448. Throughput: 0: 1769.8, 1: 1763.0. Samples: 2152364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:49,559][31953] Avg episode reward: [(0, '19.870'), (1, '20.120')] [2023-10-14 01:11:49,776][33201] Updated weights for policy 0, policy_version 4180 (0.0009) [2023-10-14 01:11:49,812][33226] Updated weights for policy 1, policy_version 4210 (0.0007) [2023-10-14 01:11:50,151][33201] Updated weights for policy 0, policy_version 4190 (0.0010) [2023-10-14 01:11:50,180][33226] Updated weights for policy 1, policy_version 4220 (0.0007) [2023-10-14 01:11:53,797][33226] Updated weights for policy 1, policy_version 4230 (0.0008) [2023-10-14 01:11:54,023][33201] Updated weights for policy 0, policy_version 4200 (0.0009) [2023-10-14 01:11:54,166][33226] Updated weights for policy 1, policy_version 4240 (0.0008) [2023-10-14 01:11:54,399][33201] Updated weights for policy 0, policy_version 4210 (0.0008) [2023-10-14 01:11:54,529][33226] Updated weights for policy 1, policy_version 4250 (0.0009) [2023-10-14 01:11:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 8617984. Throughput: 0: 1777.9, 1: 1780.6. Samples: 2173626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:11:54,558][31953] Avg episode reward: [(0, '19.880'), (1, '20.160')] [2023-10-14 01:11:54,745][32895] Saving new best policy, reward=20.160! [2023-10-14 01:11:54,767][33201] Updated weights for policy 0, policy_version 4220 (0.0008) [2023-10-14 01:11:58,394][33226] Updated weights for policy 1, policy_version 4260 (0.0008) [2023-10-14 01:11:58,655][33201] Updated weights for policy 0, policy_version 4230 (0.0008) [2023-10-14 01:11:58,758][33226] Updated weights for policy 1, policy_version 4270 (0.0008) [2023-10-14 01:11:59,021][33201] Updated weights for policy 0, policy_version 4240 (0.0008) [2023-10-14 01:11:59,123][33226] Updated weights for policy 1, policy_version 4280 (0.0008) [2023-10-14 01:11:59,396][33201] Updated weights for policy 0, policy_version 4250 (0.0008) [2023-10-14 01:11:59,557][31953] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 8716288. Throughput: 0: 1757.3, 1: 1763.2. Samples: 2184222. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 01:11:59,557][31953] Avg episode reward: [(0, '19.890'), (1, '20.190')] [2023-10-14 01:11:59,558][32895] Saving new best policy, reward=20.190! [2023-10-14 01:12:02,895][33226] Updated weights for policy 1, policy_version 4290 (0.0008) [2023-10-14 01:12:03,143][33201] Updated weights for policy 0, policy_version 4260 (0.0008) [2023-10-14 01:12:03,263][33226] Updated weights for policy 1, policy_version 4300 (0.0007) [2023-10-14 01:12:03,511][33201] Updated weights for policy 0, policy_version 4270 (0.0008) [2023-10-14 01:12:03,628][33226] Updated weights for policy 1, policy_version 4310 (0.0007) [2023-10-14 01:12:03,887][33201] Updated weights for policy 0, policy_version 4280 (0.0008) [2023-10-14 01:12:04,004][33226] Updated weights for policy 1, policy_version 4320 (0.0007) [2023-10-14 01:12:04,557][31953] Fps is (10 sec: 19661.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 8814592. Throughput: 0: 1782.7, 1: 1787.7. Samples: 2205842. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:04,558][31953] Avg episode reward: [(0, '19.910'), (1, '20.220')] [2023-10-14 01:12:04,559][32895] Saving new best policy, reward=20.220! [2023-10-14 01:12:07,573][33201] Updated weights for policy 0, policy_version 4290 (0.0007) [2023-10-14 01:12:07,670][33226] Updated weights for policy 1, policy_version 4330 (0.0010) [2023-10-14 01:12:07,938][33201] Updated weights for policy 0, policy_version 4300 (0.0007) [2023-10-14 01:12:08,042][33226] Updated weights for policy 1, policy_version 4340 (0.0008) [2023-10-14 01:12:08,306][33201] Updated weights for policy 0, policy_version 4310 (0.0008) [2023-10-14 01:12:08,407][33226] Updated weights for policy 1, policy_version 4350 (0.0007) [2023-10-14 01:12:08,676][33201] Updated weights for policy 0, policy_version 4320 (0.0008) [2023-10-14 01:12:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 8880128. Throughput: 0: 1748.7, 1: 1762.2. Samples: 2225440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:09,558][31953] Avg episode reward: [(0, '19.900'), (1, '20.240')] [2023-10-14 01:12:09,566][32895] Saving new best policy, reward=20.240! [2023-10-14 01:12:12,297][33226] Updated weights for policy 1, policy_version 4360 (0.0008) [2023-10-14 01:12:12,455][33201] Updated weights for policy 0, policy_version 4330 (0.0009) [2023-10-14 01:12:12,669][33226] Updated weights for policy 1, policy_version 4370 (0.0007) [2023-10-14 01:12:12,821][33201] Updated weights for policy 0, policy_version 4340 (0.0009) [2023-10-14 01:12:13,036][33226] Updated weights for policy 1, policy_version 4380 (0.0008) [2023-10-14 01:12:13,186][33201] Updated weights for policy 0, policy_version 4350 (0.0009) [2023-10-14 01:12:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 8945664. Throughput: 0: 1780.9, 1: 1793.6. Samples: 2238070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:14,558][31953] Avg episode reward: [(0, '20.020'), (1, '20.260')] [2023-10-14 01:12:14,559][32895] Saving new best policy, reward=20.260! [2023-10-14 01:12:16,826][33226] Updated weights for policy 1, policy_version 4390 (0.0008) [2023-10-14 01:12:17,016][33201] Updated weights for policy 0, policy_version 4360 (0.0008) [2023-10-14 01:12:17,198][33226] Updated weights for policy 1, policy_version 4400 (0.0009) [2023-10-14 01:12:17,391][33201] Updated weights for policy 0, policy_version 4370 (0.0008) [2023-10-14 01:12:17,569][33226] Updated weights for policy 1, policy_version 4410 (0.0008) [2023-10-14 01:12:17,769][33201] Updated weights for policy 0, policy_version 4380 (0.0007) [2023-10-14 01:12:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 9011200. Throughput: 0: 1748.9, 1: 1763.9. Samples: 2257246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:19,558][31953] Avg episode reward: [(0, '20.010'), (1, '20.250')] [2023-10-14 01:12:21,450][33226] Updated weights for policy 1, policy_version 4420 (0.0008) [2023-10-14 01:12:21,744][33201] Updated weights for policy 0, policy_version 4390 (0.0010) [2023-10-14 01:12:21,819][33226] Updated weights for policy 1, policy_version 4430 (0.0008) [2023-10-14 01:12:22,122][33201] Updated weights for policy 0, policy_version 4400 (0.0010) [2023-10-14 01:12:22,182][33226] Updated weights for policy 1, policy_version 4440 (0.0008) [2023-10-14 01:12:22,486][33201] Updated weights for policy 0, policy_version 4410 (0.0008) [2023-10-14 01:12:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 9076736. Throughput: 0: 1740.6, 1: 1766.0. Samples: 2279024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:24,558][31953] Avg episode reward: [(0, '20.010'), (1, '20.260')] [2023-10-14 01:12:26,108][33226] Updated weights for policy 1, policy_version 4450 (0.0008) [2023-10-14 01:12:26,440][33201] Updated weights for policy 0, policy_version 4420 (0.0009) [2023-10-14 01:12:26,476][33226] Updated weights for policy 1, policy_version 4460 (0.0007) [2023-10-14 01:12:26,809][33201] Updated weights for policy 0, policy_version 4430 (0.0007) [2023-10-14 01:12:26,847][33226] Updated weights for policy 1, policy_version 4470 (0.0007) [2023-10-14 01:12:27,182][33201] Updated weights for policy 0, policy_version 4440 (0.0008) [2023-10-14 01:12:27,220][33226] Updated weights for policy 1, policy_version 4480 (0.0009) [2023-10-14 01:12:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 9142272. Throughput: 0: 1751.5, 1: 1773.1. Samples: 2289240. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:29,558][31953] Avg episode reward: [(0, '20.050'), (1, '20.270')] [2023-10-14 01:12:29,560][32895] Saving new best policy, reward=20.270! [2023-10-14 01:12:31,068][33226] Updated weights for policy 1, policy_version 4490 (0.0009) [2023-10-14 01:12:31,078][33201] Updated weights for policy 0, policy_version 4450 (0.0010) [2023-10-14 01:12:31,434][33226] Updated weights for policy 1, policy_version 4500 (0.0008) [2023-10-14 01:12:31,446][33201] Updated weights for policy 0, policy_version 4460 (0.0009) [2023-10-14 01:12:31,816][33226] Updated weights for policy 1, policy_version 4510 (0.0008) [2023-10-14 01:12:31,821][33201] Updated weights for policy 0, policy_version 4470 (0.0008) [2023-10-14 01:12:32,202][33201] Updated weights for policy 0, policy_version 4480 (0.0007) [2023-10-14 01:12:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 9207808. Throughput: 0: 1739.8, 1: 1763.7. Samples: 2310022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:34,558][31953] Avg episode reward: [(0, '20.060'), (1, '20.290')] [2023-10-14 01:12:34,559][32837] Saving new best policy, reward=20.060! [2023-10-14 01:12:34,559][32895] Saving new best policy, reward=20.290! [2023-10-14 01:12:35,685][33226] Updated weights for policy 1, policy_version 4520 (0.0009) [2023-10-14 01:12:36,068][33226] Updated weights for policy 1, policy_version 4530 (0.0008) [2023-10-14 01:12:36,136][33201] Updated weights for policy 0, policy_version 4490 (0.0008) [2023-10-14 01:12:36,434][33226] Updated weights for policy 1, policy_version 4540 (0.0009) [2023-10-14 01:12:36,509][33201] Updated weights for policy 0, policy_version 4500 (0.0009) [2023-10-14 01:12:36,876][33201] Updated weights for policy 0, policy_version 4510 (0.0011) [2023-10-14 01:12:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 9273344. Throughput: 0: 1750.1, 1: 1760.5. Samples: 2331602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:12:39,558][31953] Avg episode reward: [(0, '20.060'), (1, '20.340')] [2023-10-14 01:12:39,566][32895] Saving new best policy, reward=20.340! [2023-10-14 01:12:40,264][33226] Updated weights for policy 1, policy_version 4550 (0.0011) [2023-10-14 01:12:40,638][33226] Updated weights for policy 1, policy_version 4560 (0.0007) [2023-10-14 01:12:40,681][33201] Updated weights for policy 0, policy_version 4520 (0.0009) [2023-10-14 01:12:41,002][33226] Updated weights for policy 1, policy_version 4570 (0.0007) [2023-10-14 01:12:41,058][33201] Updated weights for policy 0, policy_version 4530 (0.0008) [2023-10-14 01:12:41,431][33201] Updated weights for policy 0, policy_version 4540 (0.0009) [2023-10-14 01:12:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 9338880. Throughput: 0: 1740.8, 1: 1749.3. Samples: 2341278. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-14 01:12:44,558][31953] Avg episode reward: [(0, '20.080'), (1, '20.350')] [2023-10-14 01:12:44,558][32837] Saving new best policy, reward=20.080! [2023-10-14 01:12:44,670][33226] Updated weights for policy 1, policy_version 4580 (0.0009) [2023-10-14 01:12:45,042][33226] Updated weights for policy 1, policy_version 4590 (0.0008) [2023-10-14 01:12:45,204][33201] Updated weights for policy 0, policy_version 4550 (0.0009) [2023-10-14 01:12:45,403][33226] Updated weights for policy 1, policy_version 4600 (0.0008) [2023-10-14 01:12:45,582][33201] Updated weights for policy 0, policy_version 4560 (0.0008) [2023-10-14 01:12:45,700][32895] Saving new best policy, reward=20.350! [2023-10-14 01:12:45,945][33201] Updated weights for policy 0, policy_version 4570 (0.0008) [2023-10-14 01:12:49,160][33226] Updated weights for policy 1, policy_version 4610 (0.0009) [2023-10-14 01:12:49,540][33226] Updated weights for policy 1, policy_version 4620 (0.0008) [2023-10-14 01:12:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 9404416. Throughput: 0: 1743.4, 1: 1767.2. Samples: 2363820. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-14 01:12:49,557][31953] Avg episode reward: [(0, '20.090'), (1, '20.350')] [2023-10-14 01:12:49,558][32837] Saving new best policy, reward=20.090! [2023-10-14 01:12:49,878][33201] Updated weights for policy 0, policy_version 4580 (0.0009) [2023-10-14 01:12:49,909][33226] Updated weights for policy 1, policy_version 4630 (0.0009) [2023-10-14 01:12:50,249][33201] Updated weights for policy 0, policy_version 4590 (0.0009) [2023-10-14 01:12:50,278][33226] Updated weights for policy 1, policy_version 4640 (0.0008) [2023-10-14 01:12:50,626][33201] Updated weights for policy 0, policy_version 4600 (0.0008) [2023-10-14 01:12:54,035][33226] Updated weights for policy 1, policy_version 4650 (0.0008) [2023-10-14 01:12:54,397][33201] Updated weights for policy 0, policy_version 4610 (0.0009) [2023-10-14 01:12:54,407][33226] Updated weights for policy 1, policy_version 4660 (0.0009) [2023-10-14 01:12:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 9469952. Throughput: 0: 1770.6, 1: 1782.8. Samples: 2385340. Policy #0 lag: (min: 10.0, avg: 17.0, max: 42.0) [2023-10-14 01:12:54,558][31953] Avg episode reward: [(0, '20.100'), (1, '20.360')] [2023-10-14 01:12:54,770][33226] Updated weights for policy 1, policy_version 4670 (0.0008) [2023-10-14 01:12:54,780][33201] Updated weights for policy 0, policy_version 4620 (0.0009) [2023-10-14 01:12:54,846][32895] Saving new best policy, reward=20.360! [2023-10-14 01:12:55,152][33201] Updated weights for policy 0, policy_version 4630 (0.0009) [2023-10-14 01:12:55,513][32837] Saving new best policy, reward=20.100! [2023-10-14 01:12:55,520][33201] Updated weights for policy 0, policy_version 4640 (0.0009) [2023-10-14 01:12:58,491][33226] Updated weights for policy 1, policy_version 4680 (0.0007) [2023-10-14 01:12:58,859][33226] Updated weights for policy 1, policy_version 4690 (0.0007) [2023-10-14 01:12:59,231][33226] Updated weights for policy 1, policy_version 4700 (0.0007) [2023-10-14 01:12:59,403][33201] Updated weights for policy 0, policy_version 4650 (0.0008) [2023-10-14 01:12:59,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 9568256. Throughput: 0: 1734.1, 1: 1758.2. Samples: 2395224. Policy #0 lag: (min: 10.0, avg: 17.0, max: 42.0) [2023-10-14 01:12:59,558][31953] Avg episode reward: [(0, '20.140'), (1, '20.340')] [2023-10-14 01:12:59,786][33201] Updated weights for policy 0, policy_version 4660 (0.0008) [2023-10-14 01:13:00,165][33201] Updated weights for policy 0, policy_version 4670 (0.0009) [2023-10-14 01:13:00,237][32837] Saving new best policy, reward=20.140! [2023-10-14 01:13:03,169][33226] Updated weights for policy 1, policy_version 4710 (0.0008) [2023-10-14 01:13:03,546][33226] Updated weights for policy 1, policy_version 4720 (0.0009) [2023-10-14 01:13:03,917][33226] Updated weights for policy 1, policy_version 4730 (0.0007) [2023-10-14 01:13:03,965][33201] Updated weights for policy 0, policy_version 4680 (0.0008) [2023-10-14 01:13:04,350][33201] Updated weights for policy 0, policy_version 4690 (0.0008) [2023-10-14 01:13:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 9633792. Throughput: 0: 1762.9, 1: 1788.2. Samples: 2417046. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-14 01:13:04,558][31953] Avg episode reward: [(0, '20.140'), (1, '20.340')] [2023-10-14 01:13:04,715][33201] Updated weights for policy 0, policy_version 4700 (0.0010) [2023-10-14 01:13:07,681][33226] Updated weights for policy 1, policy_version 4740 (0.0007) [2023-10-14 01:13:08,046][33226] Updated weights for policy 1, policy_version 4750 (0.0008) [2023-10-14 01:13:08,409][33226] Updated weights for policy 1, policy_version 4760 (0.0009) [2023-10-14 01:13:08,741][33201] Updated weights for policy 0, policy_version 4710 (0.0008) [2023-10-14 01:13:09,106][33201] Updated weights for policy 0, policy_version 4720 (0.0010) [2023-10-14 01:13:09,487][33201] Updated weights for policy 0, policy_version 4730 (0.0010) [2023-10-14 01:13:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.2, 300 sec: 14106.9). Total num frames: 9699328. Throughput: 0: 1749.7, 1: 1757.1. Samples: 2436832. Policy #0 lag: (min: 31.0, avg: 40.8, max: 63.0) [2023-10-14 01:13:09,559][31953] Avg episode reward: [(0, '20.150'), (1, '20.330')] [2023-10-14 01:13:09,705][32837] Saving new best policy, reward=20.150! [2023-10-14 01:13:12,276][33226] Updated weights for policy 1, policy_version 4770 (0.0009) [2023-10-14 01:13:12,643][33226] Updated weights for policy 1, policy_version 4780 (0.0011) [2023-10-14 01:13:13,013][33226] Updated weights for policy 1, policy_version 4790 (0.0009) [2023-10-14 01:13:13,375][33226] Updated weights for policy 1, policy_version 4800 (0.0008) [2023-10-14 01:13:13,534][33201] Updated weights for policy 0, policy_version 4740 (0.0010) [2023-10-14 01:13:13,906][33201] Updated weights for policy 0, policy_version 4750 (0.0008) [2023-10-14 01:13:14,270][33201] Updated weights for policy 0, policy_version 4760 (0.0007) [2023-10-14 01:13:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 9764864. Throughput: 0: 1751.5, 1: 1783.7. Samples: 2448324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:13:14,557][31953] Avg episode reward: [(0, '20.160'), (1, '20.330')] [2023-10-14 01:13:14,564][32837] Saving new best policy, reward=20.160! [2023-10-14 01:13:17,315][33226] Updated weights for policy 1, policy_version 4810 (0.0009) [2023-10-14 01:13:17,687][33226] Updated weights for policy 1, policy_version 4820 (0.0010) [2023-10-14 01:13:18,059][33226] Updated weights for policy 1, policy_version 4830 (0.0009) [2023-10-14 01:13:18,104][33201] Updated weights for policy 0, policy_version 4770 (0.0009) [2023-10-14 01:13:18,477][33201] Updated weights for policy 0, policy_version 4780 (0.0010) [2023-10-14 01:13:18,850][33201] Updated weights for policy 0, policy_version 4790 (0.0008) [2023-10-14 01:13:19,224][33201] Updated weights for policy 0, policy_version 4800 (0.0010) [2023-10-14 01:13:19,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 9863168. Throughput: 0: 1765.3, 1: 1767.8. Samples: 2469014. Policy #0 lag: (min: 7.0, avg: 14.8, max: 39.0) [2023-10-14 01:13:19,559][31953] Avg episode reward: [(0, '20.190'), (1, '20.280')] [2023-10-14 01:13:19,560][32837] Saving new best policy, reward=20.190! [2023-10-14 01:13:22,025][33226] Updated weights for policy 1, policy_version 4840 (0.0009) [2023-10-14 01:13:22,398][33226] Updated weights for policy 1, policy_version 4850 (0.0008) [2023-10-14 01:13:22,773][33226] Updated weights for policy 1, policy_version 4860 (0.0009) [2023-10-14 01:13:23,224][33201] Updated weights for policy 0, policy_version 4810 (0.0008) [2023-10-14 01:13:23,604][33201] Updated weights for policy 0, policy_version 4820 (0.0008) [2023-10-14 01:13:23,980][33201] Updated weights for policy 0, policy_version 4830 (0.0008) [2023-10-14 01:13:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 9928704. Throughput: 0: 1733.3, 1: 1764.6. Samples: 2489008. Policy #0 lag: (min: 7.0, avg: 14.8, max: 39.0) [2023-10-14 01:13:24,558][31953] Avg episode reward: [(0, '20.190'), (1, '20.260')] [2023-10-14 01:13:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000004864_4980736.pth... [2023-10-14 01:13:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000004832_4947968.pth... [2023-10-14 01:13:24,604][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000003200_3276800.pth [2023-10-14 01:13:24,606][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000003168_3244032.pth [2023-10-14 01:13:26,335][33226] Updated weights for policy 1, policy_version 4870 (0.0008) [2023-10-14 01:13:26,702][33226] Updated weights for policy 1, policy_version 4880 (0.0012) [2023-10-14 01:13:27,073][33226] Updated weights for policy 1, policy_version 4890 (0.0010) [2023-10-14 01:13:27,791][33201] Updated weights for policy 0, policy_version 4840 (0.0009) [2023-10-14 01:13:28,171][33201] Updated weights for policy 0, policy_version 4850 (0.0009) [2023-10-14 01:13:28,548][33201] Updated weights for policy 0, policy_version 4860 (0.0009) [2023-10-14 01:13:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 9994240. Throughput: 0: 1761.2, 1: 1774.8. Samples: 2500396. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:13:29,558][31953] Avg episode reward: [(0, '20.200'), (1, '20.240')] [2023-10-14 01:13:29,558][32837] Saving new best policy, reward=20.200! [2023-10-14 01:13:30,951][33226] Updated weights for policy 1, policy_version 4900 (0.0010) [2023-10-14 01:13:31,314][33226] Updated weights for policy 1, policy_version 4910 (0.0010) [2023-10-14 01:13:31,684][33226] Updated weights for policy 1, policy_version 4920 (0.0010) [2023-10-14 01:13:32,311][33201] Updated weights for policy 0, policy_version 4870 (0.0010) [2023-10-14 01:13:32,681][33201] Updated weights for policy 0, policy_version 4880 (0.0009) [2023-10-14 01:13:33,043][33201] Updated weights for policy 0, policy_version 4890 (0.0007) [2023-10-14 01:13:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 10059776. Throughput: 0: 1736.0, 1: 1754.8. Samples: 2520902. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:13:34,558][31953] Avg episode reward: [(0, '20.230'), (1, '20.280')] [2023-10-14 01:13:34,559][32837] Saving new best policy, reward=20.230! [2023-10-14 01:13:35,403][33226] Updated weights for policy 1, policy_version 4930 (0.0010) [2023-10-14 01:13:35,774][33226] Updated weights for policy 1, policy_version 4940 (0.0009) [2023-10-14 01:13:36,140][33226] Updated weights for policy 1, policy_version 4950 (0.0010) [2023-10-14 01:13:36,514][33226] Updated weights for policy 1, policy_version 4960 (0.0009) [2023-10-14 01:13:36,848][33201] Updated weights for policy 0, policy_version 4900 (0.0008) [2023-10-14 01:13:37,214][33201] Updated weights for policy 0, policy_version 4910 (0.0008) [2023-10-14 01:13:37,591][33201] Updated weights for policy 0, policy_version 4920 (0.0010) [2023-10-14 01:13:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 10125312. Throughput: 0: 1731.4, 1: 1768.5. Samples: 2542838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:13:39,558][31953] Avg episode reward: [(0, '20.240'), (1, '20.270')] [2023-10-14 01:13:39,567][32837] Saving new best policy, reward=20.240! [2023-10-14 01:13:40,386][33226] Updated weights for policy 1, policy_version 4970 (0.0010) [2023-10-14 01:13:40,742][33226] Updated weights for policy 1, policy_version 4980 (0.0009) [2023-10-14 01:13:41,107][33226] Updated weights for policy 1, policy_version 4990 (0.0009) [2023-10-14 01:13:41,414][33201] Updated weights for policy 0, policy_version 4930 (0.0007) [2023-10-14 01:13:41,773][33201] Updated weights for policy 0, policy_version 4940 (0.0010) [2023-10-14 01:13:42,142][33201] Updated weights for policy 0, policy_version 4950 (0.0011) [2023-10-14 01:13:42,518][33201] Updated weights for policy 0, policy_version 4960 (0.0008) [2023-10-14 01:13:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 10190848. Throughput: 0: 1743.3, 1: 1759.1. Samples: 2552834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:13:44,558][31953] Avg episode reward: [(0, '20.260'), (1, '20.270')] [2023-10-14 01:13:44,560][32837] Saving new best policy, reward=20.260! [2023-10-14 01:13:44,947][33226] Updated weights for policy 1, policy_version 5000 (0.0008) [2023-10-14 01:13:45,317][33226] Updated weights for policy 1, policy_version 5010 (0.0010) [2023-10-14 01:13:45,683][33226] Updated weights for policy 1, policy_version 5020 (0.0009) [2023-10-14 01:13:46,509][33201] Updated weights for policy 0, policy_version 4970 (0.0009) [2023-10-14 01:13:46,878][33201] Updated weights for policy 0, policy_version 4980 (0.0009) [2023-10-14 01:13:47,262][33201] Updated weights for policy 0, policy_version 4990 (0.0009) [2023-10-14 01:13:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 10256384. Throughput: 0: 1728.0, 1: 1758.7. Samples: 2573948. Policy #0 lag: (min: 8.0, avg: 24.9, max: 40.0) [2023-10-14 01:13:49,558][31953] Avg episode reward: [(0, '20.260'), (1, '20.250')] [2023-10-14 01:13:49,602][33226] Updated weights for policy 1, policy_version 5030 (0.0009) [2023-10-14 01:13:49,977][33226] Updated weights for policy 1, policy_version 5040 (0.0009) [2023-10-14 01:13:50,339][33226] Updated weights for policy 1, policy_version 5050 (0.0008) [2023-10-14 01:13:51,145][33201] Updated weights for policy 0, policy_version 5000 (0.0009) [2023-10-14 01:13:51,527][33201] Updated weights for policy 0, policy_version 5010 (0.0009) [2023-10-14 01:13:51,888][33201] Updated weights for policy 0, policy_version 5020 (0.0008) [2023-10-14 01:13:54,043][33226] Updated weights for policy 1, policy_version 5060 (0.0008) [2023-10-14 01:13:54,413][33226] Updated weights for policy 1, policy_version 5070 (0.0007) [2023-10-14 01:13:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 10321920. Throughput: 0: 1745.1, 1: 1787.2. Samples: 2595782. Policy #0 lag: (min: 8.0, avg: 24.9, max: 40.0) [2023-10-14 01:13:54,557][31953] Avg episode reward: [(0, '20.230'), (1, '20.290')] [2023-10-14 01:13:54,781][33226] Updated weights for policy 1, policy_version 5080 (0.0007) [2023-10-14 01:13:55,654][33201] Updated weights for policy 0, policy_version 5030 (0.0010) [2023-10-14 01:13:56,024][33201] Updated weights for policy 0, policy_version 5040 (0.0010) [2023-10-14 01:13:56,402][33201] Updated weights for policy 0, policy_version 5050 (0.0009) [2023-10-14 01:13:58,782][33226] Updated weights for policy 1, policy_version 5090 (0.0009) [2023-10-14 01:13:59,146][33226] Updated weights for policy 1, policy_version 5100 (0.0008) [2023-10-14 01:13:59,520][33226] Updated weights for policy 1, policy_version 5110 (0.0011) [2023-10-14 01:13:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 10387456. Throughput: 0: 1734.9, 1: 1759.1. Samples: 2605552. Policy #0 lag: (min: 27.0, avg: 31.8, max: 59.0) [2023-10-14 01:13:59,558][31953] Avg episode reward: [(0, '20.240'), (1, '20.290')] [2023-10-14 01:13:59,894][33226] Updated weights for policy 1, policy_version 5120 (0.0011) [2023-10-14 01:14:00,246][33201] Updated weights for policy 0, policy_version 5060 (0.0010) [2023-10-14 01:14:00,615][33201] Updated weights for policy 0, policy_version 5070 (0.0008) [2023-10-14 01:14:00,987][33201] Updated weights for policy 0, policy_version 5080 (0.0007) [2023-10-14 01:14:03,580][33226] Updated weights for policy 1, policy_version 5130 (0.0009) [2023-10-14 01:14:03,935][33226] Updated weights for policy 1, policy_version 5140 (0.0009) [2023-10-14 01:14:04,296][33226] Updated weights for policy 1, policy_version 5150 (0.0009) [2023-10-14 01:14:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 10485760. Throughput: 0: 1736.1, 1: 1789.3. Samples: 2627658. Policy #0 lag: (min: 27.0, avg: 31.8, max: 59.0) [2023-10-14 01:14:04,558][31953] Avg episode reward: [(0, '20.220'), (1, '20.300')] [2023-10-14 01:14:04,765][33201] Updated weights for policy 0, policy_version 5090 (0.0007) [2023-10-14 01:14:05,128][33201] Updated weights for policy 0, policy_version 5100 (0.0010) [2023-10-14 01:14:05,501][33201] Updated weights for policy 0, policy_version 5110 (0.0009) [2023-10-14 01:14:05,870][33201] Updated weights for policy 0, policy_version 5120 (0.0011) [2023-10-14 01:14:08,355][33226] Updated weights for policy 1, policy_version 5160 (0.0008) [2023-10-14 01:14:08,735][33226] Updated weights for policy 1, policy_version 5170 (0.0009) [2023-10-14 01:14:09,106][33226] Updated weights for policy 1, policy_version 5180 (0.0008) [2023-10-14 01:14:09,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 10551296. Throughput: 0: 1768.4, 1: 1771.9. Samples: 2648318. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-14 01:14:09,558][31953] Avg episode reward: [(0, '20.230'), (1, '20.300')] [2023-10-14 01:14:09,890][33201] Updated weights for policy 0, policy_version 5130 (0.0009) [2023-10-14 01:14:10,262][33201] Updated weights for policy 0, policy_version 5140 (0.0007) [2023-10-14 01:14:10,629][33201] Updated weights for policy 0, policy_version 5150 (0.0008) [2023-10-14 01:14:12,740][33226] Updated weights for policy 1, policy_version 5190 (0.0009) [2023-10-14 01:14:13,107][33226] Updated weights for policy 1, policy_version 5200 (0.0009) [2023-10-14 01:14:13,486][33226] Updated weights for policy 1, policy_version 5210 (0.0010) [2023-10-14 01:14:14,443][33201] Updated weights for policy 0, policy_version 5160 (0.0009) [2023-10-14 01:14:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 10616832. Throughput: 0: 1734.8, 1: 1783.3. Samples: 2658708. Policy #0 lag: (min: 31.0, avg: 42.7, max: 63.0) [2023-10-14 01:14:14,558][31953] Avg episode reward: [(0, '20.200'), (1, '20.290')] [2023-10-14 01:14:14,819][33201] Updated weights for policy 0, policy_version 5170 (0.0009) [2023-10-14 01:14:15,193][33201] Updated weights for policy 0, policy_version 5180 (0.0010) [2023-10-14 01:14:17,121][33226] Updated weights for policy 1, policy_version 5220 (0.0009) [2023-10-14 01:14:17,484][33226] Updated weights for policy 1, policy_version 5230 (0.0008) [2023-10-14 01:14:17,853][33226] Updated weights for policy 1, policy_version 5240 (0.0007) [2023-10-14 01:14:19,097][33201] Updated weights for policy 0, policy_version 5190 (0.0008) [2023-10-14 01:14:19,467][33201] Updated weights for policy 0, policy_version 5200 (0.0007) [2023-10-14 01:14:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 10682368. Throughput: 0: 1759.9, 1: 1768.0. Samples: 2679658. Policy #0 lag: (min: 26.0, avg: 30.4, max: 58.0) [2023-10-14 01:14:19,558][31953] Avg episode reward: [(0, '20.190'), (1, '20.290')] [2023-10-14 01:14:19,852][33201] Updated weights for policy 0, policy_version 5210 (0.0007) [2023-10-14 01:14:21,621][33226] Updated weights for policy 1, policy_version 5250 (0.0007) [2023-10-14 01:14:21,991][33226] Updated weights for policy 1, policy_version 5260 (0.0011) [2023-10-14 01:14:22,352][33226] Updated weights for policy 1, policy_version 5270 (0.0008) [2023-10-14 01:14:22,720][33226] Updated weights for policy 1, policy_version 5280 (0.0008) [2023-10-14 01:14:23,691][33201] Updated weights for policy 0, policy_version 5220 (0.0008) [2023-10-14 01:14:24,060][33201] Updated weights for policy 0, policy_version 5230 (0.0008) [2023-10-14 01:14:24,441][33201] Updated weights for policy 0, policy_version 5240 (0.0009) [2023-10-14 01:14:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 10747904. Throughput: 0: 1755.6, 1: 1761.2. Samples: 2701094. Policy #0 lag: (min: 26.0, avg: 30.4, max: 58.0) [2023-10-14 01:14:24,557][31953] Avg episode reward: [(0, '20.200'), (1, '20.300')] [2023-10-14 01:14:26,655][33226] Updated weights for policy 1, policy_version 5290 (0.0008) [2023-10-14 01:14:27,029][33226] Updated weights for policy 1, policy_version 5300 (0.0007) [2023-10-14 01:14:27,391][33226] Updated weights for policy 1, policy_version 5310 (0.0010) [2023-10-14 01:14:28,255][33201] Updated weights for policy 0, policy_version 5250 (0.0009) [2023-10-14 01:14:28,628][33201] Updated weights for policy 0, policy_version 5260 (0.0009) [2023-10-14 01:14:29,004][33201] Updated weights for policy 0, policy_version 5270 (0.0009) [2023-10-14 01:14:29,375][33201] Updated weights for policy 0, policy_version 5280 (0.0008) [2023-10-14 01:14:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 10846208. Throughput: 0: 1760.8, 1: 1775.8. Samples: 2711980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 33.0) [2023-10-14 01:14:29,558][31953] Avg episode reward: [(0, '20.180'), (1, '20.320')] [2023-10-14 01:14:31,099][33226] Updated weights for policy 1, policy_version 5320 (0.0009) [2023-10-14 01:14:31,470][33226] Updated weights for policy 1, policy_version 5330 (0.0007) [2023-10-14 01:14:31,843][33226] Updated weights for policy 1, policy_version 5340 (0.0007) [2023-10-14 01:14:33,198][33201] Updated weights for policy 0, policy_version 5290 (0.0007) [2023-10-14 01:14:33,567][33201] Updated weights for policy 0, policy_version 5300 (0.0007) [2023-10-14 01:14:33,939][33201] Updated weights for policy 0, policy_version 5310 (0.0007) [2023-10-14 01:14:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 10911744. Throughput: 0: 1773.6, 1: 1769.7. Samples: 2733398. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:14:34,558][31953] Avg episode reward: [(0, '20.170'), (1, '20.330')] [2023-10-14 01:14:35,521][33226] Updated weights for policy 1, policy_version 5350 (0.0008) [2023-10-14 01:14:35,888][33226] Updated weights for policy 1, policy_version 5360 (0.0007) [2023-10-14 01:14:36,253][33226] Updated weights for policy 1, policy_version 5370 (0.0008) [2023-10-14 01:14:37,617][33201] Updated weights for policy 0, policy_version 5320 (0.0010) [2023-10-14 01:14:37,989][33201] Updated weights for policy 0, policy_version 5330 (0.0011) [2023-10-14 01:14:38,360][33201] Updated weights for policy 0, policy_version 5340 (0.0010) [2023-10-14 01:14:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 10977280. Throughput: 0: 1753.0, 1: 1779.2. Samples: 2754728. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:14:39,558][31953] Avg episode reward: [(0, '20.170'), (1, '20.310')] [2023-10-14 01:14:40,081][33226] Updated weights for policy 1, policy_version 5380 (0.0008) [2023-10-14 01:14:40,448][33226] Updated weights for policy 1, policy_version 5390 (0.0008) [2023-10-14 01:14:40,818][33226] Updated weights for policy 1, policy_version 5400 (0.0008) [2023-10-14 01:14:42,221][33201] Updated weights for policy 0, policy_version 5350 (0.0009) [2023-10-14 01:14:42,599][33201] Updated weights for policy 0, policy_version 5360 (0.0008) [2023-10-14 01:14:42,963][33201] Updated weights for policy 0, policy_version 5370 (0.0010) [2023-10-14 01:14:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 11042816. Throughput: 0: 1783.2, 1: 1776.9. Samples: 2765758. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 01:14:44,559][31953] Avg episode reward: [(0, '20.180'), (1, '20.310')] [2023-10-14 01:14:44,717][33226] Updated weights for policy 1, policy_version 5410 (0.0007) [2023-10-14 01:14:45,078][33226] Updated weights for policy 1, policy_version 5420 (0.0009) [2023-10-14 01:14:45,451][33226] Updated weights for policy 1, policy_version 5430 (0.0009) [2023-10-14 01:14:45,824][33226] Updated weights for policy 1, policy_version 5440 (0.0008) [2023-10-14 01:14:46,639][33201] Updated weights for policy 0, policy_version 5380 (0.0009) [2023-10-14 01:14:47,018][33201] Updated weights for policy 0, policy_version 5390 (0.0009) [2023-10-14 01:14:47,388][33201] Updated weights for policy 0, policy_version 5400 (0.0007) [2023-10-14 01:14:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 11108352. Throughput: 0: 1753.3, 1: 1770.0. Samples: 2786210. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 01:14:49,557][31953] Avg episode reward: [(0, '20.160'), (1, '20.330')] [2023-10-14 01:14:49,606][33226] Updated weights for policy 1, policy_version 5450 (0.0010) [2023-10-14 01:14:49,979][33226] Updated weights for policy 1, policy_version 5460 (0.0007) [2023-10-14 01:14:50,349][33226] Updated weights for policy 1, policy_version 5470 (0.0007) [2023-10-14 01:14:51,212][33201] Updated weights for policy 0, policy_version 5410 (0.0008) [2023-10-14 01:14:51,584][33201] Updated weights for policy 0, policy_version 5420 (0.0009) [2023-10-14 01:14:51,947][33201] Updated weights for policy 0, policy_version 5430 (0.0009) [2023-10-14 01:14:52,317][33201] Updated weights for policy 0, policy_version 5440 (0.0011) [2023-10-14 01:14:54,085][33226] Updated weights for policy 1, policy_version 5480 (0.0008) [2023-10-14 01:14:54,464][33226] Updated weights for policy 1, policy_version 5490 (0.0007) [2023-10-14 01:14:54,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 11173888. Throughput: 0: 1759.5, 1: 1798.2. Samples: 2808414. Policy #0 lag: (min: 2.0, avg: 2.1, max: 7.0) [2023-10-14 01:14:54,558][31953] Avg episode reward: [(0, '20.160'), (1, '20.320')] [2023-10-14 01:14:54,826][33226] Updated weights for policy 1, policy_version 5500 (0.0011) [2023-10-14 01:14:56,232][33201] Updated weights for policy 0, policy_version 5450 (0.0007) [2023-10-14 01:14:56,603][33201] Updated weights for policy 0, policy_version 5460 (0.0007) [2023-10-14 01:14:56,980][33201] Updated weights for policy 0, policy_version 5470 (0.0007) [2023-10-14 01:14:58,630][33226] Updated weights for policy 1, policy_version 5510 (0.0010) [2023-10-14 01:14:59,008][33226] Updated weights for policy 1, policy_version 5520 (0.0009) [2023-10-14 01:14:59,375][33226] Updated weights for policy 1, policy_version 5530 (0.0007) [2023-10-14 01:14:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 11239424. Throughput: 0: 1764.4, 1: 1778.1. Samples: 2818118. Policy #0 lag: (min: 2.0, avg: 2.1, max: 7.0) [2023-10-14 01:14:59,558][31953] Avg episode reward: [(0, '20.170'), (1, '20.300')] [2023-10-14 01:15:00,770][33201] Updated weights for policy 0, policy_version 5480 (0.0007) [2023-10-14 01:15:01,139][33201] Updated weights for policy 0, policy_version 5490 (0.0007) [2023-10-14 01:15:01,514][33201] Updated weights for policy 0, policy_version 5500 (0.0008) [2023-10-14 01:15:03,226][33226] Updated weights for policy 1, policy_version 5540 (0.0008) [2023-10-14 01:15:03,598][33226] Updated weights for policy 1, policy_version 5550 (0.0007) [2023-10-14 01:15:03,963][33226] Updated weights for policy 1, policy_version 5560 (0.0008) [2023-10-14 01:15:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 11337728. Throughput: 0: 1764.8, 1: 1802.8. Samples: 2840200. Policy #0 lag: (min: 17.0, avg: 31.0, max: 49.0) [2023-10-14 01:15:04,557][31953] Avg episode reward: [(0, '20.170'), (1, '20.320')] [2023-10-14 01:15:05,216][33201] Updated weights for policy 0, policy_version 5510 (0.0008) [2023-10-14 01:15:05,587][33201] Updated weights for policy 0, policy_version 5520 (0.0009) [2023-10-14 01:15:05,964][33201] Updated weights for policy 0, policy_version 5530 (0.0007) [2023-10-14 01:15:07,820][33226] Updated weights for policy 1, policy_version 5570 (0.0007) [2023-10-14 01:15:08,178][33226] Updated weights for policy 1, policy_version 5580 (0.0007) [2023-10-14 01:15:08,550][33226] Updated weights for policy 1, policy_version 5590 (0.0008) [2023-10-14 01:15:08,926][33226] Updated weights for policy 1, policy_version 5600 (0.0009) [2023-10-14 01:15:09,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 11403264. Throughput: 0: 1781.6, 1: 1776.1. Samples: 2861190. Policy #0 lag: (min: 17.0, avg: 31.0, max: 49.0) [2023-10-14 01:15:09,557][31953] Avg episode reward: [(0, '20.170'), (1, '20.340')] [2023-10-14 01:15:09,763][33201] Updated weights for policy 0, policy_version 5540 (0.0008) [2023-10-14 01:15:10,146][33201] Updated weights for policy 0, policy_version 5550 (0.0008) [2023-10-14 01:15:10,518][33201] Updated weights for policy 0, policy_version 5560 (0.0008) [2023-10-14 01:15:12,482][33226] Updated weights for policy 1, policy_version 5610 (0.0007) [2023-10-14 01:15:12,847][33226] Updated weights for policy 1, policy_version 5620 (0.0009) [2023-10-14 01:15:13,216][33226] Updated weights for policy 1, policy_version 5630 (0.0008) [2023-10-14 01:15:14,501][33201] Updated weights for policy 0, policy_version 5570 (0.0009) [2023-10-14 01:15:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 11468800. Throughput: 0: 1759.6, 1: 1800.7. Samples: 2872194. Policy #0 lag: (min: 17.0, avg: 25.5, max: 49.0) [2023-10-14 01:15:14,558][31953] Avg episode reward: [(0, '20.140'), (1, '20.350')] [2023-10-14 01:15:14,868][33201] Updated weights for policy 0, policy_version 5580 (0.0008) [2023-10-14 01:15:15,242][33201] Updated weights for policy 0, policy_version 5590 (0.0008) [2023-10-14 01:15:15,610][33201] Updated weights for policy 0, policy_version 5600 (0.0007) [2023-10-14 01:15:16,858][33226] Updated weights for policy 1, policy_version 5640 (0.0008) [2023-10-14 01:15:17,226][33226] Updated weights for policy 1, policy_version 5650 (0.0008) [2023-10-14 01:15:17,599][33226] Updated weights for policy 1, policy_version 5660 (0.0009) [2023-10-14 01:15:19,424][33201] Updated weights for policy 0, policy_version 5610 (0.0007) [2023-10-14 01:15:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 11534336. Throughput: 0: 1769.6, 1: 1782.0. Samples: 2893218. Policy #0 lag: (min: 17.0, avg: 25.5, max: 49.0) [2023-10-14 01:15:19,558][31953] Avg episode reward: [(0, '20.140'), (1, '20.340')] [2023-10-14 01:15:19,795][33201] Updated weights for policy 0, policy_version 5620 (0.0007) [2023-10-14 01:15:20,167][33201] Updated weights for policy 0, policy_version 5630 (0.0008) [2023-10-14 01:15:21,345][33226] Updated weights for policy 1, policy_version 5670 (0.0007) [2023-10-14 01:15:21,721][33226] Updated weights for policy 1, policy_version 5680 (0.0008) [2023-10-14 01:15:22,080][33226] Updated weights for policy 1, policy_version 5690 (0.0009) [2023-10-14 01:15:24,066][33201] Updated weights for policy 0, policy_version 5640 (0.0009) [2023-10-14 01:15:24,438][33201] Updated weights for policy 0, policy_version 5650 (0.0008) [2023-10-14 01:15:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 11599872. Throughput: 0: 1785.5, 1: 1785.4. Samples: 2915416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:24,558][31953] Avg episode reward: [(0, '20.150'), (1, '20.380')] [2023-10-14 01:15:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000005696_5832704.pth... [2023-10-14 01:15:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000004032_4128768.pth [2023-10-14 01:15:24,608][32895] Saving new best policy, reward=20.380! [2023-10-14 01:15:24,816][33201] Updated weights for policy 0, policy_version 5660 (0.0007) [2023-10-14 01:15:24,963][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000005664_5799936.pth... [2023-10-14 01:15:25,002][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000004000_4096000.pth [2023-10-14 01:15:25,908][33226] Updated weights for policy 1, policy_version 5700 (0.0009) [2023-10-14 01:15:26,276][33226] Updated weights for policy 1, policy_version 5710 (0.0007) [2023-10-14 01:15:26,650][33226] Updated weights for policy 1, policy_version 5720 (0.0007) [2023-10-14 01:15:28,602][33201] Updated weights for policy 0, policy_version 5670 (0.0010) [2023-10-14 01:15:28,982][33201] Updated weights for policy 0, policy_version 5680 (0.0008) [2023-10-14 01:15:29,358][33201] Updated weights for policy 0, policy_version 5690 (0.0010) [2023-10-14 01:15:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 11665408. Throughput: 0: 1765.3, 1: 1785.3. Samples: 2925534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:29,557][31953] Avg episode reward: [(0, '20.170'), (1, '20.420')] [2023-10-14 01:15:29,558][32895] Saving new best policy, reward=20.420! [2023-10-14 01:15:30,334][33226] Updated weights for policy 1, policy_version 5730 (0.0010) [2023-10-14 01:15:30,707][33226] Updated weights for policy 1, policy_version 5740 (0.0007) [2023-10-14 01:15:31,074][33226] Updated weights for policy 1, policy_version 5750 (0.0008) [2023-10-14 01:15:31,438][33226] Updated weights for policy 1, policy_version 5760 (0.0007) [2023-10-14 01:15:33,104][33201] Updated weights for policy 0, policy_version 5700 (0.0009) [2023-10-14 01:15:33,472][33201] Updated weights for policy 0, policy_version 5710 (0.0010) [2023-10-14 01:15:33,854][33201] Updated weights for policy 0, policy_version 5720 (0.0009) [2023-10-14 01:15:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 11763712. Throughput: 0: 1795.2, 1: 1784.9. Samples: 2947318. Policy #0 lag: (min: 31.0, avg: 40.2, max: 63.0) [2023-10-14 01:15:34,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.440')] [2023-10-14 01:15:34,559][32895] Saving new best policy, reward=20.440! [2023-10-14 01:15:35,273][33226] Updated weights for policy 1, policy_version 5770 (0.0008) [2023-10-14 01:15:35,636][33226] Updated weights for policy 1, policy_version 5780 (0.0010) [2023-10-14 01:15:36,007][33226] Updated weights for policy 1, policy_version 5790 (0.0007) [2023-10-14 01:15:37,501][33201] Updated weights for policy 0, policy_version 5730 (0.0009) [2023-10-14 01:15:37,870][33201] Updated weights for policy 0, policy_version 5740 (0.0008) [2023-10-14 01:15:38,244][33201] Updated weights for policy 0, policy_version 5750 (0.0010) [2023-10-14 01:15:38,614][33201] Updated weights for policy 0, policy_version 5760 (0.0010) [2023-10-14 01:15:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 11829248. Throughput: 0: 1766.3, 1: 1788.8. Samples: 2968396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:39,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.420')] [2023-10-14 01:15:39,888][33226] Updated weights for policy 1, policy_version 5800 (0.0008) [2023-10-14 01:15:40,252][33226] Updated weights for policy 1, policy_version 5810 (0.0008) [2023-10-14 01:15:40,630][33226] Updated weights for policy 1, policy_version 5820 (0.0009) [2023-10-14 01:15:42,473][33201] Updated weights for policy 0, policy_version 5770 (0.0010) [2023-10-14 01:15:42,854][33201] Updated weights for policy 0, policy_version 5780 (0.0009) [2023-10-14 01:15:43,223][33201] Updated weights for policy 0, policy_version 5790 (0.0008) [2023-10-14 01:15:44,393][33226] Updated weights for policy 1, policy_version 5830 (0.0007) [2023-10-14 01:15:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 11894784. Throughput: 0: 1796.5, 1: 1779.9. Samples: 2979058. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:44,558][31953] Avg episode reward: [(0, '20.200'), (1, '20.410')] [2023-10-14 01:15:44,772][33226] Updated weights for policy 1, policy_version 5840 (0.0009) [2023-10-14 01:15:45,131][33226] Updated weights for policy 1, policy_version 5850 (0.0008) [2023-10-14 01:15:46,960][33201] Updated weights for policy 0, policy_version 5800 (0.0009) [2023-10-14 01:15:47,328][33201] Updated weights for policy 0, policy_version 5810 (0.0009) [2023-10-14 01:15:47,696][33201] Updated weights for policy 0, policy_version 5820 (0.0010) [2023-10-14 01:15:49,101][33226] Updated weights for policy 1, policy_version 5860 (0.0008) [2023-10-14 01:15:49,467][33226] Updated weights for policy 1, policy_version 5870 (0.0007) [2023-10-14 01:15:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 11960320. Throughput: 0: 1765.6, 1: 1782.2. Samples: 2999852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:49,558][31953] Avg episode reward: [(0, '20.250'), (1, '20.420')] [2023-10-14 01:15:49,841][33226] Updated weights for policy 1, policy_version 5880 (0.0008) [2023-10-14 01:15:51,538][33201] Updated weights for policy 0, policy_version 5830 (0.0007) [2023-10-14 01:15:51,907][33201] Updated weights for policy 0, policy_version 5840 (0.0010) [2023-10-14 01:15:52,278][33201] Updated weights for policy 0, policy_version 5850 (0.0009) [2023-10-14 01:15:53,470][33226] Updated weights for policy 1, policy_version 5890 (0.0008) [2023-10-14 01:15:53,840][33226] Updated weights for policy 1, policy_version 5900 (0.0009) [2023-10-14 01:15:54,197][33226] Updated weights for policy 1, policy_version 5910 (0.0008) [2023-10-14 01:15:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 12025856. Throughput: 0: 1762.7, 1: 1797.2. Samples: 3021384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:54,558][31953] Avg episode reward: [(0, '20.230'), (1, '20.410')] [2023-10-14 01:15:54,563][33226] Updated weights for policy 1, policy_version 5920 (0.0010) [2023-10-14 01:15:56,056][33201] Updated weights for policy 0, policy_version 5860 (0.0008) [2023-10-14 01:15:56,437][33201] Updated weights for policy 0, policy_version 5870 (0.0007) [2023-10-14 01:15:56,807][33201] Updated weights for policy 0, policy_version 5880 (0.0009) [2023-10-14 01:15:58,369][33226] Updated weights for policy 1, policy_version 5930 (0.0009) [2023-10-14 01:15:58,752][33226] Updated weights for policy 1, policy_version 5940 (0.0008) [2023-10-14 01:15:59,125][33226] Updated weights for policy 1, policy_version 5950 (0.0010) [2023-10-14 01:15:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 12124160. Throughput: 0: 1767.8, 1: 1777.2. Samples: 3031716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:15:59,558][31953] Avg episode reward: [(0, '20.220'), (1, '20.430')] [2023-10-14 01:16:00,612][33201] Updated weights for policy 0, policy_version 5890 (0.0008) [2023-10-14 01:16:00,998][33201] Updated weights for policy 0, policy_version 5900 (0.0010) [2023-10-14 01:16:01,364][33201] Updated weights for policy 0, policy_version 5910 (0.0009) [2023-10-14 01:16:01,731][33201] Updated weights for policy 0, policy_version 5920 (0.0008) [2023-10-14 01:16:02,916][33226] Updated weights for policy 1, policy_version 5960 (0.0008) [2023-10-14 01:16:03,280][33226] Updated weights for policy 1, policy_version 5970 (0.0007) [2023-10-14 01:16:03,661][33226] Updated weights for policy 1, policy_version 5980 (0.0009) [2023-10-14 01:16:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 12189696. Throughput: 0: 1765.0, 1: 1793.0. Samples: 3053328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:16:04,558][31953] Avg episode reward: [(0, '20.260'), (1, '20.440')] [2023-10-14 01:16:05,462][33201] Updated weights for policy 0, policy_version 5930 (0.0008) [2023-10-14 01:16:05,836][33201] Updated weights for policy 0, policy_version 5940 (0.0009) [2023-10-14 01:16:06,209][33201] Updated weights for policy 0, policy_version 5950 (0.0009) [2023-10-14 01:16:07,653][33226] Updated weights for policy 1, policy_version 5990 (0.0008) [2023-10-14 01:16:08,017][33226] Updated weights for policy 1, policy_version 6000 (0.0007) [2023-10-14 01:16:08,387][33226] Updated weights for policy 1, policy_version 6010 (0.0009) [2023-10-14 01:16:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 12255232. Throughput: 0: 1767.2, 1: 1760.5. Samples: 3074164. Policy #0 lag: (min: 17.0, avg: 27.5, max: 49.0) [2023-10-14 01:16:09,558][31953] Avg episode reward: [(0, '20.280'), (1, '20.440')] [2023-10-14 01:16:09,565][32837] Saving new best policy, reward=20.280! [2023-10-14 01:16:10,003][33201] Updated weights for policy 0, policy_version 5960 (0.0008) [2023-10-14 01:16:10,384][33201] Updated weights for policy 0, policy_version 5970 (0.0009) [2023-10-14 01:16:10,759][33201] Updated weights for policy 0, policy_version 5980 (0.0010) [2023-10-14 01:16:12,244][33226] Updated weights for policy 1, policy_version 6020 (0.0008) [2023-10-14 01:16:12,618][33226] Updated weights for policy 1, policy_version 6030 (0.0007) [2023-10-14 01:16:12,980][33226] Updated weights for policy 1, policy_version 6040 (0.0008) [2023-10-14 01:16:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 12320768. Throughput: 0: 1755.5, 1: 1795.2. Samples: 3085316. Policy #0 lag: (min: 17.0, avg: 27.5, max: 49.0) [2023-10-14 01:16:14,558][31953] Avg episode reward: [(0, '20.310'), (1, '20.450')] [2023-10-14 01:16:14,559][32895] Saving new best policy, reward=20.450! [2023-10-14 01:16:14,587][33201] Updated weights for policy 0, policy_version 5990 (0.0010) [2023-10-14 01:16:14,964][33201] Updated weights for policy 0, policy_version 6000 (0.0008) [2023-10-14 01:16:15,339][33201] Updated weights for policy 0, policy_version 6010 (0.0009) [2023-10-14 01:16:15,557][32837] Saving new best policy, reward=20.310! [2023-10-14 01:16:16,681][33226] Updated weights for policy 1, policy_version 6050 (0.0007) [2023-10-14 01:16:17,055][33226] Updated weights for policy 1, policy_version 6060 (0.0007) [2023-10-14 01:16:17,419][33226] Updated weights for policy 1, policy_version 6070 (0.0009) [2023-10-14 01:16:17,789][33226] Updated weights for policy 1, policy_version 6080 (0.0007) [2023-10-14 01:16:19,233][33201] Updated weights for policy 0, policy_version 6020 (0.0008) [2023-10-14 01:16:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 12386304. Throughput: 0: 1758.7, 1: 1769.9. Samples: 3106108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:16:19,558][31953] Avg episode reward: [(0, '20.330'), (1, '20.460')] [2023-10-14 01:16:19,560][32895] Saving new best policy, reward=20.460! [2023-10-14 01:16:19,620][33201] Updated weights for policy 0, policy_version 6030 (0.0010) [2023-10-14 01:16:19,988][33201] Updated weights for policy 0, policy_version 6040 (0.0007) [2023-10-14 01:16:20,286][32837] Saving new best policy, reward=20.330! [2023-10-14 01:16:21,734][33226] Updated weights for policy 1, policy_version 6090 (0.0009) [2023-10-14 01:16:22,106][33226] Updated weights for policy 1, policy_version 6100 (0.0010) [2023-10-14 01:16:22,471][33226] Updated weights for policy 1, policy_version 6110 (0.0008) [2023-10-14 01:16:24,089][33201] Updated weights for policy 0, policy_version 6050 (0.0009) [2023-10-14 01:16:24,458][33201] Updated weights for policy 0, policy_version 6060 (0.0007) [2023-10-14 01:16:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 12451840. Throughput: 0: 1780.1, 1: 1765.0. Samples: 3127922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:16:24,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.460')] [2023-10-14 01:16:24,826][33201] Updated weights for policy 0, policy_version 6070 (0.0009) [2023-10-14 01:16:25,196][32837] Saving new best policy, reward=20.350! [2023-10-14 01:16:25,200][33201] Updated weights for policy 0, policy_version 6080 (0.0010) [2023-10-14 01:16:26,232][33226] Updated weights for policy 1, policy_version 6120 (0.0011) [2023-10-14 01:16:26,604][33226] Updated weights for policy 1, policy_version 6130 (0.0009) [2023-10-14 01:16:26,976][33226] Updated weights for policy 1, policy_version 6140 (0.0009) [2023-10-14 01:16:29,081][33201] Updated weights for policy 0, policy_version 6090 (0.0008) [2023-10-14 01:16:29,447][33201] Updated weights for policy 0, policy_version 6100 (0.0007) [2023-10-14 01:16:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 12517376. Throughput: 0: 1755.7, 1: 1775.0. Samples: 3137938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:16:29,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.460')] [2023-10-14 01:16:29,822][33201] Updated weights for policy 0, policy_version 6110 (0.0009) [2023-10-14 01:16:30,834][33226] Updated weights for policy 1, policy_version 6150 (0.0008) [2023-10-14 01:16:31,195][33226] Updated weights for policy 1, policy_version 6160 (0.0007) [2023-10-14 01:16:31,571][33226] Updated weights for policy 1, policy_version 6170 (0.0009) [2023-10-14 01:16:33,605][33201] Updated weights for policy 0, policy_version 6120 (0.0010) [2023-10-14 01:16:33,986][33201] Updated weights for policy 0, policy_version 6130 (0.0007) [2023-10-14 01:16:34,358][33201] Updated weights for policy 0, policy_version 6140 (0.0007) [2023-10-14 01:16:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 12615680. Throughput: 0: 1780.9, 1: 1764.2. Samples: 3159380. Policy #0 lag: (min: 10.0, avg: 10.3, max: 22.0) [2023-10-14 01:16:34,558][31953] Avg episode reward: [(0, '20.360'), (1, '20.450')] [2023-10-14 01:16:34,558][32837] Saving new best policy, reward=20.360! [2023-10-14 01:16:35,298][33226] Updated weights for policy 1, policy_version 6180 (0.0009) [2023-10-14 01:16:35,668][33226] Updated weights for policy 1, policy_version 6190 (0.0009) [2023-10-14 01:16:36,047][33226] Updated weights for policy 1, policy_version 6200 (0.0007) [2023-10-14 01:16:38,238][33201] Updated weights for policy 0, policy_version 6150 (0.0010) [2023-10-14 01:16:38,611][33201] Updated weights for policy 0, policy_version 6160 (0.0009) [2023-10-14 01:16:38,986][33201] Updated weights for policy 0, policy_version 6170 (0.0007) [2023-10-14 01:16:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 12681216. Throughput: 0: 1751.8, 1: 1774.8. Samples: 3180084. Policy #0 lag: (min: 10.0, avg: 10.3, max: 22.0) [2023-10-14 01:16:39,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.440')] [2023-10-14 01:16:40,000][33226] Updated weights for policy 1, policy_version 6210 (0.0008) [2023-10-14 01:16:40,378][33226] Updated weights for policy 1, policy_version 6220 (0.0010) [2023-10-14 01:16:40,752][33226] Updated weights for policy 1, policy_version 6230 (0.0008) [2023-10-14 01:16:41,118][33226] Updated weights for policy 1, policy_version 6240 (0.0008) [2023-10-14 01:16:42,744][33201] Updated weights for policy 0, policy_version 6180 (0.0008) [2023-10-14 01:16:43,123][33201] Updated weights for policy 0, policy_version 6190 (0.0008) [2023-10-14 01:16:43,494][33201] Updated weights for policy 0, policy_version 6200 (0.0007) [2023-10-14 01:16:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 12746752. Throughput: 0: 1780.9, 1: 1757.2. Samples: 3190930. Policy #0 lag: (min: 30.0, avg: 30.6, max: 46.0) [2023-10-14 01:16:44,558][31953] Avg episode reward: [(0, '19.940'), (1, '20.440')] [2023-10-14 01:16:44,914][33226] Updated weights for policy 1, policy_version 6250 (0.0007) [2023-10-14 01:16:45,277][33226] Updated weights for policy 1, policy_version 6260 (0.0007) [2023-10-14 01:16:45,639][33226] Updated weights for policy 1, policy_version 6270 (0.0007) [2023-10-14 01:16:46,460][33201] Updated weights for policy 0, policy_version 6210 (0.0008) [2023-10-14 01:16:46,840][33201] Updated weights for policy 0, policy_version 6220 (0.0010) [2023-10-14 01:16:47,224][33201] Updated weights for policy 0, policy_version 6230 (0.0010) [2023-10-14 01:16:47,589][33201] Updated weights for policy 0, policy_version 6240 (0.0010) [2023-10-14 01:16:48,738][33226] Updated weights for policy 1, policy_version 6280 (0.0008) [2023-10-14 01:16:49,106][33226] Updated weights for policy 1, policy_version 6290 (0.0007) [2023-10-14 01:16:49,467][33226] Updated weights for policy 1, policy_version 6300 (0.0007) [2023-10-14 01:16:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 12812288. Throughput: 0: 1777.7, 1: 1801.2. Samples: 3214376. Policy #0 lag: (min: 30.0, avg: 30.6, max: 46.0) [2023-10-14 01:16:49,558][31953] Avg episode reward: [(0, '19.940'), (1, '20.450')] [2023-10-14 01:16:51,410][33201] Updated weights for policy 0, policy_version 6250 (0.0009) [2023-10-14 01:16:51,791][33201] Updated weights for policy 0, policy_version 6260 (0.0010) [2023-10-14 01:16:52,154][33201] Updated weights for policy 0, policy_version 6270 (0.0009) [2023-10-14 01:16:53,229][33226] Updated weights for policy 1, policy_version 6310 (0.0008) [2023-10-14 01:16:53,598][33226] Updated weights for policy 1, policy_version 6320 (0.0008) [2023-10-14 01:16:53,955][33226] Updated weights for policy 1, policy_version 6330 (0.0007) [2023-10-14 01:16:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 12910592. Throughput: 0: 1781.0, 1: 1801.7. Samples: 3235386. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-14 01:16:54,558][31953] Avg episode reward: [(0, '19.910'), (1, '20.450')] [2023-10-14 01:16:56,122][33201] Updated weights for policy 0, policy_version 6280 (0.0008) [2023-10-14 01:16:56,502][33201] Updated weights for policy 0, policy_version 6290 (0.0008) [2023-10-14 01:16:56,884][33201] Updated weights for policy 0, policy_version 6300 (0.0009) [2023-10-14 01:16:57,702][33226] Updated weights for policy 1, policy_version 6340 (0.0008) [2023-10-14 01:16:58,062][33226] Updated weights for policy 1, policy_version 6350 (0.0009) [2023-10-14 01:16:58,436][33226] Updated weights for policy 1, policy_version 6360 (0.0007) [2023-10-14 01:16:59,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 12976128. Throughput: 0: 1780.1, 1: 1792.2. Samples: 3246072. Policy #0 lag: (min: 31.0, avg: 31.3, max: 43.0) [2023-10-14 01:16:59,558][31953] Avg episode reward: [(0, '19.900'), (1, '20.470')] [2023-10-14 01:16:59,560][32895] Saving new best policy, reward=20.470! [2023-10-14 01:17:00,702][33201] Updated weights for policy 0, policy_version 6310 (0.0009) [2023-10-14 01:17:01,077][33201] Updated weights for policy 0, policy_version 6320 (0.0007) [2023-10-14 01:17:01,451][33201] Updated weights for policy 0, policy_version 6330 (0.0009) [2023-10-14 01:17:02,389][33226] Updated weights for policy 1, policy_version 6370 (0.0007) [2023-10-14 01:17:02,761][33226] Updated weights for policy 1, policy_version 6380 (0.0009) [2023-10-14 01:17:03,130][33226] Updated weights for policy 1, policy_version 6390 (0.0008) [2023-10-14 01:17:03,498][33226] Updated weights for policy 1, policy_version 6400 (0.0009) [2023-10-14 01:17:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 13041664. Throughput: 0: 1781.4, 1: 1801.5. Samples: 3267340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:04,558][31953] Avg episode reward: [(0, '19.870'), (1, '20.470')] [2023-10-14 01:17:05,229][33201] Updated weights for policy 0, policy_version 6340 (0.0010) [2023-10-14 01:17:05,599][33201] Updated weights for policy 0, policy_version 6350 (0.0011) [2023-10-14 01:17:05,985][33201] Updated weights for policy 0, policy_version 6360 (0.0007) [2023-10-14 01:17:07,411][33226] Updated weights for policy 1, policy_version 6410 (0.0008) [2023-10-14 01:17:07,778][33226] Updated weights for policy 1, policy_version 6420 (0.0007) [2023-10-14 01:17:08,153][33226] Updated weights for policy 1, policy_version 6430 (0.0011) [2023-10-14 01:17:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 13107200. Throughput: 0: 1785.8, 1: 1788.4. Samples: 3288764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:09,558][31953] Avg episode reward: [(0, '19.880'), (1, '20.490')] [2023-10-14 01:17:09,569][32895] Saving new best policy, reward=20.490! [2023-10-14 01:17:09,757][33201] Updated weights for policy 0, policy_version 6370 (0.0009) [2023-10-14 01:17:10,124][33201] Updated weights for policy 0, policy_version 6380 (0.0008) [2023-10-14 01:17:10,503][33201] Updated weights for policy 0, policy_version 6390 (0.0008) [2023-10-14 01:17:10,875][33201] Updated weights for policy 0, policy_version 6400 (0.0009) [2023-10-14 01:17:11,885][33226] Updated weights for policy 1, policy_version 6440 (0.0009) [2023-10-14 01:17:12,250][33226] Updated weights for policy 1, policy_version 6450 (0.0007) [2023-10-14 01:17:12,616][33226] Updated weights for policy 1, policy_version 6460 (0.0008) [2023-10-14 01:17:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 13172736. Throughput: 0: 1780.4, 1: 1808.6. Samples: 3299444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:14,557][31953] Avg episode reward: [(0, '19.890'), (1, '20.480')] [2023-10-14 01:17:14,815][33201] Updated weights for policy 0, policy_version 6410 (0.0012) [2023-10-14 01:17:15,186][33201] Updated weights for policy 0, policy_version 6420 (0.0010) [2023-10-14 01:17:15,557][33201] Updated weights for policy 0, policy_version 6430 (0.0009) [2023-10-14 01:17:16,071][33226] Updated weights for policy 1, policy_version 6470 (0.0007) [2023-10-14 01:17:16,443][33226] Updated weights for policy 1, policy_version 6480 (0.0008) [2023-10-14 01:17:16,807][33226] Updated weights for policy 1, policy_version 6490 (0.0012) [2023-10-14 01:17:19,103][33201] Updated weights for policy 0, policy_version 6440 (0.0009) [2023-10-14 01:17:19,484][33201] Updated weights for policy 0, policy_version 6450 (0.0008) [2023-10-14 01:17:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 13238272. Throughput: 0: 1784.5, 1: 1796.5. Samples: 3320528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:19,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.510')] [2023-10-14 01:17:19,559][32895] Saving new best policy, reward=20.510! [2023-10-14 01:17:19,855][33201] Updated weights for policy 0, policy_version 6460 (0.0008) [2023-10-14 01:17:20,655][33226] Updated weights for policy 1, policy_version 6500 (0.0010) [2023-10-14 01:17:21,031][33226] Updated weights for policy 1, policy_version 6510 (0.0009) [2023-10-14 01:17:21,395][33226] Updated weights for policy 1, policy_version 6520 (0.0008) [2023-10-14 01:17:23,730][33201] Updated weights for policy 0, policy_version 6470 (0.0008) [2023-10-14 01:17:24,099][33201] Updated weights for policy 0, policy_version 6480 (0.0011) [2023-10-14 01:17:24,463][33201] Updated weights for policy 0, policy_version 6490 (0.0010) [2023-10-14 01:17:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 13303808. Throughput: 0: 1800.1, 1: 1801.7. Samples: 3342166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:24,558][31953] Avg episode reward: [(0, '19.840'), (1, '20.500')] [2023-10-14 01:17:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000006528_6684672.pth... [2023-10-14 01:17:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000004864_4980736.pth [2023-10-14 01:17:24,680][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000006496_6651904.pth... [2023-10-14 01:17:24,712][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000004832_4947968.pth [2023-10-14 01:17:25,273][33226] Updated weights for policy 1, policy_version 6530 (0.0010) [2023-10-14 01:17:25,634][33226] Updated weights for policy 1, policy_version 6540 (0.0008) [2023-10-14 01:17:26,005][33226] Updated weights for policy 1, policy_version 6550 (0.0009) [2023-10-14 01:17:26,376][33226] Updated weights for policy 1, policy_version 6560 (0.0009) [2023-10-14 01:17:28,355][33201] Updated weights for policy 0, policy_version 6500 (0.0008) [2023-10-14 01:17:28,729][33201] Updated weights for policy 0, policy_version 6510 (0.0007) [2023-10-14 01:17:29,104][33201] Updated weights for policy 0, policy_version 6520 (0.0007) [2023-10-14 01:17:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 13402112. Throughput: 0: 1785.8, 1: 1800.8. Samples: 3352328. Policy #0 lag: (min: 0.0, avg: 14.1, max: 32.0) [2023-10-14 01:17:29,558][31953] Avg episode reward: [(0, '19.880'), (1, '20.530')] [2023-10-14 01:17:29,559][32895] Saving new best policy, reward=20.530! [2023-10-14 01:17:30,157][33226] Updated weights for policy 1, policy_version 6570 (0.0008) [2023-10-14 01:17:30,525][33226] Updated weights for policy 1, policy_version 6580 (0.0008) [2023-10-14 01:17:30,884][33226] Updated weights for policy 1, policy_version 6590 (0.0010) [2023-10-14 01:17:33,053][33201] Updated weights for policy 0, policy_version 6530 (0.0008) [2023-10-14 01:17:33,429][33201] Updated weights for policy 0, policy_version 6540 (0.0008) [2023-10-14 01:17:33,802][33201] Updated weights for policy 0, policy_version 6550 (0.0007) [2023-10-14 01:17:34,170][33201] Updated weights for policy 0, policy_version 6560 (0.0010) [2023-10-14 01:17:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13467648. Throughput: 0: 1782.6, 1: 1768.5. Samples: 3374178. Policy #0 lag: (min: 0.0, avg: 14.1, max: 32.0) [2023-10-14 01:17:34,558][31953] Avg episode reward: [(0, '19.870'), (1, '20.530')] [2023-10-14 01:17:34,737][33226] Updated weights for policy 1, policy_version 6600 (0.0008) [2023-10-14 01:17:35,108][33226] Updated weights for policy 1, policy_version 6610 (0.0007) [2023-10-14 01:17:35,475][33226] Updated weights for policy 1, policy_version 6620 (0.0008) [2023-10-14 01:17:38,026][33201] Updated weights for policy 0, policy_version 6570 (0.0007) [2023-10-14 01:17:38,406][33201] Updated weights for policy 0, policy_version 6580 (0.0007) [2023-10-14 01:17:38,778][33201] Updated weights for policy 0, policy_version 6590 (0.0008) [2023-10-14 01:17:39,179][33226] Updated weights for policy 1, policy_version 6630 (0.0010) [2023-10-14 01:17:39,540][33226] Updated weights for policy 1, policy_version 6640 (0.0009) [2023-10-14 01:17:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13533184. Throughput: 0: 1751.3, 1: 1798.3. Samples: 3395116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:39,557][31953] Avg episode reward: [(0, '19.830'), (1, '20.540')] [2023-10-14 01:17:39,905][33226] Updated weights for policy 1, policy_version 6650 (0.0010) [2023-10-14 01:17:40,132][32895] Saving new best policy, reward=20.540! [2023-10-14 01:17:42,758][33201] Updated weights for policy 0, policy_version 6600 (0.0010) [2023-10-14 01:17:43,133][33201] Updated weights for policy 0, policy_version 6610 (0.0010) [2023-10-14 01:17:43,510][33201] Updated weights for policy 0, policy_version 6620 (0.0007) [2023-10-14 01:17:43,687][33226] Updated weights for policy 1, policy_version 6660 (0.0009) [2023-10-14 01:17:44,056][33226] Updated weights for policy 1, policy_version 6670 (0.0009) [2023-10-14 01:17:44,415][33226] Updated weights for policy 1, policy_version 6680 (0.0008) [2023-10-14 01:17:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13598720. Throughput: 0: 1785.4, 1: 1773.0. Samples: 3406200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:44,558][31953] Avg episode reward: [(0, '19.840'), (1, '20.500')] [2023-10-14 01:17:47,374][33201] Updated weights for policy 0, policy_version 6630 (0.0008) [2023-10-14 01:17:47,746][33201] Updated weights for policy 0, policy_version 6640 (0.0009) [2023-10-14 01:17:48,121][33201] Updated weights for policy 0, policy_version 6650 (0.0011) [2023-10-14 01:17:48,311][33226] Updated weights for policy 1, policy_version 6690 (0.0007) [2023-10-14 01:17:48,680][33226] Updated weights for policy 1, policy_version 6700 (0.0011) [2023-10-14 01:17:49,052][33226] Updated weights for policy 1, policy_version 6710 (0.0009) [2023-10-14 01:17:49,423][33226] Updated weights for policy 1, policy_version 6720 (0.0009) [2023-10-14 01:17:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 13697024. Throughput: 0: 1755.5, 1: 1792.3. Samples: 3426988. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:49,558][31953] Avg episode reward: [(0, '19.830'), (1, '20.550')] [2023-10-14 01:17:49,559][32895] Saving new best policy, reward=20.550! [2023-10-14 01:17:52,124][33201] Updated weights for policy 0, policy_version 6660 (0.0010) [2023-10-14 01:17:52,501][33201] Updated weights for policy 0, policy_version 6670 (0.0007) [2023-10-14 01:17:52,873][33201] Updated weights for policy 0, policy_version 6680 (0.0007) [2023-10-14 01:17:53,208][33226] Updated weights for policy 1, policy_version 6730 (0.0008) [2023-10-14 01:17:53,581][33226] Updated weights for policy 1, policy_version 6740 (0.0007) [2023-10-14 01:17:53,945][33226] Updated weights for policy 1, policy_version 6750 (0.0009) [2023-10-14 01:17:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13762560. Throughput: 0: 1742.8, 1: 1777.0. Samples: 3447154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:54,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.560')] [2023-10-14 01:17:54,567][32895] Saving new best policy, reward=20.560! [2023-10-14 01:17:56,497][33201] Updated weights for policy 0, policy_version 6690 (0.0007) [2023-10-14 01:17:56,872][33201] Updated weights for policy 0, policy_version 6700 (0.0007) [2023-10-14 01:17:57,249][33201] Updated weights for policy 0, policy_version 6710 (0.0010) [2023-10-14 01:17:57,615][33201] Updated weights for policy 0, policy_version 6720 (0.0008) [2023-10-14 01:17:57,972][33226] Updated weights for policy 1, policy_version 6760 (0.0007) [2023-10-14 01:17:58,354][33226] Updated weights for policy 1, policy_version 6770 (0.0007) [2023-10-14 01:17:58,716][33226] Updated weights for policy 1, policy_version 6780 (0.0009) [2023-10-14 01:17:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13828096. Throughput: 0: 1762.3, 1: 1777.6. Samples: 3458742. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:17:59,558][31953] Avg episode reward: [(0, '19.840'), (1, '20.560')] [2023-10-14 01:18:01,245][33201] Updated weights for policy 0, policy_version 6730 (0.0008) [2023-10-14 01:18:01,625][33201] Updated weights for policy 0, policy_version 6740 (0.0008) [2023-10-14 01:18:01,996][33201] Updated weights for policy 0, policy_version 6750 (0.0008) [2023-10-14 01:18:02,503][33226] Updated weights for policy 1, policy_version 6790 (0.0008) [2023-10-14 01:18:02,878][33226] Updated weights for policy 1, policy_version 6800 (0.0007) [2023-10-14 01:18:03,256][33226] Updated weights for policy 1, policy_version 6810 (0.0009) [2023-10-14 01:18:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13893632. Throughput: 0: 1752.4, 1: 1781.4. Samples: 3479552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:18:04,558][31953] Avg episode reward: [(0, '19.810'), (1, '20.560')] [2023-10-14 01:18:05,929][33201] Updated weights for policy 0, policy_version 6760 (0.0007) [2023-10-14 01:18:06,310][33201] Updated weights for policy 0, policy_version 6770 (0.0007) [2023-10-14 01:18:06,683][33201] Updated weights for policy 0, policy_version 6780 (0.0008) [2023-10-14 01:18:06,973][33226] Updated weights for policy 1, policy_version 6820 (0.0008) [2023-10-14 01:18:07,338][33226] Updated weights for policy 1, policy_version 6830 (0.0007) [2023-10-14 01:18:07,712][33226] Updated weights for policy 1, policy_version 6840 (0.0007) [2023-10-14 01:18:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 13959168. Throughput: 0: 1758.7, 1: 1765.0. Samples: 3500730. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:18:09,557][31953] Avg episode reward: [(0, '19.790'), (1, '20.590')] [2023-10-14 01:18:09,567][32895] Saving new best policy, reward=20.590! [2023-10-14 01:18:10,554][33201] Updated weights for policy 0, policy_version 6790 (0.0010) [2023-10-14 01:18:10,922][33201] Updated weights for policy 0, policy_version 6800 (0.0008) [2023-10-14 01:18:11,299][33201] Updated weights for policy 0, policy_version 6810 (0.0007) [2023-10-14 01:18:11,499][33226] Updated weights for policy 1, policy_version 6850 (0.0009) [2023-10-14 01:18:11,864][33226] Updated weights for policy 1, policy_version 6860 (0.0008) [2023-10-14 01:18:12,228][33226] Updated weights for policy 1, policy_version 6870 (0.0008) [2023-10-14 01:18:12,594][33226] Updated weights for policy 1, policy_version 6880 (0.0008) [2023-10-14 01:18:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 14024704. Throughput: 0: 1746.7, 1: 1785.7. Samples: 3511284. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:18:14,558][31953] Avg episode reward: [(0, '19.630'), (1, '20.600')] [2023-10-14 01:18:14,559][32895] Saving new best policy, reward=20.600! [2023-10-14 01:18:14,955][33201] Updated weights for policy 0, policy_version 6820 (0.0007) [2023-10-14 01:18:15,324][33201] Updated weights for policy 0, policy_version 6830 (0.0009) [2023-10-14 01:18:15,704][33201] Updated weights for policy 0, policy_version 6840 (0.0009) [2023-10-14 01:18:16,422][33226] Updated weights for policy 1, policy_version 6890 (0.0011) [2023-10-14 01:18:16,787][33226] Updated weights for policy 1, policy_version 6900 (0.0010) [2023-10-14 01:18:17,155][33226] Updated weights for policy 1, policy_version 6910 (0.0010) [2023-10-14 01:18:19,448][33201] Updated weights for policy 0, policy_version 6850 (0.0008) [2023-10-14 01:18:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 14090240. Throughput: 0: 1757.2, 1: 1768.3. Samples: 3532826. Policy #0 lag: (min: 14.0, avg: 14.2, max: 23.0) [2023-10-14 01:18:19,557][31953] Avg episode reward: [(0, '20.030'), (1, '20.600')] [2023-10-14 01:18:19,822][33201] Updated weights for policy 0, policy_version 6860 (0.0010) [2023-10-14 01:18:20,194][33201] Updated weights for policy 0, policy_version 6870 (0.0009) [2023-10-14 01:18:20,567][33201] Updated weights for policy 0, policy_version 6880 (0.0008) [2023-10-14 01:18:20,924][33226] Updated weights for policy 1, policy_version 6920 (0.0007) [2023-10-14 01:18:21,292][33226] Updated weights for policy 1, policy_version 6930 (0.0010) [2023-10-14 01:18:21,672][33226] Updated weights for policy 1, policy_version 6940 (0.0008) [2023-10-14 01:18:24,374][33201] Updated weights for policy 0, policy_version 6890 (0.0008) [2023-10-14 01:18:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 14155776. Throughput: 0: 1789.7, 1: 1769.6. Samples: 3555282. Policy #0 lag: (min: 14.0, avg: 14.2, max: 23.0) [2023-10-14 01:18:24,558][31953] Avg episode reward: [(0, '20.010'), (1, '20.610')] [2023-10-14 01:18:24,565][32895] Saving new best policy, reward=20.610! [2023-10-14 01:18:24,748][33201] Updated weights for policy 0, policy_version 6900 (0.0007) [2023-10-14 01:18:25,119][33201] Updated weights for policy 0, policy_version 6910 (0.0007) [2023-10-14 01:18:25,375][33226] Updated weights for policy 1, policy_version 6950 (0.0008) [2023-10-14 01:18:25,744][33226] Updated weights for policy 1, policy_version 6960 (0.0007) [2023-10-14 01:18:26,106][33226] Updated weights for policy 1, policy_version 6970 (0.0009) [2023-10-14 01:18:28,906][33201] Updated weights for policy 0, policy_version 6920 (0.0011) [2023-10-14 01:18:29,277][33201] Updated weights for policy 0, policy_version 6930 (0.0007) [2023-10-14 01:18:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 14221312. Throughput: 0: 1761.6, 1: 1769.0. Samples: 3565076. Policy #0 lag: (min: 34.0, avg: 54.7, max: 56.0) [2023-10-14 01:18:29,558][31953] Avg episode reward: [(0, '20.020'), (1, '20.630')] [2023-10-14 01:18:29,559][32895] Saving new best policy, reward=20.630! [2023-10-14 01:18:29,659][33201] Updated weights for policy 0, policy_version 6940 (0.0008) [2023-10-14 01:18:29,907][33226] Updated weights for policy 1, policy_version 6980 (0.0010) [2023-10-14 01:18:30,274][33226] Updated weights for policy 1, policy_version 6990 (0.0010) [2023-10-14 01:18:30,640][33226] Updated weights for policy 1, policy_version 7000 (0.0009) [2023-10-14 01:18:33,484][33201] Updated weights for policy 0, policy_version 6950 (0.0009) [2023-10-14 01:18:33,850][33201] Updated weights for policy 0, policy_version 6960 (0.0008) [2023-10-14 01:18:34,228][33201] Updated weights for policy 0, policy_version 6970 (0.0010) [2023-10-14 01:18:34,499][33226] Updated weights for policy 1, policy_version 7010 (0.0007) [2023-10-14 01:18:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 14319616. Throughput: 0: 1788.2, 1: 1769.6. Samples: 3587088. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 01:18:34,558][31953] Avg episode reward: [(0, '20.040'), (1, '20.610')] [2023-10-14 01:18:34,872][33226] Updated weights for policy 1, policy_version 7020 (0.0008) [2023-10-14 01:18:35,239][33226] Updated weights for policy 1, policy_version 7030 (0.0007) [2023-10-14 01:18:35,613][33226] Updated weights for policy 1, policy_version 7040 (0.0007) [2023-10-14 01:18:38,162][33201] Updated weights for policy 0, policy_version 6980 (0.0010) [2023-10-14 01:18:38,544][33201] Updated weights for policy 0, policy_version 6990 (0.0011) [2023-10-14 01:18:38,911][33201] Updated weights for policy 0, policy_version 7000 (0.0008) [2023-10-14 01:18:39,263][33226] Updated weights for policy 1, policy_version 7050 (0.0007) [2023-10-14 01:18:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 14385152. Throughput: 0: 1770.4, 1: 1802.3. Samples: 3607924. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 01:18:39,558][31953] Avg episode reward: [(0, '20.060'), (1, '20.630')] [2023-10-14 01:18:39,631][33226] Updated weights for policy 1, policy_version 7060 (0.0008) [2023-10-14 01:18:39,989][33226] Updated weights for policy 1, policy_version 7070 (0.0008) [2023-10-14 01:18:42,575][33201] Updated weights for policy 0, policy_version 7010 (0.0009) [2023-10-14 01:18:42,950][33201] Updated weights for policy 0, policy_version 7020 (0.0010) [2023-10-14 01:18:43,313][33201] Updated weights for policy 0, policy_version 7030 (0.0007) [2023-10-14 01:18:43,688][33201] Updated weights for policy 0, policy_version 7040 (0.0007) [2023-10-14 01:18:43,850][33226] Updated weights for policy 1, policy_version 7080 (0.0009) [2023-10-14 01:18:44,224][33226] Updated weights for policy 1, policy_version 7090 (0.0009) [2023-10-14 01:18:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 14450688. Throughput: 0: 1782.0, 1: 1777.1. Samples: 3618906. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:18:44,558][31953] Avg episode reward: [(0, '20.080'), (1, '20.610')] [2023-10-14 01:18:44,598][33226] Updated weights for policy 1, policy_version 7100 (0.0008) [2023-10-14 01:18:47,579][33201] Updated weights for policy 0, policy_version 7050 (0.0008) [2023-10-14 01:18:47,957][33201] Updated weights for policy 0, policy_version 7060 (0.0009) [2023-10-14 01:18:48,332][33201] Updated weights for policy 0, policy_version 7070 (0.0009) [2023-10-14 01:18:48,427][33226] Updated weights for policy 1, policy_version 7110 (0.0008) [2023-10-14 01:18:48,789][33226] Updated weights for policy 1, policy_version 7120 (0.0011) [2023-10-14 01:18:49,157][33226] Updated weights for policy 1, policy_version 7130 (0.0008) [2023-10-14 01:18:49,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 14548992. Throughput: 0: 1763.2, 1: 1795.1. Samples: 3639680. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:18:49,558][31953] Avg episode reward: [(0, '20.090'), (1, '20.620')] [2023-10-14 01:18:52,321][33201] Updated weights for policy 0, policy_version 7080 (0.0008) [2023-10-14 01:18:52,693][33201] Updated weights for policy 0, policy_version 7090 (0.0008) [2023-10-14 01:18:52,928][33226] Updated weights for policy 1, policy_version 7140 (0.0008) [2023-10-14 01:18:53,057][33201] Updated weights for policy 0, policy_version 7100 (0.0008) [2023-10-14 01:18:53,295][33226] Updated weights for policy 1, policy_version 7150 (0.0007) [2023-10-14 01:18:53,666][33226] Updated weights for policy 1, policy_version 7160 (0.0008) [2023-10-14 01:18:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 14614528. Throughput: 0: 1756.8, 1: 1778.1. Samples: 3659804. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:18:54,558][31953] Avg episode reward: [(0, '20.190'), (1, '20.610')] [2023-10-14 01:18:56,751][33201] Updated weights for policy 0, policy_version 7110 (0.0009) [2023-10-14 01:18:57,129][33201] Updated weights for policy 0, policy_version 7120 (0.0008) [2023-10-14 01:18:57,421][33226] Updated weights for policy 1, policy_version 7170 (0.0009) [2023-10-14 01:18:57,499][33201] Updated weights for policy 0, policy_version 7130 (0.0008) [2023-10-14 01:18:57,780][33226] Updated weights for policy 1, policy_version 7180 (0.0008) [2023-10-14 01:18:58,159][33226] Updated weights for policy 1, policy_version 7190 (0.0008) [2023-10-14 01:18:58,518][33226] Updated weights for policy 1, policy_version 7200 (0.0008) [2023-10-14 01:18:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 14680064. Throughput: 0: 1773.8, 1: 1791.1. Samples: 3671706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:18:59,558][31953] Avg episode reward: [(0, '20.160'), (1, '20.590')] [2023-10-14 01:19:01,465][33201] Updated weights for policy 0, policy_version 7140 (0.0007) [2023-10-14 01:19:01,829][33201] Updated weights for policy 0, policy_version 7150 (0.0008) [2023-10-14 01:19:02,201][33201] Updated weights for policy 0, policy_version 7160 (0.0008) [2023-10-14 01:19:02,269][33226] Updated weights for policy 1, policy_version 7210 (0.0007) [2023-10-14 01:19:02,644][33226] Updated weights for policy 1, policy_version 7220 (0.0007) [2023-10-14 01:19:03,016][33226] Updated weights for policy 1, policy_version 7230 (0.0007) [2023-10-14 01:19:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 14745600. Throughput: 0: 1747.5, 1: 1782.8. Samples: 3691692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:04,558][31953] Avg episode reward: [(0, '20.180'), (1, '20.580')] [2023-10-14 01:19:05,971][33201] Updated weights for policy 0, policy_version 7170 (0.0008) [2023-10-14 01:19:06,343][33201] Updated weights for policy 0, policy_version 7180 (0.0009) [2023-10-14 01:19:06,715][33201] Updated weights for policy 0, policy_version 7190 (0.0008) [2023-10-14 01:19:06,878][33226] Updated weights for policy 1, policy_version 7240 (0.0008) [2023-10-14 01:19:07,083][33201] Updated weights for policy 0, policy_version 7200 (0.0010) [2023-10-14 01:19:07,235][33226] Updated weights for policy 1, policy_version 7250 (0.0008) [2023-10-14 01:19:07,609][33226] Updated weights for policy 1, policy_version 7260 (0.0009) [2023-10-14 01:19:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 14811136. Throughput: 0: 1744.3, 1: 1771.3. Samples: 3713486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:09,558][31953] Avg episode reward: [(0, '20.220'), (1, '20.580')] [2023-10-14 01:19:11,082][33201] Updated weights for policy 0, policy_version 7210 (0.0008) [2023-10-14 01:19:11,444][33201] Updated weights for policy 0, policy_version 7220 (0.0008) [2023-10-14 01:19:11,455][33226] Updated weights for policy 1, policy_version 7270 (0.0010) [2023-10-14 01:19:11,815][33201] Updated weights for policy 0, policy_version 7230 (0.0007) [2023-10-14 01:19:11,823][33226] Updated weights for policy 1, policy_version 7280 (0.0008) [2023-10-14 01:19:12,199][33226] Updated weights for policy 1, policy_version 7290 (0.0007) [2023-10-14 01:19:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 14876672. Throughput: 0: 1737.4, 1: 1780.4. Samples: 3723380. Policy #0 lag: (min: 29.0, avg: 35.3, max: 61.0) [2023-10-14 01:19:14,558][31953] Avg episode reward: [(0, '20.220'), (1, '20.600')] [2023-10-14 01:19:15,734][33201] Updated weights for policy 0, policy_version 7240 (0.0009) [2023-10-14 01:19:16,023][33226] Updated weights for policy 1, policy_version 7300 (0.0008) [2023-10-14 01:19:16,106][33201] Updated weights for policy 0, policy_version 7250 (0.0008) [2023-10-14 01:19:16,391][33226] Updated weights for policy 1, policy_version 7310 (0.0009) [2023-10-14 01:19:16,466][33201] Updated weights for policy 0, policy_version 7260 (0.0008) [2023-10-14 01:19:16,762][33226] Updated weights for policy 1, policy_version 7320 (0.0009) [2023-10-14 01:19:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 14942208. Throughput: 0: 1736.7, 1: 1765.4. Samples: 3744680. Policy #0 lag: (min: 29.0, avg: 35.3, max: 61.0) [2023-10-14 01:19:19,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.610')] [2023-10-14 01:19:20,412][33201] Updated weights for policy 0, policy_version 7270 (0.0010) [2023-10-14 01:19:20,633][33226] Updated weights for policy 1, policy_version 7330 (0.0008) [2023-10-14 01:19:20,781][33201] Updated weights for policy 0, policy_version 7280 (0.0008) [2023-10-14 01:19:20,994][33226] Updated weights for policy 1, policy_version 7340 (0.0007) [2023-10-14 01:19:21,157][33201] Updated weights for policy 0, policy_version 7290 (0.0007) [2023-10-14 01:19:21,368][33226] Updated weights for policy 1, policy_version 7350 (0.0008) [2023-10-14 01:19:21,743][33226] Updated weights for policy 1, policy_version 7360 (0.0010) [2023-10-14 01:19:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 15007744. Throughput: 0: 1771.4, 1: 1762.9. Samples: 3766966. Policy #0 lag: (min: 1.0, avg: 13.4, max: 33.0) [2023-10-14 01:19:24,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.580')] [2023-10-14 01:19:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000007360_7536640.pth... [2023-10-14 01:19:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000007296_7471104.pth... [2023-10-14 01:19:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000005696_5832704.pth [2023-10-14 01:19:24,610][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000005664_5799936.pth [2023-10-14 01:19:24,897][33201] Updated weights for policy 0, policy_version 7300 (0.0009) [2023-10-14 01:19:25,272][33201] Updated weights for policy 0, policy_version 7310 (0.0009) [2023-10-14 01:19:25,495][33226] Updated weights for policy 1, policy_version 7370 (0.0008) [2023-10-14 01:19:25,650][33201] Updated weights for policy 0, policy_version 7320 (0.0008) [2023-10-14 01:19:25,862][33226] Updated weights for policy 1, policy_version 7380 (0.0007) [2023-10-14 01:19:26,233][33226] Updated weights for policy 1, policy_version 7390 (0.0009) [2023-10-14 01:19:29,320][33201] Updated weights for policy 0, policy_version 7330 (0.0008) [2023-10-14 01:19:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 15073280. Throughput: 0: 1742.5, 1: 1763.6. Samples: 3776678. Policy #0 lag: (min: 1.0, avg: 13.4, max: 33.0) [2023-10-14 01:19:29,558][31953] Avg episode reward: [(0, '20.200'), (1, '20.590')] [2023-10-14 01:19:29,703][33201] Updated weights for policy 0, policy_version 7340 (0.0007) [2023-10-14 01:19:30,076][33201] Updated weights for policy 0, policy_version 7350 (0.0008) [2023-10-14 01:19:30,124][33226] Updated weights for policy 1, policy_version 7400 (0.0008) [2023-10-14 01:19:30,444][33201] Updated weights for policy 0, policy_version 7360 (0.0008) [2023-10-14 01:19:30,493][33226] Updated weights for policy 1, policy_version 7410 (0.0007) [2023-10-14 01:19:30,872][33226] Updated weights for policy 1, policy_version 7420 (0.0008) [2023-10-14 01:19:34,360][33201] Updated weights for policy 0, policy_version 7370 (0.0008) [2023-10-14 01:19:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 15138816. Throughput: 0: 1774.1, 1: 1758.8. Samples: 3798660. Policy #0 lag: (min: 13.0, avg: 21.7, max: 45.0) [2023-10-14 01:19:34,558][31953] Avg episode reward: [(0, '20.190'), (1, '20.580')] [2023-10-14 01:19:34,695][33226] Updated weights for policy 1, policy_version 7430 (0.0009) [2023-10-14 01:19:34,737][33201] Updated weights for policy 0, policy_version 7380 (0.0008) [2023-10-14 01:19:35,070][33226] Updated weights for policy 1, policy_version 7440 (0.0008) [2023-10-14 01:19:35,115][33201] Updated weights for policy 0, policy_version 7390 (0.0009) [2023-10-14 01:19:35,436][33226] Updated weights for policy 1, policy_version 7450 (0.0008) [2023-10-14 01:19:38,987][33201] Updated weights for policy 0, policy_version 7400 (0.0008) [2023-10-14 01:19:39,240][33226] Updated weights for policy 1, policy_version 7460 (0.0008) [2023-10-14 01:19:39,377][33201] Updated weights for policy 0, policy_version 7410 (0.0008) [2023-10-14 01:19:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 15204352. Throughput: 0: 1769.7, 1: 1790.4. Samples: 3820010. Policy #0 lag: (min: 13.0, avg: 21.7, max: 45.0) [2023-10-14 01:19:39,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.550')] [2023-10-14 01:19:39,595][33226] Updated weights for policy 1, policy_version 7470 (0.0008) [2023-10-14 01:19:39,741][33201] Updated weights for policy 0, policy_version 7420 (0.0008) [2023-10-14 01:19:39,959][33226] Updated weights for policy 1, policy_version 7480 (0.0008) [2023-10-14 01:19:43,403][33201] Updated weights for policy 0, policy_version 7430 (0.0008) [2023-10-14 01:19:43,680][33226] Updated weights for policy 1, policy_version 7490 (0.0008) [2023-10-14 01:19:43,772][33201] Updated weights for policy 0, policy_version 7440 (0.0007) [2023-10-14 01:19:44,043][33226] Updated weights for policy 1, policy_version 7500 (0.0007) [2023-10-14 01:19:44,142][33201] Updated weights for policy 0, policy_version 7450 (0.0007) [2023-10-14 01:19:44,407][33226] Updated weights for policy 1, policy_version 7510 (0.0007) [2023-10-14 01:19:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 15302656. Throughput: 0: 1762.8, 1: 1759.2. Samples: 3830196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:44,557][31953] Avg episode reward: [(0, '20.240'), (1, '20.530')] [2023-10-14 01:19:44,778][33226] Updated weights for policy 1, policy_version 7520 (0.0008) [2023-10-14 01:19:47,963][33201] Updated weights for policy 0, policy_version 7460 (0.0008) [2023-10-14 01:19:48,337][33201] Updated weights for policy 0, policy_version 7470 (0.0009) [2023-10-14 01:19:48,699][33201] Updated weights for policy 0, policy_version 7480 (0.0009) [2023-10-14 01:19:48,784][33226] Updated weights for policy 1, policy_version 7530 (0.0008) [2023-10-14 01:19:49,152][33226] Updated weights for policy 1, policy_version 7540 (0.0008) [2023-10-14 01:19:49,522][33226] Updated weights for policy 1, policy_version 7550 (0.0007) [2023-10-14 01:19:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 15368192. Throughput: 0: 1776.9, 1: 1780.9. Samples: 3851790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:49,558][31953] Avg episode reward: [(0, '20.390'), (1, '20.570')] [2023-10-14 01:19:49,558][32837] Saving new best policy, reward=20.390! [2023-10-14 01:19:52,475][33201] Updated weights for policy 0, policy_version 7490 (0.0009) [2023-10-14 01:19:52,857][33201] Updated weights for policy 0, policy_version 7500 (0.0008) [2023-10-14 01:19:53,235][33201] Updated weights for policy 0, policy_version 7510 (0.0008) [2023-10-14 01:19:53,268][33226] Updated weights for policy 1, policy_version 7560 (0.0007) [2023-10-14 01:19:53,608][33201] Updated weights for policy 0, policy_version 7520 (0.0008) [2023-10-14 01:19:53,627][33226] Updated weights for policy 1, policy_version 7570 (0.0008) [2023-10-14 01:19:54,001][33226] Updated weights for policy 1, policy_version 7580 (0.0008) [2023-10-14 01:19:54,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 15466496. Throughput: 0: 1753.8, 1: 1760.7. Samples: 3871640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:54,558][31953] Avg episode reward: [(0, '20.390'), (1, '20.540')] [2023-10-14 01:19:57,412][33201] Updated weights for policy 0, policy_version 7530 (0.0007) [2023-10-14 01:19:57,755][33226] Updated weights for policy 1, policy_version 7590 (0.0007) [2023-10-14 01:19:57,777][33201] Updated weights for policy 0, policy_version 7540 (0.0007) [2023-10-14 01:19:58,129][33226] Updated weights for policy 1, policy_version 7600 (0.0008) [2023-10-14 01:19:58,149][33201] Updated weights for policy 0, policy_version 7550 (0.0007) [2023-10-14 01:19:58,497][33226] Updated weights for policy 1, policy_version 7610 (0.0008) [2023-10-14 01:19:59,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 15532032. Throughput: 0: 1789.2, 1: 1777.7. Samples: 3883892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:19:59,558][31953] Avg episode reward: [(0, '20.420'), (1, '20.530')] [2023-10-14 01:19:59,559][32837] Saving new best policy, reward=20.420! [2023-10-14 01:20:02,061][33201] Updated weights for policy 0, policy_version 7560 (0.0007) [2023-10-14 01:20:02,400][33226] Updated weights for policy 1, policy_version 7620 (0.0008) [2023-10-14 01:20:02,425][33201] Updated weights for policy 0, policy_version 7570 (0.0007) [2023-10-14 01:20:02,767][33226] Updated weights for policy 1, policy_version 7630 (0.0009) [2023-10-14 01:20:02,791][33201] Updated weights for policy 0, policy_version 7580 (0.0008) [2023-10-14 01:20:03,128][33226] Updated weights for policy 1, policy_version 7640 (0.0009) [2023-10-14 01:20:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 15597568. Throughput: 0: 1763.6, 1: 1772.6. Samples: 3903806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:04,557][31953] Avg episode reward: [(0, '20.470'), (1, '20.540')] [2023-10-14 01:20:04,558][32837] Saving new best policy, reward=20.470! [2023-10-14 01:20:06,666][33201] Updated weights for policy 0, policy_version 7590 (0.0008) [2023-10-14 01:20:06,863][33226] Updated weights for policy 1, policy_version 7650 (0.0009) [2023-10-14 01:20:07,040][33201] Updated weights for policy 0, policy_version 7600 (0.0007) [2023-10-14 01:20:07,233][33226] Updated weights for policy 1, policy_version 7660 (0.0007) [2023-10-14 01:20:07,422][33201] Updated weights for policy 0, policy_version 7610 (0.0007) [2023-10-14 01:20:07,597][33226] Updated weights for policy 1, policy_version 7670 (0.0009) [2023-10-14 01:20:07,971][33226] Updated weights for policy 1, policy_version 7680 (0.0008) [2023-10-14 01:20:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 15663104. Throughput: 0: 1752.1, 1: 1763.8. Samples: 3925182. Policy #0 lag: (min: 28.0, avg: 34.3, max: 60.0) [2023-10-14 01:20:09,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.540')] [2023-10-14 01:20:09,567][32837] Saving new best policy, reward=20.480! [2023-10-14 01:20:11,343][33201] Updated weights for policy 0, policy_version 7620 (0.0008) [2023-10-14 01:20:11,713][33201] Updated weights for policy 0, policy_version 7630 (0.0007) [2023-10-14 01:20:11,843][33226] Updated weights for policy 1, policy_version 7690 (0.0007) [2023-10-14 01:20:12,089][33201] Updated weights for policy 0, policy_version 7640 (0.0007) [2023-10-14 01:20:12,210][33226] Updated weights for policy 1, policy_version 7700 (0.0007) [2023-10-14 01:20:12,571][33226] Updated weights for policy 1, policy_version 7710 (0.0008) [2023-10-14 01:20:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 15728640. Throughput: 0: 1756.9, 1: 1776.9. Samples: 3935698. Policy #0 lag: (min: 28.0, avg: 34.3, max: 60.0) [2023-10-14 01:20:14,557][31953] Avg episode reward: [(0, '20.480'), (1, '20.470')] [2023-10-14 01:20:15,946][33201] Updated weights for policy 0, policy_version 7650 (0.0008) [2023-10-14 01:20:16,323][33201] Updated weights for policy 0, policy_version 7660 (0.0007) [2023-10-14 01:20:16,546][33226] Updated weights for policy 1, policy_version 7720 (0.0007) [2023-10-14 01:20:16,691][33201] Updated weights for policy 0, policy_version 7670 (0.0007) [2023-10-14 01:20:16,926][33226] Updated weights for policy 1, policy_version 7730 (0.0008) [2023-10-14 01:20:17,057][33201] Updated weights for policy 0, policy_version 7680 (0.0007) [2023-10-14 01:20:17,290][33226] Updated weights for policy 1, policy_version 7740 (0.0009) [2023-10-14 01:20:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 15794176. Throughput: 0: 1744.6, 1: 1755.6. Samples: 3956168. Policy #0 lag: (min: 25.0, avg: 40.5, max: 57.0) [2023-10-14 01:20:19,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.480')] [2023-10-14 01:20:20,897][33201] Updated weights for policy 0, policy_version 7690 (0.0008) [2023-10-14 01:20:21,148][33226] Updated weights for policy 1, policy_version 7750 (0.0008) [2023-10-14 01:20:21,265][33201] Updated weights for policy 0, policy_version 7700 (0.0008) [2023-10-14 01:20:21,509][33226] Updated weights for policy 1, policy_version 7760 (0.0008) [2023-10-14 01:20:21,633][33201] Updated weights for policy 0, policy_version 7710 (0.0009) [2023-10-14 01:20:21,873][33226] Updated weights for policy 1, policy_version 7770 (0.0007) [2023-10-14 01:20:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 15859712. Throughput: 0: 1761.4, 1: 1752.8. Samples: 3978150. Policy #0 lag: (min: 25.0, avg: 40.5, max: 57.0) [2023-10-14 01:20:24,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.480')] [2023-10-14 01:20:24,572][32837] Saving new best policy, reward=20.490! [2023-10-14 01:20:25,495][33201] Updated weights for policy 0, policy_version 7720 (0.0007) [2023-10-14 01:20:25,632][33226] Updated weights for policy 1, policy_version 7780 (0.0009) [2023-10-14 01:20:25,871][33201] Updated weights for policy 0, policy_version 7730 (0.0007) [2023-10-14 01:20:25,999][33226] Updated weights for policy 1, policy_version 7790 (0.0008) [2023-10-14 01:20:26,243][33201] Updated weights for policy 0, policy_version 7740 (0.0007) [2023-10-14 01:20:26,371][33226] Updated weights for policy 1, policy_version 7800 (0.0007) [2023-10-14 01:20:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 15925248. Throughput: 0: 1746.4, 1: 1753.7. Samples: 3987702. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) [2023-10-14 01:20:29,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.470')] [2023-10-14 01:20:29,968][33201] Updated weights for policy 0, policy_version 7750 (0.0009) [2023-10-14 01:20:30,178][33226] Updated weights for policy 1, policy_version 7810 (0.0010) [2023-10-14 01:20:30,338][33201] Updated weights for policy 0, policy_version 7760 (0.0009) [2023-10-14 01:20:30,548][33226] Updated weights for policy 1, policy_version 7820 (0.0008) [2023-10-14 01:20:30,716][33201] Updated weights for policy 0, policy_version 7770 (0.0007) [2023-10-14 01:20:30,919][33226] Updated weights for policy 1, policy_version 7830 (0.0007) [2023-10-14 01:20:31,282][33226] Updated weights for policy 1, policy_version 7840 (0.0010) [2023-10-14 01:20:34,551][33201] Updated weights for policy 0, policy_version 7780 (0.0009) [2023-10-14 01:20:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 15990784. Throughput: 0: 1754.0, 1: 1758.5. Samples: 4009856. Policy #0 lag: (min: 31.0, avg: 32.6, max: 58.0) [2023-10-14 01:20:34,558][31953] Avg episode reward: [(0, '20.510'), (1, '20.480')] [2023-10-14 01:20:34,924][33201] Updated weights for policy 0, policy_version 7790 (0.0009) [2023-10-14 01:20:35,133][33226] Updated weights for policy 1, policy_version 7850 (0.0009) [2023-10-14 01:20:35,293][33201] Updated weights for policy 0, policy_version 7800 (0.0009) [2023-10-14 01:20:35,509][33226] Updated weights for policy 1, policy_version 7860 (0.0008) [2023-10-14 01:20:35,585][32837] Saving new best policy, reward=20.510! [2023-10-14 01:20:35,876][33226] Updated weights for policy 1, policy_version 7870 (0.0008) [2023-10-14 01:20:39,104][33201] Updated weights for policy 0, policy_version 7810 (0.0008) [2023-10-14 01:20:39,468][33201] Updated weights for policy 0, policy_version 7820 (0.0007) [2023-10-14 01:20:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 16056320. Throughput: 0: 1782.3, 1: 1778.6. Samples: 4031878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:39,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.480')] [2023-10-14 01:20:39,818][33226] Updated weights for policy 1, policy_version 7880 (0.0009) [2023-10-14 01:20:39,839][33201] Updated weights for policy 0, policy_version 7830 (0.0008) [2023-10-14 01:20:40,189][33226] Updated weights for policy 1, policy_version 7890 (0.0008) [2023-10-14 01:20:40,202][32837] Saving new best policy, reward=20.520! [2023-10-14 01:20:40,203][33201] Updated weights for policy 0, policy_version 7840 (0.0007) [2023-10-14 01:20:40,560][33226] Updated weights for policy 1, policy_version 7900 (0.0008) [2023-10-14 01:20:43,954][33201] Updated weights for policy 0, policy_version 7850 (0.0010) [2023-10-14 01:20:44,253][33226] Updated weights for policy 1, policy_version 7910 (0.0008) [2023-10-14 01:20:44,328][33201] Updated weights for policy 0, policy_version 7860 (0.0007) [2023-10-14 01:20:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 16121856. Throughput: 0: 1749.2, 1: 1753.5. Samples: 4041516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:44,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.460')] [2023-10-14 01:20:44,623][33226] Updated weights for policy 1, policy_version 7920 (0.0008) [2023-10-14 01:20:44,697][33201] Updated weights for policy 0, policy_version 7870 (0.0007) [2023-10-14 01:20:44,767][32837] Saving new best policy, reward=20.530! [2023-10-14 01:20:44,993][33226] Updated weights for policy 1, policy_version 7930 (0.0009) [2023-10-14 01:20:48,519][33201] Updated weights for policy 0, policy_version 7880 (0.0008) [2023-10-14 01:20:48,895][33201] Updated weights for policy 0, policy_version 7890 (0.0008) [2023-10-14 01:20:48,998][33226] Updated weights for policy 1, policy_version 7940 (0.0010) [2023-10-14 01:20:49,270][33201] Updated weights for policy 0, policy_version 7900 (0.0007) [2023-10-14 01:20:49,362][33226] Updated weights for policy 1, policy_version 7950 (0.0008) [2023-10-14 01:20:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 16220160. Throughput: 0: 1777.2, 1: 1768.9. Samples: 4063378. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:49,557][31953] Avg episode reward: [(0, '20.550'), (1, '20.460')] [2023-10-14 01:20:49,558][32837] Saving new best policy, reward=20.550! [2023-10-14 01:20:49,732][33226] Updated weights for policy 1, policy_version 7960 (0.0008) [2023-10-14 01:20:53,055][33201] Updated weights for policy 0, policy_version 7910 (0.0009) [2023-10-14 01:20:53,416][33201] Updated weights for policy 0, policy_version 7920 (0.0010) [2023-10-14 01:20:53,572][33226] Updated weights for policy 1, policy_version 7970 (0.0007) [2023-10-14 01:20:53,788][33201] Updated weights for policy 0, policy_version 7930 (0.0009) [2023-10-14 01:20:53,942][33226] Updated weights for policy 1, policy_version 7980 (0.0008) [2023-10-14 01:20:54,311][33226] Updated weights for policy 1, policy_version 7990 (0.0007) [2023-10-14 01:20:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 16285696. Throughput: 0: 1753.3, 1: 1764.4. Samples: 4083480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:54,559][31953] Avg episode reward: [(0, '20.610'), (1, '20.470')] [2023-10-14 01:20:54,573][32837] Saving new best policy, reward=20.610! [2023-10-14 01:20:54,674][33226] Updated weights for policy 1, policy_version 8000 (0.0008) [2023-10-14 01:20:57,644][33201] Updated weights for policy 0, policy_version 7940 (0.0009) [2023-10-14 01:20:58,019][33201] Updated weights for policy 0, policy_version 7950 (0.0009) [2023-10-14 01:20:58,357][33226] Updated weights for policy 1, policy_version 8010 (0.0007) [2023-10-14 01:20:58,394][33201] Updated weights for policy 0, policy_version 7960 (0.0007) [2023-10-14 01:20:58,738][33226] Updated weights for policy 1, policy_version 8020 (0.0008) [2023-10-14 01:20:59,109][33226] Updated weights for policy 1, policy_version 8030 (0.0009) [2023-10-14 01:20:59,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 16384000. Throughput: 0: 1776.7, 1: 1764.3. Samples: 4095042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:20:59,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.450')] [2023-10-14 01:21:02,348][33201] Updated weights for policy 0, policy_version 7970 (0.0007) [2023-10-14 01:21:02,722][33201] Updated weights for policy 0, policy_version 7980 (0.0007) [2023-10-14 01:21:02,840][33226] Updated weights for policy 1, policy_version 8040 (0.0007) [2023-10-14 01:21:03,098][33201] Updated weights for policy 0, policy_version 7990 (0.0007) [2023-10-14 01:21:03,216][33226] Updated weights for policy 1, policy_version 8050 (0.0008) [2023-10-14 01:21:03,472][33201] Updated weights for policy 0, policy_version 8000 (0.0008) [2023-10-14 01:21:03,586][33226] Updated weights for policy 1, policy_version 8060 (0.0009) [2023-10-14 01:21:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 16449536. Throughput: 0: 1761.5, 1: 1779.5. Samples: 4115514. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) [2023-10-14 01:21:04,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.450')] [2023-10-14 01:21:04,560][32837] Saving new best policy, reward=20.620! [2023-10-14 01:21:07,211][33201] Updated weights for policy 0, policy_version 8010 (0.0009) [2023-10-14 01:21:07,378][33226] Updated weights for policy 1, policy_version 8070 (0.0008) [2023-10-14 01:21:07,582][33201] Updated weights for policy 0, policy_version 8020 (0.0007) [2023-10-14 01:21:07,748][33226] Updated weights for policy 1, policy_version 8080 (0.0007) [2023-10-14 01:21:07,967][33201] Updated weights for policy 0, policy_version 8030 (0.0008) [2023-10-14 01:21:08,102][33226] Updated weights for policy 1, policy_version 8090 (0.0008) [2023-10-14 01:21:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 16515072. Throughput: 0: 1754.2, 1: 1760.9. Samples: 4136330. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) [2023-10-14 01:21:09,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.440')] [2023-10-14 01:21:11,897][33226] Updated weights for policy 1, policy_version 8100 (0.0008) [2023-10-14 01:21:12,022][33201] Updated weights for policy 0, policy_version 8040 (0.0008) [2023-10-14 01:21:12,262][33226] Updated weights for policy 1, policy_version 8110 (0.0008) [2023-10-14 01:21:12,406][33201] Updated weights for policy 0, policy_version 8050 (0.0008) [2023-10-14 01:21:12,620][33226] Updated weights for policy 1, policy_version 8120 (0.0007) [2023-10-14 01:21:12,778][33201] Updated weights for policy 0, policy_version 8060 (0.0008) [2023-10-14 01:21:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 16580608. Throughput: 0: 1773.2, 1: 1786.4. Samples: 4147880. Policy #0 lag: (min: 21.0, avg: 28.6, max: 53.0) [2023-10-14 01:21:14,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.450')] [2023-10-14 01:21:16,477][33201] Updated weights for policy 0, policy_version 8070 (0.0008) [2023-10-14 01:21:16,548][33226] Updated weights for policy 1, policy_version 8130 (0.0008) [2023-10-14 01:21:16,855][33201] Updated weights for policy 0, policy_version 8080 (0.0010) [2023-10-14 01:21:16,914][33226] Updated weights for policy 1, policy_version 8140 (0.0009) [2023-10-14 01:21:17,229][33201] Updated weights for policy 0, policy_version 8090 (0.0008) [2023-10-14 01:21:17,282][33226] Updated weights for policy 1, policy_version 8150 (0.0007) [2023-10-14 01:21:17,650][33226] Updated weights for policy 1, policy_version 8160 (0.0008) [2023-10-14 01:21:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 16646144. Throughput: 0: 1751.7, 1: 1754.3. Samples: 4167626. Policy #0 lag: (min: 21.0, avg: 28.6, max: 53.0) [2023-10-14 01:21:19,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.460')] [2023-10-14 01:21:20,973][33201] Updated weights for policy 0, policy_version 8100 (0.0009) [2023-10-14 01:21:21,341][33201] Updated weights for policy 0, policy_version 8110 (0.0008) [2023-10-14 01:21:21,350][33226] Updated weights for policy 1, policy_version 8170 (0.0008) [2023-10-14 01:21:21,715][33201] Updated weights for policy 0, policy_version 8120 (0.0009) [2023-10-14 01:21:21,727][33226] Updated weights for policy 1, policy_version 8180 (0.0008) [2023-10-14 01:21:22,097][33226] Updated weights for policy 1, policy_version 8190 (0.0008) [2023-10-14 01:21:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 16711680. Throughput: 0: 1744.0, 1: 1760.0. Samples: 4189562. Policy #0 lag: (min: 2.0, avg: 2.4, max: 13.0) [2023-10-14 01:21:24,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.440')] [2023-10-14 01:21:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000008192_8388608.pth... [2023-10-14 01:21:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000008128_8323072.pth... [2023-10-14 01:21:24,609][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000006528_6684672.pth [2023-10-14 01:21:24,609][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000006496_6651904.pth [2023-10-14 01:21:24,614][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000008192_8388608.pth [2023-10-14 01:21:24,615][32837] Saving new best policy, reward=20.630! [2023-10-14 01:21:24,659][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000008128_8323072.pth [2023-10-14 01:21:25,624][33201] Updated weights for policy 0, policy_version 8130 (0.0008) [2023-10-14 01:21:25,809][33226] Updated weights for policy 1, policy_version 8200 (0.0007) [2023-10-14 01:21:25,997][33201] Updated weights for policy 0, policy_version 8140 (0.0008) [2023-10-14 01:21:26,182][33226] Updated weights for policy 1, policy_version 8210 (0.0008) [2023-10-14 01:21:26,366][33201] Updated weights for policy 0, policy_version 8150 (0.0008) [2023-10-14 01:21:26,557][33226] Updated weights for policy 1, policy_version 8220 (0.0009) [2023-10-14 01:21:26,742][33201] Updated weights for policy 0, policy_version 8160 (0.0009) [2023-10-14 01:21:29,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 16777216. Throughput: 0: 1743.5, 1: 1758.7. Samples: 4199116. Policy #0 lag: (min: 2.0, avg: 2.4, max: 13.0) [2023-10-14 01:21:29,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.490')] [2023-10-14 01:21:29,559][32837] Saving new best policy, reward=20.640! [2023-10-14 01:21:30,470][33226] Updated weights for policy 1, policy_version 8230 (0.0009) [2023-10-14 01:21:30,656][33201] Updated weights for policy 0, policy_version 8170 (0.0007) [2023-10-14 01:21:30,839][33226] Updated weights for policy 1, policy_version 8240 (0.0007) [2023-10-14 01:21:31,028][33201] Updated weights for policy 0, policy_version 8180 (0.0009) [2023-10-14 01:21:31,210][33226] Updated weights for policy 1, policy_version 8250 (0.0008) [2023-10-14 01:21:31,396][33201] Updated weights for policy 0, policy_version 8190 (0.0008) [2023-10-14 01:21:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 16842752. Throughput: 0: 1740.9, 1: 1764.0. Samples: 4221100. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 01:21:34,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.490')] [2023-10-14 01:21:34,560][32837] Saving new best policy, reward=20.690! [2023-10-14 01:21:35,052][33226] Updated weights for policy 1, policy_version 8260 (0.0010) [2023-10-14 01:21:35,364][33201] Updated weights for policy 0, policy_version 8200 (0.0007) [2023-10-14 01:21:35,410][33226] Updated weights for policy 1, policy_version 8270 (0.0007) [2023-10-14 01:21:35,734][33201] Updated weights for policy 0, policy_version 8210 (0.0008) [2023-10-14 01:21:35,782][33226] Updated weights for policy 1, policy_version 8280 (0.0007) [2023-10-14 01:21:36,117][33201] Updated weights for policy 0, policy_version 8220 (0.0007) [2023-10-14 01:21:39,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 16908288. Throughput: 0: 1767.1, 1: 1780.0. Samples: 4243100. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 01:21:39,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.490')] [2023-10-14 01:21:39,645][33226] Updated weights for policy 1, policy_version 8290 (0.0008) [2023-10-14 01:21:39,918][33201] Updated weights for policy 0, policy_version 8230 (0.0009) [2023-10-14 01:21:40,008][33226] Updated weights for policy 1, policy_version 8300 (0.0008) [2023-10-14 01:21:40,283][33201] Updated weights for policy 0, policy_version 8240 (0.0007) [2023-10-14 01:21:40,369][33226] Updated weights for policy 1, policy_version 8310 (0.0009) [2023-10-14 01:21:40,659][33201] Updated weights for policy 0, policy_version 8250 (0.0007) [2023-10-14 01:21:40,737][33226] Updated weights for policy 1, policy_version 8320 (0.0009) [2023-10-14 01:21:44,501][33226] Updated weights for policy 1, policy_version 8330 (0.0009) [2023-10-14 01:21:44,503][33201] Updated weights for policy 0, policy_version 8260 (0.0008) [2023-10-14 01:21:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 16973824. Throughput: 0: 1736.7, 1: 1764.4. Samples: 4252588. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 01:21:44,557][31953] Avg episode reward: [(0, '20.690'), (1, '20.530')] [2023-10-14 01:21:44,874][33201] Updated weights for policy 0, policy_version 8270 (0.0007) [2023-10-14 01:21:44,875][33226] Updated weights for policy 1, policy_version 8340 (0.0008) [2023-10-14 01:21:45,243][33226] Updated weights for policy 1, policy_version 8350 (0.0010) [2023-10-14 01:21:45,256][33201] Updated weights for policy 0, policy_version 8280 (0.0007) [2023-10-14 01:21:49,158][33226] Updated weights for policy 1, policy_version 8360 (0.0008) [2023-10-14 01:21:49,234][33201] Updated weights for policy 0, policy_version 8290 (0.0009) [2023-10-14 01:21:49,538][33226] Updated weights for policy 1, policy_version 8370 (0.0009) [2023-10-14 01:21:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 17039360. Throughput: 0: 1758.7, 1: 1776.9. Samples: 4274618. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 01:21:49,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.510')] [2023-10-14 01:21:49,605][33201] Updated weights for policy 0, policy_version 8300 (0.0008) [2023-10-14 01:21:49,906][33226] Updated weights for policy 1, policy_version 8380 (0.0008) [2023-10-14 01:21:49,963][33201] Updated weights for policy 0, policy_version 8310 (0.0008) [2023-10-14 01:21:50,338][33201] Updated weights for policy 0, policy_version 8320 (0.0008) [2023-10-14 01:21:53,743][33226] Updated weights for policy 1, policy_version 8390 (0.0009) [2023-10-14 01:21:54,042][33201] Updated weights for policy 0, policy_version 8330 (0.0008) [2023-10-14 01:21:54,113][33226] Updated weights for policy 1, policy_version 8400 (0.0008) [2023-10-14 01:21:54,421][33201] Updated weights for policy 0, policy_version 8340 (0.0010) [2023-10-14 01:21:54,488][33226] Updated weights for policy 1, policy_version 8410 (0.0007) [2023-10-14 01:21:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 13995.8). Total num frames: 17104896. Throughput: 0: 1752.4, 1: 1779.3. Samples: 4295254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:21:54,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.520')] [2023-10-14 01:21:54,790][33201] Updated weights for policy 0, policy_version 8350 (0.0009) [2023-10-14 01:21:54,857][32837] Saving new best policy, reward=20.720! [2023-10-14 01:21:58,173][33226] Updated weights for policy 1, policy_version 8420 (0.0007) [2023-10-14 01:21:58,545][33226] Updated weights for policy 1, policy_version 8430 (0.0007) [2023-10-14 01:21:58,721][33201] Updated weights for policy 0, policy_version 8360 (0.0009) [2023-10-14 01:21:58,920][33226] Updated weights for policy 1, policy_version 8440 (0.0007) [2023-10-14 01:21:59,105][33201] Updated weights for policy 0, policy_version 8370 (0.0009) [2023-10-14 01:21:59,489][33201] Updated weights for policy 0, policy_version 8380 (0.0008) [2023-10-14 01:21:59,557][31953] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 17203200. Throughput: 0: 1747.4, 1: 1761.7. Samples: 4305790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:21:59,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.540')] [2023-10-14 01:22:02,782][33226] Updated weights for policy 1, policy_version 8450 (0.0007) [2023-10-14 01:22:03,158][33226] Updated weights for policy 1, policy_version 8460 (0.0008) [2023-10-14 01:22:03,515][33201] Updated weights for policy 0, policy_version 8390 (0.0009) [2023-10-14 01:22:03,527][33226] Updated weights for policy 1, policy_version 8470 (0.0007) [2023-10-14 01:22:03,886][33201] Updated weights for policy 0, policy_version 8400 (0.0008) [2023-10-14 01:22:03,898][33226] Updated weights for policy 1, policy_version 8480 (0.0009) [2023-10-14 01:22:04,259][33201] Updated weights for policy 0, policy_version 8410 (0.0008) [2023-10-14 01:22:04,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 17301504. Throughput: 0: 1759.6, 1: 1788.8. Samples: 4327308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:22:04,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.540')] [2023-10-14 01:22:07,747][33226] Updated weights for policy 1, policy_version 8490 (0.0008) [2023-10-14 01:22:08,111][33226] Updated weights for policy 1, policy_version 8500 (0.0008) [2023-10-14 01:22:08,121][33201] Updated weights for policy 0, policy_version 8420 (0.0009) [2023-10-14 01:22:08,488][33226] Updated weights for policy 1, policy_version 8510 (0.0009) [2023-10-14 01:22:08,494][33201] Updated weights for policy 0, policy_version 8430 (0.0008) [2023-10-14 01:22:08,867][33201] Updated weights for policy 0, policy_version 8440 (0.0009) [2023-10-14 01:22:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 17367040. Throughput: 0: 1732.0, 1: 1760.4. Samples: 4346724. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:22:09,560][31953] Avg episode reward: [(0, '20.710'), (1, '20.540')] [2023-10-14 01:22:12,365][33226] Updated weights for policy 1, policy_version 8520 (0.0008) [2023-10-14 01:22:12,543][33201] Updated weights for policy 0, policy_version 8450 (0.0008) [2023-10-14 01:22:12,736][33226] Updated weights for policy 1, policy_version 8530 (0.0009) [2023-10-14 01:22:12,912][33201] Updated weights for policy 0, policy_version 8460 (0.0007) [2023-10-14 01:22:13,104][33226] Updated weights for policy 1, policy_version 8540 (0.0008) [2023-10-14 01:22:13,289][33201] Updated weights for policy 0, policy_version 8470 (0.0007) [2023-10-14 01:22:13,660][33201] Updated weights for policy 0, policy_version 8480 (0.0009) [2023-10-14 01:22:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 17432576. Throughput: 0: 1765.4, 1: 1790.8. Samples: 4359146. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:22:14,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.590')] [2023-10-14 01:22:16,678][33226] Updated weights for policy 1, policy_version 8550 (0.0009) [2023-10-14 01:22:17,055][33226] Updated weights for policy 1, policy_version 8560 (0.0010) [2023-10-14 01:22:17,425][33226] Updated weights for policy 1, policy_version 8570 (0.0009) [2023-10-14 01:22:17,539][33201] Updated weights for policy 0, policy_version 8490 (0.0007) [2023-10-14 01:22:17,903][33201] Updated weights for policy 0, policy_version 8500 (0.0008) [2023-10-14 01:22:18,283][33201] Updated weights for policy 0, policy_version 8510 (0.0008) [2023-10-14 01:22:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 17498112. Throughput: 0: 1742.4, 1: 1761.9. Samples: 4378792. Policy #0 lag: (min: 26.0, avg: 27.5, max: 52.0) [2023-10-14 01:22:19,559][31953] Avg episode reward: [(0, '20.670'), (1, '20.610')] [2023-10-14 01:22:21,246][33226] Updated weights for policy 1, policy_version 8580 (0.0009) [2023-10-14 01:22:21,608][33226] Updated weights for policy 1, policy_version 8590 (0.0008) [2023-10-14 01:22:21,973][33226] Updated weights for policy 1, policy_version 8600 (0.0008) [2023-10-14 01:22:22,131][33201] Updated weights for policy 0, policy_version 8520 (0.0008) [2023-10-14 01:22:22,503][33201] Updated weights for policy 0, policy_version 8530 (0.0008) [2023-10-14 01:22:22,869][33201] Updated weights for policy 0, policy_version 8540 (0.0008) [2023-10-14 01:22:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 17563648. Throughput: 0: 1737.5, 1: 1760.1. Samples: 4400492. Policy #0 lag: (min: 26.0, avg: 27.5, max: 52.0) [2023-10-14 01:22:24,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.600')] [2023-10-14 01:22:25,835][33226] Updated weights for policy 1, policy_version 8610 (0.0007) [2023-10-14 01:22:26,206][33226] Updated weights for policy 1, policy_version 8620 (0.0009) [2023-10-14 01:22:26,563][33226] Updated weights for policy 1, policy_version 8630 (0.0009) [2023-10-14 01:22:26,696][33201] Updated weights for policy 0, policy_version 8550 (0.0007) [2023-10-14 01:22:26,933][33226] Updated weights for policy 1, policy_version 8640 (0.0008) [2023-10-14 01:22:27,070][33201] Updated weights for policy 0, policy_version 8560 (0.0007) [2023-10-14 01:22:27,451][33201] Updated weights for policy 0, policy_version 8570 (0.0011) [2023-10-14 01:22:29,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 17629184. Throughput: 0: 1758.0, 1: 1759.3. Samples: 4410870. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 01:22:29,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.570')] [2023-10-14 01:22:30,952][33226] Updated weights for policy 1, policy_version 8650 (0.0008) [2023-10-14 01:22:31,115][33201] Updated weights for policy 0, policy_version 8580 (0.0010) [2023-10-14 01:22:31,312][33226] Updated weights for policy 1, policy_version 8660 (0.0008) [2023-10-14 01:22:31,483][33201] Updated weights for policy 0, policy_version 8590 (0.0009) [2023-10-14 01:22:31,685][33226] Updated weights for policy 1, policy_version 8670 (0.0008) [2023-10-14 01:22:31,860][33201] Updated weights for policy 0, policy_version 8600 (0.0008) [2023-10-14 01:22:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 17694720. Throughput: 0: 1744.4, 1: 1753.6. Samples: 4432030. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 01:22:34,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.560')] [2023-10-14 01:22:35,627][33226] Updated weights for policy 1, policy_version 8680 (0.0009) [2023-10-14 01:22:35,724][33201] Updated weights for policy 0, policy_version 8610 (0.0007) [2023-10-14 01:22:35,999][33226] Updated weights for policy 1, policy_version 8690 (0.0008) [2023-10-14 01:22:36,091][33201] Updated weights for policy 0, policy_version 8620 (0.0008) [2023-10-14 01:22:36,370][33226] Updated weights for policy 1, policy_version 8700 (0.0007) [2023-10-14 01:22:36,469][33201] Updated weights for policy 0, policy_version 8630 (0.0007) [2023-10-14 01:22:36,837][33201] Updated weights for policy 0, policy_version 8640 (0.0008) [2023-10-14 01:22:39,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 17760256. Throughput: 0: 1755.5, 1: 1774.6. Samples: 4454110. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 01:22:39,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.590')] [2023-10-14 01:22:39,987][33226] Updated weights for policy 1, policy_version 8710 (0.0007) [2023-10-14 01:22:40,354][33226] Updated weights for policy 1, policy_version 8720 (0.0008) [2023-10-14 01:22:40,722][33226] Updated weights for policy 1, policy_version 8730 (0.0008) [2023-10-14 01:22:40,795][33201] Updated weights for policy 0, policy_version 8650 (0.0007) [2023-10-14 01:22:41,177][33201] Updated weights for policy 0, policy_version 8660 (0.0008) [2023-10-14 01:22:41,548][33201] Updated weights for policy 0, policy_version 8670 (0.0008) [2023-10-14 01:22:44,536][33226] Updated weights for policy 1, policy_version 8740 (0.0008) [2023-10-14 01:22:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 17825792. Throughput: 0: 1741.4, 1: 1765.4. Samples: 4463596. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 01:22:44,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.570')] [2023-10-14 01:22:44,911][33226] Updated weights for policy 1, policy_version 8750 (0.0009) [2023-10-14 01:22:45,269][33226] Updated weights for policy 1, policy_version 8760 (0.0007) [2023-10-14 01:22:45,436][33201] Updated weights for policy 0, policy_version 8680 (0.0009) [2023-10-14 01:22:45,810][33201] Updated weights for policy 0, policy_version 8690 (0.0008) [2023-10-14 01:22:46,184][33201] Updated weights for policy 0, policy_version 8700 (0.0010) [2023-10-14 01:22:49,197][33226] Updated weights for policy 1, policy_version 8770 (0.0008) [2023-10-14 01:22:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 17891328. Throughput: 0: 1746.9, 1: 1765.3. Samples: 4485360. Policy #0 lag: (min: 15.0, avg: 20.8, max: 47.0) [2023-10-14 01:22:49,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.580')] [2023-10-14 01:22:49,567][33226] Updated weights for policy 1, policy_version 8780 (0.0007) [2023-10-14 01:22:49,938][33226] Updated weights for policy 1, policy_version 8790 (0.0008) [2023-10-14 01:22:49,975][33201] Updated weights for policy 0, policy_version 8710 (0.0010) [2023-10-14 01:22:50,315][33226] Updated weights for policy 1, policy_version 8800 (0.0008) [2023-10-14 01:22:50,344][33201] Updated weights for policy 0, policy_version 8720 (0.0008) [2023-10-14 01:22:50,727][33201] Updated weights for policy 0, policy_version 8730 (0.0009) [2023-10-14 01:22:54,103][33226] Updated weights for policy 1, policy_version 8810 (0.0009) [2023-10-14 01:22:54,476][33226] Updated weights for policy 1, policy_version 8820 (0.0009) [2023-10-14 01:22:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 17956864. Throughput: 0: 1776.8, 1: 1781.5. Samples: 4506846. Policy #0 lag: (min: 15.0, avg: 20.8, max: 47.0) [2023-10-14 01:22:54,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.590')] [2023-10-14 01:22:54,762][33201] Updated weights for policy 0, policy_version 8740 (0.0010) [2023-10-14 01:22:54,852][33226] Updated weights for policy 1, policy_version 8830 (0.0009) [2023-10-14 01:22:55,124][33201] Updated weights for policy 0, policy_version 8750 (0.0009) [2023-10-14 01:22:55,493][33201] Updated weights for policy 0, policy_version 8760 (0.0010) [2023-10-14 01:22:58,574][33226] Updated weights for policy 1, policy_version 8840 (0.0009) [2023-10-14 01:22:58,938][33226] Updated weights for policy 1, policy_version 8850 (0.0009) [2023-10-14 01:22:59,299][33201] Updated weights for policy 0, policy_version 8770 (0.0007) [2023-10-14 01:22:59,315][33226] Updated weights for policy 1, policy_version 8860 (0.0009) [2023-10-14 01:22:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 18055168. Throughput: 0: 1743.4, 1: 1761.5. Samples: 4516866. Policy #0 lag: (min: 9.0, avg: 12.7, max: 41.0) [2023-10-14 01:22:59,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.650')] [2023-10-14 01:22:59,558][32895] Saving new best policy, reward=20.650! [2023-10-14 01:22:59,672][33201] Updated weights for policy 0, policy_version 8780 (0.0008) [2023-10-14 01:23:00,046][33201] Updated weights for policy 0, policy_version 8790 (0.0008) [2023-10-14 01:23:00,418][33201] Updated weights for policy 0, policy_version 8800 (0.0007) [2023-10-14 01:23:03,203][33226] Updated weights for policy 1, policy_version 8870 (0.0008) [2023-10-14 01:23:03,571][33226] Updated weights for policy 1, policy_version 8880 (0.0008) [2023-10-14 01:23:03,941][33226] Updated weights for policy 1, policy_version 8890 (0.0008) [2023-10-14 01:23:04,242][33201] Updated weights for policy 0, policy_version 8810 (0.0008) [2023-10-14 01:23:04,557][31953] Fps is (10 sec: 16384.7, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 18120704. Throughput: 0: 1768.2, 1: 1788.7. Samples: 4538854. Policy #0 lag: (min: 9.0, avg: 12.7, max: 41.0) [2023-10-14 01:23:04,557][31953] Avg episode reward: [(0, '20.520'), (1, '20.670')] [2023-10-14 01:23:04,558][32895] Saving new best policy, reward=20.670! [2023-10-14 01:23:04,613][33201] Updated weights for policy 0, policy_version 8820 (0.0010) [2023-10-14 01:23:04,983][33201] Updated weights for policy 0, policy_version 8830 (0.0010) [2023-10-14 01:23:07,907][33226] Updated weights for policy 1, policy_version 8900 (0.0007) [2023-10-14 01:23:08,273][33226] Updated weights for policy 1, policy_version 8910 (0.0007) [2023-10-14 01:23:08,648][33226] Updated weights for policy 1, policy_version 8920 (0.0009) [2023-10-14 01:23:08,816][33201] Updated weights for policy 0, policy_version 8840 (0.0007) [2023-10-14 01:23:09,192][33201] Updated weights for policy 0, policy_version 8850 (0.0007) [2023-10-14 01:23:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 18186240. Throughput: 0: 1761.6, 1: 1754.4. Samples: 4558710. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 01:23:09,557][31953] Avg episode reward: [(0, '20.240'), (1, '20.670')] [2023-10-14 01:23:09,569][33201] Updated weights for policy 0, policy_version 8860 (0.0009) [2023-10-14 01:23:12,259][33226] Updated weights for policy 1, policy_version 8930 (0.0007) [2023-10-14 01:23:12,626][33226] Updated weights for policy 1, policy_version 8940 (0.0008) [2023-10-14 01:23:12,997][33226] Updated weights for policy 1, policy_version 8950 (0.0007) [2023-10-14 01:23:13,369][33226] Updated weights for policy 1, policy_version 8960 (0.0008) [2023-10-14 01:23:13,465][33201] Updated weights for policy 0, policy_version 8870 (0.0007) [2023-10-14 01:23:13,833][33201] Updated weights for policy 0, policy_version 8880 (0.0010) [2023-10-14 01:23:14,205][33201] Updated weights for policy 0, policy_version 8890 (0.0011) [2023-10-14 01:23:14,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 18284544. Throughput: 0: 1753.6, 1: 1792.7. Samples: 4570458. Policy #0 lag: (min: 17.0, avg: 27.5, max: 49.0) [2023-10-14 01:23:14,558][31953] Avg episode reward: [(0, '20.230'), (1, '20.660')] [2023-10-14 01:23:17,065][33226] Updated weights for policy 1, policy_version 8970 (0.0008) [2023-10-14 01:23:17,438][33226] Updated weights for policy 1, policy_version 8980 (0.0008) [2023-10-14 01:23:17,806][33226] Updated weights for policy 1, policy_version 8990 (0.0009) [2023-10-14 01:23:18,059][33201] Updated weights for policy 0, policy_version 8900 (0.0010) [2023-10-14 01:23:18,424][33201] Updated weights for policy 0, policy_version 8910 (0.0007) [2023-10-14 01:23:18,792][33201] Updated weights for policy 0, policy_version 8920 (0.0009) [2023-10-14 01:23:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 18350080. Throughput: 0: 1766.7, 1: 1768.5. Samples: 4591112. Policy #0 lag: (min: 17.0, avg: 27.5, max: 49.0) [2023-10-14 01:23:19,558][31953] Avg episode reward: [(0, '20.180'), (1, '20.610')] [2023-10-14 01:23:21,641][33226] Updated weights for policy 1, policy_version 9000 (0.0007) [2023-10-14 01:23:22,022][33226] Updated weights for policy 1, policy_version 9010 (0.0007) [2023-10-14 01:23:22,392][33226] Updated weights for policy 1, policy_version 9020 (0.0008) [2023-10-14 01:23:22,639][33201] Updated weights for policy 0, policy_version 8930 (0.0010) [2023-10-14 01:23:23,002][33201] Updated weights for policy 0, policy_version 8940 (0.0011) [2023-10-14 01:23:23,387][33201] Updated weights for policy 0, policy_version 8950 (0.0011) [2023-10-14 01:23:23,754][33201] Updated weights for policy 0, policy_version 8960 (0.0010) [2023-10-14 01:23:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 18415616. Throughput: 0: 1742.0, 1: 1763.1. Samples: 4611840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:24,558][31953] Avg episode reward: [(0, '20.130'), (1, '20.620')] [2023-10-14 01:23:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000008960_9175040.pth... [2023-10-14 01:23:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000009024_9240576.pth... [2023-10-14 01:23:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000007296_7471104.pth [2023-10-14 01:23:24,609][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000007360_7536640.pth [2023-10-14 01:23:26,187][33226] Updated weights for policy 1, policy_version 9030 (0.0009) [2023-10-14 01:23:26,561][33226] Updated weights for policy 1, policy_version 9040 (0.0008) [2023-10-14 01:23:26,932][33226] Updated weights for policy 1, policy_version 9050 (0.0007) [2023-10-14 01:23:27,442][33201] Updated weights for policy 0, policy_version 8970 (0.0008) [2023-10-14 01:23:27,813][33201] Updated weights for policy 0, policy_version 8980 (0.0009) [2023-10-14 01:23:28,191][33201] Updated weights for policy 0, policy_version 8990 (0.0009) [2023-10-14 01:23:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 18481152. Throughput: 0: 1781.6, 1: 1773.1. Samples: 4623556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:29,558][31953] Avg episode reward: [(0, '20.100'), (1, '20.590')] [2023-10-14 01:23:30,658][33226] Updated weights for policy 1, policy_version 9060 (0.0008) [2023-10-14 01:23:31,020][33226] Updated weights for policy 1, policy_version 9070 (0.0011) [2023-10-14 01:23:31,392][33226] Updated weights for policy 1, policy_version 9080 (0.0010) [2023-10-14 01:23:32,257][33201] Updated weights for policy 0, policy_version 9000 (0.0008) [2023-10-14 01:23:32,637][33201] Updated weights for policy 0, policy_version 9010 (0.0008) [2023-10-14 01:23:32,997][33201] Updated weights for policy 0, policy_version 9020 (0.0009) [2023-10-14 01:23:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 18546688. Throughput: 0: 1749.9, 1: 1777.4. Samples: 4644088. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:34,558][31953] Avg episode reward: [(0, '20.020'), (1, '20.590')] [2023-10-14 01:23:35,004][33226] Updated weights for policy 1, policy_version 9090 (0.0009) [2023-10-14 01:23:35,369][33226] Updated weights for policy 1, policy_version 9100 (0.0010) [2023-10-14 01:23:35,731][33226] Updated weights for policy 1, policy_version 9110 (0.0008) [2023-10-14 01:23:36,099][33226] Updated weights for policy 1, policy_version 9120 (0.0008) [2023-10-14 01:23:36,907][33201] Updated weights for policy 0, policy_version 9030 (0.0009) [2023-10-14 01:23:37,275][33201] Updated weights for policy 0, policy_version 9040 (0.0009) [2023-10-14 01:23:37,644][33201] Updated weights for policy 0, policy_version 9050 (0.0007) [2023-10-14 01:23:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 18612224. Throughput: 0: 1745.6, 1: 1789.3. Samples: 4665914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:39,558][31953] Avg episode reward: [(0, '20.020'), (1, '20.590')] [2023-10-14 01:23:39,924][33226] Updated weights for policy 1, policy_version 9130 (0.0007) [2023-10-14 01:23:40,297][33226] Updated weights for policy 1, policy_version 9140 (0.0007) [2023-10-14 01:23:40,679][33226] Updated weights for policy 1, policy_version 9150 (0.0010) [2023-10-14 01:23:41,349][33201] Updated weights for policy 0, policy_version 9060 (0.0008) [2023-10-14 01:23:41,726][33201] Updated weights for policy 0, policy_version 9070 (0.0009) [2023-10-14 01:23:42,096][33201] Updated weights for policy 0, policy_version 9080 (0.0008) [2023-10-14 01:23:44,382][33226] Updated weights for policy 1, policy_version 9160 (0.0009) [2023-10-14 01:23:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 18677760. Throughput: 0: 1756.7, 1: 1779.1. Samples: 4675978. Policy #0 lag: (min: 26.0, avg: 30.7, max: 58.0) [2023-10-14 01:23:44,558][31953] Avg episode reward: [(0, '20.010'), (1, '20.590')] [2023-10-14 01:23:44,757][33226] Updated weights for policy 1, policy_version 9170 (0.0007) [2023-10-14 01:23:45,122][33226] Updated weights for policy 1, policy_version 9180 (0.0007) [2023-10-14 01:23:45,855][33201] Updated weights for policy 0, policy_version 9090 (0.0009) [2023-10-14 01:23:46,226][33201] Updated weights for policy 0, policy_version 9100 (0.0008) [2023-10-14 01:23:46,593][33201] Updated weights for policy 0, policy_version 9110 (0.0009) [2023-10-14 01:23:46,974][33201] Updated weights for policy 0, policy_version 9120 (0.0007) [2023-10-14 01:23:49,009][33226] Updated weights for policy 1, policy_version 9190 (0.0007) [2023-10-14 01:23:49,375][33226] Updated weights for policy 1, policy_version 9200 (0.0009) [2023-10-14 01:23:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 18743296. Throughput: 0: 1750.4, 1: 1784.3. Samples: 4697918. Policy #0 lag: (min: 26.0, avg: 30.7, max: 58.0) [2023-10-14 01:23:49,558][31953] Avg episode reward: [(0, '19.980'), (1, '20.600')] [2023-10-14 01:23:49,746][33226] Updated weights for policy 1, policy_version 9210 (0.0007) [2023-10-14 01:23:50,726][33201] Updated weights for policy 0, policy_version 9130 (0.0007) [2023-10-14 01:23:51,097][33201] Updated weights for policy 0, policy_version 9140 (0.0008) [2023-10-14 01:23:51,474][33201] Updated weights for policy 0, policy_version 9150 (0.0009) [2023-10-14 01:23:53,379][33226] Updated weights for policy 1, policy_version 9220 (0.0008) [2023-10-14 01:23:53,757][33226] Updated weights for policy 1, policy_version 9230 (0.0007) [2023-10-14 01:23:54,122][33226] Updated weights for policy 1, policy_version 9240 (0.0007) [2023-10-14 01:23:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.7, 300 sec: 14106.9). Total num frames: 18841600. Throughput: 0: 1763.9, 1: 1803.7. Samples: 4719254. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:54,558][31953] Avg episode reward: [(0, '19.960'), (1, '20.580')] [2023-10-14 01:23:55,382][33201] Updated weights for policy 0, policy_version 9160 (0.0008) [2023-10-14 01:23:55,753][33201] Updated weights for policy 0, policy_version 9170 (0.0008) [2023-10-14 01:23:56,130][33201] Updated weights for policy 0, policy_version 9180 (0.0009) [2023-10-14 01:23:57,858][33226] Updated weights for policy 1, policy_version 9250 (0.0008) [2023-10-14 01:23:58,218][33226] Updated weights for policy 1, policy_version 9260 (0.0008) [2023-10-14 01:23:58,589][33226] Updated weights for policy 1, policy_version 9270 (0.0008) [2023-10-14 01:23:58,960][33226] Updated weights for policy 1, policy_version 9280 (0.0007) [2023-10-14 01:23:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 18907136. Throughput: 0: 1751.2, 1: 1787.6. Samples: 4729702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:23:59,558][31953] Avg episode reward: [(0, '19.940'), (1, '20.590')] [2023-10-14 01:23:59,833][33201] Updated weights for policy 0, policy_version 9190 (0.0008) [2023-10-14 01:24:00,201][33201] Updated weights for policy 0, policy_version 9200 (0.0010) [2023-10-14 01:24:00,582][33201] Updated weights for policy 0, policy_version 9210 (0.0008) [2023-10-14 01:24:02,667][33226] Updated weights for policy 1, policy_version 9290 (0.0008) [2023-10-14 01:24:03,044][33226] Updated weights for policy 1, policy_version 9300 (0.0010) [2023-10-14 01:24:03,403][33226] Updated weights for policy 1, policy_version 9310 (0.0009) [2023-10-14 01:24:04,512][33201] Updated weights for policy 0, policy_version 9220 (0.0007) [2023-10-14 01:24:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 18972672. Throughput: 0: 1754.0, 1: 1802.4. Samples: 4751152. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:24:04,558][31953] Avg episode reward: [(0, '19.930'), (1, '20.610')] [2023-10-14 01:24:04,881][33201] Updated weights for policy 0, policy_version 9230 (0.0008) [2023-10-14 01:24:05,258][33201] Updated weights for policy 0, policy_version 9240 (0.0010) [2023-10-14 01:24:07,268][33226] Updated weights for policy 1, policy_version 9320 (0.0010) [2023-10-14 01:24:07,645][33226] Updated weights for policy 1, policy_version 9330 (0.0008) [2023-10-14 01:24:08,009][33226] Updated weights for policy 1, policy_version 9340 (0.0009) [2023-10-14 01:24:09,176][33201] Updated weights for policy 0, policy_version 9250 (0.0010) [2023-10-14 01:24:09,556][33201] Updated weights for policy 0, policy_version 9260 (0.0008) [2023-10-14 01:24:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 19038208. Throughput: 0: 1782.8, 1: 1790.0. Samples: 4772618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:24:09,558][31953] Avg episode reward: [(0, '19.930'), (1, '20.630')] [2023-10-14 01:24:09,917][33201] Updated weights for policy 0, policy_version 9270 (0.0009) [2023-10-14 01:24:10,294][33201] Updated weights for policy 0, policy_version 9280 (0.0009) [2023-10-14 01:24:11,721][33226] Updated weights for policy 1, policy_version 9350 (0.0009) [2023-10-14 01:24:12,093][33226] Updated weights for policy 1, policy_version 9360 (0.0009) [2023-10-14 01:24:12,451][33226] Updated weights for policy 1, policy_version 9370 (0.0008) [2023-10-14 01:24:14,152][33201] Updated weights for policy 0, policy_version 9290 (0.0009) [2023-10-14 01:24:14,523][33201] Updated weights for policy 0, policy_version 9300 (0.0009) [2023-10-14 01:24:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 19103744. Throughput: 0: 1744.5, 1: 1803.9. Samples: 4783234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:24:14,558][31953] Avg episode reward: [(0, '19.870'), (1, '20.620')] [2023-10-14 01:24:14,899][33201] Updated weights for policy 0, policy_version 9310 (0.0008) [2023-10-14 01:24:16,205][33226] Updated weights for policy 1, policy_version 9380 (0.0009) [2023-10-14 01:24:16,573][33226] Updated weights for policy 1, policy_version 9390 (0.0007) [2023-10-14 01:24:16,944][33226] Updated weights for policy 1, policy_version 9400 (0.0008) [2023-10-14 01:24:18,864][33201] Updated weights for policy 0, policy_version 9320 (0.0007) [2023-10-14 01:24:19,234][33201] Updated weights for policy 0, policy_version 9330 (0.0008) [2023-10-14 01:24:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 19169280. Throughput: 0: 1777.8, 1: 1784.4. Samples: 4804386. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:24:19,558][31953] Avg episode reward: [(0, '19.880'), (1, '20.610')] [2023-10-14 01:24:19,610][33201] Updated weights for policy 0, policy_version 9340 (0.0008) [2023-10-14 01:24:20,777][33226] Updated weights for policy 1, policy_version 9410 (0.0008) [2023-10-14 01:24:21,145][33226] Updated weights for policy 1, policy_version 9420 (0.0010) [2023-10-14 01:24:21,513][33226] Updated weights for policy 1, policy_version 9430 (0.0008) [2023-10-14 01:24:21,886][33226] Updated weights for policy 1, policy_version 9440 (0.0008) [2023-10-14 01:24:23,426][33201] Updated weights for policy 0, policy_version 9350 (0.0008) [2023-10-14 01:24:23,805][33201] Updated weights for policy 0, policy_version 9360 (0.0008) [2023-10-14 01:24:24,170][33201] Updated weights for policy 0, policy_version 9370 (0.0009) [2023-10-14 01:24:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 19267584. Throughput: 0: 1762.7, 1: 1783.4. Samples: 4825490. Policy #0 lag: (min: 31.0, avg: 38.0, max: 63.0) [2023-10-14 01:24:24,558][31953] Avg episode reward: [(0, '19.870'), (1, '20.620')] [2023-10-14 01:24:25,791][33226] Updated weights for policy 1, policy_version 9450 (0.0008) [2023-10-14 01:24:26,157][33226] Updated weights for policy 1, policy_version 9460 (0.0007) [2023-10-14 01:24:26,528][33226] Updated weights for policy 1, policy_version 9470 (0.0007) [2023-10-14 01:24:28,019][33201] Updated weights for policy 0, policy_version 9380 (0.0009) [2023-10-14 01:24:28,393][33201] Updated weights for policy 0, policy_version 9390 (0.0008) [2023-10-14 01:24:28,775][33201] Updated weights for policy 0, policy_version 9400 (0.0007) [2023-10-14 01:24:29,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 19333120. Throughput: 0: 1771.4, 1: 1783.6. Samples: 4835956. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:24:29,558][31953] Avg episode reward: [(0, '19.880'), (1, '20.610')] [2023-10-14 01:24:30,276][33226] Updated weights for policy 1, policy_version 9480 (0.0008) [2023-10-14 01:24:30,648][33226] Updated weights for policy 1, policy_version 9490 (0.0007) [2023-10-14 01:24:31,010][33226] Updated weights for policy 1, policy_version 9500 (0.0007) [2023-10-14 01:24:32,662][33201] Updated weights for policy 0, policy_version 9410 (0.0009) [2023-10-14 01:24:33,042][33201] Updated weights for policy 0, policy_version 9420 (0.0007) [2023-10-14 01:24:33,407][33201] Updated weights for policy 0, policy_version 9430 (0.0008) [2023-10-14 01:24:33,782][33201] Updated weights for policy 0, policy_version 9440 (0.0007) [2023-10-14 01:24:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 19398656. Throughput: 0: 1763.4, 1: 1781.1. Samples: 4857418. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:24:34,558][31953] Avg episode reward: [(0, '19.890'), (1, '20.610')] [2023-10-14 01:24:34,771][33226] Updated weights for policy 1, policy_version 9510 (0.0007) [2023-10-14 01:24:35,141][33226] Updated weights for policy 1, policy_version 9520 (0.0009) [2023-10-14 01:24:35,508][33226] Updated weights for policy 1, policy_version 9530 (0.0010) [2023-10-14 01:24:37,573][33201] Updated weights for policy 0, policy_version 9450 (0.0010) [2023-10-14 01:24:37,950][33201] Updated weights for policy 0, policy_version 9460 (0.0010) [2023-10-14 01:24:38,334][33201] Updated weights for policy 0, policy_version 9470 (0.0009) [2023-10-14 01:24:39,279][33226] Updated weights for policy 1, policy_version 9540 (0.0009) [2023-10-14 01:24:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 19464192. Throughput: 0: 1748.2, 1: 1798.4. Samples: 4878854. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:24:39,558][31953] Avg episode reward: [(0, '19.890'), (1, '20.620')] [2023-10-14 01:24:39,646][33226] Updated weights for policy 1, policy_version 9550 (0.0007) [2023-10-14 01:24:40,013][33226] Updated weights for policy 1, policy_version 9560 (0.0007) [2023-10-14 01:24:42,106][33201] Updated weights for policy 0, policy_version 9480 (0.0008) [2023-10-14 01:24:42,477][33201] Updated weights for policy 0, policy_version 9490 (0.0008) [2023-10-14 01:24:42,856][33201] Updated weights for policy 0, policy_version 9500 (0.0012) [2023-10-14 01:24:43,716][33226] Updated weights for policy 1, policy_version 9570 (0.0007) [2023-10-14 01:24:44,083][33226] Updated weights for policy 1, policy_version 9580 (0.0007) [2023-10-14 01:24:44,446][33226] Updated weights for policy 1, policy_version 9590 (0.0011) [2023-10-14 01:24:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 19529728. Throughput: 0: 1770.9, 1: 1779.0. Samples: 4889446. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:24:44,558][31953] Avg episode reward: [(0, '20.110'), (1, '20.610')] [2023-10-14 01:24:44,815][33226] Updated weights for policy 1, policy_version 9600 (0.0008) [2023-10-14 01:24:46,699][33201] Updated weights for policy 0, policy_version 9510 (0.0009) [2023-10-14 01:24:47,072][33201] Updated weights for policy 0, policy_version 9520 (0.0009) [2023-10-14 01:24:47,443][33201] Updated weights for policy 0, policy_version 9530 (0.0010) [2023-10-14 01:24:48,541][33226] Updated weights for policy 1, policy_version 9610 (0.0009) [2023-10-14 01:24:48,919][33226] Updated weights for policy 1, policy_version 9620 (0.0009) [2023-10-14 01:24:49,280][33226] Updated weights for policy 1, policy_version 9630 (0.0008) [2023-10-14 01:24:49,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 19628032. Throughput: 0: 1743.6, 1: 1797.5. Samples: 4910498. Policy #0 lag: (min: 9.0, avg: 23.4, max: 41.0) [2023-10-14 01:24:49,558][31953] Avg episode reward: [(0, '20.120'), (1, '20.610')] [2023-10-14 01:24:51,316][33201] Updated weights for policy 0, policy_version 9540 (0.0009) [2023-10-14 01:24:51,688][33201] Updated weights for policy 0, policy_version 9550 (0.0008) [2023-10-14 01:24:52,050][33201] Updated weights for policy 0, policy_version 9560 (0.0008) [2023-10-14 01:24:53,014][33226] Updated weights for policy 1, policy_version 9640 (0.0007) [2023-10-14 01:24:53,379][33226] Updated weights for policy 1, policy_version 9650 (0.0007) [2023-10-14 01:24:53,760][33226] Updated weights for policy 1, policy_version 9660 (0.0009) [2023-10-14 01:24:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 19693568. Throughput: 0: 1741.9, 1: 1782.6. Samples: 4931218. Policy #0 lag: (min: 9.0, avg: 23.4, max: 41.0) [2023-10-14 01:24:54,558][31953] Avg episode reward: [(0, '20.110'), (1, '20.600')] [2023-10-14 01:24:55,885][33201] Updated weights for policy 0, policy_version 9570 (0.0007) [2023-10-14 01:24:56,251][33201] Updated weights for policy 0, policy_version 9580 (0.0008) [2023-10-14 01:24:56,625][33201] Updated weights for policy 0, policy_version 9590 (0.0010) [2023-10-14 01:24:56,999][33201] Updated weights for policy 0, policy_version 9600 (0.0009) [2023-10-14 01:24:57,485][33226] Updated weights for policy 1, policy_version 9670 (0.0009) [2023-10-14 01:24:57,858][33226] Updated weights for policy 1, policy_version 9680 (0.0007) [2023-10-14 01:24:58,225][33226] Updated weights for policy 1, policy_version 9690 (0.0007) [2023-10-14 01:24:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 19759104. Throughput: 0: 1745.2, 1: 1789.6. Samples: 4942302. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-14 01:24:59,557][31953] Avg episode reward: [(0, '20.130'), (1, '20.580')] [2023-10-14 01:25:00,764][33201] Updated weights for policy 0, policy_version 9610 (0.0008) [2023-10-14 01:25:01,132][33201] Updated weights for policy 0, policy_version 9620 (0.0009) [2023-10-14 01:25:01,499][33201] Updated weights for policy 0, policy_version 9630 (0.0010) [2023-10-14 01:25:02,021][33226] Updated weights for policy 1, policy_version 9700 (0.0008) [2023-10-14 01:25:02,397][33226] Updated weights for policy 1, policy_version 9710 (0.0010) [2023-10-14 01:25:02,761][33226] Updated weights for policy 1, policy_version 9720 (0.0011) [2023-10-14 01:25:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 19824640. Throughput: 0: 1749.6, 1: 1779.8. Samples: 4963206. Policy #0 lag: (min: 31.0, avg: 34.2, max: 63.0) [2023-10-14 01:25:04,558][31953] Avg episode reward: [(0, '20.160'), (1, '20.580')] [2023-10-14 01:25:05,431][33201] Updated weights for policy 0, policy_version 9640 (0.0007) [2023-10-14 01:25:05,801][33201] Updated weights for policy 0, policy_version 9650 (0.0007) [2023-10-14 01:25:06,175][33201] Updated weights for policy 0, policy_version 9660 (0.0009) [2023-10-14 01:25:06,575][33226] Updated weights for policy 1, policy_version 9730 (0.0010) [2023-10-14 01:25:06,942][33226] Updated weights for policy 1, policy_version 9740 (0.0009) [2023-10-14 01:25:07,322][33226] Updated weights for policy 1, policy_version 9750 (0.0008) [2023-10-14 01:25:07,692][33226] Updated weights for policy 1, policy_version 9760 (0.0007) [2023-10-14 01:25:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 19890176. Throughput: 0: 1768.0, 1: 1776.8. Samples: 4985006. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-14 01:25:09,558][31953] Avg episode reward: [(0, '20.170'), (1, '20.580')] [2023-10-14 01:25:10,013][33201] Updated weights for policy 0, policy_version 9670 (0.0008) [2023-10-14 01:25:10,398][33201] Updated weights for policy 0, policy_version 9680 (0.0009) [2023-10-14 01:25:10,769][33201] Updated weights for policy 0, policy_version 9690 (0.0008) [2023-10-14 01:25:11,330][33226] Updated weights for policy 1, policy_version 9770 (0.0011) [2023-10-14 01:25:11,699][33226] Updated weights for policy 1, policy_version 9780 (0.0008) [2023-10-14 01:25:12,056][33226] Updated weights for policy 1, policy_version 9790 (0.0008) [2023-10-14 01:25:14,557][33201] Updated weights for policy 0, policy_version 9700 (0.0009) [2023-10-14 01:25:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 19955712. Throughput: 0: 1744.6, 1: 1786.4. Samples: 4994854. Policy #0 lag: (min: 25.0, avg: 40.9, max: 57.0) [2023-10-14 01:25:14,558][31953] Avg episode reward: [(0, '20.150'), (1, '20.600')] [2023-10-14 01:25:14,933][33201] Updated weights for policy 0, policy_version 9710 (0.0010) [2023-10-14 01:25:15,306][33201] Updated weights for policy 0, policy_version 9720 (0.0009) [2023-10-14 01:25:16,087][33226] Updated weights for policy 1, policy_version 9800 (0.0010) [2023-10-14 01:25:16,453][33226] Updated weights for policy 1, policy_version 9810 (0.0010) [2023-10-14 01:25:16,830][33226] Updated weights for policy 1, policy_version 9820 (0.0009) [2023-10-14 01:25:19,152][33201] Updated weights for policy 0, policy_version 9730 (0.0007) [2023-10-14 01:25:19,528][33201] Updated weights for policy 0, policy_version 9740 (0.0008) [2023-10-14 01:25:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20021248. Throughput: 0: 1757.3, 1: 1776.8. Samples: 5016456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:25:19,558][31953] Avg episode reward: [(0, '20.140'), (1, '20.630')] [2023-10-14 01:25:19,903][33201] Updated weights for policy 0, policy_version 9750 (0.0007) [2023-10-14 01:25:20,271][33201] Updated weights for policy 0, policy_version 9760 (0.0010) [2023-10-14 01:25:20,612][33226] Updated weights for policy 1, policy_version 9830 (0.0007) [2023-10-14 01:25:20,981][33226] Updated weights for policy 1, policy_version 9840 (0.0007) [2023-10-14 01:25:21,353][33226] Updated weights for policy 1, policy_version 9850 (0.0008) [2023-10-14 01:25:23,929][33201] Updated weights for policy 0, policy_version 9770 (0.0011) [2023-10-14 01:25:24,305][33201] Updated weights for policy 0, policy_version 9780 (0.0010) [2023-10-14 01:25:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 20086784. Throughput: 0: 1764.8, 1: 1779.2. Samples: 5038334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:25:24,558][31953] Avg episode reward: [(0, '20.120'), (1, '20.650')] [2023-10-14 01:25:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000009856_10092544.pth... [2023-10-14 01:25:24,596][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000008192_8388608.pth [2023-10-14 01:25:24,671][33201] Updated weights for policy 0, policy_version 9790 (0.0011) [2023-10-14 01:25:24,749][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000009792_10027008.pth... [2023-10-14 01:25:24,779][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000008128_8323072.pth [2023-10-14 01:25:25,030][33226] Updated weights for policy 1, policy_version 9860 (0.0009) [2023-10-14 01:25:25,396][33226] Updated weights for policy 1, policy_version 9870 (0.0009) [2023-10-14 01:25:25,774][33226] Updated weights for policy 1, policy_version 9880 (0.0008) [2023-10-14 01:25:28,451][33201] Updated weights for policy 0, policy_version 9800 (0.0008) [2023-10-14 01:25:28,830][33201] Updated weights for policy 0, policy_version 9810 (0.0008) [2023-10-14 01:25:29,206][33201] Updated weights for policy 0, policy_version 9820 (0.0008) [2023-10-14 01:25:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 20185088. Throughput: 0: 1758.7, 1: 1778.7. Samples: 5048630. Policy #0 lag: (min: 31.0, avg: 33.4, max: 63.0) [2023-10-14 01:25:29,558][31953] Avg episode reward: [(0, '20.120'), (1, '20.670')] [2023-10-14 01:25:29,732][33226] Updated weights for policy 1, policy_version 9890 (0.0008) [2023-10-14 01:25:30,104][33226] Updated weights for policy 1, policy_version 9900 (0.0007) [2023-10-14 01:25:30,470][33226] Updated weights for policy 1, policy_version 9910 (0.0007) [2023-10-14 01:25:30,844][33226] Updated weights for policy 1, policy_version 9920 (0.0007) [2023-10-14 01:25:32,996][33201] Updated weights for policy 0, policy_version 9830 (0.0010) [2023-10-14 01:25:33,364][33201] Updated weights for policy 0, policy_version 9840 (0.0010) [2023-10-14 01:25:33,744][33201] Updated weights for policy 0, policy_version 9850 (0.0009) [2023-10-14 01:25:34,502][33226] Updated weights for policy 1, policy_version 9930 (0.0008) [2023-10-14 01:25:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 20250624. Throughput: 0: 1778.0, 1: 1774.5. Samples: 5070360. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:25:34,558][31953] Avg episode reward: [(0, '20.130'), (1, '20.700')] [2023-10-14 01:25:34,863][33226] Updated weights for policy 1, policy_version 9940 (0.0007) [2023-10-14 01:25:35,239][33226] Updated weights for policy 1, policy_version 9950 (0.0008) [2023-10-14 01:25:35,303][32895] Saving new best policy, reward=20.700! [2023-10-14 01:25:37,604][33201] Updated weights for policy 0, policy_version 9860 (0.0007) [2023-10-14 01:25:37,975][33201] Updated weights for policy 0, policy_version 9870 (0.0007) [2023-10-14 01:25:38,362][33201] Updated weights for policy 0, policy_version 9880 (0.0009) [2023-10-14 01:25:39,269][33226] Updated weights for policy 1, policy_version 9960 (0.0008) [2023-10-14 01:25:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 20316160. Throughput: 0: 1757.2, 1: 1803.7. Samples: 5091458. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:25:39,558][31953] Avg episode reward: [(0, '20.130'), (1, '20.700')] [2023-10-14 01:25:39,654][33226] Updated weights for policy 1, policy_version 9970 (0.0010) [2023-10-14 01:25:40,026][33226] Updated weights for policy 1, policy_version 9980 (0.0009) [2023-10-14 01:25:42,163][33201] Updated weights for policy 0, policy_version 9890 (0.0009) [2023-10-14 01:25:42,534][33201] Updated weights for policy 0, policy_version 9900 (0.0008) [2023-10-14 01:25:42,902][33201] Updated weights for policy 0, policy_version 9910 (0.0008) [2023-10-14 01:25:43,278][33201] Updated weights for policy 0, policy_version 9920 (0.0007) [2023-10-14 01:25:43,560][33226] Updated weights for policy 1, policy_version 9990 (0.0007) [2023-10-14 01:25:43,922][33226] Updated weights for policy 1, policy_version 10000 (0.0009) [2023-10-14 01:25:44,288][33226] Updated weights for policy 1, policy_version 10010 (0.0009) [2023-10-14 01:25:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 20414464. Throughput: 0: 1787.1, 1: 1769.2. Samples: 5102336. Policy #0 lag: (min: 42.0, avg: 55.0, max: 56.0) [2023-10-14 01:25:44,558][31953] Avg episode reward: [(0, '20.130'), (1, '20.700')] [2023-10-14 01:25:47,197][33201] Updated weights for policy 0, policy_version 9930 (0.0010) [2023-10-14 01:25:47,558][33201] Updated weights for policy 0, policy_version 9940 (0.0010) [2023-10-14 01:25:47,927][33201] Updated weights for policy 0, policy_version 9950 (0.0010) [2023-10-14 01:25:48,231][33226] Updated weights for policy 1, policy_version 10020 (0.0008) [2023-10-14 01:25:48,596][33226] Updated weights for policy 1, policy_version 10030 (0.0008) [2023-10-14 01:25:48,956][33226] Updated weights for policy 1, policy_version 10040 (0.0010) [2023-10-14 01:25:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 20480000. Throughput: 0: 1750.4, 1: 1799.2. Samples: 5122940. Policy #0 lag: (min: 42.0, avg: 55.0, max: 56.0) [2023-10-14 01:25:49,558][31953] Avg episode reward: [(0, '20.180'), (1, '20.750')] [2023-10-14 01:25:49,559][32895] Saving new best policy, reward=20.750! [2023-10-14 01:25:51,906][33201] Updated weights for policy 0, policy_version 9960 (0.0009) [2023-10-14 01:25:52,297][33201] Updated weights for policy 0, policy_version 9970 (0.0009) [2023-10-14 01:25:52,665][33201] Updated weights for policy 0, policy_version 9980 (0.0007) [2023-10-14 01:25:52,673][33226] Updated weights for policy 1, policy_version 10050 (0.0009) [2023-10-14 01:25:53,043][33226] Updated weights for policy 1, policy_version 10060 (0.0007) [2023-10-14 01:25:53,402][33226] Updated weights for policy 1, policy_version 10070 (0.0008) [2023-10-14 01:25:53,780][33226] Updated weights for policy 1, policy_version 10080 (0.0008) [2023-10-14 01:25:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20545536. Throughput: 0: 1751.9, 1: 1768.2. Samples: 5143410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:25:54,558][31953] Avg episode reward: [(0, '20.190'), (1, '20.760')] [2023-10-14 01:25:54,571][32895] Saving new best policy, reward=20.760! [2023-10-14 01:25:56,376][33201] Updated weights for policy 0, policy_version 9990 (0.0007) [2023-10-14 01:25:56,750][33201] Updated weights for policy 0, policy_version 10000 (0.0007) [2023-10-14 01:25:57,114][33201] Updated weights for policy 0, policy_version 10010 (0.0007) [2023-10-14 01:25:57,737][33226] Updated weights for policy 1, policy_version 10090 (0.0007) [2023-10-14 01:25:58,103][33226] Updated weights for policy 1, policy_version 10100 (0.0008) [2023-10-14 01:25:58,478][33226] Updated weights for policy 1, policy_version 10110 (0.0008) [2023-10-14 01:25:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 20611072. Throughput: 0: 1762.0, 1: 1787.4. Samples: 5154578. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:25:59,558][31953] Avg episode reward: [(0, '20.210'), (1, '20.760')] [2023-10-14 01:26:00,766][33201] Updated weights for policy 0, policy_version 10020 (0.0007) [2023-10-14 01:26:01,139][33201] Updated weights for policy 0, policy_version 10030 (0.0010) [2023-10-14 01:26:01,512][33201] Updated weights for policy 0, policy_version 10040 (0.0008) [2023-10-14 01:26:02,238][33226] Updated weights for policy 1, policy_version 10120 (0.0010) [2023-10-14 01:26:02,608][33226] Updated weights for policy 1, policy_version 10130 (0.0010) [2023-10-14 01:26:02,979][33226] Updated weights for policy 1, policy_version 10140 (0.0011) [2023-10-14 01:26:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 20676608. Throughput: 0: 1762.7, 1: 1770.7. Samples: 5175464. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:26:04,558][31953] Avg episode reward: [(0, '20.250'), (1, '20.760')] [2023-10-14 01:26:05,137][33201] Updated weights for policy 0, policy_version 10050 (0.0007) [2023-10-14 01:26:05,512][33201] Updated weights for policy 0, policy_version 10060 (0.0008) [2023-10-14 01:26:05,890][33201] Updated weights for policy 0, policy_version 10070 (0.0008) [2023-10-14 01:26:06,263][33201] Updated weights for policy 0, policy_version 10080 (0.0010) [2023-10-14 01:26:06,731][33226] Updated weights for policy 1, policy_version 10150 (0.0009) [2023-10-14 01:26:07,096][33226] Updated weights for policy 1, policy_version 10160 (0.0010) [2023-10-14 01:26:07,463][33226] Updated weights for policy 1, policy_version 10170 (0.0009) [2023-10-14 01:26:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20742144. Throughput: 0: 1775.3, 1: 1757.2. Samples: 5197300. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:26:09,558][31953] Avg episode reward: [(0, '20.270'), (1, '20.760')] [2023-10-14 01:26:10,098][33201] Updated weights for policy 0, policy_version 10090 (0.0009) [2023-10-14 01:26:10,480][33201] Updated weights for policy 0, policy_version 10100 (0.0009) [2023-10-14 01:26:10,852][33201] Updated weights for policy 0, policy_version 10110 (0.0009) [2023-10-14 01:26:11,267][33226] Updated weights for policy 1, policy_version 10180 (0.0009) [2023-10-14 01:26:11,636][33226] Updated weights for policy 1, policy_version 10190 (0.0008) [2023-10-14 01:26:12,013][33226] Updated weights for policy 1, policy_version 10200 (0.0010) [2023-10-14 01:26:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20807680. Throughput: 0: 1756.8, 1: 1770.5. Samples: 5207362. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:26:14,558][31953] Avg episode reward: [(0, '20.290'), (1, '20.770')] [2023-10-14 01:26:14,559][32895] Saving new best policy, reward=20.770! [2023-10-14 01:26:14,882][33201] Updated weights for policy 0, policy_version 10120 (0.0007) [2023-10-14 01:26:15,259][33201] Updated weights for policy 0, policy_version 10130 (0.0008) [2023-10-14 01:26:15,633][33201] Updated weights for policy 0, policy_version 10140 (0.0008) [2023-10-14 01:26:15,875][33226] Updated weights for policy 1, policy_version 10210 (0.0007) [2023-10-14 01:26:16,246][33226] Updated weights for policy 1, policy_version 10220 (0.0010) [2023-10-14 01:26:16,611][33226] Updated weights for policy 1, policy_version 10230 (0.0009) [2023-10-14 01:26:16,983][33226] Updated weights for policy 1, policy_version 10240 (0.0008) [2023-10-14 01:26:19,401][33201] Updated weights for policy 0, policy_version 10150 (0.0009) [2023-10-14 01:26:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20873216. Throughput: 0: 1766.9, 1: 1755.2. Samples: 5228854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:26:19,558][31953] Avg episode reward: [(0, '20.360'), (1, '20.780')] [2023-10-14 01:26:19,559][32895] Saving new best policy, reward=20.780! [2023-10-14 01:26:19,775][33201] Updated weights for policy 0, policy_version 10160 (0.0007) [2023-10-14 01:26:20,145][33201] Updated weights for policy 0, policy_version 10170 (0.0007) [2023-10-14 01:26:20,803][33226] Updated weights for policy 1, policy_version 10250 (0.0010) [2023-10-14 01:26:21,168][33226] Updated weights for policy 1, policy_version 10260 (0.0009) [2023-10-14 01:26:21,536][33226] Updated weights for policy 1, policy_version 10270 (0.0011) [2023-10-14 01:26:23,823][33201] Updated weights for policy 0, policy_version 10180 (0.0008) [2023-10-14 01:26:24,205][33201] Updated weights for policy 0, policy_version 10190 (0.0010) [2023-10-14 01:26:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 20938752. Throughput: 0: 1783.8, 1: 1756.1. Samples: 5250754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:26:24,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.780')] [2023-10-14 01:26:24,579][33201] Updated weights for policy 0, policy_version 10200 (0.0007) [2023-10-14 01:26:25,492][33226] Updated weights for policy 1, policy_version 10280 (0.0008) [2023-10-14 01:26:25,877][33226] Updated weights for policy 1, policy_version 10290 (0.0007) [2023-10-14 01:26:26,239][33226] Updated weights for policy 1, policy_version 10300 (0.0007) [2023-10-14 01:26:28,480][33201] Updated weights for policy 0, policy_version 10210 (0.0010) [2023-10-14 01:26:28,860][33201] Updated weights for policy 0, policy_version 10220 (0.0009) [2023-10-14 01:26:29,234][33201] Updated weights for policy 0, policy_version 10230 (0.0009) [2023-10-14 01:26:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 21004288. Throughput: 0: 1763.8, 1: 1755.8. Samples: 5260718. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:26:29,558][31953] Avg episode reward: [(0, '20.390'), (1, '20.800')] [2023-10-14 01:26:29,559][32895] Saving new best policy, reward=20.800! [2023-10-14 01:26:29,609][33201] Updated weights for policy 0, policy_version 10240 (0.0009) [2023-10-14 01:26:29,958][33226] Updated weights for policy 1, policy_version 10310 (0.0007) [2023-10-14 01:26:30,322][33226] Updated weights for policy 1, policy_version 10320 (0.0008) [2023-10-14 01:26:30,701][33226] Updated weights for policy 1, policy_version 10330 (0.0009) [2023-10-14 01:26:33,465][33201] Updated weights for policy 0, policy_version 10250 (0.0008) [2023-10-14 01:26:33,826][33201] Updated weights for policy 0, policy_version 10260 (0.0010) [2023-10-14 01:26:34,205][33201] Updated weights for policy 0, policy_version 10270 (0.0007) [2023-10-14 01:26:34,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 21102592. Throughput: 0: 1798.1, 1: 1755.7. Samples: 5282860. Policy #0 lag: (min: 5.0, avg: 13.2, max: 37.0) [2023-10-14 01:26:34,559][31953] Avg episode reward: [(0, '20.410'), (1, '20.830')] [2023-10-14 01:26:34,587][33226] Updated weights for policy 1, policy_version 10340 (0.0009) [2023-10-14 01:26:34,951][33226] Updated weights for policy 1, policy_version 10350 (0.0009) [2023-10-14 01:26:35,327][33226] Updated weights for policy 1, policy_version 10360 (0.0009) [2023-10-14 01:26:35,621][32895] Saving new best policy, reward=20.830! [2023-10-14 01:26:38,236][33201] Updated weights for policy 0, policy_version 10280 (0.0007) [2023-10-14 01:26:38,615][33201] Updated weights for policy 0, policy_version 10290 (0.0007) [2023-10-14 01:26:38,990][33201] Updated weights for policy 0, policy_version 10300 (0.0007) [2023-10-14 01:26:39,181][33226] Updated weights for policy 1, policy_version 10370 (0.0008) [2023-10-14 01:26:39,541][33226] Updated weights for policy 1, policy_version 10380 (0.0007) [2023-10-14 01:26:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 21168128. Throughput: 0: 1767.6, 1: 1791.3. Samples: 5303558. Policy #0 lag: (min: 19.0, avg: 20.2, max: 42.0) [2023-10-14 01:26:39,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.850')] [2023-10-14 01:26:39,914][33226] Updated weights for policy 1, policy_version 10390 (0.0008) [2023-10-14 01:26:40,277][32895] Saving new best policy, reward=20.850! [2023-10-14 01:26:40,280][33226] Updated weights for policy 1, policy_version 10400 (0.0007) [2023-10-14 01:26:42,661][33201] Updated weights for policy 0, policy_version 10310 (0.0009) [2023-10-14 01:26:43,023][33201] Updated weights for policy 0, policy_version 10320 (0.0009) [2023-10-14 01:26:43,392][33201] Updated weights for policy 0, policy_version 10330 (0.0007) [2023-10-14 01:26:43,891][33226] Updated weights for policy 1, policy_version 10410 (0.0008) [2023-10-14 01:26:44,266][33226] Updated weights for policy 1, policy_version 10420 (0.0008) [2023-10-14 01:26:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 21233664. Throughput: 0: 1789.9, 1: 1765.6. Samples: 5314572. Policy #0 lag: (min: 19.0, avg: 20.2, max: 42.0) [2023-10-14 01:26:44,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.870')] [2023-10-14 01:26:44,630][33226] Updated weights for policy 1, policy_version 10430 (0.0009) [2023-10-14 01:26:44,709][32895] Saving new best policy, reward=20.870! [2023-10-14 01:26:47,140][33201] Updated weights for policy 0, policy_version 10340 (0.0007) [2023-10-14 01:26:47,512][33201] Updated weights for policy 0, policy_version 10350 (0.0008) [2023-10-14 01:26:47,883][33201] Updated weights for policy 0, policy_version 10360 (0.0008) [2023-10-14 01:26:48,459][33226] Updated weights for policy 1, policy_version 10440 (0.0009) [2023-10-14 01:26:48,831][33226] Updated weights for policy 1, policy_version 10450 (0.0007) [2023-10-14 01:26:49,190][33226] Updated weights for policy 1, policy_version 10460 (0.0008) [2023-10-14 01:26:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 21331968. Throughput: 0: 1763.1, 1: 1795.3. Samples: 5335592. Policy #0 lag: (min: 22.0, avg: 23.3, max: 47.0) [2023-10-14 01:26:49,558][31953] Avg episode reward: [(0, '20.430'), (1, '20.880')] [2023-10-14 01:26:49,558][32895] Saving new best policy, reward=20.880! [2023-10-14 01:26:51,639][33201] Updated weights for policy 0, policy_version 10370 (0.0010) [2023-10-14 01:26:52,013][33201] Updated weights for policy 0, policy_version 10380 (0.0010) [2023-10-14 01:26:52,381][33201] Updated weights for policy 0, policy_version 10390 (0.0010) [2023-10-14 01:26:52,749][33201] Updated weights for policy 0, policy_version 10400 (0.0009) [2023-10-14 01:26:53,101][33226] Updated weights for policy 1, policy_version 10470 (0.0009) [2023-10-14 01:26:53,459][33226] Updated weights for policy 1, policy_version 10480 (0.0009) [2023-10-14 01:26:53,832][33226] Updated weights for policy 1, policy_version 10490 (0.0010) [2023-10-14 01:26:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 21397504. Throughput: 0: 1757.6, 1: 1775.4. Samples: 5356286. Policy #0 lag: (min: 22.0, avg: 23.3, max: 47.0) [2023-10-14 01:26:54,558][31953] Avg episode reward: [(0, '20.440'), (1, '20.910')] [2023-10-14 01:26:54,568][32895] Saving new best policy, reward=20.910! [2023-10-14 01:26:56,542][33201] Updated weights for policy 0, policy_version 10410 (0.0009) [2023-10-14 01:26:56,915][33201] Updated weights for policy 0, policy_version 10420 (0.0007) [2023-10-14 01:26:57,289][33201] Updated weights for policy 0, policy_version 10430 (0.0007) [2023-10-14 01:26:57,399][33226] Updated weights for policy 1, policy_version 10500 (0.0007) [2023-10-14 01:26:57,768][33226] Updated weights for policy 1, policy_version 10510 (0.0007) [2023-10-14 01:26:58,132][33226] Updated weights for policy 1, policy_version 10520 (0.0008) [2023-10-14 01:26:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 21463040. Throughput: 0: 1769.7, 1: 1798.1. Samples: 5367916. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 01:26:59,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.910')] [2023-10-14 01:27:01,081][33201] Updated weights for policy 0, policy_version 10440 (0.0007) [2023-10-14 01:27:01,455][33201] Updated weights for policy 0, policy_version 10450 (0.0010) [2023-10-14 01:27:01,799][33226] Updated weights for policy 1, policy_version 10530 (0.0010) [2023-10-14 01:27:01,834][33201] Updated weights for policy 0, policy_version 10460 (0.0007) [2023-10-14 01:27:02,170][33226] Updated weights for policy 1, policy_version 10540 (0.0008) [2023-10-14 01:27:02,541][33226] Updated weights for policy 1, policy_version 10550 (0.0007) [2023-10-14 01:27:02,913][33226] Updated weights for policy 1, policy_version 10560 (0.0009) [2023-10-14 01:27:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 21528576. Throughput: 0: 1758.8, 1: 1786.0. Samples: 5388374. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 01:27:04,557][31953] Avg episode reward: [(0, '20.510'), (1, '20.920')] [2023-10-14 01:27:04,558][32895] Saving new best policy, reward=20.920! [2023-10-14 01:27:05,863][33201] Updated weights for policy 0, policy_version 10470 (0.0008) [2023-10-14 01:27:06,242][33201] Updated weights for policy 0, policy_version 10480 (0.0009) [2023-10-14 01:27:06,622][33201] Updated weights for policy 0, policy_version 10490 (0.0009) [2023-10-14 01:27:06,762][33226] Updated weights for policy 1, policy_version 10570 (0.0008) [2023-10-14 01:27:07,128][33226] Updated weights for policy 1, policy_version 10580 (0.0009) [2023-10-14 01:27:07,507][33226] Updated weights for policy 1, policy_version 10590 (0.0008) [2023-10-14 01:27:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 21594112. Throughput: 0: 1759.6, 1: 1784.9. Samples: 5410260. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:27:09,558][31953] Avg episode reward: [(0, '20.540'), (1, '20.930')] [2023-10-14 01:27:09,568][32895] Saving new best policy, reward=20.930! [2023-10-14 01:27:10,388][33201] Updated weights for policy 0, policy_version 10500 (0.0007) [2023-10-14 01:27:10,767][33201] Updated weights for policy 0, policy_version 10510 (0.0008) [2023-10-14 01:27:11,140][33201] Updated weights for policy 0, policy_version 10520 (0.0007) [2023-10-14 01:27:11,350][33226] Updated weights for policy 1, policy_version 10600 (0.0007) [2023-10-14 01:27:11,734][33226] Updated weights for policy 1, policy_version 10610 (0.0009) [2023-10-14 01:27:12,092][33226] Updated weights for policy 1, policy_version 10620 (0.0007) [2023-10-14 01:27:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 21659648. Throughput: 0: 1750.7, 1: 1795.1. Samples: 5420278. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:27:14,558][31953] Avg episode reward: [(0, '20.510'), (1, '20.920')] [2023-10-14 01:27:14,681][33201] Updated weights for policy 0, policy_version 10530 (0.0008) [2023-10-14 01:27:15,047][33201] Updated weights for policy 0, policy_version 10540 (0.0010) [2023-10-14 01:27:15,423][33201] Updated weights for policy 0, policy_version 10550 (0.0009) [2023-10-14 01:27:15,789][33201] Updated weights for policy 0, policy_version 10560 (0.0010) [2023-10-14 01:27:15,794][33226] Updated weights for policy 1, policy_version 10630 (0.0009) [2023-10-14 01:27:16,163][33226] Updated weights for policy 1, policy_version 10640 (0.0009) [2023-10-14 01:27:16,531][33226] Updated weights for policy 1, policy_version 10650 (0.0009) [2023-10-14 01:27:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 21725184. Throughput: 0: 1752.3, 1: 1786.9. Samples: 5442122. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:27:19,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.930')] [2023-10-14 01:27:19,765][33201] Updated weights for policy 0, policy_version 10570 (0.0009) [2023-10-14 01:27:20,132][33201] Updated weights for policy 0, policy_version 10580 (0.0010) [2023-10-14 01:27:20,438][33226] Updated weights for policy 1, policy_version 10660 (0.0009) [2023-10-14 01:27:20,512][33201] Updated weights for policy 0, policy_version 10590 (0.0007) [2023-10-14 01:27:20,815][33226] Updated weights for policy 1, policy_version 10670 (0.0008) [2023-10-14 01:27:21,187][33226] Updated weights for policy 1, policy_version 10680 (0.0007) [2023-10-14 01:27:24,366][33201] Updated weights for policy 0, policy_version 10600 (0.0008) [2023-10-14 01:27:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 21790720. Throughput: 0: 1781.3, 1: 1784.4. Samples: 5464016. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-14 01:27:24,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.940')] [2023-10-14 01:27:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000010688_10944512.pth... [2023-10-14 01:27:24,604][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000009024_9240576.pth [2023-10-14 01:27:24,608][32895] Saving new best policy, reward=20.940! [2023-10-14 01:27:24,730][33201] Updated weights for policy 0, policy_version 10610 (0.0008) [2023-10-14 01:27:25,106][33201] Updated weights for policy 0, policy_version 10620 (0.0009) [2023-10-14 01:27:25,127][33226] Updated weights for policy 1, policy_version 10690 (0.0008) [2023-10-14 01:27:25,253][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000010624_10878976.pth... [2023-10-14 01:27:25,286][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000008960_9175040.pth [2023-10-14 01:27:25,500][33226] Updated weights for policy 1, policy_version 10700 (0.0009) [2023-10-14 01:27:25,881][33226] Updated weights for policy 1, policy_version 10710 (0.0008) [2023-10-14 01:27:26,251][33226] Updated weights for policy 1, policy_version 10720 (0.0008) [2023-10-14 01:27:28,760][33201] Updated weights for policy 0, policy_version 10630 (0.0008) [2023-10-14 01:27:29,144][33201] Updated weights for policy 0, policy_version 10640 (0.0009) [2023-10-14 01:27:29,518][33201] Updated weights for policy 0, policy_version 10650 (0.0008) [2023-10-14 01:27:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 21856256. Throughput: 0: 1752.9, 1: 1781.3. Samples: 5473612. Policy #0 lag: (min: 25.0, avg: 34.4, max: 57.0) [2023-10-14 01:27:29,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.940')] [2023-10-14 01:27:29,919][33226] Updated weights for policy 1, policy_version 10730 (0.0010) [2023-10-14 01:27:30,292][33226] Updated weights for policy 1, policy_version 10740 (0.0009) [2023-10-14 01:27:30,660][33226] Updated weights for policy 1, policy_version 10750 (0.0009) [2023-10-14 01:27:33,284][33201] Updated weights for policy 0, policy_version 10660 (0.0007) [2023-10-14 01:27:33,650][33201] Updated weights for policy 0, policy_version 10670 (0.0007) [2023-10-14 01:27:34,027][33201] Updated weights for policy 0, policy_version 10680 (0.0008) [2023-10-14 01:27:34,489][33226] Updated weights for policy 1, policy_version 10760 (0.0009) [2023-10-14 01:27:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 21954560. Throughput: 0: 1782.5, 1: 1781.4. Samples: 5495970. Policy #0 lag: (min: 19.0, avg: 20.1, max: 41.0) [2023-10-14 01:27:34,557][31953] Avg episode reward: [(0, '20.530'), (1, '20.940')] [2023-10-14 01:27:34,858][33226] Updated weights for policy 1, policy_version 10770 (0.0010) [2023-10-14 01:27:35,229][33226] Updated weights for policy 1, policy_version 10780 (0.0008) [2023-10-14 01:27:37,887][33201] Updated weights for policy 0, policy_version 10690 (0.0009) [2023-10-14 01:27:38,260][33201] Updated weights for policy 0, policy_version 10700 (0.0007) [2023-10-14 01:27:38,630][33201] Updated weights for policy 0, policy_version 10710 (0.0007) [2023-10-14 01:27:39,005][33201] Updated weights for policy 0, policy_version 10720 (0.0007) [2023-10-14 01:27:39,021][33226] Updated weights for policy 1, policy_version 10790 (0.0009) [2023-10-14 01:27:39,392][33226] Updated weights for policy 1, policy_version 10800 (0.0009) [2023-10-14 01:27:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 22020096. Throughput: 0: 1754.6, 1: 1804.8. Samples: 5516460. Policy #0 lag: (min: 16.0, avg: 34.4, max: 48.0) [2023-10-14 01:27:39,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.940')] [2023-10-14 01:27:39,756][33226] Updated weights for policy 1, policy_version 10810 (0.0007) [2023-10-14 01:27:42,803][33201] Updated weights for policy 0, policy_version 10730 (0.0007) [2023-10-14 01:27:43,182][33201] Updated weights for policy 0, policy_version 10740 (0.0008) [2023-10-14 01:27:43,491][33226] Updated weights for policy 1, policy_version 10820 (0.0009) [2023-10-14 01:27:43,545][33201] Updated weights for policy 0, policy_version 10750 (0.0007) [2023-10-14 01:27:43,854][33226] Updated weights for policy 1, policy_version 10830 (0.0008) [2023-10-14 01:27:44,229][33226] Updated weights for policy 1, policy_version 10840 (0.0009) [2023-10-14 01:27:44,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 22118400. Throughput: 0: 1776.9, 1: 1772.9. Samples: 5527658. Policy #0 lag: (min: 16.0, avg: 34.4, max: 48.0) [2023-10-14 01:27:44,558][31953] Avg episode reward: [(0, '20.510'), (1, '20.940')] [2023-10-14 01:27:47,529][33201] Updated weights for policy 0, policy_version 10760 (0.0008) [2023-10-14 01:27:47,902][33201] Updated weights for policy 0, policy_version 10770 (0.0011) [2023-10-14 01:27:48,091][33226] Updated weights for policy 1, policy_version 10850 (0.0007) [2023-10-14 01:27:48,265][33201] Updated weights for policy 0, policy_version 10780 (0.0010) [2023-10-14 01:27:48,457][33226] Updated weights for policy 1, policy_version 10860 (0.0009) [2023-10-14 01:27:48,829][33226] Updated weights for policy 1, policy_version 10870 (0.0008) [2023-10-14 01:27:49,194][33226] Updated weights for policy 1, policy_version 10880 (0.0007) [2023-10-14 01:27:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 22183936. Throughput: 0: 1763.2, 1: 1798.4. Samples: 5548646. Policy #0 lag: (min: 5.0, avg: 10.6, max: 37.0) [2023-10-14 01:27:49,557][31953] Avg episode reward: [(0, '20.520'), (1, '20.940')] [2023-10-14 01:27:52,168][33201] Updated weights for policy 0, policy_version 10790 (0.0008) [2023-10-14 01:27:52,546][33201] Updated weights for policy 0, policy_version 10800 (0.0007) [2023-10-14 01:27:52,899][33226] Updated weights for policy 1, policy_version 10890 (0.0008) [2023-10-14 01:27:52,904][33201] Updated weights for policy 0, policy_version 10810 (0.0007) [2023-10-14 01:27:53,261][33226] Updated weights for policy 1, policy_version 10900 (0.0009) [2023-10-14 01:27:53,630][33226] Updated weights for policy 1, policy_version 10910 (0.0009) [2023-10-14 01:27:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 22249472. Throughput: 0: 1757.5, 1: 1765.3. Samples: 5568786. Policy #0 lag: (min: 5.0, avg: 10.6, max: 37.0) [2023-10-14 01:27:54,558][31953] Avg episode reward: [(0, '20.510'), (1, '20.930')] [2023-10-14 01:27:56,813][33201] Updated weights for policy 0, policy_version 10820 (0.0008) [2023-10-14 01:27:57,182][33201] Updated weights for policy 0, policy_version 10830 (0.0009) [2023-10-14 01:27:57,480][33226] Updated weights for policy 1, policy_version 10920 (0.0008) [2023-10-14 01:27:57,560][33201] Updated weights for policy 0, policy_version 10840 (0.0007) [2023-10-14 01:27:57,855][33226] Updated weights for policy 1, policy_version 10930 (0.0007) [2023-10-14 01:27:58,222][33226] Updated weights for policy 1, policy_version 10940 (0.0007) [2023-10-14 01:27:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 22315008. Throughput: 0: 1774.9, 1: 1794.1. Samples: 5580884. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-14 01:27:59,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.920')] [2023-10-14 01:28:01,383][33201] Updated weights for policy 0, policy_version 10850 (0.0008) [2023-10-14 01:28:01,752][33201] Updated weights for policy 0, policy_version 10860 (0.0010) [2023-10-14 01:28:02,051][33226] Updated weights for policy 1, policy_version 10950 (0.0007) [2023-10-14 01:28:02,127][33201] Updated weights for policy 0, policy_version 10870 (0.0010) [2023-10-14 01:28:02,414][33226] Updated weights for policy 1, policy_version 10960 (0.0009) [2023-10-14 01:28:02,491][33201] Updated weights for policy 0, policy_version 10880 (0.0009) [2023-10-14 01:28:02,792][33226] Updated weights for policy 1, policy_version 10970 (0.0007) [2023-10-14 01:28:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 22380544. Throughput: 0: 1746.8, 1: 1774.0. Samples: 5600554. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-14 01:28:04,558][31953] Avg episode reward: [(0, '20.460'), (1, '20.910')] [2023-10-14 01:28:06,437][33201] Updated weights for policy 0, policy_version 10890 (0.0009) [2023-10-14 01:28:06,624][33226] Updated weights for policy 1, policy_version 10980 (0.0010) [2023-10-14 01:28:06,801][33201] Updated weights for policy 0, policy_version 10900 (0.0009) [2023-10-14 01:28:06,999][33226] Updated weights for policy 1, policy_version 10990 (0.0009) [2023-10-14 01:28:07,177][33201] Updated weights for policy 0, policy_version 10910 (0.0009) [2023-10-14 01:28:07,366][33226] Updated weights for policy 1, policy_version 11000 (0.0009) [2023-10-14 01:28:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 22446080. Throughput: 0: 1751.4, 1: 1773.8. Samples: 5622652. Policy #0 lag: (min: 31.0, avg: 37.4, max: 63.0) [2023-10-14 01:28:09,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.900')] [2023-10-14 01:28:11,004][33226] Updated weights for policy 1, policy_version 11010 (0.0009) [2023-10-14 01:28:11,183][33201] Updated weights for policy 0, policy_version 10920 (0.0009) [2023-10-14 01:28:11,371][33226] Updated weights for policy 1, policy_version 11020 (0.0008) [2023-10-14 01:28:11,554][33201] Updated weights for policy 0, policy_version 10930 (0.0010) [2023-10-14 01:28:11,739][33226] Updated weights for policy 1, policy_version 11030 (0.0007) [2023-10-14 01:28:11,923][33201] Updated weights for policy 0, policy_version 10940 (0.0009) [2023-10-14 01:28:12,119][33226] Updated weights for policy 1, policy_version 11040 (0.0009) [2023-10-14 01:28:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 22511616. Throughput: 0: 1751.8, 1: 1778.5. Samples: 5632476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:28:14,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.900')] [2023-10-14 01:28:15,611][33201] Updated weights for policy 0, policy_version 10950 (0.0008) [2023-10-14 01:28:15,915][33226] Updated weights for policy 1, policy_version 11050 (0.0009) [2023-10-14 01:28:15,988][33201] Updated weights for policy 0, policy_version 10960 (0.0009) [2023-10-14 01:28:16,288][33226] Updated weights for policy 1, policy_version 11060 (0.0008) [2023-10-14 01:28:16,358][33201] Updated weights for policy 0, policy_version 10970 (0.0009) [2023-10-14 01:28:16,658][33226] Updated weights for policy 1, policy_version 11070 (0.0009) [2023-10-14 01:28:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 22577152. Throughput: 0: 1747.7, 1: 1768.3. Samples: 5654188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:28:19,558][31953] Avg episode reward: [(0, '20.470'), (1, '20.900')] [2023-10-14 01:28:20,236][33201] Updated weights for policy 0, policy_version 10980 (0.0009) [2023-10-14 01:28:20,400][33226] Updated weights for policy 1, policy_version 11080 (0.0007) [2023-10-14 01:28:20,595][33201] Updated weights for policy 0, policy_version 10990 (0.0008) [2023-10-14 01:28:20,767][33226] Updated weights for policy 1, policy_version 11090 (0.0007) [2023-10-14 01:28:20,962][33201] Updated weights for policy 0, policy_version 11000 (0.0007) [2023-10-14 01:28:21,127][33226] Updated weights for policy 1, policy_version 11100 (0.0008) [2023-10-14 01:28:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 22642688. Throughput: 0: 1779.0, 1: 1774.9. Samples: 5676388. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:28:24,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.890')] [2023-10-14 01:28:24,603][33201] Updated weights for policy 0, policy_version 11010 (0.0008) [2023-10-14 01:28:24,983][33201] Updated weights for policy 0, policy_version 11020 (0.0008) [2023-10-14 01:28:25,016][33226] Updated weights for policy 1, policy_version 11110 (0.0009) [2023-10-14 01:28:25,352][33201] Updated weights for policy 0, policy_version 11030 (0.0008) [2023-10-14 01:28:25,388][33226] Updated weights for policy 1, policy_version 11120 (0.0010) [2023-10-14 01:28:25,722][33201] Updated weights for policy 0, policy_version 11040 (0.0008) [2023-10-14 01:28:25,760][33226] Updated weights for policy 1, policy_version 11130 (0.0009) [2023-10-14 01:28:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 22708224. Throughput: 0: 1746.1, 1: 1769.2. Samples: 5685846. Policy #0 lag: (min: 26.0, avg: 28.8, max: 58.0) [2023-10-14 01:28:29,557][31953] Avg episode reward: [(0, '20.460'), (1, '20.900')] [2023-10-14 01:28:29,590][33201] Updated weights for policy 0, policy_version 11050 (0.0007) [2023-10-14 01:28:29,703][33226] Updated weights for policy 1, policy_version 11140 (0.0009) [2023-10-14 01:28:29,964][33201] Updated weights for policy 0, policy_version 11060 (0.0008) [2023-10-14 01:28:30,071][33226] Updated weights for policy 1, policy_version 11150 (0.0008) [2023-10-14 01:28:30,331][33201] Updated weights for policy 0, policy_version 11070 (0.0007) [2023-10-14 01:28:30,433][33226] Updated weights for policy 1, policy_version 11160 (0.0009) [2023-10-14 01:28:34,044][33201] Updated weights for policy 0, policy_version 11080 (0.0008) [2023-10-14 01:28:34,147][33226] Updated weights for policy 1, policy_version 11170 (0.0007) [2023-10-14 01:28:34,414][33201] Updated weights for policy 0, policy_version 11090 (0.0007) [2023-10-14 01:28:34,517][33226] Updated weights for policy 1, policy_version 11180 (0.0009) [2023-10-14 01:28:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 22773760. Throughput: 0: 1772.3, 1: 1771.3. Samples: 5708108. Policy #0 lag: (min: 26.0, avg: 28.8, max: 58.0) [2023-10-14 01:28:34,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.900')] [2023-10-14 01:28:34,784][33201] Updated weights for policy 0, policy_version 11100 (0.0009) [2023-10-14 01:28:34,887][33226] Updated weights for policy 1, policy_version 11190 (0.0008) [2023-10-14 01:28:35,253][33226] Updated weights for policy 1, policy_version 11200 (0.0008) [2023-10-14 01:28:38,750][33201] Updated weights for policy 0, policy_version 11110 (0.0008) [2023-10-14 01:28:39,134][33201] Updated weights for policy 0, policy_version 11120 (0.0009) [2023-10-14 01:28:39,141][33226] Updated weights for policy 1, policy_version 11210 (0.0008) [2023-10-14 01:28:39,498][33201] Updated weights for policy 0, policy_version 11130 (0.0008) [2023-10-14 01:28:39,512][33226] Updated weights for policy 1, policy_version 11220 (0.0007) [2023-10-14 01:28:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 22839296. Throughput: 0: 1767.0, 1: 1797.0. Samples: 5729166. Policy #0 lag: (min: 26.0, avg: 28.8, max: 58.0) [2023-10-14 01:28:39,558][31953] Avg episode reward: [(0, '20.500'), (1, '20.900')] [2023-10-14 01:28:39,895][33226] Updated weights for policy 1, policy_version 11230 (0.0009) [2023-10-14 01:28:43,469][33201] Updated weights for policy 0, policy_version 11140 (0.0009) [2023-10-14 01:28:43,832][33201] Updated weights for policy 0, policy_version 11150 (0.0007) [2023-10-14 01:28:43,840][33226] Updated weights for policy 1, policy_version 11240 (0.0008) [2023-10-14 01:28:44,203][33201] Updated weights for policy 0, policy_version 11160 (0.0008) [2023-10-14 01:28:44,215][33226] Updated weights for policy 1, policy_version 11250 (0.0007) [2023-10-14 01:28:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 22937600. Throughput: 0: 1758.0, 1: 1768.8. Samples: 5739590. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 01:28:44,558][31953] Avg episode reward: [(0, '20.320'), (1, '20.900')] [2023-10-14 01:28:44,579][33226] Updated weights for policy 1, policy_version 11260 (0.0009) [2023-10-14 01:28:47,943][33201] Updated weights for policy 0, policy_version 11170 (0.0008) [2023-10-14 01:28:48,318][33201] Updated weights for policy 0, policy_version 11180 (0.0009) [2023-10-14 01:28:48,325][33226] Updated weights for policy 1, policy_version 11270 (0.0010) [2023-10-14 01:28:48,686][33201] Updated weights for policy 0, policy_version 11190 (0.0007) [2023-10-14 01:28:48,688][33226] Updated weights for policy 1, policy_version 11280 (0.0009) [2023-10-14 01:28:49,060][33226] Updated weights for policy 1, policy_version 11290 (0.0010) [2023-10-14 01:28:49,061][33201] Updated weights for policy 0, policy_version 11200 (0.0007) [2023-10-14 01:28:49,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 23035904. Throughput: 0: 1779.5, 1: 1790.3. Samples: 5761194. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 01:28:49,558][31953] Avg episode reward: [(0, '20.300'), (1, '20.900')] [2023-10-14 01:28:52,800][33226] Updated weights for policy 1, policy_version 11300 (0.0008) [2023-10-14 01:28:52,838][33201] Updated weights for policy 0, policy_version 11210 (0.0007) [2023-10-14 01:28:53,176][33226] Updated weights for policy 1, policy_version 11310 (0.0009) [2023-10-14 01:28:53,219][33201] Updated weights for policy 0, policy_version 11220 (0.0008) [2023-10-14 01:28:53,539][33226] Updated weights for policy 1, policy_version 11320 (0.0008) [2023-10-14 01:28:53,590][33201] Updated weights for policy 0, policy_version 11230 (0.0008) [2023-10-14 01:28:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 23101440. Throughput: 0: 1756.3, 1: 1758.7. Samples: 5780826. Policy #0 lag: (min: 17.0, avg: 27.3, max: 49.0) [2023-10-14 01:28:54,558][31953] Avg episode reward: [(0, '20.310'), (1, '20.910')] [2023-10-14 01:28:57,305][33226] Updated weights for policy 1, policy_version 11330 (0.0009) [2023-10-14 01:28:57,340][33201] Updated weights for policy 0, policy_version 11240 (0.0007) [2023-10-14 01:28:57,664][33226] Updated weights for policy 1, policy_version 11340 (0.0010) [2023-10-14 01:28:57,713][33201] Updated weights for policy 0, policy_version 11250 (0.0009) [2023-10-14 01:28:58,029][33226] Updated weights for policy 1, policy_version 11350 (0.0008) [2023-10-14 01:28:58,080][33201] Updated weights for policy 0, policy_version 11260 (0.0007) [2023-10-14 01:28:58,393][33226] Updated weights for policy 1, policy_version 11360 (0.0008) [2023-10-14 01:28:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 23166976. Throughput: 0: 1784.5, 1: 1788.6. Samples: 5793264. Policy #0 lag: (min: 17.0, avg: 27.3, max: 49.0) [2023-10-14 01:28:59,558][31953] Avg episode reward: [(0, '20.310'), (1, '20.910')] [2023-10-14 01:29:01,716][33201] Updated weights for policy 0, policy_version 11270 (0.0008) [2023-10-14 01:29:02,085][33201] Updated weights for policy 0, policy_version 11280 (0.0007) [2023-10-14 01:29:02,161][33226] Updated weights for policy 1, policy_version 11370 (0.0009) [2023-10-14 01:29:02,460][33201] Updated weights for policy 0, policy_version 11290 (0.0008) [2023-10-14 01:29:02,521][33226] Updated weights for policy 1, policy_version 11380 (0.0008) [2023-10-14 01:29:02,889][33226] Updated weights for policy 1, policy_version 11390 (0.0010) [2023-10-14 01:29:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 23232512. Throughput: 0: 1761.0, 1: 1765.3. Samples: 5812874. Policy #0 lag: (min: 17.0, avg: 27.3, max: 49.0) [2023-10-14 01:29:04,558][31953] Avg episode reward: [(0, '20.330'), (1, '20.910')] [2023-10-14 01:29:06,401][33201] Updated weights for policy 0, policy_version 11300 (0.0009) [2023-10-14 01:29:06,766][33201] Updated weights for policy 0, policy_version 11310 (0.0008) [2023-10-14 01:29:06,789][33226] Updated weights for policy 1, policy_version 11400 (0.0010) [2023-10-14 01:29:07,133][33201] Updated weights for policy 0, policy_version 11320 (0.0008) [2023-10-14 01:29:07,153][33226] Updated weights for policy 1, policy_version 11410 (0.0007) [2023-10-14 01:29:07,512][33226] Updated weights for policy 1, policy_version 11420 (0.0008) [2023-10-14 01:29:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 23298048. Throughput: 0: 1760.4, 1: 1762.0. Samples: 5834898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:09,558][31953] Avg episode reward: [(0, '20.330'), (1, '20.900')] [2023-10-14 01:29:10,821][33201] Updated weights for policy 0, policy_version 11330 (0.0008) [2023-10-14 01:29:11,186][33201] Updated weights for policy 0, policy_version 11340 (0.0007) [2023-10-14 01:29:11,293][33226] Updated weights for policy 1, policy_version 11430 (0.0007) [2023-10-14 01:29:11,551][33201] Updated weights for policy 0, policy_version 11350 (0.0008) [2023-10-14 01:29:11,665][33226] Updated weights for policy 1, policy_version 11440 (0.0009) [2023-10-14 01:29:11,928][33201] Updated weights for policy 0, policy_version 11360 (0.0008) [2023-10-14 01:29:12,023][33226] Updated weights for policy 1, policy_version 11450 (0.0009) [2023-10-14 01:29:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 23363584. Throughput: 0: 1764.0, 1: 1772.8. Samples: 5845000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:14,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.870')] [2023-10-14 01:29:15,850][33226] Updated weights for policy 1, policy_version 11460 (0.0007) [2023-10-14 01:29:15,972][33201] Updated weights for policy 0, policy_version 11370 (0.0008) [2023-10-14 01:29:16,215][33226] Updated weights for policy 1, policy_version 11470 (0.0008) [2023-10-14 01:29:16,339][33201] Updated weights for policy 0, policy_version 11380 (0.0007) [2023-10-14 01:29:16,583][33226] Updated weights for policy 1, policy_version 11480 (0.0009) [2023-10-14 01:29:16,718][33201] Updated weights for policy 0, policy_version 11390 (0.0008) [2023-10-14 01:29:19,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 23429120. Throughput: 0: 1758.4, 1: 1761.4. Samples: 5866502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:19,558][31953] Avg episode reward: [(0, '20.360'), (1, '20.850')] [2023-10-14 01:29:20,442][33226] Updated weights for policy 1, policy_version 11490 (0.0009) [2023-10-14 01:29:20,607][33201] Updated weights for policy 0, policy_version 11400 (0.0008) [2023-10-14 01:29:20,807][33226] Updated weights for policy 1, policy_version 11500 (0.0007) [2023-10-14 01:29:20,986][33201] Updated weights for policy 0, policy_version 11410 (0.0009) [2023-10-14 01:29:21,172][33226] Updated weights for policy 1, policy_version 11510 (0.0007) [2023-10-14 01:29:21,355][33201] Updated weights for policy 0, policy_version 11420 (0.0010) [2023-10-14 01:29:21,537][33226] Updated weights for policy 1, policy_version 11520 (0.0009) [2023-10-14 01:29:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 23494656. Throughput: 0: 1773.3, 1: 1766.5. Samples: 5888456. Policy #0 lag: (min: 24.0, avg: 52.3, max: 56.0) [2023-10-14 01:29:24,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.850')] [2023-10-14 01:29:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000011424_11698176.pth... [2023-10-14 01:29:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000011520_11796480.pth... [2023-10-14 01:29:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000009792_10027008.pth [2023-10-14 01:29:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000009856_10092544.pth [2023-10-14 01:29:25,227][33201] Updated weights for policy 0, policy_version 11430 (0.0008) [2023-10-14 01:29:25,379][33226] Updated weights for policy 1, policy_version 11530 (0.0008) [2023-10-14 01:29:25,592][33201] Updated weights for policy 0, policy_version 11440 (0.0009) [2023-10-14 01:29:25,749][33226] Updated weights for policy 1, policy_version 11540 (0.0008) [2023-10-14 01:29:25,971][33201] Updated weights for policy 0, policy_version 11450 (0.0010) [2023-10-14 01:29:26,126][33226] Updated weights for policy 1, policy_version 11550 (0.0008) [2023-10-14 01:29:29,556][33201] Updated weights for policy 0, policy_version 11460 (0.0008) [2023-10-14 01:29:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 23560192. Throughput: 0: 1761.7, 1: 1757.7. Samples: 5897962. Policy #0 lag: (min: 24.0, avg: 52.3, max: 56.0) [2023-10-14 01:29:29,558][31953] Avg episode reward: [(0, '20.360'), (1, '20.850')] [2023-10-14 01:29:29,932][33201] Updated weights for policy 0, policy_version 11470 (0.0008) [2023-10-14 01:29:29,967][33226] Updated weights for policy 1, policy_version 11560 (0.0009) [2023-10-14 01:29:30,299][33201] Updated weights for policy 0, policy_version 11480 (0.0009) [2023-10-14 01:29:30,333][33226] Updated weights for policy 1, policy_version 11570 (0.0007) [2023-10-14 01:29:30,701][33226] Updated weights for policy 1, policy_version 11580 (0.0007) [2023-10-14 01:29:34,164][33201] Updated weights for policy 0, policy_version 11490 (0.0009) [2023-10-14 01:29:34,369][33226] Updated weights for policy 1, policy_version 11590 (0.0008) [2023-10-14 01:29:34,535][33201] Updated weights for policy 0, policy_version 11500 (0.0009) [2023-10-14 01:29:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 23625728. Throughput: 0: 1766.6, 1: 1764.5. Samples: 5920096. Policy #0 lag: (min: 24.0, avg: 52.3, max: 56.0) [2023-10-14 01:29:34,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.840')] [2023-10-14 01:29:34,740][33226] Updated weights for policy 1, policy_version 11600 (0.0007) [2023-10-14 01:29:34,895][33201] Updated weights for policy 0, policy_version 11510 (0.0007) [2023-10-14 01:29:35,108][33226] Updated weights for policy 1, policy_version 11610 (0.0008) [2023-10-14 01:29:35,267][33201] Updated weights for policy 0, policy_version 11520 (0.0007) [2023-10-14 01:29:38,892][33226] Updated weights for policy 1, policy_version 11620 (0.0008) [2023-10-14 01:29:39,111][33201] Updated weights for policy 0, policy_version 11530 (0.0008) [2023-10-14 01:29:39,257][33226] Updated weights for policy 1, policy_version 11630 (0.0007) [2023-10-14 01:29:39,485][33201] Updated weights for policy 0, policy_version 11540 (0.0009) [2023-10-14 01:29:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 23691264. Throughput: 0: 1773.6, 1: 1794.8. Samples: 5941406. Policy #0 lag: (min: 4.0, avg: 4.6, max: 20.0) [2023-10-14 01:29:39,558][31953] Avg episode reward: [(0, '20.360'), (1, '20.840')] [2023-10-14 01:29:39,625][33226] Updated weights for policy 1, policy_version 11640 (0.0008) [2023-10-14 01:29:39,859][33201] Updated weights for policy 0, policy_version 11550 (0.0009) [2023-10-14 01:29:43,386][33226] Updated weights for policy 1, policy_version 11650 (0.0009) [2023-10-14 01:29:43,743][33226] Updated weights for policy 1, policy_version 11660 (0.0009) [2023-10-14 01:29:43,903][33201] Updated weights for policy 0, policy_version 11560 (0.0009) [2023-10-14 01:29:44,113][33226] Updated weights for policy 1, policy_version 11670 (0.0009) [2023-10-14 01:29:44,281][33201] Updated weights for policy 0, policy_version 11570 (0.0009) [2023-10-14 01:29:44,478][33226] Updated weights for policy 1, policy_version 11680 (0.0009) [2023-10-14 01:29:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 23789568. Throughput: 0: 1755.3, 1: 1765.3. Samples: 5951690. Policy #0 lag: (min: 4.0, avg: 4.6, max: 20.0) [2023-10-14 01:29:44,558][31953] Avg episode reward: [(0, '20.410'), (1, '20.830')] [2023-10-14 01:29:44,661][33201] Updated weights for policy 0, policy_version 11580 (0.0008) [2023-10-14 01:29:48,259][33226] Updated weights for policy 1, policy_version 11690 (0.0008) [2023-10-14 01:29:48,473][33201] Updated weights for policy 0, policy_version 11590 (0.0009) [2023-10-14 01:29:48,635][33226] Updated weights for policy 1, policy_version 11700 (0.0009) [2023-10-14 01:29:48,845][33201] Updated weights for policy 0, policy_version 11600 (0.0009) [2023-10-14 01:29:48,996][33226] Updated weights for policy 1, policy_version 11710 (0.0007) [2023-10-14 01:29:49,210][33201] Updated weights for policy 0, policy_version 11610 (0.0009) [2023-10-14 01:29:49,557][31953] Fps is (10 sec: 19661.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 23887872. Throughput: 0: 1771.1, 1: 1790.7. Samples: 5973154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:49,557][31953] Avg episode reward: [(0, '20.430'), (1, '20.840')] [2023-10-14 01:29:52,742][33226] Updated weights for policy 1, policy_version 11720 (0.0008) [2023-10-14 01:29:53,105][33226] Updated weights for policy 1, policy_version 11730 (0.0009) [2023-10-14 01:29:53,195][33201] Updated weights for policy 0, policy_version 11620 (0.0008) [2023-10-14 01:29:53,481][33226] Updated weights for policy 1, policy_version 11740 (0.0007) [2023-10-14 01:29:53,566][33201] Updated weights for policy 0, policy_version 11630 (0.0007) [2023-10-14 01:29:53,939][33201] Updated weights for policy 0, policy_version 11640 (0.0009) [2023-10-14 01:29:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 23953408. Throughput: 0: 1743.8, 1: 1766.7. Samples: 5992872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:54,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.850')] [2023-10-14 01:29:57,191][33226] Updated weights for policy 1, policy_version 11750 (0.0008) [2023-10-14 01:29:57,533][33201] Updated weights for policy 0, policy_version 11650 (0.0008) [2023-10-14 01:29:57,563][33226] Updated weights for policy 1, policy_version 11760 (0.0008) [2023-10-14 01:29:57,903][33201] Updated weights for policy 0, policy_version 11660 (0.0008) [2023-10-14 01:29:57,933][33226] Updated weights for policy 1, policy_version 11770 (0.0007) [2023-10-14 01:29:58,273][33201] Updated weights for policy 0, policy_version 11670 (0.0008) [2023-10-14 01:29:58,645][33201] Updated weights for policy 0, policy_version 11680 (0.0009) [2023-10-14 01:29:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 24018944. Throughput: 0: 1770.2, 1: 1790.8. Samples: 6005246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:29:59,558][31953] Avg episode reward: [(0, '20.440'), (1, '20.850')] [2023-10-14 01:30:01,725][33226] Updated weights for policy 1, policy_version 11780 (0.0009) [2023-10-14 01:30:02,098][33226] Updated weights for policy 1, policy_version 11790 (0.0008) [2023-10-14 01:30:02,463][33226] Updated weights for policy 1, policy_version 11800 (0.0009) [2023-10-14 01:30:02,499][33201] Updated weights for policy 0, policy_version 11690 (0.0009) [2023-10-14 01:30:02,867][33201] Updated weights for policy 0, policy_version 11700 (0.0007) [2023-10-14 01:30:03,237][33201] Updated weights for policy 0, policy_version 11710 (0.0007) [2023-10-14 01:30:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 24084480. Throughput: 0: 1749.7, 1: 1769.2. Samples: 6024850. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:30:04,558][31953] Avg episode reward: [(0, '20.460'), (1, '20.860')] [2023-10-14 01:30:06,388][33226] Updated weights for policy 1, policy_version 11810 (0.0008) [2023-10-14 01:30:06,751][33226] Updated weights for policy 1, policy_version 11820 (0.0008) [2023-10-14 01:30:07,124][33226] Updated weights for policy 1, policy_version 11830 (0.0010) [2023-10-14 01:30:07,255][33201] Updated weights for policy 0, policy_version 11720 (0.0009) [2023-10-14 01:30:07,496][33226] Updated weights for policy 1, policy_version 11840 (0.0008) [2023-10-14 01:30:07,629][33201] Updated weights for policy 0, policy_version 11730 (0.0009) [2023-10-14 01:30:08,008][33201] Updated weights for policy 0, policy_version 11740 (0.0009) [2023-10-14 01:30:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 24150016. Throughput: 0: 1739.0, 1: 1773.5. Samples: 6046522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:30:09,558][31953] Avg episode reward: [(0, '20.460'), (1, '20.860')] [2023-10-14 01:30:11,196][33226] Updated weights for policy 1, policy_version 11850 (0.0009) [2023-10-14 01:30:11,568][33226] Updated weights for policy 1, policy_version 11860 (0.0008) [2023-10-14 01:30:11,743][33201] Updated weights for policy 0, policy_version 11750 (0.0008) [2023-10-14 01:30:11,928][33226] Updated weights for policy 1, policy_version 11870 (0.0007) [2023-10-14 01:30:12,121][33201] Updated weights for policy 0, policy_version 11760 (0.0007) [2023-10-14 01:30:12,489][33201] Updated weights for policy 0, policy_version 11770 (0.0007) [2023-10-14 01:30:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 24215552. Throughput: 0: 1759.0, 1: 1778.2. Samples: 6057136. Policy #0 lag: (min: 9.0, avg: 14.2, max: 41.0) [2023-10-14 01:30:14,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.870')] [2023-10-14 01:30:15,662][33226] Updated weights for policy 1, policy_version 11880 (0.0009) [2023-10-14 01:30:16,025][33226] Updated weights for policy 1, policy_version 11890 (0.0008) [2023-10-14 01:30:16,361][33201] Updated weights for policy 0, policy_version 11780 (0.0008) [2023-10-14 01:30:16,388][33226] Updated weights for policy 1, policy_version 11900 (0.0007) [2023-10-14 01:30:16,730][33201] Updated weights for policy 0, policy_version 11790 (0.0009) [2023-10-14 01:30:17,103][33201] Updated weights for policy 0, policy_version 11800 (0.0009) [2023-10-14 01:30:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 24281088. Throughput: 0: 1738.7, 1: 1778.3. Samples: 6078360. Policy #0 lag: (min: 9.0, avg: 14.2, max: 41.0) [2023-10-14 01:30:19,558][31953] Avg episode reward: [(0, '20.450'), (1, '20.870')] [2023-10-14 01:30:20,208][33226] Updated weights for policy 1, policy_version 11910 (0.0007) [2023-10-14 01:30:20,581][33226] Updated weights for policy 1, policy_version 11920 (0.0008) [2023-10-14 01:30:20,816][33201] Updated weights for policy 0, policy_version 11810 (0.0008) [2023-10-14 01:30:20,958][33226] Updated weights for policy 1, policy_version 11930 (0.0007) [2023-10-14 01:30:21,192][33201] Updated weights for policy 0, policy_version 11820 (0.0009) [2023-10-14 01:30:21,562][33201] Updated weights for policy 0, policy_version 11830 (0.0008) [2023-10-14 01:30:21,934][33201] Updated weights for policy 0, policy_version 11840 (0.0008) [2023-10-14 01:30:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 24346624. Throughput: 0: 1749.5, 1: 1782.0. Samples: 6100320. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:30:24,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.850')] [2023-10-14 01:30:24,809][33226] Updated weights for policy 1, policy_version 11940 (0.0008) [2023-10-14 01:30:25,187][33226] Updated weights for policy 1, policy_version 11950 (0.0008) [2023-10-14 01:30:25,566][33226] Updated weights for policy 1, policy_version 11960 (0.0009) [2023-10-14 01:30:25,927][33201] Updated weights for policy 0, policy_version 11850 (0.0010) [2023-10-14 01:30:26,303][33201] Updated weights for policy 0, policy_version 11860 (0.0010) [2023-10-14 01:30:26,682][33201] Updated weights for policy 0, policy_version 11870 (0.0009) [2023-10-14 01:30:29,443][33226] Updated weights for policy 1, policy_version 11970 (0.0007) [2023-10-14 01:30:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 24412160. Throughput: 0: 1739.7, 1: 1775.8. Samples: 6109890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:30:29,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.860')] [2023-10-14 01:30:29,822][33226] Updated weights for policy 1, policy_version 11980 (0.0010) [2023-10-14 01:30:30,195][33226] Updated weights for policy 1, policy_version 11990 (0.0010) [2023-10-14 01:30:30,557][33226] Updated weights for policy 1, policy_version 12000 (0.0009) [2023-10-14 01:30:30,722][33201] Updated weights for policy 0, policy_version 11880 (0.0009) [2023-10-14 01:30:31,098][33201] Updated weights for policy 0, policy_version 11890 (0.0008) [2023-10-14 01:30:31,475][33201] Updated weights for policy 0, policy_version 11900 (0.0009) [2023-10-14 01:30:34,430][33226] Updated weights for policy 1, policy_version 12010 (0.0011) [2023-10-14 01:30:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 24477696. Throughput: 0: 1745.0, 1: 1778.8. Samples: 6131726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:30:34,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.860')] [2023-10-14 01:30:34,795][33226] Updated weights for policy 1, policy_version 12020 (0.0009) [2023-10-14 01:30:35,169][33226] Updated weights for policy 1, policy_version 12030 (0.0008) [2023-10-14 01:30:35,429][33201] Updated weights for policy 0, policy_version 11910 (0.0009) [2023-10-14 01:30:35,811][33201] Updated weights for policy 0, policy_version 11920 (0.0008) [2023-10-14 01:30:36,180][33201] Updated weights for policy 0, policy_version 11930 (0.0009) [2023-10-14 01:30:38,988][33226] Updated weights for policy 1, policy_version 12040 (0.0007) [2023-10-14 01:30:39,366][33226] Updated weights for policy 1, policy_version 12050 (0.0007) [2023-10-14 01:30:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 24543232. Throughput: 0: 1765.4, 1: 1798.2. Samples: 6153236. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-14 01:30:39,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.860')] [2023-10-14 01:30:39,732][33226] Updated weights for policy 1, policy_version 12060 (0.0009) [2023-10-14 01:30:39,921][33201] Updated weights for policy 0, policy_version 11940 (0.0009) [2023-10-14 01:30:40,287][33201] Updated weights for policy 0, policy_version 11950 (0.0011) [2023-10-14 01:30:40,651][33201] Updated weights for policy 0, policy_version 11960 (0.0009) [2023-10-14 01:30:43,583][33226] Updated weights for policy 1, policy_version 12070 (0.0008) [2023-10-14 01:30:43,951][33226] Updated weights for policy 1, policy_version 12080 (0.0010) [2023-10-14 01:30:44,318][33226] Updated weights for policy 1, policy_version 12090 (0.0010) [2023-10-14 01:30:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 24641536. Throughput: 0: 1737.0, 1: 1770.8. Samples: 6163098. Policy #0 lag: (min: 1.0, avg: 15.7, max: 33.0) [2023-10-14 01:30:44,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.860')] [2023-10-14 01:30:44,602][33201] Updated weights for policy 0, policy_version 11970 (0.0010) [2023-10-14 01:30:44,981][33201] Updated weights for policy 0, policy_version 11980 (0.0010) [2023-10-14 01:30:45,356][33201] Updated weights for policy 0, policy_version 11990 (0.0007) [2023-10-14 01:30:45,734][33201] Updated weights for policy 0, policy_version 12000 (0.0007) [2023-10-14 01:30:48,015][33226] Updated weights for policy 1, policy_version 12100 (0.0009) [2023-10-14 01:30:48,379][33226] Updated weights for policy 1, policy_version 12110 (0.0008) [2023-10-14 01:30:48,756][33226] Updated weights for policy 1, policy_version 12120 (0.0007) [2023-10-14 01:30:49,509][33201] Updated weights for policy 0, policy_version 12010 (0.0010) [2023-10-14 01:30:49,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.2, 300 sec: 14106.9). Total num frames: 24707072. Throughput: 0: 1764.0, 1: 1801.0. Samples: 6185274. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) [2023-10-14 01:30:49,559][31953] Avg episode reward: [(0, '20.580'), (1, '20.840')] [2023-10-14 01:30:49,886][33201] Updated weights for policy 0, policy_version 12020 (0.0010) [2023-10-14 01:30:50,269][33201] Updated weights for policy 0, policy_version 12030 (0.0010) [2023-10-14 01:30:52,578][33226] Updated weights for policy 1, policy_version 12130 (0.0007) [2023-10-14 01:30:52,954][33226] Updated weights for policy 1, policy_version 12140 (0.0008) [2023-10-14 01:30:53,323][33226] Updated weights for policy 1, policy_version 12150 (0.0009) [2023-10-14 01:30:53,682][33226] Updated weights for policy 1, policy_version 12160 (0.0009) [2023-10-14 01:30:54,057][33201] Updated weights for policy 0, policy_version 12040 (0.0008) [2023-10-14 01:30:54,428][33201] Updated weights for policy 0, policy_version 12050 (0.0007) [2023-10-14 01:30:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 24772608. Throughput: 0: 1767.8, 1: 1769.3. Samples: 6205688. Policy #0 lag: (min: 31.0, avg: 46.3, max: 63.0) [2023-10-14 01:30:54,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.840')] [2023-10-14 01:30:54,795][33201] Updated weights for policy 0, policy_version 12060 (0.0008) [2023-10-14 01:30:57,434][33226] Updated weights for policy 1, policy_version 12170 (0.0008) [2023-10-14 01:30:57,804][33226] Updated weights for policy 1, policy_version 12180 (0.0010) [2023-10-14 01:30:58,174][33226] Updated weights for policy 1, policy_version 12190 (0.0011) [2023-10-14 01:30:58,563][33201] Updated weights for policy 0, policy_version 12070 (0.0009) [2023-10-14 01:30:58,928][33201] Updated weights for policy 0, policy_version 12080 (0.0008) [2023-10-14 01:30:59,303][33201] Updated weights for policy 0, policy_version 12090 (0.0007) [2023-10-14 01:30:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 24870912. Throughput: 0: 1758.4, 1: 1796.3. Samples: 6217096. Policy #0 lag: (min: 22.0, avg: 31.8, max: 54.0) [2023-10-14 01:30:59,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.810')] [2023-10-14 01:31:01,937][33226] Updated weights for policy 1, policy_version 12200 (0.0007) [2023-10-14 01:31:02,302][33226] Updated weights for policy 1, policy_version 12210 (0.0008) [2023-10-14 01:31:02,663][33226] Updated weights for policy 1, policy_version 12220 (0.0007) [2023-10-14 01:31:03,081][33201] Updated weights for policy 0, policy_version 12100 (0.0008) [2023-10-14 01:31:03,457][33201] Updated weights for policy 0, policy_version 12110 (0.0007) [2023-10-14 01:31:03,841][33201] Updated weights for policy 0, policy_version 12120 (0.0009) [2023-10-14 01:31:04,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 24936448. Throughput: 0: 1773.6, 1: 1763.1. Samples: 6237512. Policy #0 lag: (min: 22.0, avg: 31.8, max: 54.0) [2023-10-14 01:31:04,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.770')] [2023-10-14 01:31:06,582][33226] Updated weights for policy 1, policy_version 12230 (0.0009) [2023-10-14 01:31:06,959][33226] Updated weights for policy 1, policy_version 12240 (0.0010) [2023-10-14 01:31:07,330][33226] Updated weights for policy 1, policy_version 12250 (0.0009) [2023-10-14 01:31:07,540][33201] Updated weights for policy 0, policy_version 12130 (0.0009) [2023-10-14 01:31:07,916][33201] Updated weights for policy 0, policy_version 12140 (0.0009) [2023-10-14 01:31:08,295][33201] Updated weights for policy 0, policy_version 12150 (0.0008) [2023-10-14 01:31:08,671][33201] Updated weights for policy 0, policy_version 12160 (0.0008) [2023-10-14 01:31:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25001984. Throughput: 0: 1750.8, 1: 1761.9. Samples: 6258392. Policy #0 lag: (min: 22.0, avg: 31.8, max: 54.0) [2023-10-14 01:31:09,558][31953] Avg episode reward: [(0, '20.540'), (1, '20.740')] [2023-10-14 01:31:11,034][33226] Updated weights for policy 1, policy_version 12260 (0.0009) [2023-10-14 01:31:11,401][33226] Updated weights for policy 1, policy_version 12270 (0.0010) [2023-10-14 01:31:11,769][33226] Updated weights for policy 1, policy_version 12280 (0.0009) [2023-10-14 01:31:12,458][33201] Updated weights for policy 0, policy_version 12170 (0.0009) [2023-10-14 01:31:12,835][33201] Updated weights for policy 0, policy_version 12180 (0.0009) [2023-10-14 01:31:13,210][33201] Updated weights for policy 0, policy_version 12190 (0.0010) [2023-10-14 01:31:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25067520. Throughput: 0: 1784.4, 1: 1769.0. Samples: 6269794. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-14 01:31:14,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.750')] [2023-10-14 01:31:15,444][33226] Updated weights for policy 1, policy_version 12290 (0.0009) [2023-10-14 01:31:15,819][33226] Updated weights for policy 1, policy_version 12300 (0.0009) [2023-10-14 01:31:16,179][33226] Updated weights for policy 1, policy_version 12310 (0.0009) [2023-10-14 01:31:16,549][33226] Updated weights for policy 1, policy_version 12320 (0.0007) [2023-10-14 01:31:16,854][33201] Updated weights for policy 0, policy_version 12200 (0.0009) [2023-10-14 01:31:17,229][33201] Updated weights for policy 0, policy_version 12210 (0.0009) [2023-10-14 01:31:17,597][33201] Updated weights for policy 0, policy_version 12220 (0.0008) [2023-10-14 01:31:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25133056. Throughput: 0: 1760.5, 1: 1766.4. Samples: 6290438. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-14 01:31:19,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.740')] [2023-10-14 01:31:20,372][33226] Updated weights for policy 1, policy_version 12330 (0.0009) [2023-10-14 01:31:20,746][33226] Updated weights for policy 1, policy_version 12340 (0.0008) [2023-10-14 01:31:21,117][33226] Updated weights for policy 1, policy_version 12350 (0.0008) [2023-10-14 01:31:21,664][33201] Updated weights for policy 0, policy_version 12230 (0.0008) [2023-10-14 01:31:22,038][33201] Updated weights for policy 0, policy_version 12240 (0.0009) [2023-10-14 01:31:22,418][33201] Updated weights for policy 0, policy_version 12250 (0.0009) [2023-10-14 01:31:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25198592. Throughput: 0: 1765.4, 1: 1771.6. Samples: 6312402. Policy #0 lag: (min: 31.0, avg: 39.9, max: 63.0) [2023-10-14 01:31:24,558][31953] Avg episode reward: [(0, '20.540'), (1, '20.760')] [2023-10-14 01:31:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000012352_12648448.pth... [2023-10-14 01:31:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000012256_12550144.pth... [2023-10-14 01:31:24,604][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000010624_10878976.pth [2023-10-14 01:31:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000010688_10944512.pth [2023-10-14 01:31:25,003][33226] Updated weights for policy 1, policy_version 12360 (0.0008) [2023-10-14 01:31:25,372][33226] Updated weights for policy 1, policy_version 12370 (0.0008) [2023-10-14 01:31:25,742][33226] Updated weights for policy 1, policy_version 12380 (0.0007) [2023-10-14 01:31:26,041][33201] Updated weights for policy 0, policy_version 12260 (0.0008) [2023-10-14 01:31:26,423][33201] Updated weights for policy 0, policy_version 12270 (0.0010) [2023-10-14 01:31:26,791][33201] Updated weights for policy 0, policy_version 12280 (0.0010) [2023-10-14 01:31:29,486][33226] Updated weights for policy 1, policy_version 12390 (0.0008) [2023-10-14 01:31:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 25264128. Throughput: 0: 1770.7, 1: 1764.5. Samples: 6322184. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-14 01:31:29,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.740')] [2023-10-14 01:31:29,851][33226] Updated weights for policy 1, policy_version 12400 (0.0008) [2023-10-14 01:31:30,210][33226] Updated weights for policy 1, policy_version 12410 (0.0008) [2023-10-14 01:31:30,609][33201] Updated weights for policy 0, policy_version 12290 (0.0009) [2023-10-14 01:31:30,981][33201] Updated weights for policy 0, policy_version 12300 (0.0007) [2023-10-14 01:31:31,356][33201] Updated weights for policy 0, policy_version 12310 (0.0008) [2023-10-14 01:31:31,718][33201] Updated weights for policy 0, policy_version 12320 (0.0009) [2023-10-14 01:31:33,978][33226] Updated weights for policy 1, policy_version 12420 (0.0008) [2023-10-14 01:31:34,355][33226] Updated weights for policy 1, policy_version 12430 (0.0011) [2023-10-14 01:31:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 25329664. Throughput: 0: 1768.1, 1: 1772.3. Samples: 6344590. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-14 01:31:34,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.730')] [2023-10-14 01:31:34,719][33226] Updated weights for policy 1, policy_version 12440 (0.0008) [2023-10-14 01:31:35,524][33201] Updated weights for policy 0, policy_version 12330 (0.0007) [2023-10-14 01:31:35,903][33201] Updated weights for policy 0, policy_version 12340 (0.0008) [2023-10-14 01:31:36,275][33201] Updated weights for policy 0, policy_version 12350 (0.0009) [2023-10-14 01:31:38,382][33226] Updated weights for policy 1, policy_version 12450 (0.0008) [2023-10-14 01:31:38,758][33226] Updated weights for policy 1, policy_version 12460 (0.0009) [2023-10-14 01:31:39,129][33226] Updated weights for policy 1, policy_version 12470 (0.0007) [2023-10-14 01:31:39,500][33226] Updated weights for policy 1, policy_version 12480 (0.0008) [2023-10-14 01:31:39,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 25427968. Throughput: 0: 1775.2, 1: 1790.6. Samples: 6366146. Policy #0 lag: (min: 31.0, avg: 36.1, max: 63.0) [2023-10-14 01:31:39,557][31953] Avg episode reward: [(0, '20.580'), (1, '20.700')] [2023-10-14 01:31:40,087][33201] Updated weights for policy 0, policy_version 12360 (0.0009) [2023-10-14 01:31:40,466][33201] Updated weights for policy 0, policy_version 12370 (0.0010) [2023-10-14 01:31:40,844][33201] Updated weights for policy 0, policy_version 12380 (0.0007) [2023-10-14 01:31:43,063][33226] Updated weights for policy 1, policy_version 12490 (0.0007) [2023-10-14 01:31:43,431][33226] Updated weights for policy 1, policy_version 12500 (0.0007) [2023-10-14 01:31:43,794][33226] Updated weights for policy 1, policy_version 12510 (0.0008) [2023-10-14 01:31:44,446][33201] Updated weights for policy 0, policy_version 12390 (0.0007) [2023-10-14 01:31:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 25493504. Throughput: 0: 1764.1, 1: 1785.6. Samples: 6376830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:31:44,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.690')] [2023-10-14 01:31:44,820][33201] Updated weights for policy 0, policy_version 12400 (0.0007) [2023-10-14 01:31:45,187][33201] Updated weights for policy 0, policy_version 12410 (0.0007) [2023-10-14 01:31:47,644][33226] Updated weights for policy 1, policy_version 12520 (0.0008) [2023-10-14 01:31:48,012][33226] Updated weights for policy 1, policy_version 12530 (0.0010) [2023-10-14 01:31:48,381][33226] Updated weights for policy 1, policy_version 12540 (0.0007) [2023-10-14 01:31:49,045][33201] Updated weights for policy 0, policy_version 12420 (0.0007) [2023-10-14 01:31:49,409][33201] Updated weights for policy 0, policy_version 12430 (0.0007) [2023-10-14 01:31:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 25559040. Throughput: 0: 1771.6, 1: 1805.2. Samples: 6398466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:31:49,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.630')] [2023-10-14 01:31:49,780][33201] Updated weights for policy 0, policy_version 12440 (0.0009) [2023-10-14 01:31:52,151][33226] Updated weights for policy 1, policy_version 12550 (0.0009) [2023-10-14 01:31:52,544][33226] Updated weights for policy 1, policy_version 12560 (0.0008) [2023-10-14 01:31:52,911][33226] Updated weights for policy 1, policy_version 12570 (0.0008) [2023-10-14 01:31:53,581][33201] Updated weights for policy 0, policy_version 12450 (0.0008) [2023-10-14 01:31:53,952][33201] Updated weights for policy 0, policy_version 12460 (0.0008) [2023-10-14 01:31:54,320][33201] Updated weights for policy 0, policy_version 12470 (0.0009) [2023-10-14 01:31:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 25624576. Throughput: 0: 1789.2, 1: 1792.8. Samples: 6419584. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:31:54,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.620')] [2023-10-14 01:31:54,695][33201] Updated weights for policy 0, policy_version 12480 (0.0008) [2023-10-14 01:31:56,601][33226] Updated weights for policy 1, policy_version 12580 (0.0008) [2023-10-14 01:31:56,970][33226] Updated weights for policy 1, policy_version 12590 (0.0009) [2023-10-14 01:31:57,333][33226] Updated weights for policy 1, policy_version 12600 (0.0007) [2023-10-14 01:31:58,475][33201] Updated weights for policy 0, policy_version 12490 (0.0010) [2023-10-14 01:31:58,846][33201] Updated weights for policy 0, policy_version 12500 (0.0008) [2023-10-14 01:31:59,223][33201] Updated weights for policy 0, policy_version 12510 (0.0010) [2023-10-14 01:31:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25722880. Throughput: 0: 1772.4, 1: 1802.4. Samples: 6430660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:31:59,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.610')] [2023-10-14 01:32:01,031][33226] Updated weights for policy 1, policy_version 12610 (0.0009) [2023-10-14 01:32:01,396][33226] Updated weights for policy 1, policy_version 12620 (0.0009) [2023-10-14 01:32:01,771][33226] Updated weights for policy 1, policy_version 12630 (0.0008) [2023-10-14 01:32:02,128][33226] Updated weights for policy 1, policy_version 12640 (0.0010) [2023-10-14 01:32:03,208][33201] Updated weights for policy 0, policy_version 12520 (0.0008) [2023-10-14 01:32:03,577][33201] Updated weights for policy 0, policy_version 12530 (0.0007) [2023-10-14 01:32:03,953][33201] Updated weights for policy 0, policy_version 12540 (0.0007) [2023-10-14 01:32:04,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25788416. Throughput: 0: 1800.6, 1: 1791.4. Samples: 6452076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:04,557][31953] Avg episode reward: [(0, '20.700'), (1, '20.570')] [2023-10-14 01:32:06,072][33226] Updated weights for policy 1, policy_version 12650 (0.0008) [2023-10-14 01:32:06,444][33226] Updated weights for policy 1, policy_version 12660 (0.0009) [2023-10-14 01:32:06,811][33226] Updated weights for policy 1, policy_version 12670 (0.0011) [2023-10-14 01:32:07,801][33201] Updated weights for policy 0, policy_version 12550 (0.0008) [2023-10-14 01:32:08,168][33201] Updated weights for policy 0, policy_version 12560 (0.0007) [2023-10-14 01:32:08,540][33201] Updated weights for policy 0, policy_version 12570 (0.0007) [2023-10-14 01:32:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 25853952. Throughput: 0: 1772.8, 1: 1792.4. Samples: 6472834. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:32:09,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.530')] [2023-10-14 01:32:10,554][33226] Updated weights for policy 1, policy_version 12680 (0.0010) [2023-10-14 01:32:10,928][33226] Updated weights for policy 1, policy_version 12690 (0.0007) [2023-10-14 01:32:11,297][33226] Updated weights for policy 1, policy_version 12700 (0.0009) [2023-10-14 01:32:12,356][33201] Updated weights for policy 0, policy_version 12580 (0.0008) [2023-10-14 01:32:12,735][33201] Updated weights for policy 0, policy_version 12590 (0.0008) [2023-10-14 01:32:13,108][33201] Updated weights for policy 0, policy_version 12600 (0.0009) [2023-10-14 01:32:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 25919488. Throughput: 0: 1799.1, 1: 1793.3. Samples: 6483844. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:32:14,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.510')] [2023-10-14 01:32:15,049][33226] Updated weights for policy 1, policy_version 12710 (0.0008) [2023-10-14 01:32:15,423][33226] Updated weights for policy 1, policy_version 12720 (0.0007) [2023-10-14 01:32:15,791][33226] Updated weights for policy 1, policy_version 12730 (0.0008) [2023-10-14 01:32:16,814][33201] Updated weights for policy 0, policy_version 12610 (0.0008) [2023-10-14 01:32:17,192][33201] Updated weights for policy 0, policy_version 12620 (0.0010) [2023-10-14 01:32:17,555][33201] Updated weights for policy 0, policy_version 12630 (0.0009) [2023-10-14 01:32:17,925][33201] Updated weights for policy 0, policy_version 12640 (0.0007) [2023-10-14 01:32:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 25985024. Throughput: 0: 1766.1, 1: 1794.4. Samples: 6504812. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 01:32:19,559][31953] Avg episode reward: [(0, '20.780'), (1, '20.510')] [2023-10-14 01:32:19,560][32837] Saving new best policy, reward=20.780! [2023-10-14 01:32:19,592][33226] Updated weights for policy 1, policy_version 12740 (0.0009) [2023-10-14 01:32:19,961][33226] Updated weights for policy 1, policy_version 12750 (0.0008) [2023-10-14 01:32:20,330][33226] Updated weights for policy 1, policy_version 12760 (0.0009) [2023-10-14 01:32:21,735][33201] Updated weights for policy 0, policy_version 12650 (0.0009) [2023-10-14 01:32:22,108][33201] Updated weights for policy 0, policy_version 12660 (0.0008) [2023-10-14 01:32:22,484][33201] Updated weights for policy 0, policy_version 12670 (0.0008) [2023-10-14 01:32:24,055][33226] Updated weights for policy 1, policy_version 12770 (0.0010) [2023-10-14 01:32:24,416][33226] Updated weights for policy 1, policy_version 12780 (0.0011) [2023-10-14 01:32:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 26050560. Throughput: 0: 1764.4, 1: 1807.9. Samples: 6526896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:24,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.500')] [2023-10-14 01:32:24,567][32837] Saving new best policy, reward=20.800! [2023-10-14 01:32:24,790][33226] Updated weights for policy 1, policy_version 12790 (0.0008) [2023-10-14 01:32:25,161][33226] Updated weights for policy 1, policy_version 12800 (0.0007) [2023-10-14 01:32:26,320][33201] Updated weights for policy 0, policy_version 12680 (0.0009) [2023-10-14 01:32:26,691][33201] Updated weights for policy 0, policy_version 12690 (0.0009) [2023-10-14 01:32:27,058][33201] Updated weights for policy 0, policy_version 12700 (0.0009) [2023-10-14 01:32:28,930][33226] Updated weights for policy 1, policy_version 12810 (0.0009) [2023-10-14 01:32:29,295][33226] Updated weights for policy 1, policy_version 12820 (0.0008) [2023-10-14 01:32:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 26116096. Throughput: 0: 1770.8, 1: 1781.9. Samples: 6536700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:29,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.460')] [2023-10-14 01:32:29,559][32837] Saving new best policy, reward=20.840! [2023-10-14 01:32:29,667][33226] Updated weights for policy 1, policy_version 12830 (0.0009) [2023-10-14 01:32:30,887][33201] Updated weights for policy 0, policy_version 12710 (0.0010) [2023-10-14 01:32:31,244][33201] Updated weights for policy 0, policy_version 12720 (0.0009) [2023-10-14 01:32:31,619][33201] Updated weights for policy 0, policy_version 12730 (0.0007) [2023-10-14 01:32:33,372][33226] Updated weights for policy 1, policy_version 12840 (0.0009) [2023-10-14 01:32:33,736][33226] Updated weights for policy 1, policy_version 12850 (0.0008) [2023-10-14 01:32:34,107][33226] Updated weights for policy 1, policy_version 12860 (0.0008) [2023-10-14 01:32:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 26214400. Throughput: 0: 1762.3, 1: 1799.3. Samples: 6558738. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.440')] [2023-10-14 01:32:35,447][33201] Updated weights for policy 0, policy_version 12740 (0.0007) [2023-10-14 01:32:35,818][33201] Updated weights for policy 0, policy_version 12750 (0.0007) [2023-10-14 01:32:36,194][33201] Updated weights for policy 0, policy_version 12760 (0.0009) [2023-10-14 01:32:38,028][33226] Updated weights for policy 1, policy_version 12870 (0.0008) [2023-10-14 01:32:38,396][33226] Updated weights for policy 1, policy_version 12880 (0.0010) [2023-10-14 01:32:38,765][33226] Updated weights for policy 1, policy_version 12890 (0.0011) [2023-10-14 01:32:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 26279936. Throughput: 0: 1767.7, 1: 1777.8. Samples: 6579134. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:39,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.420')] [2023-10-14 01:32:40,153][33201] Updated weights for policy 0, policy_version 12770 (0.0010) [2023-10-14 01:32:40,517][33201] Updated weights for policy 0, policy_version 12780 (0.0008) [2023-10-14 01:32:40,886][33201] Updated weights for policy 0, policy_version 12790 (0.0008) [2023-10-14 01:32:41,260][33201] Updated weights for policy 0, policy_version 12800 (0.0008) [2023-10-14 01:32:42,601][33226] Updated weights for policy 1, policy_version 12900 (0.0010) [2023-10-14 01:32:42,975][33226] Updated weights for policy 1, policy_version 12910 (0.0009) [2023-10-14 01:32:43,337][33226] Updated weights for policy 1, policy_version 12920 (0.0007) [2023-10-14 01:32:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 26345472. Throughput: 0: 1747.7, 1: 1790.9. Samples: 6589896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:44,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.420')] [2023-10-14 01:32:44,559][32837] Saving new best policy, reward=20.850! [2023-10-14 01:32:45,115][33201] Updated weights for policy 0, policy_version 12810 (0.0008) [2023-10-14 01:32:45,489][33201] Updated weights for policy 0, policy_version 12820 (0.0008) [2023-10-14 01:32:45,854][33201] Updated weights for policy 0, policy_version 12830 (0.0008) [2023-10-14 01:32:46,975][33226] Updated weights for policy 1, policy_version 12930 (0.0010) [2023-10-14 01:32:47,348][33226] Updated weights for policy 1, policy_version 12940 (0.0009) [2023-10-14 01:32:47,711][33226] Updated weights for policy 1, policy_version 12950 (0.0010) [2023-10-14 01:32:48,072][33226] Updated weights for policy 1, policy_version 12960 (0.0010) [2023-10-14 01:32:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 26411008. Throughput: 0: 1749.8, 1: 1784.0. Samples: 6611096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:32:49,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.400')] [2023-10-14 01:32:49,728][33201] Updated weights for policy 0, policy_version 12840 (0.0008) [2023-10-14 01:32:50,093][33201] Updated weights for policy 0, policy_version 12850 (0.0009) [2023-10-14 01:32:50,465][33201] Updated weights for policy 0, policy_version 12860 (0.0010) [2023-10-14 01:32:50,612][32837] Saving new best policy, reward=20.860! [2023-10-14 01:32:51,679][33226] Updated weights for policy 1, policy_version 12970 (0.0008) [2023-10-14 01:32:52,048][33226] Updated weights for policy 1, policy_version 12980 (0.0009) [2023-10-14 01:32:52,420][33226] Updated weights for policy 1, policy_version 12990 (0.0009) [2023-10-14 01:32:54,245][33201] Updated weights for policy 0, policy_version 12870 (0.0010) [2023-10-14 01:32:54,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 26476544. Throughput: 0: 1772.7, 1: 1788.4. Samples: 6633086. Policy #0 lag: (min: 24.0, avg: 43.9, max: 56.0) [2023-10-14 01:32:54,559][31953] Avg episode reward: [(0, '20.870'), (1, '20.450')] [2023-10-14 01:32:54,640][33201] Updated weights for policy 0, policy_version 12880 (0.0009) [2023-10-14 01:32:55,019][33201] Updated weights for policy 0, policy_version 12890 (0.0010) [2023-10-14 01:32:55,239][32837] Saving new best policy, reward=20.870! [2023-10-14 01:32:56,171][33226] Updated weights for policy 1, policy_version 13000 (0.0009) [2023-10-14 01:32:56,548][33226] Updated weights for policy 1, policy_version 13010 (0.0009) [2023-10-14 01:32:56,915][33226] Updated weights for policy 1, policy_version 13020 (0.0008) [2023-10-14 01:32:58,781][33201] Updated weights for policy 0, policy_version 12900 (0.0008) [2023-10-14 01:32:59,147][33201] Updated weights for policy 0, policy_version 12910 (0.0008) [2023-10-14 01:32:59,525][33201] Updated weights for policy 0, policy_version 12920 (0.0007) [2023-10-14 01:32:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 26542080. Throughput: 0: 1742.1, 1: 1792.2. Samples: 6642886. Policy #0 lag: (min: 24.0, avg: 43.9, max: 56.0) [2023-10-14 01:32:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.440')] [2023-10-14 01:32:59,824][32837] Saving new best policy, reward=20.880! [2023-10-14 01:33:00,711][33226] Updated weights for policy 1, policy_version 13030 (0.0008) [2023-10-14 01:33:01,086][33226] Updated weights for policy 1, policy_version 13040 (0.0009) [2023-10-14 01:33:01,453][33226] Updated weights for policy 1, policy_version 13050 (0.0008) [2023-10-14 01:33:03,334][33201] Updated weights for policy 0, policy_version 12930 (0.0008) [2023-10-14 01:33:03,698][33201] Updated weights for policy 0, policy_version 12940 (0.0007) [2023-10-14 01:33:04,081][33201] Updated weights for policy 0, policy_version 12950 (0.0010) [2023-10-14 01:33:04,445][33201] Updated weights for policy 0, policy_version 12960 (0.0008) [2023-10-14 01:33:04,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 26640384. Throughput: 0: 1777.3, 1: 1783.6. Samples: 6665048. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) [2023-10-14 01:33:04,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.420')] [2023-10-14 01:33:04,559][32837] Saving new best policy, reward=20.900! [2023-10-14 01:33:05,174][33226] Updated weights for policy 1, policy_version 13060 (0.0009) [2023-10-14 01:33:05,559][33226] Updated weights for policy 1, policy_version 13070 (0.0009) [2023-10-14 01:33:05,927][33226] Updated weights for policy 1, policy_version 13080 (0.0008) [2023-10-14 01:33:08,422][33201] Updated weights for policy 0, policy_version 12970 (0.0008) [2023-10-14 01:33:08,793][33201] Updated weights for policy 0, policy_version 12980 (0.0009) [2023-10-14 01:33:09,168][33201] Updated weights for policy 0, policy_version 12990 (0.0009) [2023-10-14 01:33:09,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 26705920. Throughput: 0: 1749.9, 1: 1782.3. Samples: 6685846. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) [2023-10-14 01:33:09,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.390')] [2023-10-14 01:33:09,804][33226] Updated weights for policy 1, policy_version 13090 (0.0009) [2023-10-14 01:33:10,179][33226] Updated weights for policy 1, policy_version 13100 (0.0008) [2023-10-14 01:33:10,550][33226] Updated weights for policy 1, policy_version 13110 (0.0008) [2023-10-14 01:33:10,917][33226] Updated weights for policy 1, policy_version 13120 (0.0008) [2023-10-14 01:33:13,033][33201] Updated weights for policy 0, policy_version 13000 (0.0007) [2023-10-14 01:33:13,406][33201] Updated weights for policy 0, policy_version 13010 (0.0009) [2023-10-14 01:33:13,769][33201] Updated weights for policy 0, policy_version 13020 (0.0007) [2023-10-14 01:33:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 26771456. Throughput: 0: 1768.4, 1: 1786.3. Samples: 6696660. Policy #0 lag: (min: 1.0, avg: 5.4, max: 33.0) [2023-10-14 01:33:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.390')] [2023-10-14 01:33:14,558][32837] Saving new best policy, reward=20.920! [2023-10-14 01:33:14,609][33226] Updated weights for policy 1, policy_version 13130 (0.0011) [2023-10-14 01:33:14,977][33226] Updated weights for policy 1, policy_version 13140 (0.0010) [2023-10-14 01:33:15,347][33226] Updated weights for policy 1, policy_version 13150 (0.0009) [2023-10-14 01:33:17,404][33201] Updated weights for policy 0, policy_version 13030 (0.0007) [2023-10-14 01:33:17,782][33201] Updated weights for policy 0, policy_version 13040 (0.0008) [2023-10-14 01:33:18,147][33201] Updated weights for policy 0, policy_version 13050 (0.0007) [2023-10-14 01:33:19,270][33226] Updated weights for policy 1, policy_version 13160 (0.0008) [2023-10-14 01:33:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 26836992. Throughput: 0: 1752.2, 1: 1786.2. Samples: 6717966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:33:19,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.390')] [2023-10-14 01:33:19,633][33226] Updated weights for policy 1, policy_version 13170 (0.0007) [2023-10-14 01:33:20,001][33226] Updated weights for policy 1, policy_version 13180 (0.0008) [2023-10-14 01:33:21,897][33201] Updated weights for policy 0, policy_version 13060 (0.0009) [2023-10-14 01:33:22,278][33201] Updated weights for policy 0, policy_version 13070 (0.0009) [2023-10-14 01:33:22,656][33201] Updated weights for policy 0, policy_version 13080 (0.0008) [2023-10-14 01:33:23,730][33226] Updated weights for policy 1, policy_version 13190 (0.0009) [2023-10-14 01:33:24,099][33226] Updated weights for policy 1, policy_version 13200 (0.0008) [2023-10-14 01:33:24,470][33226] Updated weights for policy 1, policy_version 13210 (0.0007) [2023-10-14 01:33:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 26902528. Throughput: 0: 1751.0, 1: 1811.9. Samples: 6739462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:33:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.390')] [2023-10-14 01:33:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000013088_13402112.pth... [2023-10-14 01:33:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000011424_11698176.pth [2023-10-14 01:33:24,686][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth... [2023-10-14 01:33:24,724][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000011520_11796480.pth [2023-10-14 01:33:26,534][33201] Updated weights for policy 0, policy_version 13090 (0.0009) [2023-10-14 01:33:26,913][33201] Updated weights for policy 0, policy_version 13100 (0.0009) [2023-10-14 01:33:27,286][33201] Updated weights for policy 0, policy_version 13110 (0.0007) [2023-10-14 01:33:27,656][33201] Updated weights for policy 0, policy_version 13120 (0.0008) [2023-10-14 01:33:28,302][33226] Updated weights for policy 1, policy_version 13220 (0.0008) [2023-10-14 01:33:28,674][33226] Updated weights for policy 1, policy_version 13230 (0.0007) [2023-10-14 01:33:29,027][33226] Updated weights for policy 1, policy_version 13240 (0.0008) [2023-10-14 01:33:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14329.0). Total num frames: 27000832. Throughput: 0: 1770.0, 1: 1797.2. Samples: 6750424. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:33:29,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.390')] [2023-10-14 01:33:31,424][33201] Updated weights for policy 0, policy_version 13130 (0.0009) [2023-10-14 01:33:31,794][33201] Updated weights for policy 0, policy_version 13140 (0.0007) [2023-10-14 01:33:32,172][33201] Updated weights for policy 0, policy_version 13150 (0.0007) [2023-10-14 01:33:32,817][33226] Updated weights for policy 1, policy_version 13250 (0.0008) [2023-10-14 01:33:33,187][33226] Updated weights for policy 1, policy_version 13260 (0.0008) [2023-10-14 01:33:33,554][33226] Updated weights for policy 1, policy_version 13270 (0.0007) [2023-10-14 01:33:33,927][33226] Updated weights for policy 1, policy_version 13280 (0.0009) [2023-10-14 01:33:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 27066368. Throughput: 0: 1755.5, 1: 1812.1. Samples: 6771638. Policy #0 lag: (min: 13.0, avg: 16.1, max: 45.0) [2023-10-14 01:33:34,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.390')] [2023-10-14 01:33:35,848][33201] Updated weights for policy 0, policy_version 13160 (0.0008) [2023-10-14 01:33:36,221][33201] Updated weights for policy 0, policy_version 13170 (0.0009) [2023-10-14 01:33:36,591][33201] Updated weights for policy 0, policy_version 13180 (0.0010) [2023-10-14 01:33:37,748][33226] Updated weights for policy 1, policy_version 13290 (0.0008) [2023-10-14 01:33:38,116][33226] Updated weights for policy 1, policy_version 13300 (0.0008) [2023-10-14 01:33:38,496][33226] Updated weights for policy 1, policy_version 13310 (0.0009) [2023-10-14 01:33:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 27131904. Throughput: 0: 1765.8, 1: 1782.5. Samples: 6792758. Policy #0 lag: (min: 13.0, avg: 16.1, max: 45.0) [2023-10-14 01:33:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.370')] [2023-10-14 01:33:40,519][33201] Updated weights for policy 0, policy_version 13190 (0.0008) [2023-10-14 01:33:40,896][33201] Updated weights for policy 0, policy_version 13200 (0.0009) [2023-10-14 01:33:41,263][33201] Updated weights for policy 0, policy_version 13210 (0.0010) [2023-10-14 01:33:42,196][33226] Updated weights for policy 1, policy_version 13320 (0.0008) [2023-10-14 01:33:42,571][33226] Updated weights for policy 1, policy_version 13330 (0.0008) [2023-10-14 01:33:42,936][33226] Updated weights for policy 1, policy_version 13340 (0.0007) [2023-10-14 01:33:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 27197440. Throughput: 0: 1764.4, 1: 1811.0. Samples: 6803778. Policy #0 lag: (min: 13.0, avg: 16.1, max: 45.0) [2023-10-14 01:33:44,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.370')] [2023-10-14 01:33:45,084][33201] Updated weights for policy 0, policy_version 13220 (0.0008) [2023-10-14 01:33:45,457][33201] Updated weights for policy 0, policy_version 13230 (0.0010) [2023-10-14 01:33:45,827][33201] Updated weights for policy 0, policy_version 13240 (0.0011) [2023-10-14 01:33:46,648][33226] Updated weights for policy 1, policy_version 13350 (0.0009) [2023-10-14 01:33:47,018][33226] Updated weights for policy 1, policy_version 13360 (0.0007) [2023-10-14 01:33:47,384][33226] Updated weights for policy 1, policy_version 13370 (0.0007) [2023-10-14 01:33:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 27262976. Throughput: 0: 1764.9, 1: 1787.1. Samples: 6824886. Policy #0 lag: (min: 10.0, avg: 35.5, max: 40.0) [2023-10-14 01:33:49,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.380')] [2023-10-14 01:33:49,715][33201] Updated weights for policy 0, policy_version 13250 (0.0008) [2023-10-14 01:33:50,093][33201] Updated weights for policy 0, policy_version 13260 (0.0007) [2023-10-14 01:33:50,457][33201] Updated weights for policy 0, policy_version 13270 (0.0007) [2023-10-14 01:33:50,823][33201] Updated weights for policy 0, policy_version 13280 (0.0007) [2023-10-14 01:33:51,206][33226] Updated weights for policy 1, policy_version 13380 (0.0007) [2023-10-14 01:33:51,569][33226] Updated weights for policy 1, policy_version 13390 (0.0009) [2023-10-14 01:33:51,939][33226] Updated weights for policy 1, policy_version 13400 (0.0009) [2023-10-14 01:33:54,498][33201] Updated weights for policy 0, policy_version 13290 (0.0007) [2023-10-14 01:33:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 27328512. Throughput: 0: 1795.7, 1: 1788.1. Samples: 6847118. Policy #0 lag: (min: 10.0, avg: 35.5, max: 40.0) [2023-10-14 01:33:54,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.410')] [2023-10-14 01:33:54,870][33201] Updated weights for policy 0, policy_version 13300 (0.0009) [2023-10-14 01:33:55,252][33201] Updated weights for policy 0, policy_version 13310 (0.0009) [2023-10-14 01:33:55,684][33226] Updated weights for policy 1, policy_version 13410 (0.0009) [2023-10-14 01:33:56,062][33226] Updated weights for policy 1, policy_version 13420 (0.0011) [2023-10-14 01:33:56,421][33226] Updated weights for policy 1, policy_version 13430 (0.0010) [2023-10-14 01:33:56,794][33226] Updated weights for policy 1, policy_version 13440 (0.0009) [2023-10-14 01:33:59,016][33201] Updated weights for policy 0, policy_version 13320 (0.0009) [2023-10-14 01:33:59,387][33201] Updated weights for policy 0, policy_version 13330 (0.0009) [2023-10-14 01:33:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 27394048. Throughput: 0: 1773.6, 1: 1784.6. Samples: 6856776. Policy #0 lag: (min: 10.0, avg: 35.5, max: 40.0) [2023-10-14 01:33:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.410')] [2023-10-14 01:33:59,757][33201] Updated weights for policy 0, policy_version 13340 (0.0009) [2023-10-14 01:34:00,683][33226] Updated weights for policy 1, policy_version 13450 (0.0009) [2023-10-14 01:34:01,048][33226] Updated weights for policy 1, policy_version 13460 (0.0010) [2023-10-14 01:34:01,404][33226] Updated weights for policy 1, policy_version 13470 (0.0008) [2023-10-14 01:34:03,542][33201] Updated weights for policy 0, policy_version 13350 (0.0010) [2023-10-14 01:34:03,909][33201] Updated weights for policy 0, policy_version 13360 (0.0007) [2023-10-14 01:34:04,295][33201] Updated weights for policy 0, policy_version 13370 (0.0008) [2023-10-14 01:34:04,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 27492352. Throughput: 0: 1795.5, 1: 1781.9. Samples: 6878946. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:34:04,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.390')] [2023-10-14 01:34:05,226][33226] Updated weights for policy 1, policy_version 13480 (0.0009) [2023-10-14 01:34:05,592][33226] Updated weights for policy 1, policy_version 13490 (0.0007) [2023-10-14 01:34:05,958][33226] Updated weights for policy 1, policy_version 13500 (0.0008) [2023-10-14 01:34:08,266][33201] Updated weights for policy 0, policy_version 13380 (0.0011) [2023-10-14 01:34:08,641][33201] Updated weights for policy 0, policy_version 13390 (0.0010) [2023-10-14 01:34:09,014][33201] Updated weights for policy 0, policy_version 13400 (0.0011) [2023-10-14 01:34:09,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 27557888. Throughput: 0: 1770.6, 1: 1794.1. Samples: 6899876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:34:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.400')] [2023-10-14 01:34:09,751][33226] Updated weights for policy 1, policy_version 13510 (0.0008) [2023-10-14 01:34:10,144][33226] Updated weights for policy 1, policy_version 13520 (0.0009) [2023-10-14 01:34:10,510][33226] Updated weights for policy 1, policy_version 13530 (0.0008) [2023-10-14 01:34:12,899][33201] Updated weights for policy 0, policy_version 13410 (0.0009) [2023-10-14 01:34:13,279][33201] Updated weights for policy 0, policy_version 13420 (0.0007) [2023-10-14 01:34:13,640][33201] Updated weights for policy 0, policy_version 13430 (0.0008) [2023-10-14 01:34:14,019][33201] Updated weights for policy 0, policy_version 13440 (0.0007) [2023-10-14 01:34:14,132][33226] Updated weights for policy 1, policy_version 13540 (0.0007) [2023-10-14 01:34:14,517][33226] Updated weights for policy 1, policy_version 13550 (0.0010) [2023-10-14 01:34:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 27623424. Throughput: 0: 1778.0, 1: 1776.6. Samples: 6910384. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-14 01:34:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.400')] [2023-10-14 01:34:14,880][33226] Updated weights for policy 1, policy_version 13560 (0.0008) [2023-10-14 01:34:17,694][33201] Updated weights for policy 0, policy_version 13450 (0.0007) [2023-10-14 01:34:18,069][33201] Updated weights for policy 0, policy_version 13460 (0.0009) [2023-10-14 01:34:18,442][33201] Updated weights for policy 0, policy_version 13470 (0.0009) [2023-10-14 01:34:18,656][33226] Updated weights for policy 1, policy_version 13570 (0.0007) [2023-10-14 01:34:19,033][33226] Updated weights for policy 1, policy_version 13580 (0.0007) [2023-10-14 01:34:19,406][33226] Updated weights for policy 1, policy_version 13590 (0.0008) [2023-10-14 01:34:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 27688960. Throughput: 0: 1777.9, 1: 1786.5. Samples: 6932036. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-14 01:34:19,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.390')] [2023-10-14 01:34:19,774][33226] Updated weights for policy 1, policy_version 13600 (0.0008) [2023-10-14 01:34:22,320][33201] Updated weights for policy 0, policy_version 13480 (0.0009) [2023-10-14 01:34:22,693][33201] Updated weights for policy 0, policy_version 13490 (0.0009) [2023-10-14 01:34:23,079][33201] Updated weights for policy 0, policy_version 13500 (0.0010) [2023-10-14 01:34:23,650][33226] Updated weights for policy 1, policy_version 13610 (0.0010) [2023-10-14 01:34:24,015][33226] Updated weights for policy 1, policy_version 13620 (0.0011) [2023-10-14 01:34:24,395][33226] Updated weights for policy 1, policy_version 13630 (0.0009) [2023-10-14 01:34:24,558][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 27787264. Throughput: 0: 1760.2, 1: 1795.6. Samples: 6952772. Policy #0 lag: (min: 31.0, avg: 35.7, max: 63.0) [2023-10-14 01:34:24,559][31953] Avg episode reward: [(0, '20.880'), (1, '20.390')] [2023-10-14 01:34:26,903][33201] Updated weights for policy 0, policy_version 13510 (0.0009) [2023-10-14 01:34:27,289][33201] Updated weights for policy 0, policy_version 13520 (0.0009) [2023-10-14 01:34:27,666][33201] Updated weights for policy 0, policy_version 13530 (0.0009) [2023-10-14 01:34:28,162][33226] Updated weights for policy 1, policy_version 13640 (0.0010) [2023-10-14 01:34:28,530][33226] Updated weights for policy 1, policy_version 13650 (0.0009) [2023-10-14 01:34:28,905][33226] Updated weights for policy 1, policy_version 13660 (0.0008) [2023-10-14 01:34:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 27852800. Throughput: 0: 1783.3, 1: 1782.9. Samples: 6964260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:34:29,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.420')] [2023-10-14 01:34:31,347][33201] Updated weights for policy 0, policy_version 13540 (0.0008) [2023-10-14 01:34:31,726][33201] Updated weights for policy 0, policy_version 13550 (0.0008) [2023-10-14 01:34:32,092][33201] Updated weights for policy 0, policy_version 13560 (0.0008) [2023-10-14 01:34:32,691][33226] Updated weights for policy 1, policy_version 13670 (0.0007) [2023-10-14 01:34:33,062][33226] Updated weights for policy 1, policy_version 13680 (0.0008) [2023-10-14 01:34:33,428][33226] Updated weights for policy 1, policy_version 13690 (0.0009) [2023-10-14 01:34:34,557][31953] Fps is (10 sec: 13107.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 27918336. Throughput: 0: 1758.0, 1: 1798.6. Samples: 6984930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:34:34,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.430')] [2023-10-14 01:34:36,000][33201] Updated weights for policy 0, policy_version 13570 (0.0011) [2023-10-14 01:34:36,366][33201] Updated weights for policy 0, policy_version 13580 (0.0009) [2023-10-14 01:34:36,738][33201] Updated weights for policy 0, policy_version 13590 (0.0008) [2023-10-14 01:34:37,110][33201] Updated weights for policy 0, policy_version 13600 (0.0007) [2023-10-14 01:34:37,247][33226] Updated weights for policy 1, policy_version 13700 (0.0009) [2023-10-14 01:34:37,613][33226] Updated weights for policy 1, policy_version 13710 (0.0009) [2023-10-14 01:34:37,980][33226] Updated weights for policy 1, policy_version 13720 (0.0007) [2023-10-14 01:34:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 27983872. Throughput: 0: 1754.6, 1: 1777.7. Samples: 7006070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:34:39,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.420')] [2023-10-14 01:34:40,909][33201] Updated weights for policy 0, policy_version 13610 (0.0011) [2023-10-14 01:34:41,275][33201] Updated weights for policy 0, policy_version 13620 (0.0008) [2023-10-14 01:34:41,645][33201] Updated weights for policy 0, policy_version 13630 (0.0011) [2023-10-14 01:34:41,646][33226] Updated weights for policy 1, policy_version 13730 (0.0008) [2023-10-14 01:34:42,021][33226] Updated weights for policy 1, policy_version 13740 (0.0008) [2023-10-14 01:34:42,399][33226] Updated weights for policy 1, policy_version 13750 (0.0009) [2023-10-14 01:34:42,772][33226] Updated weights for policy 1, policy_version 13760 (0.0011) [2023-10-14 01:34:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 28049408. Throughput: 0: 1751.6, 1: 1801.6. Samples: 7016670. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 01:34:44,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.420')] [2023-10-14 01:34:45,392][33201] Updated weights for policy 0, policy_version 13640 (0.0008) [2023-10-14 01:34:45,765][33201] Updated weights for policy 0, policy_version 13650 (0.0009) [2023-10-14 01:34:46,143][33201] Updated weights for policy 0, policy_version 13660 (0.0009) [2023-10-14 01:34:46,436][33226] Updated weights for policy 1, policy_version 13770 (0.0011) [2023-10-14 01:34:46,806][33226] Updated weights for policy 1, policy_version 13780 (0.0009) [2023-10-14 01:34:47,174][33226] Updated weights for policy 1, policy_version 13790 (0.0008) [2023-10-14 01:34:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 28114944. Throughput: 0: 1749.7, 1: 1781.5. Samples: 7037852. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 01:34:49,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.460')] [2023-10-14 01:34:49,968][33201] Updated weights for policy 0, policy_version 13670 (0.0010) [2023-10-14 01:34:50,338][33201] Updated weights for policy 0, policy_version 13680 (0.0009) [2023-10-14 01:34:50,713][33201] Updated weights for policy 0, policy_version 13690 (0.0009) [2023-10-14 01:34:51,001][33226] Updated weights for policy 1, policy_version 13800 (0.0007) [2023-10-14 01:34:51,366][33226] Updated weights for policy 1, policy_version 13810 (0.0009) [2023-10-14 01:34:51,744][33226] Updated weights for policy 1, policy_version 13820 (0.0009) [2023-10-14 01:34:54,552][33201] Updated weights for policy 0, policy_version 13700 (0.0008) [2023-10-14 01:34:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 28180480. Throughput: 0: 1781.8, 1: 1777.9. Samples: 7060060. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 01:34:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.450')] [2023-10-14 01:34:54,920][33201] Updated weights for policy 0, policy_version 13710 (0.0008) [2023-10-14 01:34:55,296][33201] Updated weights for policy 0, policy_version 13720 (0.0008) [2023-10-14 01:34:55,597][33226] Updated weights for policy 1, policy_version 13830 (0.0010) [2023-10-14 01:34:55,986][33226] Updated weights for policy 1, policy_version 13840 (0.0008) [2023-10-14 01:34:56,354][33226] Updated weights for policy 1, policy_version 13850 (0.0007) [2023-10-14 01:34:59,098][33201] Updated weights for policy 0, policy_version 13730 (0.0009) [2023-10-14 01:34:59,474][33201] Updated weights for policy 0, policy_version 13740 (0.0007) [2023-10-14 01:34:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 28246016. Throughput: 0: 1760.4, 1: 1778.9. Samples: 7069652. Policy #0 lag: (min: 2.0, avg: 7.1, max: 34.0) [2023-10-14 01:34:59,557][31953] Avg episode reward: [(0, '20.580'), (1, '20.450')] [2023-10-14 01:34:59,853][33201] Updated weights for policy 0, policy_version 13750 (0.0007) [2023-10-14 01:35:00,078][33226] Updated weights for policy 1, policy_version 13860 (0.0008) [2023-10-14 01:35:00,220][33201] Updated weights for policy 0, policy_version 13760 (0.0007) [2023-10-14 01:35:00,439][33226] Updated weights for policy 1, policy_version 13870 (0.0008) [2023-10-14 01:35:00,807][33226] Updated weights for policy 1, policy_version 13880 (0.0008) [2023-10-14 01:35:04,002][33201] Updated weights for policy 0, policy_version 13770 (0.0008) [2023-10-14 01:35:04,374][33201] Updated weights for policy 0, policy_version 13780 (0.0008) [2023-10-14 01:35:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 28311552. Throughput: 0: 1771.6, 1: 1781.5. Samples: 7091926. Policy #0 lag: (min: 2.0, avg: 7.1, max: 34.0) [2023-10-14 01:35:04,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.440')] [2023-10-14 01:35:04,640][33226] Updated weights for policy 1, policy_version 13890 (0.0010) [2023-10-14 01:35:04,735][33201] Updated weights for policy 0, policy_version 13790 (0.0009) [2023-10-14 01:35:05,011][33226] Updated weights for policy 1, policy_version 13900 (0.0010) [2023-10-14 01:35:05,383][33226] Updated weights for policy 1, policy_version 13910 (0.0009) [2023-10-14 01:35:05,753][33226] Updated weights for policy 1, policy_version 13920 (0.0009) [2023-10-14 01:35:08,647][33201] Updated weights for policy 0, policy_version 13800 (0.0007) [2023-10-14 01:35:09,010][33201] Updated weights for policy 0, policy_version 13810 (0.0010) [2023-10-14 01:35:09,387][33201] Updated weights for policy 0, policy_version 13820 (0.0007) [2023-10-14 01:35:09,480][33226] Updated weights for policy 1, policy_version 13930 (0.0008) [2023-10-14 01:35:09,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 28409856. Throughput: 0: 1765.7, 1: 1793.2. Samples: 7112922. Policy #0 lag: (min: 11.0, avg: 17.1, max: 43.0) [2023-10-14 01:35:09,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.440')] [2023-10-14 01:35:09,853][33226] Updated weights for policy 1, policy_version 13940 (0.0008) [2023-10-14 01:35:10,220][33226] Updated weights for policy 1, policy_version 13950 (0.0007) [2023-10-14 01:35:13,314][33201] Updated weights for policy 0, policy_version 13830 (0.0008) [2023-10-14 01:35:13,696][33201] Updated weights for policy 0, policy_version 13840 (0.0008) [2023-10-14 01:35:14,073][33201] Updated weights for policy 0, policy_version 13850 (0.0008) [2023-10-14 01:35:14,090][33226] Updated weights for policy 1, policy_version 13960 (0.0009) [2023-10-14 01:35:14,468][33226] Updated weights for policy 1, policy_version 13970 (0.0009) [2023-10-14 01:35:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 28475392. Throughput: 0: 1759.5, 1: 1774.1. Samples: 7123272. Policy #0 lag: (min: 11.0, avg: 17.1, max: 43.0) [2023-10-14 01:35:14,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.440')] [2023-10-14 01:35:14,833][33226] Updated weights for policy 1, policy_version 13980 (0.0008) [2023-10-14 01:35:17,932][33201] Updated weights for policy 0, policy_version 13860 (0.0007) [2023-10-14 01:35:18,297][33201] Updated weights for policy 0, policy_version 13870 (0.0009) [2023-10-14 01:35:18,624][33226] Updated weights for policy 1, policy_version 13990 (0.0009) [2023-10-14 01:35:18,668][33201] Updated weights for policy 0, policy_version 13880 (0.0009) [2023-10-14 01:35:18,982][33226] Updated weights for policy 1, policy_version 14000 (0.0009) [2023-10-14 01:35:19,360][33226] Updated weights for policy 1, policy_version 14010 (0.0007) [2023-10-14 01:35:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 28540928. Throughput: 0: 1770.5, 1: 1780.6. Samples: 7144728. Policy #0 lag: (min: 11.0, avg: 17.1, max: 43.0) [2023-10-14 01:35:19,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.440')] [2023-10-14 01:35:22,598][33201] Updated weights for policy 0, policy_version 13890 (0.0008) [2023-10-14 01:35:22,968][33201] Updated weights for policy 0, policy_version 13900 (0.0008) [2023-10-14 01:35:23,075][33226] Updated weights for policy 1, policy_version 14020 (0.0008) [2023-10-14 01:35:23,338][33201] Updated weights for policy 0, policy_version 13910 (0.0007) [2023-10-14 01:35:23,444][33226] Updated weights for policy 1, policy_version 14030 (0.0007) [2023-10-14 01:35:23,710][33201] Updated weights for policy 0, policy_version 13920 (0.0007) [2023-10-14 01:35:23,806][33226] Updated weights for policy 1, policy_version 14040 (0.0008) [2023-10-14 01:35:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 28639232. Throughput: 0: 1750.3, 1: 1780.7. Samples: 7164966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:35:24,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.480')] [2023-10-14 01:35:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000013920_14254080.pth... [2023-10-14 01:35:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000014048_14385152.pth... [2023-10-14 01:35:24,599][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000012256_12550144.pth [2023-10-14 01:35:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000012352_12648448.pth [2023-10-14 01:35:27,565][33226] Updated weights for policy 1, policy_version 14050 (0.0008) [2023-10-14 01:35:27,621][33201] Updated weights for policy 0, policy_version 13930 (0.0007) [2023-10-14 01:35:27,930][33226] Updated weights for policy 1, policy_version 14060 (0.0009) [2023-10-14 01:35:27,996][33201] Updated weights for policy 0, policy_version 13940 (0.0007) [2023-10-14 01:35:28,298][33226] Updated weights for policy 1, policy_version 14070 (0.0007) [2023-10-14 01:35:28,372][33201] Updated weights for policy 0, policy_version 13950 (0.0007) [2023-10-14 01:35:28,665][33226] Updated weights for policy 1, policy_version 14080 (0.0008) [2023-10-14 01:35:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 28704768. Throughput: 0: 1782.3, 1: 1787.7. Samples: 7177318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:35:29,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.470')] [2023-10-14 01:35:32,196][33201] Updated weights for policy 0, policy_version 13960 (0.0008) [2023-10-14 01:35:32,454][33226] Updated weights for policy 1, policy_version 14090 (0.0009) [2023-10-14 01:35:32,562][33201] Updated weights for policy 0, policy_version 13970 (0.0007) [2023-10-14 01:35:32,820][33226] Updated weights for policy 1, policy_version 14100 (0.0007) [2023-10-14 01:35:32,933][33201] Updated weights for policy 0, policy_version 13980 (0.0009) [2023-10-14 01:35:33,189][33226] Updated weights for policy 1, policy_version 14110 (0.0008) [2023-10-14 01:35:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 28770304. Throughput: 0: 1752.0, 1: 1782.5. Samples: 7196902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:35:34,557][31953] Avg episode reward: [(0, '20.630'), (1, '20.500')] [2023-10-14 01:35:36,637][33201] Updated weights for policy 0, policy_version 13990 (0.0009) [2023-10-14 01:35:36,998][33201] Updated weights for policy 0, policy_version 14000 (0.0008) [2023-10-14 01:35:37,073][33226] Updated weights for policy 1, policy_version 14120 (0.0008) [2023-10-14 01:35:37,382][33201] Updated weights for policy 0, policy_version 14010 (0.0008) [2023-10-14 01:35:37,441][33226] Updated weights for policy 1, policy_version 14130 (0.0008) [2023-10-14 01:35:37,810][33226] Updated weights for policy 1, policy_version 14140 (0.0008) [2023-10-14 01:35:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 28835840. Throughput: 0: 1748.3, 1: 1766.1. Samples: 7218208. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) [2023-10-14 01:35:39,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.470')] [2023-10-14 01:35:41,222][33201] Updated weights for policy 0, policy_version 14020 (0.0008) [2023-10-14 01:35:41,598][33201] Updated weights for policy 0, policy_version 14030 (0.0009) [2023-10-14 01:35:41,713][33226] Updated weights for policy 1, policy_version 14150 (0.0008) [2023-10-14 01:35:41,959][33201] Updated weights for policy 0, policy_version 14040 (0.0007) [2023-10-14 01:35:42,087][33226] Updated weights for policy 1, policy_version 14160 (0.0008) [2023-10-14 01:35:42,461][33226] Updated weights for policy 1, policy_version 14170 (0.0009) [2023-10-14 01:35:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 28901376. Throughput: 0: 1752.5, 1: 1786.8. Samples: 7228922. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) [2023-10-14 01:35:44,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.470')] [2023-10-14 01:35:45,887][33201] Updated weights for policy 0, policy_version 14050 (0.0008) [2023-10-14 01:35:46,259][33201] Updated weights for policy 0, policy_version 14060 (0.0007) [2023-10-14 01:35:46,287][33226] Updated weights for policy 1, policy_version 14180 (0.0008) [2023-10-14 01:35:46,628][33201] Updated weights for policy 0, policy_version 14070 (0.0008) [2023-10-14 01:35:46,653][33226] Updated weights for policy 1, policy_version 14190 (0.0008) [2023-10-14 01:35:46,993][33201] Updated weights for policy 0, policy_version 14080 (0.0009) [2023-10-14 01:35:47,025][33226] Updated weights for policy 1, policy_version 14200 (0.0008) [2023-10-14 01:35:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 28966912. Throughput: 0: 1747.6, 1: 1756.4. Samples: 7249604. Policy #0 lag: (min: 20.0, avg: 28.0, max: 52.0) [2023-10-14 01:35:49,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.480')] [2023-10-14 01:35:50,794][33226] Updated weights for policy 1, policy_version 14210 (0.0008) [2023-10-14 01:35:50,839][33201] Updated weights for policy 0, policy_version 14090 (0.0007) [2023-10-14 01:35:51,171][33226] Updated weights for policy 1, policy_version 14220 (0.0008) [2023-10-14 01:35:51,211][33201] Updated weights for policy 0, policy_version 14100 (0.0007) [2023-10-14 01:35:51,535][33226] Updated weights for policy 1, policy_version 14230 (0.0008) [2023-10-14 01:35:51,591][33201] Updated weights for policy 0, policy_version 14110 (0.0007) [2023-10-14 01:35:51,904][33226] Updated weights for policy 1, policy_version 14240 (0.0009) [2023-10-14 01:35:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 29032448. Throughput: 0: 1768.8, 1: 1762.8. Samples: 7271844. Policy #0 lag: (min: 17.0, avg: 31.1, max: 49.0) [2023-10-14 01:35:54,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.460')] [2023-10-14 01:35:55,550][33201] Updated weights for policy 0, policy_version 14120 (0.0007) [2023-10-14 01:35:55,717][33226] Updated weights for policy 1, policy_version 14250 (0.0007) [2023-10-14 01:35:55,913][33201] Updated weights for policy 0, policy_version 14130 (0.0007) [2023-10-14 01:35:56,091][33226] Updated weights for policy 1, policy_version 14260 (0.0007) [2023-10-14 01:35:56,282][33201] Updated weights for policy 0, policy_version 14140 (0.0009) [2023-10-14 01:35:56,452][33226] Updated weights for policy 1, policy_version 14270 (0.0007) [2023-10-14 01:35:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 29097984. Throughput: 0: 1750.9, 1: 1764.1. Samples: 7281446. Policy #0 lag: (min: 17.0, avg: 31.1, max: 49.0) [2023-10-14 01:35:59,557][31953] Avg episode reward: [(0, '20.650'), (1, '20.470')] [2023-10-14 01:36:00,083][33201] Updated weights for policy 0, policy_version 14150 (0.0007) [2023-10-14 01:36:00,285][33226] Updated weights for policy 1, policy_version 14280 (0.0008) [2023-10-14 01:36:00,458][33201] Updated weights for policy 0, policy_version 14160 (0.0007) [2023-10-14 01:36:00,658][33226] Updated weights for policy 1, policy_version 14290 (0.0007) [2023-10-14 01:36:00,837][33201] Updated weights for policy 0, policy_version 14170 (0.0008) [2023-10-14 01:36:01,030][33226] Updated weights for policy 1, policy_version 14300 (0.0007) [2023-10-14 01:36:04,490][33201] Updated weights for policy 0, policy_version 14180 (0.0008) [2023-10-14 01:36:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 29163520. Throughput: 0: 1764.7, 1: 1771.6. Samples: 7303862. Policy #0 lag: (min: 17.0, avg: 31.1, max: 49.0) [2023-10-14 01:36:04,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.480')] [2023-10-14 01:36:04,689][33226] Updated weights for policy 1, policy_version 14310 (0.0008) [2023-10-14 01:36:04,876][33201] Updated weights for policy 0, policy_version 14190 (0.0008) [2023-10-14 01:36:05,047][33226] Updated weights for policy 1, policy_version 14320 (0.0008) [2023-10-14 01:36:05,249][33201] Updated weights for policy 0, policy_version 14200 (0.0009) [2023-10-14 01:36:05,413][33226] Updated weights for policy 1, policy_version 14330 (0.0010) [2023-10-14 01:36:09,151][33201] Updated weights for policy 0, policy_version 14210 (0.0008) [2023-10-14 01:36:09,176][33226] Updated weights for policy 1, policy_version 14340 (0.0009) [2023-10-14 01:36:09,525][33201] Updated weights for policy 0, policy_version 14220 (0.0009) [2023-10-14 01:36:09,548][33226] Updated weights for policy 1, policy_version 14350 (0.0008) [2023-10-14 01:36:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 29229056. Throughput: 0: 1776.3, 1: 1793.8. Samples: 7325618. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-14 01:36:09,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.490')] [2023-10-14 01:36:09,891][33201] Updated weights for policy 0, policy_version 14230 (0.0007) [2023-10-14 01:36:09,914][33226] Updated weights for policy 1, policy_version 14360 (0.0008) [2023-10-14 01:36:10,262][33201] Updated weights for policy 0, policy_version 14240 (0.0008) [2023-10-14 01:36:13,842][33226] Updated weights for policy 1, policy_version 14370 (0.0007) [2023-10-14 01:36:14,083][33201] Updated weights for policy 0, policy_version 14250 (0.0008) [2023-10-14 01:36:14,218][33226] Updated weights for policy 1, policy_version 14380 (0.0009) [2023-10-14 01:36:14,455][33201] Updated weights for policy 0, policy_version 14260 (0.0008) [2023-10-14 01:36:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 29294592. Throughput: 0: 1746.6, 1: 1760.3. Samples: 7335126. Policy #0 lag: (min: 31.0, avg: 32.5, max: 58.0) [2023-10-14 01:36:14,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.510')] [2023-10-14 01:36:14,588][33226] Updated weights for policy 1, policy_version 14390 (0.0009) [2023-10-14 01:36:14,816][33201] Updated weights for policy 0, policy_version 14270 (0.0008) [2023-10-14 01:36:14,954][33226] Updated weights for policy 1, policy_version 14400 (0.0008) [2023-10-14 01:36:18,653][33201] Updated weights for policy 0, policy_version 14280 (0.0009) [2023-10-14 01:36:18,654][33226] Updated weights for policy 1, policy_version 14410 (0.0009) [2023-10-14 01:36:19,013][33226] Updated weights for policy 1, policy_version 14420 (0.0008) [2023-10-14 01:36:19,031][33201] Updated weights for policy 0, policy_version 14290 (0.0009) [2023-10-14 01:36:19,384][33226] Updated weights for policy 1, policy_version 14430 (0.0007) [2023-10-14 01:36:19,407][33201] Updated weights for policy 0, policy_version 14300 (0.0007) [2023-10-14 01:36:19,559][31953] Fps is (10 sec: 19656.2, 60 sec: 14745.0, 300 sec: 14328.9). Total num frames: 29425664. Throughput: 0: 1775.2, 1: 1785.3. Samples: 7357132. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:19,562][31953] Avg episode reward: [(0, '20.680'), (1, '20.500')] [2023-10-14 01:36:23,244][33201] Updated weights for policy 0, policy_version 14310 (0.0009) [2023-10-14 01:36:23,295][33226] Updated weights for policy 1, policy_version 14440 (0.0007) [2023-10-14 01:36:23,616][33201] Updated weights for policy 0, policy_version 14320 (0.0007) [2023-10-14 01:36:23,672][33226] Updated weights for policy 1, policy_version 14450 (0.0007) [2023-10-14 01:36:23,998][33201] Updated weights for policy 0, policy_version 14330 (0.0008) [2023-10-14 01:36:24,033][33226] Updated weights for policy 1, policy_version 14460 (0.0008) [2023-10-14 01:36:24,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 29491200. Throughput: 0: 1748.5, 1: 1773.9. Samples: 7376716. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:24,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.490')] [2023-10-14 01:36:27,799][33226] Updated weights for policy 1, policy_version 14470 (0.0008) [2023-10-14 01:36:27,807][33201] Updated weights for policy 0, policy_version 14340 (0.0008) [2023-10-14 01:36:28,175][33226] Updated weights for policy 1, policy_version 14480 (0.0007) [2023-10-14 01:36:28,181][33201] Updated weights for policy 0, policy_version 14350 (0.0007) [2023-10-14 01:36:28,535][33226] Updated weights for policy 1, policy_version 14490 (0.0008) [2023-10-14 01:36:28,548][33201] Updated weights for policy 0, policy_version 14360 (0.0007) [2023-10-14 01:36:29,557][31953] Fps is (10 sec: 13110.2, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 29556736. Throughput: 0: 1767.2, 1: 1781.2. Samples: 7388600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.500')] [2023-10-14 01:36:29,559][32837] Saving new best policy, reward=20.940! [2023-10-14 01:36:32,349][33201] Updated weights for policy 0, policy_version 14370 (0.0008) [2023-10-14 01:36:32,528][33226] Updated weights for policy 1, policy_version 14500 (0.0009) [2023-10-14 01:36:32,722][33201] Updated weights for policy 0, policy_version 14380 (0.0008) [2023-10-14 01:36:32,899][33226] Updated weights for policy 1, policy_version 14510 (0.0009) [2023-10-14 01:36:33,098][33201] Updated weights for policy 0, policy_version 14390 (0.0010) [2023-10-14 01:36:33,269][33226] Updated weights for policy 1, policy_version 14520 (0.0007) [2023-10-14 01:36:33,474][33201] Updated weights for policy 0, policy_version 14400 (0.0008) [2023-10-14 01:36:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 29622272. Throughput: 0: 1753.7, 1: 1784.8. Samples: 7408838. Policy #0 lag: (min: 8.0, avg: 26.7, max: 40.0) [2023-10-14 01:36:34,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.510')] [2023-10-14 01:36:37,125][33226] Updated weights for policy 1, policy_version 14530 (0.0008) [2023-10-14 01:36:37,293][33201] Updated weights for policy 0, policy_version 14410 (0.0007) [2023-10-14 01:36:37,492][33226] Updated weights for policy 1, policy_version 14540 (0.0008) [2023-10-14 01:36:37,666][33201] Updated weights for policy 0, policy_version 14420 (0.0008) [2023-10-14 01:36:37,857][33226] Updated weights for policy 1, policy_version 14550 (0.0007) [2023-10-14 01:36:38,034][33201] Updated weights for policy 0, policy_version 14430 (0.0009) [2023-10-14 01:36:38,221][33226] Updated weights for policy 1, policy_version 14560 (0.0007) [2023-10-14 01:36:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 29687808. Throughput: 0: 1738.8, 1: 1764.6. Samples: 7429498. Policy #0 lag: (min: 8.0, avg: 26.7, max: 40.0) [2023-10-14 01:36:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.530')] [2023-10-14 01:36:41,942][33201] Updated weights for policy 0, policy_version 14440 (0.0007) [2023-10-14 01:36:42,108][33226] Updated weights for policy 1, policy_version 14570 (0.0007) [2023-10-14 01:36:42,311][33201] Updated weights for policy 0, policy_version 14450 (0.0007) [2023-10-14 01:36:42,470][33226] Updated weights for policy 1, policy_version 14580 (0.0008) [2023-10-14 01:36:42,688][33201] Updated weights for policy 0, policy_version 14460 (0.0008) [2023-10-14 01:36:42,841][33226] Updated weights for policy 1, policy_version 14590 (0.0007) [2023-10-14 01:36:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 29753344. Throughput: 0: 1758.5, 1: 1785.9. Samples: 7440944. Policy #0 lag: (min: 8.0, avg: 26.7, max: 40.0) [2023-10-14 01:36:44,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.530')] [2023-10-14 01:36:46,536][33226] Updated weights for policy 1, policy_version 14600 (0.0009) [2023-10-14 01:36:46,720][33201] Updated weights for policy 0, policy_version 14470 (0.0008) [2023-10-14 01:36:46,904][33226] Updated weights for policy 1, policy_version 14610 (0.0008) [2023-10-14 01:36:47,087][33201] Updated weights for policy 0, policy_version 14480 (0.0008) [2023-10-14 01:36:47,272][33226] Updated weights for policy 1, policy_version 14620 (0.0009) [2023-10-14 01:36:47,456][33201] Updated weights for policy 0, policy_version 14490 (0.0009) [2023-10-14 01:36:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 29818880. Throughput: 0: 1732.2, 1: 1758.9. Samples: 7460964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:49,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.540')] [2023-10-14 01:36:51,162][33226] Updated weights for policy 1, policy_version 14630 (0.0008) [2023-10-14 01:36:51,457][33201] Updated weights for policy 0, policy_version 14500 (0.0009) [2023-10-14 01:36:51,528][33226] Updated weights for policy 1, policy_version 14640 (0.0008) [2023-10-14 01:36:51,848][33201] Updated weights for policy 0, policy_version 14510 (0.0007) [2023-10-14 01:36:51,891][33226] Updated weights for policy 1, policy_version 14650 (0.0007) [2023-10-14 01:36:52,226][33201] Updated weights for policy 0, policy_version 14520 (0.0009) [2023-10-14 01:36:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 29884416. Throughput: 0: 1739.4, 1: 1753.8. Samples: 7482810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.540')] [2023-10-14 01:36:55,625][33226] Updated weights for policy 1, policy_version 14660 (0.0008) [2023-10-14 01:36:55,780][33201] Updated weights for policy 0, policy_version 14530 (0.0007) [2023-10-14 01:36:55,993][33226] Updated weights for policy 1, policy_version 14670 (0.0008) [2023-10-14 01:36:56,143][33201] Updated weights for policy 0, policy_version 14540 (0.0008) [2023-10-14 01:36:56,361][33226] Updated weights for policy 1, policy_version 14680 (0.0007) [2023-10-14 01:36:56,515][33201] Updated weights for policy 0, policy_version 14550 (0.0009) [2023-10-14 01:36:56,883][33201] Updated weights for policy 0, policy_version 14560 (0.0008) [2023-10-14 01:36:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 29949952. Throughput: 0: 1739.8, 1: 1760.2. Samples: 7492626. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:36:59,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.500')] [2023-10-14 01:37:00,227][33226] Updated weights for policy 1, policy_version 14690 (0.0008) [2023-10-14 01:37:00,597][33226] Updated weights for policy 1, policy_version 14700 (0.0008) [2023-10-14 01:37:00,626][33201] Updated weights for policy 0, policy_version 14570 (0.0007) [2023-10-14 01:37:00,968][33226] Updated weights for policy 1, policy_version 14710 (0.0009) [2023-10-14 01:37:00,993][33201] Updated weights for policy 0, policy_version 14580 (0.0008) [2023-10-14 01:37:01,334][33226] Updated weights for policy 1, policy_version 14720 (0.0008) [2023-10-14 01:37:01,361][33201] Updated weights for policy 0, policy_version 14590 (0.0008) [2023-10-14 01:37:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 30015488. Throughput: 0: 1740.4, 1: 1757.8. Samples: 7514542. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) [2023-10-14 01:37:04,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.500')] [2023-10-14 01:37:05,182][33226] Updated weights for policy 1, policy_version 14730 (0.0007) [2023-10-14 01:37:05,224][33201] Updated weights for policy 0, policy_version 14600 (0.0008) [2023-10-14 01:37:05,550][33226] Updated weights for policy 1, policy_version 14740 (0.0007) [2023-10-14 01:37:05,599][33201] Updated weights for policy 0, policy_version 14610 (0.0009) [2023-10-14 01:37:05,920][33226] Updated weights for policy 1, policy_version 14750 (0.0007) [2023-10-14 01:37:05,977][33201] Updated weights for policy 0, policy_version 14620 (0.0010) [2023-10-14 01:37:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 30081024. Throughput: 0: 1767.4, 1: 1787.6. Samples: 7536690. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) [2023-10-14 01:37:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.450')] [2023-10-14 01:37:09,583][33226] Updated weights for policy 1, policy_version 14760 (0.0008) [2023-10-14 01:37:09,819][33201] Updated weights for policy 0, policy_version 14630 (0.0009) [2023-10-14 01:37:09,959][33226] Updated weights for policy 1, policy_version 14770 (0.0008) [2023-10-14 01:37:10,194][33201] Updated weights for policy 0, policy_version 14640 (0.0008) [2023-10-14 01:37:10,318][33226] Updated weights for policy 1, policy_version 14780 (0.0007) [2023-10-14 01:37:10,570][33201] Updated weights for policy 0, policy_version 14650 (0.0008) [2023-10-14 01:37:14,041][33226] Updated weights for policy 1, policy_version 14790 (0.0008) [2023-10-14 01:37:14,417][33226] Updated weights for policy 1, policy_version 14800 (0.0011) [2023-10-14 01:37:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 30146560. Throughput: 0: 1741.3, 1: 1761.6. Samples: 7546234. Policy #0 lag: (min: 23.0, avg: 23.3, max: 35.0) [2023-10-14 01:37:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.430')] [2023-10-14 01:37:14,641][33201] Updated weights for policy 0, policy_version 14660 (0.0009) [2023-10-14 01:37:14,786][33226] Updated weights for policy 1, policy_version 14810 (0.0007) [2023-10-14 01:37:15,002][33201] Updated weights for policy 0, policy_version 14670 (0.0008) [2023-10-14 01:37:15,382][33201] Updated weights for policy 0, policy_version 14680 (0.0007) [2023-10-14 01:37:18,577][33226] Updated weights for policy 1, policy_version 14820 (0.0009) [2023-10-14 01:37:18,950][33226] Updated weights for policy 1, policy_version 14830 (0.0007) [2023-10-14 01:37:19,223][33201] Updated weights for policy 0, policy_version 14690 (0.0007) [2023-10-14 01:37:19,317][33226] Updated weights for policy 1, policy_version 14840 (0.0008) [2023-10-14 01:37:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13107.7, 300 sec: 14106.9). Total num frames: 30212096. Throughput: 0: 1762.7, 1: 1781.9. Samples: 7568344. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 01:37:19,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.420')] [2023-10-14 01:37:19,599][33201] Updated weights for policy 0, policy_version 14700 (0.0007) [2023-10-14 01:37:19,957][33201] Updated weights for policy 0, policy_version 14710 (0.0008) [2023-10-14 01:37:20,329][33201] Updated weights for policy 0, policy_version 14720 (0.0009) [2023-10-14 01:37:22,995][33226] Updated weights for policy 1, policy_version 14850 (0.0007) [2023-10-14 01:37:23,368][33226] Updated weights for policy 1, policy_version 14860 (0.0009) [2023-10-14 01:37:23,738][33226] Updated weights for policy 1, policy_version 14870 (0.0009) [2023-10-14 01:37:24,103][33226] Updated weights for policy 1, policy_version 14880 (0.0007) [2023-10-14 01:37:24,336][33201] Updated weights for policy 0, policy_version 14730 (0.0009) [2023-10-14 01:37:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 30310400. Throughput: 0: 1772.2, 1: 1779.9. Samples: 7589342. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 01:37:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.400')] [2023-10-14 01:37:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000014880_15237120.pth... [2023-10-14 01:37:24,605][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000013216_13533184.pth [2023-10-14 01:37:24,704][33201] Updated weights for policy 0, policy_version 14740 (0.0008) [2023-10-14 01:37:25,069][33201] Updated weights for policy 0, policy_version 14750 (0.0008) [2023-10-14 01:37:25,146][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000014752_15106048.pth... [2023-10-14 01:37:25,184][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000013088_13402112.pth [2023-10-14 01:37:27,966][33226] Updated weights for policy 1, policy_version 14890 (0.0009) [2023-10-14 01:37:28,330][33226] Updated weights for policy 1, policy_version 14900 (0.0007) [2023-10-14 01:37:28,688][33226] Updated weights for policy 1, policy_version 14910 (0.0009) [2023-10-14 01:37:28,837][33201] Updated weights for policy 0, policy_version 14760 (0.0008) [2023-10-14 01:37:29,217][33201] Updated weights for policy 0, policy_version 14770 (0.0008) [2023-10-14 01:37:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 30375936. Throughput: 0: 1755.2, 1: 1784.4. Samples: 7600226. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 01:37:29,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.410')] [2023-10-14 01:37:29,599][33201] Updated weights for policy 0, policy_version 14780 (0.0008) [2023-10-14 01:37:32,462][33226] Updated weights for policy 1, policy_version 14920 (0.0007) [2023-10-14 01:37:32,845][33226] Updated weights for policy 1, policy_version 14930 (0.0007) [2023-10-14 01:37:33,212][33226] Updated weights for policy 1, policy_version 14940 (0.0010) [2023-10-14 01:37:33,453][33201] Updated weights for policy 0, policy_version 14790 (0.0009) [2023-10-14 01:37:33,822][33201] Updated weights for policy 0, policy_version 14800 (0.0007) [2023-10-14 01:37:34,196][33201] Updated weights for policy 0, policy_version 14810 (0.0009) [2023-10-14 01:37:34,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 30474240. Throughput: 0: 1777.8, 1: 1785.0. Samples: 7621288. Policy #0 lag: (min: 14.0, avg: 16.2, max: 46.0) [2023-10-14 01:37:34,559][31953] Avg episode reward: [(0, '20.930'), (1, '20.410')] [2023-10-14 01:37:36,790][33226] Updated weights for policy 1, policy_version 14950 (0.0008) [2023-10-14 01:37:37,162][33226] Updated weights for policy 1, policy_version 14960 (0.0008) [2023-10-14 01:37:37,535][33226] Updated weights for policy 1, policy_version 14970 (0.0008) [2023-10-14 01:37:38,208][33201] Updated weights for policy 0, policy_version 14820 (0.0008) [2023-10-14 01:37:38,595][33201] Updated weights for policy 0, policy_version 14830 (0.0008) [2023-10-14 01:37:38,972][33201] Updated weights for policy 0, policy_version 14840 (0.0008) [2023-10-14 01:37:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 30539776. Throughput: 0: 1748.7, 1: 1780.8. Samples: 7641640. Policy #0 lag: (min: 14.0, avg: 16.2, max: 46.0) [2023-10-14 01:37:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.390')] [2023-10-14 01:37:41,340][33226] Updated weights for policy 1, policy_version 14980 (0.0011) [2023-10-14 01:37:41,709][33226] Updated weights for policy 1, policy_version 14990 (0.0010) [2023-10-14 01:37:42,071][33226] Updated weights for policy 1, policy_version 15000 (0.0008) [2023-10-14 01:37:42,567][33201] Updated weights for policy 0, policy_version 14850 (0.0010) [2023-10-14 01:37:42,947][33201] Updated weights for policy 0, policy_version 14860 (0.0007) [2023-10-14 01:37:43,316][33201] Updated weights for policy 0, policy_version 14870 (0.0010) [2023-10-14 01:37:43,698][33201] Updated weights for policy 0, policy_version 14880 (0.0011) [2023-10-14 01:37:44,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 30605312. Throughput: 0: 1773.0, 1: 1787.3. Samples: 7652838. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:37:44,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.400')] [2023-10-14 01:37:45,966][33226] Updated weights for policy 1, policy_version 15010 (0.0008) [2023-10-14 01:37:46,333][33226] Updated weights for policy 1, policy_version 15020 (0.0008) [2023-10-14 01:37:46,696][33226] Updated weights for policy 1, policy_version 15030 (0.0008) [2023-10-14 01:37:47,060][33226] Updated weights for policy 1, policy_version 15040 (0.0009) [2023-10-14 01:37:47,471][33201] Updated weights for policy 0, policy_version 14890 (0.0007) [2023-10-14 01:37:47,840][33201] Updated weights for policy 0, policy_version 14900 (0.0009) [2023-10-14 01:37:48,205][33201] Updated weights for policy 0, policy_version 14910 (0.0010) [2023-10-14 01:37:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 30670848. Throughput: 0: 1750.2, 1: 1775.9. Samples: 7673218. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:37:49,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.410')] [2023-10-14 01:37:50,779][33226] Updated weights for policy 1, policy_version 15050 (0.0010) [2023-10-14 01:37:51,136][33226] Updated weights for policy 1, policy_version 15060 (0.0010) [2023-10-14 01:37:51,505][33226] Updated weights for policy 1, policy_version 15070 (0.0009) [2023-10-14 01:37:52,062][33201] Updated weights for policy 0, policy_version 14920 (0.0010) [2023-10-14 01:37:52,425][33201] Updated weights for policy 0, policy_version 14930 (0.0009) [2023-10-14 01:37:52,791][33201] Updated weights for policy 0, policy_version 14940 (0.0008) [2023-10-14 01:37:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 30736384. Throughput: 0: 1750.2, 1: 1772.0. Samples: 7695188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:37:54,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.430')] [2023-10-14 01:37:55,350][33226] Updated weights for policy 1, policy_version 15080 (0.0009) [2023-10-14 01:37:55,717][33226] Updated weights for policy 1, policy_version 15090 (0.0009) [2023-10-14 01:37:56,075][33226] Updated weights for policy 1, policy_version 15100 (0.0010) [2023-10-14 01:37:56,612][33201] Updated weights for policy 0, policy_version 14950 (0.0008) [2023-10-14 01:37:56,980][33201] Updated weights for policy 0, policy_version 14960 (0.0009) [2023-10-14 01:37:57,359][33201] Updated weights for policy 0, policy_version 14970 (0.0009) [2023-10-14 01:37:59,558][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.3, 300 sec: 14106.9). Total num frames: 30801920. Throughput: 0: 1763.9, 1: 1771.6. Samples: 7705332. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 01:37:59,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.410')] [2023-10-14 01:38:00,042][33226] Updated weights for policy 1, policy_version 15110 (0.0011) [2023-10-14 01:38:00,413][33226] Updated weights for policy 1, policy_version 15120 (0.0009) [2023-10-14 01:38:00,776][33226] Updated weights for policy 1, policy_version 15130 (0.0009) [2023-10-14 01:38:01,081][33201] Updated weights for policy 0, policy_version 14980 (0.0009) [2023-10-14 01:38:01,446][33201] Updated weights for policy 0, policy_version 14990 (0.0010) [2023-10-14 01:38:01,814][33201] Updated weights for policy 0, policy_version 15000 (0.0008) [2023-10-14 01:38:04,548][33226] Updated weights for policy 1, policy_version 15140 (0.0009) [2023-10-14 01:38:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 30867456. Throughput: 0: 1748.9, 1: 1769.9. Samples: 7726690. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 01:38:04,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.380')] [2023-10-14 01:38:04,914][33226] Updated weights for policy 1, policy_version 15150 (0.0008) [2023-10-14 01:38:05,290][33226] Updated weights for policy 1, policy_version 15160 (0.0009) [2023-10-14 01:38:05,683][33201] Updated weights for policy 0, policy_version 15010 (0.0007) [2023-10-14 01:38:06,047][33201] Updated weights for policy 0, policy_version 15020 (0.0008) [2023-10-14 01:38:06,423][33201] Updated weights for policy 0, policy_version 15030 (0.0008) [2023-10-14 01:38:06,796][33201] Updated weights for policy 0, policy_version 15040 (0.0008) [2023-10-14 01:38:09,189][33226] Updated weights for policy 1, policy_version 15170 (0.0008) [2023-10-14 01:38:09,557][31953] Fps is (10 sec: 13107.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 30932992. Throughput: 0: 1752.8, 1: 1791.2. Samples: 7748820. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 01:38:09,557][31953] Avg episode reward: [(0, '20.810'), (1, '20.360')] [2023-10-14 01:38:09,567][33226] Updated weights for policy 1, policy_version 15180 (0.0009) [2023-10-14 01:38:09,929][33226] Updated weights for policy 1, policy_version 15190 (0.0007) [2023-10-14 01:38:10,296][33226] Updated weights for policy 1, policy_version 15200 (0.0008) [2023-10-14 01:38:10,636][33201] Updated weights for policy 0, policy_version 15050 (0.0009) [2023-10-14 01:38:11,013][33201] Updated weights for policy 0, policy_version 15060 (0.0008) [2023-10-14 01:38:11,386][33201] Updated weights for policy 0, policy_version 15070 (0.0007) [2023-10-14 01:38:14,032][33226] Updated weights for policy 1, policy_version 15210 (0.0010) [2023-10-14 01:38:14,410][33226] Updated weights for policy 1, policy_version 15220 (0.0007) [2023-10-14 01:38:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 30998528. Throughput: 0: 1748.6, 1: 1766.4. Samples: 7758404. Policy #0 lag: (min: 23.0, avg: 26.1, max: 55.0) [2023-10-14 01:38:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.350')] [2023-10-14 01:38:14,780][33226] Updated weights for policy 1, policy_version 15230 (0.0009) [2023-10-14 01:38:15,326][33201] Updated weights for policy 0, policy_version 15080 (0.0010) [2023-10-14 01:38:15,694][33201] Updated weights for policy 0, policy_version 15090 (0.0009) [2023-10-14 01:38:16,067][33201] Updated weights for policy 0, policy_version 15100 (0.0007) [2023-10-14 01:38:18,616][33226] Updated weights for policy 1, policy_version 15240 (0.0009) [2023-10-14 01:38:18,990][33226] Updated weights for policy 1, policy_version 15250 (0.0010) [2023-10-14 01:38:19,366][33226] Updated weights for policy 1, policy_version 15260 (0.0011) [2023-10-14 01:38:19,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 31096832. Throughput: 0: 1750.0, 1: 1784.6. Samples: 7780342. Policy #0 lag: (min: 23.0, avg: 26.1, max: 55.0) [2023-10-14 01:38:19,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.350')] [2023-10-14 01:38:19,936][33201] Updated weights for policy 0, policy_version 15110 (0.0007) [2023-10-14 01:38:20,305][33201] Updated weights for policy 0, policy_version 15120 (0.0009) [2023-10-14 01:38:20,680][33201] Updated weights for policy 0, policy_version 15130 (0.0010) [2023-10-14 01:38:23,118][33226] Updated weights for policy 1, policy_version 15270 (0.0007) [2023-10-14 01:38:23,490][33226] Updated weights for policy 1, policy_version 15280 (0.0010) [2023-10-14 01:38:23,860][33226] Updated weights for policy 1, policy_version 15290 (0.0011) [2023-10-14 01:38:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31162368. Throughput: 0: 1784.5, 1: 1764.4. Samples: 7801338. Policy #0 lag: (min: 23.0, avg: 26.1, max: 55.0) [2023-10-14 01:38:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.330')] [2023-10-14 01:38:24,568][33201] Updated weights for policy 0, policy_version 15140 (0.0008) [2023-10-14 01:38:24,961][33201] Updated weights for policy 0, policy_version 15150 (0.0008) [2023-10-14 01:38:25,328][33201] Updated weights for policy 0, policy_version 15160 (0.0011) [2023-10-14 01:38:27,726][33226] Updated weights for policy 1, policy_version 15300 (0.0010) [2023-10-14 01:38:28,085][33226] Updated weights for policy 1, policy_version 15310 (0.0008) [2023-10-14 01:38:28,462][33226] Updated weights for policy 1, policy_version 15320 (0.0008) [2023-10-14 01:38:29,207][33201] Updated weights for policy 0, policy_version 15170 (0.0009) [2023-10-14 01:38:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31227904. Throughput: 0: 1753.8, 1: 1779.8. Samples: 7811848. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 01:38:29,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.310')] [2023-10-14 01:38:29,586][33201] Updated weights for policy 0, policy_version 15180 (0.0008) [2023-10-14 01:38:29,945][33201] Updated weights for policy 0, policy_version 15190 (0.0007) [2023-10-14 01:38:30,320][33201] Updated weights for policy 0, policy_version 15200 (0.0007) [2023-10-14 01:38:32,321][33226] Updated weights for policy 1, policy_version 15330 (0.0009) [2023-10-14 01:38:32,694][33226] Updated weights for policy 1, policy_version 15340 (0.0008) [2023-10-14 01:38:33,055][33226] Updated weights for policy 1, policy_version 15350 (0.0007) [2023-10-14 01:38:33,429][33226] Updated weights for policy 1, policy_version 15360 (0.0009) [2023-10-14 01:38:33,994][33201] Updated weights for policy 0, policy_version 15210 (0.0009) [2023-10-14 01:38:34,379][33201] Updated weights for policy 0, policy_version 15220 (0.0007) [2023-10-14 01:38:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 31293440. Throughput: 0: 1783.3, 1: 1774.7. Samples: 7833328. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 01:38:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.300')] [2023-10-14 01:38:34,753][33201] Updated weights for policy 0, policy_version 15230 (0.0008) [2023-10-14 01:38:37,218][33226] Updated weights for policy 1, policy_version 15370 (0.0007) [2023-10-14 01:38:37,573][33226] Updated weights for policy 1, policy_version 15380 (0.0008) [2023-10-14 01:38:37,946][33226] Updated weights for policy 1, policy_version 15390 (0.0008) [2023-10-14 01:38:38,587][33201] Updated weights for policy 0, policy_version 15240 (0.0007) [2023-10-14 01:38:38,960][33201] Updated weights for policy 0, policy_version 15250 (0.0008) [2023-10-14 01:38:39,335][33201] Updated weights for policy 0, policy_version 15260 (0.0007) [2023-10-14 01:38:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 31391744. Throughput: 0: 1764.5, 1: 1769.1. Samples: 7854200. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:38:39,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.270')] [2023-10-14 01:38:41,828][33226] Updated weights for policy 1, policy_version 15400 (0.0009) [2023-10-14 01:38:42,203][33226] Updated weights for policy 1, policy_version 15410 (0.0009) [2023-10-14 01:38:42,573][33226] Updated weights for policy 1, policy_version 15420 (0.0008) [2023-10-14 01:38:42,992][33201] Updated weights for policy 0, policy_version 15270 (0.0007) [2023-10-14 01:38:43,360][33201] Updated weights for policy 0, policy_version 15280 (0.0007) [2023-10-14 01:38:43,740][33201] Updated weights for policy 0, policy_version 15290 (0.0010) [2023-10-14 01:38:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 31457280. Throughput: 0: 1773.6, 1: 1789.5. Samples: 7865668. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:38:44,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.300')] [2023-10-14 01:38:46,406][33226] Updated weights for policy 1, policy_version 15430 (0.0009) [2023-10-14 01:38:46,795][33226] Updated weights for policy 1, policy_version 15440 (0.0008) [2023-10-14 01:38:47,167][33226] Updated weights for policy 1, policy_version 15450 (0.0010) [2023-10-14 01:38:47,465][33201] Updated weights for policy 0, policy_version 15300 (0.0009) [2023-10-14 01:38:47,831][33201] Updated weights for policy 0, policy_version 15310 (0.0010) [2023-10-14 01:38:48,209][33201] Updated weights for policy 0, policy_version 15320 (0.0010) [2023-10-14 01:38:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 31522816. Throughput: 0: 1773.6, 1: 1767.7. Samples: 7886050. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 01:38:49,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.370')] [2023-10-14 01:38:50,891][33226] Updated weights for policy 1, policy_version 15460 (0.0008) [2023-10-14 01:38:51,256][33226] Updated weights for policy 1, policy_version 15470 (0.0008) [2023-10-14 01:38:51,619][33226] Updated weights for policy 1, policy_version 15480 (0.0008) [2023-10-14 01:38:52,043][33201] Updated weights for policy 0, policy_version 15330 (0.0011) [2023-10-14 01:38:52,405][33201] Updated weights for policy 0, policy_version 15340 (0.0008) [2023-10-14 01:38:52,775][33201] Updated weights for policy 0, policy_version 15350 (0.0010) [2023-10-14 01:38:53,141][33201] Updated weights for policy 0, policy_version 15360 (0.0010) [2023-10-14 01:38:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 31588352. Throughput: 0: 1756.5, 1: 1770.0. Samples: 7907514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:38:54,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.390')] [2023-10-14 01:38:55,425][33226] Updated weights for policy 1, policy_version 15490 (0.0009) [2023-10-14 01:38:55,790][33226] Updated weights for policy 1, policy_version 15500 (0.0008) [2023-10-14 01:38:56,163][33226] Updated weights for policy 1, policy_version 15510 (0.0008) [2023-10-14 01:38:56,524][33226] Updated weights for policy 1, policy_version 15520 (0.0008) [2023-10-14 01:38:57,042][33201] Updated weights for policy 0, policy_version 15370 (0.0009) [2023-10-14 01:38:57,412][33201] Updated weights for policy 0, policy_version 15380 (0.0008) [2023-10-14 01:38:57,777][33201] Updated weights for policy 0, policy_version 15390 (0.0009) [2023-10-14 01:38:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31653888. Throughput: 0: 1783.2, 1: 1765.4. Samples: 7918090. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:38:59,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.410')] [2023-10-14 01:39:00,356][33226] Updated weights for policy 1, policy_version 15530 (0.0007) [2023-10-14 01:39:00,733][33226] Updated weights for policy 1, policy_version 15540 (0.0008) [2023-10-14 01:39:01,107][33226] Updated weights for policy 1, policy_version 15550 (0.0009) [2023-10-14 01:39:01,565][33201] Updated weights for policy 0, policy_version 15400 (0.0008) [2023-10-14 01:39:01,933][33201] Updated weights for policy 0, policy_version 15410 (0.0007) [2023-10-14 01:39:02,301][33201] Updated weights for policy 0, policy_version 15420 (0.0010) [2023-10-14 01:39:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31719424. Throughput: 0: 1760.2, 1: 1769.9. Samples: 7939198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:04,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.420')] [2023-10-14 01:39:04,709][33226] Updated weights for policy 1, policy_version 15560 (0.0008) [2023-10-14 01:39:05,084][33226] Updated weights for policy 1, policy_version 15570 (0.0010) [2023-10-14 01:39:05,449][33226] Updated weights for policy 1, policy_version 15580 (0.0007) [2023-10-14 01:39:06,148][33201] Updated weights for policy 0, policy_version 15430 (0.0008) [2023-10-14 01:39:06,514][33201] Updated weights for policy 0, policy_version 15440 (0.0008) [2023-10-14 01:39:06,890][33201] Updated weights for policy 0, policy_version 15450 (0.0007) [2023-10-14 01:39:09,280][33226] Updated weights for policy 1, policy_version 15590 (0.0007) [2023-10-14 01:39:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31784960. Throughput: 0: 1761.9, 1: 1806.0. Samples: 7961890. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) [2023-10-14 01:39:09,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.430')] [2023-10-14 01:39:09,648][33226] Updated weights for policy 1, policy_version 15600 (0.0008) [2023-10-14 01:39:10,016][33226] Updated weights for policy 1, policy_version 15610 (0.0007) [2023-10-14 01:39:10,804][33201] Updated weights for policy 0, policy_version 15460 (0.0008) [2023-10-14 01:39:11,200][33201] Updated weights for policy 0, policy_version 15470 (0.0008) [2023-10-14 01:39:11,570][33201] Updated weights for policy 0, policy_version 15480 (0.0008) [2023-10-14 01:39:13,567][33226] Updated weights for policy 1, policy_version 15620 (0.0007) [2023-10-14 01:39:13,940][33226] Updated weights for policy 1, policy_version 15630 (0.0009) [2023-10-14 01:39:14,315][33226] Updated weights for policy 1, policy_version 15640 (0.0010) [2023-10-14 01:39:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31850496. Throughput: 0: 1763.0, 1: 1784.0. Samples: 7971466. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) [2023-10-14 01:39:14,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.420')] [2023-10-14 01:39:15,405][33201] Updated weights for policy 0, policy_version 15490 (0.0007) [2023-10-14 01:39:15,770][33201] Updated weights for policy 0, policy_version 15500 (0.0008) [2023-10-14 01:39:16,143][33201] Updated weights for policy 0, policy_version 15510 (0.0007) [2023-10-14 01:39:16,511][33201] Updated weights for policy 0, policy_version 15520 (0.0009) [2023-10-14 01:39:18,120][33226] Updated weights for policy 1, policy_version 15650 (0.0008) [2023-10-14 01:39:18,486][33226] Updated weights for policy 1, policy_version 15660 (0.0007) [2023-10-14 01:39:18,860][33226] Updated weights for policy 1, policy_version 15670 (0.0007) [2023-10-14 01:39:19,229][33226] Updated weights for policy 1, policy_version 15680 (0.0007) [2023-10-14 01:39:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 31948800. Throughput: 0: 1755.9, 1: 1807.9. Samples: 7993698. Policy #0 lag: (min: 30.0, avg: 37.9, max: 62.0) [2023-10-14 01:39:19,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.430')] [2023-10-14 01:39:20,335][33201] Updated weights for policy 0, policy_version 15530 (0.0007) [2023-10-14 01:39:20,716][33201] Updated weights for policy 0, policy_version 15540 (0.0007) [2023-10-14 01:39:21,080][33201] Updated weights for policy 0, policy_version 15550 (0.0007) [2023-10-14 01:39:22,876][33226] Updated weights for policy 1, policy_version 15690 (0.0007) [2023-10-14 01:39:23,248][33226] Updated weights for policy 1, policy_version 15700 (0.0008) [2023-10-14 01:39:23,619][33226] Updated weights for policy 1, policy_version 15710 (0.0007) [2023-10-14 01:39:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32014336. Throughput: 0: 1783.9, 1: 1783.4. Samples: 8014730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:24,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.420')] [2023-10-14 01:39:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000015712_16089088.pth... [2023-10-14 01:39:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000014048_14385152.pth [2023-10-14 01:39:24,736][33201] Updated weights for policy 0, policy_version 15560 (0.0009) [2023-10-14 01:39:25,115][33201] Updated weights for policy 0, policy_version 15570 (0.0008) [2023-10-14 01:39:25,489][33201] Updated weights for policy 0, policy_version 15580 (0.0009) [2023-10-14 01:39:25,630][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000015584_15958016.pth... [2023-10-14 01:39:25,660][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000013920_14254080.pth [2023-10-14 01:39:27,421][33226] Updated weights for policy 1, policy_version 15720 (0.0009) [2023-10-14 01:39:27,793][33226] Updated weights for policy 1, policy_version 15730 (0.0009) [2023-10-14 01:39:28,164][33226] Updated weights for policy 1, policy_version 15740 (0.0008) [2023-10-14 01:39:29,305][33201] Updated weights for policy 0, policy_version 15590 (0.0010) [2023-10-14 01:39:29,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 32079872. Throughput: 0: 1759.8, 1: 1798.8. Samples: 8025806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:29,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.440')] [2023-10-14 01:39:29,683][33201] Updated weights for policy 0, policy_version 15600 (0.0010) [2023-10-14 01:39:30,064][33201] Updated weights for policy 0, policy_version 15610 (0.0007) [2023-10-14 01:39:31,985][33226] Updated weights for policy 1, policy_version 15750 (0.0010) [2023-10-14 01:39:32,355][33226] Updated weights for policy 1, policy_version 15760 (0.0007) [2023-10-14 01:39:32,735][33226] Updated weights for policy 1, policy_version 15770 (0.0007) [2023-10-14 01:39:33,879][33201] Updated weights for policy 0, policy_version 15620 (0.0007) [2023-10-14 01:39:34,257][33201] Updated weights for policy 0, policy_version 15630 (0.0009) [2023-10-14 01:39:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32145408. Throughput: 0: 1774.9, 1: 1790.7. Samples: 8046502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.440')] [2023-10-14 01:39:34,634][33201] Updated weights for policy 0, policy_version 15640 (0.0007) [2023-10-14 01:39:36,478][33226] Updated weights for policy 1, policy_version 15780 (0.0008) [2023-10-14 01:39:36,848][33226] Updated weights for policy 1, policy_version 15790 (0.0008) [2023-10-14 01:39:37,217][33226] Updated weights for policy 1, policy_version 15800 (0.0008) [2023-10-14 01:39:38,361][33201] Updated weights for policy 0, policy_version 15650 (0.0008) [2023-10-14 01:39:38,733][33201] Updated weights for policy 0, policy_version 15660 (0.0007) [2023-10-14 01:39:39,101][33201] Updated weights for policy 0, policy_version 15670 (0.0010) [2023-10-14 01:39:39,461][33201] Updated weights for policy 0, policy_version 15680 (0.0007) [2023-10-14 01:39:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 32243712. Throughput: 0: 1774.0, 1: 1791.3. Samples: 8067954. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-14 01:39:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.400')] [2023-10-14 01:39:40,985][33226] Updated weights for policy 1, policy_version 15810 (0.0009) [2023-10-14 01:39:41,344][33226] Updated weights for policy 1, policy_version 15820 (0.0008) [2023-10-14 01:39:41,712][33226] Updated weights for policy 1, policy_version 15830 (0.0008) [2023-10-14 01:39:42,083][33226] Updated weights for policy 1, policy_version 15840 (0.0009) [2023-10-14 01:39:43,205][33201] Updated weights for policy 0, policy_version 15690 (0.0010) [2023-10-14 01:39:43,576][33201] Updated weights for policy 0, policy_version 15700 (0.0007) [2023-10-14 01:39:43,944][33201] Updated weights for policy 0, policy_version 15710 (0.0011) [2023-10-14 01:39:44,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 32309248. Throughput: 0: 1774.7, 1: 1798.8. Samples: 8078898. Policy #0 lag: (min: 15.0, avg: 23.0, max: 47.0) [2023-10-14 01:39:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.330')] [2023-10-14 01:39:45,919][33226] Updated weights for policy 1, policy_version 15850 (0.0008) [2023-10-14 01:39:46,297][33226] Updated weights for policy 1, policy_version 15860 (0.0008) [2023-10-14 01:39:46,660][33226] Updated weights for policy 1, policy_version 15870 (0.0008) [2023-10-14 01:39:47,620][33201] Updated weights for policy 0, policy_version 15720 (0.0010) [2023-10-14 01:39:47,997][33201] Updated weights for policy 0, policy_version 15730 (0.0011) [2023-10-14 01:39:48,361][33201] Updated weights for policy 0, policy_version 15740 (0.0009) [2023-10-14 01:39:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 32374784. Throughput: 0: 1779.5, 1: 1797.0. Samples: 8100142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:49,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.330')] [2023-10-14 01:39:50,460][33226] Updated weights for policy 1, policy_version 15880 (0.0008) [2023-10-14 01:39:50,827][33226] Updated weights for policy 1, policy_version 15890 (0.0009) [2023-10-14 01:39:51,204][33226] Updated weights for policy 1, policy_version 15900 (0.0007) [2023-10-14 01:39:52,285][33201] Updated weights for policy 0, policy_version 15750 (0.0009) [2023-10-14 01:39:52,651][33201] Updated weights for policy 0, policy_version 15760 (0.0007) [2023-10-14 01:39:53,025][33201] Updated weights for policy 0, policy_version 15770 (0.0007) [2023-10-14 01:39:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 32440320. Throughput: 0: 1763.6, 1: 1786.7. Samples: 8121650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.350')] [2023-10-14 01:39:54,894][33226] Updated weights for policy 1, policy_version 15910 (0.0010) [2023-10-14 01:39:55,263][33226] Updated weights for policy 1, policy_version 15920 (0.0007) [2023-10-14 01:39:55,628][33226] Updated weights for policy 1, policy_version 15930 (0.0008) [2023-10-14 01:39:56,758][33201] Updated weights for policy 0, policy_version 15780 (0.0008) [2023-10-14 01:39:57,152][33201] Updated weights for policy 0, policy_version 15790 (0.0009) [2023-10-14 01:39:57,522][33201] Updated weights for policy 0, policy_version 15800 (0.0008) [2023-10-14 01:39:59,548][33226] Updated weights for policy 1, policy_version 15940 (0.0008) [2023-10-14 01:39:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 32505856. Throughput: 0: 1785.9, 1: 1782.7. Samples: 8132050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:39:59,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.360')] [2023-10-14 01:39:59,909][33226] Updated weights for policy 1, policy_version 15950 (0.0009) [2023-10-14 01:40:00,281][33226] Updated weights for policy 1, policy_version 15960 (0.0009) [2023-10-14 01:40:01,324][33201] Updated weights for policy 0, policy_version 15810 (0.0008) [2023-10-14 01:40:01,697][33201] Updated weights for policy 0, policy_version 15820 (0.0007) [2023-10-14 01:40:02,073][33201] Updated weights for policy 0, policy_version 15830 (0.0009) [2023-10-14 01:40:02,451][33201] Updated weights for policy 0, policy_version 15840 (0.0008) [2023-10-14 01:40:04,205][33226] Updated weights for policy 1, policy_version 15970 (0.0007) [2023-10-14 01:40:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32571392. Throughput: 0: 1768.6, 1: 1773.5. Samples: 8153094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:04,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.380')] [2023-10-14 01:40:04,578][33226] Updated weights for policy 1, policy_version 15980 (0.0009) [2023-10-14 01:40:04,951][33226] Updated weights for policy 1, policy_version 15990 (0.0011) [2023-10-14 01:40:05,314][33226] Updated weights for policy 1, policy_version 16000 (0.0008) [2023-10-14 01:40:06,182][33201] Updated weights for policy 0, policy_version 15850 (0.0009) [2023-10-14 01:40:06,556][33201] Updated weights for policy 0, policy_version 15860 (0.0010) [2023-10-14 01:40:06,917][33201] Updated weights for policy 0, policy_version 15870 (0.0011) [2023-10-14 01:40:08,972][33226] Updated weights for policy 1, policy_version 16010 (0.0010) [2023-10-14 01:40:09,345][33226] Updated weights for policy 1, policy_version 16020 (0.0010) [2023-10-14 01:40:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32636928. Throughput: 0: 1764.5, 1: 1797.4. Samples: 8175018. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:09,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.370')] [2023-10-14 01:40:09,707][33226] Updated weights for policy 1, policy_version 16030 (0.0008) [2023-10-14 01:40:10,777][33201] Updated weights for policy 0, policy_version 15880 (0.0008) [2023-10-14 01:40:11,149][33201] Updated weights for policy 0, policy_version 15890 (0.0008) [2023-10-14 01:40:11,517][33201] Updated weights for policy 0, policy_version 15900 (0.0009) [2023-10-14 01:40:13,488][33226] Updated weights for policy 1, policy_version 16040 (0.0010) [2023-10-14 01:40:13,852][33226] Updated weights for policy 1, policy_version 16050 (0.0011) [2023-10-14 01:40:14,221][33226] Updated weights for policy 1, policy_version 16060 (0.0008) [2023-10-14 01:40:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 32735232. Throughput: 0: 1767.4, 1: 1771.5. Samples: 8185054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:14,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.360')] [2023-10-14 01:40:15,546][33201] Updated weights for policy 0, policy_version 15910 (0.0010) [2023-10-14 01:40:15,931][33201] Updated weights for policy 0, policy_version 15920 (0.0009) [2023-10-14 01:40:16,304][33201] Updated weights for policy 0, policy_version 15930 (0.0009) [2023-10-14 01:40:17,983][33226] Updated weights for policy 1, policy_version 16070 (0.0009) [2023-10-14 01:40:18,342][33226] Updated weights for policy 1, policy_version 16080 (0.0008) [2023-10-14 01:40:18,712][33226] Updated weights for policy 1, policy_version 16090 (0.0009) [2023-10-14 01:40:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 32800768. Throughput: 0: 1760.9, 1: 1802.9. Samples: 8206872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:19,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.360')] [2023-10-14 01:40:20,166][33201] Updated weights for policy 0, policy_version 15940 (0.0007) [2023-10-14 01:40:20,539][33201] Updated weights for policy 0, policy_version 15950 (0.0008) [2023-10-14 01:40:20,920][33201] Updated weights for policy 0, policy_version 15960 (0.0010) [2023-10-14 01:40:22,656][33226] Updated weights for policy 1, policy_version 16100 (0.0009) [2023-10-14 01:40:23,064][33226] Updated weights for policy 1, policy_version 16110 (0.0009) [2023-10-14 01:40:23,433][33226] Updated weights for policy 1, policy_version 16120 (0.0009) [2023-10-14 01:40:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32866304. Throughput: 0: 1783.5, 1: 1768.1. Samples: 8227774. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.380')] [2023-10-14 01:40:24,841][33201] Updated weights for policy 0, policy_version 15970 (0.0009) [2023-10-14 01:40:25,210][33201] Updated weights for policy 0, policy_version 15980 (0.0010) [2023-10-14 01:40:25,588][33201] Updated weights for policy 0, policy_version 15990 (0.0011) [2023-10-14 01:40:25,957][33201] Updated weights for policy 0, policy_version 16000 (0.0011) [2023-10-14 01:40:27,170][33226] Updated weights for policy 1, policy_version 16130 (0.0009) [2023-10-14 01:40:27,534][33226] Updated weights for policy 1, policy_version 16140 (0.0010) [2023-10-14 01:40:27,909][33226] Updated weights for policy 1, policy_version 16150 (0.0011) [2023-10-14 01:40:28,274][33226] Updated weights for policy 1, policy_version 16160 (0.0010) [2023-10-14 01:40:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32931840. Throughput: 0: 1754.2, 1: 1797.0. Samples: 8238700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:29,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.380')] [2023-10-14 01:40:29,871][33201] Updated weights for policy 0, policy_version 16010 (0.0010) [2023-10-14 01:40:30,248][33201] Updated weights for policy 0, policy_version 16020 (0.0008) [2023-10-14 01:40:30,630][33201] Updated weights for policy 0, policy_version 16030 (0.0008) [2023-10-14 01:40:32,120][33226] Updated weights for policy 1, policy_version 16170 (0.0008) [2023-10-14 01:40:32,488][33226] Updated weights for policy 1, policy_version 16180 (0.0008) [2023-10-14 01:40:32,864][33226] Updated weights for policy 1, policy_version 16190 (0.0009) [2023-10-14 01:40:34,222][33201] Updated weights for policy 0, policy_version 16040 (0.0008) [2023-10-14 01:40:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 32997376. Throughput: 0: 1773.2, 1: 1762.9. Samples: 8259264. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 01:40:34,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.380')] [2023-10-14 01:40:34,603][33201] Updated weights for policy 0, policy_version 16050 (0.0007) [2023-10-14 01:40:34,969][33201] Updated weights for policy 0, policy_version 16060 (0.0008) [2023-10-14 01:40:36,602][33226] Updated weights for policy 1, policy_version 16200 (0.0008) [2023-10-14 01:40:36,971][33226] Updated weights for policy 1, policy_version 16210 (0.0010) [2023-10-14 01:40:37,334][33226] Updated weights for policy 1, policy_version 16220 (0.0007) [2023-10-14 01:40:38,729][33201] Updated weights for policy 0, policy_version 16070 (0.0007) [2023-10-14 01:40:39,101][33201] Updated weights for policy 0, policy_version 16080 (0.0007) [2023-10-14 01:40:39,470][33201] Updated weights for policy 0, policy_version 16090 (0.0008) [2023-10-14 01:40:39,558][31953] Fps is (10 sec: 13106.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 33062912. Throughput: 0: 1770.5, 1: 1764.5. Samples: 8280724. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 01:40:39,559][31953] Avg episode reward: [(0, '20.790'), (1, '20.370')] [2023-10-14 01:40:41,112][33226] Updated weights for policy 1, policy_version 16230 (0.0008) [2023-10-14 01:40:41,475][33226] Updated weights for policy 1, policy_version 16240 (0.0010) [2023-10-14 01:40:41,845][33226] Updated weights for policy 1, policy_version 16250 (0.0007) [2023-10-14 01:40:43,449][33201] Updated weights for policy 0, policy_version 16100 (0.0009) [2023-10-14 01:40:43,844][33201] Updated weights for policy 0, policy_version 16110 (0.0008) [2023-10-14 01:40:44,223][33201] Updated weights for policy 0, policy_version 16120 (0.0010) [2023-10-14 01:40:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 33161216. Throughput: 0: 1765.3, 1: 1772.3. Samples: 8291246. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) [2023-10-14 01:40:44,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.380')] [2023-10-14 01:40:45,561][33226] Updated weights for policy 1, policy_version 16260 (0.0010) [2023-10-14 01:40:45,925][33226] Updated weights for policy 1, policy_version 16270 (0.0009) [2023-10-14 01:40:46,304][33226] Updated weights for policy 1, policy_version 16280 (0.0011) [2023-10-14 01:40:48,043][33201] Updated weights for policy 0, policy_version 16130 (0.0007) [2023-10-14 01:40:48,416][33201] Updated weights for policy 0, policy_version 16140 (0.0009) [2023-10-14 01:40:48,787][33201] Updated weights for policy 0, policy_version 16150 (0.0008) [2023-10-14 01:40:49,162][33201] Updated weights for policy 0, policy_version 16160 (0.0008) [2023-10-14 01:40:49,557][31953] Fps is (10 sec: 16384.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 33226752. Throughput: 0: 1780.6, 1: 1770.0. Samples: 8312872. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) [2023-10-14 01:40:49,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.380')] [2023-10-14 01:40:50,232][33226] Updated weights for policy 1, policy_version 16290 (0.0009) [2023-10-14 01:40:50,600][33226] Updated weights for policy 1, policy_version 16300 (0.0009) [2023-10-14 01:40:50,974][33226] Updated weights for policy 1, policy_version 16310 (0.0007) [2023-10-14 01:40:51,344][33226] Updated weights for policy 1, policy_version 16320 (0.0010) [2023-10-14 01:40:52,869][33201] Updated weights for policy 0, policy_version 16170 (0.0009) [2023-10-14 01:40:53,246][33201] Updated weights for policy 0, policy_version 16180 (0.0007) [2023-10-14 01:40:53,617][33201] Updated weights for policy 0, policy_version 16190 (0.0007) [2023-10-14 01:40:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 33292288. Throughput: 0: 1751.0, 1: 1775.9. Samples: 8333726. Policy #0 lag: (min: 31.0, avg: 31.6, max: 48.0) [2023-10-14 01:40:54,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.380')] [2023-10-14 01:40:55,068][33226] Updated weights for policy 1, policy_version 16330 (0.0009) [2023-10-14 01:40:55,443][33226] Updated weights for policy 1, policy_version 16340 (0.0009) [2023-10-14 01:40:55,803][33226] Updated weights for policy 1, policy_version 16350 (0.0008) [2023-10-14 01:40:57,405][33201] Updated weights for policy 0, policy_version 16200 (0.0007) [2023-10-14 01:40:57,781][33201] Updated weights for policy 0, policy_version 16210 (0.0007) [2023-10-14 01:40:58,153][33201] Updated weights for policy 0, policy_version 16220 (0.0007) [2023-10-14 01:40:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 33357824. Throughput: 0: 1784.5, 1: 1763.6. Samples: 8344714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:40:59,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.390')] [2023-10-14 01:40:59,744][33226] Updated weights for policy 1, policy_version 16360 (0.0009) [2023-10-14 01:41:00,118][33226] Updated weights for policy 1, policy_version 16370 (0.0008) [2023-10-14 01:41:00,486][33226] Updated weights for policy 1, policy_version 16380 (0.0008) [2023-10-14 01:41:02,053][33201] Updated weights for policy 0, policy_version 16230 (0.0008) [2023-10-14 01:41:02,419][33201] Updated weights for policy 0, policy_version 16240 (0.0007) [2023-10-14 01:41:02,789][33201] Updated weights for policy 0, policy_version 16250 (0.0007) [2023-10-14 01:41:04,328][33226] Updated weights for policy 1, policy_version 16390 (0.0008) [2023-10-14 01:41:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 33423360. Throughput: 0: 1753.6, 1: 1762.5. Samples: 8365094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:04,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.380')] [2023-10-14 01:41:04,691][33226] Updated weights for policy 1, policy_version 16400 (0.0007) [2023-10-14 01:41:05,065][33226] Updated weights for policy 1, policy_version 16410 (0.0009) [2023-10-14 01:41:06,374][33201] Updated weights for policy 0, policy_version 16260 (0.0007) [2023-10-14 01:41:06,737][33201] Updated weights for policy 0, policy_version 16270 (0.0008) [2023-10-14 01:41:07,116][33201] Updated weights for policy 0, policy_version 16280 (0.0010) [2023-10-14 01:41:08,840][33226] Updated weights for policy 1, policy_version 16420 (0.0008) [2023-10-14 01:41:09,252][33226] Updated weights for policy 1, policy_version 16430 (0.0008) [2023-10-14 01:41:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 33488896. Throughput: 0: 1761.0, 1: 1784.6. Samples: 8387326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:09,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.400')] [2023-10-14 01:41:09,631][33226] Updated weights for policy 1, policy_version 16440 (0.0008) [2023-10-14 01:41:10,919][33201] Updated weights for policy 0, policy_version 16290 (0.0009) [2023-10-14 01:41:11,294][33201] Updated weights for policy 0, policy_version 16300 (0.0008) [2023-10-14 01:41:11,663][33201] Updated weights for policy 0, policy_version 16310 (0.0007) [2023-10-14 01:41:12,038][33201] Updated weights for policy 0, policy_version 16320 (0.0007) [2023-10-14 01:41:13,223][33226] Updated weights for policy 1, policy_version 16450 (0.0010) [2023-10-14 01:41:13,577][33226] Updated weights for policy 1, policy_version 16460 (0.0011) [2023-10-14 01:41:13,950][33226] Updated weights for policy 1, policy_version 16470 (0.0008) [2023-10-14 01:41:14,313][33226] Updated weights for policy 1, policy_version 16480 (0.0010) [2023-10-14 01:41:14,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14107.0). Total num frames: 33587200. Throughput: 0: 1766.8, 1: 1754.9. Samples: 8397178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:14,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.410')] [2023-10-14 01:41:15,904][33201] Updated weights for policy 0, policy_version 16330 (0.0012) [2023-10-14 01:41:16,283][33201] Updated weights for policy 0, policy_version 16340 (0.0010) [2023-10-14 01:41:16,642][33201] Updated weights for policy 0, policy_version 16350 (0.0007) [2023-10-14 01:41:18,092][33226] Updated weights for policy 1, policy_version 16490 (0.0007) [2023-10-14 01:41:18,465][33226] Updated weights for policy 1, policy_version 16500 (0.0007) [2023-10-14 01:41:18,827][33226] Updated weights for policy 1, policy_version 16510 (0.0009) [2023-10-14 01:41:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 33652736. Throughput: 0: 1768.7, 1: 1793.0. Samples: 8419542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:19,558][31953] Avg episode reward: [(0, '20.780'), (1, '20.420')] [2023-10-14 01:41:20,361][33201] Updated weights for policy 0, policy_version 16360 (0.0009) [2023-10-14 01:41:20,740][33201] Updated weights for policy 0, policy_version 16370 (0.0009) [2023-10-14 01:41:21,116][33201] Updated weights for policy 0, policy_version 16380 (0.0008) [2023-10-14 01:41:22,859][33226] Updated weights for policy 1, policy_version 16520 (0.0008) [2023-10-14 01:41:23,226][33226] Updated weights for policy 1, policy_version 16530 (0.0007) [2023-10-14 01:41:23,595][33226] Updated weights for policy 1, policy_version 16540 (0.0007) [2023-10-14 01:41:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 33718272. Throughput: 0: 1789.1, 1: 1762.6. Samples: 8440550. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:24,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.450')] [2023-10-14 01:41:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000016544_16941056.pth... [2023-10-14 01:41:24,596][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000014880_15237120.pth [2023-10-14 01:41:24,600][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000016544_16941056.pth [2023-10-14 01:41:24,734][33201] Updated weights for policy 0, policy_version 16390 (0.0009) [2023-10-14 01:41:25,100][33201] Updated weights for policy 0, policy_version 16400 (0.0009) [2023-10-14 01:41:25,471][33201] Updated weights for policy 0, policy_version 16410 (0.0008) [2023-10-14 01:41:25,696][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000016416_16809984.pth... [2023-10-14 01:41:25,735][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000014752_15106048.pth [2023-10-14 01:41:25,741][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000016416_16809984.pth [2023-10-14 01:41:27,241][33226] Updated weights for policy 1, policy_version 16550 (0.0008) [2023-10-14 01:41:27,617][33226] Updated weights for policy 1, policy_version 16560 (0.0008) [2023-10-14 01:41:27,977][33226] Updated weights for policy 1, policy_version 16570 (0.0009) [2023-10-14 01:41:29,504][33201] Updated weights for policy 0, policy_version 16420 (0.0008) [2023-10-14 01:41:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 33783808. Throughput: 0: 1773.5, 1: 1795.4. Samples: 8451846. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 01:41:29,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.470')] [2023-10-14 01:41:29,882][33201] Updated weights for policy 0, policy_version 16430 (0.0009) [2023-10-14 01:41:30,245][33201] Updated weights for policy 0, policy_version 16440 (0.0007) [2023-10-14 01:41:31,751][33226] Updated weights for policy 1, policy_version 16580 (0.0010) [2023-10-14 01:41:32,117][33226] Updated weights for policy 1, policy_version 16590 (0.0008) [2023-10-14 01:41:32,489][33226] Updated weights for policy 1, policy_version 16600 (0.0007) [2023-10-14 01:41:33,967][33201] Updated weights for policy 0, policy_version 16450 (0.0007) [2023-10-14 01:41:34,346][33201] Updated weights for policy 0, policy_version 16460 (0.0009) [2023-10-14 01:41:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 33849344. Throughput: 0: 1778.5, 1: 1772.1. Samples: 8472648. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 01:41:34,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.460')] [2023-10-14 01:41:34,711][33201] Updated weights for policy 0, policy_version 16470 (0.0009) [2023-10-14 01:41:35,089][33201] Updated weights for policy 0, policy_version 16480 (0.0007) [2023-10-14 01:41:36,210][33226] Updated weights for policy 1, policy_version 16610 (0.0007) [2023-10-14 01:41:36,575][33226] Updated weights for policy 1, policy_version 16620 (0.0008) [2023-10-14 01:41:36,942][33226] Updated weights for policy 1, policy_version 16630 (0.0009) [2023-10-14 01:41:37,313][33226] Updated weights for policy 1, policy_version 16640 (0.0010) [2023-10-14 01:41:38,799][33201] Updated weights for policy 0, policy_version 16490 (0.0008) [2023-10-14 01:41:39,179][33201] Updated weights for policy 0, policy_version 16500 (0.0009) [2023-10-14 01:41:39,547][33201] Updated weights for policy 0, policy_version 16510 (0.0007) [2023-10-14 01:41:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 33914880. Throughput: 0: 1787.1, 1: 1779.7. Samples: 8494234. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 01:41:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.430')] [2023-10-14 01:41:40,846][33226] Updated weights for policy 1, policy_version 16650 (0.0007) [2023-10-14 01:41:41,208][33226] Updated weights for policy 1, policy_version 16660 (0.0008) [2023-10-14 01:41:41,584][33226] Updated weights for policy 1, policy_version 16670 (0.0010) [2023-10-14 01:41:43,295][33201] Updated weights for policy 0, policy_version 16520 (0.0008) [2023-10-14 01:41:43,659][33201] Updated weights for policy 0, policy_version 16530 (0.0007) [2023-10-14 01:41:44,032][33201] Updated weights for policy 0, policy_version 16540 (0.0009) [2023-10-14 01:41:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34013184. Throughput: 0: 1773.3, 1: 1783.1. Samples: 8504750. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:44,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.390')] [2023-10-14 01:41:45,557][33226] Updated weights for policy 1, policy_version 16680 (0.0008) [2023-10-14 01:41:45,917][33226] Updated weights for policy 1, policy_version 16690 (0.0008) [2023-10-14 01:41:46,293][33226] Updated weights for policy 1, policy_version 16700 (0.0008) [2023-10-14 01:41:47,909][33201] Updated weights for policy 0, policy_version 16550 (0.0008) [2023-10-14 01:41:48,273][33201] Updated weights for policy 0, policy_version 16560 (0.0009) [2023-10-14 01:41:48,652][33201] Updated weights for policy 0, policy_version 16570 (0.0008) [2023-10-14 01:41:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34078720. Throughput: 0: 1800.5, 1: 1782.8. Samples: 8526340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.390')] [2023-10-14 01:41:49,998][33226] Updated weights for policy 1, policy_version 16710 (0.0009) [2023-10-14 01:41:50,376][33226] Updated weights for policy 1, policy_version 16720 (0.0012) [2023-10-14 01:41:50,752][33226] Updated weights for policy 1, policy_version 16730 (0.0008) [2023-10-14 01:41:52,476][33201] Updated weights for policy 0, policy_version 16580 (0.0008) [2023-10-14 01:41:52,846][33201] Updated weights for policy 0, policy_version 16590 (0.0007) [2023-10-14 01:41:53,216][33201] Updated weights for policy 0, policy_version 16600 (0.0007) [2023-10-14 01:41:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 34144256. Throughput: 0: 1770.1, 1: 1790.5. Samples: 8547556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:54,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.350')] [2023-10-14 01:41:54,607][33226] Updated weights for policy 1, policy_version 16740 (0.0008) [2023-10-14 01:41:55,005][33226] Updated weights for policy 1, policy_version 16750 (0.0008) [2023-10-14 01:41:55,364][33226] Updated weights for policy 1, policy_version 16760 (0.0007) [2023-10-14 01:41:56,986][33201] Updated weights for policy 0, policy_version 16610 (0.0008) [2023-10-14 01:41:57,356][33201] Updated weights for policy 0, policy_version 16620 (0.0008) [2023-10-14 01:41:57,733][33201] Updated weights for policy 0, policy_version 16630 (0.0007) [2023-10-14 01:41:58,108][33201] Updated weights for policy 0, policy_version 16640 (0.0009) [2023-10-14 01:41:59,047][33226] Updated weights for policy 1, policy_version 16770 (0.0007) [2023-10-14 01:41:59,410][33226] Updated weights for policy 1, policy_version 16780 (0.0010) [2023-10-14 01:41:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 34209792. Throughput: 0: 1799.2, 1: 1779.6. Samples: 8558222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:41:59,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.350')] [2023-10-14 01:41:59,773][33226] Updated weights for policy 1, policy_version 16790 (0.0010) [2023-10-14 01:42:00,137][33226] Updated weights for policy 1, policy_version 16800 (0.0011) [2023-10-14 01:42:01,726][33201] Updated weights for policy 0, policy_version 16650 (0.0007) [2023-10-14 01:42:02,096][33201] Updated weights for policy 0, policy_version 16660 (0.0008) [2023-10-14 01:42:02,470][33201] Updated weights for policy 0, policy_version 16670 (0.0007) [2023-10-14 01:42:03,966][33226] Updated weights for policy 1, policy_version 16810 (0.0008) [2023-10-14 01:42:04,327][33226] Updated weights for policy 1, policy_version 16820 (0.0007) [2023-10-14 01:42:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34275328. Throughput: 0: 1773.3, 1: 1787.3. Samples: 8579766. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:42:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.350')] [2023-10-14 01:42:04,699][33226] Updated weights for policy 1, policy_version 16830 (0.0007) [2023-10-14 01:42:06,428][33201] Updated weights for policy 0, policy_version 16680 (0.0009) [2023-10-14 01:42:06,790][33201] Updated weights for policy 0, policy_version 16690 (0.0008) [2023-10-14 01:42:07,168][33201] Updated weights for policy 0, policy_version 16700 (0.0007) [2023-10-14 01:42:08,485][33226] Updated weights for policy 1, policy_version 16840 (0.0009) [2023-10-14 01:42:08,859][33226] Updated weights for policy 1, policy_version 16850 (0.0008) [2023-10-14 01:42:09,217][33226] Updated weights for policy 1, policy_version 16860 (0.0009) [2023-10-14 01:42:09,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 34373632. Throughput: 0: 1763.6, 1: 1802.0. Samples: 8601002. Policy #0 lag: (min: 22.0, avg: 33.4, max: 54.0) [2023-10-14 01:42:09,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.350')] [2023-10-14 01:42:11,064][33201] Updated weights for policy 0, policy_version 16710 (0.0010) [2023-10-14 01:42:11,429][33201] Updated weights for policy 0, policy_version 16720 (0.0010) [2023-10-14 01:42:11,797][33201] Updated weights for policy 0, policy_version 16730 (0.0008) [2023-10-14 01:42:13,060][33226] Updated weights for policy 1, policy_version 16870 (0.0010) [2023-10-14 01:42:13,436][33226] Updated weights for policy 1, policy_version 16880 (0.0010) [2023-10-14 01:42:13,798][33226] Updated weights for policy 1, policy_version 16890 (0.0010) [2023-10-14 01:42:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 34439168. Throughput: 0: 1763.6, 1: 1784.3. Samples: 8611500. Policy #0 lag: (min: 22.0, avg: 33.4, max: 54.0) [2023-10-14 01:42:14,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.340')] [2023-10-14 01:42:15,570][33201] Updated weights for policy 0, policy_version 16740 (0.0008) [2023-10-14 01:42:15,963][33201] Updated weights for policy 0, policy_version 16750 (0.0008) [2023-10-14 01:42:16,340][33201] Updated weights for policy 0, policy_version 16760 (0.0008) [2023-10-14 01:42:17,568][33226] Updated weights for policy 1, policy_version 16900 (0.0009) [2023-10-14 01:42:17,931][33226] Updated weights for policy 1, policy_version 16910 (0.0009) [2023-10-14 01:42:18,296][33226] Updated weights for policy 1, policy_version 16920 (0.0011) [2023-10-14 01:42:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34504704. Throughput: 0: 1763.7, 1: 1796.3. Samples: 8632846. Policy #0 lag: (min: 22.0, avg: 33.4, max: 54.0) [2023-10-14 01:42:19,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.350')] [2023-10-14 01:42:20,075][33201] Updated weights for policy 0, policy_version 16770 (0.0009) [2023-10-14 01:42:20,454][33201] Updated weights for policy 0, policy_version 16780 (0.0008) [2023-10-14 01:42:20,821][33201] Updated weights for policy 0, policy_version 16790 (0.0007) [2023-10-14 01:42:21,191][33201] Updated weights for policy 0, policy_version 16800 (0.0008) [2023-10-14 01:42:22,109][33226] Updated weights for policy 1, policy_version 16930 (0.0010) [2023-10-14 01:42:22,484][33226] Updated weights for policy 1, policy_version 16940 (0.0010) [2023-10-14 01:42:22,862][33226] Updated weights for policy 1, policy_version 16950 (0.0011) [2023-10-14 01:42:23,232][33226] Updated weights for policy 1, policy_version 16960 (0.0010) [2023-10-14 01:42:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 34570240. Throughput: 0: 1780.7, 1: 1770.8. Samples: 8654052. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 01:42:24,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.340')] [2023-10-14 01:42:25,121][33201] Updated weights for policy 0, policy_version 16810 (0.0008) [2023-10-14 01:42:25,497][33201] Updated weights for policy 0, policy_version 16820 (0.0011) [2023-10-14 01:42:25,886][33201] Updated weights for policy 0, policy_version 16830 (0.0011) [2023-10-14 01:42:26,914][33226] Updated weights for policy 1, policy_version 16970 (0.0007) [2023-10-14 01:42:27,286][33226] Updated weights for policy 1, policy_version 16980 (0.0007) [2023-10-14 01:42:27,651][33226] Updated weights for policy 1, policy_version 16990 (0.0008) [2023-10-14 01:42:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 34635776. Throughput: 0: 1758.9, 1: 1796.4. Samples: 8664740. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 01:42:29,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.320')] [2023-10-14 01:42:29,808][33201] Updated weights for policy 0, policy_version 16840 (0.0008) [2023-10-14 01:42:30,180][33201] Updated weights for policy 0, policy_version 16850 (0.0007) [2023-10-14 01:42:30,544][33201] Updated weights for policy 0, policy_version 16860 (0.0010) [2023-10-14 01:42:31,270][33226] Updated weights for policy 1, policy_version 17000 (0.0010) [2023-10-14 01:42:31,647][33226] Updated weights for policy 1, policy_version 17010 (0.0010) [2023-10-14 01:42:32,020][33226] Updated weights for policy 1, policy_version 17020 (0.0009) [2023-10-14 01:42:34,382][33201] Updated weights for policy 0, policy_version 16870 (0.0009) [2023-10-14 01:42:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 34701312. Throughput: 0: 1768.2, 1: 1780.9. Samples: 8686052. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 01:42:34,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.330')] [2023-10-14 01:42:34,754][33201] Updated weights for policy 0, policy_version 16880 (0.0009) [2023-10-14 01:42:35,135][33201] Updated weights for policy 0, policy_version 16890 (0.0007) [2023-10-14 01:42:35,847][33226] Updated weights for policy 1, policy_version 17030 (0.0008) [2023-10-14 01:42:36,217][33226] Updated weights for policy 1, policy_version 17040 (0.0008) [2023-10-14 01:42:36,589][33226] Updated weights for policy 1, policy_version 17050 (0.0010) [2023-10-14 01:42:38,950][33201] Updated weights for policy 0, policy_version 16900 (0.0008) [2023-10-14 01:42:39,329][33201] Updated weights for policy 0, policy_version 16910 (0.0009) [2023-10-14 01:42:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 34766848. Throughput: 0: 1779.3, 1: 1780.3. Samples: 8707736. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-14 01:42:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.350')] [2023-10-14 01:42:39,693][33201] Updated weights for policy 0, policy_version 16920 (0.0008) [2023-10-14 01:42:40,275][33226] Updated weights for policy 1, policy_version 17060 (0.0009) [2023-10-14 01:42:40,644][33226] Updated weights for policy 1, policy_version 17070 (0.0010) [2023-10-14 01:42:41,014][33226] Updated weights for policy 1, policy_version 17080 (0.0009) [2023-10-14 01:42:43,480][33201] Updated weights for policy 0, policy_version 16930 (0.0008) [2023-10-14 01:42:43,845][33201] Updated weights for policy 0, policy_version 16940 (0.0008) [2023-10-14 01:42:44,218][33201] Updated weights for policy 0, policy_version 16950 (0.0008) [2023-10-14 01:42:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 34832384. Throughput: 0: 1756.1, 1: 1788.3. Samples: 8717720. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-14 01:42:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.360')] [2023-10-14 01:42:44,593][33201] Updated weights for policy 0, policy_version 16960 (0.0008) [2023-10-14 01:42:44,966][33226] Updated weights for policy 1, policy_version 17090 (0.0011) [2023-10-14 01:42:45,323][33226] Updated weights for policy 1, policy_version 17100 (0.0007) [2023-10-14 01:42:45,693][33226] Updated weights for policy 1, policy_version 17110 (0.0007) [2023-10-14 01:42:46,059][33226] Updated weights for policy 1, policy_version 17120 (0.0010) [2023-10-14 01:42:48,405][33201] Updated weights for policy 0, policy_version 16970 (0.0010) [2023-10-14 01:42:48,780][33201] Updated weights for policy 0, policy_version 16980 (0.0007) [2023-10-14 01:42:49,149][33201] Updated weights for policy 0, policy_version 16990 (0.0010) [2023-10-14 01:42:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34930688. Throughput: 0: 1774.6, 1: 1778.6. Samples: 8739660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:42:49,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.370')] [2023-10-14 01:42:49,844][33226] Updated weights for policy 1, policy_version 17130 (0.0007) [2023-10-14 01:42:50,211][33226] Updated weights for policy 1, policy_version 17140 (0.0008) [2023-10-14 01:42:50,583][33226] Updated weights for policy 1, policy_version 17150 (0.0007) [2023-10-14 01:42:53,133][33201] Updated weights for policy 0, policy_version 17000 (0.0007) [2023-10-14 01:42:53,499][33201] Updated weights for policy 0, policy_version 17010 (0.0007) [2023-10-14 01:42:53,871][33201] Updated weights for policy 0, policy_version 17020 (0.0007) [2023-10-14 01:42:54,258][33226] Updated weights for policy 1, policy_version 17160 (0.0009) [2023-10-14 01:42:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 34996224. Throughput: 0: 1743.0, 1: 1796.7. Samples: 8760288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:42:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.380')] [2023-10-14 01:42:54,623][33226] Updated weights for policy 1, policy_version 17170 (0.0009) [2023-10-14 01:42:54,994][33226] Updated weights for policy 1, policy_version 17180 (0.0009) [2023-10-14 01:42:57,695][33201] Updated weights for policy 0, policy_version 17030 (0.0008) [2023-10-14 01:42:58,068][33201] Updated weights for policy 0, policy_version 17040 (0.0009) [2023-10-14 01:42:58,436][33201] Updated weights for policy 0, policy_version 17050 (0.0007) [2023-10-14 01:42:58,783][33226] Updated weights for policy 1, policy_version 17190 (0.0007) [2023-10-14 01:42:59,165][33226] Updated weights for policy 1, policy_version 17200 (0.0007) [2023-10-14 01:42:59,536][33226] Updated weights for policy 1, policy_version 17210 (0.0008) [2023-10-14 01:42:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35061760. Throughput: 0: 1784.8, 1: 1774.4. Samples: 8771666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:42:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.370')] [2023-10-14 01:43:02,259][33201] Updated weights for policy 0, policy_version 17060 (0.0008) [2023-10-14 01:43:02,630][33201] Updated weights for policy 0, policy_version 17070 (0.0009) [2023-10-14 01:43:03,002][33201] Updated weights for policy 0, policy_version 17080 (0.0009) [2023-10-14 01:43:03,385][33226] Updated weights for policy 1, policy_version 17220 (0.0008) [2023-10-14 01:43:03,753][33226] Updated weights for policy 1, policy_version 17230 (0.0007) [2023-10-14 01:43:04,120][33226] Updated weights for policy 1, policy_version 17240 (0.0007) [2023-10-14 01:43:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 35160064. Throughput: 0: 1761.0, 1: 1791.9. Samples: 8792724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-14 01:43:04,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.320')] [2023-10-14 01:43:06,790][33201] Updated weights for policy 0, policy_version 17090 (0.0010) [2023-10-14 01:43:07,165][33201] Updated weights for policy 0, policy_version 17100 (0.0008) [2023-10-14 01:43:07,530][33201] Updated weights for policy 0, policy_version 17110 (0.0008) [2023-10-14 01:43:07,868][33226] Updated weights for policy 1, policy_version 17250 (0.0008) [2023-10-14 01:43:07,909][33201] Updated weights for policy 0, policy_version 17120 (0.0008) [2023-10-14 01:43:08,246][33226] Updated weights for policy 1, policy_version 17260 (0.0009) [2023-10-14 01:43:08,612][33226] Updated weights for policy 1, policy_version 17270 (0.0009) [2023-10-14 01:43:08,982][33226] Updated weights for policy 1, policy_version 17280 (0.0010) [2023-10-14 01:43:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 35225600. Throughput: 0: 1751.9, 1: 1784.3. Samples: 8813182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-14 01:43:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.310')] [2023-10-14 01:43:11,806][33201] Updated weights for policy 0, policy_version 17130 (0.0008) [2023-10-14 01:43:12,175][33201] Updated weights for policy 0, policy_version 17140 (0.0009) [2023-10-14 01:43:12,538][33201] Updated weights for policy 0, policy_version 17150 (0.0008) [2023-10-14 01:43:12,757][33226] Updated weights for policy 1, policy_version 17290 (0.0007) [2023-10-14 01:43:13,130][33226] Updated weights for policy 1, policy_version 17300 (0.0010) [2023-10-14 01:43:13,490][33226] Updated weights for policy 1, policy_version 17310 (0.0007) [2023-10-14 01:43:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35291136. Throughput: 0: 1766.1, 1: 1788.0. Samples: 8824672. Policy #0 lag: (min: 31.0, avg: 31.0, max: 34.0) [2023-10-14 01:43:14,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.300')] [2023-10-14 01:43:16,207][33201] Updated weights for policy 0, policy_version 17160 (0.0010) [2023-10-14 01:43:16,580][33201] Updated weights for policy 0, policy_version 17170 (0.0008) [2023-10-14 01:43:16,950][33201] Updated weights for policy 0, policy_version 17180 (0.0010) [2023-10-14 01:43:17,272][33226] Updated weights for policy 1, policy_version 17320 (0.0008) [2023-10-14 01:43:17,635][33226] Updated weights for policy 1, policy_version 17330 (0.0010) [2023-10-14 01:43:17,997][33226] Updated weights for policy 1, policy_version 17340 (0.0011) [2023-10-14 01:43:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35356672. Throughput: 0: 1758.0, 1: 1784.2. Samples: 8845450. Policy #0 lag: (min: 3.0, avg: 3.9, max: 23.0) [2023-10-14 01:43:19,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.280')] [2023-10-14 01:43:20,821][33201] Updated weights for policy 0, policy_version 17190 (0.0007) [2023-10-14 01:43:21,190][33201] Updated weights for policy 0, policy_version 17200 (0.0008) [2023-10-14 01:43:21,568][33201] Updated weights for policy 0, policy_version 17210 (0.0010) [2023-10-14 01:43:21,811][33226] Updated weights for policy 1, policy_version 17350 (0.0009) [2023-10-14 01:43:22,174][33226] Updated weights for policy 1, policy_version 17360 (0.0007) [2023-10-14 01:43:22,550][33226] Updated weights for policy 1, policy_version 17370 (0.0009) [2023-10-14 01:43:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35422208. Throughput: 0: 1758.2, 1: 1783.5. Samples: 8867112. Policy #0 lag: (min: 3.0, avg: 3.9, max: 23.0) [2023-10-14 01:43:24,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.330')] [2023-10-14 01:43:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000017376_17793024.pth... [2023-10-14 01:43:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000017216_17629184.pth... [2023-10-14 01:43:24,602][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000015712_16089088.pth [2023-10-14 01:43:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000015584_15958016.pth [2023-10-14 01:43:25,471][33201] Updated weights for policy 0, policy_version 17220 (0.0007) [2023-10-14 01:43:25,837][33201] Updated weights for policy 0, policy_version 17230 (0.0007) [2023-10-14 01:43:26,210][33201] Updated weights for policy 0, policy_version 17240 (0.0007) [2023-10-14 01:43:26,384][33226] Updated weights for policy 1, policy_version 17380 (0.0008) [2023-10-14 01:43:26,783][33226] Updated weights for policy 1, policy_version 17390 (0.0007) [2023-10-14 01:43:27,156][33226] Updated weights for policy 1, policy_version 17400 (0.0008) [2023-10-14 01:43:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35487744. Throughput: 0: 1751.0, 1: 1794.1. Samples: 8877248. Policy #0 lag: (min: 3.0, avg: 3.9, max: 23.0) [2023-10-14 01:43:29,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.420')] [2023-10-14 01:43:30,025][33201] Updated weights for policy 0, policy_version 17250 (0.0007) [2023-10-14 01:43:30,393][33201] Updated weights for policy 0, policy_version 17260 (0.0007) [2023-10-14 01:43:30,774][33201] Updated weights for policy 0, policy_version 17270 (0.0009) [2023-10-14 01:43:30,902][33226] Updated weights for policy 1, policy_version 17410 (0.0008) [2023-10-14 01:43:31,145][33201] Updated weights for policy 0, policy_version 17280 (0.0009) [2023-10-14 01:43:31,273][33226] Updated weights for policy 1, policy_version 17420 (0.0009) [2023-10-14 01:43:31,643][33226] Updated weights for policy 1, policy_version 17430 (0.0010) [2023-10-14 01:43:32,004][33226] Updated weights for policy 1, policy_version 17440 (0.0010) [2023-10-14 01:43:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 35553280. Throughput: 0: 1754.5, 1: 1779.1. Samples: 8898670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:43:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.380')] [2023-10-14 01:43:34,897][33201] Updated weights for policy 0, policy_version 17290 (0.0009) [2023-10-14 01:43:35,272][33201] Updated weights for policy 0, policy_version 17300 (0.0007) [2023-10-14 01:43:35,642][33201] Updated weights for policy 0, policy_version 17310 (0.0007) [2023-10-14 01:43:35,793][33226] Updated weights for policy 1, policy_version 17450 (0.0008) [2023-10-14 01:43:36,168][33226] Updated weights for policy 1, policy_version 17460 (0.0008) [2023-10-14 01:43:36,535][33226] Updated weights for policy 1, policy_version 17470 (0.0008) [2023-10-14 01:43:39,549][33201] Updated weights for policy 0, policy_version 17320 (0.0009) [2023-10-14 01:43:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 35618816. Throughput: 0: 1787.2, 1: 1779.2. Samples: 8920776. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:43:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.420')] [2023-10-14 01:43:39,927][33201] Updated weights for policy 0, policy_version 17330 (0.0009) [2023-10-14 01:43:40,273][33226] Updated weights for policy 1, policy_version 17480 (0.0008) [2023-10-14 01:43:40,289][33201] Updated weights for policy 0, policy_version 17340 (0.0007) [2023-10-14 01:43:40,638][33226] Updated weights for policy 1, policy_version 17490 (0.0008) [2023-10-14 01:43:41,007][33226] Updated weights for policy 1, policy_version 17500 (0.0008) [2023-10-14 01:43:44,128][33201] Updated weights for policy 0, policy_version 17350 (0.0008) [2023-10-14 01:43:44,503][33201] Updated weights for policy 0, policy_version 17360 (0.0007) [2023-10-14 01:43:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 35684352. Throughput: 0: 1750.5, 1: 1778.3. Samples: 8930462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:43:44,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.450')] [2023-10-14 01:43:44,783][33226] Updated weights for policy 1, policy_version 17510 (0.0009) [2023-10-14 01:43:44,869][33201] Updated weights for policy 0, policy_version 17370 (0.0007) [2023-10-14 01:43:45,153][33226] Updated weights for policy 1, policy_version 17520 (0.0009) [2023-10-14 01:43:45,528][33226] Updated weights for policy 1, policy_version 17530 (0.0007) [2023-10-14 01:43:48,851][33201] Updated weights for policy 0, policy_version 17380 (0.0008) [2023-10-14 01:43:49,251][33201] Updated weights for policy 0, policy_version 17390 (0.0008) [2023-10-14 01:43:49,355][33226] Updated weights for policy 1, policy_version 17540 (0.0007) [2023-10-14 01:43:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 35749888. Throughput: 0: 1773.3, 1: 1780.4. Samples: 8952636. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-14 01:43:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.490')] [2023-10-14 01:43:49,618][33201] Updated weights for policy 0, policy_version 17400 (0.0008) [2023-10-14 01:43:49,717][33226] Updated weights for policy 1, policy_version 17550 (0.0008) [2023-10-14 01:43:49,910][32837] Saving new best policy, reward=20.950! [2023-10-14 01:43:50,098][33226] Updated weights for policy 1, policy_version 17560 (0.0009) [2023-10-14 01:43:53,378][33201] Updated weights for policy 0, policy_version 17410 (0.0007) [2023-10-14 01:43:53,749][33201] Updated weights for policy 0, policy_version 17420 (0.0009) [2023-10-14 01:43:53,763][33226] Updated weights for policy 1, policy_version 17570 (0.0009) [2023-10-14 01:43:54,116][33201] Updated weights for policy 0, policy_version 17430 (0.0009) [2023-10-14 01:43:54,124][33226] Updated weights for policy 1, policy_version 17580 (0.0007) [2023-10-14 01:43:54,491][33201] Updated weights for policy 0, policy_version 17440 (0.0008) [2023-10-14 01:43:54,500][33226] Updated weights for policy 1, policy_version 17590 (0.0008) [2023-10-14 01:43:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 35848192. Throughput: 0: 1757.1, 1: 1807.5. Samples: 8973588. Policy #0 lag: (min: 31.0, avg: 37.8, max: 63.0) [2023-10-14 01:43:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.500')] [2023-10-14 01:43:54,569][32837] Saving new best policy, reward=20.980! [2023-10-14 01:43:54,858][33226] Updated weights for policy 1, policy_version 17600 (0.0011) [2023-10-14 01:43:58,257][33201] Updated weights for policy 0, policy_version 17450 (0.0008) [2023-10-14 01:43:58,547][33226] Updated weights for policy 1, policy_version 17610 (0.0008) [2023-10-14 01:43:58,625][33201] Updated weights for policy 0, policy_version 17460 (0.0009) [2023-10-14 01:43:58,924][33226] Updated weights for policy 1, policy_version 17620 (0.0007) [2023-10-14 01:43:58,999][33201] Updated weights for policy 0, policy_version 17470 (0.0007) [2023-10-14 01:43:59,295][33226] Updated weights for policy 1, policy_version 17630 (0.0008) [2023-10-14 01:43:59,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 35946496. Throughput: 0: 1765.1, 1: 1783.6. Samples: 8984366. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) [2023-10-14 01:43:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.490')] [2023-10-14 01:44:02,865][33201] Updated weights for policy 0, policy_version 17480 (0.0011) [2023-10-14 01:44:03,205][33226] Updated weights for policy 1, policy_version 17640 (0.0008) [2023-10-14 01:44:03,243][33201] Updated weights for policy 0, policy_version 17490 (0.0008) [2023-10-14 01:44:03,580][33226] Updated weights for policy 1, policy_version 17650 (0.0007) [2023-10-14 01:44:03,613][33201] Updated weights for policy 0, policy_version 17500 (0.0008) [2023-10-14 01:44:03,942][33226] Updated weights for policy 1, policy_version 17660 (0.0008) [2023-10-14 01:44:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 36012032. Throughput: 0: 1755.6, 1: 1803.0. Samples: 9005586. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) [2023-10-14 01:44:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.510')] [2023-10-14 01:44:04,559][32837] Saving new best policy, reward=20.990! [2023-10-14 01:44:07,415][33201] Updated weights for policy 0, policy_version 17510 (0.0009) [2023-10-14 01:44:07,791][33201] Updated weights for policy 0, policy_version 17520 (0.0009) [2023-10-14 01:44:07,796][33226] Updated weights for policy 1, policy_version 17670 (0.0007) [2023-10-14 01:44:08,160][33201] Updated weights for policy 0, policy_version 17530 (0.0009) [2023-10-14 01:44:08,169][33226] Updated weights for policy 1, policy_version 17680 (0.0008) [2023-10-14 01:44:08,534][33226] Updated weights for policy 1, policy_version 17690 (0.0007) [2023-10-14 01:44:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 36077568. Throughput: 0: 1740.5, 1: 1776.4. Samples: 9025372. Policy #0 lag: (min: 3.0, avg: 11.0, max: 35.0) [2023-10-14 01:44:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.530')] [2023-10-14 01:44:09,568][32837] Saving new best policy, reward=21.000! [2023-10-14 01:44:11,960][33201] Updated weights for policy 0, policy_version 17540 (0.0008) [2023-10-14 01:44:12,327][33201] Updated weights for policy 0, policy_version 17550 (0.0007) [2023-10-14 01:44:12,395][33226] Updated weights for policy 1, policy_version 17700 (0.0007) [2023-10-14 01:44:12,695][33201] Updated weights for policy 0, policy_version 17560 (0.0007) [2023-10-14 01:44:12,763][33226] Updated weights for policy 1, policy_version 17710 (0.0009) [2023-10-14 01:44:13,129][33226] Updated weights for policy 1, policy_version 17720 (0.0008) [2023-10-14 01:44:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 36143104. Throughput: 0: 1765.8, 1: 1796.1. Samples: 9037534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-14 01:44:14,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.550')] [2023-10-14 01:44:16,560][33201] Updated weights for policy 0, policy_version 17570 (0.0007) [2023-10-14 01:44:16,887][33226] Updated weights for policy 1, policy_version 17730 (0.0009) [2023-10-14 01:44:16,938][33201] Updated weights for policy 0, policy_version 17580 (0.0010) [2023-10-14 01:44:17,249][33226] Updated weights for policy 1, policy_version 17740 (0.0007) [2023-10-14 01:44:17,308][33201] Updated weights for policy 0, policy_version 17590 (0.0007) [2023-10-14 01:44:17,622][33226] Updated weights for policy 1, policy_version 17750 (0.0008) [2023-10-14 01:44:17,669][33201] Updated weights for policy 0, policy_version 17600 (0.0009) [2023-10-14 01:44:17,987][33226] Updated weights for policy 1, policy_version 17760 (0.0009) [2023-10-14 01:44:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 36208640. Throughput: 0: 1746.0, 1: 1784.2. Samples: 9057526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-14 01:44:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.580')] [2023-10-14 01:44:21,439][33201] Updated weights for policy 0, policy_version 17610 (0.0008) [2023-10-14 01:44:21,815][33201] Updated weights for policy 0, policy_version 17620 (0.0009) [2023-10-14 01:44:21,877][33226] Updated weights for policy 1, policy_version 17770 (0.0009) [2023-10-14 01:44:22,191][33201] Updated weights for policy 0, policy_version 17630 (0.0009) [2023-10-14 01:44:22,246][33226] Updated weights for policy 1, policy_version 17780 (0.0007) [2023-10-14 01:44:22,621][33226] Updated weights for policy 1, policy_version 17790 (0.0009) [2023-10-14 01:44:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 36274176. Throughput: 0: 1750.7, 1: 1771.8. Samples: 9079286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 33.0) [2023-10-14 01:44:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.600')] [2023-10-14 01:44:25,979][33201] Updated weights for policy 0, policy_version 17640 (0.0009) [2023-10-14 01:44:26,350][33201] Updated weights for policy 0, policy_version 17650 (0.0007) [2023-10-14 01:44:26,459][33226] Updated weights for policy 1, policy_version 17800 (0.0009) [2023-10-14 01:44:26,710][33201] Updated weights for policy 0, policy_version 17660 (0.0008) [2023-10-14 01:44:26,820][33226] Updated weights for policy 1, policy_version 17810 (0.0010) [2023-10-14 01:44:27,182][33226] Updated weights for policy 1, policy_version 17820 (0.0011) [2023-10-14 01:44:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 36339712. Throughput: 0: 1745.4, 1: 1783.4. Samples: 9089256. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.600')] [2023-10-14 01:44:30,474][33201] Updated weights for policy 0, policy_version 17670 (0.0008) [2023-10-14 01:44:30,842][33201] Updated weights for policy 0, policy_version 17680 (0.0007) [2023-10-14 01:44:31,003][33226] Updated weights for policy 1, policy_version 17830 (0.0009) [2023-10-14 01:44:31,218][33201] Updated weights for policy 0, policy_version 17690 (0.0007) [2023-10-14 01:44:31,374][33226] Updated weights for policy 1, policy_version 17840 (0.0009) [2023-10-14 01:44:31,738][33226] Updated weights for policy 1, policy_version 17850 (0.0010) [2023-10-14 01:44:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 36405248. Throughput: 0: 1745.8, 1: 1769.6. Samples: 9110830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.610')] [2023-10-14 01:44:35,106][33201] Updated weights for policy 0, policy_version 17700 (0.0009) [2023-10-14 01:44:35,447][33226] Updated weights for policy 1, policy_version 17860 (0.0008) [2023-10-14 01:44:35,488][33201] Updated weights for policy 0, policy_version 17710 (0.0008) [2023-10-14 01:44:35,826][33226] Updated weights for policy 1, policy_version 17870 (0.0007) [2023-10-14 01:44:35,856][33201] Updated weights for policy 0, policy_version 17720 (0.0008) [2023-10-14 01:44:36,189][33226] Updated weights for policy 1, policy_version 17880 (0.0009) [2023-10-14 01:44:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 36470784. Throughput: 0: 1772.0, 1: 1774.0. Samples: 9133158. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:39,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.630')] [2023-10-14 01:44:39,843][33201] Updated weights for policy 0, policy_version 17730 (0.0009) [2023-10-14 01:44:39,939][33226] Updated weights for policy 1, policy_version 17890 (0.0007) [2023-10-14 01:44:40,214][33201] Updated weights for policy 0, policy_version 17740 (0.0007) [2023-10-14 01:44:40,308][33226] Updated weights for policy 1, policy_version 17900 (0.0008) [2023-10-14 01:44:40,590][33201] Updated weights for policy 0, policy_version 17750 (0.0008) [2023-10-14 01:44:40,675][33226] Updated weights for policy 1, policy_version 17910 (0.0008) [2023-10-14 01:44:40,969][33201] Updated weights for policy 0, policy_version 17760 (0.0007) [2023-10-14 01:44:41,044][33226] Updated weights for policy 1, policy_version 17920 (0.0007) [2023-10-14 01:44:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 36536320. Throughput: 0: 1752.0, 1: 1767.7. Samples: 9142754. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 01:44:44,790][33201] Updated weights for policy 0, policy_version 17770 (0.0008) [2023-10-14 01:44:44,965][33226] Updated weights for policy 1, policy_version 17930 (0.0008) [2023-10-14 01:44:45,169][33201] Updated weights for policy 0, policy_version 17780 (0.0007) [2023-10-14 01:44:45,339][33226] Updated weights for policy 1, policy_version 17940 (0.0008) [2023-10-14 01:44:45,527][33201] Updated weights for policy 0, policy_version 17790 (0.0007) [2023-10-14 01:44:45,703][33226] Updated weights for policy 1, policy_version 17950 (0.0008) [2023-10-14 01:44:49,461][33201] Updated weights for policy 0, policy_version 17800 (0.0007) [2023-10-14 01:44:49,483][33226] Updated weights for policy 1, policy_version 17960 (0.0007) [2023-10-14 01:44:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 36601856. Throughput: 0: 1761.8, 1: 1768.3. Samples: 9164438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.700')] [2023-10-14 01:44:49,826][33201] Updated weights for policy 0, policy_version 17810 (0.0008) [2023-10-14 01:44:49,855][33226] Updated weights for policy 1, policy_version 17970 (0.0009) [2023-10-14 01:44:50,192][33201] Updated weights for policy 0, policy_version 17820 (0.0009) [2023-10-14 01:44:50,221][33226] Updated weights for policy 1, policy_version 17980 (0.0007) [2023-10-14 01:44:54,079][33201] Updated weights for policy 0, policy_version 17830 (0.0007) [2023-10-14 01:44:54,123][33226] Updated weights for policy 1, policy_version 17990 (0.0007) [2023-10-14 01:44:54,459][33201] Updated weights for policy 0, policy_version 17840 (0.0007) [2023-10-14 01:44:54,489][33226] Updated weights for policy 1, policy_version 18000 (0.0007) [2023-10-14 01:44:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 36667392. Throughput: 0: 1776.3, 1: 1794.6. Samples: 9186060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:44:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.730')] [2023-10-14 01:44:54,815][33201] Updated weights for policy 0, policy_version 17850 (0.0009) [2023-10-14 01:44:54,849][33226] Updated weights for policy 1, policy_version 18010 (0.0009) [2023-10-14 01:44:58,574][33201] Updated weights for policy 0, policy_version 17860 (0.0008) [2023-10-14 01:44:58,625][33226] Updated weights for policy 1, policy_version 18020 (0.0008) [2023-10-14 01:44:58,939][33201] Updated weights for policy 0, policy_version 17870 (0.0009) [2023-10-14 01:44:59,034][33226] Updated weights for policy 1, policy_version 18030 (0.0008) [2023-10-14 01:44:59,314][33201] Updated weights for policy 0, policy_version 17880 (0.0009) [2023-10-14 01:44:59,389][33226] Updated weights for policy 1, policy_version 18040 (0.0009) [2023-10-14 01:44:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 14106.9). Total num frames: 36732928. Throughput: 0: 1756.5, 1: 1765.9. Samples: 9196042. Policy #0 lag: (min: 1.0, avg: 7.4, max: 33.0) [2023-10-14 01:44:59,560][31953] Avg episode reward: [(0, '20.950'), (1, '20.740')] [2023-10-14 01:45:02,996][33226] Updated weights for policy 1, policy_version 18050 (0.0008) [2023-10-14 01:45:03,234][33201] Updated weights for policy 0, policy_version 17890 (0.0008) [2023-10-14 01:45:03,361][33226] Updated weights for policy 1, policy_version 18060 (0.0007) [2023-10-14 01:45:03,605][33201] Updated weights for policy 0, policy_version 17900 (0.0008) [2023-10-14 01:45:03,723][33226] Updated weights for policy 1, policy_version 18070 (0.0008) [2023-10-14 01:45:03,986][33201] Updated weights for policy 0, policy_version 17910 (0.0009) [2023-10-14 01:45:04,093][33226] Updated weights for policy 1, policy_version 18080 (0.0008) [2023-10-14 01:45:04,351][33201] Updated weights for policy 0, policy_version 17920 (0.0009) [2023-10-14 01:45:04,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 36864000. Throughput: 0: 1774.3, 1: 1792.5. Samples: 9218036. Policy #0 lag: (min: 1.0, avg: 7.4, max: 33.0) [2023-10-14 01:45:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.790')] [2023-10-14 01:45:07,891][33226] Updated weights for policy 1, policy_version 18090 (0.0009) [2023-10-14 01:45:08,253][33226] Updated weights for policy 1, policy_version 18100 (0.0008) [2023-10-14 01:45:08,301][33201] Updated weights for policy 0, policy_version 17930 (0.0008) [2023-10-14 01:45:08,629][33226] Updated weights for policy 1, policy_version 18110 (0.0007) [2023-10-14 01:45:08,676][33201] Updated weights for policy 0, policy_version 17940 (0.0007) [2023-10-14 01:45:09,048][33201] Updated weights for policy 0, policy_version 17950 (0.0007) [2023-10-14 01:45:09,557][31953] Fps is (10 sec: 19661.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 36929536. Throughput: 0: 1739.1, 1: 1770.5. Samples: 9237220. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 01:45:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.810')] [2023-10-14 01:45:12,516][33226] Updated weights for policy 1, policy_version 18120 (0.0009) [2023-10-14 01:45:12,880][33226] Updated weights for policy 1, policy_version 18130 (0.0008) [2023-10-14 01:45:12,935][33201] Updated weights for policy 0, policy_version 17960 (0.0007) [2023-10-14 01:45:13,246][33226] Updated weights for policy 1, policy_version 18140 (0.0008) [2023-10-14 01:45:13,302][33201] Updated weights for policy 0, policy_version 17970 (0.0009) [2023-10-14 01:45:13,679][33201] Updated weights for policy 0, policy_version 17980 (0.0008) [2023-10-14 01:45:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 36995072. Throughput: 0: 1770.6, 1: 1791.0. Samples: 9249528. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 01:45:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.830')] [2023-10-14 01:45:17,015][33226] Updated weights for policy 1, policy_version 18150 (0.0009) [2023-10-14 01:45:17,381][33226] Updated weights for policy 1, policy_version 18160 (0.0008) [2023-10-14 01:45:17,414][33201] Updated weights for policy 0, policy_version 17990 (0.0008) [2023-10-14 01:45:17,746][33226] Updated weights for policy 1, policy_version 18170 (0.0007) [2023-10-14 01:45:17,789][33201] Updated weights for policy 0, policy_version 18000 (0.0007) [2023-10-14 01:45:18,156][33201] Updated weights for policy 0, policy_version 18010 (0.0007) [2023-10-14 01:45:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 37060608. Throughput: 0: 1753.5, 1: 1773.0. Samples: 9269522. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 01:45:19,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.870')] [2023-10-14 01:45:21,634][33226] Updated weights for policy 1, policy_version 18180 (0.0009) [2023-10-14 01:45:21,972][33201] Updated weights for policy 0, policy_version 18020 (0.0009) [2023-10-14 01:45:22,006][33226] Updated weights for policy 1, policy_version 18190 (0.0009) [2023-10-14 01:45:22,364][33201] Updated weights for policy 0, policy_version 18030 (0.0007) [2023-10-14 01:45:22,380][33226] Updated weights for policy 1, policy_version 18200 (0.0008) [2023-10-14 01:45:22,740][33201] Updated weights for policy 0, policy_version 18040 (0.0009) [2023-10-14 01:45:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37126144. Throughput: 0: 1738.7, 1: 1763.7. Samples: 9290764. Policy #0 lag: (min: 9.0, avg: 17.6, max: 41.0) [2023-10-14 01:45:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 01:45:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000018208_18644992.pth... [2023-10-14 01:45:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000018048_18481152.pth... [2023-10-14 01:45:24,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000016544_16941056.pth [2023-10-14 01:45:24,615][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000016416_16809984.pth [2023-10-14 01:45:26,112][33226] Updated weights for policy 1, policy_version 18210 (0.0008) [2023-10-14 01:45:26,457][33201] Updated weights for policy 0, policy_version 18050 (0.0009) [2023-10-14 01:45:26,475][33226] Updated weights for policy 1, policy_version 18220 (0.0009) [2023-10-14 01:45:26,828][33201] Updated weights for policy 0, policy_version 18060 (0.0008) [2023-10-14 01:45:26,853][33226] Updated weights for policy 1, policy_version 18230 (0.0007) [2023-10-14 01:45:27,196][33201] Updated weights for policy 0, policy_version 18070 (0.0009) [2023-10-14 01:45:27,226][33226] Updated weights for policy 1, policy_version 18240 (0.0008) [2023-10-14 01:45:27,568][33201] Updated weights for policy 0, policy_version 18080 (0.0008) [2023-10-14 01:45:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37191680. Throughput: 0: 1756.3, 1: 1770.7. Samples: 9301466. Policy #0 lag: (min: 9.0, avg: 17.6, max: 41.0) [2023-10-14 01:45:29,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 01:45:30,858][33226] Updated weights for policy 1, policy_version 18250 (0.0011) [2023-10-14 01:45:31,233][33226] Updated weights for policy 1, policy_version 18260 (0.0009) [2023-10-14 01:45:31,361][33201] Updated weights for policy 0, policy_version 18090 (0.0008) [2023-10-14 01:45:31,603][33226] Updated weights for policy 1, policy_version 18270 (0.0010) [2023-10-14 01:45:31,730][33201] Updated weights for policy 0, policy_version 18100 (0.0007) [2023-10-14 01:45:32,105][33201] Updated weights for policy 0, policy_version 18110 (0.0009) [2023-10-14 01:45:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37257216. Throughput: 0: 1747.2, 1: 1765.6. Samples: 9322516. Policy #0 lag: (min: 9.0, avg: 17.6, max: 41.0) [2023-10-14 01:45:34,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 01:45:35,476][33226] Updated weights for policy 1, policy_version 18280 (0.0008) [2023-10-14 01:45:35,845][33226] Updated weights for policy 1, policy_version 18290 (0.0007) [2023-10-14 01:45:35,973][33201] Updated weights for policy 0, policy_version 18120 (0.0008) [2023-10-14 01:45:36,217][33226] Updated weights for policy 1, policy_version 18300 (0.0007) [2023-10-14 01:45:36,348][33201] Updated weights for policy 0, policy_version 18130 (0.0008) [2023-10-14 01:45:36,716][33201] Updated weights for policy 0, policy_version 18140 (0.0009) [2023-10-14 01:45:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 37322752. Throughput: 0: 1753.9, 1: 1770.0. Samples: 9344638. Policy #0 lag: (min: 31.0, avg: 31.4, max: 46.0) [2023-10-14 01:45:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.910')] [2023-10-14 01:45:39,942][33226] Updated weights for policy 1, policy_version 18310 (0.0009) [2023-10-14 01:45:40,310][33226] Updated weights for policy 1, policy_version 18320 (0.0008) [2023-10-14 01:45:40,561][33201] Updated weights for policy 0, policy_version 18150 (0.0009) [2023-10-14 01:45:40,674][33226] Updated weights for policy 1, policy_version 18330 (0.0009) [2023-10-14 01:45:40,940][33201] Updated weights for policy 0, policy_version 18160 (0.0009) [2023-10-14 01:45:41,309][33201] Updated weights for policy 0, policy_version 18170 (0.0009) [2023-10-14 01:45:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 37388288. Throughput: 0: 1749.9, 1: 1764.3. Samples: 9354182. Policy #0 lag: (min: 31.0, avg: 31.4, max: 46.0) [2023-10-14 01:45:44,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.910')] [2023-10-14 01:45:44,701][33226] Updated weights for policy 1, policy_version 18340 (0.0009) [2023-10-14 01:45:45,104][33226] Updated weights for policy 1, policy_version 18350 (0.0009) [2023-10-14 01:45:45,277][33201] Updated weights for policy 0, policy_version 18180 (0.0009) [2023-10-14 01:45:45,475][33226] Updated weights for policy 1, policy_version 18360 (0.0009) [2023-10-14 01:45:45,646][33201] Updated weights for policy 0, policy_version 18190 (0.0008) [2023-10-14 01:45:46,013][33201] Updated weights for policy 0, policy_version 18200 (0.0008) [2023-10-14 01:45:49,208][33226] Updated weights for policy 1, policy_version 18370 (0.0009) [2023-10-14 01:45:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 37453824. Throughput: 0: 1746.8, 1: 1759.5. Samples: 9375818. Policy #0 lag: (min: 31.0, avg: 31.4, max: 46.0) [2023-10-14 01:45:49,557][31953] Avg episode reward: [(0, '20.810'), (1, '20.940')] [2023-10-14 01:45:49,575][33226] Updated weights for policy 1, policy_version 18380 (0.0007) [2023-10-14 01:45:49,889][33201] Updated weights for policy 0, policy_version 18210 (0.0009) [2023-10-14 01:45:49,937][33226] Updated weights for policy 1, policy_version 18390 (0.0007) [2023-10-14 01:45:50,257][33201] Updated weights for policy 0, policy_version 18220 (0.0008) [2023-10-14 01:45:50,305][33226] Updated weights for policy 1, policy_version 18400 (0.0008) [2023-10-14 01:45:50,628][33201] Updated weights for policy 0, policy_version 18230 (0.0008) [2023-10-14 01:45:50,993][33201] Updated weights for policy 0, policy_version 18240 (0.0007) [2023-10-14 01:45:54,080][33226] Updated weights for policy 1, policy_version 18410 (0.0008) [2023-10-14 01:45:54,448][33226] Updated weights for policy 1, policy_version 18420 (0.0009) [2023-10-14 01:45:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 37519360. Throughput: 0: 1780.0, 1: 1783.7. Samples: 9397588. Policy #0 lag: (min: 9.0, avg: 29.8, max: 32.0) [2023-10-14 01:45:54,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.940')] [2023-10-14 01:45:54,747][33201] Updated weights for policy 0, policy_version 18250 (0.0007) [2023-10-14 01:45:54,822][33226] Updated weights for policy 1, policy_version 18430 (0.0007) [2023-10-14 01:45:55,114][33201] Updated weights for policy 0, policy_version 18260 (0.0007) [2023-10-14 01:45:55,488][33201] Updated weights for policy 0, policy_version 18270 (0.0007) [2023-10-14 01:45:58,497][33226] Updated weights for policy 1, policy_version 18440 (0.0008) [2023-10-14 01:45:58,866][33226] Updated weights for policy 1, policy_version 18450 (0.0008) [2023-10-14 01:45:59,238][33226] Updated weights for policy 1, policy_version 18460 (0.0008) [2023-10-14 01:45:59,282][33201] Updated weights for policy 0, policy_version 18280 (0.0007) [2023-10-14 01:45:59,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 37617664. Throughput: 0: 1750.1, 1: 1762.1. Samples: 9407578. Policy #0 lag: (min: 9.0, avg: 29.8, max: 32.0) [2023-10-14 01:45:59,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.920')] [2023-10-14 01:45:59,657][33201] Updated weights for policy 0, policy_version 18290 (0.0010) [2023-10-14 01:46:00,033][33201] Updated weights for policy 0, policy_version 18300 (0.0009) [2023-10-14 01:46:03,166][33226] Updated weights for policy 1, policy_version 18470 (0.0008) [2023-10-14 01:46:03,532][33226] Updated weights for policy 1, policy_version 18480 (0.0008) [2023-10-14 01:46:03,902][33226] Updated weights for policy 1, policy_version 18490 (0.0008) [2023-10-14 01:46:03,958][33201] Updated weights for policy 0, policy_version 18310 (0.0007) [2023-10-14 01:46:04,327][33201] Updated weights for policy 0, policy_version 18320 (0.0009) [2023-10-14 01:46:04,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 37683200. Throughput: 0: 1765.1, 1: 1788.9. Samples: 9429450. Policy #0 lag: (min: 9.0, avg: 29.8, max: 32.0) [2023-10-14 01:46:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.920')] [2023-10-14 01:46:04,708][33201] Updated weights for policy 0, policy_version 18330 (0.0007) [2023-10-14 01:46:07,624][33226] Updated weights for policy 1, policy_version 18500 (0.0007) [2023-10-14 01:46:07,987][33226] Updated weights for policy 1, policy_version 18510 (0.0009) [2023-10-14 01:46:08,358][33226] Updated weights for policy 1, policy_version 18520 (0.0007) [2023-10-14 01:46:08,440][33201] Updated weights for policy 0, policy_version 18340 (0.0007) [2023-10-14 01:46:08,835][33201] Updated weights for policy 0, policy_version 18350 (0.0007) [2023-10-14 01:46:09,201][33201] Updated weights for policy 0, policy_version 18360 (0.0007) [2023-10-14 01:46:09,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37781504. Throughput: 0: 1764.5, 1: 1767.9. Samples: 9449720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:46:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 01:46:12,163][33226] Updated weights for policy 1, policy_version 18530 (0.0008) [2023-10-14 01:46:12,519][33226] Updated weights for policy 1, policy_version 18540 (0.0008) [2023-10-14 01:46:12,895][33226] Updated weights for policy 1, policy_version 18550 (0.0008) [2023-10-14 01:46:12,935][33201] Updated weights for policy 0, policy_version 18370 (0.0009) [2023-10-14 01:46:13,257][33226] Updated weights for policy 1, policy_version 18560 (0.0007) [2023-10-14 01:46:13,303][33201] Updated weights for policy 0, policy_version 18380 (0.0008) [2023-10-14 01:46:13,678][33201] Updated weights for policy 0, policy_version 18390 (0.0007) [2023-10-14 01:46:14,041][33201] Updated weights for policy 0, policy_version 18400 (0.0009) [2023-10-14 01:46:14,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37847040. Throughput: 0: 1766.6, 1: 1792.8. Samples: 9461640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:46:14,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.940')] [2023-10-14 01:46:17,064][33226] Updated weights for policy 1, policy_version 18570 (0.0008) [2023-10-14 01:46:17,439][33226] Updated weights for policy 1, policy_version 18580 (0.0008) [2023-10-14 01:46:17,805][33226] Updated weights for policy 1, policy_version 18590 (0.0007) [2023-10-14 01:46:18,068][33201] Updated weights for policy 0, policy_version 18410 (0.0009) [2023-10-14 01:46:18,434][33201] Updated weights for policy 0, policy_version 18420 (0.0008) [2023-10-14 01:46:18,812][33201] Updated weights for policy 0, policy_version 18430 (0.0009) [2023-10-14 01:46:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 37912576. Throughput: 0: 1772.0, 1: 1766.6. Samples: 9481752. Policy #0 lag: (min: 23.0, avg: 23.6, max: 39.0) [2023-10-14 01:46:19,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.940')] [2023-10-14 01:46:21,536][33226] Updated weights for policy 1, policy_version 18600 (0.0008) [2023-10-14 01:46:21,911][33226] Updated weights for policy 1, policy_version 18610 (0.0010) [2023-10-14 01:46:22,286][33226] Updated weights for policy 1, policy_version 18620 (0.0010) [2023-10-14 01:46:22,774][33201] Updated weights for policy 0, policy_version 18440 (0.0009) [2023-10-14 01:46:23,155][33201] Updated weights for policy 0, policy_version 18450 (0.0007) [2023-10-14 01:46:23,531][33201] Updated weights for policy 0, policy_version 18460 (0.0009) [2023-10-14 01:46:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 37978112. Throughput: 0: 1746.2, 1: 1767.7. Samples: 9502766. Policy #0 lag: (min: 23.0, avg: 23.6, max: 39.0) [2023-10-14 01:46:24,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.870')] [2023-10-14 01:46:26,028][33226] Updated weights for policy 1, policy_version 18630 (0.0008) [2023-10-14 01:46:26,407][33226] Updated weights for policy 1, policy_version 18640 (0.0008) [2023-10-14 01:46:26,775][33226] Updated weights for policy 1, policy_version 18650 (0.0008) [2023-10-14 01:46:27,332][33201] Updated weights for policy 0, policy_version 18470 (0.0010) [2023-10-14 01:46:27,698][33201] Updated weights for policy 0, policy_version 18480 (0.0009) [2023-10-14 01:46:28,073][33201] Updated weights for policy 0, policy_version 18490 (0.0008) [2023-10-14 01:46:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 38043648. Throughput: 0: 1778.6, 1: 1773.6. Samples: 9514032. Policy #0 lag: (min: 23.0, avg: 23.6, max: 39.0) [2023-10-14 01:46:29,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.870')] [2023-10-14 01:46:30,675][33226] Updated weights for policy 1, policy_version 18660 (0.0009) [2023-10-14 01:46:31,053][33226] Updated weights for policy 1, policy_version 18670 (0.0008) [2023-10-14 01:46:31,408][33226] Updated weights for policy 1, policy_version 18680 (0.0009) [2023-10-14 01:46:31,765][33201] Updated weights for policy 0, policy_version 18500 (0.0007) [2023-10-14 01:46:32,143][33201] Updated weights for policy 0, policy_version 18510 (0.0008) [2023-10-14 01:46:32,512][33201] Updated weights for policy 0, policy_version 18520 (0.0007) [2023-10-14 01:46:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 38109184. Throughput: 0: 1750.9, 1: 1777.1. Samples: 9534578. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) [2023-10-14 01:46:34,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.870')] [2023-10-14 01:46:35,111][33226] Updated weights for policy 1, policy_version 18690 (0.0008) [2023-10-14 01:46:35,476][33226] Updated weights for policy 1, policy_version 18700 (0.0008) [2023-10-14 01:46:35,847][33226] Updated weights for policy 1, policy_version 18710 (0.0009) [2023-10-14 01:46:36,216][33226] Updated weights for policy 1, policy_version 18720 (0.0009) [2023-10-14 01:46:36,419][33201] Updated weights for policy 0, policy_version 18530 (0.0009) [2023-10-14 01:46:36,794][33201] Updated weights for policy 0, policy_version 18540 (0.0010) [2023-10-14 01:46:37,158][33201] Updated weights for policy 0, policy_version 18550 (0.0009) [2023-10-14 01:46:37,530][33201] Updated weights for policy 0, policy_version 18560 (0.0010) [2023-10-14 01:46:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 38174720. Throughput: 0: 1746.8, 1: 1788.4. Samples: 9556672. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) [2023-10-14 01:46:39,557][31953] Avg episode reward: [(0, '20.750'), (1, '20.870')] [2023-10-14 01:46:39,999][33226] Updated weights for policy 1, policy_version 18730 (0.0009) [2023-10-14 01:46:40,364][33226] Updated weights for policy 1, policy_version 18740 (0.0009) [2023-10-14 01:46:40,739][33226] Updated weights for policy 1, policy_version 18750 (0.0008) [2023-10-14 01:46:41,455][33201] Updated weights for policy 0, policy_version 18570 (0.0008) [2023-10-14 01:46:41,829][33201] Updated weights for policy 0, policy_version 18580 (0.0008) [2023-10-14 01:46:42,202][33201] Updated weights for policy 0, policy_version 18590 (0.0007) [2023-10-14 01:46:44,543][33226] Updated weights for policy 1, policy_version 18760 (0.0010) [2023-10-14 01:46:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 38240256. Throughput: 0: 1757.1, 1: 1780.6. Samples: 9566776. Policy #0 lag: (min: 24.0, avg: 48.3, max: 56.0) [2023-10-14 01:46:44,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.870')] [2023-10-14 01:46:44,911][33226] Updated weights for policy 1, policy_version 18770 (0.0011) [2023-10-14 01:46:45,289][33226] Updated weights for policy 1, policy_version 18780 (0.0009) [2023-10-14 01:46:46,001][33201] Updated weights for policy 0, policy_version 18600 (0.0008) [2023-10-14 01:46:46,369][33201] Updated weights for policy 0, policy_version 18610 (0.0009) [2023-10-14 01:46:46,737][33201] Updated weights for policy 0, policy_version 18620 (0.0009) [2023-10-14 01:46:49,013][33226] Updated weights for policy 1, policy_version 18790 (0.0007) [2023-10-14 01:46:49,366][33226] Updated weights for policy 1, policy_version 18800 (0.0009) [2023-10-14 01:46:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 38305792. Throughput: 0: 1759.4, 1: 1781.8. Samples: 9588802. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 01:46:49,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.860')] [2023-10-14 01:46:49,737][33226] Updated weights for policy 1, policy_version 18810 (0.0009) [2023-10-14 01:46:50,441][33201] Updated weights for policy 0, policy_version 18630 (0.0009) [2023-10-14 01:46:50,810][33201] Updated weights for policy 0, policy_version 18640 (0.0008) [2023-10-14 01:46:51,184][33201] Updated weights for policy 0, policy_version 18650 (0.0010) [2023-10-14 01:46:53,597][33226] Updated weights for policy 1, policy_version 18820 (0.0010) [2023-10-14 01:46:53,958][33226] Updated weights for policy 1, policy_version 18830 (0.0011) [2023-10-14 01:46:54,329][33226] Updated weights for policy 1, policy_version 18840 (0.0008) [2023-10-14 01:46:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 38371328. Throughput: 0: 1777.5, 1: 1793.5. Samples: 9610414. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 01:46:54,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.860')] [2023-10-14 01:46:54,922][33201] Updated weights for policy 0, policy_version 18660 (0.0007) [2023-10-14 01:46:55,322][33201] Updated weights for policy 0, policy_version 18670 (0.0007) [2023-10-14 01:46:55,691][33201] Updated weights for policy 0, policy_version 18680 (0.0007) [2023-10-14 01:46:58,096][33226] Updated weights for policy 1, policy_version 18850 (0.0009) [2023-10-14 01:46:58,459][33226] Updated weights for policy 1, policy_version 18860 (0.0009) [2023-10-14 01:46:58,820][33226] Updated weights for policy 1, policy_version 18870 (0.0010) [2023-10-14 01:46:59,190][33226] Updated weights for policy 1, policy_version 18880 (0.0010) [2023-10-14 01:46:59,407][33201] Updated weights for policy 0, policy_version 18690 (0.0007) [2023-10-14 01:46:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 38469632. Throughput: 0: 1758.6, 1: 1777.6. Samples: 9620770. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 01:46:59,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.850')] [2023-10-14 01:46:59,788][33201] Updated weights for policy 0, policy_version 18700 (0.0008) [2023-10-14 01:47:00,153][33201] Updated weights for policy 0, policy_version 18710 (0.0009) [2023-10-14 01:47:00,525][33201] Updated weights for policy 0, policy_version 18720 (0.0008) [2023-10-14 01:47:03,131][33226] Updated weights for policy 1, policy_version 18890 (0.0009) [2023-10-14 01:47:03,498][33226] Updated weights for policy 1, policy_version 18900 (0.0008) [2023-10-14 01:47:03,869][33226] Updated weights for policy 1, policy_version 18910 (0.0009) [2023-10-14 01:47:04,316][33201] Updated weights for policy 0, policy_version 18730 (0.0010) [2023-10-14 01:47:04,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 38535168. Throughput: 0: 1771.6, 1: 1801.8. Samples: 9642556. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 01:47:04,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.850')] [2023-10-14 01:47:04,695][33201] Updated weights for policy 0, policy_version 18740 (0.0009) [2023-10-14 01:47:05,073][33201] Updated weights for policy 0, policy_version 18750 (0.0009) [2023-10-14 01:47:07,526][33226] Updated weights for policy 1, policy_version 18920 (0.0010) [2023-10-14 01:47:07,895][33226] Updated weights for policy 1, policy_version 18930 (0.0010) [2023-10-14 01:47:08,266][33226] Updated weights for policy 1, policy_version 18940 (0.0009) [2023-10-14 01:47:08,778][33201] Updated weights for policy 0, policy_version 18760 (0.0009) [2023-10-14 01:47:09,147][33201] Updated weights for policy 0, policy_version 18770 (0.0009) [2023-10-14 01:47:09,513][33201] Updated weights for policy 0, policy_version 18780 (0.0010) [2023-10-14 01:47:09,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 38600704. Throughput: 0: 1784.5, 1: 1777.7. Samples: 9663066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:47:09,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.850')] [2023-10-14 01:47:11,951][33226] Updated weights for policy 1, policy_version 18950 (0.0008) [2023-10-14 01:47:12,316][33226] Updated weights for policy 1, policy_version 18960 (0.0008) [2023-10-14 01:47:12,685][33226] Updated weights for policy 1, policy_version 18970 (0.0009) [2023-10-14 01:47:13,355][33201] Updated weights for policy 0, policy_version 18790 (0.0009) [2023-10-14 01:47:13,735][33201] Updated weights for policy 0, policy_version 18800 (0.0009) [2023-10-14 01:47:14,101][33201] Updated weights for policy 0, policy_version 18810 (0.0007) [2023-10-14 01:47:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 38699008. Throughput: 0: 1766.1, 1: 1805.0. Samples: 9674730. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:47:14,557][31953] Avg episode reward: [(0, '20.640'), (1, '20.820')] [2023-10-14 01:47:16,561][33226] Updated weights for policy 1, policy_version 18980 (0.0008) [2023-10-14 01:47:16,963][33226] Updated weights for policy 1, policy_version 18990 (0.0009) [2023-10-14 01:47:17,327][33226] Updated weights for policy 1, policy_version 19000 (0.0007) [2023-10-14 01:47:17,989][33201] Updated weights for policy 0, policy_version 18820 (0.0009) [2023-10-14 01:47:18,365][33201] Updated weights for policy 0, policy_version 18830 (0.0010) [2023-10-14 01:47:18,740][33201] Updated weights for policy 0, policy_version 18840 (0.0007) [2023-10-14 01:47:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 38764544. Throughput: 0: 1792.4, 1: 1777.2. Samples: 9695210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:47:19,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.830')] [2023-10-14 01:47:20,933][33226] Updated weights for policy 1, policy_version 19010 (0.0008) [2023-10-14 01:47:21,306][33226] Updated weights for policy 1, policy_version 19020 (0.0008) [2023-10-14 01:47:21,678][33226] Updated weights for policy 1, policy_version 19030 (0.0007) [2023-10-14 01:47:22,036][33226] Updated weights for policy 1, policy_version 19040 (0.0008) [2023-10-14 01:47:22,674][33201] Updated weights for policy 0, policy_version 18850 (0.0008) [2023-10-14 01:47:23,033][33201] Updated weights for policy 0, policy_version 18860 (0.0007) [2023-10-14 01:47:23,416][33201] Updated weights for policy 0, policy_version 18870 (0.0007) [2023-10-14 01:47:23,784][33201] Updated weights for policy 0, policy_version 18880 (0.0009) [2023-10-14 01:47:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 38830080. Throughput: 0: 1766.9, 1: 1780.3. Samples: 9716294. Policy #0 lag: (min: 23.0, avg: 28.1, max: 55.0) [2023-10-14 01:47:24,559][31953] Avg episode reward: [(0, '20.630'), (1, '20.840')] [2023-10-14 01:47:24,571][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000018880_19333120.pth... [2023-10-14 01:47:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000019040_19496960.pth... [2023-10-14 01:47:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000017216_17629184.pth [2023-10-14 01:47:24,609][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000017376_17793024.pth [2023-10-14 01:47:25,825][33226] Updated weights for policy 1, policy_version 19050 (0.0007) [2023-10-14 01:47:26,193][33226] Updated weights for policy 1, policy_version 19060 (0.0008) [2023-10-14 01:47:26,562][33226] Updated weights for policy 1, policy_version 19070 (0.0009) [2023-10-14 01:47:27,661][33201] Updated weights for policy 0, policy_version 18890 (0.0010) [2023-10-14 01:47:28,036][33201] Updated weights for policy 0, policy_version 18900 (0.0007) [2023-10-14 01:47:28,395][33201] Updated weights for policy 0, policy_version 18910 (0.0007) [2023-10-14 01:47:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 38895616. Throughput: 0: 1789.2, 1: 1779.0. Samples: 9727346. Policy #0 lag: (min: 23.0, avg: 28.1, max: 55.0) [2023-10-14 01:47:29,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.840')] [2023-10-14 01:47:30,378][33226] Updated weights for policy 1, policy_version 19080 (0.0009) [2023-10-14 01:47:30,754][33226] Updated weights for policy 1, policy_version 19090 (0.0008) [2023-10-14 01:47:31,119][33226] Updated weights for policy 1, policy_version 19100 (0.0008) [2023-10-14 01:47:32,189][33201] Updated weights for policy 0, policy_version 18920 (0.0009) [2023-10-14 01:47:32,565][33201] Updated weights for policy 0, policy_version 18930 (0.0008) [2023-10-14 01:47:32,935][33201] Updated weights for policy 0, policy_version 18940 (0.0007) [2023-10-14 01:47:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 38961152. Throughput: 0: 1760.2, 1: 1780.7. Samples: 9748142. Policy #0 lag: (min: 23.0, avg: 28.1, max: 55.0) [2023-10-14 01:47:34,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.840')] [2023-10-14 01:47:34,931][33226] Updated weights for policy 1, policy_version 19110 (0.0008) [2023-10-14 01:47:35,296][33226] Updated weights for policy 1, policy_version 19120 (0.0008) [2023-10-14 01:47:35,672][33226] Updated weights for policy 1, policy_version 19130 (0.0009) [2023-10-14 01:47:36,664][33201] Updated weights for policy 0, policy_version 18950 (0.0008) [2023-10-14 01:47:37,033][33201] Updated weights for policy 0, policy_version 18960 (0.0010) [2023-10-14 01:47:37,405][33201] Updated weights for policy 0, policy_version 18970 (0.0010) [2023-10-14 01:47:39,428][33226] Updated weights for policy 1, policy_version 19140 (0.0009) [2023-10-14 01:47:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 39026688. Throughput: 0: 1754.7, 1: 1797.5. Samples: 9770262. Policy #0 lag: (min: 23.0, avg: 28.1, max: 55.0) [2023-10-14 01:47:39,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.840')] [2023-10-14 01:47:39,795][33226] Updated weights for policy 1, policy_version 19150 (0.0008) [2023-10-14 01:47:40,162][33226] Updated weights for policy 1, policy_version 19160 (0.0010) [2023-10-14 01:47:41,309][33201] Updated weights for policy 0, policy_version 18980 (0.0009) [2023-10-14 01:47:41,706][33201] Updated weights for policy 0, policy_version 18990 (0.0007) [2023-10-14 01:47:42,070][33201] Updated weights for policy 0, policy_version 19000 (0.0009) [2023-10-14 01:47:43,800][33226] Updated weights for policy 1, policy_version 19170 (0.0009) [2023-10-14 01:47:44,179][33226] Updated weights for policy 1, policy_version 19180 (0.0009) [2023-10-14 01:47:44,539][33226] Updated weights for policy 1, policy_version 19190 (0.0008) [2023-10-14 01:47:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 39092224. Throughput: 0: 1761.1, 1: 1784.3. Samples: 9780316. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:47:44,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.880')] [2023-10-14 01:47:44,910][33226] Updated weights for policy 1, policy_version 19200 (0.0009) [2023-10-14 01:47:45,859][33201] Updated weights for policy 0, policy_version 19010 (0.0009) [2023-10-14 01:47:46,224][33201] Updated weights for policy 0, policy_version 19020 (0.0007) [2023-10-14 01:47:46,598][33201] Updated weights for policy 0, policy_version 19030 (0.0008) [2023-10-14 01:47:46,971][33201] Updated weights for policy 0, policy_version 19040 (0.0010) [2023-10-14 01:47:48,693][33226] Updated weights for policy 1, policy_version 19210 (0.0010) [2023-10-14 01:47:49,061][33226] Updated weights for policy 1, policy_version 19220 (0.0010) [2023-10-14 01:47:49,444][33226] Updated weights for policy 1, policy_version 19230 (0.0009) [2023-10-14 01:47:49,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 39190528. Throughput: 0: 1747.2, 1: 1797.0. Samples: 9802046. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:47:49,557][31953] Avg episode reward: [(0, '20.640'), (1, '20.880')] [2023-10-14 01:47:50,681][33201] Updated weights for policy 0, policy_version 19050 (0.0009) [2023-10-14 01:47:51,063][33201] Updated weights for policy 0, policy_version 19060 (0.0008) [2023-10-14 01:47:51,428][33201] Updated weights for policy 0, policy_version 19070 (0.0008) [2023-10-14 01:47:53,266][33226] Updated weights for policy 1, policy_version 19240 (0.0010) [2023-10-14 01:47:53,630][33226] Updated weights for policy 1, policy_version 19250 (0.0008) [2023-10-14 01:47:53,990][33226] Updated weights for policy 1, policy_version 19260 (0.0008) [2023-10-14 01:47:54,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 39256064. Throughput: 0: 1770.9, 1: 1788.7. Samples: 9823248. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:47:54,557][31953] Avg episode reward: [(0, '20.650'), (1, '20.880')] [2023-10-14 01:47:55,237][33201] Updated weights for policy 0, policy_version 19080 (0.0007) [2023-10-14 01:47:55,609][33201] Updated weights for policy 0, policy_version 19090 (0.0009) [2023-10-14 01:47:55,988][33201] Updated weights for policy 0, policy_version 19100 (0.0010) [2023-10-14 01:47:57,887][33226] Updated weights for policy 1, policy_version 19270 (0.0008) [2023-10-14 01:47:58,250][33226] Updated weights for policy 1, policy_version 19280 (0.0009) [2023-10-14 01:47:58,623][33226] Updated weights for policy 1, policy_version 19290 (0.0007) [2023-10-14 01:47:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 39321600. Throughput: 0: 1752.6, 1: 1781.2. Samples: 9833754. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:47:59,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.880')] [2023-10-14 01:47:59,709][33201] Updated weights for policy 0, policy_version 19110 (0.0008) [2023-10-14 01:48:00,076][33201] Updated weights for policy 0, policy_version 19120 (0.0009) [2023-10-14 01:48:00,456][33201] Updated weights for policy 0, policy_version 19130 (0.0007) [2023-10-14 01:48:02,511][33226] Updated weights for policy 1, policy_version 19300 (0.0007) [2023-10-14 01:48:02,922][33226] Updated weights for policy 1, policy_version 19310 (0.0007) [2023-10-14 01:48:03,298][33226] Updated weights for policy 1, policy_version 19320 (0.0007) [2023-10-14 01:48:04,503][33201] Updated weights for policy 0, policy_version 19140 (0.0008) [2023-10-14 01:48:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 39387136. Throughput: 0: 1765.6, 1: 1791.2. Samples: 9855264. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-14 01:48:04,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.880')] [2023-10-14 01:48:04,874][33201] Updated weights for policy 0, policy_version 19150 (0.0007) [2023-10-14 01:48:05,248][33201] Updated weights for policy 0, policy_version 19160 (0.0008) [2023-10-14 01:48:07,031][33226] Updated weights for policy 1, policy_version 19330 (0.0008) [2023-10-14 01:48:07,395][33226] Updated weights for policy 1, policy_version 19340 (0.0010) [2023-10-14 01:48:07,769][33226] Updated weights for policy 1, policy_version 19350 (0.0010) [2023-10-14 01:48:08,140][33226] Updated weights for policy 1, policy_version 19360 (0.0007) [2023-10-14 01:48:08,850][33201] Updated weights for policy 0, policy_version 19170 (0.0009) [2023-10-14 01:48:09,221][33201] Updated weights for policy 0, policy_version 19180 (0.0009) [2023-10-14 01:48:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 39452672. Throughput: 0: 1785.2, 1: 1769.8. Samples: 9876268. Policy #0 lag: (min: 18.0, avg: 23.6, max: 50.0) [2023-10-14 01:48:09,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.940')] [2023-10-14 01:48:09,597][33201] Updated weights for policy 0, policy_version 19190 (0.0008) [2023-10-14 01:48:09,963][33201] Updated weights for policy 0, policy_version 19200 (0.0008) [2023-10-14 01:48:11,958][33226] Updated weights for policy 1, policy_version 19370 (0.0010) [2023-10-14 01:48:12,328][33226] Updated weights for policy 1, policy_version 19380 (0.0010) [2023-10-14 01:48:12,697][33226] Updated weights for policy 1, policy_version 19390 (0.0009) [2023-10-14 01:48:13,735][33201] Updated weights for policy 0, policy_version 19210 (0.0009) [2023-10-14 01:48:14,095][33201] Updated weights for policy 0, policy_version 19220 (0.0010) [2023-10-14 01:48:14,467][33201] Updated weights for policy 0, policy_version 19230 (0.0010) [2023-10-14 01:48:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 39550976. Throughput: 0: 1761.8, 1: 1792.2. Samples: 9887278. Policy #0 lag: (min: 26.0, avg: 34.1, max: 58.0) [2023-10-14 01:48:14,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.910')] [2023-10-14 01:48:16,522][33226] Updated weights for policy 1, policy_version 19400 (0.0009) [2023-10-14 01:48:16,886][33226] Updated weights for policy 1, policy_version 19410 (0.0008) [2023-10-14 01:48:17,250][33226] Updated weights for policy 1, policy_version 19420 (0.0009) [2023-10-14 01:48:18,365][33201] Updated weights for policy 0, policy_version 19240 (0.0007) [2023-10-14 01:48:18,734][33201] Updated weights for policy 0, policy_version 19250 (0.0007) [2023-10-14 01:48:19,115][33201] Updated weights for policy 0, policy_version 19260 (0.0008) [2023-10-14 01:48:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39616512. Throughput: 0: 1791.4, 1: 1767.9. Samples: 9908310. Policy #0 lag: (min: 26.0, avg: 34.1, max: 58.0) [2023-10-14 01:48:19,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.900')] [2023-10-14 01:48:21,042][33226] Updated weights for policy 1, policy_version 19430 (0.0009) [2023-10-14 01:48:21,410][33226] Updated weights for policy 1, policy_version 19440 (0.0011) [2023-10-14 01:48:21,773][33226] Updated weights for policy 1, policy_version 19450 (0.0011) [2023-10-14 01:48:22,772][33201] Updated weights for policy 0, policy_version 19270 (0.0010) [2023-10-14 01:48:23,145][33201] Updated weights for policy 0, policy_version 19280 (0.0008) [2023-10-14 01:48:23,529][33201] Updated weights for policy 0, policy_version 19290 (0.0008) [2023-10-14 01:48:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39682048. Throughput: 0: 1766.4, 1: 1767.7. Samples: 9929296. Policy #0 lag: (min: 26.0, avg: 34.1, max: 58.0) [2023-10-14 01:48:24,559][31953] Avg episode reward: [(0, '20.600'), (1, '20.900')] [2023-10-14 01:48:25,616][33226] Updated weights for policy 1, policy_version 19460 (0.0008) [2023-10-14 01:48:25,971][33226] Updated weights for policy 1, policy_version 19470 (0.0007) [2023-10-14 01:48:26,342][33226] Updated weights for policy 1, policy_version 19480 (0.0010) [2023-10-14 01:48:27,271][33201] Updated weights for policy 0, policy_version 19300 (0.0008) [2023-10-14 01:48:27,680][33201] Updated weights for policy 0, policy_version 19310 (0.0008) [2023-10-14 01:48:28,062][33201] Updated weights for policy 0, policy_version 19320 (0.0008) [2023-10-14 01:48:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39747584. Throughput: 0: 1792.6, 1: 1764.3. Samples: 9940374. Policy #0 lag: (min: 26.0, avg: 34.1, max: 58.0) [2023-10-14 01:48:29,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.890')] [2023-10-14 01:48:30,160][33226] Updated weights for policy 1, policy_version 19490 (0.0008) [2023-10-14 01:48:30,532][33226] Updated weights for policy 1, policy_version 19500 (0.0009) [2023-10-14 01:48:30,911][33226] Updated weights for policy 1, policy_version 19510 (0.0009) [2023-10-14 01:48:31,281][33226] Updated weights for policy 1, policy_version 19520 (0.0009) [2023-10-14 01:48:31,902][33201] Updated weights for policy 0, policy_version 19330 (0.0009) [2023-10-14 01:48:32,283][33201] Updated weights for policy 0, policy_version 19340 (0.0008) [2023-10-14 01:48:32,659][33201] Updated weights for policy 0, policy_version 19350 (0.0008) [2023-10-14 01:48:33,029][33201] Updated weights for policy 0, policy_version 19360 (0.0008) [2023-10-14 01:48:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39813120. Throughput: 0: 1769.5, 1: 1760.5. Samples: 9960898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:48:34,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.880')] [2023-10-14 01:48:35,021][33226] Updated weights for policy 1, policy_version 19530 (0.0009) [2023-10-14 01:48:35,386][33226] Updated weights for policy 1, policy_version 19540 (0.0009) [2023-10-14 01:48:35,760][33226] Updated weights for policy 1, policy_version 19550 (0.0009) [2023-10-14 01:48:37,074][33201] Updated weights for policy 0, policy_version 19370 (0.0007) [2023-10-14 01:48:37,437][33201] Updated weights for policy 0, policy_version 19380 (0.0009) [2023-10-14 01:48:37,809][33201] Updated weights for policy 0, policy_version 19390 (0.0008) [2023-10-14 01:48:39,519][33226] Updated weights for policy 1, policy_version 19560 (0.0008) [2023-10-14 01:48:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39878656. Throughput: 0: 1755.6, 1: 1793.1. Samples: 9982936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:48:39,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.880')] [2023-10-14 01:48:39,888][33226] Updated weights for policy 1, policy_version 19570 (0.0011) [2023-10-14 01:48:40,256][33226] Updated weights for policy 1, policy_version 19580 (0.0007) [2023-10-14 01:48:41,666][33201] Updated weights for policy 0, policy_version 19400 (0.0010) [2023-10-14 01:48:42,032][33201] Updated weights for policy 0, policy_version 19410 (0.0009) [2023-10-14 01:48:42,407][33201] Updated weights for policy 0, policy_version 19420 (0.0008) [2023-10-14 01:48:44,099][33226] Updated weights for policy 1, policy_version 19590 (0.0007) [2023-10-14 01:48:44,472][33226] Updated weights for policy 1, policy_version 19600 (0.0007) [2023-10-14 01:48:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 39944192. Throughput: 0: 1776.0, 1: 1767.5. Samples: 9993216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:48:44,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.890')] [2023-10-14 01:48:44,834][33226] Updated weights for policy 1, policy_version 19610 (0.0007) [2023-10-14 01:48:46,094][33201] Updated weights for policy 0, policy_version 19430 (0.0007) [2023-10-14 01:48:46,472][33201] Updated weights for policy 0, policy_version 19440 (0.0007) [2023-10-14 01:48:46,843][33201] Updated weights for policy 0, policy_version 19450 (0.0008) [2023-10-14 01:48:48,756][33226] Updated weights for policy 1, policy_version 19620 (0.0007) [2023-10-14 01:48:49,158][33226] Updated weights for policy 1, policy_version 19630 (0.0009) [2023-10-14 01:48:49,527][33226] Updated weights for policy 1, policy_version 19640 (0.0008) [2023-10-14 01:48:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 40009728. Throughput: 0: 1761.5, 1: 1788.2. Samples: 10015000. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:48:49,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.890')] [2023-10-14 01:48:50,758][33201] Updated weights for policy 0, policy_version 19460 (0.0009) [2023-10-14 01:48:51,125][33201] Updated weights for policy 0, policy_version 19470 (0.0010) [2023-10-14 01:48:51,490][33201] Updated weights for policy 0, policy_version 19480 (0.0009) [2023-10-14 01:48:53,280][33226] Updated weights for policy 1, policy_version 19650 (0.0009) [2023-10-14 01:48:53,652][33226] Updated weights for policy 1, policy_version 19660 (0.0009) [2023-10-14 01:48:54,031][33226] Updated weights for policy 1, policy_version 19670 (0.0009) [2023-10-14 01:48:54,394][33226] Updated weights for policy 1, policy_version 19680 (0.0008) [2023-10-14 01:48:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 40108032. Throughput: 0: 1767.9, 1: 1774.9. Samples: 10035694. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:48:54,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.870')] [2023-10-14 01:48:55,377][33201] Updated weights for policy 0, policy_version 19490 (0.0009) [2023-10-14 01:48:55,755][33201] Updated weights for policy 0, policy_version 19500 (0.0008) [2023-10-14 01:48:56,132][33201] Updated weights for policy 0, policy_version 19510 (0.0008) [2023-10-14 01:48:56,505][33201] Updated weights for policy 0, policy_version 19520 (0.0007) [2023-10-14 01:48:58,339][33226] Updated weights for policy 1, policy_version 19690 (0.0008) [2023-10-14 01:48:58,706][33226] Updated weights for policy 1, policy_version 19700 (0.0007) [2023-10-14 01:48:59,080][33226] Updated weights for policy 1, policy_version 19710 (0.0007) [2023-10-14 01:48:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 40173568. Throughput: 0: 1759.8, 1: 1769.8. Samples: 10046110. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:48:59,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.890')] [2023-10-14 01:49:00,279][33201] Updated weights for policy 0, policy_version 19530 (0.0009) [2023-10-14 01:49:00,661][33201] Updated weights for policy 0, policy_version 19540 (0.0008) [2023-10-14 01:49:01,036][33201] Updated weights for policy 0, policy_version 19550 (0.0009) [2023-10-14 01:49:02,918][33226] Updated weights for policy 1, policy_version 19720 (0.0009) [2023-10-14 01:49:03,291][33226] Updated weights for policy 1, policy_version 19730 (0.0010) [2023-10-14 01:49:03,652][33226] Updated weights for policy 1, policy_version 19740 (0.0009) [2023-10-14 01:49:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 40239104. Throughput: 0: 1758.7, 1: 1778.3. Samples: 10067474. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:49:04,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.890')] [2023-10-14 01:49:04,842][33201] Updated weights for policy 0, policy_version 19560 (0.0010) [2023-10-14 01:49:05,206][33201] Updated weights for policy 0, policy_version 19570 (0.0007) [2023-10-14 01:49:05,572][33201] Updated weights for policy 0, policy_version 19580 (0.0008) [2023-10-14 01:49:07,476][33226] Updated weights for policy 1, policy_version 19750 (0.0008) [2023-10-14 01:49:07,849][33226] Updated weights for policy 1, policy_version 19760 (0.0009) [2023-10-14 01:49:08,220][33226] Updated weights for policy 1, policy_version 19770 (0.0007) [2023-10-14 01:49:09,236][33201] Updated weights for policy 0, policy_version 19590 (0.0009) [2023-10-14 01:49:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 40304640. Throughput: 0: 1785.1, 1: 1752.0. Samples: 10088464. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 01:49:09,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.890')] [2023-10-14 01:49:09,604][33201] Updated weights for policy 0, policy_version 19600 (0.0009) [2023-10-14 01:49:09,982][33201] Updated weights for policy 0, policy_version 19610 (0.0007) [2023-10-14 01:49:12,000][33226] Updated weights for policy 1, policy_version 19780 (0.0008) [2023-10-14 01:49:12,378][33226] Updated weights for policy 1, policy_version 19790 (0.0011) [2023-10-14 01:49:12,743][33226] Updated weights for policy 1, policy_version 19800 (0.0010) [2023-10-14 01:49:13,781][33201] Updated weights for policy 0, policy_version 19620 (0.0008) [2023-10-14 01:49:14,154][33201] Updated weights for policy 0, policy_version 19630 (0.0010) [2023-10-14 01:49:14,529][33201] Updated weights for policy 0, policy_version 19640 (0.0011) [2023-10-14 01:49:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 40370176. Throughput: 0: 1752.5, 1: 1783.3. Samples: 10099486. Policy #0 lag: (min: 24.0, avg: 48.7, max: 56.0) [2023-10-14 01:49:14,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.890')] [2023-10-14 01:49:16,598][33226] Updated weights for policy 1, policy_version 19810 (0.0010) [2023-10-14 01:49:16,968][33226] Updated weights for policy 1, policy_version 19820 (0.0009) [2023-10-14 01:49:17,339][33226] Updated weights for policy 1, policy_version 19830 (0.0008) [2023-10-14 01:49:17,711][33226] Updated weights for policy 1, policy_version 19840 (0.0009) [2023-10-14 01:49:18,390][33201] Updated weights for policy 0, policy_version 19650 (0.0009) [2023-10-14 01:49:18,761][33201] Updated weights for policy 0, policy_version 19660 (0.0010) [2023-10-14 01:49:19,141][33201] Updated weights for policy 0, policy_version 19670 (0.0010) [2023-10-14 01:49:19,505][33201] Updated weights for policy 0, policy_version 19680 (0.0009) [2023-10-14 01:49:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 40468480. Throughput: 0: 1789.2, 1: 1750.8. Samples: 10120198. Policy #0 lag: (min: 24.0, avg: 48.7, max: 56.0) [2023-10-14 01:49:19,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.890')] [2023-10-14 01:49:21,485][33226] Updated weights for policy 1, policy_version 19850 (0.0010) [2023-10-14 01:49:21,842][33226] Updated weights for policy 1, policy_version 19860 (0.0011) [2023-10-14 01:49:22,216][33226] Updated weights for policy 1, policy_version 19870 (0.0009) [2023-10-14 01:49:23,415][33201] Updated weights for policy 0, policy_version 19690 (0.0009) [2023-10-14 01:49:23,790][33201] Updated weights for policy 0, policy_version 19700 (0.0007) [2023-10-14 01:49:24,161][33201] Updated weights for policy 0, policy_version 19710 (0.0008) [2023-10-14 01:49:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 40534016. Throughput: 0: 1764.0, 1: 1756.4. Samples: 10141356. Policy #0 lag: (min: 24.0, avg: 48.7, max: 56.0) [2023-10-14 01:49:24,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.890')] [2023-10-14 01:49:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000019712_20185088.pth... [2023-10-14 01:49:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000019872_20348928.pth... [2023-10-14 01:49:24,604][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000018208_18644992.pth [2023-10-14 01:49:24,605][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000018048_18481152.pth [2023-10-14 01:49:25,946][33226] Updated weights for policy 1, policy_version 19880 (0.0009) [2023-10-14 01:49:26,308][33226] Updated weights for policy 1, policy_version 19890 (0.0008) [2023-10-14 01:49:26,683][33226] Updated weights for policy 1, policy_version 19900 (0.0008) [2023-10-14 01:49:27,948][33201] Updated weights for policy 0, policy_version 19720 (0.0009) [2023-10-14 01:49:28,312][33201] Updated weights for policy 0, policy_version 19730 (0.0010) [2023-10-14 01:49:28,683][33201] Updated weights for policy 0, policy_version 19740 (0.0010) [2023-10-14 01:49:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 40599552. Throughput: 0: 1775.9, 1: 1757.8. Samples: 10152232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:29,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.890')] [2023-10-14 01:49:30,572][33226] Updated weights for policy 1, policy_version 19910 (0.0010) [2023-10-14 01:49:30,938][33226] Updated weights for policy 1, policy_version 19920 (0.0009) [2023-10-14 01:49:31,309][33226] Updated weights for policy 1, policy_version 19930 (0.0009) [2023-10-14 01:49:32,612][33201] Updated weights for policy 0, policy_version 19750 (0.0009) [2023-10-14 01:49:32,975][33201] Updated weights for policy 0, policy_version 19760 (0.0008) [2023-10-14 01:49:33,345][33201] Updated weights for policy 0, policy_version 19770 (0.0008) [2023-10-14 01:49:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 40665088. Throughput: 0: 1762.7, 1: 1755.1. Samples: 10173306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:34,558][31953] Avg episode reward: [(0, '20.420'), (1, '20.880')] [2023-10-14 01:49:35,180][33226] Updated weights for policy 1, policy_version 19940 (0.0008) [2023-10-14 01:49:35,574][33226] Updated weights for policy 1, policy_version 19950 (0.0010) [2023-10-14 01:49:35,943][33226] Updated weights for policy 1, policy_version 19960 (0.0010) [2023-10-14 01:49:37,171][33201] Updated weights for policy 0, policy_version 19780 (0.0007) [2023-10-14 01:49:37,544][33201] Updated weights for policy 0, policy_version 19790 (0.0008) [2023-10-14 01:49:37,910][33201] Updated weights for policy 0, policy_version 19800 (0.0007) [2023-10-14 01:49:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 40730624. Throughput: 0: 1751.6, 1: 1783.2. Samples: 10194756. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:39,558][31953] Avg episode reward: [(0, '20.420'), (1, '20.880')] [2023-10-14 01:49:39,755][33226] Updated weights for policy 1, policy_version 19970 (0.0008) [2023-10-14 01:49:40,130][33226] Updated weights for policy 1, policy_version 19980 (0.0008) [2023-10-14 01:49:40,497][33226] Updated weights for policy 1, policy_version 19990 (0.0008) [2023-10-14 01:49:40,866][33226] Updated weights for policy 1, policy_version 20000 (0.0008) [2023-10-14 01:49:41,654][33201] Updated weights for policy 0, policy_version 19810 (0.0007) [2023-10-14 01:49:42,024][33201] Updated weights for policy 0, policy_version 19820 (0.0008) [2023-10-14 01:49:42,392][33201] Updated weights for policy 0, policy_version 19830 (0.0007) [2023-10-14 01:49:42,759][33201] Updated weights for policy 0, policy_version 19840 (0.0009) [2023-10-14 01:49:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 40796160. Throughput: 0: 1774.0, 1: 1766.0. Samples: 10205412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:44,558][31953] Avg episode reward: [(0, '20.430'), (1, '20.880')] [2023-10-14 01:49:44,604][33226] Updated weights for policy 1, policy_version 20010 (0.0009) [2023-10-14 01:49:44,966][33226] Updated weights for policy 1, policy_version 20020 (0.0009) [2023-10-14 01:49:45,330][33226] Updated weights for policy 1, policy_version 20030 (0.0010) [2023-10-14 01:49:46,611][33201] Updated weights for policy 0, policy_version 19850 (0.0009) [2023-10-14 01:49:46,982][33201] Updated weights for policy 0, policy_version 19860 (0.0007) [2023-10-14 01:49:47,356][33201] Updated weights for policy 0, policy_version 19870 (0.0007) [2023-10-14 01:49:49,081][33226] Updated weights for policy 1, policy_version 20040 (0.0009) [2023-10-14 01:49:49,457][33226] Updated weights for policy 1, policy_version 20050 (0.0009) [2023-10-14 01:49:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 40861696. Throughput: 0: 1755.2, 1: 1783.0. Samples: 10226694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:49,558][31953] Avg episode reward: [(0, '20.440'), (1, '20.880')] [2023-10-14 01:49:49,816][33226] Updated weights for policy 1, policy_version 20060 (0.0009) [2023-10-14 01:49:51,291][33201] Updated weights for policy 0, policy_version 19880 (0.0009) [2023-10-14 01:49:51,664][33201] Updated weights for policy 0, policy_version 19890 (0.0010) [2023-10-14 01:49:52,032][33201] Updated weights for policy 0, policy_version 19900 (0.0008) [2023-10-14 01:49:53,444][33226] Updated weights for policy 1, policy_version 20070 (0.0009) [2023-10-14 01:49:53,817][33226] Updated weights for policy 1, policy_version 20080 (0.0008) [2023-10-14 01:49:54,194][33226] Updated weights for policy 1, policy_version 20090 (0.0008) [2023-10-14 01:49:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 40960000. Throughput: 0: 1756.4, 1: 1791.0. Samples: 10248096. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:54,559][31953] Avg episode reward: [(0, '20.430'), (1, '20.890')] [2023-10-14 01:49:55,927][33201] Updated weights for policy 0, policy_version 19910 (0.0008) [2023-10-14 01:49:56,296][33201] Updated weights for policy 0, policy_version 19920 (0.0008) [2023-10-14 01:49:56,673][33201] Updated weights for policy 0, policy_version 19930 (0.0010) [2023-10-14 01:49:57,794][33226] Updated weights for policy 1, policy_version 20100 (0.0008) [2023-10-14 01:49:58,161][33226] Updated weights for policy 1, policy_version 20110 (0.0008) [2023-10-14 01:49:58,523][33226] Updated weights for policy 1, policy_version 20120 (0.0009) [2023-10-14 01:49:59,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 41025536. Throughput: 0: 1752.3, 1: 1781.5. Samples: 10258512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:49:59,558][31953] Avg episode reward: [(0, '20.440'), (1, '20.920')] [2023-10-14 01:50:00,295][33201] Updated weights for policy 0, policy_version 19940 (0.0008) [2023-10-14 01:50:00,664][33201] Updated weights for policy 0, policy_version 19950 (0.0010) [2023-10-14 01:50:01,035][33201] Updated weights for policy 0, policy_version 19960 (0.0008) [2023-10-14 01:50:02,289][33226] Updated weights for policy 1, policy_version 20130 (0.0009) [2023-10-14 01:50:02,657][33226] Updated weights for policy 1, policy_version 20140 (0.0009) [2023-10-14 01:50:03,035][33226] Updated weights for policy 1, policy_version 20150 (0.0009) [2023-10-14 01:50:03,413][33226] Updated weights for policy 1, policy_version 20160 (0.0009) [2023-10-14 01:50:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 41091072. Throughput: 0: 1756.4, 1: 1793.3. Samples: 10279934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:04,558][31953] Avg episode reward: [(0, '20.470'), (1, '20.930')] [2023-10-14 01:50:04,740][33201] Updated weights for policy 0, policy_version 19970 (0.0008) [2023-10-14 01:50:05,112][33201] Updated weights for policy 0, policy_version 19980 (0.0009) [2023-10-14 01:50:05,492][33201] Updated weights for policy 0, policy_version 19990 (0.0007) [2023-10-14 01:50:05,870][33201] Updated weights for policy 0, policy_version 20000 (0.0008) [2023-10-14 01:50:07,180][33226] Updated weights for policy 1, policy_version 20170 (0.0007) [2023-10-14 01:50:07,552][33226] Updated weights for policy 1, policy_version 20180 (0.0009) [2023-10-14 01:50:07,923][33226] Updated weights for policy 1, policy_version 20190 (0.0009) [2023-10-14 01:50:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 41156608. Throughput: 0: 1790.6, 1: 1771.7. Samples: 10301660. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:09,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.910')] [2023-10-14 01:50:09,653][33201] Updated weights for policy 0, policy_version 20010 (0.0009) [2023-10-14 01:50:10,025][33201] Updated weights for policy 0, policy_version 20020 (0.0007) [2023-10-14 01:50:10,401][33201] Updated weights for policy 0, policy_version 20030 (0.0007) [2023-10-14 01:50:11,812][33226] Updated weights for policy 1, policy_version 20200 (0.0007) [2023-10-14 01:50:12,175][33226] Updated weights for policy 1, policy_version 20210 (0.0007) [2023-10-14 01:50:12,546][33226] Updated weights for policy 1, policy_version 20220 (0.0008) [2023-10-14 01:50:14,224][33201] Updated weights for policy 0, policy_version 20040 (0.0009) [2023-10-14 01:50:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 41222144. Throughput: 0: 1760.0, 1: 1792.5. Samples: 10312098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:14,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.920')] [2023-10-14 01:50:14,603][33201] Updated weights for policy 0, policy_version 20050 (0.0008) [2023-10-14 01:50:14,969][33201] Updated weights for policy 0, policy_version 20060 (0.0007) [2023-10-14 01:50:16,434][33226] Updated weights for policy 1, policy_version 20230 (0.0010) [2023-10-14 01:50:16,796][33226] Updated weights for policy 1, policy_version 20240 (0.0009) [2023-10-14 01:50:17,169][33226] Updated weights for policy 1, policy_version 20250 (0.0007) [2023-10-14 01:50:18,764][33201] Updated weights for policy 0, policy_version 20070 (0.0007) [2023-10-14 01:50:19,127][33201] Updated weights for policy 0, policy_version 20080 (0.0007) [2023-10-14 01:50:19,506][33201] Updated weights for policy 0, policy_version 20090 (0.0008) [2023-10-14 01:50:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 41287680. Throughput: 0: 1785.6, 1: 1770.6. Samples: 10333334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:19,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.940')] [2023-10-14 01:50:20,929][33226] Updated weights for policy 1, policy_version 20260 (0.0009) [2023-10-14 01:50:21,325][33226] Updated weights for policy 1, policy_version 20270 (0.0009) [2023-10-14 01:50:21,695][33226] Updated weights for policy 1, policy_version 20280 (0.0007) [2023-10-14 01:50:23,370][33201] Updated weights for policy 0, policy_version 20100 (0.0010) [2023-10-14 01:50:23,749][33201] Updated weights for policy 0, policy_version 20110 (0.0009) [2023-10-14 01:50:24,117][33201] Updated weights for policy 0, policy_version 20120 (0.0008) [2023-10-14 01:50:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 41385984. Throughput: 0: 1774.6, 1: 1776.1. Samples: 10354540. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) [2023-10-14 01:50:24,559][31953] Avg episode reward: [(0, '20.510'), (1, '20.940')] [2023-10-14 01:50:25,410][33226] Updated weights for policy 1, policy_version 20290 (0.0008) [2023-10-14 01:50:25,788][33226] Updated weights for policy 1, policy_version 20300 (0.0007) [2023-10-14 01:50:26,160][33226] Updated weights for policy 1, policy_version 20310 (0.0008) [2023-10-14 01:50:26,519][33226] Updated weights for policy 1, policy_version 20320 (0.0009) [2023-10-14 01:50:28,025][33201] Updated weights for policy 0, policy_version 20130 (0.0009) [2023-10-14 01:50:28,397][33201] Updated weights for policy 0, policy_version 20140 (0.0010) [2023-10-14 01:50:28,786][33201] Updated weights for policy 0, policy_version 20150 (0.0010) [2023-10-14 01:50:29,160][33201] Updated weights for policy 0, policy_version 20160 (0.0009) [2023-10-14 01:50:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 41451520. Throughput: 0: 1773.6, 1: 1775.3. Samples: 10365112. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) [2023-10-14 01:50:29,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.940')] [2023-10-14 01:50:30,468][33226] Updated weights for policy 1, policy_version 20330 (0.0008) [2023-10-14 01:50:30,840][33226] Updated weights for policy 1, policy_version 20340 (0.0008) [2023-10-14 01:50:31,203][33226] Updated weights for policy 1, policy_version 20350 (0.0008) [2023-10-14 01:50:33,015][33201] Updated weights for policy 0, policy_version 20170 (0.0008) [2023-10-14 01:50:33,388][33201] Updated weights for policy 0, policy_version 20180 (0.0010) [2023-10-14 01:50:33,759][33201] Updated weights for policy 0, policy_version 20190 (0.0008) [2023-10-14 01:50:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 41517056. Throughput: 0: 1777.8, 1: 1774.3. Samples: 10386538. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) [2023-10-14 01:50:34,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.940')] [2023-10-14 01:50:35,059][33226] Updated weights for policy 1, policy_version 20360 (0.0008) [2023-10-14 01:50:35,421][33226] Updated weights for policy 1, policy_version 20370 (0.0007) [2023-10-14 01:50:35,785][33226] Updated weights for policy 1, policy_version 20380 (0.0007) [2023-10-14 01:50:37,550][33201] Updated weights for policy 0, policy_version 20200 (0.0007) [2023-10-14 01:50:37,919][33201] Updated weights for policy 0, policy_version 20210 (0.0008) [2023-10-14 01:50:38,295][33201] Updated weights for policy 0, policy_version 20220 (0.0008) [2023-10-14 01:50:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 41582592. Throughput: 0: 1759.6, 1: 1787.9. Samples: 10407734. Policy #0 lag: (min: 31.0, avg: 31.5, max: 47.0) [2023-10-14 01:50:39,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.940')] [2023-10-14 01:50:39,708][33226] Updated weights for policy 1, policy_version 20390 (0.0008) [2023-10-14 01:50:40,082][33226] Updated weights for policy 1, policy_version 20400 (0.0008) [2023-10-14 01:50:40,449][33226] Updated weights for policy 1, policy_version 20410 (0.0009) [2023-10-14 01:50:42,063][33201] Updated weights for policy 0, policy_version 20230 (0.0007) [2023-10-14 01:50:42,437][33201] Updated weights for policy 0, policy_version 20240 (0.0007) [2023-10-14 01:50:42,811][33201] Updated weights for policy 0, policy_version 20250 (0.0007) [2023-10-14 01:50:44,182][33226] Updated weights for policy 1, policy_version 20420 (0.0010) [2023-10-14 01:50:44,548][33226] Updated weights for policy 1, policy_version 20430 (0.0009) [2023-10-14 01:50:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 41648128. Throughput: 0: 1788.4, 1: 1764.9. Samples: 10418408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:44,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.970')] [2023-10-14 01:50:44,915][33226] Updated weights for policy 1, policy_version 20440 (0.0011) [2023-10-14 01:50:45,206][32895] Saving new best policy, reward=20.970! [2023-10-14 01:50:46,481][33201] Updated weights for policy 0, policy_version 20260 (0.0009) [2023-10-14 01:50:46,869][33201] Updated weights for policy 0, policy_version 20270 (0.0008) [2023-10-14 01:50:47,246][33201] Updated weights for policy 0, policy_version 20280 (0.0008) [2023-10-14 01:50:48,959][33226] Updated weights for policy 1, policy_version 20450 (0.0010) [2023-10-14 01:50:49,331][33226] Updated weights for policy 1, policy_version 20460 (0.0008) [2023-10-14 01:50:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 41713664. Throughput: 0: 1758.6, 1: 1784.2. Samples: 10439360. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:49,560][31953] Avg episode reward: [(0, '20.570'), (1, '20.960')] [2023-10-14 01:50:49,690][33226] Updated weights for policy 1, policy_version 20470 (0.0009) [2023-10-14 01:50:50,057][33226] Updated weights for policy 1, policy_version 20480 (0.0011) [2023-10-14 01:50:51,029][33201] Updated weights for policy 0, policy_version 20290 (0.0009) [2023-10-14 01:50:51,396][33201] Updated weights for policy 0, policy_version 20300 (0.0009) [2023-10-14 01:50:51,765][33201] Updated weights for policy 0, policy_version 20310 (0.0009) [2023-10-14 01:50:52,134][33201] Updated weights for policy 0, policy_version 20320 (0.0009) [2023-10-14 01:50:53,809][33226] Updated weights for policy 1, policy_version 20490 (0.0008) [2023-10-14 01:50:54,177][33226] Updated weights for policy 1, policy_version 20500 (0.0011) [2023-10-14 01:50:54,544][33226] Updated weights for policy 1, policy_version 20510 (0.0007) [2023-10-14 01:50:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 41779200. Throughput: 0: 1755.3, 1: 1782.7. Samples: 10460870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:54,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.960')] [2023-10-14 01:50:55,925][33201] Updated weights for policy 0, policy_version 20330 (0.0008) [2023-10-14 01:50:56,288][33201] Updated weights for policy 0, policy_version 20340 (0.0009) [2023-10-14 01:50:56,664][33201] Updated weights for policy 0, policy_version 20350 (0.0009) [2023-10-14 01:50:58,236][33226] Updated weights for policy 1, policy_version 20520 (0.0009) [2023-10-14 01:50:58,600][33226] Updated weights for policy 1, policy_version 20530 (0.0009) [2023-10-14 01:50:58,969][33226] Updated weights for policy 1, policy_version 20540 (0.0008) [2023-10-14 01:50:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 41877504. Throughput: 0: 1756.6, 1: 1781.7. Samples: 10471322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:50:59,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.950')] [2023-10-14 01:51:00,496][33201] Updated weights for policy 0, policy_version 20360 (0.0007) [2023-10-14 01:51:00,873][33201] Updated weights for policy 0, policy_version 20370 (0.0008) [2023-10-14 01:51:01,251][33201] Updated weights for policy 0, policy_version 20380 (0.0008) [2023-10-14 01:51:02,693][33226] Updated weights for policy 1, policy_version 20550 (0.0009) [2023-10-14 01:51:03,067][33226] Updated weights for policy 1, policy_version 20560 (0.0008) [2023-10-14 01:51:03,432][33226] Updated weights for policy 1, policy_version 20570 (0.0008) [2023-10-14 01:51:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 41943040. Throughput: 0: 1753.1, 1: 1789.3. Samples: 10492742. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 01:51:04,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.950')] [2023-10-14 01:51:05,139][33201] Updated weights for policy 0, policy_version 20390 (0.0007) [2023-10-14 01:51:05,512][33201] Updated weights for policy 0, policy_version 20400 (0.0007) [2023-10-14 01:51:05,894][33201] Updated weights for policy 0, policy_version 20410 (0.0007) [2023-10-14 01:51:07,389][33226] Updated weights for policy 1, policy_version 20580 (0.0009) [2023-10-14 01:51:07,784][33226] Updated weights for policy 1, policy_version 20590 (0.0009) [2023-10-14 01:51:08,159][33226] Updated weights for policy 1, policy_version 20600 (0.0010) [2023-10-14 01:51:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 42008576. Throughput: 0: 1780.4, 1: 1761.1. Samples: 10513906. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 01:51:09,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.930')] [2023-10-14 01:51:09,717][33201] Updated weights for policy 0, policy_version 20420 (0.0008) [2023-10-14 01:51:10,091][33201] Updated weights for policy 0, policy_version 20430 (0.0007) [2023-10-14 01:51:10,457][33201] Updated weights for policy 0, policy_version 20440 (0.0008) [2023-10-14 01:51:12,046][33226] Updated weights for policy 1, policy_version 20610 (0.0010) [2023-10-14 01:51:12,410][33226] Updated weights for policy 1, policy_version 20620 (0.0009) [2023-10-14 01:51:12,776][33226] Updated weights for policy 1, policy_version 20630 (0.0009) [2023-10-14 01:51:13,146][33226] Updated weights for policy 1, policy_version 20640 (0.0009) [2023-10-14 01:51:14,360][33201] Updated weights for policy 0, policy_version 20450 (0.0009) [2023-10-14 01:51:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 42074112. Throughput: 0: 1760.3, 1: 1793.2. Samples: 10525018. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 01:51:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.930')] [2023-10-14 01:51:14,727][33201] Updated weights for policy 0, policy_version 20460 (0.0009) [2023-10-14 01:51:15,105][33201] Updated weights for policy 0, policy_version 20470 (0.0009) [2023-10-14 01:51:15,459][33201] Updated weights for policy 0, policy_version 20480 (0.0007) [2023-10-14 01:51:16,886][33226] Updated weights for policy 1, policy_version 20650 (0.0010) [2023-10-14 01:51:17,260][33226] Updated weights for policy 1, policy_version 20660 (0.0011) [2023-10-14 01:51:17,622][33226] Updated weights for policy 1, policy_version 20670 (0.0011) [2023-10-14 01:51:19,188][33201] Updated weights for policy 0, policy_version 20490 (0.0008) [2023-10-14 01:51:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 42139648. Throughput: 0: 1784.9, 1: 1760.0. Samples: 10546056. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 01:51:19,559][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 01:51:19,561][33201] Updated weights for policy 0, policy_version 20500 (0.0009) [2023-10-14 01:51:19,937][33201] Updated weights for policy 0, policy_version 20510 (0.0009) [2023-10-14 01:51:21,325][33226] Updated weights for policy 1, policy_version 20680 (0.0010) [2023-10-14 01:51:21,685][33226] Updated weights for policy 1, policy_version 20690 (0.0007) [2023-10-14 01:51:22,055][33226] Updated weights for policy 1, policy_version 20700 (0.0008) [2023-10-14 01:51:23,745][33201] Updated weights for policy 0, policy_version 20520 (0.0008) [2023-10-14 01:51:24,121][33201] Updated weights for policy 0, policy_version 20530 (0.0009) [2023-10-14 01:51:24,491][33201] Updated weights for policy 0, policy_version 20540 (0.0009) [2023-10-14 01:51:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 42205184. Throughput: 0: 1782.9, 1: 1769.1. Samples: 10567578. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 01:51:24,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 01:51:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000020704_21200896.pth... [2023-10-14 01:51:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000019040_19496960.pth [2023-10-14 01:51:24,633][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000020544_21037056.pth... [2023-10-14 01:51:24,662][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000018880_19333120.pth [2023-10-14 01:51:25,746][33226] Updated weights for policy 1, policy_version 20710 (0.0009) [2023-10-14 01:51:26,107][33226] Updated weights for policy 1, policy_version 20720 (0.0009) [2023-10-14 01:51:26,477][33226] Updated weights for policy 1, policy_version 20730 (0.0010) [2023-10-14 01:51:28,085][33201] Updated weights for policy 0, policy_version 20550 (0.0008) [2023-10-14 01:51:28,462][33201] Updated weights for policy 0, policy_version 20560 (0.0007) [2023-10-14 01:51:28,832][33201] Updated weights for policy 0, policy_version 20570 (0.0007) [2023-10-14 01:51:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 42303488. Throughput: 0: 1777.6, 1: 1771.6. Samples: 10578124. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 01:51:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:51:30,340][33226] Updated weights for policy 1, policy_version 20740 (0.0009) [2023-10-14 01:51:30,715][33226] Updated weights for policy 1, policy_version 20750 (0.0009) [2023-10-14 01:51:31,084][33226] Updated weights for policy 1, policy_version 20760 (0.0010) [2023-10-14 01:51:32,740][33201] Updated weights for policy 0, policy_version 20580 (0.0008) [2023-10-14 01:51:33,136][33201] Updated weights for policy 0, policy_version 20590 (0.0007) [2023-10-14 01:51:33,508][33201] Updated weights for policy 0, policy_version 20600 (0.0007) [2023-10-14 01:51:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 42369024. Throughput: 0: 1789.0, 1: 1773.7. Samples: 10599682. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 01:51:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:51:34,979][33226] Updated weights for policy 1, policy_version 20770 (0.0011) [2023-10-14 01:51:35,358][33226] Updated weights for policy 1, policy_version 20780 (0.0008) [2023-10-14 01:51:35,721][33226] Updated weights for policy 1, policy_version 20790 (0.0008) [2023-10-14 01:51:36,083][33226] Updated weights for policy 1, policy_version 20800 (0.0008) [2023-10-14 01:51:37,478][33201] Updated weights for policy 0, policy_version 20610 (0.0011) [2023-10-14 01:51:37,846][33201] Updated weights for policy 0, policy_version 20620 (0.0008) [2023-10-14 01:51:38,218][33201] Updated weights for policy 0, policy_version 20630 (0.0008) [2023-10-14 01:51:38,587][33201] Updated weights for policy 0, policy_version 20640 (0.0007) [2023-10-14 01:51:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 42434560. Throughput: 0: 1760.8, 1: 1791.0. Samples: 10620700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:51:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:51:39,734][33226] Updated weights for policy 1, policy_version 20810 (0.0008) [2023-10-14 01:51:40,103][33226] Updated weights for policy 1, policy_version 20820 (0.0010) [2023-10-14 01:51:40,478][33226] Updated weights for policy 1, policy_version 20830 (0.0011) [2023-10-14 01:51:42,298][33201] Updated weights for policy 0, policy_version 20650 (0.0009) [2023-10-14 01:51:42,660][33201] Updated weights for policy 0, policy_version 20660 (0.0008) [2023-10-14 01:51:43,034][33201] Updated weights for policy 0, policy_version 20670 (0.0010) [2023-10-14 01:51:44,162][33226] Updated weights for policy 1, policy_version 20840 (0.0010) [2023-10-14 01:51:44,537][33226] Updated weights for policy 1, policy_version 20850 (0.0009) [2023-10-14 01:51:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 42500096. Throughput: 0: 1789.6, 1: 1770.7. Samples: 10631536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:51:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 01:51:44,898][33226] Updated weights for policy 1, policy_version 20860 (0.0008) [2023-10-14 01:51:46,862][33201] Updated weights for policy 0, policy_version 20680 (0.0009) [2023-10-14 01:51:47,235][33201] Updated weights for policy 0, policy_version 20690 (0.0011) [2023-10-14 01:51:47,608][33201] Updated weights for policy 0, policy_version 20700 (0.0008) [2023-10-14 01:51:48,696][33226] Updated weights for policy 1, policy_version 20870 (0.0007) [2023-10-14 01:51:49,072][33226] Updated weights for policy 1, policy_version 20880 (0.0008) [2023-10-14 01:51:49,447][33226] Updated weights for policy 1, policy_version 20890 (0.0009) [2023-10-14 01:51:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 42565632. Throughput: 0: 1761.2, 1: 1787.0. Samples: 10652410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:51:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.900')] [2023-10-14 01:51:51,417][33201] Updated weights for policy 0, policy_version 20710 (0.0007) [2023-10-14 01:51:51,790][33201] Updated weights for policy 0, policy_version 20720 (0.0008) [2023-10-14 01:51:52,165][33201] Updated weights for policy 0, policy_version 20730 (0.0010) [2023-10-14 01:51:53,256][33226] Updated weights for policy 1, policy_version 20900 (0.0008) [2023-10-14 01:51:53,671][33226] Updated weights for policy 1, policy_version 20910 (0.0011) [2023-10-14 01:51:54,042][33226] Updated weights for policy 1, policy_version 20920 (0.0009) [2023-10-14 01:51:54,557][31953] Fps is (10 sec: 16383.4, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 42663936. Throughput: 0: 1761.0, 1: 1785.1. Samples: 10673484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:51:54,559][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:51:55,936][33201] Updated weights for policy 0, policy_version 20740 (0.0008) [2023-10-14 01:51:56,306][33201] Updated weights for policy 0, policy_version 20750 (0.0008) [2023-10-14 01:51:56,685][33201] Updated weights for policy 0, policy_version 20760 (0.0008) [2023-10-14 01:51:57,641][33226] Updated weights for policy 1, policy_version 20930 (0.0009) [2023-10-14 01:51:58,006][33226] Updated weights for policy 1, policy_version 20940 (0.0007) [2023-10-14 01:51:58,371][33226] Updated weights for policy 1, policy_version 20950 (0.0008) [2023-10-14 01:51:58,736][33226] Updated weights for policy 1, policy_version 20960 (0.0009) [2023-10-14 01:51:59,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 42729472. Throughput: 0: 1758.6, 1: 1778.3. Samples: 10684176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:51:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:52:00,418][33201] Updated weights for policy 0, policy_version 20770 (0.0009) [2023-10-14 01:52:00,783][33201] Updated weights for policy 0, policy_version 20780 (0.0011) [2023-10-14 01:52:01,149][33201] Updated weights for policy 0, policy_version 20790 (0.0009) [2023-10-14 01:52:01,526][33201] Updated weights for policy 0, policy_version 20800 (0.0011) [2023-10-14 01:52:02,557][33226] Updated weights for policy 1, policy_version 20970 (0.0010) [2023-10-14 01:52:02,929][33226] Updated weights for policy 1, policy_version 20980 (0.0010) [2023-10-14 01:52:03,302][33226] Updated weights for policy 1, policy_version 20990 (0.0011) [2023-10-14 01:52:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 42795008. Throughput: 0: 1751.7, 1: 1796.0. Samples: 10705702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:52:04,559][31953] Avg episode reward: [(0, '20.890'), (1, '20.900')] [2023-10-14 01:52:05,343][33201] Updated weights for policy 0, policy_version 20810 (0.0007) [2023-10-14 01:52:05,714][33201] Updated weights for policy 0, policy_version 20820 (0.0008) [2023-10-14 01:52:06,083][33201] Updated weights for policy 0, policy_version 20830 (0.0010) [2023-10-14 01:52:07,102][33226] Updated weights for policy 1, policy_version 21000 (0.0010) [2023-10-14 01:52:07,470][33226] Updated weights for policy 1, policy_version 21010 (0.0010) [2023-10-14 01:52:07,849][33226] Updated weights for policy 1, policy_version 21020 (0.0009) [2023-10-14 01:52:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 42860544. Throughput: 0: 1770.1, 1: 1776.8. Samples: 10727186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:52:09,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.900')] [2023-10-14 01:52:09,954][33201] Updated weights for policy 0, policy_version 20840 (0.0009) [2023-10-14 01:52:10,326][33201] Updated weights for policy 0, policy_version 20850 (0.0007) [2023-10-14 01:52:10,706][33201] Updated weights for policy 0, policy_version 20860 (0.0007) [2023-10-14 01:52:11,725][33226] Updated weights for policy 1, policy_version 21030 (0.0007) [2023-10-14 01:52:12,091][33226] Updated weights for policy 1, policy_version 21040 (0.0010) [2023-10-14 01:52:12,455][33226] Updated weights for policy 1, policy_version 21050 (0.0009) [2023-10-14 01:52:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 42926080. Throughput: 0: 1748.7, 1: 1791.4. Samples: 10737428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:52:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.900')] [2023-10-14 01:52:14,566][33201] Updated weights for policy 0, policy_version 20870 (0.0008) [2023-10-14 01:52:14,929][33201] Updated weights for policy 0, policy_version 20880 (0.0007) [2023-10-14 01:52:15,304][33201] Updated weights for policy 0, policy_version 20890 (0.0011) [2023-10-14 01:52:16,326][33226] Updated weights for policy 1, policy_version 21060 (0.0008) [2023-10-14 01:52:16,692][33226] Updated weights for policy 1, policy_version 21070 (0.0007) [2023-10-14 01:52:17,066][33226] Updated weights for policy 1, policy_version 21080 (0.0007) [2023-10-14 01:52:19,205][33201] Updated weights for policy 0, policy_version 20900 (0.0009) [2023-10-14 01:52:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 42991616. Throughput: 0: 1759.8, 1: 1767.3. Samples: 10758402. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:52:19,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.880')] [2023-10-14 01:52:19,601][33201] Updated weights for policy 0, policy_version 20910 (0.0008) [2023-10-14 01:52:19,971][33201] Updated weights for policy 0, policy_version 20920 (0.0010) [2023-10-14 01:52:20,849][33226] Updated weights for policy 1, policy_version 21090 (0.0007) [2023-10-14 01:52:21,214][33226] Updated weights for policy 1, policy_version 21100 (0.0009) [2023-10-14 01:52:21,583][33226] Updated weights for policy 1, policy_version 21110 (0.0007) [2023-10-14 01:52:21,954][33226] Updated weights for policy 1, policy_version 21120 (0.0009) [2023-10-14 01:52:23,709][33201] Updated weights for policy 0, policy_version 20930 (0.0010) [2023-10-14 01:52:24,075][33201] Updated weights for policy 0, policy_version 20940 (0.0008) [2023-10-14 01:52:24,448][33201] Updated weights for policy 0, policy_version 20950 (0.0007) [2023-10-14 01:52:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 43057152. Throughput: 0: 1772.2, 1: 1768.1. Samples: 10780012. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:52:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.860')] [2023-10-14 01:52:24,826][33201] Updated weights for policy 0, policy_version 20960 (0.0007) [2023-10-14 01:52:25,768][33226] Updated weights for policy 1, policy_version 21130 (0.0010) [2023-10-14 01:52:26,143][33226] Updated weights for policy 1, policy_version 21140 (0.0007) [2023-10-14 01:52:26,520][33226] Updated weights for policy 1, policy_version 21150 (0.0008) [2023-10-14 01:52:28,553][33201] Updated weights for policy 0, policy_version 20970 (0.0009) [2023-10-14 01:52:28,938][33201] Updated weights for policy 0, policy_version 20980 (0.0008) [2023-10-14 01:52:29,315][33201] Updated weights for policy 0, policy_version 20990 (0.0008) [2023-10-14 01:52:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 43155456. Throughput: 0: 1758.8, 1: 1766.8. Samples: 10790188. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 01:52:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.850')] [2023-10-14 01:52:30,141][33226] Updated weights for policy 1, policy_version 21160 (0.0007) [2023-10-14 01:52:30,521][33226] Updated weights for policy 1, policy_version 21170 (0.0009) [2023-10-14 01:52:30,888][33226] Updated weights for policy 1, policy_version 21180 (0.0007) [2023-10-14 01:52:33,215][33201] Updated weights for policy 0, policy_version 21000 (0.0007) [2023-10-14 01:52:33,579][33201] Updated weights for policy 0, policy_version 21010 (0.0008) [2023-10-14 01:52:33,960][33201] Updated weights for policy 0, policy_version 21020 (0.0008) [2023-10-14 01:52:34,515][33226] Updated weights for policy 1, policy_version 21190 (0.0009) [2023-10-14 01:52:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 43220992. Throughput: 0: 1778.0, 1: 1773.4. Samples: 10812224. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:52:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.850')] [2023-10-14 01:52:34,882][33226] Updated weights for policy 1, policy_version 21200 (0.0010) [2023-10-14 01:52:35,245][33226] Updated weights for policy 1, policy_version 21210 (0.0010) [2023-10-14 01:52:38,082][33201] Updated weights for policy 0, policy_version 21030 (0.0008) [2023-10-14 01:52:38,453][33201] Updated weights for policy 0, policy_version 21040 (0.0008) [2023-10-14 01:52:38,820][33201] Updated weights for policy 0, policy_version 21050 (0.0008) [2023-10-14 01:52:39,165][33226] Updated weights for policy 1, policy_version 21220 (0.0008) [2023-10-14 01:52:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 43286528. Throughput: 0: 1742.9, 1: 1802.0. Samples: 10833008. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:52:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.850')] [2023-10-14 01:52:39,566][33226] Updated weights for policy 1, policy_version 21230 (0.0007) [2023-10-14 01:52:39,931][33226] Updated weights for policy 1, policy_version 21240 (0.0009) [2023-10-14 01:52:42,849][33201] Updated weights for policy 0, policy_version 21060 (0.0009) [2023-10-14 01:52:43,232][33201] Updated weights for policy 0, policy_version 21070 (0.0010) [2023-10-14 01:52:43,580][33226] Updated weights for policy 1, policy_version 21250 (0.0009) [2023-10-14 01:52:43,605][33201] Updated weights for policy 0, policy_version 21080 (0.0009) [2023-10-14 01:52:43,944][33226] Updated weights for policy 1, policy_version 21260 (0.0007) [2023-10-14 01:52:44,313][33226] Updated weights for policy 1, policy_version 21270 (0.0007) [2023-10-14 01:52:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 43352064. Throughput: 0: 1770.6, 1: 1771.6. Samples: 10843576. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:52:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.850')] [2023-10-14 01:52:44,688][33226] Updated weights for policy 1, policy_version 21280 (0.0008) [2023-10-14 01:52:47,490][33201] Updated weights for policy 0, policy_version 21090 (0.0010) [2023-10-14 01:52:47,861][33201] Updated weights for policy 0, policy_version 21100 (0.0011) [2023-10-14 01:52:48,235][33201] Updated weights for policy 0, policy_version 21110 (0.0009) [2023-10-14 01:52:48,534][33226] Updated weights for policy 1, policy_version 21290 (0.0009) [2023-10-14 01:52:48,600][33201] Updated weights for policy 0, policy_version 21120 (0.0009) [2023-10-14 01:52:48,911][33226] Updated weights for policy 1, policy_version 21300 (0.0007) [2023-10-14 01:52:49,277][33226] Updated weights for policy 1, policy_version 21310 (0.0007) [2023-10-14 01:52:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 43450368. Throughput: 0: 1748.0, 1: 1787.4. Samples: 10864796. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 01:52:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.850')] [2023-10-14 01:52:52,313][33201] Updated weights for policy 0, policy_version 21130 (0.0007) [2023-10-14 01:52:52,690][33201] Updated weights for policy 0, policy_version 21140 (0.0009) [2023-10-14 01:52:53,052][33226] Updated weights for policy 1, policy_version 21320 (0.0008) [2023-10-14 01:52:53,053][33201] Updated weights for policy 0, policy_version 21150 (0.0007) [2023-10-14 01:52:53,419][33226] Updated weights for policy 1, policy_version 21330 (0.0007) [2023-10-14 01:52:53,787][33226] Updated weights for policy 1, policy_version 21340 (0.0009) [2023-10-14 01:52:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 43515904. Throughput: 0: 1733.6, 1: 1768.6. Samples: 10884788. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 01:52:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.870')] [2023-10-14 01:52:56,951][33201] Updated weights for policy 0, policy_version 21160 (0.0008) [2023-10-14 01:52:57,326][33201] Updated weights for policy 0, policy_version 21170 (0.0008) [2023-10-14 01:52:57,685][33226] Updated weights for policy 1, policy_version 21350 (0.0009) [2023-10-14 01:52:57,687][33201] Updated weights for policy 0, policy_version 21180 (0.0008) [2023-10-14 01:52:58,048][33226] Updated weights for policy 1, policy_version 21360 (0.0010) [2023-10-14 01:52:58,427][33226] Updated weights for policy 1, policy_version 21370 (0.0009) [2023-10-14 01:52:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 43581440. Throughput: 0: 1753.8, 1: 1784.1. Samples: 10896634. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 01:52:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.870')] [2023-10-14 01:53:01,594][33201] Updated weights for policy 0, policy_version 21190 (0.0008) [2023-10-14 01:53:01,972][33201] Updated weights for policy 0, policy_version 21200 (0.0009) [2023-10-14 01:53:02,067][33226] Updated weights for policy 1, policy_version 21380 (0.0008) [2023-10-14 01:53:02,357][33201] Updated weights for policy 0, policy_version 21210 (0.0009) [2023-10-14 01:53:02,428][33226] Updated weights for policy 1, policy_version 21390 (0.0007) [2023-10-14 01:53:02,802][33226] Updated weights for policy 1, policy_version 21400 (0.0008) [2023-10-14 01:53:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 43646976. Throughput: 0: 1729.4, 1: 1783.5. Samples: 10916484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 01:53:04,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.860')] [2023-10-14 01:53:06,171][33201] Updated weights for policy 0, policy_version 21220 (0.0009) [2023-10-14 01:53:06,558][33201] Updated weights for policy 0, policy_version 21230 (0.0008) [2023-10-14 01:53:06,693][33226] Updated weights for policy 1, policy_version 21410 (0.0009) [2023-10-14 01:53:06,934][33201] Updated weights for policy 0, policy_version 21240 (0.0007) [2023-10-14 01:53:07,055][33226] Updated weights for policy 1, policy_version 21420 (0.0007) [2023-10-14 01:53:07,428][33226] Updated weights for policy 1, policy_version 21430 (0.0009) [2023-10-14 01:53:07,797][33226] Updated weights for policy 1, policy_version 21440 (0.0009) [2023-10-14 01:53:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 43712512. Throughput: 0: 1742.0, 1: 1772.7. Samples: 10938174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 01:53:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.860')] [2023-10-14 01:53:10,736][33201] Updated weights for policy 0, policy_version 21250 (0.0007) [2023-10-14 01:53:11,107][33201] Updated weights for policy 0, policy_version 21260 (0.0010) [2023-10-14 01:53:11,477][33201] Updated weights for policy 0, policy_version 21270 (0.0008) [2023-10-14 01:53:11,526][33226] Updated weights for policy 1, policy_version 21450 (0.0009) [2023-10-14 01:53:11,842][33201] Updated weights for policy 0, policy_version 21280 (0.0007) [2023-10-14 01:53:11,888][33226] Updated weights for policy 1, policy_version 21460 (0.0008) [2023-10-14 01:53:12,261][33226] Updated weights for policy 1, policy_version 21470 (0.0009) [2023-10-14 01:53:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 43778048. Throughput: 0: 1726.3, 1: 1788.5. Samples: 10948356. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:53:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 01:53:15,658][33201] Updated weights for policy 0, policy_version 21290 (0.0008) [2023-10-14 01:53:16,025][33201] Updated weights for policy 0, policy_version 21300 (0.0010) [2023-10-14 01:53:16,125][33226] Updated weights for policy 1, policy_version 21480 (0.0009) [2023-10-14 01:53:16,395][33201] Updated weights for policy 0, policy_version 21310 (0.0008) [2023-10-14 01:53:16,492][33226] Updated weights for policy 1, policy_version 21490 (0.0007) [2023-10-14 01:53:16,861][33226] Updated weights for policy 1, policy_version 21500 (0.0007) [2023-10-14 01:53:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 43843584. Throughput: 0: 1739.0, 1: 1768.3. Samples: 10970052. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:53:19,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 01:53:20,338][33201] Updated weights for policy 0, policy_version 21320 (0.0007) [2023-10-14 01:53:20,682][33226] Updated weights for policy 1, policy_version 21510 (0.0009) [2023-10-14 01:53:20,709][33201] Updated weights for policy 0, policy_version 21330 (0.0009) [2023-10-14 01:53:21,063][33226] Updated weights for policy 1, policy_version 21520 (0.0008) [2023-10-14 01:53:21,090][33201] Updated weights for policy 0, policy_version 21340 (0.0009) [2023-10-14 01:53:21,422][33226] Updated weights for policy 1, policy_version 21530 (0.0008) [2023-10-14 01:53:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 43909120. Throughput: 0: 1769.8, 1: 1765.3. Samples: 10992088. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:53:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 01:53:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000021536_22052864.pth... [2023-10-14 01:53:24,597][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000019872_20348928.pth [2023-10-14 01:53:24,766][33201] Updated weights for policy 0, policy_version 21350 (0.0007) [2023-10-14 01:53:25,133][33201] Updated weights for policy 0, policy_version 21360 (0.0011) [2023-10-14 01:53:25,361][33226] Updated weights for policy 1, policy_version 21540 (0.0007) [2023-10-14 01:53:25,514][33201] Updated weights for policy 0, policy_version 21370 (0.0007) [2023-10-14 01:53:25,728][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000021376_21889024.pth... [2023-10-14 01:53:25,762][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000019712_20185088.pth [2023-10-14 01:53:25,767][33226] Updated weights for policy 1, policy_version 21550 (0.0009) [2023-10-14 01:53:26,138][33226] Updated weights for policy 1, policy_version 21560 (0.0009) [2023-10-14 01:53:29,385][33201] Updated weights for policy 0, policy_version 21380 (0.0008) [2023-10-14 01:53:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 43974656. Throughput: 0: 1746.5, 1: 1766.5. Samples: 11001662. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 01:53:29,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.890')] [2023-10-14 01:53:29,762][33201] Updated weights for policy 0, policy_version 21390 (0.0009) [2023-10-14 01:53:29,781][33226] Updated weights for policy 1, policy_version 21570 (0.0007) [2023-10-14 01:53:30,133][33201] Updated weights for policy 0, policy_version 21400 (0.0007) [2023-10-14 01:53:30,147][33226] Updated weights for policy 1, policy_version 21580 (0.0008) [2023-10-14 01:53:30,518][33226] Updated weights for policy 1, policy_version 21590 (0.0009) [2023-10-14 01:53:30,890][33226] Updated weights for policy 1, policy_version 21600 (0.0009) [2023-10-14 01:53:33,864][33201] Updated weights for policy 0, policy_version 21410 (0.0008) [2023-10-14 01:53:34,223][33201] Updated weights for policy 0, policy_version 21420 (0.0007) [2023-10-14 01:53:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 44040192. Throughput: 0: 1769.9, 1: 1766.5. Samples: 11023932. Policy #0 lag: (min: 12.0, avg: 15.6, max: 44.0) [2023-10-14 01:53:34,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.890')] [2023-10-14 01:53:34,601][33201] Updated weights for policy 0, policy_version 21430 (0.0007) [2023-10-14 01:53:34,782][33226] Updated weights for policy 1, policy_version 21610 (0.0010) [2023-10-14 01:53:34,964][33201] Updated weights for policy 0, policy_version 21440 (0.0009) [2023-10-14 01:53:35,147][33226] Updated weights for policy 1, policy_version 21620 (0.0010) [2023-10-14 01:53:35,520][33226] Updated weights for policy 1, policy_version 21630 (0.0007) [2023-10-14 01:53:38,759][33201] Updated weights for policy 0, policy_version 21450 (0.0008) [2023-10-14 01:53:39,129][33201] Updated weights for policy 0, policy_version 21460 (0.0007) [2023-10-14 01:53:39,312][33226] Updated weights for policy 1, policy_version 21640 (0.0009) [2023-10-14 01:53:39,498][33201] Updated weights for policy 0, policy_version 21470 (0.0008) [2023-10-14 01:53:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 44105728. Throughput: 0: 1766.3, 1: 1796.3. Samples: 11045106. Policy #0 lag: (min: 12.0, avg: 15.6, max: 44.0) [2023-10-14 01:53:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.890')] [2023-10-14 01:53:39,681][33226] Updated weights for policy 1, policy_version 21650 (0.0010) [2023-10-14 01:53:40,041][33226] Updated weights for policy 1, policy_version 21660 (0.0007) [2023-10-14 01:53:43,259][33201] Updated weights for policy 0, policy_version 21480 (0.0008) [2023-10-14 01:53:43,629][33201] Updated weights for policy 0, policy_version 21490 (0.0008) [2023-10-14 01:53:43,831][33226] Updated weights for policy 1, policy_version 21670 (0.0008) [2023-10-14 01:53:44,010][33201] Updated weights for policy 0, policy_version 21500 (0.0008) [2023-10-14 01:53:44,200][33226] Updated weights for policy 1, policy_version 21680 (0.0007) [2023-10-14 01:53:44,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44204032. Throughput: 0: 1766.9, 1: 1765.8. Samples: 11055604. Policy #0 lag: (min: 12.0, avg: 15.6, max: 44.0) [2023-10-14 01:53:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.890')] [2023-10-14 01:53:44,577][33226] Updated weights for policy 1, policy_version 21690 (0.0007) [2023-10-14 01:53:47,890][33201] Updated weights for policy 0, policy_version 21510 (0.0009) [2023-10-14 01:53:48,262][33201] Updated weights for policy 0, policy_version 21520 (0.0009) [2023-10-14 01:53:48,515][33226] Updated weights for policy 1, policy_version 21700 (0.0007) [2023-10-14 01:53:48,643][33201] Updated weights for policy 0, policy_version 21530 (0.0010) [2023-10-14 01:53:48,882][33226] Updated weights for policy 1, policy_version 21710 (0.0007) [2023-10-14 01:53:49,243][33226] Updated weights for policy 1, policy_version 21720 (0.0007) [2023-10-14 01:53:49,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44302336. Throughput: 0: 1782.7, 1: 1788.4. Samples: 11077182. Policy #0 lag: (min: 10.0, avg: 10.3, max: 20.0) [2023-10-14 01:53:49,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.900')] [2023-10-14 01:53:52,702][33201] Updated weights for policy 0, policy_version 21540 (0.0009) [2023-10-14 01:53:52,943][33226] Updated weights for policy 1, policy_version 21730 (0.0008) [2023-10-14 01:53:53,059][33201] Updated weights for policy 0, policy_version 21550 (0.0007) [2023-10-14 01:53:53,302][33226] Updated weights for policy 1, policy_version 21740 (0.0009) [2023-10-14 01:53:53,434][33201] Updated weights for policy 0, policy_version 21560 (0.0008) [2023-10-14 01:53:53,671][33226] Updated weights for policy 1, policy_version 21750 (0.0009) [2023-10-14 01:53:54,035][33226] Updated weights for policy 1, policy_version 21760 (0.0010) [2023-10-14 01:53:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44367872. Throughput: 0: 1759.2, 1: 1772.0. Samples: 11097080. Policy #0 lag: (min: 10.0, avg: 10.3, max: 20.0) [2023-10-14 01:53:54,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 01:53:57,344][33201] Updated weights for policy 0, policy_version 21570 (0.0009) [2023-10-14 01:53:57,719][33201] Updated weights for policy 0, policy_version 21580 (0.0008) [2023-10-14 01:53:57,825][33226] Updated weights for policy 1, policy_version 21770 (0.0009) [2023-10-14 01:53:58,088][33201] Updated weights for policy 0, policy_version 21590 (0.0009) [2023-10-14 01:53:58,200][33226] Updated weights for policy 1, policy_version 21780 (0.0007) [2023-10-14 01:53:58,451][33201] Updated weights for policy 0, policy_version 21600 (0.0009) [2023-10-14 01:53:58,561][33226] Updated weights for policy 1, policy_version 21790 (0.0007) [2023-10-14 01:53:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44433408. Throughput: 0: 1785.6, 1: 1788.8. Samples: 11109202. Policy #0 lag: (min: 10.0, avg: 10.3, max: 20.0) [2023-10-14 01:53:59,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 01:54:02,240][33226] Updated weights for policy 1, policy_version 21800 (0.0007) [2023-10-14 01:54:02,349][33201] Updated weights for policy 0, policy_version 21610 (0.0008) [2023-10-14 01:54:02,615][33226] Updated weights for policy 1, policy_version 21810 (0.0007) [2023-10-14 01:54:02,725][33201] Updated weights for policy 0, policy_version 21620 (0.0008) [2023-10-14 01:54:02,984][33226] Updated weights for policy 1, policy_version 21820 (0.0009) [2023-10-14 01:54:03,100][33201] Updated weights for policy 0, policy_version 21630 (0.0008) [2023-10-14 01:54:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44498944. Throughput: 0: 1752.8, 1: 1777.3. Samples: 11128908. Policy #0 lag: (min: 10.0, avg: 10.3, max: 20.0) [2023-10-14 01:54:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.890')] [2023-10-14 01:54:06,889][33226] Updated weights for policy 1, policy_version 21830 (0.0008) [2023-10-14 01:54:07,005][33201] Updated weights for policy 0, policy_version 21640 (0.0007) [2023-10-14 01:54:07,254][33226] Updated weights for policy 1, policy_version 21840 (0.0009) [2023-10-14 01:54:07,378][33201] Updated weights for policy 0, policy_version 21650 (0.0008) [2023-10-14 01:54:07,621][33226] Updated weights for policy 1, policy_version 21850 (0.0009) [2023-10-14 01:54:07,744][33201] Updated weights for policy 0, policy_version 21660 (0.0008) [2023-10-14 01:54:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 44564480. Throughput: 0: 1747.5, 1: 1769.7. Samples: 11150362. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:54:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 01:54:11,343][33226] Updated weights for policy 1, policy_version 21860 (0.0009) [2023-10-14 01:54:11,577][33201] Updated weights for policy 0, policy_version 21670 (0.0009) [2023-10-14 01:54:11,715][33226] Updated weights for policy 1, policy_version 21870 (0.0008) [2023-10-14 01:54:11,952][33201] Updated weights for policy 0, policy_version 21680 (0.0008) [2023-10-14 01:54:12,087][33226] Updated weights for policy 1, policy_version 21880 (0.0008) [2023-10-14 01:54:12,322][33201] Updated weights for policy 0, policy_version 21690 (0.0007) [2023-10-14 01:54:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 44630016. Throughput: 0: 1754.6, 1: 1788.0. Samples: 11161078. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:54:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.930')] [2023-10-14 01:54:15,862][33226] Updated weights for policy 1, policy_version 21890 (0.0008) [2023-10-14 01:54:16,196][33201] Updated weights for policy 0, policy_version 21700 (0.0008) [2023-10-14 01:54:16,224][33226] Updated weights for policy 1, policy_version 21900 (0.0008) [2023-10-14 01:54:16,568][33201] Updated weights for policy 0, policy_version 21710 (0.0007) [2023-10-14 01:54:16,593][33226] Updated weights for policy 1, policy_version 21910 (0.0008) [2023-10-14 01:54:16,940][33201] Updated weights for policy 0, policy_version 21720 (0.0007) [2023-10-14 01:54:16,954][33226] Updated weights for policy 1, policy_version 21920 (0.0008) [2023-10-14 01:54:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 44695552. Throughput: 0: 1735.8, 1: 1777.9. Samples: 11182050. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:54:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.940')] [2023-10-14 01:54:20,686][33201] Updated weights for policy 0, policy_version 21730 (0.0008) [2023-10-14 01:54:20,754][33226] Updated weights for policy 1, policy_version 21930 (0.0008) [2023-10-14 01:54:21,057][33201] Updated weights for policy 0, policy_version 21740 (0.0008) [2023-10-14 01:54:21,125][33226] Updated weights for policy 1, policy_version 21940 (0.0008) [2023-10-14 01:54:21,426][33201] Updated weights for policy 0, policy_version 21750 (0.0008) [2023-10-14 01:54:21,492][33226] Updated weights for policy 1, policy_version 21950 (0.0007) [2023-10-14 01:54:21,792][33201] Updated weights for policy 0, policy_version 21760 (0.0008) [2023-10-14 01:54:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 44761088. Throughput: 0: 1756.7, 1: 1776.2. Samples: 11204086. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 01:54:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 01:54:25,465][33226] Updated weights for policy 1, policy_version 21960 (0.0009) [2023-10-14 01:54:25,528][33201] Updated weights for policy 0, policy_version 21770 (0.0007) [2023-10-14 01:54:25,836][33226] Updated weights for policy 1, policy_version 21970 (0.0009) [2023-10-14 01:54:25,890][33201] Updated weights for policy 0, policy_version 21780 (0.0007) [2023-10-14 01:54:26,200][33226] Updated weights for policy 1, policy_version 21980 (0.0010) [2023-10-14 01:54:26,267][33201] Updated weights for policy 0, policy_version 21790 (0.0008) [2023-10-14 01:54:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 44826624. Throughput: 0: 1735.6, 1: 1770.4. Samples: 11213376. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:54:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 01:54:30,014][33226] Updated weights for policy 1, policy_version 21990 (0.0010) [2023-10-14 01:54:30,180][33201] Updated weights for policy 0, policy_version 21800 (0.0007) [2023-10-14 01:54:30,386][33226] Updated weights for policy 1, policy_version 22000 (0.0008) [2023-10-14 01:54:30,540][33201] Updated weights for policy 0, policy_version 21810 (0.0007) [2023-10-14 01:54:30,750][33226] Updated weights for policy 1, policy_version 22010 (0.0007) [2023-10-14 01:54:30,911][33201] Updated weights for policy 0, policy_version 21820 (0.0007) [2023-10-14 01:54:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 44892160. Throughput: 0: 1737.9, 1: 1768.8. Samples: 11234986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:54:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 01:54:34,603][33226] Updated weights for policy 1, policy_version 22020 (0.0008) [2023-10-14 01:54:34,864][33201] Updated weights for policy 0, policy_version 21830 (0.0008) [2023-10-14 01:54:34,972][33226] Updated weights for policy 1, policy_version 22030 (0.0008) [2023-10-14 01:54:35,239][33201] Updated weights for policy 0, policy_version 21840 (0.0008) [2023-10-14 01:54:35,334][33226] Updated weights for policy 1, policy_version 22040 (0.0007) [2023-10-14 01:54:35,605][33201] Updated weights for policy 0, policy_version 21850 (0.0007) [2023-10-14 01:54:39,344][33226] Updated weights for policy 1, policy_version 22050 (0.0009) [2023-10-14 01:54:39,526][33201] Updated weights for policy 0, policy_version 21860 (0.0007) [2023-10-14 01:54:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 44957696. Throughput: 0: 1764.6, 1: 1786.2. Samples: 11256864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:54:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 01:54:39,710][33226] Updated weights for policy 1, policy_version 22060 (0.0007) [2023-10-14 01:54:39,933][33201] Updated weights for policy 0, policy_version 21870 (0.0008) [2023-10-14 01:54:40,082][33226] Updated weights for policy 1, policy_version 22070 (0.0008) [2023-10-14 01:54:40,306][33201] Updated weights for policy 0, policy_version 21880 (0.0008) [2023-10-14 01:54:40,452][33226] Updated weights for policy 1, policy_version 22080 (0.0008) [2023-10-14 01:54:44,226][33226] Updated weights for policy 1, policy_version 22090 (0.0009) [2023-10-14 01:54:44,324][33201] Updated weights for policy 0, policy_version 21890 (0.0007) [2023-10-14 01:54:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 45023232. Throughput: 0: 1731.8, 1: 1757.6. Samples: 11266228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:54:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 01:54:44,592][33226] Updated weights for policy 1, policy_version 22100 (0.0009) [2023-10-14 01:54:44,696][33201] Updated weights for policy 0, policy_version 21900 (0.0007) [2023-10-14 01:54:44,957][33226] Updated weights for policy 1, policy_version 22110 (0.0009) [2023-10-14 01:54:45,074][33201] Updated weights for policy 0, policy_version 21910 (0.0007) [2023-10-14 01:54:45,446][33201] Updated weights for policy 0, policy_version 21920 (0.0009) [2023-10-14 01:54:48,812][33226] Updated weights for policy 1, policy_version 22120 (0.0007) [2023-10-14 01:54:49,183][33226] Updated weights for policy 1, policy_version 22130 (0.0010) [2023-10-14 01:54:49,240][33201] Updated weights for policy 0, policy_version 21930 (0.0009) [2023-10-14 01:54:49,551][33226] Updated weights for policy 1, policy_version 22140 (0.0008) [2023-10-14 01:54:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13995.8). Total num frames: 45088768. Throughput: 0: 1760.8, 1: 1782.8. Samples: 11288366. Policy #0 lag: (min: 6.0, avg: 12.9, max: 38.0) [2023-10-14 01:54:49,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 01:54:49,612][33201] Updated weights for policy 0, policy_version 21940 (0.0009) [2023-10-14 01:54:49,992][33201] Updated weights for policy 0, policy_version 21950 (0.0008) [2023-10-14 01:54:53,328][33226] Updated weights for policy 1, policy_version 22150 (0.0009) [2023-10-14 01:54:53,705][33226] Updated weights for policy 1, policy_version 22160 (0.0008) [2023-10-14 01:54:53,884][33201] Updated weights for policy 0, policy_version 21960 (0.0007) [2023-10-14 01:54:54,076][33226] Updated weights for policy 1, policy_version 22170 (0.0009) [2023-10-14 01:54:54,257][33201] Updated weights for policy 0, policy_version 21970 (0.0007) [2023-10-14 01:54:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 45187072. Throughput: 0: 1754.8, 1: 1765.2. Samples: 11308764. Policy #0 lag: (min: 6.0, avg: 12.9, max: 38.0) [2023-10-14 01:54:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 01:54:54,566][32895] Saving new best policy, reward=20.980! [2023-10-14 01:54:54,622][33201] Updated weights for policy 0, policy_version 21980 (0.0007) [2023-10-14 01:54:58,015][33226] Updated weights for policy 1, policy_version 22180 (0.0009) [2023-10-14 01:54:58,420][33226] Updated weights for policy 1, policy_version 22190 (0.0009) [2023-10-14 01:54:58,506][33201] Updated weights for policy 0, policy_version 21990 (0.0008) [2023-10-14 01:54:58,791][33226] Updated weights for policy 1, policy_version 22200 (0.0009) [2023-10-14 01:54:58,880][33201] Updated weights for policy 0, policy_version 22000 (0.0007) [2023-10-14 01:54:59,254][33201] Updated weights for policy 0, policy_version 22010 (0.0008) [2023-10-14 01:54:59,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 45285376. Throughput: 0: 1755.5, 1: 1772.8. Samples: 11319848. Policy #0 lag: (min: 6.0, avg: 12.9, max: 38.0) [2023-10-14 01:54:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 01:55:02,655][33226] Updated weights for policy 1, policy_version 22210 (0.0008) [2023-10-14 01:55:02,921][33201] Updated weights for policy 0, policy_version 22020 (0.0009) [2023-10-14 01:55:03,023][33226] Updated weights for policy 1, policy_version 22220 (0.0009) [2023-10-14 01:55:03,293][33201] Updated weights for policy 0, policy_version 22030 (0.0009) [2023-10-14 01:55:03,384][33226] Updated weights for policy 1, policy_version 22230 (0.0008) [2023-10-14 01:55:03,666][33201] Updated weights for policy 0, policy_version 22040 (0.0007) [2023-10-14 01:55:03,748][33226] Updated weights for policy 1, policy_version 22240 (0.0008) [2023-10-14 01:55:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 45350912. Throughput: 0: 1765.8, 1: 1763.8. Samples: 11340882. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:55:04,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 01:55:07,524][33201] Updated weights for policy 0, policy_version 22050 (0.0008) [2023-10-14 01:55:07,690][33226] Updated weights for policy 1, policy_version 22250 (0.0008) [2023-10-14 01:55:07,901][33201] Updated weights for policy 0, policy_version 22060 (0.0008) [2023-10-14 01:55:08,064][33226] Updated weights for policy 1, policy_version 22260 (0.0009) [2023-10-14 01:55:08,270][33201] Updated weights for policy 0, policy_version 22070 (0.0007) [2023-10-14 01:55:08,427][33226] Updated weights for policy 1, policy_version 22270 (0.0007) [2023-10-14 01:55:08,637][33201] Updated weights for policy 0, policy_version 22080 (0.0007) [2023-10-14 01:55:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 45416448. Throughput: 0: 1739.8, 1: 1741.9. Samples: 11360762. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:55:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 01:55:12,246][33226] Updated weights for policy 1, policy_version 22280 (0.0008) [2023-10-14 01:55:12,331][33201] Updated weights for policy 0, policy_version 22090 (0.0007) [2023-10-14 01:55:12,617][33226] Updated weights for policy 1, policy_version 22290 (0.0008) [2023-10-14 01:55:12,705][33201] Updated weights for policy 0, policy_version 22100 (0.0007) [2023-10-14 01:55:12,983][33226] Updated weights for policy 1, policy_version 22300 (0.0008) [2023-10-14 01:55:13,063][33201] Updated weights for policy 0, policy_version 22110 (0.0007) [2023-10-14 01:55:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 45481984. Throughput: 0: 1767.9, 1: 1775.1. Samples: 11372808. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:55:14,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 01:55:16,668][33226] Updated weights for policy 1, policy_version 22310 (0.0008) [2023-10-14 01:55:16,811][33201] Updated weights for policy 0, policy_version 22120 (0.0007) [2023-10-14 01:55:17,034][33226] Updated weights for policy 1, policy_version 22320 (0.0007) [2023-10-14 01:55:17,188][33201] Updated weights for policy 0, policy_version 22130 (0.0008) [2023-10-14 01:55:17,401][33226] Updated weights for policy 1, policy_version 22330 (0.0008) [2023-10-14 01:55:17,554][33201] Updated weights for policy 0, policy_version 22140 (0.0008) [2023-10-14 01:55:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 45547520. Throughput: 0: 1750.2, 1: 1747.5. Samples: 11392382. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 01:55:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 01:55:21,328][33201] Updated weights for policy 0, policy_version 22150 (0.0009) [2023-10-14 01:55:21,354][33226] Updated weights for policy 1, policy_version 22340 (0.0008) [2023-10-14 01:55:21,683][33201] Updated weights for policy 0, policy_version 22160 (0.0008) [2023-10-14 01:55:21,718][33226] Updated weights for policy 1, policy_version 22350 (0.0008) [2023-10-14 01:55:22,055][33201] Updated weights for policy 0, policy_version 22170 (0.0009) [2023-10-14 01:55:22,086][33226] Updated weights for policy 1, policy_version 22360 (0.0007) [2023-10-14 01:55:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 45613056. Throughput: 0: 1752.0, 1: 1746.7. Samples: 11414306. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 01:55:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000022368_22904832.pth... [2023-10-14 01:55:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000022176_22708224.pth... [2023-10-14 01:55:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000020704_21200896.pth [2023-10-14 01:55:24,609][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000020544_21037056.pth [2023-10-14 01:55:26,003][33201] Updated weights for policy 0, policy_version 22180 (0.0008) [2023-10-14 01:55:26,117][33226] Updated weights for policy 1, policy_version 22370 (0.0009) [2023-10-14 01:55:26,382][33201] Updated weights for policy 0, policy_version 22190 (0.0007) [2023-10-14 01:55:26,483][33226] Updated weights for policy 1, policy_version 22380 (0.0007) [2023-10-14 01:55:26,753][33201] Updated weights for policy 0, policy_version 22200 (0.0007) [2023-10-14 01:55:26,853][33226] Updated weights for policy 1, policy_version 22390 (0.0009) [2023-10-14 01:55:27,221][33226] Updated weights for policy 1, policy_version 22400 (0.0008) [2023-10-14 01:55:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 45678592. Throughput: 0: 1757.1, 1: 1748.9. Samples: 11423996. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 01:55:30,542][33201] Updated weights for policy 0, policy_version 22210 (0.0008) [2023-10-14 01:55:30,908][33201] Updated weights for policy 0, policy_version 22220 (0.0008) [2023-10-14 01:55:31,055][33226] Updated weights for policy 1, policy_version 22410 (0.0007) [2023-10-14 01:55:31,272][33201] Updated weights for policy 0, policy_version 22230 (0.0009) [2023-10-14 01:55:31,418][33226] Updated weights for policy 1, policy_version 22420 (0.0009) [2023-10-14 01:55:31,642][33201] Updated weights for policy 0, policy_version 22240 (0.0008) [2023-10-14 01:55:31,783][33226] Updated weights for policy 1, policy_version 22430 (0.0010) [2023-10-14 01:55:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 45744128. Throughput: 0: 1758.8, 1: 1737.4. Samples: 11445694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 01:55:35,417][33226] Updated weights for policy 1, policy_version 22440 (0.0008) [2023-10-14 01:55:35,564][33201] Updated weights for policy 0, policy_version 22250 (0.0008) [2023-10-14 01:55:35,776][33226] Updated weights for policy 1, policy_version 22450 (0.0008) [2023-10-14 01:55:35,930][33201] Updated weights for policy 0, policy_version 22260 (0.0007) [2023-10-14 01:55:36,143][33226] Updated weights for policy 1, policy_version 22460 (0.0009) [2023-10-14 01:55:36,310][33201] Updated weights for policy 0, policy_version 22270 (0.0009) [2023-10-14 01:55:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 45809664. Throughput: 0: 1769.2, 1: 1765.1. Samples: 11467806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 01:55:39,943][33226] Updated weights for policy 1, policy_version 22470 (0.0008) [2023-10-14 01:55:40,248][33201] Updated weights for policy 0, policy_version 22280 (0.0008) [2023-10-14 01:55:40,306][33226] Updated weights for policy 1, policy_version 22480 (0.0009) [2023-10-14 01:55:40,615][33201] Updated weights for policy 0, policy_version 22290 (0.0007) [2023-10-14 01:55:40,669][33226] Updated weights for policy 1, policy_version 22490 (0.0009) [2023-10-14 01:55:40,983][33201] Updated weights for policy 0, policy_version 22300 (0.0007) [2023-10-14 01:55:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 45875200. Throughput: 0: 1755.5, 1: 1741.5. Samples: 11477212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 01:55:44,561][33226] Updated weights for policy 1, policy_version 22500 (0.0009) [2023-10-14 01:55:44,599][33201] Updated weights for policy 0, policy_version 22310 (0.0008) [2023-10-14 01:55:44,958][33226] Updated weights for policy 1, policy_version 22510 (0.0010) [2023-10-14 01:55:44,966][33201] Updated weights for policy 0, policy_version 22320 (0.0009) [2023-10-14 01:55:45,315][33226] Updated weights for policy 1, policy_version 22520 (0.0007) [2023-10-14 01:55:45,340][33201] Updated weights for policy 0, policy_version 22330 (0.0007) [2023-10-14 01:55:49,138][33201] Updated weights for policy 0, policy_version 22340 (0.0009) [2023-10-14 01:55:49,140][33226] Updated weights for policy 1, policy_version 22530 (0.0008) [2023-10-14 01:55:49,505][33201] Updated weights for policy 0, policy_version 22350 (0.0007) [2023-10-14 01:55:49,510][33226] Updated weights for policy 1, policy_version 22540 (0.0007) [2023-10-14 01:55:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 45940736. Throughput: 0: 1765.7, 1: 1755.1. Samples: 11499318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:49,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.940')] [2023-10-14 01:55:49,870][33201] Updated weights for policy 0, policy_version 22360 (0.0008) [2023-10-14 01:55:49,885][33226] Updated weights for policy 1, policy_version 22550 (0.0007) [2023-10-14 01:55:50,250][33226] Updated weights for policy 1, policy_version 22560 (0.0007) [2023-10-14 01:55:53,691][33201] Updated weights for policy 0, policy_version 22370 (0.0008) [2023-10-14 01:55:53,944][33226] Updated weights for policy 1, policy_version 22570 (0.0007) [2023-10-14 01:55:54,056][33201] Updated weights for policy 0, policy_version 22380 (0.0008) [2023-10-14 01:55:54,304][33226] Updated weights for policy 1, policy_version 22580 (0.0008) [2023-10-14 01:55:54,428][33201] Updated weights for policy 0, policy_version 22390 (0.0009) [2023-10-14 01:55:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 46006272. Throughput: 0: 1776.3, 1: 1774.2. Samples: 11520534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:55:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 01:55:54,670][33226] Updated weights for policy 1, policy_version 22590 (0.0008) [2023-10-14 01:55:54,800][33201] Updated weights for policy 0, policy_version 22400 (0.0007) [2023-10-14 01:55:58,444][33226] Updated weights for policy 1, policy_version 22600 (0.0010) [2023-10-14 01:55:58,723][33201] Updated weights for policy 0, policy_version 22410 (0.0007) [2023-10-14 01:55:58,815][33226] Updated weights for policy 1, policy_version 22610 (0.0008) [2023-10-14 01:55:59,106][33201] Updated weights for policy 0, policy_version 22420 (0.0008) [2023-10-14 01:55:59,187][33226] Updated weights for policy 1, policy_version 22620 (0.0007) [2023-10-14 01:55:59,473][33201] Updated weights for policy 0, policy_version 22430 (0.0009) [2023-10-14 01:55:59,557][31953] Fps is (10 sec: 19661.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 46137344. Throughput: 0: 1756.7, 1: 1761.7. Samples: 11531136. Policy #0 lag: (min: 23.0, avg: 29.8, max: 55.0) [2023-10-14 01:55:59,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 01:56:02,903][33226] Updated weights for policy 1, policy_version 22630 (0.0007) [2023-10-14 01:56:03,266][33226] Updated weights for policy 1, policy_version 22640 (0.0007) [2023-10-14 01:56:03,293][33201] Updated weights for policy 0, policy_version 22440 (0.0007) [2023-10-14 01:56:03,633][33226] Updated weights for policy 1, policy_version 22650 (0.0007) [2023-10-14 01:56:03,667][33201] Updated weights for policy 0, policy_version 22450 (0.0007) [2023-10-14 01:56:04,025][33201] Updated weights for policy 0, policy_version 22460 (0.0008) [2023-10-14 01:56:04,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 46202880. Throughput: 0: 1780.8, 1: 1785.4. Samples: 11552860. Policy #0 lag: (min: 23.0, avg: 29.8, max: 55.0) [2023-10-14 01:56:04,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 01:56:07,376][33226] Updated weights for policy 1, policy_version 22660 (0.0007) [2023-10-14 01:56:07,744][33226] Updated weights for policy 1, policy_version 22670 (0.0008) [2023-10-14 01:56:07,839][33201] Updated weights for policy 0, policy_version 22470 (0.0009) [2023-10-14 01:56:08,102][33226] Updated weights for policy 1, policy_version 22680 (0.0008) [2023-10-14 01:56:08,214][33201] Updated weights for policy 0, policy_version 22480 (0.0009) [2023-10-14 01:56:08,584][33201] Updated weights for policy 0, policy_version 22490 (0.0009) [2023-10-14 01:56:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 46268416. Throughput: 0: 1747.1, 1: 1771.2. Samples: 11572628. Policy #0 lag: (min: 23.0, avg: 29.8, max: 55.0) [2023-10-14 01:56:09,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 01:56:11,899][33226] Updated weights for policy 1, policy_version 22690 (0.0009) [2023-10-14 01:56:12,268][33226] Updated weights for policy 1, policy_version 22700 (0.0012) [2023-10-14 01:56:12,593][33201] Updated weights for policy 0, policy_version 22500 (0.0009) [2023-10-14 01:56:12,638][33226] Updated weights for policy 1, policy_version 22710 (0.0008) [2023-10-14 01:56:12,989][33201] Updated weights for policy 0, policy_version 22510 (0.0008) [2023-10-14 01:56:13,007][33226] Updated weights for policy 1, policy_version 22720 (0.0007) [2023-10-14 01:56:13,356][33201] Updated weights for policy 0, policy_version 22520 (0.0009) [2023-10-14 01:56:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 46333952. Throughput: 0: 1776.8, 1: 1793.0. Samples: 11584638. Policy #0 lag: (min: 23.0, avg: 29.8, max: 55.0) [2023-10-14 01:56:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 01:56:16,823][33226] Updated weights for policy 1, policy_version 22730 (0.0008) [2023-10-14 01:56:17,192][33226] Updated weights for policy 1, policy_version 22740 (0.0008) [2023-10-14 01:56:17,426][33201] Updated weights for policy 0, policy_version 22530 (0.0007) [2023-10-14 01:56:17,549][33226] Updated weights for policy 1, policy_version 22750 (0.0008) [2023-10-14 01:56:17,800][33201] Updated weights for policy 0, policy_version 22540 (0.0009) [2023-10-14 01:56:18,171][33201] Updated weights for policy 0, policy_version 22550 (0.0008) [2023-10-14 01:56:18,540][33201] Updated weights for policy 0, policy_version 22560 (0.0009) [2023-10-14 01:56:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 46399488. Throughput: 0: 1745.5, 1: 1776.6. Samples: 11604186. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:19,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 01:56:21,327][33226] Updated weights for policy 1, policy_version 22760 (0.0008) [2023-10-14 01:56:21,698][33226] Updated weights for policy 1, policy_version 22770 (0.0010) [2023-10-14 01:56:22,065][33226] Updated weights for policy 1, policy_version 22780 (0.0010) [2023-10-14 01:56:22,380][33201] Updated weights for policy 0, policy_version 22570 (0.0007) [2023-10-14 01:56:22,752][33201] Updated weights for policy 0, policy_version 22580 (0.0008) [2023-10-14 01:56:23,115][33201] Updated weights for policy 0, policy_version 22590 (0.0010) [2023-10-14 01:56:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 46465024. Throughput: 0: 1741.4, 1: 1779.1. Samples: 11626230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:24,559][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 01:56:25,799][33226] Updated weights for policy 1, policy_version 22790 (0.0007) [2023-10-14 01:56:26,165][33226] Updated weights for policy 1, policy_version 22800 (0.0010) [2023-10-14 01:56:26,531][33226] Updated weights for policy 1, policy_version 22810 (0.0008) [2023-10-14 01:56:26,786][33201] Updated weights for policy 0, policy_version 22600 (0.0008) [2023-10-14 01:56:27,165][33201] Updated weights for policy 0, policy_version 22610 (0.0009) [2023-10-14 01:56:27,533][33201] Updated weights for policy 0, policy_version 22620 (0.0008) [2023-10-14 01:56:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 46530560. Throughput: 0: 1762.4, 1: 1780.4. Samples: 11636642. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:29,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 01:56:30,306][33226] Updated weights for policy 1, policy_version 22820 (0.0009) [2023-10-14 01:56:30,678][33226] Updated weights for policy 1, policy_version 22830 (0.0008) [2023-10-14 01:56:31,057][33226] Updated weights for policy 1, policy_version 22840 (0.0008) [2023-10-14 01:56:31,339][33201] Updated weights for policy 0, policy_version 22630 (0.0009) [2023-10-14 01:56:31,712][33201] Updated weights for policy 0, policy_version 22640 (0.0008) [2023-10-14 01:56:32,079][33201] Updated weights for policy 0, policy_version 22650 (0.0008) [2023-10-14 01:56:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 46596096. Throughput: 0: 1746.7, 1: 1789.7. Samples: 11658458. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 01:56:34,863][33226] Updated weights for policy 1, policy_version 22850 (0.0008) [2023-10-14 01:56:35,279][33226] Updated weights for policy 1, policy_version 22860 (0.0008) [2023-10-14 01:56:35,645][33226] Updated weights for policy 1, policy_version 22870 (0.0009) [2023-10-14 01:56:35,923][33201] Updated weights for policy 0, policy_version 22660 (0.0008) [2023-10-14 01:56:36,010][33226] Updated weights for policy 1, policy_version 22880 (0.0009) [2023-10-14 01:56:36,289][33201] Updated weights for policy 0, policy_version 22670 (0.0009) [2023-10-14 01:56:36,669][33201] Updated weights for policy 0, policy_version 22680 (0.0009) [2023-10-14 01:56:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 46661632. Throughput: 0: 1763.3, 1: 1793.6. Samples: 11680592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 01:56:39,607][33226] Updated weights for policy 1, policy_version 22890 (0.0010) [2023-10-14 01:56:39,988][33226] Updated weights for policy 1, policy_version 22900 (0.0010) [2023-10-14 01:56:40,339][33201] Updated weights for policy 0, policy_version 22690 (0.0008) [2023-10-14 01:56:40,359][33226] Updated weights for policy 1, policy_version 22910 (0.0008) [2023-10-14 01:56:40,706][33201] Updated weights for policy 0, policy_version 22700 (0.0010) [2023-10-14 01:56:41,077][33201] Updated weights for policy 0, policy_version 22710 (0.0007) [2023-10-14 01:56:41,446][33201] Updated weights for policy 0, policy_version 22720 (0.0008) [2023-10-14 01:56:44,141][33226] Updated weights for policy 1, policy_version 22920 (0.0009) [2023-10-14 01:56:44,511][33226] Updated weights for policy 1, policy_version 22930 (0.0007) [2023-10-14 01:56:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 46727168. Throughput: 0: 1754.7, 1: 1781.0. Samples: 11690244. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 01:56:44,879][33226] Updated weights for policy 1, policy_version 22940 (0.0008) [2023-10-14 01:56:45,480][33201] Updated weights for policy 0, policy_version 22730 (0.0008) [2023-10-14 01:56:45,852][33201] Updated weights for policy 0, policy_version 22740 (0.0008) [2023-10-14 01:56:46,222][33201] Updated weights for policy 0, policy_version 22750 (0.0009) [2023-10-14 01:56:48,673][33226] Updated weights for policy 1, policy_version 22950 (0.0008) [2023-10-14 01:56:49,052][33226] Updated weights for policy 1, policy_version 22960 (0.0009) [2023-10-14 01:56:49,416][33226] Updated weights for policy 1, policy_version 22970 (0.0008) [2023-10-14 01:56:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 46792704. Throughput: 0: 1749.8, 1: 1786.5. Samples: 11711994. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:49,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.920')] [2023-10-14 01:56:50,024][33201] Updated weights for policy 0, policy_version 22760 (0.0007) [2023-10-14 01:56:50,394][33201] Updated weights for policy 0, policy_version 22770 (0.0010) [2023-10-14 01:56:50,771][33201] Updated weights for policy 0, policy_version 22780 (0.0010) [2023-10-14 01:56:53,276][33226] Updated weights for policy 1, policy_version 22980 (0.0009) [2023-10-14 01:56:53,643][33226] Updated weights for policy 1, policy_version 22990 (0.0008) [2023-10-14 01:56:54,012][33226] Updated weights for policy 1, policy_version 23000 (0.0009) [2023-10-14 01:56:54,502][33201] Updated weights for policy 0, policy_version 22790 (0.0008) [2023-10-14 01:56:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 46891008. Throughput: 0: 1784.5, 1: 1787.0. Samples: 11733344. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 01:56:54,874][33201] Updated weights for policy 0, policy_version 22800 (0.0009) [2023-10-14 01:56:55,247][33201] Updated weights for policy 0, policy_version 22810 (0.0010) [2023-10-14 01:56:57,714][33226] Updated weights for policy 1, policy_version 23010 (0.0009) [2023-10-14 01:56:58,074][33226] Updated weights for policy 1, policy_version 23020 (0.0009) [2023-10-14 01:56:58,445][33226] Updated weights for policy 1, policy_version 23030 (0.0009) [2023-10-14 01:56:58,803][33226] Updated weights for policy 1, policy_version 23040 (0.0009) [2023-10-14 01:56:59,071][33201] Updated weights for policy 0, policy_version 22820 (0.0009) [2023-10-14 01:56:59,469][33201] Updated weights for policy 0, policy_version 22830 (0.0009) [2023-10-14 01:56:59,557][31953] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 46956544. Throughput: 0: 1754.5, 1: 1785.6. Samples: 11743938. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:56:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 01:56:59,829][33201] Updated weights for policy 0, policy_version 22840 (0.0008) [2023-10-14 01:57:02,545][33226] Updated weights for policy 1, policy_version 23050 (0.0008) [2023-10-14 01:57:02,911][33226] Updated weights for policy 1, policy_version 23060 (0.0011) [2023-10-14 01:57:03,285][33226] Updated weights for policy 1, policy_version 23070 (0.0011) [2023-10-14 01:57:03,618][33201] Updated weights for policy 0, policy_version 22850 (0.0007) [2023-10-14 01:57:03,983][33201] Updated weights for policy 0, policy_version 22860 (0.0007) [2023-10-14 01:57:04,364][33201] Updated weights for policy 0, policy_version 22870 (0.0007) [2023-10-14 01:57:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 47022080. Throughput: 0: 1785.7, 1: 1797.2. Samples: 11765416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:04,559][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 01:57:04,733][33201] Updated weights for policy 0, policy_version 22880 (0.0008) [2023-10-14 01:57:07,223][33226] Updated weights for policy 1, policy_version 23080 (0.0009) [2023-10-14 01:57:07,592][33226] Updated weights for policy 1, policy_version 23090 (0.0008) [2023-10-14 01:57:07,955][33226] Updated weights for policy 1, policy_version 23100 (0.0007) [2023-10-14 01:57:08,449][33201] Updated weights for policy 0, policy_version 22890 (0.0009) [2023-10-14 01:57:08,815][33201] Updated weights for policy 0, policy_version 22900 (0.0007) [2023-10-14 01:57:09,184][33201] Updated weights for policy 0, policy_version 22910 (0.0007) [2023-10-14 01:57:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 47120384. Throughput: 0: 1769.6, 1: 1781.4. Samples: 11786024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:09,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.900')] [2023-10-14 01:57:11,661][33226] Updated weights for policy 1, policy_version 23110 (0.0008) [2023-10-14 01:57:12,032][33226] Updated weights for policy 1, policy_version 23120 (0.0010) [2023-10-14 01:57:12,401][33226] Updated weights for policy 1, policy_version 23130 (0.0007) [2023-10-14 01:57:12,891][33201] Updated weights for policy 0, policy_version 22920 (0.0007) [2023-10-14 01:57:13,253][33201] Updated weights for policy 0, policy_version 22930 (0.0007) [2023-10-14 01:57:13,625][33201] Updated weights for policy 0, policy_version 22940 (0.0010) [2023-10-14 01:57:14,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 47185920. Throughput: 0: 1777.8, 1: 1796.3. Samples: 11797476. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) [2023-10-14 01:57:14,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.900')] [2023-10-14 01:57:16,121][33226] Updated weights for policy 1, policy_version 23140 (0.0009) [2023-10-14 01:57:16,483][33226] Updated weights for policy 1, policy_version 23150 (0.0009) [2023-10-14 01:57:16,853][33226] Updated weights for policy 1, policy_version 23160 (0.0008) [2023-10-14 01:57:17,604][33201] Updated weights for policy 0, policy_version 22950 (0.0011) [2023-10-14 01:57:17,964][33201] Updated weights for policy 0, policy_version 22960 (0.0009) [2023-10-14 01:57:18,323][33201] Updated weights for policy 0, policy_version 22970 (0.0010) [2023-10-14 01:57:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 47251456. Throughput: 0: 1766.8, 1: 1774.1. Samples: 11817796. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) [2023-10-14 01:57:19,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 01:57:20,721][33226] Updated weights for policy 1, policy_version 23170 (0.0007) [2023-10-14 01:57:21,109][33226] Updated weights for policy 1, policy_version 23180 (0.0011) [2023-10-14 01:57:21,469][33226] Updated weights for policy 1, policy_version 23190 (0.0010) [2023-10-14 01:57:21,845][33226] Updated weights for policy 1, policy_version 23200 (0.0009) [2023-10-14 01:57:22,247][33201] Updated weights for policy 0, policy_version 22980 (0.0009) [2023-10-14 01:57:22,622][33201] Updated weights for policy 0, policy_version 22990 (0.0008) [2023-10-14 01:57:22,999][33201] Updated weights for policy 0, policy_version 23000 (0.0008) [2023-10-14 01:57:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 47316992. Throughput: 0: 1747.6, 1: 1776.6. Samples: 11839178. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) [2023-10-14 01:57:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 01:57:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000023200_23756800.pth... [2023-10-14 01:57:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000023008_23560192.pth... [2023-10-14 01:57:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000021536_22052864.pth [2023-10-14 01:57:24,606][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000021376_21889024.pth [2023-10-14 01:57:25,606][33226] Updated weights for policy 1, policy_version 23210 (0.0009) [2023-10-14 01:57:25,963][33226] Updated weights for policy 1, policy_version 23220 (0.0009) [2023-10-14 01:57:26,330][33226] Updated weights for policy 1, policy_version 23230 (0.0007) [2023-10-14 01:57:26,807][33201] Updated weights for policy 0, policy_version 23010 (0.0007) [2023-10-14 01:57:27,175][33201] Updated weights for policy 0, policy_version 23020 (0.0007) [2023-10-14 01:57:27,554][33201] Updated weights for policy 0, policy_version 23030 (0.0009) [2023-10-14 01:57:27,917][33201] Updated weights for policy 0, policy_version 23040 (0.0008) [2023-10-14 01:57:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 47382528. Throughput: 0: 1774.9, 1: 1776.4. Samples: 11850050. Policy #0 lag: (min: 13.0, avg: 17.4, max: 45.0) [2023-10-14 01:57:29,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 01:57:30,027][33226] Updated weights for policy 1, policy_version 23240 (0.0010) [2023-10-14 01:57:30,399][33226] Updated weights for policy 1, policy_version 23250 (0.0012) [2023-10-14 01:57:30,773][33226] Updated weights for policy 1, policy_version 23260 (0.0008) [2023-10-14 01:57:31,587][33201] Updated weights for policy 0, policy_version 23050 (0.0008) [2023-10-14 01:57:31,962][33201] Updated weights for policy 0, policy_version 23060 (0.0010) [2023-10-14 01:57:32,340][33201] Updated weights for policy 0, policy_version 23070 (0.0008) [2023-10-14 01:57:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 47448064. Throughput: 0: 1764.4, 1: 1780.4. Samples: 11871512. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:34,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 01:57:34,620][33226] Updated weights for policy 1, policy_version 23270 (0.0008) [2023-10-14 01:57:34,994][33226] Updated weights for policy 1, policy_version 23280 (0.0008) [2023-10-14 01:57:35,359][33226] Updated weights for policy 1, policy_version 23290 (0.0010) [2023-10-14 01:57:36,155][33201] Updated weights for policy 0, policy_version 23080 (0.0008) [2023-10-14 01:57:36,527][33201] Updated weights for policy 0, policy_version 23090 (0.0008) [2023-10-14 01:57:36,896][33201] Updated weights for policy 0, policy_version 23100 (0.0008) [2023-10-14 01:57:39,060][33226] Updated weights for policy 1, policy_version 23300 (0.0008) [2023-10-14 01:57:39,434][33226] Updated weights for policy 1, policy_version 23310 (0.0007) [2023-10-14 01:57:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 47513600. Throughput: 0: 1763.8, 1: 1801.9. Samples: 11893802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 01:57:39,796][33226] Updated weights for policy 1, policy_version 23320 (0.0008) [2023-10-14 01:57:40,701][33201] Updated weights for policy 0, policy_version 23110 (0.0008) [2023-10-14 01:57:41,075][33201] Updated weights for policy 0, policy_version 23120 (0.0008) [2023-10-14 01:57:41,446][33201] Updated weights for policy 0, policy_version 23130 (0.0009) [2023-10-14 01:57:43,450][33226] Updated weights for policy 1, policy_version 23330 (0.0008) [2023-10-14 01:57:43,810][33226] Updated weights for policy 1, policy_version 23340 (0.0008) [2023-10-14 01:57:44,176][33226] Updated weights for policy 1, policy_version 23350 (0.0007) [2023-10-14 01:57:44,541][33226] Updated weights for policy 1, policy_version 23360 (0.0007) [2023-10-14 01:57:44,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 47611904. Throughput: 0: 1769.2, 1: 1782.7. Samples: 11903772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:44,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.950')] [2023-10-14 01:57:45,338][33201] Updated weights for policy 0, policy_version 23140 (0.0007) [2023-10-14 01:57:45,736][33201] Updated weights for policy 0, policy_version 23150 (0.0007) [2023-10-14 01:57:46,102][33201] Updated weights for policy 0, policy_version 23160 (0.0009) [2023-10-14 01:57:48,546][33226] Updated weights for policy 1, policy_version 23370 (0.0009) [2023-10-14 01:57:48,921][33226] Updated weights for policy 1, policy_version 23380 (0.0009) [2023-10-14 01:57:49,292][33226] Updated weights for policy 1, policy_version 23390 (0.0010) [2023-10-14 01:57:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 47677440. Throughput: 0: 1762.5, 1: 1796.9. Samples: 11925590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:57:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.950')] [2023-10-14 01:57:49,706][33201] Updated weights for policy 0, policy_version 23170 (0.0010) [2023-10-14 01:57:50,078][33201] Updated weights for policy 0, policy_version 23180 (0.0008) [2023-10-14 01:57:50,459][33201] Updated weights for policy 0, policy_version 23190 (0.0008) [2023-10-14 01:57:50,822][33201] Updated weights for policy 0, policy_version 23200 (0.0008) [2023-10-14 01:57:53,138][33226] Updated weights for policy 1, policy_version 23400 (0.0008) [2023-10-14 01:57:53,509][33226] Updated weights for policy 1, policy_version 23410 (0.0008) [2023-10-14 01:57:53,869][33226] Updated weights for policy 1, policy_version 23420 (0.0009) [2023-10-14 01:57:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 47742976. Throughput: 0: 1794.4, 1: 1777.6. Samples: 11946768. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:57:54,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 01:57:54,589][33201] Updated weights for policy 0, policy_version 23210 (0.0010) [2023-10-14 01:57:54,954][33201] Updated weights for policy 0, policy_version 23220 (0.0008) [2023-10-14 01:57:55,316][33201] Updated weights for policy 0, policy_version 23230 (0.0007) [2023-10-14 01:57:57,478][33226] Updated weights for policy 1, policy_version 23430 (0.0009) [2023-10-14 01:57:57,849][33226] Updated weights for policy 1, policy_version 23440 (0.0007) [2023-10-14 01:57:58,215][33226] Updated weights for policy 1, policy_version 23450 (0.0009) [2023-10-14 01:57:59,156][33201] Updated weights for policy 0, policy_version 23240 (0.0007) [2023-10-14 01:57:59,521][33201] Updated weights for policy 0, policy_version 23250 (0.0008) [2023-10-14 01:57:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 47808512. Throughput: 0: 1768.2, 1: 1792.5. Samples: 11957706. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:57:59,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 01:57:59,889][33201] Updated weights for policy 0, policy_version 23260 (0.0012) [2023-10-14 01:58:01,927][33226] Updated weights for policy 1, policy_version 23460 (0.0009) [2023-10-14 01:58:02,296][33226] Updated weights for policy 1, policy_version 23470 (0.0008) [2023-10-14 01:58:02,659][33226] Updated weights for policy 1, policy_version 23480 (0.0008) [2023-10-14 01:58:04,016][33201] Updated weights for policy 0, policy_version 23270 (0.0009) [2023-10-14 01:58:04,390][33201] Updated weights for policy 0, policy_version 23280 (0.0009) [2023-10-14 01:58:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 47874048. Throughput: 0: 1788.8, 1: 1783.8. Samples: 11978562. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:58:04,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 01:58:04,769][33201] Updated weights for policy 0, policy_version 23290 (0.0008) [2023-10-14 01:58:06,476][33226] Updated weights for policy 1, policy_version 23490 (0.0008) [2023-10-14 01:58:06,894][33226] Updated weights for policy 1, policy_version 23500 (0.0008) [2023-10-14 01:58:07,257][33226] Updated weights for policy 1, policy_version 23510 (0.0009) [2023-10-14 01:58:07,621][33226] Updated weights for policy 1, policy_version 23520 (0.0008) [2023-10-14 01:58:08,732][33201] Updated weights for policy 0, policy_version 23300 (0.0011) [2023-10-14 01:58:09,101][33201] Updated weights for policy 0, policy_version 23310 (0.0008) [2023-10-14 01:58:09,469][33201] Updated weights for policy 0, policy_version 23320 (0.0007) [2023-10-14 01:58:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 47939584. Throughput: 0: 1789.9, 1: 1780.3. Samples: 11999836. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 01:58:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 01:58:11,317][33226] Updated weights for policy 1, policy_version 23530 (0.0008) [2023-10-14 01:58:11,684][33226] Updated weights for policy 1, policy_version 23540 (0.0007) [2023-10-14 01:58:12,047][33226] Updated weights for policy 1, policy_version 23550 (0.0007) [2023-10-14 01:58:13,272][33201] Updated weights for policy 0, policy_version 23330 (0.0008) [2023-10-14 01:58:13,648][33201] Updated weights for policy 0, policy_version 23340 (0.0009) [2023-10-14 01:58:14,016][33201] Updated weights for policy 0, policy_version 23350 (0.0007) [2023-10-14 01:58:14,380][33201] Updated weights for policy 0, policy_version 23360 (0.0008) [2023-10-14 01:58:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 48037888. Throughput: 0: 1779.1, 1: 1788.2. Samples: 12010582. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 01:58:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 01:58:15,754][33226] Updated weights for policy 1, policy_version 23560 (0.0008) [2023-10-14 01:58:16,121][33226] Updated weights for policy 1, policy_version 23570 (0.0007) [2023-10-14 01:58:16,498][33226] Updated weights for policy 1, policy_version 23580 (0.0007) [2023-10-14 01:58:18,156][33201] Updated weights for policy 0, policy_version 23370 (0.0008) [2023-10-14 01:58:18,522][33201] Updated weights for policy 0, policy_version 23380 (0.0007) [2023-10-14 01:58:18,897][33201] Updated weights for policy 0, policy_version 23390 (0.0008) [2023-10-14 01:58:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 48103424. Throughput: 0: 1785.7, 1: 1784.7. Samples: 12032178. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 01:58:19,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 01:58:20,230][33226] Updated weights for policy 1, policy_version 23590 (0.0007) [2023-10-14 01:58:20,588][33226] Updated weights for policy 1, policy_version 23600 (0.0009) [2023-10-14 01:58:20,954][33226] Updated weights for policy 1, policy_version 23610 (0.0008) [2023-10-14 01:58:22,566][33201] Updated weights for policy 0, policy_version 23400 (0.0010) [2023-10-14 01:58:22,936][33201] Updated weights for policy 0, policy_version 23410 (0.0010) [2023-10-14 01:58:23,318][33201] Updated weights for policy 0, policy_version 23420 (0.0009) [2023-10-14 01:58:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 48168960. Throughput: 0: 1759.5, 1: 1788.5. Samples: 12053462. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 01:58:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.970')] [2023-10-14 01:58:24,655][33226] Updated weights for policy 1, policy_version 23620 (0.0011) [2023-10-14 01:58:25,017][33226] Updated weights for policy 1, policy_version 23630 (0.0008) [2023-10-14 01:58:25,393][33226] Updated weights for policy 1, policy_version 23640 (0.0009) [2023-10-14 01:58:27,255][33201] Updated weights for policy 0, policy_version 23430 (0.0007) [2023-10-14 01:58:27,625][33201] Updated weights for policy 0, policy_version 23440 (0.0010) [2023-10-14 01:58:28,004][33201] Updated weights for policy 0, policy_version 23450 (0.0008) [2023-10-14 01:58:29,349][33226] Updated weights for policy 1, policy_version 23650 (0.0009) [2023-10-14 01:58:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 48234496. Throughput: 0: 1786.0, 1: 1780.1. Samples: 12064248. Policy #0 lag: (min: 20.0, avg: 35.3, max: 52.0) [2023-10-14 01:58:29,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.980')] [2023-10-14 01:58:29,717][33226] Updated weights for policy 1, policy_version 23660 (0.0009) [2023-10-14 01:58:30,087][33226] Updated weights for policy 1, policy_version 23670 (0.0010) [2023-10-14 01:58:30,448][33226] Updated weights for policy 1, policy_version 23680 (0.0010) [2023-10-14 01:58:31,678][33201] Updated weights for policy 0, policy_version 23460 (0.0008) [2023-10-14 01:58:32,042][33201] Updated weights for policy 0, policy_version 23470 (0.0009) [2023-10-14 01:58:32,414][33201] Updated weights for policy 0, policy_version 23480 (0.0009) [2023-10-14 01:58:34,192][33226] Updated weights for policy 1, policy_version 23690 (0.0008) [2023-10-14 01:58:34,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 48300032. Throughput: 0: 1762.0, 1: 1782.2. Samples: 12085080. Policy #0 lag: (min: 20.0, avg: 35.3, max: 52.0) [2023-10-14 01:58:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.980')] [2023-10-14 01:58:34,561][33226] Updated weights for policy 1, policy_version 23700 (0.0008) [2023-10-14 01:58:34,928][33226] Updated weights for policy 1, policy_version 23710 (0.0008) [2023-10-14 01:58:36,224][33201] Updated weights for policy 0, policy_version 23490 (0.0009) [2023-10-14 01:58:36,627][33201] Updated weights for policy 0, policy_version 23500 (0.0007) [2023-10-14 01:58:36,994][33201] Updated weights for policy 0, policy_version 23510 (0.0007) [2023-10-14 01:58:37,362][33201] Updated weights for policy 0, policy_version 23520 (0.0007) [2023-10-14 01:58:38,727][33226] Updated weights for policy 1, policy_version 23720 (0.0010) [2023-10-14 01:58:39,090][33226] Updated weights for policy 1, policy_version 23730 (0.0009) [2023-10-14 01:58:39,474][33226] Updated weights for policy 1, policy_version 23740 (0.0008) [2023-10-14 01:58:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 48365568. Throughput: 0: 1753.6, 1: 1798.7. Samples: 12106622. Policy #0 lag: (min: 20.0, avg: 35.3, max: 52.0) [2023-10-14 01:58:39,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 01:58:41,126][33201] Updated weights for policy 0, policy_version 23530 (0.0008) [2023-10-14 01:58:41,497][33201] Updated weights for policy 0, policy_version 23540 (0.0009) [2023-10-14 01:58:41,860][33201] Updated weights for policy 0, policy_version 23550 (0.0008) [2023-10-14 01:58:43,290][33226] Updated weights for policy 1, policy_version 23750 (0.0008) [2023-10-14 01:58:43,658][33226] Updated weights for policy 1, policy_version 23760 (0.0008) [2023-10-14 01:58:44,040][33226] Updated weights for policy 1, policy_version 23770 (0.0007) [2023-10-14 01:58:44,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 48463872. Throughput: 0: 1753.2, 1: 1783.8. Samples: 12116870. Policy #0 lag: (min: 20.0, avg: 35.3, max: 52.0) [2023-10-14 01:58:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.980')] [2023-10-14 01:58:45,701][33201] Updated weights for policy 0, policy_version 23560 (0.0008) [2023-10-14 01:58:46,065][33201] Updated weights for policy 0, policy_version 23570 (0.0007) [2023-10-14 01:58:46,445][33201] Updated weights for policy 0, policy_version 23580 (0.0008) [2023-10-14 01:58:47,952][33226] Updated weights for policy 1, policy_version 23780 (0.0008) [2023-10-14 01:58:48,322][33226] Updated weights for policy 1, policy_version 23790 (0.0009) [2023-10-14 01:58:48,680][33226] Updated weights for policy 1, policy_version 23800 (0.0008) [2023-10-14 01:58:49,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 48529408. Throughput: 0: 1753.4, 1: 1799.1. Samples: 12138428. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) [2023-10-14 01:58:49,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 01:58:49,560][32895] Saving new best policy, reward=20.990! [2023-10-14 01:58:50,390][33201] Updated weights for policy 0, policy_version 23590 (0.0010) [2023-10-14 01:58:50,771][33201] Updated weights for policy 0, policy_version 23600 (0.0009) [2023-10-14 01:58:51,140][33201] Updated weights for policy 0, policy_version 23610 (0.0007) [2023-10-14 01:58:52,453][33226] Updated weights for policy 1, policy_version 23810 (0.0007) [2023-10-14 01:58:52,874][33226] Updated weights for policy 1, policy_version 23820 (0.0007) [2023-10-14 01:58:53,247][33226] Updated weights for policy 1, policy_version 23830 (0.0008) [2023-10-14 01:58:53,620][33226] Updated weights for policy 1, policy_version 23840 (0.0009) [2023-10-14 01:58:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 48594944. Throughput: 0: 1768.5, 1: 1772.0. Samples: 12159162. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) [2023-10-14 01:58:54,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.990')] [2023-10-14 01:58:54,876][33201] Updated weights for policy 0, policy_version 23620 (0.0008) [2023-10-14 01:58:55,246][33201] Updated weights for policy 0, policy_version 23630 (0.0008) [2023-10-14 01:58:55,621][33201] Updated weights for policy 0, policy_version 23640 (0.0010) [2023-10-14 01:58:57,425][33226] Updated weights for policy 1, policy_version 23850 (0.0008) [2023-10-14 01:58:57,792][33226] Updated weights for policy 1, policy_version 23860 (0.0011) [2023-10-14 01:58:58,160][33226] Updated weights for policy 1, policy_version 23870 (0.0007) [2023-10-14 01:58:59,452][33201] Updated weights for policy 0, policy_version 23650 (0.0007) [2023-10-14 01:58:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 48660480. Throughput: 0: 1752.8, 1: 1795.5. Samples: 12170254. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) [2023-10-14 01:58:59,558][31953] Avg episode reward: [(0, '20.840'), (1, '21.000')] [2023-10-14 01:58:59,559][32895] Saving new best policy, reward=21.000! [2023-10-14 01:58:59,820][33201] Updated weights for policy 0, policy_version 23660 (0.0009) [2023-10-14 01:59:00,188][33201] Updated weights for policy 0, policy_version 23670 (0.0007) [2023-10-14 01:59:00,560][33201] Updated weights for policy 0, policy_version 23680 (0.0007) [2023-10-14 01:59:01,991][33226] Updated weights for policy 1, policy_version 23880 (0.0008) [2023-10-14 01:59:02,347][33226] Updated weights for policy 1, policy_version 23890 (0.0009) [2023-10-14 01:59:02,720][33226] Updated weights for policy 1, policy_version 23900 (0.0009) [2023-10-14 01:59:04,310][33201] Updated weights for policy 0, policy_version 23690 (0.0008) [2023-10-14 01:59:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 48726016. Throughput: 0: 1767.7, 1: 1765.0. Samples: 12191150. Policy #0 lag: (min: 26.0, avg: 33.4, max: 58.0) [2023-10-14 01:59:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '21.000')] [2023-10-14 01:59:04,675][33201] Updated weights for policy 0, policy_version 23700 (0.0008) [2023-10-14 01:59:05,046][33201] Updated weights for policy 0, policy_version 23710 (0.0007) [2023-10-14 01:59:06,556][33226] Updated weights for policy 1, policy_version 23910 (0.0008) [2023-10-14 01:59:06,927][33226] Updated weights for policy 1, policy_version 23920 (0.0008) [2023-10-14 01:59:07,299][33226] Updated weights for policy 1, policy_version 23930 (0.0008) [2023-10-14 01:59:08,985][33201] Updated weights for policy 0, policy_version 23720 (0.0007) [2023-10-14 01:59:09,362][33201] Updated weights for policy 0, policy_version 23730 (0.0008) [2023-10-14 01:59:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 48791552. Throughput: 0: 1779.5, 1: 1759.9. Samples: 12212736. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 01:59:09,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.970')] [2023-10-14 01:59:09,737][33201] Updated weights for policy 0, policy_version 23740 (0.0007) [2023-10-14 01:59:11,068][33226] Updated weights for policy 1, policy_version 23940 (0.0008) [2023-10-14 01:59:11,428][33226] Updated weights for policy 1, policy_version 23950 (0.0010) [2023-10-14 01:59:11,792][33226] Updated weights for policy 1, policy_version 23960 (0.0010) [2023-10-14 01:59:13,480][33201] Updated weights for policy 0, policy_version 23750 (0.0007) [2023-10-14 01:59:13,857][33201] Updated weights for policy 0, policy_version 23760 (0.0007) [2023-10-14 01:59:14,224][33201] Updated weights for policy 0, policy_version 23770 (0.0010) [2023-10-14 01:59:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 48889856. Throughput: 0: 1763.6, 1: 1767.7. Samples: 12223154. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 01:59:14,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.950')] [2023-10-14 01:59:15,675][33226] Updated weights for policy 1, policy_version 23970 (0.0009) [2023-10-14 01:59:16,043][33226] Updated weights for policy 1, policy_version 23980 (0.0007) [2023-10-14 01:59:16,409][33226] Updated weights for policy 1, policy_version 23990 (0.0007) [2023-10-14 01:59:16,782][33226] Updated weights for policy 1, policy_version 24000 (0.0010) [2023-10-14 01:59:18,033][33201] Updated weights for policy 0, policy_version 23780 (0.0010) [2023-10-14 01:59:18,404][33201] Updated weights for policy 0, policy_version 23790 (0.0009) [2023-10-14 01:59:18,779][33201] Updated weights for policy 0, policy_version 23800 (0.0010) [2023-10-14 01:59:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 48955392. Throughput: 0: 1786.4, 1: 1760.3. Samples: 12244680. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 01:59:19,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.950')] [2023-10-14 01:59:20,598][33226] Updated weights for policy 1, policy_version 24010 (0.0007) [2023-10-14 01:59:20,965][33226] Updated weights for policy 1, policy_version 24020 (0.0011) [2023-10-14 01:59:21,329][33226] Updated weights for policy 1, policy_version 24030 (0.0010) [2023-10-14 01:59:22,773][33201] Updated weights for policy 0, policy_version 23810 (0.0010) [2023-10-14 01:59:23,187][33201] Updated weights for policy 0, policy_version 23820 (0.0010) [2023-10-14 01:59:23,559][33201] Updated weights for policy 0, policy_version 23830 (0.0011) [2023-10-14 01:59:23,934][33201] Updated weights for policy 0, policy_version 23840 (0.0008) [2023-10-14 01:59:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 49020928. Throughput: 0: 1750.1, 1: 1780.5. Samples: 12265500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:59:24,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.950')] [2023-10-14 01:59:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000023840_24412160.pth... [2023-10-14 01:59:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000024032_24608768.pth... [2023-10-14 01:59:24,610][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000022176_22708224.pth [2023-10-14 01:59:24,613][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000022368_22904832.pth [2023-10-14 01:59:25,211][33226] Updated weights for policy 1, policy_version 24040 (0.0008) [2023-10-14 01:59:25,577][33226] Updated weights for policy 1, policy_version 24050 (0.0008) [2023-10-14 01:59:25,942][33226] Updated weights for policy 1, policy_version 24060 (0.0007) [2023-10-14 01:59:27,728][33201] Updated weights for policy 0, policy_version 23850 (0.0008) [2023-10-14 01:59:28,117][33201] Updated weights for policy 0, policy_version 23860 (0.0010) [2023-10-14 01:59:28,482][33201] Updated weights for policy 0, policy_version 23870 (0.0010) [2023-10-14 01:59:29,524][33226] Updated weights for policy 1, policy_version 24070 (0.0008) [2023-10-14 01:59:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 49086464. Throughput: 0: 1783.1, 1: 1765.4. Samples: 12276552. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:59:29,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.950')] [2023-10-14 01:59:29,881][33226] Updated weights for policy 1, policy_version 24080 (0.0011) [2023-10-14 01:59:30,251][33226] Updated weights for policy 1, policy_version 24090 (0.0011) [2023-10-14 01:59:32,343][33201] Updated weights for policy 0, policy_version 23880 (0.0007) [2023-10-14 01:59:32,706][33201] Updated weights for policy 0, policy_version 23890 (0.0008) [2023-10-14 01:59:33,084][33201] Updated weights for policy 0, policy_version 23900 (0.0008) [2023-10-14 01:59:34,201][33226] Updated weights for policy 1, policy_version 24100 (0.0011) [2023-10-14 01:59:34,557][33226] Updated weights for policy 1, policy_version 24110 (0.0007) [2023-10-14 01:59:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 49152000. Throughput: 0: 1753.7, 1: 1778.1. Samples: 12297358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:59:34,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.950')] [2023-10-14 01:59:34,928][33226] Updated weights for policy 1, policy_version 24120 (0.0009) [2023-10-14 01:59:36,879][33201] Updated weights for policy 0, policy_version 23910 (0.0009) [2023-10-14 01:59:37,250][33201] Updated weights for policy 0, policy_version 23920 (0.0009) [2023-10-14 01:59:37,628][33201] Updated weights for policy 0, policy_version 23930 (0.0009) [2023-10-14 01:59:38,683][33226] Updated weights for policy 1, policy_version 24130 (0.0008) [2023-10-14 01:59:39,115][33226] Updated weights for policy 1, policy_version 24140 (0.0009) [2023-10-14 01:59:39,473][33226] Updated weights for policy 1, policy_version 24150 (0.0009) [2023-10-14 01:59:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 49217536. Throughput: 0: 1751.9, 1: 1795.0. Samples: 12318770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 01:59:39,559][31953] Avg episode reward: [(0, '20.750'), (1, '20.950')] [2023-10-14 01:59:39,841][33226] Updated weights for policy 1, policy_version 24160 (0.0009) [2023-10-14 01:59:41,315][33201] Updated weights for policy 0, policy_version 23940 (0.0009) [2023-10-14 01:59:41,686][33201] Updated weights for policy 0, policy_version 23950 (0.0008) [2023-10-14 01:59:42,054][33201] Updated weights for policy 0, policy_version 23960 (0.0009) [2023-10-14 01:59:43,572][33226] Updated weights for policy 1, policy_version 24170 (0.0007) [2023-10-14 01:59:43,941][33226] Updated weights for policy 1, policy_version 24180 (0.0009) [2023-10-14 01:59:44,322][33226] Updated weights for policy 1, policy_version 24190 (0.0008) [2023-10-14 01:59:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 49315840. Throughput: 0: 1760.0, 1: 1769.0. Samples: 12329060. Policy #0 lag: (min: 30.0, avg: 42.9, max: 62.0) [2023-10-14 01:59:44,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.930')] [2023-10-14 01:59:45,843][33201] Updated weights for policy 0, policy_version 23970 (0.0008) [2023-10-14 01:59:46,214][33201] Updated weights for policy 0, policy_version 23980 (0.0011) [2023-10-14 01:59:46,589][33201] Updated weights for policy 0, policy_version 23990 (0.0009) [2023-10-14 01:59:46,952][33201] Updated weights for policy 0, policy_version 24000 (0.0007) [2023-10-14 01:59:48,113][33226] Updated weights for policy 1, policy_version 24200 (0.0008) [2023-10-14 01:59:48,482][33226] Updated weights for policy 1, policy_version 24210 (0.0009) [2023-10-14 01:59:48,852][33226] Updated weights for policy 1, policy_version 24220 (0.0007) [2023-10-14 01:59:49,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 49381376. Throughput: 0: 1750.0, 1: 1796.0. Samples: 12350718. Policy #0 lag: (min: 30.0, avg: 42.9, max: 62.0) [2023-10-14 01:59:49,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.930')] [2023-10-14 01:59:50,643][33201] Updated weights for policy 0, policy_version 24010 (0.0008) [2023-10-14 01:59:51,020][33201] Updated weights for policy 0, policy_version 24020 (0.0009) [2023-10-14 01:59:51,387][33201] Updated weights for policy 0, policy_version 24030 (0.0011) [2023-10-14 01:59:52,663][33226] Updated weights for policy 1, policy_version 24230 (0.0008) [2023-10-14 01:59:53,037][33226] Updated weights for policy 1, policy_version 24240 (0.0009) [2023-10-14 01:59:53,402][33226] Updated weights for policy 1, policy_version 24250 (0.0007) [2023-10-14 01:59:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 49446912. Throughput: 0: 1768.3, 1: 1769.6. Samples: 12371944. Policy #0 lag: (min: 30.0, avg: 42.9, max: 62.0) [2023-10-14 01:59:54,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.920')] [2023-10-14 01:59:55,096][33201] Updated weights for policy 0, policy_version 24040 (0.0011) [2023-10-14 01:59:55,472][33201] Updated weights for policy 0, policy_version 24050 (0.0007) [2023-10-14 01:59:55,834][33201] Updated weights for policy 0, policy_version 24060 (0.0009) [2023-10-14 01:59:57,373][33226] Updated weights for policy 1, policy_version 24260 (0.0008) [2023-10-14 01:59:57,740][33226] Updated weights for policy 1, policy_version 24270 (0.0007) [2023-10-14 01:59:58,112][33226] Updated weights for policy 1, policy_version 24280 (0.0009) [2023-10-14 01:59:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 49512448. Throughput: 0: 1752.6, 1: 1792.9. Samples: 12382700. Policy #0 lag: (min: 30.0, avg: 42.9, max: 62.0) [2023-10-14 01:59:59,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.920')] [2023-10-14 01:59:59,685][33201] Updated weights for policy 0, policy_version 24070 (0.0008) [2023-10-14 02:00:00,057][33201] Updated weights for policy 0, policy_version 24080 (0.0008) [2023-10-14 02:00:00,427][33201] Updated weights for policy 0, policy_version 24090 (0.0008) [2023-10-14 02:00:01,894][33226] Updated weights for policy 1, policy_version 24290 (0.0009) [2023-10-14 02:00:02,265][33226] Updated weights for policy 1, policy_version 24300 (0.0008) [2023-10-14 02:00:02,625][33226] Updated weights for policy 1, policy_version 24310 (0.0008) [2023-10-14 02:00:02,998][33226] Updated weights for policy 1, policy_version 24320 (0.0009) [2023-10-14 02:00:04,291][33201] Updated weights for policy 0, policy_version 24100 (0.0008) [2023-10-14 02:00:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 49577984. Throughput: 0: 1761.0, 1: 1769.1. Samples: 12403534. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:04,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 02:00:04,666][33201] Updated weights for policy 0, policy_version 24110 (0.0010) [2023-10-14 02:00:05,041][33201] Updated weights for policy 0, policy_version 24120 (0.0010) [2023-10-14 02:00:06,838][33226] Updated weights for policy 1, policy_version 24330 (0.0010) [2023-10-14 02:00:07,194][33226] Updated weights for policy 1, policy_version 24340 (0.0011) [2023-10-14 02:00:07,562][33226] Updated weights for policy 1, policy_version 24350 (0.0010) [2023-10-14 02:00:09,041][33201] Updated weights for policy 0, policy_version 24130 (0.0009) [2023-10-14 02:00:09,437][33201] Updated weights for policy 0, policy_version 24140 (0.0008) [2023-10-14 02:00:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 49643520. Throughput: 0: 1790.5, 1: 1761.5. Samples: 12425342. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:09,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:00:09,810][33201] Updated weights for policy 0, policy_version 24150 (0.0011) [2023-10-14 02:00:10,172][33201] Updated weights for policy 0, policy_version 24160 (0.0009) [2023-10-14 02:00:11,211][33226] Updated weights for policy 1, policy_version 24360 (0.0011) [2023-10-14 02:00:11,578][33226] Updated weights for policy 1, policy_version 24370 (0.0010) [2023-10-14 02:00:11,949][33226] Updated weights for policy 1, policy_version 24380 (0.0009) [2023-10-14 02:00:14,008][33201] Updated weights for policy 0, policy_version 24170 (0.0007) [2023-10-14 02:00:14,386][33201] Updated weights for policy 0, policy_version 24180 (0.0007) [2023-10-14 02:00:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 49709056. Throughput: 0: 1760.6, 1: 1769.5. Samples: 12435404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:00:14,755][33201] Updated weights for policy 0, policy_version 24190 (0.0007) [2023-10-14 02:00:15,857][33226] Updated weights for policy 1, policy_version 24390 (0.0009) [2023-10-14 02:00:16,230][33226] Updated weights for policy 1, policy_version 24400 (0.0008) [2023-10-14 02:00:16,592][33226] Updated weights for policy 1, policy_version 24410 (0.0010) [2023-10-14 02:00:18,599][33201] Updated weights for policy 0, policy_version 24200 (0.0009) [2023-10-14 02:00:18,979][33201] Updated weights for policy 0, policy_version 24210 (0.0009) [2023-10-14 02:00:19,341][33201] Updated weights for policy 0, policy_version 24220 (0.0008) [2023-10-14 02:00:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 49807360. Throughput: 0: 1788.3, 1: 1760.1. Samples: 12457034. Policy #0 lag: (min: 13.0, avg: 29.4, max: 32.0) [2023-10-14 02:00:19,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 02:00:20,510][33226] Updated weights for policy 1, policy_version 24420 (0.0009) [2023-10-14 02:00:20,878][33226] Updated weights for policy 1, policy_version 24430 (0.0009) [2023-10-14 02:00:21,251][33226] Updated weights for policy 1, policy_version 24440 (0.0009) [2023-10-14 02:00:23,291][33201] Updated weights for policy 0, policy_version 24230 (0.0010) [2023-10-14 02:00:23,669][33201] Updated weights for policy 0, policy_version 24240 (0.0007) [2023-10-14 02:00:24,043][33201] Updated weights for policy 0, policy_version 24250 (0.0007) [2023-10-14 02:00:24,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 49872896. Throughput: 0: 1763.1, 1: 1765.4. Samples: 12477552. Policy #0 lag: (min: 13.0, avg: 29.4, max: 32.0) [2023-10-14 02:00:24,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.890')] [2023-10-14 02:00:25,245][33226] Updated weights for policy 1, policy_version 24450 (0.0008) [2023-10-14 02:00:25,661][33226] Updated weights for policy 1, policy_version 24460 (0.0007) [2023-10-14 02:00:26,021][33226] Updated weights for policy 1, policy_version 24470 (0.0007) [2023-10-14 02:00:26,391][33226] Updated weights for policy 1, policy_version 24480 (0.0007) [2023-10-14 02:00:27,752][33201] Updated weights for policy 0, policy_version 24260 (0.0008) [2023-10-14 02:00:28,120][33201] Updated weights for policy 0, policy_version 24270 (0.0008) [2023-10-14 02:00:28,496][33201] Updated weights for policy 0, policy_version 24280 (0.0011) [2023-10-14 02:00:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 49938432. Throughput: 0: 1788.0, 1: 1753.6. Samples: 12488436. Policy #0 lag: (min: 13.0, avg: 29.4, max: 32.0) [2023-10-14 02:00:29,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.890')] [2023-10-14 02:00:29,956][33226] Updated weights for policy 1, policy_version 24490 (0.0007) [2023-10-14 02:00:30,328][33226] Updated weights for policy 1, policy_version 24500 (0.0009) [2023-10-14 02:00:30,707][33226] Updated weights for policy 1, policy_version 24510 (0.0011) [2023-10-14 02:00:32,288][33201] Updated weights for policy 0, policy_version 24290 (0.0008) [2023-10-14 02:00:32,653][33201] Updated weights for policy 0, policy_version 24300 (0.0007) [2023-10-14 02:00:33,029][33201] Updated weights for policy 0, policy_version 24310 (0.0007) [2023-10-14 02:00:33,398][33201] Updated weights for policy 0, policy_version 24320 (0.0007) [2023-10-14 02:00:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 50003968. Throughput: 0: 1771.7, 1: 1759.7. Samples: 12509632. Policy #0 lag: (min: 13.0, avg: 29.4, max: 32.0) [2023-10-14 02:00:34,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.880')] [2023-10-14 02:00:34,666][33226] Updated weights for policy 1, policy_version 24520 (0.0009) [2023-10-14 02:00:35,027][33226] Updated weights for policy 1, policy_version 24530 (0.0009) [2023-10-14 02:00:35,394][33226] Updated weights for policy 1, policy_version 24540 (0.0008) [2023-10-14 02:00:37,116][33201] Updated weights for policy 0, policy_version 24330 (0.0009) [2023-10-14 02:00:37,487][33201] Updated weights for policy 0, policy_version 24340 (0.0008) [2023-10-14 02:00:37,866][33201] Updated weights for policy 0, policy_version 24350 (0.0007) [2023-10-14 02:00:39,331][33226] Updated weights for policy 1, policy_version 24550 (0.0007) [2023-10-14 02:00:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 50069504. Throughput: 0: 1756.0, 1: 1775.6. Samples: 12530866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:39,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.880')] [2023-10-14 02:00:39,709][33226] Updated weights for policy 1, policy_version 24560 (0.0008) [2023-10-14 02:00:40,076][33226] Updated weights for policy 1, policy_version 24570 (0.0007) [2023-10-14 02:00:41,553][33201] Updated weights for policy 0, policy_version 24360 (0.0009) [2023-10-14 02:00:41,927][33201] Updated weights for policy 0, policy_version 24370 (0.0007) [2023-10-14 02:00:42,300][33201] Updated weights for policy 0, policy_version 24380 (0.0007) [2023-10-14 02:00:43,851][33226] Updated weights for policy 1, policy_version 24580 (0.0007) [2023-10-14 02:00:44,223][33226] Updated weights for policy 1, policy_version 24590 (0.0009) [2023-10-14 02:00:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 50135040. Throughput: 0: 1769.2, 1: 1752.6. Samples: 12541180. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:44,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.860')] [2023-10-14 02:00:44,588][33226] Updated weights for policy 1, policy_version 24600 (0.0008) [2023-10-14 02:00:46,189][33201] Updated weights for policy 0, policy_version 24390 (0.0008) [2023-10-14 02:00:46,565][33201] Updated weights for policy 0, policy_version 24400 (0.0007) [2023-10-14 02:00:46,924][33201] Updated weights for policy 0, policy_version 24410 (0.0008) [2023-10-14 02:00:48,204][33226] Updated weights for policy 1, policy_version 24610 (0.0008) [2023-10-14 02:00:48,571][33226] Updated weights for policy 1, policy_version 24620 (0.0008) [2023-10-14 02:00:48,936][33226] Updated weights for policy 1, policy_version 24630 (0.0008) [2023-10-14 02:00:49,302][33226] Updated weights for policy 1, policy_version 24640 (0.0008) [2023-10-14 02:00:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 50233344. Throughput: 0: 1758.0, 1: 1792.2. Samples: 12563292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:49,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.860')] [2023-10-14 02:00:50,836][33201] Updated weights for policy 0, policy_version 24420 (0.0008) [2023-10-14 02:00:51,210][33201] Updated weights for policy 0, policy_version 24430 (0.0011) [2023-10-14 02:00:51,584][33201] Updated weights for policy 0, policy_version 24440 (0.0008) [2023-10-14 02:00:53,110][33226] Updated weights for policy 1, policy_version 24650 (0.0007) [2023-10-14 02:00:53,480][33226] Updated weights for policy 1, policy_version 24660 (0.0007) [2023-10-14 02:00:53,846][33226] Updated weights for policy 1, policy_version 24670 (0.0008) [2023-10-14 02:00:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 50298880. Throughput: 0: 1767.6, 1: 1760.9. Samples: 12584126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:54,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.860')] [2023-10-14 02:00:55,259][33201] Updated weights for policy 0, policy_version 24450 (0.0007) [2023-10-14 02:00:55,635][33201] Updated weights for policy 0, policy_version 24460 (0.0008) [2023-10-14 02:00:56,014][33201] Updated weights for policy 0, policy_version 24470 (0.0008) [2023-10-14 02:00:56,379][33201] Updated weights for policy 0, policy_version 24480 (0.0009) [2023-10-14 02:00:57,636][33226] Updated weights for policy 1, policy_version 24680 (0.0008) [2023-10-14 02:00:57,999][33226] Updated weights for policy 1, policy_version 24690 (0.0009) [2023-10-14 02:00:58,385][33226] Updated weights for policy 1, policy_version 24700 (0.0008) [2023-10-14 02:00:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 50364416. Throughput: 0: 1760.7, 1: 1789.9. Samples: 12595182. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:00:59,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 02:01:00,078][33201] Updated weights for policy 0, policy_version 24490 (0.0007) [2023-10-14 02:01:00,441][33201] Updated weights for policy 0, policy_version 24500 (0.0007) [2023-10-14 02:01:00,815][33201] Updated weights for policy 0, policy_version 24510 (0.0007) [2023-10-14 02:01:02,035][33226] Updated weights for policy 1, policy_version 24710 (0.0007) [2023-10-14 02:01:02,413][33226] Updated weights for policy 1, policy_version 24720 (0.0009) [2023-10-14 02:01:02,793][33226] Updated weights for policy 1, policy_version 24730 (0.0008) [2023-10-14 02:01:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 50429952. Throughput: 0: 1770.3, 1: 1768.9. Samples: 12616300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:01:04,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 02:01:04,659][33201] Updated weights for policy 0, policy_version 24520 (0.0010) [2023-10-14 02:01:05,033][33201] Updated weights for policy 0, policy_version 24530 (0.0010) [2023-10-14 02:01:05,400][33201] Updated weights for policy 0, policy_version 24540 (0.0008) [2023-10-14 02:01:06,529][33226] Updated weights for policy 1, policy_version 24740 (0.0007) [2023-10-14 02:01:06,903][33226] Updated weights for policy 1, policy_version 24750 (0.0010) [2023-10-14 02:01:07,265][33226] Updated weights for policy 1, policy_version 24760 (0.0008) [2023-10-14 02:01:09,361][33201] Updated weights for policy 0, policy_version 24550 (0.0008) [2023-10-14 02:01:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 50495488. Throughput: 0: 1793.6, 1: 1777.3. Samples: 12638242. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:01:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 02:01:09,722][33201] Updated weights for policy 0, policy_version 24560 (0.0010) [2023-10-14 02:01:10,089][33201] Updated weights for policy 0, policy_version 24570 (0.0010) [2023-10-14 02:01:11,104][33226] Updated weights for policy 1, policy_version 24770 (0.0009) [2023-10-14 02:01:11,517][33226] Updated weights for policy 1, policy_version 24780 (0.0009) [2023-10-14 02:01:11,889][33226] Updated weights for policy 1, policy_version 24790 (0.0007) [2023-10-14 02:01:12,254][33226] Updated weights for policy 1, policy_version 24800 (0.0011) [2023-10-14 02:01:13,898][33201] Updated weights for policy 0, policy_version 24580 (0.0009) [2023-10-14 02:01:14,260][33201] Updated weights for policy 0, policy_version 24590 (0.0008) [2023-10-14 02:01:14,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 50561024. Throughput: 0: 1760.0, 1: 1788.6. Samples: 12648124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:01:14,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 02:01:14,632][33201] Updated weights for policy 0, policy_version 24600 (0.0007) [2023-10-14 02:01:15,934][33226] Updated weights for policy 1, policy_version 24810 (0.0010) [2023-10-14 02:01:16,303][33226] Updated weights for policy 1, policy_version 24820 (0.0010) [2023-10-14 02:01:16,671][33226] Updated weights for policy 1, policy_version 24830 (0.0008) [2023-10-14 02:01:18,452][33201] Updated weights for policy 0, policy_version 24610 (0.0009) [2023-10-14 02:01:18,833][33201] Updated weights for policy 0, policy_version 24620 (0.0007) [2023-10-14 02:01:19,202][33201] Updated weights for policy 0, policy_version 24630 (0.0008) [2023-10-14 02:01:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 50626560. Throughput: 0: 1784.2, 1: 1778.5. Samples: 12669954. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:01:19,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.900')] [2023-10-14 02:01:19,571][33201] Updated weights for policy 0, policy_version 24640 (0.0008) [2023-10-14 02:01:20,330][33226] Updated weights for policy 1, policy_version 24840 (0.0007) [2023-10-14 02:01:20,693][33226] Updated weights for policy 1, policy_version 24850 (0.0011) [2023-10-14 02:01:21,061][33226] Updated weights for policy 1, policy_version 24860 (0.0008) [2023-10-14 02:01:23,344][33201] Updated weights for policy 0, policy_version 24650 (0.0007) [2023-10-14 02:01:23,712][33201] Updated weights for policy 0, policy_version 24660 (0.0007) [2023-10-14 02:01:24,088][33201] Updated weights for policy 0, policy_version 24670 (0.0008) [2023-10-14 02:01:24,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 50724864. Throughput: 0: 1762.6, 1: 1796.8. Samples: 12691038. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:01:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.900')] [2023-10-14 02:01:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000024672_25264128.pth... [2023-10-14 02:01:24,597][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000023008_23560192.pth [2023-10-14 02:01:24,601][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000024672_25264128.pth [2023-10-14 02:01:24,793][33226] Updated weights for policy 1, policy_version 24870 (0.0009) [2023-10-14 02:01:25,148][33226] Updated weights for policy 1, policy_version 24880 (0.0009) [2023-10-14 02:01:25,528][33226] Updated weights for policy 1, policy_version 24890 (0.0010) [2023-10-14 02:01:25,752][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000024896_25493504.pth... [2023-10-14 02:01:25,791][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000023200_23756800.pth [2023-10-14 02:01:25,797][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000024896_25493504.pth [2023-10-14 02:01:27,801][33201] Updated weights for policy 0, policy_version 24680 (0.0008) [2023-10-14 02:01:28,164][33201] Updated weights for policy 0, policy_version 24690 (0.0007) [2023-10-14 02:01:28,544][33201] Updated weights for policy 0, policy_version 24700 (0.0007) [2023-10-14 02:01:29,397][33226] Updated weights for policy 1, policy_version 24900 (0.0008) [2023-10-14 02:01:29,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 50790400. Throughput: 0: 1784.6, 1: 1789.3. Samples: 12702004. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:01:29,560][31953] Avg episode reward: [(0, '20.920'), (1, '20.900')] [2023-10-14 02:01:29,759][33226] Updated weights for policy 1, policy_version 24910 (0.0008) [2023-10-14 02:01:30,129][33226] Updated weights for policy 1, policy_version 24920 (0.0008) [2023-10-14 02:01:32,292][33201] Updated weights for policy 0, policy_version 24710 (0.0008) [2023-10-14 02:01:32,657][33201] Updated weights for policy 0, policy_version 24720 (0.0007) [2023-10-14 02:01:33,029][33201] Updated weights for policy 0, policy_version 24730 (0.0007) [2023-10-14 02:01:33,842][33226] Updated weights for policy 1, policy_version 24930 (0.0009) [2023-10-14 02:01:34,205][33226] Updated weights for policy 1, policy_version 24940 (0.0009) [2023-10-14 02:01:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 50855936. Throughput: 0: 1769.5, 1: 1779.8. Samples: 12723008. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:01:34,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.920')] [2023-10-14 02:01:34,574][33226] Updated weights for policy 1, policy_version 24950 (0.0009) [2023-10-14 02:01:34,940][33226] Updated weights for policy 1, policy_version 24960 (0.0009) [2023-10-14 02:01:36,859][33201] Updated weights for policy 0, policy_version 24740 (0.0008) [2023-10-14 02:01:37,219][33201] Updated weights for policy 0, policy_version 24750 (0.0007) [2023-10-14 02:01:37,587][33201] Updated weights for policy 0, policy_version 24760 (0.0007) [2023-10-14 02:01:38,789][33226] Updated weights for policy 1, policy_version 24970 (0.0010) [2023-10-14 02:01:39,166][33226] Updated weights for policy 1, policy_version 24980 (0.0008) [2023-10-14 02:01:39,541][33226] Updated weights for policy 1, policy_version 24990 (0.0008) [2023-10-14 02:01:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 50921472. Throughput: 0: 1761.6, 1: 1803.2. Samples: 12744542. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:01:39,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.920')] [2023-10-14 02:01:41,597][33201] Updated weights for policy 0, policy_version 24770 (0.0007) [2023-10-14 02:01:41,994][33201] Updated weights for policy 0, policy_version 24780 (0.0008) [2023-10-14 02:01:42,365][33201] Updated weights for policy 0, policy_version 24790 (0.0008) [2023-10-14 02:01:42,734][33201] Updated weights for policy 0, policy_version 24800 (0.0008) [2023-10-14 02:01:43,171][33226] Updated weights for policy 1, policy_version 25000 (0.0010) [2023-10-14 02:01:43,542][33226] Updated weights for policy 1, policy_version 25010 (0.0010) [2023-10-14 02:01:43,908][33226] Updated weights for policy 1, policy_version 25020 (0.0010) [2023-10-14 02:01:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 51019776. Throughput: 0: 1778.6, 1: 1780.6. Samples: 12755348. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:01:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.920')] [2023-10-14 02:01:46,507][33201] Updated weights for policy 0, policy_version 24810 (0.0008) [2023-10-14 02:01:46,883][33201] Updated weights for policy 0, policy_version 24820 (0.0009) [2023-10-14 02:01:47,258][33201] Updated weights for policy 0, policy_version 24830 (0.0009) [2023-10-14 02:01:47,698][33226] Updated weights for policy 1, policy_version 25030 (0.0008) [2023-10-14 02:01:48,065][33226] Updated weights for policy 1, policy_version 25040 (0.0011) [2023-10-14 02:01:48,436][33226] Updated weights for policy 1, policy_version 25050 (0.0010) [2023-10-14 02:01:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51085312. Throughput: 0: 1756.1, 1: 1798.0. Samples: 12776238. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:01:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.920')] [2023-10-14 02:01:51,018][33201] Updated weights for policy 0, policy_version 24840 (0.0008) [2023-10-14 02:01:51,387][33201] Updated weights for policy 0, policy_version 24850 (0.0009) [2023-10-14 02:01:51,762][33201] Updated weights for policy 0, policy_version 24860 (0.0008) [2023-10-14 02:01:52,255][33226] Updated weights for policy 1, policy_version 25060 (0.0011) [2023-10-14 02:01:52,622][33226] Updated weights for policy 1, policy_version 25070 (0.0011) [2023-10-14 02:01:52,988][33226] Updated weights for policy 1, policy_version 25080 (0.0010) [2023-10-14 02:01:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51150848. Throughput: 0: 1761.3, 1: 1774.2. Samples: 12797338. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 02:01:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 02:01:55,600][33201] Updated weights for policy 0, policy_version 24870 (0.0007) [2023-10-14 02:01:55,969][33201] Updated weights for policy 0, policy_version 24880 (0.0008) [2023-10-14 02:01:56,340][33201] Updated weights for policy 0, policy_version 24890 (0.0009) [2023-10-14 02:01:57,009][33226] Updated weights for policy 1, policy_version 25090 (0.0011) [2023-10-14 02:01:57,420][33226] Updated weights for policy 1, policy_version 25100 (0.0009) [2023-10-14 02:01:57,785][33226] Updated weights for policy 1, policy_version 25110 (0.0010) [2023-10-14 02:01:58,150][33226] Updated weights for policy 1, policy_version 25120 (0.0008) [2023-10-14 02:01:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51216384. Throughput: 0: 1759.4, 1: 1798.0. Samples: 12808204. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 02:01:59,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.940')] [2023-10-14 02:02:00,228][33201] Updated weights for policy 0, policy_version 24900 (0.0009) [2023-10-14 02:02:00,601][33201] Updated weights for policy 0, policy_version 24910 (0.0008) [2023-10-14 02:02:00,974][33201] Updated weights for policy 0, policy_version 24920 (0.0008) [2023-10-14 02:02:01,711][33226] Updated weights for policy 1, policy_version 25130 (0.0009) [2023-10-14 02:02:02,074][33226] Updated weights for policy 1, policy_version 25140 (0.0009) [2023-10-14 02:02:02,436][33226] Updated weights for policy 1, policy_version 25150 (0.0012) [2023-10-14 02:02:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 51281920. Throughput: 0: 1757.4, 1: 1783.0. Samples: 12829270. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 02:02:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.940')] [2023-10-14 02:02:04,899][33201] Updated weights for policy 0, policy_version 24930 (0.0011) [2023-10-14 02:02:05,271][33201] Updated weights for policy 0, policy_version 24940 (0.0010) [2023-10-14 02:02:05,649][33201] Updated weights for policy 0, policy_version 24950 (0.0010) [2023-10-14 02:02:06,020][33201] Updated weights for policy 0, policy_version 24960 (0.0009) [2023-10-14 02:02:06,193][33226] Updated weights for policy 1, policy_version 25160 (0.0008) [2023-10-14 02:02:06,557][33226] Updated weights for policy 1, policy_version 25170 (0.0007) [2023-10-14 02:02:06,923][33226] Updated weights for policy 1, policy_version 25180 (0.0011) [2023-10-14 02:02:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 51347456. Throughput: 0: 1791.6, 1: 1774.2. Samples: 12851496. Policy #0 lag: (min: 31.0, avg: 33.9, max: 63.0) [2023-10-14 02:02:09,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:02:09,727][33201] Updated weights for policy 0, policy_version 24970 (0.0009) [2023-10-14 02:02:10,099][33201] Updated weights for policy 0, policy_version 24980 (0.0009) [2023-10-14 02:02:10,465][33201] Updated weights for policy 0, policy_version 24990 (0.0008) [2023-10-14 02:02:10,723][33226] Updated weights for policy 1, policy_version 25190 (0.0009) [2023-10-14 02:02:11,091][33226] Updated weights for policy 1, policy_version 25200 (0.0009) [2023-10-14 02:02:11,458][33226] Updated weights for policy 1, policy_version 25210 (0.0010) [2023-10-14 02:02:14,150][33201] Updated weights for policy 0, policy_version 25000 (0.0010) [2023-10-14 02:02:14,516][33201] Updated weights for policy 0, policy_version 25010 (0.0010) [2023-10-14 02:02:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 51412992. Throughput: 0: 1759.1, 1: 1777.3. Samples: 12861142. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-14 02:02:14,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.930')] [2023-10-14 02:02:14,889][33201] Updated weights for policy 0, policy_version 25020 (0.0010) [2023-10-14 02:02:15,208][33226] Updated weights for policy 1, policy_version 25220 (0.0007) [2023-10-14 02:02:15,573][33226] Updated weights for policy 1, policy_version 25230 (0.0007) [2023-10-14 02:02:15,943][33226] Updated weights for policy 1, policy_version 25240 (0.0007) [2023-10-14 02:02:18,787][33201] Updated weights for policy 0, policy_version 25030 (0.0009) [2023-10-14 02:02:19,153][33201] Updated weights for policy 0, policy_version 25040 (0.0008) [2023-10-14 02:02:19,526][33201] Updated weights for policy 0, policy_version 25050 (0.0009) [2023-10-14 02:02:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 51478528. Throughput: 0: 1784.9, 1: 1782.5. Samples: 12883540. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-14 02:02:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:02:19,722][33226] Updated weights for policy 1, policy_version 25250 (0.0007) [2023-10-14 02:02:20,090][33226] Updated weights for policy 1, policy_version 25260 (0.0007) [2023-10-14 02:02:20,454][33226] Updated weights for policy 1, policy_version 25270 (0.0008) [2023-10-14 02:02:20,818][33226] Updated weights for policy 1, policy_version 25280 (0.0009) [2023-10-14 02:02:23,291][33201] Updated weights for policy 0, policy_version 25060 (0.0007) [2023-10-14 02:02:23,659][33201] Updated weights for policy 0, policy_version 25070 (0.0009) [2023-10-14 02:02:24,032][33201] Updated weights for policy 0, policy_version 25080 (0.0010) [2023-10-14 02:02:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51576832. Throughput: 0: 1765.2, 1: 1789.8. Samples: 12904518. Policy #0 lag: (min: 14.0, avg: 22.0, max: 46.0) [2023-10-14 02:02:24,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.930')] [2023-10-14 02:02:24,711][33226] Updated weights for policy 1, policy_version 25290 (0.0008) [2023-10-14 02:02:25,082][33226] Updated weights for policy 1, policy_version 25300 (0.0008) [2023-10-14 02:02:25,442][33226] Updated weights for policy 1, policy_version 25310 (0.0007) [2023-10-14 02:02:27,852][33201] Updated weights for policy 0, policy_version 25090 (0.0007) [2023-10-14 02:02:28,235][33201] Updated weights for policy 0, policy_version 25100 (0.0008) [2023-10-14 02:02:28,613][33201] Updated weights for policy 0, policy_version 25110 (0.0010) [2023-10-14 02:02:28,989][33201] Updated weights for policy 0, policy_version 25120 (0.0009) [2023-10-14 02:02:29,332][33226] Updated weights for policy 1, policy_version 25320 (0.0010) [2023-10-14 02:02:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51642368. Throughput: 0: 1778.4, 1: 1774.3. Samples: 12915220. Policy #0 lag: (min: 10.0, avg: 12.6, max: 40.0) [2023-10-14 02:02:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 02:02:29,704][33226] Updated weights for policy 1, policy_version 25330 (0.0008) [2023-10-14 02:02:30,070][33226] Updated weights for policy 1, policy_version 25340 (0.0008) [2023-10-14 02:02:32,856][33201] Updated weights for policy 0, policy_version 25130 (0.0011) [2023-10-14 02:02:33,228][33201] Updated weights for policy 0, policy_version 25140 (0.0007) [2023-10-14 02:02:33,607][33201] Updated weights for policy 0, policy_version 25150 (0.0009) [2023-10-14 02:02:33,926][33226] Updated weights for policy 1, policy_version 25350 (0.0008) [2023-10-14 02:02:34,291][33226] Updated weights for policy 1, policy_version 25360 (0.0009) [2023-10-14 02:02:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 51707904. Throughput: 0: 1773.6, 1: 1783.7. Samples: 12936320. Policy #0 lag: (min: 10.0, avg: 12.6, max: 40.0) [2023-10-14 02:02:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 02:02:34,669][33226] Updated weights for policy 1, policy_version 25370 (0.0009) [2023-10-14 02:02:37,367][33201] Updated weights for policy 0, policy_version 25160 (0.0007) [2023-10-14 02:02:37,736][33201] Updated weights for policy 0, policy_version 25170 (0.0009) [2023-10-14 02:02:38,103][33201] Updated weights for policy 0, policy_version 25180 (0.0009) [2023-10-14 02:02:38,251][33226] Updated weights for policy 1, policy_version 25380 (0.0008) [2023-10-14 02:02:38,621][33226] Updated weights for policy 1, policy_version 25390 (0.0008) [2023-10-14 02:02:38,995][33226] Updated weights for policy 1, policy_version 25400 (0.0010) [2023-10-14 02:02:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.5, 300 sec: 14218.0). Total num frames: 51806208. Throughput: 0: 1756.7, 1: 1790.2. Samples: 12956952. Policy #0 lag: (min: 10.0, avg: 12.6, max: 40.0) [2023-10-14 02:02:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 02:02:41,973][33201] Updated weights for policy 0, policy_version 25190 (0.0008) [2023-10-14 02:02:42,346][33201] Updated weights for policy 0, policy_version 25200 (0.0010) [2023-10-14 02:02:42,707][33201] Updated weights for policy 0, policy_version 25210 (0.0008) [2023-10-14 02:02:42,902][33226] Updated weights for policy 1, policy_version 25410 (0.0008) [2023-10-14 02:02:43,276][33226] Updated weights for policy 1, policy_version 25420 (0.0010) [2023-10-14 02:02:43,646][33226] Updated weights for policy 1, policy_version 25430 (0.0011) [2023-10-14 02:02:44,011][33226] Updated weights for policy 1, policy_version 25440 (0.0008) [2023-10-14 02:02:44,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51871744. Throughput: 0: 1783.5, 1: 1781.5. Samples: 12968630. Policy #0 lag: (min: 10.0, avg: 12.6, max: 40.0) [2023-10-14 02:02:44,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 02:02:46,679][33201] Updated weights for policy 0, policy_version 25220 (0.0008) [2023-10-14 02:02:47,050][33201] Updated weights for policy 0, policy_version 25230 (0.0011) [2023-10-14 02:02:47,422][33201] Updated weights for policy 0, policy_version 25240 (0.0010) [2023-10-14 02:02:47,782][33226] Updated weights for policy 1, policy_version 25450 (0.0008) [2023-10-14 02:02:48,146][33226] Updated weights for policy 1, policy_version 25460 (0.0007) [2023-10-14 02:02:48,522][33226] Updated weights for policy 1, policy_version 25470 (0.0008) [2023-10-14 02:02:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 51937280. Throughput: 0: 1756.6, 1: 1791.8. Samples: 12988946. Policy #0 lag: (min: 25.0, avg: 41.6, max: 57.0) [2023-10-14 02:02:49,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 02:02:51,371][33201] Updated weights for policy 0, policy_version 25250 (0.0008) [2023-10-14 02:02:51,743][33201] Updated weights for policy 0, policy_version 25260 (0.0009) [2023-10-14 02:02:52,122][33201] Updated weights for policy 0, policy_version 25270 (0.0007) [2023-10-14 02:02:52,278][33226] Updated weights for policy 1, policy_version 25480 (0.0007) [2023-10-14 02:02:52,490][33201] Updated weights for policy 0, policy_version 25280 (0.0009) [2023-10-14 02:02:52,641][33226] Updated weights for policy 1, policy_version 25490 (0.0009) [2023-10-14 02:02:53,020][33226] Updated weights for policy 1, policy_version 25500 (0.0009) [2023-10-14 02:02:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52002816. Throughput: 0: 1749.3, 1: 1772.3. Samples: 13009968. Policy #0 lag: (min: 25.0, avg: 41.6, max: 57.0) [2023-10-14 02:02:54,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 02:02:56,164][33201] Updated weights for policy 0, policy_version 25290 (0.0007) [2023-10-14 02:02:56,543][33201] Updated weights for policy 0, policy_version 25300 (0.0009) [2023-10-14 02:02:56,905][33201] Updated weights for policy 0, policy_version 25310 (0.0009) [2023-10-14 02:02:56,913][33226] Updated weights for policy 1, policy_version 25510 (0.0008) [2023-10-14 02:02:57,280][33226] Updated weights for policy 1, policy_version 25520 (0.0007) [2023-10-14 02:02:57,649][33226] Updated weights for policy 1, policy_version 25530 (0.0007) [2023-10-14 02:02:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52068352. Throughput: 0: 1748.3, 1: 1796.0. Samples: 13020634. Policy #0 lag: (min: 25.0, avg: 41.6, max: 57.0) [2023-10-14 02:02:59,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 02:03:00,588][33201] Updated weights for policy 0, policy_version 25320 (0.0008) [2023-10-14 02:03:00,952][33201] Updated weights for policy 0, policy_version 25330 (0.0007) [2023-10-14 02:03:01,331][33201] Updated weights for policy 0, policy_version 25340 (0.0009) [2023-10-14 02:03:01,358][33226] Updated weights for policy 1, policy_version 25540 (0.0008) [2023-10-14 02:03:01,731][33226] Updated weights for policy 1, policy_version 25550 (0.0010) [2023-10-14 02:03:02,101][33226] Updated weights for policy 1, policy_version 25560 (0.0007) [2023-10-14 02:03:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52133888. Throughput: 0: 1749.6, 1: 1771.4. Samples: 13041984. Policy #0 lag: (min: 25.0, avg: 41.6, max: 57.0) [2023-10-14 02:03:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 02:03:05,228][33201] Updated weights for policy 0, policy_version 25350 (0.0008) [2023-10-14 02:03:05,597][33201] Updated weights for policy 0, policy_version 25360 (0.0009) [2023-10-14 02:03:05,854][33226] Updated weights for policy 1, policy_version 25570 (0.0007) [2023-10-14 02:03:05,971][33201] Updated weights for policy 0, policy_version 25370 (0.0007) [2023-10-14 02:03:06,219][33226] Updated weights for policy 1, policy_version 25580 (0.0007) [2023-10-14 02:03:06,588][33226] Updated weights for policy 1, policy_version 25590 (0.0008) [2023-10-14 02:03:06,960][33226] Updated weights for policy 1, policy_version 25600 (0.0007) [2023-10-14 02:03:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 52199424. Throughput: 0: 1773.0, 1: 1773.6. Samples: 13064114. Policy #0 lag: (min: 18.0, avg: 18.1, max: 22.0) [2023-10-14 02:03:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 02:03:09,779][33201] Updated weights for policy 0, policy_version 25380 (0.0008) [2023-10-14 02:03:10,156][33201] Updated weights for policy 0, policy_version 25390 (0.0007) [2023-10-14 02:03:10,533][33201] Updated weights for policy 0, policy_version 25400 (0.0008) [2023-10-14 02:03:10,773][33226] Updated weights for policy 1, policy_version 25610 (0.0007) [2023-10-14 02:03:11,143][33226] Updated weights for policy 1, policy_version 25620 (0.0007) [2023-10-14 02:03:11,514][33226] Updated weights for policy 1, policy_version 25630 (0.0007) [2023-10-14 02:03:14,499][33201] Updated weights for policy 0, policy_version 25410 (0.0009) [2023-10-14 02:03:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 52264960. Throughput: 0: 1747.0, 1: 1775.3. Samples: 13073724. Policy #0 lag: (min: 18.0, avg: 18.1, max: 22.0) [2023-10-14 02:03:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 02:03:14,911][33201] Updated weights for policy 0, policy_version 25420 (0.0009) [2023-10-14 02:03:15,098][33226] Updated weights for policy 1, policy_version 25640 (0.0007) [2023-10-14 02:03:15,275][33201] Updated weights for policy 0, policy_version 25430 (0.0007) [2023-10-14 02:03:15,455][33226] Updated weights for policy 1, policy_version 25650 (0.0009) [2023-10-14 02:03:15,640][33201] Updated weights for policy 0, policy_version 25440 (0.0007) [2023-10-14 02:03:15,829][33226] Updated weights for policy 1, policy_version 25660 (0.0008) [2023-10-14 02:03:19,449][33201] Updated weights for policy 0, policy_version 25450 (0.0009) [2023-10-14 02:03:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 52330496. Throughput: 0: 1765.1, 1: 1779.3. Samples: 13095816. Policy #0 lag: (min: 18.0, avg: 18.1, max: 22.0) [2023-10-14 02:03:19,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 02:03:19,729][33226] Updated weights for policy 1, policy_version 25670 (0.0008) [2023-10-14 02:03:19,810][33201] Updated weights for policy 0, policy_version 25460 (0.0008) [2023-10-14 02:03:20,088][33226] Updated weights for policy 1, policy_version 25680 (0.0008) [2023-10-14 02:03:20,178][33201] Updated weights for policy 0, policy_version 25470 (0.0008) [2023-10-14 02:03:20,463][33226] Updated weights for policy 1, policy_version 25690 (0.0008) [2023-10-14 02:03:23,841][33201] Updated weights for policy 0, policy_version 25480 (0.0008) [2023-10-14 02:03:24,211][33201] Updated weights for policy 0, policy_version 25490 (0.0009) [2023-10-14 02:03:24,305][33226] Updated weights for policy 1, policy_version 25700 (0.0007) [2023-10-14 02:03:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 52396032. Throughput: 0: 1769.4, 1: 1792.1. Samples: 13117220. Policy #0 lag: (min: 18.0, avg: 18.1, max: 22.0) [2023-10-14 02:03:24,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.920')] [2023-10-14 02:03:24,588][33201] Updated weights for policy 0, policy_version 25500 (0.0008) [2023-10-14 02:03:24,680][33226] Updated weights for policy 1, policy_version 25710 (0.0008) [2023-10-14 02:03:24,742][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000025504_26116096.pth... [2023-10-14 02:03:24,775][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000023840_24412160.pth [2023-10-14 02:03:25,044][33226] Updated weights for policy 1, policy_version 25720 (0.0008) [2023-10-14 02:03:25,339][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000025728_26345472.pth... [2023-10-14 02:03:25,367][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000024032_24608768.pth [2023-10-14 02:03:28,560][33201] Updated weights for policy 0, policy_version 25510 (0.0007) [2023-10-14 02:03:28,911][33226] Updated weights for policy 1, policy_version 25730 (0.0009) [2023-10-14 02:03:28,945][33201] Updated weights for policy 0, policy_version 25520 (0.0009) [2023-10-14 02:03:29,280][33226] Updated weights for policy 1, policy_version 25740 (0.0009) [2023-10-14 02:03:29,321][33201] Updated weights for policy 0, policy_version 25530 (0.0007) [2023-10-14 02:03:29,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52494336. Throughput: 0: 1756.7, 1: 1768.7. Samples: 13127274. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:03:29,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.920')] [2023-10-14 02:03:29,652][33226] Updated weights for policy 1, policy_version 25750 (0.0010) [2023-10-14 02:03:30,016][33226] Updated weights for policy 1, policy_version 25760 (0.0007) [2023-10-14 02:03:33,146][33201] Updated weights for policy 0, policy_version 25540 (0.0008) [2023-10-14 02:03:33,519][33201] Updated weights for policy 0, policy_version 25550 (0.0010) [2023-10-14 02:03:33,898][33201] Updated weights for policy 0, policy_version 25560 (0.0008) [2023-10-14 02:03:33,910][33226] Updated weights for policy 1, policy_version 25770 (0.0008) [2023-10-14 02:03:34,272][33226] Updated weights for policy 1, policy_version 25780 (0.0009) [2023-10-14 02:03:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52559872. Throughput: 0: 1779.9, 1: 1777.9. Samples: 13149048. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:03:34,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.920')] [2023-10-14 02:03:34,639][33226] Updated weights for policy 1, policy_version 25790 (0.0009) [2023-10-14 02:03:37,681][33201] Updated weights for policy 0, policy_version 25570 (0.0008) [2023-10-14 02:03:38,044][33201] Updated weights for policy 0, policy_version 25580 (0.0010) [2023-10-14 02:03:38,412][33201] Updated weights for policy 0, policy_version 25590 (0.0007) [2023-10-14 02:03:38,493][33226] Updated weights for policy 1, policy_version 25800 (0.0007) [2023-10-14 02:03:38,789][33201] Updated weights for policy 0, policy_version 25600 (0.0009) [2023-10-14 02:03:38,860][33226] Updated weights for policy 1, policy_version 25810 (0.0010) [2023-10-14 02:03:39,221][33226] Updated weights for policy 1, policy_version 25820 (0.0007) [2023-10-14 02:03:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52658176. Throughput: 0: 1749.1, 1: 1780.6. Samples: 13168802. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:03:39,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 02:03:42,561][33201] Updated weights for policy 0, policy_version 25610 (0.0010) [2023-10-14 02:03:42,934][33201] Updated weights for policy 0, policy_version 25620 (0.0009) [2023-10-14 02:03:42,966][33226] Updated weights for policy 1, policy_version 25830 (0.0008) [2023-10-14 02:03:43,303][33201] Updated weights for policy 0, policy_version 25630 (0.0008) [2023-10-14 02:03:43,334][33226] Updated weights for policy 1, policy_version 25840 (0.0008) [2023-10-14 02:03:43,697][33226] Updated weights for policy 1, policy_version 25850 (0.0007) [2023-10-14 02:03:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 52723712. Throughput: 0: 1783.9, 1: 1775.7. Samples: 13180814. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-14 02:03:44,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 02:03:47,085][33201] Updated weights for policy 0, policy_version 25640 (0.0007) [2023-10-14 02:03:47,454][33201] Updated weights for policy 0, policy_version 25650 (0.0009) [2023-10-14 02:03:47,657][33226] Updated weights for policy 1, policy_version 25860 (0.0011) [2023-10-14 02:03:47,830][33201] Updated weights for policy 0, policy_version 25660 (0.0008) [2023-10-14 02:03:48,023][33226] Updated weights for policy 1, policy_version 25870 (0.0008) [2023-10-14 02:03:48,394][33226] Updated weights for policy 1, policy_version 25880 (0.0008) [2023-10-14 02:03:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52789248. Throughput: 0: 1747.2, 1: 1781.0. Samples: 13200752. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-14 02:03:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:03:51,717][33201] Updated weights for policy 0, policy_version 25670 (0.0007) [2023-10-14 02:03:52,050][33226] Updated weights for policy 1, policy_version 25890 (0.0007) [2023-10-14 02:03:52,091][33201] Updated weights for policy 0, policy_version 25680 (0.0010) [2023-10-14 02:03:52,414][33226] Updated weights for policy 1, policy_version 25900 (0.0008) [2023-10-14 02:03:52,454][33201] Updated weights for policy 0, policy_version 25690 (0.0007) [2023-10-14 02:03:52,783][33226] Updated weights for policy 1, policy_version 25910 (0.0009) [2023-10-14 02:03:53,148][33226] Updated weights for policy 1, policy_version 25920 (0.0008) [2023-10-14 02:03:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 52854784. Throughput: 0: 1745.6, 1: 1760.7. Samples: 13221898. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-14 02:03:54,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:03:56,434][33201] Updated weights for policy 0, policy_version 25700 (0.0007) [2023-10-14 02:03:56,809][33201] Updated weights for policy 0, policy_version 25710 (0.0009) [2023-10-14 02:03:56,956][33226] Updated weights for policy 1, policy_version 25930 (0.0009) [2023-10-14 02:03:57,185][33201] Updated weights for policy 0, policy_version 25720 (0.0008) [2023-10-14 02:03:57,319][33226] Updated weights for policy 1, policy_version 25940 (0.0009) [2023-10-14 02:03:57,694][33226] Updated weights for policy 1, policy_version 25950 (0.0007) [2023-10-14 02:03:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 52920320. Throughput: 0: 1754.8, 1: 1783.6. Samples: 13232956. Policy #0 lag: (min: 17.0, avg: 33.1, max: 49.0) [2023-10-14 02:03:59,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:04:01,035][33201] Updated weights for policy 0, policy_version 25730 (0.0009) [2023-10-14 02:04:01,407][33201] Updated weights for policy 0, policy_version 25740 (0.0008) [2023-10-14 02:04:01,471][33226] Updated weights for policy 1, policy_version 25960 (0.0009) [2023-10-14 02:04:01,781][33201] Updated weights for policy 0, policy_version 25750 (0.0010) [2023-10-14 02:04:01,840][33226] Updated weights for policy 1, policy_version 25970 (0.0007) [2023-10-14 02:04:02,154][33201] Updated weights for policy 0, policy_version 25760 (0.0009) [2023-10-14 02:04:02,219][33226] Updated weights for policy 1, policy_version 25980 (0.0008) [2023-10-14 02:04:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 52985856. Throughput: 0: 1747.2, 1: 1765.1. Samples: 13253866. Policy #0 lag: (min: 17.0, avg: 30.5, max: 49.0) [2023-10-14 02:04:04,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:04:05,862][33201] Updated weights for policy 0, policy_version 25770 (0.0009) [2023-10-14 02:04:06,087][33226] Updated weights for policy 1, policy_version 25990 (0.0008) [2023-10-14 02:04:06,239][33201] Updated weights for policy 0, policy_version 25780 (0.0010) [2023-10-14 02:04:06,453][33226] Updated weights for policy 1, policy_version 26000 (0.0008) [2023-10-14 02:04:06,603][33201] Updated weights for policy 0, policy_version 25790 (0.0009) [2023-10-14 02:04:06,824][33226] Updated weights for policy 1, policy_version 26010 (0.0007) [2023-10-14 02:04:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 53051392. Throughput: 0: 1756.8, 1: 1771.7. Samples: 13276002. Policy #0 lag: (min: 17.0, avg: 30.5, max: 49.0) [2023-10-14 02:04:09,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 02:04:10,503][33201] Updated weights for policy 0, policy_version 25800 (0.0008) [2023-10-14 02:04:10,585][33226] Updated weights for policy 1, policy_version 26020 (0.0007) [2023-10-14 02:04:10,881][33201] Updated weights for policy 0, policy_version 25810 (0.0010) [2023-10-14 02:04:10,948][33226] Updated weights for policy 1, policy_version 26030 (0.0008) [2023-10-14 02:04:11,246][33201] Updated weights for policy 0, policy_version 25820 (0.0009) [2023-10-14 02:04:11,315][33226] Updated weights for policy 1, policy_version 26040 (0.0010) [2023-10-14 02:04:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 53116928. Throughput: 0: 1744.2, 1: 1771.7. Samples: 13285490. Policy #0 lag: (min: 17.0, avg: 30.5, max: 49.0) [2023-10-14 02:04:14,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:04:15,056][33226] Updated weights for policy 1, policy_version 26050 (0.0010) [2023-10-14 02:04:15,139][33201] Updated weights for policy 0, policy_version 25830 (0.0007) [2023-10-14 02:04:15,438][33226] Updated weights for policy 1, policy_version 26060 (0.0007) [2023-10-14 02:04:15,513][33201] Updated weights for policy 0, policy_version 25840 (0.0008) [2023-10-14 02:04:15,796][33226] Updated weights for policy 1, policy_version 26070 (0.0007) [2023-10-14 02:04:15,879][33201] Updated weights for policy 0, policy_version 25850 (0.0008) [2023-10-14 02:04:16,159][33226] Updated weights for policy 1, policy_version 26080 (0.0009) [2023-10-14 02:04:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 53182464. Throughput: 0: 1745.4, 1: 1775.8. Samples: 13307502. Policy #0 lag: (min: 17.0, avg: 30.5, max: 49.0) [2023-10-14 02:04:19,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.900')] [2023-10-14 02:04:19,848][33201] Updated weights for policy 0, policy_version 25860 (0.0008) [2023-10-14 02:04:19,894][33226] Updated weights for policy 1, policy_version 26090 (0.0008) [2023-10-14 02:04:20,220][33201] Updated weights for policy 0, policy_version 25870 (0.0007) [2023-10-14 02:04:20,261][33226] Updated weights for policy 1, policy_version 26100 (0.0007) [2023-10-14 02:04:20,593][33201] Updated weights for policy 0, policy_version 25880 (0.0007) [2023-10-14 02:04:20,627][33226] Updated weights for policy 1, policy_version 26110 (0.0007) [2023-10-14 02:04:24,344][33201] Updated weights for policy 0, policy_version 25890 (0.0008) [2023-10-14 02:04:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 53248000. Throughput: 0: 1776.7, 1: 1793.0. Samples: 13329436. Policy #0 lag: (min: 20.0, avg: 20.1, max: 27.0) [2023-10-14 02:04:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:04:24,601][33226] Updated weights for policy 1, policy_version 26120 (0.0007) [2023-10-14 02:04:24,713][33201] Updated weights for policy 0, policy_version 25900 (0.0010) [2023-10-14 02:04:24,961][33226] Updated weights for policy 1, policy_version 26130 (0.0009) [2023-10-14 02:04:25,076][33201] Updated weights for policy 0, policy_version 25910 (0.0008) [2023-10-14 02:04:25,329][33226] Updated weights for policy 1, policy_version 26140 (0.0009) [2023-10-14 02:04:25,450][33201] Updated weights for policy 0, policy_version 25920 (0.0008) [2023-10-14 02:04:29,121][33226] Updated weights for policy 1, policy_version 26150 (0.0009) [2023-10-14 02:04:29,262][33201] Updated weights for policy 0, policy_version 25930 (0.0008) [2023-10-14 02:04:29,492][33226] Updated weights for policy 1, policy_version 26160 (0.0008) [2023-10-14 02:04:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 53313536. Throughput: 0: 1743.7, 1: 1771.0. Samples: 13338976. Policy #0 lag: (min: 20.0, avg: 20.1, max: 27.0) [2023-10-14 02:04:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:04:29,625][33201] Updated weights for policy 0, policy_version 25940 (0.0007) [2023-10-14 02:04:29,852][33226] Updated weights for policy 1, policy_version 26170 (0.0007) [2023-10-14 02:04:29,996][33201] Updated weights for policy 0, policy_version 25950 (0.0008) [2023-10-14 02:04:33,789][33226] Updated weights for policy 1, policy_version 26180 (0.0009) [2023-10-14 02:04:33,857][33201] Updated weights for policy 0, policy_version 25960 (0.0008) [2023-10-14 02:04:34,162][33226] Updated weights for policy 1, policy_version 26190 (0.0008) [2023-10-14 02:04:34,235][33201] Updated weights for policy 0, policy_version 25970 (0.0008) [2023-10-14 02:04:34,523][33226] Updated weights for policy 1, policy_version 26200 (0.0007) [2023-10-14 02:04:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 53379072. Throughput: 0: 1777.2, 1: 1788.4. Samples: 13361204. Policy #0 lag: (min: 20.0, avg: 20.1, max: 27.0) [2023-10-14 02:04:34,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:04:34,600][33201] Updated weights for policy 0, policy_version 25980 (0.0007) [2023-10-14 02:04:38,353][33226] Updated weights for policy 1, policy_version 26210 (0.0008) [2023-10-14 02:04:38,543][33201] Updated weights for policy 0, policy_version 25990 (0.0009) [2023-10-14 02:04:38,718][33226] Updated weights for policy 1, policy_version 26220 (0.0009) [2023-10-14 02:04:38,905][33201] Updated weights for policy 0, policy_version 26000 (0.0007) [2023-10-14 02:04:39,093][33226] Updated weights for policy 1, policy_version 26230 (0.0009) [2023-10-14 02:04:39,273][33201] Updated weights for policy 0, policy_version 26010 (0.0009) [2023-10-14 02:04:39,460][33226] Updated weights for policy 1, policy_version 26240 (0.0009) [2023-10-14 02:04:39,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53510144. Throughput: 0: 1754.6, 1: 1788.5. Samples: 13381340. Policy #0 lag: (min: 1.0, avg: 5.3, max: 33.0) [2023-10-14 02:04:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 02:04:43,118][33201] Updated weights for policy 0, policy_version 26020 (0.0009) [2023-10-14 02:04:43,221][33226] Updated weights for policy 1, policy_version 26250 (0.0007) [2023-10-14 02:04:43,485][33201] Updated weights for policy 0, policy_version 26030 (0.0007) [2023-10-14 02:04:43,577][33226] Updated weights for policy 1, policy_version 26260 (0.0008) [2023-10-14 02:04:43,857][33201] Updated weights for policy 0, policy_version 26040 (0.0008) [2023-10-14 02:04:43,943][33226] Updated weights for policy 1, policy_version 26270 (0.0009) [2023-10-14 02:04:44,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53575680. Throughput: 0: 1766.0, 1: 1781.4. Samples: 13392592. Policy #0 lag: (min: 1.0, avg: 5.3, max: 33.0) [2023-10-14 02:04:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 02:04:47,609][33201] Updated weights for policy 0, policy_version 26050 (0.0008) [2023-10-14 02:04:47,846][33226] Updated weights for policy 1, policy_version 26280 (0.0009) [2023-10-14 02:04:47,986][33201] Updated weights for policy 0, policy_version 26060 (0.0009) [2023-10-14 02:04:48,222][33226] Updated weights for policy 1, policy_version 26290 (0.0008) [2023-10-14 02:04:48,352][33201] Updated weights for policy 0, policy_version 26070 (0.0008) [2023-10-14 02:04:48,585][33226] Updated weights for policy 1, policy_version 26300 (0.0009) [2023-10-14 02:04:48,719][33201] Updated weights for policy 0, policy_version 26080 (0.0008) [2023-10-14 02:04:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53641216. Throughput: 0: 1759.2, 1: 1782.8. Samples: 13413260. Policy #0 lag: (min: 1.0, avg: 5.3, max: 33.0) [2023-10-14 02:04:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 02:04:52,334][33226] Updated weights for policy 1, policy_version 26310 (0.0008) [2023-10-14 02:04:52,697][33226] Updated weights for policy 1, policy_version 26320 (0.0008) [2023-10-14 02:04:52,756][33201] Updated weights for policy 0, policy_version 26090 (0.0008) [2023-10-14 02:04:53,072][33226] Updated weights for policy 1, policy_version 26330 (0.0007) [2023-10-14 02:04:53,120][33201] Updated weights for policy 0, policy_version 26100 (0.0008) [2023-10-14 02:04:53,489][33201] Updated weights for policy 0, policy_version 26110 (0.0007) [2023-10-14 02:04:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53706752. Throughput: 0: 1740.2, 1: 1761.3. Samples: 13433568. Policy #0 lag: (min: 1.0, avg: 5.3, max: 33.0) [2023-10-14 02:04:54,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.870')] [2023-10-14 02:04:56,649][33226] Updated weights for policy 1, policy_version 26340 (0.0009) [2023-10-14 02:04:57,015][33226] Updated weights for policy 1, policy_version 26350 (0.0010) [2023-10-14 02:04:57,380][33226] Updated weights for policy 1, policy_version 26360 (0.0007) [2023-10-14 02:04:57,408][33201] Updated weights for policy 0, policy_version 26120 (0.0007) [2023-10-14 02:04:57,766][33201] Updated weights for policy 0, policy_version 26130 (0.0008) [2023-10-14 02:04:58,134][33201] Updated weights for policy 0, policy_version 26140 (0.0011) [2023-10-14 02:04:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53772288. Throughput: 0: 1771.7, 1: 1784.3. Samples: 13445510. Policy #0 lag: (min: 28.0, avg: 44.2, max: 60.0) [2023-10-14 02:04:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.870')] [2023-10-14 02:05:01,164][33226] Updated weights for policy 1, policy_version 26370 (0.0007) [2023-10-14 02:05:01,529][33226] Updated weights for policy 1, policy_version 26380 (0.0011) [2023-10-14 02:05:01,896][33226] Updated weights for policy 1, policy_version 26390 (0.0007) [2023-10-14 02:05:02,080][33201] Updated weights for policy 0, policy_version 26150 (0.0009) [2023-10-14 02:05:02,257][33226] Updated weights for policy 1, policy_version 26400 (0.0007) [2023-10-14 02:05:02,462][33201] Updated weights for policy 0, policy_version 26160 (0.0007) [2023-10-14 02:05:02,828][33201] Updated weights for policy 0, policy_version 26170 (0.0009) [2023-10-14 02:05:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 53837824. Throughput: 0: 1739.9, 1: 1765.8. Samples: 13465256. Policy #0 lag: (min: 28.0, avg: 44.2, max: 60.0) [2023-10-14 02:05:04,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.880')] [2023-10-14 02:05:06,060][33226] Updated weights for policy 1, policy_version 26410 (0.0010) [2023-10-14 02:05:06,429][33226] Updated weights for policy 1, policy_version 26420 (0.0007) [2023-10-14 02:05:06,573][33201] Updated weights for policy 0, policy_version 26180 (0.0008) [2023-10-14 02:05:06,794][33226] Updated weights for policy 1, policy_version 26430 (0.0009) [2023-10-14 02:05:06,947][33201] Updated weights for policy 0, policy_version 26190 (0.0007) [2023-10-14 02:05:07,319][33201] Updated weights for policy 0, policy_version 26200 (0.0010) [2023-10-14 02:05:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 53903360. Throughput: 0: 1741.1, 1: 1770.1. Samples: 13487440. Policy #0 lag: (min: 28.0, avg: 44.2, max: 60.0) [2023-10-14 02:05:09,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.880')] [2023-10-14 02:05:10,718][33226] Updated weights for policy 1, policy_version 26440 (0.0008) [2023-10-14 02:05:11,082][33226] Updated weights for policy 1, policy_version 26450 (0.0009) [2023-10-14 02:05:11,115][33201] Updated weights for policy 0, policy_version 26210 (0.0010) [2023-10-14 02:05:11,455][33226] Updated weights for policy 1, policy_version 26460 (0.0007) [2023-10-14 02:05:11,482][33201] Updated weights for policy 0, policy_version 26220 (0.0008) [2023-10-14 02:05:11,858][33201] Updated weights for policy 0, policy_version 26230 (0.0007) [2023-10-14 02:05:12,230][33201] Updated weights for policy 0, policy_version 26240 (0.0009) [2023-10-14 02:05:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 53968896. Throughput: 0: 1748.3, 1: 1769.7. Samples: 13497286. Policy #0 lag: (min: 28.0, avg: 44.2, max: 60.0) [2023-10-14 02:05:14,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.900')] [2023-10-14 02:05:15,220][33226] Updated weights for policy 1, policy_version 26470 (0.0007) [2023-10-14 02:05:15,588][33226] Updated weights for policy 1, policy_version 26480 (0.0008) [2023-10-14 02:05:15,819][33201] Updated weights for policy 0, policy_version 26250 (0.0007) [2023-10-14 02:05:15,955][33226] Updated weights for policy 1, policy_version 26490 (0.0007) [2023-10-14 02:05:16,194][33201] Updated weights for policy 0, policy_version 26260 (0.0008) [2023-10-14 02:05:16,573][33201] Updated weights for policy 0, policy_version 26270 (0.0007) [2023-10-14 02:05:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 54034432. Throughput: 0: 1748.1, 1: 1766.3. Samples: 13519354. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:05:19,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.900')] [2023-10-14 02:05:19,784][33226] Updated weights for policy 1, policy_version 26500 (0.0008) [2023-10-14 02:05:20,157][33226] Updated weights for policy 1, policy_version 26510 (0.0007) [2023-10-14 02:05:20,399][33201] Updated weights for policy 0, policy_version 26280 (0.0007) [2023-10-14 02:05:20,519][33226] Updated weights for policy 1, policy_version 26520 (0.0008) [2023-10-14 02:05:20,773][33201] Updated weights for policy 0, policy_version 26290 (0.0008) [2023-10-14 02:05:21,158][33201] Updated weights for policy 0, policy_version 26300 (0.0010) [2023-10-14 02:05:24,321][33226] Updated weights for policy 1, policy_version 26530 (0.0009) [2023-10-14 02:05:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 54099968. Throughput: 0: 1774.3, 1: 1787.1. Samples: 13541602. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:05:24,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 02:05:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000026304_26935296.pth... [2023-10-14 02:05:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000024672_25264128.pth [2023-10-14 02:05:24,684][33226] Updated weights for policy 1, policy_version 26540 (0.0010) [2023-10-14 02:05:24,965][33201] Updated weights for policy 0, policy_version 26310 (0.0009) [2023-10-14 02:05:25,061][33226] Updated weights for policy 1, policy_version 26550 (0.0009) [2023-10-14 02:05:25,327][33201] Updated weights for policy 0, policy_version 26320 (0.0007) [2023-10-14 02:05:25,432][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000026560_27197440.pth... [2023-10-14 02:05:25,433][33226] Updated weights for policy 1, policy_version 26560 (0.0008) [2023-10-14 02:05:25,465][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000024896_25493504.pth [2023-10-14 02:05:25,694][33201] Updated weights for policy 0, policy_version 26330 (0.0009) [2023-10-14 02:05:29,134][33226] Updated weights for policy 1, policy_version 26570 (0.0007) [2023-10-14 02:05:29,511][33226] Updated weights for policy 1, policy_version 26580 (0.0008) [2023-10-14 02:05:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 54165504. Throughput: 0: 1748.5, 1: 1768.3. Samples: 13550848. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:05:29,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-14 02:05:29,564][33201] Updated weights for policy 0, policy_version 26340 (0.0009) [2023-10-14 02:05:29,873][33226] Updated weights for policy 1, policy_version 26590 (0.0007) [2023-10-14 02:05:29,936][33201] Updated weights for policy 0, policy_version 26350 (0.0009) [2023-10-14 02:05:30,304][33201] Updated weights for policy 0, policy_version 26360 (0.0010) [2023-10-14 02:05:33,721][33226] Updated weights for policy 1, policy_version 26600 (0.0010) [2023-10-14 02:05:34,085][33226] Updated weights for policy 1, policy_version 26610 (0.0007) [2023-10-14 02:05:34,373][33201] Updated weights for policy 0, policy_version 26370 (0.0007) [2023-10-14 02:05:34,458][33226] Updated weights for policy 1, policy_version 26620 (0.0009) [2023-10-14 02:05:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 54231040. Throughput: 0: 1762.6, 1: 1789.1. Samples: 13573086. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:05:34,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 02:05:34,737][33201] Updated weights for policy 0, policy_version 26380 (0.0008) [2023-10-14 02:05:35,107][33201] Updated weights for policy 0, policy_version 26390 (0.0010) [2023-10-14 02:05:35,479][33201] Updated weights for policy 0, policy_version 26400 (0.0010) [2023-10-14 02:05:38,285][33226] Updated weights for policy 1, policy_version 26630 (0.0007) [2023-10-14 02:05:38,656][33226] Updated weights for policy 1, policy_version 26640 (0.0008) [2023-10-14 02:05:39,029][33226] Updated weights for policy 1, policy_version 26650 (0.0007) [2023-10-14 02:05:39,173][33201] Updated weights for policy 0, policy_version 26410 (0.0007) [2023-10-14 02:05:39,547][33201] Updated weights for policy 0, policy_version 26420 (0.0009) [2023-10-14 02:05:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 54329344. Throughput: 0: 1778.8, 1: 1783.8. Samples: 13593882. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:05:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.920')] [2023-10-14 02:05:39,922][33201] Updated weights for policy 0, policy_version 26430 (0.0008) [2023-10-14 02:05:42,828][33226] Updated weights for policy 1, policy_version 26660 (0.0008) [2023-10-14 02:05:43,195][33226] Updated weights for policy 1, policy_version 26670 (0.0009) [2023-10-14 02:05:43,559][33226] Updated weights for policy 1, policy_version 26680 (0.0008) [2023-10-14 02:05:43,781][33201] Updated weights for policy 0, policy_version 26440 (0.0008) [2023-10-14 02:05:44,146][33201] Updated weights for policy 0, policy_version 26450 (0.0008) [2023-10-14 02:05:44,525][33201] Updated weights for policy 0, policy_version 26460 (0.0009) [2023-10-14 02:05:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 54394880. Throughput: 0: 1753.8, 1: 1784.0. Samples: 13604712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:05:44,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 02:05:47,479][33226] Updated weights for policy 1, policy_version 26690 (0.0008) [2023-10-14 02:05:47,875][33226] Updated weights for policy 1, policy_version 26700 (0.0007) [2023-10-14 02:05:48,235][33226] Updated weights for policy 1, policy_version 26710 (0.0007) [2023-10-14 02:05:48,314][33201] Updated weights for policy 0, policy_version 26470 (0.0009) [2023-10-14 02:05:48,605][33226] Updated weights for policy 1, policy_version 26720 (0.0008) [2023-10-14 02:05:48,689][33201] Updated weights for policy 0, policy_version 26480 (0.0007) [2023-10-14 02:05:49,066][33201] Updated weights for policy 0, policy_version 26490 (0.0009) [2023-10-14 02:05:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 54493184. Throughput: 0: 1792.5, 1: 1785.5. Samples: 13626268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:05:49,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.930')] [2023-10-14 02:05:52,312][33226] Updated weights for policy 1, policy_version 26730 (0.0010) [2023-10-14 02:05:52,683][33226] Updated weights for policy 1, policy_version 26740 (0.0009) [2023-10-14 02:05:52,968][33201] Updated weights for policy 0, policy_version 26500 (0.0009) [2023-10-14 02:05:53,057][33226] Updated weights for policy 1, policy_version 26750 (0.0008) [2023-10-14 02:05:53,338][33201] Updated weights for policy 0, policy_version 26510 (0.0008) [2023-10-14 02:05:53,720][33201] Updated weights for policy 0, policy_version 26520 (0.0011) [2023-10-14 02:05:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 54558720. Throughput: 0: 1762.8, 1: 1763.7. Samples: 13646136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:05:54,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.930')] [2023-10-14 02:05:56,910][33226] Updated weights for policy 1, policy_version 26760 (0.0008) [2023-10-14 02:05:57,268][33226] Updated weights for policy 1, policy_version 26770 (0.0010) [2023-10-14 02:05:57,586][33201] Updated weights for policy 0, policy_version 26530 (0.0010) [2023-10-14 02:05:57,645][33226] Updated weights for policy 1, policy_version 26780 (0.0011) [2023-10-14 02:05:57,958][33201] Updated weights for policy 0, policy_version 26540 (0.0008) [2023-10-14 02:05:58,328][33201] Updated weights for policy 0, policy_version 26550 (0.0009) [2023-10-14 02:05:58,694][33201] Updated weights for policy 0, policy_version 26560 (0.0007) [2023-10-14 02:05:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 54624256. Throughput: 0: 1780.7, 1: 1787.6. Samples: 13657856. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:05:59,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.910')] [2023-10-14 02:06:01,311][33226] Updated weights for policy 1, policy_version 26790 (0.0010) [2023-10-14 02:06:01,675][33226] Updated weights for policy 1, policy_version 26800 (0.0008) [2023-10-14 02:06:02,038][33226] Updated weights for policy 1, policy_version 26810 (0.0010) [2023-10-14 02:06:02,630][33201] Updated weights for policy 0, policy_version 26570 (0.0009) [2023-10-14 02:06:02,999][33201] Updated weights for policy 0, policy_version 26580 (0.0008) [2023-10-14 02:06:03,365][33201] Updated weights for policy 0, policy_version 26590 (0.0009) [2023-10-14 02:06:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 54689792. Throughput: 0: 1759.6, 1: 1767.7. Samples: 13678086. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:06:04,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.880')] [2023-10-14 02:06:05,807][33226] Updated weights for policy 1, policy_version 26820 (0.0009) [2023-10-14 02:06:06,181][33226] Updated weights for policy 1, policy_version 26830 (0.0007) [2023-10-14 02:06:06,551][33226] Updated weights for policy 1, policy_version 26840 (0.0007) [2023-10-14 02:06:07,157][33201] Updated weights for policy 0, policy_version 26600 (0.0010) [2023-10-14 02:06:07,538][33201] Updated weights for policy 0, policy_version 26610 (0.0009) [2023-10-14 02:06:07,901][33201] Updated weights for policy 0, policy_version 26620 (0.0008) [2023-10-14 02:06:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 54755328. Throughput: 0: 1748.9, 1: 1772.3. Samples: 13700058. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:06:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.880')] [2023-10-14 02:06:10,307][33226] Updated weights for policy 1, policy_version 26850 (0.0008) [2023-10-14 02:06:10,677][33226] Updated weights for policy 1, policy_version 26860 (0.0008) [2023-10-14 02:06:11,045][33226] Updated weights for policy 1, policy_version 26870 (0.0009) [2023-10-14 02:06:11,419][33226] Updated weights for policy 1, policy_version 26880 (0.0010) [2023-10-14 02:06:11,568][33201] Updated weights for policy 0, policy_version 26630 (0.0007) [2023-10-14 02:06:11,943][33201] Updated weights for policy 0, policy_version 26640 (0.0008) [2023-10-14 02:06:12,318][33201] Updated weights for policy 0, policy_version 26650 (0.0009) [2023-10-14 02:06:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 54820864. Throughput: 0: 1769.1, 1: 1774.6. Samples: 13710312. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:06:14,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.880')] [2023-10-14 02:06:15,277][33226] Updated weights for policy 1, policy_version 26890 (0.0010) [2023-10-14 02:06:15,647][33226] Updated weights for policy 1, policy_version 26900 (0.0009) [2023-10-14 02:06:16,019][33226] Updated weights for policy 1, policy_version 26910 (0.0009) [2023-10-14 02:06:16,077][33201] Updated weights for policy 0, policy_version 26660 (0.0008) [2023-10-14 02:06:16,448][33201] Updated weights for policy 0, policy_version 26670 (0.0007) [2023-10-14 02:06:16,813][33201] Updated weights for policy 0, policy_version 26680 (0.0008) [2023-10-14 02:06:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 54886400. Throughput: 0: 1765.7, 1: 1769.8. Samples: 13732184. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:06:19,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.870')] [2023-10-14 02:06:19,719][33226] Updated weights for policy 1, policy_version 26920 (0.0011) [2023-10-14 02:06:20,093][33226] Updated weights for policy 1, policy_version 26930 (0.0010) [2023-10-14 02:06:20,461][33226] Updated weights for policy 1, policy_version 26940 (0.0008) [2023-10-14 02:06:20,473][33201] Updated weights for policy 0, policy_version 26690 (0.0008) [2023-10-14 02:06:20,835][33201] Updated weights for policy 0, policy_version 26700 (0.0008) [2023-10-14 02:06:21,210][33201] Updated weights for policy 0, policy_version 26710 (0.0007) [2023-10-14 02:06:21,580][33201] Updated weights for policy 0, policy_version 26720 (0.0008) [2023-10-14 02:06:24,291][33226] Updated weights for policy 1, policy_version 26950 (0.0008) [2023-10-14 02:06:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 54951936. Throughput: 0: 1775.3, 1: 1793.2. Samples: 13754468. Policy #0 lag: (min: 24.0, avg: 46.9, max: 56.0) [2023-10-14 02:06:24,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.880')] [2023-10-14 02:06:24,660][33226] Updated weights for policy 1, policy_version 26960 (0.0008) [2023-10-14 02:06:25,033][33226] Updated weights for policy 1, policy_version 26970 (0.0010) [2023-10-14 02:06:25,506][33201] Updated weights for policy 0, policy_version 26730 (0.0009) [2023-10-14 02:06:25,883][33201] Updated weights for policy 0, policy_version 26740 (0.0009) [2023-10-14 02:06:26,256][33201] Updated weights for policy 0, policy_version 26750 (0.0008) [2023-10-14 02:06:28,918][33226] Updated weights for policy 1, policy_version 26980 (0.0010) [2023-10-14 02:06:29,284][33226] Updated weights for policy 1, policy_version 26990 (0.0007) [2023-10-14 02:06:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 55017472. Throughput: 0: 1764.6, 1: 1770.5. Samples: 13763792. Policy #0 lag: (min: 24.0, avg: 46.9, max: 56.0) [2023-10-14 02:06:29,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.900')] [2023-10-14 02:06:29,647][33226] Updated weights for policy 1, policy_version 27000 (0.0009) [2023-10-14 02:06:29,987][33201] Updated weights for policy 0, policy_version 26760 (0.0008) [2023-10-14 02:06:30,356][33201] Updated weights for policy 0, policy_version 26770 (0.0009) [2023-10-14 02:06:30,721][33201] Updated weights for policy 0, policy_version 26780 (0.0007) [2023-10-14 02:06:33,468][33226] Updated weights for policy 1, policy_version 27010 (0.0008) [2023-10-14 02:06:33,881][33226] Updated weights for policy 1, policy_version 27020 (0.0008) [2023-10-14 02:06:34,242][33226] Updated weights for policy 1, policy_version 27030 (0.0009) [2023-10-14 02:06:34,541][33201] Updated weights for policy 0, policy_version 26790 (0.0007) [2023-10-14 02:06:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 55083008. Throughput: 0: 1763.2, 1: 1789.6. Samples: 13786146. Policy #0 lag: (min: 24.0, avg: 46.9, max: 56.0) [2023-10-14 02:06:34,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.900')] [2023-10-14 02:06:34,607][33226] Updated weights for policy 1, policy_version 27040 (0.0009) [2023-10-14 02:06:34,912][33201] Updated weights for policy 0, policy_version 26800 (0.0010) [2023-10-14 02:06:35,296][33201] Updated weights for policy 0, policy_version 26810 (0.0008) [2023-10-14 02:06:38,455][33226] Updated weights for policy 1, policy_version 27050 (0.0008) [2023-10-14 02:06:38,826][33226] Updated weights for policy 1, policy_version 27060 (0.0009) [2023-10-14 02:06:38,989][33201] Updated weights for policy 0, policy_version 26820 (0.0010) [2023-10-14 02:06:39,188][33226] Updated weights for policy 1, policy_version 27070 (0.0009) [2023-10-14 02:06:39,358][33201] Updated weights for policy 0, policy_version 26830 (0.0007) [2023-10-14 02:06:39,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 55181312. Throughput: 0: 1788.4, 1: 1783.2. Samples: 13806860. Policy #0 lag: (min: 24.0, avg: 46.9, max: 56.0) [2023-10-14 02:06:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:06:39,729][33201] Updated weights for policy 0, policy_version 26840 (0.0009) [2023-10-14 02:06:43,079][33226] Updated weights for policy 1, policy_version 27080 (0.0009) [2023-10-14 02:06:43,445][33226] Updated weights for policy 1, policy_version 27090 (0.0010) [2023-10-14 02:06:43,574][33201] Updated weights for policy 0, policy_version 26850 (0.0007) [2023-10-14 02:06:43,806][33226] Updated weights for policy 1, policy_version 27100 (0.0008) [2023-10-14 02:06:43,947][33201] Updated weights for policy 0, policy_version 26860 (0.0007) [2023-10-14 02:06:44,316][33201] Updated weights for policy 0, policy_version 26870 (0.0009) [2023-10-14 02:06:44,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 55246848. Throughput: 0: 1766.5, 1: 1783.0. Samples: 13817584. Policy #0 lag: (min: 24.0, avg: 46.9, max: 56.0) [2023-10-14 02:06:44,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.900')] [2023-10-14 02:06:44,688][33201] Updated weights for policy 0, policy_version 26880 (0.0008) [2023-10-14 02:06:47,699][33226] Updated weights for policy 1, policy_version 27110 (0.0010) [2023-10-14 02:06:48,055][33226] Updated weights for policy 1, policy_version 27120 (0.0008) [2023-10-14 02:06:48,422][33226] Updated weights for policy 1, policy_version 27130 (0.0008) [2023-10-14 02:06:48,563][33201] Updated weights for policy 0, policy_version 26890 (0.0008) [2023-10-14 02:06:48,930][33201] Updated weights for policy 0, policy_version 26900 (0.0008) [2023-10-14 02:06:49,313][33201] Updated weights for policy 0, policy_version 26910 (0.0009) [2023-10-14 02:06:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 55345152. Throughput: 0: 1786.2, 1: 1786.0. Samples: 13838836. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 02:06:49,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 02:06:52,241][33226] Updated weights for policy 1, policy_version 27140 (0.0009) [2023-10-14 02:06:52,608][33226] Updated weights for policy 1, policy_version 27150 (0.0010) [2023-10-14 02:06:52,975][33226] Updated weights for policy 1, policy_version 27160 (0.0008) [2023-10-14 02:06:53,341][33201] Updated weights for policy 0, policy_version 26920 (0.0009) [2023-10-14 02:06:53,706][33201] Updated weights for policy 0, policy_version 26930 (0.0008) [2023-10-14 02:06:54,080][33201] Updated weights for policy 0, policy_version 26940 (0.0008) [2023-10-14 02:06:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 55410688. Throughput: 0: 1761.5, 1: 1759.2. Samples: 13858492. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 02:06:54,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:06:56,811][33226] Updated weights for policy 1, policy_version 27170 (0.0008) [2023-10-14 02:06:57,195][33226] Updated weights for policy 1, policy_version 27180 (0.0009) [2023-10-14 02:06:57,559][33226] Updated weights for policy 1, policy_version 27190 (0.0007) [2023-10-14 02:06:57,909][33201] Updated weights for policy 0, policy_version 26950 (0.0009) [2023-10-14 02:06:57,927][33226] Updated weights for policy 1, policy_version 27200 (0.0008) [2023-10-14 02:06:58,281][33201] Updated weights for policy 0, policy_version 26960 (0.0009) [2023-10-14 02:06:58,657][33201] Updated weights for policy 0, policy_version 26970 (0.0008) [2023-10-14 02:06:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 55476224. Throughput: 0: 1771.5, 1: 1785.1. Samples: 13870358. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 02:06:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.900')] [2023-10-14 02:07:01,667][33226] Updated weights for policy 1, policy_version 27210 (0.0007) [2023-10-14 02:07:02,032][33226] Updated weights for policy 1, policy_version 27220 (0.0009) [2023-10-14 02:07:02,393][33226] Updated weights for policy 1, policy_version 27230 (0.0009) [2023-10-14 02:07:02,545][33201] Updated weights for policy 0, policy_version 26980 (0.0009) [2023-10-14 02:07:02,924][33201] Updated weights for policy 0, policy_version 26990 (0.0008) [2023-10-14 02:07:03,294][33201] Updated weights for policy 0, policy_version 27000 (0.0008) [2023-10-14 02:07:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 55541760. Throughput: 0: 1757.1, 1: 1755.1. Samples: 13890230. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:07:04,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.900')] [2023-10-14 02:07:06,169][33226] Updated weights for policy 1, policy_version 27240 (0.0008) [2023-10-14 02:07:06,544][33226] Updated weights for policy 1, policy_version 27250 (0.0010) [2023-10-14 02:07:06,914][33226] Updated weights for policy 1, policy_version 27260 (0.0007) [2023-10-14 02:07:07,150][33201] Updated weights for policy 0, policy_version 27010 (0.0009) [2023-10-14 02:07:07,518][33201] Updated weights for policy 0, policy_version 27020 (0.0009) [2023-10-14 02:07:07,890][33201] Updated weights for policy 0, policy_version 27030 (0.0008) [2023-10-14 02:07:08,268][33201] Updated weights for policy 0, policy_version 27040 (0.0008) [2023-10-14 02:07:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 55607296. Throughput: 0: 1740.1, 1: 1759.4. Samples: 13911946. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:07:09,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.890')] [2023-10-14 02:07:10,594][33226] Updated weights for policy 1, policy_version 27270 (0.0009) [2023-10-14 02:07:10,968][33226] Updated weights for policy 1, policy_version 27280 (0.0007) [2023-10-14 02:07:11,331][33226] Updated weights for policy 1, policy_version 27290 (0.0010) [2023-10-14 02:07:11,942][33201] Updated weights for policy 0, policy_version 27050 (0.0007) [2023-10-14 02:07:12,320][33201] Updated weights for policy 0, policy_version 27060 (0.0007) [2023-10-14 02:07:12,683][33201] Updated weights for policy 0, policy_version 27070 (0.0007) [2023-10-14 02:07:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 55672832. Throughput: 0: 1766.6, 1: 1762.1. Samples: 13922586. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:07:14,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.880')] [2023-10-14 02:07:15,123][33226] Updated weights for policy 1, policy_version 27300 (0.0007) [2023-10-14 02:07:15,487][33226] Updated weights for policy 1, policy_version 27310 (0.0007) [2023-10-14 02:07:15,850][33226] Updated weights for policy 1, policy_version 27320 (0.0008) [2023-10-14 02:07:16,476][33201] Updated weights for policy 0, policy_version 27080 (0.0008) [2023-10-14 02:07:16,845][33201] Updated weights for policy 0, policy_version 27090 (0.0008) [2023-10-14 02:07:17,208][33201] Updated weights for policy 0, policy_version 27100 (0.0007) [2023-10-14 02:07:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 55738368. Throughput: 0: 1746.3, 1: 1760.8. Samples: 13943962. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:07:19,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.870')] [2023-10-14 02:07:19,692][33226] Updated weights for policy 1, policy_version 27330 (0.0010) [2023-10-14 02:07:20,111][33226] Updated weights for policy 1, policy_version 27340 (0.0008) [2023-10-14 02:07:20,478][33226] Updated weights for policy 1, policy_version 27350 (0.0009) [2023-10-14 02:07:20,857][33226] Updated weights for policy 1, policy_version 27360 (0.0009) [2023-10-14 02:07:21,003][33201] Updated weights for policy 0, policy_version 27110 (0.0008) [2023-10-14 02:07:21,373][33201] Updated weights for policy 0, policy_version 27120 (0.0009) [2023-10-14 02:07:21,738][33201] Updated weights for policy 0, policy_version 27130 (0.0009) [2023-10-14 02:07:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 55803904. Throughput: 0: 1753.2, 1: 1782.4. Samples: 13965966. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:07:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 02:07:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000027136_27787264.pth... [2023-10-14 02:07:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000025504_26116096.pth [2023-10-14 02:07:24,684][33226] Updated weights for policy 1, policy_version 27370 (0.0008) [2023-10-14 02:07:25,053][33226] Updated weights for policy 1, policy_version 27380 (0.0008) [2023-10-14 02:07:25,426][33226] Updated weights for policy 1, policy_version 27390 (0.0008) [2023-10-14 02:07:25,496][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000027392_28049408.pth... [2023-10-14 02:07:25,535][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000025728_26345472.pth [2023-10-14 02:07:25,570][33201] Updated weights for policy 0, policy_version 27140 (0.0009) [2023-10-14 02:07:25,939][33201] Updated weights for policy 0, policy_version 27150 (0.0007) [2023-10-14 02:07:26,308][33201] Updated weights for policy 0, policy_version 27160 (0.0008) [2023-10-14 02:07:29,170][33226] Updated weights for policy 1, policy_version 27400 (0.0007) [2023-10-14 02:07:29,531][33226] Updated weights for policy 1, policy_version 27410 (0.0008) [2023-10-14 02:07:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 55869440. Throughput: 0: 1747.8, 1: 1760.4. Samples: 13975452. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:07:29,559][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 02:07:29,896][33226] Updated weights for policy 1, policy_version 27420 (0.0009) [2023-10-14 02:07:30,228][33201] Updated weights for policy 0, policy_version 27170 (0.0010) [2023-10-14 02:07:30,590][33201] Updated weights for policy 0, policy_version 27180 (0.0008) [2023-10-14 02:07:30,961][33201] Updated weights for policy 0, policy_version 27190 (0.0008) [2023-10-14 02:07:31,334][33201] Updated weights for policy 0, policy_version 27200 (0.0010) [2023-10-14 02:07:33,779][33226] Updated weights for policy 1, policy_version 27430 (0.0007) [2023-10-14 02:07:34,144][33226] Updated weights for policy 1, policy_version 27440 (0.0008) [2023-10-14 02:07:34,506][33226] Updated weights for policy 1, policy_version 27450 (0.0009) [2023-10-14 02:07:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 55934976. Throughput: 0: 1755.5, 1: 1773.7. Samples: 13997652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:07:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 02:07:35,088][33201] Updated weights for policy 0, policy_version 27210 (0.0009) [2023-10-14 02:07:35,448][33201] Updated weights for policy 0, policy_version 27220 (0.0008) [2023-10-14 02:07:35,815][33201] Updated weights for policy 0, policy_version 27230 (0.0007) [2023-10-14 02:07:38,289][33226] Updated weights for policy 1, policy_version 27460 (0.0010) [2023-10-14 02:07:38,653][33226] Updated weights for policy 1, policy_version 27470 (0.0007) [2023-10-14 02:07:39,019][33226] Updated weights for policy 1, policy_version 27480 (0.0008) [2023-10-14 02:07:39,455][33201] Updated weights for policy 0, policy_version 27240 (0.0007) [2023-10-14 02:07:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 56033280. Throughput: 0: 1796.5, 1: 1772.4. Samples: 14019092. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:07:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 02:07:39,826][33201] Updated weights for policy 0, policy_version 27250 (0.0009) [2023-10-14 02:07:40,198][33201] Updated weights for policy 0, policy_version 27260 (0.0008) [2023-10-14 02:07:42,965][33226] Updated weights for policy 1, policy_version 27490 (0.0008) [2023-10-14 02:07:43,333][33226] Updated weights for policy 1, policy_version 27500 (0.0009) [2023-10-14 02:07:43,700][33226] Updated weights for policy 1, policy_version 27510 (0.0008) [2023-10-14 02:07:43,993][33201] Updated weights for policy 0, policy_version 27270 (0.0010) [2023-10-14 02:07:44,064][33226] Updated weights for policy 1, policy_version 27520 (0.0009) [2023-10-14 02:07:44,357][33201] Updated weights for policy 0, policy_version 27280 (0.0008) [2023-10-14 02:07:44,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 56098816. Throughput: 0: 1769.3, 1: 1766.9. Samples: 14029486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:07:44,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 02:07:44,722][33201] Updated weights for policy 0, policy_version 27290 (0.0008) [2023-10-14 02:07:48,040][33226] Updated weights for policy 1, policy_version 27530 (0.0007) [2023-10-14 02:07:48,416][33226] Updated weights for policy 1, policy_version 27540 (0.0009) [2023-10-14 02:07:48,557][33201] Updated weights for policy 0, policy_version 27300 (0.0009) [2023-10-14 02:07:48,783][33226] Updated weights for policy 1, policy_version 27550 (0.0009) [2023-10-14 02:07:48,925][33201] Updated weights for policy 0, policy_version 27310 (0.0007) [2023-10-14 02:07:49,292][33201] Updated weights for policy 0, policy_version 27320 (0.0008) [2023-10-14 02:07:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 56164352. Throughput: 0: 1791.2, 1: 1785.4. Samples: 14051174. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:07:49,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.870')] [2023-10-14 02:07:52,324][33226] Updated weights for policy 1, policy_version 27560 (0.0007) [2023-10-14 02:07:52,694][33226] Updated weights for policy 1, policy_version 27570 (0.0007) [2023-10-14 02:07:53,058][33226] Updated weights for policy 1, policy_version 27580 (0.0007) [2023-10-14 02:07:53,142][33201] Updated weights for policy 0, policy_version 27330 (0.0009) [2023-10-14 02:07:53,518][33201] Updated weights for policy 0, policy_version 27340 (0.0009) [2023-10-14 02:07:53,882][33201] Updated weights for policy 0, policy_version 27350 (0.0008) [2023-10-14 02:07:54,258][33201] Updated weights for policy 0, policy_version 27360 (0.0009) [2023-10-14 02:07:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 56262656. Throughput: 0: 1778.5, 1: 1765.7. Samples: 14071436. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:07:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.880')] [2023-10-14 02:07:57,076][33226] Updated weights for policy 1, policy_version 27590 (0.0008) [2023-10-14 02:07:57,439][33226] Updated weights for policy 1, policy_version 27600 (0.0007) [2023-10-14 02:07:57,804][33226] Updated weights for policy 1, policy_version 27610 (0.0011) [2023-10-14 02:07:58,153][33201] Updated weights for policy 0, policy_version 27370 (0.0007) [2023-10-14 02:07:58,540][33201] Updated weights for policy 0, policy_version 27380 (0.0009) [2023-10-14 02:07:58,914][33201] Updated weights for policy 0, policy_version 27390 (0.0008) [2023-10-14 02:07:59,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 56328192. Throughput: 0: 1778.4, 1: 1787.3. Samples: 14083044. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:07:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.890')] [2023-10-14 02:08:01,652][33226] Updated weights for policy 1, policy_version 27620 (0.0008) [2023-10-14 02:08:02,007][33226] Updated weights for policy 1, policy_version 27630 (0.0007) [2023-10-14 02:08:02,373][33226] Updated weights for policy 1, policy_version 27640 (0.0007) [2023-10-14 02:08:02,666][33201] Updated weights for policy 0, policy_version 27400 (0.0009) [2023-10-14 02:08:03,046][33201] Updated weights for policy 0, policy_version 27410 (0.0010) [2023-10-14 02:08:03,419][33201] Updated weights for policy 0, policy_version 27420 (0.0009) [2023-10-14 02:08:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 56393728. Throughput: 0: 1777.1, 1: 1754.6. Samples: 14102886. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:08:04,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.870')] [2023-10-14 02:08:06,241][33226] Updated weights for policy 1, policy_version 27650 (0.0008) [2023-10-14 02:08:06,655][33226] Updated weights for policy 1, policy_version 27660 (0.0010) [2023-10-14 02:08:07,020][33226] Updated weights for policy 1, policy_version 27670 (0.0008) [2023-10-14 02:08:07,262][33201] Updated weights for policy 0, policy_version 27430 (0.0007) [2023-10-14 02:08:07,382][33226] Updated weights for policy 1, policy_version 27680 (0.0008) [2023-10-14 02:08:07,638][33201] Updated weights for policy 0, policy_version 27440 (0.0009) [2023-10-14 02:08:08,007][33201] Updated weights for policy 0, policy_version 27450 (0.0009) [2023-10-14 02:08:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 56459264. Throughput: 0: 1763.2, 1: 1753.3. Samples: 14124210. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:08:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.850')] [2023-10-14 02:08:11,186][33226] Updated weights for policy 1, policy_version 27690 (0.0007) [2023-10-14 02:08:11,557][33226] Updated weights for policy 1, policy_version 27700 (0.0008) [2023-10-14 02:08:11,901][33201] Updated weights for policy 0, policy_version 27460 (0.0009) [2023-10-14 02:08:11,925][33226] Updated weights for policy 1, policy_version 27710 (0.0007) [2023-10-14 02:08:12,277][33201] Updated weights for policy 0, policy_version 27470 (0.0010) [2023-10-14 02:08:12,652][33201] Updated weights for policy 0, policy_version 27480 (0.0009) [2023-10-14 02:08:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 56524800. Throughput: 0: 1786.5, 1: 1755.7. Samples: 14134852. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:08:14,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.820')] [2023-10-14 02:08:15,485][33226] Updated weights for policy 1, policy_version 27720 (0.0008) [2023-10-14 02:08:15,860][33226] Updated weights for policy 1, policy_version 27730 (0.0011) [2023-10-14 02:08:16,234][33226] Updated weights for policy 1, policy_version 27740 (0.0009) [2023-10-14 02:08:16,349][33201] Updated weights for policy 0, policy_version 27490 (0.0009) [2023-10-14 02:08:16,718][33201] Updated weights for policy 0, policy_version 27500 (0.0008) [2023-10-14 02:08:17,104][33201] Updated weights for policy 0, policy_version 27510 (0.0007) [2023-10-14 02:08:17,465][33201] Updated weights for policy 0, policy_version 27520 (0.0009) [2023-10-14 02:08:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 56590336. Throughput: 0: 1759.2, 1: 1763.8. Samples: 14156188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:08:19,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.830')] [2023-10-14 02:08:20,019][33226] Updated weights for policy 1, policy_version 27750 (0.0008) [2023-10-14 02:08:20,379][33226] Updated weights for policy 1, policy_version 27760 (0.0008) [2023-10-14 02:08:20,744][33226] Updated weights for policy 1, policy_version 27770 (0.0008) [2023-10-14 02:08:21,247][33201] Updated weights for policy 0, policy_version 27530 (0.0007) [2023-10-14 02:08:21,617][33201] Updated weights for policy 0, policy_version 27540 (0.0009) [2023-10-14 02:08:21,993][33201] Updated weights for policy 0, policy_version 27550 (0.0008) [2023-10-14 02:08:24,409][33226] Updated weights for policy 1, policy_version 27780 (0.0008) [2023-10-14 02:08:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 56655872. Throughput: 0: 1761.8, 1: 1790.4. Samples: 14178940. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:08:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.820')] [2023-10-14 02:08:24,781][33226] Updated weights for policy 1, policy_version 27790 (0.0008) [2023-10-14 02:08:25,147][33226] Updated weights for policy 1, policy_version 27800 (0.0008) [2023-10-14 02:08:25,734][33201] Updated weights for policy 0, policy_version 27560 (0.0008) [2023-10-14 02:08:26,102][33201] Updated weights for policy 0, policy_version 27570 (0.0011) [2023-10-14 02:08:26,473][33201] Updated weights for policy 0, policy_version 27580 (0.0008) [2023-10-14 02:08:28,955][33226] Updated weights for policy 1, policy_version 27810 (0.0008) [2023-10-14 02:08:29,329][33226] Updated weights for policy 1, policy_version 27820 (0.0009) [2023-10-14 02:08:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 56721408. Throughput: 0: 1764.3, 1: 1769.6. Samples: 14188510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:08:29,557][31953] Avg episode reward: [(0, '20.880'), (1, '20.820')] [2023-10-14 02:08:29,698][33226] Updated weights for policy 1, policy_version 27830 (0.0010) [2023-10-14 02:08:30,060][33226] Updated weights for policy 1, policy_version 27840 (0.0008) [2023-10-14 02:08:30,305][33201] Updated weights for policy 0, policy_version 27590 (0.0009) [2023-10-14 02:08:30,673][33201] Updated weights for policy 0, policy_version 27600 (0.0007) [2023-10-14 02:08:31,043][33201] Updated weights for policy 0, policy_version 27610 (0.0007) [2023-10-14 02:08:33,800][33226] Updated weights for policy 1, policy_version 27850 (0.0008) [2023-10-14 02:08:34,167][33226] Updated weights for policy 1, policy_version 27860 (0.0008) [2023-10-14 02:08:34,534][33226] Updated weights for policy 1, policy_version 27870 (0.0009) [2023-10-14 02:08:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 56786944. Throughput: 0: 1760.2, 1: 1780.3. Samples: 14210498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:08:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.820')] [2023-10-14 02:08:34,878][33201] Updated weights for policy 0, policy_version 27620 (0.0008) [2023-10-14 02:08:35,257][33201] Updated weights for policy 0, policy_version 27630 (0.0008) [2023-10-14 02:08:35,621][33201] Updated weights for policy 0, policy_version 27640 (0.0007) [2023-10-14 02:08:38,317][33226] Updated weights for policy 1, policy_version 27880 (0.0009) [2023-10-14 02:08:38,679][33226] Updated weights for policy 1, policy_version 27890 (0.0008) [2023-10-14 02:08:39,052][33226] Updated weights for policy 1, policy_version 27900 (0.0009) [2023-10-14 02:08:39,347][33201] Updated weights for policy 0, policy_version 27650 (0.0008) [2023-10-14 02:08:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 56885248. Throughput: 0: 1789.5, 1: 1771.0. Samples: 14231658. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 02:08:39,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.820')] [2023-10-14 02:08:39,718][33201] Updated weights for policy 0, policy_version 27660 (0.0008) [2023-10-14 02:08:40,083][33201] Updated weights for policy 0, policy_version 27670 (0.0008) [2023-10-14 02:08:40,452][33201] Updated weights for policy 0, policy_version 27680 (0.0011) [2023-10-14 02:08:42,724][33226] Updated weights for policy 1, policy_version 27910 (0.0007) [2023-10-14 02:08:43,088][33226] Updated weights for policy 1, policy_version 27920 (0.0007) [2023-10-14 02:08:43,459][33226] Updated weights for policy 1, policy_version 27930 (0.0007) [2023-10-14 02:08:44,311][33201] Updated weights for policy 0, policy_version 27690 (0.0009) [2023-10-14 02:08:44,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 56950784. Throughput: 0: 1767.2, 1: 1774.3. Samples: 14242410. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 02:08:44,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.810')] [2023-10-14 02:08:44,679][33201] Updated weights for policy 0, policy_version 27700 (0.0007) [2023-10-14 02:08:45,050][33201] Updated weights for policy 0, policy_version 27710 (0.0008) [2023-10-14 02:08:47,186][33226] Updated weights for policy 1, policy_version 27940 (0.0007) [2023-10-14 02:08:47,558][33226] Updated weights for policy 1, policy_version 27950 (0.0009) [2023-10-14 02:08:47,918][33226] Updated weights for policy 1, policy_version 27960 (0.0008) [2023-10-14 02:08:48,894][33201] Updated weights for policy 0, policy_version 27720 (0.0008) [2023-10-14 02:08:49,263][33201] Updated weights for policy 0, policy_version 27730 (0.0008) [2023-10-14 02:08:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 57016320. Throughput: 0: 1783.0, 1: 1789.3. Samples: 14263642. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 02:08:49,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.810')] [2023-10-14 02:08:49,629][33201] Updated weights for policy 0, policy_version 27740 (0.0008) [2023-10-14 02:08:51,774][33226] Updated weights for policy 1, policy_version 27970 (0.0008) [2023-10-14 02:08:52,194][33226] Updated weights for policy 1, policy_version 27980 (0.0009) [2023-10-14 02:08:52,560][33226] Updated weights for policy 1, policy_version 27990 (0.0008) [2023-10-14 02:08:52,936][33226] Updated weights for policy 1, policy_version 28000 (0.0008) [2023-10-14 02:08:53,527][33201] Updated weights for policy 0, policy_version 27750 (0.0007) [2023-10-14 02:08:53,900][33201] Updated weights for policy 0, policy_version 27760 (0.0008) [2023-10-14 02:08:54,263][33201] Updated weights for policy 0, policy_version 27770 (0.0009) [2023-10-14 02:08:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 57114624. Throughput: 0: 1773.4, 1: 1783.7. Samples: 14284278. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 02:08:54,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.810')] [2023-10-14 02:08:56,589][33226] Updated weights for policy 1, policy_version 28010 (0.0008) [2023-10-14 02:08:56,958][33226] Updated weights for policy 1, policy_version 28020 (0.0009) [2023-10-14 02:08:57,330][33226] Updated weights for policy 1, policy_version 28030 (0.0009) [2023-10-14 02:08:58,194][33201] Updated weights for policy 0, policy_version 27780 (0.0008) [2023-10-14 02:08:58,553][33201] Updated weights for policy 0, policy_version 27790 (0.0009) [2023-10-14 02:08:58,920][33201] Updated weights for policy 0, policy_version 27800 (0.0008) [2023-10-14 02:08:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 57180160. Throughput: 0: 1766.3, 1: 1799.6. Samples: 14295316. Policy #0 lag: (min: 9.0, avg: 16.8, max: 41.0) [2023-10-14 02:08:59,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.810')] [2023-10-14 02:09:01,211][33226] Updated weights for policy 1, policy_version 28040 (0.0008) [2023-10-14 02:09:01,586][33226] Updated weights for policy 1, policy_version 28050 (0.0008) [2023-10-14 02:09:01,950][33226] Updated weights for policy 1, policy_version 28060 (0.0010) [2023-10-14 02:09:02,887][33201] Updated weights for policy 0, policy_version 27810 (0.0008) [2023-10-14 02:09:03,267][33201] Updated weights for policy 0, policy_version 27820 (0.0010) [2023-10-14 02:09:03,632][33201] Updated weights for policy 0, policy_version 27830 (0.0010) [2023-10-14 02:09:04,006][33201] Updated weights for policy 0, policy_version 27840 (0.0010) [2023-10-14 02:09:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 57245696. Throughput: 0: 1776.3, 1: 1784.1. Samples: 14316410. Policy #0 lag: (min: 9.0, avg: 16.8, max: 41.0) [2023-10-14 02:09:04,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.830')] [2023-10-14 02:09:05,776][33226] Updated weights for policy 1, policy_version 28070 (0.0007) [2023-10-14 02:09:06,147][33226] Updated weights for policy 1, policy_version 28080 (0.0009) [2023-10-14 02:09:06,521][33226] Updated weights for policy 1, policy_version 28090 (0.0008) [2023-10-14 02:09:07,748][33201] Updated weights for policy 0, policy_version 27850 (0.0008) [2023-10-14 02:09:08,111][33201] Updated weights for policy 0, policy_version 27860 (0.0008) [2023-10-14 02:09:08,483][33201] Updated weights for policy 0, policy_version 27870 (0.0008) [2023-10-14 02:09:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 57311232. Throughput: 0: 1744.2, 1: 1780.0. Samples: 14337528. Policy #0 lag: (min: 9.0, avg: 16.8, max: 41.0) [2023-10-14 02:09:09,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.830')] [2023-10-14 02:09:10,347][33226] Updated weights for policy 1, policy_version 28100 (0.0008) [2023-10-14 02:09:10,710][33226] Updated weights for policy 1, policy_version 28110 (0.0008) [2023-10-14 02:09:11,072][33226] Updated weights for policy 1, policy_version 28120 (0.0008) [2023-10-14 02:09:12,343][33201] Updated weights for policy 0, policy_version 27880 (0.0010) [2023-10-14 02:09:12,715][33201] Updated weights for policy 0, policy_version 27890 (0.0011) [2023-10-14 02:09:13,091][33201] Updated weights for policy 0, policy_version 27900 (0.0009) [2023-10-14 02:09:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 57376768. Throughput: 0: 1774.8, 1: 1778.7. Samples: 14348418. Policy #0 lag: (min: 9.0, avg: 16.8, max: 41.0) [2023-10-14 02:09:14,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.840')] [2023-10-14 02:09:14,857][33226] Updated weights for policy 1, policy_version 28130 (0.0009) [2023-10-14 02:09:15,238][33226] Updated weights for policy 1, policy_version 28140 (0.0009) [2023-10-14 02:09:15,596][33226] Updated weights for policy 1, policy_version 28150 (0.0008) [2023-10-14 02:09:15,968][33226] Updated weights for policy 1, policy_version 28160 (0.0008) [2023-10-14 02:09:16,978][33201] Updated weights for policy 0, policy_version 27910 (0.0010) [2023-10-14 02:09:17,354][33201] Updated weights for policy 0, policy_version 27920 (0.0009) [2023-10-14 02:09:17,713][33201] Updated weights for policy 0, policy_version 27930 (0.0008) [2023-10-14 02:09:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 57442304. Throughput: 0: 1748.3, 1: 1778.6. Samples: 14369210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:09:19,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.860')] [2023-10-14 02:09:19,792][33226] Updated weights for policy 1, policy_version 28170 (0.0008) [2023-10-14 02:09:20,170][33226] Updated weights for policy 1, policy_version 28180 (0.0008) [2023-10-14 02:09:20,534][33226] Updated weights for policy 1, policy_version 28190 (0.0008) [2023-10-14 02:09:21,544][33201] Updated weights for policy 0, policy_version 27940 (0.0008) [2023-10-14 02:09:21,917][33201] Updated weights for policy 0, policy_version 27950 (0.0008) [2023-10-14 02:09:22,281][33201] Updated weights for policy 0, policy_version 27960 (0.0011) [2023-10-14 02:09:24,364][33226] Updated weights for policy 1, policy_version 28200 (0.0007) [2023-10-14 02:09:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 57507840. Throughput: 0: 1743.4, 1: 1807.0. Samples: 14391426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:09:24,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.870')] [2023-10-14 02:09:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000027968_28639232.pth... [2023-10-14 02:09:24,599][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000026304_26935296.pth [2023-10-14 02:09:24,733][33226] Updated weights for policy 1, policy_version 28210 (0.0009) [2023-10-14 02:09:25,111][33226] Updated weights for policy 1, policy_version 28220 (0.0008) [2023-10-14 02:09:25,255][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000028224_28901376.pth... [2023-10-14 02:09:25,294][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000026560_27197440.pth [2023-10-14 02:09:25,908][33201] Updated weights for policy 0, policy_version 27970 (0.0009) [2023-10-14 02:09:26,286][33201] Updated weights for policy 0, policy_version 27980 (0.0008) [2023-10-14 02:09:26,648][33201] Updated weights for policy 0, policy_version 27990 (0.0008) [2023-10-14 02:09:27,020][33201] Updated weights for policy 0, policy_version 28000 (0.0008) [2023-10-14 02:09:28,899][33226] Updated weights for policy 1, policy_version 28230 (0.0007) [2023-10-14 02:09:29,265][33226] Updated weights for policy 1, policy_version 28240 (0.0008) [2023-10-14 02:09:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 57573376. Throughput: 0: 1750.9, 1: 1781.2. Samples: 14401352. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:09:29,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.870')] [2023-10-14 02:09:29,630][33226] Updated weights for policy 1, policy_version 28250 (0.0007) [2023-10-14 02:09:30,628][33201] Updated weights for policy 0, policy_version 28010 (0.0009) [2023-10-14 02:09:30,997][33201] Updated weights for policy 0, policy_version 28020 (0.0008) [2023-10-14 02:09:31,372][33201] Updated weights for policy 0, policy_version 28030 (0.0009) [2023-10-14 02:09:33,480][33226] Updated weights for policy 1, policy_version 28260 (0.0007) [2023-10-14 02:09:33,851][33226] Updated weights for policy 1, policy_version 28270 (0.0009) [2023-10-14 02:09:34,218][33226] Updated weights for policy 1, policy_version 28280 (0.0008) [2023-10-14 02:09:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 57671680. Throughput: 0: 1754.3, 1: 1796.8. Samples: 14423440. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:09:34,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.880')] [2023-10-14 02:09:35,389][33201] Updated weights for policy 0, policy_version 28040 (0.0009) [2023-10-14 02:09:35,765][33201] Updated weights for policy 0, policy_version 28050 (0.0010) [2023-10-14 02:09:36,135][33201] Updated weights for policy 0, policy_version 28060 (0.0009) [2023-10-14 02:09:37,986][33226] Updated weights for policy 1, policy_version 28290 (0.0008) [2023-10-14 02:09:38,411][33226] Updated weights for policy 1, policy_version 28300 (0.0010) [2023-10-14 02:09:38,778][33226] Updated weights for policy 1, policy_version 28310 (0.0007) [2023-10-14 02:09:39,144][33226] Updated weights for policy 1, policy_version 28320 (0.0007) [2023-10-14 02:09:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 57737216. Throughput: 0: 1773.1, 1: 1779.6. Samples: 14444148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:09:39,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.880')] [2023-10-14 02:09:39,949][33201] Updated weights for policy 0, policy_version 28070 (0.0009) [2023-10-14 02:09:40,327][33201] Updated weights for policy 0, policy_version 28080 (0.0007) [2023-10-14 02:09:40,700][33201] Updated weights for policy 0, policy_version 28090 (0.0009) [2023-10-14 02:09:42,892][33226] Updated weights for policy 1, policy_version 28330 (0.0007) [2023-10-14 02:09:43,264][33226] Updated weights for policy 1, policy_version 28340 (0.0008) [2023-10-14 02:09:43,624][33226] Updated weights for policy 1, policy_version 28350 (0.0008) [2023-10-14 02:09:44,543][33201] Updated weights for policy 0, policy_version 28100 (0.0009) [2023-10-14 02:09:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 57802752. Throughput: 0: 1757.2, 1: 1791.5. Samples: 14455006. Policy #0 lag: (min: 30.0, avg: 33.7, max: 62.0) [2023-10-14 02:09:44,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.880')] [2023-10-14 02:09:44,910][33201] Updated weights for policy 0, policy_version 28110 (0.0007) [2023-10-14 02:09:45,277][33201] Updated weights for policy 0, policy_version 28120 (0.0007) [2023-10-14 02:09:47,499][33226] Updated weights for policy 1, policy_version 28360 (0.0011) [2023-10-14 02:09:47,871][33226] Updated weights for policy 1, policy_version 28370 (0.0011) [2023-10-14 02:09:48,230][33226] Updated weights for policy 1, policy_version 28380 (0.0011) [2023-10-14 02:09:49,022][33201] Updated weights for policy 0, policy_version 28130 (0.0009) [2023-10-14 02:09:49,398][33201] Updated weights for policy 0, policy_version 28140 (0.0010) [2023-10-14 02:09:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 57868288. Throughput: 0: 1774.1, 1: 1782.6. Samples: 14476464. Policy #0 lag: (min: 30.0, avg: 33.7, max: 62.0) [2023-10-14 02:09:49,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.880')] [2023-10-14 02:09:49,780][33201] Updated weights for policy 0, policy_version 28150 (0.0009) [2023-10-14 02:09:50,147][33201] Updated weights for policy 0, policy_version 28160 (0.0007) [2023-10-14 02:09:52,113][33226] Updated weights for policy 1, policy_version 28390 (0.0008) [2023-10-14 02:09:52,489][33226] Updated weights for policy 1, policy_version 28400 (0.0007) [2023-10-14 02:09:52,853][33226] Updated weights for policy 1, policy_version 28410 (0.0008) [2023-10-14 02:09:53,785][33201] Updated weights for policy 0, policy_version 28170 (0.0010) [2023-10-14 02:09:54,163][33201] Updated weights for policy 0, policy_version 28180 (0.0008) [2023-10-14 02:09:54,538][33201] Updated weights for policy 0, policy_version 28190 (0.0008) [2023-10-14 02:09:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 57933824. Throughput: 0: 1786.5, 1: 1766.4. Samples: 14497408. Policy #0 lag: (min: 30.0, avg: 33.7, max: 62.0) [2023-10-14 02:09:54,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:09:56,698][33226] Updated weights for policy 1, policy_version 28420 (0.0009) [2023-10-14 02:09:57,070][33226] Updated weights for policy 1, policy_version 28430 (0.0008) [2023-10-14 02:09:57,432][33226] Updated weights for policy 1, policy_version 28440 (0.0010) [2023-10-14 02:09:58,327][33201] Updated weights for policy 0, policy_version 28200 (0.0008) [2023-10-14 02:09:58,699][33201] Updated weights for policy 0, policy_version 28210 (0.0007) [2023-10-14 02:09:59,070][33201] Updated weights for policy 0, policy_version 28220 (0.0007) [2023-10-14 02:09:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 58032128. Throughput: 0: 1772.2, 1: 1787.6. Samples: 14508612. Policy #0 lag: (min: 30.0, avg: 33.7, max: 62.0) [2023-10-14 02:09:59,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.920')] [2023-10-14 02:10:00,828][33226] Updated weights for policy 1, policy_version 28450 (0.0008) [2023-10-14 02:10:01,193][33226] Updated weights for policy 1, policy_version 28460 (0.0010) [2023-10-14 02:10:01,556][33226] Updated weights for policy 1, policy_version 28470 (0.0007) [2023-10-14 02:10:01,922][33226] Updated weights for policy 1, policy_version 28480 (0.0007) [2023-10-14 02:10:03,062][33201] Updated weights for policy 0, policy_version 28230 (0.0007) [2023-10-14 02:10:03,423][33201] Updated weights for policy 0, policy_version 28240 (0.0008) [2023-10-14 02:10:03,796][33201] Updated weights for policy 0, policy_version 28250 (0.0009) [2023-10-14 02:10:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 58097664. Throughput: 0: 1793.8, 1: 1774.4. Samples: 14529778. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) [2023-10-14 02:10:04,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.910')] [2023-10-14 02:10:05,616][33226] Updated weights for policy 1, policy_version 28490 (0.0007) [2023-10-14 02:10:05,982][33226] Updated weights for policy 1, policy_version 28500 (0.0008) [2023-10-14 02:10:06,352][33226] Updated weights for policy 1, policy_version 28510 (0.0008) [2023-10-14 02:10:07,492][33201] Updated weights for policy 0, policy_version 28260 (0.0010) [2023-10-14 02:10:07,864][33201] Updated weights for policy 0, policy_version 28270 (0.0007) [2023-10-14 02:10:08,234][33201] Updated weights for policy 0, policy_version 28280 (0.0010) [2023-10-14 02:10:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 58163200. Throughput: 0: 1769.2, 1: 1773.5. Samples: 14550852. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) [2023-10-14 02:10:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:10:10,158][33226] Updated weights for policy 1, policy_version 28520 (0.0010) [2023-10-14 02:10:10,532][33226] Updated weights for policy 1, policy_version 28530 (0.0009) [2023-10-14 02:10:10,894][33226] Updated weights for policy 1, policy_version 28540 (0.0009) [2023-10-14 02:10:12,026][33201] Updated weights for policy 0, policy_version 28290 (0.0008) [2023-10-14 02:10:12,384][33201] Updated weights for policy 0, policy_version 28300 (0.0007) [2023-10-14 02:10:12,754][33201] Updated weights for policy 0, policy_version 28310 (0.0008) [2023-10-14 02:10:13,127][33201] Updated weights for policy 0, policy_version 28320 (0.0008) [2023-10-14 02:10:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 58228736. Throughput: 0: 1787.0, 1: 1773.0. Samples: 14561550. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) [2023-10-14 02:10:14,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.900')] [2023-10-14 02:10:14,620][33226] Updated weights for policy 1, policy_version 28550 (0.0010) [2023-10-14 02:10:14,981][33226] Updated weights for policy 1, policy_version 28560 (0.0008) [2023-10-14 02:10:15,346][33226] Updated weights for policy 1, policy_version 28570 (0.0007) [2023-10-14 02:10:16,845][33201] Updated weights for policy 0, policy_version 28330 (0.0008) [2023-10-14 02:10:17,217][33201] Updated weights for policy 0, policy_version 28340 (0.0010) [2023-10-14 02:10:17,586][33201] Updated weights for policy 0, policy_version 28350 (0.0007) [2023-10-14 02:10:19,276][33226] Updated weights for policy 1, policy_version 28580 (0.0009) [2023-10-14 02:10:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 58294272. Throughput: 0: 1760.1, 1: 1779.1. Samples: 14582702. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) [2023-10-14 02:10:19,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 02:10:19,641][33226] Updated weights for policy 1, policy_version 28590 (0.0010) [2023-10-14 02:10:20,019][33226] Updated weights for policy 1, policy_version 28600 (0.0007) [2023-10-14 02:10:21,698][33201] Updated weights for policy 0, policy_version 28360 (0.0008) [2023-10-14 02:10:22,076][33201] Updated weights for policy 0, policy_version 28370 (0.0007) [2023-10-14 02:10:22,450][33201] Updated weights for policy 0, policy_version 28380 (0.0007) [2023-10-14 02:10:23,924][33226] Updated weights for policy 1, policy_version 28610 (0.0008) [2023-10-14 02:10:24,335][33226] Updated weights for policy 1, policy_version 28620 (0.0009) [2023-10-14 02:10:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 58359808. Throughput: 0: 1758.8, 1: 1803.1. Samples: 14604436. Policy #0 lag: (min: 26.0, avg: 29.0, max: 58.0) [2023-10-14 02:10:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 02:10:24,706][33226] Updated weights for policy 1, policy_version 28630 (0.0008) [2023-10-14 02:10:25,074][33226] Updated weights for policy 1, policy_version 28640 (0.0009) [2023-10-14 02:10:26,352][33201] Updated weights for policy 0, policy_version 28390 (0.0009) [2023-10-14 02:10:26,713][33201] Updated weights for policy 0, policy_version 28400 (0.0009) [2023-10-14 02:10:27,090][33201] Updated weights for policy 0, policy_version 28410 (0.0008) [2023-10-14 02:10:28,831][33226] Updated weights for policy 1, policy_version 28650 (0.0010) [2023-10-14 02:10:29,200][33226] Updated weights for policy 1, policy_version 28660 (0.0009) [2023-10-14 02:10:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 58425344. Throughput: 0: 1768.2, 1: 1776.7. Samples: 14614526. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 02:10:29,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 02:10:29,566][33226] Updated weights for policy 1, policy_version 28670 (0.0009) [2023-10-14 02:10:30,909][33201] Updated weights for policy 0, policy_version 28420 (0.0009) [2023-10-14 02:10:31,274][33201] Updated weights for policy 0, policy_version 28430 (0.0010) [2023-10-14 02:10:31,640][33201] Updated weights for policy 0, policy_version 28440 (0.0007) [2023-10-14 02:10:33,324][33226] Updated weights for policy 1, policy_version 28680 (0.0008) [2023-10-14 02:10:33,689][33226] Updated weights for policy 1, policy_version 28690 (0.0011) [2023-10-14 02:10:34,059][33226] Updated weights for policy 1, policy_version 28700 (0.0010) [2023-10-14 02:10:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 58523648. Throughput: 0: 1753.5, 1: 1794.3. Samples: 14636112. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 02:10:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.920')] [2023-10-14 02:10:35,472][33201] Updated weights for policy 0, policy_version 28450 (0.0007) [2023-10-14 02:10:35,842][33201] Updated weights for policy 0, policy_version 28460 (0.0007) [2023-10-14 02:10:36,211][33201] Updated weights for policy 0, policy_version 28470 (0.0008) [2023-10-14 02:10:36,588][33201] Updated weights for policy 0, policy_version 28480 (0.0008) [2023-10-14 02:10:37,751][33226] Updated weights for policy 1, policy_version 28710 (0.0008) [2023-10-14 02:10:38,122][33226] Updated weights for policy 1, policy_version 28720 (0.0008) [2023-10-14 02:10:38,491][33226] Updated weights for policy 1, policy_version 28730 (0.0008) [2023-10-14 02:10:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 58589184. Throughput: 0: 1760.9, 1: 1782.6. Samples: 14656866. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 02:10:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.920')] [2023-10-14 02:10:40,413][33201] Updated weights for policy 0, policy_version 28490 (0.0008) [2023-10-14 02:10:40,783][33201] Updated weights for policy 0, policy_version 28500 (0.0009) [2023-10-14 02:10:41,156][33201] Updated weights for policy 0, policy_version 28510 (0.0007) [2023-10-14 02:10:42,331][33226] Updated weights for policy 1, policy_version 28740 (0.0008) [2023-10-14 02:10:42,706][33226] Updated weights for policy 1, policy_version 28750 (0.0008) [2023-10-14 02:10:43,072][33226] Updated weights for policy 1, policy_version 28760 (0.0008) [2023-10-14 02:10:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 58654720. Throughput: 0: 1743.8, 1: 1797.5. Samples: 14667972. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 02:10:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 02:10:45,035][33201] Updated weights for policy 0, policy_version 28520 (0.0008) [2023-10-14 02:10:45,403][33201] Updated weights for policy 0, policy_version 28530 (0.0007) [2023-10-14 02:10:45,781][33201] Updated weights for policy 0, policy_version 28540 (0.0010) [2023-10-14 02:10:46,949][33226] Updated weights for policy 1, policy_version 28770 (0.0007) [2023-10-14 02:10:47,317][33226] Updated weights for policy 1, policy_version 28780 (0.0009) [2023-10-14 02:10:47,679][33226] Updated weights for policy 1, policy_version 28790 (0.0008) [2023-10-14 02:10:48,048][33226] Updated weights for policy 1, policy_version 28800 (0.0010) [2023-10-14 02:10:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 58720256. Throughput: 0: 1749.0, 1: 1782.8. Samples: 14688710. Policy #0 lag: (min: 31.0, avg: 36.2, max: 63.0) [2023-10-14 02:10:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 02:10:49,793][33201] Updated weights for policy 0, policy_version 28550 (0.0008) [2023-10-14 02:10:50,162][33201] Updated weights for policy 0, policy_version 28560 (0.0008) [2023-10-14 02:10:50,547][33201] Updated weights for policy 0, policy_version 28570 (0.0007) [2023-10-14 02:10:51,795][33226] Updated weights for policy 1, policy_version 28810 (0.0009) [2023-10-14 02:10:52,176][33226] Updated weights for policy 1, policy_version 28820 (0.0009) [2023-10-14 02:10:52,545][33226] Updated weights for policy 1, policy_version 28830 (0.0010) [2023-10-14 02:10:54,297][33201] Updated weights for policy 0, policy_version 28580 (0.0008) [2023-10-14 02:10:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 58785792. Throughput: 0: 1773.2, 1: 1774.1. Samples: 14710484. Policy #0 lag: (min: 1.0, avg: 20.7, max: 33.0) [2023-10-14 02:10:54,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-14 02:10:54,667][33201] Updated weights for policy 0, policy_version 28590 (0.0010) [2023-10-14 02:10:55,043][33201] Updated weights for policy 0, policy_version 28600 (0.0011) [2023-10-14 02:10:56,308][33226] Updated weights for policy 1, policy_version 28840 (0.0009) [2023-10-14 02:10:56,681][33226] Updated weights for policy 1, policy_version 28850 (0.0009) [2023-10-14 02:10:57,046][33226] Updated weights for policy 1, policy_version 28860 (0.0007) [2023-10-14 02:10:58,887][33201] Updated weights for policy 0, policy_version 28610 (0.0010) [2023-10-14 02:10:59,258][33201] Updated weights for policy 0, policy_version 28620 (0.0008) [2023-10-14 02:10:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 58851328. Throughput: 0: 1745.8, 1: 1783.0. Samples: 14720344. Policy #0 lag: (min: 1.0, avg: 20.7, max: 33.0) [2023-10-14 02:10:59,559][31953] Avg episode reward: [(0, '20.850'), (1, '20.940')] [2023-10-14 02:10:59,629][33201] Updated weights for policy 0, policy_version 28630 (0.0009) [2023-10-14 02:11:00,005][33201] Updated weights for policy 0, policy_version 28640 (0.0009) [2023-10-14 02:11:00,854][33226] Updated weights for policy 1, policy_version 28870 (0.0007) [2023-10-14 02:11:01,226][33226] Updated weights for policy 1, policy_version 28880 (0.0007) [2023-10-14 02:11:01,599][33226] Updated weights for policy 1, policy_version 28890 (0.0010) [2023-10-14 02:11:03,830][33201] Updated weights for policy 0, policy_version 28650 (0.0009) [2023-10-14 02:11:04,201][33201] Updated weights for policy 0, policy_version 28660 (0.0010) [2023-10-14 02:11:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 58916864. Throughput: 0: 1772.6, 1: 1771.9. Samples: 14742206. Policy #0 lag: (min: 1.0, avg: 20.7, max: 33.0) [2023-10-14 02:11:04,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.940')] [2023-10-14 02:11:04,575][33201] Updated weights for policy 0, policy_version 28670 (0.0009) [2023-10-14 02:11:05,386][33226] Updated weights for policy 1, policy_version 28900 (0.0008) [2023-10-14 02:11:05,750][33226] Updated weights for policy 1, policy_version 28910 (0.0011) [2023-10-14 02:11:06,128][33226] Updated weights for policy 1, policy_version 28920 (0.0011) [2023-10-14 02:11:08,494][33201] Updated weights for policy 0, policy_version 28680 (0.0007) [2023-10-14 02:11:08,871][33201] Updated weights for policy 0, policy_version 28690 (0.0009) [2023-10-14 02:11:09,235][33201] Updated weights for policy 0, policy_version 28700 (0.0010) [2023-10-14 02:11:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59015168. Throughput: 0: 1753.6, 1: 1770.4. Samples: 14763014. Policy #0 lag: (min: 1.0, avg: 20.7, max: 33.0) [2023-10-14 02:11:09,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.940')] [2023-10-14 02:11:10,081][33226] Updated weights for policy 1, policy_version 28930 (0.0009) [2023-10-14 02:11:10,500][33226] Updated weights for policy 1, policy_version 28940 (0.0007) [2023-10-14 02:11:10,869][33226] Updated weights for policy 1, policy_version 28950 (0.0007) [2023-10-14 02:11:11,234][33226] Updated weights for policy 1, policy_version 28960 (0.0007) [2023-10-14 02:11:13,042][33201] Updated weights for policy 0, policy_version 28710 (0.0009) [2023-10-14 02:11:13,408][33201] Updated weights for policy 0, policy_version 28720 (0.0007) [2023-10-14 02:11:13,788][33201] Updated weights for policy 0, policy_version 28730 (0.0008) [2023-10-14 02:11:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 59080704. Throughput: 0: 1765.6, 1: 1761.9. Samples: 14773264. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:11:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.940')] [2023-10-14 02:11:14,919][33226] Updated weights for policy 1, policy_version 28970 (0.0010) [2023-10-14 02:11:15,293][33226] Updated weights for policy 1, policy_version 28980 (0.0011) [2023-10-14 02:11:15,669][33226] Updated weights for policy 1, policy_version 28990 (0.0011) [2023-10-14 02:11:17,590][33201] Updated weights for policy 0, policy_version 28740 (0.0010) [2023-10-14 02:11:17,959][33201] Updated weights for policy 0, policy_version 28750 (0.0009) [2023-10-14 02:11:18,333][33201] Updated weights for policy 0, policy_version 28760 (0.0009) [2023-10-14 02:11:19,490][33226] Updated weights for policy 1, policy_version 29000 (0.0009) [2023-10-14 02:11:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59146240. Throughput: 0: 1756.7, 1: 1762.7. Samples: 14794486. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:11:19,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 02:11:19,860][33226] Updated weights for policy 1, policy_version 29010 (0.0010) [2023-10-14 02:11:20,227][33226] Updated weights for policy 1, policy_version 29020 (0.0009) [2023-10-14 02:11:22,335][33201] Updated weights for policy 0, policy_version 28770 (0.0008) [2023-10-14 02:11:22,706][33201] Updated weights for policy 0, policy_version 28780 (0.0009) [2023-10-14 02:11:23,091][33201] Updated weights for policy 0, policy_version 28790 (0.0009) [2023-10-14 02:11:23,467][33201] Updated weights for policy 0, policy_version 28800 (0.0011) [2023-10-14 02:11:24,019][33226] Updated weights for policy 1, policy_version 29030 (0.0008) [2023-10-14 02:11:24,395][33226] Updated weights for policy 1, policy_version 29040 (0.0007) [2023-10-14 02:11:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59211776. Throughput: 0: 1739.8, 1: 1789.0. Samples: 14815664. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:11:24,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 02:11:24,564][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000028800_29491200.pth... [2023-10-14 02:11:24,609][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000027136_27787264.pth [2023-10-14 02:11:24,763][33226] Updated weights for policy 1, policy_version 29050 (0.0008) [2023-10-14 02:11:24,977][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000029056_29753344.pth... [2023-10-14 02:11:25,006][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000027392_28049408.pth [2023-10-14 02:11:27,293][33201] Updated weights for policy 0, policy_version 28810 (0.0010) [2023-10-14 02:11:27,667][33201] Updated weights for policy 0, policy_version 28820 (0.0011) [2023-10-14 02:11:28,043][33201] Updated weights for policy 0, policy_version 28830 (0.0010) [2023-10-14 02:11:28,634][33226] Updated weights for policy 1, policy_version 29060 (0.0008) [2023-10-14 02:11:29,004][33226] Updated weights for policy 1, policy_version 29070 (0.0011) [2023-10-14 02:11:29,375][33226] Updated weights for policy 1, policy_version 29080 (0.0007) [2023-10-14 02:11:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59277312. Throughput: 0: 1766.3, 1: 1759.7. Samples: 14826638. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:11:29,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.890')] [2023-10-14 02:11:31,796][33201] Updated weights for policy 0, policy_version 28840 (0.0007) [2023-10-14 02:11:32,175][33201] Updated weights for policy 0, policy_version 28850 (0.0007) [2023-10-14 02:11:32,549][33201] Updated weights for policy 0, policy_version 28860 (0.0007) [2023-10-14 02:11:33,142][33226] Updated weights for policy 1, policy_version 29090 (0.0009) [2023-10-14 02:11:33,511][33226] Updated weights for policy 1, policy_version 29100 (0.0008) [2023-10-14 02:11:33,881][33226] Updated weights for policy 1, policy_version 29110 (0.0008) [2023-10-14 02:11:34,254][33226] Updated weights for policy 1, policy_version 29120 (0.0008) [2023-10-14 02:11:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59375616. Throughput: 0: 1742.4, 1: 1788.0. Samples: 14847578. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:11:34,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.870')] [2023-10-14 02:11:36,208][33201] Updated weights for policy 0, policy_version 28870 (0.0009) [2023-10-14 02:11:36,580][33201] Updated weights for policy 0, policy_version 28880 (0.0009) [2023-10-14 02:11:36,955][33201] Updated weights for policy 0, policy_version 28890 (0.0009) [2023-10-14 02:11:38,028][33226] Updated weights for policy 1, policy_version 29130 (0.0007) [2023-10-14 02:11:38,395][33226] Updated weights for policy 1, policy_version 29140 (0.0009) [2023-10-14 02:11:38,756][33226] Updated weights for policy 1, policy_version 29150 (0.0008) [2023-10-14 02:11:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 59441152. Throughput: 0: 1745.5, 1: 1765.3. Samples: 14868468. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:11:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.870')] [2023-10-14 02:11:40,824][33201] Updated weights for policy 0, policy_version 28900 (0.0007) [2023-10-14 02:11:41,199][33201] Updated weights for policy 0, policy_version 28910 (0.0009) [2023-10-14 02:11:41,563][33201] Updated weights for policy 0, policy_version 28920 (0.0009) [2023-10-14 02:11:42,512][33226] Updated weights for policy 1, policy_version 29160 (0.0008) [2023-10-14 02:11:42,883][33226] Updated weights for policy 1, policy_version 29170 (0.0008) [2023-10-14 02:11:43,242][33226] Updated weights for policy 1, policy_version 29180 (0.0008) [2023-10-14 02:11:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 59506688. Throughput: 0: 1746.7, 1: 1789.1. Samples: 14879454. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:11:44,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.880')] [2023-10-14 02:11:45,396][33201] Updated weights for policy 0, policy_version 28930 (0.0011) [2023-10-14 02:11:45,764][33201] Updated weights for policy 0, policy_version 28940 (0.0008) [2023-10-14 02:11:46,126][33201] Updated weights for policy 0, policy_version 28950 (0.0008) [2023-10-14 02:11:46,494][33201] Updated weights for policy 0, policy_version 28960 (0.0008) [2023-10-14 02:11:47,028][33226] Updated weights for policy 1, policy_version 29190 (0.0009) [2023-10-14 02:11:47,406][33226] Updated weights for policy 1, policy_version 29200 (0.0009) [2023-10-14 02:11:47,773][33226] Updated weights for policy 1, policy_version 29210 (0.0010) [2023-10-14 02:11:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 59572224. Throughput: 0: 1748.4, 1: 1763.5. Samples: 14900242. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:11:49,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.870')] [2023-10-14 02:11:50,249][33201] Updated weights for policy 0, policy_version 28970 (0.0007) [2023-10-14 02:11:50,616][33201] Updated weights for policy 0, policy_version 28980 (0.0007) [2023-10-14 02:11:50,995][33201] Updated weights for policy 0, policy_version 28990 (0.0007) [2023-10-14 02:11:51,657][33226] Updated weights for policy 1, policy_version 29220 (0.0009) [2023-10-14 02:11:52,023][33226] Updated weights for policy 1, policy_version 29230 (0.0009) [2023-10-14 02:11:52,393][33226] Updated weights for policy 1, policy_version 29240 (0.0007) [2023-10-14 02:11:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 59637760. Throughput: 0: 1773.3, 1: 1768.9. Samples: 14922416. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:11:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:11:54,930][33201] Updated weights for policy 0, policy_version 29000 (0.0009) [2023-10-14 02:11:55,317][33201] Updated weights for policy 0, policy_version 29010 (0.0008) [2023-10-14 02:11:55,685][33201] Updated weights for policy 0, policy_version 29020 (0.0009) [2023-10-14 02:11:56,023][33226] Updated weights for policy 1, policy_version 29250 (0.0009) [2023-10-14 02:11:56,437][33226] Updated weights for policy 1, policy_version 29260 (0.0010) [2023-10-14 02:11:56,798][33226] Updated weights for policy 1, policy_version 29270 (0.0008) [2023-10-14 02:11:57,162][33226] Updated weights for policy 1, policy_version 29280 (0.0008) [2023-10-14 02:11:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 59703296. Throughput: 0: 1749.3, 1: 1781.9. Samples: 14932168. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:11:59,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:11:59,607][33201] Updated weights for policy 0, policy_version 29030 (0.0008) [2023-10-14 02:11:59,980][33201] Updated weights for policy 0, policy_version 29040 (0.0007) [2023-10-14 02:12:00,356][33201] Updated weights for policy 0, policy_version 29050 (0.0008) [2023-10-14 02:12:00,902][33226] Updated weights for policy 1, policy_version 29290 (0.0008) [2023-10-14 02:12:01,273][33226] Updated weights for policy 1, policy_version 29300 (0.0007) [2023-10-14 02:12:01,637][33226] Updated weights for policy 1, policy_version 29310 (0.0009) [2023-10-14 02:12:04,114][33201] Updated weights for policy 0, policy_version 29060 (0.0007) [2023-10-14 02:12:04,487][33201] Updated weights for policy 0, policy_version 29070 (0.0008) [2023-10-14 02:12:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 59768832. Throughput: 0: 1768.5, 1: 1774.6. Samples: 14953926. Policy #0 lag: (min: 21.0, avg: 24.6, max: 53.0) [2023-10-14 02:12:04,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:12:04,862][33201] Updated weights for policy 0, policy_version 29080 (0.0007) [2023-10-14 02:12:05,414][33226] Updated weights for policy 1, policy_version 29320 (0.0008) [2023-10-14 02:12:05,785][33226] Updated weights for policy 1, policy_version 29330 (0.0010) [2023-10-14 02:12:06,157][33226] Updated weights for policy 1, policy_version 29340 (0.0010) [2023-10-14 02:12:08,669][33201] Updated weights for policy 0, policy_version 29090 (0.0009) [2023-10-14 02:12:09,038][33201] Updated weights for policy 0, policy_version 29100 (0.0009) [2023-10-14 02:12:09,403][33201] Updated weights for policy 0, policy_version 29110 (0.0007) [2023-10-14 02:12:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 59834368. Throughput: 0: 1775.3, 1: 1776.9. Samples: 14975514. Policy #0 lag: (min: 21.0, avg: 24.6, max: 53.0) [2023-10-14 02:12:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:12:09,765][33201] Updated weights for policy 0, policy_version 29120 (0.0010) [2023-10-14 02:12:09,940][33226] Updated weights for policy 1, policy_version 29350 (0.0008) [2023-10-14 02:12:10,307][33226] Updated weights for policy 1, policy_version 29360 (0.0008) [2023-10-14 02:12:10,673][33226] Updated weights for policy 1, policy_version 29370 (0.0007) [2023-10-14 02:12:13,785][33201] Updated weights for policy 0, policy_version 29130 (0.0008) [2023-10-14 02:12:14,156][33201] Updated weights for policy 0, policy_version 29140 (0.0009) [2023-10-14 02:12:14,426][33226] Updated weights for policy 1, policy_version 29380 (0.0008) [2023-10-14 02:12:14,530][33201] Updated weights for policy 0, policy_version 29150 (0.0009) [2023-10-14 02:12:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 59899904. Throughput: 0: 1765.9, 1: 1774.5. Samples: 14985958. Policy #0 lag: (min: 21.0, avg: 24.6, max: 53.0) [2023-10-14 02:12:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:12:14,796][33226] Updated weights for policy 1, policy_version 29390 (0.0008) [2023-10-14 02:12:15,161][33226] Updated weights for policy 1, policy_version 29400 (0.0009) [2023-10-14 02:12:18,550][33201] Updated weights for policy 0, policy_version 29160 (0.0008) [2023-10-14 02:12:18,928][33201] Updated weights for policy 0, policy_version 29170 (0.0008) [2023-10-14 02:12:18,972][33226] Updated weights for policy 1, policy_version 29410 (0.0009) [2023-10-14 02:12:19,294][33201] Updated weights for policy 0, policy_version 29180 (0.0008) [2023-10-14 02:12:19,338][33226] Updated weights for policy 1, policy_version 29420 (0.0008) [2023-10-14 02:12:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 59998208. Throughput: 0: 1783.6, 1: 1773.1. Samples: 15007630. Policy #0 lag: (min: 21.0, avg: 24.6, max: 53.0) [2023-10-14 02:12:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.920')] [2023-10-14 02:12:19,702][33226] Updated weights for policy 1, policy_version 29430 (0.0008) [2023-10-14 02:12:20,071][33226] Updated weights for policy 1, policy_version 29440 (0.0008) [2023-10-14 02:12:23,095][33201] Updated weights for policy 0, policy_version 29190 (0.0009) [2023-10-14 02:12:23,462][33201] Updated weights for policy 0, policy_version 29200 (0.0007) [2023-10-14 02:12:23,831][33201] Updated weights for policy 0, policy_version 29210 (0.0008) [2023-10-14 02:12:23,966][33226] Updated weights for policy 1, policy_version 29450 (0.0009) [2023-10-14 02:12:24,334][33226] Updated weights for policy 1, policy_version 29460 (0.0010) [2023-10-14 02:12:24,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 60063744. Throughput: 0: 1746.7, 1: 1791.9. Samples: 15027704. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 02:12:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.910')] [2023-10-14 02:12:24,708][33226] Updated weights for policy 1, policy_version 29470 (0.0010) [2023-10-14 02:12:27,729][33201] Updated weights for policy 0, policy_version 29220 (0.0009) [2023-10-14 02:12:28,093][33201] Updated weights for policy 0, policy_version 29230 (0.0009) [2023-10-14 02:12:28,475][33201] Updated weights for policy 0, policy_version 29240 (0.0008) [2023-10-14 02:12:28,549][33226] Updated weights for policy 1, policy_version 29480 (0.0008) [2023-10-14 02:12:28,931][33226] Updated weights for policy 1, policy_version 29490 (0.0008) [2023-10-14 02:12:29,295][33226] Updated weights for policy 1, policy_version 29500 (0.0008) [2023-10-14 02:12:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 60162048. Throughput: 0: 1781.1, 1: 1765.1. Samples: 15039032. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 02:12:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.910')] [2023-10-14 02:12:32,323][33201] Updated weights for policy 0, policy_version 29250 (0.0008) [2023-10-14 02:12:32,692][33201] Updated weights for policy 0, policy_version 29260 (0.0007) [2023-10-14 02:12:33,064][33201] Updated weights for policy 0, policy_version 29270 (0.0009) [2023-10-14 02:12:33,090][33226] Updated weights for policy 1, policy_version 29510 (0.0007) [2023-10-14 02:12:33,431][33201] Updated weights for policy 0, policy_version 29280 (0.0010) [2023-10-14 02:12:33,460][33226] Updated weights for policy 1, policy_version 29520 (0.0007) [2023-10-14 02:12:33,824][33226] Updated weights for policy 1, policy_version 29530 (0.0007) [2023-10-14 02:12:34,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 60227584. Throughput: 0: 1752.6, 1: 1797.6. Samples: 15060002. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 02:12:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 02:12:37,188][33201] Updated weights for policy 0, policy_version 29290 (0.0008) [2023-10-14 02:12:37,549][33201] Updated weights for policy 0, policy_version 29300 (0.0009) [2023-10-14 02:12:37,702][33226] Updated weights for policy 1, policy_version 29540 (0.0008) [2023-10-14 02:12:37,919][33201] Updated weights for policy 0, policy_version 29310 (0.0007) [2023-10-14 02:12:38,061][33226] Updated weights for policy 1, policy_version 29550 (0.0007) [2023-10-14 02:12:38,425][33226] Updated weights for policy 1, policy_version 29560 (0.0008) [2023-10-14 02:12:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 60293120. Throughput: 0: 1741.1, 1: 1764.9. Samples: 15080186. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 02:12:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.910')] [2023-10-14 02:12:41,786][33201] Updated weights for policy 0, policy_version 29320 (0.0009) [2023-10-14 02:12:42,156][33201] Updated weights for policy 0, policy_version 29330 (0.0007) [2023-10-14 02:12:42,311][33226] Updated weights for policy 1, policy_version 29570 (0.0008) [2023-10-14 02:12:42,522][33201] Updated weights for policy 0, policy_version 29340 (0.0009) [2023-10-14 02:12:42,729][33226] Updated weights for policy 1, policy_version 29580 (0.0007) [2023-10-14 02:12:43,103][33226] Updated weights for policy 1, policy_version 29590 (0.0009) [2023-10-14 02:12:43,465][33226] Updated weights for policy 1, policy_version 29600 (0.0008) [2023-10-14 02:12:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 60358656. Throughput: 0: 1760.3, 1: 1792.0. Samples: 15092022. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 02:12:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.900')] [2023-10-14 02:12:46,397][33201] Updated weights for policy 0, policy_version 29350 (0.0007) [2023-10-14 02:12:46,769][33201] Updated weights for policy 0, policy_version 29360 (0.0009) [2023-10-14 02:12:47,137][33201] Updated weights for policy 0, policy_version 29370 (0.0008) [2023-10-14 02:12:47,276][33226] Updated weights for policy 1, policy_version 29610 (0.0007) [2023-10-14 02:12:47,646][33226] Updated weights for policy 1, policy_version 29620 (0.0007) [2023-10-14 02:12:48,020][33226] Updated weights for policy 1, policy_version 29630 (0.0008) [2023-10-14 02:12:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60424192. Throughput: 0: 1739.7, 1: 1769.3. Samples: 15111830. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-14 02:12:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.890')] [2023-10-14 02:12:50,995][33201] Updated weights for policy 0, policy_version 29380 (0.0007) [2023-10-14 02:12:51,364][33201] Updated weights for policy 0, policy_version 29390 (0.0007) [2023-10-14 02:12:51,731][33201] Updated weights for policy 0, policy_version 29400 (0.0010) [2023-10-14 02:12:51,841][33226] Updated weights for policy 1, policy_version 29640 (0.0008) [2023-10-14 02:12:52,208][33226] Updated weights for policy 1, policy_version 29650 (0.0007) [2023-10-14 02:12:52,579][33226] Updated weights for policy 1, policy_version 29660 (0.0008) [2023-10-14 02:12:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60489728. Throughput: 0: 1754.5, 1: 1765.3. Samples: 15133906. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-14 02:12:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.890')] [2023-10-14 02:12:55,498][33201] Updated weights for policy 0, policy_version 29410 (0.0008) [2023-10-14 02:12:55,862][33201] Updated weights for policy 0, policy_version 29420 (0.0008) [2023-10-14 02:12:56,236][33201] Updated weights for policy 0, policy_version 29430 (0.0008) [2023-10-14 02:12:56,345][33226] Updated weights for policy 1, policy_version 29670 (0.0007) [2023-10-14 02:12:56,605][33201] Updated weights for policy 0, policy_version 29440 (0.0008) [2023-10-14 02:12:56,709][33226] Updated weights for policy 1, policy_version 29680 (0.0011) [2023-10-14 02:12:57,070][33226] Updated weights for policy 1, policy_version 29690 (0.0008) [2023-10-14 02:12:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60555264. Throughput: 0: 1739.6, 1: 1776.7. Samples: 15144192. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-14 02:12:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.860')] [2023-10-14 02:13:00,473][33201] Updated weights for policy 0, policy_version 29450 (0.0008) [2023-10-14 02:13:00,775][33226] Updated weights for policy 1, policy_version 29700 (0.0009) [2023-10-14 02:13:00,845][33201] Updated weights for policy 0, policy_version 29460 (0.0007) [2023-10-14 02:13:01,142][33226] Updated weights for policy 1, policy_version 29710 (0.0009) [2023-10-14 02:13:01,224][33201] Updated weights for policy 0, policy_version 29470 (0.0009) [2023-10-14 02:13:01,504][33226] Updated weights for policy 1, policy_version 29720 (0.0009) [2023-10-14 02:13:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60620800. Throughput: 0: 1752.8, 1: 1767.1. Samples: 15166028. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-14 02:13:04,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.840')] [2023-10-14 02:13:04,904][33201] Updated weights for policy 0, policy_version 29480 (0.0008) [2023-10-14 02:13:05,283][33201] Updated weights for policy 0, policy_version 29490 (0.0007) [2023-10-14 02:13:05,421][33226] Updated weights for policy 1, policy_version 29730 (0.0008) [2023-10-14 02:13:05,645][33201] Updated weights for policy 0, policy_version 29500 (0.0007) [2023-10-14 02:13:05,800][33226] Updated weights for policy 1, policy_version 29740 (0.0007) [2023-10-14 02:13:06,168][33226] Updated weights for policy 1, policy_version 29750 (0.0009) [2023-10-14 02:13:06,526][33226] Updated weights for policy 1, policy_version 29760 (0.0009) [2023-10-14 02:13:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60686336. Throughput: 0: 1783.3, 1: 1772.5. Samples: 15187712. Policy #0 lag: (min: 31.0, avg: 39.1, max: 63.0) [2023-10-14 02:13:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.820')] [2023-10-14 02:13:09,683][33201] Updated weights for policy 0, policy_version 29510 (0.0008) [2023-10-14 02:13:10,061][33201] Updated weights for policy 0, policy_version 29520 (0.0009) [2023-10-14 02:13:10,320][33226] Updated weights for policy 1, policy_version 29770 (0.0008) [2023-10-14 02:13:10,427][33201] Updated weights for policy 0, policy_version 29530 (0.0010) [2023-10-14 02:13:10,683][33226] Updated weights for policy 1, policy_version 29780 (0.0010) [2023-10-14 02:13:11,054][33226] Updated weights for policy 1, policy_version 29790 (0.0007) [2023-10-14 02:13:14,291][33201] Updated weights for policy 0, policy_version 29540 (0.0008) [2023-10-14 02:13:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 60751872. Throughput: 0: 1747.6, 1: 1765.2. Samples: 15197104. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 02:13:14,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:14,660][33201] Updated weights for policy 0, policy_version 29550 (0.0008) [2023-10-14 02:13:14,918][33226] Updated weights for policy 1, policy_version 29800 (0.0009) [2023-10-14 02:13:15,021][33201] Updated weights for policy 0, policy_version 29560 (0.0007) [2023-10-14 02:13:15,280][33226] Updated weights for policy 1, policy_version 29810 (0.0008) [2023-10-14 02:13:15,636][33226] Updated weights for policy 1, policy_version 29820 (0.0011) [2023-10-14 02:13:18,899][33201] Updated weights for policy 0, policy_version 29570 (0.0008) [2023-10-14 02:13:19,272][33201] Updated weights for policy 0, policy_version 29580 (0.0009) [2023-10-14 02:13:19,290][33226] Updated weights for policy 1, policy_version 29830 (0.0008) [2023-10-14 02:13:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 60817408. Throughput: 0: 1768.9, 1: 1770.4. Samples: 15219274. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 02:13:19,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:19,636][33201] Updated weights for policy 0, policy_version 29590 (0.0008) [2023-10-14 02:13:19,664][33226] Updated weights for policy 1, policy_version 29840 (0.0007) [2023-10-14 02:13:20,008][33201] Updated weights for policy 0, policy_version 29600 (0.0007) [2023-10-14 02:13:20,027][33226] Updated weights for policy 1, policy_version 29850 (0.0007) [2023-10-14 02:13:23,879][33226] Updated weights for policy 1, policy_version 29860 (0.0009) [2023-10-14 02:13:23,888][33201] Updated weights for policy 0, policy_version 29610 (0.0008) [2023-10-14 02:13:24,239][33226] Updated weights for policy 1, policy_version 29870 (0.0007) [2023-10-14 02:13:24,263][33201] Updated weights for policy 0, policy_version 29620 (0.0009) [2023-10-14 02:13:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 60882944. Throughput: 0: 1764.3, 1: 1795.3. Samples: 15240370. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 02:13:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:13:24,598][33226] Updated weights for policy 1, policy_version 29880 (0.0008) [2023-10-14 02:13:24,634][33201] Updated weights for policy 0, policy_version 29630 (0.0009) [2023-10-14 02:13:24,708][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000029632_30343168.pth... [2023-10-14 02:13:24,744][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000027968_28639232.pth [2023-10-14 02:13:24,891][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000029888_30605312.pth... [2023-10-14 02:13:24,921][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000028224_28901376.pth [2023-10-14 02:13:28,393][33201] Updated weights for policy 0, policy_version 29640 (0.0009) [2023-10-14 02:13:28,521][33226] Updated weights for policy 1, policy_version 29890 (0.0009) [2023-10-14 02:13:28,767][33201] Updated weights for policy 0, policy_version 29650 (0.0008) [2023-10-14 02:13:28,936][33226] Updated weights for policy 1, policy_version 29900 (0.0009) [2023-10-14 02:13:29,135][33201] Updated weights for policy 0, policy_version 29660 (0.0009) [2023-10-14 02:13:29,302][33226] Updated weights for policy 1, policy_version 29910 (0.0008) [2023-10-14 02:13:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 60981248. Throughput: 0: 1762.7, 1: 1765.4. Samples: 15250786. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-10-14 02:13:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:13:29,677][33226] Updated weights for policy 1, policy_version 29920 (0.0008) [2023-10-14 02:13:32,927][33201] Updated weights for policy 0, policy_version 29670 (0.0007) [2023-10-14 02:13:33,297][33201] Updated weights for policy 0, policy_version 29680 (0.0007) [2023-10-14 02:13:33,547][33226] Updated weights for policy 1, policy_version 29930 (0.0008) [2023-10-14 02:13:33,673][33201] Updated weights for policy 0, policy_version 29690 (0.0009) [2023-10-14 02:13:33,901][33226] Updated weights for policy 1, policy_version 29940 (0.0008) [2023-10-14 02:13:34,267][33226] Updated weights for policy 1, policy_version 29950 (0.0008) [2023-10-14 02:13:34,557][31953] Fps is (10 sec: 19661.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 61079552. Throughput: 0: 1765.5, 1: 1791.0. Samples: 15271870. Policy #0 lag: (min: 5.0, avg: 10.8, max: 37.0) [2023-10-14 02:13:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:13:37,476][33201] Updated weights for policy 0, policy_version 29700 (0.0009) [2023-10-14 02:13:37,858][33201] Updated weights for policy 0, policy_version 29710 (0.0008) [2023-10-14 02:13:37,952][33226] Updated weights for policy 1, policy_version 29960 (0.0009) [2023-10-14 02:13:38,224][33201] Updated weights for policy 0, policy_version 29720 (0.0009) [2023-10-14 02:13:38,327][33226] Updated weights for policy 1, policy_version 29970 (0.0008) [2023-10-14 02:13:38,689][33226] Updated weights for policy 1, policy_version 29980 (0.0008) [2023-10-14 02:13:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 61145088. Throughput: 0: 1741.5, 1: 1762.5. Samples: 15291590. Policy #0 lag: (min: 5.0, avg: 10.8, max: 37.0) [2023-10-14 02:13:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:42,065][33201] Updated weights for policy 0, policy_version 29730 (0.0009) [2023-10-14 02:13:42,370][33226] Updated weights for policy 1, policy_version 29990 (0.0009) [2023-10-14 02:13:42,442][33201] Updated weights for policy 0, policy_version 29740 (0.0007) [2023-10-14 02:13:42,734][33226] Updated weights for policy 1, policy_version 30000 (0.0009) [2023-10-14 02:13:42,808][33201] Updated weights for policy 0, policy_version 29750 (0.0007) [2023-10-14 02:13:43,096][33226] Updated weights for policy 1, policy_version 30010 (0.0009) [2023-10-14 02:13:43,180][33201] Updated weights for policy 0, policy_version 29760 (0.0008) [2023-10-14 02:13:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 61210624. Throughput: 0: 1766.7, 1: 1781.5. Samples: 15303864. Policy #0 lag: (min: 5.0, avg: 10.8, max: 37.0) [2023-10-14 02:13:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:47,031][33226] Updated weights for policy 1, policy_version 30020 (0.0008) [2023-10-14 02:13:47,103][33201] Updated weights for policy 0, policy_version 29770 (0.0007) [2023-10-14 02:13:47,396][33226] Updated weights for policy 1, policy_version 30030 (0.0010) [2023-10-14 02:13:47,475][33201] Updated weights for policy 0, policy_version 29780 (0.0008) [2023-10-14 02:13:47,764][33226] Updated weights for policy 1, policy_version 30040 (0.0008) [2023-10-14 02:13:47,842][33201] Updated weights for policy 0, policy_version 29790 (0.0009) [2023-10-14 02:13:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 61276160. Throughput: 0: 1730.7, 1: 1762.8. Samples: 15323238. Policy #0 lag: (min: 5.0, avg: 10.8, max: 37.0) [2023-10-14 02:13:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:51,578][33201] Updated weights for policy 0, policy_version 29800 (0.0007) [2023-10-14 02:13:51,605][33226] Updated weights for policy 1, policy_version 30050 (0.0008) [2023-10-14 02:13:51,956][33201] Updated weights for policy 0, policy_version 29810 (0.0008) [2023-10-14 02:13:51,970][33226] Updated weights for policy 1, policy_version 30060 (0.0007) [2023-10-14 02:13:52,334][33201] Updated weights for policy 0, policy_version 29820 (0.0008) [2023-10-14 02:13:52,339][33226] Updated weights for policy 1, policy_version 30070 (0.0009) [2023-10-14 02:13:52,703][33226] Updated weights for policy 1, policy_version 30080 (0.0010) [2023-10-14 02:13:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 61341696. Throughput: 0: 1734.3, 1: 1762.2. Samples: 15345056. Policy #0 lag: (min: 5.0, avg: 10.8, max: 37.0) [2023-10-14 02:13:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:13:56,252][33201] Updated weights for policy 0, policy_version 29830 (0.0009) [2023-10-14 02:13:56,547][33226] Updated weights for policy 1, policy_version 30090 (0.0008) [2023-10-14 02:13:56,624][33201] Updated weights for policy 0, policy_version 29840 (0.0008) [2023-10-14 02:13:56,909][33226] Updated weights for policy 1, policy_version 30100 (0.0009) [2023-10-14 02:13:57,002][33201] Updated weights for policy 0, policy_version 29850 (0.0009) [2023-10-14 02:13:57,279][33226] Updated weights for policy 1, policy_version 30110 (0.0008) [2023-10-14 02:13:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 61407232. Throughput: 0: 1744.4, 1: 1777.5. Samples: 15355586. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 02:13:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.800')] [2023-10-14 02:14:00,980][33201] Updated weights for policy 0, policy_version 29860 (0.0008) [2023-10-14 02:14:01,122][33226] Updated weights for policy 1, policy_version 30120 (0.0008) [2023-10-14 02:14:01,356][33201] Updated weights for policy 0, policy_version 29870 (0.0008) [2023-10-14 02:14:01,493][33226] Updated weights for policy 1, policy_version 30130 (0.0009) [2023-10-14 02:14:01,723][33201] Updated weights for policy 0, policy_version 29880 (0.0007) [2023-10-14 02:14:01,852][33226] Updated weights for policy 1, policy_version 30140 (0.0007) [2023-10-14 02:14:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 61472768. Throughput: 0: 1746.8, 1: 1755.4. Samples: 15376870. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 02:14:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.800')] [2023-10-14 02:14:05,399][33201] Updated weights for policy 0, policy_version 29890 (0.0007) [2023-10-14 02:14:05,695][33226] Updated weights for policy 1, policy_version 30150 (0.0007) [2023-10-14 02:14:05,772][33201] Updated weights for policy 0, policy_version 29900 (0.0007) [2023-10-14 02:14:06,050][33226] Updated weights for policy 1, policy_version 30160 (0.0007) [2023-10-14 02:14:06,141][33201] Updated weights for policy 0, policy_version 29910 (0.0009) [2023-10-14 02:14:06,415][33226] Updated weights for policy 1, policy_version 30170 (0.0007) [2023-10-14 02:14:06,503][33201] Updated weights for policy 0, policy_version 29920 (0.0009) [2023-10-14 02:14:09,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 61538304. Throughput: 0: 1761.6, 1: 1765.9. Samples: 15399108. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 02:14:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.750')] [2023-10-14 02:14:10,087][33226] Updated weights for policy 1, policy_version 30180 (0.0009) [2023-10-14 02:14:10,257][33201] Updated weights for policy 0, policy_version 29930 (0.0008) [2023-10-14 02:14:10,454][33226] Updated weights for policy 1, policy_version 30190 (0.0007) [2023-10-14 02:14:10,629][33201] Updated weights for policy 0, policy_version 29940 (0.0008) [2023-10-14 02:14:10,821][33226] Updated weights for policy 1, policy_version 30200 (0.0007) [2023-10-14 02:14:10,999][33201] Updated weights for policy 0, policy_version 29950 (0.0007) [2023-10-14 02:14:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 61603840. Throughput: 0: 1746.4, 1: 1761.7. Samples: 15408650. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 02:14:14,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.760')] [2023-10-14 02:14:14,700][33226] Updated weights for policy 1, policy_version 30210 (0.0009) [2023-10-14 02:14:15,063][33201] Updated weights for policy 0, policy_version 29960 (0.0009) [2023-10-14 02:14:15,119][33226] Updated weights for policy 1, policy_version 30220 (0.0007) [2023-10-14 02:14:15,435][33201] Updated weights for policy 0, policy_version 29970 (0.0010) [2023-10-14 02:14:15,494][33226] Updated weights for policy 1, policy_version 30230 (0.0007) [2023-10-14 02:14:15,798][33201] Updated weights for policy 0, policy_version 29980 (0.0009) [2023-10-14 02:14:15,856][33226] Updated weights for policy 1, policy_version 30240 (0.0007) [2023-10-14 02:14:19,461][33226] Updated weights for policy 1, policy_version 30250 (0.0009) [2023-10-14 02:14:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 61669376. Throughput: 0: 1759.2, 1: 1765.8. Samples: 15430496. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 02:14:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.760')] [2023-10-14 02:14:19,599][33201] Updated weights for policy 0, policy_version 29990 (0.0007) [2023-10-14 02:14:19,818][33226] Updated weights for policy 1, policy_version 30260 (0.0009) [2023-10-14 02:14:19,972][33201] Updated weights for policy 0, policy_version 30000 (0.0007) [2023-10-14 02:14:20,181][33226] Updated weights for policy 1, policy_version 30270 (0.0007) [2023-10-14 02:14:20,342][33201] Updated weights for policy 0, policy_version 30010 (0.0007) [2023-10-14 02:14:24,036][33226] Updated weights for policy 1, policy_version 30280 (0.0009) [2023-10-14 02:14:24,295][33201] Updated weights for policy 0, policy_version 30020 (0.0008) [2023-10-14 02:14:24,402][33226] Updated weights for policy 1, policy_version 30290 (0.0008) [2023-10-14 02:14:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 61734912. Throughput: 0: 1780.8, 1: 1788.7. Samples: 15452216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:14:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.750')] [2023-10-14 02:14:24,677][33201] Updated weights for policy 0, policy_version 30030 (0.0008) [2023-10-14 02:14:24,769][33226] Updated weights for policy 1, policy_version 30300 (0.0007) [2023-10-14 02:14:25,049][33201] Updated weights for policy 0, policy_version 30040 (0.0007) [2023-10-14 02:14:28,686][33226] Updated weights for policy 1, policy_version 30310 (0.0010) [2023-10-14 02:14:28,855][33201] Updated weights for policy 0, policy_version 30050 (0.0008) [2023-10-14 02:14:29,062][33226] Updated weights for policy 1, policy_version 30320 (0.0008) [2023-10-14 02:14:29,221][33201] Updated weights for policy 0, policy_version 30060 (0.0008) [2023-10-14 02:14:29,431][33226] Updated weights for policy 1, policy_version 30330 (0.0008) [2023-10-14 02:14:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 61800448. Throughput: 0: 1749.9, 1: 1761.5. Samples: 15461876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:14:29,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.750')] [2023-10-14 02:14:29,587][33201] Updated weights for policy 0, policy_version 30070 (0.0008) [2023-10-14 02:14:29,961][33201] Updated weights for policy 0, policy_version 30080 (0.0010) [2023-10-14 02:14:33,162][33226] Updated weights for policy 1, policy_version 30340 (0.0008) [2023-10-14 02:14:33,539][33226] Updated weights for policy 1, policy_version 30350 (0.0009) [2023-10-14 02:14:33,812][33201] Updated weights for policy 0, policy_version 30090 (0.0007) [2023-10-14 02:14:33,898][33226] Updated weights for policy 1, policy_version 30360 (0.0009) [2023-10-14 02:14:34,179][33201] Updated weights for policy 0, policy_version 30100 (0.0007) [2023-10-14 02:14:34,553][33201] Updated weights for policy 0, policy_version 30110 (0.0007) [2023-10-14 02:14:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 61898752. Throughput: 0: 1780.0, 1: 1790.8. Samples: 15483928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:14:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.760')] [2023-10-14 02:14:37,725][33226] Updated weights for policy 1, policy_version 30370 (0.0009) [2023-10-14 02:14:38,092][33226] Updated weights for policy 1, policy_version 30380 (0.0010) [2023-10-14 02:14:38,459][33226] Updated weights for policy 1, policy_version 30390 (0.0010) [2023-10-14 02:14:38,459][33201] Updated weights for policy 0, policy_version 30120 (0.0008) [2023-10-14 02:14:38,823][33226] Updated weights for policy 1, policy_version 30400 (0.0009) [2023-10-14 02:14:38,835][33201] Updated weights for policy 0, policy_version 30130 (0.0008) [2023-10-14 02:14:39,217][33201] Updated weights for policy 0, policy_version 30140 (0.0011) [2023-10-14 02:14:39,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 61997056. Throughput: 0: 1756.8, 1: 1761.3. Samples: 15503374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:14:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.770')] [2023-10-14 02:14:42,835][33226] Updated weights for policy 1, policy_version 30410 (0.0007) [2023-10-14 02:14:42,958][33201] Updated weights for policy 0, policy_version 30150 (0.0010) [2023-10-14 02:14:43,198][33226] Updated weights for policy 1, policy_version 30420 (0.0008) [2023-10-14 02:14:43,331][33201] Updated weights for policy 0, policy_version 30160 (0.0008) [2023-10-14 02:14:43,570][33226] Updated weights for policy 1, policy_version 30430 (0.0010) [2023-10-14 02:14:43,704][33201] Updated weights for policy 0, policy_version 30170 (0.0009) [2023-10-14 02:14:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 62062592. Throughput: 0: 1773.1, 1: 1775.9. Samples: 15515290. Policy #0 lag: (min: 5.0, avg: 12.9, max: 37.0) [2023-10-14 02:14:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.780')] [2023-10-14 02:14:47,306][33201] Updated weights for policy 0, policy_version 30180 (0.0009) [2023-10-14 02:14:47,510][33226] Updated weights for policy 1, policy_version 30440 (0.0008) [2023-10-14 02:14:47,668][33201] Updated weights for policy 0, policy_version 30190 (0.0007) [2023-10-14 02:14:47,877][33226] Updated weights for policy 1, policy_version 30450 (0.0008) [2023-10-14 02:14:48,035][33201] Updated weights for policy 0, policy_version 30200 (0.0007) [2023-10-14 02:14:48,246][33226] Updated weights for policy 1, policy_version 30460 (0.0009) [2023-10-14 02:14:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 62128128. Throughput: 0: 1756.0, 1: 1768.9. Samples: 15535492. Policy #0 lag: (min: 5.0, avg: 12.9, max: 37.0) [2023-10-14 02:14:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.790')] [2023-10-14 02:14:51,778][33226] Updated weights for policy 1, policy_version 30470 (0.0009) [2023-10-14 02:14:51,996][33201] Updated weights for policy 0, policy_version 30210 (0.0009) [2023-10-14 02:14:52,143][33226] Updated weights for policy 1, policy_version 30480 (0.0007) [2023-10-14 02:14:52,360][33201] Updated weights for policy 0, policy_version 30220 (0.0007) [2023-10-14 02:14:52,511][33226] Updated weights for policy 1, policy_version 30490 (0.0008) [2023-10-14 02:14:52,741][33201] Updated weights for policy 0, policy_version 30230 (0.0008) [2023-10-14 02:14:53,110][33201] Updated weights for policy 0, policy_version 30240 (0.0011) [2023-10-14 02:14:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 62193664. Throughput: 0: 1748.1, 1: 1759.7. Samples: 15556956. Policy #0 lag: (min: 5.0, avg: 12.9, max: 37.0) [2023-10-14 02:14:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.800')] [2023-10-14 02:14:56,294][33226] Updated weights for policy 1, policy_version 30500 (0.0007) [2023-10-14 02:14:56,663][33226] Updated weights for policy 1, policy_version 30510 (0.0009) [2023-10-14 02:14:56,912][33201] Updated weights for policy 0, policy_version 30250 (0.0008) [2023-10-14 02:14:57,023][33226] Updated weights for policy 1, policy_version 30520 (0.0008) [2023-10-14 02:14:57,282][33201] Updated weights for policy 0, policy_version 30260 (0.0009) [2023-10-14 02:14:57,645][33201] Updated weights for policy 0, policy_version 30270 (0.0009) [2023-10-14 02:14:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 62259200. Throughput: 0: 1768.6, 1: 1770.8. Samples: 15567920. Policy #0 lag: (min: 5.0, avg: 12.9, max: 37.0) [2023-10-14 02:14:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 02:15:00,868][33226] Updated weights for policy 1, policy_version 30530 (0.0009) [2023-10-14 02:15:01,224][33226] Updated weights for policy 1, policy_version 30540 (0.0009) [2023-10-14 02:15:01,398][33201] Updated weights for policy 0, policy_version 30280 (0.0008) [2023-10-14 02:15:01,594][33226] Updated weights for policy 1, policy_version 30550 (0.0008) [2023-10-14 02:15:01,770][33201] Updated weights for policy 0, policy_version 30290 (0.0009) [2023-10-14 02:15:01,960][33226] Updated weights for policy 1, policy_version 30560 (0.0009) [2023-10-14 02:15:02,140][33201] Updated weights for policy 0, policy_version 30300 (0.0010) [2023-10-14 02:15:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 62324736. Throughput: 0: 1753.6, 1: 1761.4. Samples: 15588674. Policy #0 lag: (min: 5.0, avg: 12.9, max: 37.0) [2023-10-14 02:15:04,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.820')] [2023-10-14 02:15:05,855][33226] Updated weights for policy 1, policy_version 30570 (0.0007) [2023-10-14 02:15:06,146][33201] Updated weights for policy 0, policy_version 30310 (0.0010) [2023-10-14 02:15:06,213][33226] Updated weights for policy 1, policy_version 30580 (0.0008) [2023-10-14 02:15:06,525][33201] Updated weights for policy 0, policy_version 30320 (0.0008) [2023-10-14 02:15:06,588][33226] Updated weights for policy 1, policy_version 30590 (0.0009) [2023-10-14 02:15:06,892][33201] Updated weights for policy 0, policy_version 30330 (0.0009) [2023-10-14 02:15:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 62390272. Throughput: 0: 1750.8, 1: 1765.7. Samples: 15610460. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-14 02:15:09,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.820')] [2023-10-14 02:15:10,436][33226] Updated weights for policy 1, policy_version 30600 (0.0010) [2023-10-14 02:15:10,718][33201] Updated weights for policy 0, policy_version 30340 (0.0010) [2023-10-14 02:15:10,803][33226] Updated weights for policy 1, policy_version 30610 (0.0009) [2023-10-14 02:15:11,080][33201] Updated weights for policy 0, policy_version 30350 (0.0010) [2023-10-14 02:15:11,179][33226] Updated weights for policy 1, policy_version 30620 (0.0011) [2023-10-14 02:15:11,454][33201] Updated weights for policy 0, policy_version 30360 (0.0011) [2023-10-14 02:15:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 62455808. Throughput: 0: 1750.2, 1: 1757.1. Samples: 15619704. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-14 02:15:14,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.800')] [2023-10-14 02:15:14,910][33226] Updated weights for policy 1, policy_version 30630 (0.0009) [2023-10-14 02:15:15,281][33226] Updated weights for policy 1, policy_version 30640 (0.0007) [2023-10-14 02:15:15,335][33201] Updated weights for policy 0, policy_version 30370 (0.0010) [2023-10-14 02:15:15,650][33226] Updated weights for policy 1, policy_version 30650 (0.0007) [2023-10-14 02:15:15,696][33201] Updated weights for policy 0, policy_version 30380 (0.0008) [2023-10-14 02:15:16,070][33201] Updated weights for policy 0, policy_version 30390 (0.0007) [2023-10-14 02:15:16,449][33201] Updated weights for policy 0, policy_version 30400 (0.0008) [2023-10-14 02:15:19,353][33226] Updated weights for policy 1, policy_version 30660 (0.0007) [2023-10-14 02:15:19,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 62521344. Throughput: 0: 1755.0, 1: 1760.9. Samples: 15642140. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-14 02:15:19,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.780')] [2023-10-14 02:15:19,720][33226] Updated weights for policy 1, policy_version 30670 (0.0008) [2023-10-14 02:15:20,087][33226] Updated weights for policy 1, policy_version 30680 (0.0008) [2023-10-14 02:15:20,140][33201] Updated weights for policy 0, policy_version 30410 (0.0009) [2023-10-14 02:15:20,521][33201] Updated weights for policy 0, policy_version 30420 (0.0007) [2023-10-14 02:15:20,886][33201] Updated weights for policy 0, policy_version 30430 (0.0008) [2023-10-14 02:15:23,770][33226] Updated weights for policy 1, policy_version 30690 (0.0008) [2023-10-14 02:15:24,133][33226] Updated weights for policy 1, policy_version 30700 (0.0007) [2023-10-14 02:15:24,505][33226] Updated weights for policy 1, policy_version 30710 (0.0008) [2023-10-14 02:15:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 62586880. Throughput: 0: 1778.9, 1: 1794.4. Samples: 15664176. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-14 02:15:24,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.780')] [2023-10-14 02:15:24,734][33201] Updated weights for policy 0, policy_version 30440 (0.0007) [2023-10-14 02:15:24,865][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000030720_31457280.pth... [2023-10-14 02:15:24,868][33226] Updated weights for policy 1, policy_version 30720 (0.0007) [2023-10-14 02:15:24,894][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000029056_29753344.pth [2023-10-14 02:15:25,114][33201] Updated weights for policy 0, policy_version 30450 (0.0009) [2023-10-14 02:15:25,476][33201] Updated weights for policy 0, policy_version 30460 (0.0010) [2023-10-14 02:15:25,626][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000030464_31195136.pth... [2023-10-14 02:15:25,655][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000028800_29491200.pth [2023-10-14 02:15:28,853][33226] Updated weights for policy 1, policy_version 30730 (0.0011) [2023-10-14 02:15:29,225][33226] Updated weights for policy 1, policy_version 30740 (0.0010) [2023-10-14 02:15:29,416][33201] Updated weights for policy 0, policy_version 30470 (0.0008) [2023-10-14 02:15:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 62652416. Throughput: 0: 1754.9, 1: 1770.9. Samples: 15673950. Policy #0 lag: (min: 5.0, avg: 13.0, max: 37.0) [2023-10-14 02:15:29,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.830')] [2023-10-14 02:15:29,587][33226] Updated weights for policy 1, policy_version 30750 (0.0008) [2023-10-14 02:15:29,779][33201] Updated weights for policy 0, policy_version 30480 (0.0009) [2023-10-14 02:15:30,154][33201] Updated weights for policy 0, policy_version 30490 (0.0008) [2023-10-14 02:15:33,483][33226] Updated weights for policy 1, policy_version 30760 (0.0007) [2023-10-14 02:15:33,768][33201] Updated weights for policy 0, policy_version 30500 (0.0007) [2023-10-14 02:15:33,846][33226] Updated weights for policy 1, policy_version 30770 (0.0008) [2023-10-14 02:15:34,130][33201] Updated weights for policy 0, policy_version 30510 (0.0009) [2023-10-14 02:15:34,221][33226] Updated weights for policy 1, policy_version 30780 (0.0009) [2023-10-14 02:15:34,506][33201] Updated weights for policy 0, policy_version 30520 (0.0007) [2023-10-14 02:15:34,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 62750720. Throughput: 0: 1776.1, 1: 1790.4. Samples: 15695986. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:15:34,557][31953] Avg episode reward: [(0, '20.570'), (1, '20.830')] [2023-10-14 02:15:38,057][33226] Updated weights for policy 1, policy_version 30790 (0.0007) [2023-10-14 02:15:38,390][33201] Updated weights for policy 0, policy_version 30530 (0.0009) [2023-10-14 02:15:38,418][33226] Updated weights for policy 1, policy_version 30800 (0.0008) [2023-10-14 02:15:38,754][33201] Updated weights for policy 0, policy_version 30540 (0.0007) [2023-10-14 02:15:38,782][33226] Updated weights for policy 1, policy_version 30810 (0.0009) [2023-10-14 02:15:39,126][33201] Updated weights for policy 0, policy_version 30550 (0.0009) [2023-10-14 02:15:39,501][33201] Updated weights for policy 0, policy_version 30560 (0.0011) [2023-10-14 02:15:39,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 62849024. Throughput: 0: 1760.5, 1: 1765.7. Samples: 15715636. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:15:39,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.820')] [2023-10-14 02:15:42,744][33226] Updated weights for policy 1, policy_version 30820 (0.0009) [2023-10-14 02:15:43,114][33226] Updated weights for policy 1, policy_version 30830 (0.0009) [2023-10-14 02:15:43,331][33201] Updated weights for policy 0, policy_version 30570 (0.0007) [2023-10-14 02:15:43,482][33226] Updated weights for policy 1, policy_version 30840 (0.0008) [2023-10-14 02:15:43,703][33201] Updated weights for policy 0, policy_version 30580 (0.0008) [2023-10-14 02:15:44,079][33201] Updated weights for policy 0, policy_version 30590 (0.0007) [2023-10-14 02:15:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 62914560. Throughput: 0: 1761.2, 1: 1781.1. Samples: 15727326. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:15:44,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.830')] [2023-10-14 02:15:47,373][33226] Updated weights for policy 1, policy_version 30850 (0.0008) [2023-10-14 02:15:47,745][33226] Updated weights for policy 1, policy_version 30860 (0.0009) [2023-10-14 02:15:47,972][33201] Updated weights for policy 0, policy_version 30600 (0.0008) [2023-10-14 02:15:48,110][33226] Updated weights for policy 1, policy_version 30870 (0.0008) [2023-10-14 02:15:48,348][33201] Updated weights for policy 0, policy_version 30610 (0.0009) [2023-10-14 02:15:48,471][33226] Updated weights for policy 1, policy_version 30880 (0.0009) [2023-10-14 02:15:48,711][33201] Updated weights for policy 0, policy_version 30620 (0.0009) [2023-10-14 02:15:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 62980096. Throughput: 0: 1765.2, 1: 1776.6. Samples: 15748054. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 02:15:49,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.840')] [2023-10-14 02:15:52,208][33226] Updated weights for policy 1, policy_version 30890 (0.0007) [2023-10-14 02:15:52,568][33226] Updated weights for policy 1, policy_version 30900 (0.0009) [2023-10-14 02:15:52,676][33201] Updated weights for policy 0, policy_version 30630 (0.0007) [2023-10-14 02:15:52,942][33226] Updated weights for policy 1, policy_version 30910 (0.0008) [2023-10-14 02:15:53,052][33201] Updated weights for policy 0, policy_version 30640 (0.0010) [2023-10-14 02:15:53,423][33201] Updated weights for policy 0, policy_version 30650 (0.0010) [2023-10-14 02:15:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 63045632. Throughput: 0: 1744.4, 1: 1767.5. Samples: 15768496. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 02:15:54,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.890')] [2023-10-14 02:15:56,673][33226] Updated weights for policy 1, policy_version 30920 (0.0010) [2023-10-14 02:15:57,037][33226] Updated weights for policy 1, policy_version 30930 (0.0009) [2023-10-14 02:15:57,406][33226] Updated weights for policy 1, policy_version 30940 (0.0008) [2023-10-14 02:15:57,438][33201] Updated weights for policy 0, policy_version 30660 (0.0010) [2023-10-14 02:15:57,810][33201] Updated weights for policy 0, policy_version 30670 (0.0008) [2023-10-14 02:15:58,187][33201] Updated weights for policy 0, policy_version 30680 (0.0010) [2023-10-14 02:15:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 63111168. Throughput: 0: 1775.1, 1: 1785.7. Samples: 15779938. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 02:15:59,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.880')] [2023-10-14 02:16:01,220][33226] Updated weights for policy 1, policy_version 30950 (0.0010) [2023-10-14 02:16:01,596][33226] Updated weights for policy 1, policy_version 30960 (0.0008) [2023-10-14 02:16:01,953][33226] Updated weights for policy 1, policy_version 30970 (0.0008) [2023-10-14 02:16:01,958][33201] Updated weights for policy 0, policy_version 30690 (0.0007) [2023-10-14 02:16:02,329][33201] Updated weights for policy 0, policy_version 30700 (0.0007) [2023-10-14 02:16:02,698][33201] Updated weights for policy 0, policy_version 30710 (0.0008) [2023-10-14 02:16:03,068][33201] Updated weights for policy 0, policy_version 30720 (0.0008) [2023-10-14 02:16:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 63176704. Throughput: 0: 1743.2, 1: 1765.3. Samples: 15800026. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 02:16:04,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.870')] [2023-10-14 02:16:05,716][33226] Updated weights for policy 1, policy_version 30980 (0.0007) [2023-10-14 02:16:06,082][33226] Updated weights for policy 1, policy_version 30990 (0.0008) [2023-10-14 02:16:06,442][33226] Updated weights for policy 1, policy_version 31000 (0.0009) [2023-10-14 02:16:06,988][33201] Updated weights for policy 0, policy_version 30730 (0.0010) [2023-10-14 02:16:07,358][33201] Updated weights for policy 0, policy_version 30740 (0.0011) [2023-10-14 02:16:07,735][33201] Updated weights for policy 0, policy_version 30750 (0.0007) [2023-10-14 02:16:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 63242240. Throughput: 0: 1740.8, 1: 1772.6. Samples: 15822276. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 02:16:09,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.870')] [2023-10-14 02:16:10,118][33226] Updated weights for policy 1, policy_version 31010 (0.0008) [2023-10-14 02:16:10,487][33226] Updated weights for policy 1, policy_version 31020 (0.0011) [2023-10-14 02:16:10,857][33226] Updated weights for policy 1, policy_version 31030 (0.0008) [2023-10-14 02:16:11,220][33226] Updated weights for policy 1, policy_version 31040 (0.0009) [2023-10-14 02:16:11,533][33201] Updated weights for policy 0, policy_version 30760 (0.0008) [2023-10-14 02:16:11,915][33201] Updated weights for policy 0, policy_version 30770 (0.0009) [2023-10-14 02:16:12,296][33201] Updated weights for policy 0, policy_version 30780 (0.0007) [2023-10-14 02:16:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 63307776. Throughput: 0: 1750.8, 1: 1773.0. Samples: 15832524. Policy #0 lag: (min: 31.0, avg: 34.8, max: 63.0) [2023-10-14 02:16:14,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.860')] [2023-10-14 02:16:14,993][33226] Updated weights for policy 1, policy_version 31050 (0.0008) [2023-10-14 02:16:15,366][33226] Updated weights for policy 1, policy_version 31060 (0.0009) [2023-10-14 02:16:15,726][33226] Updated weights for policy 1, policy_version 31070 (0.0008) [2023-10-14 02:16:16,092][33201] Updated weights for policy 0, policy_version 30790 (0.0008) [2023-10-14 02:16:16,462][33201] Updated weights for policy 0, policy_version 30800 (0.0009) [2023-10-14 02:16:16,850][33201] Updated weights for policy 0, policy_version 30810 (0.0009) [2023-10-14 02:16:19,544][33226] Updated weights for policy 1, policy_version 31080 (0.0008) [2023-10-14 02:16:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 63373312. Throughput: 0: 1740.6, 1: 1776.3. Samples: 15854246. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:16:19,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.860')] [2023-10-14 02:16:19,911][33226] Updated weights for policy 1, policy_version 31090 (0.0010) [2023-10-14 02:16:20,280][33226] Updated weights for policy 1, policy_version 31100 (0.0007) [2023-10-14 02:16:20,633][33201] Updated weights for policy 0, policy_version 30820 (0.0008) [2023-10-14 02:16:21,016][33201] Updated weights for policy 0, policy_version 30830 (0.0010) [2023-10-14 02:16:21,387][33201] Updated weights for policy 0, policy_version 30840 (0.0009) [2023-10-14 02:16:24,225][33226] Updated weights for policy 1, policy_version 31110 (0.0009) [2023-10-14 02:16:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 63438848. Throughput: 0: 1759.1, 1: 1801.0. Samples: 15875840. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:16:24,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.860')] [2023-10-14 02:16:24,593][33226] Updated weights for policy 1, policy_version 31120 (0.0009) [2023-10-14 02:16:24,957][33226] Updated weights for policy 1, policy_version 31130 (0.0007) [2023-10-14 02:16:25,245][33201] Updated weights for policy 0, policy_version 30850 (0.0007) [2023-10-14 02:16:25,628][33201] Updated weights for policy 0, policy_version 30860 (0.0009) [2023-10-14 02:16:25,997][33201] Updated weights for policy 0, policy_version 30870 (0.0008) [2023-10-14 02:16:26,354][33201] Updated weights for policy 0, policy_version 30880 (0.0007) [2023-10-14 02:16:28,693][33226] Updated weights for policy 1, policy_version 31140 (0.0008) [2023-10-14 02:16:29,063][33226] Updated weights for policy 1, policy_version 31150 (0.0008) [2023-10-14 02:16:29,431][33226] Updated weights for policy 1, policy_version 31160 (0.0007) [2023-10-14 02:16:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 63504384. Throughput: 0: 1736.5, 1: 1779.6. Samples: 15885552. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:16:29,557][31953] Avg episode reward: [(0, '20.600'), (1, '20.860')] [2023-10-14 02:16:30,206][33201] Updated weights for policy 0, policy_version 30890 (0.0007) [2023-10-14 02:16:30,577][33201] Updated weights for policy 0, policy_version 30900 (0.0008) [2023-10-14 02:16:30,957][33201] Updated weights for policy 0, policy_version 30910 (0.0010) [2023-10-14 02:16:33,306][33226] Updated weights for policy 1, policy_version 31170 (0.0008) [2023-10-14 02:16:33,671][33226] Updated weights for policy 1, policy_version 31180 (0.0008) [2023-10-14 02:16:34,039][33226] Updated weights for policy 1, policy_version 31190 (0.0010) [2023-10-14 02:16:34,404][33226] Updated weights for policy 1, policy_version 31200 (0.0008) [2023-10-14 02:16:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 63602688. Throughput: 0: 1751.6, 1: 1792.5. Samples: 15907540. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:16:34,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.860')] [2023-10-14 02:16:34,833][33201] Updated weights for policy 0, policy_version 30920 (0.0009) [2023-10-14 02:16:35,202][33201] Updated weights for policy 0, policy_version 30930 (0.0007) [2023-10-14 02:16:35,567][33201] Updated weights for policy 0, policy_version 30940 (0.0007) [2023-10-14 02:16:38,179][33226] Updated weights for policy 1, policy_version 31210 (0.0009) [2023-10-14 02:16:38,562][33226] Updated weights for policy 1, policy_version 31220 (0.0007) [2023-10-14 02:16:38,923][33226] Updated weights for policy 1, policy_version 31230 (0.0008) [2023-10-14 02:16:39,557][31953] Fps is (10 sec: 16383.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 63668224. Throughput: 0: 1771.2, 1: 1775.6. Samples: 15928102. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:16:39,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.870')] [2023-10-14 02:16:39,560][33201] Updated weights for policy 0, policy_version 30950 (0.0009) [2023-10-14 02:16:39,931][33201] Updated weights for policy 0, policy_version 30960 (0.0008) [2023-10-14 02:16:40,303][33201] Updated weights for policy 0, policy_version 30970 (0.0007) [2023-10-14 02:16:42,672][33226] Updated weights for policy 1, policy_version 31240 (0.0007) [2023-10-14 02:16:43,036][33226] Updated weights for policy 1, policy_version 31250 (0.0007) [2023-10-14 02:16:43,400][33226] Updated weights for policy 1, policy_version 31260 (0.0007) [2023-10-14 02:16:44,184][33201] Updated weights for policy 0, policy_version 30980 (0.0007) [2023-10-14 02:16:44,555][33201] Updated weights for policy 0, policy_version 30990 (0.0009) [2023-10-14 02:16:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 63733760. Throughput: 0: 1745.3, 1: 1793.2. Samples: 15939172. Policy #0 lag: (min: 13.0, avg: 15.6, max: 45.0) [2023-10-14 02:16:44,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.870')] [2023-10-14 02:16:44,925][33201] Updated weights for policy 0, policy_version 31000 (0.0008) [2023-10-14 02:16:47,162][33226] Updated weights for policy 1, policy_version 31270 (0.0009) [2023-10-14 02:16:47,526][33226] Updated weights for policy 1, policy_version 31280 (0.0010) [2023-10-14 02:16:47,892][33226] Updated weights for policy 1, policy_version 31290 (0.0008) [2023-10-14 02:16:48,628][33201] Updated weights for policy 0, policy_version 31010 (0.0009) [2023-10-14 02:16:49,003][33201] Updated weights for policy 0, policy_version 31020 (0.0011) [2023-10-14 02:16:49,370][33201] Updated weights for policy 0, policy_version 31030 (0.0009) [2023-10-14 02:16:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 63799296. Throughput: 0: 1774.7, 1: 1783.2. Samples: 15960134. Policy #0 lag: (min: 13.0, avg: 15.6, max: 45.0) [2023-10-14 02:16:49,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.880')] [2023-10-14 02:16:49,742][33201] Updated weights for policy 0, policy_version 31040 (0.0009) [2023-10-14 02:16:51,718][33226] Updated weights for policy 1, policy_version 31300 (0.0008) [2023-10-14 02:16:52,081][33226] Updated weights for policy 1, policy_version 31310 (0.0007) [2023-10-14 02:16:52,450][33226] Updated weights for policy 1, policy_version 31320 (0.0007) [2023-10-14 02:16:53,589][33201] Updated weights for policy 0, policy_version 31050 (0.0008) [2023-10-14 02:16:53,965][33201] Updated weights for policy 0, policy_version 31060 (0.0009) [2023-10-14 02:16:54,329][33201] Updated weights for policy 0, policy_version 31070 (0.0008) [2023-10-14 02:16:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 63897600. Throughput: 0: 1753.5, 1: 1771.9. Samples: 15980918. Policy #0 lag: (min: 13.0, avg: 15.6, max: 45.0) [2023-10-14 02:16:54,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.880')] [2023-10-14 02:16:56,184][33226] Updated weights for policy 1, policy_version 31330 (0.0008) [2023-10-14 02:16:56,550][33226] Updated weights for policy 1, policy_version 31340 (0.0008) [2023-10-14 02:16:56,911][33226] Updated weights for policy 1, policy_version 31350 (0.0007) [2023-10-14 02:16:57,282][33226] Updated weights for policy 1, policy_version 31360 (0.0009) [2023-10-14 02:16:58,059][33201] Updated weights for policy 0, policy_version 31080 (0.0009) [2023-10-14 02:16:58,429][33201] Updated weights for policy 0, policy_version 31090 (0.0008) [2023-10-14 02:16:58,793][33201] Updated weights for policy 0, policy_version 31100 (0.0007) [2023-10-14 02:16:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 63963136. Throughput: 0: 1766.8, 1: 1780.4. Samples: 15992150. Policy #0 lag: (min: 13.0, avg: 15.6, max: 45.0) [2023-10-14 02:16:59,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.880')] [2023-10-14 02:17:01,083][33226] Updated weights for policy 1, policy_version 31370 (0.0007) [2023-10-14 02:17:01,452][33226] Updated weights for policy 1, policy_version 31380 (0.0008) [2023-10-14 02:17:01,812][33226] Updated weights for policy 1, policy_version 31390 (0.0007) [2023-10-14 02:17:02,719][33201] Updated weights for policy 0, policy_version 31110 (0.0008) [2023-10-14 02:17:03,091][33201] Updated weights for policy 0, policy_version 31120 (0.0008) [2023-10-14 02:17:03,465][33201] Updated weights for policy 0, policy_version 31130 (0.0008) [2023-10-14 02:17:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 64028672. Throughput: 0: 1756.7, 1: 1773.5. Samples: 16013104. Policy #0 lag: (min: 27.0, avg: 32.1, max: 59.0) [2023-10-14 02:17:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.890')] [2023-10-14 02:17:05,569][33226] Updated weights for policy 1, policy_version 31400 (0.0008) [2023-10-14 02:17:05,941][33226] Updated weights for policy 1, policy_version 31410 (0.0009) [2023-10-14 02:17:06,318][33226] Updated weights for policy 1, policy_version 31420 (0.0010) [2023-10-14 02:17:07,309][33201] Updated weights for policy 0, policy_version 31140 (0.0009) [2023-10-14 02:17:07,669][33201] Updated weights for policy 0, policy_version 31150 (0.0009) [2023-10-14 02:17:08,039][33201] Updated weights for policy 0, policy_version 31160 (0.0007) [2023-10-14 02:17:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 64094208. Throughput: 0: 1748.6, 1: 1785.6. Samples: 16034880. Policy #0 lag: (min: 27.0, avg: 32.1, max: 59.0) [2023-10-14 02:17:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 02:17:10,081][33226] Updated weights for policy 1, policy_version 31430 (0.0007) [2023-10-14 02:17:10,445][33226] Updated weights for policy 1, policy_version 31440 (0.0008) [2023-10-14 02:17:10,804][33226] Updated weights for policy 1, policy_version 31450 (0.0007) [2023-10-14 02:17:11,761][33201] Updated weights for policy 0, policy_version 31170 (0.0008) [2023-10-14 02:17:12,132][33201] Updated weights for policy 0, policy_version 31180 (0.0009) [2023-10-14 02:17:12,505][33201] Updated weights for policy 0, policy_version 31190 (0.0008) [2023-10-14 02:17:12,869][33201] Updated weights for policy 0, policy_version 31200 (0.0009) [2023-10-14 02:17:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64159744. Throughput: 0: 1773.7, 1: 1779.8. Samples: 16045462. Policy #0 lag: (min: 27.0, avg: 32.1, max: 59.0) [2023-10-14 02:17:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 02:17:14,643][33226] Updated weights for policy 1, policy_version 31460 (0.0007) [2023-10-14 02:17:15,010][33226] Updated weights for policy 1, policy_version 31470 (0.0007) [2023-10-14 02:17:15,381][33226] Updated weights for policy 1, policy_version 31480 (0.0008) [2023-10-14 02:17:16,712][33201] Updated weights for policy 0, policy_version 31210 (0.0008) [2023-10-14 02:17:17,091][33201] Updated weights for policy 0, policy_version 31220 (0.0009) [2023-10-14 02:17:17,457][33201] Updated weights for policy 0, policy_version 31230 (0.0008) [2023-10-14 02:17:19,164][33226] Updated weights for policy 1, policy_version 31490 (0.0009) [2023-10-14 02:17:19,532][33226] Updated weights for policy 1, policy_version 31500 (0.0009) [2023-10-14 02:17:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64225280. Throughput: 0: 1751.3, 1: 1782.8. Samples: 16066576. Policy #0 lag: (min: 27.0, avg: 32.1, max: 59.0) [2023-10-14 02:17:19,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 02:17:19,897][33226] Updated weights for policy 1, policy_version 31510 (0.0010) [2023-10-14 02:17:20,255][33226] Updated weights for policy 1, policy_version 31520 (0.0011) [2023-10-14 02:17:21,372][33201] Updated weights for policy 0, policy_version 31240 (0.0008) [2023-10-14 02:17:21,748][33201] Updated weights for policy 0, policy_version 31250 (0.0007) [2023-10-14 02:17:22,120][33201] Updated weights for policy 0, policy_version 31260 (0.0007) [2023-10-14 02:17:24,065][33226] Updated weights for policy 1, policy_version 31530 (0.0008) [2023-10-14 02:17:24,431][33226] Updated weights for policy 1, policy_version 31540 (0.0007) [2023-10-14 02:17:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 64290816. Throughput: 0: 1752.9, 1: 1808.8. Samples: 16088376. Policy #0 lag: (min: 27.0, avg: 32.1, max: 59.0) [2023-10-14 02:17:24,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.920')] [2023-10-14 02:17:24,564][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000031264_32014336.pth... [2023-10-14 02:17:24,594][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000029632_30343168.pth [2023-10-14 02:17:24,797][33226] Updated weights for policy 1, policy_version 31550 (0.0007) [2023-10-14 02:17:24,869][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000031552_32309248.pth... [2023-10-14 02:17:24,907][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000029888_30605312.pth [2023-10-14 02:17:25,896][33201] Updated weights for policy 0, policy_version 31270 (0.0007) [2023-10-14 02:17:26,268][33201] Updated weights for policy 0, policy_version 31280 (0.0008) [2023-10-14 02:17:26,639][33201] Updated weights for policy 0, policy_version 31290 (0.0009) [2023-10-14 02:17:28,512][33226] Updated weights for policy 1, policy_version 31560 (0.0008) [2023-10-14 02:17:28,876][33226] Updated weights for policy 1, policy_version 31570 (0.0008) [2023-10-14 02:17:29,242][33226] Updated weights for policy 1, policy_version 31580 (0.0007) [2023-10-14 02:17:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 64389120. Throughput: 0: 1750.7, 1: 1787.1. Samples: 16098372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:29,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 02:17:30,538][33201] Updated weights for policy 0, policy_version 31300 (0.0007) [2023-10-14 02:17:30,903][33201] Updated weights for policy 0, policy_version 31310 (0.0009) [2023-10-14 02:17:31,277][33201] Updated weights for policy 0, policy_version 31320 (0.0008) [2023-10-14 02:17:33,069][33226] Updated weights for policy 1, policy_version 31590 (0.0008) [2023-10-14 02:17:33,433][33226] Updated weights for policy 1, policy_version 31600 (0.0009) [2023-10-14 02:17:33,802][33226] Updated weights for policy 1, policy_version 31610 (0.0008) [2023-10-14 02:17:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64454656. Throughput: 0: 1756.5, 1: 1811.7. Samples: 16120702. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:34,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.950')] [2023-10-14 02:17:35,012][33201] Updated weights for policy 0, policy_version 31330 (0.0008) [2023-10-14 02:17:35,377][33201] Updated weights for policy 0, policy_version 31340 (0.0008) [2023-10-14 02:17:35,742][33201] Updated weights for policy 0, policy_version 31350 (0.0008) [2023-10-14 02:17:36,113][33201] Updated weights for policy 0, policy_version 31360 (0.0011) [2023-10-14 02:17:37,714][33226] Updated weights for policy 1, policy_version 31620 (0.0011) [2023-10-14 02:17:38,089][33226] Updated weights for policy 1, policy_version 31630 (0.0010) [2023-10-14 02:17:38,463][33226] Updated weights for policy 1, policy_version 31640 (0.0010) [2023-10-14 02:17:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64520192. Throughput: 0: 1782.8, 1: 1784.3. Samples: 16141436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.950')] [2023-10-14 02:17:39,854][33201] Updated weights for policy 0, policy_version 31370 (0.0010) [2023-10-14 02:17:40,236][33201] Updated weights for policy 0, policy_version 31380 (0.0010) [2023-10-14 02:17:40,613][33201] Updated weights for policy 0, policy_version 31390 (0.0009) [2023-10-14 02:17:42,123][33226] Updated weights for policy 1, policy_version 31650 (0.0009) [2023-10-14 02:17:42,497][33226] Updated weights for policy 1, policy_version 31660 (0.0009) [2023-10-14 02:17:42,864][33226] Updated weights for policy 1, policy_version 31670 (0.0007) [2023-10-14 02:17:43,229][33226] Updated weights for policy 1, policy_version 31680 (0.0007) [2023-10-14 02:17:44,420][33201] Updated weights for policy 0, policy_version 31400 (0.0011) [2023-10-14 02:17:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64585728. Throughput: 0: 1760.3, 1: 1804.6. Samples: 16152570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:44,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.950')] [2023-10-14 02:17:44,791][33201] Updated weights for policy 0, policy_version 31410 (0.0009) [2023-10-14 02:17:45,174][33201] Updated weights for policy 0, policy_version 31420 (0.0010) [2023-10-14 02:17:46,932][33226] Updated weights for policy 1, policy_version 31690 (0.0010) [2023-10-14 02:17:47,309][33226] Updated weights for policy 1, policy_version 31700 (0.0009) [2023-10-14 02:17:47,674][33226] Updated weights for policy 1, policy_version 31710 (0.0011) [2023-10-14 02:17:49,144][33201] Updated weights for policy 0, policy_version 31430 (0.0007) [2023-10-14 02:17:49,517][33201] Updated weights for policy 0, policy_version 31440 (0.0007) [2023-10-14 02:17:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 64651264. Throughput: 0: 1775.8, 1: 1783.7. Samples: 16173282. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:49,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.960')] [2023-10-14 02:17:49,882][33201] Updated weights for policy 0, policy_version 31450 (0.0009) [2023-10-14 02:17:51,396][33226] Updated weights for policy 1, policy_version 31720 (0.0008) [2023-10-14 02:17:51,755][33226] Updated weights for policy 1, policy_version 31730 (0.0008) [2023-10-14 02:17:52,127][33226] Updated weights for policy 1, policy_version 31740 (0.0009) [2023-10-14 02:17:53,640][33201] Updated weights for policy 0, policy_version 31460 (0.0009) [2023-10-14 02:17:54,009][33201] Updated weights for policy 0, policy_version 31470 (0.0007) [2023-10-14 02:17:54,381][33201] Updated weights for policy 0, policy_version 31480 (0.0009) [2023-10-14 02:17:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 64716800. Throughput: 0: 1773.8, 1: 1783.6. Samples: 16194962. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.960')] [2023-10-14 02:17:55,865][33226] Updated weights for policy 1, policy_version 31750 (0.0009) [2023-10-14 02:17:56,239][33226] Updated weights for policy 1, policy_version 31760 (0.0009) [2023-10-14 02:17:56,608][33226] Updated weights for policy 1, policy_version 31770 (0.0010) [2023-10-14 02:17:58,104][33201] Updated weights for policy 0, policy_version 31490 (0.0009) [2023-10-14 02:17:58,478][33201] Updated weights for policy 0, policy_version 31500 (0.0008) [2023-10-14 02:17:58,851][33201] Updated weights for policy 0, policy_version 31510 (0.0008) [2023-10-14 02:17:59,224][33201] Updated weights for policy 0, policy_version 31520 (0.0009) [2023-10-14 02:17:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 64815104. Throughput: 0: 1766.7, 1: 1783.6. Samples: 16205222. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:17:59,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 02:18:00,145][33226] Updated weights for policy 1, policy_version 31780 (0.0008) [2023-10-14 02:18:00,522][33226] Updated weights for policy 1, policy_version 31790 (0.0008) [2023-10-14 02:18:00,897][33226] Updated weights for policy 1, policy_version 31800 (0.0009) [2023-10-14 02:18:03,072][33201] Updated weights for policy 0, policy_version 31530 (0.0009) [2023-10-14 02:18:03,446][33201] Updated weights for policy 0, policy_version 31540 (0.0008) [2023-10-14 02:18:03,814][33201] Updated weights for policy 0, policy_version 31550 (0.0007) [2023-10-14 02:18:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 64880640. Throughput: 0: 1776.4, 1: 1786.1. Samples: 16226890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:18:04,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 02:18:04,677][33226] Updated weights for policy 1, policy_version 31810 (0.0010) [2023-10-14 02:18:05,044][33226] Updated weights for policy 1, policy_version 31820 (0.0011) [2023-10-14 02:18:05,416][33226] Updated weights for policy 1, policy_version 31830 (0.0009) [2023-10-14 02:18:05,777][33226] Updated weights for policy 1, policy_version 31840 (0.0008) [2023-10-14 02:18:07,757][33201] Updated weights for policy 0, policy_version 31560 (0.0007) [2023-10-14 02:18:08,152][33201] Updated weights for policy 0, policy_version 31570 (0.0008) [2023-10-14 02:18:08,527][33201] Updated weights for policy 0, policy_version 31580 (0.0008) [2023-10-14 02:18:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 64946176. Throughput: 0: 1754.7, 1: 1790.1. Samples: 16247894. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:18:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 02:18:09,805][33226] Updated weights for policy 1, policy_version 31850 (0.0010) [2023-10-14 02:18:10,182][33226] Updated weights for policy 1, policy_version 31860 (0.0009) [2023-10-14 02:18:10,547][33226] Updated weights for policy 1, policy_version 31870 (0.0008) [2023-10-14 02:18:12,363][33201] Updated weights for policy 0, policy_version 31590 (0.0009) [2023-10-14 02:18:12,733][33201] Updated weights for policy 0, policy_version 31600 (0.0010) [2023-10-14 02:18:13,103][33201] Updated weights for policy 0, policy_version 31610 (0.0010) [2023-10-14 02:18:14,319][33226] Updated weights for policy 1, policy_version 31880 (0.0007) [2023-10-14 02:18:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 65011712. Throughput: 0: 1791.1, 1: 1774.1. Samples: 16258804. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-14 02:18:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 02:18:14,693][33226] Updated weights for policy 1, policy_version 31890 (0.0008) [2023-10-14 02:18:15,058][33226] Updated weights for policy 1, policy_version 31900 (0.0009) [2023-10-14 02:18:16,832][33201] Updated weights for policy 0, policy_version 31620 (0.0009) [2023-10-14 02:18:17,205][33201] Updated weights for policy 0, policy_version 31630 (0.0008) [2023-10-14 02:18:17,570][33201] Updated weights for policy 0, policy_version 31640 (0.0009) [2023-10-14 02:18:18,914][33226] Updated weights for policy 1, policy_version 31910 (0.0010) [2023-10-14 02:18:19,283][33226] Updated weights for policy 1, policy_version 31920 (0.0008) [2023-10-14 02:18:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 65077248. Throughput: 0: 1754.1, 1: 1775.6. Samples: 16279538. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-14 02:18:19,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 02:18:19,654][33226] Updated weights for policy 1, policy_version 31930 (0.0008) [2023-10-14 02:18:21,434][33201] Updated weights for policy 0, policy_version 31650 (0.0009) [2023-10-14 02:18:21,803][33201] Updated weights for policy 0, policy_version 31660 (0.0008) [2023-10-14 02:18:22,175][33201] Updated weights for policy 0, policy_version 31670 (0.0007) [2023-10-14 02:18:22,548][33201] Updated weights for policy 0, policy_version 31680 (0.0010) [2023-10-14 02:18:23,392][33226] Updated weights for policy 1, policy_version 31940 (0.0008) [2023-10-14 02:18:23,770][33226] Updated weights for policy 1, policy_version 31950 (0.0008) [2023-10-14 02:18:24,130][33226] Updated weights for policy 1, policy_version 31960 (0.0007) [2023-10-14 02:18:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 65175552. Throughput: 0: 1749.4, 1: 1790.1. Samples: 16300710. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-14 02:18:24,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 02:18:26,621][33201] Updated weights for policy 0, policy_version 31690 (0.0009) [2023-10-14 02:18:26,991][33201] Updated weights for policy 0, policy_version 31700 (0.0010) [2023-10-14 02:18:27,360][33201] Updated weights for policy 0, policy_version 31710 (0.0010) [2023-10-14 02:18:27,987][33226] Updated weights for policy 1, policy_version 31970 (0.0009) [2023-10-14 02:18:28,356][33226] Updated weights for policy 1, policy_version 31980 (0.0007) [2023-10-14 02:18:28,731][33226] Updated weights for policy 1, policy_version 31990 (0.0010) [2023-10-14 02:18:29,094][33226] Updated weights for policy 1, policy_version 32000 (0.0007) [2023-10-14 02:18:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 65241088. Throughput: 0: 1758.7, 1: 1771.2. Samples: 16311418. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-14 02:18:29,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 02:18:31,108][33201] Updated weights for policy 0, policy_version 31720 (0.0008) [2023-10-14 02:18:31,485][33201] Updated weights for policy 0, policy_version 31730 (0.0009) [2023-10-14 02:18:31,856][33201] Updated weights for policy 0, policy_version 31740 (0.0008) [2023-10-14 02:18:32,837][33226] Updated weights for policy 1, policy_version 32010 (0.0007) [2023-10-14 02:18:33,200][33226] Updated weights for policy 1, policy_version 32020 (0.0010) [2023-10-14 02:18:33,565][33226] Updated weights for policy 1, policy_version 32030 (0.0007) [2023-10-14 02:18:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 65306624. Throughput: 0: 1750.7, 1: 1789.2. Samples: 16332578. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-10-14 02:18:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 02:18:35,674][33201] Updated weights for policy 0, policy_version 31750 (0.0008) [2023-10-14 02:18:36,038][33201] Updated weights for policy 0, policy_version 31760 (0.0008) [2023-10-14 02:18:36,419][33201] Updated weights for policy 0, policy_version 31770 (0.0008) [2023-10-14 02:18:37,437][33226] Updated weights for policy 1, policy_version 32040 (0.0008) [2023-10-14 02:18:37,813][33226] Updated weights for policy 1, policy_version 32050 (0.0008) [2023-10-14 02:18:38,172][33226] Updated weights for policy 1, policy_version 32060 (0.0008) [2023-10-14 02:18:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 65372160. Throughput: 0: 1774.9, 1: 1763.2. Samples: 16354178. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 02:18:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.950')] [2023-10-14 02:18:40,076][33201] Updated weights for policy 0, policy_version 31780 (0.0008) [2023-10-14 02:18:40,446][33201] Updated weights for policy 0, policy_version 31790 (0.0010) [2023-10-14 02:18:40,811][33201] Updated weights for policy 0, policy_version 31800 (0.0010) [2023-10-14 02:18:41,884][33226] Updated weights for policy 1, policy_version 32070 (0.0008) [2023-10-14 02:18:42,255][33226] Updated weights for policy 1, policy_version 32080 (0.0008) [2023-10-14 02:18:42,623][33226] Updated weights for policy 1, policy_version 32090 (0.0008) [2023-10-14 02:18:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 65437696. Throughput: 0: 1757.5, 1: 1787.0. Samples: 16364724. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 02:18:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.940')] [2023-10-14 02:18:44,599][33201] Updated weights for policy 0, policy_version 31810 (0.0010) [2023-10-14 02:18:44,973][33201] Updated weights for policy 0, policy_version 31820 (0.0009) [2023-10-14 02:18:45,342][33201] Updated weights for policy 0, policy_version 31830 (0.0009) [2023-10-14 02:18:45,716][33201] Updated weights for policy 0, policy_version 31840 (0.0010) [2023-10-14 02:18:46,499][33226] Updated weights for policy 1, policy_version 32100 (0.0009) [2023-10-14 02:18:46,865][33226] Updated weights for policy 1, policy_version 32110 (0.0007) [2023-10-14 02:18:47,229][33226] Updated weights for policy 1, policy_version 32120 (0.0010) [2023-10-14 02:18:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 65503232. Throughput: 0: 1765.0, 1: 1760.2. Samples: 16385522. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 02:18:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 02:18:49,608][33201] Updated weights for policy 0, policy_version 31850 (0.0008) [2023-10-14 02:18:49,969][33201] Updated weights for policy 0, policy_version 31860 (0.0010) [2023-10-14 02:18:50,345][33201] Updated weights for policy 0, policy_version 31870 (0.0009) [2023-10-14 02:18:51,046][33226] Updated weights for policy 1, policy_version 32130 (0.0007) [2023-10-14 02:18:51,418][33226] Updated weights for policy 1, policy_version 32140 (0.0009) [2023-10-14 02:18:51,786][33226] Updated weights for policy 1, policy_version 32150 (0.0009) [2023-10-14 02:18:52,156][33226] Updated weights for policy 1, policy_version 32160 (0.0008) [2023-10-14 02:18:54,214][33201] Updated weights for policy 0, policy_version 31880 (0.0008) [2023-10-14 02:18:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 65568768. Throughput: 0: 1782.0, 1: 1758.3. Samples: 16407206. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 02:18:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.940')] [2023-10-14 02:18:54,586][33201] Updated weights for policy 0, policy_version 31890 (0.0008) [2023-10-14 02:18:54,946][33201] Updated weights for policy 0, policy_version 31900 (0.0010) [2023-10-14 02:18:56,013][33226] Updated weights for policy 1, policy_version 32170 (0.0008) [2023-10-14 02:18:56,398][33226] Updated weights for policy 1, policy_version 32180 (0.0010) [2023-10-14 02:18:56,757][33226] Updated weights for policy 1, policy_version 32190 (0.0010) [2023-10-14 02:18:58,774][33201] Updated weights for policy 0, policy_version 31910 (0.0010) [2023-10-14 02:18:59,145][33201] Updated weights for policy 0, policy_version 31920 (0.0007) [2023-10-14 02:18:59,516][33201] Updated weights for policy 0, policy_version 31930 (0.0007) [2023-10-14 02:18:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 65634304. Throughput: 0: 1752.4, 1: 1760.5. Samples: 16416882. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 02:18:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 02:19:00,548][33226] Updated weights for policy 1, policy_version 32200 (0.0010) [2023-10-14 02:19:00,914][33226] Updated weights for policy 1, policy_version 32210 (0.0009) [2023-10-14 02:19:01,284][33226] Updated weights for policy 1, policy_version 32220 (0.0008) [2023-10-14 02:19:03,327][33201] Updated weights for policy 0, policy_version 31940 (0.0008) [2023-10-14 02:19:03,696][33201] Updated weights for policy 0, policy_version 31950 (0.0007) [2023-10-14 02:19:04,069][33201] Updated weights for policy 0, policy_version 31960 (0.0009) [2023-10-14 02:19:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 65732608. Throughput: 0: 1780.4, 1: 1760.3. Samples: 16438874. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 02:19:05,074][33226] Updated weights for policy 1, policy_version 32230 (0.0007) [2023-10-14 02:19:05,434][33226] Updated weights for policy 1, policy_version 32240 (0.0007) [2023-10-14 02:19:05,804][33226] Updated weights for policy 1, policy_version 32250 (0.0007) [2023-10-14 02:19:07,855][33201] Updated weights for policy 0, policy_version 31970 (0.0008) [2023-10-14 02:19:08,228][33201] Updated weights for policy 0, policy_version 31980 (0.0007) [2023-10-14 02:19:08,597][33201] Updated weights for policy 0, policy_version 31990 (0.0009) [2023-10-14 02:19:08,966][33201] Updated weights for policy 0, policy_version 32000 (0.0010) [2023-10-14 02:19:09,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 65798144. Throughput: 0: 1752.4, 1: 1780.0. Samples: 16459670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 02:19:09,774][33226] Updated weights for policy 1, policy_version 32260 (0.0009) [2023-10-14 02:19:10,145][33226] Updated weights for policy 1, policy_version 32270 (0.0008) [2023-10-14 02:19:10,513][33226] Updated weights for policy 1, policy_version 32280 (0.0010) [2023-10-14 02:19:12,863][33201] Updated weights for policy 0, policy_version 32010 (0.0008) [2023-10-14 02:19:13,234][33201] Updated weights for policy 0, policy_version 32020 (0.0008) [2023-10-14 02:19:13,611][33201] Updated weights for policy 0, policy_version 32030 (0.0008) [2023-10-14 02:19:14,195][33226] Updated weights for policy 1, policy_version 32290 (0.0011) [2023-10-14 02:19:14,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 65863680. Throughput: 0: 1778.6, 1: 1762.1. Samples: 16470752. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:14,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 02:19:14,560][33226] Updated weights for policy 1, policy_version 32300 (0.0008) [2023-10-14 02:19:14,929][33226] Updated weights for policy 1, policy_version 32310 (0.0010) [2023-10-14 02:19:15,299][33226] Updated weights for policy 1, policy_version 32320 (0.0009) [2023-10-14 02:19:17,485][33201] Updated weights for policy 0, policy_version 32040 (0.0011) [2023-10-14 02:19:17,851][33201] Updated weights for policy 0, policy_version 32050 (0.0010) [2023-10-14 02:19:18,225][33201] Updated weights for policy 0, policy_version 32060 (0.0009) [2023-10-14 02:19:19,285][33226] Updated weights for policy 1, policy_version 32330 (0.0010) [2023-10-14 02:19:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 65929216. Throughput: 0: 1766.2, 1: 1774.6. Samples: 16491914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:19,651][33226] Updated weights for policy 1, policy_version 32340 (0.0011) [2023-10-14 02:19:20,025][33226] Updated weights for policy 1, policy_version 32350 (0.0009) [2023-10-14 02:19:21,940][33201] Updated weights for policy 0, policy_version 32070 (0.0011) [2023-10-14 02:19:22,311][33201] Updated weights for policy 0, policy_version 32080 (0.0008) [2023-10-14 02:19:22,681][33201] Updated weights for policy 0, policy_version 32090 (0.0007) [2023-10-14 02:19:23,831][33226] Updated weights for policy 1, policy_version 32360 (0.0010) [2023-10-14 02:19:24,204][33226] Updated weights for policy 1, policy_version 32370 (0.0012) [2023-10-14 02:19:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 65994752. Throughput: 0: 1746.2, 1: 1783.2. Samples: 16513000. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) [2023-10-14 02:19:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:24,568][33226] Updated weights for policy 1, policy_version 32380 (0.0009) [2023-10-14 02:19:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000032096_32866304.pth... [2023-10-14 02:19:24,602][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000030464_31195136.pth [2023-10-14 02:19:24,714][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000032384_33161216.pth... [2023-10-14 02:19:24,752][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000030720_31457280.pth [2023-10-14 02:19:26,374][33201] Updated weights for policy 0, policy_version 32100 (0.0007) [2023-10-14 02:19:26,744][33201] Updated weights for policy 0, policy_version 32110 (0.0007) [2023-10-14 02:19:27,111][33201] Updated weights for policy 0, policy_version 32120 (0.0007) [2023-10-14 02:19:28,299][33226] Updated weights for policy 1, policy_version 32390 (0.0007) [2023-10-14 02:19:28,662][33226] Updated weights for policy 1, policy_version 32400 (0.0009) [2023-10-14 02:19:29,030][33226] Updated weights for policy 1, policy_version 32410 (0.0010) [2023-10-14 02:19:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 66093056. Throughput: 0: 1759.8, 1: 1775.3. Samples: 16523802. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) [2023-10-14 02:19:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:30,850][33201] Updated weights for policy 0, policy_version 32130 (0.0008) [2023-10-14 02:19:31,218][33201] Updated weights for policy 0, policy_version 32140 (0.0007) [2023-10-14 02:19:31,584][33201] Updated weights for policy 0, policy_version 32150 (0.0010) [2023-10-14 02:19:31,950][33201] Updated weights for policy 0, policy_version 32160 (0.0010) [2023-10-14 02:19:32,803][33226] Updated weights for policy 1, policy_version 32420 (0.0009) [2023-10-14 02:19:33,169][33226] Updated weights for policy 1, policy_version 32430 (0.0010) [2023-10-14 02:19:33,542][33226] Updated weights for policy 1, policy_version 32440 (0.0011) [2023-10-14 02:19:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 66158592. Throughput: 0: 1758.1, 1: 1791.9. Samples: 16545272. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) [2023-10-14 02:19:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:35,894][33201] Updated weights for policy 0, policy_version 32170 (0.0008) [2023-10-14 02:19:36,270][33201] Updated weights for policy 0, policy_version 32180 (0.0010) [2023-10-14 02:19:36,637][33201] Updated weights for policy 0, policy_version 32190 (0.0010) [2023-10-14 02:19:37,468][33226] Updated weights for policy 1, policy_version 32450 (0.0010) [2023-10-14 02:19:37,825][33226] Updated weights for policy 1, policy_version 32460 (0.0009) [2023-10-14 02:19:38,197][33226] Updated weights for policy 1, policy_version 32470 (0.0008) [2023-10-14 02:19:38,564][33226] Updated weights for policy 1, policy_version 32480 (0.0009) [2023-10-14 02:19:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 66224128. Throughput: 0: 1765.5, 1: 1761.9. Samples: 16565942. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) [2023-10-14 02:19:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:40,639][33201] Updated weights for policy 0, policy_version 32200 (0.0008) [2023-10-14 02:19:41,016][33201] Updated weights for policy 0, policy_version 32210 (0.0008) [2023-10-14 02:19:41,382][33201] Updated weights for policy 0, policy_version 32220 (0.0009) [2023-10-14 02:19:42,382][33226] Updated weights for policy 1, policy_version 32490 (0.0009) [2023-10-14 02:19:42,749][33226] Updated weights for policy 1, policy_version 32500 (0.0007) [2023-10-14 02:19:43,119][33226] Updated weights for policy 1, policy_version 32510 (0.0007) [2023-10-14 02:19:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 66289664. Throughput: 0: 1755.6, 1: 1799.6. Samples: 16576868. Policy #0 lag: (min: 9.0, avg: 13.3, max: 41.0) [2023-10-14 02:19:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:45,065][33201] Updated weights for policy 0, policy_version 32230 (0.0008) [2023-10-14 02:19:45,429][33201] Updated weights for policy 0, policy_version 32240 (0.0008) [2023-10-14 02:19:45,810][33201] Updated weights for policy 0, policy_version 32250 (0.0009) [2023-10-14 02:19:46,998][33226] Updated weights for policy 1, policy_version 32520 (0.0010) [2023-10-14 02:19:47,360][33226] Updated weights for policy 1, policy_version 32530 (0.0010) [2023-10-14 02:19:47,723][33226] Updated weights for policy 1, policy_version 32540 (0.0008) [2023-10-14 02:19:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 66355200. Throughput: 0: 1760.9, 1: 1762.8. Samples: 16597438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 02:19:49,806][33201] Updated weights for policy 0, policy_version 32260 (0.0007) [2023-10-14 02:19:50,173][33201] Updated weights for policy 0, policy_version 32270 (0.0008) [2023-10-14 02:19:50,550][33201] Updated weights for policy 0, policy_version 32280 (0.0008) [2023-10-14 02:19:51,390][33226] Updated weights for policy 1, policy_version 32550 (0.0007) [2023-10-14 02:19:51,750][33226] Updated weights for policy 1, policy_version 32560 (0.0011) [2023-10-14 02:19:52,117][33226] Updated weights for policy 1, policy_version 32570 (0.0009) [2023-10-14 02:19:54,372][33201] Updated weights for policy 0, policy_version 32290 (0.0009) [2023-10-14 02:19:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 66420736. Throughput: 0: 1795.3, 1: 1765.1. Samples: 16619890. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:54,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.890')] [2023-10-14 02:19:54,743][33201] Updated weights for policy 0, policy_version 32300 (0.0008) [2023-10-14 02:19:55,114][33201] Updated weights for policy 0, policy_version 32310 (0.0009) [2023-10-14 02:19:55,486][33201] Updated weights for policy 0, policy_version 32320 (0.0009) [2023-10-14 02:19:55,840][33226] Updated weights for policy 1, policy_version 32580 (0.0010) [2023-10-14 02:19:56,217][33226] Updated weights for policy 1, policy_version 32590 (0.0008) [2023-10-14 02:19:56,583][33226] Updated weights for policy 1, policy_version 32600 (0.0009) [2023-10-14 02:19:59,234][33201] Updated weights for policy 0, policy_version 32330 (0.0007) [2023-10-14 02:19:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 66486272. Throughput: 0: 1757.7, 1: 1768.8. Samples: 16629446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:19:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.890')] [2023-10-14 02:19:59,617][33201] Updated weights for policy 0, policy_version 32340 (0.0010) [2023-10-14 02:19:59,985][33201] Updated weights for policy 0, policy_version 32350 (0.0010) [2023-10-14 02:20:00,418][33226] Updated weights for policy 1, policy_version 32610 (0.0008) [2023-10-14 02:20:00,780][33226] Updated weights for policy 1, policy_version 32620 (0.0010) [2023-10-14 02:20:01,152][33226] Updated weights for policy 1, policy_version 32630 (0.0009) [2023-10-14 02:20:01,513][33226] Updated weights for policy 1, policy_version 32640 (0.0009) [2023-10-14 02:20:03,768][33201] Updated weights for policy 0, policy_version 32360 (0.0007) [2023-10-14 02:20:04,149][33201] Updated weights for policy 0, policy_version 32370 (0.0009) [2023-10-14 02:20:04,513][33201] Updated weights for policy 0, policy_version 32380 (0.0007) [2023-10-14 02:20:04,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 66551808. Throughput: 0: 1782.4, 1: 1764.5. Samples: 16651526. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:20:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:20:05,219][33226] Updated weights for policy 1, policy_version 32650 (0.0007) [2023-10-14 02:20:05,588][33226] Updated weights for policy 1, policy_version 32660 (0.0008) [2023-10-14 02:20:05,963][33226] Updated weights for policy 1, policy_version 32670 (0.0008) [2023-10-14 02:20:08,304][33201] Updated weights for policy 0, policy_version 32390 (0.0008) [2023-10-14 02:20:08,673][33201] Updated weights for policy 0, policy_version 32400 (0.0008) [2023-10-14 02:20:09,045][33201] Updated weights for policy 0, policy_version 32410 (0.0008) [2023-10-14 02:20:09,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 66650112. Throughput: 0: 1769.2, 1: 1779.0. Samples: 16672670. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:20:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:20:09,808][33226] Updated weights for policy 1, policy_version 32680 (0.0009) [2023-10-14 02:20:10,172][33226] Updated weights for policy 1, policy_version 32690 (0.0008) [2023-10-14 02:20:10,541][33226] Updated weights for policy 1, policy_version 32700 (0.0008) [2023-10-14 02:20:12,787][33201] Updated weights for policy 0, policy_version 32420 (0.0008) [2023-10-14 02:20:13,158][33201] Updated weights for policy 0, policy_version 32430 (0.0007) [2023-10-14 02:20:13,523][33201] Updated weights for policy 0, policy_version 32440 (0.0009) [2023-10-14 02:20:14,207][33226] Updated weights for policy 1, policy_version 32710 (0.0007) [2023-10-14 02:20:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 66715648. Throughput: 0: 1786.3, 1: 1766.7. Samples: 16683686. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:20:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:20:14,568][33226] Updated weights for policy 1, policy_version 32720 (0.0009) [2023-10-14 02:20:14,930][33226] Updated weights for policy 1, policy_version 32730 (0.0007) [2023-10-14 02:20:17,226][33201] Updated weights for policy 0, policy_version 32450 (0.0008) [2023-10-14 02:20:17,600][33201] Updated weights for policy 0, policy_version 32460 (0.0010) [2023-10-14 02:20:17,981][33201] Updated weights for policy 0, policy_version 32470 (0.0008) [2023-10-14 02:20:18,353][33201] Updated weights for policy 0, policy_version 32480 (0.0008) [2023-10-14 02:20:18,726][33226] Updated weights for policy 1, policy_version 32740 (0.0008) [2023-10-14 02:20:19,097][33226] Updated weights for policy 1, policy_version 32750 (0.0009) [2023-10-14 02:20:19,474][33226] Updated weights for policy 1, policy_version 32760 (0.0009) [2023-10-14 02:20:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 66781184. Throughput: 0: 1771.6, 1: 1779.2. Samples: 16705056. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:20:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:20:22,109][33201] Updated weights for policy 0, policy_version 32490 (0.0011) [2023-10-14 02:20:22,477][33201] Updated weights for policy 0, policy_version 32500 (0.0010) [2023-10-14 02:20:22,849][33201] Updated weights for policy 0, policy_version 32510 (0.0008) [2023-10-14 02:20:23,295][33226] Updated weights for policy 1, policy_version 32770 (0.0008) [2023-10-14 02:20:23,657][33226] Updated weights for policy 1, policy_version 32780 (0.0008) [2023-10-14 02:20:24,021][33226] Updated weights for policy 1, policy_version 32790 (0.0009) [2023-10-14 02:20:24,394][33226] Updated weights for policy 1, policy_version 32800 (0.0009) [2023-10-14 02:20:24,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 66879488. Throughput: 0: 1767.2, 1: 1792.1. Samples: 16726110. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:20:24,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:20:26,610][33201] Updated weights for policy 0, policy_version 32520 (0.0009) [2023-10-14 02:20:26,979][33201] Updated weights for policy 0, policy_version 32530 (0.0009) [2023-10-14 02:20:27,346][33201] Updated weights for policy 0, policy_version 32540 (0.0009) [2023-10-14 02:20:28,208][33226] Updated weights for policy 1, policy_version 32810 (0.0011) [2023-10-14 02:20:28,569][33226] Updated weights for policy 1, policy_version 32820 (0.0008) [2023-10-14 02:20:28,946][33226] Updated weights for policy 1, policy_version 32830 (0.0009) [2023-10-14 02:20:29,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 66945024. Throughput: 0: 1783.6, 1: 1778.8. Samples: 16737174. Policy #0 lag: (min: 29.0, avg: 29.0, max: 29.0) [2023-10-14 02:20:29,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.910')] [2023-10-14 02:20:31,037][33201] Updated weights for policy 0, policy_version 32550 (0.0011) [2023-10-14 02:20:31,411][33201] Updated weights for policy 0, policy_version 32560 (0.0008) [2023-10-14 02:20:31,780][33201] Updated weights for policy 0, policy_version 32570 (0.0009) [2023-10-14 02:20:32,699][33226] Updated weights for policy 1, policy_version 32840 (0.0009) [2023-10-14 02:20:33,065][33226] Updated weights for policy 1, policy_version 32850 (0.0010) [2023-10-14 02:20:33,436][33226] Updated weights for policy 1, policy_version 32860 (0.0008) [2023-10-14 02:20:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 67010560. Throughput: 0: 1770.8, 1: 1801.7. Samples: 16758202. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) [2023-10-14 02:20:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:20:35,612][33201] Updated weights for policy 0, policy_version 32580 (0.0007) [2023-10-14 02:20:35,981][33201] Updated weights for policy 0, policy_version 32590 (0.0008) [2023-10-14 02:20:36,352][33201] Updated weights for policy 0, policy_version 32600 (0.0011) [2023-10-14 02:20:37,321][33226] Updated weights for policy 1, policy_version 32870 (0.0007) [2023-10-14 02:20:37,686][33226] Updated weights for policy 1, policy_version 32880 (0.0010) [2023-10-14 02:20:38,057][33226] Updated weights for policy 1, policy_version 32890 (0.0009) [2023-10-14 02:20:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 67076096. Throughput: 0: 1769.5, 1: 1778.5. Samples: 16779552. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) [2023-10-14 02:20:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:20:40,462][33201] Updated weights for policy 0, policy_version 32610 (0.0008) [2023-10-14 02:20:40,844][33201] Updated weights for policy 0, policy_version 32620 (0.0011) [2023-10-14 02:20:41,209][33201] Updated weights for policy 0, policy_version 32630 (0.0011) [2023-10-14 02:20:41,578][33201] Updated weights for policy 0, policy_version 32640 (0.0011) [2023-10-14 02:20:41,860][33226] Updated weights for policy 1, policy_version 32900 (0.0009) [2023-10-14 02:20:42,231][33226] Updated weights for policy 1, policy_version 32910 (0.0008) [2023-10-14 02:20:42,599][33226] Updated weights for policy 1, policy_version 32920 (0.0007) [2023-10-14 02:20:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 67141632. Throughput: 0: 1772.3, 1: 1802.8. Samples: 16790322. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) [2023-10-14 02:20:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:20:45,388][33201] Updated weights for policy 0, policy_version 32650 (0.0007) [2023-10-14 02:20:45,761][33201] Updated weights for policy 0, policy_version 32660 (0.0007) [2023-10-14 02:20:46,139][33201] Updated weights for policy 0, policy_version 32670 (0.0008) [2023-10-14 02:20:46,214][33226] Updated weights for policy 1, policy_version 32930 (0.0009) [2023-10-14 02:20:46,573][33226] Updated weights for policy 1, policy_version 32940 (0.0007) [2023-10-14 02:20:46,954][33226] Updated weights for policy 1, policy_version 32950 (0.0008) [2023-10-14 02:20:47,310][33226] Updated weights for policy 1, policy_version 32960 (0.0008) [2023-10-14 02:20:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 67207168. Throughput: 0: 1770.6, 1: 1786.3. Samples: 16811584. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) [2023-10-14 02:20:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:20:49,865][33201] Updated weights for policy 0, policy_version 32680 (0.0009) [2023-10-14 02:20:50,227][33201] Updated weights for policy 0, policy_version 32690 (0.0008) [2023-10-14 02:20:50,607][33201] Updated weights for policy 0, policy_version 32700 (0.0010) [2023-10-14 02:20:51,067][33226] Updated weights for policy 1, policy_version 32970 (0.0009) [2023-10-14 02:20:51,435][33226] Updated weights for policy 1, policy_version 32980 (0.0011) [2023-10-14 02:20:51,788][33226] Updated weights for policy 1, policy_version 32990 (0.0008) [2023-10-14 02:20:54,321][33201] Updated weights for policy 0, policy_version 32710 (0.0008) [2023-10-14 02:20:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 67272704. Throughput: 0: 1794.0, 1: 1781.8. Samples: 16833580. Policy #0 lag: (min: 1.0, avg: 4.2, max: 33.0) [2023-10-14 02:20:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:20:54,691][33201] Updated weights for policy 0, policy_version 32720 (0.0008) [2023-10-14 02:20:55,071][33201] Updated weights for policy 0, policy_version 32730 (0.0008) [2023-10-14 02:20:55,570][33226] Updated weights for policy 1, policy_version 33000 (0.0008) [2023-10-14 02:20:55,934][33226] Updated weights for policy 1, policy_version 33010 (0.0007) [2023-10-14 02:20:56,301][33226] Updated weights for policy 1, policy_version 33020 (0.0009) [2023-10-14 02:20:58,803][33201] Updated weights for policy 0, policy_version 32740 (0.0007) [2023-10-14 02:20:59,178][33201] Updated weights for policy 0, policy_version 32750 (0.0009) [2023-10-14 02:20:59,545][33201] Updated weights for policy 0, policy_version 32760 (0.0008) [2023-10-14 02:20:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 67338240. Throughput: 0: 1765.5, 1: 1781.1. Samples: 16843280. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:20:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:00,198][33226] Updated weights for policy 1, policy_version 33030 (0.0009) [2023-10-14 02:21:00,568][33226] Updated weights for policy 1, policy_version 33040 (0.0009) [2023-10-14 02:21:00,938][33226] Updated weights for policy 1, policy_version 33050 (0.0009) [2023-10-14 02:21:03,477][33201] Updated weights for policy 0, policy_version 32770 (0.0010) [2023-10-14 02:21:03,847][33201] Updated weights for policy 0, policy_version 32780 (0.0009) [2023-10-14 02:21:04,223][33201] Updated weights for policy 0, policy_version 32790 (0.0008) [2023-10-14 02:21:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 67403776. Throughput: 0: 1788.7, 1: 1774.2. Samples: 16865384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:21:04,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:04,590][33201] Updated weights for policy 0, policy_version 32800 (0.0007) [2023-10-14 02:21:04,772][33226] Updated weights for policy 1, policy_version 33060 (0.0009) [2023-10-14 02:21:05,150][33226] Updated weights for policy 1, policy_version 33070 (0.0008) [2023-10-14 02:21:05,519][33226] Updated weights for policy 1, policy_version 33080 (0.0007) [2023-10-14 02:21:08,353][33201] Updated weights for policy 0, policy_version 32810 (0.0010) [2023-10-14 02:21:08,726][33201] Updated weights for policy 0, policy_version 32820 (0.0010) [2023-10-14 02:21:09,099][33201] Updated weights for policy 0, policy_version 32830 (0.0007) [2023-10-14 02:21:09,327][33226] Updated weights for policy 1, policy_version 33090 (0.0008) [2023-10-14 02:21:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67502080. Throughput: 0: 1766.4, 1: 1795.4. Samples: 16886390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:21:09,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:09,684][33226] Updated weights for policy 1, policy_version 33100 (0.0011) [2023-10-14 02:21:10,058][33226] Updated weights for policy 1, policy_version 33110 (0.0011) [2023-10-14 02:21:10,417][33226] Updated weights for policy 1, policy_version 33120 (0.0012) [2023-10-14 02:21:12,963][33201] Updated weights for policy 0, policy_version 32840 (0.0008) [2023-10-14 02:21:13,349][33201] Updated weights for policy 0, policy_version 32850 (0.0008) [2023-10-14 02:21:13,722][33201] Updated weights for policy 0, policy_version 32860 (0.0007) [2023-10-14 02:21:14,331][33226] Updated weights for policy 1, policy_version 33130 (0.0008) [2023-10-14 02:21:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67567616. Throughput: 0: 1781.2, 1: 1771.3. Samples: 16897036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:21:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:14,704][33226] Updated weights for policy 1, policy_version 33140 (0.0008) [2023-10-14 02:21:15,067][33226] Updated weights for policy 1, policy_version 33150 (0.0010) [2023-10-14 02:21:17,539][33201] Updated weights for policy 0, policy_version 32870 (0.0009) [2023-10-14 02:21:17,911][33201] Updated weights for policy 0, policy_version 32880 (0.0008) [2023-10-14 02:21:18,288][33201] Updated weights for policy 0, policy_version 32890 (0.0007) [2023-10-14 02:21:18,801][33226] Updated weights for policy 1, policy_version 33160 (0.0009) [2023-10-14 02:21:19,167][33226] Updated weights for policy 1, policy_version 33170 (0.0008) [2023-10-14 02:21:19,523][33226] Updated weights for policy 1, policy_version 33180 (0.0008) [2023-10-14 02:21:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67633152. Throughput: 0: 1769.0, 1: 1783.4. Samples: 16918060. Policy #0 lag: (min: 17.0, avg: 24.5, max: 49.0) [2023-10-14 02:21:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:22,142][33201] Updated weights for policy 0, policy_version 32900 (0.0008) [2023-10-14 02:21:22,516][33201] Updated weights for policy 0, policy_version 32910 (0.0010) [2023-10-14 02:21:22,880][33201] Updated weights for policy 0, policy_version 32920 (0.0009) [2023-10-14 02:21:23,473][33226] Updated weights for policy 1, policy_version 33190 (0.0011) [2023-10-14 02:21:23,852][33226] Updated weights for policy 1, policy_version 33200 (0.0009) [2023-10-14 02:21:24,220][33226] Updated weights for policy 1, policy_version 33210 (0.0007) [2023-10-14 02:21:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 67731456. Throughput: 0: 1753.3, 1: 1781.8. Samples: 16938630. Policy #0 lag: (min: 17.0, avg: 24.5, max: 49.0) [2023-10-14 02:21:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000032928_33718272.pth... [2023-10-14 02:21:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000033216_34013184.pth... [2023-10-14 02:21:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000031264_32014336.pth [2023-10-14 02:21:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000031552_32309248.pth [2023-10-14 02:21:24,605][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000032928_33718272.pth [2023-10-14 02:21:24,607][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000033216_34013184.pth [2023-10-14 02:21:26,769][33201] Updated weights for policy 0, policy_version 32930 (0.0008) [2023-10-14 02:21:27,133][33201] Updated weights for policy 0, policy_version 32940 (0.0007) [2023-10-14 02:21:27,511][33201] Updated weights for policy 0, policy_version 32950 (0.0009) [2023-10-14 02:21:27,878][33201] Updated weights for policy 0, policy_version 32960 (0.0009) [2023-10-14 02:21:27,957][33226] Updated weights for policy 1, policy_version 33220 (0.0008) [2023-10-14 02:21:28,326][33226] Updated weights for policy 1, policy_version 33230 (0.0009) [2023-10-14 02:21:28,703][33226] Updated weights for policy 1, policy_version 33240 (0.0008) [2023-10-14 02:21:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 67796992. Throughput: 0: 1773.4, 1: 1774.5. Samples: 16949980. Policy #0 lag: (min: 17.0, avg: 24.5, max: 49.0) [2023-10-14 02:21:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:31,634][33201] Updated weights for policy 0, policy_version 32970 (0.0009) [2023-10-14 02:21:32,007][33201] Updated weights for policy 0, policy_version 32980 (0.0010) [2023-10-14 02:21:32,387][33201] Updated weights for policy 0, policy_version 32990 (0.0008) [2023-10-14 02:21:32,555][33226] Updated weights for policy 1, policy_version 33250 (0.0009) [2023-10-14 02:21:32,925][33226] Updated weights for policy 1, policy_version 33260 (0.0007) [2023-10-14 02:21:33,301][33226] Updated weights for policy 1, policy_version 33270 (0.0008) [2023-10-14 02:21:33,665][33226] Updated weights for policy 1, policy_version 33280 (0.0009) [2023-10-14 02:21:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67862528. Throughput: 0: 1758.4, 1: 1780.3. Samples: 16970828. Policy #0 lag: (min: 17.0, avg: 24.5, max: 49.0) [2023-10-14 02:21:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:36,137][33201] Updated weights for policy 0, policy_version 33000 (0.0009) [2023-10-14 02:21:36,519][33201] Updated weights for policy 0, policy_version 33010 (0.0008) [2023-10-14 02:21:36,893][33201] Updated weights for policy 0, policy_version 33020 (0.0011) [2023-10-14 02:21:37,483][33226] Updated weights for policy 1, policy_version 33290 (0.0009) [2023-10-14 02:21:37,845][33226] Updated weights for policy 1, policy_version 33300 (0.0007) [2023-10-14 02:21:38,213][33226] Updated weights for policy 1, policy_version 33310 (0.0007) [2023-10-14 02:21:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67928064. Throughput: 0: 1757.9, 1: 1761.6. Samples: 16991956. Policy #0 lag: (min: 17.0, avg: 24.5, max: 49.0) [2023-10-14 02:21:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:40,975][33201] Updated weights for policy 0, policy_version 33030 (0.0010) [2023-10-14 02:21:41,340][33201] Updated weights for policy 0, policy_version 33040 (0.0009) [2023-10-14 02:21:41,711][33201] Updated weights for policy 0, policy_version 33050 (0.0008) [2023-10-14 02:21:42,096][33226] Updated weights for policy 1, policy_version 33320 (0.0008) [2023-10-14 02:21:42,469][33226] Updated weights for policy 1, policy_version 33330 (0.0007) [2023-10-14 02:21:42,840][33226] Updated weights for policy 1, policy_version 33340 (0.0008) [2023-10-14 02:21:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 67993600. Throughput: 0: 1752.1, 1: 1785.4. Samples: 17002470. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 02:21:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.980')] [2023-10-14 02:21:45,379][33201] Updated weights for policy 0, policy_version 33060 (0.0008) [2023-10-14 02:21:45,746][33201] Updated weights for policy 0, policy_version 33070 (0.0007) [2023-10-14 02:21:46,114][33201] Updated weights for policy 0, policy_version 33080 (0.0009) [2023-10-14 02:21:46,517][33226] Updated weights for policy 1, policy_version 33350 (0.0008) [2023-10-14 02:21:46,879][33226] Updated weights for policy 1, policy_version 33360 (0.0010) [2023-10-14 02:21:47,250][33226] Updated weights for policy 1, policy_version 33370 (0.0008) [2023-10-14 02:21:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 68059136. Throughput: 0: 1754.8, 1: 1763.0. Samples: 17023688. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 02:21:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:49,932][33201] Updated weights for policy 0, policy_version 33090 (0.0008) [2023-10-14 02:21:50,307][33201] Updated weights for policy 0, policy_version 33100 (0.0008) [2023-10-14 02:21:50,679][33201] Updated weights for policy 0, policy_version 33110 (0.0007) [2023-10-14 02:21:51,047][33201] Updated weights for policy 0, policy_version 33120 (0.0009) [2023-10-14 02:21:51,113][33226] Updated weights for policy 1, policy_version 33380 (0.0008) [2023-10-14 02:21:51,479][33226] Updated weights for policy 1, policy_version 33390 (0.0008) [2023-10-14 02:21:51,850][33226] Updated weights for policy 1, policy_version 33400 (0.0008) [2023-10-14 02:21:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 68124672. Throughput: 0: 1782.3, 1: 1763.5. Samples: 17045950. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 02:21:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:54,938][33201] Updated weights for policy 0, policy_version 33130 (0.0009) [2023-10-14 02:21:55,306][33201] Updated weights for policy 0, policy_version 33140 (0.0009) [2023-10-14 02:21:55,360][33226] Updated weights for policy 1, policy_version 33410 (0.0008) [2023-10-14 02:21:55,675][33201] Updated weights for policy 0, policy_version 33150 (0.0008) [2023-10-14 02:21:55,730][33226] Updated weights for policy 1, policy_version 33420 (0.0007) [2023-10-14 02:21:56,104][33226] Updated weights for policy 1, policy_version 33430 (0.0008) [2023-10-14 02:21:56,467][33226] Updated weights for policy 1, policy_version 33440 (0.0007) [2023-10-14 02:21:59,484][33201] Updated weights for policy 0, policy_version 33160 (0.0009) [2023-10-14 02:21:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 68190208. Throughput: 0: 1757.4, 1: 1769.3. Samples: 17055740. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 02:21:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:21:59,862][33201] Updated weights for policy 0, policy_version 33170 (0.0010) [2023-10-14 02:22:00,237][33201] Updated weights for policy 0, policy_version 33180 (0.0008) [2023-10-14 02:22:00,348][33226] Updated weights for policy 1, policy_version 33450 (0.0008) [2023-10-14 02:22:00,728][33226] Updated weights for policy 1, policy_version 33460 (0.0008) [2023-10-14 02:22:01,096][33226] Updated weights for policy 1, policy_version 33470 (0.0010) [2023-10-14 02:22:03,876][33201] Updated weights for policy 0, policy_version 33190 (0.0008) [2023-10-14 02:22:04,253][33201] Updated weights for policy 0, policy_version 33200 (0.0007) [2023-10-14 02:22:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 68255744. Throughput: 0: 1777.9, 1: 1772.3. Samples: 17077820. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 02:22:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:22:04,615][33201] Updated weights for policy 0, policy_version 33210 (0.0009) [2023-10-14 02:22:04,821][33226] Updated weights for policy 1, policy_version 33480 (0.0008) [2023-10-14 02:22:05,186][33226] Updated weights for policy 1, policy_version 33490 (0.0012) [2023-10-14 02:22:05,552][33226] Updated weights for policy 1, policy_version 33500 (0.0010) [2023-10-14 02:22:08,543][33201] Updated weights for policy 0, policy_version 33220 (0.0009) [2023-10-14 02:22:08,916][33201] Updated weights for policy 0, policy_version 33230 (0.0011) [2023-10-14 02:22:09,286][33201] Updated weights for policy 0, policy_version 33240 (0.0009) [2023-10-14 02:22:09,445][33226] Updated weights for policy 1, policy_version 33510 (0.0009) [2023-10-14 02:22:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 68321280. Throughput: 0: 1773.3, 1: 1795.3. Samples: 17099216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:22:09,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 02:22:09,810][33226] Updated weights for policy 1, policy_version 33520 (0.0009) [2023-10-14 02:22:10,171][33226] Updated weights for policy 1, policy_version 33530 (0.0008) [2023-10-14 02:22:13,268][33201] Updated weights for policy 0, policy_version 33250 (0.0009) [2023-10-14 02:22:13,634][33201] Updated weights for policy 0, policy_version 33260 (0.0007) [2023-10-14 02:22:13,948][33226] Updated weights for policy 1, policy_version 33540 (0.0008) [2023-10-14 02:22:14,008][33201] Updated weights for policy 0, policy_version 33270 (0.0009) [2023-10-14 02:22:14,317][33226] Updated weights for policy 1, policy_version 33550 (0.0008) [2023-10-14 02:22:14,367][33201] Updated weights for policy 0, policy_version 33280 (0.0008) [2023-10-14 02:22:14,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 68419584. Throughput: 0: 1769.4, 1: 1776.9. Samples: 17109566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:22:14,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.930')] [2023-10-14 02:22:14,681][33226] Updated weights for policy 1, policy_version 33560 (0.0009) [2023-10-14 02:22:18,219][33201] Updated weights for policy 0, policy_version 33290 (0.0008) [2023-10-14 02:22:18,332][33226] Updated weights for policy 1, policy_version 33570 (0.0009) [2023-10-14 02:22:18,595][33201] Updated weights for policy 0, policy_version 33300 (0.0008) [2023-10-14 02:22:18,687][33226] Updated weights for policy 1, policy_version 33580 (0.0007) [2023-10-14 02:22:18,971][33201] Updated weights for policy 0, policy_version 33310 (0.0009) [2023-10-14 02:22:19,062][33226] Updated weights for policy 1, policy_version 33590 (0.0007) [2023-10-14 02:22:19,424][33226] Updated weights for policy 1, policy_version 33600 (0.0009) [2023-10-14 02:22:19,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 68517888. Throughput: 0: 1776.1, 1: 1792.2. Samples: 17131402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:22:19,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:22:22,811][33201] Updated weights for policy 0, policy_version 33320 (0.0009) [2023-10-14 02:22:23,185][33201] Updated weights for policy 0, policy_version 33330 (0.0009) [2023-10-14 02:22:23,287][33226] Updated weights for policy 1, policy_version 33610 (0.0009) [2023-10-14 02:22:23,562][33201] Updated weights for policy 0, policy_version 33340 (0.0008) [2023-10-14 02:22:23,655][33226] Updated weights for policy 1, policy_version 33620 (0.0009) [2023-10-14 02:22:24,016][33226] Updated weights for policy 1, policy_version 33630 (0.0010) [2023-10-14 02:22:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 68583424. Throughput: 0: 1747.1, 1: 1785.2. Samples: 17150912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:22:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:22:27,340][33201] Updated weights for policy 0, policy_version 33350 (0.0008) [2023-10-14 02:22:27,703][33226] Updated weights for policy 1, policy_version 33640 (0.0008) [2023-10-14 02:22:27,709][33201] Updated weights for policy 0, policy_version 33360 (0.0007) [2023-10-14 02:22:28,071][33226] Updated weights for policy 1, policy_version 33650 (0.0009) [2023-10-14 02:22:28,080][33201] Updated weights for policy 0, policy_version 33370 (0.0009) [2023-10-14 02:22:28,439][33226] Updated weights for policy 1, policy_version 33660 (0.0008) [2023-10-14 02:22:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 68648960. Throughput: 0: 1784.3, 1: 1786.1. Samples: 17163138. Policy #0 lag: (min: 27.0, avg: 35.5, max: 59.0) [2023-10-14 02:22:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.910')] [2023-10-14 02:22:31,940][33201] Updated weights for policy 0, policy_version 33380 (0.0009) [2023-10-14 02:22:32,306][33201] Updated weights for policy 0, policy_version 33390 (0.0009) [2023-10-14 02:22:32,414][33226] Updated weights for policy 1, policy_version 33670 (0.0008) [2023-10-14 02:22:32,684][33201] Updated weights for policy 0, policy_version 33400 (0.0007) [2023-10-14 02:22:32,783][33226] Updated weights for policy 1, policy_version 33680 (0.0007) [2023-10-14 02:22:33,150][33226] Updated weights for policy 1, policy_version 33690 (0.0007) [2023-10-14 02:22:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 68714496. Throughput: 0: 1746.9, 1: 1788.3. Samples: 17182772. Policy #0 lag: (min: 27.0, avg: 35.5, max: 59.0) [2023-10-14 02:22:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.910')] [2023-10-14 02:22:36,410][33201] Updated weights for policy 0, policy_version 33410 (0.0009) [2023-10-14 02:22:36,783][33201] Updated weights for policy 0, policy_version 33420 (0.0008) [2023-10-14 02:22:36,959][33226] Updated weights for policy 1, policy_version 33700 (0.0009) [2023-10-14 02:22:37,151][33201] Updated weights for policy 0, policy_version 33430 (0.0008) [2023-10-14 02:22:37,314][33226] Updated weights for policy 1, policy_version 33710 (0.0008) [2023-10-14 02:22:37,519][33201] Updated weights for policy 0, policy_version 33440 (0.0008) [2023-10-14 02:22:37,683][33226] Updated weights for policy 1, policy_version 33720 (0.0009) [2023-10-14 02:22:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 68780032. Throughput: 0: 1747.1, 1: 1769.1. Samples: 17204178. Policy #0 lag: (min: 27.0, avg: 35.5, max: 59.0) [2023-10-14 02:22:39,559][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:22:41,355][33201] Updated weights for policy 0, policy_version 33450 (0.0009) [2023-10-14 02:22:41,488][33226] Updated weights for policy 1, policy_version 33730 (0.0007) [2023-10-14 02:22:41,729][33201] Updated weights for policy 0, policy_version 33460 (0.0008) [2023-10-14 02:22:41,860][33226] Updated weights for policy 1, policy_version 33740 (0.0008) [2023-10-14 02:22:42,097][33201] Updated weights for policy 0, policy_version 33470 (0.0008) [2023-10-14 02:22:42,232][33226] Updated weights for policy 1, policy_version 33750 (0.0009) [2023-10-14 02:22:42,586][33226] Updated weights for policy 1, policy_version 33760 (0.0009) [2023-10-14 02:22:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 68845568. Throughput: 0: 1747.1, 1: 1784.0. Samples: 17214640. Policy #0 lag: (min: 27.0, avg: 35.5, max: 59.0) [2023-10-14 02:22:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:22:45,969][33201] Updated weights for policy 0, policy_version 33480 (0.0009) [2023-10-14 02:22:46,348][33201] Updated weights for policy 0, policy_version 33490 (0.0008) [2023-10-14 02:22:46,429][33226] Updated weights for policy 1, policy_version 33770 (0.0010) [2023-10-14 02:22:46,718][33201] Updated weights for policy 0, policy_version 33500 (0.0009) [2023-10-14 02:22:46,794][33226] Updated weights for policy 1, policy_version 33780 (0.0008) [2023-10-14 02:22:47,157][33226] Updated weights for policy 1, policy_version 33790 (0.0008) [2023-10-14 02:22:49,557][31953] Fps is (10 sec: 13107.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 68911104. Throughput: 0: 1743.6, 1: 1768.5. Samples: 17235862. Policy #0 lag: (min: 27.0, avg: 35.5, max: 59.0) [2023-10-14 02:22:49,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.890')] [2023-10-14 02:22:50,572][33201] Updated weights for policy 0, policy_version 33510 (0.0008) [2023-10-14 02:22:50,821][33226] Updated weights for policy 1, policy_version 33800 (0.0008) [2023-10-14 02:22:50,939][33201] Updated weights for policy 0, policy_version 33520 (0.0007) [2023-10-14 02:22:51,187][33226] Updated weights for policy 1, policy_version 33810 (0.0010) [2023-10-14 02:22:51,307][33201] Updated weights for policy 0, policy_version 33530 (0.0008) [2023-10-14 02:22:51,557][33226] Updated weights for policy 1, policy_version 33820 (0.0007) [2023-10-14 02:22:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 68976640. Throughput: 0: 1760.2, 1: 1770.4. Samples: 17258096. Policy #0 lag: (min: 1.0, avg: 7.7, max: 33.0) [2023-10-14 02:22:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.880')] [2023-10-14 02:22:55,122][33201] Updated weights for policy 0, policy_version 33540 (0.0009) [2023-10-14 02:22:55,394][33226] Updated weights for policy 1, policy_version 33830 (0.0009) [2023-10-14 02:22:55,499][33201] Updated weights for policy 0, policy_version 33550 (0.0009) [2023-10-14 02:22:55,753][33226] Updated weights for policy 1, policy_version 33840 (0.0007) [2023-10-14 02:22:55,863][33201] Updated weights for policy 0, policy_version 33560 (0.0008) [2023-10-14 02:22:56,124][33226] Updated weights for policy 1, policy_version 33850 (0.0008) [2023-10-14 02:22:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 69042176. Throughput: 0: 1745.1, 1: 1767.4. Samples: 17267630. Policy #0 lag: (min: 1.0, avg: 7.7, max: 33.0) [2023-10-14 02:22:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.880')] [2023-10-14 02:22:59,587][33201] Updated weights for policy 0, policy_version 33570 (0.0008) [2023-10-14 02:22:59,963][33201] Updated weights for policy 0, policy_version 33580 (0.0009) [2023-10-14 02:23:00,085][33226] Updated weights for policy 1, policy_version 33860 (0.0008) [2023-10-14 02:23:00,325][33201] Updated weights for policy 0, policy_version 33590 (0.0008) [2023-10-14 02:23:00,445][33226] Updated weights for policy 1, policy_version 33870 (0.0008) [2023-10-14 02:23:00,695][33201] Updated weights for policy 0, policy_version 33600 (0.0007) [2023-10-14 02:23:00,808][33226] Updated weights for policy 1, policy_version 33880 (0.0008) [2023-10-14 02:23:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 69107712. Throughput: 0: 1750.5, 1: 1763.0. Samples: 17289512. Policy #0 lag: (min: 1.0, avg: 7.7, max: 33.0) [2023-10-14 02:23:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:23:04,602][33226] Updated weights for policy 1, policy_version 33890 (0.0010) [2023-10-14 02:23:04,680][33201] Updated weights for policy 0, policy_version 33610 (0.0007) [2023-10-14 02:23:04,966][33226] Updated weights for policy 1, policy_version 33900 (0.0009) [2023-10-14 02:23:05,056][33201] Updated weights for policy 0, policy_version 33620 (0.0008) [2023-10-14 02:23:05,332][33226] Updated weights for policy 1, policy_version 33910 (0.0008) [2023-10-14 02:23:05,421][33201] Updated weights for policy 0, policy_version 33630 (0.0008) [2023-10-14 02:23:05,694][33226] Updated weights for policy 1, policy_version 33920 (0.0010) [2023-10-14 02:23:09,250][33201] Updated weights for policy 0, policy_version 33640 (0.0007) [2023-10-14 02:23:09,423][33226] Updated weights for policy 1, policy_version 33930 (0.0007) [2023-10-14 02:23:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 69173248. Throughput: 0: 1772.9, 1: 1794.7. Samples: 17311454. Policy #0 lag: (min: 1.0, avg: 7.7, max: 33.0) [2023-10-14 02:23:09,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:23:09,609][33201] Updated weights for policy 0, policy_version 33650 (0.0008) [2023-10-14 02:23:09,788][33226] Updated weights for policy 1, policy_version 33940 (0.0008) [2023-10-14 02:23:09,983][33201] Updated weights for policy 0, policy_version 33660 (0.0009) [2023-10-14 02:23:10,150][33226] Updated weights for policy 1, policy_version 33950 (0.0008) [2023-10-14 02:23:13,755][33201] Updated weights for policy 0, policy_version 33670 (0.0010) [2023-10-14 02:23:13,995][33226] Updated weights for policy 1, policy_version 33960 (0.0009) [2023-10-14 02:23:14,118][33201] Updated weights for policy 0, policy_version 33680 (0.0010) [2023-10-14 02:23:14,365][33226] Updated weights for policy 1, policy_version 33970 (0.0008) [2023-10-14 02:23:14,486][33201] Updated weights for policy 0, policy_version 33690 (0.0009) [2023-10-14 02:23:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 69238784. Throughput: 0: 1746.1, 1: 1768.3. Samples: 17321284. Policy #0 lag: (min: 1.0, avg: 7.7, max: 33.0) [2023-10-14 02:23:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:23:14,734][33226] Updated weights for policy 1, policy_version 33980 (0.0008) [2023-10-14 02:23:18,512][33201] Updated weights for policy 0, policy_version 33700 (0.0008) [2023-10-14 02:23:18,580][33226] Updated weights for policy 1, policy_version 33990 (0.0008) [2023-10-14 02:23:18,886][33201] Updated weights for policy 0, policy_version 33710 (0.0011) [2023-10-14 02:23:18,947][33226] Updated weights for policy 1, policy_version 34000 (0.0008) [2023-10-14 02:23:19,252][33201] Updated weights for policy 0, policy_version 33720 (0.0007) [2023-10-14 02:23:19,313][33226] Updated weights for policy 1, policy_version 34010 (0.0008) [2023-10-14 02:23:19,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 69369856. Throughput: 0: 1774.5, 1: 1788.1. Samples: 17343090. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 02:23:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 02:23:23,167][33201] Updated weights for policy 0, policy_version 33730 (0.0008) [2023-10-14 02:23:23,215][33226] Updated weights for policy 1, policy_version 34020 (0.0008) [2023-10-14 02:23:23,538][33201] Updated weights for policy 0, policy_version 33740 (0.0007) [2023-10-14 02:23:23,590][33226] Updated weights for policy 1, policy_version 34030 (0.0008) [2023-10-14 02:23:23,906][33201] Updated weights for policy 0, policy_version 33750 (0.0007) [2023-10-14 02:23:23,952][33226] Updated weights for policy 1, policy_version 34040 (0.0009) [2023-10-14 02:23:24,270][33201] Updated weights for policy 0, policy_version 33760 (0.0007) [2023-10-14 02:23:24,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69435392. Throughput: 0: 1747.6, 1: 1774.6. Samples: 17362678. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 02:23:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 02:23:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000033760_34570240.pth... [2023-10-14 02:23:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000034048_34865152.pth... [2023-10-14 02:23:24,598][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000032096_32866304.pth [2023-10-14 02:23:24,612][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000032384_33161216.pth [2023-10-14 02:23:27,777][33226] Updated weights for policy 1, policy_version 34050 (0.0007) [2023-10-14 02:23:28,014][33201] Updated weights for policy 0, policy_version 33770 (0.0008) [2023-10-14 02:23:28,135][33226] Updated weights for policy 1, policy_version 34060 (0.0009) [2023-10-14 02:23:28,385][33201] Updated weights for policy 0, policy_version 33780 (0.0009) [2023-10-14 02:23:28,501][33226] Updated weights for policy 1, policy_version 34070 (0.0007) [2023-10-14 02:23:28,752][33201] Updated weights for policy 0, policy_version 33790 (0.0008) [2023-10-14 02:23:28,870][33226] Updated weights for policy 1, policy_version 34080 (0.0008) [2023-10-14 02:23:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69500928. Throughput: 0: 1774.0, 1: 1779.8. Samples: 17374562. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 02:23:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:23:32,708][33201] Updated weights for policy 0, policy_version 33800 (0.0009) [2023-10-14 02:23:32,745][33226] Updated weights for policy 1, policy_version 34090 (0.0009) [2023-10-14 02:23:33,085][33201] Updated weights for policy 0, policy_version 33810 (0.0008) [2023-10-14 02:23:33,125][33226] Updated weights for policy 1, policy_version 34100 (0.0008) [2023-10-14 02:23:33,450][33201] Updated weights for policy 0, policy_version 33820 (0.0008) [2023-10-14 02:23:33,484][33226] Updated weights for policy 1, policy_version 34110 (0.0008) [2023-10-14 02:23:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69566464. Throughput: 0: 1759.0, 1: 1775.6. Samples: 17394922. Policy #0 lag: (min: 31.0, avg: 35.3, max: 63.0) [2023-10-14 02:23:34,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 02:23:37,214][33226] Updated weights for policy 1, policy_version 34120 (0.0008) [2023-10-14 02:23:37,295][33201] Updated weights for policy 0, policy_version 33830 (0.0008) [2023-10-14 02:23:37,581][33226] Updated weights for policy 1, policy_version 34130 (0.0007) [2023-10-14 02:23:37,668][33201] Updated weights for policy 0, policy_version 33840 (0.0008) [2023-10-14 02:23:37,947][33226] Updated weights for policy 1, policy_version 34140 (0.0008) [2023-10-14 02:23:38,028][33201] Updated weights for policy 0, policy_version 33850 (0.0010) [2023-10-14 02:23:39,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69632000. Throughput: 0: 1742.7, 1: 1755.2. Samples: 17415502. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:23:39,559][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 02:23:41,950][33226] Updated weights for policy 1, policy_version 34150 (0.0008) [2023-10-14 02:23:41,980][33201] Updated weights for policy 0, policy_version 33860 (0.0009) [2023-10-14 02:23:42,320][33226] Updated weights for policy 1, policy_version 34160 (0.0008) [2023-10-14 02:23:42,350][33201] Updated weights for policy 0, policy_version 33870 (0.0007) [2023-10-14 02:23:42,680][33226] Updated weights for policy 1, policy_version 34170 (0.0008) [2023-10-14 02:23:42,718][33201] Updated weights for policy 0, policy_version 33880 (0.0009) [2023-10-14 02:23:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69697536. Throughput: 0: 1763.9, 1: 1780.0. Samples: 17427108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:23:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 02:23:46,409][33226] Updated weights for policy 1, policy_version 34180 (0.0008) [2023-10-14 02:23:46,577][33201] Updated weights for policy 0, policy_version 33890 (0.0008) [2023-10-14 02:23:46,764][33226] Updated weights for policy 1, policy_version 34190 (0.0008) [2023-10-14 02:23:46,948][33201] Updated weights for policy 0, policy_version 33900 (0.0008) [2023-10-14 02:23:47,138][33226] Updated weights for policy 1, policy_version 34200 (0.0008) [2023-10-14 02:23:47,325][33201] Updated weights for policy 0, policy_version 33910 (0.0007) [2023-10-14 02:23:47,688][33201] Updated weights for policy 0, policy_version 33920 (0.0008) [2023-10-14 02:23:49,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69763072. Throughput: 0: 1739.4, 1: 1751.4. Samples: 17446600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:23:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 02:23:51,021][33226] Updated weights for policy 1, policy_version 34210 (0.0008) [2023-10-14 02:23:51,388][33226] Updated weights for policy 1, policy_version 34220 (0.0008) [2023-10-14 02:23:51,662][33201] Updated weights for policy 0, policy_version 33930 (0.0008) [2023-10-14 02:23:51,755][33226] Updated weights for policy 1, policy_version 34230 (0.0007) [2023-10-14 02:23:52,036][33201] Updated weights for policy 0, policy_version 33940 (0.0007) [2023-10-14 02:23:52,123][33226] Updated weights for policy 1, policy_version 34240 (0.0007) [2023-10-14 02:23:52,408][33201] Updated weights for policy 0, policy_version 33950 (0.0007) [2023-10-14 02:23:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 69828608. Throughput: 0: 1744.3, 1: 1746.7. Samples: 17468548. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:23:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 02:23:55,971][33226] Updated weights for policy 1, policy_version 34250 (0.0008) [2023-10-14 02:23:56,241][33201] Updated weights for policy 0, policy_version 33960 (0.0007) [2023-10-14 02:23:56,333][33226] Updated weights for policy 1, policy_version 34260 (0.0008) [2023-10-14 02:23:56,599][33201] Updated weights for policy 0, policy_version 33970 (0.0007) [2023-10-14 02:23:56,702][33226] Updated weights for policy 1, policy_version 34270 (0.0008) [2023-10-14 02:23:56,967][33201] Updated weights for policy 0, policy_version 33980 (0.0009) [2023-10-14 02:23:59,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 69894144. Throughput: 0: 1741.1, 1: 1746.4. Samples: 17478226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:23:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 02:24:00,429][33226] Updated weights for policy 1, policy_version 34280 (0.0008) [2023-10-14 02:24:00,799][33226] Updated weights for policy 1, policy_version 34290 (0.0009) [2023-10-14 02:24:00,855][33201] Updated weights for policy 0, policy_version 33990 (0.0008) [2023-10-14 02:24:01,165][33226] Updated weights for policy 1, policy_version 34300 (0.0008) [2023-10-14 02:24:01,224][33201] Updated weights for policy 0, policy_version 34000 (0.0008) [2023-10-14 02:24:01,593][33201] Updated weights for policy 0, policy_version 34010 (0.0008) [2023-10-14 02:24:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 69959680. Throughput: 0: 1742.5, 1: 1750.5. Samples: 17500278. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-14 02:24:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 02:24:04,891][33226] Updated weights for policy 1, policy_version 34310 (0.0008) [2023-10-14 02:24:05,263][33226] Updated weights for policy 1, policy_version 34320 (0.0008) [2023-10-14 02:24:05,434][33201] Updated weights for policy 0, policy_version 34020 (0.0007) [2023-10-14 02:24:05,622][33226] Updated weights for policy 1, policy_version 34330 (0.0007) [2023-10-14 02:24:05,797][33201] Updated weights for policy 0, policy_version 34030 (0.0007) [2023-10-14 02:24:06,168][33201] Updated weights for policy 0, policy_version 34040 (0.0008) [2023-10-14 02:24:09,327][33226] Updated weights for policy 1, policy_version 34340 (0.0007) [2023-10-14 02:24:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 70025216. Throughput: 0: 1769.6, 1: 1781.8. Samples: 17522492. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-14 02:24:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 02:24:09,693][33226] Updated weights for policy 1, policy_version 34350 (0.0008) [2023-10-14 02:24:09,924][33201] Updated weights for policy 0, policy_version 34050 (0.0008) [2023-10-14 02:24:10,069][33226] Updated weights for policy 1, policy_version 34360 (0.0008) [2023-10-14 02:24:10,301][33201] Updated weights for policy 0, policy_version 34060 (0.0009) [2023-10-14 02:24:10,672][33201] Updated weights for policy 0, policy_version 34070 (0.0008) [2023-10-14 02:24:11,037][33201] Updated weights for policy 0, policy_version 34080 (0.0008) [2023-10-14 02:24:13,980][33226] Updated weights for policy 1, policy_version 34370 (0.0009) [2023-10-14 02:24:14,346][33226] Updated weights for policy 1, policy_version 34380 (0.0008) [2023-10-14 02:24:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 70090752. Throughput: 0: 1741.6, 1: 1756.0. Samples: 17531956. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-14 02:24:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:24:14,712][33226] Updated weights for policy 1, policy_version 34390 (0.0008) [2023-10-14 02:24:14,835][33201] Updated weights for policy 0, policy_version 34090 (0.0009) [2023-10-14 02:24:15,077][33226] Updated weights for policy 1, policy_version 34400 (0.0008) [2023-10-14 02:24:15,205][33201] Updated weights for policy 0, policy_version 34100 (0.0009) [2023-10-14 02:24:15,572][33201] Updated weights for policy 0, policy_version 34110 (0.0010) [2023-10-14 02:24:18,955][33226] Updated weights for policy 1, policy_version 34410 (0.0009) [2023-10-14 02:24:19,290][33201] Updated weights for policy 0, policy_version 34120 (0.0009) [2023-10-14 02:24:19,321][33226] Updated weights for policy 1, policy_version 34420 (0.0008) [2023-10-14 02:24:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 14106.9). Total num frames: 70156288. Throughput: 0: 1766.2, 1: 1775.4. Samples: 17554296. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-14 02:24:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:24:19,671][33201] Updated weights for policy 0, policy_version 34130 (0.0009) [2023-10-14 02:24:19,680][33226] Updated weights for policy 1, policy_version 34430 (0.0009) [2023-10-14 02:24:20,048][33201] Updated weights for policy 0, policy_version 34140 (0.0010) [2023-10-14 02:24:23,439][33226] Updated weights for policy 1, policy_version 34440 (0.0009) [2023-10-14 02:24:23,817][33226] Updated weights for policy 1, policy_version 34450 (0.0010) [2023-10-14 02:24:23,850][33201] Updated weights for policy 0, policy_version 34150 (0.0008) [2023-10-14 02:24:24,181][33226] Updated weights for policy 1, policy_version 34460 (0.0007) [2023-10-14 02:24:24,214][33201] Updated weights for policy 0, policy_version 34160 (0.0009) [2023-10-14 02:24:24,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 70254592. Throughput: 0: 1768.1, 1: 1769.4. Samples: 17574690. Policy #0 lag: (min: 31.0, avg: 37.9, max: 63.0) [2023-10-14 02:24:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.970')] [2023-10-14 02:24:24,586][33201] Updated weights for policy 0, policy_version 34170 (0.0007) [2023-10-14 02:24:28,063][33226] Updated weights for policy 1, policy_version 34470 (0.0009) [2023-10-14 02:24:28,432][33226] Updated weights for policy 1, policy_version 34480 (0.0007) [2023-10-14 02:24:28,572][33201] Updated weights for policy 0, policy_version 34180 (0.0008) [2023-10-14 02:24:28,799][33226] Updated weights for policy 1, policy_version 34490 (0.0008) [2023-10-14 02:24:28,939][33201] Updated weights for policy 0, policy_version 34190 (0.0008) [2023-10-14 02:24:29,318][33201] Updated weights for policy 0, policy_version 34200 (0.0009) [2023-10-14 02:24:29,557][31953] Fps is (10 sec: 16384.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 70320128. Throughput: 0: 1753.3, 1: 1770.4. Samples: 17585672. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-14 02:24:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.970')] [2023-10-14 02:24:32,518][33226] Updated weights for policy 1, policy_version 34500 (0.0008) [2023-10-14 02:24:32,884][33226] Updated weights for policy 1, policy_version 34510 (0.0007) [2023-10-14 02:24:33,158][33201] Updated weights for policy 0, policy_version 34210 (0.0008) [2023-10-14 02:24:33,257][33226] Updated weights for policy 1, policy_version 34520 (0.0007) [2023-10-14 02:24:33,528][33201] Updated weights for policy 0, policy_version 34220 (0.0009) [2023-10-14 02:24:33,897][33201] Updated weights for policy 0, policy_version 34230 (0.0008) [2023-10-14 02:24:34,269][33201] Updated weights for policy 0, policy_version 34240 (0.0009) [2023-10-14 02:24:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 70418432. Throughput: 0: 1780.3, 1: 1783.7. Samples: 17606978. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-14 02:24:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:24:37,169][33226] Updated weights for policy 1, policy_version 34530 (0.0007) [2023-10-14 02:24:37,537][33226] Updated weights for policy 1, policy_version 34540 (0.0009) [2023-10-14 02:24:37,899][33226] Updated weights for policy 1, policy_version 34550 (0.0010) [2023-10-14 02:24:38,129][33201] Updated weights for policy 0, policy_version 34250 (0.0009) [2023-10-14 02:24:38,268][33226] Updated weights for policy 1, policy_version 34560 (0.0008) [2023-10-14 02:24:38,502][33201] Updated weights for policy 0, policy_version 34260 (0.0010) [2023-10-14 02:24:38,880][33201] Updated weights for policy 0, policy_version 34270 (0.0007) [2023-10-14 02:24:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 70483968. Throughput: 0: 1751.8, 1: 1764.9. Samples: 17626800. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-14 02:24:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:24:42,122][33226] Updated weights for policy 1, policy_version 34570 (0.0009) [2023-10-14 02:24:42,487][33226] Updated weights for policy 1, policy_version 34580 (0.0009) [2023-10-14 02:24:42,691][33201] Updated weights for policy 0, policy_version 34280 (0.0008) [2023-10-14 02:24:42,848][33226] Updated weights for policy 1, policy_version 34590 (0.0007) [2023-10-14 02:24:43,057][33201] Updated weights for policy 0, policy_version 34290 (0.0007) [2023-10-14 02:24:43,427][33201] Updated weights for policy 0, policy_version 34300 (0.0007) [2023-10-14 02:24:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 70549504. Throughput: 0: 1784.0, 1: 1790.3. Samples: 17639068. Policy #0 lag: (min: 31.0, avg: 36.3, max: 63.0) [2023-10-14 02:24:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:24:46,467][33226] Updated weights for policy 1, policy_version 34600 (0.0009) [2023-10-14 02:24:46,820][33226] Updated weights for policy 1, policy_version 34610 (0.0007) [2023-10-14 02:24:47,196][33226] Updated weights for policy 1, policy_version 34620 (0.0007) [2023-10-14 02:24:47,421][33201] Updated weights for policy 0, policy_version 34310 (0.0009) [2023-10-14 02:24:47,790][33201] Updated weights for policy 0, policy_version 34320 (0.0008) [2023-10-14 02:24:48,154][33201] Updated weights for policy 0, policy_version 34330 (0.0009) [2023-10-14 02:24:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 70615040. Throughput: 0: 1763.3, 1: 1766.3. Samples: 17659112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:24:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:24:51,097][33226] Updated weights for policy 1, policy_version 34630 (0.0010) [2023-10-14 02:24:51,463][33226] Updated weights for policy 1, policy_version 34640 (0.0007) [2023-10-14 02:24:51,833][33226] Updated weights for policy 1, policy_version 34650 (0.0008) [2023-10-14 02:24:51,870][33201] Updated weights for policy 0, policy_version 34340 (0.0008) [2023-10-14 02:24:52,241][33201] Updated weights for policy 0, policy_version 34350 (0.0009) [2023-10-14 02:24:52,613][33201] Updated weights for policy 0, policy_version 34360 (0.0008) [2023-10-14 02:24:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 70680576. Throughput: 0: 1755.0, 1: 1770.0. Samples: 17681118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:24:54,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:24:55,552][33226] Updated weights for policy 1, policy_version 34660 (0.0008) [2023-10-14 02:24:55,913][33226] Updated weights for policy 1, policy_version 34670 (0.0008) [2023-10-14 02:24:56,280][33226] Updated weights for policy 1, policy_version 34680 (0.0009) [2023-10-14 02:24:56,505][33201] Updated weights for policy 0, policy_version 34370 (0.0008) [2023-10-14 02:24:56,871][33201] Updated weights for policy 0, policy_version 34380 (0.0007) [2023-10-14 02:24:57,238][33201] Updated weights for policy 0, policy_version 34390 (0.0007) [2023-10-14 02:24:57,603][33201] Updated weights for policy 0, policy_version 34400 (0.0010) [2023-10-14 02:24:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 70746112. Throughput: 0: 1768.7, 1: 1775.3. Samples: 17691438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:24:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 02:25:00,111][33226] Updated weights for policy 1, policy_version 34690 (0.0007) [2023-10-14 02:25:00,477][33226] Updated weights for policy 1, policy_version 34700 (0.0010) [2023-10-14 02:25:00,831][33226] Updated weights for policy 1, policy_version 34710 (0.0008) [2023-10-14 02:25:01,204][33226] Updated weights for policy 1, policy_version 34720 (0.0007) [2023-10-14 02:25:01,324][33201] Updated weights for policy 0, policy_version 34410 (0.0007) [2023-10-14 02:25:01,700][33201] Updated weights for policy 0, policy_version 34420 (0.0007) [2023-10-14 02:25:02,064][33201] Updated weights for policy 0, policy_version 34430 (0.0007) [2023-10-14 02:25:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 70811648. Throughput: 0: 1753.2, 1: 1771.9. Samples: 17712926. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:25:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 02:25:05,103][33226] Updated weights for policy 1, policy_version 34730 (0.0008) [2023-10-14 02:25:05,465][33226] Updated weights for policy 1, policy_version 34740 (0.0009) [2023-10-14 02:25:05,741][33201] Updated weights for policy 0, policy_version 34440 (0.0007) [2023-10-14 02:25:05,830][33226] Updated weights for policy 1, policy_version 34750 (0.0007) [2023-10-14 02:25:06,110][33201] Updated weights for policy 0, policy_version 34450 (0.0009) [2023-10-14 02:25:06,490][33201] Updated weights for policy 0, policy_version 34460 (0.0009) [2023-10-14 02:25:09,514][33226] Updated weights for policy 1, policy_version 34760 (0.0007) [2023-10-14 02:25:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 70877184. Throughput: 0: 1773.9, 1: 1796.1. Samples: 17735340. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:25:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 02:25:09,869][33226] Updated weights for policy 1, policy_version 34770 (0.0010) [2023-10-14 02:25:10,222][33201] Updated weights for policy 0, policy_version 34470 (0.0008) [2023-10-14 02:25:10,232][33226] Updated weights for policy 1, policy_version 34780 (0.0009) [2023-10-14 02:25:10,595][33201] Updated weights for policy 0, policy_version 34480 (0.0008) [2023-10-14 02:25:10,968][33201] Updated weights for policy 0, policy_version 34490 (0.0009) [2023-10-14 02:25:14,094][33226] Updated weights for policy 1, policy_version 34790 (0.0009) [2023-10-14 02:25:14,466][33226] Updated weights for policy 1, policy_version 34800 (0.0010) [2023-10-14 02:25:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 70942720. Throughput: 0: 1764.5, 1: 1772.7. Samples: 17744848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:25:14,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.930')] [2023-10-14 02:25:14,839][33201] Updated weights for policy 0, policy_version 34500 (0.0009) [2023-10-14 02:25:14,842][33226] Updated weights for policy 1, policy_version 34810 (0.0009) [2023-10-14 02:25:15,206][33201] Updated weights for policy 0, policy_version 34510 (0.0009) [2023-10-14 02:25:15,582][33201] Updated weights for policy 0, policy_version 34520 (0.0009) [2023-10-14 02:25:18,531][33226] Updated weights for policy 1, policy_version 34820 (0.0008) [2023-10-14 02:25:18,905][33226] Updated weights for policy 1, policy_version 34830 (0.0010) [2023-10-14 02:25:19,268][33226] Updated weights for policy 1, policy_version 34840 (0.0009) [2023-10-14 02:25:19,341][33201] Updated weights for policy 0, policy_version 34530 (0.0011) [2023-10-14 02:25:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 71008256. Throughput: 0: 1762.5, 1: 1794.1. Samples: 17767020. Policy #0 lag: (min: 31.0, avg: 32.7, max: 58.0) [2023-10-14 02:25:19,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.910')] [2023-10-14 02:25:19,713][33201] Updated weights for policy 0, policy_version 34540 (0.0009) [2023-10-14 02:25:20,087][33201] Updated weights for policy 0, policy_version 34550 (0.0009) [2023-10-14 02:25:20,448][33201] Updated weights for policy 0, policy_version 34560 (0.0009) [2023-10-14 02:25:23,111][33226] Updated weights for policy 1, policy_version 34850 (0.0008) [2023-10-14 02:25:23,477][33226] Updated weights for policy 1, policy_version 34860 (0.0008) [2023-10-14 02:25:23,844][33226] Updated weights for policy 1, policy_version 34870 (0.0008) [2023-10-14 02:25:24,210][33226] Updated weights for policy 1, policy_version 34880 (0.0007) [2023-10-14 02:25:24,262][33201] Updated weights for policy 0, policy_version 34570 (0.0008) [2023-10-14 02:25:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 71106560. Throughput: 0: 1791.6, 1: 1790.4. Samples: 17787994. Policy #0 lag: (min: 31.0, avg: 32.7, max: 58.0) [2023-10-14 02:25:24,559][31953] Avg episode reward: [(0, '20.670'), (1, '20.900')] [2023-10-14 02:25:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000034880_35717120.pth... [2023-10-14 02:25:24,606][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000033216_34013184.pth [2023-10-14 02:25:24,622][33201] Updated weights for policy 0, policy_version 34580 (0.0008) [2023-10-14 02:25:24,992][33201] Updated weights for policy 0, policy_version 34590 (0.0009) [2023-10-14 02:25:25,067][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000034592_35422208.pth... [2023-10-14 02:25:25,104][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000032928_33718272.pth [2023-10-14 02:25:28,038][33226] Updated weights for policy 1, policy_version 34890 (0.0008) [2023-10-14 02:25:28,403][33226] Updated weights for policy 1, policy_version 34900 (0.0010) [2023-10-14 02:25:28,770][33226] Updated weights for policy 1, policy_version 34910 (0.0007) [2023-10-14 02:25:28,900][33201] Updated weights for policy 0, policy_version 34600 (0.0008) [2023-10-14 02:25:29,267][33201] Updated weights for policy 0, policy_version 34610 (0.0007) [2023-10-14 02:25:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 71172096. Throughput: 0: 1759.7, 1: 1786.2. Samples: 17798636. Policy #0 lag: (min: 31.0, avg: 32.7, max: 58.0) [2023-10-14 02:25:29,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.890')] [2023-10-14 02:25:29,647][33201] Updated weights for policy 0, policy_version 34620 (0.0008) [2023-10-14 02:25:32,614][33226] Updated weights for policy 1, policy_version 34920 (0.0007) [2023-10-14 02:25:32,979][33226] Updated weights for policy 1, policy_version 34930 (0.0008) [2023-10-14 02:25:33,346][33226] Updated weights for policy 1, policy_version 34940 (0.0008) [2023-10-14 02:25:33,444][33201] Updated weights for policy 0, policy_version 34630 (0.0008) [2023-10-14 02:25:33,816][33201] Updated weights for policy 0, policy_version 34640 (0.0007) [2023-10-14 02:25:34,181][33201] Updated weights for policy 0, policy_version 34650 (0.0007) [2023-10-14 02:25:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 71270400. Throughput: 0: 1784.8, 1: 1789.4. Samples: 17819952. Policy #0 lag: (min: 31.0, avg: 32.7, max: 58.0) [2023-10-14 02:25:34,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.880')] [2023-10-14 02:25:37,208][33226] Updated weights for policy 1, policy_version 34950 (0.0010) [2023-10-14 02:25:37,572][33226] Updated weights for policy 1, policy_version 34960 (0.0010) [2023-10-14 02:25:37,945][33226] Updated weights for policy 1, policy_version 34970 (0.0010) [2023-10-14 02:25:38,122][33201] Updated weights for policy 0, policy_version 34660 (0.0008) [2023-10-14 02:25:38,492][33201] Updated weights for policy 0, policy_version 34670 (0.0010) [2023-10-14 02:25:38,868][33201] Updated weights for policy 0, policy_version 34680 (0.0008) [2023-10-14 02:25:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 71335936. Throughput: 0: 1763.5, 1: 1766.6. Samples: 17839974. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:25:39,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.880')] [2023-10-14 02:25:41,754][33226] Updated weights for policy 1, policy_version 34980 (0.0010) [2023-10-14 02:25:42,125][33226] Updated weights for policy 1, policy_version 34990 (0.0008) [2023-10-14 02:25:42,493][33226] Updated weights for policy 1, policy_version 35000 (0.0008) [2023-10-14 02:25:42,789][33201] Updated weights for policy 0, policy_version 34690 (0.0009) [2023-10-14 02:25:43,152][33201] Updated weights for policy 0, policy_version 34700 (0.0011) [2023-10-14 02:25:43,518][33201] Updated weights for policy 0, policy_version 34710 (0.0012) [2023-10-14 02:25:43,884][33201] Updated weights for policy 0, policy_version 34720 (0.0010) [2023-10-14 02:25:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 71401472. Throughput: 0: 1773.1, 1: 1785.5. Samples: 17851576. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:25:44,557][31953] Avg episode reward: [(0, '20.680'), (1, '20.890')] [2023-10-14 02:25:46,268][33226] Updated weights for policy 1, policy_version 35010 (0.0010) [2023-10-14 02:25:46,640][33226] Updated weights for policy 1, policy_version 35020 (0.0008) [2023-10-14 02:25:47,002][33226] Updated weights for policy 1, policy_version 35030 (0.0009) [2023-10-14 02:25:47,368][33226] Updated weights for policy 1, policy_version 35040 (0.0007) [2023-10-14 02:25:47,779][33201] Updated weights for policy 0, policy_version 34730 (0.0008) [2023-10-14 02:25:48,150][33201] Updated weights for policy 0, policy_version 34740 (0.0008) [2023-10-14 02:25:48,525][33201] Updated weights for policy 0, policy_version 34750 (0.0008) [2023-10-14 02:25:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 71467008. Throughput: 0: 1766.6, 1: 1766.6. Samples: 17871922. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:25:49,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.890')] [2023-10-14 02:25:51,134][33226] Updated weights for policy 1, policy_version 35050 (0.0010) [2023-10-14 02:25:51,498][33226] Updated weights for policy 1, policy_version 35060 (0.0010) [2023-10-14 02:25:51,863][33226] Updated weights for policy 1, policy_version 35070 (0.0012) [2023-10-14 02:25:52,401][33201] Updated weights for policy 0, policy_version 34760 (0.0007) [2023-10-14 02:25:52,780][33201] Updated weights for policy 0, policy_version 34770 (0.0007) [2023-10-14 02:25:53,145][33201] Updated weights for policy 0, policy_version 34780 (0.0007) [2023-10-14 02:25:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 71532544. Throughput: 0: 1743.5, 1: 1761.2. Samples: 17893052. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:25:54,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.880')] [2023-10-14 02:25:55,716][33226] Updated weights for policy 1, policy_version 35080 (0.0007) [2023-10-14 02:25:56,080][33226] Updated weights for policy 1, policy_version 35090 (0.0009) [2023-10-14 02:25:56,447][33226] Updated weights for policy 1, policy_version 35100 (0.0008) [2023-10-14 02:25:56,894][33201] Updated weights for policy 0, policy_version 34790 (0.0009) [2023-10-14 02:25:57,270][33201] Updated weights for policy 0, policy_version 34800 (0.0010) [2023-10-14 02:25:57,649][33201] Updated weights for policy 0, policy_version 34810 (0.0010) [2023-10-14 02:25:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 71598080. Throughput: 0: 1763.7, 1: 1762.3. Samples: 17903516. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:25:59,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.880')] [2023-10-14 02:26:00,258][33226] Updated weights for policy 1, policy_version 35110 (0.0008) [2023-10-14 02:26:00,626][33226] Updated weights for policy 1, policy_version 35120 (0.0009) [2023-10-14 02:26:00,992][33226] Updated weights for policy 1, policy_version 35130 (0.0011) [2023-10-14 02:26:01,348][33201] Updated weights for policy 0, policy_version 34820 (0.0010) [2023-10-14 02:26:01,722][33201] Updated weights for policy 0, policy_version 34830 (0.0011) [2023-10-14 02:26:02,091][33201] Updated weights for policy 0, policy_version 34840 (0.0010) [2023-10-14 02:26:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 71663616. Throughput: 0: 1747.1, 1: 1763.4. Samples: 17924992. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:04,557][31953] Avg episode reward: [(0, '20.680'), (1, '20.870')] [2023-10-14 02:26:04,857][33226] Updated weights for policy 1, policy_version 35140 (0.0010) [2023-10-14 02:26:05,222][33226] Updated weights for policy 1, policy_version 35150 (0.0010) [2023-10-14 02:26:05,591][33226] Updated weights for policy 1, policy_version 35160 (0.0009) [2023-10-14 02:26:05,946][33201] Updated weights for policy 0, policy_version 34850 (0.0011) [2023-10-14 02:26:06,318][33201] Updated weights for policy 0, policy_version 34860 (0.0011) [2023-10-14 02:26:06,683][33201] Updated weights for policy 0, policy_version 34870 (0.0008) [2023-10-14 02:26:07,050][33201] Updated weights for policy 0, policy_version 34880 (0.0007) [2023-10-14 02:26:09,382][33226] Updated weights for policy 1, policy_version 35170 (0.0009) [2023-10-14 02:26:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 71729152. Throughput: 0: 1751.1, 1: 1782.4. Samples: 17947002. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:09,557][31953] Avg episode reward: [(0, '20.690'), (1, '20.870')] [2023-10-14 02:26:09,752][33226] Updated weights for policy 1, policy_version 35180 (0.0010) [2023-10-14 02:26:10,123][33226] Updated weights for policy 1, policy_version 35190 (0.0011) [2023-10-14 02:26:10,487][33226] Updated weights for policy 1, policy_version 35200 (0.0007) [2023-10-14 02:26:10,830][33201] Updated weights for policy 0, policy_version 34890 (0.0008) [2023-10-14 02:26:11,200][33201] Updated weights for policy 0, policy_version 34900 (0.0009) [2023-10-14 02:26:11,564][33201] Updated weights for policy 0, policy_version 34910 (0.0009) [2023-10-14 02:26:14,147][33226] Updated weights for policy 1, policy_version 35210 (0.0011) [2023-10-14 02:26:14,521][33226] Updated weights for policy 1, policy_version 35220 (0.0010) [2023-10-14 02:26:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 71794688. Throughput: 0: 1750.3, 1: 1763.7. Samples: 17956762. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:14,557][31953] Avg episode reward: [(0, '20.600'), (1, '20.870')] [2023-10-14 02:26:14,890][33226] Updated weights for policy 1, policy_version 35230 (0.0007) [2023-10-14 02:26:15,316][33201] Updated weights for policy 0, policy_version 34920 (0.0007) [2023-10-14 02:26:15,699][33201] Updated weights for policy 0, policy_version 34930 (0.0008) [2023-10-14 02:26:16,077][33201] Updated weights for policy 0, policy_version 34940 (0.0009) [2023-10-14 02:26:18,689][33226] Updated weights for policy 1, policy_version 35240 (0.0008) [2023-10-14 02:26:19,064][33226] Updated weights for policy 1, policy_version 35250 (0.0007) [2023-10-14 02:26:19,436][33226] Updated weights for policy 1, policy_version 35260 (0.0008) [2023-10-14 02:26:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 71860224. Throughput: 0: 1751.7, 1: 1782.5. Samples: 17978992. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:19,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.880')] [2023-10-14 02:26:19,832][33201] Updated weights for policy 0, policy_version 34950 (0.0009) [2023-10-14 02:26:20,200][33201] Updated weights for policy 0, policy_version 34960 (0.0008) [2023-10-14 02:26:20,582][33201] Updated weights for policy 0, policy_version 34970 (0.0009) [2023-10-14 02:26:23,256][33226] Updated weights for policy 1, policy_version 35270 (0.0009) [2023-10-14 02:26:23,614][33226] Updated weights for policy 1, policy_version 35280 (0.0008) [2023-10-14 02:26:23,986][33226] Updated weights for policy 1, policy_version 35290 (0.0007) [2023-10-14 02:26:24,415][33201] Updated weights for policy 0, policy_version 34980 (0.0008) [2023-10-14 02:26:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 71958528. Throughput: 0: 1785.2, 1: 1775.2. Samples: 18000190. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:24,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.880')] [2023-10-14 02:26:24,783][33201] Updated weights for policy 0, policy_version 34990 (0.0008) [2023-10-14 02:26:25,153][33201] Updated weights for policy 0, policy_version 35000 (0.0008) [2023-10-14 02:26:27,683][33226] Updated weights for policy 1, policy_version 35300 (0.0008) [2023-10-14 02:26:28,055][33226] Updated weights for policy 1, policy_version 35310 (0.0009) [2023-10-14 02:26:28,425][33226] Updated weights for policy 1, policy_version 35320 (0.0008) [2023-10-14 02:26:28,931][33201] Updated weights for policy 0, policy_version 35010 (0.0007) [2023-10-14 02:26:29,315][33201] Updated weights for policy 0, policy_version 35020 (0.0009) [2023-10-14 02:26:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 72024064. Throughput: 0: 1761.9, 1: 1781.1. Samples: 18011008. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:26:29,559][31953] Avg episode reward: [(0, '20.600'), (1, '20.900')] [2023-10-14 02:26:29,691][33201] Updated weights for policy 0, policy_version 35030 (0.0010) [2023-10-14 02:26:30,054][33201] Updated weights for policy 0, policy_version 35040 (0.0010) [2023-10-14 02:26:32,323][33226] Updated weights for policy 1, policy_version 35330 (0.0008) [2023-10-14 02:26:32,691][33226] Updated weights for policy 1, policy_version 35340 (0.0011) [2023-10-14 02:26:33,054][33226] Updated weights for policy 1, policy_version 35350 (0.0011) [2023-10-14 02:26:33,424][33226] Updated weights for policy 1, policy_version 35360 (0.0010) [2023-10-14 02:26:33,870][33201] Updated weights for policy 0, policy_version 35050 (0.0009) [2023-10-14 02:26:34,239][33201] Updated weights for policy 0, policy_version 35060 (0.0009) [2023-10-14 02:26:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 72089600. Throughput: 0: 1779.8, 1: 1780.1. Samples: 18032114. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 02:26:34,557][31953] Avg episode reward: [(0, '20.600'), (1, '20.920')] [2023-10-14 02:26:34,620][33201] Updated weights for policy 0, policy_version 35070 (0.0007) [2023-10-14 02:26:37,336][33226] Updated weights for policy 1, policy_version 35370 (0.0011) [2023-10-14 02:26:37,713][33226] Updated weights for policy 1, policy_version 35380 (0.0010) [2023-10-14 02:26:38,076][33226] Updated weights for policy 1, policy_version 35390 (0.0007) [2023-10-14 02:26:38,598][33201] Updated weights for policy 0, policy_version 35080 (0.0009) [2023-10-14 02:26:38,979][33201] Updated weights for policy 0, policy_version 35090 (0.0007) [2023-10-14 02:26:39,347][33201] Updated weights for policy 0, policy_version 35100 (0.0008) [2023-10-14 02:26:39,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 72187904. Throughput: 0: 1773.4, 1: 1769.4. Samples: 18052480. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 02:26:39,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.920')] [2023-10-14 02:26:41,640][33226] Updated weights for policy 1, policy_version 35400 (0.0008) [2023-10-14 02:26:42,013][33226] Updated weights for policy 1, policy_version 35410 (0.0007) [2023-10-14 02:26:42,373][33226] Updated weights for policy 1, policy_version 35420 (0.0008) [2023-10-14 02:26:43,218][33201] Updated weights for policy 0, policy_version 35110 (0.0009) [2023-10-14 02:26:43,583][33201] Updated weights for policy 0, policy_version 35120 (0.0008) [2023-10-14 02:26:43,956][33201] Updated weights for policy 0, policy_version 35130 (0.0009) [2023-10-14 02:26:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 72253440. Throughput: 0: 1770.1, 1: 1788.9. Samples: 18063670. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 02:26:44,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.910')] [2023-10-14 02:26:46,097][33226] Updated weights for policy 1, policy_version 35430 (0.0008) [2023-10-14 02:26:46,456][33226] Updated weights for policy 1, policy_version 35440 (0.0008) [2023-10-14 02:26:46,830][33226] Updated weights for policy 1, policy_version 35450 (0.0009) [2023-10-14 02:26:47,585][33201] Updated weights for policy 0, policy_version 35140 (0.0010) [2023-10-14 02:26:47,966][33201] Updated weights for policy 0, policy_version 35150 (0.0011) [2023-10-14 02:26:48,328][33201] Updated weights for policy 0, policy_version 35160 (0.0009) [2023-10-14 02:26:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 72318976. Throughput: 0: 1773.8, 1: 1769.9. Samples: 18084462. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 02:26:49,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.910')] [2023-10-14 02:26:50,747][33226] Updated weights for policy 1, policy_version 35460 (0.0008) [2023-10-14 02:26:51,111][33226] Updated weights for policy 1, policy_version 35470 (0.0009) [2023-10-14 02:26:51,482][33226] Updated weights for policy 1, policy_version 35480 (0.0008) [2023-10-14 02:26:52,185][33201] Updated weights for policy 0, policy_version 35170 (0.0007) [2023-10-14 02:26:52,546][33201] Updated weights for policy 0, policy_version 35180 (0.0008) [2023-10-14 02:26:52,914][33201] Updated weights for policy 0, policy_version 35190 (0.0008) [2023-10-14 02:26:53,273][33201] Updated weights for policy 0, policy_version 35200 (0.0009) [2023-10-14 02:26:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 72384512. Throughput: 0: 1755.8, 1: 1779.9. Samples: 18106112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:26:54,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.910')] [2023-10-14 02:26:55,345][33226] Updated weights for policy 1, policy_version 35490 (0.0010) [2023-10-14 02:26:55,712][33226] Updated weights for policy 1, policy_version 35500 (0.0008) [2023-10-14 02:26:56,082][33226] Updated weights for policy 1, policy_version 35510 (0.0007) [2023-10-14 02:26:56,449][33226] Updated weights for policy 1, policy_version 35520 (0.0008) [2023-10-14 02:26:57,081][33201] Updated weights for policy 0, policy_version 35210 (0.0007) [2023-10-14 02:26:57,442][33201] Updated weights for policy 0, policy_version 35220 (0.0007) [2023-10-14 02:26:57,814][33201] Updated weights for policy 0, policy_version 35230 (0.0008) [2023-10-14 02:26:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 72450048. Throughput: 0: 1777.1, 1: 1772.3. Samples: 18116490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:26:59,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.890')] [2023-10-14 02:27:00,164][33226] Updated weights for policy 1, policy_version 35530 (0.0007) [2023-10-14 02:27:00,524][33226] Updated weights for policy 1, policy_version 35540 (0.0007) [2023-10-14 02:27:00,880][33226] Updated weights for policy 1, policy_version 35550 (0.0007) [2023-10-14 02:27:01,793][33201] Updated weights for policy 0, policy_version 35240 (0.0009) [2023-10-14 02:27:02,162][33201] Updated weights for policy 0, policy_version 35250 (0.0007) [2023-10-14 02:27:02,531][33201] Updated weights for policy 0, policy_version 35260 (0.0008) [2023-10-14 02:27:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 72515584. Throughput: 0: 1749.6, 1: 1778.8. Samples: 18137774. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:27:04,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.900')] [2023-10-14 02:27:04,621][33226] Updated weights for policy 1, policy_version 35560 (0.0008) [2023-10-14 02:27:04,994][33226] Updated weights for policy 1, policy_version 35570 (0.0008) [2023-10-14 02:27:05,367][33226] Updated weights for policy 1, policy_version 35580 (0.0008) [2023-10-14 02:27:06,511][33201] Updated weights for policy 0, policy_version 35270 (0.0007) [2023-10-14 02:27:06,879][33201] Updated weights for policy 0, policy_version 35280 (0.0008) [2023-10-14 02:27:07,259][33201] Updated weights for policy 0, policy_version 35290 (0.0010) [2023-10-14 02:27:09,209][33226] Updated weights for policy 1, policy_version 35590 (0.0008) [2023-10-14 02:27:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 72581120. Throughput: 0: 1744.5, 1: 1807.2. Samples: 18160016. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:27:09,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.900')] [2023-10-14 02:27:09,578][33226] Updated weights for policy 1, policy_version 35600 (0.0007) [2023-10-14 02:27:09,953][33226] Updated weights for policy 1, policy_version 35610 (0.0007) [2023-10-14 02:27:11,177][33201] Updated weights for policy 0, policy_version 35300 (0.0010) [2023-10-14 02:27:11,543][33201] Updated weights for policy 0, policy_version 35310 (0.0009) [2023-10-14 02:27:11,924][33201] Updated weights for policy 0, policy_version 35320 (0.0010) [2023-10-14 02:27:13,754][33226] Updated weights for policy 1, policy_version 35620 (0.0007) [2023-10-14 02:27:14,130][33226] Updated weights for policy 1, policy_version 35630 (0.0008) [2023-10-14 02:27:14,493][33226] Updated weights for policy 1, policy_version 35640 (0.0008) [2023-10-14 02:27:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 72646656. Throughput: 0: 1749.0, 1: 1782.4. Samples: 18169922. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:27:14,557][31953] Avg episode reward: [(0, '20.530'), (1, '20.910')] [2023-10-14 02:27:15,655][33201] Updated weights for policy 0, policy_version 35330 (0.0010) [2023-10-14 02:27:16,022][33201] Updated weights for policy 0, policy_version 35340 (0.0008) [2023-10-14 02:27:16,395][33201] Updated weights for policy 0, policy_version 35350 (0.0008) [2023-10-14 02:27:16,768][33201] Updated weights for policy 0, policy_version 35360 (0.0008) [2023-10-14 02:27:18,253][33226] Updated weights for policy 1, policy_version 35650 (0.0008) [2023-10-14 02:27:18,622][33226] Updated weights for policy 1, policy_version 35660 (0.0009) [2023-10-14 02:27:18,992][33226] Updated weights for policy 1, policy_version 35670 (0.0009) [2023-10-14 02:27:19,362][33226] Updated weights for policy 1, policy_version 35680 (0.0008) [2023-10-14 02:27:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 72744960. Throughput: 0: 1748.4, 1: 1806.8. Samples: 18192098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:27:19,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.920')] [2023-10-14 02:27:20,432][33201] Updated weights for policy 0, policy_version 35370 (0.0009) [2023-10-14 02:27:20,793][33201] Updated weights for policy 0, policy_version 35380 (0.0009) [2023-10-14 02:27:21,168][33201] Updated weights for policy 0, policy_version 35390 (0.0010) [2023-10-14 02:27:23,196][33226] Updated weights for policy 1, policy_version 35690 (0.0009) [2023-10-14 02:27:23,568][33226] Updated weights for policy 1, policy_version 35700 (0.0008) [2023-10-14 02:27:23,928][33226] Updated weights for policy 1, policy_version 35710 (0.0008) [2023-10-14 02:27:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 72810496. Throughput: 0: 1773.5, 1: 1789.6. Samples: 18212818. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:27:24,557][31953] Avg episode reward: [(0, '20.530'), (1, '20.920')] [2023-10-14 02:27:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000035712_36569088.pth... [2023-10-14 02:27:24,599][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000034048_34865152.pth [2023-10-14 02:27:24,877][33201] Updated weights for policy 0, policy_version 35400 (0.0008) [2023-10-14 02:27:25,260][33201] Updated weights for policy 0, policy_version 35410 (0.0009) [2023-10-14 02:27:25,633][33201] Updated weights for policy 0, policy_version 35420 (0.0008) [2023-10-14 02:27:25,776][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000035424_36274176.pth... [2023-10-14 02:27:25,805][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000033760_34570240.pth [2023-10-14 02:27:27,768][33226] Updated weights for policy 1, policy_version 35720 (0.0008) [2023-10-14 02:27:28,136][33226] Updated weights for policy 1, policy_version 35730 (0.0007) [2023-10-14 02:27:28,511][33226] Updated weights for policy 1, policy_version 35740 (0.0009) [2023-10-14 02:27:29,503][33201] Updated weights for policy 0, policy_version 35430 (0.0008) [2023-10-14 02:27:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 72876032. Throughput: 0: 1753.0, 1: 1798.4. Samples: 18223482. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:27:29,557][31953] Avg episode reward: [(0, '20.520'), (1, '20.920')] [2023-10-14 02:27:29,873][33201] Updated weights for policy 0, policy_version 35440 (0.0009) [2023-10-14 02:27:30,245][33201] Updated weights for policy 0, policy_version 35450 (0.0009) [2023-10-14 02:27:32,353][33226] Updated weights for policy 1, policy_version 35750 (0.0008) [2023-10-14 02:27:32,714][33226] Updated weights for policy 1, policy_version 35760 (0.0009) [2023-10-14 02:27:33,081][33226] Updated weights for policy 1, policy_version 35770 (0.0008) [2023-10-14 02:27:33,997][33201] Updated weights for policy 0, policy_version 35460 (0.0007) [2023-10-14 02:27:34,375][33201] Updated weights for policy 0, policy_version 35470 (0.0009) [2023-10-14 02:27:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 72941568. Throughput: 0: 1772.7, 1: 1787.2. Samples: 18244658. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:27:34,558][31953] Avg episode reward: [(0, '20.520'), (1, '20.920')] [2023-10-14 02:27:34,752][33201] Updated weights for policy 0, policy_version 35480 (0.0007) [2023-10-14 02:27:36,882][33226] Updated weights for policy 1, policy_version 35780 (0.0008) [2023-10-14 02:27:37,257][33226] Updated weights for policy 1, policy_version 35790 (0.0007) [2023-10-14 02:27:37,621][33226] Updated weights for policy 1, policy_version 35800 (0.0007) [2023-10-14 02:27:38,851][33201] Updated weights for policy 0, policy_version 35490 (0.0008) [2023-10-14 02:27:39,221][33201] Updated weights for policy 0, policy_version 35500 (0.0010) [2023-10-14 02:27:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 73007104. Throughput: 0: 1777.9, 1: 1776.4. Samples: 18266054. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:27:39,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.920')] [2023-10-14 02:27:39,595][33201] Updated weights for policy 0, policy_version 35510 (0.0011) [2023-10-14 02:27:39,966][33201] Updated weights for policy 0, policy_version 35520 (0.0010) [2023-10-14 02:27:41,375][33226] Updated weights for policy 1, policy_version 35810 (0.0007) [2023-10-14 02:27:41,731][33226] Updated weights for policy 1, policy_version 35820 (0.0007) [2023-10-14 02:27:42,100][33226] Updated weights for policy 1, policy_version 35830 (0.0009) [2023-10-14 02:27:42,462][33226] Updated weights for policy 1, policy_version 35840 (0.0009) [2023-10-14 02:27:43,824][33201] Updated weights for policy 0, policy_version 35530 (0.0010) [2023-10-14 02:27:44,203][33201] Updated weights for policy 0, policy_version 35540 (0.0008) [2023-10-14 02:27:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 73072640. Throughput: 0: 1762.0, 1: 1801.2. Samples: 18276836. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:27:44,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.930')] [2023-10-14 02:27:44,572][33201] Updated weights for policy 0, policy_version 35550 (0.0008) [2023-10-14 02:27:46,085][33226] Updated weights for policy 1, policy_version 35850 (0.0008) [2023-10-14 02:27:46,457][33226] Updated weights for policy 1, policy_version 35860 (0.0010) [2023-10-14 02:27:46,829][33226] Updated weights for policy 1, policy_version 35870 (0.0007) [2023-10-14 02:27:48,450][33201] Updated weights for policy 0, policy_version 35560 (0.0010) [2023-10-14 02:27:48,826][33201] Updated weights for policy 0, policy_version 35570 (0.0007) [2023-10-14 02:27:49,196][33201] Updated weights for policy 0, policy_version 35580 (0.0008) [2023-10-14 02:27:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 73170944. Throughput: 0: 1787.8, 1: 1780.0. Samples: 18298326. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:27:49,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.940')] [2023-10-14 02:27:50,748][33226] Updated weights for policy 1, policy_version 35880 (0.0007) [2023-10-14 02:27:51,103][33226] Updated weights for policy 1, policy_version 35890 (0.0010) [2023-10-14 02:27:51,472][33226] Updated weights for policy 1, policy_version 35900 (0.0009) [2023-10-14 02:27:52,983][33201] Updated weights for policy 0, policy_version 35590 (0.0008) [2023-10-14 02:27:53,343][33201] Updated weights for policy 0, policy_version 35600 (0.0008) [2023-10-14 02:27:53,711][33201] Updated weights for policy 0, policy_version 35610 (0.0008) [2023-10-14 02:27:54,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 73236480. Throughput: 0: 1752.7, 1: 1780.8. Samples: 18319026. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:27:54,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.940')] [2023-10-14 02:27:55,098][33226] Updated weights for policy 1, policy_version 35910 (0.0007) [2023-10-14 02:27:55,455][33226] Updated weights for policy 1, policy_version 35920 (0.0008) [2023-10-14 02:27:55,828][33226] Updated weights for policy 1, policy_version 35930 (0.0008) [2023-10-14 02:27:57,540][33201] Updated weights for policy 0, policy_version 35620 (0.0009) [2023-10-14 02:27:57,905][33201] Updated weights for policy 0, policy_version 35630 (0.0008) [2023-10-14 02:27:58,279][33201] Updated weights for policy 0, policy_version 35640 (0.0008) [2023-10-14 02:27:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 73302016. Throughput: 0: 1782.7, 1: 1776.6. Samples: 18330092. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:27:59,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.920')] [2023-10-14 02:27:59,815][33226] Updated weights for policy 1, policy_version 35940 (0.0009) [2023-10-14 02:28:00,179][33226] Updated weights for policy 1, policy_version 35950 (0.0008) [2023-10-14 02:28:00,555][33226] Updated weights for policy 1, policy_version 35960 (0.0009) [2023-10-14 02:28:02,029][33201] Updated weights for policy 0, policy_version 35650 (0.0008) [2023-10-14 02:28:02,417][33201] Updated weights for policy 0, policy_version 35660 (0.0007) [2023-10-14 02:28:02,790][33201] Updated weights for policy 0, policy_version 35670 (0.0010) [2023-10-14 02:28:03,151][33201] Updated weights for policy 0, policy_version 35680 (0.0010) [2023-10-14 02:28:04,306][33226] Updated weights for policy 1, policy_version 35970 (0.0009) [2023-10-14 02:28:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 73367552. Throughput: 0: 1752.3, 1: 1777.5. Samples: 18350938. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:28:04,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.920')] [2023-10-14 02:28:04,672][33226] Updated weights for policy 1, policy_version 35980 (0.0007) [2023-10-14 02:28:05,046][33226] Updated weights for policy 1, policy_version 35990 (0.0007) [2023-10-14 02:28:05,422][33226] Updated weights for policy 1, policy_version 36000 (0.0008) [2023-10-14 02:28:06,836][33201] Updated weights for policy 0, policy_version 35690 (0.0011) [2023-10-14 02:28:07,210][33201] Updated weights for policy 0, policy_version 35700 (0.0010) [2023-10-14 02:28:07,582][33201] Updated weights for policy 0, policy_version 35710 (0.0011) [2023-10-14 02:28:09,205][33226] Updated weights for policy 1, policy_version 36010 (0.0008) [2023-10-14 02:28:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 73433088. Throughput: 0: 1748.3, 1: 1804.2. Samples: 18372680. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:28:09,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.920')] [2023-10-14 02:28:09,592][33226] Updated weights for policy 1, policy_version 36020 (0.0009) [2023-10-14 02:28:09,954][33226] Updated weights for policy 1, policy_version 36030 (0.0008) [2023-10-14 02:28:11,400][33201] Updated weights for policy 0, policy_version 35720 (0.0009) [2023-10-14 02:28:11,781][33201] Updated weights for policy 0, policy_version 35730 (0.0007) [2023-10-14 02:28:12,154][33201] Updated weights for policy 0, policy_version 35740 (0.0007) [2023-10-14 02:28:13,680][33226] Updated weights for policy 1, policy_version 36040 (0.0009) [2023-10-14 02:28:14,041][33226] Updated weights for policy 1, policy_version 36050 (0.0009) [2023-10-14 02:28:14,410][33226] Updated weights for policy 1, policy_version 36060 (0.0007) [2023-10-14 02:28:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 73498624. Throughput: 0: 1758.9, 1: 1778.0. Samples: 18382644. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 02:28:14,558][31953] Avg episode reward: [(0, '20.780'), (1, '20.900')] [2023-10-14 02:28:16,158][33201] Updated weights for policy 0, policy_version 35750 (0.0009) [2023-10-14 02:28:16,529][33201] Updated weights for policy 0, policy_version 35760 (0.0008) [2023-10-14 02:28:16,893][33201] Updated weights for policy 0, policy_version 35770 (0.0008) [2023-10-14 02:28:18,173][33226] Updated weights for policy 1, policy_version 36070 (0.0008) [2023-10-14 02:28:18,542][33226] Updated weights for policy 1, policy_version 36080 (0.0007) [2023-10-14 02:28:18,916][33226] Updated weights for policy 1, policy_version 36090 (0.0008) [2023-10-14 02:28:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 73596928. Throughput: 0: 1748.0, 1: 1806.1. Samples: 18404594. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:28:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:28:20,826][33201] Updated weights for policy 0, policy_version 35780 (0.0010) [2023-10-14 02:28:21,189][33201] Updated weights for policy 0, policy_version 35790 (0.0010) [2023-10-14 02:28:21,560][33201] Updated weights for policy 0, policy_version 35800 (0.0010) [2023-10-14 02:28:22,562][33226] Updated weights for policy 1, policy_version 36100 (0.0008) [2023-10-14 02:28:22,929][33226] Updated weights for policy 1, policy_version 36110 (0.0009) [2023-10-14 02:28:23,306][33226] Updated weights for policy 1, policy_version 36120 (0.0007) [2023-10-14 02:28:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 73662464. Throughput: 0: 1757.5, 1: 1786.1. Samples: 18425514. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:28:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 02:28:25,500][33201] Updated weights for policy 0, policy_version 35810 (0.0010) [2023-10-14 02:28:25,872][33201] Updated weights for policy 0, policy_version 35820 (0.0012) [2023-10-14 02:28:26,243][33201] Updated weights for policy 0, policy_version 35830 (0.0008) [2023-10-14 02:28:26,622][33201] Updated weights for policy 0, policy_version 35840 (0.0007) [2023-10-14 02:28:27,010][33226] Updated weights for policy 1, policy_version 36130 (0.0007) [2023-10-14 02:28:27,386][33226] Updated weights for policy 1, policy_version 36140 (0.0010) [2023-10-14 02:28:27,740][33226] Updated weights for policy 1, policy_version 36150 (0.0009) [2023-10-14 02:28:28,103][33226] Updated weights for policy 1, policy_version 36160 (0.0011) [2023-10-14 02:28:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 73728000. Throughput: 0: 1746.8, 1: 1794.0. Samples: 18436176. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:28:29,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 02:28:30,408][33201] Updated weights for policy 0, policy_version 35850 (0.0007) [2023-10-14 02:28:30,779][33201] Updated weights for policy 0, policy_version 35860 (0.0010) [2023-10-14 02:28:31,149][33201] Updated weights for policy 0, policy_version 35870 (0.0010) [2023-10-14 02:28:32,099][33226] Updated weights for policy 1, policy_version 36170 (0.0008) [2023-10-14 02:28:32,461][33226] Updated weights for policy 1, policy_version 36180 (0.0008) [2023-10-14 02:28:32,835][33226] Updated weights for policy 1, policy_version 36190 (0.0009) [2023-10-14 02:28:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 73793536. Throughput: 0: 1744.3, 1: 1773.0. Samples: 18456604. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:28:34,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 02:28:35,113][33201] Updated weights for policy 0, policy_version 35880 (0.0008) [2023-10-14 02:28:35,487][33201] Updated weights for policy 0, policy_version 35890 (0.0010) [2023-10-14 02:28:35,856][33201] Updated weights for policy 0, policy_version 35900 (0.0010) [2023-10-14 02:28:36,671][33226] Updated weights for policy 1, policy_version 36200 (0.0010) [2023-10-14 02:28:37,037][33226] Updated weights for policy 1, policy_version 36210 (0.0008) [2023-10-14 02:28:37,409][33226] Updated weights for policy 1, policy_version 36220 (0.0007) [2023-10-14 02:28:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 73859072. Throughput: 0: 1777.1, 1: 1772.8. Samples: 18478770. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:28:39,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 02:28:39,742][33201] Updated weights for policy 0, policy_version 35910 (0.0008) [2023-10-14 02:28:40,110][33201] Updated weights for policy 0, policy_version 35920 (0.0008) [2023-10-14 02:28:40,490][33201] Updated weights for policy 0, policy_version 35930 (0.0008) [2023-10-14 02:28:41,067][33226] Updated weights for policy 1, policy_version 36230 (0.0007) [2023-10-14 02:28:41,433][33226] Updated weights for policy 1, policy_version 36240 (0.0008) [2023-10-14 02:28:41,805][33226] Updated weights for policy 1, policy_version 36250 (0.0007) [2023-10-14 02:28:44,216][33201] Updated weights for policy 0, policy_version 35940 (0.0008) [2023-10-14 02:28:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 73924608. Throughput: 0: 1742.4, 1: 1782.3. Samples: 18488702. Policy #0 lag: (min: 16.0, avg: 38.7, max: 40.0) [2023-10-14 02:28:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 02:28:44,599][33201] Updated weights for policy 0, policy_version 35950 (0.0009) [2023-10-14 02:28:44,970][33201] Updated weights for policy 0, policy_version 35960 (0.0011) [2023-10-14 02:28:45,338][33226] Updated weights for policy 1, policy_version 36260 (0.0009) [2023-10-14 02:28:45,710][33226] Updated weights for policy 1, policy_version 36270 (0.0009) [2023-10-14 02:28:46,071][33226] Updated weights for policy 1, policy_version 36280 (0.0007) [2023-10-14 02:28:48,824][33201] Updated weights for policy 0, policy_version 35970 (0.0010) [2023-10-14 02:28:49,194][33201] Updated weights for policy 0, policy_version 35980 (0.0007) [2023-10-14 02:28:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 73990144. Throughput: 0: 1771.2, 1: 1786.1. Samples: 18511018. Policy #0 lag: (min: 16.0, avg: 38.7, max: 40.0) [2023-10-14 02:28:49,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 02:28:49,571][33201] Updated weights for policy 0, policy_version 35990 (0.0008) [2023-10-14 02:28:49,833][33226] Updated weights for policy 1, policy_version 36290 (0.0008) [2023-10-14 02:28:49,939][33201] Updated weights for policy 0, policy_version 36000 (0.0009) [2023-10-14 02:28:50,201][33226] Updated weights for policy 1, policy_version 36300 (0.0009) [2023-10-14 02:28:50,573][33226] Updated weights for policy 1, policy_version 36310 (0.0009) [2023-10-14 02:28:50,941][33226] Updated weights for policy 1, policy_version 36320 (0.0008) [2023-10-14 02:28:53,522][33201] Updated weights for policy 0, policy_version 36010 (0.0008) [2023-10-14 02:28:53,891][33201] Updated weights for policy 0, policy_version 36020 (0.0007) [2023-10-14 02:28:54,270][33201] Updated weights for policy 0, policy_version 36030 (0.0007) [2023-10-14 02:28:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 74088448. Throughput: 0: 1754.2, 1: 1797.4. Samples: 18532500. Policy #0 lag: (min: 16.0, avg: 38.7, max: 40.0) [2023-10-14 02:28:54,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.950')] [2023-10-14 02:28:54,807][33226] Updated weights for policy 1, policy_version 36330 (0.0009) [2023-10-14 02:28:55,181][33226] Updated weights for policy 1, policy_version 36340 (0.0008) [2023-10-14 02:28:55,562][33226] Updated weights for policy 1, policy_version 36350 (0.0007) [2023-10-14 02:28:58,176][33201] Updated weights for policy 0, policy_version 36040 (0.0009) [2023-10-14 02:28:58,549][33201] Updated weights for policy 0, policy_version 36050 (0.0007) [2023-10-14 02:28:58,919][33201] Updated weights for policy 0, policy_version 36060 (0.0008) [2023-10-14 02:28:58,992][33226] Updated weights for policy 1, policy_version 36360 (0.0007) [2023-10-14 02:28:59,361][33226] Updated weights for policy 1, policy_version 36370 (0.0011) [2023-10-14 02:28:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 74153984. Throughput: 0: 1772.8, 1: 1792.6. Samples: 18543088. Policy #0 lag: (min: 16.0, avg: 38.7, max: 40.0) [2023-10-14 02:28:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 02:28:59,726][33226] Updated weights for policy 1, policy_version 36380 (0.0010) [2023-10-14 02:29:02,874][33201] Updated weights for policy 0, policy_version 36070 (0.0008) [2023-10-14 02:29:03,241][33201] Updated weights for policy 0, policy_version 36080 (0.0007) [2023-10-14 02:29:03,551][33226] Updated weights for policy 1, policy_version 36390 (0.0009) [2023-10-14 02:29:03,617][33201] Updated weights for policy 0, policy_version 36090 (0.0008) [2023-10-14 02:29:03,924][33226] Updated weights for policy 1, policy_version 36400 (0.0010) [2023-10-14 02:29:04,294][33226] Updated weights for policy 1, policy_version 36410 (0.0011) [2023-10-14 02:29:04,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 74252288. Throughput: 0: 1768.8, 1: 1792.2. Samples: 18564838. Policy #0 lag: (min: 16.0, avg: 38.7, max: 40.0) [2023-10-14 02:29:04,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 02:29:07,288][33201] Updated weights for policy 0, policy_version 36100 (0.0009) [2023-10-14 02:29:07,659][33201] Updated weights for policy 0, policy_version 36110 (0.0008) [2023-10-14 02:29:08,030][33201] Updated weights for policy 0, policy_version 36120 (0.0008) [2023-10-14 02:29:08,049][33226] Updated weights for policy 1, policy_version 36420 (0.0009) [2023-10-14 02:29:08,424][33226] Updated weights for policy 1, policy_version 36430 (0.0009) [2023-10-14 02:29:08,790][33226] Updated weights for policy 1, policy_version 36440 (0.0008) [2023-10-14 02:29:09,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14329.0). Total num frames: 74317824. Throughput: 0: 1751.5, 1: 1793.7. Samples: 18585046. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:09,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 02:29:11,861][33201] Updated weights for policy 0, policy_version 36130 (0.0009) [2023-10-14 02:29:12,244][33201] Updated weights for policy 0, policy_version 36140 (0.0008) [2023-10-14 02:29:12,599][33226] Updated weights for policy 1, policy_version 36450 (0.0009) [2023-10-14 02:29:12,611][33201] Updated weights for policy 0, policy_version 36150 (0.0008) [2023-10-14 02:29:12,968][33226] Updated weights for policy 1, policy_version 36460 (0.0008) [2023-10-14 02:29:12,975][33201] Updated weights for policy 0, policy_version 36160 (0.0010) [2023-10-14 02:29:13,330][33226] Updated weights for policy 1, policy_version 36470 (0.0008) [2023-10-14 02:29:13,693][33226] Updated weights for policy 1, policy_version 36480 (0.0009) [2023-10-14 02:29:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 74383360. Throughput: 0: 1775.9, 1: 1791.5. Samples: 18596710. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:29:16,848][33201] Updated weights for policy 0, policy_version 36170 (0.0009) [2023-10-14 02:29:17,211][33201] Updated weights for policy 0, policy_version 36180 (0.0008) [2023-10-14 02:29:17,586][33201] Updated weights for policy 0, policy_version 36190 (0.0008) [2023-10-14 02:29:17,588][33226] Updated weights for policy 1, policy_version 36490 (0.0008) [2023-10-14 02:29:17,958][33226] Updated weights for policy 1, policy_version 36500 (0.0009) [2023-10-14 02:29:18,331][33226] Updated weights for policy 1, policy_version 36510 (0.0009) [2023-10-14 02:29:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 74448896. Throughput: 0: 1753.6, 1: 1812.1. Samples: 18617058. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:29:21,340][33201] Updated weights for policy 0, policy_version 36200 (0.0009) [2023-10-14 02:29:21,707][33201] Updated weights for policy 0, policy_version 36210 (0.0008) [2023-10-14 02:29:22,074][33201] Updated weights for policy 0, policy_version 36220 (0.0008) [2023-10-14 02:29:22,107][33226] Updated weights for policy 1, policy_version 36520 (0.0007) [2023-10-14 02:29:22,481][33226] Updated weights for policy 1, policy_version 36530 (0.0009) [2023-10-14 02:29:22,847][33226] Updated weights for policy 1, policy_version 36540 (0.0009) [2023-10-14 02:29:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 74514432. Throughput: 0: 1758.7, 1: 1794.3. Samples: 18638654. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:29:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000036544_37421056.pth... [2023-10-14 02:29:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000036224_37093376.pth... [2023-10-14 02:29:24,606][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000034880_35717120.pth [2023-10-14 02:29:24,612][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000034592_35422208.pth [2023-10-14 02:29:26,042][33201] Updated weights for policy 0, policy_version 36230 (0.0009) [2023-10-14 02:29:26,408][33201] Updated weights for policy 0, policy_version 36240 (0.0008) [2023-10-14 02:29:26,682][33226] Updated weights for policy 1, policy_version 36550 (0.0011) [2023-10-14 02:29:26,778][33201] Updated weights for policy 0, policy_version 36250 (0.0009) [2023-10-14 02:29:27,053][33226] Updated weights for policy 1, policy_version 36560 (0.0009) [2023-10-14 02:29:27,413][33226] Updated weights for policy 1, policy_version 36570 (0.0008) [2023-10-14 02:29:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 74579968. Throughput: 0: 1752.8, 1: 1806.9. Samples: 18648886. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 02:29:30,579][33201] Updated weights for policy 0, policy_version 36260 (0.0009) [2023-10-14 02:29:30,957][33201] Updated weights for policy 0, policy_version 36270 (0.0007) [2023-10-14 02:29:31,318][33226] Updated weights for policy 1, policy_version 36580 (0.0008) [2023-10-14 02:29:31,319][33201] Updated weights for policy 0, policy_version 36280 (0.0009) [2023-10-14 02:29:31,694][33226] Updated weights for policy 1, policy_version 36590 (0.0010) [2023-10-14 02:29:32,056][33226] Updated weights for policy 1, policy_version 36600 (0.0008) [2023-10-14 02:29:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 74645504. Throughput: 0: 1757.2, 1: 1780.8. Samples: 18670228. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:29:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 02:29:35,083][33201] Updated weights for policy 0, policy_version 36290 (0.0008) [2023-10-14 02:29:35,456][33201] Updated weights for policy 0, policy_version 36300 (0.0008) [2023-10-14 02:29:35,835][33201] Updated weights for policy 0, policy_version 36310 (0.0007) [2023-10-14 02:29:35,852][33226] Updated weights for policy 1, policy_version 36610 (0.0010) [2023-10-14 02:29:36,199][33201] Updated weights for policy 0, policy_version 36320 (0.0008) [2023-10-14 02:29:36,226][33226] Updated weights for policy 1, policy_version 36620 (0.0010) [2023-10-14 02:29:36,588][33226] Updated weights for policy 1, policy_version 36630 (0.0009) [2023-10-14 02:29:36,951][33226] Updated weights for policy 1, policy_version 36640 (0.0008) [2023-10-14 02:29:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 74711040. Throughput: 0: 1777.1, 1: 1775.5. Samples: 18692368. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) [2023-10-14 02:29:39,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 02:29:39,937][33201] Updated weights for policy 0, policy_version 36330 (0.0007) [2023-10-14 02:29:40,307][33201] Updated weights for policy 0, policy_version 36340 (0.0009) [2023-10-14 02:29:40,678][33201] Updated weights for policy 0, policy_version 36350 (0.0007) [2023-10-14 02:29:40,796][33226] Updated weights for policy 1, policy_version 36650 (0.0008) [2023-10-14 02:29:41,179][33226] Updated weights for policy 1, policy_version 36660 (0.0011) [2023-10-14 02:29:41,543][33226] Updated weights for policy 1, policy_version 36670 (0.0009) [2023-10-14 02:29:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 74776576. Throughput: 0: 1753.0, 1: 1775.2. Samples: 18701856. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) [2023-10-14 02:29:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 02:29:44,814][33201] Updated weights for policy 0, policy_version 36360 (0.0007) [2023-10-14 02:29:45,189][33201] Updated weights for policy 0, policy_version 36370 (0.0007) [2023-10-14 02:29:45,334][33226] Updated weights for policy 1, policy_version 36680 (0.0008) [2023-10-14 02:29:45,562][33201] Updated weights for policy 0, policy_version 36380 (0.0009) [2023-10-14 02:29:45,695][33226] Updated weights for policy 1, policy_version 36690 (0.0009) [2023-10-14 02:29:46,062][33226] Updated weights for policy 1, policy_version 36700 (0.0011) [2023-10-14 02:29:49,410][33201] Updated weights for policy 0, policy_version 36390 (0.0010) [2023-10-14 02:29:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 74842112. Throughput: 0: 1764.4, 1: 1772.1. Samples: 18723980. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) [2023-10-14 02:29:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 02:29:49,782][33201] Updated weights for policy 0, policy_version 36400 (0.0007) [2023-10-14 02:29:49,921][33226] Updated weights for policy 1, policy_version 36710 (0.0009) [2023-10-14 02:29:50,163][33201] Updated weights for policy 0, policy_version 36410 (0.0008) [2023-10-14 02:29:50,290][33226] Updated weights for policy 1, policy_version 36720 (0.0008) [2023-10-14 02:29:50,658][33226] Updated weights for policy 1, policy_version 36730 (0.0010) [2023-10-14 02:29:53,966][33201] Updated weights for policy 0, policy_version 36420 (0.0010) [2023-10-14 02:29:54,335][33201] Updated weights for policy 0, policy_version 36430 (0.0007) [2023-10-14 02:29:54,424][33226] Updated weights for policy 1, policy_version 36740 (0.0009) [2023-10-14 02:29:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 74907648. Throughput: 0: 1774.9, 1: 1795.2. Samples: 18745698. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) [2023-10-14 02:29:54,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 02:29:54,698][33201] Updated weights for policy 0, policy_version 36440 (0.0008) [2023-10-14 02:29:54,787][33226] Updated weights for policy 1, policy_version 36750 (0.0010) [2023-10-14 02:29:55,152][33226] Updated weights for policy 1, policy_version 36760 (0.0008) [2023-10-14 02:29:58,363][33201] Updated weights for policy 0, policy_version 36450 (0.0009) [2023-10-14 02:29:58,749][33201] Updated weights for policy 0, policy_version 36460 (0.0008) [2023-10-14 02:29:58,934][33226] Updated weights for policy 1, policy_version 36770 (0.0007) [2023-10-14 02:29:59,115][33201] Updated weights for policy 0, policy_version 36470 (0.0008) [2023-10-14 02:29:59,301][33226] Updated weights for policy 1, policy_version 36780 (0.0008) [2023-10-14 02:29:59,487][33201] Updated weights for policy 0, policy_version 36480 (0.0009) [2023-10-14 02:29:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 75005952. Throughput: 0: 1764.8, 1: 1768.8. Samples: 18755720. Policy #0 lag: (min: 16.0, avg: 43.8, max: 48.0) [2023-10-14 02:29:59,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:29:59,671][33226] Updated weights for policy 1, policy_version 36790 (0.0008) [2023-10-14 02:30:00,032][33226] Updated weights for policy 1, policy_version 36800 (0.0010) [2023-10-14 02:30:03,318][33201] Updated weights for policy 0, policy_version 36490 (0.0008) [2023-10-14 02:30:03,690][33201] Updated weights for policy 0, policy_version 36500 (0.0007) [2023-10-14 02:30:03,849][33226] Updated weights for policy 1, policy_version 36810 (0.0009) [2023-10-14 02:30:04,060][33201] Updated weights for policy 0, policy_version 36510 (0.0008) [2023-10-14 02:30:04,216][33226] Updated weights for policy 1, policy_version 36820 (0.0008) [2023-10-14 02:30:04,557][31953] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 75071488. Throughput: 0: 1781.5, 1: 1786.6. Samples: 18777622. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:30:04,594][33226] Updated weights for policy 1, policy_version 36830 (0.0007) [2023-10-14 02:30:07,809][33201] Updated weights for policy 0, policy_version 36520 (0.0010) [2023-10-14 02:30:08,173][33201] Updated weights for policy 0, policy_version 36530 (0.0009) [2023-10-14 02:30:08,449][33226] Updated weights for policy 1, policy_version 36840 (0.0009) [2023-10-14 02:30:08,541][33201] Updated weights for policy 0, policy_version 36540 (0.0007) [2023-10-14 02:30:08,810][33226] Updated weights for policy 1, policy_version 36850 (0.0007) [2023-10-14 02:30:09,177][33226] Updated weights for policy 1, policy_version 36860 (0.0007) [2023-10-14 02:30:09,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 75169792. Throughput: 0: 1752.1, 1: 1776.3. Samples: 18797432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:30:12,258][33201] Updated weights for policy 0, policy_version 36550 (0.0009) [2023-10-14 02:30:12,629][33201] Updated weights for policy 0, policy_version 36560 (0.0011) [2023-10-14 02:30:12,982][33226] Updated weights for policy 1, policy_version 36870 (0.0007) [2023-10-14 02:30:12,994][33201] Updated weights for policy 0, policy_version 36570 (0.0010) [2023-10-14 02:30:13,351][33226] Updated weights for policy 1, policy_version 36880 (0.0007) [2023-10-14 02:30:13,724][33226] Updated weights for policy 1, policy_version 36890 (0.0007) [2023-10-14 02:30:14,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 75235328. Throughput: 0: 1790.7, 1: 1774.4. Samples: 18809318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:30:16,824][33201] Updated weights for policy 0, policy_version 36580 (0.0011) [2023-10-14 02:30:17,188][33201] Updated weights for policy 0, policy_version 36590 (0.0010) [2023-10-14 02:30:17,530][33226] Updated weights for policy 1, policy_version 36900 (0.0010) [2023-10-14 02:30:17,552][33201] Updated weights for policy 0, policy_version 36600 (0.0008) [2023-10-14 02:30:17,900][33226] Updated weights for policy 1, policy_version 36910 (0.0008) [2023-10-14 02:30:18,267][33226] Updated weights for policy 1, policy_version 36920 (0.0008) [2023-10-14 02:30:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 75300864. Throughput: 0: 1758.0, 1: 1779.2. Samples: 18829404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.900')] [2023-10-14 02:30:21,460][33201] Updated weights for policy 0, policy_version 36610 (0.0008) [2023-10-14 02:30:21,823][33201] Updated weights for policy 0, policy_version 36620 (0.0007) [2023-10-14 02:30:22,046][33226] Updated weights for policy 1, policy_version 36930 (0.0009) [2023-10-14 02:30:22,201][33201] Updated weights for policy 0, policy_version 36630 (0.0007) [2023-10-14 02:30:22,415][33226] Updated weights for policy 1, policy_version 36940 (0.0007) [2023-10-14 02:30:22,562][33201] Updated weights for policy 0, policy_version 36640 (0.0007) [2023-10-14 02:30:22,779][33226] Updated weights for policy 1, policy_version 36950 (0.0007) [2023-10-14 02:30:23,148][33226] Updated weights for policy 1, policy_version 36960 (0.0007) [2023-10-14 02:30:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 75366400. Throughput: 0: 1757.3, 1: 1764.0. Samples: 18850828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:30:26,334][33201] Updated weights for policy 0, policy_version 36650 (0.0008) [2023-10-14 02:30:26,702][33201] Updated weights for policy 0, policy_version 36660 (0.0009) [2023-10-14 02:30:26,965][33226] Updated weights for policy 1, policy_version 36970 (0.0009) [2023-10-14 02:30:27,072][33201] Updated weights for policy 0, policy_version 36670 (0.0008) [2023-10-14 02:30:27,344][33226] Updated weights for policy 1, policy_version 36980 (0.0007) [2023-10-14 02:30:27,713][33226] Updated weights for policy 1, policy_version 36990 (0.0010) [2023-10-14 02:30:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 75431936. Throughput: 0: 1758.9, 1: 1784.7. Samples: 18861318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:30:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 02:30:31,064][33201] Updated weights for policy 0, policy_version 36680 (0.0008) [2023-10-14 02:30:31,425][33201] Updated weights for policy 0, policy_version 36690 (0.0010) [2023-10-14 02:30:31,531][33226] Updated weights for policy 1, policy_version 37000 (0.0008) [2023-10-14 02:30:31,803][33201] Updated weights for policy 0, policy_version 36700 (0.0008) [2023-10-14 02:30:31,896][33226] Updated weights for policy 1, policy_version 37010 (0.0008) [2023-10-14 02:30:32,257][33226] Updated weights for policy 1, policy_version 37020 (0.0009) [2023-10-14 02:30:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 75497472. Throughput: 0: 1753.9, 1: 1760.6. Samples: 18882130. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.850')] [2023-10-14 02:30:35,827][33201] Updated weights for policy 0, policy_version 36710 (0.0008) [2023-10-14 02:30:35,937][33226] Updated weights for policy 1, policy_version 37030 (0.0008) [2023-10-14 02:30:36,198][33201] Updated weights for policy 0, policy_version 36720 (0.0008) [2023-10-14 02:30:36,301][33226] Updated weights for policy 1, policy_version 37040 (0.0009) [2023-10-14 02:30:36,572][33201] Updated weights for policy 0, policy_version 36730 (0.0008) [2023-10-14 02:30:36,670][33226] Updated weights for policy 1, policy_version 37050 (0.0008) [2023-10-14 02:30:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 75563008. Throughput: 0: 1755.9, 1: 1762.4. Samples: 18904018. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:39,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.860')] [2023-10-14 02:30:40,471][33201] Updated weights for policy 0, policy_version 36740 (0.0010) [2023-10-14 02:30:40,506][33226] Updated weights for policy 1, policy_version 37060 (0.0008) [2023-10-14 02:30:40,850][33201] Updated weights for policy 0, policy_version 36750 (0.0009) [2023-10-14 02:30:40,865][33226] Updated weights for policy 1, policy_version 37070 (0.0009) [2023-10-14 02:30:41,219][33201] Updated weights for policy 0, policy_version 36760 (0.0009) [2023-10-14 02:30:41,246][33226] Updated weights for policy 1, policy_version 37080 (0.0009) [2023-10-14 02:30:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 75628544. Throughput: 0: 1744.5, 1: 1765.6. Samples: 18913672. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:44,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.860')] [2023-10-14 02:30:45,006][33201] Updated weights for policy 0, policy_version 36770 (0.0008) [2023-10-14 02:30:45,014][33226] Updated weights for policy 1, policy_version 37090 (0.0009) [2023-10-14 02:30:45,379][33226] Updated weights for policy 1, policy_version 37100 (0.0009) [2023-10-14 02:30:45,384][33201] Updated weights for policy 0, policy_version 36780 (0.0007) [2023-10-14 02:30:45,745][33226] Updated weights for policy 1, policy_version 37110 (0.0007) [2023-10-14 02:30:45,765][33201] Updated weights for policy 0, policy_version 36790 (0.0007) [2023-10-14 02:30:46,108][33226] Updated weights for policy 1, policy_version 37120 (0.0009) [2023-10-14 02:30:46,130][33201] Updated weights for policy 0, policy_version 36800 (0.0007) [2023-10-14 02:30:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 75694080. Throughput: 0: 1752.2, 1: 1761.8. Samples: 18935752. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 02:30:49,905][33226] Updated weights for policy 1, policy_version 37130 (0.0009) [2023-10-14 02:30:49,910][33201] Updated weights for policy 0, policy_version 36810 (0.0007) [2023-10-14 02:30:50,273][33226] Updated weights for policy 1, policy_version 37140 (0.0008) [2023-10-14 02:30:50,276][33201] Updated weights for policy 0, policy_version 36820 (0.0008) [2023-10-14 02:30:50,638][33226] Updated weights for policy 1, policy_version 37150 (0.0009) [2023-10-14 02:30:50,643][33201] Updated weights for policy 0, policy_version 36830 (0.0008) [2023-10-14 02:30:54,415][33226] Updated weights for policy 1, policy_version 37160 (0.0008) [2023-10-14 02:30:54,558][33201] Updated weights for policy 0, policy_version 36840 (0.0007) [2023-10-14 02:30:54,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 75759616. Throughput: 0: 1777.8, 1: 1783.7. Samples: 18957702. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:54,559][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 02:30:54,782][33226] Updated weights for policy 1, policy_version 37170 (0.0008) [2023-10-14 02:30:54,931][33201] Updated weights for policy 0, policy_version 36850 (0.0007) [2023-10-14 02:30:55,147][33226] Updated weights for policy 1, policy_version 37180 (0.0008) [2023-10-14 02:30:55,292][33201] Updated weights for policy 0, policy_version 36860 (0.0008) [2023-10-14 02:30:58,901][33226] Updated weights for policy 1, policy_version 37190 (0.0009) [2023-10-14 02:30:59,129][33201] Updated weights for policy 0, policy_version 36870 (0.0010) [2023-10-14 02:30:59,266][33226] Updated weights for policy 1, policy_version 37200 (0.0007) [2023-10-14 02:30:59,498][33201] Updated weights for policy 0, policy_version 36880 (0.0007) [2023-10-14 02:30:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 75825152. Throughput: 0: 1742.8, 1: 1767.2. Samples: 18967270. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:30:59,559][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 02:30:59,634][33226] Updated weights for policy 1, policy_version 37210 (0.0008) [2023-10-14 02:30:59,867][33201] Updated weights for policy 0, policy_version 36890 (0.0010) [2023-10-14 02:31:03,362][33226] Updated weights for policy 1, policy_version 37220 (0.0007) [2023-10-14 02:31:03,602][33201] Updated weights for policy 0, policy_version 36900 (0.0007) [2023-10-14 02:31:03,724][33226] Updated weights for policy 1, policy_version 37230 (0.0007) [2023-10-14 02:31:03,973][33201] Updated weights for policy 0, policy_version 36910 (0.0007) [2023-10-14 02:31:04,090][33226] Updated weights for policy 1, policy_version 37240 (0.0010) [2023-10-14 02:31:04,344][33201] Updated weights for policy 0, policy_version 36920 (0.0007) [2023-10-14 02:31:04,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 75923456. Throughput: 0: 1773.6, 1: 1784.4. Samples: 18989516. Policy #0 lag: (min: 17.0, avg: 24.8, max: 49.0) [2023-10-14 02:31:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.840')] [2023-10-14 02:31:08,046][33226] Updated weights for policy 1, policy_version 37250 (0.0008) [2023-10-14 02:31:08,148][33201] Updated weights for policy 0, policy_version 36930 (0.0008) [2023-10-14 02:31:08,421][33226] Updated weights for policy 1, policy_version 37260 (0.0007) [2023-10-14 02:31:08,517][33201] Updated weights for policy 0, policy_version 36940 (0.0007) [2023-10-14 02:31:08,794][33226] Updated weights for policy 1, policy_version 37270 (0.0008) [2023-10-14 02:31:08,878][33201] Updated weights for policy 0, policy_version 36950 (0.0007) [2023-10-14 02:31:09,154][33226] Updated weights for policy 1, policy_version 37280 (0.0007) [2023-10-14 02:31:09,249][33201] Updated weights for policy 0, policy_version 36960 (0.0007) [2023-10-14 02:31:09,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 76021760. Throughput: 0: 1750.3, 1: 1773.5. Samples: 19009396. Policy #0 lag: (min: 17.0, avg: 24.8, max: 49.0) [2023-10-14 02:31:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.840')] [2023-10-14 02:31:12,882][33226] Updated weights for policy 1, policy_version 37290 (0.0008) [2023-10-14 02:31:13,185][33201] Updated weights for policy 0, policy_version 36970 (0.0008) [2023-10-14 02:31:13,245][33226] Updated weights for policy 1, policy_version 37300 (0.0009) [2023-10-14 02:31:13,549][33201] Updated weights for policy 0, policy_version 36980 (0.0008) [2023-10-14 02:31:13,616][33226] Updated weights for policy 1, policy_version 37310 (0.0008) [2023-10-14 02:31:13,926][33201] Updated weights for policy 0, policy_version 36990 (0.0008) [2023-10-14 02:31:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 76087296. Throughput: 0: 1772.5, 1: 1783.7. Samples: 19021346. Policy #0 lag: (min: 17.0, avg: 24.8, max: 49.0) [2023-10-14 02:31:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.890')] [2023-10-14 02:31:17,443][33226] Updated weights for policy 1, policy_version 37320 (0.0007) [2023-10-14 02:31:17,647][33201] Updated weights for policy 0, policy_version 37000 (0.0008) [2023-10-14 02:31:17,802][33226] Updated weights for policy 1, policy_version 37330 (0.0008) [2023-10-14 02:31:18,017][33201] Updated weights for policy 0, policy_version 37010 (0.0007) [2023-10-14 02:31:18,169][33226] Updated weights for policy 1, policy_version 37340 (0.0007) [2023-10-14 02:31:18,388][33201] Updated weights for policy 0, policy_version 37020 (0.0007) [2023-10-14 02:31:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 76152832. Throughput: 0: 1761.6, 1: 1781.8. Samples: 19041584. Policy #0 lag: (min: 17.0, avg: 24.8, max: 49.0) [2023-10-14 02:31:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:31:22,034][33226] Updated weights for policy 1, policy_version 37350 (0.0007) [2023-10-14 02:31:22,260][33201] Updated weights for policy 0, policy_version 37030 (0.0007) [2023-10-14 02:31:22,404][33226] Updated weights for policy 1, policy_version 37360 (0.0007) [2023-10-14 02:31:22,638][33201] Updated weights for policy 0, policy_version 37040 (0.0008) [2023-10-14 02:31:22,763][33226] Updated weights for policy 1, policy_version 37370 (0.0008) [2023-10-14 02:31:23,010][33201] Updated weights for policy 0, policy_version 37050 (0.0008) [2023-10-14 02:31:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 76218368. Throughput: 0: 1751.4, 1: 1772.8. Samples: 19062608. Policy #0 lag: (min: 17.0, avg: 24.8, max: 49.0) [2023-10-14 02:31:24,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:31:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000037376_38273024.pth... [2023-10-14 02:31:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000037056_37945344.pth... [2023-10-14 02:31:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000035712_36569088.pth [2023-10-14 02:31:24,609][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000035424_36274176.pth [2023-10-14 02:31:26,640][33226] Updated weights for policy 1, policy_version 37380 (0.0009) [2023-10-14 02:31:26,885][33201] Updated weights for policy 0, policy_version 37060 (0.0009) [2023-10-14 02:31:27,014][33226] Updated weights for policy 1, policy_version 37390 (0.0009) [2023-10-14 02:31:27,255][33201] Updated weights for policy 0, policy_version 37070 (0.0007) [2023-10-14 02:31:27,374][33226] Updated weights for policy 1, policy_version 37400 (0.0009) [2023-10-14 02:31:27,625][33201] Updated weights for policy 0, policy_version 37080 (0.0008) [2023-10-14 02:31:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 76283904. Throughput: 0: 1775.0, 1: 1785.3. Samples: 19073884. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:31:31,224][33226] Updated weights for policy 1, policy_version 37410 (0.0008) [2023-10-14 02:31:31,351][33201] Updated weights for policy 0, policy_version 37090 (0.0010) [2023-10-14 02:31:31,592][33226] Updated weights for policy 1, policy_version 37420 (0.0008) [2023-10-14 02:31:31,722][33201] Updated weights for policy 0, policy_version 37100 (0.0008) [2023-10-14 02:31:31,961][33226] Updated weights for policy 1, policy_version 37430 (0.0008) [2023-10-14 02:31:32,101][33201] Updated weights for policy 0, policy_version 37110 (0.0008) [2023-10-14 02:31:32,327][33226] Updated weights for policy 1, policy_version 37440 (0.0008) [2023-10-14 02:31:32,472][33201] Updated weights for policy 0, policy_version 37120 (0.0008) [2023-10-14 02:31:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 76349440. Throughput: 0: 1754.2, 1: 1765.9. Samples: 19094156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:31:36,140][33226] Updated weights for policy 1, policy_version 37450 (0.0009) [2023-10-14 02:31:36,306][33201] Updated weights for policy 0, policy_version 37130 (0.0010) [2023-10-14 02:31:36,513][33226] Updated weights for policy 1, policy_version 37460 (0.0008) [2023-10-14 02:31:36,673][33201] Updated weights for policy 0, policy_version 37140 (0.0008) [2023-10-14 02:31:36,884][33226] Updated weights for policy 1, policy_version 37470 (0.0007) [2023-10-14 02:31:37,051][33201] Updated weights for policy 0, policy_version 37150 (0.0008) [2023-10-14 02:31:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 76414976. Throughput: 0: 1753.4, 1: 1765.5. Samples: 19116052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 02:31:40,707][33226] Updated weights for policy 1, policy_version 37480 (0.0007) [2023-10-14 02:31:40,923][33201] Updated weights for policy 0, policy_version 37160 (0.0008) [2023-10-14 02:31:41,079][33226] Updated weights for policy 1, policy_version 37490 (0.0007) [2023-10-14 02:31:41,300][33201] Updated weights for policy 0, policy_version 37170 (0.0009) [2023-10-14 02:31:41,452][33226] Updated weights for policy 1, policy_version 37500 (0.0008) [2023-10-14 02:31:41,664][33201] Updated weights for policy 0, policy_version 37180 (0.0010) [2023-10-14 02:31:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 76480512. Throughput: 0: 1756.0, 1: 1764.4. Samples: 19125692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 02:31:45,247][33226] Updated weights for policy 1, policy_version 37510 (0.0007) [2023-10-14 02:31:45,426][33201] Updated weights for policy 0, policy_version 37190 (0.0008) [2023-10-14 02:31:45,613][33226] Updated weights for policy 1, policy_version 37520 (0.0007) [2023-10-14 02:31:45,805][33201] Updated weights for policy 0, policy_version 37200 (0.0009) [2023-10-14 02:31:45,980][33226] Updated weights for policy 1, policy_version 37530 (0.0007) [2023-10-14 02:31:46,169][33201] Updated weights for policy 0, policy_version 37210 (0.0007) [2023-10-14 02:31:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 76546048. Throughput: 0: 1756.9, 1: 1762.0. Samples: 19147864. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:49,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 02:31:49,742][33226] Updated weights for policy 1, policy_version 37540 (0.0007) [2023-10-14 02:31:50,022][33201] Updated weights for policy 0, policy_version 37220 (0.0007) [2023-10-14 02:31:50,112][33226] Updated weights for policy 1, policy_version 37550 (0.0007) [2023-10-14 02:31:50,389][33201] Updated weights for policy 0, policy_version 37230 (0.0008) [2023-10-14 02:31:50,486][33226] Updated weights for policy 1, policy_version 37560 (0.0008) [2023-10-14 02:31:50,762][33201] Updated weights for policy 0, policy_version 37240 (0.0008) [2023-10-14 02:31:54,120][33226] Updated weights for policy 1, policy_version 37570 (0.0007) [2023-10-14 02:31:54,494][33226] Updated weights for policy 1, policy_version 37580 (0.0009) [2023-10-14 02:31:54,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 76611584. Throughput: 0: 1772.9, 1: 1795.9. Samples: 19169990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:31:54,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.800')] [2023-10-14 02:31:54,793][33201] Updated weights for policy 0, policy_version 37250 (0.0010) [2023-10-14 02:31:54,862][33226] Updated weights for policy 1, policy_version 37590 (0.0007) [2023-10-14 02:31:55,155][33201] Updated weights for policy 0, policy_version 37260 (0.0008) [2023-10-14 02:31:55,233][33226] Updated weights for policy 1, policy_version 37600 (0.0007) [2023-10-14 02:31:55,526][33201] Updated weights for policy 0, policy_version 37270 (0.0009) [2023-10-14 02:31:55,902][33201] Updated weights for policy 0, policy_version 37280 (0.0007) [2023-10-14 02:31:59,060][33226] Updated weights for policy 1, policy_version 37610 (0.0009) [2023-10-14 02:31:59,435][33226] Updated weights for policy 1, policy_version 37620 (0.0008) [2023-10-14 02:31:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 76677120. Throughput: 0: 1747.6, 1: 1770.3. Samples: 19179652. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:31:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.790')] [2023-10-14 02:31:59,751][33201] Updated weights for policy 0, policy_version 37290 (0.0007) [2023-10-14 02:31:59,799][33226] Updated weights for policy 1, policy_version 37630 (0.0009) [2023-10-14 02:32:00,122][33201] Updated weights for policy 0, policy_version 37300 (0.0007) [2023-10-14 02:32:00,491][33201] Updated weights for policy 0, policy_version 37310 (0.0007) [2023-10-14 02:32:03,614][33226] Updated weights for policy 1, policy_version 37640 (0.0010) [2023-10-14 02:32:03,977][33226] Updated weights for policy 1, policy_version 37650 (0.0010) [2023-10-14 02:32:04,336][33226] Updated weights for policy 1, policy_version 37660 (0.0008) [2023-10-14 02:32:04,378][33201] Updated weights for policy 0, policy_version 37320 (0.0007) [2023-10-14 02:32:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 76775424. Throughput: 0: 1763.3, 1: 1792.5. Samples: 19201594. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:32:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.810')] [2023-10-14 02:32:04,749][33201] Updated weights for policy 0, policy_version 37330 (0.0010) [2023-10-14 02:32:05,114][33201] Updated weights for policy 0, policy_version 37340 (0.0008) [2023-10-14 02:32:08,315][33226] Updated weights for policy 1, policy_version 37670 (0.0007) [2023-10-14 02:32:08,684][33226] Updated weights for policy 1, policy_version 37680 (0.0009) [2023-10-14 02:32:08,987][33201] Updated weights for policy 0, policy_version 37350 (0.0008) [2023-10-14 02:32:09,047][33226] Updated weights for policy 1, policy_version 37690 (0.0008) [2023-10-14 02:32:09,372][33201] Updated weights for policy 0, policy_version 37360 (0.0010) [2023-10-14 02:32:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 76840960. Throughput: 0: 1766.5, 1: 1777.0. Samples: 19222068. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:32:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.840')] [2023-10-14 02:32:09,748][33201] Updated weights for policy 0, policy_version 37370 (0.0010) [2023-10-14 02:32:12,879][33226] Updated weights for policy 1, policy_version 37700 (0.0008) [2023-10-14 02:32:13,246][33226] Updated weights for policy 1, policy_version 37710 (0.0009) [2023-10-14 02:32:13,422][33201] Updated weights for policy 0, policy_version 37380 (0.0009) [2023-10-14 02:32:13,619][33226] Updated weights for policy 1, policy_version 37720 (0.0008) [2023-10-14 02:32:13,802][33201] Updated weights for policy 0, policy_version 37390 (0.0007) [2023-10-14 02:32:14,161][33201] Updated weights for policy 0, policy_version 37400 (0.0007) [2023-10-14 02:32:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 76939264. Throughput: 0: 1753.7, 1: 1783.7. Samples: 19233068. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:32:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.840')] [2023-10-14 02:32:17,348][33226] Updated weights for policy 1, policy_version 37730 (0.0008) [2023-10-14 02:32:17,713][33226] Updated weights for policy 1, policy_version 37740 (0.0007) [2023-10-14 02:32:17,916][33201] Updated weights for policy 0, policy_version 37410 (0.0008) [2023-10-14 02:32:18,075][33226] Updated weights for policy 1, policy_version 37750 (0.0009) [2023-10-14 02:32:18,281][33201] Updated weights for policy 0, policy_version 37420 (0.0007) [2023-10-14 02:32:18,442][33226] Updated weights for policy 1, policy_version 37760 (0.0009) [2023-10-14 02:32:18,656][33201] Updated weights for policy 0, policy_version 37430 (0.0009) [2023-10-14 02:32:19,030][33201] Updated weights for policy 0, policy_version 37440 (0.0009) [2023-10-14 02:32:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77004800. Throughput: 0: 1768.2, 1: 1781.1. Samples: 19253876. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:32:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 02:32:22,268][33226] Updated weights for policy 1, policy_version 37770 (0.0009) [2023-10-14 02:32:22,635][33226] Updated weights for policy 1, policy_version 37780 (0.0008) [2023-10-14 02:32:22,903][33201] Updated weights for policy 0, policy_version 37450 (0.0008) [2023-10-14 02:32:23,000][33226] Updated weights for policy 1, policy_version 37790 (0.0007) [2023-10-14 02:32:23,274][33201] Updated weights for policy 0, policy_version 37460 (0.0009) [2023-10-14 02:32:23,640][33201] Updated weights for policy 0, policy_version 37470 (0.0008) [2023-10-14 02:32:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 77070336. Throughput: 0: 1746.2, 1: 1770.3. Samples: 19274294. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.810')] [2023-10-14 02:32:26,787][33226] Updated weights for policy 1, policy_version 37800 (0.0008) [2023-10-14 02:32:27,158][33226] Updated weights for policy 1, policy_version 37810 (0.0009) [2023-10-14 02:32:27,515][33226] Updated weights for policy 1, policy_version 37820 (0.0009) [2023-10-14 02:32:27,652][33201] Updated weights for policy 0, policy_version 37480 (0.0007) [2023-10-14 02:32:28,020][33201] Updated weights for policy 0, policy_version 37490 (0.0010) [2023-10-14 02:32:28,396][33201] Updated weights for policy 0, policy_version 37500 (0.0009) [2023-10-14 02:32:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77135872. Throughput: 0: 1777.1, 1: 1785.3. Samples: 19285998. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.740')] [2023-10-14 02:32:31,196][33226] Updated weights for policy 1, policy_version 37830 (0.0008) [2023-10-14 02:32:31,562][33226] Updated weights for policy 1, policy_version 37840 (0.0009) [2023-10-14 02:32:31,931][33226] Updated weights for policy 1, policy_version 37850 (0.0009) [2023-10-14 02:32:32,050][33201] Updated weights for policy 0, policy_version 37510 (0.0007) [2023-10-14 02:32:32,408][33201] Updated weights for policy 0, policy_version 37520 (0.0008) [2023-10-14 02:32:32,771][33201] Updated weights for policy 0, policy_version 37530 (0.0009) [2023-10-14 02:32:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77201408. Throughput: 0: 1744.2, 1: 1772.9. Samples: 19306132. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.740')] [2023-10-14 02:32:35,834][33226] Updated weights for policy 1, policy_version 37860 (0.0008) [2023-10-14 02:32:36,198][33226] Updated weights for policy 1, policy_version 37870 (0.0008) [2023-10-14 02:32:36,567][33226] Updated weights for policy 1, policy_version 37880 (0.0009) [2023-10-14 02:32:36,606][33201] Updated weights for policy 0, policy_version 37540 (0.0009) [2023-10-14 02:32:36,976][33201] Updated weights for policy 0, policy_version 37550 (0.0008) [2023-10-14 02:32:37,338][33201] Updated weights for policy 0, policy_version 37560 (0.0008) [2023-10-14 02:32:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77266944. Throughput: 0: 1751.6, 1: 1768.3. Samples: 19328384. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.740')] [2023-10-14 02:32:40,442][33226] Updated weights for policy 1, policy_version 37890 (0.0009) [2023-10-14 02:32:40,814][33226] Updated weights for policy 1, policy_version 37900 (0.0008) [2023-10-14 02:32:41,121][33201] Updated weights for policy 0, policy_version 37570 (0.0007) [2023-10-14 02:32:41,178][33226] Updated weights for policy 1, policy_version 37910 (0.0007) [2023-10-14 02:32:41,495][33201] Updated weights for policy 0, policy_version 37580 (0.0009) [2023-10-14 02:32:41,543][33226] Updated weights for policy 1, policy_version 37920 (0.0008) [2023-10-14 02:32:41,869][33201] Updated weights for policy 0, policy_version 37590 (0.0009) [2023-10-14 02:32:42,242][33201] Updated weights for policy 0, policy_version 37600 (0.0010) [2023-10-14 02:32:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 77332480. Throughput: 0: 1759.5, 1: 1764.8. Samples: 19338244. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.740')] [2023-10-14 02:32:45,294][33226] Updated weights for policy 1, policy_version 37930 (0.0009) [2023-10-14 02:32:45,657][33226] Updated weights for policy 1, policy_version 37940 (0.0010) [2023-10-14 02:32:46,031][33226] Updated weights for policy 1, policy_version 37950 (0.0008) [2023-10-14 02:32:46,142][33201] Updated weights for policy 0, policy_version 37610 (0.0009) [2023-10-14 02:32:46,499][33201] Updated weights for policy 0, policy_version 37620 (0.0007) [2023-10-14 02:32:46,876][33201] Updated weights for policy 0, policy_version 37630 (0.0008) [2023-10-14 02:32:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 77398016. Throughput: 0: 1752.0, 1: 1773.6. Samples: 19360250. Policy #0 lag: (min: 23.0, avg: 38.0, max: 55.0) [2023-10-14 02:32:49,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.770')] [2023-10-14 02:32:49,927][33226] Updated weights for policy 1, policy_version 37960 (0.0009) [2023-10-14 02:32:50,301][33226] Updated weights for policy 1, policy_version 37970 (0.0009) [2023-10-14 02:32:50,662][33201] Updated weights for policy 0, policy_version 37640 (0.0008) [2023-10-14 02:32:50,663][33226] Updated weights for policy 1, policy_version 37980 (0.0010) [2023-10-14 02:32:51,038][33201] Updated weights for policy 0, policy_version 37650 (0.0009) [2023-10-14 02:32:51,405][33201] Updated weights for policy 0, policy_version 37660 (0.0010) [2023-10-14 02:32:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 77463552. Throughput: 0: 1768.5, 1: 1789.2. Samples: 19382166. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-14 02:32:54,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.770')] [2023-10-14 02:32:54,583][33226] Updated weights for policy 1, policy_version 37990 (0.0009) [2023-10-14 02:32:54,946][33226] Updated weights for policy 1, policy_version 38000 (0.0009) [2023-10-14 02:32:55,214][33201] Updated weights for policy 0, policy_version 37670 (0.0008) [2023-10-14 02:32:55,321][33226] Updated weights for policy 1, policy_version 38010 (0.0008) [2023-10-14 02:32:55,603][33201] Updated weights for policy 0, policy_version 37680 (0.0007) [2023-10-14 02:32:55,981][33201] Updated weights for policy 0, policy_version 37690 (0.0009) [2023-10-14 02:32:59,063][33226] Updated weights for policy 1, policy_version 38020 (0.0009) [2023-10-14 02:32:59,434][33226] Updated weights for policy 1, policy_version 38030 (0.0008) [2023-10-14 02:32:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 77529088. Throughput: 0: 1755.4, 1: 1766.7. Samples: 19391562. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-14 02:32:59,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.770')] [2023-10-14 02:32:59,797][33226] Updated weights for policy 1, policy_version 38040 (0.0008) [2023-10-14 02:32:59,854][33201] Updated weights for policy 0, policy_version 37700 (0.0009) [2023-10-14 02:33:00,230][33201] Updated weights for policy 0, policy_version 37710 (0.0008) [2023-10-14 02:33:00,601][33201] Updated weights for policy 0, policy_version 37720 (0.0010) [2023-10-14 02:33:03,593][33226] Updated weights for policy 1, policy_version 38050 (0.0007) [2023-10-14 02:33:03,953][33226] Updated weights for policy 1, policy_version 38060 (0.0009) [2023-10-14 02:33:04,316][33226] Updated weights for policy 1, policy_version 38070 (0.0010) [2023-10-14 02:33:04,355][33201] Updated weights for policy 0, policy_version 37730 (0.0009) [2023-10-14 02:33:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 77594624. Throughput: 0: 1762.4, 1: 1794.0. Samples: 19413914. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-14 02:33:04,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.780')] [2023-10-14 02:33:04,684][33226] Updated weights for policy 1, policy_version 38080 (0.0007) [2023-10-14 02:33:04,714][33201] Updated weights for policy 0, policy_version 37740 (0.0009) [2023-10-14 02:33:05,076][33201] Updated weights for policy 0, policy_version 37750 (0.0010) [2023-10-14 02:33:05,439][33201] Updated weights for policy 0, policy_version 37760 (0.0008) [2023-10-14 02:33:08,350][33226] Updated weights for policy 1, policy_version 38090 (0.0009) [2023-10-14 02:33:08,720][33226] Updated weights for policy 1, policy_version 38100 (0.0008) [2023-10-14 02:33:09,076][33226] Updated weights for policy 1, policy_version 38110 (0.0008) [2023-10-14 02:33:09,292][33201] Updated weights for policy 0, policy_version 37770 (0.0009) [2023-10-14 02:33:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77692928. Throughput: 0: 1783.7, 1: 1783.6. Samples: 19434820. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-14 02:33:09,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.780')] [2023-10-14 02:33:09,656][33201] Updated weights for policy 0, policy_version 37780 (0.0008) [2023-10-14 02:33:10,026][33201] Updated weights for policy 0, policy_version 37790 (0.0008) [2023-10-14 02:33:12,877][33226] Updated weights for policy 1, policy_version 38120 (0.0009) [2023-10-14 02:33:13,259][33226] Updated weights for policy 1, policy_version 38130 (0.0009) [2023-10-14 02:33:13,627][33226] Updated weights for policy 1, policy_version 38140 (0.0008) [2023-10-14 02:33:13,844][33201] Updated weights for policy 0, policy_version 37800 (0.0009) [2023-10-14 02:33:14,218][33201] Updated weights for policy 0, policy_version 37810 (0.0010) [2023-10-14 02:33:14,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 77758464. Throughput: 0: 1754.0, 1: 1794.5. Samples: 19445680. Policy #0 lag: (min: 31.0, avg: 34.6, max: 63.0) [2023-10-14 02:33:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.800')] [2023-10-14 02:33:14,591][33201] Updated weights for policy 0, policy_version 37820 (0.0008) [2023-10-14 02:33:17,330][33226] Updated weights for policy 1, policy_version 38150 (0.0007) [2023-10-14 02:33:17,705][33226] Updated weights for policy 1, policy_version 38160 (0.0008) [2023-10-14 02:33:18,080][33226] Updated weights for policy 1, policy_version 38170 (0.0008) [2023-10-14 02:33:18,480][33201] Updated weights for policy 0, policy_version 37830 (0.0009) [2023-10-14 02:33:18,852][33201] Updated weights for policy 0, policy_version 37840 (0.0008) [2023-10-14 02:33:19,223][33201] Updated weights for policy 0, policy_version 37850 (0.0008) [2023-10-14 02:33:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 77856768. Throughput: 0: 1784.0, 1: 1787.3. Samples: 19466840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:19,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.800')] [2023-10-14 02:33:21,751][33226] Updated weights for policy 1, policy_version 38180 (0.0009) [2023-10-14 02:33:22,106][33226] Updated weights for policy 1, policy_version 38190 (0.0011) [2023-10-14 02:33:22,469][33226] Updated weights for policy 1, policy_version 38200 (0.0008) [2023-10-14 02:33:23,070][33201] Updated weights for policy 0, policy_version 37860 (0.0009) [2023-10-14 02:33:23,437][33201] Updated weights for policy 0, policy_version 37870 (0.0009) [2023-10-14 02:33:23,810][33201] Updated weights for policy 0, policy_version 37880 (0.0010) [2023-10-14 02:33:24,557][31953] Fps is (10 sec: 16383.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 77922304. Throughput: 0: 1753.2, 1: 1779.7. Samples: 19487366. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:24,559][31953] Avg episode reward: [(0, '20.820'), (1, '20.840')] [2023-10-14 02:33:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000038208_39124992.pth... [2023-10-14 02:33:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000037888_38797312.pth... [2023-10-14 02:33:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000036544_37421056.pth [2023-10-14 02:33:24,609][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000036224_37093376.pth [2023-10-14 02:33:26,190][33226] Updated weights for policy 1, policy_version 38210 (0.0008) [2023-10-14 02:33:26,563][33226] Updated weights for policy 1, policy_version 38220 (0.0009) [2023-10-14 02:33:26,931][33226] Updated weights for policy 1, policy_version 38230 (0.0010) [2023-10-14 02:33:27,302][33226] Updated weights for policy 1, policy_version 38240 (0.0008) [2023-10-14 02:33:27,589][33201] Updated weights for policy 0, policy_version 37890 (0.0008) [2023-10-14 02:33:27,961][33201] Updated weights for policy 0, policy_version 37900 (0.0007) [2023-10-14 02:33:28,331][33201] Updated weights for policy 0, policy_version 37910 (0.0010) [2023-10-14 02:33:28,709][33201] Updated weights for policy 0, policy_version 37920 (0.0009) [2023-10-14 02:33:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 77987840. Throughput: 0: 1775.2, 1: 1790.7. Samples: 19498710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:29,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.880')] [2023-10-14 02:33:31,065][33226] Updated weights for policy 1, policy_version 38250 (0.0007) [2023-10-14 02:33:31,424][33226] Updated weights for policy 1, policy_version 38260 (0.0009) [2023-10-14 02:33:31,788][33226] Updated weights for policy 1, policy_version 38270 (0.0007) [2023-10-14 02:33:32,738][33201] Updated weights for policy 0, policy_version 37930 (0.0008) [2023-10-14 02:33:33,117][33201] Updated weights for policy 0, policy_version 37940 (0.0009) [2023-10-14 02:33:33,481][33201] Updated weights for policy 0, policy_version 37950 (0.0007) [2023-10-14 02:33:34,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78053376. Throughput: 0: 1763.3, 1: 1778.1. Samples: 19519616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 02:33:35,571][33226] Updated weights for policy 1, policy_version 38280 (0.0007) [2023-10-14 02:33:35,950][33226] Updated weights for policy 1, policy_version 38290 (0.0008) [2023-10-14 02:33:36,325][33226] Updated weights for policy 1, policy_version 38300 (0.0009) [2023-10-14 02:33:37,110][33201] Updated weights for policy 0, policy_version 37960 (0.0009) [2023-10-14 02:33:37,473][33201] Updated weights for policy 0, policy_version 37970 (0.0010) [2023-10-14 02:33:37,841][33201] Updated weights for policy 0, policy_version 37980 (0.0009) [2023-10-14 02:33:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78118912. Throughput: 0: 1745.6, 1: 1782.1. Samples: 19540912. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:33:40,041][33226] Updated weights for policy 1, policy_version 38310 (0.0008) [2023-10-14 02:33:40,411][33226] Updated weights for policy 1, policy_version 38320 (0.0010) [2023-10-14 02:33:40,784][33226] Updated weights for policy 1, policy_version 38330 (0.0008) [2023-10-14 02:33:42,054][33201] Updated weights for policy 0, policy_version 37990 (0.0009) [2023-10-14 02:33:42,452][33201] Updated weights for policy 0, policy_version 38000 (0.0008) [2023-10-14 02:33:42,826][33201] Updated weights for policy 0, policy_version 38010 (0.0007) [2023-10-14 02:33:44,535][33226] Updated weights for policy 1, policy_version 38340 (0.0010) [2023-10-14 02:33:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 78184448. Throughput: 0: 1770.9, 1: 1782.6. Samples: 19551470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:33:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:33:44,904][33226] Updated weights for policy 1, policy_version 38350 (0.0009) [2023-10-14 02:33:45,273][33226] Updated weights for policy 1, policy_version 38360 (0.0009) [2023-10-14 02:33:46,690][33201] Updated weights for policy 0, policy_version 38020 (0.0008) [2023-10-14 02:33:47,059][33201] Updated weights for policy 0, policy_version 38030 (0.0009) [2023-10-14 02:33:47,421][33201] Updated weights for policy 0, policy_version 38040 (0.0008) [2023-10-14 02:33:49,100][33226] Updated weights for policy 1, policy_version 38370 (0.0009) [2023-10-14 02:33:49,477][33226] Updated weights for policy 1, policy_version 38380 (0.0010) [2023-10-14 02:33:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 78249984. Throughput: 0: 1744.0, 1: 1783.0. Samples: 19572628. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:33:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:33:49,844][33226] Updated weights for policy 1, policy_version 38390 (0.0008) [2023-10-14 02:33:50,207][33226] Updated weights for policy 1, policy_version 38400 (0.0010) [2023-10-14 02:33:51,229][33201] Updated weights for policy 0, policy_version 38050 (0.0009) [2023-10-14 02:33:51,594][33201] Updated weights for policy 0, policy_version 38060 (0.0009) [2023-10-14 02:33:51,960][33201] Updated weights for policy 0, policy_version 38070 (0.0007) [2023-10-14 02:33:52,330][33201] Updated weights for policy 0, policy_version 38080 (0.0008) [2023-10-14 02:33:53,888][33226] Updated weights for policy 1, policy_version 38410 (0.0008) [2023-10-14 02:33:54,258][33226] Updated weights for policy 1, policy_version 38420 (0.0007) [2023-10-14 02:33:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 78315520. Throughput: 0: 1751.9, 1: 1798.4. Samples: 19594584. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:33:54,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.890')] [2023-10-14 02:33:54,620][33226] Updated weights for policy 1, policy_version 38430 (0.0007) [2023-10-14 02:33:56,005][33201] Updated weights for policy 0, policy_version 38090 (0.0009) [2023-10-14 02:33:56,372][33201] Updated weights for policy 0, policy_version 38100 (0.0011) [2023-10-14 02:33:56,749][33201] Updated weights for policy 0, policy_version 38110 (0.0008) [2023-10-14 02:33:58,437][33226] Updated weights for policy 1, policy_version 38440 (0.0008) [2023-10-14 02:33:58,795][33226] Updated weights for policy 1, policy_version 38450 (0.0008) [2023-10-14 02:33:59,162][33226] Updated weights for policy 1, policy_version 38460 (0.0011) [2023-10-14 02:33:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 78413824. Throughput: 0: 1751.6, 1: 1784.4. Samples: 19604800. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:33:59,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.890')] [2023-10-14 02:34:00,429][33201] Updated weights for policy 0, policy_version 38120 (0.0008) [2023-10-14 02:34:00,799][33201] Updated weights for policy 0, policy_version 38130 (0.0009) [2023-10-14 02:34:01,176][33201] Updated weights for policy 0, policy_version 38140 (0.0008) [2023-10-14 02:34:03,089][33226] Updated weights for policy 1, policy_version 38470 (0.0009) [2023-10-14 02:34:03,453][33226] Updated weights for policy 1, policy_version 38480 (0.0008) [2023-10-14 02:34:03,821][33226] Updated weights for policy 1, policy_version 38490 (0.0008) [2023-10-14 02:34:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 78479360. Throughput: 0: 1760.3, 1: 1802.2. Samples: 19627152. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:34:04,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.890')] [2023-10-14 02:34:04,780][33201] Updated weights for policy 0, policy_version 38150 (0.0007) [2023-10-14 02:34:05,149][33201] Updated weights for policy 0, policy_version 38160 (0.0007) [2023-10-14 02:34:05,512][33201] Updated weights for policy 0, policy_version 38170 (0.0008) [2023-10-14 02:34:07,525][33226] Updated weights for policy 1, policy_version 38500 (0.0011) [2023-10-14 02:34:07,903][33226] Updated weights for policy 1, policy_version 38510 (0.0008) [2023-10-14 02:34:08,269][33226] Updated weights for policy 1, policy_version 38520 (0.0009) [2023-10-14 02:34:09,313][33201] Updated weights for policy 0, policy_version 38180 (0.0010) [2023-10-14 02:34:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 78544896. Throughput: 0: 1795.1, 1: 1779.3. Samples: 19648212. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:34:09,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.950')] [2023-10-14 02:34:09,692][33201] Updated weights for policy 0, policy_version 38190 (0.0009) [2023-10-14 02:34:10,054][33201] Updated weights for policy 0, policy_version 38200 (0.0007) [2023-10-14 02:34:12,043][33226] Updated weights for policy 1, policy_version 38530 (0.0007) [2023-10-14 02:34:12,408][33226] Updated weights for policy 1, policy_version 38540 (0.0008) [2023-10-14 02:34:12,781][33226] Updated weights for policy 1, policy_version 38550 (0.0009) [2023-10-14 02:34:13,143][33226] Updated weights for policy 1, policy_version 38560 (0.0008) [2023-10-14 02:34:13,802][33201] Updated weights for policy 0, policy_version 38210 (0.0009) [2023-10-14 02:34:14,174][33201] Updated weights for policy 0, policy_version 38220 (0.0011) [2023-10-14 02:34:14,541][33201] Updated weights for policy 0, policy_version 38230 (0.0009) [2023-10-14 02:34:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 78610432. Throughput: 0: 1768.4, 1: 1801.4. Samples: 19659348. Policy #0 lag: (min: 31.0, avg: 31.4, max: 44.0) [2023-10-14 02:34:14,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.990')] [2023-10-14 02:34:14,913][33201] Updated weights for policy 0, policy_version 38240 (0.0008) [2023-10-14 02:34:16,954][33226] Updated weights for policy 1, policy_version 38570 (0.0007) [2023-10-14 02:34:17,325][33226] Updated weights for policy 1, policy_version 38580 (0.0007) [2023-10-14 02:34:17,693][33226] Updated weights for policy 1, policy_version 38590 (0.0009) [2023-10-14 02:34:18,720][33201] Updated weights for policy 0, policy_version 38250 (0.0007) [2023-10-14 02:34:19,084][33201] Updated weights for policy 0, policy_version 38260 (0.0007) [2023-10-14 02:34:19,459][33201] Updated weights for policy 0, policy_version 38270 (0.0008) [2023-10-14 02:34:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78708736. Throughput: 0: 1793.3, 1: 1775.4. Samples: 19680206. Policy #0 lag: (min: 26.0, avg: 33.5, max: 58.0) [2023-10-14 02:34:19,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.990')] [2023-10-14 02:34:21,444][33226] Updated weights for policy 1, policy_version 38600 (0.0010) [2023-10-14 02:34:21,825][33226] Updated weights for policy 1, policy_version 38610 (0.0008) [2023-10-14 02:34:22,192][33226] Updated weights for policy 1, policy_version 38620 (0.0010) [2023-10-14 02:34:23,199][33201] Updated weights for policy 0, policy_version 38280 (0.0008) [2023-10-14 02:34:23,580][33201] Updated weights for policy 0, policy_version 38290 (0.0008) [2023-10-14 02:34:23,954][33201] Updated weights for policy 0, policy_version 38300 (0.0008) [2023-10-14 02:34:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 78774272. Throughput: 0: 1781.5, 1: 1785.4. Samples: 19701422. Policy #0 lag: (min: 26.0, avg: 33.5, max: 58.0) [2023-10-14 02:34:24,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.970')] [2023-10-14 02:34:25,857][33226] Updated weights for policy 1, policy_version 38630 (0.0009) [2023-10-14 02:34:26,226][33226] Updated weights for policy 1, policy_version 38640 (0.0007) [2023-10-14 02:34:26,586][33226] Updated weights for policy 1, policy_version 38650 (0.0007) [2023-10-14 02:34:27,798][33201] Updated weights for policy 0, policy_version 38310 (0.0007) [2023-10-14 02:34:28,187][33201] Updated weights for policy 0, policy_version 38320 (0.0007) [2023-10-14 02:34:28,564][33201] Updated weights for policy 0, policy_version 38330 (0.0008) [2023-10-14 02:34:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78839808. Throughput: 0: 1790.6, 1: 1784.8. Samples: 19712360. Policy #0 lag: (min: 26.0, avg: 33.5, max: 58.0) [2023-10-14 02:34:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 02:34:30,411][33226] Updated weights for policy 1, policy_version 38660 (0.0007) [2023-10-14 02:34:30,776][33226] Updated weights for policy 1, policy_version 38670 (0.0009) [2023-10-14 02:34:31,151][33226] Updated weights for policy 1, policy_version 38680 (0.0009) [2023-10-14 02:34:32,571][33201] Updated weights for policy 0, policy_version 38340 (0.0007) [2023-10-14 02:34:32,945][33201] Updated weights for policy 0, policy_version 38350 (0.0009) [2023-10-14 02:34:33,314][33201] Updated weights for policy 0, policy_version 38360 (0.0010) [2023-10-14 02:34:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78905344. Throughput: 0: 1791.2, 1: 1783.7. Samples: 19733498. Policy #0 lag: (min: 26.0, avg: 33.5, max: 58.0) [2023-10-14 02:34:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 02:34:34,794][33226] Updated weights for policy 1, policy_version 38690 (0.0009) [2023-10-14 02:34:35,160][33226] Updated weights for policy 1, policy_version 38700 (0.0010) [2023-10-14 02:34:35,528][33226] Updated weights for policy 1, policy_version 38710 (0.0010) [2023-10-14 02:34:35,897][33226] Updated weights for policy 1, policy_version 38720 (0.0010) [2023-10-14 02:34:37,088][33201] Updated weights for policy 0, policy_version 38370 (0.0010) [2023-10-14 02:34:37,452][33201] Updated weights for policy 0, policy_version 38380 (0.0008) [2023-10-14 02:34:37,824][33201] Updated weights for policy 0, policy_version 38390 (0.0007) [2023-10-14 02:34:38,201][33201] Updated weights for policy 0, policy_version 38400 (0.0007) [2023-10-14 02:34:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 78970880. Throughput: 0: 1772.5, 1: 1798.0. Samples: 19755258. Policy #0 lag: (min: 26.0, avg: 33.5, max: 58.0) [2023-10-14 02:34:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 02:34:39,670][33226] Updated weights for policy 1, policy_version 38730 (0.0011) [2023-10-14 02:34:40,036][33226] Updated weights for policy 1, policy_version 38740 (0.0011) [2023-10-14 02:34:40,406][33226] Updated weights for policy 1, policy_version 38750 (0.0012) [2023-10-14 02:34:42,167][33201] Updated weights for policy 0, policy_version 38410 (0.0010) [2023-10-14 02:34:42,547][33201] Updated weights for policy 0, policy_version 38420 (0.0009) [2023-10-14 02:34:42,917][33201] Updated weights for policy 0, policy_version 38430 (0.0007) [2023-10-14 02:34:44,297][33226] Updated weights for policy 1, policy_version 38760 (0.0010) [2023-10-14 02:34:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79036416. Throughput: 0: 1791.8, 1: 1783.2. Samples: 19765676. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:34:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 02:34:44,662][33226] Updated weights for policy 1, policy_version 38770 (0.0009) [2023-10-14 02:34:45,030][33226] Updated weights for policy 1, policy_version 38780 (0.0007) [2023-10-14 02:34:46,601][33201] Updated weights for policy 0, policy_version 38440 (0.0007) [2023-10-14 02:34:46,972][33201] Updated weights for policy 0, policy_version 38450 (0.0009) [2023-10-14 02:34:47,350][33201] Updated weights for policy 0, policy_version 38460 (0.0008) [2023-10-14 02:34:48,837][33226] Updated weights for policy 1, policy_version 38790 (0.0009) [2023-10-14 02:34:49,204][33226] Updated weights for policy 1, policy_version 38800 (0.0011) [2023-10-14 02:34:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79101952. Throughput: 0: 1759.9, 1: 1784.1. Samples: 19786632. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:34:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 02:34:49,571][33226] Updated weights for policy 1, policy_version 38810 (0.0010) [2023-10-14 02:34:51,246][33201] Updated weights for policy 0, policy_version 38470 (0.0010) [2023-10-14 02:34:51,615][33201] Updated weights for policy 0, policy_version 38480 (0.0008) [2023-10-14 02:34:51,995][33201] Updated weights for policy 0, policy_version 38490 (0.0008) [2023-10-14 02:34:53,338][33226] Updated weights for policy 1, policy_version 38820 (0.0007) [2023-10-14 02:34:53,707][33226] Updated weights for policy 1, policy_version 38830 (0.0007) [2023-10-14 02:34:54,078][33226] Updated weights for policy 1, policy_version 38840 (0.0008) [2023-10-14 02:34:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 79200256. Throughput: 0: 1757.8, 1: 1795.5. Samples: 19808110. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:34:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 02:34:55,714][33201] Updated weights for policy 0, policy_version 38500 (0.0008) [2023-10-14 02:34:56,093][33201] Updated weights for policy 0, policy_version 38510 (0.0007) [2023-10-14 02:34:56,456][33201] Updated weights for policy 0, policy_version 38520 (0.0008) [2023-10-14 02:34:57,822][33226] Updated weights for policy 1, policy_version 38850 (0.0008) [2023-10-14 02:34:58,184][33226] Updated weights for policy 1, policy_version 38860 (0.0009) [2023-10-14 02:34:58,552][33226] Updated weights for policy 1, policy_version 38870 (0.0007) [2023-10-14 02:34:58,916][33226] Updated weights for policy 1, policy_version 38880 (0.0008) [2023-10-14 02:34:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79265792. Throughput: 0: 1757.1, 1: 1782.8. Samples: 19818640. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:34:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:35:00,338][33201] Updated weights for policy 0, policy_version 38530 (0.0008) [2023-10-14 02:35:00,710][33201] Updated weights for policy 0, policy_version 38540 (0.0008) [2023-10-14 02:35:01,077][33201] Updated weights for policy 0, policy_version 38550 (0.0009) [2023-10-14 02:35:01,447][33201] Updated weights for policy 0, policy_version 38560 (0.0010) [2023-10-14 02:35:02,628][33226] Updated weights for policy 1, policy_version 38890 (0.0007) [2023-10-14 02:35:02,996][33226] Updated weights for policy 1, policy_version 38900 (0.0009) [2023-10-14 02:35:03,363][33226] Updated weights for policy 1, policy_version 38910 (0.0008) [2023-10-14 02:35:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 79331328. Throughput: 0: 1755.1, 1: 1801.2. Samples: 19840242. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:35:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:35:05,214][33201] Updated weights for policy 0, policy_version 38570 (0.0009) [2023-10-14 02:35:05,590][33201] Updated weights for policy 0, policy_version 38580 (0.0011) [2023-10-14 02:35:05,945][33201] Updated weights for policy 0, policy_version 38590 (0.0007) [2023-10-14 02:35:07,353][33226] Updated weights for policy 1, policy_version 38920 (0.0009) [2023-10-14 02:35:07,732][33226] Updated weights for policy 1, policy_version 38930 (0.0009) [2023-10-14 02:35:08,095][33226] Updated weights for policy 1, policy_version 38940 (0.0010) [2023-10-14 02:35:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 79396864. Throughput: 0: 1782.0, 1: 1775.1. Samples: 19861492. Policy #0 lag: (min: 20.0, avg: 20.3, max: 31.0) [2023-10-14 02:35:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:35:09,685][33201] Updated weights for policy 0, policy_version 38600 (0.0009) [2023-10-14 02:35:10,054][33201] Updated weights for policy 0, policy_version 38610 (0.0008) [2023-10-14 02:35:10,423][33201] Updated weights for policy 0, policy_version 38620 (0.0009) [2023-10-14 02:35:11,714][33226] Updated weights for policy 1, policy_version 38950 (0.0008) [2023-10-14 02:35:12,083][33226] Updated weights for policy 1, policy_version 38960 (0.0008) [2023-10-14 02:35:12,451][33226] Updated weights for policy 1, policy_version 38970 (0.0009) [2023-10-14 02:35:14,261][33201] Updated weights for policy 0, policy_version 38630 (0.0009) [2023-10-14 02:35:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 79462400. Throughput: 0: 1751.5, 1: 1796.9. Samples: 19872036. Policy #0 lag: (min: 27.0, avg: 30.1, max: 59.0) [2023-10-14 02:35:14,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 02:35:14,643][33201] Updated weights for policy 0, policy_version 38640 (0.0008) [2023-10-14 02:35:15,005][33201] Updated weights for policy 0, policy_version 38650 (0.0007) [2023-10-14 02:35:16,332][33226] Updated weights for policy 1, policy_version 38980 (0.0009) [2023-10-14 02:35:16,697][33226] Updated weights for policy 1, policy_version 38990 (0.0010) [2023-10-14 02:35:17,070][33226] Updated weights for policy 1, policy_version 39000 (0.0007) [2023-10-14 02:35:18,701][33201] Updated weights for policy 0, policy_version 38660 (0.0008) [2023-10-14 02:35:19,080][33201] Updated weights for policy 0, policy_version 38670 (0.0009) [2023-10-14 02:35:19,458][33201] Updated weights for policy 0, policy_version 38680 (0.0007) [2023-10-14 02:35:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 79527936. Throughput: 0: 1778.2, 1: 1777.0. Samples: 19893482. Policy #0 lag: (min: 27.0, avg: 30.1, max: 59.0) [2023-10-14 02:35:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:20,954][33226] Updated weights for policy 1, policy_version 39010 (0.0008) [2023-10-14 02:35:21,319][33226] Updated weights for policy 1, policy_version 39020 (0.0010) [2023-10-14 02:35:21,685][33226] Updated weights for policy 1, policy_version 39030 (0.0008) [2023-10-14 02:35:22,058][33226] Updated weights for policy 1, policy_version 39040 (0.0008) [2023-10-14 02:35:23,222][33201] Updated weights for policy 0, policy_version 38690 (0.0008) [2023-10-14 02:35:23,590][33201] Updated weights for policy 0, policy_version 38700 (0.0009) [2023-10-14 02:35:23,963][33201] Updated weights for policy 0, policy_version 38710 (0.0009) [2023-10-14 02:35:24,330][33201] Updated weights for policy 0, policy_version 38720 (0.0008) [2023-10-14 02:35:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79626240. Throughput: 0: 1769.0, 1: 1772.4. Samples: 19914618. Policy #0 lag: (min: 27.0, avg: 30.1, max: 59.0) [2023-10-14 02:35:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000039040_39976960.pth... [2023-10-14 02:35:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000038720_39649280.pth... [2023-10-14 02:35:24,602][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000037056_37945344.pth [2023-10-14 02:35:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000037376_38273024.pth [2023-10-14 02:35:25,750][33226] Updated weights for policy 1, policy_version 39050 (0.0010) [2023-10-14 02:35:26,120][33226] Updated weights for policy 1, policy_version 39060 (0.0008) [2023-10-14 02:35:26,495][33226] Updated weights for policy 1, policy_version 39070 (0.0008) [2023-10-14 02:35:28,155][33201] Updated weights for policy 0, policy_version 38730 (0.0007) [2023-10-14 02:35:28,528][33201] Updated weights for policy 0, policy_version 38740 (0.0011) [2023-10-14 02:35:28,913][33201] Updated weights for policy 0, policy_version 38750 (0.0011) [2023-10-14 02:35:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79691776. Throughput: 0: 1778.4, 1: 1775.9. Samples: 19925622. Policy #0 lag: (min: 27.0, avg: 30.1, max: 59.0) [2023-10-14 02:35:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:30,329][33226] Updated weights for policy 1, policy_version 39080 (0.0010) [2023-10-14 02:35:30,697][33226] Updated weights for policy 1, policy_version 39090 (0.0009) [2023-10-14 02:35:31,063][33226] Updated weights for policy 1, policy_version 39100 (0.0008) [2023-10-14 02:35:32,795][33201] Updated weights for policy 0, policy_version 38760 (0.0008) [2023-10-14 02:35:33,169][33201] Updated weights for policy 0, policy_version 38770 (0.0008) [2023-10-14 02:35:33,546][33201] Updated weights for policy 0, policy_version 38780 (0.0007) [2023-10-14 02:35:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79757312. Throughput: 0: 1784.7, 1: 1778.0. Samples: 19946950. Policy #0 lag: (min: 27.0, avg: 30.1, max: 59.0) [2023-10-14 02:35:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:34,789][33226] Updated weights for policy 1, policy_version 39110 (0.0009) [2023-10-14 02:35:35,160][33226] Updated weights for policy 1, policy_version 39120 (0.0009) [2023-10-14 02:35:35,531][33226] Updated weights for policy 1, policy_version 39130 (0.0010) [2023-10-14 02:35:37,334][33201] Updated weights for policy 0, policy_version 38790 (0.0008) [2023-10-14 02:35:37,705][33201] Updated weights for policy 0, policy_version 38800 (0.0009) [2023-10-14 02:35:38,078][33201] Updated weights for policy 0, policy_version 38810 (0.0008) [2023-10-14 02:35:39,169][33226] Updated weights for policy 1, policy_version 39140 (0.0009) [2023-10-14 02:35:39,529][33226] Updated weights for policy 1, policy_version 39150 (0.0007) [2023-10-14 02:35:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 79822848. Throughput: 0: 1763.3, 1: 1796.3. Samples: 19968294. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:35:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:39,890][33226] Updated weights for policy 1, policy_version 39160 (0.0009) [2023-10-14 02:35:42,079][33201] Updated weights for policy 0, policy_version 38820 (0.0008) [2023-10-14 02:35:42,450][33201] Updated weights for policy 0, policy_version 38830 (0.0009) [2023-10-14 02:35:42,817][33201] Updated weights for policy 0, policy_version 38840 (0.0007) [2023-10-14 02:35:43,590][33226] Updated weights for policy 1, policy_version 39170 (0.0009) [2023-10-14 02:35:43,951][33226] Updated weights for policy 1, policy_version 39180 (0.0008) [2023-10-14 02:35:44,319][33226] Updated weights for policy 1, policy_version 39190 (0.0007) [2023-10-14 02:35:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 79888384. Throughput: 0: 1784.6, 1: 1777.0. Samples: 19978912. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:35:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 02:35:44,671][33226] Updated weights for policy 1, policy_version 39200 (0.0007) [2023-10-14 02:35:46,562][33201] Updated weights for policy 0, policy_version 38850 (0.0008) [2023-10-14 02:35:46,925][33201] Updated weights for policy 0, policy_version 38860 (0.0007) [2023-10-14 02:35:47,296][33201] Updated weights for policy 0, policy_version 38870 (0.0008) [2023-10-14 02:35:47,665][33201] Updated weights for policy 0, policy_version 38880 (0.0008) [2023-10-14 02:35:48,590][33226] Updated weights for policy 1, policy_version 39210 (0.0009) [2023-10-14 02:35:48,958][33226] Updated weights for policy 1, policy_version 39220 (0.0007) [2023-10-14 02:35:49,330][33226] Updated weights for policy 1, policy_version 39230 (0.0008) [2023-10-14 02:35:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 79986688. Throughput: 0: 1753.6, 1: 1793.5. Samples: 19999858. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:35:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:35:51,756][33201] Updated weights for policy 0, policy_version 38890 (0.0007) [2023-10-14 02:35:52,124][33201] Updated weights for policy 0, policy_version 38900 (0.0007) [2023-10-14 02:35:52,498][33201] Updated weights for policy 0, policy_version 38910 (0.0010) [2023-10-14 02:35:53,335][33226] Updated weights for policy 1, policy_version 39240 (0.0008) [2023-10-14 02:35:53,710][33226] Updated weights for policy 1, policy_version 39250 (0.0010) [2023-10-14 02:35:54,072][33226] Updated weights for policy 1, policy_version 39260 (0.0010) [2023-10-14 02:35:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 80052224. Throughput: 0: 1746.4, 1: 1782.0. Samples: 20020266. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:35:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:35:56,123][33201] Updated weights for policy 0, policy_version 38920 (0.0008) [2023-10-14 02:35:56,491][33201] Updated weights for policy 0, policy_version 38930 (0.0008) [2023-10-14 02:35:56,861][33201] Updated weights for policy 0, policy_version 38940 (0.0007) [2023-10-14 02:35:57,958][33226] Updated weights for policy 1, policy_version 39270 (0.0011) [2023-10-14 02:35:58,330][33226] Updated weights for policy 1, policy_version 39280 (0.0008) [2023-10-14 02:35:58,704][33226] Updated weights for policy 1, policy_version 39290 (0.0011) [2023-10-14 02:35:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 80117760. Throughput: 0: 1745.5, 1: 1783.0. Samples: 20030818. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:35:59,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:36:00,881][33201] Updated weights for policy 0, policy_version 38950 (0.0010) [2023-10-14 02:36:01,255][33201] Updated weights for policy 0, policy_version 38960 (0.0010) [2023-10-14 02:36:01,616][33201] Updated weights for policy 0, policy_version 38970 (0.0011) [2023-10-14 02:36:02,403][33226] Updated weights for policy 1, policy_version 39300 (0.0008) [2023-10-14 02:36:02,770][33226] Updated weights for policy 1, policy_version 39310 (0.0011) [2023-10-14 02:36:03,139][33226] Updated weights for policy 1, policy_version 39320 (0.0010) [2023-10-14 02:36:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 80183296. Throughput: 0: 1735.9, 1: 1780.6. Samples: 20051722. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:36:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.930')] [2023-10-14 02:36:05,608][33201] Updated weights for policy 0, policy_version 38980 (0.0011) [2023-10-14 02:36:05,993][33201] Updated weights for policy 0, policy_version 38990 (0.0011) [2023-10-14 02:36:06,370][33201] Updated weights for policy 0, policy_version 39000 (0.0008) [2023-10-14 02:36:06,815][33226] Updated weights for policy 1, policy_version 39330 (0.0009) [2023-10-14 02:36:07,192][33226] Updated weights for policy 1, policy_version 39340 (0.0008) [2023-10-14 02:36:07,554][33226] Updated weights for policy 1, policy_version 39350 (0.0007) [2023-10-14 02:36:07,920][33226] Updated weights for policy 1, policy_version 39360 (0.0008) [2023-10-14 02:36:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 80248832. Throughput: 0: 1752.7, 1: 1766.5. Samples: 20072982. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:36:10,347][33201] Updated weights for policy 0, policy_version 39010 (0.0009) [2023-10-14 02:36:10,715][33201] Updated weights for policy 0, policy_version 39020 (0.0009) [2023-10-14 02:36:11,092][33201] Updated weights for policy 0, policy_version 39030 (0.0009) [2023-10-14 02:36:11,461][33201] Updated weights for policy 0, policy_version 39040 (0.0009) [2023-10-14 02:36:11,774][33226] Updated weights for policy 1, policy_version 39370 (0.0010) [2023-10-14 02:36:12,140][33226] Updated weights for policy 1, policy_version 39380 (0.0010) [2023-10-14 02:36:12,505][33226] Updated weights for policy 1, policy_version 39390 (0.0010) [2023-10-14 02:36:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 80314368. Throughput: 0: 1721.5, 1: 1779.4. Samples: 20083162. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:14,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:36:15,349][33201] Updated weights for policy 0, policy_version 39050 (0.0008) [2023-10-14 02:36:15,713][33201] Updated weights for policy 0, policy_version 39060 (0.0009) [2023-10-14 02:36:16,081][33201] Updated weights for policy 0, policy_version 39070 (0.0010) [2023-10-14 02:36:16,309][33226] Updated weights for policy 1, policy_version 39400 (0.0008) [2023-10-14 02:36:16,686][33226] Updated weights for policy 1, policy_version 39410 (0.0008) [2023-10-14 02:36:17,055][33226] Updated weights for policy 1, policy_version 39420 (0.0007) [2023-10-14 02:36:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 80379904. Throughput: 0: 1736.1, 1: 1757.1. Samples: 20104148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:19,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:36:19,946][33201] Updated weights for policy 0, policy_version 39080 (0.0008) [2023-10-14 02:36:20,313][33201] Updated weights for policy 0, policy_version 39090 (0.0009) [2023-10-14 02:36:20,688][33201] Updated weights for policy 0, policy_version 39100 (0.0008) [2023-10-14 02:36:20,923][33226] Updated weights for policy 1, policy_version 39430 (0.0008) [2023-10-14 02:36:21,276][33226] Updated weights for policy 1, policy_version 39440 (0.0010) [2023-10-14 02:36:21,651][33226] Updated weights for policy 1, policy_version 39450 (0.0010) [2023-10-14 02:36:24,547][33201] Updated weights for policy 0, policy_version 39110 (0.0009) [2023-10-14 02:36:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.2, 300 sec: 14106.9). Total num frames: 80445440. Throughput: 0: 1755.4, 1: 1755.5. Samples: 20126288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:24,559][31953] Avg episode reward: [(0, '20.840'), (1, '20.900')] [2023-10-14 02:36:24,922][33201] Updated weights for policy 0, policy_version 39120 (0.0009) [2023-10-14 02:36:25,288][33201] Updated weights for policy 0, policy_version 39130 (0.0007) [2023-10-14 02:36:25,469][33226] Updated weights for policy 1, policy_version 39460 (0.0007) [2023-10-14 02:36:25,842][33226] Updated weights for policy 1, policy_version 39470 (0.0009) [2023-10-14 02:36:26,215][33226] Updated weights for policy 1, policy_version 39480 (0.0009) [2023-10-14 02:36:29,000][33201] Updated weights for policy 0, policy_version 39140 (0.0008) [2023-10-14 02:36:29,367][33201] Updated weights for policy 0, policy_version 39150 (0.0008) [2023-10-14 02:36:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 80510976. Throughput: 0: 1737.3, 1: 1757.8. Samples: 20136192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:29,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:36:29,729][33201] Updated weights for policy 0, policy_version 39160 (0.0007) [2023-10-14 02:36:29,984][33226] Updated weights for policy 1, policy_version 39490 (0.0007) [2023-10-14 02:36:30,360][33226] Updated weights for policy 1, policy_version 39500 (0.0009) [2023-10-14 02:36:30,725][33226] Updated weights for policy 1, policy_version 39510 (0.0008) [2023-10-14 02:36:31,086][33226] Updated weights for policy 1, policy_version 39520 (0.0008) [2023-10-14 02:36:33,501][33201] Updated weights for policy 0, policy_version 39170 (0.0008) [2023-10-14 02:36:33,860][33201] Updated weights for policy 0, policy_version 39180 (0.0007) [2023-10-14 02:36:34,228][33201] Updated weights for policy 0, policy_version 39190 (0.0010) [2023-10-14 02:36:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 80576512. Throughput: 0: 1767.8, 1: 1755.2. Samples: 20158394. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 02:36:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 02:36:34,612][33201] Updated weights for policy 0, policy_version 39200 (0.0009) [2023-10-14 02:36:34,953][33226] Updated weights for policy 1, policy_version 39530 (0.0008) [2023-10-14 02:36:35,327][33226] Updated weights for policy 1, policy_version 39540 (0.0008) [2023-10-14 02:36:35,691][33226] Updated weights for policy 1, policy_version 39550 (0.0009) [2023-10-14 02:36:38,451][33201] Updated weights for policy 0, policy_version 39210 (0.0009) [2023-10-14 02:36:38,827][33201] Updated weights for policy 0, policy_version 39220 (0.0010) [2023-10-14 02:36:39,202][33201] Updated weights for policy 0, policy_version 39230 (0.0009) [2023-10-14 02:36:39,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 80674816. Throughput: 0: 1756.2, 1: 1786.7. Samples: 20179694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:36:39,557][33226] Updated weights for policy 1, policy_version 39560 (0.0007) [2023-10-14 02:36:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 02:36:39,931][33226] Updated weights for policy 1, policy_version 39570 (0.0009) [2023-10-14 02:36:40,305][33226] Updated weights for policy 1, policy_version 39580 (0.0008) [2023-10-14 02:36:43,029][33201] Updated weights for policy 0, policy_version 39240 (0.0009) [2023-10-14 02:36:43,407][33201] Updated weights for policy 0, policy_version 39250 (0.0007) [2023-10-14 02:36:43,779][33201] Updated weights for policy 0, policy_version 39260 (0.0008) [2023-10-14 02:36:44,044][33226] Updated weights for policy 1, policy_version 39590 (0.0008) [2023-10-14 02:36:44,420][33226] Updated weights for policy 1, policy_version 39600 (0.0009) [2023-10-14 02:36:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 80740352. Throughput: 0: 1780.7, 1: 1761.1. Samples: 20190204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:36:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 02:36:44,778][33226] Updated weights for policy 1, policy_version 39610 (0.0009) [2023-10-14 02:36:47,574][33201] Updated weights for policy 0, policy_version 39270 (0.0008) [2023-10-14 02:36:47,943][33201] Updated weights for policy 0, policy_version 39280 (0.0009) [2023-10-14 02:36:48,313][33201] Updated weights for policy 0, policy_version 39290 (0.0009) [2023-10-14 02:36:48,477][33226] Updated weights for policy 1, policy_version 39620 (0.0008) [2023-10-14 02:36:48,843][33226] Updated weights for policy 1, policy_version 39630 (0.0007) [2023-10-14 02:36:49,205][33226] Updated weights for policy 1, policy_version 39640 (0.0008) [2023-10-14 02:36:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 80838656. Throughput: 0: 1769.4, 1: 1786.6. Samples: 20211740. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:36:49,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 02:36:51,917][33201] Updated weights for policy 0, policy_version 39300 (0.0007) [2023-10-14 02:36:52,320][33201] Updated weights for policy 0, policy_version 39310 (0.0007) [2023-10-14 02:36:52,684][33201] Updated weights for policy 0, policy_version 39320 (0.0009) [2023-10-14 02:36:53,067][33226] Updated weights for policy 1, policy_version 39650 (0.0009) [2023-10-14 02:36:53,440][33226] Updated weights for policy 1, policy_version 39660 (0.0009) [2023-10-14 02:36:53,813][33226] Updated weights for policy 1, policy_version 39670 (0.0009) [2023-10-14 02:36:54,190][33226] Updated weights for policy 1, policy_version 39680 (0.0009) [2023-10-14 02:36:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 80904192. Throughput: 0: 1766.8, 1: 1772.1. Samples: 20232234. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:36:54,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 02:36:56,528][33201] Updated weights for policy 0, policy_version 39330 (0.0009) [2023-10-14 02:36:56,906][33201] Updated weights for policy 0, policy_version 39340 (0.0010) [2023-10-14 02:36:57,274][33201] Updated weights for policy 0, policy_version 39350 (0.0010) [2023-10-14 02:36:57,647][33201] Updated weights for policy 0, policy_version 39360 (0.0008) [2023-10-14 02:36:58,091][33226] Updated weights for policy 1, policy_version 39690 (0.0007) [2023-10-14 02:36:58,460][33226] Updated weights for policy 1, policy_version 39700 (0.0007) [2023-10-14 02:36:58,825][33226] Updated weights for policy 1, policy_version 39710 (0.0009) [2023-10-14 02:36:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 80969728. Throughput: 0: 1781.2, 1: 1780.3. Samples: 20243428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:36:59,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:37:01,467][33201] Updated weights for policy 0, policy_version 39370 (0.0008) [2023-10-14 02:37:01,844][33201] Updated weights for policy 0, policy_version 39380 (0.0009) [2023-10-14 02:37:02,208][33201] Updated weights for policy 0, policy_version 39390 (0.0009) [2023-10-14 02:37:02,621][33226] Updated weights for policy 1, policy_version 39720 (0.0008) [2023-10-14 02:37:03,001][33226] Updated weights for policy 1, policy_version 39730 (0.0007) [2023-10-14 02:37:03,365][33226] Updated weights for policy 1, policy_version 39740 (0.0007) [2023-10-14 02:37:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 81035264. Throughput: 0: 1770.1, 1: 1790.4. Samples: 20264370. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:04,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:37:06,009][33201] Updated weights for policy 0, policy_version 39400 (0.0007) [2023-10-14 02:37:06,376][33201] Updated weights for policy 0, policy_version 39410 (0.0008) [2023-10-14 02:37:06,750][33201] Updated weights for policy 0, policy_version 39420 (0.0009) [2023-10-14 02:37:06,933][33226] Updated weights for policy 1, policy_version 39750 (0.0007) [2023-10-14 02:37:07,308][33226] Updated weights for policy 1, policy_version 39760 (0.0007) [2023-10-14 02:37:07,670][33226] Updated weights for policy 1, policy_version 39770 (0.0009) [2023-10-14 02:37:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 81100800. Throughput: 0: 1773.4, 1: 1776.4. Samples: 20286026. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:09,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:37:10,567][33201] Updated weights for policy 0, policy_version 39430 (0.0007) [2023-10-14 02:37:10,948][33201] Updated weights for policy 0, policy_version 39440 (0.0010) [2023-10-14 02:37:11,313][33201] Updated weights for policy 0, policy_version 39450 (0.0011) [2023-10-14 02:37:11,548][33226] Updated weights for policy 1, policy_version 39780 (0.0008) [2023-10-14 02:37:11,916][33226] Updated weights for policy 1, policy_version 39790 (0.0008) [2023-10-14 02:37:12,287][33226] Updated weights for policy 1, policy_version 39800 (0.0007) [2023-10-14 02:37:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 81166336. Throughput: 0: 1767.9, 1: 1792.6. Samples: 20296412. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:14,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:37:15,125][33201] Updated weights for policy 0, policy_version 39460 (0.0009) [2023-10-14 02:37:15,507][33201] Updated weights for policy 0, policy_version 39470 (0.0010) [2023-10-14 02:37:15,872][33201] Updated weights for policy 0, policy_version 39480 (0.0011) [2023-10-14 02:37:16,092][33226] Updated weights for policy 1, policy_version 39810 (0.0008) [2023-10-14 02:37:16,450][33226] Updated weights for policy 1, policy_version 39820 (0.0009) [2023-10-14 02:37:16,812][33226] Updated weights for policy 1, policy_version 39830 (0.0009) [2023-10-14 02:37:17,184][33226] Updated weights for policy 1, policy_version 39840 (0.0007) [2023-10-14 02:37:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 81231872. Throughput: 0: 1763.6, 1: 1777.9. Samples: 20317762. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:19,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.890')] [2023-10-14 02:37:19,607][33201] Updated weights for policy 0, policy_version 39490 (0.0010) [2023-10-14 02:37:19,978][33201] Updated weights for policy 0, policy_version 39500 (0.0008) [2023-10-14 02:37:20,358][33201] Updated weights for policy 0, policy_version 39510 (0.0011) [2023-10-14 02:37:20,728][33201] Updated weights for policy 0, policy_version 39520 (0.0008) [2023-10-14 02:37:20,796][33226] Updated weights for policy 1, policy_version 39850 (0.0009) [2023-10-14 02:37:21,158][33226] Updated weights for policy 1, policy_version 39860 (0.0008) [2023-10-14 02:37:21,526][33226] Updated weights for policy 1, policy_version 39870 (0.0007) [2023-10-14 02:37:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 81297408. Throughput: 0: 1782.4, 1: 1778.7. Samples: 20339944. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.880')] [2023-10-14 02:37:24,566][33201] Updated weights for policy 0, policy_version 39530 (0.0007) [2023-10-14 02:37:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000039872_40828928.pth... [2023-10-14 02:37:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000038208_39124992.pth [2023-10-14 02:37:24,941][33201] Updated weights for policy 0, policy_version 39540 (0.0009) [2023-10-14 02:37:25,307][33201] Updated weights for policy 0, policy_version 39550 (0.0007) [2023-10-14 02:37:25,376][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000039552_40501248.pth... [2023-10-14 02:37:25,405][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000037888_38797312.pth [2023-10-14 02:37:25,436][33226] Updated weights for policy 1, policy_version 39880 (0.0009) [2023-10-14 02:37:25,811][33226] Updated weights for policy 1, policy_version 39890 (0.0008) [2023-10-14 02:37:26,180][33226] Updated weights for policy 1, policy_version 39900 (0.0008) [2023-10-14 02:37:29,061][33201] Updated weights for policy 0, policy_version 39560 (0.0009) [2023-10-14 02:37:29,430][33201] Updated weights for policy 0, policy_version 39570 (0.0009) [2023-10-14 02:37:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 81362944. Throughput: 0: 1759.3, 1: 1778.5. Samples: 20349404. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:37:29,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:37:29,803][33201] Updated weights for policy 0, policy_version 39580 (0.0010) [2023-10-14 02:37:29,995][33226] Updated weights for policy 1, policy_version 39910 (0.0008) [2023-10-14 02:37:30,357][33226] Updated weights for policy 1, policy_version 39920 (0.0008) [2023-10-14 02:37:30,726][33226] Updated weights for policy 1, policy_version 39930 (0.0008) [2023-10-14 02:37:33,515][33201] Updated weights for policy 0, policy_version 39590 (0.0010) [2023-10-14 02:37:33,881][33201] Updated weights for policy 0, policy_version 39600 (0.0009) [2023-10-14 02:37:34,249][33201] Updated weights for policy 0, policy_version 39610 (0.0010) [2023-10-14 02:37:34,517][33226] Updated weights for policy 1, policy_version 39940 (0.0009) [2023-10-14 02:37:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 81461248. Throughput: 0: 1784.1, 1: 1772.7. Samples: 20371796. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 02:37:34,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.900')] [2023-10-14 02:37:34,885][33226] Updated weights for policy 1, policy_version 39950 (0.0007) [2023-10-14 02:37:35,251][33226] Updated weights for policy 1, policy_version 39960 (0.0007) [2023-10-14 02:37:38,288][33201] Updated weights for policy 0, policy_version 39620 (0.0007) [2023-10-14 02:37:38,684][33201] Updated weights for policy 0, policy_version 39630 (0.0009) [2023-10-14 02:37:38,941][33226] Updated weights for policy 1, policy_version 39970 (0.0008) [2023-10-14 02:37:39,051][33201] Updated weights for policy 0, policy_version 39640 (0.0007) [2023-10-14 02:37:39,315][33226] Updated weights for policy 1, policy_version 39980 (0.0007) [2023-10-14 02:37:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 81526784. Throughput: 0: 1765.9, 1: 1805.0. Samples: 20392926. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 02:37:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.900')] [2023-10-14 02:37:39,678][33226] Updated weights for policy 1, policy_version 39990 (0.0007) [2023-10-14 02:37:40,045][33226] Updated weights for policy 1, policy_version 40000 (0.0008) [2023-10-14 02:37:42,812][33201] Updated weights for policy 0, policy_version 39650 (0.0008) [2023-10-14 02:37:43,180][33201] Updated weights for policy 0, policy_version 39660 (0.0009) [2023-10-14 02:37:43,540][33201] Updated weights for policy 0, policy_version 39670 (0.0010) [2023-10-14 02:37:43,789][33226] Updated weights for policy 1, policy_version 40010 (0.0007) [2023-10-14 02:37:43,913][33201] Updated weights for policy 0, policy_version 39680 (0.0009) [2023-10-14 02:37:44,154][33226] Updated weights for policy 1, policy_version 40020 (0.0009) [2023-10-14 02:37:44,525][33226] Updated weights for policy 1, policy_version 40030 (0.0011) [2023-10-14 02:37:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 81592320. Throughput: 0: 1775.9, 1: 1783.7. Samples: 20403612. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 02:37:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 02:37:47,780][33201] Updated weights for policy 0, policy_version 39690 (0.0008) [2023-10-14 02:37:48,152][33201] Updated weights for policy 0, policy_version 39700 (0.0009) [2023-10-14 02:37:48,306][33226] Updated weights for policy 1, policy_version 40040 (0.0008) [2023-10-14 02:37:48,511][33201] Updated weights for policy 0, policy_version 39710 (0.0009) [2023-10-14 02:37:48,678][33226] Updated weights for policy 1, policy_version 40050 (0.0008) [2023-10-14 02:37:49,055][33226] Updated weights for policy 1, policy_version 40060 (0.0010) [2023-10-14 02:37:49,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 81690624. Throughput: 0: 1774.6, 1: 1801.2. Samples: 20425280. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 02:37:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 02:37:52,415][33201] Updated weights for policy 0, policy_version 39720 (0.0008) [2023-10-14 02:37:52,779][33201] Updated weights for policy 0, policy_version 39730 (0.0007) [2023-10-14 02:37:52,924][33226] Updated weights for policy 1, policy_version 40070 (0.0009) [2023-10-14 02:37:53,150][33201] Updated weights for policy 0, policy_version 39740 (0.0007) [2023-10-14 02:37:53,292][33226] Updated weights for policy 1, policy_version 40080 (0.0010) [2023-10-14 02:37:53,663][33226] Updated weights for policy 1, policy_version 40090 (0.0010) [2023-10-14 02:37:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 81756160. Throughput: 0: 1760.1, 1: 1783.9. Samples: 20445504. Policy #0 lag: (min: 31.0, avg: 34.1, max: 63.0) [2023-10-14 02:37:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 02:37:56,953][33201] Updated weights for policy 0, policy_version 39750 (0.0007) [2023-10-14 02:37:57,288][33226] Updated weights for policy 1, policy_version 40100 (0.0008) [2023-10-14 02:37:57,325][33201] Updated weights for policy 0, policy_version 39760 (0.0008) [2023-10-14 02:37:57,648][33226] Updated weights for policy 1, policy_version 40110 (0.0009) [2023-10-14 02:37:57,691][33201] Updated weights for policy 0, policy_version 39770 (0.0007) [2023-10-14 02:37:58,004][33226] Updated weights for policy 1, policy_version 40120 (0.0008) [2023-10-14 02:37:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 81821696. Throughput: 0: 1780.4, 1: 1801.5. Samples: 20457600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:37:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 02:38:01,487][33201] Updated weights for policy 0, policy_version 39780 (0.0008) [2023-10-14 02:38:01,834][33226] Updated weights for policy 1, policy_version 40130 (0.0010) [2023-10-14 02:38:01,855][33201] Updated weights for policy 0, policy_version 39790 (0.0007) [2023-10-14 02:38:02,203][33226] Updated weights for policy 1, policy_version 40140 (0.0007) [2023-10-14 02:38:02,229][33201] Updated weights for policy 0, policy_version 39800 (0.0007) [2023-10-14 02:38:02,564][33226] Updated weights for policy 1, policy_version 40150 (0.0008) [2023-10-14 02:38:02,928][33226] Updated weights for policy 1, policy_version 40160 (0.0009) [2023-10-14 02:38:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 81887232. Throughput: 0: 1760.5, 1: 1785.5. Samples: 20477332. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:38:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:38:05,961][33201] Updated weights for policy 0, policy_version 39810 (0.0008) [2023-10-14 02:38:06,325][33201] Updated weights for policy 0, policy_version 39820 (0.0009) [2023-10-14 02:38:06,651][33226] Updated weights for policy 1, policy_version 40170 (0.0008) [2023-10-14 02:38:06,697][33201] Updated weights for policy 0, policy_version 39830 (0.0007) [2023-10-14 02:38:07,021][33226] Updated weights for policy 1, policy_version 40180 (0.0008) [2023-10-14 02:38:07,063][33201] Updated weights for policy 0, policy_version 39840 (0.0007) [2023-10-14 02:38:07,378][33226] Updated weights for policy 1, policy_version 40190 (0.0009) [2023-10-14 02:38:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 81952768. Throughput: 0: 1762.2, 1: 1788.1. Samples: 20499706. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:38:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 02:38:10,789][33201] Updated weights for policy 0, policy_version 39850 (0.0010) [2023-10-14 02:38:11,161][33201] Updated weights for policy 0, policy_version 39860 (0.0009) [2023-10-14 02:38:11,306][33226] Updated weights for policy 1, policy_version 40200 (0.0010) [2023-10-14 02:38:11,526][33201] Updated weights for policy 0, policy_version 39870 (0.0008) [2023-10-14 02:38:11,692][33226] Updated weights for policy 1, policy_version 40210 (0.0009) [2023-10-14 02:38:12,058][33226] Updated weights for policy 1, policy_version 40220 (0.0010) [2023-10-14 02:38:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 82018304. Throughput: 0: 1762.2, 1: 1797.4. Samples: 20509586. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:38:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:15,341][33201] Updated weights for policy 0, policy_version 39880 (0.0008) [2023-10-14 02:38:15,711][33201] Updated weights for policy 0, policy_version 39890 (0.0008) [2023-10-14 02:38:15,735][33226] Updated weights for policy 1, policy_version 40230 (0.0009) [2023-10-14 02:38:16,071][33201] Updated weights for policy 0, policy_version 39900 (0.0007) [2023-10-14 02:38:16,100][33226] Updated weights for policy 1, policy_version 40240 (0.0008) [2023-10-14 02:38:16,470][33226] Updated weights for policy 1, policy_version 40250 (0.0009) [2023-10-14 02:38:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 82083840. Throughput: 0: 1757.5, 1: 1786.2. Samples: 20531262. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:38:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:20,102][33201] Updated weights for policy 0, policy_version 39910 (0.0007) [2023-10-14 02:38:20,390][33226] Updated weights for policy 1, policy_version 40260 (0.0009) [2023-10-14 02:38:20,467][33201] Updated weights for policy 0, policy_version 39920 (0.0008) [2023-10-14 02:38:20,751][33226] Updated weights for policy 1, policy_version 40270 (0.0007) [2023-10-14 02:38:20,845][33201] Updated weights for policy 0, policy_version 39930 (0.0009) [2023-10-14 02:38:21,121][33226] Updated weights for policy 1, policy_version 40280 (0.0007) [2023-10-14 02:38:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 82149376. Throughput: 0: 1775.3, 1: 1782.6. Samples: 20553032. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:38:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:24,856][33226] Updated weights for policy 1, policy_version 40290 (0.0008) [2023-10-14 02:38:24,995][33201] Updated weights for policy 0, policy_version 39940 (0.0008) [2023-10-14 02:38:25,217][33226] Updated weights for policy 1, policy_version 40300 (0.0008) [2023-10-14 02:38:25,395][33201] Updated weights for policy 0, policy_version 39950 (0.0008) [2023-10-14 02:38:25,585][33226] Updated weights for policy 1, policy_version 40310 (0.0007) [2023-10-14 02:38:25,766][33201] Updated weights for policy 0, policy_version 39960 (0.0008) [2023-10-14 02:38:25,947][33226] Updated weights for policy 1, policy_version 40320 (0.0009) [2023-10-14 02:38:29,555][33201] Updated weights for policy 0, policy_version 39970 (0.0008) [2023-10-14 02:38:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 82214912. Throughput: 0: 1750.2, 1: 1781.3. Samples: 20562532. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:38:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:29,771][33226] Updated weights for policy 1, policy_version 40330 (0.0008) [2023-10-14 02:38:29,933][33201] Updated weights for policy 0, policy_version 39980 (0.0008) [2023-10-14 02:38:30,146][33226] Updated weights for policy 1, policy_version 40340 (0.0009) [2023-10-14 02:38:30,304][33201] Updated weights for policy 0, policy_version 39990 (0.0008) [2023-10-14 02:38:30,515][33226] Updated weights for policy 1, policy_version 40350 (0.0007) [2023-10-14 02:38:30,668][33201] Updated weights for policy 0, policy_version 40000 (0.0008) [2023-10-14 02:38:34,353][33226] Updated weights for policy 1, policy_version 40360 (0.0008) [2023-10-14 02:38:34,389][33201] Updated weights for policy 0, policy_version 40010 (0.0009) [2023-10-14 02:38:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 82280448. Throughput: 0: 1768.4, 1: 1773.9. Samples: 20584684. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:38:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:34,713][33226] Updated weights for policy 1, policy_version 40370 (0.0007) [2023-10-14 02:38:34,746][33201] Updated weights for policy 0, policy_version 40020 (0.0008) [2023-10-14 02:38:35,092][33226] Updated weights for policy 1, policy_version 40380 (0.0007) [2023-10-14 02:38:35,118][33201] Updated weights for policy 0, policy_version 40030 (0.0009) [2023-10-14 02:38:38,718][33226] Updated weights for policy 1, policy_version 40390 (0.0008) [2023-10-14 02:38:38,982][33201] Updated weights for policy 0, policy_version 40040 (0.0007) [2023-10-14 02:38:39,089][33226] Updated weights for policy 1, policy_version 40400 (0.0007) [2023-10-14 02:38:39,347][33201] Updated weights for policy 0, policy_version 40050 (0.0007) [2023-10-14 02:38:39,457][33226] Updated weights for policy 1, policy_version 40410 (0.0008) [2023-10-14 02:38:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 82345984. Throughput: 0: 1767.8, 1: 1795.3. Samples: 20605844. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:38:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:39,720][33201] Updated weights for policy 0, policy_version 40060 (0.0008) [2023-10-14 02:38:43,200][33226] Updated weights for policy 1, policy_version 40420 (0.0009) [2023-10-14 02:38:43,395][33201] Updated weights for policy 0, policy_version 40070 (0.0009) [2023-10-14 02:38:43,563][33226] Updated weights for policy 1, policy_version 40430 (0.0008) [2023-10-14 02:38:43,764][33201] Updated weights for policy 0, policy_version 40080 (0.0008) [2023-10-14 02:38:43,930][33226] Updated weights for policy 1, policy_version 40440 (0.0008) [2023-10-14 02:38:44,132][33201] Updated weights for policy 0, policy_version 40090 (0.0007) [2023-10-14 02:38:44,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 82477056. Throughput: 0: 1758.4, 1: 1770.4. Samples: 20616396. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:38:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 02:38:47,763][33226] Updated weights for policy 1, policy_version 40450 (0.0008) [2023-10-14 02:38:48,055][33201] Updated weights for policy 0, policy_version 40100 (0.0009) [2023-10-14 02:38:48,129][33226] Updated weights for policy 1, policy_version 40460 (0.0008) [2023-10-14 02:38:48,419][33201] Updated weights for policy 0, policy_version 40110 (0.0009) [2023-10-14 02:38:48,498][33226] Updated weights for policy 1, policy_version 40470 (0.0009) [2023-10-14 02:38:48,788][33201] Updated weights for policy 0, policy_version 40120 (0.0007) [2023-10-14 02:38:48,861][33226] Updated weights for policy 1, policy_version 40480 (0.0007) [2023-10-14 02:38:49,557][31953] Fps is (10 sec: 19661.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 82542592. Throughput: 0: 1778.1, 1: 1799.9. Samples: 20638340. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 02:38:49,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 02:38:52,606][33226] Updated weights for policy 1, policy_version 40490 (0.0007) [2023-10-14 02:38:52,667][33201] Updated weights for policy 0, policy_version 40130 (0.0009) [2023-10-14 02:38:52,971][33226] Updated weights for policy 1, policy_version 40500 (0.0008) [2023-10-14 02:38:53,035][33201] Updated weights for policy 0, policy_version 40140 (0.0007) [2023-10-14 02:38:53,341][33226] Updated weights for policy 1, policy_version 40510 (0.0007) [2023-10-14 02:38:53,414][33201] Updated weights for policy 0, policy_version 40150 (0.0008) [2023-10-14 02:38:53,789][33201] Updated weights for policy 0, policy_version 40160 (0.0010) [2023-10-14 02:38:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 82608128. Throughput: 0: 1750.0, 1: 1777.0. Samples: 20658422. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:38:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 02:38:57,190][33226] Updated weights for policy 1, policy_version 40520 (0.0008) [2023-10-14 02:38:57,518][33201] Updated weights for policy 0, policy_version 40170 (0.0008) [2023-10-14 02:38:57,570][33226] Updated weights for policy 1, policy_version 40530 (0.0008) [2023-10-14 02:38:57,883][33201] Updated weights for policy 0, policy_version 40180 (0.0007) [2023-10-14 02:38:57,944][33226] Updated weights for policy 1, policy_version 40540 (0.0007) [2023-10-14 02:38:58,258][33201] Updated weights for policy 0, policy_version 40190 (0.0008) [2023-10-14 02:38:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 82673664. Throughput: 0: 1784.7, 1: 1798.3. Samples: 20670818. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:38:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 02:39:01,701][33226] Updated weights for policy 1, policy_version 40550 (0.0008) [2023-10-14 02:39:02,065][33226] Updated weights for policy 1, policy_version 40560 (0.0007) [2023-10-14 02:39:02,138][33201] Updated weights for policy 0, policy_version 40200 (0.0007) [2023-10-14 02:39:02,434][33226] Updated weights for policy 1, policy_version 40570 (0.0009) [2023-10-14 02:39:02,504][33201] Updated weights for policy 0, policy_version 40210 (0.0007) [2023-10-14 02:39:02,866][33201] Updated weights for policy 0, policy_version 40220 (0.0007) [2023-10-14 02:39:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 82739200. Throughput: 0: 1749.2, 1: 1777.1. Samples: 20689948. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:39:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 02:39:06,158][33226] Updated weights for policy 1, policy_version 40580 (0.0007) [2023-10-14 02:39:06,526][33226] Updated weights for policy 1, policy_version 40590 (0.0010) [2023-10-14 02:39:06,668][33201] Updated weights for policy 0, policy_version 40230 (0.0009) [2023-10-14 02:39:06,891][33226] Updated weights for policy 1, policy_version 40600 (0.0009) [2023-10-14 02:39:07,046][33201] Updated weights for policy 0, policy_version 40240 (0.0007) [2023-10-14 02:39:07,408][33201] Updated weights for policy 0, policy_version 40250 (0.0008) [2023-10-14 02:39:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 82804736. Throughput: 0: 1753.7, 1: 1780.5. Samples: 20712070. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:39:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 02:39:10,793][33226] Updated weights for policy 1, policy_version 40610 (0.0009) [2023-10-14 02:39:11,164][33226] Updated weights for policy 1, policy_version 40620 (0.0008) [2023-10-14 02:39:11,297][33201] Updated weights for policy 0, policy_version 40260 (0.0008) [2023-10-14 02:39:11,523][33226] Updated weights for policy 1, policy_version 40630 (0.0007) [2023-10-14 02:39:11,668][33201] Updated weights for policy 0, policy_version 40270 (0.0008) [2023-10-14 02:39:11,884][33226] Updated weights for policy 1, policy_version 40640 (0.0009) [2023-10-14 02:39:12,044][33201] Updated weights for policy 0, policy_version 40280 (0.0009) [2023-10-14 02:39:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 82870272. Throughput: 0: 1761.0, 1: 1777.8. Samples: 20721778. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:39:14,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 02:39:15,666][33226] Updated weights for policy 1, policy_version 40650 (0.0007) [2023-10-14 02:39:15,806][33201] Updated weights for policy 0, policy_version 40290 (0.0008) [2023-10-14 02:39:16,040][33226] Updated weights for policy 1, policy_version 40660 (0.0008) [2023-10-14 02:39:16,172][33201] Updated weights for policy 0, policy_version 40300 (0.0007) [2023-10-14 02:39:16,405][33226] Updated weights for policy 1, policy_version 40670 (0.0008) [2023-10-14 02:39:16,543][33201] Updated weights for policy 0, policy_version 40310 (0.0007) [2023-10-14 02:39:16,906][33201] Updated weights for policy 0, policy_version 40320 (0.0010) [2023-10-14 02:39:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 82935808. Throughput: 0: 1751.7, 1: 1776.3. Samples: 20743444. Policy #0 lag: (min: 20.0, avg: 20.0, max: 20.0) [2023-10-14 02:39:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 02:39:20,181][33226] Updated weights for policy 1, policy_version 40680 (0.0008) [2023-10-14 02:39:20,550][33226] Updated weights for policy 1, policy_version 40690 (0.0008) [2023-10-14 02:39:20,832][33201] Updated weights for policy 0, policy_version 40330 (0.0007) [2023-10-14 02:39:20,922][33226] Updated weights for policy 1, policy_version 40700 (0.0008) [2023-10-14 02:39:21,194][33201] Updated weights for policy 0, policy_version 40340 (0.0007) [2023-10-14 02:39:21,568][33201] Updated weights for policy 0, policy_version 40350 (0.0008) [2023-10-14 02:39:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 83001344. Throughput: 0: 1766.1, 1: 1794.3. Samples: 20766060. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-14 02:39:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 02:39:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000040352_41320448.pth... [2023-10-14 02:39:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000038720_39649280.pth [2023-10-14 02:39:24,710][33226] Updated weights for policy 1, policy_version 40710 (0.0008) [2023-10-14 02:39:25,072][33226] Updated weights for policy 1, policy_version 40720 (0.0008) [2023-10-14 02:39:25,285][33201] Updated weights for policy 0, policy_version 40360 (0.0008) [2023-10-14 02:39:25,440][33226] Updated weights for policy 1, policy_version 40730 (0.0009) [2023-10-14 02:39:25,653][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000040736_41713664.pth... [2023-10-14 02:39:25,656][33201] Updated weights for policy 0, policy_version 40370 (0.0008) [2023-10-14 02:39:25,693][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000039040_39976960.pth [2023-10-14 02:39:26,032][33201] Updated weights for policy 0, policy_version 40380 (0.0009) [2023-10-14 02:39:29,368][33226] Updated weights for policy 1, policy_version 40740 (0.0008) [2023-10-14 02:39:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 83066880. Throughput: 0: 1755.5, 1: 1782.8. Samples: 20775618. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-14 02:39:29,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 02:39:29,727][33226] Updated weights for policy 1, policy_version 40750 (0.0008) [2023-10-14 02:39:29,899][33201] Updated weights for policy 0, policy_version 40390 (0.0008) [2023-10-14 02:39:30,100][33226] Updated weights for policy 1, policy_version 40760 (0.0008) [2023-10-14 02:39:30,264][33201] Updated weights for policy 0, policy_version 40400 (0.0008) [2023-10-14 02:39:30,631][33201] Updated weights for policy 0, policy_version 40410 (0.0010) [2023-10-14 02:39:33,843][33226] Updated weights for policy 1, policy_version 40770 (0.0008) [2023-10-14 02:39:34,223][33226] Updated weights for policy 1, policy_version 40780 (0.0011) [2023-10-14 02:39:34,464][33201] Updated weights for policy 0, policy_version 40420 (0.0010) [2023-10-14 02:39:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 83132416. Throughput: 0: 1759.9, 1: 1784.4. Samples: 20797834. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-14 02:39:34,557][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 02:39:34,592][33226] Updated weights for policy 1, policy_version 40790 (0.0010) [2023-10-14 02:39:34,836][33201] Updated weights for policy 0, policy_version 40430 (0.0008) [2023-10-14 02:39:34,953][33226] Updated weights for policy 1, policy_version 40800 (0.0010) [2023-10-14 02:39:35,201][33201] Updated weights for policy 0, policy_version 40440 (0.0009) [2023-10-14 02:39:38,669][33226] Updated weights for policy 1, policy_version 40810 (0.0011) [2023-10-14 02:39:39,028][33201] Updated weights for policy 0, policy_version 40450 (0.0009) [2023-10-14 02:39:39,042][33226] Updated weights for policy 1, policy_version 40820 (0.0008) [2023-10-14 02:39:39,409][33201] Updated weights for policy 0, policy_version 40460 (0.0008) [2023-10-14 02:39:39,415][33226] Updated weights for policy 1, policy_version 40830 (0.0008) [2023-10-14 02:39:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 83230720. Throughput: 0: 1781.0, 1: 1789.7. Samples: 20819104. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-14 02:39:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 02:39:39,775][33201] Updated weights for policy 0, policy_version 40470 (0.0009) [2023-10-14 02:39:40,146][33201] Updated weights for policy 0, policy_version 40480 (0.0008) [2023-10-14 02:39:43,395][33226] Updated weights for policy 1, policy_version 40840 (0.0009) [2023-10-14 02:39:43,775][33226] Updated weights for policy 1, policy_version 40850 (0.0009) [2023-10-14 02:39:44,037][33201] Updated weights for policy 0, policy_version 40490 (0.0008) [2023-10-14 02:39:44,139][33226] Updated weights for policy 1, policy_version 40860 (0.0008) [2023-10-14 02:39:44,411][33201] Updated weights for policy 0, policy_version 40500 (0.0007) [2023-10-14 02:39:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 83296256. Throughput: 0: 1745.6, 1: 1779.6. Samples: 20829452. Policy #0 lag: (min: 31.0, avg: 39.6, max: 63.0) [2023-10-14 02:39:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 02:39:44,775][33201] Updated weights for policy 0, policy_version 40510 (0.0008) [2023-10-14 02:39:47,847][33226] Updated weights for policy 1, policy_version 40870 (0.0008) [2023-10-14 02:39:48,208][33226] Updated weights for policy 1, policy_version 40880 (0.0010) [2023-10-14 02:39:48,497][33201] Updated weights for policy 0, policy_version 40520 (0.0009) [2023-10-14 02:39:48,587][33226] Updated weights for policy 1, policy_version 40890 (0.0008) [2023-10-14 02:39:48,872][33201] Updated weights for policy 0, policy_version 40530 (0.0008) [2023-10-14 02:39:49,243][33201] Updated weights for policy 0, policy_version 40540 (0.0009) [2023-10-14 02:39:49,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 83394560. Throughput: 0: 1783.6, 1: 1799.1. Samples: 20851170. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:39:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 02:39:52,421][33226] Updated weights for policy 1, policy_version 40900 (0.0009) [2023-10-14 02:39:52,786][33226] Updated weights for policy 1, policy_version 40910 (0.0007) [2023-10-14 02:39:53,009][33201] Updated weights for policy 0, policy_version 40550 (0.0009) [2023-10-14 02:39:53,153][33226] Updated weights for policy 1, policy_version 40920 (0.0008) [2023-10-14 02:39:53,374][33201] Updated weights for policy 0, policy_version 40560 (0.0009) [2023-10-14 02:39:53,737][33201] Updated weights for policy 0, policy_version 40570 (0.0009) [2023-10-14 02:39:54,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 83460096. Throughput: 0: 1754.3, 1: 1767.0. Samples: 20870532. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:39:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 02:39:57,049][33226] Updated weights for policy 1, policy_version 40930 (0.0009) [2023-10-14 02:39:57,421][33226] Updated weights for policy 1, policy_version 40940 (0.0010) [2023-10-14 02:39:57,600][33201] Updated weights for policy 0, policy_version 40580 (0.0010) [2023-10-14 02:39:57,786][33226] Updated weights for policy 1, policy_version 40950 (0.0007) [2023-10-14 02:39:57,960][33201] Updated weights for policy 0, policy_version 40590 (0.0009) [2023-10-14 02:39:58,154][33226] Updated weights for policy 1, policy_version 40960 (0.0008) [2023-10-14 02:39:58,328][33201] Updated weights for policy 0, policy_version 40600 (0.0009) [2023-10-14 02:39:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 83525632. Throughput: 0: 1779.2, 1: 1799.8. Samples: 20882832. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:39:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 02:40:02,007][33226] Updated weights for policy 1, policy_version 40970 (0.0008) [2023-10-14 02:40:02,253][33201] Updated weights for policy 0, policy_version 40610 (0.0009) [2023-10-14 02:40:02,373][33226] Updated weights for policy 1, policy_version 40980 (0.0008) [2023-10-14 02:40:02,623][33201] Updated weights for policy 0, policy_version 40620 (0.0007) [2023-10-14 02:40:02,740][33226] Updated weights for policy 1, policy_version 40990 (0.0008) [2023-10-14 02:40:02,997][33201] Updated weights for policy 0, policy_version 40630 (0.0007) [2023-10-14 02:40:03,363][33201] Updated weights for policy 0, policy_version 40640 (0.0007) [2023-10-14 02:40:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 83591168. Throughput: 0: 1759.3, 1: 1771.4. Samples: 20902328. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:40:04,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 02:40:06,348][33226] Updated weights for policy 1, policy_version 41000 (0.0009) [2023-10-14 02:40:06,710][33226] Updated weights for policy 1, policy_version 41010 (0.0011) [2023-10-14 02:40:07,074][33226] Updated weights for policy 1, policy_version 41020 (0.0007) [2023-10-14 02:40:07,384][33201] Updated weights for policy 0, policy_version 40650 (0.0010) [2023-10-14 02:40:07,752][33201] Updated weights for policy 0, policy_version 40660 (0.0009) [2023-10-14 02:40:08,123][33201] Updated weights for policy 0, policy_version 40670 (0.0010) [2023-10-14 02:40:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 83656704. Throughput: 0: 1743.0, 1: 1765.4. Samples: 20923936. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:40:09,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 02:40:11,017][33226] Updated weights for policy 1, policy_version 41030 (0.0007) [2023-10-14 02:40:11,385][33226] Updated weights for policy 1, policy_version 41040 (0.0009) [2023-10-14 02:40:11,748][33226] Updated weights for policy 1, policy_version 41050 (0.0009) [2023-10-14 02:40:12,012][33201] Updated weights for policy 0, policy_version 40680 (0.0008) [2023-10-14 02:40:12,384][33201] Updated weights for policy 0, policy_version 40690 (0.0007) [2023-10-14 02:40:12,760][33201] Updated weights for policy 0, policy_version 40700 (0.0008) [2023-10-14 02:40:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 83722240. Throughput: 0: 1768.6, 1: 1763.1. Samples: 20934548. Policy #0 lag: (min: 31.0, avg: 38.5, max: 63.0) [2023-10-14 02:40:14,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 02:40:15,338][33226] Updated weights for policy 1, policy_version 41060 (0.0008) [2023-10-14 02:40:15,711][33226] Updated weights for policy 1, policy_version 41070 (0.0007) [2023-10-14 02:40:16,083][33226] Updated weights for policy 1, policy_version 41080 (0.0007) [2023-10-14 02:40:16,497][33201] Updated weights for policy 0, policy_version 40710 (0.0008) [2023-10-14 02:40:16,867][33201] Updated weights for policy 0, policy_version 40720 (0.0011) [2023-10-14 02:40:17,242][33201] Updated weights for policy 0, policy_version 40730 (0.0010) [2023-10-14 02:40:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 83787776. Throughput: 0: 1740.2, 1: 1767.1. Samples: 20955662. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 02:40:19,900][33226] Updated weights for policy 1, policy_version 41090 (0.0009) [2023-10-14 02:40:20,270][33226] Updated weights for policy 1, policy_version 41100 (0.0007) [2023-10-14 02:40:20,646][33226] Updated weights for policy 1, policy_version 41110 (0.0007) [2023-10-14 02:40:21,010][33226] Updated weights for policy 1, policy_version 41120 (0.0008) [2023-10-14 02:40:21,260][33201] Updated weights for policy 0, policy_version 40740 (0.0008) [2023-10-14 02:40:21,626][33201] Updated weights for policy 0, policy_version 40750 (0.0009) [2023-10-14 02:40:21,996][33201] Updated weights for policy 0, policy_version 40760 (0.0009) [2023-10-14 02:40:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 83853312. Throughput: 0: 1743.7, 1: 1779.1. Samples: 20977628. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 02:40:24,827][33226] Updated weights for policy 1, policy_version 41130 (0.0009) [2023-10-14 02:40:25,194][33226] Updated weights for policy 1, policy_version 41140 (0.0007) [2023-10-14 02:40:25,568][33226] Updated weights for policy 1, policy_version 41150 (0.0009) [2023-10-14 02:40:25,880][33201] Updated weights for policy 0, policy_version 40770 (0.0007) [2023-10-14 02:40:26,258][33201] Updated weights for policy 0, policy_version 40780 (0.0010) [2023-10-14 02:40:26,629][33201] Updated weights for policy 0, policy_version 40790 (0.0009) [2023-10-14 02:40:27,006][33201] Updated weights for policy 0, policy_version 40800 (0.0010) [2023-10-14 02:40:29,324][33226] Updated weights for policy 1, policy_version 41160 (0.0008) [2023-10-14 02:40:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 83918848. Throughput: 0: 1743.0, 1: 1762.6. Samples: 20987204. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:29,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 02:40:29,697][33226] Updated weights for policy 1, policy_version 41170 (0.0009) [2023-10-14 02:40:30,075][33226] Updated weights for policy 1, policy_version 41180 (0.0009) [2023-10-14 02:40:30,708][33201] Updated weights for policy 0, policy_version 40810 (0.0007) [2023-10-14 02:40:31,088][33201] Updated weights for policy 0, policy_version 40820 (0.0008) [2023-10-14 02:40:31,446][33201] Updated weights for policy 0, policy_version 40830 (0.0008) [2023-10-14 02:40:33,655][33226] Updated weights for policy 1, policy_version 41190 (0.0008) [2023-10-14 02:40:34,018][33226] Updated weights for policy 1, policy_version 41200 (0.0007) [2023-10-14 02:40:34,379][33226] Updated weights for policy 1, policy_version 41210 (0.0008) [2023-10-14 02:40:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 83984384. Throughput: 0: 1737.4, 1: 1778.3. Samples: 21009378. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:34,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 02:40:35,379][33201] Updated weights for policy 0, policy_version 40840 (0.0008) [2023-10-14 02:40:35,753][33201] Updated weights for policy 0, policy_version 40850 (0.0010) [2023-10-14 02:40:36,122][33201] Updated weights for policy 0, policy_version 40860 (0.0009) [2023-10-14 02:40:38,415][33226] Updated weights for policy 1, policy_version 41220 (0.0008) [2023-10-14 02:40:38,789][33226] Updated weights for policy 1, policy_version 41230 (0.0007) [2023-10-14 02:40:39,150][33226] Updated weights for policy 1, policy_version 41240 (0.0008) [2023-10-14 02:40:39,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84082688. Throughput: 0: 1773.7, 1: 1787.5. Samples: 21030786. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:39,560][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 02:40:39,678][33201] Updated weights for policy 0, policy_version 40870 (0.0010) [2023-10-14 02:40:40,060][33201] Updated weights for policy 0, policy_version 40880 (0.0010) [2023-10-14 02:40:40,431][33201] Updated weights for policy 0, policy_version 40890 (0.0009) [2023-10-14 02:40:43,031][33226] Updated weights for policy 1, policy_version 41250 (0.0007) [2023-10-14 02:40:43,403][33226] Updated weights for policy 1, policy_version 41260 (0.0008) [2023-10-14 02:40:43,769][33226] Updated weights for policy 1, policy_version 41270 (0.0007) [2023-10-14 02:40:44,131][33226] Updated weights for policy 1, policy_version 41280 (0.0008) [2023-10-14 02:40:44,320][33201] Updated weights for policy 0, policy_version 40900 (0.0009) [2023-10-14 02:40:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 84148224. Throughput: 0: 1742.7, 1: 1776.2. Samples: 21041184. Policy #0 lag: (min: 31.0, avg: 38.4, max: 63.0) [2023-10-14 02:40:44,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 02:40:44,702][33201] Updated weights for policy 0, policy_version 40910 (0.0009) [2023-10-14 02:40:45,073][33201] Updated weights for policy 0, policy_version 40920 (0.0009) [2023-10-14 02:40:47,919][33226] Updated weights for policy 1, policy_version 41290 (0.0011) [2023-10-14 02:40:48,280][33226] Updated weights for policy 1, policy_version 41300 (0.0009) [2023-10-14 02:40:48,646][33226] Updated weights for policy 1, policy_version 41310 (0.0010) [2023-10-14 02:40:48,974][33201] Updated weights for policy 0, policy_version 40930 (0.0009) [2023-10-14 02:40:49,342][33201] Updated weights for policy 0, policy_version 40940 (0.0008) [2023-10-14 02:40:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 84213760. Throughput: 0: 1765.7, 1: 1796.2. Samples: 21062612. Policy #0 lag: (min: 11.0, avg: 17.7, max: 43.0) [2023-10-14 02:40:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 02:40:49,721][33201] Updated weights for policy 0, policy_version 40950 (0.0010) [2023-10-14 02:40:50,088][33201] Updated weights for policy 0, policy_version 40960 (0.0009) [2023-10-14 02:40:52,307][33226] Updated weights for policy 1, policy_version 41320 (0.0008) [2023-10-14 02:40:52,674][33226] Updated weights for policy 1, policy_version 41330 (0.0009) [2023-10-14 02:40:53,041][33226] Updated weights for policy 1, policy_version 41340 (0.0007) [2023-10-14 02:40:54,067][33201] Updated weights for policy 0, policy_version 40970 (0.0010) [2023-10-14 02:40:54,436][33201] Updated weights for policy 0, policy_version 40980 (0.0007) [2023-10-14 02:40:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 84279296. Throughput: 0: 1766.2, 1: 1778.2. Samples: 21083432. Policy #0 lag: (min: 11.0, avg: 17.7, max: 43.0) [2023-10-14 02:40:54,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.930')] [2023-10-14 02:40:54,814][33201] Updated weights for policy 0, policy_version 40990 (0.0007) [2023-10-14 02:40:56,848][33226] Updated weights for policy 1, policy_version 41350 (0.0009) [2023-10-14 02:40:57,210][33226] Updated weights for policy 1, policy_version 41360 (0.0009) [2023-10-14 02:40:57,576][33226] Updated weights for policy 1, policy_version 41370 (0.0009) [2023-10-14 02:40:58,615][33201] Updated weights for policy 0, policy_version 41000 (0.0008) [2023-10-14 02:40:58,994][33201] Updated weights for policy 0, policy_version 41010 (0.0007) [2023-10-14 02:40:59,364][33201] Updated weights for policy 0, policy_version 41020 (0.0007) [2023-10-14 02:40:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84377600. Throughput: 0: 1750.1, 1: 1807.3. Samples: 21094626. Policy #0 lag: (min: 11.0, avg: 17.7, max: 43.0) [2023-10-14 02:40:59,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.930')] [2023-10-14 02:41:01,220][33226] Updated weights for policy 1, policy_version 41380 (0.0008) [2023-10-14 02:41:01,580][33226] Updated weights for policy 1, policy_version 41390 (0.0007) [2023-10-14 02:41:01,945][33226] Updated weights for policy 1, policy_version 41400 (0.0007) [2023-10-14 02:41:03,196][33201] Updated weights for policy 0, policy_version 41030 (0.0007) [2023-10-14 02:41:03,566][33201] Updated weights for policy 0, policy_version 41040 (0.0007) [2023-10-14 02:41:03,930][33201] Updated weights for policy 0, policy_version 41050 (0.0008) [2023-10-14 02:41:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84443136. Throughput: 0: 1776.2, 1: 1784.4. Samples: 21115888. Policy #0 lag: (min: 11.0, avg: 17.7, max: 43.0) [2023-10-14 02:41:04,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 02:41:05,656][33226] Updated weights for policy 1, policy_version 41410 (0.0008) [2023-10-14 02:41:06,026][33226] Updated weights for policy 1, policy_version 41420 (0.0008) [2023-10-14 02:41:06,389][33226] Updated weights for policy 1, policy_version 41430 (0.0008) [2023-10-14 02:41:06,763][33226] Updated weights for policy 1, policy_version 41440 (0.0007) [2023-10-14 02:41:07,772][33201] Updated weights for policy 0, policy_version 41060 (0.0009) [2023-10-14 02:41:08,146][33201] Updated weights for policy 0, policy_version 41070 (0.0007) [2023-10-14 02:41:08,529][33201] Updated weights for policy 0, policy_version 41080 (0.0008) [2023-10-14 02:41:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 84508672. Throughput: 0: 1750.3, 1: 1788.8. Samples: 21136888. Policy #0 lag: (min: 11.0, avg: 17.7, max: 43.0) [2023-10-14 02:41:09,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.930')] [2023-10-14 02:41:10,634][33226] Updated weights for policy 1, policy_version 41450 (0.0008) [2023-10-14 02:41:10,999][33226] Updated weights for policy 1, policy_version 41460 (0.0008) [2023-10-14 02:41:11,361][33226] Updated weights for policy 1, policy_version 41470 (0.0009) [2023-10-14 02:41:12,173][33201] Updated weights for policy 0, policy_version 41090 (0.0007) [2023-10-14 02:41:12,546][33201] Updated weights for policy 0, policy_version 41100 (0.0009) [2023-10-14 02:41:12,924][33201] Updated weights for policy 0, policy_version 41110 (0.0008) [2023-10-14 02:41:13,305][33201] Updated weights for policy 0, policy_version 41120 (0.0009) [2023-10-14 02:41:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84574208. Throughput: 0: 1787.3, 1: 1786.4. Samples: 21148020. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:14,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.930')] [2023-10-14 02:41:15,336][33226] Updated weights for policy 1, policy_version 41480 (0.0008) [2023-10-14 02:41:15,711][33226] Updated weights for policy 1, policy_version 41490 (0.0007) [2023-10-14 02:41:16,073][33226] Updated weights for policy 1, policy_version 41500 (0.0009) [2023-10-14 02:41:17,192][33201] Updated weights for policy 0, policy_version 41130 (0.0009) [2023-10-14 02:41:17,558][33201] Updated weights for policy 0, policy_version 41140 (0.0011) [2023-10-14 02:41:17,933][33201] Updated weights for policy 0, policy_version 41150 (0.0010) [2023-10-14 02:41:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84639744. Throughput: 0: 1750.7, 1: 1776.7. Samples: 21168112. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:19,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.940')] [2023-10-14 02:41:19,718][33226] Updated weights for policy 1, policy_version 41510 (0.0007) [2023-10-14 02:41:20,085][33226] Updated weights for policy 1, policy_version 41520 (0.0007) [2023-10-14 02:41:20,454][33226] Updated weights for policy 1, policy_version 41530 (0.0007) [2023-10-14 02:41:21,823][33201] Updated weights for policy 0, policy_version 41160 (0.0012) [2023-10-14 02:41:22,191][33201] Updated weights for policy 0, policy_version 41170 (0.0008) [2023-10-14 02:41:22,559][33201] Updated weights for policy 0, policy_version 41180 (0.0007) [2023-10-14 02:41:24,129][33226] Updated weights for policy 1, policy_version 41540 (0.0009) [2023-10-14 02:41:24,500][33226] Updated weights for policy 1, policy_version 41550 (0.0010) [2023-10-14 02:41:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 84705280. Throughput: 0: 1745.4, 1: 1794.0. Samples: 21190062. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:24,559][31953] Avg episode reward: [(0, '20.770'), (1, '20.940')] [2023-10-14 02:41:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000041184_42172416.pth... [2023-10-14 02:41:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000039552_40501248.pth [2023-10-14 02:41:24,612][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000041184_42172416.pth [2023-10-14 02:41:24,871][33226] Updated weights for policy 1, policy_version 41560 (0.0011) [2023-10-14 02:41:25,163][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000041568_42565632.pth... [2023-10-14 02:41:25,191][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000039872_40828928.pth [2023-10-14 02:41:25,195][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000041568_42565632.pth [2023-10-14 02:41:26,413][33201] Updated weights for policy 0, policy_version 41190 (0.0008) [2023-10-14 02:41:26,792][33201] Updated weights for policy 0, policy_version 41200 (0.0009) [2023-10-14 02:41:27,161][33201] Updated weights for policy 0, policy_version 41210 (0.0008) [2023-10-14 02:41:28,660][33226] Updated weights for policy 1, policy_version 41570 (0.0010) [2023-10-14 02:41:29,023][33226] Updated weights for policy 1, policy_version 41580 (0.0009) [2023-10-14 02:41:29,396][33226] Updated weights for policy 1, policy_version 41590 (0.0010) [2023-10-14 02:41:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 84770816. Throughput: 0: 1756.9, 1: 1773.4. Samples: 21200048. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:29,558][31953] Avg episode reward: [(0, '20.380'), (1, '20.940')] [2023-10-14 02:41:29,767][33226] Updated weights for policy 1, policy_version 41600 (0.0010) [2023-10-14 02:41:31,088][33201] Updated weights for policy 0, policy_version 41220 (0.0009) [2023-10-14 02:41:31,458][33201] Updated weights for policy 0, policy_version 41230 (0.0008) [2023-10-14 02:41:31,832][33201] Updated weights for policy 0, policy_version 41240 (0.0009) [2023-10-14 02:41:33,761][33226] Updated weights for policy 1, policy_version 41610 (0.0009) [2023-10-14 02:41:34,135][33226] Updated weights for policy 1, policy_version 41620 (0.0009) [2023-10-14 02:41:34,511][33226] Updated weights for policy 1, policy_version 41630 (0.0007) [2023-10-14 02:41:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 84836352. Throughput: 0: 1749.0, 1: 1786.1. Samples: 21221690. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:34,558][31953] Avg episode reward: [(0, '20.380'), (1, '20.940')] [2023-10-14 02:41:35,679][33201] Updated weights for policy 0, policy_version 41250 (0.0008) [2023-10-14 02:41:36,074][33201] Updated weights for policy 0, policy_version 41260 (0.0008) [2023-10-14 02:41:36,448][33201] Updated weights for policy 0, policy_version 41270 (0.0007) [2023-10-14 02:41:36,814][33201] Updated weights for policy 0, policy_version 41280 (0.0008) [2023-10-14 02:41:38,279][33226] Updated weights for policy 1, policy_version 41640 (0.0008) [2023-10-14 02:41:38,639][33226] Updated weights for policy 1, policy_version 41650 (0.0007) [2023-10-14 02:41:39,005][33226] Updated weights for policy 1, policy_version 41660 (0.0008) [2023-10-14 02:41:39,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 84934656. Throughput: 0: 1768.5, 1: 1779.0. Samples: 21243068. Policy #0 lag: (min: 17.0, avg: 45.5, max: 48.0) [2023-10-14 02:41:39,558][31953] Avg episode reward: [(0, '20.300'), (1, '20.930')] [2023-10-14 02:41:40,566][33201] Updated weights for policy 0, policy_version 41290 (0.0007) [2023-10-14 02:41:40,939][33201] Updated weights for policy 0, policy_version 41300 (0.0007) [2023-10-14 02:41:41,302][33201] Updated weights for policy 0, policy_version 41310 (0.0008) [2023-10-14 02:41:42,729][33226] Updated weights for policy 1, policy_version 41670 (0.0007) [2023-10-14 02:41:43,096][33226] Updated weights for policy 1, policy_version 41680 (0.0008) [2023-10-14 02:41:43,467][33226] Updated weights for policy 1, policy_version 41690 (0.0008) [2023-10-14 02:41:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 85000192. Throughput: 0: 1757.5, 1: 1783.6. Samples: 21253976. Policy #0 lag: (min: 5.0, avg: 12.3, max: 37.0) [2023-10-14 02:41:44,558][31953] Avg episode reward: [(0, '20.300'), (1, '20.930')] [2023-10-14 02:41:45,181][33201] Updated weights for policy 0, policy_version 41320 (0.0010) [2023-10-14 02:41:45,555][33201] Updated weights for policy 0, policy_version 41330 (0.0010) [2023-10-14 02:41:45,926][33201] Updated weights for policy 0, policy_version 41340 (0.0008) [2023-10-14 02:41:47,209][33226] Updated weights for policy 1, policy_version 41700 (0.0008) [2023-10-14 02:41:47,578][33226] Updated weights for policy 1, policy_version 41710 (0.0009) [2023-10-14 02:41:47,947][33226] Updated weights for policy 1, policy_version 41720 (0.0008) [2023-10-14 02:41:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 85065728. Throughput: 0: 1755.1, 1: 1778.9. Samples: 21274918. Policy #0 lag: (min: 5.0, avg: 12.3, max: 37.0) [2023-10-14 02:41:49,558][31953] Avg episode reward: [(0, '20.320'), (1, '20.930')] [2023-10-14 02:41:49,612][33201] Updated weights for policy 0, policy_version 41350 (0.0007) [2023-10-14 02:41:49,982][33201] Updated weights for policy 0, policy_version 41360 (0.0007) [2023-10-14 02:41:50,353][33201] Updated weights for policy 0, policy_version 41370 (0.0007) [2023-10-14 02:41:51,645][33226] Updated weights for policy 1, policy_version 41730 (0.0008) [2023-10-14 02:41:52,015][33226] Updated weights for policy 1, policy_version 41740 (0.0008) [2023-10-14 02:41:52,376][33226] Updated weights for policy 1, policy_version 41750 (0.0007) [2023-10-14 02:41:52,742][33226] Updated weights for policy 1, policy_version 41760 (0.0008) [2023-10-14 02:41:54,076][33201] Updated weights for policy 0, policy_version 41380 (0.0008) [2023-10-14 02:41:54,445][33201] Updated weights for policy 0, policy_version 41390 (0.0009) [2023-10-14 02:41:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 85131264. Throughput: 0: 1782.8, 1: 1766.7. Samples: 21296616. Policy #0 lag: (min: 5.0, avg: 12.3, max: 37.0) [2023-10-14 02:41:54,558][31953] Avg episode reward: [(0, '20.320'), (1, '20.930')] [2023-10-14 02:41:54,820][33201] Updated weights for policy 0, policy_version 41400 (0.0009) [2023-10-14 02:41:56,571][33226] Updated weights for policy 1, policy_version 41770 (0.0009) [2023-10-14 02:41:56,930][33226] Updated weights for policy 1, policy_version 41780 (0.0007) [2023-10-14 02:41:57,303][33226] Updated weights for policy 1, policy_version 41790 (0.0007) [2023-10-14 02:41:58,602][33201] Updated weights for policy 0, policy_version 41410 (0.0008) [2023-10-14 02:41:58,972][33201] Updated weights for policy 0, policy_version 41420 (0.0008) [2023-10-14 02:41:59,343][33201] Updated weights for policy 0, policy_version 41430 (0.0007) [2023-10-14 02:41:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 85196800. Throughput: 0: 1752.4, 1: 1784.2. Samples: 21307168. Policy #0 lag: (min: 5.0, avg: 12.3, max: 37.0) [2023-10-14 02:41:59,558][31953] Avg episode reward: [(0, '20.340'), (1, '20.930')] [2023-10-14 02:41:59,711][33201] Updated weights for policy 0, policy_version 41440 (0.0008) [2023-10-14 02:42:01,035][33226] Updated weights for policy 1, policy_version 41800 (0.0008) [2023-10-14 02:42:01,401][33226] Updated weights for policy 1, policy_version 41810 (0.0007) [2023-10-14 02:42:01,775][33226] Updated weights for policy 1, policy_version 41820 (0.0009) [2023-10-14 02:42:03,499][33201] Updated weights for policy 0, policy_version 41450 (0.0009) [2023-10-14 02:42:03,862][33201] Updated weights for policy 0, policy_version 41460 (0.0008) [2023-10-14 02:42:04,230][33201] Updated weights for policy 0, policy_version 41470 (0.0008) [2023-10-14 02:42:04,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 85295104. Throughput: 0: 1793.3, 1: 1777.5. Samples: 21328798. Policy #0 lag: (min: 5.0, avg: 12.3, max: 37.0) [2023-10-14 02:42:04,557][31953] Avg episode reward: [(0, '20.340'), (1, '20.930')] [2023-10-14 02:42:05,789][33226] Updated weights for policy 1, policy_version 41830 (0.0009) [2023-10-14 02:42:06,149][33226] Updated weights for policy 1, policy_version 41840 (0.0009) [2023-10-14 02:42:06,524][33226] Updated weights for policy 1, policy_version 41850 (0.0009) [2023-10-14 02:42:08,058][33201] Updated weights for policy 0, policy_version 41480 (0.0008) [2023-10-14 02:42:08,428][33201] Updated weights for policy 0, policy_version 41490 (0.0010) [2023-10-14 02:42:08,804][33201] Updated weights for policy 0, policy_version 41500 (0.0008) [2023-10-14 02:42:09,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 85360640. Throughput: 0: 1765.5, 1: 1777.3. Samples: 21349484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:09,558][31953] Avg episode reward: [(0, '20.340'), (1, '20.930')] [2023-10-14 02:42:10,338][33226] Updated weights for policy 1, policy_version 41860 (0.0008) [2023-10-14 02:42:10,701][33226] Updated weights for policy 1, policy_version 41870 (0.0009) [2023-10-14 02:42:11,071][33226] Updated weights for policy 1, policy_version 41880 (0.0008) [2023-10-14 02:42:12,419][33201] Updated weights for policy 0, policy_version 41510 (0.0007) [2023-10-14 02:42:12,788][33201] Updated weights for policy 0, policy_version 41520 (0.0008) [2023-10-14 02:42:13,160][33201] Updated weights for policy 0, policy_version 41530 (0.0008) [2023-10-14 02:42:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 85426176. Throughput: 0: 1791.9, 1: 1779.7. Samples: 21360770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:14,558][31953] Avg episode reward: [(0, '20.340'), (1, '20.920')] [2023-10-14 02:42:14,784][33226] Updated weights for policy 1, policy_version 41890 (0.0009) [2023-10-14 02:42:15,147][33226] Updated weights for policy 1, policy_version 41900 (0.0007) [2023-10-14 02:42:15,516][33226] Updated weights for policy 1, policy_version 41910 (0.0009) [2023-10-14 02:42:15,895][33226] Updated weights for policy 1, policy_version 41920 (0.0008) [2023-10-14 02:42:16,955][33201] Updated weights for policy 0, policy_version 41540 (0.0009) [2023-10-14 02:42:17,322][33201] Updated weights for policy 0, policy_version 41550 (0.0008) [2023-10-14 02:42:17,695][33201] Updated weights for policy 0, policy_version 41560 (0.0008) [2023-10-14 02:42:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 85491712. Throughput: 0: 1773.7, 1: 1785.0. Samples: 21381830. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:19,558][31953] Avg episode reward: [(0, '20.340'), (1, '20.920')] [2023-10-14 02:42:19,676][33226] Updated weights for policy 1, policy_version 41930 (0.0009) [2023-10-14 02:42:20,029][33226] Updated weights for policy 1, policy_version 41940 (0.0008) [2023-10-14 02:42:20,407][33226] Updated weights for policy 1, policy_version 41950 (0.0008) [2023-10-14 02:42:21,656][33201] Updated weights for policy 0, policy_version 41570 (0.0009) [2023-10-14 02:42:22,081][33201] Updated weights for policy 0, policy_version 41580 (0.0007) [2023-10-14 02:42:22,452][33201] Updated weights for policy 0, policy_version 41590 (0.0008) [2023-10-14 02:42:22,814][33201] Updated weights for policy 0, policy_version 41600 (0.0007) [2023-10-14 02:42:24,020][33226] Updated weights for policy 1, policy_version 41960 (0.0009) [2023-10-14 02:42:24,389][33226] Updated weights for policy 1, policy_version 41970 (0.0007) [2023-10-14 02:42:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 85557248. Throughput: 0: 1767.3, 1: 1803.9. Samples: 21403772. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:24,558][31953] Avg episode reward: [(0, '20.390'), (1, '20.920')] [2023-10-14 02:42:24,759][33226] Updated weights for policy 1, policy_version 41980 (0.0007) [2023-10-14 02:42:26,456][33201] Updated weights for policy 0, policy_version 41610 (0.0008) [2023-10-14 02:42:26,828][33201] Updated weights for policy 0, policy_version 41620 (0.0008) [2023-10-14 02:42:27,191][33201] Updated weights for policy 0, policy_version 41630 (0.0007) [2023-10-14 02:42:28,571][33226] Updated weights for policy 1, policy_version 41990 (0.0007) [2023-10-14 02:42:28,942][33226] Updated weights for policy 1, policy_version 42000 (0.0009) [2023-10-14 02:42:29,322][33226] Updated weights for policy 1, policy_version 42010 (0.0008) [2023-10-14 02:42:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 85655552. Throughput: 0: 1779.1, 1: 1781.4. Samples: 21414196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:29,558][31953] Avg episode reward: [(0, '20.500'), (1, '20.920')] [2023-10-14 02:42:31,057][33201] Updated weights for policy 0, policy_version 41640 (0.0008) [2023-10-14 02:42:31,431][33201] Updated weights for policy 0, policy_version 41650 (0.0009) [2023-10-14 02:42:31,793][33201] Updated weights for policy 0, policy_version 41660 (0.0010) [2023-10-14 02:42:33,129][33226] Updated weights for policy 1, policy_version 42020 (0.0008) [2023-10-14 02:42:33,499][33226] Updated weights for policy 1, policy_version 42030 (0.0009) [2023-10-14 02:42:33,865][33226] Updated weights for policy 1, policy_version 42040 (0.0009) [2023-10-14 02:42:34,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 85721088. Throughput: 0: 1774.1, 1: 1805.6. Samples: 21436002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:42:34,557][31953] Avg episode reward: [(0, '20.510'), (1, '20.920')] [2023-10-14 02:42:35,681][33201] Updated weights for policy 0, policy_version 41670 (0.0007) [2023-10-14 02:42:36,045][33201] Updated weights for policy 0, policy_version 41680 (0.0009) [2023-10-14 02:42:36,420][33201] Updated weights for policy 0, policy_version 41690 (0.0009) [2023-10-14 02:42:37,560][33226] Updated weights for policy 1, policy_version 42050 (0.0008) [2023-10-14 02:42:37,917][33226] Updated weights for policy 1, policy_version 42060 (0.0009) [2023-10-14 02:42:38,283][33226] Updated weights for policy 1, policy_version 42070 (0.0008) [2023-10-14 02:42:38,650][33226] Updated weights for policy 1, policy_version 42080 (0.0008) [2023-10-14 02:42:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 85786624. Throughput: 0: 1778.2, 1: 1783.4. Samples: 21456886. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:42:39,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.960')] [2023-10-14 02:42:40,203][33201] Updated weights for policy 0, policy_version 41700 (0.0010) [2023-10-14 02:42:40,566][33201] Updated weights for policy 0, policy_version 41710 (0.0008) [2023-10-14 02:42:40,935][33201] Updated weights for policy 0, policy_version 41720 (0.0010) [2023-10-14 02:42:42,504][33226] Updated weights for policy 1, policy_version 42090 (0.0009) [2023-10-14 02:42:42,873][33226] Updated weights for policy 1, policy_version 42100 (0.0008) [2023-10-14 02:42:43,247][33226] Updated weights for policy 1, policy_version 42110 (0.0008) [2023-10-14 02:42:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 85852160. Throughput: 0: 1768.6, 1: 1800.5. Samples: 21467780. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:42:44,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.960')] [2023-10-14 02:42:44,776][33201] Updated weights for policy 0, policy_version 41730 (0.0009) [2023-10-14 02:42:45,143][33201] Updated weights for policy 0, policy_version 41740 (0.0009) [2023-10-14 02:42:45,514][33201] Updated weights for policy 0, policy_version 41750 (0.0007) [2023-10-14 02:42:45,882][33201] Updated weights for policy 0, policy_version 41760 (0.0007) [2023-10-14 02:42:46,964][33226] Updated weights for policy 1, policy_version 42120 (0.0008) [2023-10-14 02:42:47,340][33226] Updated weights for policy 1, policy_version 42130 (0.0008) [2023-10-14 02:42:47,702][33226] Updated weights for policy 1, policy_version 42140 (0.0010) [2023-10-14 02:42:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 85917696. Throughput: 0: 1761.9, 1: 1783.6. Samples: 21488342. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:42:49,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.960')] [2023-10-14 02:42:49,866][33201] Updated weights for policy 0, policy_version 41770 (0.0010) [2023-10-14 02:42:50,235][33201] Updated weights for policy 0, policy_version 41780 (0.0007) [2023-10-14 02:42:50,619][33201] Updated weights for policy 0, policy_version 41790 (0.0007) [2023-10-14 02:42:51,536][33226] Updated weights for policy 1, policy_version 42150 (0.0008) [2023-10-14 02:42:51,927][33226] Updated weights for policy 1, policy_version 42160 (0.0008) [2023-10-14 02:42:52,296][33226] Updated weights for policy 1, policy_version 42170 (0.0008) [2023-10-14 02:42:54,315][33201] Updated weights for policy 0, policy_version 41800 (0.0008) [2023-10-14 02:42:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 85983232. Throughput: 0: 1785.2, 1: 1784.7. Samples: 21510130. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:42:54,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.930')] [2023-10-14 02:42:54,683][33201] Updated weights for policy 0, policy_version 41810 (0.0007) [2023-10-14 02:42:55,049][33201] Updated weights for policy 0, policy_version 41820 (0.0008) [2023-10-14 02:42:55,981][33226] Updated weights for policy 1, policy_version 42180 (0.0009) [2023-10-14 02:42:56,346][33226] Updated weights for policy 1, policy_version 42190 (0.0010) [2023-10-14 02:42:56,721][33226] Updated weights for policy 1, policy_version 42200 (0.0008) [2023-10-14 02:42:58,858][33201] Updated weights for policy 0, policy_version 41830 (0.0008) [2023-10-14 02:42:59,237][33201] Updated weights for policy 0, policy_version 41840 (0.0007) [2023-10-14 02:42:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86048768. Throughput: 0: 1750.4, 1: 1787.8. Samples: 21519990. Policy #0 lag: (min: 31.0, avg: 36.5, max: 63.0) [2023-10-14 02:42:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.930')] [2023-10-14 02:42:59,603][33201] Updated weights for policy 0, policy_version 41850 (0.0007) [2023-10-14 02:43:00,564][33226] Updated weights for policy 1, policy_version 42210 (0.0008) [2023-10-14 02:43:00,930][33226] Updated weights for policy 1, policy_version 42220 (0.0007) [2023-10-14 02:43:01,298][33226] Updated weights for policy 1, policy_version 42230 (0.0008) [2023-10-14 02:43:01,669][33226] Updated weights for policy 1, policy_version 42240 (0.0010) [2023-10-14 02:43:03,373][33201] Updated weights for policy 0, policy_version 41860 (0.0009) [2023-10-14 02:43:03,743][33201] Updated weights for policy 0, policy_version 41870 (0.0009) [2023-10-14 02:43:04,108][33201] Updated weights for policy 0, policy_version 41880 (0.0009) [2023-10-14 02:43:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 86147072. Throughput: 0: 1785.3, 1: 1776.3. Samples: 21542100. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:04,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:43:05,338][33226] Updated weights for policy 1, policy_version 42250 (0.0008) [2023-10-14 02:43:05,705][33226] Updated weights for policy 1, policy_version 42260 (0.0009) [2023-10-14 02:43:06,073][33226] Updated weights for policy 1, policy_version 42270 (0.0011) [2023-10-14 02:43:07,962][33201] Updated weights for policy 0, policy_version 41890 (0.0010) [2023-10-14 02:43:08,349][33201] Updated weights for policy 0, policy_version 41900 (0.0008) [2023-10-14 02:43:08,725][33201] Updated weights for policy 0, policy_version 41910 (0.0007) [2023-10-14 02:43:09,098][33201] Updated weights for policy 0, policy_version 41920 (0.0007) [2023-10-14 02:43:09,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 86212608. Throughput: 0: 1753.6, 1: 1784.5. Samples: 21562986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:09,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:43:09,779][33226] Updated weights for policy 1, policy_version 42280 (0.0009) [2023-10-14 02:43:10,154][33226] Updated weights for policy 1, policy_version 42290 (0.0007) [2023-10-14 02:43:10,520][33226] Updated weights for policy 1, policy_version 42300 (0.0007) [2023-10-14 02:43:13,077][33201] Updated weights for policy 0, policy_version 41930 (0.0007) [2023-10-14 02:43:13,443][33201] Updated weights for policy 0, policy_version 41940 (0.0007) [2023-10-14 02:43:13,819][33201] Updated weights for policy 0, policy_version 41950 (0.0008) [2023-10-14 02:43:14,389][33226] Updated weights for policy 1, policy_version 42310 (0.0007) [2023-10-14 02:43:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 86278144. Throughput: 0: 1771.7, 1: 1772.1. Samples: 21573668. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:43:14,754][33226] Updated weights for policy 1, policy_version 42320 (0.0009) [2023-10-14 02:43:15,128][33226] Updated weights for policy 1, policy_version 42330 (0.0009) [2023-10-14 02:43:17,697][33201] Updated weights for policy 0, policy_version 41960 (0.0008) [2023-10-14 02:43:18,076][33201] Updated weights for policy 0, policy_version 41970 (0.0009) [2023-10-14 02:43:18,447][33201] Updated weights for policy 0, policy_version 41980 (0.0007) [2023-10-14 02:43:18,926][33226] Updated weights for policy 1, policy_version 42340 (0.0009) [2023-10-14 02:43:19,292][33226] Updated weights for policy 1, policy_version 42350 (0.0009) [2023-10-14 02:43:19,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 86343680. Throughput: 0: 1761.5, 1: 1774.3. Samples: 21595114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:19,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:43:19,654][33226] Updated weights for policy 1, policy_version 42360 (0.0009) [2023-10-14 02:43:22,346][33201] Updated weights for policy 0, policy_version 41990 (0.0008) [2023-10-14 02:43:22,717][33201] Updated weights for policy 0, policy_version 42000 (0.0008) [2023-10-14 02:43:23,088][33201] Updated weights for policy 0, policy_version 42010 (0.0007) [2023-10-14 02:43:23,535][33226] Updated weights for policy 1, policy_version 42370 (0.0008) [2023-10-14 02:43:23,908][33226] Updated weights for policy 1, policy_version 42380 (0.0008) [2023-10-14 02:43:24,272][33226] Updated weights for policy 1, policy_version 42390 (0.0008) [2023-10-14 02:43:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 86409216. Throughput: 0: 1741.3, 1: 1795.2. Samples: 21616028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 02:43:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000042016_43024384.pth... [2023-10-14 02:43:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000040352_41320448.pth [2023-10-14 02:43:24,630][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000042400_43417600.pth... [2023-10-14 02:43:24,631][33226] Updated weights for policy 1, policy_version 42400 (0.0009) [2023-10-14 02:43:24,658][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000040736_41713664.pth [2023-10-14 02:43:27,070][33201] Updated weights for policy 0, policy_version 42020 (0.0008) [2023-10-14 02:43:27,441][33201] Updated weights for policy 0, policy_version 42030 (0.0007) [2023-10-14 02:43:27,821][33201] Updated weights for policy 0, policy_version 42040 (0.0009) [2023-10-14 02:43:28,327][33226] Updated weights for policy 1, policy_version 42410 (0.0008) [2023-10-14 02:43:28,698][33226] Updated weights for policy 1, policy_version 42420 (0.0008) [2023-10-14 02:43:29,070][33226] Updated weights for policy 1, policy_version 42430 (0.0008) [2023-10-14 02:43:29,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 86507520. Throughput: 0: 1766.8, 1: 1780.1. Samples: 21627390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:43:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 02:43:31,532][33201] Updated weights for policy 0, policy_version 42050 (0.0007) [2023-10-14 02:43:31,900][33201] Updated weights for policy 0, policy_version 42060 (0.0008) [2023-10-14 02:43:32,277][33201] Updated weights for policy 0, policy_version 42070 (0.0008) [2023-10-14 02:43:32,647][33201] Updated weights for policy 0, policy_version 42080 (0.0007) [2023-10-14 02:43:32,793][33226] Updated weights for policy 1, policy_version 42440 (0.0008) [2023-10-14 02:43:33,155][33226] Updated weights for policy 1, policy_version 42450 (0.0011) [2023-10-14 02:43:33,526][33226] Updated weights for policy 1, policy_version 42460 (0.0009) [2023-10-14 02:43:34,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 86573056. Throughput: 0: 1744.6, 1: 1799.0. Samples: 21647806. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 02:43:36,414][33201] Updated weights for policy 0, policy_version 42090 (0.0008) [2023-10-14 02:43:36,790][33201] Updated weights for policy 0, policy_version 42100 (0.0008) [2023-10-14 02:43:37,158][33201] Updated weights for policy 0, policy_version 42110 (0.0009) [2023-10-14 02:43:37,432][33226] Updated weights for policy 1, policy_version 42470 (0.0009) [2023-10-14 02:43:37,805][33226] Updated weights for policy 1, policy_version 42480 (0.0010) [2023-10-14 02:43:38,170][33226] Updated weights for policy 1, policy_version 42490 (0.0007) [2023-10-14 02:43:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86638592. Throughput: 0: 1754.4, 1: 1778.7. Samples: 21669120. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.830')] [2023-10-14 02:43:41,010][33201] Updated weights for policy 0, policy_version 42120 (0.0008) [2023-10-14 02:43:41,386][33201] Updated weights for policy 0, policy_version 42130 (0.0010) [2023-10-14 02:43:41,752][33201] Updated weights for policy 0, policy_version 42140 (0.0007) [2023-10-14 02:43:41,909][33226] Updated weights for policy 1, policy_version 42500 (0.0008) [2023-10-14 02:43:42,280][33226] Updated weights for policy 1, policy_version 42510 (0.0007) [2023-10-14 02:43:42,647][33226] Updated weights for policy 1, policy_version 42520 (0.0008) [2023-10-14 02:43:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86704128. Throughput: 0: 1750.0, 1: 1798.8. Samples: 21679682. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:44,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.800')] [2023-10-14 02:43:45,388][33201] Updated weights for policy 0, policy_version 42150 (0.0007) [2023-10-14 02:43:45,757][33201] Updated weights for policy 0, policy_version 42160 (0.0007) [2023-10-14 02:43:46,127][33201] Updated weights for policy 0, policy_version 42170 (0.0007) [2023-10-14 02:43:46,506][33226] Updated weights for policy 1, policy_version 42530 (0.0007) [2023-10-14 02:43:46,883][33226] Updated weights for policy 1, policy_version 42540 (0.0011) [2023-10-14 02:43:47,242][33226] Updated weights for policy 1, policy_version 42550 (0.0011) [2023-10-14 02:43:47,616][33226] Updated weights for policy 1, policy_version 42560 (0.0007) [2023-10-14 02:43:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86769664. Throughput: 0: 1749.2, 1: 1773.1. Samples: 21700602. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:49,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.780')] [2023-10-14 02:43:49,962][33201] Updated weights for policy 0, policy_version 42180 (0.0008) [2023-10-14 02:43:50,333][33201] Updated weights for policy 0, policy_version 42190 (0.0007) [2023-10-14 02:43:50,706][33201] Updated weights for policy 0, policy_version 42200 (0.0008) [2023-10-14 02:43:51,267][33226] Updated weights for policy 1, policy_version 42570 (0.0008) [2023-10-14 02:43:51,636][33226] Updated weights for policy 1, policy_version 42580 (0.0008) [2023-10-14 02:43:52,006][33226] Updated weights for policy 1, policy_version 42590 (0.0008) [2023-10-14 02:43:54,455][33201] Updated weights for policy 0, policy_version 42210 (0.0008) [2023-10-14 02:43:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86835200. Throughput: 0: 1782.3, 1: 1772.2. Samples: 21722942. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:54,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.730')] [2023-10-14 02:43:54,870][33201] Updated weights for policy 0, policy_version 42220 (0.0009) [2023-10-14 02:43:55,246][33201] Updated weights for policy 0, policy_version 42230 (0.0008) [2023-10-14 02:43:55,622][33201] Updated weights for policy 0, policy_version 42240 (0.0008) [2023-10-14 02:43:55,802][33226] Updated weights for policy 1, policy_version 42600 (0.0007) [2023-10-14 02:43:56,154][33226] Updated weights for policy 1, policy_version 42610 (0.0007) [2023-10-14 02:43:56,527][33226] Updated weights for policy 1, policy_version 42620 (0.0007) [2023-10-14 02:43:59,462][33201] Updated weights for policy 0, policy_version 42250 (0.0010) [2023-10-14 02:43:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 86900736. Throughput: 0: 1750.7, 1: 1778.7. Samples: 21732490. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:43:59,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.720')] [2023-10-14 02:43:59,839][33201] Updated weights for policy 0, policy_version 42260 (0.0010) [2023-10-14 02:44:00,219][33201] Updated weights for policy 0, policy_version 42270 (0.0007) [2023-10-14 02:44:00,241][33226] Updated weights for policy 1, policy_version 42630 (0.0007) [2023-10-14 02:44:00,605][33226] Updated weights for policy 1, policy_version 42640 (0.0008) [2023-10-14 02:44:00,964][33226] Updated weights for policy 1, policy_version 42650 (0.0009) [2023-10-14 02:44:04,122][33201] Updated weights for policy 0, policy_version 42280 (0.0009) [2023-10-14 02:44:04,486][33201] Updated weights for policy 0, policy_version 42290 (0.0008) [2023-10-14 02:44:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 86966272. Throughput: 0: 1765.4, 1: 1779.2. Samples: 21754622. Policy #0 lag: (min: 31.0, avg: 37.7, max: 63.0) [2023-10-14 02:44:04,557][31953] Avg episode reward: [(0, '20.820'), (1, '20.700')] [2023-10-14 02:44:04,789][33226] Updated weights for policy 1, policy_version 42660 (0.0010) [2023-10-14 02:44:04,860][33201] Updated weights for policy 0, policy_version 42300 (0.0010) [2023-10-14 02:44:05,156][33226] Updated weights for policy 1, policy_version 42670 (0.0007) [2023-10-14 02:44:05,527][33226] Updated weights for policy 1, policy_version 42680 (0.0009) [2023-10-14 02:44:08,810][33201] Updated weights for policy 0, policy_version 42310 (0.0009) [2023-10-14 02:44:09,191][33201] Updated weights for policy 0, policy_version 42320 (0.0007) [2023-10-14 02:44:09,412][33226] Updated weights for policy 1, policy_version 42690 (0.0008) [2023-10-14 02:44:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 87031808. Throughput: 0: 1769.1, 1: 1787.8. Samples: 21776090. Policy #0 lag: (min: 9.0, avg: 16.0, max: 41.0) [2023-10-14 02:44:09,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.700')] [2023-10-14 02:44:09,560][33201] Updated weights for policy 0, policy_version 42330 (0.0007) [2023-10-14 02:44:09,770][33226] Updated weights for policy 1, policy_version 42700 (0.0007) [2023-10-14 02:44:10,138][33226] Updated weights for policy 1, policy_version 42710 (0.0008) [2023-10-14 02:44:10,501][33226] Updated weights for policy 1, policy_version 42720 (0.0008) [2023-10-14 02:44:13,448][33201] Updated weights for policy 0, policy_version 42340 (0.0009) [2023-10-14 02:44:13,814][33201] Updated weights for policy 0, policy_version 42350 (0.0009) [2023-10-14 02:44:14,190][33201] Updated weights for policy 0, policy_version 42360 (0.0008) [2023-10-14 02:44:14,297][33226] Updated weights for policy 1, policy_version 42730 (0.0008) [2023-10-14 02:44:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 87130112. Throughput: 0: 1756.4, 1: 1773.1. Samples: 21786222. Policy #0 lag: (min: 9.0, avg: 16.0, max: 41.0) [2023-10-14 02:44:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.690')] [2023-10-14 02:44:14,669][33226] Updated weights for policy 1, policy_version 42740 (0.0008) [2023-10-14 02:44:15,032][33226] Updated weights for policy 1, policy_version 42750 (0.0008) [2023-10-14 02:44:17,913][33201] Updated weights for policy 0, policy_version 42370 (0.0008) [2023-10-14 02:44:18,299][33201] Updated weights for policy 0, policy_version 42380 (0.0009) [2023-10-14 02:44:18,632][33226] Updated weights for policy 1, policy_version 42760 (0.0007) [2023-10-14 02:44:18,665][33201] Updated weights for policy 0, policy_version 42390 (0.0008) [2023-10-14 02:44:18,998][33226] Updated weights for policy 1, policy_version 42770 (0.0009) [2023-10-14 02:44:19,038][33201] Updated weights for policy 0, policy_version 42400 (0.0007) [2023-10-14 02:44:19,366][33226] Updated weights for policy 1, policy_version 42780 (0.0010) [2023-10-14 02:44:19,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14745.7, 300 sec: 14329.1). Total num frames: 87228416. Throughput: 0: 1779.9, 1: 1789.3. Samples: 21808420. Policy #0 lag: (min: 9.0, avg: 16.0, max: 41.0) [2023-10-14 02:44:19,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.690')] [2023-10-14 02:44:22,874][33201] Updated weights for policy 0, policy_version 42410 (0.0007) [2023-10-14 02:44:23,234][33201] Updated weights for policy 0, policy_version 42420 (0.0008) [2023-10-14 02:44:23,351][33226] Updated weights for policy 1, policy_version 42790 (0.0010) [2023-10-14 02:44:23,602][33201] Updated weights for policy 0, policy_version 42430 (0.0008) [2023-10-14 02:44:23,730][33226] Updated weights for policy 1, policy_version 42800 (0.0008) [2023-10-14 02:44:24,095][33226] Updated weights for policy 1, policy_version 42810 (0.0009) [2023-10-14 02:44:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 87293952. Throughput: 0: 1747.4, 1: 1785.9. Samples: 21828120. Policy #0 lag: (min: 9.0, avg: 16.0, max: 41.0) [2023-10-14 02:44:24,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.690')] [2023-10-14 02:44:27,430][33201] Updated weights for policy 0, policy_version 42440 (0.0009) [2023-10-14 02:44:27,797][33201] Updated weights for policy 0, policy_version 42450 (0.0007) [2023-10-14 02:44:27,883][33226] Updated weights for policy 1, policy_version 42820 (0.0008) [2023-10-14 02:44:28,168][33201] Updated weights for policy 0, policy_version 42460 (0.0007) [2023-10-14 02:44:28,248][33226] Updated weights for policy 1, policy_version 42830 (0.0009) [2023-10-14 02:44:28,617][33226] Updated weights for policy 1, policy_version 42840 (0.0009) [2023-10-14 02:44:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 87359488. Throughput: 0: 1781.5, 1: 1781.3. Samples: 21840010. Policy #0 lag: (min: 9.0, avg: 16.0, max: 41.0) [2023-10-14 02:44:29,557][31953] Avg episode reward: [(0, '20.740'), (1, '20.690')] [2023-10-14 02:44:32,058][33201] Updated weights for policy 0, policy_version 42470 (0.0008) [2023-10-14 02:44:32,421][33201] Updated weights for policy 0, policy_version 42480 (0.0007) [2023-10-14 02:44:32,586][33226] Updated weights for policy 1, policy_version 42850 (0.0008) [2023-10-14 02:44:32,796][33201] Updated weights for policy 0, policy_version 42490 (0.0009) [2023-10-14 02:44:32,955][33226] Updated weights for policy 1, policy_version 42860 (0.0008) [2023-10-14 02:44:33,312][33226] Updated weights for policy 1, policy_version 42870 (0.0008) [2023-10-14 02:44:33,677][33226] Updated weights for policy 1, policy_version 42880 (0.0007) [2023-10-14 02:44:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 87425024. Throughput: 0: 1741.3, 1: 1799.0. Samples: 21859916. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:34,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.680')] [2023-10-14 02:44:36,652][33201] Updated weights for policy 0, policy_version 42500 (0.0009) [2023-10-14 02:44:37,013][33201] Updated weights for policy 0, policy_version 42510 (0.0011) [2023-10-14 02:44:37,375][33201] Updated weights for policy 0, policy_version 42520 (0.0009) [2023-10-14 02:44:37,535][33226] Updated weights for policy 1, policy_version 42890 (0.0008) [2023-10-14 02:44:37,899][33226] Updated weights for policy 1, policy_version 42900 (0.0010) [2023-10-14 02:44:38,264][33226] Updated weights for policy 1, policy_version 42910 (0.0009) [2023-10-14 02:44:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 87490560. Throughput: 0: 1742.3, 1: 1769.1. Samples: 21880954. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:39,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.680')] [2023-10-14 02:44:41,341][33201] Updated weights for policy 0, policy_version 42530 (0.0010) [2023-10-14 02:44:41,756][33201] Updated weights for policy 0, policy_version 42540 (0.0007) [2023-10-14 02:44:42,093][33226] Updated weights for policy 1, policy_version 42920 (0.0008) [2023-10-14 02:44:42,118][33201] Updated weights for policy 0, policy_version 42550 (0.0008) [2023-10-14 02:44:42,454][33226] Updated weights for policy 1, policy_version 42930 (0.0009) [2023-10-14 02:44:42,490][33201] Updated weights for policy 0, policy_version 42560 (0.0007) [2023-10-14 02:44:42,820][33226] Updated weights for policy 1, policy_version 42940 (0.0008) [2023-10-14 02:44:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 87556096. Throughput: 0: 1755.5, 1: 1794.8. Samples: 21892256. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:44,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.710')] [2023-10-14 02:44:46,168][33201] Updated weights for policy 0, policy_version 42570 (0.0007) [2023-10-14 02:44:46,540][33201] Updated weights for policy 0, policy_version 42580 (0.0007) [2023-10-14 02:44:46,548][33226] Updated weights for policy 1, policy_version 42950 (0.0007) [2023-10-14 02:44:46,910][33201] Updated weights for policy 0, policy_version 42590 (0.0008) [2023-10-14 02:44:46,912][33226] Updated weights for policy 1, policy_version 42960 (0.0007) [2023-10-14 02:44:47,279][33226] Updated weights for policy 1, policy_version 42970 (0.0007) [2023-10-14 02:44:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 87621632. Throughput: 0: 1746.7, 1: 1763.4. Samples: 21912578. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:49,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.710')] [2023-10-14 02:44:50,870][33201] Updated weights for policy 0, policy_version 42600 (0.0007) [2023-10-14 02:44:51,130][33226] Updated weights for policy 1, policy_version 42980 (0.0008) [2023-10-14 02:44:51,250][33201] Updated weights for policy 0, policy_version 42610 (0.0009) [2023-10-14 02:44:51,498][33226] Updated weights for policy 1, policy_version 42990 (0.0007) [2023-10-14 02:44:51,624][33201] Updated weights for policy 0, policy_version 42620 (0.0008) [2023-10-14 02:44:51,864][33226] Updated weights for policy 1, policy_version 43000 (0.0009) [2023-10-14 02:44:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 87687168. Throughput: 0: 1761.3, 1: 1766.6. Samples: 21934848. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:54,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.710')] [2023-10-14 02:44:55,384][33201] Updated weights for policy 0, policy_version 42630 (0.0008) [2023-10-14 02:44:55,526][33226] Updated weights for policy 1, policy_version 43010 (0.0009) [2023-10-14 02:44:55,749][33201] Updated weights for policy 0, policy_version 42640 (0.0007) [2023-10-14 02:44:55,899][33226] Updated weights for policy 1, policy_version 43020 (0.0008) [2023-10-14 02:44:56,126][33201] Updated weights for policy 0, policy_version 42650 (0.0007) [2023-10-14 02:44:56,255][33226] Updated weights for policy 1, policy_version 43030 (0.0009) [2023-10-14 02:44:56,620][33226] Updated weights for policy 1, policy_version 43040 (0.0008) [2023-10-14 02:44:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 87752704. Throughput: 0: 1752.3, 1: 1767.3. Samples: 21944600. Policy #0 lag: (min: 22.0, avg: 22.0, max: 25.0) [2023-10-14 02:44:59,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.730')] [2023-10-14 02:44:59,916][33201] Updated weights for policy 0, policy_version 42660 (0.0008) [2023-10-14 02:45:00,280][33201] Updated weights for policy 0, policy_version 42670 (0.0009) [2023-10-14 02:45:00,399][33226] Updated weights for policy 1, policy_version 43050 (0.0008) [2023-10-14 02:45:00,643][33201] Updated weights for policy 0, policy_version 42680 (0.0008) [2023-10-14 02:45:00,767][33226] Updated weights for policy 1, policy_version 43060 (0.0007) [2023-10-14 02:45:01,128][33226] Updated weights for policy 1, policy_version 43070 (0.0008) [2023-10-14 02:45:04,514][33201] Updated weights for policy 0, policy_version 42690 (0.0007) [2023-10-14 02:45:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 87818240. Throughput: 0: 1754.5, 1: 1764.6. Samples: 21966780. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:04,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.740')] [2023-10-14 02:45:04,874][33201] Updated weights for policy 0, policy_version 42700 (0.0008) [2023-10-14 02:45:04,999][33226] Updated weights for policy 1, policy_version 43080 (0.0009) [2023-10-14 02:45:05,246][33201] Updated weights for policy 0, policy_version 42710 (0.0007) [2023-10-14 02:45:05,354][33226] Updated weights for policy 1, policy_version 43090 (0.0008) [2023-10-14 02:45:05,617][33201] Updated weights for policy 0, policy_version 42720 (0.0007) [2023-10-14 02:45:05,732][33226] Updated weights for policy 1, policy_version 43100 (0.0008) [2023-10-14 02:45:09,481][33201] Updated weights for policy 0, policy_version 42730 (0.0008) [2023-10-14 02:45:09,509][33226] Updated weights for policy 1, policy_version 43110 (0.0007) [2023-10-14 02:45:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 87883776. Throughput: 0: 1783.4, 1: 1788.2. Samples: 21988842. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:09,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.740')] [2023-10-14 02:45:09,850][33201] Updated weights for policy 0, policy_version 42740 (0.0008) [2023-10-14 02:45:09,884][33226] Updated weights for policy 1, policy_version 43120 (0.0009) [2023-10-14 02:45:10,224][33201] Updated weights for policy 0, policy_version 42750 (0.0009) [2023-10-14 02:45:10,255][33226] Updated weights for policy 1, policy_version 43130 (0.0009) [2023-10-14 02:45:13,955][33226] Updated weights for policy 1, policy_version 43140 (0.0007) [2023-10-14 02:45:14,088][33201] Updated weights for policy 0, policy_version 42760 (0.0007) [2023-10-14 02:45:14,319][33226] Updated weights for policy 1, policy_version 43150 (0.0009) [2023-10-14 02:45:14,449][33201] Updated weights for policy 0, policy_version 42770 (0.0007) [2023-10-14 02:45:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 87949312. Throughput: 0: 1750.8, 1: 1767.6. Samples: 21998334. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:14,557][31953] Avg episode reward: [(0, '20.610'), (1, '20.740')] [2023-10-14 02:45:14,687][33226] Updated weights for policy 1, policy_version 43160 (0.0009) [2023-10-14 02:45:14,826][33201] Updated weights for policy 0, policy_version 42780 (0.0009) [2023-10-14 02:45:18,593][33201] Updated weights for policy 0, policy_version 42790 (0.0008) [2023-10-14 02:45:18,657][33226] Updated weights for policy 1, policy_version 43170 (0.0007) [2023-10-14 02:45:18,968][33201] Updated weights for policy 0, policy_version 42800 (0.0009) [2023-10-14 02:45:19,036][33226] Updated weights for policy 1, policy_version 43180 (0.0008) [2023-10-14 02:45:19,331][33201] Updated weights for policy 0, policy_version 42810 (0.0009) [2023-10-14 02:45:19,400][33226] Updated weights for policy 1, policy_version 43190 (0.0008) [2023-10-14 02:45:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 88047616. Throughput: 0: 1786.2, 1: 1781.7. Samples: 22020472. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:19,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.740')] [2023-10-14 02:45:19,759][33226] Updated weights for policy 1, policy_version 43200 (0.0008) [2023-10-14 02:45:23,052][33201] Updated weights for policy 0, policy_version 42820 (0.0007) [2023-10-14 02:45:23,419][33201] Updated weights for policy 0, policy_version 42830 (0.0008) [2023-10-14 02:45:23,633][33226] Updated weights for policy 1, policy_version 43210 (0.0010) [2023-10-14 02:45:23,788][33201] Updated weights for policy 0, policy_version 42840 (0.0007) [2023-10-14 02:45:24,006][33226] Updated weights for policy 1, policy_version 43220 (0.0008) [2023-10-14 02:45:24,374][33226] Updated weights for policy 1, policy_version 43230 (0.0009) [2023-10-14 02:45:24,557][31953] Fps is (10 sec: 19660.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 88145920. Throughput: 0: 1756.0, 1: 1788.5. Samples: 22040460. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:24,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.800')] [2023-10-14 02:45:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000042848_43876352.pth... [2023-10-14 02:45:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000043232_44269568.pth... [2023-10-14 02:45:24,600][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000041184_42172416.pth [2023-10-14 02:45:24,611][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000041568_42565632.pth [2023-10-14 02:45:27,632][33201] Updated weights for policy 0, policy_version 42850 (0.0008) [2023-10-14 02:45:28,016][33201] Updated weights for policy 0, policy_version 42860 (0.0008) [2023-10-14 02:45:28,166][33226] Updated weights for policy 1, policy_version 43240 (0.0008) [2023-10-14 02:45:28,383][33201] Updated weights for policy 0, policy_version 42870 (0.0007) [2023-10-14 02:45:28,535][33226] Updated weights for policy 1, policy_version 43250 (0.0008) [2023-10-14 02:45:28,757][33201] Updated weights for policy 0, policy_version 42880 (0.0009) [2023-10-14 02:45:28,908][33226] Updated weights for policy 1, policy_version 43260 (0.0008) [2023-10-14 02:45:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 88211456. Throughput: 0: 1779.6, 1: 1775.7. Samples: 22052242. Policy #0 lag: (min: 31.0, avg: 32.8, max: 60.0) [2023-10-14 02:45:29,557][31953] Avg episode reward: [(0, '20.660'), (1, '20.800')] [2023-10-14 02:45:32,601][33201] Updated weights for policy 0, policy_version 42890 (0.0008) [2023-10-14 02:45:32,701][33226] Updated weights for policy 1, policy_version 43270 (0.0010) [2023-10-14 02:45:32,977][33201] Updated weights for policy 0, policy_version 42900 (0.0008) [2023-10-14 02:45:33,069][33226] Updated weights for policy 1, policy_version 43280 (0.0009) [2023-10-14 02:45:33,343][33201] Updated weights for policy 0, policy_version 42910 (0.0009) [2023-10-14 02:45:33,436][33226] Updated weights for policy 1, policy_version 43290 (0.0008) [2023-10-14 02:45:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 88276992. Throughput: 0: 1764.3, 1: 1792.2. Samples: 22072618. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:34,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.810')] [2023-10-14 02:45:37,237][33201] Updated weights for policy 0, policy_version 42920 (0.0008) [2023-10-14 02:45:37,424][33226] Updated weights for policy 1, policy_version 43300 (0.0010) [2023-10-14 02:45:37,606][33201] Updated weights for policy 0, policy_version 42930 (0.0008) [2023-10-14 02:45:37,788][33226] Updated weights for policy 1, policy_version 43310 (0.0008) [2023-10-14 02:45:37,971][33201] Updated weights for policy 0, policy_version 42940 (0.0007) [2023-10-14 02:45:38,155][33226] Updated weights for policy 1, policy_version 43320 (0.0007) [2023-10-14 02:45:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 88342528. Throughput: 0: 1750.6, 1: 1762.5. Samples: 22092940. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:39,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.880')] [2023-10-14 02:45:41,882][33226] Updated weights for policy 1, policy_version 43330 (0.0008) [2023-10-14 02:45:41,951][33201] Updated weights for policy 0, policy_version 42950 (0.0007) [2023-10-14 02:45:42,250][33226] Updated weights for policy 1, policy_version 43340 (0.0009) [2023-10-14 02:45:42,314][33201] Updated weights for policy 0, policy_version 42960 (0.0008) [2023-10-14 02:45:42,622][33226] Updated weights for policy 1, policy_version 43350 (0.0008) [2023-10-14 02:45:42,689][33201] Updated weights for policy 0, policy_version 42970 (0.0007) [2023-10-14 02:45:42,993][33226] Updated weights for policy 1, policy_version 43360 (0.0009) [2023-10-14 02:45:44,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 88408064. Throughput: 0: 1769.3, 1: 1790.3. Samples: 22104782. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:44,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.900')] [2023-10-14 02:45:46,467][33201] Updated weights for policy 0, policy_version 42980 (0.0009) [2023-10-14 02:45:46,771][33226] Updated weights for policy 1, policy_version 43370 (0.0010) [2023-10-14 02:45:46,838][33201] Updated weights for policy 0, policy_version 42990 (0.0008) [2023-10-14 02:45:47,139][33226] Updated weights for policy 1, policy_version 43380 (0.0007) [2023-10-14 02:45:47,205][33201] Updated weights for policy 0, policy_version 43000 (0.0007) [2023-10-14 02:45:47,504][33226] Updated weights for policy 1, policy_version 43390 (0.0007) [2023-10-14 02:45:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 88473600. Throughput: 0: 1745.9, 1: 1756.6. Samples: 22124394. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:49,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.900')] [2023-10-14 02:45:51,178][33201] Updated weights for policy 0, policy_version 43010 (0.0008) [2023-10-14 02:45:51,415][33226] Updated weights for policy 1, policy_version 43400 (0.0009) [2023-10-14 02:45:51,556][33201] Updated weights for policy 0, policy_version 43020 (0.0008) [2023-10-14 02:45:51,782][33226] Updated weights for policy 1, policy_version 43410 (0.0007) [2023-10-14 02:45:51,924][33201] Updated weights for policy 0, policy_version 43030 (0.0009) [2023-10-14 02:45:52,145][33226] Updated weights for policy 1, policy_version 43420 (0.0008) [2023-10-14 02:45:52,294][33201] Updated weights for policy 0, policy_version 43040 (0.0009) [2023-10-14 02:45:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 88539136. Throughput: 0: 1744.1, 1: 1750.9. Samples: 22146118. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:54,557][31953] Avg episode reward: [(0, '20.760'), (1, '20.930')] [2023-10-14 02:45:56,059][33226] Updated weights for policy 1, policy_version 43430 (0.0009) [2023-10-14 02:45:56,096][33201] Updated weights for policy 0, policy_version 43050 (0.0010) [2023-10-14 02:45:56,446][33226] Updated weights for policy 1, policy_version 43440 (0.0008) [2023-10-14 02:45:56,459][33201] Updated weights for policy 0, policy_version 43060 (0.0007) [2023-10-14 02:45:56,808][33226] Updated weights for policy 1, policy_version 43450 (0.0010) [2023-10-14 02:45:56,826][33201] Updated weights for policy 0, policy_version 43070 (0.0007) [2023-10-14 02:45:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 88604672. Throughput: 0: 1745.3, 1: 1749.9. Samples: 22155620. Policy #0 lag: (min: 31.0, avg: 32.7, max: 60.0) [2023-10-14 02:45:59,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.950')] [2023-10-14 02:46:00,588][33226] Updated weights for policy 1, policy_version 43460 (0.0007) [2023-10-14 02:46:00,666][33201] Updated weights for policy 0, policy_version 43080 (0.0008) [2023-10-14 02:46:00,961][33226] Updated weights for policy 1, policy_version 43470 (0.0007) [2023-10-14 02:46:01,037][33201] Updated weights for policy 0, policy_version 43090 (0.0009) [2023-10-14 02:46:01,325][33226] Updated weights for policy 1, policy_version 43480 (0.0008) [2023-10-14 02:46:01,403][33201] Updated weights for policy 0, policy_version 43100 (0.0007) [2023-10-14 02:46:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 88670208. Throughput: 0: 1747.2, 1: 1743.7. Samples: 22177564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:04,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.950')] [2023-10-14 02:46:05,117][33226] Updated weights for policy 1, policy_version 43490 (0.0007) [2023-10-14 02:46:05,141][33201] Updated weights for policy 0, policy_version 43110 (0.0007) [2023-10-14 02:46:05,487][33226] Updated weights for policy 1, policy_version 43500 (0.0008) [2023-10-14 02:46:05,518][33201] Updated weights for policy 0, policy_version 43120 (0.0007) [2023-10-14 02:46:05,854][33226] Updated weights for policy 1, policy_version 43510 (0.0008) [2023-10-14 02:46:05,898][33201] Updated weights for policy 0, policy_version 43130 (0.0009) [2023-10-14 02:46:06,217][33226] Updated weights for policy 1, policy_version 43520 (0.0007) [2023-10-14 02:46:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 88735744. Throughput: 0: 1778.7, 1: 1762.1. Samples: 22199794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:09,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.950')] [2023-10-14 02:46:09,784][33201] Updated weights for policy 0, policy_version 43140 (0.0010) [2023-10-14 02:46:10,003][33226] Updated weights for policy 1, policy_version 43530 (0.0009) [2023-10-14 02:46:10,153][33201] Updated weights for policy 0, policy_version 43150 (0.0009) [2023-10-14 02:46:10,373][33226] Updated weights for policy 1, policy_version 43540 (0.0010) [2023-10-14 02:46:10,517][33201] Updated weights for policy 0, policy_version 43160 (0.0007) [2023-10-14 02:46:10,739][33226] Updated weights for policy 1, policy_version 43550 (0.0008) [2023-10-14 02:46:14,320][33201] Updated weights for policy 0, policy_version 43170 (0.0007) [2023-10-14 02:46:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 88801280. Throughput: 0: 1746.0, 1: 1743.9. Samples: 22209290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:14,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.950')] [2023-10-14 02:46:14,598][33226] Updated weights for policy 1, policy_version 43560 (0.0007) [2023-10-14 02:46:14,700][33201] Updated weights for policy 0, policy_version 43180 (0.0008) [2023-10-14 02:46:14,960][33226] Updated weights for policy 1, policy_version 43570 (0.0009) [2023-10-14 02:46:15,065][33201] Updated weights for policy 0, policy_version 43190 (0.0007) [2023-10-14 02:46:15,327][33226] Updated weights for policy 1, policy_version 43580 (0.0008) [2023-10-14 02:46:15,436][33201] Updated weights for policy 0, policy_version 43200 (0.0008) [2023-10-14 02:46:19,082][33201] Updated weights for policy 0, policy_version 43210 (0.0008) [2023-10-14 02:46:19,261][33226] Updated weights for policy 1, policy_version 43590 (0.0007) [2023-10-14 02:46:19,451][33201] Updated weights for policy 0, policy_version 43220 (0.0007) [2023-10-14 02:46:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 88866816. Throughput: 0: 1770.5, 1: 1755.5. Samples: 22231286. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:19,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.950')] [2023-10-14 02:46:19,625][33226] Updated weights for policy 1, policy_version 43600 (0.0008) [2023-10-14 02:46:19,825][33201] Updated weights for policy 0, policy_version 43230 (0.0009) [2023-10-14 02:46:19,999][33226] Updated weights for policy 1, policy_version 43610 (0.0009) [2023-10-14 02:46:23,649][33201] Updated weights for policy 0, policy_version 43240 (0.0008) [2023-10-14 02:46:23,752][33226] Updated weights for policy 1, policy_version 43620 (0.0009) [2023-10-14 02:46:24,019][33201] Updated weights for policy 0, policy_version 43250 (0.0010) [2023-10-14 02:46:24,109][33226] Updated weights for policy 1, policy_version 43630 (0.0008) [2023-10-14 02:46:24,387][33201] Updated weights for policy 0, policy_version 43260 (0.0008) [2023-10-14 02:46:24,474][33226] Updated weights for policy 1, policy_version 43640 (0.0009) [2023-10-14 02:46:24,557][31953] Fps is (10 sec: 16383.3, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 88965120. Throughput: 0: 1766.0, 1: 1771.4. Samples: 22252124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:24,559][31953] Avg episode reward: [(0, '20.610'), (1, '20.960')] [2023-10-14 02:46:28,229][33226] Updated weights for policy 1, policy_version 43650 (0.0007) [2023-10-14 02:46:28,303][33201] Updated weights for policy 0, policy_version 43270 (0.0007) [2023-10-14 02:46:28,598][33226] Updated weights for policy 1, policy_version 43660 (0.0007) [2023-10-14 02:46:28,669][33201] Updated weights for policy 0, policy_version 43280 (0.0007) [2023-10-14 02:46:28,960][33226] Updated weights for policy 1, policy_version 43670 (0.0008) [2023-10-14 02:46:29,039][33201] Updated weights for policy 0, policy_version 43290 (0.0007) [2023-10-14 02:46:29,323][33226] Updated weights for policy 1, policy_version 43680 (0.0010) [2023-10-14 02:46:29,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 89063424. Throughput: 0: 1766.3, 1: 1751.2. Samples: 22263066. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:46:29,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.960')] [2023-10-14 02:46:32,955][33201] Updated weights for policy 0, policy_version 43300 (0.0009) [2023-10-14 02:46:33,200][33226] Updated weights for policy 1, policy_version 43690 (0.0007) [2023-10-14 02:46:33,317][33201] Updated weights for policy 0, policy_version 43310 (0.0009) [2023-10-14 02:46:33,565][33226] Updated weights for policy 1, policy_version 43700 (0.0008) [2023-10-14 02:46:33,693][33201] Updated weights for policy 0, policy_version 43320 (0.0008) [2023-10-14 02:46:33,935][33226] Updated weights for policy 1, policy_version 43710 (0.0008) [2023-10-14 02:46:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 89128960. Throughput: 0: 1780.1, 1: 1783.4. Samples: 22284752. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:34,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.940')] [2023-10-14 02:46:37,685][33201] Updated weights for policy 0, policy_version 43330 (0.0007) [2023-10-14 02:46:37,730][33226] Updated weights for policy 1, policy_version 43720 (0.0008) [2023-10-14 02:46:38,055][33201] Updated weights for policy 0, policy_version 43340 (0.0008) [2023-10-14 02:46:38,089][33226] Updated weights for policy 1, policy_version 43730 (0.0008) [2023-10-14 02:46:38,428][33201] Updated weights for policy 0, policy_version 43350 (0.0009) [2023-10-14 02:46:38,456][33226] Updated weights for policy 1, policy_version 43740 (0.0008) [2023-10-14 02:46:38,801][33201] Updated weights for policy 0, policy_version 43360 (0.0010) [2023-10-14 02:46:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 89194496. Throughput: 0: 1756.9, 1: 1763.3. Samples: 22304526. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:39,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.940')] [2023-10-14 02:46:42,287][33226] Updated weights for policy 1, policy_version 43750 (0.0009) [2023-10-14 02:46:42,488][33201] Updated weights for policy 0, policy_version 43370 (0.0007) [2023-10-14 02:46:42,674][33226] Updated weights for policy 1, policy_version 43760 (0.0008) [2023-10-14 02:46:42,853][33201] Updated weights for policy 0, policy_version 43380 (0.0010) [2023-10-14 02:46:43,039][33226] Updated weights for policy 1, policy_version 43770 (0.0007) [2023-10-14 02:46:43,224][33201] Updated weights for policy 0, policy_version 43390 (0.0008) [2023-10-14 02:46:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 89260032. Throughput: 0: 1789.7, 1: 1796.9. Samples: 22317018. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:44,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.940')] [2023-10-14 02:46:46,780][33226] Updated weights for policy 1, policy_version 43780 (0.0007) [2023-10-14 02:46:47,112][33201] Updated weights for policy 0, policy_version 43400 (0.0007) [2023-10-14 02:46:47,137][33226] Updated weights for policy 1, policy_version 43790 (0.0007) [2023-10-14 02:46:47,481][33201] Updated weights for policy 0, policy_version 43410 (0.0008) [2023-10-14 02:46:47,502][33226] Updated weights for policy 1, policy_version 43800 (0.0007) [2023-10-14 02:46:47,852][33201] Updated weights for policy 0, policy_version 43420 (0.0010) [2023-10-14 02:46:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 89325568. Throughput: 0: 1755.2, 1: 1765.0. Samples: 22335976. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:49,561][31953] Avg episode reward: [(0, '20.640'), (1, '20.940')] [2023-10-14 02:46:51,299][33226] Updated weights for policy 1, policy_version 43810 (0.0008) [2023-10-14 02:46:51,646][33201] Updated weights for policy 0, policy_version 43430 (0.0008) [2023-10-14 02:46:51,677][33226] Updated weights for policy 1, policy_version 43820 (0.0008) [2023-10-14 02:46:52,019][33201] Updated weights for policy 0, policy_version 43440 (0.0010) [2023-10-14 02:46:52,037][33226] Updated weights for policy 1, policy_version 43830 (0.0008) [2023-10-14 02:46:52,386][33201] Updated weights for policy 0, policy_version 43450 (0.0007) [2023-10-14 02:46:52,402][33226] Updated weights for policy 1, policy_version 43840 (0.0008) [2023-10-14 02:46:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 89391104. Throughput: 0: 1749.5, 1: 1764.0. Samples: 22357900. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:54,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.930')] [2023-10-14 02:46:56,343][33201] Updated weights for policy 0, policy_version 43460 (0.0007) [2023-10-14 02:46:56,359][33226] Updated weights for policy 1, policy_version 43850 (0.0009) [2023-10-14 02:46:56,713][33201] Updated weights for policy 0, policy_version 43470 (0.0007) [2023-10-14 02:46:56,723][33226] Updated weights for policy 1, policy_version 43860 (0.0009) [2023-10-14 02:46:57,075][33201] Updated weights for policy 0, policy_version 43480 (0.0007) [2023-10-14 02:46:57,085][33226] Updated weights for policy 1, policy_version 43870 (0.0007) [2023-10-14 02:46:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 89456640. Throughput: 0: 1758.3, 1: 1774.8. Samples: 22368278. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:46:59,557][31953] Avg episode reward: [(0, '20.630'), (1, '20.920')] [2023-10-14 02:47:00,943][33201] Updated weights for policy 0, policy_version 43490 (0.0008) [2023-10-14 02:47:00,951][33226] Updated weights for policy 1, policy_version 43880 (0.0007) [2023-10-14 02:47:01,324][33226] Updated weights for policy 1, policy_version 43890 (0.0009) [2023-10-14 02:47:01,326][33201] Updated weights for policy 0, policy_version 43500 (0.0010) [2023-10-14 02:47:01,679][33226] Updated weights for policy 1, policy_version 43900 (0.0008) [2023-10-14 02:47:01,694][33201] Updated weights for policy 0, policy_version 43510 (0.0008) [2023-10-14 02:47:02,056][33201] Updated weights for policy 0, policy_version 43520 (0.0007) [2023-10-14 02:47:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 89522176. Throughput: 0: 1749.0, 1: 1765.5. Samples: 22389438. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 02:47:04,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.920')] [2023-10-14 02:47:05,720][33226] Updated weights for policy 1, policy_version 43910 (0.0008) [2023-10-14 02:47:05,908][33201] Updated weights for policy 0, policy_version 43530 (0.0007) [2023-10-14 02:47:06,087][33226] Updated weights for policy 1, policy_version 43920 (0.0008) [2023-10-14 02:47:06,276][33201] Updated weights for policy 0, policy_version 43540 (0.0008) [2023-10-14 02:47:06,447][33226] Updated weights for policy 1, policy_version 43930 (0.0009) [2023-10-14 02:47:06,644][33201] Updated weights for policy 0, policy_version 43550 (0.0008) [2023-10-14 02:47:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 89587712. Throughput: 0: 1764.4, 1: 1775.1. Samples: 22411402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:09,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.900')] [2023-10-14 02:47:10,216][33226] Updated weights for policy 1, policy_version 43940 (0.0007) [2023-10-14 02:47:10,410][33201] Updated weights for policy 0, policy_version 43560 (0.0007) [2023-10-14 02:47:10,585][33226] Updated weights for policy 1, policy_version 43950 (0.0007) [2023-10-14 02:47:10,774][33201] Updated weights for policy 0, policy_version 43570 (0.0007) [2023-10-14 02:47:10,951][33226] Updated weights for policy 1, policy_version 43960 (0.0007) [2023-10-14 02:47:11,141][33201] Updated weights for policy 0, policy_version 43580 (0.0009) [2023-10-14 02:47:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 89653248. Throughput: 0: 1744.4, 1: 1766.0. Samples: 22421034. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:14,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.920')] [2023-10-14 02:47:14,769][33226] Updated weights for policy 1, policy_version 43970 (0.0009) [2023-10-14 02:47:15,004][33201] Updated weights for policy 0, policy_version 43590 (0.0009) [2023-10-14 02:47:15,130][33226] Updated weights for policy 1, policy_version 43980 (0.0008) [2023-10-14 02:47:15,368][33201] Updated weights for policy 0, policy_version 43600 (0.0008) [2023-10-14 02:47:15,508][33226] Updated weights for policy 1, policy_version 43990 (0.0008) [2023-10-14 02:47:15,733][33201] Updated weights for policy 0, policy_version 43610 (0.0008) [2023-10-14 02:47:15,874][33226] Updated weights for policy 1, policy_version 44000 (0.0008) [2023-10-14 02:47:19,542][33201] Updated weights for policy 0, policy_version 43620 (0.0009) [2023-10-14 02:47:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 89718784. Throughput: 0: 1754.9, 1: 1763.6. Samples: 22443084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:19,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.940')] [2023-10-14 02:47:19,625][33226] Updated weights for policy 1, policy_version 44010 (0.0007) [2023-10-14 02:47:19,907][33201] Updated weights for policy 0, policy_version 43630 (0.0009) [2023-10-14 02:47:19,988][33226] Updated weights for policy 1, policy_version 44020 (0.0007) [2023-10-14 02:47:20,283][33201] Updated weights for policy 0, policy_version 43640 (0.0007) [2023-10-14 02:47:20,354][33226] Updated weights for policy 1, policy_version 44030 (0.0008) [2023-10-14 02:47:24,014][33201] Updated weights for policy 0, policy_version 43650 (0.0009) [2023-10-14 02:47:24,103][33226] Updated weights for policy 1, policy_version 44040 (0.0009) [2023-10-14 02:47:24,380][33201] Updated weights for policy 0, policy_version 43660 (0.0007) [2023-10-14 02:47:24,471][33226] Updated weights for policy 1, policy_version 44050 (0.0008) [2023-10-14 02:47:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13995.8). Total num frames: 89784320. Throughput: 0: 1776.9, 1: 1788.8. Samples: 22464980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:24,557][31953] Avg episode reward: [(0, '20.660'), (1, '20.940')] [2023-10-14 02:47:24,754][33201] Updated weights for policy 0, policy_version 43670 (0.0007) [2023-10-14 02:47:24,840][33226] Updated weights for policy 1, policy_version 44060 (0.0007) [2023-10-14 02:47:24,977][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000044064_45121536.pth... [2023-10-14 02:47:25,006][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000042400_43417600.pth [2023-10-14 02:47:25,118][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000043680_44728320.pth... [2023-10-14 02:47:25,122][33201] Updated weights for policy 0, policy_version 43680 (0.0008) [2023-10-14 02:47:25,157][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000042016_43024384.pth [2023-10-14 02:47:28,704][33226] Updated weights for policy 1, policy_version 44070 (0.0009) [2023-10-14 02:47:29,067][33226] Updated weights for policy 1, policy_version 44080 (0.0008) [2023-10-14 02:47:29,252][33201] Updated weights for policy 0, policy_version 43690 (0.0009) [2023-10-14 02:47:29,434][33226] Updated weights for policy 1, policy_version 44090 (0.0008) [2023-10-14 02:47:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13995.8). Total num frames: 89849856. Throughput: 0: 1740.4, 1: 1763.5. Samples: 22474692. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:29,557][31953] Avg episode reward: [(0, '20.660'), (1, '20.940')] [2023-10-14 02:47:29,613][33201] Updated weights for policy 0, policy_version 43700 (0.0009) [2023-10-14 02:47:29,981][33201] Updated weights for policy 0, policy_version 43710 (0.0010) [2023-10-14 02:47:33,297][33226] Updated weights for policy 1, policy_version 44100 (0.0007) [2023-10-14 02:47:33,664][33226] Updated weights for policy 1, policy_version 44110 (0.0008) [2023-10-14 02:47:33,784][33201] Updated weights for policy 0, policy_version 43720 (0.0008) [2023-10-14 02:47:34,029][33226] Updated weights for policy 1, policy_version 44120 (0.0008) [2023-10-14 02:47:34,159][33201] Updated weights for policy 0, policy_version 43730 (0.0008) [2023-10-14 02:47:34,529][33201] Updated weights for policy 0, policy_version 43740 (0.0008) [2023-10-14 02:47:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 89948160. Throughput: 0: 1772.8, 1: 1792.7. Samples: 22496426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:47:34,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.930')] [2023-10-14 02:47:38,009][33226] Updated weights for policy 1, policy_version 44130 (0.0009) [2023-10-14 02:47:38,317][33201] Updated weights for policy 0, policy_version 43750 (0.0008) [2023-10-14 02:47:38,373][33226] Updated weights for policy 1, policy_version 44140 (0.0008) [2023-10-14 02:47:38,682][33201] Updated weights for policy 0, policy_version 43760 (0.0009) [2023-10-14 02:47:38,741][33226] Updated weights for policy 1, policy_version 44150 (0.0008) [2023-10-14 02:47:39,059][33201] Updated weights for policy 0, policy_version 43770 (0.0007) [2023-10-14 02:47:39,111][33226] Updated weights for policy 1, policy_version 44160 (0.0007) [2023-10-14 02:47:39,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 90046464. Throughput: 0: 1751.4, 1: 1761.9. Samples: 22515998. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:47:39,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.930')] [2023-10-14 02:47:42,832][33226] Updated weights for policy 1, policy_version 44170 (0.0007) [2023-10-14 02:47:42,940][33201] Updated weights for policy 0, policy_version 43780 (0.0009) [2023-10-14 02:47:43,195][33226] Updated weights for policy 1, policy_version 44180 (0.0007) [2023-10-14 02:47:43,308][33201] Updated weights for policy 0, policy_version 43790 (0.0007) [2023-10-14 02:47:43,563][33226] Updated weights for policy 1, policy_version 44190 (0.0010) [2023-10-14 02:47:43,681][33201] Updated weights for policy 0, policy_version 43800 (0.0007) [2023-10-14 02:47:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 90112000. Throughput: 0: 1764.4, 1: 1776.4. Samples: 22527616. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:47:44,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.930')] [2023-10-14 02:47:47,386][33226] Updated weights for policy 1, policy_version 44200 (0.0008) [2023-10-14 02:47:47,554][33201] Updated weights for policy 0, policy_version 43810 (0.0009) [2023-10-14 02:47:47,746][33226] Updated weights for policy 1, policy_version 44210 (0.0009) [2023-10-14 02:47:47,923][33201] Updated weights for policy 0, policy_version 43820 (0.0009) [2023-10-14 02:47:48,111][33226] Updated weights for policy 1, policy_version 44220 (0.0008) [2023-10-14 02:47:48,286][33201] Updated weights for policy 0, policy_version 43830 (0.0010) [2023-10-14 02:47:48,650][33201] Updated weights for policy 0, policy_version 43840 (0.0007) [2023-10-14 02:47:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 90177536. Throughput: 0: 1755.5, 1: 1763.4. Samples: 22547790. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:47:49,559][31953] Avg episode reward: [(0, '20.660'), (1, '20.930')] [2023-10-14 02:47:51,834][33226] Updated weights for policy 1, policy_version 44230 (0.0009) [2023-10-14 02:47:52,202][33226] Updated weights for policy 1, policy_version 44240 (0.0007) [2023-10-14 02:47:52,536][33201] Updated weights for policy 0, policy_version 43850 (0.0008) [2023-10-14 02:47:52,576][33226] Updated weights for policy 1, policy_version 44250 (0.0009) [2023-10-14 02:47:52,897][33201] Updated weights for policy 0, policy_version 43860 (0.0009) [2023-10-14 02:47:53,269][33201] Updated weights for policy 0, policy_version 43870 (0.0008) [2023-10-14 02:47:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 90243072. Throughput: 0: 1736.9, 1: 1755.6. Samples: 22568568. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:47:54,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.930')] [2023-10-14 02:47:56,484][33226] Updated weights for policy 1, policy_version 44260 (0.0008) [2023-10-14 02:47:56,852][33226] Updated weights for policy 1, policy_version 44270 (0.0008) [2023-10-14 02:47:57,197][33201] Updated weights for policy 0, policy_version 43880 (0.0007) [2023-10-14 02:47:57,207][33226] Updated weights for policy 1, policy_version 44280 (0.0009) [2023-10-14 02:47:57,562][33201] Updated weights for policy 0, policy_version 43890 (0.0009) [2023-10-14 02:47:57,935][33201] Updated weights for policy 0, policy_version 43900 (0.0007) [2023-10-14 02:47:59,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 90308608. Throughput: 0: 1758.8, 1: 1768.4. Samples: 22579760. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:47:59,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:48:01,002][33226] Updated weights for policy 1, policy_version 44290 (0.0008) [2023-10-14 02:48:01,365][33226] Updated weights for policy 1, policy_version 44300 (0.0008) [2023-10-14 02:48:01,726][33226] Updated weights for policy 1, policy_version 44310 (0.0007) [2023-10-14 02:48:01,830][33201] Updated weights for policy 0, policy_version 43910 (0.0008) [2023-10-14 02:48:02,091][33226] Updated weights for policy 1, policy_version 44320 (0.0008) [2023-10-14 02:48:02,207][33201] Updated weights for policy 0, policy_version 43920 (0.0007) [2023-10-14 02:48:02,578][33201] Updated weights for policy 0, policy_version 43930 (0.0007) [2023-10-14 02:48:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 90374144. Throughput: 0: 1729.8, 1: 1755.2. Samples: 22599908. Policy #0 lag: (min: 11.0, avg: 19.0, max: 43.0) [2023-10-14 02:48:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.930')] [2023-10-14 02:48:05,967][33226] Updated weights for policy 1, policy_version 44330 (0.0009) [2023-10-14 02:48:06,332][33226] Updated weights for policy 1, policy_version 44340 (0.0007) [2023-10-14 02:48:06,420][33201] Updated weights for policy 0, policy_version 43940 (0.0009) [2023-10-14 02:48:06,704][33226] Updated weights for policy 1, policy_version 44350 (0.0009) [2023-10-14 02:48:06,787][33201] Updated weights for policy 0, policy_version 43950 (0.0009) [2023-10-14 02:48:07,156][33201] Updated weights for policy 0, policy_version 43960 (0.0010) [2023-10-14 02:48:09,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 90439680. Throughput: 0: 1732.7, 1: 1760.9. Samples: 22622196. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.930')] [2023-10-14 02:48:10,391][33226] Updated weights for policy 1, policy_version 44360 (0.0009) [2023-10-14 02:48:10,761][33226] Updated weights for policy 1, policy_version 44370 (0.0007) [2023-10-14 02:48:11,075][33201] Updated weights for policy 0, policy_version 43970 (0.0009) [2023-10-14 02:48:11,138][33226] Updated weights for policy 1, policy_version 44380 (0.0008) [2023-10-14 02:48:11,434][33201] Updated weights for policy 0, policy_version 43980 (0.0008) [2023-10-14 02:48:11,806][33201] Updated weights for policy 0, policy_version 43990 (0.0007) [2023-10-14 02:48:12,177][33201] Updated weights for policy 0, policy_version 44000 (0.0007) [2023-10-14 02:48:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 90505216. Throughput: 0: 1744.9, 1: 1754.0. Samples: 22632144. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:14,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 02:48:14,868][33226] Updated weights for policy 1, policy_version 44390 (0.0008) [2023-10-14 02:48:15,252][33226] Updated weights for policy 1, policy_version 44400 (0.0007) [2023-10-14 02:48:15,622][33226] Updated weights for policy 1, policy_version 44410 (0.0007) [2023-10-14 02:48:15,918][33201] Updated weights for policy 0, policy_version 44010 (0.0008) [2023-10-14 02:48:16,285][33201] Updated weights for policy 0, policy_version 44020 (0.0009) [2023-10-14 02:48:16,663][33201] Updated weights for policy 0, policy_version 44030 (0.0009) [2023-10-14 02:48:19,490][33226] Updated weights for policy 1, policy_version 44420 (0.0009) [2023-10-14 02:48:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 90570752. Throughput: 0: 1743.2, 1: 1758.8. Samples: 22654014. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:19,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.940')] [2023-10-14 02:48:19,848][33226] Updated weights for policy 1, policy_version 44430 (0.0012) [2023-10-14 02:48:20,213][33226] Updated weights for policy 1, policy_version 44440 (0.0011) [2023-10-14 02:48:20,478][33201] Updated weights for policy 0, policy_version 44040 (0.0008) [2023-10-14 02:48:20,850][33201] Updated weights for policy 0, policy_version 44050 (0.0010) [2023-10-14 02:48:21,229][33201] Updated weights for policy 0, policy_version 44060 (0.0010) [2023-10-14 02:48:24,002][33226] Updated weights for policy 1, policy_version 44450 (0.0009) [2023-10-14 02:48:24,382][33226] Updated weights for policy 1, policy_version 44460 (0.0007) [2023-10-14 02:48:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 90636288. Throughput: 0: 1761.5, 1: 1791.1. Samples: 22675862. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:24,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.950')] [2023-10-14 02:48:24,753][33226] Updated weights for policy 1, policy_version 44470 (0.0009) [2023-10-14 02:48:25,020][33201] Updated weights for policy 0, policy_version 44070 (0.0010) [2023-10-14 02:48:25,113][33226] Updated weights for policy 1, policy_version 44480 (0.0008) [2023-10-14 02:48:25,387][33201] Updated weights for policy 0, policy_version 44080 (0.0007) [2023-10-14 02:48:25,764][33201] Updated weights for policy 0, policy_version 44090 (0.0007) [2023-10-14 02:48:28,798][33226] Updated weights for policy 1, policy_version 44490 (0.0008) [2023-10-14 02:48:29,161][33226] Updated weights for policy 1, policy_version 44500 (0.0008) [2023-10-14 02:48:29,496][33201] Updated weights for policy 0, policy_version 44100 (0.0010) [2023-10-14 02:48:29,527][33226] Updated weights for policy 1, policy_version 44510 (0.0007) [2023-10-14 02:48:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 90701824. Throughput: 0: 1739.1, 1: 1768.2. Samples: 22685446. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:29,557][31953] Avg episode reward: [(0, '20.810'), (1, '20.920')] [2023-10-14 02:48:29,863][33201] Updated weights for policy 0, policy_version 44110 (0.0010) [2023-10-14 02:48:30,236][33201] Updated weights for policy 0, policy_version 44120 (0.0008) [2023-10-14 02:48:33,395][33226] Updated weights for policy 1, policy_version 44520 (0.0009) [2023-10-14 02:48:33,763][33226] Updated weights for policy 1, policy_version 44530 (0.0008) [2023-10-14 02:48:34,124][33226] Updated weights for policy 1, policy_version 44540 (0.0009) [2023-10-14 02:48:34,263][33201] Updated weights for policy 0, policy_version 44130 (0.0007) [2023-10-14 02:48:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 90800128. Throughput: 0: 1760.1, 1: 1794.8. Samples: 22707762. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:34,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.930')] [2023-10-14 02:48:34,641][33201] Updated weights for policy 0, policy_version 44140 (0.0008) [2023-10-14 02:48:35,012][33201] Updated weights for policy 0, policy_version 44150 (0.0010) [2023-10-14 02:48:35,383][33201] Updated weights for policy 0, policy_version 44160 (0.0008) [2023-10-14 02:48:38,209][33226] Updated weights for policy 1, policy_version 44550 (0.0008) [2023-10-14 02:48:38,584][33226] Updated weights for policy 1, policy_version 44560 (0.0008) [2023-10-14 02:48:38,946][33226] Updated weights for policy 1, policy_version 44570 (0.0011) [2023-10-14 02:48:39,523][33201] Updated weights for policy 0, policy_version 44170 (0.0008) [2023-10-14 02:48:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 90865664. Throughput: 0: 1770.9, 1: 1772.8. Samples: 22728038. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 02:48:39,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.930')] [2023-10-14 02:48:39,885][33201] Updated weights for policy 0, policy_version 44180 (0.0010) [2023-10-14 02:48:40,272][33201] Updated weights for policy 0, policy_version 44190 (0.0010) [2023-10-14 02:48:42,627][33226] Updated weights for policy 1, policy_version 44580 (0.0007) [2023-10-14 02:48:42,997][33226] Updated weights for policy 1, policy_version 44590 (0.0008) [2023-10-14 02:48:43,372][33226] Updated weights for policy 1, policy_version 44600 (0.0008) [2023-10-14 02:48:43,899][33201] Updated weights for policy 0, policy_version 44200 (0.0009) [2023-10-14 02:48:44,264][33201] Updated weights for policy 0, policy_version 44210 (0.0009) [2023-10-14 02:48:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 90931200. Throughput: 0: 1747.3, 1: 1787.9. Samples: 22738844. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:48:44,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.940')] [2023-10-14 02:48:44,650][33201] Updated weights for policy 0, policy_version 44220 (0.0009) [2023-10-14 02:48:47,271][33226] Updated weights for policy 1, policy_version 44610 (0.0007) [2023-10-14 02:48:47,648][33226] Updated weights for policy 1, policy_version 44620 (0.0007) [2023-10-14 02:48:48,008][33226] Updated weights for policy 1, policy_version 44630 (0.0007) [2023-10-14 02:48:48,366][33201] Updated weights for policy 0, policy_version 44230 (0.0008) [2023-10-14 02:48:48,381][33226] Updated weights for policy 1, policy_version 44640 (0.0009) [2023-10-14 02:48:48,737][33201] Updated weights for policy 0, policy_version 44240 (0.0010) [2023-10-14 02:48:49,101][33201] Updated weights for policy 0, policy_version 44250 (0.0008) [2023-10-14 02:48:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 91029504. Throughput: 0: 1785.2, 1: 1778.2. Samples: 22760262. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:48:49,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.940')] [2023-10-14 02:48:51,882][33226] Updated weights for policy 1, policy_version 44650 (0.0008) [2023-10-14 02:48:52,257][33226] Updated weights for policy 1, policy_version 44660 (0.0008) [2023-10-14 02:48:52,619][33226] Updated weights for policy 1, policy_version 44670 (0.0007) [2023-10-14 02:48:52,867][33201] Updated weights for policy 0, policy_version 44260 (0.0008) [2023-10-14 02:48:53,240][33201] Updated weights for policy 0, policy_version 44270 (0.0011) [2023-10-14 02:48:53,606][33201] Updated weights for policy 0, policy_version 44280 (0.0009) [2023-10-14 02:48:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 91095040. Throughput: 0: 1751.9, 1: 1769.5. Samples: 22780658. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:48:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.950')] [2023-10-14 02:48:56,349][33226] Updated weights for policy 1, policy_version 44680 (0.0008) [2023-10-14 02:48:56,713][33226] Updated weights for policy 1, policy_version 44690 (0.0009) [2023-10-14 02:48:57,075][33226] Updated weights for policy 1, policy_version 44700 (0.0007) [2023-10-14 02:48:57,641][33201] Updated weights for policy 0, policy_version 44290 (0.0008) [2023-10-14 02:48:58,000][33201] Updated weights for policy 0, policy_version 44300 (0.0009) [2023-10-14 02:48:58,374][33201] Updated weights for policy 0, policy_version 44310 (0.0010) [2023-10-14 02:48:58,751][33201] Updated weights for policy 0, policy_version 44320 (0.0010) [2023-10-14 02:48:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 91160576. Throughput: 0: 1772.4, 1: 1781.2. Samples: 22792058. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:48:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 02:49:00,865][33226] Updated weights for policy 1, policy_version 44710 (0.0008) [2023-10-14 02:49:01,233][33226] Updated weights for policy 1, policy_version 44720 (0.0009) [2023-10-14 02:49:01,599][33226] Updated weights for policy 1, policy_version 44730 (0.0009) [2023-10-14 02:49:02,658][33201] Updated weights for policy 0, policy_version 44330 (0.0010) [2023-10-14 02:49:03,030][33201] Updated weights for policy 0, policy_version 44340 (0.0007) [2023-10-14 02:49:03,410][33201] Updated weights for policy 0, policy_version 44350 (0.0007) [2023-10-14 02:49:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 91226112. Throughput: 0: 1750.5, 1: 1778.8. Samples: 22812832. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:49:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:49:05,369][33226] Updated weights for policy 1, policy_version 44740 (0.0008) [2023-10-14 02:49:05,749][33226] Updated weights for policy 1, policy_version 44750 (0.0007) [2023-10-14 02:49:06,116][33226] Updated weights for policy 1, policy_version 44760 (0.0009) [2023-10-14 02:49:07,212][33201] Updated weights for policy 0, policy_version 44360 (0.0009) [2023-10-14 02:49:07,590][33201] Updated weights for policy 0, policy_version 44370 (0.0010) [2023-10-14 02:49:07,961][33201] Updated weights for policy 0, policy_version 44380 (0.0010) [2023-10-14 02:49:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 91291648. Throughput: 0: 1737.8, 1: 1779.0. Samples: 22834120. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 02:49:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:49:09,921][33226] Updated weights for policy 1, policy_version 44770 (0.0009) [2023-10-14 02:49:10,296][33226] Updated weights for policy 1, policy_version 44780 (0.0008) [2023-10-14 02:49:10,665][33226] Updated weights for policy 1, policy_version 44790 (0.0008) [2023-10-14 02:49:11,030][33226] Updated weights for policy 1, policy_version 44800 (0.0008) [2023-10-14 02:49:11,825][33201] Updated weights for policy 0, policy_version 44390 (0.0008) [2023-10-14 02:49:12,194][33201] Updated weights for policy 0, policy_version 44400 (0.0010) [2023-10-14 02:49:12,573][33201] Updated weights for policy 0, policy_version 44410 (0.0008) [2023-10-14 02:49:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 91357184. Throughput: 0: 1757.9, 1: 1776.3. Samples: 22844482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:14,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.930')] [2023-10-14 02:49:14,946][33226] Updated weights for policy 1, policy_version 44810 (0.0009) [2023-10-14 02:49:15,311][33226] Updated weights for policy 1, policy_version 44820 (0.0007) [2023-10-14 02:49:15,676][33226] Updated weights for policy 1, policy_version 44830 (0.0010) [2023-10-14 02:49:16,362][33201] Updated weights for policy 0, policy_version 44420 (0.0010) [2023-10-14 02:49:16,734][33201] Updated weights for policy 0, policy_version 44430 (0.0009) [2023-10-14 02:49:17,097][33201] Updated weights for policy 0, policy_version 44440 (0.0008) [2023-10-14 02:49:19,350][33226] Updated weights for policy 1, policy_version 44840 (0.0007) [2023-10-14 02:49:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 91422720. Throughput: 0: 1737.4, 1: 1774.1. Samples: 22865778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:19,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 02:49:19,720][33226] Updated weights for policy 1, policy_version 44850 (0.0010) [2023-10-14 02:49:20,088][33226] Updated weights for policy 1, policy_version 44860 (0.0009) [2023-10-14 02:49:21,046][33201] Updated weights for policy 0, policy_version 44450 (0.0008) [2023-10-14 02:49:21,420][33201] Updated weights for policy 0, policy_version 44460 (0.0009) [2023-10-14 02:49:21,785][33201] Updated weights for policy 0, policy_version 44470 (0.0008) [2023-10-14 02:49:22,154][33201] Updated weights for policy 0, policy_version 44480 (0.0009) [2023-10-14 02:49:23,833][33226] Updated weights for policy 1, policy_version 44870 (0.0009) [2023-10-14 02:49:24,194][33226] Updated weights for policy 1, policy_version 44880 (0.0010) [2023-10-14 02:49:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 91488256. Throughput: 0: 1745.0, 1: 1802.7. Samples: 22887686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 02:49:24,564][33226] Updated weights for policy 1, policy_version 44890 (0.0009) [2023-10-14 02:49:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000044480_45547520.pth... [2023-10-14 02:49:24,602][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000042848_43876352.pth [2023-10-14 02:49:24,778][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000044896_45973504.pth... [2023-10-14 02:49:24,815][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000043232_44269568.pth [2023-10-14 02:49:25,967][33201] Updated weights for policy 0, policy_version 44490 (0.0009) [2023-10-14 02:49:26,342][33201] Updated weights for policy 0, policy_version 44500 (0.0010) [2023-10-14 02:49:26,718][33201] Updated weights for policy 0, policy_version 44510 (0.0008) [2023-10-14 02:49:28,292][33226] Updated weights for policy 1, policy_version 44900 (0.0007) [2023-10-14 02:49:28,664][33226] Updated weights for policy 1, policy_version 44910 (0.0009) [2023-10-14 02:49:29,045][33226] Updated weights for policy 1, policy_version 44920 (0.0009) [2023-10-14 02:49:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.5, 300 sec: 14106.9). Total num frames: 91586560. Throughput: 0: 1745.8, 1: 1789.1. Samples: 22897916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:29,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.940')] [2023-10-14 02:49:30,469][33201] Updated weights for policy 0, policy_version 44520 (0.0008) [2023-10-14 02:49:30,844][33201] Updated weights for policy 0, policy_version 44530 (0.0009) [2023-10-14 02:49:31,214][33201] Updated weights for policy 0, policy_version 44540 (0.0008) [2023-10-14 02:49:32,883][33226] Updated weights for policy 1, policy_version 44930 (0.0009) [2023-10-14 02:49:33,246][33226] Updated weights for policy 1, policy_version 44940 (0.0009) [2023-10-14 02:49:33,617][33226] Updated weights for policy 1, policy_version 44950 (0.0010) [2023-10-14 02:49:33,983][33226] Updated weights for policy 1, policy_version 44960 (0.0010) [2023-10-14 02:49:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 91652096. Throughput: 0: 1739.2, 1: 1808.4. Samples: 22919902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:34,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.940')] [2023-10-14 02:49:35,023][33201] Updated weights for policy 0, policy_version 44550 (0.0008) [2023-10-14 02:49:35,395][33201] Updated weights for policy 0, policy_version 44560 (0.0008) [2023-10-14 02:49:35,757][33201] Updated weights for policy 0, policy_version 44570 (0.0008) [2023-10-14 02:49:37,748][33226] Updated weights for policy 1, policy_version 44970 (0.0010) [2023-10-14 02:49:38,116][33226] Updated weights for policy 1, policy_version 44980 (0.0011) [2023-10-14 02:49:38,489][33226] Updated weights for policy 1, policy_version 44990 (0.0011) [2023-10-14 02:49:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 91717632. Throughput: 0: 1772.9, 1: 1782.2. Samples: 22940640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:39,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.940')] [2023-10-14 02:49:39,649][33201] Updated weights for policy 0, policy_version 44580 (0.0007) [2023-10-14 02:49:40,019][33201] Updated weights for policy 0, policy_version 44590 (0.0007) [2023-10-14 02:49:40,396][33201] Updated weights for policy 0, policy_version 44600 (0.0007) [2023-10-14 02:49:42,376][33226] Updated weights for policy 1, policy_version 45000 (0.0009) [2023-10-14 02:49:42,751][33226] Updated weights for policy 1, policy_version 45010 (0.0007) [2023-10-14 02:49:43,110][33226] Updated weights for policy 1, policy_version 45020 (0.0007) [2023-10-14 02:49:44,143][33201] Updated weights for policy 0, policy_version 44610 (0.0009) [2023-10-14 02:49:44,510][33201] Updated weights for policy 0, policy_version 44620 (0.0007) [2023-10-14 02:49:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 91783168. Throughput: 0: 1744.8, 1: 1803.2. Samples: 22951714. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:49:44,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 02:49:44,877][33201] Updated weights for policy 0, policy_version 44630 (0.0007) [2023-10-14 02:49:45,249][33201] Updated weights for policy 0, policy_version 44640 (0.0007) [2023-10-14 02:49:46,726][33226] Updated weights for policy 1, policy_version 45030 (0.0008) [2023-10-14 02:49:47,123][33226] Updated weights for policy 1, policy_version 45040 (0.0008) [2023-10-14 02:49:47,489][33226] Updated weights for policy 1, policy_version 45050 (0.0008) [2023-10-14 02:49:49,133][33201] Updated weights for policy 0, policy_version 44650 (0.0008) [2023-10-14 02:49:49,504][33201] Updated weights for policy 0, policy_version 44660 (0.0008) [2023-10-14 02:49:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 91848704. Throughput: 0: 1771.0, 1: 1775.2. Samples: 22972414. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:49:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.890')] [2023-10-14 02:49:49,871][33201] Updated weights for policy 0, policy_version 44670 (0.0009) [2023-10-14 02:49:51,275][33226] Updated weights for policy 1, policy_version 45060 (0.0010) [2023-10-14 02:49:51,642][33226] Updated weights for policy 1, policy_version 45070 (0.0010) [2023-10-14 02:49:52,019][33226] Updated weights for policy 1, policy_version 45080 (0.0010) [2023-10-14 02:49:53,631][33201] Updated weights for policy 0, policy_version 44680 (0.0011) [2023-10-14 02:49:53,999][33201] Updated weights for policy 0, policy_version 44690 (0.0011) [2023-10-14 02:49:54,368][33201] Updated weights for policy 0, policy_version 44700 (0.0008) [2023-10-14 02:49:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 91947008. Throughput: 0: 1768.0, 1: 1772.7. Samples: 22993448. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:49:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.890')] [2023-10-14 02:49:55,735][33226] Updated weights for policy 1, policy_version 45090 (0.0007) [2023-10-14 02:49:56,107][33226] Updated weights for policy 1, policy_version 45100 (0.0007) [2023-10-14 02:49:56,475][33226] Updated weights for policy 1, policy_version 45110 (0.0009) [2023-10-14 02:49:56,839][33226] Updated weights for policy 1, policy_version 45120 (0.0011) [2023-10-14 02:49:58,329][33201] Updated weights for policy 0, policy_version 44710 (0.0011) [2023-10-14 02:49:58,700][33201] Updated weights for policy 0, policy_version 44720 (0.0008) [2023-10-14 02:49:59,078][33201] Updated weights for policy 0, policy_version 44730 (0.0007) [2023-10-14 02:49:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 92012544. Throughput: 0: 1766.3, 1: 1775.3. Samples: 23003856. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:49:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.890')] [2023-10-14 02:50:00,621][33226] Updated weights for policy 1, policy_version 45130 (0.0011) [2023-10-14 02:50:00,991][33226] Updated weights for policy 1, policy_version 45140 (0.0009) [2023-10-14 02:50:01,347][33226] Updated weights for policy 1, policy_version 45150 (0.0011) [2023-10-14 02:50:02,831][33201] Updated weights for policy 0, policy_version 44740 (0.0008) [2023-10-14 02:50:03,197][33201] Updated weights for policy 0, policy_version 44750 (0.0009) [2023-10-14 02:50:03,573][33201] Updated weights for policy 0, policy_version 44760 (0.0007) [2023-10-14 02:50:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 92078080. Throughput: 0: 1772.3, 1: 1778.0. Samples: 23025546. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:50:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.890')] [2023-10-14 02:50:05,238][33226] Updated weights for policy 1, policy_version 45160 (0.0010) [2023-10-14 02:50:05,603][33226] Updated weights for policy 1, policy_version 45170 (0.0009) [2023-10-14 02:50:05,969][33226] Updated weights for policy 1, policy_version 45180 (0.0008) [2023-10-14 02:50:07,420][33201] Updated weights for policy 0, policy_version 44770 (0.0008) [2023-10-14 02:50:07,779][33201] Updated weights for policy 0, policy_version 44780 (0.0008) [2023-10-14 02:50:08,149][33201] Updated weights for policy 0, policy_version 44790 (0.0008) [2023-10-14 02:50:08,523][33201] Updated weights for policy 0, policy_version 44800 (0.0009) [2023-10-14 02:50:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 92143616. Throughput: 0: 1753.0, 1: 1783.4. Samples: 23046826. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:50:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.890')] [2023-10-14 02:50:09,822][33226] Updated weights for policy 1, policy_version 45190 (0.0008) [2023-10-14 02:50:10,194][33226] Updated weights for policy 1, policy_version 45200 (0.0007) [2023-10-14 02:50:10,557][33226] Updated weights for policy 1, policy_version 45210 (0.0007) [2023-10-14 02:50:12,499][33201] Updated weights for policy 0, policy_version 44810 (0.0009) [2023-10-14 02:50:12,877][33201] Updated weights for policy 0, policy_version 44820 (0.0007) [2023-10-14 02:50:13,246][33201] Updated weights for policy 0, policy_version 44830 (0.0007) [2023-10-14 02:50:14,168][33226] Updated weights for policy 1, policy_version 45220 (0.0007) [2023-10-14 02:50:14,542][33226] Updated weights for policy 1, policy_version 45230 (0.0007) [2023-10-14 02:50:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 92209152. Throughput: 0: 1785.7, 1: 1768.4. Samples: 23057852. Policy #0 lag: (min: 28.0, avg: 30.8, max: 60.0) [2023-10-14 02:50:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 02:50:14,911][33226] Updated weights for policy 1, policy_version 45240 (0.0010) [2023-10-14 02:50:17,038][33201] Updated weights for policy 0, policy_version 44840 (0.0007) [2023-10-14 02:50:17,410][33201] Updated weights for policy 0, policy_version 44850 (0.0009) [2023-10-14 02:50:17,778][33201] Updated weights for policy 0, policy_version 44860 (0.0008) [2023-10-14 02:50:18,557][33226] Updated weights for policy 1, policy_version 45250 (0.0010) [2023-10-14 02:50:18,924][33226] Updated weights for policy 1, policy_version 45260 (0.0007) [2023-10-14 02:50:19,293][33226] Updated weights for policy 1, policy_version 45270 (0.0010) [2023-10-14 02:50:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 92274688. Throughput: 0: 1752.4, 1: 1783.2. Samples: 23079000. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:19,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.920')] [2023-10-14 02:50:19,659][33226] Updated weights for policy 1, policy_version 45280 (0.0009) [2023-10-14 02:50:21,623][33201] Updated weights for policy 0, policy_version 44870 (0.0011) [2023-10-14 02:50:21,995][33201] Updated weights for policy 0, policy_version 44880 (0.0010) [2023-10-14 02:50:22,371][33201] Updated weights for policy 0, policy_version 44890 (0.0008) [2023-10-14 02:50:23,528][33226] Updated weights for policy 1, policy_version 45290 (0.0009) [2023-10-14 02:50:23,896][33226] Updated weights for policy 1, policy_version 45300 (0.0009) [2023-10-14 02:50:24,261][33226] Updated weights for policy 1, policy_version 45310 (0.0008) [2023-10-14 02:50:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 92372992. Throughput: 0: 1755.7, 1: 1792.9. Samples: 23100330. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:24,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:50:26,158][33201] Updated weights for policy 0, policy_version 44900 (0.0010) [2023-10-14 02:50:26,521][33201] Updated weights for policy 0, policy_version 44910 (0.0011) [2023-10-14 02:50:26,889][33201] Updated weights for policy 0, policy_version 44920 (0.0010) [2023-10-14 02:50:27,962][33226] Updated weights for policy 1, policy_version 45320 (0.0008) [2023-10-14 02:50:28,329][33226] Updated weights for policy 1, policy_version 45330 (0.0008) [2023-10-14 02:50:28,697][33226] Updated weights for policy 1, policy_version 45340 (0.0009) [2023-10-14 02:50:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 92438528. Throughput: 0: 1756.0, 1: 1789.5. Samples: 23111260. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:29,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.920')] [2023-10-14 02:50:30,734][33201] Updated weights for policy 0, policy_version 44930 (0.0010) [2023-10-14 02:50:31,103][33201] Updated weights for policy 0, policy_version 44940 (0.0009) [2023-10-14 02:50:31,466][33201] Updated weights for policy 0, policy_version 44950 (0.0010) [2023-10-14 02:50:31,844][33201] Updated weights for policy 0, policy_version 44960 (0.0007) [2023-10-14 02:50:32,583][33226] Updated weights for policy 1, policy_version 45350 (0.0008) [2023-10-14 02:50:32,979][33226] Updated weights for policy 1, policy_version 45360 (0.0010) [2023-10-14 02:50:33,358][33226] Updated weights for policy 1, policy_version 45370 (0.0009) [2023-10-14 02:50:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 92504064. Throughput: 0: 1750.4, 1: 1801.2. Samples: 23132240. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:34,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 02:50:35,672][33201] Updated weights for policy 0, policy_version 44970 (0.0011) [2023-10-14 02:50:36,033][33201] Updated weights for policy 0, policy_version 44980 (0.0009) [2023-10-14 02:50:36,411][33201] Updated weights for policy 0, policy_version 44990 (0.0007) [2023-10-14 02:50:37,153][33226] Updated weights for policy 1, policy_version 45380 (0.0009) [2023-10-14 02:50:37,525][33226] Updated weights for policy 1, policy_version 45390 (0.0008) [2023-10-14 02:50:37,894][33226] Updated weights for policy 1, policy_version 45400 (0.0007) [2023-10-14 02:50:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 92569600. Throughput: 0: 1773.6, 1: 1785.9. Samples: 23153622. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:50:40,255][33201] Updated weights for policy 0, policy_version 45000 (0.0008) [2023-10-14 02:50:40,628][33201] Updated weights for policy 0, policy_version 45010 (0.0007) [2023-10-14 02:50:41,005][33201] Updated weights for policy 0, policy_version 45020 (0.0008) [2023-10-14 02:50:41,665][33226] Updated weights for policy 1, policy_version 45410 (0.0009) [2023-10-14 02:50:42,019][33226] Updated weights for policy 1, policy_version 45420 (0.0009) [2023-10-14 02:50:42,384][33226] Updated weights for policy 1, policy_version 45430 (0.0008) [2023-10-14 02:50:42,741][33226] Updated weights for policy 1, policy_version 45440 (0.0009) [2023-10-14 02:50:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 92635136. Throughput: 0: 1758.8, 1: 1807.9. Samples: 23164354. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:44,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.910')] [2023-10-14 02:50:44,618][33201] Updated weights for policy 0, policy_version 45030 (0.0009) [2023-10-14 02:50:44,988][33201] Updated weights for policy 0, policy_version 45040 (0.0009) [2023-10-14 02:50:45,362][33201] Updated weights for policy 0, policy_version 45050 (0.0007) [2023-10-14 02:50:46,676][33226] Updated weights for policy 1, policy_version 45450 (0.0007) [2023-10-14 02:50:47,039][33226] Updated weights for policy 1, policy_version 45460 (0.0007) [2023-10-14 02:50:47,407][33226] Updated weights for policy 1, policy_version 45470 (0.0008) [2023-10-14 02:50:49,155][33201] Updated weights for policy 0, policy_version 45060 (0.0007) [2023-10-14 02:50:49,529][33201] Updated weights for policy 0, policy_version 45070 (0.0010) [2023-10-14 02:50:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 92700672. Throughput: 0: 1775.6, 1: 1777.0. Samples: 23185412. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-10-14 02:50:49,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 02:50:49,909][33201] Updated weights for policy 0, policy_version 45080 (0.0009) [2023-10-14 02:50:51,036][33226] Updated weights for policy 1, policy_version 45480 (0.0009) [2023-10-14 02:50:51,416][33226] Updated weights for policy 1, policy_version 45490 (0.0009) [2023-10-14 02:50:51,779][33226] Updated weights for policy 1, policy_version 45500 (0.0012) [2023-10-14 02:50:53,787][33201] Updated weights for policy 0, policy_version 45090 (0.0009) [2023-10-14 02:50:54,152][33201] Updated weights for policy 0, policy_version 45100 (0.0009) [2023-10-14 02:50:54,521][33201] Updated weights for policy 0, policy_version 45110 (0.0010) [2023-10-14 02:50:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 92766208. Throughput: 0: 1785.7, 1: 1783.2. Samples: 23207426. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:50:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 02:50:54,884][33201] Updated weights for policy 0, policy_version 45120 (0.0010) [2023-10-14 02:50:55,560][33226] Updated weights for policy 1, policy_version 45510 (0.0008) [2023-10-14 02:50:55,929][33226] Updated weights for policy 1, policy_version 45520 (0.0009) [2023-10-14 02:50:56,300][33226] Updated weights for policy 1, policy_version 45530 (0.0008) [2023-10-14 02:50:58,842][33201] Updated weights for policy 0, policy_version 45130 (0.0007) [2023-10-14 02:50:59,216][33201] Updated weights for policy 0, policy_version 45140 (0.0007) [2023-10-14 02:50:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 92831744. Throughput: 0: 1763.4, 1: 1783.6. Samples: 23217470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:50:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 02:50:59,590][33201] Updated weights for policy 0, policy_version 45150 (0.0008) [2023-10-14 02:50:59,974][33226] Updated weights for policy 1, policy_version 45540 (0.0012) [2023-10-14 02:51:00,341][33226] Updated weights for policy 1, policy_version 45550 (0.0010) [2023-10-14 02:51:00,710][33226] Updated weights for policy 1, policy_version 45560 (0.0009) [2023-10-14 02:51:03,499][33201] Updated weights for policy 0, policy_version 45160 (0.0009) [2023-10-14 02:51:03,871][33201] Updated weights for policy 0, policy_version 45170 (0.0008) [2023-10-14 02:51:04,238][33201] Updated weights for policy 0, policy_version 45180 (0.0008) [2023-10-14 02:51:04,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 92930048. Throughput: 0: 1789.6, 1: 1775.6. Samples: 23239430. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:51:04,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 02:51:04,590][33226] Updated weights for policy 1, policy_version 45570 (0.0008) [2023-10-14 02:51:04,967][33226] Updated weights for policy 1, policy_version 45580 (0.0007) [2023-10-14 02:51:05,333][33226] Updated weights for policy 1, policy_version 45590 (0.0008) [2023-10-14 02:51:05,698][33226] Updated weights for policy 1, policy_version 45600 (0.0008) [2023-10-14 02:51:07,994][33201] Updated weights for policy 0, policy_version 45190 (0.0009) [2023-10-14 02:51:08,369][33201] Updated weights for policy 0, policy_version 45200 (0.0009) [2023-10-14 02:51:08,737][33201] Updated weights for policy 0, policy_version 45210 (0.0008) [2023-10-14 02:51:09,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 92995584. Throughput: 0: 1753.7, 1: 1797.4. Samples: 23260128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:51:09,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 02:51:09,663][33226] Updated weights for policy 1, policy_version 45610 (0.0009) [2023-10-14 02:51:10,023][33226] Updated weights for policy 1, policy_version 45620 (0.0009) [2023-10-14 02:51:10,393][33226] Updated weights for policy 1, policy_version 45630 (0.0010) [2023-10-14 02:51:12,473][33201] Updated weights for policy 0, policy_version 45220 (0.0008) [2023-10-14 02:51:12,843][33201] Updated weights for policy 0, policy_version 45230 (0.0009) [2023-10-14 02:51:13,210][33201] Updated weights for policy 0, policy_version 45240 (0.0007) [2023-10-14 02:51:14,209][33226] Updated weights for policy 1, policy_version 45640 (0.0007) [2023-10-14 02:51:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 93061120. Throughput: 0: 1786.9, 1: 1767.9. Samples: 23271228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:51:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 02:51:14,575][33226] Updated weights for policy 1, policy_version 45650 (0.0007) [2023-10-14 02:51:14,934][33226] Updated weights for policy 1, policy_version 45660 (0.0009) [2023-10-14 02:51:17,102][33201] Updated weights for policy 0, policy_version 45250 (0.0007) [2023-10-14 02:51:17,465][33201] Updated weights for policy 0, policy_version 45260 (0.0008) [2023-10-14 02:51:17,834][33201] Updated weights for policy 0, policy_version 45270 (0.0009) [2023-10-14 02:51:18,206][33201] Updated weights for policy 0, policy_version 45280 (0.0010) [2023-10-14 02:51:18,600][33226] Updated weights for policy 1, policy_version 45670 (0.0009) [2023-10-14 02:51:18,967][33226] Updated weights for policy 1, policy_version 45680 (0.0010) [2023-10-14 02:51:19,325][33226] Updated weights for policy 1, policy_version 45690 (0.0009) [2023-10-14 02:51:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 93159424. Throughput: 0: 1760.7, 1: 1795.8. Samples: 23292284. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:51:19,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.930')] [2023-10-14 02:51:21,957][33201] Updated weights for policy 0, policy_version 45290 (0.0009) [2023-10-14 02:51:22,320][33201] Updated weights for policy 0, policy_version 45300 (0.0007) [2023-10-14 02:51:22,691][33201] Updated weights for policy 0, policy_version 45310 (0.0008) [2023-10-14 02:51:22,946][33226] Updated weights for policy 1, policy_version 45700 (0.0008) [2023-10-14 02:51:23,310][33226] Updated weights for policy 1, policy_version 45710 (0.0008) [2023-10-14 02:51:23,684][33226] Updated weights for policy 1, policy_version 45720 (0.0008) [2023-10-14 02:51:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93224960. Throughput: 0: 1762.6, 1: 1787.1. Samples: 23313356. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.930')] [2023-10-14 02:51:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000045728_46825472.pth... [2023-10-14 02:51:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000045312_46399488.pth... [2023-10-14 02:51:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000043680_44728320.pth [2023-10-14 02:51:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000044064_45121536.pth [2023-10-14 02:51:26,359][33201] Updated weights for policy 0, policy_version 45320 (0.0007) [2023-10-14 02:51:26,733][33201] Updated weights for policy 0, policy_version 45330 (0.0007) [2023-10-14 02:51:27,110][33201] Updated weights for policy 0, policy_version 45340 (0.0007) [2023-10-14 02:51:27,429][33226] Updated weights for policy 1, policy_version 45730 (0.0008) [2023-10-14 02:51:27,795][33226] Updated weights for policy 1, policy_version 45740 (0.0007) [2023-10-14 02:51:28,161][33226] Updated weights for policy 1, policy_version 45750 (0.0007) [2023-10-14 02:51:28,529][33226] Updated weights for policy 1, policy_version 45760 (0.0009) [2023-10-14 02:51:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93290496. Throughput: 0: 1769.5, 1: 1792.9. Samples: 23324662. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:29,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.930')] [2023-10-14 02:51:30,904][33201] Updated weights for policy 0, policy_version 45350 (0.0008) [2023-10-14 02:51:31,266][33201] Updated weights for policy 0, policy_version 45360 (0.0009) [2023-10-14 02:51:31,643][33201] Updated weights for policy 0, policy_version 45370 (0.0007) [2023-10-14 02:51:32,219][33226] Updated weights for policy 1, policy_version 45770 (0.0008) [2023-10-14 02:51:32,597][33226] Updated weights for policy 1, policy_version 45780 (0.0009) [2023-10-14 02:51:32,978][33226] Updated weights for policy 1, policy_version 45790 (0.0009) [2023-10-14 02:51:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93356032. Throughput: 0: 1758.4, 1: 1796.3. Samples: 23345376. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:34,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.980')] [2023-10-14 02:51:35,498][33201] Updated weights for policy 0, policy_version 45380 (0.0008) [2023-10-14 02:51:35,875][33201] Updated weights for policy 0, policy_version 45390 (0.0008) [2023-10-14 02:51:36,245][33201] Updated weights for policy 0, policy_version 45400 (0.0008) [2023-10-14 02:51:36,768][33226] Updated weights for policy 1, policy_version 45800 (0.0009) [2023-10-14 02:51:37,143][33226] Updated weights for policy 1, policy_version 45810 (0.0007) [2023-10-14 02:51:37,501][33226] Updated weights for policy 1, policy_version 45820 (0.0009) [2023-10-14 02:51:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93421568. Throughput: 0: 1770.6, 1: 1787.7. Samples: 23367546. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 02:51:40,114][33201] Updated weights for policy 0, policy_version 45410 (0.0008) [2023-10-14 02:51:40,480][33201] Updated weights for policy 0, policy_version 45420 (0.0008) [2023-10-14 02:51:40,864][33201] Updated weights for policy 0, policy_version 45430 (0.0007) [2023-10-14 02:51:41,203][33226] Updated weights for policy 1, policy_version 45830 (0.0009) [2023-10-14 02:51:41,221][33201] Updated weights for policy 0, policy_version 45440 (0.0007) [2023-10-14 02:51:41,572][33226] Updated weights for policy 1, policy_version 45840 (0.0008) [2023-10-14 02:51:41,934][33226] Updated weights for policy 1, policy_version 45850 (0.0009) [2023-10-14 02:51:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93487104. Throughput: 0: 1760.8, 1: 1796.2. Samples: 23377536. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 02:51:45,013][33201] Updated weights for policy 0, policy_version 45450 (0.0009) [2023-10-14 02:51:45,383][33201] Updated weights for policy 0, policy_version 45460 (0.0009) [2023-10-14 02:51:45,724][33226] Updated weights for policy 1, policy_version 45860 (0.0008) [2023-10-14 02:51:45,745][33201] Updated weights for policy 0, policy_version 45470 (0.0007) [2023-10-14 02:51:46,097][33226] Updated weights for policy 1, policy_version 45870 (0.0009) [2023-10-14 02:51:46,465][33226] Updated weights for policy 1, policy_version 45880 (0.0010) [2023-10-14 02:51:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 93552640. Throughput: 0: 1763.5, 1: 1784.7. Samples: 23399098. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:49,562][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 02:51:49,586][33201] Updated weights for policy 0, policy_version 45480 (0.0009) [2023-10-14 02:51:49,953][33201] Updated weights for policy 0, policy_version 45490 (0.0009) [2023-10-14 02:51:50,294][33226] Updated weights for policy 1, policy_version 45890 (0.0008) [2023-10-14 02:51:50,324][33201] Updated weights for policy 0, policy_version 45500 (0.0007) [2023-10-14 02:51:50,664][33226] Updated weights for policy 1, policy_version 45900 (0.0007) [2023-10-14 02:51:51,031][33226] Updated weights for policy 1, policy_version 45910 (0.0008) [2023-10-14 02:51:51,401][33226] Updated weights for policy 1, policy_version 45920 (0.0009) [2023-10-14 02:51:54,281][33201] Updated weights for policy 0, policy_version 45510 (0.0010) [2023-10-14 02:51:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93618176. Throughput: 0: 1792.3, 1: 1786.3. Samples: 23421166. Policy #0 lag: (min: 26.0, avg: 29.2, max: 58.0) [2023-10-14 02:51:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 02:51:54,648][33201] Updated weights for policy 0, policy_version 45520 (0.0010) [2023-10-14 02:51:55,013][33201] Updated weights for policy 0, policy_version 45530 (0.0008) [2023-10-14 02:51:55,138][33226] Updated weights for policy 1, policy_version 45930 (0.0008) [2023-10-14 02:51:55,512][33226] Updated weights for policy 1, policy_version 45940 (0.0008) [2023-10-14 02:51:55,873][33226] Updated weights for policy 1, policy_version 45950 (0.0008) [2023-10-14 02:51:58,904][33201] Updated weights for policy 0, policy_version 45540 (0.0007) [2023-10-14 02:51:59,283][33201] Updated weights for policy 0, policy_version 45550 (0.0009) [2023-10-14 02:51:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 93683712. Throughput: 0: 1761.6, 1: 1786.1. Samples: 23430876. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:51:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 02:51:59,661][33201] Updated weights for policy 0, policy_version 45560 (0.0009) [2023-10-14 02:51:59,779][33226] Updated weights for policy 1, policy_version 45960 (0.0009) [2023-10-14 02:52:00,147][33226] Updated weights for policy 1, policy_version 45970 (0.0010) [2023-10-14 02:52:00,532][33226] Updated weights for policy 1, policy_version 45980 (0.0010) [2023-10-14 02:52:03,499][33201] Updated weights for policy 0, policy_version 45570 (0.0009) [2023-10-14 02:52:03,879][33201] Updated weights for policy 0, policy_version 45580 (0.0010) [2023-10-14 02:52:04,252][33201] Updated weights for policy 0, policy_version 45590 (0.0010) [2023-10-14 02:52:04,414][33226] Updated weights for policy 1, policy_version 45990 (0.0007) [2023-10-14 02:52:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 93749248. Throughput: 0: 1791.2, 1: 1774.1. Samples: 23452722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 02:52:04,625][33201] Updated weights for policy 0, policy_version 45600 (0.0009) [2023-10-14 02:52:04,796][33226] Updated weights for policy 1, policy_version 46000 (0.0008) [2023-10-14 02:52:05,166][33226] Updated weights for policy 1, policy_version 46010 (0.0010) [2023-10-14 02:52:08,540][33201] Updated weights for policy 0, policy_version 45610 (0.0007) [2023-10-14 02:52:08,898][33201] Updated weights for policy 0, policy_version 45620 (0.0010) [2023-10-14 02:52:08,964][33226] Updated weights for policy 1, policy_version 46020 (0.0009) [2023-10-14 02:52:09,264][33201] Updated weights for policy 0, policy_version 45630 (0.0008) [2023-10-14 02:52:09,319][33226] Updated weights for policy 1, policy_version 46030 (0.0007) [2023-10-14 02:52:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 93847552. Throughput: 0: 1760.2, 1: 1794.4. Samples: 23473312. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 02:52:09,688][33226] Updated weights for policy 1, policy_version 46040 (0.0008) [2023-10-14 02:52:13,147][33201] Updated weights for policy 0, policy_version 45640 (0.0008) [2023-10-14 02:52:13,515][33201] Updated weights for policy 0, policy_version 45650 (0.0007) [2023-10-14 02:52:13,519][33226] Updated weights for policy 1, policy_version 46050 (0.0007) [2023-10-14 02:52:13,880][33201] Updated weights for policy 0, policy_version 45660 (0.0007) [2023-10-14 02:52:13,893][33226] Updated weights for policy 1, policy_version 46060 (0.0007) [2023-10-14 02:52:14,254][33226] Updated weights for policy 1, policy_version 46070 (0.0007) [2023-10-14 02:52:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 93913088. Throughput: 0: 1771.2, 1: 1769.1. Samples: 23483974. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 02:52:14,624][33226] Updated weights for policy 1, policy_version 46080 (0.0008) [2023-10-14 02:52:17,667][33201] Updated weights for policy 0, policy_version 45670 (0.0010) [2023-10-14 02:52:18,036][33201] Updated weights for policy 0, policy_version 45680 (0.0009) [2023-10-14 02:52:18,294][33226] Updated weights for policy 1, policy_version 46090 (0.0007) [2023-10-14 02:52:18,410][33201] Updated weights for policy 0, policy_version 45690 (0.0007) [2023-10-14 02:52:18,667][33226] Updated weights for policy 1, policy_version 46100 (0.0007) [2023-10-14 02:52:19,032][33226] Updated weights for policy 1, policy_version 46110 (0.0007) [2023-10-14 02:52:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 94011392. Throughput: 0: 1765.0, 1: 1794.9. Samples: 23505570. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 02:52:22,209][33201] Updated weights for policy 0, policy_version 45700 (0.0008) [2023-10-14 02:52:22,581][33201] Updated weights for policy 0, policy_version 45710 (0.0010) [2023-10-14 02:52:22,950][33201] Updated weights for policy 0, policy_version 45720 (0.0008) [2023-10-14 02:52:22,990][33226] Updated weights for policy 1, policy_version 46120 (0.0009) [2023-10-14 02:52:23,361][33226] Updated weights for policy 1, policy_version 46130 (0.0007) [2023-10-14 02:52:23,725][33226] Updated weights for policy 1, policy_version 46140 (0.0007) [2023-10-14 02:52:24,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 94076928. Throughput: 0: 1744.9, 1: 1764.3. Samples: 23525462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 02:52:26,860][33201] Updated weights for policy 0, policy_version 45730 (0.0009) [2023-10-14 02:52:27,233][33201] Updated weights for policy 0, policy_version 45740 (0.0007) [2023-10-14 02:52:27,600][33201] Updated weights for policy 0, policy_version 45750 (0.0008) [2023-10-14 02:52:27,671][33226] Updated weights for policy 1, policy_version 46150 (0.0010) [2023-10-14 02:52:27,974][33201] Updated weights for policy 0, policy_version 45760 (0.0008) [2023-10-14 02:52:28,045][33226] Updated weights for policy 1, policy_version 46160 (0.0008) [2023-10-14 02:52:28,413][33226] Updated weights for policy 1, policy_version 46170 (0.0009) [2023-10-14 02:52:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 94142464. Throughput: 0: 1765.9, 1: 1780.1. Samples: 23537108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 02:52:31,877][33201] Updated weights for policy 0, policy_version 45770 (0.0007) [2023-10-14 02:52:32,245][33201] Updated weights for policy 0, policy_version 45780 (0.0007) [2023-10-14 02:52:32,246][33226] Updated weights for policy 1, policy_version 46180 (0.0008) [2023-10-14 02:52:32,616][33201] Updated weights for policy 0, policy_version 45790 (0.0008) [2023-10-14 02:52:32,620][33226] Updated weights for policy 1, policy_version 46190 (0.0007) [2023-10-14 02:52:32,984][33226] Updated weights for policy 1, policy_version 46200 (0.0010) [2023-10-14 02:52:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94208000. Throughput: 0: 1741.5, 1: 1768.4. Samples: 23557044. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 02:52:36,445][33201] Updated weights for policy 0, policy_version 45800 (0.0009) [2023-10-14 02:52:36,814][33201] Updated weights for policy 0, policy_version 45810 (0.0008) [2023-10-14 02:52:36,831][33226] Updated weights for policy 1, policy_version 46210 (0.0007) [2023-10-14 02:52:37,186][33201] Updated weights for policy 0, policy_version 45820 (0.0008) [2023-10-14 02:52:37,191][33226] Updated weights for policy 1, policy_version 46220 (0.0008) [2023-10-14 02:52:37,556][33226] Updated weights for policy 1, policy_version 46230 (0.0010) [2023-10-14 02:52:37,920][33226] Updated weights for policy 1, policy_version 46240 (0.0008) [2023-10-14 02:52:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94273536. Throughput: 0: 1745.0, 1: 1751.5. Samples: 23578508. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 02:52:41,003][33201] Updated weights for policy 0, policy_version 45830 (0.0007) [2023-10-14 02:52:41,366][33201] Updated weights for policy 0, policy_version 45840 (0.0008) [2023-10-14 02:52:41,574][33226] Updated weights for policy 1, policy_version 46250 (0.0009) [2023-10-14 02:52:41,733][33201] Updated weights for policy 0, policy_version 45850 (0.0008) [2023-10-14 02:52:41,929][33226] Updated weights for policy 1, policy_version 46260 (0.0008) [2023-10-14 02:52:42,300][33226] Updated weights for policy 1, policy_version 46270 (0.0007) [2023-10-14 02:52:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94339072. Throughput: 0: 1738.5, 1: 1770.8. Samples: 23588794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 02:52:45,583][33201] Updated weights for policy 0, policy_version 45860 (0.0007) [2023-10-14 02:52:45,962][33201] Updated weights for policy 0, policy_version 45870 (0.0008) [2023-10-14 02:52:46,138][33226] Updated weights for policy 1, policy_version 46280 (0.0009) [2023-10-14 02:52:46,330][33201] Updated weights for policy 0, policy_version 45880 (0.0008) [2023-10-14 02:52:46,496][33226] Updated weights for policy 1, policy_version 46290 (0.0007) [2023-10-14 02:52:46,860][33226] Updated weights for policy 1, policy_version 46300 (0.0009) [2023-10-14 02:52:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94404608. Throughput: 0: 1736.8, 1: 1761.7. Samples: 23610154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 02:52:50,184][33201] Updated weights for policy 0, policy_version 45890 (0.0008) [2023-10-14 02:52:50,552][33201] Updated weights for policy 0, policy_version 45900 (0.0009) [2023-10-14 02:52:50,789][33226] Updated weights for policy 1, policy_version 46310 (0.0008) [2023-10-14 02:52:50,924][33201] Updated weights for policy 0, policy_version 45910 (0.0008) [2023-10-14 02:52:51,169][33226] Updated weights for policy 1, policy_version 46320 (0.0009) [2023-10-14 02:52:51,291][33201] Updated weights for policy 0, policy_version 45920 (0.0008) [2023-10-14 02:52:51,534][33226] Updated weights for policy 1, policy_version 46330 (0.0009) [2023-10-14 02:52:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94470144. Throughput: 0: 1762.3, 1: 1767.3. Samples: 23632144. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 02:52:55,202][33201] Updated weights for policy 0, policy_version 45930 (0.0008) [2023-10-14 02:52:55,252][33226] Updated weights for policy 1, policy_version 46340 (0.0009) [2023-10-14 02:52:55,568][33201] Updated weights for policy 0, policy_version 45940 (0.0008) [2023-10-14 02:52:55,622][33226] Updated weights for policy 1, policy_version 46350 (0.0009) [2023-10-14 02:52:55,950][33201] Updated weights for policy 0, policy_version 45950 (0.0007) [2023-10-14 02:52:55,979][33226] Updated weights for policy 1, policy_version 46360 (0.0009) [2023-10-14 02:52:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 94535680. Throughput: 0: 1743.2, 1: 1762.4. Samples: 23641726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:52:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 02:52:59,669][33201] Updated weights for policy 0, policy_version 45960 (0.0008) [2023-10-14 02:52:59,794][33226] Updated weights for policy 1, policy_version 46370 (0.0010) [2023-10-14 02:53:00,044][33201] Updated weights for policy 0, policy_version 45970 (0.0007) [2023-10-14 02:53:00,160][33226] Updated weights for policy 1, policy_version 46380 (0.0008) [2023-10-14 02:53:00,415][33201] Updated weights for policy 0, policy_version 45980 (0.0007) [2023-10-14 02:53:00,543][33226] Updated weights for policy 1, policy_version 46390 (0.0009) [2023-10-14 02:53:00,898][33226] Updated weights for policy 1, policy_version 46400 (0.0010) [2023-10-14 02:53:04,194][33201] Updated weights for policy 0, policy_version 45990 (0.0009) [2023-10-14 02:53:04,526][33226] Updated weights for policy 1, policy_version 46410 (0.0008) [2023-10-14 02:53:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 94601216. Throughput: 0: 1759.2, 1: 1767.9. Samples: 23664290. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 02:53:04,567][33201] Updated weights for policy 0, policy_version 46000 (0.0007) [2023-10-14 02:53:04,900][33226] Updated weights for policy 1, policy_version 46420 (0.0009) [2023-10-14 02:53:04,936][33201] Updated weights for policy 0, policy_version 46010 (0.0008) [2023-10-14 02:53:05,263][33226] Updated weights for policy 1, policy_version 46430 (0.0008) [2023-10-14 02:53:08,906][33201] Updated weights for policy 0, policy_version 46020 (0.0008) [2023-10-14 02:53:09,241][33226] Updated weights for policy 1, policy_version 46440 (0.0008) [2023-10-14 02:53:09,277][33201] Updated weights for policy 0, policy_version 46030 (0.0007) [2023-10-14 02:53:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 94666752. Throughput: 0: 1765.9, 1: 1793.3. Samples: 23685622. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 02:53:09,612][33226] Updated weights for policy 1, policy_version 46450 (0.0009) [2023-10-14 02:53:09,658][33201] Updated weights for policy 0, policy_version 46040 (0.0008) [2023-10-14 02:53:09,980][33226] Updated weights for policy 1, policy_version 46460 (0.0007) [2023-10-14 02:53:13,594][33201] Updated weights for policy 0, policy_version 46050 (0.0008) [2023-10-14 02:53:13,634][33226] Updated weights for policy 1, policy_version 46470 (0.0009) [2023-10-14 02:53:13,963][33201] Updated weights for policy 0, policy_version 46060 (0.0009) [2023-10-14 02:53:14,003][33226] Updated weights for policy 1, policy_version 46480 (0.0010) [2023-10-14 02:53:14,331][33201] Updated weights for policy 0, policy_version 46070 (0.0008) [2023-10-14 02:53:14,362][33226] Updated weights for policy 1, policy_version 46490 (0.0008) [2023-10-14 02:53:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 94732288. Throughput: 0: 1753.5, 1: 1768.0. Samples: 23695576. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 02:53:14,709][33201] Updated weights for policy 0, policy_version 46080 (0.0010) [2023-10-14 02:53:18,249][33226] Updated weights for policy 1, policy_version 46500 (0.0008) [2023-10-14 02:53:18,424][33201] Updated weights for policy 0, policy_version 46090 (0.0009) [2023-10-14 02:53:18,618][33226] Updated weights for policy 1, policy_version 46510 (0.0009) [2023-10-14 02:53:18,785][33201] Updated weights for policy 0, policy_version 46100 (0.0008) [2023-10-14 02:53:18,988][33226] Updated weights for policy 1, policy_version 46520 (0.0009) [2023-10-14 02:53:19,153][33201] Updated weights for policy 0, policy_version 46110 (0.0007) [2023-10-14 02:53:19,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 94863360. Throughput: 0: 1775.3, 1: 1793.0. Samples: 23717614. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 02:53:22,829][33226] Updated weights for policy 1, policy_version 46530 (0.0008) [2023-10-14 02:53:23,107][33201] Updated weights for policy 0, policy_version 46120 (0.0008) [2023-10-14 02:53:23,198][33226] Updated weights for policy 1, policy_version 46540 (0.0009) [2023-10-14 02:53:23,469][33201] Updated weights for policy 0, policy_version 46130 (0.0007) [2023-10-14 02:53:23,560][33226] Updated weights for policy 1, policy_version 46550 (0.0008) [2023-10-14 02:53:23,842][33201] Updated weights for policy 0, policy_version 46140 (0.0007) [2023-10-14 02:53:23,922][33226] Updated weights for policy 1, policy_version 46560 (0.0009) [2023-10-14 02:53:24,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 94928896. Throughput: 0: 1738.9, 1: 1774.8. Samples: 23736628. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 02:53:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000046144_47251456.pth... [2023-10-14 02:53:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000046560_47677440.pth... [2023-10-14 02:53:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000044896_45973504.pth [2023-10-14 02:53:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000044480_45547520.pth [2023-10-14 02:53:27,691][33201] Updated weights for policy 0, policy_version 46150 (0.0008) [2023-10-14 02:53:27,692][33226] Updated weights for policy 1, policy_version 46570 (0.0008) [2023-10-14 02:53:28,053][33226] Updated weights for policy 1, policy_version 46580 (0.0008) [2023-10-14 02:53:28,067][33201] Updated weights for policy 0, policy_version 46160 (0.0007) [2023-10-14 02:53:28,424][33226] Updated weights for policy 1, policy_version 46590 (0.0008) [2023-10-14 02:53:28,440][33201] Updated weights for policy 0, policy_version 46170 (0.0009) [2023-10-14 02:53:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 94994432. Throughput: 0: 1774.2, 1: 1786.6. Samples: 23749028. Policy #0 lag: (min: 1.0, avg: 5.0, max: 33.0) [2023-10-14 02:53:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:53:32,147][33201] Updated weights for policy 0, policy_version 46180 (0.0009) [2023-10-14 02:53:32,192][33226] Updated weights for policy 1, policy_version 46600 (0.0009) [2023-10-14 02:53:32,512][33201] Updated weights for policy 0, policy_version 46190 (0.0009) [2023-10-14 02:53:32,552][33226] Updated weights for policy 1, policy_version 46610 (0.0009) [2023-10-14 02:53:32,881][33201] Updated weights for policy 0, policy_version 46200 (0.0007) [2023-10-14 02:53:32,916][33226] Updated weights for policy 1, policy_version 46620 (0.0010) [2023-10-14 02:53:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95059968. Throughput: 0: 1747.8, 1: 1775.4. Samples: 23768700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 02:53:36,826][33201] Updated weights for policy 0, policy_version 46210 (0.0007) [2023-10-14 02:53:36,855][33226] Updated weights for policy 1, policy_version 46630 (0.0009) [2023-10-14 02:53:37,199][33201] Updated weights for policy 0, policy_version 46220 (0.0008) [2023-10-14 02:53:37,231][33226] Updated weights for policy 1, policy_version 46640 (0.0007) [2023-10-14 02:53:37,569][33201] Updated weights for policy 0, policy_version 46230 (0.0008) [2023-10-14 02:53:37,591][33226] Updated weights for policy 1, policy_version 46650 (0.0009) [2023-10-14 02:53:37,933][33201] Updated weights for policy 0, policy_version 46240 (0.0008) [2023-10-14 02:53:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95125504. Throughput: 0: 1746.3, 1: 1764.0. Samples: 23790110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:39,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:53:41,596][33226] Updated weights for policy 1, policy_version 46660 (0.0007) [2023-10-14 02:53:41,819][33201] Updated weights for policy 0, policy_version 46250 (0.0009) [2023-10-14 02:53:41,956][33226] Updated weights for policy 1, policy_version 46670 (0.0007) [2023-10-14 02:53:42,187][33201] Updated weights for policy 0, policy_version 46260 (0.0007) [2023-10-14 02:53:42,331][33226] Updated weights for policy 1, policy_version 46680 (0.0007) [2023-10-14 02:53:42,554][33201] Updated weights for policy 0, policy_version 46270 (0.0009) [2023-10-14 02:53:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 95191040. Throughput: 0: 1760.4, 1: 1780.4. Samples: 23801060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:53:46,081][33226] Updated weights for policy 1, policy_version 46690 (0.0010) [2023-10-14 02:53:46,348][33201] Updated weights for policy 0, policy_version 46280 (0.0009) [2023-10-14 02:53:46,444][33226] Updated weights for policy 1, policy_version 46700 (0.0009) [2023-10-14 02:53:46,711][33201] Updated weights for policy 0, policy_version 46290 (0.0007) [2023-10-14 02:53:46,813][33226] Updated weights for policy 1, policy_version 46710 (0.0008) [2023-10-14 02:53:47,084][33201] Updated weights for policy 0, policy_version 46300 (0.0008) [2023-10-14 02:53:47,172][33226] Updated weights for policy 1, policy_version 46720 (0.0009) [2023-10-14 02:53:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 95256576. Throughput: 0: 1744.6, 1: 1756.3. Samples: 23821828. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 02:53:50,864][33201] Updated weights for policy 0, policy_version 46310 (0.0007) [2023-10-14 02:53:51,015][33226] Updated weights for policy 1, policy_version 46730 (0.0007) [2023-10-14 02:53:51,231][33201] Updated weights for policy 0, policy_version 46320 (0.0007) [2023-10-14 02:53:51,379][33226] Updated weights for policy 1, policy_version 46740 (0.0007) [2023-10-14 02:53:51,603][33201] Updated weights for policy 0, policy_version 46330 (0.0009) [2023-10-14 02:53:51,746][33226] Updated weights for policy 1, policy_version 46750 (0.0008) [2023-10-14 02:53:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 95322112. Throughput: 0: 1754.5, 1: 1765.6. Samples: 23844028. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:54,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 02:53:55,443][33201] Updated weights for policy 0, policy_version 46340 (0.0008) [2023-10-14 02:53:55,521][33226] Updated weights for policy 1, policy_version 46760 (0.0007) [2023-10-14 02:53:55,813][33201] Updated weights for policy 0, policy_version 46350 (0.0009) [2023-10-14 02:53:55,883][33226] Updated weights for policy 1, policy_version 46770 (0.0008) [2023-10-14 02:53:56,186][33201] Updated weights for policy 0, policy_version 46360 (0.0009) [2023-10-14 02:53:56,252][33226] Updated weights for policy 1, policy_version 46780 (0.0009) [2023-10-14 02:53:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 95387648. Throughput: 0: 1747.6, 1: 1761.6. Samples: 23853492. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:53:59,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.940')] [2023-10-14 02:53:59,895][33201] Updated weights for policy 0, policy_version 46370 (0.0008) [2023-10-14 02:54:00,084][33226] Updated weights for policy 1, policy_version 46790 (0.0007) [2023-10-14 02:54:00,261][33201] Updated weights for policy 0, policy_version 46380 (0.0007) [2023-10-14 02:54:00,451][33226] Updated weights for policy 1, policy_version 46800 (0.0009) [2023-10-14 02:54:00,638][33201] Updated weights for policy 0, policy_version 46390 (0.0007) [2023-10-14 02:54:00,815][33226] Updated weights for policy 1, policy_version 46810 (0.0007) [2023-10-14 02:54:01,003][33201] Updated weights for policy 0, policy_version 46400 (0.0007) [2023-10-14 02:54:04,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 95453184. Throughput: 0: 1760.2, 1: 1756.4. Samples: 23875862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:54:04,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.940')] [2023-10-14 02:54:04,634][33226] Updated weights for policy 1, policy_version 46820 (0.0008) [2023-10-14 02:54:04,869][33201] Updated weights for policy 0, policy_version 46410 (0.0009) [2023-10-14 02:54:04,998][33226] Updated weights for policy 1, policy_version 46830 (0.0008) [2023-10-14 02:54:05,242][33201] Updated weights for policy 0, policy_version 46420 (0.0007) [2023-10-14 02:54:05,361][33226] Updated weights for policy 1, policy_version 46840 (0.0007) [2023-10-14 02:54:05,614][33201] Updated weights for policy 0, policy_version 46430 (0.0008) [2023-10-14 02:54:09,297][33226] Updated weights for policy 1, policy_version 46850 (0.0007) [2023-10-14 02:54:09,486][33201] Updated weights for policy 0, policy_version 46440 (0.0009) [2023-10-14 02:54:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 95518720. Throughput: 0: 1792.4, 1: 1782.3. Samples: 23897490. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:09,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 02:54:09,667][33226] Updated weights for policy 1, policy_version 46860 (0.0010) [2023-10-14 02:54:09,857][33201] Updated weights for policy 0, policy_version 46450 (0.0009) [2023-10-14 02:54:10,040][33226] Updated weights for policy 1, policy_version 46870 (0.0009) [2023-10-14 02:54:10,232][33201] Updated weights for policy 0, policy_version 46460 (0.0007) [2023-10-14 02:54:10,414][33226] Updated weights for policy 1, policy_version 46880 (0.0007) [2023-10-14 02:54:13,948][33201] Updated weights for policy 0, policy_version 46470 (0.0007) [2023-10-14 02:54:14,229][33226] Updated weights for policy 1, policy_version 46890 (0.0007) [2023-10-14 02:54:14,311][33201] Updated weights for policy 0, policy_version 46480 (0.0007) [2023-10-14 02:54:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 95584256. Throughput: 0: 1757.8, 1: 1753.1. Samples: 23907018. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 02:54:14,586][33226] Updated weights for policy 1, policy_version 46900 (0.0007) [2023-10-14 02:54:14,681][33201] Updated weights for policy 0, policy_version 46490 (0.0007) [2023-10-14 02:54:14,954][33226] Updated weights for policy 1, policy_version 46910 (0.0007) [2023-10-14 02:54:18,590][33201] Updated weights for policy 0, policy_version 46500 (0.0007) [2023-10-14 02:54:18,864][33226] Updated weights for policy 1, policy_version 46920 (0.0007) [2023-10-14 02:54:18,956][33201] Updated weights for policy 0, policy_version 46510 (0.0007) [2023-10-14 02:54:19,232][33226] Updated weights for policy 1, policy_version 46930 (0.0009) [2023-10-14 02:54:19,331][33201] Updated weights for policy 0, policy_version 46520 (0.0007) [2023-10-14 02:54:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13107.2, 300 sec: 14106.9). Total num frames: 95649792. Throughput: 0: 1784.2, 1: 1776.5. Samples: 23928930. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:19,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.840')] [2023-10-14 02:54:19,586][33226] Updated weights for policy 1, policy_version 46940 (0.0009) [2023-10-14 02:54:23,137][33201] Updated weights for policy 0, policy_version 46530 (0.0008) [2023-10-14 02:54:23,370][33226] Updated weights for policy 1, policy_version 46950 (0.0007) [2023-10-14 02:54:23,504][33201] Updated weights for policy 0, policy_version 46540 (0.0008) [2023-10-14 02:54:23,756][33226] Updated weights for policy 1, policy_version 46960 (0.0008) [2023-10-14 02:54:23,877][33201] Updated weights for policy 0, policy_version 46550 (0.0008) [2023-10-14 02:54:24,113][33226] Updated weights for policy 1, policy_version 46970 (0.0008) [2023-10-14 02:54:24,246][33201] Updated weights for policy 0, policy_version 46560 (0.0009) [2023-10-14 02:54:24,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95780864. Throughput: 0: 1760.4, 1: 1766.9. Samples: 23948840. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.840')] [2023-10-14 02:54:27,945][33226] Updated weights for policy 1, policy_version 46980 (0.0009) [2023-10-14 02:54:28,119][33201] Updated weights for policy 0, policy_version 46570 (0.0007) [2023-10-14 02:54:28,306][33226] Updated weights for policy 1, policy_version 46990 (0.0008) [2023-10-14 02:54:28,482][33201] Updated weights for policy 0, policy_version 46580 (0.0007) [2023-10-14 02:54:28,671][33226] Updated weights for policy 1, policy_version 47000 (0.0008) [2023-10-14 02:54:28,856][33201] Updated weights for policy 0, policy_version 46590 (0.0010) [2023-10-14 02:54:29,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95846400. Throughput: 0: 1768.9, 1: 1772.3. Samples: 23960412. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:29,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.840')] [2023-10-14 02:54:32,470][33226] Updated weights for policy 1, policy_version 47010 (0.0007) [2023-10-14 02:54:32,780][33201] Updated weights for policy 0, policy_version 46600 (0.0008) [2023-10-14 02:54:32,833][33226] Updated weights for policy 1, policy_version 47020 (0.0007) [2023-10-14 02:54:33,152][33201] Updated weights for policy 0, policy_version 46610 (0.0007) [2023-10-14 02:54:33,203][33226] Updated weights for policy 1, policy_version 47030 (0.0009) [2023-10-14 02:54:33,520][33201] Updated weights for policy 0, policy_version 46620 (0.0008) [2023-10-14 02:54:33,560][33226] Updated weights for policy 1, policy_version 47040 (0.0008) [2023-10-14 02:54:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95911936. Throughput: 0: 1760.6, 1: 1773.4. Samples: 23980856. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 02:54:34,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.810')] [2023-10-14 02:54:37,404][33201] Updated weights for policy 0, policy_version 46630 (0.0007) [2023-10-14 02:54:37,477][33226] Updated weights for policy 1, policy_version 47050 (0.0007) [2023-10-14 02:54:37,780][33201] Updated weights for policy 0, policy_version 46640 (0.0009) [2023-10-14 02:54:37,833][33226] Updated weights for policy 1, policy_version 47060 (0.0007) [2023-10-14 02:54:38,146][33201] Updated weights for policy 0, policy_version 46650 (0.0007) [2023-10-14 02:54:38,191][33226] Updated weights for policy 1, policy_version 47070 (0.0007) [2023-10-14 02:54:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 95977472. Throughput: 0: 1743.7, 1: 1746.8. Samples: 24001100. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:54:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.810')] [2023-10-14 02:54:41,934][33201] Updated weights for policy 0, policy_version 46660 (0.0008) [2023-10-14 02:54:42,012][33226] Updated weights for policy 1, policy_version 47080 (0.0008) [2023-10-14 02:54:42,304][33201] Updated weights for policy 0, policy_version 46670 (0.0009) [2023-10-14 02:54:42,376][33226] Updated weights for policy 1, policy_version 47090 (0.0008) [2023-10-14 02:54:42,676][33201] Updated weights for policy 0, policy_version 46680 (0.0008) [2023-10-14 02:54:42,747][33226] Updated weights for policy 1, policy_version 47100 (0.0007) [2023-10-14 02:54:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 96043008. Throughput: 0: 1769.6, 1: 1774.9. Samples: 24012994. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:54:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.810')] [2023-10-14 02:54:46,554][33226] Updated weights for policy 1, policy_version 47110 (0.0007) [2023-10-14 02:54:46,692][33201] Updated weights for policy 0, policy_version 46690 (0.0008) [2023-10-14 02:54:46,932][33226] Updated weights for policy 1, policy_version 47120 (0.0008) [2023-10-14 02:54:47,056][33201] Updated weights for policy 0, policy_version 46700 (0.0007) [2023-10-14 02:54:47,302][33226] Updated weights for policy 1, policy_version 47130 (0.0008) [2023-10-14 02:54:47,426][33201] Updated weights for policy 0, policy_version 46710 (0.0010) [2023-10-14 02:54:47,796][33201] Updated weights for policy 0, policy_version 46720 (0.0007) [2023-10-14 02:54:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 96108544. Throughput: 0: 1733.5, 1: 1751.8. Samples: 24032698. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:54:49,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.810')] [2023-10-14 02:54:51,151][33226] Updated weights for policy 1, policy_version 47140 (0.0007) [2023-10-14 02:54:51,504][33226] Updated weights for policy 1, policy_version 47150 (0.0008) [2023-10-14 02:54:51,637][33201] Updated weights for policy 0, policy_version 46730 (0.0009) [2023-10-14 02:54:51,869][33226] Updated weights for policy 1, policy_version 47160 (0.0007) [2023-10-14 02:54:52,002][33201] Updated weights for policy 0, policy_version 46740 (0.0008) [2023-10-14 02:54:52,370][33201] Updated weights for policy 0, policy_version 46750 (0.0007) [2023-10-14 02:54:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 96174080. Throughput: 0: 1734.4, 1: 1757.8. Samples: 24054640. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:54:54,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.810')] [2023-10-14 02:54:55,712][33226] Updated weights for policy 1, policy_version 47170 (0.0008) [2023-10-14 02:54:56,077][33226] Updated weights for policy 1, policy_version 47180 (0.0009) [2023-10-14 02:54:56,199][33201] Updated weights for policy 0, policy_version 46760 (0.0008) [2023-10-14 02:54:56,456][33226] Updated weights for policy 1, policy_version 47190 (0.0007) [2023-10-14 02:54:56,572][33201] Updated weights for policy 0, policy_version 46770 (0.0008) [2023-10-14 02:54:56,814][33226] Updated weights for policy 1, policy_version 47200 (0.0007) [2023-10-14 02:54:56,938][33201] Updated weights for policy 0, policy_version 46780 (0.0008) [2023-10-14 02:54:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 96239616. Throughput: 0: 1736.7, 1: 1756.1. Samples: 24064198. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:54:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.810')] [2023-10-14 02:55:00,536][33226] Updated weights for policy 1, policy_version 47210 (0.0007) [2023-10-14 02:55:00,905][33226] Updated weights for policy 1, policy_version 47220 (0.0007) [2023-10-14 02:55:00,924][33201] Updated weights for policy 0, policy_version 46790 (0.0008) [2023-10-14 02:55:01,261][33226] Updated weights for policy 1, policy_version 47230 (0.0010) [2023-10-14 02:55:01,294][33201] Updated weights for policy 0, policy_version 46800 (0.0009) [2023-10-14 02:55:01,672][33201] Updated weights for policy 0, policy_version 46810 (0.0010) [2023-10-14 02:55:04,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 96305152. Throughput: 0: 1734.8, 1: 1762.3. Samples: 24086296. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:55:04,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.810')] [2023-10-14 02:55:04,860][33226] Updated weights for policy 1, policy_version 47240 (0.0009) [2023-10-14 02:55:05,223][33226] Updated weights for policy 1, policy_version 47250 (0.0009) [2023-10-14 02:55:05,389][33201] Updated weights for policy 0, policy_version 46820 (0.0007) [2023-10-14 02:55:05,579][33226] Updated weights for policy 1, policy_version 47260 (0.0007) [2023-10-14 02:55:05,763][33201] Updated weights for policy 0, policy_version 46830 (0.0007) [2023-10-14 02:55:06,126][33201] Updated weights for policy 0, policy_version 46840 (0.0008) [2023-10-14 02:55:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 96370688. Throughput: 0: 1759.3, 1: 1785.4. Samples: 24108352. Policy #0 lag: (min: 8.0, avg: 22.3, max: 40.0) [2023-10-14 02:55:09,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.840')] [2023-10-14 02:55:09,574][33226] Updated weights for policy 1, policy_version 47270 (0.0010) [2023-10-14 02:55:09,949][33201] Updated weights for policy 0, policy_version 46850 (0.0009) [2023-10-14 02:55:09,962][33226] Updated weights for policy 1, policy_version 47280 (0.0009) [2023-10-14 02:55:10,316][33201] Updated weights for policy 0, policy_version 46860 (0.0007) [2023-10-14 02:55:10,333][33226] Updated weights for policy 1, policy_version 47290 (0.0010) [2023-10-14 02:55:10,689][33201] Updated weights for policy 0, policy_version 46870 (0.0011) [2023-10-14 02:55:11,064][33201] Updated weights for policy 0, policy_version 46880 (0.0010) [2023-10-14 02:55:14,011][33226] Updated weights for policy 1, policy_version 47300 (0.0007) [2023-10-14 02:55:14,375][33226] Updated weights for policy 1, policy_version 47310 (0.0008) [2023-10-14 02:55:14,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 96436224. Throughput: 0: 1731.4, 1: 1762.5. Samples: 24117636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.850')] [2023-10-14 02:55:14,742][33226] Updated weights for policy 1, policy_version 47320 (0.0009) [2023-10-14 02:55:15,022][33201] Updated weights for policy 0, policy_version 46890 (0.0007) [2023-10-14 02:55:15,398][33201] Updated weights for policy 0, policy_version 46900 (0.0009) [2023-10-14 02:55:15,765][33201] Updated weights for policy 0, policy_version 46910 (0.0010) [2023-10-14 02:55:18,462][33226] Updated weights for policy 1, policy_version 47330 (0.0008) [2023-10-14 02:55:18,832][33226] Updated weights for policy 1, policy_version 47340 (0.0008) [2023-10-14 02:55:19,206][33226] Updated weights for policy 1, policy_version 47350 (0.0009) [2023-10-14 02:55:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 96501760. Throughput: 0: 1749.6, 1: 1785.4. Samples: 24139932. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:19,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.850')] [2023-10-14 02:55:19,572][33226] Updated weights for policy 1, policy_version 47360 (0.0009) [2023-10-14 02:55:19,750][33201] Updated weights for policy 0, policy_version 46920 (0.0007) [2023-10-14 02:55:20,113][33201] Updated weights for policy 0, policy_version 46930 (0.0008) [2023-10-14 02:55:20,488][33201] Updated weights for policy 0, policy_version 46940 (0.0008) [2023-10-14 02:55:23,392][33226] Updated weights for policy 1, policy_version 47370 (0.0009) [2023-10-14 02:55:23,752][33226] Updated weights for policy 1, policy_version 47380 (0.0010) [2023-10-14 02:55:24,128][33226] Updated weights for policy 1, policy_version 47390 (0.0010) [2023-10-14 02:55:24,203][33201] Updated weights for policy 0, policy_version 46950 (0.0008) [2023-10-14 02:55:24,557][31953] Fps is (10 sec: 16384.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 96600064. Throughput: 0: 1767.5, 1: 1783.6. Samples: 24160898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:24,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.850')] [2023-10-14 02:55:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000047392_48529408.pth... [2023-10-14 02:55:24,572][33201] Updated weights for policy 0, policy_version 46960 (0.0008) [2023-10-14 02:55:24,604][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000045728_46825472.pth [2023-10-14 02:55:24,941][33201] Updated weights for policy 0, policy_version 46970 (0.0007) [2023-10-14 02:55:25,163][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000046976_48103424.pth... [2023-10-14 02:55:25,193][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000045312_46399488.pth [2023-10-14 02:55:27,883][33226] Updated weights for policy 1, policy_version 47400 (0.0009) [2023-10-14 02:55:28,245][33226] Updated weights for policy 1, policy_version 47410 (0.0010) [2023-10-14 02:55:28,606][33226] Updated weights for policy 1, policy_version 47420 (0.0010) [2023-10-14 02:55:28,728][33201] Updated weights for policy 0, policy_version 46980 (0.0008) [2023-10-14 02:55:29,100][33201] Updated weights for policy 0, policy_version 46990 (0.0007) [2023-10-14 02:55:29,470][33201] Updated weights for policy 0, policy_version 47000 (0.0009) [2023-10-14 02:55:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 96665600. Throughput: 0: 1744.3, 1: 1787.8. Samples: 24171936. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.850')] [2023-10-14 02:55:32,495][33226] Updated weights for policy 1, policy_version 47430 (0.0009) [2023-10-14 02:55:32,857][33226] Updated weights for policy 1, policy_version 47440 (0.0012) [2023-10-14 02:55:33,229][33226] Updated weights for policy 1, policy_version 47450 (0.0008) [2023-10-14 02:55:33,244][33201] Updated weights for policy 0, policy_version 47010 (0.0009) [2023-10-14 02:55:33,617][33201] Updated weights for policy 0, policy_version 47020 (0.0008) [2023-10-14 02:55:33,988][33201] Updated weights for policy 0, policy_version 47030 (0.0008) [2023-10-14 02:55:34,360][33201] Updated weights for policy 0, policy_version 47040 (0.0010) [2023-10-14 02:55:34,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 96763904. Throughput: 0: 1777.7, 1: 1794.2. Samples: 24193434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.850')] [2023-10-14 02:55:37,124][33226] Updated weights for policy 1, policy_version 47460 (0.0009) [2023-10-14 02:55:37,488][33226] Updated weights for policy 1, policy_version 47470 (0.0010) [2023-10-14 02:55:37,854][33226] Updated weights for policy 1, policy_version 47480 (0.0007) [2023-10-14 02:55:38,186][33201] Updated weights for policy 0, policy_version 47050 (0.0008) [2023-10-14 02:55:38,553][33201] Updated weights for policy 0, policy_version 47060 (0.0009) [2023-10-14 02:55:38,926][33201] Updated weights for policy 0, policy_version 47070 (0.0007) [2023-10-14 02:55:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 96829440. Throughput: 0: 1748.7, 1: 1778.7. Samples: 24213372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 02:55:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.840')] [2023-10-14 02:55:41,647][33226] Updated weights for policy 1, policy_version 47490 (0.0007) [2023-10-14 02:55:42,013][33226] Updated weights for policy 1, policy_version 47500 (0.0008) [2023-10-14 02:55:42,376][33226] Updated weights for policy 1, policy_version 47510 (0.0007) [2023-10-14 02:55:42,743][33226] Updated weights for policy 1, policy_version 47520 (0.0007) [2023-10-14 02:55:42,832][33201] Updated weights for policy 0, policy_version 47080 (0.0009) [2023-10-14 02:55:43,218][33201] Updated weights for policy 0, policy_version 47090 (0.0011) [2023-10-14 02:55:43,589][33201] Updated weights for policy 0, policy_version 47100 (0.0010) [2023-10-14 02:55:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 96894976. Throughput: 0: 1779.8, 1: 1804.5. Samples: 24225488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:55:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.820')] [2023-10-14 02:55:46,347][33226] Updated weights for policy 1, policy_version 47530 (0.0008) [2023-10-14 02:55:46,711][33226] Updated weights for policy 1, policy_version 47540 (0.0008) [2023-10-14 02:55:47,078][33226] Updated weights for policy 1, policy_version 47550 (0.0009) [2023-10-14 02:55:47,202][33201] Updated weights for policy 0, policy_version 47110 (0.0009) [2023-10-14 02:55:47,570][33201] Updated weights for policy 0, policy_version 47120 (0.0010) [2023-10-14 02:55:47,942][33201] Updated weights for policy 0, policy_version 47130 (0.0010) [2023-10-14 02:55:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 96960512. Throughput: 0: 1759.6, 1: 1778.8. Samples: 24245528. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:55:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.820')] [2023-10-14 02:55:50,975][33226] Updated weights for policy 1, policy_version 47560 (0.0009) [2023-10-14 02:55:51,346][33226] Updated weights for policy 1, policy_version 47570 (0.0009) [2023-10-14 02:55:51,709][33226] Updated weights for policy 1, policy_version 47580 (0.0007) [2023-10-14 02:55:51,783][33201] Updated weights for policy 0, policy_version 47140 (0.0007) [2023-10-14 02:55:52,151][33201] Updated weights for policy 0, policy_version 47150 (0.0007) [2023-10-14 02:55:52,514][33201] Updated weights for policy 0, policy_version 47160 (0.0008) [2023-10-14 02:55:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 97026048. Throughput: 0: 1758.2, 1: 1779.5. Samples: 24267548. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:55:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.800')] [2023-10-14 02:55:55,456][33226] Updated weights for policy 1, policy_version 47590 (0.0009) [2023-10-14 02:55:55,835][33226] Updated weights for policy 1, policy_version 47600 (0.0007) [2023-10-14 02:55:56,194][33226] Updated weights for policy 1, policy_version 47610 (0.0009) [2023-10-14 02:55:56,447][33201] Updated weights for policy 0, policy_version 47170 (0.0008) [2023-10-14 02:55:56,815][33201] Updated weights for policy 0, policy_version 47180 (0.0008) [2023-10-14 02:55:57,183][33201] Updated weights for policy 0, policy_version 47190 (0.0008) [2023-10-14 02:55:57,547][33201] Updated weights for policy 0, policy_version 47200 (0.0009) [2023-10-14 02:55:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 97091584. Throughput: 0: 1775.8, 1: 1777.4. Samples: 24277532. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:55:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 02:56:00,083][33226] Updated weights for policy 1, policy_version 47620 (0.0009) [2023-10-14 02:56:00,444][33226] Updated weights for policy 1, policy_version 47630 (0.0009) [2023-10-14 02:56:00,813][33226] Updated weights for policy 1, policy_version 47640 (0.0007) [2023-10-14 02:56:01,354][33201] Updated weights for policy 0, policy_version 47210 (0.0009) [2023-10-14 02:56:01,730][33201] Updated weights for policy 0, policy_version 47220 (0.0009) [2023-10-14 02:56:02,102][33201] Updated weights for policy 0, policy_version 47230 (0.0007) [2023-10-14 02:56:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 97157120. Throughput: 0: 1769.5, 1: 1774.4. Samples: 24299408. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:56:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.900')] [2023-10-14 02:56:04,588][33226] Updated weights for policy 1, policy_version 47650 (0.0008) [2023-10-14 02:56:04,951][33226] Updated weights for policy 1, policy_version 47660 (0.0011) [2023-10-14 02:56:05,320][33226] Updated weights for policy 1, policy_version 47670 (0.0010) [2023-10-14 02:56:05,686][33226] Updated weights for policy 1, policy_version 47680 (0.0009) [2023-10-14 02:56:05,883][33201] Updated weights for policy 0, policy_version 47240 (0.0008) [2023-10-14 02:56:06,259][33201] Updated weights for policy 0, policy_version 47250 (0.0010) [2023-10-14 02:56:06,642][33201] Updated weights for policy 0, policy_version 47260 (0.0010) [2023-10-14 02:56:09,380][33226] Updated weights for policy 1, policy_version 47690 (0.0008) [2023-10-14 02:56:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 97222656. Throughput: 0: 1768.0, 1: 1803.1. Samples: 24321598. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:56:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.900')] [2023-10-14 02:56:09,743][33226] Updated weights for policy 1, policy_version 47700 (0.0007) [2023-10-14 02:56:10,108][33226] Updated weights for policy 1, policy_version 47710 (0.0007) [2023-10-14 02:56:10,482][33201] Updated weights for policy 0, policy_version 47270 (0.0010) [2023-10-14 02:56:10,850][33201] Updated weights for policy 0, policy_version 47280 (0.0010) [2023-10-14 02:56:11,229][33201] Updated weights for policy 0, policy_version 47290 (0.0010) [2023-10-14 02:56:13,859][33226] Updated weights for policy 1, policy_version 47720 (0.0009) [2023-10-14 02:56:14,221][33226] Updated weights for policy 1, policy_version 47730 (0.0010) [2023-10-14 02:56:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 97288192. Throughput: 0: 1764.3, 1: 1775.3. Samples: 24331218. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 02:56:14,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.900')] [2023-10-14 02:56:14,598][33226] Updated weights for policy 1, policy_version 47740 (0.0009) [2023-10-14 02:56:15,019][33201] Updated weights for policy 0, policy_version 47300 (0.0009) [2023-10-14 02:56:15,383][33201] Updated weights for policy 0, policy_version 47310 (0.0008) [2023-10-14 02:56:15,760][33201] Updated weights for policy 0, policy_version 47320 (0.0008) [2023-10-14 02:56:18,373][33226] Updated weights for policy 1, policy_version 47750 (0.0009) [2023-10-14 02:56:18,741][33226] Updated weights for policy 1, policy_version 47760 (0.0007) [2023-10-14 02:56:19,110][33226] Updated weights for policy 1, policy_version 47770 (0.0007) [2023-10-14 02:56:19,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 97386496. Throughput: 0: 1754.9, 1: 1798.3. Samples: 24353324. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 02:56:19,609][33201] Updated weights for policy 0, policy_version 47330 (0.0008) [2023-10-14 02:56:19,976][33201] Updated weights for policy 0, policy_version 47340 (0.0010) [2023-10-14 02:56:20,351][33201] Updated weights for policy 0, policy_version 47350 (0.0009) [2023-10-14 02:56:20,710][33201] Updated weights for policy 0, policy_version 47360 (0.0010) [2023-10-14 02:56:22,892][33226] Updated weights for policy 1, policy_version 47780 (0.0009) [2023-10-14 02:56:23,262][33226] Updated weights for policy 1, policy_version 47790 (0.0008) [2023-10-14 02:56:23,624][33226] Updated weights for policy 1, policy_version 47800 (0.0008) [2023-10-14 02:56:24,539][33201] Updated weights for policy 0, policy_version 47370 (0.0007) [2023-10-14 02:56:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 97452032. Throughput: 0: 1793.7, 1: 1782.9. Samples: 24374320. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:24,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.930')] [2023-10-14 02:56:24,905][33201] Updated weights for policy 0, policy_version 47380 (0.0010) [2023-10-14 02:56:25,273][33201] Updated weights for policy 0, policy_version 47390 (0.0008) [2023-10-14 02:56:27,370][33226] Updated weights for policy 1, policy_version 47810 (0.0010) [2023-10-14 02:56:27,744][33226] Updated weights for policy 1, policy_version 47820 (0.0009) [2023-10-14 02:56:28,104][33226] Updated weights for policy 1, policy_version 47830 (0.0010) [2023-10-14 02:56:28,470][33226] Updated weights for policy 1, policy_version 47840 (0.0010) [2023-10-14 02:56:29,168][33201] Updated weights for policy 0, policy_version 47400 (0.0009) [2023-10-14 02:56:29,530][33201] Updated weights for policy 0, policy_version 47410 (0.0007) [2023-10-14 02:56:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 97517568. Throughput: 0: 1761.6, 1: 1791.4. Samples: 24385372. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:29,557][31953] Avg episode reward: [(0, '20.880'), (1, '20.930')] [2023-10-14 02:56:29,909][33201] Updated weights for policy 0, policy_version 47420 (0.0007) [2023-10-14 02:56:32,474][33226] Updated weights for policy 1, policy_version 47850 (0.0008) [2023-10-14 02:56:32,844][33226] Updated weights for policy 1, policy_version 47860 (0.0007) [2023-10-14 02:56:33,212][33226] Updated weights for policy 1, policy_version 47870 (0.0007) [2023-10-14 02:56:33,540][33201] Updated weights for policy 0, policy_version 47430 (0.0007) [2023-10-14 02:56:33,904][33201] Updated weights for policy 0, policy_version 47440 (0.0007) [2023-10-14 02:56:34,280][33201] Updated weights for policy 0, policy_version 47450 (0.0009) [2023-10-14 02:56:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 97615872. Throughput: 0: 1788.9, 1: 1788.4. Samples: 24406508. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:34,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.930')] [2023-10-14 02:56:36,906][33226] Updated weights for policy 1, policy_version 47880 (0.0009) [2023-10-14 02:56:37,273][33226] Updated weights for policy 1, policy_version 47890 (0.0007) [2023-10-14 02:56:37,641][33226] Updated weights for policy 1, policy_version 47900 (0.0008) [2023-10-14 02:56:38,107][33201] Updated weights for policy 0, policy_version 47460 (0.0008) [2023-10-14 02:56:38,482][33201] Updated weights for policy 0, policy_version 47470 (0.0007) [2023-10-14 02:56:38,864][33201] Updated weights for policy 0, policy_version 47480 (0.0008) [2023-10-14 02:56:39,557][31953] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 97681408. Throughput: 0: 1764.0, 1: 1780.0. Samples: 24427030. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.910')] [2023-10-14 02:56:41,557][33226] Updated weights for policy 1, policy_version 47910 (0.0009) [2023-10-14 02:56:41,936][33226] Updated weights for policy 1, policy_version 47920 (0.0009) [2023-10-14 02:56:42,301][33226] Updated weights for policy 1, policy_version 47930 (0.0009) [2023-10-14 02:56:42,973][33201] Updated weights for policy 0, policy_version 47490 (0.0009) [2023-10-14 02:56:43,334][33201] Updated weights for policy 0, policy_version 47500 (0.0010) [2023-10-14 02:56:43,706][33201] Updated weights for policy 0, policy_version 47510 (0.0007) [2023-10-14 02:56:44,078][33201] Updated weights for policy 0, policy_version 47520 (0.0008) [2023-10-14 02:56:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 97746944. Throughput: 0: 1780.3, 1: 1796.6. Samples: 24438492. Policy #0 lag: (min: 2.0, avg: 11.0, max: 34.0) [2023-10-14 02:56:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.910')] [2023-10-14 02:56:46,178][33226] Updated weights for policy 1, policy_version 47940 (0.0008) [2023-10-14 02:56:46,546][33226] Updated weights for policy 1, policy_version 47950 (0.0009) [2023-10-14 02:56:46,920][33226] Updated weights for policy 1, policy_version 47960 (0.0008) [2023-10-14 02:56:47,845][33201] Updated weights for policy 0, policy_version 47530 (0.0007) [2023-10-14 02:56:48,226][33201] Updated weights for policy 0, policy_version 47540 (0.0009) [2023-10-14 02:56:48,586][33201] Updated weights for policy 0, policy_version 47550 (0.0008) [2023-10-14 02:56:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 97812480. Throughput: 0: 1774.8, 1: 1774.3. Samples: 24459120. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:56:49,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:56:50,694][33226] Updated weights for policy 1, policy_version 47970 (0.0009) [2023-10-14 02:56:51,062][33226] Updated weights for policy 1, policy_version 47980 (0.0009) [2023-10-14 02:56:51,431][33226] Updated weights for policy 1, policy_version 47990 (0.0008) [2023-10-14 02:56:51,802][33226] Updated weights for policy 1, policy_version 48000 (0.0009) [2023-10-14 02:56:52,490][33201] Updated weights for policy 0, policy_version 47560 (0.0008) [2023-10-14 02:56:52,849][33201] Updated weights for policy 0, policy_version 47570 (0.0008) [2023-10-14 02:56:53,226][33201] Updated weights for policy 0, policy_version 47580 (0.0008) [2023-10-14 02:56:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 97878016. Throughput: 0: 1762.3, 1: 1767.4. Samples: 24480432. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:56:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:56:55,603][33226] Updated weights for policy 1, policy_version 48010 (0.0010) [2023-10-14 02:56:55,967][33226] Updated weights for policy 1, policy_version 48020 (0.0008) [2023-10-14 02:56:56,339][33226] Updated weights for policy 1, policy_version 48030 (0.0009) [2023-10-14 02:56:56,886][33201] Updated weights for policy 0, policy_version 47590 (0.0009) [2023-10-14 02:56:57,253][33201] Updated weights for policy 0, policy_version 47600 (0.0008) [2023-10-14 02:56:57,629][33201] Updated weights for policy 0, policy_version 47610 (0.0010) [2023-10-14 02:56:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 97943552. Throughput: 0: 1785.0, 1: 1767.3. Samples: 24491072. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:56:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:56:59,972][33226] Updated weights for policy 1, policy_version 48040 (0.0010) [2023-10-14 02:57:00,340][33226] Updated weights for policy 1, policy_version 48050 (0.0012) [2023-10-14 02:57:00,713][33226] Updated weights for policy 1, policy_version 48060 (0.0009) [2023-10-14 02:57:01,321][33201] Updated weights for policy 0, policy_version 47620 (0.0009) [2023-10-14 02:57:01,677][33201] Updated weights for policy 0, policy_version 47630 (0.0010) [2023-10-14 02:57:02,047][33201] Updated weights for policy 0, policy_version 47640 (0.0008) [2023-10-14 02:57:04,393][33226] Updated weights for policy 1, policy_version 48070 (0.0007) [2023-10-14 02:57:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 98009088. Throughput: 0: 1773.5, 1: 1766.2. Samples: 24512610. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:57:04,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:57:04,763][33226] Updated weights for policy 1, policy_version 48080 (0.0007) [2023-10-14 02:57:05,140][33226] Updated weights for policy 1, policy_version 48090 (0.0008) [2023-10-14 02:57:05,871][33201] Updated weights for policy 0, policy_version 47650 (0.0007) [2023-10-14 02:57:06,238][33201] Updated weights for policy 0, policy_version 47660 (0.0011) [2023-10-14 02:57:06,608][33201] Updated weights for policy 0, policy_version 47670 (0.0008) [2023-10-14 02:57:06,990][33201] Updated weights for policy 0, policy_version 47680 (0.0008) [2023-10-14 02:57:08,966][33226] Updated weights for policy 1, policy_version 48100 (0.0007) [2023-10-14 02:57:09,340][33226] Updated weights for policy 1, policy_version 48110 (0.0008) [2023-10-14 02:57:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 98074624. Throughput: 0: 1768.9, 1: 1797.5. Samples: 24534810. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:57:09,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:57:09,704][33226] Updated weights for policy 1, policy_version 48120 (0.0007) [2023-10-14 02:57:10,790][33201] Updated weights for policy 0, policy_version 47690 (0.0009) [2023-10-14 02:57:11,159][33201] Updated weights for policy 0, policy_version 47700 (0.0011) [2023-10-14 02:57:11,531][33201] Updated weights for policy 0, policy_version 47710 (0.0010) [2023-10-14 02:57:13,440][33226] Updated weights for policy 1, policy_version 48130 (0.0008) [2023-10-14 02:57:13,816][33226] Updated weights for policy 1, policy_version 48140 (0.0009) [2023-10-14 02:57:14,180][33226] Updated weights for policy 1, policy_version 48150 (0.0008) [2023-10-14 02:57:14,543][33226] Updated weights for policy 1, policy_version 48160 (0.0007) [2023-10-14 02:57:14,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 98172928. Throughput: 0: 1770.0, 1: 1772.2. Samples: 24544770. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:57:14,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:57:15,297][33201] Updated weights for policy 0, policy_version 47720 (0.0010) [2023-10-14 02:57:15,673][33201] Updated weights for policy 0, policy_version 47730 (0.0008) [2023-10-14 02:57:16,046][33201] Updated weights for policy 0, policy_version 47740 (0.0010) [2023-10-14 02:57:18,164][33226] Updated weights for policy 1, policy_version 48170 (0.0009) [2023-10-14 02:57:18,534][33226] Updated weights for policy 1, policy_version 48180 (0.0008) [2023-10-14 02:57:18,907][33226] Updated weights for policy 1, policy_version 48190 (0.0008) [2023-10-14 02:57:19,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 98238464. Throughput: 0: 1766.4, 1: 1795.9. Samples: 24566812. Policy #0 lag: (min: 17.0, avg: 34.5, max: 49.0) [2023-10-14 02:57:19,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.910')] [2023-10-14 02:57:19,843][33201] Updated weights for policy 0, policy_version 47750 (0.0007) [2023-10-14 02:57:20,214][33201] Updated weights for policy 0, policy_version 47760 (0.0008) [2023-10-14 02:57:20,585][33201] Updated weights for policy 0, policy_version 47770 (0.0008) [2023-10-14 02:57:22,638][33226] Updated weights for policy 1, policy_version 48200 (0.0007) [2023-10-14 02:57:23,004][33226] Updated weights for policy 1, policy_version 48210 (0.0007) [2023-10-14 02:57:23,368][33226] Updated weights for policy 1, policy_version 48220 (0.0011) [2023-10-14 02:57:24,476][33201] Updated weights for policy 0, policy_version 47780 (0.0007) [2023-10-14 02:57:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 98304000. Throughput: 0: 1794.8, 1: 1776.4. Samples: 24587732. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:24,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.920')] [2023-10-14 02:57:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000048224_49381376.pth... [2023-10-14 02:57:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000046560_47677440.pth [2023-10-14 02:57:24,855][33201] Updated weights for policy 0, policy_version 47790 (0.0008) [2023-10-14 02:57:25,220][33201] Updated weights for policy 0, policy_version 47800 (0.0007) [2023-10-14 02:57:25,513][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth... [2023-10-14 02:57:25,541][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000046144_47251456.pth [2023-10-14 02:57:27,210][33226] Updated weights for policy 1, policy_version 48230 (0.0009) [2023-10-14 02:57:27,592][33226] Updated weights for policy 1, policy_version 48240 (0.0011) [2023-10-14 02:57:27,956][33226] Updated weights for policy 1, policy_version 48250 (0.0010) [2023-10-14 02:57:29,039][33201] Updated weights for policy 0, policy_version 47810 (0.0007) [2023-10-14 02:57:29,407][33201] Updated weights for policy 0, policy_version 47820 (0.0008) [2023-10-14 02:57:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 98369536. Throughput: 0: 1764.3, 1: 1796.6. Samples: 24598730. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:29,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 02:57:29,779][33201] Updated weights for policy 0, policy_version 47830 (0.0010) [2023-10-14 02:57:30,148][33201] Updated weights for policy 0, policy_version 47840 (0.0009) [2023-10-14 02:57:31,808][33226] Updated weights for policy 1, policy_version 48260 (0.0009) [2023-10-14 02:57:32,185][33226] Updated weights for policy 1, policy_version 48270 (0.0008) [2023-10-14 02:57:32,553][33226] Updated weights for policy 1, policy_version 48280 (0.0008) [2023-10-14 02:57:33,831][33201] Updated weights for policy 0, policy_version 47850 (0.0007) [2023-10-14 02:57:34,199][33201] Updated weights for policy 0, policy_version 47860 (0.0007) [2023-10-14 02:57:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 98435072. Throughput: 0: 1781.0, 1: 1784.6. Samples: 24619574. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:34,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.950')] [2023-10-14 02:57:34,567][33201] Updated weights for policy 0, policy_version 47870 (0.0008) [2023-10-14 02:57:36,076][33226] Updated weights for policy 1, policy_version 48290 (0.0008) [2023-10-14 02:57:36,435][33226] Updated weights for policy 1, policy_version 48300 (0.0010) [2023-10-14 02:57:36,799][33226] Updated weights for policy 1, policy_version 48310 (0.0009) [2023-10-14 02:57:37,161][33226] Updated weights for policy 1, policy_version 48320 (0.0007) [2023-10-14 02:57:38,548][33201] Updated weights for policy 0, policy_version 47880 (0.0009) [2023-10-14 02:57:38,921][33201] Updated weights for policy 0, policy_version 47890 (0.0010) [2023-10-14 02:57:39,285][33201] Updated weights for policy 0, policy_version 47900 (0.0009) [2023-10-14 02:57:39,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 98533376. Throughput: 0: 1771.0, 1: 1795.6. Samples: 24640928. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:39,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.950')] [2023-10-14 02:57:40,872][33226] Updated weights for policy 1, policy_version 48330 (0.0007) [2023-10-14 02:57:41,244][33226] Updated weights for policy 1, policy_version 48340 (0.0007) [2023-10-14 02:57:41,604][33226] Updated weights for policy 1, policy_version 48350 (0.0008) [2023-10-14 02:57:43,095][33201] Updated weights for policy 0, policy_version 47910 (0.0008) [2023-10-14 02:57:43,464][33201] Updated weights for policy 0, policy_version 47920 (0.0011) [2023-10-14 02:57:43,839][33201] Updated weights for policy 0, policy_version 47930 (0.0010) [2023-10-14 02:57:44,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 98598912. Throughput: 0: 1769.8, 1: 1799.9. Samples: 24651706. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:44,557][31953] Avg episode reward: [(0, '20.850'), (1, '20.950')] [2023-10-14 02:57:45,282][33226] Updated weights for policy 1, policy_version 48360 (0.0010) [2023-10-14 02:57:45,648][33226] Updated weights for policy 1, policy_version 48370 (0.0007) [2023-10-14 02:57:46,022][33226] Updated weights for policy 1, policy_version 48380 (0.0007) [2023-10-14 02:57:47,745][33201] Updated weights for policy 0, policy_version 47940 (0.0008) [2023-10-14 02:57:48,115][33201] Updated weights for policy 0, policy_version 47950 (0.0008) [2023-10-14 02:57:48,486][33201] Updated weights for policy 0, policy_version 47960 (0.0010) [2023-10-14 02:57:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 98664448. Throughput: 0: 1770.9, 1: 1797.5. Samples: 24673188. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-10-14 02:57:49,557][31953] Avg episode reward: [(0, '20.620'), (1, '20.960')] [2023-10-14 02:57:49,921][33226] Updated weights for policy 1, policy_version 48390 (0.0008) [2023-10-14 02:57:50,292][33226] Updated weights for policy 1, policy_version 48400 (0.0008) [2023-10-14 02:57:50,660][33226] Updated weights for policy 1, policy_version 48410 (0.0007) [2023-10-14 02:57:52,306][33201] Updated weights for policy 0, policy_version 47970 (0.0008) [2023-10-14 02:57:52,679][33201] Updated weights for policy 0, policy_version 47980 (0.0009) [2023-10-14 02:57:53,045][33201] Updated weights for policy 0, policy_version 47990 (0.0007) [2023-10-14 02:57:53,421][33201] Updated weights for policy 0, policy_version 48000 (0.0007) [2023-10-14 02:57:54,319][33226] Updated weights for policy 1, policy_version 48420 (0.0007) [2023-10-14 02:57:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 98729984. Throughput: 0: 1749.0, 1: 1808.7. Samples: 24694906. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:57:54,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.950')] [2023-10-14 02:57:54,686][33226] Updated weights for policy 1, policy_version 48430 (0.0010) [2023-10-14 02:57:55,055][33226] Updated weights for policy 1, policy_version 48440 (0.0009) [2023-10-14 02:57:57,369][33201] Updated weights for policy 0, policy_version 48010 (0.0008) [2023-10-14 02:57:57,740][33201] Updated weights for policy 0, policy_version 48020 (0.0011) [2023-10-14 02:57:58,118][33201] Updated weights for policy 0, policy_version 48030 (0.0009) [2023-10-14 02:57:58,869][33226] Updated weights for policy 1, policy_version 48450 (0.0008) [2023-10-14 02:57:59,242][33226] Updated weights for policy 1, policy_version 48460 (0.0011) [2023-10-14 02:57:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 98795520. Throughput: 0: 1774.4, 1: 1799.8. Samples: 24705606. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:57:59,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.950')] [2023-10-14 02:57:59,606][33226] Updated weights for policy 1, policy_version 48470 (0.0010) [2023-10-14 02:57:59,970][33226] Updated weights for policy 1, policy_version 48480 (0.0011) [2023-10-14 02:58:01,904][33201] Updated weights for policy 0, policy_version 48040 (0.0010) [2023-10-14 02:58:02,269][33201] Updated weights for policy 0, policy_version 48050 (0.0010) [2023-10-14 02:58:02,647][33201] Updated weights for policy 0, policy_version 48060 (0.0009) [2023-10-14 02:58:03,765][33226] Updated weights for policy 1, policy_version 48490 (0.0008) [2023-10-14 02:58:04,125][33226] Updated weights for policy 1, policy_version 48500 (0.0007) [2023-10-14 02:58:04,500][33226] Updated weights for policy 1, policy_version 48510 (0.0009) [2023-10-14 02:58:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 98861056. Throughput: 0: 1746.4, 1: 1805.7. Samples: 24726654. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:58:04,557][31953] Avg episode reward: [(0, '20.620'), (1, '20.910')] [2023-10-14 02:58:06,688][33201] Updated weights for policy 0, policy_version 48070 (0.0009) [2023-10-14 02:58:07,075][33201] Updated weights for policy 0, policy_version 48080 (0.0010) [2023-10-14 02:58:07,443][33201] Updated weights for policy 0, policy_version 48090 (0.0008) [2023-10-14 02:58:08,282][33226] Updated weights for policy 1, policy_version 48520 (0.0010) [2023-10-14 02:58:08,657][33226] Updated weights for policy 1, policy_version 48530 (0.0008) [2023-10-14 02:58:09,009][33226] Updated weights for policy 1, policy_version 48540 (0.0009) [2023-10-14 02:58:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 98959360. Throughput: 0: 1740.8, 1: 1804.0. Samples: 24747246. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:58:09,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.880')] [2023-10-14 02:58:11,282][33201] Updated weights for policy 0, policy_version 48100 (0.0008) [2023-10-14 02:58:11,651][33201] Updated weights for policy 0, policy_version 48110 (0.0011) [2023-10-14 02:58:12,026][33201] Updated weights for policy 0, policy_version 48120 (0.0009) [2023-10-14 02:58:12,766][33226] Updated weights for policy 1, policy_version 48550 (0.0008) [2023-10-14 02:58:13,146][33226] Updated weights for policy 1, policy_version 48560 (0.0009) [2023-10-14 02:58:13,516][33226] Updated weights for policy 1, policy_version 48570 (0.0008) [2023-10-14 02:58:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 99024896. Throughput: 0: 1748.3, 1: 1800.5. Samples: 24758426. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:58:14,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.880')] [2023-10-14 02:58:15,656][33201] Updated weights for policy 0, policy_version 48130 (0.0008) [2023-10-14 02:58:16,024][33201] Updated weights for policy 0, policy_version 48140 (0.0010) [2023-10-14 02:58:16,394][33201] Updated weights for policy 0, policy_version 48150 (0.0009) [2023-10-14 02:58:16,762][33201] Updated weights for policy 0, policy_version 48160 (0.0007) [2023-10-14 02:58:17,309][33226] Updated weights for policy 1, policy_version 48580 (0.0007) [2023-10-14 02:58:17,672][33226] Updated weights for policy 1, policy_version 48590 (0.0008) [2023-10-14 02:58:18,042][33226] Updated weights for policy 1, policy_version 48600 (0.0009) [2023-10-14 02:58:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 99090432. Throughput: 0: 1739.6, 1: 1809.0. Samples: 24779262. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:58:19,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.880')] [2023-10-14 02:58:20,503][33201] Updated weights for policy 0, policy_version 48170 (0.0009) [2023-10-14 02:58:20,875][33201] Updated weights for policy 0, policy_version 48180 (0.0007) [2023-10-14 02:58:21,250][33201] Updated weights for policy 0, policy_version 48190 (0.0008) [2023-10-14 02:58:21,953][33226] Updated weights for policy 1, policy_version 48610 (0.0009) [2023-10-14 02:58:22,313][33226] Updated weights for policy 1, policy_version 48620 (0.0008) [2023-10-14 02:58:22,684][33226] Updated weights for policy 1, policy_version 48630 (0.0008) [2023-10-14 02:58:23,053][33226] Updated weights for policy 1, policy_version 48640 (0.0008) [2023-10-14 02:58:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 99155968. Throughput: 0: 1770.6, 1: 1787.9. Samples: 24801062. Policy #0 lag: (min: 17.0, avg: 27.2, max: 49.0) [2023-10-14 02:58:24,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.890')] [2023-10-14 02:58:25,060][33201] Updated weights for policy 0, policy_version 48200 (0.0007) [2023-10-14 02:58:25,429][33201] Updated weights for policy 0, policy_version 48210 (0.0007) [2023-10-14 02:58:25,798][33201] Updated weights for policy 0, policy_version 48220 (0.0008) [2023-10-14 02:58:26,726][33226] Updated weights for policy 1, policy_version 48650 (0.0007) [2023-10-14 02:58:27,091][33226] Updated weights for policy 1, policy_version 48660 (0.0008) [2023-10-14 02:58:27,455][33226] Updated weights for policy 1, policy_version 48670 (0.0009) [2023-10-14 02:58:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 99221504. Throughput: 0: 1748.4, 1: 1802.4. Samples: 24811494. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:29,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.860')] [2023-10-14 02:58:29,661][33201] Updated weights for policy 0, policy_version 48230 (0.0008) [2023-10-14 02:58:30,024][33201] Updated weights for policy 0, policy_version 48240 (0.0007) [2023-10-14 02:58:30,400][33201] Updated weights for policy 0, policy_version 48250 (0.0009) [2023-10-14 02:58:31,245][33226] Updated weights for policy 1, policy_version 48680 (0.0009) [2023-10-14 02:58:31,616][33226] Updated weights for policy 1, policy_version 48690 (0.0009) [2023-10-14 02:58:31,988][33226] Updated weights for policy 1, policy_version 48700 (0.0009) [2023-10-14 02:58:34,157][33201] Updated weights for policy 0, policy_version 48260 (0.0009) [2023-10-14 02:58:34,532][33201] Updated weights for policy 0, policy_version 48270 (0.0010) [2023-10-14 02:58:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 99287040. Throughput: 0: 1765.9, 1: 1784.9. Samples: 24832974. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:34,557][31953] Avg episode reward: [(0, '20.650'), (1, '20.850')] [2023-10-14 02:58:34,898][33201] Updated weights for policy 0, policy_version 48280 (0.0008) [2023-10-14 02:58:35,768][33226] Updated weights for policy 1, policy_version 48710 (0.0008) [2023-10-14 02:58:36,138][33226] Updated weights for policy 1, policy_version 48720 (0.0009) [2023-10-14 02:58:36,502][33226] Updated weights for policy 1, policy_version 48730 (0.0007) [2023-10-14 02:58:38,700][33201] Updated weights for policy 0, policy_version 48290 (0.0010) [2023-10-14 02:58:39,084][33201] Updated weights for policy 0, policy_version 48300 (0.0009) [2023-10-14 02:58:39,453][33201] Updated weights for policy 0, policy_version 48310 (0.0008) [2023-10-14 02:58:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 99352576. Throughput: 0: 1769.2, 1: 1777.6. Samples: 24854512. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:39,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.840')] [2023-10-14 02:58:39,824][33201] Updated weights for policy 0, policy_version 48320 (0.0008) [2023-10-14 02:58:40,301][33226] Updated weights for policy 1, policy_version 48740 (0.0009) [2023-10-14 02:58:40,676][33226] Updated weights for policy 1, policy_version 48750 (0.0010) [2023-10-14 02:58:41,056][33226] Updated weights for policy 1, policy_version 48760 (0.0009) [2023-10-14 02:58:43,636][33201] Updated weights for policy 0, policy_version 48330 (0.0008) [2023-10-14 02:58:43,999][33201] Updated weights for policy 0, policy_version 48340 (0.0009) [2023-10-14 02:58:44,366][33201] Updated weights for policy 0, policy_version 48350 (0.0010) [2023-10-14 02:58:44,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 99450880. Throughput: 0: 1757.8, 1: 1778.4. Samples: 24864734. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:44,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.820')] [2023-10-14 02:58:44,744][33226] Updated weights for policy 1, policy_version 48770 (0.0008) [2023-10-14 02:58:45,113][33226] Updated weights for policy 1, policy_version 48780 (0.0010) [2023-10-14 02:58:45,479][33226] Updated weights for policy 1, policy_version 48790 (0.0009) [2023-10-14 02:58:45,844][33226] Updated weights for policy 1, policy_version 48800 (0.0007) [2023-10-14 02:58:48,247][33201] Updated weights for policy 0, policy_version 48360 (0.0008) [2023-10-14 02:58:48,611][33201] Updated weights for policy 0, policy_version 48370 (0.0010) [2023-10-14 02:58:48,982][33201] Updated weights for policy 0, policy_version 48380 (0.0011) [2023-10-14 02:58:49,554][33226] Updated weights for policy 1, policy_version 48810 (0.0012) [2023-10-14 02:58:49,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 99516416. Throughput: 0: 1783.3, 1: 1774.1. Samples: 24886736. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:49,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.820')] [2023-10-14 02:58:49,926][33226] Updated weights for policy 1, policy_version 48820 (0.0009) [2023-10-14 02:58:50,300][33226] Updated weights for policy 1, policy_version 48830 (0.0011) [2023-10-14 02:58:52,834][33201] Updated weights for policy 0, policy_version 48390 (0.0009) [2023-10-14 02:58:53,213][33201] Updated weights for policy 0, policy_version 48400 (0.0009) [2023-10-14 02:58:53,588][33201] Updated weights for policy 0, policy_version 48410 (0.0008) [2023-10-14 02:58:54,067][33226] Updated weights for policy 1, policy_version 48840 (0.0008) [2023-10-14 02:58:54,424][33226] Updated weights for policy 1, policy_version 48850 (0.0010) [2023-10-14 02:58:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 99581952. Throughput: 0: 1759.9, 1: 1796.4. Samples: 24907276. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 02:58:54,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.800')] [2023-10-14 02:58:54,789][33226] Updated weights for policy 1, policy_version 48860 (0.0011) [2023-10-14 02:58:57,193][33201] Updated weights for policy 0, policy_version 48420 (0.0007) [2023-10-14 02:58:57,568][33201] Updated weights for policy 0, policy_version 48430 (0.0008) [2023-10-14 02:58:57,941][33201] Updated weights for policy 0, policy_version 48440 (0.0007) [2023-10-14 02:58:58,725][33226] Updated weights for policy 1, policy_version 48870 (0.0008) [2023-10-14 02:58:59,125][33226] Updated weights for policy 1, policy_version 48880 (0.0009) [2023-10-14 02:58:59,504][33226] Updated weights for policy 1, policy_version 48890 (0.0009) [2023-10-14 02:58:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 99647488. Throughput: 0: 1789.3, 1: 1773.0. Samples: 24918732. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:58:59,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.790')] [2023-10-14 02:59:01,778][33201] Updated weights for policy 0, policy_version 48450 (0.0007) [2023-10-14 02:59:02,143][33201] Updated weights for policy 0, policy_version 48460 (0.0010) [2023-10-14 02:59:02,519][33201] Updated weights for policy 0, policy_version 48470 (0.0010) [2023-10-14 02:59:02,888][33201] Updated weights for policy 0, policy_version 48480 (0.0007) [2023-10-14 02:59:03,377][33226] Updated weights for policy 1, policy_version 48900 (0.0011) [2023-10-14 02:59:03,748][33226] Updated weights for policy 1, policy_version 48910 (0.0008) [2023-10-14 02:59:04,115][33226] Updated weights for policy 1, policy_version 48920 (0.0008) [2023-10-14 02:59:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 99745792. Throughput: 0: 1768.2, 1: 1794.9. Samples: 24939602. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:04,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.790')] [2023-10-14 02:59:06,728][33201] Updated weights for policy 0, policy_version 48490 (0.0010) [2023-10-14 02:59:07,094][33201] Updated weights for policy 0, policy_version 48500 (0.0011) [2023-10-14 02:59:07,469][33201] Updated weights for policy 0, policy_version 48510 (0.0010) [2023-10-14 02:59:07,853][33226] Updated weights for policy 1, policy_version 48930 (0.0008) [2023-10-14 02:59:08,212][33226] Updated weights for policy 1, policy_version 48940 (0.0007) [2023-10-14 02:59:08,578][33226] Updated weights for policy 1, policy_version 48950 (0.0007) [2023-10-14 02:59:08,941][33226] Updated weights for policy 1, policy_version 48960 (0.0008) [2023-10-14 02:59:09,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 99811328. Throughput: 0: 1763.0, 1: 1778.1. Samples: 24960410. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:09,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.790')] [2023-10-14 02:59:11,384][33201] Updated weights for policy 0, policy_version 48520 (0.0009) [2023-10-14 02:59:11,748][33201] Updated weights for policy 0, policy_version 48530 (0.0011) [2023-10-14 02:59:12,120][33201] Updated weights for policy 0, policy_version 48540 (0.0010) [2023-10-14 02:59:12,849][33226] Updated weights for policy 1, policy_version 48970 (0.0007) [2023-10-14 02:59:13,212][33226] Updated weights for policy 1, policy_version 48980 (0.0009) [2023-10-14 02:59:13,578][33226] Updated weights for policy 1, policy_version 48990 (0.0010) [2023-10-14 02:59:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 99876864. Throughput: 0: 1768.8, 1: 1786.9. Samples: 24971504. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:14,558][31953] Avg episode reward: [(0, '20.370'), (1, '20.790')] [2023-10-14 02:59:15,962][33201] Updated weights for policy 0, policy_version 48550 (0.0008) [2023-10-14 02:59:16,330][33201] Updated weights for policy 0, policy_version 48560 (0.0009) [2023-10-14 02:59:16,705][33201] Updated weights for policy 0, policy_version 48570 (0.0011) [2023-10-14 02:59:17,320][33226] Updated weights for policy 1, policy_version 49000 (0.0010) [2023-10-14 02:59:17,675][33226] Updated weights for policy 1, policy_version 49010 (0.0009) [2023-10-14 02:59:18,048][33226] Updated weights for policy 1, policy_version 49020 (0.0009) [2023-10-14 02:59:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 99942400. Throughput: 0: 1760.2, 1: 1779.5. Samples: 24992258. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:19,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.770')] [2023-10-14 02:59:20,532][33201] Updated weights for policy 0, policy_version 48580 (0.0009) [2023-10-14 02:59:20,900][33201] Updated weights for policy 0, policy_version 48590 (0.0009) [2023-10-14 02:59:21,268][33201] Updated weights for policy 0, policy_version 48600 (0.0008) [2023-10-14 02:59:21,719][33226] Updated weights for policy 1, policy_version 49030 (0.0009) [2023-10-14 02:59:22,082][33226] Updated weights for policy 1, policy_version 49040 (0.0008) [2023-10-14 02:59:22,453][33226] Updated weights for policy 1, policy_version 49050 (0.0009) [2023-10-14 02:59:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 100007936. Throughput: 0: 1777.6, 1: 1772.2. Samples: 25014254. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:24,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.770')] [2023-10-14 02:59:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000049056_50233344.pth... [2023-10-14 02:59:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000048608_49774592.pth... [2023-10-14 02:59:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000046976_48103424.pth [2023-10-14 02:59:24,609][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000047392_48529408.pth [2023-10-14 02:59:25,061][33201] Updated weights for policy 0, policy_version 48610 (0.0009) [2023-10-14 02:59:25,431][33201] Updated weights for policy 0, policy_version 48620 (0.0007) [2023-10-14 02:59:25,804][33201] Updated weights for policy 0, policy_version 48630 (0.0007) [2023-10-14 02:59:26,176][33201] Updated weights for policy 0, policy_version 48640 (0.0009) [2023-10-14 02:59:26,282][33226] Updated weights for policy 1, policy_version 49060 (0.0010) [2023-10-14 02:59:26,648][33226] Updated weights for policy 1, policy_version 49070 (0.0009) [2023-10-14 02:59:27,016][33226] Updated weights for policy 1, policy_version 49080 (0.0007) [2023-10-14 02:59:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 100073472. Throughput: 0: 1761.0, 1: 1785.6. Samples: 25024330. Policy #0 lag: (min: 30.0, avg: 33.8, max: 62.0) [2023-10-14 02:59:29,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.790')] [2023-10-14 02:59:30,050][33201] Updated weights for policy 0, policy_version 48650 (0.0009) [2023-10-14 02:59:30,414][33201] Updated weights for policy 0, policy_version 48660 (0.0007) [2023-10-14 02:59:30,791][33201] Updated weights for policy 0, policy_version 48670 (0.0007) [2023-10-14 02:59:30,857][33226] Updated weights for policy 1, policy_version 49090 (0.0007) [2023-10-14 02:59:31,221][33226] Updated weights for policy 1, policy_version 49100 (0.0008) [2023-10-14 02:59:31,580][33226] Updated weights for policy 1, policy_version 49110 (0.0010) [2023-10-14 02:59:31,944][33226] Updated weights for policy 1, policy_version 49120 (0.0009) [2023-10-14 02:59:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 100139008. Throughput: 0: 1760.1, 1: 1771.0. Samples: 25045636. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:34,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.790')] [2023-10-14 02:59:34,574][33201] Updated weights for policy 0, policy_version 48680 (0.0009) [2023-10-14 02:59:34,951][33201] Updated weights for policy 0, policy_version 48690 (0.0011) [2023-10-14 02:59:35,310][33201] Updated weights for policy 0, policy_version 48700 (0.0010) [2023-10-14 02:59:35,729][33226] Updated weights for policy 1, policy_version 49130 (0.0010) [2023-10-14 02:59:36,096][33226] Updated weights for policy 1, policy_version 49140 (0.0010) [2023-10-14 02:59:36,459][33226] Updated weights for policy 1, policy_version 49150 (0.0010) [2023-10-14 02:59:39,353][33201] Updated weights for policy 0, policy_version 48710 (0.0007) [2023-10-14 02:59:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 100204544. Throughput: 0: 1785.4, 1: 1774.3. Samples: 25067462. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:39,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.800')] [2023-10-14 02:59:39,729][33201] Updated weights for policy 0, policy_version 48720 (0.0008) [2023-10-14 02:59:40,097][33201] Updated weights for policy 0, policy_version 48730 (0.0008) [2023-10-14 02:59:40,170][33226] Updated weights for policy 1, policy_version 49160 (0.0009) [2023-10-14 02:59:40,530][33226] Updated weights for policy 1, policy_version 49170 (0.0010) [2023-10-14 02:59:40,904][33226] Updated weights for policy 1, policy_version 49180 (0.0011) [2023-10-14 02:59:44,034][33201] Updated weights for policy 0, policy_version 48740 (0.0009) [2023-10-14 02:59:44,402][33201] Updated weights for policy 0, policy_version 48750 (0.0011) [2023-10-14 02:59:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 100270080. Throughput: 0: 1744.4, 1: 1771.8. Samples: 25076960. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:44,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.800')] [2023-10-14 02:59:44,775][33201] Updated weights for policy 0, policy_version 48760 (0.0007) [2023-10-14 02:59:44,887][33226] Updated weights for policy 1, policy_version 49190 (0.0010) [2023-10-14 02:59:45,265][33226] Updated weights for policy 1, policy_version 49200 (0.0008) [2023-10-14 02:59:45,641][33226] Updated weights for policy 1, policy_version 49210 (0.0008) [2023-10-14 02:59:48,611][33201] Updated weights for policy 0, policy_version 48770 (0.0008) [2023-10-14 02:59:48,991][33201] Updated weights for policy 0, policy_version 48780 (0.0009) [2023-10-14 02:59:49,357][33201] Updated weights for policy 0, policy_version 48790 (0.0007) [2023-10-14 02:59:49,423][33226] Updated weights for policy 1, policy_version 49220 (0.0008) [2023-10-14 02:59:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 100335616. Throughput: 0: 1775.1, 1: 1767.5. Samples: 25099020. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:49,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.840')] [2023-10-14 02:59:49,725][33201] Updated weights for policy 0, policy_version 48800 (0.0008) [2023-10-14 02:59:49,791][33226] Updated weights for policy 1, policy_version 49230 (0.0007) [2023-10-14 02:59:50,155][33226] Updated weights for policy 1, policy_version 49240 (0.0010) [2023-10-14 02:59:53,376][33201] Updated weights for policy 0, policy_version 48810 (0.0009) [2023-10-14 02:59:53,752][33201] Updated weights for policy 0, policy_version 48820 (0.0011) [2023-10-14 02:59:54,008][33226] Updated weights for policy 1, policy_version 49250 (0.0009) [2023-10-14 02:59:54,115][33201] Updated weights for policy 0, policy_version 48830 (0.0009) [2023-10-14 02:59:54,382][33226] Updated weights for policy 1, policy_version 49260 (0.0009) [2023-10-14 02:59:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 100433920. Throughput: 0: 1744.5, 1: 1794.5. Samples: 25119662. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:54,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.870')] [2023-10-14 02:59:54,757][33226] Updated weights for policy 1, policy_version 49270 (0.0008) [2023-10-14 02:59:55,120][33226] Updated weights for policy 1, policy_version 49280 (0.0007) [2023-10-14 02:59:58,080][33201] Updated weights for policy 0, policy_version 48840 (0.0008) [2023-10-14 02:59:58,442][33201] Updated weights for policy 0, policy_version 48850 (0.0009) [2023-10-14 02:59:58,809][33201] Updated weights for policy 0, policy_version 48860 (0.0007) [2023-10-14 02:59:58,818][33226] Updated weights for policy 1, policy_version 49290 (0.0009) [2023-10-14 02:59:59,182][33226] Updated weights for policy 1, policy_version 49300 (0.0009) [2023-10-14 02:59:59,551][33226] Updated weights for policy 1, policy_version 49310 (0.0007) [2023-10-14 02:59:59,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 100499456. Throughput: 0: 1766.8, 1: 1769.0. Samples: 25130612. Policy #0 lag: (min: 31.0, avg: 35.8, max: 63.0) [2023-10-14 02:59:59,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.870')] [2023-10-14 03:00:02,664][33201] Updated weights for policy 0, policy_version 48870 (0.0007) [2023-10-14 03:00:03,040][33201] Updated weights for policy 0, policy_version 48880 (0.0010) [2023-10-14 03:00:03,365][33226] Updated weights for policy 1, policy_version 49320 (0.0007) [2023-10-14 03:00:03,410][33201] Updated weights for policy 0, policy_version 48890 (0.0007) [2023-10-14 03:00:03,733][33226] Updated weights for policy 1, policy_version 49330 (0.0009) [2023-10-14 03:00:04,107][33226] Updated weights for policy 1, policy_version 49340 (0.0011) [2023-10-14 03:00:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 100597760. Throughput: 0: 1756.5, 1: 1795.2. Samples: 25152084. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:04,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.870')] [2023-10-14 03:00:07,278][33201] Updated weights for policy 0, policy_version 48900 (0.0009) [2023-10-14 03:00:07,650][33201] Updated weights for policy 0, policy_version 48910 (0.0008) [2023-10-14 03:00:07,931][33226] Updated weights for policy 1, policy_version 49350 (0.0010) [2023-10-14 03:00:08,024][33201] Updated weights for policy 0, policy_version 48920 (0.0010) [2023-10-14 03:00:08,307][33226] Updated weights for policy 1, policy_version 49360 (0.0008) [2023-10-14 03:00:08,670][33226] Updated weights for policy 1, policy_version 49370 (0.0009) [2023-10-14 03:00:09,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 100663296. Throughput: 0: 1744.0, 1: 1762.0. Samples: 25172022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:09,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.870')] [2023-10-14 03:00:11,811][33201] Updated weights for policy 0, policy_version 48930 (0.0008) [2023-10-14 03:00:12,182][33201] Updated weights for policy 0, policy_version 48940 (0.0008) [2023-10-14 03:00:12,471][33226] Updated weights for policy 1, policy_version 49380 (0.0009) [2023-10-14 03:00:12,563][33201] Updated weights for policy 0, policy_version 48950 (0.0009) [2023-10-14 03:00:12,827][33226] Updated weights for policy 1, policy_version 49390 (0.0008) [2023-10-14 03:00:12,929][33201] Updated weights for policy 0, policy_version 48960 (0.0008) [2023-10-14 03:00:13,191][33226] Updated weights for policy 1, policy_version 49400 (0.0008) [2023-10-14 03:00:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 100728832. Throughput: 0: 1769.1, 1: 1782.0. Samples: 25184126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:14,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.900')] [2023-10-14 03:00:16,687][33201] Updated weights for policy 0, policy_version 48970 (0.0009) [2023-10-14 03:00:17,051][33226] Updated weights for policy 1, policy_version 49410 (0.0009) [2023-10-14 03:00:17,058][33201] Updated weights for policy 0, policy_version 48980 (0.0010) [2023-10-14 03:00:17,418][33226] Updated weights for policy 1, policy_version 49420 (0.0010) [2023-10-14 03:00:17,430][33201] Updated weights for policy 0, policy_version 48990 (0.0009) [2023-10-14 03:00:17,790][33226] Updated weights for policy 1, policy_version 49430 (0.0009) [2023-10-14 03:00:18,162][33226] Updated weights for policy 1, policy_version 49440 (0.0009) [2023-10-14 03:00:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 100794368. Throughput: 0: 1750.9, 1: 1769.1. Samples: 25204036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:00:21,176][33201] Updated weights for policy 0, policy_version 49000 (0.0009) [2023-10-14 03:00:21,543][33201] Updated weights for policy 0, policy_version 49010 (0.0010) [2023-10-14 03:00:21,916][33201] Updated weights for policy 0, policy_version 49020 (0.0009) [2023-10-14 03:00:21,951][33226] Updated weights for policy 1, policy_version 49450 (0.0008) [2023-10-14 03:00:22,318][33226] Updated weights for policy 1, policy_version 49460 (0.0009) [2023-10-14 03:00:22,690][33226] Updated weights for policy 1, policy_version 49470 (0.0009) [2023-10-14 03:00:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 100859904. Throughput: 0: 1752.2, 1: 1768.2. Samples: 25225880. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 03:00:25,935][33201] Updated weights for policy 0, policy_version 49030 (0.0009) [2023-10-14 03:00:26,317][33201] Updated weights for policy 0, policy_version 49040 (0.0008) [2023-10-14 03:00:26,561][33226] Updated weights for policy 1, policy_version 49480 (0.0009) [2023-10-14 03:00:26,684][33201] Updated weights for policy 0, policy_version 49050 (0.0008) [2023-10-14 03:00:26,930][33226] Updated weights for policy 1, policy_version 49490 (0.0009) [2023-10-14 03:00:27,288][33226] Updated weights for policy 1, policy_version 49500 (0.0007) [2023-10-14 03:00:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 100925440. Throughput: 0: 1749.7, 1: 1777.3. Samples: 25235674. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 03:00:30,565][33201] Updated weights for policy 0, policy_version 49060 (0.0010) [2023-10-14 03:00:30,901][33226] Updated weights for policy 1, policy_version 49510 (0.0010) [2023-10-14 03:00:30,936][33201] Updated weights for policy 0, policy_version 49070 (0.0008) [2023-10-14 03:00:31,275][33226] Updated weights for policy 1, policy_version 49520 (0.0008) [2023-10-14 03:00:31,302][33201] Updated weights for policy 0, policy_version 49080 (0.0010) [2023-10-14 03:00:31,636][33226] Updated weights for policy 1, policy_version 49530 (0.0008) [2023-10-14 03:00:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 100990976. Throughput: 0: 1743.6, 1: 1775.4. Samples: 25257374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:00:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 03:00:35,155][33201] Updated weights for policy 0, policy_version 49090 (0.0009) [2023-10-14 03:00:35,367][33226] Updated weights for policy 1, policy_version 49540 (0.0008) [2023-10-14 03:00:35,523][33201] Updated weights for policy 0, policy_version 49100 (0.0008) [2023-10-14 03:00:35,738][33226] Updated weights for policy 1, policy_version 49550 (0.0007) [2023-10-14 03:00:35,888][33201] Updated weights for policy 0, policy_version 49110 (0.0007) [2023-10-14 03:00:36,106][33226] Updated weights for policy 1, policy_version 49560 (0.0007) [2023-10-14 03:00:36,256][33201] Updated weights for policy 0, policy_version 49120 (0.0008) [2023-10-14 03:00:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 101056512. Throughput: 0: 1768.0, 1: 1775.4. Samples: 25279114. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:00:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 03:00:39,883][33226] Updated weights for policy 1, policy_version 49570 (0.0009) [2023-10-14 03:00:40,138][33201] Updated weights for policy 0, policy_version 49130 (0.0007) [2023-10-14 03:00:40,255][33226] Updated weights for policy 1, policy_version 49580 (0.0008) [2023-10-14 03:00:40,511][33201] Updated weights for policy 0, policy_version 49140 (0.0009) [2023-10-14 03:00:40,622][33226] Updated weights for policy 1, policy_version 49590 (0.0007) [2023-10-14 03:00:40,882][33201] Updated weights for policy 0, policy_version 49150 (0.0009) [2023-10-14 03:00:40,994][33226] Updated weights for policy 1, policy_version 49600 (0.0007) [2023-10-14 03:00:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 101122048. Throughput: 0: 1739.9, 1: 1772.2. Samples: 25288656. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:00:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 03:00:44,746][33201] Updated weights for policy 0, policy_version 49160 (0.0008) [2023-10-14 03:00:44,881][33226] Updated weights for policy 1, policy_version 49610 (0.0008) [2023-10-14 03:00:45,118][33201] Updated weights for policy 0, policy_version 49170 (0.0008) [2023-10-14 03:00:45,242][33226] Updated weights for policy 1, policy_version 49620 (0.0007) [2023-10-14 03:00:45,476][33201] Updated weights for policy 0, policy_version 49180 (0.0008) [2023-10-14 03:00:45,615][33226] Updated weights for policy 1, policy_version 49630 (0.0007) [2023-10-14 03:00:49,399][33201] Updated weights for policy 0, policy_version 49190 (0.0007) [2023-10-14 03:00:49,462][33226] Updated weights for policy 1, policy_version 49640 (0.0008) [2023-10-14 03:00:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 101187584. Throughput: 0: 1757.1, 1: 1761.2. Samples: 25310406. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:00:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:00:49,772][33201] Updated weights for policy 0, policy_version 49200 (0.0008) [2023-10-14 03:00:49,829][33226] Updated weights for policy 1, policy_version 49650 (0.0008) [2023-10-14 03:00:50,142][33201] Updated weights for policy 0, policy_version 49210 (0.0008) [2023-10-14 03:00:50,184][33226] Updated weights for policy 1, policy_version 49660 (0.0008) [2023-10-14 03:00:53,957][33226] Updated weights for policy 1, policy_version 49670 (0.0007) [2023-10-14 03:00:53,978][33201] Updated weights for policy 0, policy_version 49220 (0.0008) [2023-10-14 03:00:54,328][33226] Updated weights for policy 1, policy_version 49680 (0.0007) [2023-10-14 03:00:54,350][33201] Updated weights for policy 0, policy_version 49230 (0.0008) [2023-10-14 03:00:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 101253120. Throughput: 0: 1759.1, 1: 1797.0. Samples: 25332048. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:00:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:00:54,691][33226] Updated weights for policy 1, policy_version 49690 (0.0007) [2023-10-14 03:00:54,724][33201] Updated weights for policy 0, policy_version 49240 (0.0007) [2023-10-14 03:00:58,506][33201] Updated weights for policy 0, policy_version 49250 (0.0008) [2023-10-14 03:00:58,640][33226] Updated weights for policy 1, policy_version 49700 (0.0007) [2023-10-14 03:00:58,881][33201] Updated weights for policy 0, policy_version 49260 (0.0007) [2023-10-14 03:00:59,004][33226] Updated weights for policy 1, policy_version 49710 (0.0007) [2023-10-14 03:00:59,247][33201] Updated weights for policy 0, policy_version 49270 (0.0007) [2023-10-14 03:00:59,379][33226] Updated weights for policy 1, policy_version 49720 (0.0007) [2023-10-14 03:00:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 101318656. Throughput: 0: 1745.9, 1: 1766.9. Samples: 25342200. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:00:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:00:59,622][33201] Updated weights for policy 0, policy_version 49280 (0.0007) [2023-10-14 03:01:03,170][33226] Updated weights for policy 1, policy_version 49730 (0.0009) [2023-10-14 03:01:03,487][33201] Updated weights for policy 0, policy_version 49290 (0.0007) [2023-10-14 03:01:03,537][33226] Updated weights for policy 1, policy_version 49740 (0.0007) [2023-10-14 03:01:03,853][33201] Updated weights for policy 0, policy_version 49300 (0.0009) [2023-10-14 03:01:03,893][33226] Updated weights for policy 1, policy_version 49750 (0.0008) [2023-10-14 03:01:04,225][33201] Updated weights for policy 0, policy_version 49310 (0.0010) [2023-10-14 03:01:04,263][33226] Updated weights for policy 1, policy_version 49760 (0.0009) [2023-10-14 03:01:04,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 101449728. Throughput: 0: 1767.9, 1: 1788.1. Samples: 25364058. Policy #0 lag: (min: 2.0, avg: 28.7, max: 32.0) [2023-10-14 03:01:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.970')] [2023-10-14 03:01:07,945][33201] Updated weights for policy 0, policy_version 49320 (0.0007) [2023-10-14 03:01:07,985][33226] Updated weights for policy 1, policy_version 49770 (0.0009) [2023-10-14 03:01:08,321][33201] Updated weights for policy 0, policy_version 49330 (0.0007) [2023-10-14 03:01:08,346][33226] Updated weights for policy 1, policy_version 49780 (0.0008) [2023-10-14 03:01:08,692][33201] Updated weights for policy 0, policy_version 49340 (0.0007) [2023-10-14 03:01:08,721][33226] Updated weights for policy 1, policy_version 49790 (0.0008) [2023-10-14 03:01:09,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 101515264. Throughput: 0: 1737.6, 1: 1758.9. Samples: 25383220. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:09,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:01:12,597][33226] Updated weights for policy 1, policy_version 49800 (0.0008) [2023-10-14 03:01:12,623][33201] Updated weights for policy 0, policy_version 49350 (0.0008) [2023-10-14 03:01:12,960][33226] Updated weights for policy 1, policy_version 49810 (0.0007) [2023-10-14 03:01:13,017][33201] Updated weights for policy 0, policy_version 49360 (0.0008) [2023-10-14 03:01:13,333][33226] Updated weights for policy 1, policy_version 49820 (0.0007) [2023-10-14 03:01:13,390][33201] Updated weights for policy 0, policy_version 49370 (0.0007) [2023-10-14 03:01:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 101580800. Throughput: 0: 1778.0, 1: 1775.9. Samples: 25395602. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:14,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:01:17,049][33226] Updated weights for policy 1, policy_version 49830 (0.0010) [2023-10-14 03:01:17,344][33201] Updated weights for policy 0, policy_version 49380 (0.0009) [2023-10-14 03:01:17,413][33226] Updated weights for policy 1, policy_version 49840 (0.0009) [2023-10-14 03:01:17,719][33201] Updated weights for policy 0, policy_version 49390 (0.0008) [2023-10-14 03:01:17,779][33226] Updated weights for policy 1, policy_version 49850 (0.0009) [2023-10-14 03:01:18,079][33201] Updated weights for policy 0, policy_version 49400 (0.0007) [2023-10-14 03:01:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 101646336. Throughput: 0: 1749.1, 1: 1756.4. Samples: 25415118. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:19,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:01:21,676][33226] Updated weights for policy 1, policy_version 49860 (0.0009) [2023-10-14 03:01:22,064][33201] Updated weights for policy 0, policy_version 49410 (0.0007) [2023-10-14 03:01:22,080][33226] Updated weights for policy 1, policy_version 49870 (0.0008) [2023-10-14 03:01:22,432][33201] Updated weights for policy 0, policy_version 49420 (0.0009) [2023-10-14 03:01:22,447][33226] Updated weights for policy 1, policy_version 49880 (0.0009) [2023-10-14 03:01:22,797][33201] Updated weights for policy 0, policy_version 49430 (0.0009) [2023-10-14 03:01:23,168][33201] Updated weights for policy 0, policy_version 49440 (0.0009) [2023-10-14 03:01:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 101711872. Throughput: 0: 1741.3, 1: 1751.0. Samples: 25436268. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:24,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:01:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000049888_51085312.pth... [2023-10-14 03:01:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000049440_50626560.pth... [2023-10-14 03:01:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000047808_48955392.pth [2023-10-14 03:01:24,605][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000048224_49381376.pth [2023-10-14 03:01:24,607][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000049440_50626560.pth [2023-10-14 03:01:24,609][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000049888_51085312.pth [2023-10-14 03:01:26,237][33226] Updated weights for policy 1, policy_version 49890 (0.0008) [2023-10-14 03:01:26,597][33226] Updated weights for policy 1, policy_version 49900 (0.0011) [2023-10-14 03:01:26,959][33226] Updated weights for policy 1, policy_version 49910 (0.0008) [2023-10-14 03:01:27,039][33201] Updated weights for policy 0, policy_version 49450 (0.0007) [2023-10-14 03:01:27,330][33226] Updated weights for policy 1, policy_version 49920 (0.0007) [2023-10-14 03:01:27,409][33201] Updated weights for policy 0, policy_version 49460 (0.0008) [2023-10-14 03:01:27,784][33201] Updated weights for policy 0, policy_version 49470 (0.0008) [2023-10-14 03:01:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 101777408. Throughput: 0: 1763.1, 1: 1766.2. Samples: 25447474. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:29,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:01:31,227][33226] Updated weights for policy 1, policy_version 49930 (0.0009) [2023-10-14 03:01:31,468][33201] Updated weights for policy 0, policy_version 49480 (0.0008) [2023-10-14 03:01:31,588][33226] Updated weights for policy 1, policy_version 49940 (0.0009) [2023-10-14 03:01:31,841][33201] Updated weights for policy 0, policy_version 49490 (0.0007) [2023-10-14 03:01:31,954][33226] Updated weights for policy 1, policy_version 49950 (0.0007) [2023-10-14 03:01:32,209][33201] Updated weights for policy 0, policy_version 49500 (0.0007) [2023-10-14 03:01:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 101842944. Throughput: 0: 1747.4, 1: 1767.1. Samples: 25468560. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:34,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.960')] [2023-10-14 03:01:35,699][33226] Updated weights for policy 1, policy_version 49960 (0.0007) [2023-10-14 03:01:36,042][33201] Updated weights for policy 0, policy_version 49510 (0.0008) [2023-10-14 03:01:36,072][33226] Updated weights for policy 1, policy_version 49970 (0.0007) [2023-10-14 03:01:36,417][33201] Updated weights for policy 0, policy_version 49520 (0.0007) [2023-10-14 03:01:36,433][33226] Updated weights for policy 1, policy_version 49980 (0.0007) [2023-10-14 03:01:36,794][33201] Updated weights for policy 0, policy_version 49530 (0.0008) [2023-10-14 03:01:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 101908480. Throughput: 0: 1758.0, 1: 1768.8. Samples: 25490756. Policy #0 lag: (min: 19.0, avg: 24.6, max: 51.0) [2023-10-14 03:01:39,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.960')] [2023-10-14 03:01:40,184][33226] Updated weights for policy 1, policy_version 49990 (0.0008) [2023-10-14 03:01:40,552][33226] Updated weights for policy 1, policy_version 50000 (0.0010) [2023-10-14 03:01:40,584][33201] Updated weights for policy 0, policy_version 49540 (0.0008) [2023-10-14 03:01:40,922][33226] Updated weights for policy 1, policy_version 50010 (0.0009) [2023-10-14 03:01:40,959][33201] Updated weights for policy 0, policy_version 49550 (0.0007) [2023-10-14 03:01:41,334][33201] Updated weights for policy 0, policy_version 49560 (0.0008) [2023-10-14 03:01:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 101974016. Throughput: 0: 1748.9, 1: 1766.2. Samples: 25500380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:01:44,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.960')] [2023-10-14 03:01:44,701][33226] Updated weights for policy 1, policy_version 50020 (0.0010) [2023-10-14 03:01:45,067][33226] Updated weights for policy 1, policy_version 50030 (0.0010) [2023-10-14 03:01:45,151][33201] Updated weights for policy 0, policy_version 49570 (0.0009) [2023-10-14 03:01:45,438][33226] Updated weights for policy 1, policy_version 50040 (0.0007) [2023-10-14 03:01:45,517][33201] Updated weights for policy 0, policy_version 49580 (0.0009) [2023-10-14 03:01:45,891][33201] Updated weights for policy 0, policy_version 49590 (0.0007) [2023-10-14 03:01:46,258][33201] Updated weights for policy 0, policy_version 49600 (0.0009) [2023-10-14 03:01:49,416][33226] Updated weights for policy 1, policy_version 50050 (0.0009) [2023-10-14 03:01:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 102039552. Throughput: 0: 1747.3, 1: 1773.7. Samples: 25522504. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:01:49,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.960')] [2023-10-14 03:01:49,795][33226] Updated weights for policy 1, policy_version 50060 (0.0009) [2023-10-14 03:01:50,152][33226] Updated weights for policy 1, policy_version 50070 (0.0009) [2023-10-14 03:01:50,175][33201] Updated weights for policy 0, policy_version 49610 (0.0009) [2023-10-14 03:01:50,525][33226] Updated weights for policy 1, policy_version 50080 (0.0010) [2023-10-14 03:01:50,552][33201] Updated weights for policy 0, policy_version 49620 (0.0008) [2023-10-14 03:01:50,926][33201] Updated weights for policy 0, policy_version 49630 (0.0009) [2023-10-14 03:01:54,296][33226] Updated weights for policy 1, policy_version 50090 (0.0009) [2023-10-14 03:01:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 102105088. Throughput: 0: 1779.2, 1: 1805.7. Samples: 25544540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:01:54,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.960')] [2023-10-14 03:01:54,657][33201] Updated weights for policy 0, policy_version 49640 (0.0008) [2023-10-14 03:01:54,666][33226] Updated weights for policy 1, policy_version 50100 (0.0007) [2023-10-14 03:01:55,026][33201] Updated weights for policy 0, policy_version 49650 (0.0008) [2023-10-14 03:01:55,041][33226] Updated weights for policy 1, policy_version 50110 (0.0008) [2023-10-14 03:01:55,402][33201] Updated weights for policy 0, policy_version 49660 (0.0008) [2023-10-14 03:01:58,662][33226] Updated weights for policy 1, policy_version 50120 (0.0009) [2023-10-14 03:01:59,032][33226] Updated weights for policy 1, policy_version 50130 (0.0008) [2023-10-14 03:01:59,219][33201] Updated weights for policy 0, policy_version 49670 (0.0007) [2023-10-14 03:01:59,397][33226] Updated weights for policy 1, policy_version 50140 (0.0008) [2023-10-14 03:01:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 102203392. Throughput: 0: 1745.3, 1: 1780.7. Samples: 25554272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:01:59,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.960')] [2023-10-14 03:01:59,590][33201] Updated weights for policy 0, policy_version 49680 (0.0008) [2023-10-14 03:01:59,963][33201] Updated weights for policy 0, policy_version 49690 (0.0010) [2023-10-14 03:02:03,022][33226] Updated weights for policy 1, policy_version 50150 (0.0008) [2023-10-14 03:02:03,385][33226] Updated weights for policy 1, policy_version 50160 (0.0008) [2023-10-14 03:02:03,739][33226] Updated weights for policy 1, policy_version 50170 (0.0011) [2023-10-14 03:02:03,962][33201] Updated weights for policy 0, policy_version 49700 (0.0010) [2023-10-14 03:02:04,320][33201] Updated weights for policy 0, policy_version 49710 (0.0009) [2023-10-14 03:02:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 102268928. Throughput: 0: 1774.4, 1: 1809.8. Samples: 25576408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:04,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.960')] [2023-10-14 03:02:04,688][33201] Updated weights for policy 0, policy_version 49720 (0.0011) [2023-10-14 03:02:07,470][33226] Updated weights for policy 1, policy_version 50180 (0.0008) [2023-10-14 03:02:07,842][33226] Updated weights for policy 1, policy_version 50190 (0.0009) [2023-10-14 03:02:08,205][33226] Updated weights for policy 1, policy_version 50200 (0.0008) [2023-10-14 03:02:08,467][33201] Updated weights for policy 0, policy_version 49730 (0.0009) [2023-10-14 03:02:08,836][33201] Updated weights for policy 0, policy_version 49740 (0.0010) [2023-10-14 03:02:09,209][33201] Updated weights for policy 0, policy_version 49750 (0.0009) [2023-10-14 03:02:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 102334464. Throughput: 0: 1770.5, 1: 1796.3. Samples: 25596774. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:09,557][31953] Avg episode reward: [(0, '20.740'), (1, '20.970')] [2023-10-14 03:02:09,582][33201] Updated weights for policy 0, policy_version 49760 (0.0008) [2023-10-14 03:02:11,907][33226] Updated weights for policy 1, policy_version 50210 (0.0010) [2023-10-14 03:02:12,273][33226] Updated weights for policy 1, policy_version 50220 (0.0007) [2023-10-14 03:02:12,641][33226] Updated weights for policy 1, policy_version 50230 (0.0007) [2023-10-14 03:02:13,008][33226] Updated weights for policy 1, policy_version 50240 (0.0007) [2023-10-14 03:02:13,415][33201] Updated weights for policy 0, policy_version 49770 (0.0009) [2023-10-14 03:02:13,777][33201] Updated weights for policy 0, policy_version 49780 (0.0010) [2023-10-14 03:02:14,146][33201] Updated weights for policy 0, policy_version 49790 (0.0009) [2023-10-14 03:02:14,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 102432768. Throughput: 0: 1764.4, 1: 1812.2. Samples: 25608422. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:14,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.970')] [2023-10-14 03:02:16,660][33226] Updated weights for policy 1, policy_version 50250 (0.0008) [2023-10-14 03:02:17,022][33226] Updated weights for policy 1, policy_version 50260 (0.0007) [2023-10-14 03:02:17,389][33226] Updated weights for policy 1, policy_version 50270 (0.0010) [2023-10-14 03:02:17,986][33201] Updated weights for policy 0, policy_version 49800 (0.0010) [2023-10-14 03:02:18,362][33201] Updated weights for policy 0, policy_version 49810 (0.0010) [2023-10-14 03:02:18,734][33201] Updated weights for policy 0, policy_version 49820 (0.0009) [2023-10-14 03:02:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 102498304. Throughput: 0: 1771.0, 1: 1793.4. Samples: 25628958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:19,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:02:21,148][33226] Updated weights for policy 1, policy_version 50280 (0.0010) [2023-10-14 03:02:21,508][33226] Updated weights for policy 1, policy_version 50290 (0.0010) [2023-10-14 03:02:21,874][33226] Updated weights for policy 1, policy_version 50300 (0.0009) [2023-10-14 03:02:22,597][33201] Updated weights for policy 0, policy_version 49830 (0.0010) [2023-10-14 03:02:22,969][33201] Updated weights for policy 0, policy_version 49840 (0.0008) [2023-10-14 03:02:23,341][33201] Updated weights for policy 0, policy_version 49850 (0.0007) [2023-10-14 03:02:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 102563840. Throughput: 0: 1747.9, 1: 1793.4. Samples: 25650116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:24,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:02:25,883][33226] Updated weights for policy 1, policy_version 50310 (0.0008) [2023-10-14 03:02:26,252][33226] Updated weights for policy 1, policy_version 50320 (0.0008) [2023-10-14 03:02:26,626][33226] Updated weights for policy 1, policy_version 50330 (0.0008) [2023-10-14 03:02:27,051][33201] Updated weights for policy 0, policy_version 49860 (0.0009) [2023-10-14 03:02:27,420][33201] Updated weights for policy 0, policy_version 49870 (0.0008) [2023-10-14 03:02:27,799][33201] Updated weights for policy 0, policy_version 49880 (0.0007) [2023-10-14 03:02:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 102629376. Throughput: 0: 1776.8, 1: 1789.7. Samples: 25660872. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:29,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:02:30,412][33226] Updated weights for policy 1, policy_version 50340 (0.0009) [2023-10-14 03:02:30,782][33226] Updated weights for policy 1, policy_version 50350 (0.0009) [2023-10-14 03:02:31,143][33226] Updated weights for policy 1, policy_version 50360 (0.0010) [2023-10-14 03:02:31,697][33201] Updated weights for policy 0, policy_version 49890 (0.0009) [2023-10-14 03:02:32,072][33201] Updated weights for policy 0, policy_version 49900 (0.0008) [2023-10-14 03:02:32,446][33201] Updated weights for policy 0, policy_version 49910 (0.0009) [2023-10-14 03:02:32,816][33201] Updated weights for policy 0, policy_version 49920 (0.0009) [2023-10-14 03:02:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 102694912. Throughput: 0: 1751.9, 1: 1792.2. Samples: 25681990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:34,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:02:34,929][33226] Updated weights for policy 1, policy_version 50370 (0.0010) [2023-10-14 03:02:35,295][33226] Updated weights for policy 1, policy_version 50380 (0.0007) [2023-10-14 03:02:35,663][33226] Updated weights for policy 1, policy_version 50390 (0.0007) [2023-10-14 03:02:36,026][33226] Updated weights for policy 1, policy_version 50400 (0.0009) [2023-10-14 03:02:36,517][33201] Updated weights for policy 0, policy_version 49930 (0.0009) [2023-10-14 03:02:36,892][33201] Updated weights for policy 0, policy_version 49940 (0.0009) [2023-10-14 03:02:37,263][33201] Updated weights for policy 0, policy_version 49950 (0.0009) [2023-10-14 03:02:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 102760448. Throughput: 0: 1757.8, 1: 1793.3. Samples: 25704338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:39,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:02:39,765][33226] Updated weights for policy 1, policy_version 50410 (0.0010) [2023-10-14 03:02:40,132][33226] Updated weights for policy 1, policy_version 50420 (0.0010) [2023-10-14 03:02:40,501][33226] Updated weights for policy 1, policy_version 50430 (0.0010) [2023-10-14 03:02:40,962][33201] Updated weights for policy 0, policy_version 49960 (0.0009) [2023-10-14 03:02:41,327][33201] Updated weights for policy 0, policy_version 49970 (0.0007) [2023-10-14 03:02:41,706][33201] Updated weights for policy 0, policy_version 49980 (0.0009) [2023-10-14 03:02:44,301][33226] Updated weights for policy 1, policy_version 50440 (0.0009) [2023-10-14 03:02:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 102825984. Throughput: 0: 1761.2, 1: 1790.1. Samples: 25714082. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:02:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.990')] [2023-10-14 03:02:44,660][33226] Updated weights for policy 1, policy_version 50450 (0.0009) [2023-10-14 03:02:45,018][33226] Updated weights for policy 1, policy_version 50460 (0.0011) [2023-10-14 03:02:45,558][33201] Updated weights for policy 0, policy_version 49990 (0.0008) [2023-10-14 03:02:45,951][33201] Updated weights for policy 0, policy_version 50000 (0.0007) [2023-10-14 03:02:46,323][33201] Updated weights for policy 0, policy_version 50010 (0.0008) [2023-10-14 03:02:48,806][33226] Updated weights for policy 1, policy_version 50470 (0.0008) [2023-10-14 03:02:49,165][33226] Updated weights for policy 1, policy_version 50480 (0.0009) [2023-10-14 03:02:49,539][33226] Updated weights for policy 1, policy_version 50490 (0.0009) [2023-10-14 03:02:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 102891520. Throughput: 0: 1763.6, 1: 1792.1. Samples: 25736416. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:02:49,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.990')] [2023-10-14 03:02:50,043][33201] Updated weights for policy 0, policy_version 50020 (0.0008) [2023-10-14 03:02:50,415][33201] Updated weights for policy 0, policy_version 50030 (0.0007) [2023-10-14 03:02:50,793][33201] Updated weights for policy 0, policy_version 50040 (0.0007) [2023-10-14 03:02:53,322][33226] Updated weights for policy 1, policy_version 50500 (0.0007) [2023-10-14 03:02:53,723][33226] Updated weights for policy 1, policy_version 50510 (0.0007) [2023-10-14 03:02:54,081][33226] Updated weights for policy 1, policy_version 50520 (0.0009) [2023-10-14 03:02:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 102989824. Throughput: 0: 1781.2, 1: 1793.9. Samples: 25757658. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:02:54,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.990')] [2023-10-14 03:02:54,603][33201] Updated weights for policy 0, policy_version 50050 (0.0008) [2023-10-14 03:02:54,971][33201] Updated weights for policy 0, policy_version 50060 (0.0007) [2023-10-14 03:02:55,341][33201] Updated weights for policy 0, policy_version 50070 (0.0008) [2023-10-14 03:02:55,711][33201] Updated weights for policy 0, policy_version 50080 (0.0007) [2023-10-14 03:02:57,790][33226] Updated weights for policy 1, policy_version 50530 (0.0008) [2023-10-14 03:02:58,157][33226] Updated weights for policy 1, policy_version 50540 (0.0007) [2023-10-14 03:02:58,523][33226] Updated weights for policy 1, policy_version 50550 (0.0007) [2023-10-14 03:02:58,895][33226] Updated weights for policy 1, policy_version 50560 (0.0007) [2023-10-14 03:02:59,457][33201] Updated weights for policy 0, policy_version 50090 (0.0009) [2023-10-14 03:02:59,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 103055360. Throughput: 0: 1766.8, 1: 1787.3. Samples: 25768358. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:02:59,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.990')] [2023-10-14 03:02:59,831][33201] Updated weights for policy 0, policy_version 50100 (0.0008) [2023-10-14 03:03:00,215][33201] Updated weights for policy 0, policy_version 50110 (0.0009) [2023-10-14 03:03:02,687][33226] Updated weights for policy 1, policy_version 50570 (0.0007) [2023-10-14 03:03:03,054][33226] Updated weights for policy 1, policy_version 50580 (0.0010) [2023-10-14 03:03:03,418][33226] Updated weights for policy 1, policy_version 50590 (0.0007) [2023-10-14 03:03:04,014][33201] Updated weights for policy 0, policy_version 50120 (0.0010) [2023-10-14 03:03:04,380][33201] Updated weights for policy 0, policy_version 50130 (0.0009) [2023-10-14 03:03:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 103120896. Throughput: 0: 1779.4, 1: 1802.2. Samples: 25790130. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:03:04,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.970')] [2023-10-14 03:03:04,763][33201] Updated weights for policy 0, policy_version 50140 (0.0009) [2023-10-14 03:03:07,133][33226] Updated weights for policy 1, policy_version 50600 (0.0011) [2023-10-14 03:03:07,507][33226] Updated weights for policy 1, policy_version 50610 (0.0008) [2023-10-14 03:03:07,874][33226] Updated weights for policy 1, policy_version 50620 (0.0008) [2023-10-14 03:03:08,630][33201] Updated weights for policy 0, policy_version 50150 (0.0009) [2023-10-14 03:03:09,004][33201] Updated weights for policy 0, policy_version 50160 (0.0007) [2023-10-14 03:03:09,374][33201] Updated weights for policy 0, policy_version 50170 (0.0007) [2023-10-14 03:03:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 103186432. Throughput: 0: 1782.9, 1: 1785.8. Samples: 25810708. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:03:09,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.970')] [2023-10-14 03:03:11,728][33226] Updated weights for policy 1, policy_version 50630 (0.0008) [2023-10-14 03:03:12,090][33226] Updated weights for policy 1, policy_version 50640 (0.0008) [2023-10-14 03:03:12,460][33226] Updated weights for policy 1, policy_version 50650 (0.0007) [2023-10-14 03:03:13,073][33201] Updated weights for policy 0, policy_version 50180 (0.0008) [2023-10-14 03:03:13,442][33201] Updated weights for policy 0, policy_version 50190 (0.0009) [2023-10-14 03:03:13,817][33201] Updated weights for policy 0, policy_version 50200 (0.0010) [2023-10-14 03:03:14,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103284736. Throughput: 0: 1771.7, 1: 1806.6. Samples: 25821894. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:03:14,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.950')] [2023-10-14 03:03:16,136][33226] Updated weights for policy 1, policy_version 50660 (0.0009) [2023-10-14 03:03:16,514][33226] Updated weights for policy 1, policy_version 50670 (0.0009) [2023-10-14 03:03:16,870][33226] Updated weights for policy 1, policy_version 50680 (0.0008) [2023-10-14 03:03:17,668][33201] Updated weights for policy 0, policy_version 50210 (0.0008) [2023-10-14 03:03:18,052][33201] Updated weights for policy 0, policy_version 50220 (0.0007) [2023-10-14 03:03:18,416][33201] Updated weights for policy 0, policy_version 50230 (0.0008) [2023-10-14 03:03:18,791][33201] Updated weights for policy 0, policy_version 50240 (0.0009) [2023-10-14 03:03:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103350272. Throughput: 0: 1785.4, 1: 1784.6. Samples: 25842638. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:03:19,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:20,547][33226] Updated weights for policy 1, policy_version 50690 (0.0010) [2023-10-14 03:03:20,913][33226] Updated weights for policy 1, policy_version 50700 (0.0009) [2023-10-14 03:03:21,267][33226] Updated weights for policy 1, policy_version 50710 (0.0008) [2023-10-14 03:03:21,629][33226] Updated weights for policy 1, policy_version 50720 (0.0008) [2023-10-14 03:03:22,561][33201] Updated weights for policy 0, policy_version 50250 (0.0009) [2023-10-14 03:03:22,931][33201] Updated weights for policy 0, policy_version 50260 (0.0010) [2023-10-14 03:03:23,307][33201] Updated weights for policy 0, policy_version 50270 (0.0008) [2023-10-14 03:03:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103415808. Throughput: 0: 1763.0, 1: 1792.7. Samples: 25864342. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:24,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000050272_51478528.pth... [2023-10-14 03:03:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000050720_51937280.pth... [2023-10-14 03:03:24,614][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000049056_50233344.pth [2023-10-14 03:03:24,615][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000048608_49774592.pth [2023-10-14 03:03:25,285][33226] Updated weights for policy 1, policy_version 50730 (0.0008) [2023-10-14 03:03:25,644][33226] Updated weights for policy 1, policy_version 50740 (0.0008) [2023-10-14 03:03:26,011][33226] Updated weights for policy 1, policy_version 50750 (0.0007) [2023-10-14 03:03:27,114][33201] Updated weights for policy 0, policy_version 50280 (0.0010) [2023-10-14 03:03:27,491][33201] Updated weights for policy 0, policy_version 50290 (0.0010) [2023-10-14 03:03:27,860][33201] Updated weights for policy 0, policy_version 50300 (0.0007) [2023-10-14 03:03:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103481344. Throughput: 0: 1787.6, 1: 1790.9. Samples: 25875114. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:29,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:29,817][33226] Updated weights for policy 1, policy_version 50760 (0.0008) [2023-10-14 03:03:30,184][33226] Updated weights for policy 1, policy_version 50770 (0.0008) [2023-10-14 03:03:30,552][33226] Updated weights for policy 1, policy_version 50780 (0.0008) [2023-10-14 03:03:31,538][33201] Updated weights for policy 0, policy_version 50310 (0.0008) [2023-10-14 03:03:31,905][33201] Updated weights for policy 0, policy_version 50320 (0.0008) [2023-10-14 03:03:32,283][33201] Updated weights for policy 0, policy_version 50330 (0.0007) [2023-10-14 03:03:34,341][33226] Updated weights for policy 1, policy_version 50790 (0.0008) [2023-10-14 03:03:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103546880. Throughput: 0: 1764.1, 1: 1787.1. Samples: 25896220. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:34,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:34,705][33226] Updated weights for policy 1, policy_version 50800 (0.0008) [2023-10-14 03:03:35,068][33226] Updated weights for policy 1, policy_version 50810 (0.0008) [2023-10-14 03:03:36,251][33201] Updated weights for policy 0, policy_version 50340 (0.0007) [2023-10-14 03:03:36,642][33201] Updated weights for policy 0, policy_version 50350 (0.0007) [2023-10-14 03:03:37,010][33201] Updated weights for policy 0, policy_version 50360 (0.0008) [2023-10-14 03:03:38,956][33226] Updated weights for policy 1, policy_version 50820 (0.0009) [2023-10-14 03:03:39,360][33226] Updated weights for policy 1, policy_version 50830 (0.0007) [2023-10-14 03:03:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 103612416. Throughput: 0: 1761.1, 1: 1799.5. Samples: 25917884. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:39,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:39,720][33226] Updated weights for policy 1, policy_version 50840 (0.0008) [2023-10-14 03:03:40,721][33201] Updated weights for policy 0, policy_version 50370 (0.0008) [2023-10-14 03:03:41,099][33201] Updated weights for policy 0, policy_version 50380 (0.0008) [2023-10-14 03:03:41,471][33201] Updated weights for policy 0, policy_version 50390 (0.0010) [2023-10-14 03:03:41,850][33201] Updated weights for policy 0, policy_version 50400 (0.0009) [2023-10-14 03:03:43,392][33226] Updated weights for policy 1, policy_version 50850 (0.0008) [2023-10-14 03:03:43,754][33226] Updated weights for policy 1, policy_version 50860 (0.0008) [2023-10-14 03:03:44,118][33226] Updated weights for policy 1, policy_version 50870 (0.0010) [2023-10-14 03:03:44,490][33226] Updated weights for policy 1, policy_version 50880 (0.0011) [2023-10-14 03:03:44,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.5, 300 sec: 14218.0). Total num frames: 103710720. Throughput: 0: 1758.1, 1: 1778.6. Samples: 25927510. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:44,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:03:45,847][33201] Updated weights for policy 0, policy_version 50410 (0.0007) [2023-10-14 03:03:46,219][33201] Updated weights for policy 0, policy_version 50420 (0.0009) [2023-10-14 03:03:46,590][33201] Updated weights for policy 0, policy_version 50430 (0.0012) [2023-10-14 03:03:48,372][33226] Updated weights for policy 1, policy_version 50890 (0.0009) [2023-10-14 03:03:48,737][33226] Updated weights for policy 1, policy_version 50900 (0.0010) [2023-10-14 03:03:49,112][33226] Updated weights for policy 1, policy_version 50910 (0.0007) [2023-10-14 03:03:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 103776256. Throughput: 0: 1750.5, 1: 1791.3. Samples: 25949508. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:49,557][31953] Avg episode reward: [(0, '20.670'), (1, '20.960')] [2023-10-14 03:03:50,563][33201] Updated weights for policy 0, policy_version 50440 (0.0008) [2023-10-14 03:03:50,926][33201] Updated weights for policy 0, policy_version 50450 (0.0008) [2023-10-14 03:03:51,306][33201] Updated weights for policy 0, policy_version 50460 (0.0007) [2023-10-14 03:03:52,991][33226] Updated weights for policy 1, policy_version 50920 (0.0010) [2023-10-14 03:03:53,358][33226] Updated weights for policy 1, policy_version 50930 (0.0009) [2023-10-14 03:03:53,733][33226] Updated weights for policy 1, policy_version 50940 (0.0009) [2023-10-14 03:03:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 103841792. Throughput: 0: 1768.3, 1: 1773.4. Samples: 25970082. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 03:03:54,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.960')] [2023-10-14 03:03:55,275][33201] Updated weights for policy 0, policy_version 50470 (0.0010) [2023-10-14 03:03:55,644][33201] Updated weights for policy 0, policy_version 50480 (0.0010) [2023-10-14 03:03:56,019][33201] Updated weights for policy 0, policy_version 50490 (0.0008) [2023-10-14 03:03:57,547][33226] Updated weights for policy 1, policy_version 50950 (0.0010) [2023-10-14 03:03:57,911][33226] Updated weights for policy 1, policy_version 50960 (0.0010) [2023-10-14 03:03:58,280][33226] Updated weights for policy 1, policy_version 50970 (0.0009) [2023-10-14 03:03:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 103907328. Throughput: 0: 1747.2, 1: 1787.0. Samples: 25980930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:03:59,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.960')] [2023-10-14 03:03:59,792][33201] Updated weights for policy 0, policy_version 50500 (0.0008) [2023-10-14 03:04:00,160][33201] Updated weights for policy 0, policy_version 50510 (0.0007) [2023-10-14 03:04:00,524][33201] Updated weights for policy 0, policy_version 50520 (0.0008) [2023-10-14 03:04:02,282][33226] Updated weights for policy 1, policy_version 50980 (0.0008) [2023-10-14 03:04:02,641][33226] Updated weights for policy 1, policy_version 50990 (0.0010) [2023-10-14 03:04:03,016][33226] Updated weights for policy 1, policy_version 51000 (0.0011) [2023-10-14 03:04:04,339][33201] Updated weights for policy 0, policy_version 50530 (0.0009) [2023-10-14 03:04:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 103972864. Throughput: 0: 1760.2, 1: 1780.2. Samples: 26001958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:04,557][31953] Avg episode reward: [(0, '20.630'), (1, '20.960')] [2023-10-14 03:04:04,708][33201] Updated weights for policy 0, policy_version 50540 (0.0007) [2023-10-14 03:04:05,077][33201] Updated weights for policy 0, policy_version 50550 (0.0007) [2023-10-14 03:04:05,442][33201] Updated weights for policy 0, policy_version 50560 (0.0007) [2023-10-14 03:04:06,856][33226] Updated weights for policy 1, policy_version 51010 (0.0011) [2023-10-14 03:04:07,231][33226] Updated weights for policy 1, policy_version 51020 (0.0011) [2023-10-14 03:04:07,599][33226] Updated weights for policy 1, policy_version 51030 (0.0007) [2023-10-14 03:04:07,974][33226] Updated weights for policy 1, policy_version 51040 (0.0008) [2023-10-14 03:04:09,313][33201] Updated weights for policy 0, policy_version 50570 (0.0010) [2023-10-14 03:04:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 104038400. Throughput: 0: 1774.0, 1: 1761.0. Samples: 26023420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:09,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.920')] [2023-10-14 03:04:09,679][33201] Updated weights for policy 0, policy_version 50580 (0.0010) [2023-10-14 03:04:10,049][33201] Updated weights for policy 0, policy_version 50590 (0.0010) [2023-10-14 03:04:11,723][33226] Updated weights for policy 1, policy_version 51050 (0.0008) [2023-10-14 03:04:12,084][33226] Updated weights for policy 1, policy_version 51060 (0.0008) [2023-10-14 03:04:12,439][33226] Updated weights for policy 1, policy_version 51070 (0.0007) [2023-10-14 03:04:13,897][33201] Updated weights for policy 0, policy_version 50600 (0.0009) [2023-10-14 03:04:14,260][33201] Updated weights for policy 0, policy_version 50610 (0.0008) [2023-10-14 03:04:14,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 104103936. Throughput: 0: 1748.7, 1: 1779.0. Samples: 26033860. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:14,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.920')] [2023-10-14 03:04:14,634][33201] Updated weights for policy 0, policy_version 50620 (0.0007) [2023-10-14 03:04:16,232][33226] Updated weights for policy 1, policy_version 51080 (0.0008) [2023-10-14 03:04:16,606][33226] Updated weights for policy 1, policy_version 51090 (0.0011) [2023-10-14 03:04:16,971][33226] Updated weights for policy 1, policy_version 51100 (0.0009) [2023-10-14 03:04:18,511][33201] Updated weights for policy 0, policy_version 50630 (0.0007) [2023-10-14 03:04:18,876][33201] Updated weights for policy 0, policy_version 50640 (0.0009) [2023-10-14 03:04:19,249][33201] Updated weights for policy 0, policy_version 50650 (0.0011) [2023-10-14 03:04:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 104202240. Throughput: 0: 1771.1, 1: 1761.4. Samples: 26055184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:19,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 03:04:20,683][33226] Updated weights for policy 1, policy_version 51110 (0.0008) [2023-10-14 03:04:21,051][33226] Updated weights for policy 1, policy_version 51120 (0.0010) [2023-10-14 03:04:21,423][33226] Updated weights for policy 1, policy_version 51130 (0.0011) [2023-10-14 03:04:23,296][33201] Updated weights for policy 0, policy_version 50660 (0.0011) [2023-10-14 03:04:23,681][33201] Updated weights for policy 0, policy_version 50670 (0.0007) [2023-10-14 03:04:24,046][33201] Updated weights for policy 0, policy_version 50680 (0.0009) [2023-10-14 03:04:24,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 104267776. Throughput: 0: 1741.2, 1: 1774.8. Samples: 26076106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.910')] [2023-10-14 03:04:25,241][33226] Updated weights for policy 1, policy_version 51140 (0.0010) [2023-10-14 03:04:25,642][33226] Updated weights for policy 1, policy_version 51150 (0.0009) [2023-10-14 03:04:25,997][33226] Updated weights for policy 1, policy_version 51160 (0.0010) [2023-10-14 03:04:27,849][33201] Updated weights for policy 0, policy_version 50690 (0.0010) [2023-10-14 03:04:28,227][33201] Updated weights for policy 0, policy_version 50700 (0.0007) [2023-10-14 03:04:28,597][33201] Updated weights for policy 0, policy_version 50710 (0.0008) [2023-10-14 03:04:28,972][33201] Updated weights for policy 0, policy_version 50720 (0.0010) [2023-10-14 03:04:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 104333312. Throughput: 0: 1765.6, 1: 1769.2. Samples: 26086580. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:29,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.910')] [2023-10-14 03:04:29,828][33226] Updated weights for policy 1, policy_version 51170 (0.0011) [2023-10-14 03:04:30,187][33226] Updated weights for policy 1, policy_version 51180 (0.0010) [2023-10-14 03:04:30,554][33226] Updated weights for policy 1, policy_version 51190 (0.0007) [2023-10-14 03:04:30,919][33226] Updated weights for policy 1, policy_version 51200 (0.0007) [2023-10-14 03:04:32,803][33201] Updated weights for policy 0, policy_version 50730 (0.0007) [2023-10-14 03:04:33,176][33201] Updated weights for policy 0, policy_version 50740 (0.0007) [2023-10-14 03:04:33,551][33201] Updated weights for policy 0, policy_version 50750 (0.0009) [2023-10-14 03:04:34,557][31953] Fps is (10 sec: 13107.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 104398848. Throughput: 0: 1753.1, 1: 1771.7. Samples: 26108126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:34,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.890')] [2023-10-14 03:04:34,567][33226] Updated weights for policy 1, policy_version 51210 (0.0009) [2023-10-14 03:04:34,934][33226] Updated weights for policy 1, policy_version 51220 (0.0009) [2023-10-14 03:04:35,290][33226] Updated weights for policy 1, policy_version 51230 (0.0009) [2023-10-14 03:04:37,448][33201] Updated weights for policy 0, policy_version 50760 (0.0009) [2023-10-14 03:04:37,823][33201] Updated weights for policy 0, policy_version 50770 (0.0009) [2023-10-14 03:04:38,193][33201] Updated weights for policy 0, policy_version 50780 (0.0009) [2023-10-14 03:04:39,044][33226] Updated weights for policy 1, policy_version 51240 (0.0008) [2023-10-14 03:04:39,412][33226] Updated weights for policy 1, policy_version 51250 (0.0008) [2023-10-14 03:04:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 104464384. Throughput: 0: 1740.2, 1: 1798.8. Samples: 26129338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:39,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.850')] [2023-10-14 03:04:39,781][33226] Updated weights for policy 1, policy_version 51260 (0.0009) [2023-10-14 03:04:41,991][33201] Updated weights for policy 0, policy_version 50790 (0.0009) [2023-10-14 03:04:42,354][33201] Updated weights for policy 0, policy_version 50800 (0.0009) [2023-10-14 03:04:42,722][33201] Updated weights for policy 0, policy_version 50810 (0.0009) [2023-10-14 03:04:43,607][33226] Updated weights for policy 1, policy_version 51270 (0.0008) [2023-10-14 03:04:43,974][33226] Updated weights for policy 1, policy_version 51280 (0.0008) [2023-10-14 03:04:44,339][33226] Updated weights for policy 1, policy_version 51290 (0.0008) [2023-10-14 03:04:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 104529920. Throughput: 0: 1765.8, 1: 1772.9. Samples: 26140172. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:44,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.850')] [2023-10-14 03:04:46,634][33201] Updated weights for policy 0, policy_version 50820 (0.0008) [2023-10-14 03:04:47,001][33201] Updated weights for policy 0, policy_version 50830 (0.0010) [2023-10-14 03:04:47,375][33201] Updated weights for policy 0, policy_version 50840 (0.0010) [2023-10-14 03:04:48,066][33226] Updated weights for policy 1, policy_version 51300 (0.0007) [2023-10-14 03:04:48,434][33226] Updated weights for policy 1, policy_version 51310 (0.0008) [2023-10-14 03:04:48,809][33226] Updated weights for policy 1, policy_version 51320 (0.0008) [2023-10-14 03:04:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 104628224. Throughput: 0: 1737.6, 1: 1797.9. Samples: 26161054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:49,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.850')] [2023-10-14 03:04:51,306][33201] Updated weights for policy 0, policy_version 50850 (0.0010) [2023-10-14 03:04:51,676][33201] Updated weights for policy 0, policy_version 50860 (0.0009) [2023-10-14 03:04:52,041][33201] Updated weights for policy 0, policy_version 50870 (0.0009) [2023-10-14 03:04:52,414][33201] Updated weights for policy 0, policy_version 50880 (0.0009) [2023-10-14 03:04:52,481][33226] Updated weights for policy 1, policy_version 51330 (0.0009) [2023-10-14 03:04:52,842][33226] Updated weights for policy 1, policy_version 51340 (0.0007) [2023-10-14 03:04:53,215][33226] Updated weights for policy 1, policy_version 51350 (0.0009) [2023-10-14 03:04:53,579][33226] Updated weights for policy 1, policy_version 51360 (0.0008) [2023-10-14 03:04:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 104693760. Throughput: 0: 1739.8, 1: 1781.6. Samples: 26181886. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:54,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.870')] [2023-10-14 03:04:56,085][33201] Updated weights for policy 0, policy_version 50890 (0.0008) [2023-10-14 03:04:56,454][33201] Updated weights for policy 0, policy_version 50900 (0.0007) [2023-10-14 03:04:56,824][33201] Updated weights for policy 0, policy_version 50910 (0.0007) [2023-10-14 03:04:57,350][33226] Updated weights for policy 1, policy_version 51370 (0.0008) [2023-10-14 03:04:57,709][33226] Updated weights for policy 1, policy_version 51380 (0.0008) [2023-10-14 03:04:58,087][33226] Updated weights for policy 1, policy_version 51390 (0.0008) [2023-10-14 03:04:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 104759296. Throughput: 0: 1741.9, 1: 1798.0. Samples: 26193156. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:04:59,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.870')] [2023-10-14 03:05:00,598][33201] Updated weights for policy 0, policy_version 50920 (0.0009) [2023-10-14 03:05:00,969][33201] Updated weights for policy 0, policy_version 50930 (0.0007) [2023-10-14 03:05:01,341][33201] Updated weights for policy 0, policy_version 50940 (0.0008) [2023-10-14 03:05:01,927][33226] Updated weights for policy 1, policy_version 51400 (0.0008) [2023-10-14 03:05:02,292][33226] Updated weights for policy 1, policy_version 51410 (0.0008) [2023-10-14 03:05:02,668][33226] Updated weights for policy 1, policy_version 51420 (0.0009) [2023-10-14 03:05:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 104824832. Throughput: 0: 1748.9, 1: 1778.0. Samples: 26213892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:05:04,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.890')] [2023-10-14 03:05:05,154][33201] Updated weights for policy 0, policy_version 50950 (0.0008) [2023-10-14 03:05:05,532][33201] Updated weights for policy 0, policy_version 50960 (0.0008) [2023-10-14 03:05:05,910][33201] Updated weights for policy 0, policy_version 50970 (0.0008) [2023-10-14 03:05:06,438][33226] Updated weights for policy 1, policy_version 51430 (0.0008) [2023-10-14 03:05:06,800][33226] Updated weights for policy 1, policy_version 51440 (0.0008) [2023-10-14 03:05:07,174][33226] Updated weights for policy 1, policy_version 51450 (0.0009) [2023-10-14 03:05:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 104890368. Throughput: 0: 1780.4, 1: 1772.3. Samples: 26235976. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:09,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.870')] [2023-10-14 03:05:09,709][33201] Updated weights for policy 0, policy_version 50980 (0.0009) [2023-10-14 03:05:10,083][33201] Updated weights for policy 0, policy_version 50990 (0.0009) [2023-10-14 03:05:10,461][33201] Updated weights for policy 0, policy_version 51000 (0.0009) [2023-10-14 03:05:10,997][33226] Updated weights for policy 1, policy_version 51460 (0.0009) [2023-10-14 03:05:11,376][33226] Updated weights for policy 1, policy_version 51470 (0.0007) [2023-10-14 03:05:11,742][33226] Updated weights for policy 1, policy_version 51480 (0.0008) [2023-10-14 03:05:14,132][33201] Updated weights for policy 0, policy_version 51010 (0.0008) [2023-10-14 03:05:14,513][33201] Updated weights for policy 0, policy_version 51020 (0.0009) [2023-10-14 03:05:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 104955904. Throughput: 0: 1756.5, 1: 1776.6. Samples: 26245570. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:14,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.870')] [2023-10-14 03:05:14,884][33201] Updated weights for policy 0, policy_version 51030 (0.0010) [2023-10-14 03:05:15,249][33201] Updated weights for policy 0, policy_version 51040 (0.0009) [2023-10-14 03:05:15,548][33226] Updated weights for policy 1, policy_version 51490 (0.0007) [2023-10-14 03:05:15,922][33226] Updated weights for policy 1, policy_version 51500 (0.0008) [2023-10-14 03:05:16,284][33226] Updated weights for policy 1, policy_version 51510 (0.0008) [2023-10-14 03:05:16,647][33226] Updated weights for policy 1, policy_version 51520 (0.0008) [2023-10-14 03:05:18,976][33201] Updated weights for policy 0, policy_version 51050 (0.0008) [2023-10-14 03:05:19,348][33201] Updated weights for policy 0, policy_version 51060 (0.0007) [2023-10-14 03:05:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 105021440. Throughput: 0: 1777.6, 1: 1770.5. Samples: 26267792. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:19,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.870')] [2023-10-14 03:05:19,722][33201] Updated weights for policy 0, policy_version 51070 (0.0008) [2023-10-14 03:05:20,525][33226] Updated weights for policy 1, policy_version 51530 (0.0007) [2023-10-14 03:05:20,888][33226] Updated weights for policy 1, policy_version 51540 (0.0010) [2023-10-14 03:05:21,264][33226] Updated weights for policy 1, policy_version 51550 (0.0008) [2023-10-14 03:05:23,507][33201] Updated weights for policy 0, policy_version 51080 (0.0008) [2023-10-14 03:05:23,878][33201] Updated weights for policy 0, policy_version 51090 (0.0009) [2023-10-14 03:05:24,242][33201] Updated weights for policy 0, policy_version 51100 (0.0010) [2023-10-14 03:05:24,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 105119744. Throughput: 0: 1767.9, 1: 1777.2. Samples: 26288864. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 03:05:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000051104_52330496.pth... [2023-10-14 03:05:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000049440_50626560.pth [2023-10-14 03:05:24,918][33226] Updated weights for policy 1, policy_version 51560 (0.0008) [2023-10-14 03:05:25,291][33226] Updated weights for policy 1, policy_version 51570 (0.0008) [2023-10-14 03:05:25,660][33226] Updated weights for policy 1, policy_version 51580 (0.0008) [2023-10-14 03:05:25,806][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000051584_52822016.pth... [2023-10-14 03:05:25,846][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000049888_51085312.pth [2023-10-14 03:05:28,261][33201] Updated weights for policy 0, policy_version 51110 (0.0008) [2023-10-14 03:05:28,633][33201] Updated weights for policy 0, policy_version 51120 (0.0009) [2023-10-14 03:05:29,000][33201] Updated weights for policy 0, policy_version 51130 (0.0010) [2023-10-14 03:05:29,480][33226] Updated weights for policy 1, policy_version 51590 (0.0007) [2023-10-14 03:05:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 105185280. Throughput: 0: 1766.8, 1: 1772.0. Samples: 26299422. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 03:05:29,841][33226] Updated weights for policy 1, policy_version 51600 (0.0010) [2023-10-14 03:05:30,212][33226] Updated weights for policy 1, policy_version 51610 (0.0008) [2023-10-14 03:05:32,979][33201] Updated weights for policy 0, policy_version 51140 (0.0009) [2023-10-14 03:05:33,352][33201] Updated weights for policy 0, policy_version 51150 (0.0010) [2023-10-14 03:05:33,727][33201] Updated weights for policy 0, policy_version 51160 (0.0009) [2023-10-14 03:05:33,996][33226] Updated weights for policy 1, policy_version 51620 (0.0008) [2023-10-14 03:05:34,362][33226] Updated weights for policy 1, policy_version 51630 (0.0009) [2023-10-14 03:05:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 105250816. Throughput: 0: 1783.8, 1: 1775.0. Samples: 26321200. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:34,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 03:05:34,716][33226] Updated weights for policy 1, policy_version 51640 (0.0009) [2023-10-14 03:05:37,522][33201] Updated weights for policy 0, policy_version 51170 (0.0008) [2023-10-14 03:05:37,894][33201] Updated weights for policy 0, policy_version 51180 (0.0008) [2023-10-14 03:05:38,267][33201] Updated weights for policy 0, policy_version 51190 (0.0009) [2023-10-14 03:05:38,595][33226] Updated weights for policy 1, policy_version 51650 (0.0008) [2023-10-14 03:05:38,642][33201] Updated weights for policy 0, policy_version 51200 (0.0010) [2023-10-14 03:05:38,962][33226] Updated weights for policy 1, policy_version 51660 (0.0007) [2023-10-14 03:05:39,332][33226] Updated weights for policy 1, policy_version 51670 (0.0008) [2023-10-14 03:05:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 105316352. Throughput: 0: 1758.7, 1: 1792.5. Samples: 26341690. Policy #0 lag: (min: 31.0, avg: 31.7, max: 47.0) [2023-10-14 03:05:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 03:05:39,694][33226] Updated weights for policy 1, policy_version 51680 (0.0011) [2023-10-14 03:05:42,423][33201] Updated weights for policy 0, policy_version 51210 (0.0011) [2023-10-14 03:05:42,790][33201] Updated weights for policy 0, policy_version 51220 (0.0009) [2023-10-14 03:05:43,161][33201] Updated weights for policy 0, policy_version 51230 (0.0007) [2023-10-14 03:05:43,522][33226] Updated weights for policy 1, policy_version 51690 (0.0007) [2023-10-14 03:05:43,896][33226] Updated weights for policy 1, policy_version 51700 (0.0007) [2023-10-14 03:05:44,264][33226] Updated weights for policy 1, policy_version 51710 (0.0008) [2023-10-14 03:05:44,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 105414656. Throughput: 0: 1783.2, 1: 1768.9. Samples: 26353000. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:05:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 03:05:47,026][33201] Updated weights for policy 0, policy_version 51240 (0.0011) [2023-10-14 03:05:47,395][33201] Updated weights for policy 0, policy_version 51250 (0.0008) [2023-10-14 03:05:47,767][33201] Updated weights for policy 0, policy_version 51260 (0.0008) [2023-10-14 03:05:48,054][33226] Updated weights for policy 1, policy_version 51720 (0.0008) [2023-10-14 03:05:48,417][33226] Updated weights for policy 1, policy_version 51730 (0.0010) [2023-10-14 03:05:48,788][33226] Updated weights for policy 1, policy_version 51740 (0.0009) [2023-10-14 03:05:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 105480192. Throughput: 0: 1746.1, 1: 1797.1. Samples: 26373340. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:05:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.830')] [2023-10-14 03:05:51,500][33201] Updated weights for policy 0, policy_version 51270 (0.0009) [2023-10-14 03:05:51,874][33201] Updated weights for policy 0, policy_version 51280 (0.0010) [2023-10-14 03:05:52,247][33201] Updated weights for policy 0, policy_version 51290 (0.0010) [2023-10-14 03:05:52,459][33226] Updated weights for policy 1, policy_version 51750 (0.0007) [2023-10-14 03:05:52,824][33226] Updated weights for policy 1, policy_version 51760 (0.0007) [2023-10-14 03:05:53,197][33226] Updated weights for policy 1, policy_version 51770 (0.0008) [2023-10-14 03:05:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 105545728. Throughput: 0: 1752.6, 1: 1777.9. Samples: 26394848. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:05:54,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.870')] [2023-10-14 03:05:56,027][33201] Updated weights for policy 0, policy_version 51300 (0.0010) [2023-10-14 03:05:56,421][33201] Updated weights for policy 0, policy_version 51310 (0.0008) [2023-10-14 03:05:56,787][33201] Updated weights for policy 0, policy_version 51320 (0.0007) [2023-10-14 03:05:57,156][33226] Updated weights for policy 1, policy_version 51780 (0.0007) [2023-10-14 03:05:57,561][33226] Updated weights for policy 1, policy_version 51790 (0.0008) [2023-10-14 03:05:57,929][33226] Updated weights for policy 1, policy_version 51800 (0.0009) [2023-10-14 03:05:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 105611264. Throughput: 0: 1751.8, 1: 1804.5. Samples: 26405606. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:05:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 03:06:00,509][33201] Updated weights for policy 0, policy_version 51330 (0.0008) [2023-10-14 03:06:00,885][33201] Updated weights for policy 0, policy_version 51340 (0.0009) [2023-10-14 03:06:01,261][33201] Updated weights for policy 0, policy_version 51350 (0.0010) [2023-10-14 03:06:01,624][33201] Updated weights for policy 0, policy_version 51360 (0.0008) [2023-10-14 03:06:01,831][33226] Updated weights for policy 1, policy_version 51810 (0.0008) [2023-10-14 03:06:02,197][33226] Updated weights for policy 1, policy_version 51820 (0.0010) [2023-10-14 03:06:02,564][33226] Updated weights for policy 1, policy_version 51830 (0.0008) [2023-10-14 03:06:02,925][33226] Updated weights for policy 1, policy_version 51840 (0.0008) [2023-10-14 03:06:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 105676800. Throughput: 0: 1751.9, 1: 1772.6. Samples: 26426392. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:06:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 03:06:05,358][33201] Updated weights for policy 0, policy_version 51370 (0.0007) [2023-10-14 03:06:05,727][33201] Updated weights for policy 0, policy_version 51380 (0.0008) [2023-10-14 03:06:06,101][33201] Updated weights for policy 0, policy_version 51390 (0.0009) [2023-10-14 03:06:06,712][33226] Updated weights for policy 1, policy_version 51850 (0.0010) [2023-10-14 03:06:07,084][33226] Updated weights for policy 1, policy_version 51860 (0.0008) [2023-10-14 03:06:07,447][33226] Updated weights for policy 1, policy_version 51870 (0.0007) [2023-10-14 03:06:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 105742336. Throughput: 0: 1781.7, 1: 1773.0. Samples: 26448824. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:06:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.860')] [2023-10-14 03:06:09,887][33201] Updated weights for policy 0, policy_version 51400 (0.0007) [2023-10-14 03:06:10,263][33201] Updated weights for policy 0, policy_version 51410 (0.0008) [2023-10-14 03:06:10,633][33201] Updated weights for policy 0, policy_version 51420 (0.0009) [2023-10-14 03:06:11,139][33226] Updated weights for policy 1, policy_version 51880 (0.0010) [2023-10-14 03:06:11,504][33226] Updated weights for policy 1, policy_version 51890 (0.0010) [2023-10-14 03:06:11,877][33226] Updated weights for policy 1, policy_version 51900 (0.0010) [2023-10-14 03:06:14,466][33201] Updated weights for policy 0, policy_version 51430 (0.0009) [2023-10-14 03:06:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 105807872. Throughput: 0: 1759.3, 1: 1779.4. Samples: 26458664. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:06:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.860')] [2023-10-14 03:06:14,847][33201] Updated weights for policy 0, policy_version 51440 (0.0007) [2023-10-14 03:06:15,219][33201] Updated weights for policy 0, policy_version 51450 (0.0009) [2023-10-14 03:06:15,683][33226] Updated weights for policy 1, policy_version 51910 (0.0007) [2023-10-14 03:06:16,055][33226] Updated weights for policy 1, policy_version 51920 (0.0008) [2023-10-14 03:06:16,430][33226] Updated weights for policy 1, policy_version 51930 (0.0009) [2023-10-14 03:06:18,992][33201] Updated weights for policy 0, policy_version 51460 (0.0010) [2023-10-14 03:06:19,368][33201] Updated weights for policy 0, policy_version 51470 (0.0008) [2023-10-14 03:06:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 105873408. Throughput: 0: 1772.2, 1: 1771.6. Samples: 26480674. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:06:19,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 03:06:19,730][33201] Updated weights for policy 0, policy_version 51480 (0.0009) [2023-10-14 03:06:20,047][33226] Updated weights for policy 1, policy_version 51940 (0.0009) [2023-10-14 03:06:20,412][33226] Updated weights for policy 1, policy_version 51950 (0.0008) [2023-10-14 03:06:20,781][33226] Updated weights for policy 1, policy_version 51960 (0.0009) [2023-10-14 03:06:23,598][33201] Updated weights for policy 0, policy_version 51490 (0.0010) [2023-10-14 03:06:23,960][33201] Updated weights for policy 0, policy_version 51500 (0.0010) [2023-10-14 03:06:24,337][33201] Updated weights for policy 0, policy_version 51510 (0.0009) [2023-10-14 03:06:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 105938944. Throughput: 0: 1788.8, 1: 1783.4. Samples: 26502442. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 03:06:24,655][33226] Updated weights for policy 1, policy_version 51970 (0.0009) [2023-10-14 03:06:24,714][33201] Updated weights for policy 0, policy_version 51520 (0.0008) [2023-10-14 03:06:25,023][33226] Updated weights for policy 1, policy_version 51980 (0.0010) [2023-10-14 03:06:25,386][33226] Updated weights for policy 1, policy_version 51990 (0.0008) [2023-10-14 03:06:25,747][33226] Updated weights for policy 1, policy_version 52000 (0.0009) [2023-10-14 03:06:28,517][33201] Updated weights for policy 0, policy_version 51530 (0.0008) [2023-10-14 03:06:28,883][33201] Updated weights for policy 0, policy_version 51540 (0.0009) [2023-10-14 03:06:29,257][33201] Updated weights for policy 0, policy_version 51550 (0.0007) [2023-10-14 03:06:29,445][33226] Updated weights for policy 1, policy_version 52010 (0.0009) [2023-10-14 03:06:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 106037248. Throughput: 0: 1774.4, 1: 1775.1. Samples: 26512728. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.920')] [2023-10-14 03:06:29,811][33226] Updated weights for policy 1, policy_version 52020 (0.0009) [2023-10-14 03:06:30,180][33226] Updated weights for policy 1, policy_version 52030 (0.0009) [2023-10-14 03:06:33,093][33201] Updated weights for policy 0, policy_version 51560 (0.0008) [2023-10-14 03:06:33,461][33201] Updated weights for policy 0, policy_version 51570 (0.0008) [2023-10-14 03:06:33,813][33226] Updated weights for policy 1, policy_version 52040 (0.0008) [2023-10-14 03:06:33,831][33201] Updated weights for policy 0, policy_version 51580 (0.0007) [2023-10-14 03:06:34,183][33226] Updated weights for policy 1, policy_version 52050 (0.0009) [2023-10-14 03:06:34,553][33226] Updated weights for policy 1, policy_version 52060 (0.0010) [2023-10-14 03:06:34,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 106102784. Throughput: 0: 1795.6, 1: 1787.9. Samples: 26534596. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:34,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.920')] [2023-10-14 03:06:37,704][33201] Updated weights for policy 0, policy_version 51590 (0.0008) [2023-10-14 03:06:38,091][33201] Updated weights for policy 0, policy_version 51600 (0.0008) [2023-10-14 03:06:38,381][33226] Updated weights for policy 1, policy_version 52070 (0.0009) [2023-10-14 03:06:38,460][33201] Updated weights for policy 0, policy_version 51610 (0.0008) [2023-10-14 03:06:38,746][33226] Updated weights for policy 1, policy_version 52080 (0.0009) [2023-10-14 03:06:39,118][33226] Updated weights for policy 1, policy_version 52090 (0.0011) [2023-10-14 03:06:39,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.7, 300 sec: 14329.1). Total num frames: 106201088. Throughput: 0: 1767.9, 1: 1788.9. Samples: 26554902. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.920')] [2023-10-14 03:06:42,420][33201] Updated weights for policy 0, policy_version 51620 (0.0008) [2023-10-14 03:06:42,806][33201] Updated weights for policy 0, policy_version 51630 (0.0009) [2023-10-14 03:06:42,900][33226] Updated weights for policy 1, policy_version 52100 (0.0008) [2023-10-14 03:06:43,181][33201] Updated weights for policy 0, policy_version 51640 (0.0008) [2023-10-14 03:06:43,271][33226] Updated weights for policy 1, policy_version 52110 (0.0008) [2023-10-14 03:06:43,635][33226] Updated weights for policy 1, policy_version 52120 (0.0008) [2023-10-14 03:06:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 106266624. Throughput: 0: 1799.6, 1: 1783.6. Samples: 26566854. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.920')] [2023-10-14 03:06:46,992][33201] Updated weights for policy 0, policy_version 51650 (0.0008) [2023-10-14 03:06:47,356][33201] Updated weights for policy 0, policy_version 51660 (0.0007) [2023-10-14 03:06:47,446][33226] Updated weights for policy 1, policy_version 52130 (0.0009) [2023-10-14 03:06:47,728][33201] Updated weights for policy 0, policy_version 51670 (0.0007) [2023-10-14 03:06:47,822][33226] Updated weights for policy 1, policy_version 52140 (0.0010) [2023-10-14 03:06:48,097][33201] Updated weights for policy 0, policy_version 51680 (0.0009) [2023-10-14 03:06:48,189][33226] Updated weights for policy 1, policy_version 52150 (0.0008) [2023-10-14 03:06:48,554][33226] Updated weights for policy 1, policy_version 52160 (0.0009) [2023-10-14 03:06:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 106332160. Throughput: 0: 1762.7, 1: 1798.3. Samples: 26586636. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:06:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.900')] [2023-10-14 03:06:51,989][33201] Updated weights for policy 0, policy_version 51690 (0.0010) [2023-10-14 03:06:52,363][33201] Updated weights for policy 0, policy_version 51700 (0.0008) [2023-10-14 03:06:52,654][33226] Updated weights for policy 1, policy_version 52170 (0.0009) [2023-10-14 03:06:52,726][33201] Updated weights for policy 0, policy_version 51710 (0.0007) [2023-10-14 03:06:53,034][33226] Updated weights for policy 1, policy_version 52180 (0.0010) [2023-10-14 03:06:53,398][33226] Updated weights for policy 1, policy_version 52190 (0.0009) [2023-10-14 03:06:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 106397696. Throughput: 0: 1756.4, 1: 1773.1. Samples: 26607650. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:06:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:06:56,452][33201] Updated weights for policy 0, policy_version 51720 (0.0009) [2023-10-14 03:06:56,818][33201] Updated weights for policy 0, policy_version 51730 (0.0008) [2023-10-14 03:06:57,193][33201] Updated weights for policy 0, policy_version 51740 (0.0007) [2023-10-14 03:06:57,199][33226] Updated weights for policy 1, policy_version 52200 (0.0009) [2023-10-14 03:06:57,560][33226] Updated weights for policy 1, policy_version 52210 (0.0010) [2023-10-14 03:06:57,923][33226] Updated weights for policy 1, policy_version 52220 (0.0011) [2023-10-14 03:06:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 106463232. Throughput: 0: 1762.5, 1: 1798.2. Samples: 26618896. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:06:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:01,083][33201] Updated weights for policy 0, policy_version 51750 (0.0008) [2023-10-14 03:07:01,446][33201] Updated weights for policy 0, policy_version 51760 (0.0010) [2023-10-14 03:07:01,679][33226] Updated weights for policy 1, policy_version 52230 (0.0008) [2023-10-14 03:07:01,824][33201] Updated weights for policy 0, policy_version 51770 (0.0009) [2023-10-14 03:07:02,048][33226] Updated weights for policy 1, policy_version 52240 (0.0008) [2023-10-14 03:07:02,405][33226] Updated weights for policy 1, policy_version 52250 (0.0007) [2023-10-14 03:07:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 106528768. Throughput: 0: 1746.9, 1: 1774.1. Samples: 26639118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:05,622][33201] Updated weights for policy 0, policy_version 51780 (0.0008) [2023-10-14 03:07:06,001][33201] Updated weights for policy 0, policy_version 51790 (0.0007) [2023-10-14 03:07:06,240][33226] Updated weights for policy 1, policy_version 52260 (0.0009) [2023-10-14 03:07:06,363][33201] Updated weights for policy 0, policy_version 51800 (0.0008) [2023-10-14 03:07:06,609][33226] Updated weights for policy 1, policy_version 52270 (0.0009) [2023-10-14 03:07:06,973][33226] Updated weights for policy 1, policy_version 52280 (0.0009) [2023-10-14 03:07:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 106594304. Throughput: 0: 1762.2, 1: 1767.8. Samples: 26661292. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.910')] [2023-10-14 03:07:10,050][33201] Updated weights for policy 0, policy_version 51810 (0.0008) [2023-10-14 03:07:10,422][33201] Updated weights for policy 0, policy_version 51820 (0.0008) [2023-10-14 03:07:10,791][33201] Updated weights for policy 0, policy_version 51830 (0.0009) [2023-10-14 03:07:10,979][33226] Updated weights for policy 1, policy_version 52290 (0.0010) [2023-10-14 03:07:11,161][33201] Updated weights for policy 0, policy_version 51840 (0.0009) [2023-10-14 03:07:11,342][33226] Updated weights for policy 1, policy_version 52300 (0.0007) [2023-10-14 03:07:11,706][33226] Updated weights for policy 1, policy_version 52310 (0.0007) [2023-10-14 03:07:12,076][33226] Updated weights for policy 1, policy_version 52320 (0.0007) [2023-10-14 03:07:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 106659840. Throughput: 0: 1745.3, 1: 1769.6. Samples: 26670902. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:15,107][33201] Updated weights for policy 0, policy_version 51850 (0.0009) [2023-10-14 03:07:15,466][33201] Updated weights for policy 0, policy_version 51860 (0.0009) [2023-10-14 03:07:15,846][33201] Updated weights for policy 0, policy_version 51870 (0.0009) [2023-10-14 03:07:15,961][33226] Updated weights for policy 1, policy_version 52330 (0.0009) [2023-10-14 03:07:16,328][33226] Updated weights for policy 1, policy_version 52340 (0.0009) [2023-10-14 03:07:16,691][33226] Updated weights for policy 1, policy_version 52350 (0.0008) [2023-10-14 03:07:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 106725376. Throughput: 0: 1753.0, 1: 1755.1. Samples: 26692462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:19,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:19,787][33201] Updated weights for policy 0, policy_version 51880 (0.0007) [2023-10-14 03:07:20,166][33201] Updated weights for policy 0, policy_version 51890 (0.0008) [2023-10-14 03:07:20,527][33201] Updated weights for policy 0, policy_version 51900 (0.0007) [2023-10-14 03:07:20,553][33226] Updated weights for policy 1, policy_version 52360 (0.0008) [2023-10-14 03:07:20,920][33226] Updated weights for policy 1, policy_version 52370 (0.0008) [2023-10-14 03:07:21,291][33226] Updated weights for policy 1, policy_version 52380 (0.0009) [2023-10-14 03:07:24,322][33201] Updated weights for policy 0, policy_version 51910 (0.0008) [2023-10-14 03:07:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 106790912. Throughput: 0: 1773.4, 1: 1774.7. Samples: 26714564. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000052384_53641216.pth... [2023-10-14 03:07:24,602][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000050720_51937280.pth [2023-10-14 03:07:24,690][33201] Updated weights for policy 0, policy_version 51920 (0.0008) [2023-10-14 03:07:25,038][33226] Updated weights for policy 1, policy_version 52390 (0.0008) [2023-10-14 03:07:25,057][33201] Updated weights for policy 0, policy_version 51930 (0.0007) [2023-10-14 03:07:25,276][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000051936_53182464.pth... [2023-10-14 03:07:25,308][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000050272_51478528.pth [2023-10-14 03:07:25,409][33226] Updated weights for policy 1, policy_version 52400 (0.0009) [2023-10-14 03:07:25,772][33226] Updated weights for policy 1, policy_version 52410 (0.0008) [2023-10-14 03:07:29,120][33201] Updated weights for policy 0, policy_version 51940 (0.0008) [2023-10-14 03:07:29,525][33201] Updated weights for policy 0, policy_version 51950 (0.0008) [2023-10-14 03:07:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 106856448. Throughput: 0: 1743.5, 1: 1750.8. Samples: 26724098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:07:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:07:29,650][33226] Updated weights for policy 1, policy_version 52420 (0.0008) [2023-10-14 03:07:29,893][33201] Updated weights for policy 0, policy_version 51960 (0.0008) [2023-10-14 03:07:30,024][33226] Updated weights for policy 1, policy_version 52430 (0.0009) [2023-10-14 03:07:30,384][33226] Updated weights for policy 1, policy_version 52440 (0.0008) [2023-10-14 03:07:33,782][33201] Updated weights for policy 0, policy_version 51970 (0.0007) [2023-10-14 03:07:34,107][33226] Updated weights for policy 1, policy_version 52450 (0.0008) [2023-10-14 03:07:34,159][33201] Updated weights for policy 0, policy_version 51980 (0.0010) [2023-10-14 03:07:34,481][33226] Updated weights for policy 1, policy_version 52460 (0.0008) [2023-10-14 03:07:34,530][33201] Updated weights for policy 0, policy_version 51990 (0.0008) [2023-10-14 03:07:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 106921984. Throughput: 0: 1766.0, 1: 1772.3. Samples: 26745858. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 03:07:34,851][33226] Updated weights for policy 1, policy_version 52470 (0.0008) [2023-10-14 03:07:34,900][33201] Updated weights for policy 0, policy_version 52000 (0.0008) [2023-10-14 03:07:35,228][33226] Updated weights for policy 1, policy_version 52480 (0.0010) [2023-10-14 03:07:38,734][33201] Updated weights for policy 0, policy_version 52010 (0.0008) [2023-10-14 03:07:38,943][33226] Updated weights for policy 1, policy_version 52490 (0.0008) [2023-10-14 03:07:39,095][33201] Updated weights for policy 0, policy_version 52020 (0.0007) [2023-10-14 03:07:39,310][33226] Updated weights for policy 1, policy_version 52500 (0.0010) [2023-10-14 03:07:39,465][33201] Updated weights for policy 0, policy_version 52030 (0.0007) [2023-10-14 03:07:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 107020288. Throughput: 0: 1747.7, 1: 1785.3. Samples: 26766634. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:39,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.930')] [2023-10-14 03:07:39,674][33226] Updated weights for policy 1, policy_version 52510 (0.0009) [2023-10-14 03:07:43,338][33201] Updated weights for policy 0, policy_version 52040 (0.0007) [2023-10-14 03:07:43,461][33226] Updated weights for policy 1, policy_version 52520 (0.0008) [2023-10-14 03:07:43,712][33201] Updated weights for policy 0, policy_version 52050 (0.0007) [2023-10-14 03:07:43,832][33226] Updated weights for policy 1, policy_version 52530 (0.0008) [2023-10-14 03:07:44,083][33201] Updated weights for policy 0, policy_version 52060 (0.0008) [2023-10-14 03:07:44,197][33226] Updated weights for policy 1, policy_version 52540 (0.0007) [2023-10-14 03:07:44,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 107118592. Throughput: 0: 1758.4, 1: 1764.8. Samples: 26777442. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.930')] [2023-10-14 03:07:47,851][33201] Updated weights for policy 0, policy_version 52070 (0.0008) [2023-10-14 03:07:47,994][33226] Updated weights for policy 1, policy_version 52550 (0.0008) [2023-10-14 03:07:48,220][33201] Updated weights for policy 0, policy_version 52080 (0.0008) [2023-10-14 03:07:48,359][33226] Updated weights for policy 1, policy_version 52560 (0.0008) [2023-10-14 03:07:48,588][33201] Updated weights for policy 0, policy_version 52090 (0.0009) [2023-10-14 03:07:48,721][33226] Updated weights for policy 1, policy_version 52570 (0.0008) [2023-10-14 03:07:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 107184128. Throughput: 0: 1758.5, 1: 1786.3. Samples: 26798634. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 03:07:52,329][33201] Updated weights for policy 0, policy_version 52100 (0.0008) [2023-10-14 03:07:52,538][33226] Updated weights for policy 1, policy_version 52580 (0.0009) [2023-10-14 03:07:52,712][33201] Updated weights for policy 0, policy_version 52110 (0.0007) [2023-10-14 03:07:52,895][33226] Updated weights for policy 1, policy_version 52590 (0.0008) [2023-10-14 03:07:53,072][33201] Updated weights for policy 0, policy_version 52120 (0.0007) [2023-10-14 03:07:53,270][33226] Updated weights for policy 1, policy_version 52600 (0.0008) [2023-10-14 03:07:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 107249664. Throughput: 0: 1734.1, 1: 1762.0. Samples: 26818616. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:07:56,989][33201] Updated weights for policy 0, policy_version 52130 (0.0007) [2023-10-14 03:07:57,151][33226] Updated weights for policy 1, policy_version 52610 (0.0010) [2023-10-14 03:07:57,359][33201] Updated weights for policy 0, policy_version 52140 (0.0007) [2023-10-14 03:07:57,517][33226] Updated weights for policy 1, policy_version 52620 (0.0008) [2023-10-14 03:07:57,720][33201] Updated weights for policy 0, policy_version 52150 (0.0008) [2023-10-14 03:07:57,884][33226] Updated weights for policy 1, policy_version 52630 (0.0007) [2023-10-14 03:07:58,091][33201] Updated weights for policy 0, policy_version 52160 (0.0008) [2023-10-14 03:07:58,248][33226] Updated weights for policy 1, policy_version 52640 (0.0008) [2023-10-14 03:07:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 107315200. Throughput: 0: 1762.2, 1: 1792.9. Samples: 26830882. Policy #0 lag: (min: 13.0, avg: 36.5, max: 40.0) [2023-10-14 03:07:59,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.950')] [2023-10-14 03:08:01,937][33226] Updated weights for policy 1, policy_version 52650 (0.0007) [2023-10-14 03:08:02,034][33201] Updated weights for policy 0, policy_version 52170 (0.0008) [2023-10-14 03:08:02,307][33226] Updated weights for policy 1, policy_version 52660 (0.0007) [2023-10-14 03:08:02,398][33201] Updated weights for policy 0, policy_version 52180 (0.0007) [2023-10-14 03:08:02,671][33226] Updated weights for policy 1, policy_version 52670 (0.0007) [2023-10-14 03:08:02,773][33201] Updated weights for policy 0, policy_version 52190 (0.0008) [2023-10-14 03:08:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 107380736. Throughput: 0: 1733.5, 1: 1770.0. Samples: 26850120. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:04,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.950')] [2023-10-14 03:08:06,434][33226] Updated weights for policy 1, policy_version 52680 (0.0007) [2023-10-14 03:08:06,694][33201] Updated weights for policy 0, policy_version 52200 (0.0008) [2023-10-14 03:08:06,797][33226] Updated weights for policy 1, policy_version 52690 (0.0009) [2023-10-14 03:08:07,066][33201] Updated weights for policy 0, policy_version 52210 (0.0008) [2023-10-14 03:08:07,165][33226] Updated weights for policy 1, policy_version 52700 (0.0009) [2023-10-14 03:08:07,441][33201] Updated weights for policy 0, policy_version 52220 (0.0007) [2023-10-14 03:08:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 107446272. Throughput: 0: 1733.1, 1: 1771.6. Samples: 26872272. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:09,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.900')] [2023-10-14 03:08:11,019][33226] Updated weights for policy 1, policy_version 52710 (0.0009) [2023-10-14 03:08:11,367][33201] Updated weights for policy 0, policy_version 52230 (0.0009) [2023-10-14 03:08:11,388][33226] Updated weights for policy 1, policy_version 52720 (0.0008) [2023-10-14 03:08:11,730][33201] Updated weights for policy 0, policy_version 52240 (0.0007) [2023-10-14 03:08:11,743][33226] Updated weights for policy 1, policy_version 52730 (0.0008) [2023-10-14 03:08:12,105][33201] Updated weights for policy 0, policy_version 52250 (0.0009) [2023-10-14 03:08:14,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 107511808. Throughput: 0: 1740.8, 1: 1774.0. Samples: 26882266. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:14,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.900')] [2023-10-14 03:08:15,586][33226] Updated weights for policy 1, policy_version 52740 (0.0008) [2023-10-14 03:08:15,645][33201] Updated weights for policy 0, policy_version 52260 (0.0008) [2023-10-14 03:08:15,984][33226] Updated weights for policy 1, policy_version 52750 (0.0007) [2023-10-14 03:08:16,040][33201] Updated weights for policy 0, policy_version 52270 (0.0007) [2023-10-14 03:08:16,340][33226] Updated weights for policy 1, policy_version 52760 (0.0008) [2023-10-14 03:08:16,405][33201] Updated weights for policy 0, policy_version 52280 (0.0007) [2023-10-14 03:08:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 107577344. Throughput: 0: 1741.6, 1: 1762.4. Samples: 26903540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:19,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.880')] [2023-10-14 03:08:20,267][33226] Updated weights for policy 1, policy_version 52770 (0.0008) [2023-10-14 03:08:20,416][33201] Updated weights for policy 0, policy_version 52290 (0.0010) [2023-10-14 03:08:20,633][33226] Updated weights for policy 1, policy_version 52780 (0.0007) [2023-10-14 03:08:20,780][33201] Updated weights for policy 0, policy_version 52300 (0.0007) [2023-10-14 03:08:20,986][33226] Updated weights for policy 1, policy_version 52790 (0.0007) [2023-10-14 03:08:21,156][33201] Updated weights for policy 0, policy_version 52310 (0.0008) [2023-10-14 03:08:21,358][33226] Updated weights for policy 1, policy_version 52800 (0.0009) [2023-10-14 03:08:21,521][33201] Updated weights for policy 0, policy_version 52320 (0.0009) [2023-10-14 03:08:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 107642880. Throughput: 0: 1759.1, 1: 1778.1. Samples: 26925810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:24,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.870')] [2023-10-14 03:08:25,098][33226] Updated weights for policy 1, policy_version 52810 (0.0008) [2023-10-14 03:08:25,350][33201] Updated weights for policy 0, policy_version 52330 (0.0008) [2023-10-14 03:08:25,465][33226] Updated weights for policy 1, policy_version 52820 (0.0008) [2023-10-14 03:08:25,722][33201] Updated weights for policy 0, policy_version 52340 (0.0008) [2023-10-14 03:08:25,833][33226] Updated weights for policy 1, policy_version 52830 (0.0007) [2023-10-14 03:08:26,085][33201] Updated weights for policy 0, policy_version 52350 (0.0009) [2023-10-14 03:08:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 107708416. Throughput: 0: 1742.0, 1: 1764.0. Samples: 26935210. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:29,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.860')] [2023-10-14 03:08:29,720][33226] Updated weights for policy 1, policy_version 52840 (0.0008) [2023-10-14 03:08:29,922][33201] Updated weights for policy 0, policy_version 52360 (0.0008) [2023-10-14 03:08:30,081][33226] Updated weights for policy 1, policy_version 52850 (0.0008) [2023-10-14 03:08:30,297][33201] Updated weights for policy 0, policy_version 52370 (0.0008) [2023-10-14 03:08:30,444][33226] Updated weights for policy 1, policy_version 52860 (0.0008) [2023-10-14 03:08:30,676][33201] Updated weights for policy 0, policy_version 52380 (0.0009) [2023-10-14 03:08:34,054][33226] Updated weights for policy 1, policy_version 52870 (0.0009) [2023-10-14 03:08:34,417][33226] Updated weights for policy 1, policy_version 52880 (0.0008) [2023-10-14 03:08:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 107773952. Throughput: 0: 1751.7, 1: 1771.6. Samples: 26957184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:34,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.880')] [2023-10-14 03:08:34,589][33201] Updated weights for policy 0, policy_version 52390 (0.0009) [2023-10-14 03:08:34,780][33226] Updated weights for policy 1, policy_version 52890 (0.0007) [2023-10-14 03:08:34,960][33201] Updated weights for policy 0, policy_version 52400 (0.0008) [2023-10-14 03:08:35,333][33201] Updated weights for policy 0, policy_version 52410 (0.0008) [2023-10-14 03:08:38,667][33226] Updated weights for policy 1, policy_version 52900 (0.0007) [2023-10-14 03:08:39,025][33226] Updated weights for policy 1, policy_version 52910 (0.0008) [2023-10-14 03:08:39,097][33201] Updated weights for policy 0, policy_version 52420 (0.0007) [2023-10-14 03:08:39,396][33226] Updated weights for policy 1, policy_version 52920 (0.0007) [2023-10-14 03:08:39,463][33201] Updated weights for policy 0, policy_version 52430 (0.0007) [2023-10-14 03:08:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 107839488. Throughput: 0: 1771.7, 1: 1789.4. Samples: 26978862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:08:39,557][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 03:08:39,833][33201] Updated weights for policy 0, policy_version 52440 (0.0007) [2023-10-14 03:08:43,186][33226] Updated weights for policy 1, policy_version 52930 (0.0007) [2023-10-14 03:08:43,548][33226] Updated weights for policy 1, policy_version 52940 (0.0008) [2023-10-14 03:08:43,722][33201] Updated weights for policy 0, policy_version 52450 (0.0008) [2023-10-14 03:08:43,908][33226] Updated weights for policy 1, policy_version 52950 (0.0008) [2023-10-14 03:08:44,093][33201] Updated weights for policy 0, policy_version 52460 (0.0008) [2023-10-14 03:08:44,283][33226] Updated weights for policy 1, policy_version 52960 (0.0009) [2023-10-14 03:08:44,472][33201] Updated weights for policy 0, policy_version 52470 (0.0007) [2023-10-14 03:08:44,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 107937792. Throughput: 0: 1749.6, 1: 1768.9. Samples: 26989212. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:08:44,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 03:08:44,836][33201] Updated weights for policy 0, policy_version 52480 (0.0009) [2023-10-14 03:08:48,033][33226] Updated weights for policy 1, policy_version 52970 (0.0008) [2023-10-14 03:08:48,402][33226] Updated weights for policy 1, policy_version 52980 (0.0008) [2023-10-14 03:08:48,715][33201] Updated weights for policy 0, policy_version 52490 (0.0009) [2023-10-14 03:08:48,764][33226] Updated weights for policy 1, policy_version 52990 (0.0007) [2023-10-14 03:08:49,087][33201] Updated weights for policy 0, policy_version 52500 (0.0007) [2023-10-14 03:08:49,455][33201] Updated weights for policy 0, policy_version 52510 (0.0008) [2023-10-14 03:08:49,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 108036096. Throughput: 0: 1783.5, 1: 1790.3. Samples: 27010938. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:08:49,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 03:08:52,555][33226] Updated weights for policy 1, policy_version 53000 (0.0008) [2023-10-14 03:08:52,922][33226] Updated weights for policy 1, policy_version 53010 (0.0008) [2023-10-14 03:08:53,172][33201] Updated weights for policy 0, policy_version 52520 (0.0009) [2023-10-14 03:08:53,286][33226] Updated weights for policy 1, policy_version 53020 (0.0008) [2023-10-14 03:08:53,543][33201] Updated weights for policy 0, policy_version 52530 (0.0010) [2023-10-14 03:08:53,924][33201] Updated weights for policy 0, policy_version 52540 (0.0008) [2023-10-14 03:08:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 108101632. Throughput: 0: 1755.4, 1: 1769.8. Samples: 27030904. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:08:54,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 03:08:57,224][33226] Updated weights for policy 1, policy_version 53030 (0.0009) [2023-10-14 03:08:57,595][33226] Updated weights for policy 1, policy_version 53040 (0.0009) [2023-10-14 03:08:57,799][33201] Updated weights for policy 0, policy_version 52550 (0.0008) [2023-10-14 03:08:57,962][33226] Updated weights for policy 1, policy_version 53050 (0.0007) [2023-10-14 03:08:58,164][33201] Updated weights for policy 0, policy_version 52560 (0.0008) [2023-10-14 03:08:58,539][33201] Updated weights for policy 0, policy_version 52570 (0.0009) [2023-10-14 03:08:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 108167168. Throughput: 0: 1780.3, 1: 1795.2. Samples: 27043164. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:08:59,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.890')] [2023-10-14 03:09:01,665][33226] Updated weights for policy 1, policy_version 53060 (0.0010) [2023-10-14 03:09:02,044][33226] Updated weights for policy 1, policy_version 53070 (0.0009) [2023-10-14 03:09:02,346][33201] Updated weights for policy 0, policy_version 52580 (0.0008) [2023-10-14 03:09:02,402][33226] Updated weights for policy 1, policy_version 53080 (0.0008) [2023-10-14 03:09:02,740][33201] Updated weights for policy 0, policy_version 52590 (0.0007) [2023-10-14 03:09:03,109][33201] Updated weights for policy 0, policy_version 52600 (0.0009) [2023-10-14 03:09:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 108232704. Throughput: 0: 1765.1, 1: 1774.0. Samples: 27062798. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:09:04,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.890')] [2023-10-14 03:09:06,302][33226] Updated weights for policy 1, policy_version 53090 (0.0008) [2023-10-14 03:09:06,664][33226] Updated weights for policy 1, policy_version 53100 (0.0008) [2023-10-14 03:09:06,773][33201] Updated weights for policy 0, policy_version 52610 (0.0007) [2023-10-14 03:09:07,033][33226] Updated weights for policy 1, policy_version 53110 (0.0007) [2023-10-14 03:09:07,142][33201] Updated weights for policy 0, policy_version 52620 (0.0007) [2023-10-14 03:09:07,404][33226] Updated weights for policy 1, policy_version 53120 (0.0008) [2023-10-14 03:09:07,514][33201] Updated weights for policy 0, policy_version 52630 (0.0010) [2023-10-14 03:09:07,887][33201] Updated weights for policy 0, policy_version 52640 (0.0009) [2023-10-14 03:09:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 108298240. Throughput: 0: 1758.8, 1: 1771.0. Samples: 27084652. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:09:09,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.890')] [2023-10-14 03:09:11,137][33226] Updated weights for policy 1, policy_version 53130 (0.0008) [2023-10-14 03:09:11,507][33226] Updated weights for policy 1, policy_version 53140 (0.0008) [2023-10-14 03:09:11,803][33201] Updated weights for policy 0, policy_version 52650 (0.0007) [2023-10-14 03:09:11,873][33226] Updated weights for policy 1, policy_version 53150 (0.0009) [2023-10-14 03:09:12,167][33201] Updated weights for policy 0, policy_version 52660 (0.0007) [2023-10-14 03:09:12,539][33201] Updated weights for policy 0, policy_version 52670 (0.0007) [2023-10-14 03:09:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 108363776. Throughput: 0: 1773.6, 1: 1776.4. Samples: 27094956. Policy #0 lag: (min: 28.0, avg: 52.2, max: 56.0) [2023-10-14 03:09:14,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.890')] [2023-10-14 03:09:15,752][33226] Updated weights for policy 1, policy_version 53160 (0.0007) [2023-10-14 03:09:16,118][33226] Updated weights for policy 1, policy_version 53170 (0.0008) [2023-10-14 03:09:16,236][33201] Updated weights for policy 0, policy_version 52680 (0.0007) [2023-10-14 03:09:16,487][33226] Updated weights for policy 1, policy_version 53180 (0.0008) [2023-10-14 03:09:16,608][33201] Updated weights for policy 0, policy_version 52690 (0.0007) [2023-10-14 03:09:16,982][33201] Updated weights for policy 0, policy_version 52700 (0.0010) [2023-10-14 03:09:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 108429312. Throughput: 0: 1763.0, 1: 1770.0. Samples: 27116172. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:19,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.890')] [2023-10-14 03:09:20,180][33226] Updated weights for policy 1, policy_version 53190 (0.0010) [2023-10-14 03:09:20,545][33226] Updated weights for policy 1, policy_version 53200 (0.0010) [2023-10-14 03:09:20,731][33201] Updated weights for policy 0, policy_version 52710 (0.0009) [2023-10-14 03:09:20,921][33226] Updated weights for policy 1, policy_version 53210 (0.0007) [2023-10-14 03:09:21,106][33201] Updated weights for policy 0, policy_version 52720 (0.0007) [2023-10-14 03:09:21,474][33201] Updated weights for policy 0, policy_version 52730 (0.0008) [2023-10-14 03:09:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 108494848. Throughput: 0: 1761.8, 1: 1777.8. Samples: 27138144. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:24,558][31953] Avg episode reward: [(0, '20.620'), (1, '20.910')] [2023-10-14 03:09:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000052736_54001664.pth... [2023-10-14 03:09:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000053216_54493184.pth... [2023-10-14 03:09:24,602][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000051104_52330496.pth [2023-10-14 03:09:24,611][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000051584_52822016.pth [2023-10-14 03:09:24,877][33226] Updated weights for policy 1, policy_version 53220 (0.0009) [2023-10-14 03:09:25,239][33226] Updated weights for policy 1, policy_version 53230 (0.0008) [2023-10-14 03:09:25,461][33201] Updated weights for policy 0, policy_version 52740 (0.0008) [2023-10-14 03:09:25,602][33226] Updated weights for policy 1, policy_version 53240 (0.0007) [2023-10-14 03:09:25,831][33201] Updated weights for policy 0, policy_version 52750 (0.0009) [2023-10-14 03:09:26,211][33201] Updated weights for policy 0, policy_version 52760 (0.0010) [2023-10-14 03:09:29,146][33226] Updated weights for policy 1, policy_version 53250 (0.0007) [2023-10-14 03:09:29,521][33226] Updated weights for policy 1, policy_version 53260 (0.0008) [2023-10-14 03:09:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 108560384. Throughput: 0: 1759.4, 1: 1765.3. Samples: 27147822. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:29,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 03:09:29,883][33226] Updated weights for policy 1, policy_version 53270 (0.0007) [2023-10-14 03:09:30,002][33201] Updated weights for policy 0, policy_version 52770 (0.0009) [2023-10-14 03:09:30,240][33226] Updated weights for policy 1, policy_version 53280 (0.0007) [2023-10-14 03:09:30,378][33201] Updated weights for policy 0, policy_version 52780 (0.0010) [2023-10-14 03:09:30,749][33201] Updated weights for policy 0, policy_version 52790 (0.0009) [2023-10-14 03:09:31,119][33201] Updated weights for policy 0, policy_version 52800 (0.0009) [2023-10-14 03:09:34,067][33226] Updated weights for policy 1, policy_version 53290 (0.0009) [2023-10-14 03:09:34,442][33226] Updated weights for policy 1, policy_version 53300 (0.0008) [2023-10-14 03:09:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 108625920. Throughput: 0: 1756.9, 1: 1779.9. Samples: 27170098. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:34,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 03:09:34,804][33226] Updated weights for policy 1, policy_version 53310 (0.0009) [2023-10-14 03:09:34,946][33201] Updated weights for policy 0, policy_version 52810 (0.0008) [2023-10-14 03:09:35,309][33201] Updated weights for policy 0, policy_version 52820 (0.0011) [2023-10-14 03:09:35,680][33201] Updated weights for policy 0, policy_version 52830 (0.0010) [2023-10-14 03:09:38,637][33226] Updated weights for policy 1, policy_version 53320 (0.0007) [2023-10-14 03:09:38,998][33226] Updated weights for policy 1, policy_version 53330 (0.0008) [2023-10-14 03:09:39,365][33226] Updated weights for policy 1, policy_version 53340 (0.0008) [2023-10-14 03:09:39,420][33201] Updated weights for policy 0, policy_version 52840 (0.0007) [2023-10-14 03:09:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 108724224. Throughput: 0: 1791.7, 1: 1781.0. Samples: 27191672. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:39,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 03:09:39,777][33201] Updated weights for policy 0, policy_version 52850 (0.0007) [2023-10-14 03:09:40,149][33201] Updated weights for policy 0, policy_version 52860 (0.0007) [2023-10-14 03:09:43,277][33226] Updated weights for policy 1, policy_version 53350 (0.0007) [2023-10-14 03:09:43,654][33226] Updated weights for policy 1, policy_version 53360 (0.0007) [2023-10-14 03:09:44,027][33226] Updated weights for policy 1, policy_version 53370 (0.0008) [2023-10-14 03:09:44,056][33201] Updated weights for policy 0, policy_version 52870 (0.0007) [2023-10-14 03:09:44,435][33201] Updated weights for policy 0, policy_version 52880 (0.0008) [2023-10-14 03:09:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 108789760. Throughput: 0: 1760.1, 1: 1771.2. Samples: 27202072. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:44,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.910')] [2023-10-14 03:09:44,804][33201] Updated weights for policy 0, policy_version 52890 (0.0010) [2023-10-14 03:09:47,889][33226] Updated weights for policy 1, policy_version 53380 (0.0009) [2023-10-14 03:09:48,272][33226] Updated weights for policy 1, policy_version 53390 (0.0011) [2023-10-14 03:09:48,645][33226] Updated weights for policy 1, policy_version 53400 (0.0009) [2023-10-14 03:09:48,795][33201] Updated weights for policy 0, policy_version 52900 (0.0009) [2023-10-14 03:09:49,183][33201] Updated weights for policy 0, policy_version 52910 (0.0008) [2023-10-14 03:09:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 108855296. Throughput: 0: 1785.9, 1: 1795.3. Samples: 27223954. Policy #0 lag: (min: 31.0, avg: 33.7, max: 63.0) [2023-10-14 03:09:49,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.890')] [2023-10-14 03:09:49,560][33201] Updated weights for policy 0, policy_version 52920 (0.0007) [2023-10-14 03:09:52,276][33226] Updated weights for policy 1, policy_version 53410 (0.0007) [2023-10-14 03:09:52,632][33226] Updated weights for policy 1, policy_version 53420 (0.0007) [2023-10-14 03:09:52,994][33226] Updated weights for policy 1, policy_version 53430 (0.0008) [2023-10-14 03:09:53,226][33201] Updated weights for policy 0, policy_version 52930 (0.0007) [2023-10-14 03:09:53,360][33226] Updated weights for policy 1, policy_version 53440 (0.0008) [2023-10-14 03:09:53,592][33201] Updated weights for policy 0, policy_version 52940 (0.0007) [2023-10-14 03:09:53,963][33201] Updated weights for policy 0, policy_version 52950 (0.0007) [2023-10-14 03:09:54,342][33201] Updated weights for policy 0, policy_version 52960 (0.0009) [2023-10-14 03:09:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 108953600. Throughput: 0: 1772.8, 1: 1769.6. Samples: 27244058. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:09:54,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 03:09:57,249][33226] Updated weights for policy 1, policy_version 53450 (0.0008) [2023-10-14 03:09:57,616][33226] Updated weights for policy 1, policy_version 53460 (0.0007) [2023-10-14 03:09:57,987][33226] Updated weights for policy 1, policy_version 53470 (0.0008) [2023-10-14 03:09:58,046][33201] Updated weights for policy 0, policy_version 52970 (0.0007) [2023-10-14 03:09:58,421][33201] Updated weights for policy 0, policy_version 52980 (0.0007) [2023-10-14 03:09:58,790][33201] Updated weights for policy 0, policy_version 52990 (0.0007) [2023-10-14 03:09:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109019136. Throughput: 0: 1785.9, 1: 1795.8. Samples: 27256134. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:09:59,557][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 03:10:01,662][33226] Updated weights for policy 1, policy_version 53480 (0.0008) [2023-10-14 03:10:02,028][33226] Updated weights for policy 1, policy_version 53490 (0.0008) [2023-10-14 03:10:02,400][33226] Updated weights for policy 1, policy_version 53500 (0.0007) [2023-10-14 03:10:02,463][33201] Updated weights for policy 0, policy_version 53000 (0.0007) [2023-10-14 03:10:02,826][33201] Updated weights for policy 0, policy_version 53010 (0.0007) [2023-10-14 03:10:03,194][33201] Updated weights for policy 0, policy_version 53020 (0.0009) [2023-10-14 03:10:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109084672. Throughput: 0: 1777.5, 1: 1774.6. Samples: 27276016. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:04,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.960')] [2023-10-14 03:10:06,219][33226] Updated weights for policy 1, policy_version 53510 (0.0008) [2023-10-14 03:10:06,584][33226] Updated weights for policy 1, policy_version 53520 (0.0008) [2023-10-14 03:10:06,912][33201] Updated weights for policy 0, policy_version 53030 (0.0010) [2023-10-14 03:10:06,952][33226] Updated weights for policy 1, policy_version 53530 (0.0008) [2023-10-14 03:10:07,272][33201] Updated weights for policy 0, policy_version 53040 (0.0008) [2023-10-14 03:10:07,649][33201] Updated weights for policy 0, policy_version 53050 (0.0010) [2023-10-14 03:10:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109150208. Throughput: 0: 1764.1, 1: 1775.0. Samples: 27297404. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:09,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.960')] [2023-10-14 03:10:10,705][33226] Updated weights for policy 1, policy_version 53540 (0.0008) [2023-10-14 03:10:11,072][33226] Updated weights for policy 1, policy_version 53550 (0.0009) [2023-10-14 03:10:11,446][33226] Updated weights for policy 1, policy_version 53560 (0.0010) [2023-10-14 03:10:11,698][33201] Updated weights for policy 0, policy_version 53060 (0.0007) [2023-10-14 03:10:12,057][33201] Updated weights for policy 0, policy_version 53070 (0.0011) [2023-10-14 03:10:12,440][33201] Updated weights for policy 0, policy_version 53080 (0.0008) [2023-10-14 03:10:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109215744. Throughput: 0: 1780.7, 1: 1774.6. Samples: 27307810. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:14,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 03:10:15,259][33226] Updated weights for policy 1, policy_version 53570 (0.0009) [2023-10-14 03:10:15,620][33226] Updated weights for policy 1, policy_version 53580 (0.0007) [2023-10-14 03:10:15,990][33226] Updated weights for policy 1, policy_version 53590 (0.0008) [2023-10-14 03:10:16,282][33201] Updated weights for policy 0, policy_version 53090 (0.0008) [2023-10-14 03:10:16,352][33226] Updated weights for policy 1, policy_version 53600 (0.0007) [2023-10-14 03:10:16,646][33201] Updated weights for policy 0, policy_version 53100 (0.0009) [2023-10-14 03:10:17,022][33201] Updated weights for policy 0, policy_version 53110 (0.0009) [2023-10-14 03:10:17,386][33201] Updated weights for policy 0, policy_version 53120 (0.0008) [2023-10-14 03:10:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 109281280. Throughput: 0: 1766.3, 1: 1769.7. Samples: 27329218. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:19,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 03:10:20,181][33226] Updated weights for policy 1, policy_version 53610 (0.0008) [2023-10-14 03:10:20,549][33226] Updated weights for policy 1, policy_version 53620 (0.0008) [2023-10-14 03:10:20,917][33226] Updated weights for policy 1, policy_version 53630 (0.0008) [2023-10-14 03:10:21,185][33201] Updated weights for policy 0, policy_version 53130 (0.0007) [2023-10-14 03:10:21,556][33201] Updated weights for policy 0, policy_version 53140 (0.0007) [2023-10-14 03:10:21,921][33201] Updated weights for policy 0, policy_version 53150 (0.0007) [2023-10-14 03:10:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 109346816. Throughput: 0: 1765.8, 1: 1790.1. Samples: 27351690. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:24,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.930')] [2023-10-14 03:10:24,726][33226] Updated weights for policy 1, policy_version 53640 (0.0007) [2023-10-14 03:10:25,090][33226] Updated weights for policy 1, policy_version 53650 (0.0007) [2023-10-14 03:10:25,452][33226] Updated weights for policy 1, policy_version 53660 (0.0008) [2023-10-14 03:10:25,711][33201] Updated weights for policy 0, policy_version 53160 (0.0009) [2023-10-14 03:10:26,085][33201] Updated weights for policy 0, policy_version 53170 (0.0011) [2023-10-14 03:10:26,457][33201] Updated weights for policy 0, policy_version 53180 (0.0010) [2023-10-14 03:10:29,065][33226] Updated weights for policy 1, policy_version 53670 (0.0008) [2023-10-14 03:10:29,431][33226] Updated weights for policy 1, policy_version 53680 (0.0007) [2023-10-14 03:10:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 109412352. Throughput: 0: 1763.1, 1: 1777.3. Samples: 27361388. Policy #0 lag: (min: 31.0, avg: 39.8, max: 63.0) [2023-10-14 03:10:29,557][31953] Avg episode reward: [(0, '20.700'), (1, '20.930')] [2023-10-14 03:10:29,804][33226] Updated weights for policy 1, policy_version 53690 (0.0010) [2023-10-14 03:10:30,262][33201] Updated weights for policy 0, policy_version 53190 (0.0009) [2023-10-14 03:10:30,650][33201] Updated weights for policy 0, policy_version 53200 (0.0010) [2023-10-14 03:10:31,016][33201] Updated weights for policy 0, policy_version 53210 (0.0008) [2023-10-14 03:10:33,651][33226] Updated weights for policy 1, policy_version 53700 (0.0009) [2023-10-14 03:10:34,063][33226] Updated weights for policy 1, policy_version 53710 (0.0009) [2023-10-14 03:10:34,434][33226] Updated weights for policy 1, policy_version 53720 (0.0009) [2023-10-14 03:10:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 109477888. Throughput: 0: 1759.0, 1: 1790.7. Samples: 27383690. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:34,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.930')] [2023-10-14 03:10:34,815][33201] Updated weights for policy 0, policy_version 53220 (0.0008) [2023-10-14 03:10:35,214][33201] Updated weights for policy 0, policy_version 53230 (0.0010) [2023-10-14 03:10:35,585][33201] Updated weights for policy 0, policy_version 53240 (0.0008) [2023-10-14 03:10:38,207][33226] Updated weights for policy 1, policy_version 53730 (0.0008) [2023-10-14 03:10:38,580][33226] Updated weights for policy 1, policy_version 53740 (0.0011) [2023-10-14 03:10:38,934][33226] Updated weights for policy 1, policy_version 53750 (0.0007) [2023-10-14 03:10:39,191][33201] Updated weights for policy 0, policy_version 53250 (0.0008) [2023-10-14 03:10:39,299][33226] Updated weights for policy 1, policy_version 53760 (0.0007) [2023-10-14 03:10:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 109576192. Throughput: 0: 1781.5, 1: 1784.7. Samples: 27404536. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:10:39,565][33201] Updated weights for policy 0, policy_version 53260 (0.0010) [2023-10-14 03:10:39,937][33201] Updated weights for policy 0, policy_version 53270 (0.0010) [2023-10-14 03:10:40,315][33201] Updated weights for policy 0, policy_version 53280 (0.0010) [2023-10-14 03:10:43,040][33226] Updated weights for policy 1, policy_version 53770 (0.0007) [2023-10-14 03:10:43,420][33226] Updated weights for policy 1, policy_version 53780 (0.0008) [2023-10-14 03:10:43,777][33226] Updated weights for policy 1, policy_version 53790 (0.0008) [2023-10-14 03:10:44,301][33201] Updated weights for policy 0, policy_version 53290 (0.0007) [2023-10-14 03:10:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 109641728. Throughput: 0: 1752.7, 1: 1781.8. Samples: 27415186. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:44,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 03:10:44,669][33201] Updated weights for policy 0, policy_version 53300 (0.0010) [2023-10-14 03:10:45,038][33201] Updated weights for policy 0, policy_version 53310 (0.0007) [2023-10-14 03:10:47,514][33226] Updated weights for policy 1, policy_version 53800 (0.0009) [2023-10-14 03:10:47,881][33226] Updated weights for policy 1, policy_version 53810 (0.0008) [2023-10-14 03:10:48,244][33226] Updated weights for policy 1, policy_version 53820 (0.0010) [2023-10-14 03:10:48,902][33201] Updated weights for policy 0, policy_version 53320 (0.0008) [2023-10-14 03:10:49,277][33201] Updated weights for policy 0, policy_version 53330 (0.0010) [2023-10-14 03:10:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 109707264. Throughput: 0: 1771.1, 1: 1790.2. Samples: 27436276. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:49,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 03:10:49,640][33201] Updated weights for policy 0, policy_version 53340 (0.0011) [2023-10-14 03:10:52,056][33226] Updated weights for policy 1, policy_version 53830 (0.0008) [2023-10-14 03:10:52,424][33226] Updated weights for policy 1, policy_version 53840 (0.0009) [2023-10-14 03:10:52,810][33226] Updated weights for policy 1, policy_version 53850 (0.0009) [2023-10-14 03:10:53,471][33201] Updated weights for policy 0, policy_version 53350 (0.0009) [2023-10-14 03:10:53,837][33201] Updated weights for policy 0, policy_version 53360 (0.0009) [2023-10-14 03:10:54,199][33201] Updated weights for policy 0, policy_version 53370 (0.0007) [2023-10-14 03:10:54,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 109805568. Throughput: 0: 1765.9, 1: 1780.0. Samples: 27456974. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:54,559][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 03:10:56,658][33226] Updated weights for policy 1, policy_version 53860 (0.0008) [2023-10-14 03:10:57,026][33226] Updated weights for policy 1, policy_version 53870 (0.0007) [2023-10-14 03:10:57,395][33226] Updated weights for policy 1, policy_version 53880 (0.0008) [2023-10-14 03:10:58,067][33201] Updated weights for policy 0, policy_version 53380 (0.0007) [2023-10-14 03:10:58,442][33201] Updated weights for policy 0, policy_version 53390 (0.0009) [2023-10-14 03:10:58,813][33201] Updated weights for policy 0, policy_version 53400 (0.0008) [2023-10-14 03:10:59,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109871104. Throughput: 0: 1769.4, 1: 1800.4. Samples: 27468450. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:10:59,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.930')] [2023-10-14 03:11:01,189][33226] Updated weights for policy 1, policy_version 53890 (0.0009) [2023-10-14 03:11:01,555][33226] Updated weights for policy 1, policy_version 53900 (0.0009) [2023-10-14 03:11:01,929][33226] Updated weights for policy 1, policy_version 53910 (0.0007) [2023-10-14 03:11:02,302][33226] Updated weights for policy 1, policy_version 53920 (0.0008) [2023-10-14 03:11:02,653][33201] Updated weights for policy 0, policy_version 53410 (0.0008) [2023-10-14 03:11:03,031][33201] Updated weights for policy 0, policy_version 53420 (0.0009) [2023-10-14 03:11:03,387][33201] Updated weights for policy 0, policy_version 53430 (0.0009) [2023-10-14 03:11:03,763][33201] Updated weights for policy 0, policy_version 53440 (0.0009) [2023-10-14 03:11:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 109936640. Throughput: 0: 1767.2, 1: 1781.4. Samples: 27488908. Policy #0 lag: (min: 31.0, avg: 32.7, max: 59.0) [2023-10-14 03:11:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.910')] [2023-10-14 03:11:06,007][33226] Updated weights for policy 1, policy_version 53930 (0.0011) [2023-10-14 03:11:06,370][33226] Updated weights for policy 1, policy_version 53940 (0.0007) [2023-10-14 03:11:06,745][33226] Updated weights for policy 1, policy_version 53950 (0.0008) [2023-10-14 03:11:07,742][33201] Updated weights for policy 0, policy_version 53450 (0.0009) [2023-10-14 03:11:08,114][33201] Updated weights for policy 0, policy_version 53460 (0.0009) [2023-10-14 03:11:08,490][33201] Updated weights for policy 0, policy_version 53470 (0.0009) [2023-10-14 03:11:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110002176. Throughput: 0: 1738.3, 1: 1779.5. Samples: 27509990. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.910')] [2023-10-14 03:11:10,481][33226] Updated weights for policy 1, policy_version 53960 (0.0008) [2023-10-14 03:11:10,843][33226] Updated weights for policy 1, policy_version 53970 (0.0009) [2023-10-14 03:11:11,207][33226] Updated weights for policy 1, policy_version 53980 (0.0007) [2023-10-14 03:11:12,490][33201] Updated weights for policy 0, policy_version 53480 (0.0007) [2023-10-14 03:11:12,873][33201] Updated weights for policy 0, policy_version 53490 (0.0009) [2023-10-14 03:11:13,247][33201] Updated weights for policy 0, policy_version 53500 (0.0009) [2023-10-14 03:11:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110067712. Throughput: 0: 1773.4, 1: 1776.8. Samples: 27521148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.910')] [2023-10-14 03:11:14,998][33226] Updated weights for policy 1, policy_version 53990 (0.0011) [2023-10-14 03:11:15,368][33226] Updated weights for policy 1, policy_version 54000 (0.0010) [2023-10-14 03:11:15,736][33226] Updated weights for policy 1, policy_version 54010 (0.0010) [2023-10-14 03:11:17,126][33201] Updated weights for policy 0, policy_version 53510 (0.0008) [2023-10-14 03:11:17,497][33201] Updated weights for policy 0, policy_version 53520 (0.0009) [2023-10-14 03:11:17,867][33201] Updated weights for policy 0, policy_version 53530 (0.0008) [2023-10-14 03:11:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110133248. Throughput: 0: 1745.2, 1: 1772.9. Samples: 27542004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:19,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 03:11:19,594][33226] Updated weights for policy 1, policy_version 54020 (0.0008) [2023-10-14 03:11:19,991][33226] Updated weights for policy 1, policy_version 54030 (0.0008) [2023-10-14 03:11:20,356][33226] Updated weights for policy 1, policy_version 54040 (0.0008) [2023-10-14 03:11:21,648][33201] Updated weights for policy 0, policy_version 53540 (0.0009) [2023-10-14 03:11:22,040][33201] Updated weights for policy 0, policy_version 53550 (0.0010) [2023-10-14 03:11:22,408][33201] Updated weights for policy 0, policy_version 53560 (0.0007) [2023-10-14 03:11:23,924][33226] Updated weights for policy 1, policy_version 54050 (0.0008) [2023-10-14 03:11:24,297][33226] Updated weights for policy 1, policy_version 54060 (0.0008) [2023-10-14 03:11:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 110198784. Throughput: 0: 1741.3, 1: 1802.4. Samples: 27564004. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:24,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.900')] [2023-10-14 03:11:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000053568_54853632.pth... [2023-10-14 03:11:24,599][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000051936_53182464.pth [2023-10-14 03:11:24,658][33226] Updated weights for policy 1, policy_version 54070 (0.0009) [2023-10-14 03:11:25,020][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000054080_55377920.pth... [2023-10-14 03:11:25,022][33226] Updated weights for policy 1, policy_version 54080 (0.0007) [2023-10-14 03:11:25,049][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000052384_53641216.pth [2023-10-14 03:11:26,248][33201] Updated weights for policy 0, policy_version 53570 (0.0009) [2023-10-14 03:11:26,614][33201] Updated weights for policy 0, policy_version 53580 (0.0008) [2023-10-14 03:11:26,985][33201] Updated weights for policy 0, policy_version 53590 (0.0007) [2023-10-14 03:11:27,355][33201] Updated weights for policy 0, policy_version 53600 (0.0008) [2023-10-14 03:11:28,846][33226] Updated weights for policy 1, policy_version 54090 (0.0007) [2023-10-14 03:11:29,212][33226] Updated weights for policy 1, policy_version 54100 (0.0008) [2023-10-14 03:11:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 110264320. Throughput: 0: 1752.8, 1: 1780.8. Samples: 27574196. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.900')] [2023-10-14 03:11:29,577][33226] Updated weights for policy 1, policy_version 54110 (0.0011) [2023-10-14 03:11:31,124][33201] Updated weights for policy 0, policy_version 53610 (0.0008) [2023-10-14 03:11:31,494][33201] Updated weights for policy 0, policy_version 53620 (0.0008) [2023-10-14 03:11:31,868][33201] Updated weights for policy 0, policy_version 53630 (0.0010) [2023-10-14 03:11:33,508][33226] Updated weights for policy 1, policy_version 54120 (0.0011) [2023-10-14 03:11:33,868][33226] Updated weights for policy 1, policy_version 54130 (0.0010) [2023-10-14 03:11:34,236][33226] Updated weights for policy 1, policy_version 54140 (0.0010) [2023-10-14 03:11:34,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 110362624. Throughput: 0: 1747.4, 1: 1797.6. Samples: 27595800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.920')] [2023-10-14 03:11:35,740][33201] Updated weights for policy 0, policy_version 53640 (0.0008) [2023-10-14 03:11:36,105][33201] Updated weights for policy 0, policy_version 53650 (0.0007) [2023-10-14 03:11:36,484][33201] Updated weights for policy 0, policy_version 53660 (0.0009) [2023-10-14 03:11:38,175][33226] Updated weights for policy 1, policy_version 54150 (0.0009) [2023-10-14 03:11:38,539][33226] Updated weights for policy 1, policy_version 54160 (0.0008) [2023-10-14 03:11:38,912][33226] Updated weights for policy 1, policy_version 54170 (0.0008) [2023-10-14 03:11:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 110428160. Throughput: 0: 1766.1, 1: 1779.3. Samples: 27616516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:11:40,209][33201] Updated weights for policy 0, policy_version 53670 (0.0007) [2023-10-14 03:11:40,590][33201] Updated weights for policy 0, policy_version 53680 (0.0007) [2023-10-14 03:11:40,950][33201] Updated weights for policy 0, policy_version 53690 (0.0010) [2023-10-14 03:11:42,690][33226] Updated weights for policy 1, policy_version 54180 (0.0008) [2023-10-14 03:11:43,062][33226] Updated weights for policy 1, policy_version 54190 (0.0007) [2023-10-14 03:11:43,425][33226] Updated weights for policy 1, policy_version 54200 (0.0008) [2023-10-14 03:11:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 110493696. Throughput: 0: 1748.3, 1: 1782.2. Samples: 27627324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:11:44,709][33201] Updated weights for policy 0, policy_version 53700 (0.0009) [2023-10-14 03:11:45,089][33201] Updated weights for policy 0, policy_version 53710 (0.0008) [2023-10-14 03:11:45,455][33201] Updated weights for policy 0, policy_version 53720 (0.0008) [2023-10-14 03:11:47,042][33226] Updated weights for policy 1, policy_version 54210 (0.0009) [2023-10-14 03:11:47,418][33226] Updated weights for policy 1, policy_version 54220 (0.0009) [2023-10-14 03:11:47,792][33226] Updated weights for policy 1, policy_version 54230 (0.0009) [2023-10-14 03:11:48,148][33226] Updated weights for policy 1, policy_version 54240 (0.0007) [2023-10-14 03:11:49,196][33201] Updated weights for policy 0, policy_version 53730 (0.0009) [2023-10-14 03:11:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 110559232. Throughput: 0: 1768.6, 1: 1779.9. Samples: 27648590. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:11:49,562][33201] Updated weights for policy 0, policy_version 53740 (0.0008) [2023-10-14 03:11:49,928][33201] Updated weights for policy 0, policy_version 53750 (0.0007) [2023-10-14 03:11:50,302][33201] Updated weights for policy 0, policy_version 53760 (0.0008) [2023-10-14 03:11:51,778][33226] Updated weights for policy 1, policy_version 54250 (0.0010) [2023-10-14 03:11:52,136][33226] Updated weights for policy 1, policy_version 54260 (0.0009) [2023-10-14 03:11:52,503][33226] Updated weights for policy 1, policy_version 54270 (0.0009) [2023-10-14 03:11:54,145][33201] Updated weights for policy 0, policy_version 53770 (0.0008) [2023-10-14 03:11:54,512][33201] Updated weights for policy 0, policy_version 53780 (0.0010) [2023-10-14 03:11:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 110624768. Throughput: 0: 1787.5, 1: 1777.1. Samples: 27670396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:11:54,882][33201] Updated weights for policy 0, policy_version 53790 (0.0009) [2023-10-14 03:11:56,349][33226] Updated weights for policy 1, policy_version 54280 (0.0008) [2023-10-14 03:11:56,713][33226] Updated weights for policy 1, policy_version 54290 (0.0008) [2023-10-14 03:11:57,087][33226] Updated weights for policy 1, policy_version 54300 (0.0009) [2023-10-14 03:11:58,633][33201] Updated weights for policy 0, policy_version 53800 (0.0008) [2023-10-14 03:11:59,001][33201] Updated weights for policy 0, policy_version 53810 (0.0009) [2023-10-14 03:11:59,380][33201] Updated weights for policy 0, policy_version 53820 (0.0007) [2023-10-14 03:11:59,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110723072. Throughput: 0: 1763.4, 1: 1784.8. Samples: 27680816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:11:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:12:00,916][33226] Updated weights for policy 1, policy_version 54310 (0.0008) [2023-10-14 03:12:01,279][33226] Updated weights for policy 1, policy_version 54320 (0.0011) [2023-10-14 03:12:01,641][33226] Updated weights for policy 1, policy_version 54330 (0.0010) [2023-10-14 03:12:03,294][33201] Updated weights for policy 0, policy_version 53830 (0.0010) [2023-10-14 03:12:03,663][33201] Updated weights for policy 0, policy_version 53840 (0.0009) [2023-10-14 03:12:04,038][33201] Updated weights for policy 0, policy_version 53850 (0.0008) [2023-10-14 03:12:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 110788608. Throughput: 0: 1789.8, 1: 1774.3. Samples: 27702390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:04,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.970')] [2023-10-14 03:12:05,482][33226] Updated weights for policy 1, policy_version 54340 (0.0009) [2023-10-14 03:12:05,873][33226] Updated weights for policy 1, policy_version 54350 (0.0007) [2023-10-14 03:12:06,239][33226] Updated weights for policy 1, policy_version 54360 (0.0010) [2023-10-14 03:12:07,896][33201] Updated weights for policy 0, policy_version 53860 (0.0008) [2023-10-14 03:12:08,279][33201] Updated weights for policy 0, policy_version 53870 (0.0007) [2023-10-14 03:12:08,653][33201] Updated weights for policy 0, policy_version 53880 (0.0007) [2023-10-14 03:12:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110854144. Throughput: 0: 1760.0, 1: 1774.1. Samples: 27723040. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:09,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:12:10,030][33226] Updated weights for policy 1, policy_version 54370 (0.0011) [2023-10-14 03:12:10,405][33226] Updated weights for policy 1, policy_version 54380 (0.0008) [2023-10-14 03:12:10,762][33226] Updated weights for policy 1, policy_version 54390 (0.0007) [2023-10-14 03:12:11,132][33226] Updated weights for policy 1, policy_version 54400 (0.0008) [2023-10-14 03:12:12,199][33201] Updated weights for policy 0, policy_version 53890 (0.0008) [2023-10-14 03:12:12,574][33201] Updated weights for policy 0, policy_version 53900 (0.0009) [2023-10-14 03:12:12,943][33201] Updated weights for policy 0, policy_version 53910 (0.0010) [2023-10-14 03:12:13,324][33201] Updated weights for policy 0, policy_version 53920 (0.0011) [2023-10-14 03:12:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 110919680. Throughput: 0: 1788.5, 1: 1768.1. Samples: 27734246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:14,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.950')] [2023-10-14 03:12:14,808][33226] Updated weights for policy 1, policy_version 54410 (0.0007) [2023-10-14 03:12:15,174][33226] Updated weights for policy 1, policy_version 54420 (0.0009) [2023-10-14 03:12:15,541][33226] Updated weights for policy 1, policy_version 54430 (0.0008) [2023-10-14 03:12:17,072][33201] Updated weights for policy 0, policy_version 53930 (0.0007) [2023-10-14 03:12:17,443][33201] Updated weights for policy 0, policy_version 53940 (0.0009) [2023-10-14 03:12:17,805][33201] Updated weights for policy 0, policy_version 53950 (0.0009) [2023-10-14 03:12:19,373][33226] Updated weights for policy 1, policy_version 54440 (0.0007) [2023-10-14 03:12:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 110985216. Throughput: 0: 1766.8, 1: 1774.7. Samples: 27755168. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:19,728][33226] Updated weights for policy 1, policy_version 54450 (0.0008) [2023-10-14 03:12:20,100][33226] Updated weights for policy 1, policy_version 54460 (0.0009) [2023-10-14 03:12:21,653][33201] Updated weights for policy 0, policy_version 53960 (0.0011) [2023-10-14 03:12:22,014][33201] Updated weights for policy 0, policy_version 53970 (0.0008) [2023-10-14 03:12:22,386][33201] Updated weights for policy 0, policy_version 53980 (0.0007) [2023-10-14 03:12:23,722][33226] Updated weights for policy 1, policy_version 54470 (0.0009) [2023-10-14 03:12:24,088][33226] Updated weights for policy 1, policy_version 54480 (0.0009) [2023-10-14 03:12:24,456][33226] Updated weights for policy 1, policy_version 54490 (0.0007) [2023-10-14 03:12:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 111050752. Throughput: 0: 1765.8, 1: 1800.6. Samples: 27777002. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:26,091][33201] Updated weights for policy 0, policy_version 53990 (0.0007) [2023-10-14 03:12:26,452][33201] Updated weights for policy 0, policy_version 54000 (0.0007) [2023-10-14 03:12:26,826][33201] Updated weights for policy 0, policy_version 54010 (0.0008) [2023-10-14 03:12:28,177][33226] Updated weights for policy 1, policy_version 54500 (0.0008) [2023-10-14 03:12:28,546][33226] Updated weights for policy 1, policy_version 54510 (0.0007) [2023-10-14 03:12:28,913][33226] Updated weights for policy 1, policy_version 54520 (0.0008) [2023-10-14 03:12:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 111149056. Throughput: 0: 1767.1, 1: 1791.2. Samples: 27787446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:30,787][33201] Updated weights for policy 0, policy_version 54020 (0.0007) [2023-10-14 03:12:31,155][33201] Updated weights for policy 0, policy_version 54030 (0.0007) [2023-10-14 03:12:31,529][33201] Updated weights for policy 0, policy_version 54040 (0.0009) [2023-10-14 03:12:32,719][33226] Updated weights for policy 1, policy_version 54530 (0.0008) [2023-10-14 03:12:33,091][33226] Updated weights for policy 1, policy_version 54540 (0.0009) [2023-10-14 03:12:33,452][33226] Updated weights for policy 1, policy_version 54550 (0.0008) [2023-10-14 03:12:33,815][33226] Updated weights for policy 1, policy_version 54560 (0.0008) [2023-10-14 03:12:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 111214592. Throughput: 0: 1760.8, 1: 1808.8. Samples: 27809224. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:35,415][33201] Updated weights for policy 0, policy_version 54050 (0.0008) [2023-10-14 03:12:35,781][33201] Updated weights for policy 0, policy_version 54060 (0.0008) [2023-10-14 03:12:36,144][33201] Updated weights for policy 0, policy_version 54070 (0.0009) [2023-10-14 03:12:36,514][33201] Updated weights for policy 0, policy_version 54080 (0.0008) [2023-10-14 03:12:37,602][33226] Updated weights for policy 1, policy_version 54570 (0.0009) [2023-10-14 03:12:37,977][33226] Updated weights for policy 1, policy_version 54580 (0.0009) [2023-10-14 03:12:38,335][33226] Updated weights for policy 1, policy_version 54590 (0.0008) [2023-10-14 03:12:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 111280128. Throughput: 0: 1770.6, 1: 1791.1. Samples: 27830670. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:40,307][33201] Updated weights for policy 0, policy_version 54090 (0.0010) [2023-10-14 03:12:40,681][33201] Updated weights for policy 0, policy_version 54100 (0.0011) [2023-10-14 03:12:41,051][33201] Updated weights for policy 0, policy_version 54110 (0.0011) [2023-10-14 03:12:42,112][33226] Updated weights for policy 1, policy_version 54600 (0.0010) [2023-10-14 03:12:42,485][33226] Updated weights for policy 1, policy_version 54610 (0.0007) [2023-10-14 03:12:42,855][33226] Updated weights for policy 1, policy_version 54620 (0.0007) [2023-10-14 03:12:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 111345664. Throughput: 0: 1761.8, 1: 1812.6. Samples: 27841666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:44,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:12:44,904][33201] Updated weights for policy 0, policy_version 54120 (0.0009) [2023-10-14 03:12:45,284][33201] Updated weights for policy 0, policy_version 54130 (0.0008) [2023-10-14 03:12:45,659][33201] Updated weights for policy 0, policy_version 54140 (0.0008) [2023-10-14 03:12:46,682][33226] Updated weights for policy 1, policy_version 54630 (0.0008) [2023-10-14 03:12:47,055][33226] Updated weights for policy 1, policy_version 54640 (0.0010) [2023-10-14 03:12:47,426][33226] Updated weights for policy 1, policy_version 54650 (0.0007) [2023-10-14 03:12:49,458][33201] Updated weights for policy 0, policy_version 54150 (0.0009) [2023-10-14 03:12:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 111411200. Throughput: 0: 1767.0, 1: 1789.2. Samples: 27862420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:12:49,839][33201] Updated weights for policy 0, policy_version 54160 (0.0010) [2023-10-14 03:12:50,222][33201] Updated weights for policy 0, policy_version 54170 (0.0011) [2023-10-14 03:12:51,319][33226] Updated weights for policy 1, policy_version 54660 (0.0009) [2023-10-14 03:12:51,722][33226] Updated weights for policy 1, policy_version 54670 (0.0009) [2023-10-14 03:12:52,089][33226] Updated weights for policy 1, policy_version 54680 (0.0007) [2023-10-14 03:12:54,122][33201] Updated weights for policy 0, policy_version 54180 (0.0009) [2023-10-14 03:12:54,517][33201] Updated weights for policy 0, policy_version 54190 (0.0007) [2023-10-14 03:12:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 111476736. Throughput: 0: 1793.6, 1: 1786.9. Samples: 27884164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:12:54,887][33201] Updated weights for policy 0, policy_version 54200 (0.0008) [2023-10-14 03:12:55,861][33226] Updated weights for policy 1, policy_version 54690 (0.0008) [2023-10-14 03:12:56,227][33226] Updated weights for policy 1, policy_version 54700 (0.0010) [2023-10-14 03:12:56,596][33226] Updated weights for policy 1, policy_version 54710 (0.0009) [2023-10-14 03:12:56,965][33226] Updated weights for policy 1, policy_version 54720 (0.0008) [2023-10-14 03:12:58,612][33201] Updated weights for policy 0, policy_version 54210 (0.0008) [2023-10-14 03:12:58,995][33201] Updated weights for policy 0, policy_version 54220 (0.0010) [2023-10-14 03:12:59,365][33201] Updated weights for policy 0, policy_version 54230 (0.0007) [2023-10-14 03:12:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 111542272. Throughput: 0: 1760.9, 1: 1792.5. Samples: 27894150. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:12:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.970')] [2023-10-14 03:12:59,735][33201] Updated weights for policy 0, policy_version 54240 (0.0009) [2023-10-14 03:13:00,717][33226] Updated weights for policy 1, policy_version 54730 (0.0008) [2023-10-14 03:13:01,080][33226] Updated weights for policy 1, policy_version 54740 (0.0007) [2023-10-14 03:13:01,447][33226] Updated weights for policy 1, policy_version 54750 (0.0010) [2023-10-14 03:13:03,466][33201] Updated weights for policy 0, policy_version 54250 (0.0008) [2023-10-14 03:13:03,837][33201] Updated weights for policy 0, policy_version 54260 (0.0010) [2023-10-14 03:13:04,207][33201] Updated weights for policy 0, policy_version 54270 (0.0011) [2023-10-14 03:13:04,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 111640576. Throughput: 0: 1797.2, 1: 1792.7. Samples: 27916710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 03:13:05,062][33226] Updated weights for policy 1, policy_version 54760 (0.0009) [2023-10-14 03:13:05,423][33226] Updated weights for policy 1, policy_version 54770 (0.0008) [2023-10-14 03:13:05,781][33226] Updated weights for policy 1, policy_version 54780 (0.0007) [2023-10-14 03:13:08,127][33201] Updated weights for policy 0, policy_version 54280 (0.0009) [2023-10-14 03:13:08,498][33201] Updated weights for policy 0, policy_version 54290 (0.0009) [2023-10-14 03:13:08,860][33201] Updated weights for policy 0, policy_version 54300 (0.0008) [2023-10-14 03:13:09,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 111706112. Throughput: 0: 1761.2, 1: 1801.0. Samples: 27937300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:09,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.980')] [2023-10-14 03:13:09,628][33226] Updated weights for policy 1, policy_version 54790 (0.0010) [2023-10-14 03:13:10,006][33226] Updated weights for policy 1, policy_version 54800 (0.0010) [2023-10-14 03:13:10,377][33226] Updated weights for policy 1, policy_version 54810 (0.0011) [2023-10-14 03:13:12,595][33201] Updated weights for policy 0, policy_version 54310 (0.0011) [2023-10-14 03:13:12,970][33201] Updated weights for policy 0, policy_version 54320 (0.0009) [2023-10-14 03:13:13,349][33201] Updated weights for policy 0, policy_version 54330 (0.0008) [2023-10-14 03:13:14,167][33226] Updated weights for policy 1, policy_version 54820 (0.0009) [2023-10-14 03:13:14,537][33226] Updated weights for policy 1, policy_version 54830 (0.0008) [2023-10-14 03:13:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 111771648. Throughput: 0: 1789.5, 1: 1785.4. Samples: 27948314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.980')] [2023-10-14 03:13:14,908][33226] Updated weights for policy 1, policy_version 54840 (0.0010) [2023-10-14 03:13:16,953][33201] Updated weights for policy 0, policy_version 54340 (0.0008) [2023-10-14 03:13:17,330][33201] Updated weights for policy 0, policy_version 54350 (0.0007) [2023-10-14 03:13:17,689][33201] Updated weights for policy 0, policy_version 54360 (0.0010) [2023-10-14 03:13:18,607][33226] Updated weights for policy 1, policy_version 54850 (0.0009) [2023-10-14 03:13:18,974][33226] Updated weights for policy 1, policy_version 54860 (0.0008) [2023-10-14 03:13:19,348][33226] Updated weights for policy 1, policy_version 54870 (0.0010) [2023-10-14 03:13:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 111837184. Throughput: 0: 1762.9, 1: 1791.4. Samples: 27969164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:19,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:13:19,709][33226] Updated weights for policy 1, policy_version 54880 (0.0011) [2023-10-14 03:13:21,616][33201] Updated weights for policy 0, policy_version 54370 (0.0009) [2023-10-14 03:13:21,971][33201] Updated weights for policy 0, policy_version 54380 (0.0008) [2023-10-14 03:13:22,341][33201] Updated weights for policy 0, policy_version 54390 (0.0007) [2023-10-14 03:13:22,715][33201] Updated weights for policy 0, policy_version 54400 (0.0009) [2023-10-14 03:13:23,596][33226] Updated weights for policy 1, policy_version 54890 (0.0009) [2023-10-14 03:13:23,962][33226] Updated weights for policy 1, policy_version 54900 (0.0008) [2023-10-14 03:13:24,323][33226] Updated weights for policy 1, policy_version 54910 (0.0007) [2023-10-14 03:13:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 111935488. Throughput: 0: 1763.0, 1: 1788.4. Samples: 27990482. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:24,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:13:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000054912_56229888.pth... [2023-10-14 03:13:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000054400_55705600.pth... [2023-10-14 03:13:24,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000053216_54493184.pth [2023-10-14 03:13:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000052736_54001664.pth [2023-10-14 03:13:26,424][33201] Updated weights for policy 0, policy_version 54410 (0.0010) [2023-10-14 03:13:26,794][33201] Updated weights for policy 0, policy_version 54420 (0.0011) [2023-10-14 03:13:27,163][33201] Updated weights for policy 0, policy_version 54430 (0.0009) [2023-10-14 03:13:28,083][33226] Updated weights for policy 1, policy_version 54920 (0.0009) [2023-10-14 03:13:28,446][33226] Updated weights for policy 1, policy_version 54930 (0.0008) [2023-10-14 03:13:28,809][33226] Updated weights for policy 1, policy_version 54940 (0.0009) [2023-10-14 03:13:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 112001024. Throughput: 0: 1768.3, 1: 1777.0. Samples: 28001208. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:29,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:13:31,111][33201] Updated weights for policy 0, policy_version 54440 (0.0009) [2023-10-14 03:13:31,487][33201] Updated weights for policy 0, policy_version 54450 (0.0008) [2023-10-14 03:13:31,850][33201] Updated weights for policy 0, policy_version 54460 (0.0009) [2023-10-14 03:13:32,644][33226] Updated weights for policy 1, policy_version 54950 (0.0008) [2023-10-14 03:13:33,006][33226] Updated weights for policy 1, policy_version 54960 (0.0007) [2023-10-14 03:13:33,375][33226] Updated weights for policy 1, policy_version 54970 (0.0007) [2023-10-14 03:13:34,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 112066560. Throughput: 0: 1761.1, 1: 1793.1. Samples: 28022358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:13:34,559][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:13:35,619][33201] Updated weights for policy 0, policy_version 54470 (0.0010) [2023-10-14 03:13:35,993][33201] Updated weights for policy 0, policy_version 54480 (0.0008) [2023-10-14 03:13:36,363][33201] Updated weights for policy 0, policy_version 54490 (0.0007) [2023-10-14 03:13:37,329][33226] Updated weights for policy 1, policy_version 54980 (0.0008) [2023-10-14 03:13:37,735][33226] Updated weights for policy 1, policy_version 54990 (0.0008) [2023-10-14 03:13:38,106][33226] Updated weights for policy 1, policy_version 55000 (0.0008) [2023-10-14 03:13:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112132096. Throughput: 0: 1774.1, 1: 1769.5. Samples: 28043628. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:13:39,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:13:40,184][33201] Updated weights for policy 0, policy_version 54500 (0.0008) [2023-10-14 03:13:40,566][33201] Updated weights for policy 0, policy_version 54510 (0.0007) [2023-10-14 03:13:40,934][33201] Updated weights for policy 0, policy_version 54520 (0.0009) [2023-10-14 03:13:41,957][33226] Updated weights for policy 1, policy_version 55010 (0.0008) [2023-10-14 03:13:42,324][33226] Updated weights for policy 1, policy_version 55020 (0.0007) [2023-10-14 03:13:42,695][33226] Updated weights for policy 1, policy_version 55030 (0.0008) [2023-10-14 03:13:43,063][33226] Updated weights for policy 1, policy_version 55040 (0.0008) [2023-10-14 03:13:44,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 112197632. Throughput: 0: 1765.7, 1: 1793.1. Samples: 28054296. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:13:44,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:13:44,856][33201] Updated weights for policy 0, policy_version 54530 (0.0008) [2023-10-14 03:13:45,224][33201] Updated weights for policy 0, policy_version 54540 (0.0008) [2023-10-14 03:13:45,595][33201] Updated weights for policy 0, policy_version 54550 (0.0007) [2023-10-14 03:13:45,962][33201] Updated weights for policy 0, policy_version 54560 (0.0008) [2023-10-14 03:13:46,967][33226] Updated weights for policy 1, policy_version 55050 (0.0009) [2023-10-14 03:13:47,334][33226] Updated weights for policy 1, policy_version 55060 (0.0009) [2023-10-14 03:13:47,698][33226] Updated weights for policy 1, policy_version 55070 (0.0008) [2023-10-14 03:13:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 112263168. Throughput: 0: 1759.7, 1: 1756.5. Samples: 28074942. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:13:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 03:13:49,892][33201] Updated weights for policy 0, policy_version 54570 (0.0008) [2023-10-14 03:13:50,267][33201] Updated weights for policy 0, policy_version 54580 (0.0008) [2023-10-14 03:13:50,627][33201] Updated weights for policy 0, policy_version 54590 (0.0008) [2023-10-14 03:13:51,615][33226] Updated weights for policy 1, policy_version 55080 (0.0010) [2023-10-14 03:13:51,984][33226] Updated weights for policy 1, policy_version 55090 (0.0009) [2023-10-14 03:13:52,355][33226] Updated weights for policy 1, policy_version 55100 (0.0008) [2023-10-14 03:13:54,160][33201] Updated weights for policy 0, policy_version 54600 (0.0010) [2023-10-14 03:13:54,541][33201] Updated weights for policy 0, policy_version 54610 (0.0007) [2023-10-14 03:13:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 112328704. Throughput: 0: 1798.9, 1: 1745.1. Samples: 28096776. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:13:54,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.990')] [2023-10-14 03:13:54,907][33201] Updated weights for policy 0, policy_version 54620 (0.0011) [2023-10-14 03:13:56,173][33226] Updated weights for policy 1, policy_version 55110 (0.0007) [2023-10-14 03:13:56,541][33226] Updated weights for policy 1, policy_version 55120 (0.0008) [2023-10-14 03:13:56,916][33226] Updated weights for policy 1, policy_version 55130 (0.0011) [2023-10-14 03:13:58,862][33201] Updated weights for policy 0, policy_version 54630 (0.0009) [2023-10-14 03:13:59,228][33201] Updated weights for policy 0, policy_version 54640 (0.0007) [2023-10-14 03:13:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 112394240. Throughput: 0: 1771.2, 1: 1752.1. Samples: 28106862. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:13:59,558][31953] Avg episode reward: [(0, '20.760'), (1, '21.000')] [2023-10-14 03:13:59,602][33201] Updated weights for policy 0, policy_version 54650 (0.0008) [2023-10-14 03:14:00,860][33226] Updated weights for policy 1, policy_version 55140 (0.0009) [2023-10-14 03:14:01,224][33226] Updated weights for policy 1, policy_version 55150 (0.0010) [2023-10-14 03:14:01,597][33226] Updated weights for policy 1, policy_version 55160 (0.0009) [2023-10-14 03:14:03,378][33201] Updated weights for policy 0, policy_version 54660 (0.0007) [2023-10-14 03:14:03,756][33201] Updated weights for policy 0, policy_version 54670 (0.0008) [2023-10-14 03:14:04,123][33201] Updated weights for policy 0, policy_version 54680 (0.0007) [2023-10-14 03:14:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112492544. Throughput: 0: 1799.4, 1: 1742.0. Samples: 28128528. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:14:04,557][31953] Avg episode reward: [(0, '20.740'), (1, '20.990')] [2023-10-14 03:14:05,286][33226] Updated weights for policy 1, policy_version 55170 (0.0008) [2023-10-14 03:14:05,649][33226] Updated weights for policy 1, policy_version 55180 (0.0009) [2023-10-14 03:14:06,016][33226] Updated weights for policy 1, policy_version 55190 (0.0010) [2023-10-14 03:14:06,378][33226] Updated weights for policy 1, policy_version 55200 (0.0007) [2023-10-14 03:14:07,909][33201] Updated weights for policy 0, policy_version 54690 (0.0010) [2023-10-14 03:14:08,278][33201] Updated weights for policy 0, policy_version 54700 (0.0008) [2023-10-14 03:14:08,653][33201] Updated weights for policy 0, policy_version 54710 (0.0009) [2023-10-14 03:14:09,017][33201] Updated weights for policy 0, policy_version 54720 (0.0009) [2023-10-14 03:14:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112558080. Throughput: 0: 1758.2, 1: 1770.8. Samples: 28149286. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:14:09,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:14:10,047][33226] Updated weights for policy 1, policy_version 55210 (0.0008) [2023-10-14 03:14:10,407][33226] Updated weights for policy 1, policy_version 55220 (0.0009) [2023-10-14 03:14:10,779][33226] Updated weights for policy 1, policy_version 55230 (0.0010) [2023-10-14 03:14:12,900][33201] Updated weights for policy 0, policy_version 54730 (0.0008) [2023-10-14 03:14:13,270][33201] Updated weights for policy 0, policy_version 54740 (0.0008) [2023-10-14 03:14:13,648][33201] Updated weights for policy 0, policy_version 54750 (0.0007) [2023-10-14 03:14:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112623616. Throughput: 0: 1782.0, 1: 1750.8. Samples: 28160180. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:14,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.970')] [2023-10-14 03:14:14,651][33226] Updated weights for policy 1, policy_version 55240 (0.0008) [2023-10-14 03:14:15,013][33226] Updated weights for policy 1, policy_version 55250 (0.0009) [2023-10-14 03:14:15,378][33226] Updated weights for policy 1, policy_version 55260 (0.0008) [2023-10-14 03:14:17,690][33201] Updated weights for policy 0, policy_version 54760 (0.0008) [2023-10-14 03:14:18,060][33201] Updated weights for policy 0, policy_version 54770 (0.0007) [2023-10-14 03:14:18,435][33201] Updated weights for policy 0, policy_version 54780 (0.0010) [2023-10-14 03:14:19,235][33226] Updated weights for policy 1, policy_version 55270 (0.0007) [2023-10-14 03:14:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112689152. Throughput: 0: 1770.6, 1: 1770.4. Samples: 28181700. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:19,557][31953] Avg episode reward: [(0, '20.660'), (1, '20.970')] [2023-10-14 03:14:19,593][33226] Updated weights for policy 1, policy_version 55280 (0.0008) [2023-10-14 03:14:19,963][33226] Updated weights for policy 1, policy_version 55290 (0.0010) [2023-10-14 03:14:22,106][33201] Updated weights for policy 0, policy_version 54790 (0.0007) [2023-10-14 03:14:22,478][33201] Updated weights for policy 0, policy_version 54800 (0.0008) [2023-10-14 03:14:22,851][33201] Updated weights for policy 0, policy_version 54810 (0.0010) [2023-10-14 03:14:23,724][33226] Updated weights for policy 1, policy_version 55300 (0.0008) [2023-10-14 03:14:24,111][33226] Updated weights for policy 1, policy_version 55310 (0.0007) [2023-10-14 03:14:24,470][33226] Updated weights for policy 1, policy_version 55320 (0.0008) [2023-10-14 03:14:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 112754688. Throughput: 0: 1751.1, 1: 1786.1. Samples: 28202800. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:24,558][31953] Avg episode reward: [(0, '20.630'), (1, '20.970')] [2023-10-14 03:14:26,685][33201] Updated weights for policy 0, policy_version 54820 (0.0009) [2023-10-14 03:14:27,071][33201] Updated weights for policy 0, policy_version 54830 (0.0009) [2023-10-14 03:14:27,439][33201] Updated weights for policy 0, policy_version 54840 (0.0009) [2023-10-14 03:14:28,331][33226] Updated weights for policy 1, policy_version 55330 (0.0010) [2023-10-14 03:14:28,692][33226] Updated weights for policy 1, policy_version 55340 (0.0009) [2023-10-14 03:14:29,067][33226] Updated weights for policy 1, policy_version 55350 (0.0008) [2023-10-14 03:14:29,433][33226] Updated weights for policy 1, policy_version 55360 (0.0008) [2023-10-14 03:14:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 112852992. Throughput: 0: 1772.6, 1: 1766.6. Samples: 28213560. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:29,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.970')] [2023-10-14 03:14:31,318][33201] Updated weights for policy 0, policy_version 54850 (0.0008) [2023-10-14 03:14:31,698][33201] Updated weights for policy 0, policy_version 54860 (0.0008) [2023-10-14 03:14:32,073][33201] Updated weights for policy 0, policy_version 54870 (0.0009) [2023-10-14 03:14:32,436][33201] Updated weights for policy 0, policy_version 54880 (0.0007) [2023-10-14 03:14:33,202][33226] Updated weights for policy 1, policy_version 55370 (0.0009) [2023-10-14 03:14:33,578][33226] Updated weights for policy 1, policy_version 55380 (0.0008) [2023-10-14 03:14:33,948][33226] Updated weights for policy 1, policy_version 55390 (0.0008) [2023-10-14 03:14:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 112918528. Throughput: 0: 1749.4, 1: 1795.2. Samples: 28234448. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:34,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.970')] [2023-10-14 03:14:36,351][33201] Updated weights for policy 0, policy_version 54890 (0.0008) [2023-10-14 03:14:36,727][33201] Updated weights for policy 0, policy_version 54900 (0.0009) [2023-10-14 03:14:37,096][33201] Updated weights for policy 0, policy_version 54910 (0.0008) [2023-10-14 03:14:37,706][33226] Updated weights for policy 1, policy_version 55400 (0.0008) [2023-10-14 03:14:38,066][33226] Updated weights for policy 1, policy_version 55410 (0.0007) [2023-10-14 03:14:38,431][33226] Updated weights for policy 1, policy_version 55420 (0.0007) [2023-10-14 03:14:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 112984064. Throughput: 0: 1746.9, 1: 1775.0. Samples: 28255264. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:39,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.970')] [2023-10-14 03:14:40,927][33201] Updated weights for policy 0, policy_version 54920 (0.0009) [2023-10-14 03:14:41,293][33201] Updated weights for policy 0, policy_version 54930 (0.0011) [2023-10-14 03:14:41,667][33201] Updated weights for policy 0, policy_version 54940 (0.0009) [2023-10-14 03:14:42,156][33226] Updated weights for policy 1, policy_version 55430 (0.0009) [2023-10-14 03:14:42,523][33226] Updated weights for policy 1, policy_version 55440 (0.0008) [2023-10-14 03:14:42,899][33226] Updated weights for policy 1, policy_version 55450 (0.0008) [2023-10-14 03:14:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113049600. Throughput: 0: 1742.1, 1: 1802.4. Samples: 28266364. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:44,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.970')] [2023-10-14 03:14:45,228][33201] Updated weights for policy 0, policy_version 54950 (0.0008) [2023-10-14 03:14:45,595][33201] Updated weights for policy 0, policy_version 54960 (0.0008) [2023-10-14 03:14:45,972][33201] Updated weights for policy 0, policy_version 54970 (0.0007) [2023-10-14 03:14:46,657][33226] Updated weights for policy 1, policy_version 55460 (0.0008) [2023-10-14 03:14:47,027][33226] Updated weights for policy 1, policy_version 55470 (0.0008) [2023-10-14 03:14:47,386][33226] Updated weights for policy 1, policy_version 55480 (0.0007) [2023-10-14 03:14:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 113115136. Throughput: 0: 1749.3, 1: 1779.0. Samples: 28287302. Policy #0 lag: (min: 13.0, avg: 21.0, max: 45.0) [2023-10-14 03:14:49,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.970')] [2023-10-14 03:14:49,907][33201] Updated weights for policy 0, policy_version 54980 (0.0009) [2023-10-14 03:14:50,280][33201] Updated weights for policy 0, policy_version 54990 (0.0009) [2023-10-14 03:14:50,644][33201] Updated weights for policy 0, policy_version 55000 (0.0010) [2023-10-14 03:14:51,233][33226] Updated weights for policy 1, policy_version 55490 (0.0009) [2023-10-14 03:14:51,597][33226] Updated weights for policy 1, policy_version 55500 (0.0009) [2023-10-14 03:14:51,961][33226] Updated weights for policy 1, policy_version 55510 (0.0007) [2023-10-14 03:14:52,330][33226] Updated weights for policy 1, policy_version 55520 (0.0007) [2023-10-14 03:14:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 113180672. Throughput: 0: 1781.9, 1: 1775.6. Samples: 28309374. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:14:54,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.970')] [2023-10-14 03:14:54,560][33201] Updated weights for policy 0, policy_version 55010 (0.0007) [2023-10-14 03:14:54,927][33201] Updated weights for policy 0, policy_version 55020 (0.0007) [2023-10-14 03:14:55,298][33201] Updated weights for policy 0, policy_version 55030 (0.0008) [2023-10-14 03:14:55,675][33201] Updated weights for policy 0, policy_version 55040 (0.0009) [2023-10-14 03:14:56,192][33226] Updated weights for policy 1, policy_version 55530 (0.0009) [2023-10-14 03:14:56,554][33226] Updated weights for policy 1, policy_version 55540 (0.0008) [2023-10-14 03:14:56,918][33226] Updated weights for policy 1, policy_version 55550 (0.0008) [2023-10-14 03:14:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 113246208. Throughput: 0: 1753.8, 1: 1778.8. Samples: 28319148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:14:59,557][31953] Avg episode reward: [(0, '20.710'), (1, '20.970')] [2023-10-14 03:14:59,613][33201] Updated weights for policy 0, policy_version 55050 (0.0008) [2023-10-14 03:14:59,973][33201] Updated weights for policy 0, policy_version 55060 (0.0011) [2023-10-14 03:15:00,343][33201] Updated weights for policy 0, policy_version 55070 (0.0008) [2023-10-14 03:15:00,727][33226] Updated weights for policy 1, policy_version 55560 (0.0009) [2023-10-14 03:15:01,086][33226] Updated weights for policy 1, policy_version 55570 (0.0008) [2023-10-14 03:15:01,458][33226] Updated weights for policy 1, policy_version 55580 (0.0008) [2023-10-14 03:15:04,219][33201] Updated weights for policy 0, policy_version 55080 (0.0009) [2023-10-14 03:15:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 113311744. Throughput: 0: 1767.9, 1: 1774.3. Samples: 28341098. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:04,557][31953] Avg episode reward: [(0, '20.710'), (1, '20.970')] [2023-10-14 03:15:04,597][33201] Updated weights for policy 0, policy_version 55090 (0.0007) [2023-10-14 03:15:04,965][33201] Updated weights for policy 0, policy_version 55100 (0.0008) [2023-10-14 03:15:05,161][33226] Updated weights for policy 1, policy_version 55590 (0.0007) [2023-10-14 03:15:05,521][33226] Updated weights for policy 1, policy_version 55600 (0.0008) [2023-10-14 03:15:05,896][33226] Updated weights for policy 1, policy_version 55610 (0.0008) [2023-10-14 03:15:08,745][33201] Updated weights for policy 0, policy_version 55110 (0.0008) [2023-10-14 03:15:09,114][33201] Updated weights for policy 0, policy_version 55120 (0.0008) [2023-10-14 03:15:09,481][33201] Updated weights for policy 0, policy_version 55130 (0.0009) [2023-10-14 03:15:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 113377280. Throughput: 0: 1763.9, 1: 1789.3. Samples: 28362694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:09,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.970')] [2023-10-14 03:15:09,685][33226] Updated weights for policy 1, policy_version 55620 (0.0009) [2023-10-14 03:15:10,054][33226] Updated weights for policy 1, policy_version 55630 (0.0012) [2023-10-14 03:15:10,419][33226] Updated weights for policy 1, policy_version 55640 (0.0011) [2023-10-14 03:15:13,197][33201] Updated weights for policy 0, policy_version 55140 (0.0007) [2023-10-14 03:15:13,578][33201] Updated weights for policy 0, policy_version 55150 (0.0009) [2023-10-14 03:15:13,944][33201] Updated weights for policy 0, policy_version 55160 (0.0008) [2023-10-14 03:15:14,192][33226] Updated weights for policy 1, policy_version 55650 (0.0009) [2023-10-14 03:15:14,554][33226] Updated weights for policy 1, policy_version 55660 (0.0008) [2023-10-14 03:15:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113475584. Throughput: 0: 1760.6, 1: 1780.9. Samples: 28372928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:14,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.960')] [2023-10-14 03:15:14,915][33226] Updated weights for policy 1, policy_version 55670 (0.0009) [2023-10-14 03:15:15,280][33226] Updated weights for policy 1, policy_version 55680 (0.0008) [2023-10-14 03:15:17,700][33201] Updated weights for policy 0, policy_version 55170 (0.0008) [2023-10-14 03:15:18,083][33201] Updated weights for policy 0, policy_version 55180 (0.0007) [2023-10-14 03:15:18,443][33201] Updated weights for policy 0, policy_version 55190 (0.0007) [2023-10-14 03:15:18,811][33201] Updated weights for policy 0, policy_version 55200 (0.0008) [2023-10-14 03:15:18,969][33226] Updated weights for policy 1, policy_version 55690 (0.0009) [2023-10-14 03:15:19,335][33226] Updated weights for policy 1, policy_version 55700 (0.0007) [2023-10-14 03:15:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113541120. Throughput: 0: 1776.5, 1: 1785.5. Samples: 28394736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:19,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.960')] [2023-10-14 03:15:19,700][33226] Updated weights for policy 1, policy_version 55710 (0.0007) [2023-10-14 03:15:22,636][33201] Updated weights for policy 0, policy_version 55210 (0.0008) [2023-10-14 03:15:22,998][33201] Updated weights for policy 0, policy_version 55220 (0.0007) [2023-10-14 03:15:23,378][33201] Updated weights for policy 0, policy_version 55230 (0.0007) [2023-10-14 03:15:23,407][33226] Updated weights for policy 1, policy_version 55720 (0.0008) [2023-10-14 03:15:23,777][33226] Updated weights for policy 1, policy_version 55730 (0.0009) [2023-10-14 03:15:24,145][33226] Updated weights for policy 1, policy_version 55740 (0.0007) [2023-10-14 03:15:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14745.6, 300 sec: 14329.0). Total num frames: 113639424. Throughput: 0: 1758.3, 1: 1794.2. Samples: 28415124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:24,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.960')] [2023-10-14 03:15:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000055744_57081856.pth... [2023-10-14 03:15:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000055232_56557568.pth... [2023-10-14 03:15:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000053568_54853632.pth [2023-10-14 03:15:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000054080_55377920.pth [2023-10-14 03:15:27,356][33201] Updated weights for policy 0, policy_version 55240 (0.0008) [2023-10-14 03:15:27,720][33201] Updated weights for policy 0, policy_version 55250 (0.0010) [2023-10-14 03:15:28,005][33226] Updated weights for policy 1, policy_version 55750 (0.0007) [2023-10-14 03:15:28,094][33201] Updated weights for policy 0, policy_version 55260 (0.0008) [2023-10-14 03:15:28,372][33226] Updated weights for policy 1, policy_version 55760 (0.0009) [2023-10-14 03:15:28,746][33226] Updated weights for policy 1, policy_version 55770 (0.0007) [2023-10-14 03:15:29,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 113704960. Throughput: 0: 1782.9, 1: 1782.7. Samples: 28426818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:29,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.960')] [2023-10-14 03:15:31,892][33201] Updated weights for policy 0, policy_version 55270 (0.0009) [2023-10-14 03:15:32,270][33201] Updated weights for policy 0, policy_version 55280 (0.0009) [2023-10-14 03:15:32,499][33226] Updated weights for policy 1, policy_version 55780 (0.0008) [2023-10-14 03:15:32,635][33201] Updated weights for policy 0, policy_version 55290 (0.0007) [2023-10-14 03:15:32,862][33226] Updated weights for policy 1, policy_version 55790 (0.0007) [2023-10-14 03:15:33,234][33226] Updated weights for policy 1, policy_version 55800 (0.0008) [2023-10-14 03:15:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113770496. Throughput: 0: 1747.1, 1: 1798.2. Samples: 28446840. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:34,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.960')] [2023-10-14 03:15:36,465][33201] Updated weights for policy 0, policy_version 55300 (0.0008) [2023-10-14 03:15:36,835][33201] Updated weights for policy 0, policy_version 55310 (0.0010) [2023-10-14 03:15:37,202][33201] Updated weights for policy 0, policy_version 55320 (0.0011) [2023-10-14 03:15:37,252][33226] Updated weights for policy 1, policy_version 55810 (0.0010) [2023-10-14 03:15:37,612][33226] Updated weights for policy 1, policy_version 55820 (0.0008) [2023-10-14 03:15:37,984][33226] Updated weights for policy 1, policy_version 55830 (0.0008) [2023-10-14 03:15:38,352][33226] Updated weights for policy 1, policy_version 55840 (0.0010) [2023-10-14 03:15:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113836032. Throughput: 0: 1752.6, 1: 1774.8. Samples: 28468106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:39,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.960')] [2023-10-14 03:15:41,095][33201] Updated weights for policy 0, policy_version 55330 (0.0009) [2023-10-14 03:15:41,466][33201] Updated weights for policy 0, policy_version 55340 (0.0009) [2023-10-14 03:15:41,832][33201] Updated weights for policy 0, policy_version 55350 (0.0008) [2023-10-14 03:15:42,122][33226] Updated weights for policy 1, policy_version 55850 (0.0009) [2023-10-14 03:15:42,197][33201] Updated weights for policy 0, policy_version 55360 (0.0008) [2023-10-14 03:15:42,491][33226] Updated weights for policy 1, policy_version 55860 (0.0009) [2023-10-14 03:15:42,867][33226] Updated weights for policy 1, policy_version 55870 (0.0008) [2023-10-14 03:15:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 113901568. Throughput: 0: 1756.2, 1: 1800.5. Samples: 28479200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:44,558][31953] Avg episode reward: [(0, '20.670'), (1, '20.960')] [2023-10-14 03:15:46,026][33201] Updated weights for policy 0, policy_version 55370 (0.0009) [2023-10-14 03:15:46,394][33201] Updated weights for policy 0, policy_version 55380 (0.0010) [2023-10-14 03:15:46,584][33226] Updated weights for policy 1, policy_version 55880 (0.0008) [2023-10-14 03:15:46,768][33201] Updated weights for policy 0, policy_version 55390 (0.0008) [2023-10-14 03:15:46,941][33226] Updated weights for policy 1, policy_version 55890 (0.0010) [2023-10-14 03:15:47,311][33226] Updated weights for policy 1, policy_version 55900 (0.0008) [2023-10-14 03:15:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 113967104. Throughput: 0: 1750.8, 1: 1774.3. Samples: 28499728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:49,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.970')] [2023-10-14 03:15:50,706][33201] Updated weights for policy 0, policy_version 55400 (0.0008) [2023-10-14 03:15:51,028][33226] Updated weights for policy 1, policy_version 55910 (0.0007) [2023-10-14 03:15:51,070][33201] Updated weights for policy 0, policy_version 55410 (0.0008) [2023-10-14 03:15:51,393][33226] Updated weights for policy 1, policy_version 55920 (0.0008) [2023-10-14 03:15:51,433][33201] Updated weights for policy 0, policy_version 55420 (0.0008) [2023-10-14 03:15:51,754][33226] Updated weights for policy 1, policy_version 55930 (0.0008) [2023-10-14 03:15:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 114032640. Throughput: 0: 1762.4, 1: 1775.7. Samples: 28521910. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:54,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.990')] [2023-10-14 03:15:55,237][33201] Updated weights for policy 0, policy_version 55430 (0.0009) [2023-10-14 03:15:55,609][33201] Updated weights for policy 0, policy_version 55440 (0.0008) [2023-10-14 03:15:55,643][33226] Updated weights for policy 1, policy_version 55940 (0.0008) [2023-10-14 03:15:55,987][33201] Updated weights for policy 0, policy_version 55450 (0.0008) [2023-10-14 03:15:56,029][33226] Updated weights for policy 1, policy_version 55950 (0.0008) [2023-10-14 03:15:56,397][33226] Updated weights for policy 1, policy_version 55960 (0.0007) [2023-10-14 03:15:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 114098176. Throughput: 0: 1747.2, 1: 1773.6. Samples: 28531364. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:15:59,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.980')] [2023-10-14 03:15:59,859][33201] Updated weights for policy 0, policy_version 55460 (0.0008) [2023-10-14 03:16:00,094][33226] Updated weights for policy 1, policy_version 55970 (0.0009) [2023-10-14 03:16:00,237][33201] Updated weights for policy 0, policy_version 55470 (0.0008) [2023-10-14 03:16:00,462][33226] Updated weights for policy 1, policy_version 55980 (0.0009) [2023-10-14 03:16:00,606][33201] Updated weights for policy 0, policy_version 55480 (0.0008) [2023-10-14 03:16:00,842][33226] Updated weights for policy 1, policy_version 55990 (0.0008) [2023-10-14 03:16:01,213][33226] Updated weights for policy 1, policy_version 56000 (0.0010) [2023-10-14 03:16:04,373][33201] Updated weights for policy 0, policy_version 55490 (0.0008) [2023-10-14 03:16:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 114163712. Throughput: 0: 1755.9, 1: 1764.1. Samples: 28553136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:16:04,559][31953] Avg episode reward: [(0, '20.700'), (1, '20.980')] [2023-10-14 03:16:04,748][33201] Updated weights for policy 0, policy_version 55500 (0.0008) [2023-10-14 03:16:05,063][33226] Updated weights for policy 1, policy_version 56010 (0.0007) [2023-10-14 03:16:05,113][33201] Updated weights for policy 0, policy_version 55510 (0.0009) [2023-10-14 03:16:05,428][33226] Updated weights for policy 1, policy_version 56020 (0.0008) [2023-10-14 03:16:05,475][33201] Updated weights for policy 0, policy_version 55520 (0.0009) [2023-10-14 03:16:05,803][33226] Updated weights for policy 1, policy_version 56030 (0.0008) [2023-10-14 03:16:09,304][33201] Updated weights for policy 0, policy_version 55530 (0.0007) [2023-10-14 03:16:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 114229248. Throughput: 0: 1769.7, 1: 1786.4. Samples: 28575148. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:09,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.960')] [2023-10-14 03:16:09,612][33226] Updated weights for policy 1, policy_version 56040 (0.0007) [2023-10-14 03:16:09,669][33201] Updated weights for policy 0, policy_version 55540 (0.0008) [2023-10-14 03:16:09,976][33226] Updated weights for policy 1, policy_version 56050 (0.0008) [2023-10-14 03:16:10,038][33201] Updated weights for policy 0, policy_version 55550 (0.0007) [2023-10-14 03:16:10,342][33226] Updated weights for policy 1, policy_version 56060 (0.0007) [2023-10-14 03:16:13,895][33201] Updated weights for policy 0, policy_version 55560 (0.0007) [2023-10-14 03:16:14,065][33226] Updated weights for policy 1, policy_version 56070 (0.0007) [2023-10-14 03:16:14,256][33201] Updated weights for policy 0, policy_version 55570 (0.0008) [2023-10-14 03:16:14,425][33226] Updated weights for policy 1, policy_version 56080 (0.0009) [2023-10-14 03:16:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 114294784. Throughput: 0: 1748.1, 1: 1767.3. Samples: 28585010. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:14,557][31953] Avg episode reward: [(0, '20.700'), (1, '20.960')] [2023-10-14 03:16:14,619][33201] Updated weights for policy 0, policy_version 55580 (0.0008) [2023-10-14 03:16:14,783][33226] Updated weights for policy 1, policy_version 56090 (0.0009) [2023-10-14 03:16:18,456][33201] Updated weights for policy 0, policy_version 55590 (0.0008) [2023-10-14 03:16:18,665][33226] Updated weights for policy 1, policy_version 56100 (0.0008) [2023-10-14 03:16:18,833][33201] Updated weights for policy 0, policy_version 55600 (0.0009) [2023-10-14 03:16:19,029][33226] Updated weights for policy 1, policy_version 56110 (0.0008) [2023-10-14 03:16:19,197][33201] Updated weights for policy 0, policy_version 55610 (0.0007) [2023-10-14 03:16:19,387][33226] Updated weights for policy 1, policy_version 56120 (0.0007) [2023-10-14 03:16:19,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 114393088. Throughput: 0: 1776.5, 1: 1781.9. Samples: 28606968. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:19,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.960')] [2023-10-14 03:16:22,881][33226] Updated weights for policy 1, policy_version 56130 (0.0008) [2023-10-14 03:16:23,076][33201] Updated weights for policy 0, policy_version 55620 (0.0009) [2023-10-14 03:16:23,248][33226] Updated weights for policy 1, policy_version 56140 (0.0008) [2023-10-14 03:16:23,443][33201] Updated weights for policy 0, policy_version 55630 (0.0007) [2023-10-14 03:16:23,612][33226] Updated weights for policy 1, policy_version 56150 (0.0009) [2023-10-14 03:16:23,804][33201] Updated weights for policy 0, policy_version 55640 (0.0009) [2023-10-14 03:16:23,977][33226] Updated weights for policy 1, policy_version 56160 (0.0007) [2023-10-14 03:16:24,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 114491392. Throughput: 0: 1742.7, 1: 1777.8. Samples: 28626526. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:24,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:16:27,638][33201] Updated weights for policy 0, policy_version 55650 (0.0008) [2023-10-14 03:16:27,756][33226] Updated weights for policy 1, policy_version 56170 (0.0007) [2023-10-14 03:16:28,009][33201] Updated weights for policy 0, policy_version 55660 (0.0007) [2023-10-14 03:16:28,130][33226] Updated weights for policy 1, policy_version 56180 (0.0007) [2023-10-14 03:16:28,391][33201] Updated weights for policy 0, policy_version 55670 (0.0009) [2023-10-14 03:16:28,494][33226] Updated weights for policy 1, policy_version 56190 (0.0009) [2023-10-14 03:16:28,754][33201] Updated weights for policy 0, policy_version 55680 (0.0008) [2023-10-14 03:16:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 114556928. Throughput: 0: 1769.2, 1: 1782.6. Samples: 28639032. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:29,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:16:32,311][33226] Updated weights for policy 1, policy_version 56200 (0.0008) [2023-10-14 03:16:32,590][33201] Updated weights for policy 0, policy_version 55690 (0.0007) [2023-10-14 03:16:32,676][33226] Updated weights for policy 1, policy_version 56210 (0.0009) [2023-10-14 03:16:32,953][33201] Updated weights for policy 0, policy_version 55700 (0.0007) [2023-10-14 03:16:33,042][33226] Updated weights for policy 1, policy_version 56220 (0.0007) [2023-10-14 03:16:33,319][33201] Updated weights for policy 0, policy_version 55710 (0.0008) [2023-10-14 03:16:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 114622464. Throughput: 0: 1752.2, 1: 1783.8. Samples: 28658850. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:34,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 03:16:36,955][33226] Updated weights for policy 1, policy_version 56230 (0.0008) [2023-10-14 03:16:37,272][33201] Updated weights for policy 0, policy_version 55720 (0.0007) [2023-10-14 03:16:37,326][33226] Updated weights for policy 1, policy_version 56240 (0.0008) [2023-10-14 03:16:37,652][33201] Updated weights for policy 0, policy_version 55730 (0.0009) [2023-10-14 03:16:37,696][33226] Updated weights for policy 1, policy_version 56250 (0.0007) [2023-10-14 03:16:38,019][33201] Updated weights for policy 0, policy_version 55740 (0.0010) [2023-10-14 03:16:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 114688000. Throughput: 0: 1740.4, 1: 1769.6. Samples: 28679860. Policy #0 lag: (min: 23.0, avg: 24.0, max: 45.0) [2023-10-14 03:16:39,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.960')] [2023-10-14 03:16:41,714][33226] Updated weights for policy 1, policy_version 56260 (0.0008) [2023-10-14 03:16:41,792][33201] Updated weights for policy 0, policy_version 55750 (0.0007) [2023-10-14 03:16:42,107][33226] Updated weights for policy 1, policy_version 56270 (0.0008) [2023-10-14 03:16:42,158][33201] Updated weights for policy 0, policy_version 55760 (0.0008) [2023-10-14 03:16:42,473][33226] Updated weights for policy 1, policy_version 56280 (0.0007) [2023-10-14 03:16:42,527][33201] Updated weights for policy 0, policy_version 55770 (0.0007) [2023-10-14 03:16:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 114753536. Throughput: 0: 1761.8, 1: 1789.0. Samples: 28691150. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:16:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 03:16:46,361][33226] Updated weights for policy 1, policy_version 56290 (0.0007) [2023-10-14 03:16:46,459][33201] Updated weights for policy 0, policy_version 55780 (0.0008) [2023-10-14 03:16:46,732][33226] Updated weights for policy 1, policy_version 56300 (0.0008) [2023-10-14 03:16:46,850][33201] Updated weights for policy 0, policy_version 55790 (0.0007) [2023-10-14 03:16:47,093][33226] Updated weights for policy 1, policy_version 56310 (0.0007) [2023-10-14 03:16:47,220][33201] Updated weights for policy 0, policy_version 55800 (0.0007) [2023-10-14 03:16:47,453][33226] Updated weights for policy 1, policy_version 56320 (0.0007) [2023-10-14 03:16:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 114819072. Throughput: 0: 1735.0, 1: 1767.5. Samples: 28710746. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:16:49,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 03:16:51,158][33201] Updated weights for policy 0, policy_version 55810 (0.0007) [2023-10-14 03:16:51,194][33226] Updated weights for policy 1, policy_version 56330 (0.0008) [2023-10-14 03:16:51,521][33201] Updated weights for policy 0, policy_version 55820 (0.0010) [2023-10-14 03:16:51,558][33226] Updated weights for policy 1, policy_version 56340 (0.0010) [2023-10-14 03:16:51,896][33201] Updated weights for policy 0, policy_version 55830 (0.0008) [2023-10-14 03:16:51,933][33226] Updated weights for policy 1, policy_version 56350 (0.0008) [2023-10-14 03:16:52,271][33201] Updated weights for policy 0, policy_version 55840 (0.0007) [2023-10-14 03:16:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 114884608. Throughput: 0: 1739.0, 1: 1770.4. Samples: 28733072. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:16:54,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.960')] [2023-10-14 03:16:55,570][33226] Updated weights for policy 1, policy_version 56360 (0.0010) [2023-10-14 03:16:55,938][33226] Updated weights for policy 1, policy_version 56370 (0.0011) [2023-10-14 03:16:56,259][33201] Updated weights for policy 0, policy_version 55850 (0.0007) [2023-10-14 03:16:56,300][33226] Updated weights for policy 1, policy_version 56380 (0.0011) [2023-10-14 03:16:56,627][33201] Updated weights for policy 0, policy_version 55860 (0.0008) [2023-10-14 03:16:56,995][33201] Updated weights for policy 0, policy_version 55870 (0.0008) [2023-10-14 03:16:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 114950144. Throughput: 0: 1735.9, 1: 1772.9. Samples: 28742910. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:16:59,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 03:17:00,145][33226] Updated weights for policy 1, policy_version 56390 (0.0010) [2023-10-14 03:17:00,515][33226] Updated weights for policy 1, policy_version 56400 (0.0009) [2023-10-14 03:17:00,797][33201] Updated weights for policy 0, policy_version 55880 (0.0008) [2023-10-14 03:17:00,882][33226] Updated weights for policy 1, policy_version 56410 (0.0008) [2023-10-14 03:17:01,170][33201] Updated weights for policy 0, policy_version 55890 (0.0007) [2023-10-14 03:17:01,536][33201] Updated weights for policy 0, policy_version 55900 (0.0007) [2023-10-14 03:17:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115015680. Throughput: 0: 1736.3, 1: 1781.2. Samples: 28765254. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:17:04,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.950')] [2023-10-14 03:17:04,573][33226] Updated weights for policy 1, policy_version 56420 (0.0009) [2023-10-14 03:17:04,945][33226] Updated weights for policy 1, policy_version 56430 (0.0011) [2023-10-14 03:17:05,325][33226] Updated weights for policy 1, policy_version 56440 (0.0010) [2023-10-14 03:17:05,527][33201] Updated weights for policy 0, policy_version 55910 (0.0008) [2023-10-14 03:17:05,903][33201] Updated weights for policy 0, policy_version 55920 (0.0010) [2023-10-14 03:17:06,285][33201] Updated weights for policy 0, policy_version 55930 (0.0007) [2023-10-14 03:17:09,177][33226] Updated weights for policy 1, policy_version 56450 (0.0008) [2023-10-14 03:17:09,549][33226] Updated weights for policy 1, policy_version 56460 (0.0008) [2023-10-14 03:17:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115081216. Throughput: 0: 1768.4, 1: 1804.0. Samples: 28787286. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:17:09,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.930')] [2023-10-14 03:17:09,923][33226] Updated weights for policy 1, policy_version 56470 (0.0009) [2023-10-14 03:17:09,953][33201] Updated weights for policy 0, policy_version 55940 (0.0009) [2023-10-14 03:17:10,285][33226] Updated weights for policy 1, policy_version 56480 (0.0007) [2023-10-14 03:17:10,324][33201] Updated weights for policy 0, policy_version 55950 (0.0007) [2023-10-14 03:17:10,694][33201] Updated weights for policy 0, policy_version 55960 (0.0007) [2023-10-14 03:17:14,047][33226] Updated weights for policy 1, policy_version 56490 (0.0008) [2023-10-14 03:17:14,415][33226] Updated weights for policy 1, policy_version 56500 (0.0007) [2023-10-14 03:17:14,502][33201] Updated weights for policy 0, policy_version 55970 (0.0008) [2023-10-14 03:17:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 115146752. Throughput: 0: 1739.8, 1: 1770.3. Samples: 28796988. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:17:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:17:14,789][33226] Updated weights for policy 1, policy_version 56510 (0.0009) [2023-10-14 03:17:14,874][33201] Updated weights for policy 0, policy_version 55980 (0.0009) [2023-10-14 03:17:15,248][33201] Updated weights for policy 0, policy_version 55990 (0.0010) [2023-10-14 03:17:15,621][33201] Updated weights for policy 0, policy_version 56000 (0.0011) [2023-10-14 03:17:18,514][33226] Updated weights for policy 1, policy_version 56520 (0.0008) [2023-10-14 03:17:18,882][33226] Updated weights for policy 1, policy_version 56530 (0.0008) [2023-10-14 03:17:19,250][33226] Updated weights for policy 1, policy_version 56540 (0.0010) [2023-10-14 03:17:19,338][33201] Updated weights for policy 0, policy_version 56010 (0.0009) [2023-10-14 03:17:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 115245056. Throughput: 0: 1766.3, 1: 1796.4. Samples: 28819168. Policy #0 lag: (min: 23.0, avg: 23.0, max: 23.0) [2023-10-14 03:17:19,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-14 03:17:19,710][33201] Updated weights for policy 0, policy_version 56020 (0.0008) [2023-10-14 03:17:20,073][33201] Updated weights for policy 0, policy_version 56030 (0.0008) [2023-10-14 03:17:23,195][33226] Updated weights for policy 1, policy_version 56550 (0.0010) [2023-10-14 03:17:23,562][33226] Updated weights for policy 1, policy_version 56560 (0.0008) [2023-10-14 03:17:23,813][33201] Updated weights for policy 0, policy_version 56040 (0.0009) [2023-10-14 03:17:23,929][33226] Updated weights for policy 1, policy_version 56570 (0.0007) [2023-10-14 03:17:24,180][33201] Updated weights for policy 0, policy_version 56050 (0.0008) [2023-10-14 03:17:24,552][33201] Updated weights for policy 0, policy_version 56060 (0.0007) [2023-10-14 03:17:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 115310592. Throughput: 0: 1769.7, 1: 1774.5. Samples: 28839348. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:24,559][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 03:17:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000056576_57933824.pth... [2023-10-14 03:17:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000054912_56229888.pth [2023-10-14 03:17:24,692][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000056064_57409536.pth... [2023-10-14 03:17:24,722][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000054400_55705600.pth [2023-10-14 03:17:27,790][33226] Updated weights for policy 1, policy_version 56580 (0.0009) [2023-10-14 03:17:28,165][33226] Updated weights for policy 1, policy_version 56590 (0.0007) [2023-10-14 03:17:28,529][33226] Updated weights for policy 1, policy_version 56600 (0.0007) [2023-10-14 03:17:28,550][33201] Updated weights for policy 0, policy_version 56070 (0.0008) [2023-10-14 03:17:28,926][33201] Updated weights for policy 0, policy_version 56080 (0.0009) [2023-10-14 03:17:29,300][33201] Updated weights for policy 0, policy_version 56090 (0.0008) [2023-10-14 03:17:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 115408896. Throughput: 0: 1759.1, 1: 1783.0. Samples: 28850544. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:29,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:17:32,410][33226] Updated weights for policy 1, policy_version 56610 (0.0008) [2023-10-14 03:17:32,782][33226] Updated weights for policy 1, policy_version 56620 (0.0008) [2023-10-14 03:17:33,139][33226] Updated weights for policy 1, policy_version 56630 (0.0008) [2023-10-14 03:17:33,174][33201] Updated weights for policy 0, policy_version 56100 (0.0008) [2023-10-14 03:17:33,510][33226] Updated weights for policy 1, policy_version 56640 (0.0008) [2023-10-14 03:17:33,566][33201] Updated weights for policy 0, policy_version 56110 (0.0007) [2023-10-14 03:17:33,929][33201] Updated weights for policy 0, policy_version 56120 (0.0009) [2023-10-14 03:17:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 115474432. Throughput: 0: 1781.5, 1: 1789.6. Samples: 28871444. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:34,559][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:17:37,255][33226] Updated weights for policy 1, policy_version 56650 (0.0008) [2023-10-14 03:17:37,624][33226] Updated weights for policy 1, policy_version 56660 (0.0008) [2023-10-14 03:17:37,761][33201] Updated weights for policy 0, policy_version 56130 (0.0009) [2023-10-14 03:17:37,998][33226] Updated weights for policy 1, policy_version 56670 (0.0010) [2023-10-14 03:17:38,136][33201] Updated weights for policy 0, policy_version 56140 (0.0008) [2023-10-14 03:17:38,506][33201] Updated weights for policy 0, policy_version 56150 (0.0008) [2023-10-14 03:17:38,881][33201] Updated weights for policy 0, policy_version 56160 (0.0007) [2023-10-14 03:17:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 115539968. Throughput: 0: 1751.8, 1: 1769.8. Samples: 28891542. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:17:41,599][33226] Updated weights for policy 1, policy_version 56680 (0.0010) [2023-10-14 03:17:41,960][33226] Updated weights for policy 1, policy_version 56690 (0.0009) [2023-10-14 03:17:42,332][33226] Updated weights for policy 1, policy_version 56700 (0.0008) [2023-10-14 03:17:42,792][33201] Updated weights for policy 0, policy_version 56170 (0.0008) [2023-10-14 03:17:43,154][33201] Updated weights for policy 0, policy_version 56180 (0.0008) [2023-10-14 03:17:43,538][33201] Updated weights for policy 0, policy_version 56190 (0.0008) [2023-10-14 03:17:44,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 115605504. Throughput: 0: 1781.6, 1: 1781.3. Samples: 28903242. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:44,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:17:46,185][33226] Updated weights for policy 1, policy_version 56710 (0.0009) [2023-10-14 03:17:46,556][33226] Updated weights for policy 1, policy_version 56720 (0.0009) [2023-10-14 03:17:46,917][33226] Updated weights for policy 1, policy_version 56730 (0.0009) [2023-10-14 03:17:47,345][33201] Updated weights for policy 0, policy_version 56200 (0.0008) [2023-10-14 03:17:47,704][33201] Updated weights for policy 0, policy_version 56210 (0.0008) [2023-10-14 03:17:48,068][33201] Updated weights for policy 0, policy_version 56220 (0.0008) [2023-10-14 03:17:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 115671040. Throughput: 0: 1750.2, 1: 1762.3. Samples: 28923314. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:49,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 03:17:50,723][33226] Updated weights for policy 1, policy_version 56740 (0.0009) [2023-10-14 03:17:51,088][33226] Updated weights for policy 1, policy_version 56750 (0.0010) [2023-10-14 03:17:51,451][33226] Updated weights for policy 1, policy_version 56760 (0.0010) [2023-10-14 03:17:52,083][33201] Updated weights for policy 0, policy_version 56230 (0.0008) [2023-10-14 03:17:52,460][33201] Updated weights for policy 0, policy_version 56240 (0.0009) [2023-10-14 03:17:52,832][33201] Updated weights for policy 0, policy_version 56250 (0.0008) [2023-10-14 03:17:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 115736576. Throughput: 0: 1743.6, 1: 1770.2. Samples: 28945408. Policy #0 lag: (min: 31.0, avg: 36.4, max: 63.0) [2023-10-14 03:17:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 03:17:55,217][33226] Updated weights for policy 1, policy_version 56770 (0.0009) [2023-10-14 03:17:55,592][33226] Updated weights for policy 1, policy_version 56780 (0.0008) [2023-10-14 03:17:55,967][33226] Updated weights for policy 1, policy_version 56790 (0.0008) [2023-10-14 03:17:56,331][33226] Updated weights for policy 1, policy_version 56800 (0.0009) [2023-10-14 03:17:56,544][33201] Updated weights for policy 0, policy_version 56260 (0.0008) [2023-10-14 03:17:56,922][33201] Updated weights for policy 0, policy_version 56270 (0.0007) [2023-10-14 03:17:57,287][33201] Updated weights for policy 0, policy_version 56280 (0.0007) [2023-10-14 03:17:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115802112. Throughput: 0: 1758.3, 1: 1769.5. Samples: 28955738. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:17:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 03:18:00,093][33226] Updated weights for policy 1, policy_version 56810 (0.0009) [2023-10-14 03:18:00,460][33226] Updated weights for policy 1, policy_version 56820 (0.0008) [2023-10-14 03:18:00,822][33226] Updated weights for policy 1, policy_version 56830 (0.0008) [2023-10-14 03:18:01,072][33201] Updated weights for policy 0, policy_version 56290 (0.0008) [2023-10-14 03:18:01,440][33201] Updated weights for policy 0, policy_version 56300 (0.0010) [2023-10-14 03:18:01,813][33201] Updated weights for policy 0, policy_version 56310 (0.0007) [2023-10-14 03:18:02,184][33201] Updated weights for policy 0, policy_version 56320 (0.0008) [2023-10-14 03:18:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115867648. Throughput: 0: 1745.9, 1: 1768.9. Samples: 28977334. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:04,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 03:18:04,626][33226] Updated weights for policy 1, policy_version 56840 (0.0008) [2023-10-14 03:18:04,991][33226] Updated weights for policy 1, policy_version 56850 (0.0007) [2023-10-14 03:18:05,357][33226] Updated weights for policy 1, policy_version 56860 (0.0007) [2023-10-14 03:18:05,902][33201] Updated weights for policy 0, policy_version 56330 (0.0010) [2023-10-14 03:18:06,277][33201] Updated weights for policy 0, policy_version 56340 (0.0010) [2023-10-14 03:18:06,644][33201] Updated weights for policy 0, policy_version 56350 (0.0011) [2023-10-14 03:18:09,027][33226] Updated weights for policy 1, policy_version 56870 (0.0007) [2023-10-14 03:18:09,399][33226] Updated weights for policy 1, policy_version 56880 (0.0009) [2023-10-14 03:18:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115933184. Throughput: 0: 1763.4, 1: 1796.1. Samples: 28999526. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:09,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 03:18:09,763][33226] Updated weights for policy 1, policy_version 56890 (0.0008) [2023-10-14 03:18:10,346][33201] Updated weights for policy 0, policy_version 56360 (0.0008) [2023-10-14 03:18:10,715][33201] Updated weights for policy 0, policy_version 56370 (0.0008) [2023-10-14 03:18:11,088][33201] Updated weights for policy 0, policy_version 56380 (0.0007) [2023-10-14 03:18:13,595][33226] Updated weights for policy 1, policy_version 56900 (0.0007) [2023-10-14 03:18:13,994][33226] Updated weights for policy 1, policy_version 56910 (0.0008) [2023-10-14 03:18:14,364][33226] Updated weights for policy 1, policy_version 56920 (0.0009) [2023-10-14 03:18:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 115998720. Throughput: 0: 1754.9, 1: 1777.1. Samples: 29009484. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.960')] [2023-10-14 03:18:14,855][33201] Updated weights for policy 0, policy_version 56390 (0.0008) [2023-10-14 03:18:15,231][33201] Updated weights for policy 0, policy_version 56400 (0.0008) [2023-10-14 03:18:15,602][33201] Updated weights for policy 0, policy_version 56410 (0.0011) [2023-10-14 03:18:18,243][33226] Updated weights for policy 1, policy_version 56930 (0.0009) [2023-10-14 03:18:18,607][33226] Updated weights for policy 1, policy_version 56940 (0.0010) [2023-10-14 03:18:18,981][33226] Updated weights for policy 1, policy_version 56950 (0.0011) [2023-10-14 03:18:19,347][33226] Updated weights for policy 1, policy_version 56960 (0.0009) [2023-10-14 03:18:19,497][33201] Updated weights for policy 0, policy_version 56420 (0.0010) [2023-10-14 03:18:19,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 116097024. Throughput: 0: 1763.5, 1: 1792.9. Samples: 29031482. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:19,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.960')] [2023-10-14 03:18:19,899][33201] Updated weights for policy 0, policy_version 56430 (0.0008) [2023-10-14 03:18:20,266][33201] Updated weights for policy 0, policy_version 56440 (0.0009) [2023-10-14 03:18:23,171][33226] Updated weights for policy 1, policy_version 56970 (0.0007) [2023-10-14 03:18:23,537][33226] Updated weights for policy 1, policy_version 56980 (0.0008) [2023-10-14 03:18:23,901][33226] Updated weights for policy 1, policy_version 56990 (0.0009) [2023-10-14 03:18:24,014][33201] Updated weights for policy 0, policy_version 56450 (0.0008) [2023-10-14 03:18:24,392][33201] Updated weights for policy 0, policy_version 56460 (0.0007) [2023-10-14 03:18:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 116162560. Throughput: 0: 1789.3, 1: 1781.2. Samples: 29052214. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:24,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.960')] [2023-10-14 03:18:24,769][33201] Updated weights for policy 0, policy_version 56470 (0.0008) [2023-10-14 03:18:25,137][33201] Updated weights for policy 0, policy_version 56480 (0.0009) [2023-10-14 03:18:27,546][33226] Updated weights for policy 1, policy_version 57000 (0.0008) [2023-10-14 03:18:27,919][33226] Updated weights for policy 1, policy_version 57010 (0.0008) [2023-10-14 03:18:28,278][33226] Updated weights for policy 1, policy_version 57020 (0.0009) [2023-10-14 03:18:29,046][33201] Updated weights for policy 0, policy_version 56490 (0.0007) [2023-10-14 03:18:29,418][33201] Updated weights for policy 0, policy_version 56500 (0.0009) [2023-10-14 03:18:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 116228096. Throughput: 0: 1760.5, 1: 1799.0. Samples: 29063422. Policy #0 lag: (min: 11.0, avg: 11.5, max: 25.0) [2023-10-14 03:18:29,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.960')] [2023-10-14 03:18:29,795][33201] Updated weights for policy 0, policy_version 56510 (0.0008) [2023-10-14 03:18:32,015][33226] Updated weights for policy 1, policy_version 57030 (0.0008) [2023-10-14 03:18:32,381][33226] Updated weights for policy 1, policy_version 57040 (0.0009) [2023-10-14 03:18:32,761][33226] Updated weights for policy 1, policy_version 57050 (0.0008) [2023-10-14 03:18:33,543][33201] Updated weights for policy 0, policy_version 56520 (0.0007) [2023-10-14 03:18:33,911][33201] Updated weights for policy 0, policy_version 56530 (0.0008) [2023-10-14 03:18:34,281][33201] Updated weights for policy 0, policy_version 56540 (0.0007) [2023-10-14 03:18:34,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 116326400. Throughput: 0: 1792.2, 1: 1782.7. Samples: 29084184. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:34,557][31953] Avg episode reward: [(0, '20.790'), (1, '20.950')] [2023-10-14 03:18:36,627][33226] Updated weights for policy 1, policy_version 57060 (0.0008) [2023-10-14 03:18:36,997][33226] Updated weights for policy 1, policy_version 57070 (0.0008) [2023-10-14 03:18:37,359][33226] Updated weights for policy 1, policy_version 57080 (0.0007) [2023-10-14 03:18:38,153][33201] Updated weights for policy 0, policy_version 56550 (0.0007) [2023-10-14 03:18:38,529][33201] Updated weights for policy 0, policy_version 56560 (0.0007) [2023-10-14 03:18:38,896][33201] Updated weights for policy 0, policy_version 56570 (0.0009) [2023-10-14 03:18:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 116391936. Throughput: 0: 1766.0, 1: 1771.0. Samples: 29104574. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:39,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.930')] [2023-10-14 03:18:41,170][33226] Updated weights for policy 1, policy_version 57090 (0.0008) [2023-10-14 03:18:41,538][33226] Updated weights for policy 1, policy_version 57100 (0.0010) [2023-10-14 03:18:41,902][33226] Updated weights for policy 1, policy_version 57110 (0.0011) [2023-10-14 03:18:42,266][33226] Updated weights for policy 1, policy_version 57120 (0.0011) [2023-10-14 03:18:42,878][33201] Updated weights for policy 0, policy_version 56580 (0.0007) [2023-10-14 03:18:43,251][33201] Updated weights for policy 0, policy_version 56590 (0.0009) [2023-10-14 03:18:43,634][33201] Updated weights for policy 0, policy_version 56600 (0.0008) [2023-10-14 03:18:44,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 116457472. Throughput: 0: 1777.7, 1: 1780.8. Samples: 29115870. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.930')] [2023-10-14 03:18:46,261][33226] Updated weights for policy 1, policy_version 57130 (0.0009) [2023-10-14 03:18:46,614][33226] Updated weights for policy 1, policy_version 57140 (0.0008) [2023-10-14 03:18:46,978][33226] Updated weights for policy 1, policy_version 57150 (0.0009) [2023-10-14 03:18:47,475][33201] Updated weights for policy 0, policy_version 56610 (0.0009) [2023-10-14 03:18:47,850][33201] Updated weights for policy 0, policy_version 56620 (0.0008) [2023-10-14 03:18:48,225][33201] Updated weights for policy 0, policy_version 56630 (0.0010) [2023-10-14 03:18:48,590][33201] Updated weights for policy 0, policy_version 56640 (0.0008) [2023-10-14 03:18:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 116523008. Throughput: 0: 1772.8, 1: 1763.9. Samples: 29136484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.950')] [2023-10-14 03:18:50,709][33226] Updated weights for policy 1, policy_version 57160 (0.0007) [2023-10-14 03:18:51,073][33226] Updated weights for policy 1, policy_version 57170 (0.0007) [2023-10-14 03:18:51,437][33226] Updated weights for policy 1, policy_version 57180 (0.0008) [2023-10-14 03:18:52,404][33201] Updated weights for policy 0, policy_version 56650 (0.0007) [2023-10-14 03:18:52,774][33201] Updated weights for policy 0, policy_version 56660 (0.0008) [2023-10-14 03:18:53,143][33201] Updated weights for policy 0, policy_version 56670 (0.0008) [2023-10-14 03:18:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 116588544. Throughput: 0: 1757.3, 1: 1777.9. Samples: 29158612. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.970')] [2023-10-14 03:18:55,091][33226] Updated weights for policy 1, policy_version 57190 (0.0008) [2023-10-14 03:18:55,464][33226] Updated weights for policy 1, policy_version 57200 (0.0007) [2023-10-14 03:18:55,834][33226] Updated weights for policy 1, policy_version 57210 (0.0007) [2023-10-14 03:18:56,813][33201] Updated weights for policy 0, policy_version 56680 (0.0007) [2023-10-14 03:18:57,188][33201] Updated weights for policy 0, policy_version 56690 (0.0007) [2023-10-14 03:18:57,555][33201] Updated weights for policy 0, policy_version 56700 (0.0008) [2023-10-14 03:18:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 116654080. Throughput: 0: 1774.1, 1: 1775.2. Samples: 29169202. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:18:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.970')] [2023-10-14 03:18:59,636][33226] Updated weights for policy 1, policy_version 57220 (0.0008) [2023-10-14 03:19:00,027][33226] Updated weights for policy 1, policy_version 57230 (0.0008) [2023-10-14 03:19:00,399][33226] Updated weights for policy 1, policy_version 57240 (0.0007) [2023-10-14 03:19:01,366][33201] Updated weights for policy 0, policy_version 56710 (0.0009) [2023-10-14 03:19:01,732][33201] Updated weights for policy 0, policy_version 56720 (0.0010) [2023-10-14 03:19:02,099][33201] Updated weights for policy 0, policy_version 56730 (0.0010) [2023-10-14 03:19:04,074][33226] Updated weights for policy 1, policy_version 57250 (0.0007) [2023-10-14 03:19:04,435][33226] Updated weights for policy 1, policy_version 57260 (0.0008) [2023-10-14 03:19:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 116719616. Throughput: 0: 1753.4, 1: 1781.3. Samples: 29190542. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:19:04,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:19:04,794][33226] Updated weights for policy 1, policy_version 57270 (0.0010) [2023-10-14 03:19:05,166][33226] Updated weights for policy 1, policy_version 57280 (0.0007) [2023-10-14 03:19:05,970][33201] Updated weights for policy 0, policy_version 56740 (0.0009) [2023-10-14 03:19:06,355][33201] Updated weights for policy 0, policy_version 56750 (0.0007) [2023-10-14 03:19:06,726][33201] Updated weights for policy 0, policy_version 56760 (0.0007) [2023-10-14 03:19:08,851][33226] Updated weights for policy 1, policy_version 57290 (0.0009) [2023-10-14 03:19:09,214][33226] Updated weights for policy 1, policy_version 57300 (0.0010) [2023-10-14 03:19:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 116785152. Throughput: 0: 1761.3, 1: 1801.8. Samples: 29212556. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:19:09,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:19:09,572][33226] Updated weights for policy 1, policy_version 57310 (0.0008) [2023-10-14 03:19:10,413][33201] Updated weights for policy 0, policy_version 56770 (0.0007) [2023-10-14 03:19:10,785][33201] Updated weights for policy 0, policy_version 56780 (0.0009) [2023-10-14 03:19:11,164][33201] Updated weights for policy 0, policy_version 56790 (0.0010) [2023-10-14 03:19:11,534][33201] Updated weights for policy 0, policy_version 56800 (0.0009) [2023-10-14 03:19:13,335][33226] Updated weights for policy 1, policy_version 57320 (0.0008) [2023-10-14 03:19:13,698][33226] Updated weights for policy 1, policy_version 57330 (0.0010) [2023-10-14 03:19:14,072][33226] Updated weights for policy 1, policy_version 57340 (0.0008) [2023-10-14 03:19:14,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 116883456. Throughput: 0: 1760.6, 1: 1781.7. Samples: 29222828. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:19:15,384][33201] Updated weights for policy 0, policy_version 56810 (0.0010) [2023-10-14 03:19:15,748][33201] Updated weights for policy 0, policy_version 56820 (0.0007) [2023-10-14 03:19:16,116][33201] Updated weights for policy 0, policy_version 56830 (0.0008) [2023-10-14 03:19:17,898][33226] Updated weights for policy 1, policy_version 57350 (0.0008) [2023-10-14 03:19:18,275][33226] Updated weights for policy 1, policy_version 57360 (0.0008) [2023-10-14 03:19:18,644][33226] Updated weights for policy 1, policy_version 57370 (0.0009) [2023-10-14 03:19:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 116948992. Throughput: 0: 1764.2, 1: 1799.7. Samples: 29244562. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:19:19,804][33201] Updated weights for policy 0, policy_version 56840 (0.0008) [2023-10-14 03:19:20,171][33201] Updated weights for policy 0, policy_version 56850 (0.0009) [2023-10-14 03:19:20,550][33201] Updated weights for policy 0, policy_version 56860 (0.0008) [2023-10-14 03:19:22,512][33226] Updated weights for policy 1, policy_version 57380 (0.0010) [2023-10-14 03:19:22,881][33226] Updated weights for policy 1, policy_version 57390 (0.0009) [2023-10-14 03:19:23,250][33226] Updated weights for policy 1, policy_version 57400 (0.0009) [2023-10-14 03:19:24,346][33201] Updated weights for policy 0, policy_version 56870 (0.0008) [2023-10-14 03:19:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117014528. Throughput: 0: 1798.9, 1: 1779.2. Samples: 29265590. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:19:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000057408_58785792.pth... [2023-10-14 03:19:24,605][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000055744_57081856.pth [2023-10-14 03:19:24,713][33201] Updated weights for policy 0, policy_version 56880 (0.0008) [2023-10-14 03:19:25,085][33201] Updated weights for policy 0, policy_version 56890 (0.0008) [2023-10-14 03:19:25,302][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000056896_58261504.pth... [2023-10-14 03:19:25,333][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000055232_56557568.pth [2023-10-14 03:19:27,134][33226] Updated weights for policy 1, policy_version 57410 (0.0008) [2023-10-14 03:19:27,504][33226] Updated weights for policy 1, policy_version 57420 (0.0010) [2023-10-14 03:19:27,877][33226] Updated weights for policy 1, policy_version 57430 (0.0010) [2023-10-14 03:19:28,233][33226] Updated weights for policy 1, policy_version 57440 (0.0010) [2023-10-14 03:19:28,888][33201] Updated weights for policy 0, policy_version 56900 (0.0010) [2023-10-14 03:19:29,263][33201] Updated weights for policy 0, policy_version 56910 (0.0007) [2023-10-14 03:19:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117080064. Throughput: 0: 1770.1, 1: 1803.4. Samples: 29276674. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:29,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:19:29,630][33201] Updated weights for policy 0, policy_version 56920 (0.0008) [2023-10-14 03:19:32,064][33226] Updated weights for policy 1, policy_version 57450 (0.0008) [2023-10-14 03:19:32,425][33226] Updated weights for policy 1, policy_version 57460 (0.0008) [2023-10-14 03:19:32,797][33226] Updated weights for policy 1, policy_version 57470 (0.0007) [2023-10-14 03:19:33,455][33201] Updated weights for policy 0, policy_version 56930 (0.0009) [2023-10-14 03:19:33,830][33201] Updated weights for policy 0, policy_version 56940 (0.0008) [2023-10-14 03:19:34,204][33201] Updated weights for policy 0, policy_version 56950 (0.0009) [2023-10-14 03:19:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 117145600. Throughput: 0: 1789.8, 1: 1787.1. Samples: 29297444. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:34,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:19:34,580][33201] Updated weights for policy 0, policy_version 56960 (0.0009) [2023-10-14 03:19:36,417][33226] Updated weights for policy 1, policy_version 57480 (0.0007) [2023-10-14 03:19:36,771][33226] Updated weights for policy 1, policy_version 57490 (0.0009) [2023-10-14 03:19:37,140][33226] Updated weights for policy 1, policy_version 57500 (0.0009) [2023-10-14 03:19:38,398][33201] Updated weights for policy 0, policy_version 56970 (0.0009) [2023-10-14 03:19:38,775][33201] Updated weights for policy 0, policy_version 56980 (0.0009) [2023-10-14 03:19:39,146][33201] Updated weights for policy 0, policy_version 56990 (0.0008) [2023-10-14 03:19:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117243904. Throughput: 0: 1773.2, 1: 1778.9. Samples: 29318458. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:39,561][31953] Avg episode reward: [(0, '20.890'), (1, '20.930')] [2023-10-14 03:19:40,928][33226] Updated weights for policy 1, policy_version 57510 (0.0009) [2023-10-14 03:19:41,293][33226] Updated weights for policy 1, policy_version 57520 (0.0009) [2023-10-14 03:19:41,670][33226] Updated weights for policy 1, policy_version 57530 (0.0009) [2023-10-14 03:19:42,938][33201] Updated weights for policy 0, policy_version 57000 (0.0007) [2023-10-14 03:19:43,309][33201] Updated weights for policy 0, policy_version 57010 (0.0007) [2023-10-14 03:19:43,681][33201] Updated weights for policy 0, policy_version 57020 (0.0009) [2023-10-14 03:19:44,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117309440. Throughput: 0: 1777.4, 1: 1780.7. Samples: 29329314. Policy #0 lag: (min: 31.0, avg: 34.3, max: 63.0) [2023-10-14 03:19:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 03:19:45,529][33226] Updated weights for policy 1, policy_version 57540 (0.0010) [2023-10-14 03:19:45,933][33226] Updated weights for policy 1, policy_version 57550 (0.0009) [2023-10-14 03:19:46,301][33226] Updated weights for policy 1, policy_version 57560 (0.0009) [2023-10-14 03:19:47,627][33201] Updated weights for policy 0, policy_version 57030 (0.0009) [2023-10-14 03:19:47,996][33201] Updated weights for policy 0, policy_version 57040 (0.0012) [2023-10-14 03:19:48,367][33201] Updated weights for policy 0, policy_version 57050 (0.0011) [2023-10-14 03:19:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117374976. Throughput: 0: 1777.2, 1: 1773.5. Samples: 29350324. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:19:49,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.900')] [2023-10-14 03:19:50,094][33226] Updated weights for policy 1, policy_version 57570 (0.0008) [2023-10-14 03:19:50,470][33226] Updated weights for policy 1, policy_version 57580 (0.0008) [2023-10-14 03:19:50,841][33226] Updated weights for policy 1, policy_version 57590 (0.0008) [2023-10-14 03:19:51,195][33226] Updated weights for policy 1, policy_version 57600 (0.0007) [2023-10-14 03:19:52,309][33201] Updated weights for policy 0, policy_version 57060 (0.0008) [2023-10-14 03:19:52,702][33201] Updated weights for policy 0, policy_version 57070 (0.0009) [2023-10-14 03:19:53,063][33201] Updated weights for policy 0, policy_version 57080 (0.0009) [2023-10-14 03:19:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117440512. Throughput: 0: 1756.5, 1: 1782.4. Samples: 29371810. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:19:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.900')] [2023-10-14 03:19:54,764][33226] Updated weights for policy 1, policy_version 57610 (0.0009) [2023-10-14 03:19:55,136][33226] Updated weights for policy 1, policy_version 57620 (0.0010) [2023-10-14 03:19:55,501][33226] Updated weights for policy 1, policy_version 57630 (0.0007) [2023-10-14 03:19:57,013][33201] Updated weights for policy 0, policy_version 57090 (0.0010) [2023-10-14 03:19:57,380][33201] Updated weights for policy 0, policy_version 57100 (0.0008) [2023-10-14 03:19:57,750][33201] Updated weights for policy 0, policy_version 57110 (0.0009) [2023-10-14 03:19:58,122][33201] Updated weights for policy 0, policy_version 57120 (0.0008) [2023-10-14 03:19:59,125][33226] Updated weights for policy 1, policy_version 57640 (0.0007) [2023-10-14 03:19:59,501][33226] Updated weights for policy 1, policy_version 57650 (0.0008) [2023-10-14 03:19:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117506048. Throughput: 0: 1783.1, 1: 1773.6. Samples: 29382880. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:19:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 03:19:59,865][33226] Updated weights for policy 1, policy_version 57660 (0.0007) [2023-10-14 03:20:02,021][33201] Updated weights for policy 0, policy_version 57130 (0.0007) [2023-10-14 03:20:02,396][33201] Updated weights for policy 0, policy_version 57140 (0.0007) [2023-10-14 03:20:02,770][33201] Updated weights for policy 0, policy_version 57150 (0.0009) [2023-10-14 03:20:03,709][33226] Updated weights for policy 1, policy_version 57670 (0.0008) [2023-10-14 03:20:04,067][33226] Updated weights for policy 1, policy_version 57680 (0.0007) [2023-10-14 03:20:04,438][33226] Updated weights for policy 1, policy_version 57690 (0.0007) [2023-10-14 03:20:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117571584. Throughput: 0: 1750.4, 1: 1790.9. Samples: 29403916. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:20:04,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 03:20:06,437][33201] Updated weights for policy 0, policy_version 57160 (0.0009) [2023-10-14 03:20:06,800][33201] Updated weights for policy 0, policy_version 57170 (0.0007) [2023-10-14 03:20:07,178][33201] Updated weights for policy 0, policy_version 57180 (0.0008) [2023-10-14 03:20:08,143][33226] Updated weights for policy 1, policy_version 57700 (0.0007) [2023-10-14 03:20:08,519][33226] Updated weights for policy 1, policy_version 57710 (0.0009) [2023-10-14 03:20:08,897][33226] Updated weights for policy 1, policy_version 57720 (0.0007) [2023-10-14 03:20:09,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 117669888. Throughput: 0: 1753.5, 1: 1796.2. Samples: 29425326. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:20:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.860')] [2023-10-14 03:20:11,050][33201] Updated weights for policy 0, policy_version 57190 (0.0009) [2023-10-14 03:20:11,415][33201] Updated weights for policy 0, policy_version 57200 (0.0009) [2023-10-14 03:20:11,800][33201] Updated weights for policy 0, policy_version 57210 (0.0010) [2023-10-14 03:20:12,691][33226] Updated weights for policy 1, policy_version 57730 (0.0007) [2023-10-14 03:20:13,056][33226] Updated weights for policy 1, policy_version 57740 (0.0008) [2023-10-14 03:20:13,413][33226] Updated weights for policy 1, policy_version 57750 (0.0010) [2023-10-14 03:20:13,783][33226] Updated weights for policy 1, policy_version 57760 (0.0010) [2023-10-14 03:20:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 117735424. Throughput: 0: 1747.1, 1: 1795.3. Samples: 29436082. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:20:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 03:20:15,430][33201] Updated weights for policy 0, policy_version 57220 (0.0009) [2023-10-14 03:20:15,800][33201] Updated weights for policy 0, policy_version 57230 (0.0007) [2023-10-14 03:20:16,176][33201] Updated weights for policy 0, policy_version 57240 (0.0009) [2023-10-14 03:20:17,567][33226] Updated weights for policy 1, policy_version 57770 (0.0010) [2023-10-14 03:20:17,941][33226] Updated weights for policy 1, policy_version 57780 (0.0008) [2023-10-14 03:20:18,297][33226] Updated weights for policy 1, policy_version 57790 (0.0009) [2023-10-14 03:20:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117800960. Throughput: 0: 1748.6, 1: 1807.2. Samples: 29457452. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:20:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 03:20:20,009][33201] Updated weights for policy 0, policy_version 57250 (0.0009) [2023-10-14 03:20:20,379][33201] Updated weights for policy 0, policy_version 57260 (0.0010) [2023-10-14 03:20:20,755][33201] Updated weights for policy 0, policy_version 57270 (0.0010) [2023-10-14 03:20:21,124][33201] Updated weights for policy 0, policy_version 57280 (0.0010) [2023-10-14 03:20:22,018][33226] Updated weights for policy 1, policy_version 57800 (0.0008) [2023-10-14 03:20:22,382][33226] Updated weights for policy 1, policy_version 57810 (0.0008) [2023-10-14 03:20:22,741][33226] Updated weights for policy 1, policy_version 57820 (0.0010) [2023-10-14 03:20:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117866496. Throughput: 0: 1770.8, 1: 1790.3. Samples: 29478708. Policy #0 lag: (min: 16.0, avg: 39.2, max: 48.0) [2023-10-14 03:20:24,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:25,156][33201] Updated weights for policy 0, policy_version 57290 (0.0009) [2023-10-14 03:20:25,529][33201] Updated weights for policy 0, policy_version 57300 (0.0009) [2023-10-14 03:20:25,894][33201] Updated weights for policy 0, policy_version 57310 (0.0008) [2023-10-14 03:20:26,594][33226] Updated weights for policy 1, policy_version 57830 (0.0009) [2023-10-14 03:20:26,966][33226] Updated weights for policy 1, policy_version 57840 (0.0008) [2023-10-14 03:20:27,331][33226] Updated weights for policy 1, policy_version 57850 (0.0008) [2023-10-14 03:20:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117932032. Throughput: 0: 1745.0, 1: 1803.3. Samples: 29488986. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:29,691][33201] Updated weights for policy 0, policy_version 57320 (0.0008) [2023-10-14 03:20:30,057][33201] Updated weights for policy 0, policy_version 57330 (0.0011) [2023-10-14 03:20:30,430][33201] Updated weights for policy 0, policy_version 57340 (0.0007) [2023-10-14 03:20:31,174][33226] Updated weights for policy 1, policy_version 57860 (0.0010) [2023-10-14 03:20:31,539][33226] Updated weights for policy 1, policy_version 57870 (0.0009) [2023-10-14 03:20:31,910][33226] Updated weights for policy 1, policy_version 57880 (0.0007) [2023-10-14 03:20:34,377][33201] Updated weights for policy 0, policy_version 57350 (0.0007) [2023-10-14 03:20:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 117997568. Throughput: 0: 1761.1, 1: 1793.0. Samples: 29510260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:34,752][33201] Updated weights for policy 0, policy_version 57360 (0.0009) [2023-10-14 03:20:35,118][33201] Updated weights for policy 0, policy_version 57370 (0.0009) [2023-10-14 03:20:35,596][33226] Updated weights for policy 1, policy_version 57890 (0.0008) [2023-10-14 03:20:35,993][33226] Updated weights for policy 1, policy_version 57900 (0.0009) [2023-10-14 03:20:36,361][33226] Updated weights for policy 1, policy_version 57910 (0.0009) [2023-10-14 03:20:36,727][33226] Updated weights for policy 1, policy_version 57920 (0.0008) [2023-10-14 03:20:39,078][33201] Updated weights for policy 0, policy_version 57380 (0.0008) [2023-10-14 03:20:39,452][33201] Updated weights for policy 0, policy_version 57390 (0.0009) [2023-10-14 03:20:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 118063104. Throughput: 0: 1771.4, 1: 1791.2. Samples: 29532124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:39,827][33201] Updated weights for policy 0, policy_version 57400 (0.0011) [2023-10-14 03:20:40,460][33226] Updated weights for policy 1, policy_version 57930 (0.0010) [2023-10-14 03:20:40,832][33226] Updated weights for policy 1, policy_version 57940 (0.0010) [2023-10-14 03:20:41,196][33226] Updated weights for policy 1, policy_version 57950 (0.0011) [2023-10-14 03:20:43,511][33201] Updated weights for policy 0, policy_version 57410 (0.0009) [2023-10-14 03:20:43,874][33201] Updated weights for policy 0, policy_version 57420 (0.0008) [2023-10-14 03:20:44,247][33201] Updated weights for policy 0, policy_version 57430 (0.0009) [2023-10-14 03:20:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 118128640. Throughput: 0: 1751.1, 1: 1785.4. Samples: 29542022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:44,613][33201] Updated weights for policy 0, policy_version 57440 (0.0009) [2023-10-14 03:20:44,869][33226] Updated weights for policy 1, policy_version 57960 (0.0011) [2023-10-14 03:20:45,236][33226] Updated weights for policy 1, policy_version 57970 (0.0010) [2023-10-14 03:20:45,615][33226] Updated weights for policy 1, policy_version 57980 (0.0009) [2023-10-14 03:20:48,500][33201] Updated weights for policy 0, policy_version 57450 (0.0008) [2023-10-14 03:20:48,867][33201] Updated weights for policy 0, policy_version 57460 (0.0008) [2023-10-14 03:20:49,231][33201] Updated weights for policy 0, policy_version 57470 (0.0007) [2023-10-14 03:20:49,430][33226] Updated weights for policy 1, policy_version 57990 (0.0007) [2023-10-14 03:20:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 118226944. Throughput: 0: 1781.0, 1: 1785.8. Samples: 29564420. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:20:49,806][33226] Updated weights for policy 1, policy_version 58000 (0.0008) [2023-10-14 03:20:50,168][33226] Updated weights for policy 1, policy_version 58010 (0.0009) [2023-10-14 03:20:52,782][33201] Updated weights for policy 0, policy_version 57480 (0.0007) [2023-10-14 03:20:53,159][33201] Updated weights for policy 0, policy_version 57490 (0.0008) [2023-10-14 03:20:53,526][33201] Updated weights for policy 0, policy_version 57500 (0.0008) [2023-10-14 03:20:54,091][33226] Updated weights for policy 1, policy_version 58020 (0.0009) [2023-10-14 03:20:54,460][33226] Updated weights for policy 1, policy_version 58030 (0.0008) [2023-10-14 03:20:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 118292480. Throughput: 0: 1744.5, 1: 1803.6. Samples: 29584992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 03:20:54,835][33226] Updated weights for policy 1, policy_version 58040 (0.0008) [2023-10-14 03:20:57,373][33201] Updated weights for policy 0, policy_version 57510 (0.0008) [2023-10-14 03:20:57,746][33201] Updated weights for policy 0, policy_version 57520 (0.0008) [2023-10-14 03:20:58,121][33201] Updated weights for policy 0, policy_version 57530 (0.0008) [2023-10-14 03:20:58,430][33226] Updated weights for policy 1, policy_version 58050 (0.0009) [2023-10-14 03:20:58,795][33226] Updated weights for policy 1, policy_version 58060 (0.0010) [2023-10-14 03:20:59,162][33226] Updated weights for policy 1, policy_version 58070 (0.0007) [2023-10-14 03:20:59,527][33226] Updated weights for policy 1, policy_version 58080 (0.0009) [2023-10-14 03:20:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 118390784. Throughput: 0: 1784.2, 1: 1775.4. Samples: 29596264. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:20:59,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 03:21:01,819][33201] Updated weights for policy 0, policy_version 57540 (0.0007) [2023-10-14 03:21:02,191][33201] Updated weights for policy 0, policy_version 57550 (0.0010) [2023-10-14 03:21:02,546][33201] Updated weights for policy 0, policy_version 57560 (0.0010) [2023-10-14 03:21:03,359][33226] Updated weights for policy 1, policy_version 58090 (0.0010) [2023-10-14 03:21:03,732][33226] Updated weights for policy 1, policy_version 58100 (0.0009) [2023-10-14 03:21:04,099][33226] Updated weights for policy 1, policy_version 58110 (0.0008) [2023-10-14 03:21:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 118456320. Throughput: 0: 1751.5, 1: 1805.9. Samples: 29617536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:04,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.860')] [2023-10-14 03:21:06,393][33201] Updated weights for policy 0, policy_version 57570 (0.0008) [2023-10-14 03:21:06,763][33201] Updated weights for policy 0, policy_version 57580 (0.0009) [2023-10-14 03:21:07,134][33201] Updated weights for policy 0, policy_version 57590 (0.0009) [2023-10-14 03:21:07,509][33201] Updated weights for policy 0, policy_version 57600 (0.0008) [2023-10-14 03:21:07,862][33226] Updated weights for policy 1, policy_version 58120 (0.0010) [2023-10-14 03:21:08,239][33226] Updated weights for policy 1, policy_version 58130 (0.0007) [2023-10-14 03:21:08,598][33226] Updated weights for policy 1, policy_version 58140 (0.0008) [2023-10-14 03:21:09,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 118521856. Throughput: 0: 1758.7, 1: 1787.1. Samples: 29638268. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.860')] [2023-10-14 03:21:11,399][33201] Updated weights for policy 0, policy_version 57610 (0.0010) [2023-10-14 03:21:11,778][33201] Updated weights for policy 0, policy_version 57620 (0.0010) [2023-10-14 03:21:12,150][33201] Updated weights for policy 0, policy_version 57630 (0.0011) [2023-10-14 03:21:12,409][33226] Updated weights for policy 1, policy_version 58150 (0.0008) [2023-10-14 03:21:12,777][33226] Updated weights for policy 1, policy_version 58160 (0.0008) [2023-10-14 03:21:13,146][33226] Updated weights for policy 1, policy_version 58170 (0.0007) [2023-10-14 03:21:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 118587392. Throughput: 0: 1769.5, 1: 1803.3. Samples: 29649764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.860')] [2023-10-14 03:21:15,962][33201] Updated weights for policy 0, policy_version 57640 (0.0009) [2023-10-14 03:21:16,328][33201] Updated weights for policy 0, policy_version 57650 (0.0010) [2023-10-14 03:21:16,701][33201] Updated weights for policy 0, policy_version 57660 (0.0007) [2023-10-14 03:21:16,790][33226] Updated weights for policy 1, policy_version 58180 (0.0009) [2023-10-14 03:21:17,148][33226] Updated weights for policy 1, policy_version 58190 (0.0010) [2023-10-14 03:21:17,514][33226] Updated weights for policy 1, policy_version 58200 (0.0010) [2023-10-14 03:21:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 118652928. Throughput: 0: 1765.4, 1: 1795.9. Samples: 29670516. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.860')] [2023-10-14 03:21:20,544][33201] Updated weights for policy 0, policy_version 57670 (0.0010) [2023-10-14 03:21:20,914][33201] Updated weights for policy 0, policy_version 57680 (0.0012) [2023-10-14 03:21:21,289][33201] Updated weights for policy 0, policy_version 57690 (0.0008) [2023-10-14 03:21:21,389][33226] Updated weights for policy 1, policy_version 58210 (0.0011) [2023-10-14 03:21:21,802][33226] Updated weights for policy 1, policy_version 58220 (0.0009) [2023-10-14 03:21:22,171][33226] Updated weights for policy 1, policy_version 58230 (0.0011) [2023-10-14 03:21:22,540][33226] Updated weights for policy 1, policy_version 58240 (0.0007) [2023-10-14 03:21:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 118718464. Throughput: 0: 1773.3, 1: 1795.5. Samples: 29692720. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.860')] [2023-10-14 03:21:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000057696_59080704.pth... [2023-10-14 03:21:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000058240_59637760.pth... [2023-10-14 03:21:24,610][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000056064_57409536.pth [2023-10-14 03:21:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000056576_57933824.pth [2023-10-14 03:21:24,615][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000057696_59080704.pth [2023-10-14 03:21:24,615][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000058240_59637760.pth [2023-10-14 03:21:25,080][33201] Updated weights for policy 0, policy_version 57700 (0.0008) [2023-10-14 03:21:25,470][33201] Updated weights for policy 0, policy_version 57710 (0.0009) [2023-10-14 03:21:25,846][33201] Updated weights for policy 0, policy_version 57720 (0.0008) [2023-10-14 03:21:26,181][33226] Updated weights for policy 1, policy_version 58250 (0.0007) [2023-10-14 03:21:26,543][33226] Updated weights for policy 1, policy_version 58260 (0.0007) [2023-10-14 03:21:26,910][33226] Updated weights for policy 1, policy_version 58270 (0.0008) [2023-10-14 03:21:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 118784000. Throughput: 0: 1763.3, 1: 1800.0. Samples: 29702372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.860')] [2023-10-14 03:21:29,610][33201] Updated weights for policy 0, policy_version 57730 (0.0007) [2023-10-14 03:21:29,985][33201] Updated weights for policy 0, policy_version 57740 (0.0008) [2023-10-14 03:21:30,359][33201] Updated weights for policy 0, policy_version 57750 (0.0009) [2023-10-14 03:21:30,569][33226] Updated weights for policy 1, policy_version 58280 (0.0008) [2023-10-14 03:21:30,730][33201] Updated weights for policy 0, policy_version 57760 (0.0008) [2023-10-14 03:21:30,930][33226] Updated weights for policy 1, policy_version 58290 (0.0011) [2023-10-14 03:21:31,299][33226] Updated weights for policy 1, policy_version 58300 (0.0008) [2023-10-14 03:21:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 118849536. Throughput: 0: 1761.0, 1: 1793.4. Samples: 29724368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.850')] [2023-10-14 03:21:34,656][33201] Updated weights for policy 0, policy_version 57770 (0.0008) [2023-10-14 03:21:35,023][33201] Updated weights for policy 0, policy_version 57780 (0.0008) [2023-10-14 03:21:35,198][33226] Updated weights for policy 1, policy_version 58310 (0.0010) [2023-10-14 03:21:35,390][33201] Updated weights for policy 0, policy_version 57790 (0.0008) [2023-10-14 03:21:35,561][33226] Updated weights for policy 1, policy_version 58320 (0.0010) [2023-10-14 03:21:35,927][33226] Updated weights for policy 1, policy_version 58330 (0.0011) [2023-10-14 03:21:39,076][33201] Updated weights for policy 0, policy_version 57800 (0.0012) [2023-10-14 03:21:39,443][33201] Updated weights for policy 0, policy_version 57810 (0.0008) [2023-10-14 03:21:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 118915072. Throughput: 0: 1785.0, 1: 1794.7. Samples: 29746076. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:21:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 03:21:39,721][33226] Updated weights for policy 1, policy_version 58340 (0.0008) [2023-10-14 03:21:39,818][33201] Updated weights for policy 0, policy_version 57820 (0.0008) [2023-10-14 03:21:40,086][33226] Updated weights for policy 1, policy_version 58350 (0.0008) [2023-10-14 03:21:40,458][33226] Updated weights for policy 1, policy_version 58360 (0.0008) [2023-10-14 03:21:43,696][33201] Updated weights for policy 0, policy_version 57830 (0.0009) [2023-10-14 03:21:44,070][33201] Updated weights for policy 0, policy_version 57840 (0.0008) [2023-10-14 03:21:44,171][33226] Updated weights for policy 1, policy_version 58370 (0.0007) [2023-10-14 03:21:44,439][33201] Updated weights for policy 0, policy_version 57850 (0.0007) [2023-10-14 03:21:44,546][33226] Updated weights for policy 1, policy_version 58380 (0.0008) [2023-10-14 03:21:44,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 118980608. Throughput: 0: 1759.2, 1: 1788.7. Samples: 29755920. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:21:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 03:21:44,914][33226] Updated weights for policy 1, policy_version 58390 (0.0010) [2023-10-14 03:21:45,286][33226] Updated weights for policy 1, policy_version 58400 (0.0008) [2023-10-14 03:21:48,445][33201] Updated weights for policy 0, policy_version 57860 (0.0009) [2023-10-14 03:21:48,828][33201] Updated weights for policy 0, policy_version 57870 (0.0009) [2023-10-14 03:21:49,098][33226] Updated weights for policy 1, policy_version 58410 (0.0009) [2023-10-14 03:21:49,198][33201] Updated weights for policy 0, policy_version 57880 (0.0008) [2023-10-14 03:21:49,463][33226] Updated weights for policy 1, policy_version 58420 (0.0007) [2023-10-14 03:21:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 119078912. Throughput: 0: 1783.0, 1: 1779.7. Samples: 29777856. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:21:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 03:21:49,827][33226] Updated weights for policy 1, policy_version 58430 (0.0008) [2023-10-14 03:21:53,039][33201] Updated weights for policy 0, policy_version 57890 (0.0007) [2023-10-14 03:21:53,408][33201] Updated weights for policy 0, policy_version 57900 (0.0007) [2023-10-14 03:21:53,591][33226] Updated weights for policy 1, policy_version 58440 (0.0008) [2023-10-14 03:21:53,782][33201] Updated weights for policy 0, policy_version 57910 (0.0007) [2023-10-14 03:21:53,958][33226] Updated weights for policy 1, policy_version 58450 (0.0008) [2023-10-14 03:21:54,140][33201] Updated weights for policy 0, policy_version 57920 (0.0008) [2023-10-14 03:21:54,319][33226] Updated weights for policy 1, policy_version 58460 (0.0008) [2023-10-14 03:21:54,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14745.5, 300 sec: 14329.1). Total num frames: 119177216. Throughput: 0: 1749.0, 1: 1797.7. Samples: 29797872. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:21:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:21:57,980][33201] Updated weights for policy 0, policy_version 57930 (0.0008) [2023-10-14 03:21:58,074][33226] Updated weights for policy 1, policy_version 58470 (0.0008) [2023-10-14 03:21:58,354][33201] Updated weights for policy 0, policy_version 57940 (0.0008) [2023-10-14 03:21:58,436][33226] Updated weights for policy 1, policy_version 58480 (0.0007) [2023-10-14 03:21:58,718][33201] Updated weights for policy 0, policy_version 57950 (0.0007) [2023-10-14 03:21:58,808][33226] Updated weights for policy 1, policy_version 58490 (0.0008) [2023-10-14 03:21:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 119242752. Throughput: 0: 1772.2, 1: 1781.6. Samples: 29809682. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:21:59,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:22:02,457][33201] Updated weights for policy 0, policy_version 57960 (0.0009) [2023-10-14 03:22:02,718][33226] Updated weights for policy 1, policy_version 58500 (0.0008) [2023-10-14 03:22:02,821][33201] Updated weights for policy 0, policy_version 57970 (0.0007) [2023-10-14 03:22:03,079][33226] Updated weights for policy 1, policy_version 58510 (0.0007) [2023-10-14 03:22:03,184][33201] Updated weights for policy 0, policy_version 57980 (0.0007) [2023-10-14 03:22:03,440][33226] Updated weights for policy 1, policy_version 58520 (0.0009) [2023-10-14 03:22:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 119308288. Throughput: 0: 1754.9, 1: 1796.6. Samples: 29830332. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:22:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 03:22:07,006][33201] Updated weights for policy 0, policy_version 57990 (0.0009) [2023-10-14 03:22:07,316][33226] Updated weights for policy 1, policy_version 58530 (0.0009) [2023-10-14 03:22:07,377][33201] Updated weights for policy 0, policy_version 58000 (0.0010) [2023-10-14 03:22:07,736][33226] Updated weights for policy 1, policy_version 58540 (0.0008) [2023-10-14 03:22:07,746][33201] Updated weights for policy 0, policy_version 58010 (0.0007) [2023-10-14 03:22:08,099][33226] Updated weights for policy 1, policy_version 58550 (0.0007) [2023-10-14 03:22:08,470][33226] Updated weights for policy 1, policy_version 58560 (0.0008) [2023-10-14 03:22:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 119373824. Throughput: 0: 1746.3, 1: 1767.6. Samples: 29850842. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:22:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 03:22:11,717][33201] Updated weights for policy 0, policy_version 58020 (0.0008) [2023-10-14 03:22:12,111][33201] Updated weights for policy 0, policy_version 58030 (0.0008) [2023-10-14 03:22:12,218][33226] Updated weights for policy 1, policy_version 58570 (0.0010) [2023-10-14 03:22:12,487][33201] Updated weights for policy 0, policy_version 58040 (0.0007) [2023-10-14 03:22:12,596][33226] Updated weights for policy 1, policy_version 58580 (0.0009) [2023-10-14 03:22:12,965][33226] Updated weights for policy 1, policy_version 58590 (0.0007) [2023-10-14 03:22:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 119439360. Throughput: 0: 1761.6, 1: 1794.8. Samples: 29862406. Policy #0 lag: (min: 21.0, avg: 28.3, max: 53.0) [2023-10-14 03:22:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:16,144][33201] Updated weights for policy 0, policy_version 58050 (0.0007) [2023-10-14 03:22:16,516][33201] Updated weights for policy 0, policy_version 58060 (0.0008) [2023-10-14 03:22:16,883][33201] Updated weights for policy 0, policy_version 58070 (0.0008) [2023-10-14 03:22:16,919][33226] Updated weights for policy 1, policy_version 58600 (0.0009) [2023-10-14 03:22:17,257][33201] Updated weights for policy 0, policy_version 58080 (0.0008) [2023-10-14 03:22:17,286][33226] Updated weights for policy 1, policy_version 58610 (0.0009) [2023-10-14 03:22:17,652][33226] Updated weights for policy 1, policy_version 58620 (0.0008) [2023-10-14 03:22:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 119504896. Throughput: 0: 1748.7, 1: 1761.9. Samples: 29882344. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.820')] [2023-10-14 03:22:21,121][33201] Updated weights for policy 0, policy_version 58090 (0.0010) [2023-10-14 03:22:21,285][33226] Updated weights for policy 1, policy_version 58630 (0.0009) [2023-10-14 03:22:21,487][33201] Updated weights for policy 0, policy_version 58100 (0.0009) [2023-10-14 03:22:21,651][33226] Updated weights for policy 1, policy_version 58640 (0.0008) [2023-10-14 03:22:21,843][33201] Updated weights for policy 0, policy_version 58110 (0.0009) [2023-10-14 03:22:22,015][33226] Updated weights for policy 1, policy_version 58650 (0.0009) [2023-10-14 03:22:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 119570432. Throughput: 0: 1752.8, 1: 1761.6. Samples: 29904224. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.820')] [2023-10-14 03:22:25,801][33226] Updated weights for policy 1, policy_version 58660 (0.0007) [2023-10-14 03:22:25,815][33201] Updated weights for policy 0, policy_version 58120 (0.0009) [2023-10-14 03:22:26,176][33226] Updated weights for policy 1, policy_version 58670 (0.0008) [2023-10-14 03:22:26,184][33201] Updated weights for policy 0, policy_version 58130 (0.0009) [2023-10-14 03:22:26,538][33226] Updated weights for policy 1, policy_version 58680 (0.0007) [2023-10-14 03:22:26,553][33201] Updated weights for policy 0, policy_version 58140 (0.0009) [2023-10-14 03:22:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 119635968. Throughput: 0: 1745.4, 1: 1768.0. Samples: 29914020. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:29,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:30,171][33201] Updated weights for policy 0, policy_version 58150 (0.0007) [2023-10-14 03:22:30,366][33226] Updated weights for policy 1, policy_version 58690 (0.0008) [2023-10-14 03:22:30,542][33201] Updated weights for policy 0, policy_version 58160 (0.0007) [2023-10-14 03:22:30,737][33226] Updated weights for policy 1, policy_version 58700 (0.0009) [2023-10-14 03:22:30,923][33201] Updated weights for policy 0, policy_version 58170 (0.0007) [2023-10-14 03:22:31,108][33226] Updated weights for policy 1, policy_version 58710 (0.0008) [2023-10-14 03:22:31,477][33226] Updated weights for policy 1, policy_version 58720 (0.0009) [2023-10-14 03:22:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 119701504. Throughput: 0: 1751.6, 1: 1771.8. Samples: 29936410. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:34,725][33201] Updated weights for policy 0, policy_version 58180 (0.0009) [2023-10-14 03:22:35,099][33201] Updated weights for policy 0, policy_version 58190 (0.0008) [2023-10-14 03:22:35,311][33226] Updated weights for policy 1, policy_version 58730 (0.0008) [2023-10-14 03:22:35,466][33201] Updated weights for policy 0, policy_version 58200 (0.0009) [2023-10-14 03:22:35,682][33226] Updated weights for policy 1, policy_version 58740 (0.0008) [2023-10-14 03:22:36,043][33226] Updated weights for policy 1, policy_version 58750 (0.0009) [2023-10-14 03:22:39,286][33201] Updated weights for policy 0, policy_version 58210 (0.0007) [2023-10-14 03:22:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 119767040. Throughput: 0: 1782.4, 1: 1784.0. Samples: 29958358. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:39,657][33201] Updated weights for policy 0, policy_version 58220 (0.0008) [2023-10-14 03:22:39,773][33226] Updated weights for policy 1, policy_version 58760 (0.0009) [2023-10-14 03:22:40,028][33201] Updated weights for policy 0, policy_version 58230 (0.0009) [2023-10-14 03:22:40,136][33226] Updated weights for policy 1, policy_version 58770 (0.0007) [2023-10-14 03:22:40,388][33201] Updated weights for policy 0, policy_version 58240 (0.0009) [2023-10-14 03:22:40,500][33226] Updated weights for policy 1, policy_version 58780 (0.0009) [2023-10-14 03:22:44,265][33201] Updated weights for policy 0, policy_version 58250 (0.0008) [2023-10-14 03:22:44,326][33226] Updated weights for policy 1, policy_version 58790 (0.0008) [2023-10-14 03:22:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 119832576. Throughput: 0: 1748.4, 1: 1767.2. Samples: 29967884. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:44,639][33201] Updated weights for policy 0, policy_version 58260 (0.0009) [2023-10-14 03:22:44,692][33226] Updated weights for policy 1, policy_version 58800 (0.0008) [2023-10-14 03:22:45,006][33201] Updated weights for policy 0, policy_version 58270 (0.0009) [2023-10-14 03:22:45,062][33226] Updated weights for policy 1, policy_version 58810 (0.0008) [2023-10-14 03:22:48,844][33226] Updated weights for policy 1, policy_version 58820 (0.0008) [2023-10-14 03:22:48,874][33201] Updated weights for policy 0, policy_version 58280 (0.0008) [2023-10-14 03:22:49,219][33226] Updated weights for policy 1, policy_version 58830 (0.0009) [2023-10-14 03:22:49,239][33201] Updated weights for policy 0, policy_version 58290 (0.0007) [2023-10-14 03:22:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 119898112. Throughput: 0: 1771.6, 1: 1777.7. Samples: 29990052. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:49,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.820')] [2023-10-14 03:22:49,581][33226] Updated weights for policy 1, policy_version 58840 (0.0007) [2023-10-14 03:22:49,607][33201] Updated weights for policy 0, policy_version 58300 (0.0007) [2023-10-14 03:22:53,530][33226] Updated weights for policy 1, policy_version 58850 (0.0008) [2023-10-14 03:22:53,624][33201] Updated weights for policy 0, policy_version 58310 (0.0007) [2023-10-14 03:22:53,930][33226] Updated weights for policy 1, policy_version 58860 (0.0009) [2023-10-14 03:22:53,992][33201] Updated weights for policy 0, policy_version 58320 (0.0007) [2023-10-14 03:22:54,292][33226] Updated weights for policy 1, policy_version 58870 (0.0007) [2023-10-14 03:22:54,363][33201] Updated weights for policy 0, policy_version 58330 (0.0007) [2023-10-14 03:22:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 14106.9). Total num frames: 119963648. Throughput: 0: 1759.8, 1: 1784.5. Samples: 30010336. Policy #0 lag: (min: 31.0, avg: 31.5, max: 45.0) [2023-10-14 03:22:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:22:54,665][33226] Updated weights for policy 1, policy_version 58880 (0.0008) [2023-10-14 03:22:58,183][33201] Updated weights for policy 0, policy_version 58340 (0.0008) [2023-10-14 03:22:58,471][33226] Updated weights for policy 1, policy_version 58890 (0.0007) [2023-10-14 03:22:58,551][33201] Updated weights for policy 0, policy_version 58350 (0.0008) [2023-10-14 03:22:58,845][33226] Updated weights for policy 1, policy_version 58900 (0.0007) [2023-10-14 03:22:58,915][33201] Updated weights for policy 0, policy_version 58360 (0.0007) [2023-10-14 03:22:59,214][33226] Updated weights for policy 1, policy_version 58910 (0.0008) [2023-10-14 03:22:59,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 120094720. Throughput: 0: 1766.0, 1: 1763.3. Samples: 30021224. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:22:59,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:23:02,765][33201] Updated weights for policy 0, policy_version 58370 (0.0007) [2023-10-14 03:23:03,038][33226] Updated weights for policy 1, policy_version 58920 (0.0008) [2023-10-14 03:23:03,127][33201] Updated weights for policy 0, policy_version 58380 (0.0007) [2023-10-14 03:23:03,408][33226] Updated weights for policy 1, policy_version 58930 (0.0007) [2023-10-14 03:23:03,500][33201] Updated weights for policy 0, policy_version 58390 (0.0008) [2023-10-14 03:23:03,770][33226] Updated weights for policy 1, policy_version 58940 (0.0008) [2023-10-14 03:23:03,861][33201] Updated weights for policy 0, policy_version 58400 (0.0007) [2023-10-14 03:23:04,557][31953] Fps is (10 sec: 19661.0, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 120160256. Throughput: 0: 1773.5, 1: 1791.7. Samples: 30042778. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:23:07,642][33226] Updated weights for policy 1, policy_version 58950 (0.0007) [2023-10-14 03:23:07,696][33201] Updated weights for policy 0, policy_version 58410 (0.0007) [2023-10-14 03:23:08,012][33226] Updated weights for policy 1, policy_version 58960 (0.0007) [2023-10-14 03:23:08,069][33201] Updated weights for policy 0, policy_version 58420 (0.0007) [2023-10-14 03:23:08,368][33226] Updated weights for policy 1, policy_version 58970 (0.0007) [2023-10-14 03:23:08,435][33201] Updated weights for policy 0, policy_version 58430 (0.0007) [2023-10-14 03:23:09,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 120225792. Throughput: 0: 1760.3, 1: 1769.7. Samples: 30063072. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:09,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:23:12,150][33201] Updated weights for policy 0, policy_version 58440 (0.0010) [2023-10-14 03:23:12,310][33226] Updated weights for policy 1, policy_version 58980 (0.0009) [2023-10-14 03:23:12,527][33201] Updated weights for policy 0, policy_version 58450 (0.0009) [2023-10-14 03:23:12,681][33226] Updated weights for policy 1, policy_version 58990 (0.0009) [2023-10-14 03:23:12,886][33201] Updated weights for policy 0, policy_version 58460 (0.0009) [2023-10-14 03:23:13,041][33226] Updated weights for policy 1, policy_version 59000 (0.0009) [2023-10-14 03:23:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 120291328. Throughput: 0: 1787.9, 1: 1793.7. Samples: 30075190. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:23:16,776][33201] Updated weights for policy 0, policy_version 58470 (0.0009) [2023-10-14 03:23:16,776][33226] Updated weights for policy 1, policy_version 59010 (0.0008) [2023-10-14 03:23:17,131][33226] Updated weights for policy 1, policy_version 59020 (0.0008) [2023-10-14 03:23:17,139][33201] Updated weights for policy 0, policy_version 58480 (0.0008) [2023-10-14 03:23:17,502][33226] Updated weights for policy 1, policy_version 59030 (0.0009) [2023-10-14 03:23:17,511][33201] Updated weights for policy 0, policy_version 58490 (0.0007) [2023-10-14 03:23:17,856][33226] Updated weights for policy 1, policy_version 59040 (0.0008) [2023-10-14 03:23:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 120356864. Throughput: 0: 1758.7, 1: 1756.5. Samples: 30094598. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:23:21,187][33201] Updated weights for policy 0, policy_version 58500 (0.0008) [2023-10-14 03:23:21,553][33201] Updated weights for policy 0, policy_version 58510 (0.0009) [2023-10-14 03:23:21,566][33226] Updated weights for policy 1, policy_version 59050 (0.0007) [2023-10-14 03:23:21,928][33226] Updated weights for policy 1, policy_version 59060 (0.0008) [2023-10-14 03:23:21,929][33201] Updated weights for policy 0, policy_version 58520 (0.0009) [2023-10-14 03:23:22,288][33226] Updated weights for policy 1, policy_version 59070 (0.0009) [2023-10-14 03:23:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 120422400. Throughput: 0: 1758.3, 1: 1765.9. Samples: 30116952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.890')] [2023-10-14 03:23:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000058528_59932672.pth... [2023-10-14 03:23:24,570][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000059072_60489728.pth... [2023-10-14 03:23:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000057408_58785792.pth [2023-10-14 03:23:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000056896_58261504.pth [2023-10-14 03:23:25,699][33201] Updated weights for policy 0, policy_version 58530 (0.0008) [2023-10-14 03:23:25,996][33226] Updated weights for policy 1, policy_version 59080 (0.0008) [2023-10-14 03:23:26,066][33201] Updated weights for policy 0, policy_version 58540 (0.0009) [2023-10-14 03:23:26,372][33226] Updated weights for policy 1, policy_version 59090 (0.0008) [2023-10-14 03:23:26,431][33201] Updated weights for policy 0, policy_version 58550 (0.0007) [2023-10-14 03:23:26,731][33226] Updated weights for policy 1, policy_version 59100 (0.0009) [2023-10-14 03:23:26,804][33201] Updated weights for policy 0, policy_version 58560 (0.0007) [2023-10-14 03:23:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 120487936. Throughput: 0: 1762.5, 1: 1765.2. Samples: 30126632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 03:23:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.890')] [2023-10-14 03:23:30,664][33226] Updated weights for policy 1, policy_version 59110 (0.0008) [2023-10-14 03:23:30,676][33201] Updated weights for policy 0, policy_version 58570 (0.0007) [2023-10-14 03:23:31,033][33226] Updated weights for policy 1, policy_version 59120 (0.0007) [2023-10-14 03:23:31,046][33201] Updated weights for policy 0, policy_version 58580 (0.0007) [2023-10-14 03:23:31,401][33226] Updated weights for policy 1, policy_version 59130 (0.0008) [2023-10-14 03:23:31,423][33201] Updated weights for policy 0, policy_version 58590 (0.0007) [2023-10-14 03:23:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 120553472. Throughput: 0: 1764.6, 1: 1764.4. Samples: 30148856. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.890')] [2023-10-14 03:23:35,146][33226] Updated weights for policy 1, policy_version 59140 (0.0009) [2023-10-14 03:23:35,228][33201] Updated weights for policy 0, policy_version 58600 (0.0008) [2023-10-14 03:23:35,508][33226] Updated weights for policy 1, policy_version 59150 (0.0008) [2023-10-14 03:23:35,603][33201] Updated weights for policy 0, policy_version 58610 (0.0007) [2023-10-14 03:23:35,860][33226] Updated weights for policy 1, policy_version 59160 (0.0010) [2023-10-14 03:23:35,970][33201] Updated weights for policy 0, policy_version 58620 (0.0009) [2023-10-14 03:23:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 120619008. Throughput: 0: 1779.0, 1: 1778.7. Samples: 30170432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 03:23:39,788][33226] Updated weights for policy 1, policy_version 59170 (0.0010) [2023-10-14 03:23:39,827][33201] Updated weights for policy 0, policy_version 58630 (0.0009) [2023-10-14 03:23:40,188][33226] Updated weights for policy 1, policy_version 59180 (0.0009) [2023-10-14 03:23:40,208][33201] Updated weights for policy 0, policy_version 58640 (0.0007) [2023-10-14 03:23:40,554][33226] Updated weights for policy 1, policy_version 59190 (0.0010) [2023-10-14 03:23:40,574][33201] Updated weights for policy 0, policy_version 58650 (0.0007) [2023-10-14 03:23:40,918][33226] Updated weights for policy 1, policy_version 59200 (0.0010) [2023-10-14 03:23:44,389][33201] Updated weights for policy 0, policy_version 58660 (0.0008) [2023-10-14 03:23:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 120684544. Throughput: 0: 1758.9, 1: 1765.2. Samples: 30179808. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 03:23:44,728][33226] Updated weights for policy 1, policy_version 59210 (0.0009) [2023-10-14 03:23:44,775][33201] Updated weights for policy 0, policy_version 58670 (0.0008) [2023-10-14 03:23:45,090][33226] Updated weights for policy 1, policy_version 59220 (0.0008) [2023-10-14 03:23:45,149][33201] Updated weights for policy 0, policy_version 58680 (0.0009) [2023-10-14 03:23:45,454][33226] Updated weights for policy 1, policy_version 59230 (0.0009) [2023-10-14 03:23:48,947][33201] Updated weights for policy 0, policy_version 58690 (0.0007) [2023-10-14 03:23:49,219][33226] Updated weights for policy 1, policy_version 59240 (0.0008) [2023-10-14 03:23:49,316][33201] Updated weights for policy 0, policy_version 58700 (0.0007) [2023-10-14 03:23:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 120750080. Throughput: 0: 1771.9, 1: 1765.7. Samples: 30201966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.940')] [2023-10-14 03:23:49,577][33226] Updated weights for policy 1, policy_version 59250 (0.0007) [2023-10-14 03:23:49,685][33201] Updated weights for policy 0, policy_version 58710 (0.0008) [2023-10-14 03:23:49,942][33226] Updated weights for policy 1, policy_version 59260 (0.0009) [2023-10-14 03:23:50,050][33201] Updated weights for policy 0, policy_version 58720 (0.0007) [2023-10-14 03:23:53,662][33226] Updated weights for policy 1, policy_version 59270 (0.0009) [2023-10-14 03:23:53,730][33201] Updated weights for policy 0, policy_version 58730 (0.0007) [2023-10-14 03:23:54,031][33226] Updated weights for policy 1, policy_version 59280 (0.0007) [2023-10-14 03:23:54,095][33201] Updated weights for policy 0, policy_version 58740 (0.0009) [2023-10-14 03:23:54,396][33226] Updated weights for policy 1, policy_version 59290 (0.0009) [2023-10-14 03:23:54,474][33201] Updated weights for policy 0, policy_version 58750 (0.0009) [2023-10-14 03:23:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 120848384. Throughput: 0: 1772.4, 1: 1777.0. Samples: 30222796. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 03:23:58,368][33226] Updated weights for policy 1, policy_version 59300 (0.0009) [2023-10-14 03:23:58,425][33201] Updated weights for policy 0, policy_version 58760 (0.0007) [2023-10-14 03:23:58,739][33226] Updated weights for policy 1, policy_version 59310 (0.0008) [2023-10-14 03:23:58,798][33201] Updated weights for policy 0, policy_version 58770 (0.0008) [2023-10-14 03:23:59,103][33226] Updated weights for policy 1, policy_version 59320 (0.0009) [2023-10-14 03:23:59,168][33201] Updated weights for policy 0, policy_version 58780 (0.0008) [2023-10-14 03:23:59,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 120946688. Throughput: 0: 1760.4, 1: 1762.6. Samples: 30233726. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:23:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:03,174][33201] Updated weights for policy 0, policy_version 58790 (0.0007) [2023-10-14 03:24:03,175][33226] Updated weights for policy 1, policy_version 59330 (0.0008) [2023-10-14 03:24:03,540][33226] Updated weights for policy 1, policy_version 59340 (0.0009) [2023-10-14 03:24:03,541][33201] Updated weights for policy 0, policy_version 58800 (0.0007) [2023-10-14 03:24:03,903][33226] Updated weights for policy 1, policy_version 59350 (0.0009) [2023-10-14 03:24:03,911][33201] Updated weights for policy 0, policy_version 58810 (0.0007) [2023-10-14 03:24:04,273][33226] Updated weights for policy 1, policy_version 59360 (0.0009) [2023-10-14 03:24:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 121012224. Throughput: 0: 1782.8, 1: 1790.0. Samples: 30255372. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:24:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:07,819][33201] Updated weights for policy 0, policy_version 58820 (0.0008) [2023-10-14 03:24:08,025][33226] Updated weights for policy 1, policy_version 59370 (0.0007) [2023-10-14 03:24:08,189][33201] Updated weights for policy 0, policy_version 58830 (0.0009) [2023-10-14 03:24:08,406][33226] Updated weights for policy 1, policy_version 59380 (0.0008) [2023-10-14 03:24:08,567][33201] Updated weights for policy 0, policy_version 58840 (0.0008) [2023-10-14 03:24:08,769][33226] Updated weights for policy 1, policy_version 59390 (0.0009) [2023-10-14 03:24:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121077760. Throughput: 0: 1753.7, 1: 1751.9. Samples: 30274704. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:12,380][33201] Updated weights for policy 0, policy_version 58850 (0.0008) [2023-10-14 03:24:12,411][33226] Updated weights for policy 1, policy_version 59400 (0.0008) [2023-10-14 03:24:12,751][33201] Updated weights for policy 0, policy_version 58860 (0.0008) [2023-10-14 03:24:12,767][33226] Updated weights for policy 1, policy_version 59410 (0.0008) [2023-10-14 03:24:13,118][33201] Updated weights for policy 0, policy_version 58870 (0.0008) [2023-10-14 03:24:13,129][33226] Updated weights for policy 1, policy_version 59420 (0.0009) [2023-10-14 03:24:13,483][33201] Updated weights for policy 0, policy_version 58880 (0.0008) [2023-10-14 03:24:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121143296. Throughput: 0: 1782.6, 1: 1786.4. Samples: 30287238. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:17,090][33226] Updated weights for policy 1, policy_version 59430 (0.0009) [2023-10-14 03:24:17,449][33226] Updated weights for policy 1, policy_version 59440 (0.0007) [2023-10-14 03:24:17,468][33201] Updated weights for policy 0, policy_version 58890 (0.0009) [2023-10-14 03:24:17,820][33226] Updated weights for policy 1, policy_version 59450 (0.0009) [2023-10-14 03:24:17,833][33201] Updated weights for policy 0, policy_version 58900 (0.0009) [2023-10-14 03:24:18,209][33201] Updated weights for policy 0, policy_version 58910 (0.0010) [2023-10-14 03:24:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121208832. Throughput: 0: 1747.0, 1: 1752.9. Samples: 30306352. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:19,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:21,452][33226] Updated weights for policy 1, policy_version 59460 (0.0008) [2023-10-14 03:24:21,829][33226] Updated weights for policy 1, policy_version 59470 (0.0008) [2023-10-14 03:24:22,181][33201] Updated weights for policy 0, policy_version 58920 (0.0007) [2023-10-14 03:24:22,202][33226] Updated weights for policy 1, policy_version 59480 (0.0007) [2023-10-14 03:24:22,550][33201] Updated weights for policy 0, policy_version 58930 (0.0008) [2023-10-14 03:24:22,927][33201] Updated weights for policy 0, policy_version 58940 (0.0008) [2023-10-14 03:24:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121274368. Throughput: 0: 1740.8, 1: 1764.0. Samples: 30328148. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:24:26,004][33226] Updated weights for policy 1, policy_version 59490 (0.0007) [2023-10-14 03:24:26,409][33226] Updated weights for policy 1, policy_version 59500 (0.0008) [2023-10-14 03:24:26,778][33226] Updated weights for policy 1, policy_version 59510 (0.0008) [2023-10-14 03:24:26,874][33201] Updated weights for policy 0, policy_version 58950 (0.0010) [2023-10-14 03:24:27,142][33226] Updated weights for policy 1, policy_version 59520 (0.0009) [2023-10-14 03:24:27,250][33201] Updated weights for policy 0, policy_version 58960 (0.0007) [2023-10-14 03:24:27,623][33201] Updated weights for policy 0, policy_version 58970 (0.0007) [2023-10-14 03:24:29,557][31953] Fps is (10 sec: 13106.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 121339904. Throughput: 0: 1759.9, 1: 1769.8. Samples: 30338648. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 03:24:30,916][33226] Updated weights for policy 1, policy_version 59530 (0.0008) [2023-10-14 03:24:31,273][33226] Updated weights for policy 1, policy_version 59540 (0.0008) [2023-10-14 03:24:31,303][33201] Updated weights for policy 0, policy_version 58980 (0.0009) [2023-10-14 03:24:31,640][33226] Updated weights for policy 1, policy_version 59550 (0.0009) [2023-10-14 03:24:31,675][33201] Updated weights for policy 0, policy_version 58990 (0.0008) [2023-10-14 03:24:32,047][33201] Updated weights for policy 0, policy_version 59000 (0.0009) [2023-10-14 03:24:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 121405440. Throughput: 0: 1738.9, 1: 1765.7. Samples: 30359674. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 03:24:35,530][33226] Updated weights for policy 1, policy_version 59560 (0.0008) [2023-10-14 03:24:35,899][33226] Updated weights for policy 1, policy_version 59570 (0.0007) [2023-10-14 03:24:35,989][33201] Updated weights for policy 0, policy_version 59010 (0.0008) [2023-10-14 03:24:36,267][33226] Updated weights for policy 1, policy_version 59580 (0.0007) [2023-10-14 03:24:36,395][33201] Updated weights for policy 0, policy_version 59020 (0.0009) [2023-10-14 03:24:36,772][33201] Updated weights for policy 0, policy_version 59030 (0.0008) [2023-10-14 03:24:37,138][33201] Updated weights for policy 0, policy_version 59040 (0.0008) [2023-10-14 03:24:39,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 121470976. Throughput: 0: 1748.3, 1: 1783.4. Samples: 30381722. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:24:39,942][33226] Updated weights for policy 1, policy_version 59590 (0.0009) [2023-10-14 03:24:40,315][33226] Updated weights for policy 1, policy_version 59600 (0.0008) [2023-10-14 03:24:40,676][33226] Updated weights for policy 1, policy_version 59610 (0.0010) [2023-10-14 03:24:40,864][33201] Updated weights for policy 0, policy_version 59050 (0.0008) [2023-10-14 03:24:41,237][33201] Updated weights for policy 0, policy_version 59060 (0.0009) [2023-10-14 03:24:41,602][33201] Updated weights for policy 0, policy_version 59070 (0.0009) [2023-10-14 03:24:44,349][33226] Updated weights for policy 1, policy_version 59620 (0.0007) [2023-10-14 03:24:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 121536512. Throughput: 0: 1732.9, 1: 1772.1. Samples: 30391452. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:24:44,712][33226] Updated weights for policy 1, policy_version 59630 (0.0007) [2023-10-14 03:24:45,072][33226] Updated weights for policy 1, policy_version 59640 (0.0007) [2023-10-14 03:24:45,349][33201] Updated weights for policy 0, policy_version 59080 (0.0009) [2023-10-14 03:24:45,714][33201] Updated weights for policy 0, policy_version 59090 (0.0010) [2023-10-14 03:24:46,084][33201] Updated weights for policy 0, policy_version 59100 (0.0011) [2023-10-14 03:24:48,851][33226] Updated weights for policy 1, policy_version 59650 (0.0008) [2023-10-14 03:24:49,231][33226] Updated weights for policy 1, policy_version 59660 (0.0007) [2023-10-14 03:24:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 121602048. Throughput: 0: 1738.7, 1: 1776.0. Samples: 30413530. Policy #0 lag: (min: 31.0, avg: 31.6, max: 47.0) [2023-10-14 03:24:49,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:24:49,595][33226] Updated weights for policy 1, policy_version 59670 (0.0009) [2023-10-14 03:24:49,952][33201] Updated weights for policy 0, policy_version 59110 (0.0008) [2023-10-14 03:24:49,957][33226] Updated weights for policy 1, policy_version 59680 (0.0010) [2023-10-14 03:24:50,316][33201] Updated weights for policy 0, policy_version 59120 (0.0010) [2023-10-14 03:24:50,681][33201] Updated weights for policy 0, policy_version 59130 (0.0007) [2023-10-14 03:24:53,843][33226] Updated weights for policy 1, policy_version 59690 (0.0010) [2023-10-14 03:24:54,209][33226] Updated weights for policy 1, policy_version 59700 (0.0010) [2023-10-14 03:24:54,438][33201] Updated weights for policy 0, policy_version 59140 (0.0007) [2023-10-14 03:24:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 121667584. Throughput: 0: 1772.1, 1: 1793.6. Samples: 30435158. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:24:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:24:54,578][33226] Updated weights for policy 1, policy_version 59710 (0.0007) [2023-10-14 03:24:54,807][33201] Updated weights for policy 0, policy_version 59150 (0.0008) [2023-10-14 03:24:55,176][33201] Updated weights for policy 0, policy_version 59160 (0.0007) [2023-10-14 03:24:58,392][33226] Updated weights for policy 1, policy_version 59720 (0.0009) [2023-10-14 03:24:58,768][33226] Updated weights for policy 1, policy_version 59730 (0.0009) [2023-10-14 03:24:58,983][33201] Updated weights for policy 0, policy_version 59170 (0.0009) [2023-10-14 03:24:59,138][33226] Updated weights for policy 1, policy_version 59740 (0.0009) [2023-10-14 03:24:59,350][33201] Updated weights for policy 0, policy_version 59180 (0.0007) [2023-10-14 03:24:59,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 121765888. Throughput: 0: 1742.8, 1: 1771.0. Samples: 30445362. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:24:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:24:59,720][33201] Updated weights for policy 0, policy_version 59190 (0.0009) [2023-10-14 03:25:00,095][33201] Updated weights for policy 0, policy_version 59200 (0.0011) [2023-10-14 03:25:02,908][33226] Updated weights for policy 1, policy_version 59750 (0.0008) [2023-10-14 03:25:03,265][33226] Updated weights for policy 1, policy_version 59760 (0.0010) [2023-10-14 03:25:03,631][33226] Updated weights for policy 1, policy_version 59770 (0.0007) [2023-10-14 03:25:04,087][33201] Updated weights for policy 0, policy_version 59210 (0.0007) [2023-10-14 03:25:04,457][33201] Updated weights for policy 0, policy_version 59220 (0.0007) [2023-10-14 03:25:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 121831424. Throughput: 0: 1768.7, 1: 1796.3. Samples: 30466780. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 03:25:04,834][33201] Updated weights for policy 0, policy_version 59230 (0.0008) [2023-10-14 03:25:07,381][33226] Updated weights for policy 1, policy_version 59780 (0.0008) [2023-10-14 03:25:07,755][33226] Updated weights for policy 1, policy_version 59790 (0.0007) [2023-10-14 03:25:08,119][33226] Updated weights for policy 1, policy_version 59800 (0.0009) [2023-10-14 03:25:08,623][33201] Updated weights for policy 0, policy_version 59240 (0.0008) [2023-10-14 03:25:08,988][33201] Updated weights for policy 0, policy_version 59250 (0.0008) [2023-10-14 03:25:09,371][33201] Updated weights for policy 0, policy_version 59260 (0.0010) [2023-10-14 03:25:09,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121929728. Throughput: 0: 1763.8, 1: 1772.7. Samples: 30487290. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 03:25:11,926][33226] Updated weights for policy 1, policy_version 59810 (0.0009) [2023-10-14 03:25:12,351][33226] Updated weights for policy 1, policy_version 59820 (0.0009) [2023-10-14 03:25:12,724][33226] Updated weights for policy 1, policy_version 59830 (0.0009) [2023-10-14 03:25:13,086][33226] Updated weights for policy 1, policy_version 59840 (0.0008) [2023-10-14 03:25:13,233][33201] Updated weights for policy 0, policy_version 59270 (0.0008) [2023-10-14 03:25:13,609][33201] Updated weights for policy 0, policy_version 59280 (0.0010) [2023-10-14 03:25:13,976][33201] Updated weights for policy 0, policy_version 59290 (0.0009) [2023-10-14 03:25:14,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 121995264. Throughput: 0: 1764.4, 1: 1801.0. Samples: 30499092. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 03:25:16,740][33226] Updated weights for policy 1, policy_version 59850 (0.0010) [2023-10-14 03:25:17,110][33226] Updated weights for policy 1, policy_version 59860 (0.0009) [2023-10-14 03:25:17,481][33226] Updated weights for policy 1, policy_version 59870 (0.0010) [2023-10-14 03:25:17,788][33201] Updated weights for policy 0, policy_version 59300 (0.0009) [2023-10-14 03:25:18,156][33201] Updated weights for policy 0, policy_version 59310 (0.0008) [2023-10-14 03:25:18,535][33201] Updated weights for policy 0, policy_version 59320 (0.0008) [2023-10-14 03:25:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 122060800. Throughput: 0: 1766.4, 1: 1777.3. Samples: 30519142. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:19,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.960')] [2023-10-14 03:25:21,352][33226] Updated weights for policy 1, policy_version 59880 (0.0010) [2023-10-14 03:25:21,722][33226] Updated weights for policy 1, policy_version 59890 (0.0010) [2023-10-14 03:25:22,076][33226] Updated weights for policy 1, policy_version 59900 (0.0008) [2023-10-14 03:25:22,397][33201] Updated weights for policy 0, policy_version 59330 (0.0009) [2023-10-14 03:25:22,796][33201] Updated weights for policy 0, policy_version 59340 (0.0008) [2023-10-14 03:25:23,168][33201] Updated weights for policy 0, policy_version 59350 (0.0007) [2023-10-14 03:25:23,538][33201] Updated weights for policy 0, policy_version 59360 (0.0008) [2023-10-14 03:25:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 122126336. Throughput: 0: 1749.6, 1: 1771.9. Samples: 30540188. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:24,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 03:25:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000059904_61341696.pth... [2023-10-14 03:25:24,565][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000059360_60784640.pth... [2023-10-14 03:25:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000058240_59637760.pth [2023-10-14 03:25:24,611][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000057696_59080704.pth [2023-10-14 03:25:25,838][33226] Updated weights for policy 1, policy_version 59910 (0.0007) [2023-10-14 03:25:26,209][33226] Updated weights for policy 1, policy_version 59920 (0.0008) [2023-10-14 03:25:26,581][33226] Updated weights for policy 1, policy_version 59930 (0.0009) [2023-10-14 03:25:27,341][33201] Updated weights for policy 0, policy_version 59370 (0.0008) [2023-10-14 03:25:27,701][33201] Updated weights for policy 0, policy_version 59380 (0.0010) [2023-10-14 03:25:28,081][33201] Updated weights for policy 0, policy_version 59390 (0.0008) [2023-10-14 03:25:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 122191872. Throughput: 0: 1782.1, 1: 1768.0. Samples: 30551208. Policy #0 lag: (min: 31.0, avg: 34.5, max: 63.0) [2023-10-14 03:25:29,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.950')] [2023-10-14 03:25:30,402][33226] Updated weights for policy 1, policy_version 59940 (0.0008) [2023-10-14 03:25:30,766][33226] Updated weights for policy 1, policy_version 59950 (0.0008) [2023-10-14 03:25:31,137][33226] Updated weights for policy 1, policy_version 59960 (0.0008) [2023-10-14 03:25:31,887][33201] Updated weights for policy 0, policy_version 59400 (0.0008) [2023-10-14 03:25:32,262][33201] Updated weights for policy 0, policy_version 59410 (0.0008) [2023-10-14 03:25:32,631][33201] Updated weights for policy 0, policy_version 59420 (0.0009) [2023-10-14 03:25:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 122257408. Throughput: 0: 1752.1, 1: 1775.5. Samples: 30572270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:34,852][33226] Updated weights for policy 1, policy_version 59970 (0.0010) [2023-10-14 03:25:35,226][33226] Updated weights for policy 1, policy_version 59980 (0.0008) [2023-10-14 03:25:35,586][33226] Updated weights for policy 1, policy_version 59990 (0.0009) [2023-10-14 03:25:35,952][33226] Updated weights for policy 1, policy_version 60000 (0.0008) [2023-10-14 03:25:36,407][33201] Updated weights for policy 0, policy_version 59430 (0.0008) [2023-10-14 03:25:36,780][33201] Updated weights for policy 0, policy_version 59440 (0.0008) [2023-10-14 03:25:37,155][33201] Updated weights for policy 0, policy_version 59450 (0.0007) [2023-10-14 03:25:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 122322944. Throughput: 0: 1747.6, 1: 1783.6. Samples: 30594064. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:39,949][33226] Updated weights for policy 1, policy_version 60010 (0.0010) [2023-10-14 03:25:40,303][33226] Updated weights for policy 1, policy_version 60020 (0.0009) [2023-10-14 03:25:40,676][33226] Updated weights for policy 1, policy_version 60030 (0.0007) [2023-10-14 03:25:40,958][33201] Updated weights for policy 0, policy_version 59460 (0.0008) [2023-10-14 03:25:41,328][33201] Updated weights for policy 0, policy_version 59470 (0.0008) [2023-10-14 03:25:41,697][33201] Updated weights for policy 0, policy_version 59480 (0.0007) [2023-10-14 03:25:44,415][33226] Updated weights for policy 1, policy_version 60040 (0.0008) [2023-10-14 03:25:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 122388480. Throughput: 0: 1747.1, 1: 1767.7. Samples: 30603528. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:44,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:44,782][33226] Updated weights for policy 1, policy_version 60050 (0.0010) [2023-10-14 03:25:45,149][33226] Updated weights for policy 1, policy_version 60060 (0.0007) [2023-10-14 03:25:45,640][33201] Updated weights for policy 0, policy_version 59490 (0.0008) [2023-10-14 03:25:46,010][33201] Updated weights for policy 0, policy_version 59500 (0.0009) [2023-10-14 03:25:46,388][33201] Updated weights for policy 0, policy_version 59510 (0.0007) [2023-10-14 03:25:46,753][33201] Updated weights for policy 0, policy_version 59520 (0.0008) [2023-10-14 03:25:48,960][33226] Updated weights for policy 1, policy_version 60070 (0.0009) [2023-10-14 03:25:49,317][33226] Updated weights for policy 1, policy_version 60080 (0.0008) [2023-10-14 03:25:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 122454016. Throughput: 0: 1752.1, 1: 1784.5. Samples: 30625930. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:49,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:49,689][33226] Updated weights for policy 1, policy_version 60090 (0.0008) [2023-10-14 03:25:50,746][33201] Updated weights for policy 0, policy_version 59530 (0.0008) [2023-10-14 03:25:51,124][33201] Updated weights for policy 0, policy_version 59540 (0.0008) [2023-10-14 03:25:51,492][33201] Updated weights for policy 0, policy_version 59550 (0.0007) [2023-10-14 03:25:53,497][33226] Updated weights for policy 1, policy_version 60100 (0.0007) [2023-10-14 03:25:53,869][33226] Updated weights for policy 1, policy_version 60110 (0.0007) [2023-10-14 03:25:54,240][33226] Updated weights for policy 1, policy_version 60120 (0.0009) [2023-10-14 03:25:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 122552320. Throughput: 0: 1774.8, 1: 1786.4. Samples: 30647544. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:54,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:55,079][33201] Updated weights for policy 0, policy_version 59560 (0.0008) [2023-10-14 03:25:55,454][33201] Updated weights for policy 0, policy_version 59570 (0.0010) [2023-10-14 03:25:55,823][33201] Updated weights for policy 0, policy_version 59580 (0.0011) [2023-10-14 03:25:58,068][33226] Updated weights for policy 1, policy_version 60130 (0.0007) [2023-10-14 03:25:58,481][33226] Updated weights for policy 1, policy_version 60140 (0.0008) [2023-10-14 03:25:58,844][33226] Updated weights for policy 1, policy_version 60150 (0.0009) [2023-10-14 03:25:59,218][33226] Updated weights for policy 1, policy_version 60160 (0.0007) [2023-10-14 03:25:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 122617856. Throughput: 0: 1755.7, 1: 1773.7. Samples: 30657916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:25:59,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:25:59,724][33201] Updated weights for policy 0, policy_version 59590 (0.0009) [2023-10-14 03:26:00,096][33201] Updated weights for policy 0, policy_version 59600 (0.0007) [2023-10-14 03:26:00,461][33201] Updated weights for policy 0, policy_version 59610 (0.0007) [2023-10-14 03:26:03,009][33226] Updated weights for policy 1, policy_version 60170 (0.0009) [2023-10-14 03:26:03,382][33226] Updated weights for policy 1, policy_version 60180 (0.0008) [2023-10-14 03:26:03,755][33226] Updated weights for policy 1, policy_version 60190 (0.0007) [2023-10-14 03:26:04,222][33201] Updated weights for policy 0, policy_version 59620 (0.0008) [2023-10-14 03:26:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 122683392. Throughput: 0: 1768.6, 1: 1794.4. Samples: 30679474. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:04,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:26:04,597][33201] Updated weights for policy 0, policy_version 59630 (0.0008) [2023-10-14 03:26:04,961][33201] Updated weights for policy 0, policy_version 59640 (0.0010) [2023-10-14 03:26:07,469][33226] Updated weights for policy 1, policy_version 60200 (0.0007) [2023-10-14 03:26:07,845][33226] Updated weights for policy 1, policy_version 60210 (0.0007) [2023-10-14 03:26:08,209][33226] Updated weights for policy 1, policy_version 60220 (0.0008) [2023-10-14 03:26:08,912][33201] Updated weights for policy 0, policy_version 59650 (0.0008) [2023-10-14 03:26:09,325][33201] Updated weights for policy 0, policy_version 59660 (0.0008) [2023-10-14 03:26:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 122748928. Throughput: 0: 1783.1, 1: 1774.8. Samples: 30700296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:09,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.960')] [2023-10-14 03:26:09,684][33201] Updated weights for policy 0, policy_version 59670 (0.0010) [2023-10-14 03:26:10,068][33201] Updated weights for policy 0, policy_version 59680 (0.0010) [2023-10-14 03:26:11,884][33226] Updated weights for policy 1, policy_version 60230 (0.0008) [2023-10-14 03:26:12,245][33226] Updated weights for policy 1, policy_version 60240 (0.0007) [2023-10-14 03:26:12,612][33226] Updated weights for policy 1, policy_version 60250 (0.0007) [2023-10-14 03:26:13,732][33201] Updated weights for policy 0, policy_version 59690 (0.0010) [2023-10-14 03:26:14,092][33201] Updated weights for policy 0, policy_version 59700 (0.0008) [2023-10-14 03:26:14,472][33201] Updated weights for policy 0, policy_version 59710 (0.0007) [2023-10-14 03:26:14,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 122847232. Throughput: 0: 1757.2, 1: 1803.9. Samples: 30711460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 03:26:16,348][33226] Updated weights for policy 1, policy_version 60260 (0.0008) [2023-10-14 03:26:16,712][33226] Updated weights for policy 1, policy_version 60270 (0.0010) [2023-10-14 03:26:17,082][33226] Updated weights for policy 1, policy_version 60280 (0.0011) [2023-10-14 03:26:18,404][33201] Updated weights for policy 0, policy_version 59720 (0.0009) [2023-10-14 03:26:18,785][33201] Updated weights for policy 0, policy_version 59730 (0.0008) [2023-10-14 03:26:19,155][33201] Updated weights for policy 0, policy_version 59740 (0.0008) [2023-10-14 03:26:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 122912768. Throughput: 0: 1784.4, 1: 1774.8. Samples: 30732436. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:19,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 03:26:20,898][33226] Updated weights for policy 1, policy_version 60290 (0.0010) [2023-10-14 03:26:21,267][33226] Updated weights for policy 1, policy_version 60300 (0.0007) [2023-10-14 03:26:21,629][33226] Updated weights for policy 1, policy_version 60310 (0.0007) [2023-10-14 03:26:22,002][33226] Updated weights for policy 1, policy_version 60320 (0.0007) [2023-10-14 03:26:22,954][33201] Updated weights for policy 0, policy_version 59750 (0.0008) [2023-10-14 03:26:23,329][33201] Updated weights for policy 0, policy_version 59760 (0.0008) [2023-10-14 03:26:23,707][33201] Updated weights for policy 0, policy_version 59770 (0.0008) [2023-10-14 03:26:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 122978304. Throughput: 0: 1753.3, 1: 1780.7. Samples: 30753094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:24,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.990')] [2023-10-14 03:26:25,877][33226] Updated weights for policy 1, policy_version 60330 (0.0008) [2023-10-14 03:26:26,250][33226] Updated weights for policy 1, policy_version 60340 (0.0009) [2023-10-14 03:26:26,613][33226] Updated weights for policy 1, policy_version 60350 (0.0010) [2023-10-14 03:26:27,534][33201] Updated weights for policy 0, policy_version 59780 (0.0009) [2023-10-14 03:26:27,914][33201] Updated weights for policy 0, policy_version 59790 (0.0010) [2023-10-14 03:26:28,286][33201] Updated weights for policy 0, policy_version 59800 (0.0010) [2023-10-14 03:26:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 123043840. Throughput: 0: 1788.7, 1: 1782.9. Samples: 30764250. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:29,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.990')] [2023-10-14 03:26:30,448][33226] Updated weights for policy 1, policy_version 60360 (0.0009) [2023-10-14 03:26:30,810][33226] Updated weights for policy 1, policy_version 60370 (0.0009) [2023-10-14 03:26:31,174][33226] Updated weights for policy 1, policy_version 60380 (0.0008) [2023-10-14 03:26:32,193][33201] Updated weights for policy 0, policy_version 59810 (0.0010) [2023-10-14 03:26:32,562][33201] Updated weights for policy 0, policy_version 59820 (0.0007) [2023-10-14 03:26:32,926][33201] Updated weights for policy 0, policy_version 59830 (0.0007) [2023-10-14 03:26:33,301][33201] Updated weights for policy 0, policy_version 59840 (0.0007) [2023-10-14 03:26:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 123109376. Throughput: 0: 1762.9, 1: 1773.6. Samples: 30785072. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:34,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.980')] [2023-10-14 03:26:34,939][33226] Updated weights for policy 1, policy_version 60390 (0.0009) [2023-10-14 03:26:35,304][33226] Updated weights for policy 1, policy_version 60400 (0.0008) [2023-10-14 03:26:35,675][33226] Updated weights for policy 1, policy_version 60410 (0.0008) [2023-10-14 03:26:37,062][33201] Updated weights for policy 0, policy_version 59850 (0.0010) [2023-10-14 03:26:37,433][33201] Updated weights for policy 0, policy_version 59860 (0.0010) [2023-10-14 03:26:37,809][33201] Updated weights for policy 0, policy_version 59870 (0.0007) [2023-10-14 03:26:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 123174912. Throughput: 0: 1751.6, 1: 1788.8. Samples: 30806858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:39,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:26:39,578][33226] Updated weights for policy 1, policy_version 60420 (0.0007) [2023-10-14 03:26:39,942][33226] Updated weights for policy 1, policy_version 60430 (0.0007) [2023-10-14 03:26:40,311][33226] Updated weights for policy 1, policy_version 60440 (0.0008) [2023-10-14 03:26:41,559][33201] Updated weights for policy 0, policy_version 59880 (0.0009) [2023-10-14 03:26:41,933][33201] Updated weights for policy 0, policy_version 59890 (0.0008) [2023-10-14 03:26:42,299][33201] Updated weights for policy 0, policy_version 59900 (0.0007) [2023-10-14 03:26:44,273][33226] Updated weights for policy 1, policy_version 60450 (0.0010) [2023-10-14 03:26:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 123240448. Throughput: 0: 1763.1, 1: 1770.5. Samples: 30816928. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:44,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.970')] [2023-10-14 03:26:44,680][33226] Updated weights for policy 1, policy_version 60460 (0.0008) [2023-10-14 03:26:45,047][33226] Updated weights for policy 1, policy_version 60470 (0.0010) [2023-10-14 03:26:45,416][33226] Updated weights for policy 1, policy_version 60480 (0.0012) [2023-10-14 03:26:46,213][33201] Updated weights for policy 0, policy_version 59910 (0.0008) [2023-10-14 03:26:46,587][33201] Updated weights for policy 0, policy_version 59920 (0.0009) [2023-10-14 03:26:46,960][33201] Updated weights for policy 0, policy_version 59930 (0.0008) [2023-10-14 03:26:49,265][33226] Updated weights for policy 1, policy_version 60490 (0.0007) [2023-10-14 03:26:49,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 123305984. Throughput: 0: 1749.5, 1: 1774.5. Samples: 30838054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:49,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.970')] [2023-10-14 03:26:49,625][33226] Updated weights for policy 1, policy_version 60500 (0.0008) [2023-10-14 03:26:49,996][33226] Updated weights for policy 1, policy_version 60510 (0.0008) [2023-10-14 03:26:50,749][33201] Updated weights for policy 0, policy_version 59940 (0.0007) [2023-10-14 03:26:51,115][33201] Updated weights for policy 0, policy_version 59950 (0.0007) [2023-10-14 03:26:51,494][33201] Updated weights for policy 0, policy_version 59960 (0.0008) [2023-10-14 03:26:53,644][33226] Updated weights for policy 1, policy_version 60520 (0.0007) [2023-10-14 03:26:54,018][33226] Updated weights for policy 1, policy_version 60530 (0.0008) [2023-10-14 03:26:54,381][33226] Updated weights for policy 1, policy_version 60540 (0.0007) [2023-10-14 03:26:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 123404288. Throughput: 0: 1760.0, 1: 1781.2. Samples: 30859648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:26:54,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.970')] [2023-10-14 03:26:55,478][33201] Updated weights for policy 0, policy_version 59970 (0.0009) [2023-10-14 03:26:55,880][33201] Updated weights for policy 0, policy_version 59980 (0.0008) [2023-10-14 03:26:56,250][33201] Updated weights for policy 0, policy_version 59990 (0.0009) [2023-10-14 03:26:56,624][33201] Updated weights for policy 0, policy_version 60000 (0.0011) [2023-10-14 03:26:58,230][33226] Updated weights for policy 1, policy_version 60550 (0.0009) [2023-10-14 03:26:58,604][33226] Updated weights for policy 1, policy_version 60560 (0.0007) [2023-10-14 03:26:58,976][33226] Updated weights for policy 1, policy_version 60570 (0.0008) [2023-10-14 03:26:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 123469824. Throughput: 0: 1746.2, 1: 1768.2. Samples: 30869608. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:26:59,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.970')] [2023-10-14 03:27:00,250][33201] Updated weights for policy 0, policy_version 60010 (0.0008) [2023-10-14 03:27:00,616][33201] Updated weights for policy 0, policy_version 60020 (0.0009) [2023-10-14 03:27:00,990][33201] Updated weights for policy 0, policy_version 60030 (0.0008) [2023-10-14 03:27:02,827][33226] Updated weights for policy 1, policy_version 60580 (0.0007) [2023-10-14 03:27:03,190][33226] Updated weights for policy 1, policy_version 60590 (0.0007) [2023-10-14 03:27:03,567][33226] Updated weights for policy 1, policy_version 60600 (0.0010) [2023-10-14 03:27:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 123535360. Throughput: 0: 1752.9, 1: 1781.7. Samples: 30891494. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:04,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.970')] [2023-10-14 03:27:04,717][33201] Updated weights for policy 0, policy_version 60040 (0.0008) [2023-10-14 03:27:05,082][33201] Updated weights for policy 0, policy_version 60050 (0.0008) [2023-10-14 03:27:05,449][33201] Updated weights for policy 0, policy_version 60060 (0.0010) [2023-10-14 03:27:07,338][33226] Updated weights for policy 1, policy_version 60610 (0.0008) [2023-10-14 03:27:07,698][33226] Updated weights for policy 1, policy_version 60620 (0.0007) [2023-10-14 03:27:08,063][33226] Updated weights for policy 1, policy_version 60630 (0.0009) [2023-10-14 03:27:08,431][33226] Updated weights for policy 1, policy_version 60640 (0.0008) [2023-10-14 03:27:09,271][33201] Updated weights for policy 0, policy_version 60070 (0.0009) [2023-10-14 03:27:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 123600896. Throughput: 0: 1790.2, 1: 1759.7. Samples: 30912840. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:09,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.970')] [2023-10-14 03:27:09,642][33201] Updated weights for policy 0, policy_version 60080 (0.0010) [2023-10-14 03:27:10,003][33201] Updated weights for policy 0, policy_version 60090 (0.0010) [2023-10-14 03:27:12,224][33226] Updated weights for policy 1, policy_version 60650 (0.0009) [2023-10-14 03:27:12,588][33226] Updated weights for policy 1, policy_version 60660 (0.0008) [2023-10-14 03:27:12,960][33226] Updated weights for policy 1, policy_version 60670 (0.0009) [2023-10-14 03:27:13,687][33201] Updated weights for policy 0, policy_version 60100 (0.0008) [2023-10-14 03:27:14,048][33201] Updated weights for policy 0, policy_version 60110 (0.0009) [2023-10-14 03:27:14,423][33201] Updated weights for policy 0, policy_version 60120 (0.0010) [2023-10-14 03:27:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 123666432. Throughput: 0: 1755.2, 1: 1790.4. Samples: 30923806. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:14,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.970')] [2023-10-14 03:27:16,616][33226] Updated weights for policy 1, policy_version 60680 (0.0008) [2023-10-14 03:27:16,977][33226] Updated weights for policy 1, policy_version 60690 (0.0008) [2023-10-14 03:27:17,354][33226] Updated weights for policy 1, policy_version 60700 (0.0007) [2023-10-14 03:27:18,286][33201] Updated weights for policy 0, policy_version 60130 (0.0009) [2023-10-14 03:27:18,650][33201] Updated weights for policy 0, policy_version 60140 (0.0007) [2023-10-14 03:27:19,018][33201] Updated weights for policy 0, policy_version 60150 (0.0010) [2023-10-14 03:27:19,387][33201] Updated weights for policy 0, policy_version 60160 (0.0008) [2023-10-14 03:27:19,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 123764736. Throughput: 0: 1786.1, 1: 1765.4. Samples: 30944890. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:19,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.980')] [2023-10-14 03:27:21,107][33226] Updated weights for policy 1, policy_version 60710 (0.0008) [2023-10-14 03:27:21,474][33226] Updated weights for policy 1, policy_version 60720 (0.0008) [2023-10-14 03:27:21,855][33226] Updated weights for policy 1, policy_version 60730 (0.0009) [2023-10-14 03:27:23,160][33201] Updated weights for policy 0, policy_version 60170 (0.0008) [2023-10-14 03:27:23,537][33201] Updated weights for policy 0, policy_version 60180 (0.0008) [2023-10-14 03:27:23,906][33201] Updated weights for policy 0, policy_version 60190 (0.0010) [2023-10-14 03:27:24,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 123830272. Throughput: 0: 1762.0, 1: 1769.1. Samples: 30965762. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:24,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.980')] [2023-10-14 03:27:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000060736_62193664.pth... [2023-10-14 03:27:24,571][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000060192_61636608.pth... [2023-10-14 03:27:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000059072_60489728.pth [2023-10-14 03:27:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000058528_59932672.pth [2023-10-14 03:27:25,701][33226] Updated weights for policy 1, policy_version 60740 (0.0008) [2023-10-14 03:27:26,062][33226] Updated weights for policy 1, policy_version 60750 (0.0008) [2023-10-14 03:27:26,430][33226] Updated weights for policy 1, policy_version 60760 (0.0007) [2023-10-14 03:27:27,824][33201] Updated weights for policy 0, policy_version 60200 (0.0009) [2023-10-14 03:27:28,202][33201] Updated weights for policy 0, policy_version 60210 (0.0010) [2023-10-14 03:27:28,572][33201] Updated weights for policy 0, policy_version 60220 (0.0010) [2023-10-14 03:27:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 123895808. Throughput: 0: 1780.2, 1: 1770.9. Samples: 30976728. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:29,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.980')] [2023-10-14 03:27:30,221][33226] Updated weights for policy 1, policy_version 60770 (0.0009) [2023-10-14 03:27:30,576][33226] Updated weights for policy 1, policy_version 60780 (0.0008) [2023-10-14 03:27:30,936][33226] Updated weights for policy 1, policy_version 60790 (0.0008) [2023-10-14 03:27:31,299][33226] Updated weights for policy 1, policy_version 60800 (0.0008) [2023-10-14 03:27:32,414][33201] Updated weights for policy 0, policy_version 60230 (0.0009) [2023-10-14 03:27:32,785][33201] Updated weights for policy 0, policy_version 60240 (0.0007) [2023-10-14 03:27:33,148][33201] Updated weights for policy 0, policy_version 60250 (0.0008) [2023-10-14 03:27:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 123961344. Throughput: 0: 1770.8, 1: 1781.4. Samples: 30997902. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 03:27:34,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.980')] [2023-10-14 03:27:35,121][33226] Updated weights for policy 1, policy_version 60810 (0.0007) [2023-10-14 03:27:35,494][33226] Updated weights for policy 1, policy_version 60820 (0.0007) [2023-10-14 03:27:35,864][33226] Updated weights for policy 1, policy_version 60830 (0.0007) [2023-10-14 03:27:37,066][33201] Updated weights for policy 0, policy_version 60260 (0.0008) [2023-10-14 03:27:37,440][33201] Updated weights for policy 0, policy_version 60270 (0.0010) [2023-10-14 03:27:37,814][33201] Updated weights for policy 0, policy_version 60280 (0.0010) [2023-10-14 03:27:39,546][33226] Updated weights for policy 1, policy_version 60840 (0.0008) [2023-10-14 03:27:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 124026880. Throughput: 0: 1758.1, 1: 1789.8. Samples: 31019304. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:27:39,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.980')] [2023-10-14 03:27:39,907][33226] Updated weights for policy 1, policy_version 60850 (0.0008) [2023-10-14 03:27:40,282][33226] Updated weights for policy 1, policy_version 60860 (0.0010) [2023-10-14 03:27:41,521][33201] Updated weights for policy 0, policy_version 60290 (0.0007) [2023-10-14 03:27:41,928][33201] Updated weights for policy 0, policy_version 60300 (0.0007) [2023-10-14 03:27:42,293][33201] Updated weights for policy 0, policy_version 60310 (0.0008) [2023-10-14 03:27:42,663][33201] Updated weights for policy 0, policy_version 60320 (0.0007) [2023-10-14 03:27:44,178][33226] Updated weights for policy 1, policy_version 60870 (0.0008) [2023-10-14 03:27:44,548][33226] Updated weights for policy 1, policy_version 60880 (0.0010) [2023-10-14 03:27:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 124092416. Throughput: 0: 1782.0, 1: 1769.7. Samples: 31029438. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:27:44,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.980')] [2023-10-14 03:27:44,914][33226] Updated weights for policy 1, policy_version 60890 (0.0011) [2023-10-14 03:27:46,465][33201] Updated weights for policy 0, policy_version 60330 (0.0008) [2023-10-14 03:27:46,839][33201] Updated weights for policy 0, policy_version 60340 (0.0009) [2023-10-14 03:27:47,206][33201] Updated weights for policy 0, policy_version 60350 (0.0010) [2023-10-14 03:27:48,803][33226] Updated weights for policy 1, policy_version 60900 (0.0009) [2023-10-14 03:27:49,182][33226] Updated weights for policy 1, policy_version 60910 (0.0008) [2023-10-14 03:27:49,548][33226] Updated weights for policy 1, policy_version 60920 (0.0007) [2023-10-14 03:27:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 124157952. Throughput: 0: 1758.5, 1: 1779.3. Samples: 31050696. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:27:49,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.970')] [2023-10-14 03:27:51,108][33201] Updated weights for policy 0, policy_version 60360 (0.0010) [2023-10-14 03:27:51,476][33201] Updated weights for policy 0, policy_version 60370 (0.0010) [2023-10-14 03:27:51,847][33201] Updated weights for policy 0, policy_version 60380 (0.0010) [2023-10-14 03:27:53,297][33226] Updated weights for policy 1, policy_version 60930 (0.0010) [2023-10-14 03:27:53,658][33226] Updated weights for policy 1, policy_version 60940 (0.0007) [2023-10-14 03:27:54,025][33226] Updated weights for policy 1, policy_version 60950 (0.0008) [2023-10-14 03:27:54,391][33226] Updated weights for policy 1, policy_version 60960 (0.0007) [2023-10-14 03:27:54,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 124256256. Throughput: 0: 1752.8, 1: 1782.7. Samples: 31071938. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:27:54,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.970')] [2023-10-14 03:27:55,767][33201] Updated weights for policy 0, policy_version 60390 (0.0010) [2023-10-14 03:27:56,144][33201] Updated weights for policy 0, policy_version 60400 (0.0009) [2023-10-14 03:27:56,506][33201] Updated weights for policy 0, policy_version 60410 (0.0010) [2023-10-14 03:27:58,205][33226] Updated weights for policy 1, policy_version 60970 (0.0009) [2023-10-14 03:27:58,572][33226] Updated weights for policy 1, policy_version 60980 (0.0007) [2023-10-14 03:27:58,940][33226] Updated weights for policy 1, policy_version 60990 (0.0007) [2023-10-14 03:27:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 124321792. Throughput: 0: 1751.2, 1: 1775.4. Samples: 31082504. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:27:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.970')] [2023-10-14 03:28:00,269][33201] Updated weights for policy 0, policy_version 60420 (0.0008) [2023-10-14 03:28:00,647][33201] Updated weights for policy 0, policy_version 60430 (0.0007) [2023-10-14 03:28:01,020][33201] Updated weights for policy 0, policy_version 60440 (0.0007) [2023-10-14 03:28:02,705][33226] Updated weights for policy 1, policy_version 61000 (0.0008) [2023-10-14 03:28:03,078][33226] Updated weights for policy 1, policy_version 61010 (0.0007) [2023-10-14 03:28:03,442][33226] Updated weights for policy 1, policy_version 61020 (0.0007) [2023-10-14 03:28:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 124387328. Throughput: 0: 1742.4, 1: 1789.1. Samples: 31103810. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:28:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:28:04,957][33201] Updated weights for policy 0, policy_version 60450 (0.0009) [2023-10-14 03:28:05,327][33201] Updated weights for policy 0, policy_version 60460 (0.0010) [2023-10-14 03:28:05,690][33201] Updated weights for policy 0, policy_version 60470 (0.0010) [2023-10-14 03:28:06,059][33201] Updated weights for policy 0, policy_version 60480 (0.0008) [2023-10-14 03:28:07,238][33226] Updated weights for policy 1, policy_version 61030 (0.0008) [2023-10-14 03:28:07,600][33226] Updated weights for policy 1, policy_version 61040 (0.0009) [2023-10-14 03:28:07,962][33226] Updated weights for policy 1, policy_version 61050 (0.0008) [2023-10-14 03:28:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 124452864. Throughput: 0: 1776.9, 1: 1765.9. Samples: 31125184. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:28:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:28:09,891][33201] Updated weights for policy 0, policy_version 60490 (0.0010) [2023-10-14 03:28:10,262][33201] Updated weights for policy 0, policy_version 60500 (0.0009) [2023-10-14 03:28:10,632][33201] Updated weights for policy 0, policy_version 60510 (0.0010) [2023-10-14 03:28:11,766][33226] Updated weights for policy 1, policy_version 61060 (0.0008) [2023-10-14 03:28:12,123][33226] Updated weights for policy 1, policy_version 61070 (0.0008) [2023-10-14 03:28:12,492][33226] Updated weights for policy 1, policy_version 61080 (0.0009) [2023-10-14 03:28:14,542][33201] Updated weights for policy 0, policy_version 60520 (0.0007) [2023-10-14 03:28:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 124518400. Throughput: 0: 1746.1, 1: 1785.4. Samples: 31135646. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:28:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:28:14,907][33201] Updated weights for policy 0, policy_version 60530 (0.0008) [2023-10-14 03:28:15,291][33201] Updated weights for policy 0, policy_version 60540 (0.0009) [2023-10-14 03:28:16,249][33226] Updated weights for policy 1, policy_version 61090 (0.0008) [2023-10-14 03:28:16,611][33226] Updated weights for policy 1, policy_version 61100 (0.0009) [2023-10-14 03:28:16,986][33226] Updated weights for policy 1, policy_version 61110 (0.0008) [2023-10-14 03:28:17,356][33226] Updated weights for policy 1, policy_version 61120 (0.0007) [2023-10-14 03:28:19,139][33201] Updated weights for policy 0, policy_version 60550 (0.0009) [2023-10-14 03:28:19,520][33201] Updated weights for policy 0, policy_version 60560 (0.0008) [2023-10-14 03:28:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 124583936. Throughput: 0: 1767.7, 1: 1758.0. Samples: 31156556. Policy #0 lag: (min: 28.0, avg: 28.0, max: 28.0) [2023-10-14 03:28:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:28:19,887][33201] Updated weights for policy 0, policy_version 60570 (0.0008) [2023-10-14 03:28:21,341][33226] Updated weights for policy 1, policy_version 61130 (0.0011) [2023-10-14 03:28:21,706][33226] Updated weights for policy 1, policy_version 61140 (0.0010) [2023-10-14 03:28:22,079][33226] Updated weights for policy 1, policy_version 61150 (0.0008) [2023-10-14 03:28:23,594][33201] Updated weights for policy 0, policy_version 60580 (0.0010) [2023-10-14 03:28:23,962][33201] Updated weights for policy 0, policy_version 60590 (0.0008) [2023-10-14 03:28:24,335][33201] Updated weights for policy 0, policy_version 60600 (0.0010) [2023-10-14 03:28:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 124649472. Throughput: 0: 1763.2, 1: 1759.8. Samples: 31177838. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:28:25,752][33226] Updated weights for policy 1, policy_version 61160 (0.0007) [2023-10-14 03:28:26,122][33226] Updated weights for policy 1, policy_version 61170 (0.0009) [2023-10-14 03:28:26,483][33226] Updated weights for policy 1, policy_version 61180 (0.0007) [2023-10-14 03:28:28,078][33201] Updated weights for policy 0, policy_version 60610 (0.0009) [2023-10-14 03:28:28,449][33201] Updated weights for policy 0, policy_version 60620 (0.0008) [2023-10-14 03:28:28,824][33201] Updated weights for policy 0, policy_version 60630 (0.0008) [2023-10-14 03:28:29,188][33201] Updated weights for policy 0, policy_version 60640 (0.0009) [2023-10-14 03:28:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 124747776. Throughput: 0: 1763.5, 1: 1768.0. Samples: 31188356. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:28:30,494][33226] Updated weights for policy 1, policy_version 61190 (0.0009) [2023-10-14 03:28:30,859][33226] Updated weights for policy 1, policy_version 61200 (0.0008) [2023-10-14 03:28:31,219][33226] Updated weights for policy 1, policy_version 61210 (0.0010) [2023-10-14 03:28:33,133][33201] Updated weights for policy 0, policy_version 60650 (0.0009) [2023-10-14 03:28:33,497][33201] Updated weights for policy 0, policy_version 60660 (0.0009) [2023-10-14 03:28:33,866][33201] Updated weights for policy 0, policy_version 60670 (0.0007) [2023-10-14 03:28:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 124813312. Throughput: 0: 1771.2, 1: 1764.0. Samples: 31209780. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:34,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 03:28:34,969][33226] Updated weights for policy 1, policy_version 61220 (0.0009) [2023-10-14 03:28:35,334][33226] Updated weights for policy 1, policy_version 61230 (0.0009) [2023-10-14 03:28:35,689][33226] Updated weights for policy 1, policy_version 61240 (0.0010) [2023-10-14 03:28:37,661][33201] Updated weights for policy 0, policy_version 60680 (0.0010) [2023-10-14 03:28:38,034][33201] Updated weights for policy 0, policy_version 60690 (0.0011) [2023-10-14 03:28:38,409][33201] Updated weights for policy 0, policy_version 60700 (0.0009) [2023-10-14 03:28:39,534][33226] Updated weights for policy 1, policy_version 61250 (0.0010) [2023-10-14 03:28:39,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 124878848. Throughput: 0: 1749.7, 1: 1782.8. Samples: 31230902. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:39,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 03:28:39,899][33226] Updated weights for policy 1, policy_version 61260 (0.0010) [2023-10-14 03:28:40,270][33226] Updated weights for policy 1, policy_version 61270 (0.0010) [2023-10-14 03:28:40,641][33226] Updated weights for policy 1, policy_version 61280 (0.0010) [2023-10-14 03:28:42,307][33201] Updated weights for policy 0, policy_version 60710 (0.0009) [2023-10-14 03:28:42,677][33201] Updated weights for policy 0, policy_version 60720 (0.0007) [2023-10-14 03:28:43,060][33201] Updated weights for policy 0, policy_version 60730 (0.0007) [2023-10-14 03:28:44,179][33226] Updated weights for policy 1, policy_version 61290 (0.0008) [2023-10-14 03:28:44,534][33226] Updated weights for policy 1, policy_version 61300 (0.0007) [2023-10-14 03:28:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 124944384. Throughput: 0: 1779.4, 1: 1765.5. Samples: 31242024. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:44,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 03:28:44,900][33226] Updated weights for policy 1, policy_version 61310 (0.0010) [2023-10-14 03:28:46,955][33201] Updated weights for policy 0, policy_version 60740 (0.0009) [2023-10-14 03:28:47,328][33201] Updated weights for policy 0, policy_version 60750 (0.0011) [2023-10-14 03:28:47,695][33201] Updated weights for policy 0, policy_version 60760 (0.0008) [2023-10-14 03:28:48,771][33226] Updated weights for policy 1, policy_version 61320 (0.0007) [2023-10-14 03:28:49,138][33226] Updated weights for policy 1, policy_version 61330 (0.0007) [2023-10-14 03:28:49,508][33226] Updated weights for policy 1, policy_version 61340 (0.0007) [2023-10-14 03:28:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 125009920. Throughput: 0: 1751.6, 1: 1782.4. Samples: 31262840. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:49,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 03:28:51,685][33201] Updated weights for policy 0, policy_version 60770 (0.0009) [2023-10-14 03:28:52,062][33201] Updated weights for policy 0, policy_version 60780 (0.0010) [2023-10-14 03:28:52,430][33201] Updated weights for policy 0, policy_version 60790 (0.0008) [2023-10-14 03:28:52,803][33201] Updated weights for policy 0, policy_version 60800 (0.0008) [2023-10-14 03:28:53,342][33226] Updated weights for policy 1, policy_version 61350 (0.0007) [2023-10-14 03:28:53,712][33226] Updated weights for policy 1, policy_version 61360 (0.0008) [2023-10-14 03:28:54,080][33226] Updated weights for policy 1, policy_version 61370 (0.0010) [2023-10-14 03:28:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 125108224. Throughput: 0: 1750.0, 1: 1779.2. Samples: 31284002. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 03:28:56,494][33201] Updated weights for policy 0, policy_version 60810 (0.0008) [2023-10-14 03:28:56,865][33201] Updated weights for policy 0, policy_version 60820 (0.0010) [2023-10-14 03:28:57,241][33201] Updated weights for policy 0, policy_version 60830 (0.0007) [2023-10-14 03:28:57,861][33226] Updated weights for policy 1, policy_version 61380 (0.0007) [2023-10-14 03:28:58,226][33226] Updated weights for policy 1, policy_version 61390 (0.0009) [2023-10-14 03:28:58,589][33226] Updated weights for policy 1, policy_version 61400 (0.0007) [2023-10-14 03:28:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 125173760. Throughput: 0: 1758.0, 1: 1783.4. Samples: 31295010. Policy #0 lag: (min: 31.0, avg: 44.8, max: 63.0) [2023-10-14 03:28:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:00,917][33201] Updated weights for policy 0, policy_version 60840 (0.0008) [2023-10-14 03:29:01,287][33201] Updated weights for policy 0, policy_version 60850 (0.0010) [2023-10-14 03:29:01,664][33201] Updated weights for policy 0, policy_version 60860 (0.0010) [2023-10-14 03:29:02,431][33226] Updated weights for policy 1, policy_version 61410 (0.0009) [2023-10-14 03:29:02,809][33226] Updated weights for policy 1, policy_version 61420 (0.0010) [2023-10-14 03:29:03,168][33226] Updated weights for policy 1, policy_version 61430 (0.0009) [2023-10-14 03:29:03,540][33226] Updated weights for policy 1, policy_version 61440 (0.0010) [2023-10-14 03:29:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 125239296. Throughput: 0: 1753.7, 1: 1791.4. Samples: 31316088. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:04,559][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:05,632][33201] Updated weights for policy 0, policy_version 60870 (0.0011) [2023-10-14 03:29:05,993][33201] Updated weights for policy 0, policy_version 60880 (0.0011) [2023-10-14 03:29:06,368][33201] Updated weights for policy 0, policy_version 60890 (0.0010) [2023-10-14 03:29:07,287][33226] Updated weights for policy 1, policy_version 61450 (0.0009) [2023-10-14 03:29:07,657][33226] Updated weights for policy 1, policy_version 61460 (0.0009) [2023-10-14 03:29:08,024][33226] Updated weights for policy 1, policy_version 61470 (0.0009) [2023-10-14 03:29:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 125304832. Throughput: 0: 1770.4, 1: 1775.2. Samples: 31337392. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:09,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:09,945][33201] Updated weights for policy 0, policy_version 60900 (0.0009) [2023-10-14 03:29:10,319][33201] Updated weights for policy 0, policy_version 60910 (0.0008) [2023-10-14 03:29:10,700][33201] Updated weights for policy 0, policy_version 60920 (0.0007) [2023-10-14 03:29:11,796][33226] Updated weights for policy 1, policy_version 61480 (0.0007) [2023-10-14 03:29:12,167][33226] Updated weights for policy 1, policy_version 61490 (0.0007) [2023-10-14 03:29:12,536][33226] Updated weights for policy 1, policy_version 61500 (0.0007) [2023-10-14 03:29:14,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 125370368. Throughput: 0: 1756.5, 1: 1790.7. Samples: 31347980. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:14,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:14,586][33201] Updated weights for policy 0, policy_version 60930 (0.0008) [2023-10-14 03:29:14,982][33201] Updated weights for policy 0, policy_version 60940 (0.0009) [2023-10-14 03:29:15,358][33201] Updated weights for policy 0, policy_version 60950 (0.0008) [2023-10-14 03:29:15,727][33201] Updated weights for policy 0, policy_version 60960 (0.0008) [2023-10-14 03:29:16,318][33226] Updated weights for policy 1, policy_version 61510 (0.0008) [2023-10-14 03:29:16,683][33226] Updated weights for policy 1, policy_version 61520 (0.0008) [2023-10-14 03:29:17,058][33226] Updated weights for policy 1, policy_version 61530 (0.0008) [2023-10-14 03:29:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 125435904. Throughput: 0: 1764.3, 1: 1777.9. Samples: 31369176. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:19,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:19,658][33201] Updated weights for policy 0, policy_version 60970 (0.0009) [2023-10-14 03:29:20,032][33201] Updated weights for policy 0, policy_version 60980 (0.0008) [2023-10-14 03:29:20,403][33201] Updated weights for policy 0, policy_version 60990 (0.0009) [2023-10-14 03:29:20,955][33226] Updated weights for policy 1, policy_version 61540 (0.0007) [2023-10-14 03:29:21,316][33226] Updated weights for policy 1, policy_version 61550 (0.0008) [2023-10-14 03:29:21,691][33226] Updated weights for policy 1, policy_version 61560 (0.0008) [2023-10-14 03:29:24,095][33201] Updated weights for policy 0, policy_version 61000 (0.0009) [2023-10-14 03:29:24,472][33201] Updated weights for policy 0, policy_version 61010 (0.0009) [2023-10-14 03:29:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 125501440. Throughput: 0: 1781.9, 1: 1774.8. Samples: 31390954. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:24,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 03:29:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000061568_63045632.pth... [2023-10-14 03:29:24,601][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000059904_61341696.pth [2023-10-14 03:29:24,833][33201] Updated weights for policy 0, policy_version 61020 (0.0008) [2023-10-14 03:29:24,984][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000061024_62488576.pth... [2023-10-14 03:29:25,023][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000059360_60784640.pth [2023-10-14 03:29:25,644][33226] Updated weights for policy 1, policy_version 61570 (0.0008) [2023-10-14 03:29:26,016][33226] Updated weights for policy 1, policy_version 61580 (0.0008) [2023-10-14 03:29:26,378][33226] Updated weights for policy 1, policy_version 61590 (0.0011) [2023-10-14 03:29:26,742][33226] Updated weights for policy 1, policy_version 61600 (0.0011) [2023-10-14 03:29:28,635][33201] Updated weights for policy 0, policy_version 61030 (0.0009) [2023-10-14 03:29:28,999][33201] Updated weights for policy 0, policy_version 61040 (0.0009) [2023-10-14 03:29:29,369][33201] Updated weights for policy 0, policy_version 61050 (0.0009) [2023-10-14 03:29:29,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 125566976. Throughput: 0: 1760.8, 1: 1769.8. Samples: 31400902. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:29,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.970')] [2023-10-14 03:29:30,623][33226] Updated weights for policy 1, policy_version 61610 (0.0007) [2023-10-14 03:29:30,991][33226] Updated weights for policy 1, policy_version 61620 (0.0007) [2023-10-14 03:29:31,355][33226] Updated weights for policy 1, policy_version 61630 (0.0008) [2023-10-14 03:29:33,245][33201] Updated weights for policy 0, policy_version 61060 (0.0010) [2023-10-14 03:29:33,604][33201] Updated weights for policy 0, policy_version 61070 (0.0010) [2023-10-14 03:29:33,981][33201] Updated weights for policy 0, policy_version 61080 (0.0009) [2023-10-14 03:29:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 125665280. Throughput: 0: 1790.2, 1: 1761.7. Samples: 31422674. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:34,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 03:29:35,018][33226] Updated weights for policy 1, policy_version 61640 (0.0010) [2023-10-14 03:29:35,394][33226] Updated weights for policy 1, policy_version 61650 (0.0008) [2023-10-14 03:29:35,752][33226] Updated weights for policy 1, policy_version 61660 (0.0010) [2023-10-14 03:29:37,656][33201] Updated weights for policy 0, policy_version 61090 (0.0009) [2023-10-14 03:29:38,017][33201] Updated weights for policy 0, policy_version 61100 (0.0008) [2023-10-14 03:29:38,394][33201] Updated weights for policy 0, policy_version 61110 (0.0009) [2023-10-14 03:29:38,755][33201] Updated weights for policy 0, policy_version 61120 (0.0007) [2023-10-14 03:29:39,550][33226] Updated weights for policy 1, policy_version 61670 (0.0010) [2023-10-14 03:29:39,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 125730816. Throughput: 0: 1758.9, 1: 1789.6. Samples: 31443682. Policy #0 lag: (min: 31.0, avg: 33.3, max: 63.0) [2023-10-14 03:29:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 03:29:39,904][33226] Updated weights for policy 1, policy_version 61680 (0.0008) [2023-10-14 03:29:40,286][33226] Updated weights for policy 1, policy_version 61690 (0.0009) [2023-10-14 03:29:42,602][33201] Updated weights for policy 0, policy_version 61130 (0.0009) [2023-10-14 03:29:42,975][33201] Updated weights for policy 0, policy_version 61140 (0.0010) [2023-10-14 03:29:43,350][33201] Updated weights for policy 0, policy_version 61150 (0.0010) [2023-10-14 03:29:44,070][33226] Updated weights for policy 1, policy_version 61700 (0.0009) [2023-10-14 03:29:44,442][33226] Updated weights for policy 1, policy_version 61710 (0.0009) [2023-10-14 03:29:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 125796352. Throughput: 0: 1784.7, 1: 1759.7. Samples: 31454508. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:29:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 03:29:44,801][33226] Updated weights for policy 1, policy_version 61720 (0.0008) [2023-10-14 03:29:47,273][33201] Updated weights for policy 0, policy_version 61160 (0.0009) [2023-10-14 03:29:47,642][33201] Updated weights for policy 0, policy_version 61170 (0.0010) [2023-10-14 03:29:48,020][33201] Updated weights for policy 0, policy_version 61180 (0.0008) [2023-10-14 03:29:48,606][33226] Updated weights for policy 1, policy_version 61730 (0.0009) [2023-10-14 03:29:48,966][33226] Updated weights for policy 1, policy_version 61740 (0.0007) [2023-10-14 03:29:49,335][33226] Updated weights for policy 1, policy_version 61750 (0.0008) [2023-10-14 03:29:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 125861888. Throughput: 0: 1762.0, 1: 1780.7. Samples: 31475506. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:29:49,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 03:29:49,705][33226] Updated weights for policy 1, policy_version 61760 (0.0008) [2023-10-14 03:29:51,784][33201] Updated weights for policy 0, policy_version 61190 (0.0007) [2023-10-14 03:29:52,155][33201] Updated weights for policy 0, policy_version 61200 (0.0007) [2023-10-14 03:29:52,526][33201] Updated weights for policy 0, policy_version 61210 (0.0008) [2023-10-14 03:29:53,607][33226] Updated weights for policy 1, policy_version 61770 (0.0008) [2023-10-14 03:29:53,983][33226] Updated weights for policy 1, policy_version 61780 (0.0008) [2023-10-14 03:29:54,357][33226] Updated weights for policy 1, policy_version 61790 (0.0008) [2023-10-14 03:29:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 125960192. Throughput: 0: 1761.3, 1: 1778.6. Samples: 31496688. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:29:54,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 03:29:56,364][33201] Updated weights for policy 0, policy_version 61220 (0.0008) [2023-10-14 03:29:56,732][33201] Updated weights for policy 0, policy_version 61230 (0.0008) [2023-10-14 03:29:57,093][33201] Updated weights for policy 0, policy_version 61240 (0.0008) [2023-10-14 03:29:58,176][33226] Updated weights for policy 1, policy_version 61800 (0.0009) [2023-10-14 03:29:58,538][33226] Updated weights for policy 1, policy_version 61810 (0.0008) [2023-10-14 03:29:58,917][33226] Updated weights for policy 1, policy_version 61820 (0.0009) [2023-10-14 03:29:59,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 126025728. Throughput: 0: 1767.5, 1: 1776.3. Samples: 31507450. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:29:59,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.930')] [2023-10-14 03:30:00,889][33201] Updated weights for policy 0, policy_version 61250 (0.0008) [2023-10-14 03:30:01,263][33201] Updated weights for policy 0, policy_version 61260 (0.0007) [2023-10-14 03:30:01,636][33201] Updated weights for policy 0, policy_version 61270 (0.0011) [2023-10-14 03:30:02,011][33201] Updated weights for policy 0, policy_version 61280 (0.0011) [2023-10-14 03:30:02,754][33226] Updated weights for policy 1, policy_version 61830 (0.0009) [2023-10-14 03:30:03,122][33226] Updated weights for policy 1, policy_version 61840 (0.0009) [2023-10-14 03:30:03,494][33226] Updated weights for policy 1, policy_version 61850 (0.0008) [2023-10-14 03:30:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 126091264. Throughput: 0: 1764.8, 1: 1783.1. Samples: 31528832. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:30:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 03:30:05,713][33201] Updated weights for policy 0, policy_version 61290 (0.0011) [2023-10-14 03:30:06,070][33201] Updated weights for policy 0, policy_version 61300 (0.0009) [2023-10-14 03:30:06,448][33201] Updated weights for policy 0, policy_version 61310 (0.0009) [2023-10-14 03:30:07,276][33226] Updated weights for policy 1, policy_version 61860 (0.0009) [2023-10-14 03:30:07,644][33226] Updated weights for policy 1, policy_version 61870 (0.0008) [2023-10-14 03:30:08,010][33226] Updated weights for policy 1, policy_version 61880 (0.0007) [2023-10-14 03:30:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 126156800. Throughput: 0: 1768.1, 1: 1768.0. Samples: 31550078. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:30:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 03:30:10,445][33201] Updated weights for policy 0, policy_version 61320 (0.0009) [2023-10-14 03:30:10,823][33201] Updated weights for policy 0, policy_version 61330 (0.0010) [2023-10-14 03:30:11,205][33201] Updated weights for policy 0, policy_version 61340 (0.0008) [2023-10-14 03:30:11,874][33226] Updated weights for policy 1, policy_version 61890 (0.0008) [2023-10-14 03:30:12,243][33226] Updated weights for policy 1, policy_version 61900 (0.0007) [2023-10-14 03:30:12,602][33226] Updated weights for policy 1, policy_version 61910 (0.0007) [2023-10-14 03:30:12,975][33226] Updated weights for policy 1, policy_version 61920 (0.0008) [2023-10-14 03:30:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 126222336. Throughput: 0: 1757.0, 1: 1795.2. Samples: 31560750. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:30:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 03:30:14,969][33201] Updated weights for policy 0, policy_version 61350 (0.0009) [2023-10-14 03:30:15,342][33201] Updated weights for policy 0, policy_version 61360 (0.0009) [2023-10-14 03:30:15,708][33201] Updated weights for policy 0, policy_version 61370 (0.0007) [2023-10-14 03:30:16,551][33226] Updated weights for policy 1, policy_version 61930 (0.0010) [2023-10-14 03:30:16,923][33226] Updated weights for policy 1, policy_version 61940 (0.0007) [2023-10-14 03:30:17,296][33226] Updated weights for policy 1, policy_version 61950 (0.0007) [2023-10-14 03:30:19,497][33201] Updated weights for policy 0, policy_version 61380 (0.0008) [2023-10-14 03:30:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 126287872. Throughput: 0: 1761.5, 1: 1777.4. Samples: 31581924. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:30:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 03:30:19,873][33201] Updated weights for policy 0, policy_version 61390 (0.0008) [2023-10-14 03:30:20,243][33201] Updated weights for policy 0, policy_version 61400 (0.0009) [2023-10-14 03:30:21,031][33226] Updated weights for policy 1, policy_version 61960 (0.0009) [2023-10-14 03:30:21,396][33226] Updated weights for policy 1, policy_version 61970 (0.0009) [2023-10-14 03:30:21,758][33226] Updated weights for policy 1, policy_version 61980 (0.0007) [2023-10-14 03:30:23,991][33201] Updated weights for policy 0, policy_version 61410 (0.0011) [2023-10-14 03:30:24,351][33201] Updated weights for policy 0, policy_version 61420 (0.0009) [2023-10-14 03:30:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 126353408. Throughput: 0: 1786.0, 1: 1779.7. Samples: 31604136. Policy #0 lag: (min: 29.0, avg: 31.9, max: 61.0) [2023-10-14 03:30:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 03:30:24,726][33201] Updated weights for policy 0, policy_version 61430 (0.0008) [2023-10-14 03:30:25,097][33201] Updated weights for policy 0, policy_version 61440 (0.0007) [2023-10-14 03:30:25,332][33226] Updated weights for policy 1, policy_version 61990 (0.0010) [2023-10-14 03:30:25,697][33226] Updated weights for policy 1, policy_version 62000 (0.0009) [2023-10-14 03:30:26,050][33226] Updated weights for policy 1, policy_version 62010 (0.0007) [2023-10-14 03:30:28,865][33201] Updated weights for policy 0, policy_version 61450 (0.0009) [2023-10-14 03:30:29,238][33201] Updated weights for policy 0, policy_version 61460 (0.0011) [2023-10-14 03:30:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 126418944. Throughput: 0: 1761.1, 1: 1785.3. Samples: 31614098. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:29,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 03:30:29,604][33201] Updated weights for policy 0, policy_version 61470 (0.0011) [2023-10-14 03:30:30,025][33226] Updated weights for policy 1, policy_version 62020 (0.0008) [2023-10-14 03:30:30,383][33226] Updated weights for policy 1, policy_version 62030 (0.0007) [2023-10-14 03:30:30,752][33226] Updated weights for policy 1, policy_version 62040 (0.0009) [2023-10-14 03:30:33,437][33201] Updated weights for policy 0, policy_version 61480 (0.0010) [2023-10-14 03:30:33,815][33201] Updated weights for policy 0, policy_version 61490 (0.0009) [2023-10-14 03:30:34,180][33201] Updated weights for policy 0, policy_version 61500 (0.0008) [2023-10-14 03:30:34,453][33226] Updated weights for policy 1, policy_version 62050 (0.0007) [2023-10-14 03:30:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 126517248. Throughput: 0: 1790.0, 1: 1778.8. Samples: 31636098. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 03:30:34,816][33226] Updated weights for policy 1, policy_version 62060 (0.0008) [2023-10-14 03:30:35,193][33226] Updated weights for policy 1, policy_version 62070 (0.0010) [2023-10-14 03:30:35,547][33226] Updated weights for policy 1, policy_version 62080 (0.0010) [2023-10-14 03:30:38,211][33201] Updated weights for policy 0, policy_version 61510 (0.0007) [2023-10-14 03:30:38,576][33201] Updated weights for policy 0, policy_version 61520 (0.0011) [2023-10-14 03:30:38,944][33201] Updated weights for policy 0, policy_version 61530 (0.0010) [2023-10-14 03:30:39,522][33226] Updated weights for policy 1, policy_version 62090 (0.0009) [2023-10-14 03:30:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 126582784. Throughput: 0: 1753.9, 1: 1802.1. Samples: 31656706. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:39,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 03:30:39,889][33226] Updated weights for policy 1, policy_version 62100 (0.0010) [2023-10-14 03:30:40,256][33226] Updated weights for policy 1, policy_version 62110 (0.0008) [2023-10-14 03:30:42,821][33201] Updated weights for policy 0, policy_version 61540 (0.0010) [2023-10-14 03:30:43,198][33201] Updated weights for policy 0, policy_version 61550 (0.0008) [2023-10-14 03:30:43,571][33201] Updated weights for policy 0, policy_version 61560 (0.0008) [2023-10-14 03:30:44,026][33226] Updated weights for policy 1, policy_version 62120 (0.0009) [2023-10-14 03:30:44,398][33226] Updated weights for policy 1, policy_version 62130 (0.0010) [2023-10-14 03:30:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 126648320. Throughput: 0: 1773.8, 1: 1781.7. Samples: 31667448. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 03:30:44,760][33226] Updated weights for policy 1, policy_version 62140 (0.0008) [2023-10-14 03:30:47,431][33201] Updated weights for policy 0, policy_version 61570 (0.0008) [2023-10-14 03:30:47,801][33201] Updated weights for policy 0, policy_version 61580 (0.0009) [2023-10-14 03:30:48,168][33201] Updated weights for policy 0, policy_version 61590 (0.0010) [2023-10-14 03:30:48,439][33226] Updated weights for policy 1, policy_version 62150 (0.0010) [2023-10-14 03:30:48,544][33201] Updated weights for policy 0, policy_version 61600 (0.0009) [2023-10-14 03:30:48,802][33226] Updated weights for policy 1, policy_version 62160 (0.0007) [2023-10-14 03:30:49,165][33226] Updated weights for policy 1, policy_version 62170 (0.0011) [2023-10-14 03:30:49,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 126746624. Throughput: 0: 1762.9, 1: 1796.4. Samples: 31689002. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 03:30:52,356][33201] Updated weights for policy 0, policy_version 61610 (0.0008) [2023-10-14 03:30:52,740][33201] Updated weights for policy 0, policy_version 61620 (0.0009) [2023-10-14 03:30:53,038][33226] Updated weights for policy 1, policy_version 62180 (0.0009) [2023-10-14 03:30:53,116][33201] Updated weights for policy 0, policy_version 61630 (0.0008) [2023-10-14 03:30:53,408][33226] Updated weights for policy 1, policy_version 62190 (0.0007) [2023-10-14 03:30:53,776][33226] Updated weights for policy 1, policy_version 62200 (0.0008) [2023-10-14 03:30:54,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 126812160. Throughput: 0: 1752.8, 1: 1784.0. Samples: 31709234. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 03:30:56,803][33201] Updated weights for policy 0, policy_version 61640 (0.0008) [2023-10-14 03:30:57,179][33201] Updated weights for policy 0, policy_version 61650 (0.0008) [2023-10-14 03:30:57,518][33226] Updated weights for policy 1, policy_version 62210 (0.0008) [2023-10-14 03:30:57,547][33201] Updated weights for policy 0, policy_version 61660 (0.0007) [2023-10-14 03:30:57,890][33226] Updated weights for policy 1, policy_version 62220 (0.0008) [2023-10-14 03:30:58,247][33226] Updated weights for policy 1, policy_version 62230 (0.0008) [2023-10-14 03:30:58,612][33226] Updated weights for policy 1, policy_version 62240 (0.0007) [2023-10-14 03:30:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 126877696. Throughput: 0: 1772.4, 1: 1785.7. Samples: 31720866. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:30:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 03:31:01,382][33201] Updated weights for policy 0, policy_version 61670 (0.0009) [2023-10-14 03:31:01,758][33201] Updated weights for policy 0, policy_version 61680 (0.0008) [2023-10-14 03:31:02,119][33201] Updated weights for policy 0, policy_version 61690 (0.0008) [2023-10-14 03:31:02,480][33226] Updated weights for policy 1, policy_version 62250 (0.0007) [2023-10-14 03:31:02,842][33226] Updated weights for policy 1, policy_version 62260 (0.0007) [2023-10-14 03:31:03,217][33226] Updated weights for policy 1, policy_version 62270 (0.0010) [2023-10-14 03:31:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 126943232. Throughput: 0: 1754.8, 1: 1789.4. Samples: 31741416. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 03:31:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 03:31:06,240][33201] Updated weights for policy 0, policy_version 61700 (0.0007) [2023-10-14 03:31:06,610][33201] Updated weights for policy 0, policy_version 61710 (0.0010) [2023-10-14 03:31:06,980][33201] Updated weights for policy 0, policy_version 61720 (0.0009) [2023-10-14 03:31:07,024][33226] Updated weights for policy 1, policy_version 62280 (0.0008) [2023-10-14 03:31:07,383][33226] Updated weights for policy 1, policy_version 62290 (0.0007) [2023-10-14 03:31:07,762][33226] Updated weights for policy 1, policy_version 62300 (0.0008) [2023-10-14 03:31:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 127008768. Throughput: 0: 1752.0, 1: 1771.7. Samples: 31762704. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 03:31:10,718][33201] Updated weights for policy 0, policy_version 61730 (0.0009) [2023-10-14 03:31:11,097][33201] Updated weights for policy 0, policy_version 61740 (0.0009) [2023-10-14 03:31:11,455][33201] Updated weights for policy 0, policy_version 61750 (0.0008) [2023-10-14 03:31:11,455][33226] Updated weights for policy 1, policy_version 62310 (0.0009) [2023-10-14 03:31:11,820][33201] Updated weights for policy 0, policy_version 61760 (0.0008) [2023-10-14 03:31:11,829][33226] Updated weights for policy 1, policy_version 62320 (0.0007) [2023-10-14 03:31:12,186][33226] Updated weights for policy 1, policy_version 62330 (0.0008) [2023-10-14 03:31:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 127074304. Throughput: 0: 1746.6, 1: 1784.4. Samples: 31772992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 03:31:15,570][33201] Updated weights for policy 0, policy_version 61770 (0.0007) [2023-10-14 03:31:15,943][33201] Updated weights for policy 0, policy_version 61780 (0.0007) [2023-10-14 03:31:16,054][33226] Updated weights for policy 1, policy_version 62340 (0.0008) [2023-10-14 03:31:16,302][33201] Updated weights for policy 0, policy_version 61790 (0.0007) [2023-10-14 03:31:16,419][33226] Updated weights for policy 1, policy_version 62350 (0.0008) [2023-10-14 03:31:16,783][33226] Updated weights for policy 1, policy_version 62360 (0.0008) [2023-10-14 03:31:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 127139840. Throughput: 0: 1749.8, 1: 1769.3. Samples: 31794460. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 03:31:20,138][33201] Updated weights for policy 0, policy_version 61800 (0.0008) [2023-10-14 03:31:20,493][33226] Updated weights for policy 1, policy_version 62370 (0.0007) [2023-10-14 03:31:20,503][33201] Updated weights for policy 0, policy_version 61810 (0.0008) [2023-10-14 03:31:20,846][33226] Updated weights for policy 1, policy_version 62380 (0.0007) [2023-10-14 03:31:20,870][33201] Updated weights for policy 0, policy_version 61820 (0.0007) [2023-10-14 03:31:21,211][33226] Updated weights for policy 1, policy_version 62390 (0.0009) [2023-10-14 03:31:21,587][33226] Updated weights for policy 1, policy_version 62400 (0.0008) [2023-10-14 03:31:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 127205376. Throughput: 0: 1779.8, 1: 1768.0. Samples: 31816358. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:31:24,565][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000062400_63897600.pth... [2023-10-14 03:31:24,593][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000060736_62193664.pth [2023-10-14 03:31:24,734][33201] Updated weights for policy 0, policy_version 61830 (0.0009) [2023-10-14 03:31:25,107][33201] Updated weights for policy 0, policy_version 61840 (0.0009) [2023-10-14 03:31:25,477][33201] Updated weights for policy 0, policy_version 61850 (0.0008) [2023-10-14 03:31:25,517][33226] Updated weights for policy 1, policy_version 62410 (0.0009) [2023-10-14 03:31:25,695][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000061856_63340544.pth... [2023-10-14 03:31:25,724][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000060192_61636608.pth [2023-10-14 03:31:25,897][33226] Updated weights for policy 1, policy_version 62420 (0.0008) [2023-10-14 03:31:26,264][33226] Updated weights for policy 1, policy_version 62430 (0.0008) [2023-10-14 03:31:29,435][33201] Updated weights for policy 0, policy_version 61860 (0.0008) [2023-10-14 03:31:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 127270912. Throughput: 0: 1747.7, 1: 1768.0. Samples: 31825652. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 03:31:29,813][33201] Updated weights for policy 0, policy_version 61870 (0.0008) [2023-10-14 03:31:29,996][33226] Updated weights for policy 1, policy_version 62440 (0.0008) [2023-10-14 03:31:30,193][33201] Updated weights for policy 0, policy_version 61880 (0.0007) [2023-10-14 03:31:30,360][33226] Updated weights for policy 1, policy_version 62450 (0.0010) [2023-10-14 03:31:30,726][33226] Updated weights for policy 1, policy_version 62460 (0.0009) [2023-10-14 03:31:34,165][33201] Updated weights for policy 0, policy_version 61890 (0.0008) [2023-10-14 03:31:34,517][33226] Updated weights for policy 1, policy_version 62470 (0.0007) [2023-10-14 03:31:34,530][33201] Updated weights for policy 0, policy_version 61900 (0.0007) [2023-10-14 03:31:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 127336448. Throughput: 0: 1760.8, 1: 1765.1. Samples: 31847666. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 03:31:34,871][33226] Updated weights for policy 1, policy_version 62480 (0.0009) [2023-10-14 03:31:34,912][33201] Updated weights for policy 0, policy_version 61910 (0.0010) [2023-10-14 03:31:35,238][33226] Updated weights for policy 1, policy_version 62490 (0.0007) [2023-10-14 03:31:35,282][33201] Updated weights for policy 0, policy_version 61920 (0.0008) [2023-10-14 03:31:39,042][33226] Updated weights for policy 1, policy_version 62500 (0.0008) [2023-10-14 03:31:39,264][33201] Updated weights for policy 0, policy_version 61930 (0.0008) [2023-10-14 03:31:39,411][33226] Updated weights for policy 1, policy_version 62510 (0.0007) [2023-10-14 03:31:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 127401984. Throughput: 0: 1760.1, 1: 1792.8. Samples: 31869116. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 03:31:39,636][33201] Updated weights for policy 0, policy_version 61940 (0.0007) [2023-10-14 03:31:39,785][33226] Updated weights for policy 1, policy_version 62520 (0.0008) [2023-10-14 03:31:40,005][33201] Updated weights for policy 0, policy_version 61950 (0.0008) [2023-10-14 03:31:43,649][33226] Updated weights for policy 1, policy_version 62530 (0.0007) [2023-10-14 03:31:43,835][33201] Updated weights for policy 0, policy_version 61960 (0.0007) [2023-10-14 03:31:44,015][33226] Updated weights for policy 1, policy_version 62540 (0.0008) [2023-10-14 03:31:44,212][33201] Updated weights for policy 0, policy_version 61970 (0.0007) [2023-10-14 03:31:44,386][33226] Updated weights for policy 1, policy_version 62550 (0.0008) [2023-10-14 03:31:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 127467520. Throughput: 0: 1749.5, 1: 1766.2. Samples: 31879070. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:31:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '21.000')] [2023-10-14 03:31:44,588][33201] Updated weights for policy 0, policy_version 61980 (0.0007) [2023-10-14 03:31:44,751][33226] Updated weights for policy 1, policy_version 62560 (0.0008) [2023-10-14 03:31:48,323][33201] Updated weights for policy 0, policy_version 61990 (0.0009) [2023-10-14 03:31:48,506][33226] Updated weights for policy 1, policy_version 62570 (0.0009) [2023-10-14 03:31:48,695][33201] Updated weights for policy 0, policy_version 62000 (0.0007) [2023-10-14 03:31:48,880][33226] Updated weights for policy 1, policy_version 62580 (0.0011) [2023-10-14 03:31:49,067][33201] Updated weights for policy 0, policy_version 62010 (0.0009) [2023-10-14 03:31:49,241][33226] Updated weights for policy 1, policy_version 62590 (0.0007) [2023-10-14 03:31:49,557][31953] Fps is (10 sec: 19661.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 127598592. Throughput: 0: 1767.5, 1: 1785.7. Samples: 31901306. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:31:49,557][31953] Avg episode reward: [(0, '20.990'), (1, '21.000')] [2023-10-14 03:31:52,902][33201] Updated weights for policy 0, policy_version 62020 (0.0008) [2023-10-14 03:31:52,945][33226] Updated weights for policy 1, policy_version 62600 (0.0008) [2023-10-14 03:31:53,263][33201] Updated weights for policy 0, policy_version 62030 (0.0008) [2023-10-14 03:31:53,312][33226] Updated weights for policy 1, policy_version 62610 (0.0008) [2023-10-14 03:31:53,637][33201] Updated weights for policy 0, policy_version 62040 (0.0008) [2023-10-14 03:31:53,680][33226] Updated weights for policy 1, policy_version 62620 (0.0007) [2023-10-14 03:31:54,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 127664128. Throughput: 0: 1736.7, 1: 1769.6. Samples: 31920486. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:31:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '21.000')] [2023-10-14 03:31:57,427][33226] Updated weights for policy 1, policy_version 62630 (0.0008) [2023-10-14 03:31:57,552][33201] Updated weights for policy 0, policy_version 62050 (0.0009) [2023-10-14 03:31:57,788][33226] Updated weights for policy 1, policy_version 62640 (0.0008) [2023-10-14 03:31:57,913][33201] Updated weights for policy 0, policy_version 62060 (0.0009) [2023-10-14 03:31:58,159][33226] Updated weights for policy 1, policy_version 62650 (0.0007) [2023-10-14 03:31:58,278][33201] Updated weights for policy 0, policy_version 62070 (0.0008) [2023-10-14 03:31:58,649][33201] Updated weights for policy 0, policy_version 62080 (0.0009) [2023-10-14 03:31:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 127729664. Throughput: 0: 1764.4, 1: 1789.8. Samples: 31932934. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:31:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 03:32:01,973][33226] Updated weights for policy 1, policy_version 62660 (0.0008) [2023-10-14 03:32:02,346][33226] Updated weights for policy 1, policy_version 62670 (0.0010) [2023-10-14 03:32:02,479][33201] Updated weights for policy 0, policy_version 62090 (0.0007) [2023-10-14 03:32:02,704][33226] Updated weights for policy 1, policy_version 62680 (0.0008) [2023-10-14 03:32:02,846][33201] Updated weights for policy 0, policy_version 62100 (0.0008) [2023-10-14 03:32:03,218][33201] Updated weights for policy 0, policy_version 62110 (0.0009) [2023-10-14 03:32:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 127795200. Throughput: 0: 1736.3, 1: 1776.7. Samples: 31952546. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.990')] [2023-10-14 03:32:06,640][33226] Updated weights for policy 1, policy_version 62690 (0.0008) [2023-10-14 03:32:07,000][33226] Updated weights for policy 1, policy_version 62700 (0.0008) [2023-10-14 03:32:07,094][33201] Updated weights for policy 0, policy_version 62120 (0.0008) [2023-10-14 03:32:07,355][33226] Updated weights for policy 1, policy_version 62710 (0.0008) [2023-10-14 03:32:07,466][33201] Updated weights for policy 0, policy_version 62130 (0.0008) [2023-10-14 03:32:07,725][33226] Updated weights for policy 1, policy_version 62720 (0.0007) [2023-10-14 03:32:07,838][33201] Updated weights for policy 0, policy_version 62140 (0.0008) [2023-10-14 03:32:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 127860736. Throughput: 0: 1730.2, 1: 1769.7. Samples: 31973854. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.990')] [2023-10-14 03:32:11,597][33226] Updated weights for policy 1, policy_version 62730 (0.0008) [2023-10-14 03:32:11,615][33201] Updated weights for policy 0, policy_version 62150 (0.0007) [2023-10-14 03:32:11,964][33226] Updated weights for policy 1, policy_version 62740 (0.0009) [2023-10-14 03:32:11,984][33201] Updated weights for policy 0, policy_version 62160 (0.0008) [2023-10-14 03:32:12,339][33226] Updated weights for policy 1, policy_version 62750 (0.0008) [2023-10-14 03:32:12,365][33201] Updated weights for policy 0, policy_version 62170 (0.0008) [2023-10-14 03:32:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 127926272. Throughput: 0: 1748.3, 1: 1779.2. Samples: 31984390. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:14,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.990')] [2023-10-14 03:32:16,158][33226] Updated weights for policy 1, policy_version 62760 (0.0008) [2023-10-14 03:32:16,227][33201] Updated weights for policy 0, policy_version 62180 (0.0007) [2023-10-14 03:32:16,533][33226] Updated weights for policy 1, policy_version 62770 (0.0009) [2023-10-14 03:32:16,600][33201] Updated weights for policy 0, policy_version 62190 (0.0007) [2023-10-14 03:32:16,896][33226] Updated weights for policy 1, policy_version 62780 (0.0007) [2023-10-14 03:32:16,965][33201] Updated weights for policy 0, policy_version 62200 (0.0008) [2023-10-14 03:32:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 127991808. Throughput: 0: 1736.1, 1: 1763.2. Samples: 32005138. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 03:32:20,657][33226] Updated weights for policy 1, policy_version 62790 (0.0008) [2023-10-14 03:32:20,783][33201] Updated weights for policy 0, policy_version 62210 (0.0008) [2023-10-14 03:32:21,024][33226] Updated weights for policy 1, policy_version 62800 (0.0007) [2023-10-14 03:32:21,150][33201] Updated weights for policy 0, policy_version 62220 (0.0008) [2023-10-14 03:32:21,394][33226] Updated weights for policy 1, policy_version 62810 (0.0008) [2023-10-14 03:32:21,523][33201] Updated weights for policy 0, policy_version 62230 (0.0008) [2023-10-14 03:32:21,881][33201] Updated weights for policy 0, policy_version 62240 (0.0008) [2023-10-14 03:32:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128057344. Throughput: 0: 1748.8, 1: 1764.4. Samples: 32027208. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:24,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:32:25,235][33226] Updated weights for policy 1, policy_version 62820 (0.0009) [2023-10-14 03:32:25,592][33226] Updated weights for policy 1, policy_version 62830 (0.0007) [2023-10-14 03:32:25,958][33226] Updated weights for policy 1, policy_version 62840 (0.0008) [2023-10-14 03:32:26,024][33201] Updated weights for policy 0, policy_version 62250 (0.0008) [2023-10-14 03:32:26,392][33201] Updated weights for policy 0, policy_version 62260 (0.0007) [2023-10-14 03:32:26,765][33201] Updated weights for policy 0, policy_version 62270 (0.0008) [2023-10-14 03:32:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128122880. Throughput: 0: 1740.0, 1: 1765.0. Samples: 32036792. Policy #0 lag: (min: 31.0, avg: 36.8, max: 63.0) [2023-10-14 03:32:29,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:32:29,751][33226] Updated weights for policy 1, policy_version 62850 (0.0009) [2023-10-14 03:32:30,113][33226] Updated weights for policy 1, policy_version 62860 (0.0011) [2023-10-14 03:32:30,482][33226] Updated weights for policy 1, policy_version 62870 (0.0009) [2023-10-14 03:32:30,562][33201] Updated weights for policy 0, policy_version 62280 (0.0008) [2023-10-14 03:32:30,847][33226] Updated weights for policy 1, policy_version 62880 (0.0009) [2023-10-14 03:32:30,936][33201] Updated weights for policy 0, policy_version 62290 (0.0007) [2023-10-14 03:32:31,304][33201] Updated weights for policy 0, policy_version 62300 (0.0008) [2023-10-14 03:32:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128188416. Throughput: 0: 1738.6, 1: 1759.1. Samples: 32058702. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 03:32:34,787][33226] Updated weights for policy 1, policy_version 62890 (0.0009) [2023-10-14 03:32:34,954][33201] Updated weights for policy 0, policy_version 62310 (0.0008) [2023-10-14 03:32:35,155][33226] Updated weights for policy 1, policy_version 62900 (0.0008) [2023-10-14 03:32:35,318][33201] Updated weights for policy 0, policy_version 62320 (0.0008) [2023-10-14 03:32:35,532][33226] Updated weights for policy 1, policy_version 62910 (0.0008) [2023-10-14 03:32:35,686][33201] Updated weights for policy 0, policy_version 62330 (0.0010) [2023-10-14 03:32:39,373][33226] Updated weights for policy 1, policy_version 62920 (0.0007) [2023-10-14 03:32:39,522][33201] Updated weights for policy 0, policy_version 62340 (0.0008) [2023-10-14 03:32:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128253952. Throughput: 0: 1778.9, 1: 1787.8. Samples: 32080988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:39,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:32:39,736][33226] Updated weights for policy 1, policy_version 62930 (0.0008) [2023-10-14 03:32:39,895][33201] Updated weights for policy 0, policy_version 62350 (0.0008) [2023-10-14 03:32:40,094][33226] Updated weights for policy 1, policy_version 62940 (0.0009) [2023-10-14 03:32:40,255][33201] Updated weights for policy 0, policy_version 62360 (0.0010) [2023-10-14 03:32:43,862][33226] Updated weights for policy 1, policy_version 62950 (0.0008) [2023-10-14 03:32:44,066][33201] Updated weights for policy 0, policy_version 62370 (0.0009) [2023-10-14 03:32:44,234][33226] Updated weights for policy 1, policy_version 62960 (0.0008) [2023-10-14 03:32:44,450][33201] Updated weights for policy 0, policy_version 62380 (0.0008) [2023-10-14 03:32:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128319488. Throughput: 0: 1748.7, 1: 1755.7. Samples: 32090630. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:32:44,602][33226] Updated weights for policy 1, policy_version 62970 (0.0008) [2023-10-14 03:32:44,817][33201] Updated weights for policy 0, policy_version 62390 (0.0010) [2023-10-14 03:32:45,190][33201] Updated weights for policy 0, policy_version 62400 (0.0009) [2023-10-14 03:32:48,196][33226] Updated weights for policy 1, policy_version 62980 (0.0008) [2023-10-14 03:32:48,568][33226] Updated weights for policy 1, policy_version 62990 (0.0008) [2023-10-14 03:32:48,935][33226] Updated weights for policy 1, policy_version 63000 (0.0008) [2023-10-14 03:32:49,085][33201] Updated weights for policy 0, policy_version 62410 (0.0008) [2023-10-14 03:32:49,461][33201] Updated weights for policy 0, policy_version 62420 (0.0009) [2023-10-14 03:32:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 128417792. Throughput: 0: 1770.9, 1: 1790.0. Samples: 32112786. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:32:49,825][33201] Updated weights for policy 0, policy_version 62430 (0.0009) [2023-10-14 03:32:52,710][33226] Updated weights for policy 1, policy_version 63010 (0.0008) [2023-10-14 03:32:53,078][33226] Updated weights for policy 1, policy_version 63020 (0.0007) [2023-10-14 03:32:53,433][33226] Updated weights for policy 1, policy_version 63030 (0.0007) [2023-10-14 03:32:53,781][33201] Updated weights for policy 0, policy_version 62440 (0.0009) [2023-10-14 03:32:53,801][33226] Updated weights for policy 1, policy_version 63040 (0.0007) [2023-10-14 03:32:54,150][33201] Updated weights for policy 0, policy_version 62450 (0.0008) [2023-10-14 03:32:54,521][33201] Updated weights for policy 0, policy_version 62460 (0.0008) [2023-10-14 03:32:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 128483328. Throughput: 0: 1764.0, 1: 1767.2. Samples: 32132758. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:32:57,661][33226] Updated weights for policy 1, policy_version 63050 (0.0008) [2023-10-14 03:32:58,029][33226] Updated weights for policy 1, policy_version 63060 (0.0007) [2023-10-14 03:32:58,197][33201] Updated weights for policy 0, policy_version 62470 (0.0009) [2023-10-14 03:32:58,401][33226] Updated weights for policy 1, policy_version 63070 (0.0010) [2023-10-14 03:32:58,563][33201] Updated weights for policy 0, policy_version 62480 (0.0007) [2023-10-14 03:32:58,944][33201] Updated weights for policy 0, policy_version 62490 (0.0008) [2023-10-14 03:32:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 128581632. Throughput: 0: 1766.0, 1: 1796.2. Samples: 32144688. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:32:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:33:02,155][33226] Updated weights for policy 1, policy_version 63080 (0.0009) [2023-10-14 03:33:02,521][33226] Updated weights for policy 1, policy_version 63090 (0.0007) [2023-10-14 03:33:02,853][33201] Updated weights for policy 0, policy_version 62500 (0.0008) [2023-10-14 03:33:02,879][33226] Updated weights for policy 1, policy_version 63100 (0.0008) [2023-10-14 03:33:03,220][33201] Updated weights for policy 0, policy_version 62510 (0.0010) [2023-10-14 03:33:03,595][33201] Updated weights for policy 0, policy_version 62520 (0.0009) [2023-10-14 03:33:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 128647168. Throughput: 0: 1774.6, 1: 1776.2. Samples: 32164924. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:33:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 03:33:06,722][33226] Updated weights for policy 1, policy_version 63110 (0.0007) [2023-10-14 03:33:07,087][33226] Updated weights for policy 1, policy_version 63120 (0.0009) [2023-10-14 03:33:07,448][33226] Updated weights for policy 1, policy_version 63130 (0.0008) [2023-10-14 03:33:07,617][33201] Updated weights for policy 0, policy_version 62530 (0.0008) [2023-10-14 03:33:07,980][33201] Updated weights for policy 0, policy_version 62540 (0.0010) [2023-10-14 03:33:08,361][33201] Updated weights for policy 0, policy_version 62550 (0.0007) [2023-10-14 03:33:08,735][33201] Updated weights for policy 0, policy_version 62560 (0.0007) [2023-10-14 03:33:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 128712704. Throughput: 0: 1746.7, 1: 1781.0. Samples: 32185954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-10-14 03:33:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:11,040][33226] Updated weights for policy 1, policy_version 63140 (0.0008) [2023-10-14 03:33:11,410][33226] Updated weights for policy 1, policy_version 63150 (0.0009) [2023-10-14 03:33:11,772][33226] Updated weights for policy 1, policy_version 63160 (0.0009) [2023-10-14 03:33:12,689][33201] Updated weights for policy 0, policy_version 62570 (0.0009) [2023-10-14 03:33:13,055][33201] Updated weights for policy 0, policy_version 62580 (0.0010) [2023-10-14 03:33:13,430][33201] Updated weights for policy 0, policy_version 62590 (0.0008) [2023-10-14 03:33:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 128778240. Throughput: 0: 1780.5, 1: 1782.3. Samples: 32197120. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:15,435][33226] Updated weights for policy 1, policy_version 63170 (0.0008) [2023-10-14 03:33:15,810][33226] Updated weights for policy 1, policy_version 63180 (0.0007) [2023-10-14 03:33:16,183][33226] Updated weights for policy 1, policy_version 63190 (0.0008) [2023-10-14 03:33:16,549][33226] Updated weights for policy 1, policy_version 63200 (0.0007) [2023-10-14 03:33:17,070][33201] Updated weights for policy 0, policy_version 62600 (0.0009) [2023-10-14 03:33:17,434][33201] Updated weights for policy 0, policy_version 62610 (0.0008) [2023-10-14 03:33:17,809][33201] Updated weights for policy 0, policy_version 62620 (0.0008) [2023-10-14 03:33:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 128843776. Throughput: 0: 1745.8, 1: 1790.3. Samples: 32217828. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:20,320][33226] Updated weights for policy 1, policy_version 63210 (0.0009) [2023-10-14 03:33:20,686][33226] Updated weights for policy 1, policy_version 63220 (0.0009) [2023-10-14 03:33:21,055][33226] Updated weights for policy 1, policy_version 63230 (0.0008) [2023-10-14 03:33:21,519][33201] Updated weights for policy 0, policy_version 62630 (0.0009) [2023-10-14 03:33:21,887][33201] Updated weights for policy 0, policy_version 62640 (0.0009) [2023-10-14 03:33:22,266][33201] Updated weights for policy 0, policy_version 62650 (0.0007) [2023-10-14 03:33:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 128909312. Throughput: 0: 1745.4, 1: 1792.2. Samples: 32240180. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000062656_64159744.pth... [2023-10-14 03:33:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000061024_62488576.pth [2023-10-14 03:33:24,813][33226] Updated weights for policy 1, policy_version 63240 (0.0008) [2023-10-14 03:33:25,177][33226] Updated weights for policy 1, policy_version 63250 (0.0009) [2023-10-14 03:33:25,531][33226] Updated weights for policy 1, policy_version 63260 (0.0008) [2023-10-14 03:33:25,676][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000063264_64782336.pth... [2023-10-14 03:33:25,704][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000061568_63045632.pth [2023-10-14 03:33:26,117][33201] Updated weights for policy 0, policy_version 62660 (0.0008) [2023-10-14 03:33:26,484][33201] Updated weights for policy 0, policy_version 62670 (0.0007) [2023-10-14 03:33:26,856][33201] Updated weights for policy 0, policy_version 62680 (0.0007) [2023-10-14 03:33:29,333][33226] Updated weights for policy 1, policy_version 63270 (0.0007) [2023-10-14 03:33:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 128974848. Throughput: 0: 1748.8, 1: 1790.4. Samples: 32249892. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:29,705][33226] Updated weights for policy 1, policy_version 63280 (0.0009) [2023-10-14 03:33:30,073][33226] Updated weights for policy 1, policy_version 63290 (0.0008) [2023-10-14 03:33:30,544][33201] Updated weights for policy 0, policy_version 62690 (0.0007) [2023-10-14 03:33:30,903][33201] Updated weights for policy 0, policy_version 62700 (0.0008) [2023-10-14 03:33:31,278][33201] Updated weights for policy 0, policy_version 62710 (0.0010) [2023-10-14 03:33:31,649][33201] Updated weights for policy 0, policy_version 62720 (0.0008) [2023-10-14 03:33:33,843][33226] Updated weights for policy 1, policy_version 63300 (0.0007) [2023-10-14 03:33:34,206][33226] Updated weights for policy 1, policy_version 63310 (0.0007) [2023-10-14 03:33:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 129040384. Throughput: 0: 1752.4, 1: 1788.5. Samples: 32272128. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:34,579][33226] Updated weights for policy 1, policy_version 63320 (0.0007) [2023-10-14 03:33:35,338][33201] Updated weights for policy 0, policy_version 62730 (0.0008) [2023-10-14 03:33:35,719][33201] Updated weights for policy 0, policy_version 62740 (0.0007) [2023-10-14 03:33:36,085][33201] Updated weights for policy 0, policy_version 62750 (0.0007) [2023-10-14 03:33:38,493][33226] Updated weights for policy 1, policy_version 63330 (0.0009) [2023-10-14 03:33:38,856][33226] Updated weights for policy 1, policy_version 63340 (0.0010) [2023-10-14 03:33:39,217][33226] Updated weights for policy 1, policy_version 63350 (0.0008) [2023-10-14 03:33:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 129105920. Throughput: 0: 1773.9, 1: 1798.4. Samples: 32293510. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 03:33:39,578][33226] Updated weights for policy 1, policy_version 63360 (0.0009) [2023-10-14 03:33:39,993][33201] Updated weights for policy 0, policy_version 62760 (0.0009) [2023-10-14 03:33:40,368][33201] Updated weights for policy 0, policy_version 62770 (0.0008) [2023-10-14 03:33:40,736][33201] Updated weights for policy 0, policy_version 62780 (0.0007) [2023-10-14 03:33:43,503][33226] Updated weights for policy 1, policy_version 63370 (0.0007) [2023-10-14 03:33:43,881][33226] Updated weights for policy 1, policy_version 63380 (0.0007) [2023-10-14 03:33:44,246][33226] Updated weights for policy 1, policy_version 63390 (0.0007) [2023-10-14 03:33:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 129204224. Throughput: 0: 1754.7, 1: 1778.6. Samples: 32303686. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.950')] [2023-10-14 03:33:44,741][33201] Updated weights for policy 0, policy_version 62790 (0.0008) [2023-10-14 03:33:45,108][33201] Updated weights for policy 0, policy_version 62800 (0.0008) [2023-10-14 03:33:45,480][33201] Updated weights for policy 0, policy_version 62810 (0.0009) [2023-10-14 03:33:47,921][33226] Updated weights for policy 1, policy_version 63400 (0.0010) [2023-10-14 03:33:48,292][33226] Updated weights for policy 1, policy_version 63410 (0.0010) [2023-10-14 03:33:48,652][33226] Updated weights for policy 1, policy_version 63420 (0.0009) [2023-10-14 03:33:49,317][33201] Updated weights for policy 0, policy_version 62820 (0.0007) [2023-10-14 03:33:49,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 129269760. Throughput: 0: 1758.0, 1: 1803.7. Samples: 32325200. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:33:49,690][33201] Updated weights for policy 0, policy_version 62830 (0.0009) [2023-10-14 03:33:50,059][33201] Updated weights for policy 0, policy_version 62840 (0.0010) [2023-10-14 03:33:52,574][33226] Updated weights for policy 1, policy_version 63430 (0.0010) [2023-10-14 03:33:52,938][33226] Updated weights for policy 1, policy_version 63440 (0.0010) [2023-10-14 03:33:53,303][33226] Updated weights for policy 1, policy_version 63450 (0.0010) [2023-10-14 03:33:53,816][33201] Updated weights for policy 0, policy_version 62850 (0.0009) [2023-10-14 03:33:54,198][33201] Updated weights for policy 0, policy_version 62860 (0.0011) [2023-10-14 03:33:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 129335296. Throughput: 0: 1776.8, 1: 1774.0. Samples: 32345742. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:33:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:33:54,568][33201] Updated weights for policy 0, policy_version 62870 (0.0010) [2023-10-14 03:33:54,945][33201] Updated weights for policy 0, policy_version 62880 (0.0010) [2023-10-14 03:33:57,101][33226] Updated weights for policy 1, policy_version 63460 (0.0007) [2023-10-14 03:33:57,478][33226] Updated weights for policy 1, policy_version 63470 (0.0009) [2023-10-14 03:33:57,847][33226] Updated weights for policy 1, policy_version 63480 (0.0009) [2023-10-14 03:33:58,892][33201] Updated weights for policy 0, policy_version 62890 (0.0007) [2023-10-14 03:33:59,265][33201] Updated weights for policy 0, policy_version 62900 (0.0007) [2023-10-14 03:33:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 129400832. Throughput: 0: 1754.0, 1: 1802.0. Samples: 32357142. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:33:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 03:33:59,637][33201] Updated weights for policy 0, policy_version 62910 (0.0009) [2023-10-14 03:34:01,859][33226] Updated weights for policy 1, policy_version 63490 (0.0008) [2023-10-14 03:34:02,237][33226] Updated weights for policy 1, policy_version 63500 (0.0008) [2023-10-14 03:34:02,599][33226] Updated weights for policy 1, policy_version 63510 (0.0007) [2023-10-14 03:34:02,970][33226] Updated weights for policy 1, policy_version 63520 (0.0007) [2023-10-14 03:34:03,498][33201] Updated weights for policy 0, policy_version 62920 (0.0010) [2023-10-14 03:34:03,872][33201] Updated weights for policy 0, policy_version 62930 (0.0008) [2023-10-14 03:34:04,244][33201] Updated weights for policy 0, policy_version 62940 (0.0008) [2023-10-14 03:34:04,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 129499136. Throughput: 0: 1785.8, 1: 1765.1. Samples: 32377622. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 03:34:06,828][33226] Updated weights for policy 1, policy_version 63530 (0.0011) [2023-10-14 03:34:07,199][33226] Updated weights for policy 1, policy_version 63540 (0.0011) [2023-10-14 03:34:07,576][33226] Updated weights for policy 1, policy_version 63550 (0.0010) [2023-10-14 03:34:08,100][33201] Updated weights for policy 0, policy_version 62950 (0.0010) [2023-10-14 03:34:08,462][33201] Updated weights for policy 0, policy_version 62960 (0.0008) [2023-10-14 03:34:08,835][33201] Updated weights for policy 0, policy_version 62970 (0.0009) [2023-10-14 03:34:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 129564672. Throughput: 0: 1753.2, 1: 1758.4. Samples: 32398202. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 03:34:11,344][33226] Updated weights for policy 1, policy_version 63560 (0.0010) [2023-10-14 03:34:11,710][33226] Updated weights for policy 1, policy_version 63570 (0.0007) [2023-10-14 03:34:12,083][33226] Updated weights for policy 1, policy_version 63580 (0.0007) [2023-10-14 03:34:12,734][33201] Updated weights for policy 0, policy_version 62980 (0.0008) [2023-10-14 03:34:13,103][33201] Updated weights for policy 0, policy_version 62990 (0.0008) [2023-10-14 03:34:13,477][33201] Updated weights for policy 0, policy_version 63000 (0.0008) [2023-10-14 03:34:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 129630208. Throughput: 0: 1785.7, 1: 1768.9. Samples: 32409850. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 03:34:15,594][33226] Updated weights for policy 1, policy_version 63590 (0.0007) [2023-10-14 03:34:15,964][33226] Updated weights for policy 1, policy_version 63600 (0.0007) [2023-10-14 03:34:16,328][33226] Updated weights for policy 1, policy_version 63610 (0.0008) [2023-10-14 03:34:17,008][33201] Updated weights for policy 0, policy_version 63010 (0.0008) [2023-10-14 03:34:17,372][33201] Updated weights for policy 0, policy_version 63020 (0.0008) [2023-10-14 03:34:17,747][33201] Updated weights for policy 0, policy_version 63030 (0.0008) [2023-10-14 03:34:18,116][33201] Updated weights for policy 0, policy_version 63040 (0.0007) [2023-10-14 03:34:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 129695744. Throughput: 0: 1760.4, 1: 1769.0. Samples: 32430952. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:19,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:34:20,082][33226] Updated weights for policy 1, policy_version 63620 (0.0008) [2023-10-14 03:34:20,443][33226] Updated weights for policy 1, policy_version 63630 (0.0009) [2023-10-14 03:34:20,810][33226] Updated weights for policy 1, policy_version 63640 (0.0008) [2023-10-14 03:34:21,947][33201] Updated weights for policy 0, policy_version 63050 (0.0008) [2023-10-14 03:34:22,314][33201] Updated weights for policy 0, policy_version 63060 (0.0007) [2023-10-14 03:34:22,687][33201] Updated weights for policy 0, policy_version 63070 (0.0007) [2023-10-14 03:34:24,448][33226] Updated weights for policy 1, policy_version 63650 (0.0008) [2023-10-14 03:34:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 129761280. Throughput: 0: 1757.9, 1: 1797.3. Samples: 32453492. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:24,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:34:24,810][33226] Updated weights for policy 1, policy_version 63660 (0.0008) [2023-10-14 03:34:25,174][33226] Updated weights for policy 1, policy_version 63670 (0.0008) [2023-10-14 03:34:25,539][33226] Updated weights for policy 1, policy_version 63680 (0.0007) [2023-10-14 03:34:26,421][33201] Updated weights for policy 0, policy_version 63080 (0.0008) [2023-10-14 03:34:26,785][33201] Updated weights for policy 0, policy_version 63090 (0.0007) [2023-10-14 03:34:27,160][33201] Updated weights for policy 0, policy_version 63100 (0.0007) [2023-10-14 03:34:29,412][33226] Updated weights for policy 1, policy_version 63690 (0.0009) [2023-10-14 03:34:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 129826816. Throughput: 0: 1768.0, 1: 1782.3. Samples: 32463448. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 03:34:29,787][33226] Updated weights for policy 1, policy_version 63700 (0.0009) [2023-10-14 03:34:30,156][33226] Updated weights for policy 1, policy_version 63710 (0.0012) [2023-10-14 03:34:30,899][33201] Updated weights for policy 0, policy_version 63110 (0.0008) [2023-10-14 03:34:31,267][33201] Updated weights for policy 0, policy_version 63120 (0.0007) [2023-10-14 03:34:31,637][33201] Updated weights for policy 0, policy_version 63130 (0.0007) [2023-10-14 03:34:34,035][33226] Updated weights for policy 1, policy_version 63720 (0.0009) [2023-10-14 03:34:34,402][33226] Updated weights for policy 1, policy_version 63730 (0.0007) [2023-10-14 03:34:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 129892352. Throughput: 0: 1764.7, 1: 1786.6. Samples: 32485006. Policy #0 lag: (min: 18.0, avg: 18.2, max: 27.0) [2023-10-14 03:34:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 03:34:34,769][33226] Updated weights for policy 1, policy_version 63740 (0.0007) [2023-10-14 03:34:35,516][33201] Updated weights for policy 0, policy_version 63140 (0.0008) [2023-10-14 03:34:35,878][33201] Updated weights for policy 0, policy_version 63150 (0.0009) [2023-10-14 03:34:36,247][33201] Updated weights for policy 0, policy_version 63160 (0.0009) [2023-10-14 03:34:38,307][33226] Updated weights for policy 1, policy_version 63750 (0.0008) [2023-10-14 03:34:38,672][33226] Updated weights for policy 1, policy_version 63760 (0.0007) [2023-10-14 03:34:39,046][33226] Updated weights for policy 1, policy_version 63770 (0.0007) [2023-10-14 03:34:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 129990656. Throughput: 0: 1774.4, 1: 1798.5. Samples: 32506522. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:34:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 03:34:40,284][33201] Updated weights for policy 0, policy_version 63170 (0.0008) [2023-10-14 03:34:40,654][33201] Updated weights for policy 0, policy_version 63180 (0.0008) [2023-10-14 03:34:41,025][33201] Updated weights for policy 0, policy_version 63190 (0.0009) [2023-10-14 03:34:41,394][33201] Updated weights for policy 0, policy_version 63200 (0.0008) [2023-10-14 03:34:42,680][33226] Updated weights for policy 1, policy_version 63780 (0.0009) [2023-10-14 03:34:43,053][33226] Updated weights for policy 1, policy_version 63790 (0.0009) [2023-10-14 03:34:43,426][33226] Updated weights for policy 1, policy_version 63800 (0.0009) [2023-10-14 03:34:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130056192. Throughput: 0: 1764.9, 1: 1794.9. Samples: 32517334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:34:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 03:34:45,172][33201] Updated weights for policy 0, policy_version 63210 (0.0010) [2023-10-14 03:34:45,537][33201] Updated weights for policy 0, policy_version 63220 (0.0008) [2023-10-14 03:34:45,923][33201] Updated weights for policy 0, policy_version 63230 (0.0008) [2023-10-14 03:34:47,199][33226] Updated weights for policy 1, policy_version 63810 (0.0008) [2023-10-14 03:34:47,567][33226] Updated weights for policy 1, policy_version 63820 (0.0009) [2023-10-14 03:34:47,920][33226] Updated weights for policy 1, policy_version 63830 (0.0010) [2023-10-14 03:34:48,295][33226] Updated weights for policy 1, policy_version 63840 (0.0011) [2023-10-14 03:34:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 130121728. Throughput: 0: 1767.7, 1: 1813.8. Samples: 32538790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:34:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 03:34:49,804][33201] Updated weights for policy 0, policy_version 63240 (0.0010) [2023-10-14 03:34:50,171][33201] Updated weights for policy 0, policy_version 63250 (0.0010) [2023-10-14 03:34:50,544][33201] Updated weights for policy 0, policy_version 63260 (0.0010) [2023-10-14 03:34:51,962][33226] Updated weights for policy 1, policy_version 63850 (0.0009) [2023-10-14 03:34:52,331][33226] Updated weights for policy 1, policy_version 63860 (0.0010) [2023-10-14 03:34:52,707][33226] Updated weights for policy 1, policy_version 63870 (0.0009) [2023-10-14 03:34:54,327][33201] Updated weights for policy 0, policy_version 63270 (0.0010) [2023-10-14 03:34:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 130187264. Throughput: 0: 1797.3, 1: 1805.4. Samples: 32560324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:34:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 03:34:54,695][33201] Updated weights for policy 0, policy_version 63280 (0.0011) [2023-10-14 03:34:55,060][33201] Updated weights for policy 0, policy_version 63290 (0.0009) [2023-10-14 03:34:56,522][33226] Updated weights for policy 1, policy_version 63880 (0.0007) [2023-10-14 03:34:56,899][33226] Updated weights for policy 1, policy_version 63890 (0.0009) [2023-10-14 03:34:57,272][33226] Updated weights for policy 1, policy_version 63900 (0.0007) [2023-10-14 03:34:58,910][33201] Updated weights for policy 0, policy_version 63300 (0.0007) [2023-10-14 03:34:59,269][33201] Updated weights for policy 0, policy_version 63310 (0.0007) [2023-10-14 03:34:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 130252800. Throughput: 0: 1761.9, 1: 1810.5. Samples: 32570608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:34:59,560][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 03:34:59,644][33201] Updated weights for policy 0, policy_version 63320 (0.0008) [2023-10-14 03:35:01,040][33226] Updated weights for policy 1, policy_version 63910 (0.0009) [2023-10-14 03:35:01,408][33226] Updated weights for policy 1, policy_version 63920 (0.0008) [2023-10-14 03:35:01,773][33226] Updated weights for policy 1, policy_version 63930 (0.0009) [2023-10-14 03:35:03,343][33201] Updated weights for policy 0, policy_version 63330 (0.0009) [2023-10-14 03:35:03,714][33201] Updated weights for policy 0, policy_version 63340 (0.0008) [2023-10-14 03:35:04,095][33201] Updated weights for policy 0, policy_version 63350 (0.0008) [2023-10-14 03:35:04,460][33201] Updated weights for policy 0, policy_version 63360 (0.0007) [2023-10-14 03:35:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130351104. Throughput: 0: 1794.7, 1: 1794.7. Samples: 32592478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 03:35:05,699][33226] Updated weights for policy 1, policy_version 63940 (0.0008) [2023-10-14 03:35:06,057][33226] Updated weights for policy 1, policy_version 63950 (0.0009) [2023-10-14 03:35:06,428][33226] Updated weights for policy 1, policy_version 63960 (0.0008) [2023-10-14 03:35:08,114][33201] Updated weights for policy 0, policy_version 63370 (0.0007) [2023-10-14 03:35:08,479][33201] Updated weights for policy 0, policy_version 63380 (0.0007) [2023-10-14 03:35:08,844][33201] Updated weights for policy 0, policy_version 63390 (0.0009) [2023-10-14 03:35:09,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130416640. Throughput: 0: 1764.7, 1: 1783.0. Samples: 32613140. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:35:10,143][33226] Updated weights for policy 1, policy_version 63970 (0.0009) [2023-10-14 03:35:10,513][33226] Updated weights for policy 1, policy_version 63980 (0.0008) [2023-10-14 03:35:10,875][33226] Updated weights for policy 1, policy_version 63990 (0.0007) [2023-10-14 03:35:11,244][33226] Updated weights for policy 1, policy_version 64000 (0.0008) [2023-10-14 03:35:12,788][33201] Updated weights for policy 0, policy_version 63400 (0.0009) [2023-10-14 03:35:13,164][33201] Updated weights for policy 0, policy_version 63410 (0.0008) [2023-10-14 03:35:13,535][33201] Updated weights for policy 0, policy_version 63420 (0.0008) [2023-10-14 03:35:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130482176. Throughput: 0: 1787.4, 1: 1786.9. Samples: 32624290. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.970')] [2023-10-14 03:35:15,083][33226] Updated weights for policy 1, policy_version 64010 (0.0008) [2023-10-14 03:35:15,451][33226] Updated weights for policy 1, policy_version 64020 (0.0008) [2023-10-14 03:35:15,822][33226] Updated weights for policy 1, policy_version 64030 (0.0007) [2023-10-14 03:35:17,150][33201] Updated weights for policy 0, policy_version 63430 (0.0009) [2023-10-14 03:35:17,527][33201] Updated weights for policy 0, policy_version 63440 (0.0011) [2023-10-14 03:35:17,896][33201] Updated weights for policy 0, policy_version 63450 (0.0010) [2023-10-14 03:35:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130547712. Throughput: 0: 1763.1, 1: 1793.5. Samples: 32645054. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:19,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.970')] [2023-10-14 03:35:19,566][33226] Updated weights for policy 1, policy_version 64040 (0.0010) [2023-10-14 03:35:19,927][33226] Updated weights for policy 1, policy_version 64050 (0.0010) [2023-10-14 03:35:20,290][33226] Updated weights for policy 1, policy_version 64060 (0.0011) [2023-10-14 03:35:21,829][33201] Updated weights for policy 0, policy_version 63460 (0.0008) [2023-10-14 03:35:22,194][33201] Updated weights for policy 0, policy_version 63470 (0.0007) [2023-10-14 03:35:22,571][33201] Updated weights for policy 0, policy_version 63480 (0.0007) [2023-10-14 03:35:24,076][33226] Updated weights for policy 1, policy_version 64070 (0.0008) [2023-10-14 03:35:24,449][33226] Updated weights for policy 1, policy_version 64080 (0.0008) [2023-10-14 03:35:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 130613248. Throughput: 0: 1757.5, 1: 1804.7. Samples: 32666822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:24,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.970')] [2023-10-14 03:35:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000063488_65011712.pth... [2023-10-14 03:35:24,613][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000061856_63340544.pth [2023-10-14 03:35:24,815][33226] Updated weights for policy 1, policy_version 64090 (0.0009) [2023-10-14 03:35:25,034][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000064096_65634304.pth... [2023-10-14 03:35:25,062][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000062400_63897600.pth [2023-10-14 03:35:26,376][33201] Updated weights for policy 0, policy_version 63490 (0.0009) [2023-10-14 03:35:26,754][33201] Updated weights for policy 0, policy_version 63500 (0.0007) [2023-10-14 03:35:27,123][33201] Updated weights for policy 0, policy_version 63510 (0.0007) [2023-10-14 03:35:27,497][33201] Updated weights for policy 0, policy_version 63520 (0.0009) [2023-10-14 03:35:28,641][33226] Updated weights for policy 1, policy_version 64100 (0.0009) [2023-10-14 03:35:29,004][33226] Updated weights for policy 1, policy_version 64110 (0.0008) [2023-10-14 03:35:29,374][33226] Updated weights for policy 1, policy_version 64120 (0.0008) [2023-10-14 03:35:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 130678784. Throughput: 0: 1769.6, 1: 1781.0. Samples: 32677110. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:29,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.970')] [2023-10-14 03:35:31,359][33201] Updated weights for policy 0, policy_version 63530 (0.0009) [2023-10-14 03:35:31,744][33201] Updated weights for policy 0, policy_version 63540 (0.0008) [2023-10-14 03:35:32,115][33201] Updated weights for policy 0, policy_version 63550 (0.0009) [2023-10-14 03:35:33,095][33226] Updated weights for policy 1, policy_version 64130 (0.0009) [2023-10-14 03:35:33,467][33226] Updated weights for policy 1, policy_version 64140 (0.0007) [2023-10-14 03:35:33,833][33226] Updated weights for policy 1, policy_version 64150 (0.0007) [2023-10-14 03:35:34,189][33226] Updated weights for policy 1, policy_version 64160 (0.0008) [2023-10-14 03:35:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 130777088. Throughput: 0: 1758.9, 1: 1800.4. Samples: 32698958. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:34,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.970')] [2023-10-14 03:35:36,248][33201] Updated weights for policy 0, policy_version 63560 (0.0009) [2023-10-14 03:35:36,622][33201] Updated weights for policy 0, policy_version 63570 (0.0009) [2023-10-14 03:35:36,996][33201] Updated weights for policy 0, policy_version 63580 (0.0011) [2023-10-14 03:35:37,944][33226] Updated weights for policy 1, policy_version 64170 (0.0009) [2023-10-14 03:35:38,321][33226] Updated weights for policy 1, policy_version 64180 (0.0009) [2023-10-14 03:35:38,676][33226] Updated weights for policy 1, policy_version 64190 (0.0009) [2023-10-14 03:35:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 130842624. Throughput: 0: 1758.0, 1: 1779.6. Samples: 32719514. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:35:40,806][33201] Updated weights for policy 0, policy_version 63590 (0.0009) [2023-10-14 03:35:41,180][33201] Updated weights for policy 0, policy_version 63600 (0.0008) [2023-10-14 03:35:41,546][33201] Updated weights for policy 0, policy_version 63610 (0.0007) [2023-10-14 03:35:42,561][33226] Updated weights for policy 1, policy_version 64200 (0.0008) [2023-10-14 03:35:42,918][33226] Updated weights for policy 1, policy_version 64210 (0.0007) [2023-10-14 03:35:43,287][33226] Updated weights for policy 1, policy_version 64220 (0.0009) [2023-10-14 03:35:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 130908160. Throughput: 0: 1755.6, 1: 1799.0. Samples: 32730568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:35:45,370][33201] Updated weights for policy 0, policy_version 63620 (0.0009) [2023-10-14 03:35:45,741][33201] Updated weights for policy 0, policy_version 63630 (0.0008) [2023-10-14 03:35:46,114][33201] Updated weights for policy 0, policy_version 63640 (0.0007) [2023-10-14 03:35:47,142][33226] Updated weights for policy 1, policy_version 64230 (0.0009) [2023-10-14 03:35:47,504][33226] Updated weights for policy 1, policy_version 64240 (0.0008) [2023-10-14 03:35:47,865][33226] Updated weights for policy 1, policy_version 64250 (0.0011) [2023-10-14 03:35:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 130973696. Throughput: 0: 1750.1, 1: 1783.2. Samples: 32751478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:35:49,765][33201] Updated weights for policy 0, policy_version 63650 (0.0007) [2023-10-14 03:35:50,132][33201] Updated weights for policy 0, policy_version 63660 (0.0010) [2023-10-14 03:35:50,495][33201] Updated weights for policy 0, policy_version 63670 (0.0007) [2023-10-14 03:35:50,861][33201] Updated weights for policy 0, policy_version 63680 (0.0007) [2023-10-14 03:35:51,530][33226] Updated weights for policy 1, policy_version 64260 (0.0011) [2023-10-14 03:35:51,903][33226] Updated weights for policy 1, policy_version 64270 (0.0009) [2023-10-14 03:35:52,265][33226] Updated weights for policy 1, policy_version 64280 (0.0007) [2023-10-14 03:35:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 131039232. Throughput: 0: 1780.8, 1: 1786.8. Samples: 32773684. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:35:54,635][33201] Updated weights for policy 0, policy_version 63690 (0.0007) [2023-10-14 03:35:55,017][33201] Updated weights for policy 0, policy_version 63700 (0.0007) [2023-10-14 03:35:55,381][33201] Updated weights for policy 0, policy_version 63710 (0.0013) [2023-10-14 03:35:56,098][33226] Updated weights for policy 1, policy_version 64290 (0.0007) [2023-10-14 03:35:56,475][33226] Updated weights for policy 1, policy_version 64300 (0.0009) [2023-10-14 03:35:56,839][33226] Updated weights for policy 1, policy_version 64310 (0.0008) [2023-10-14 03:35:57,203][33226] Updated weights for policy 1, policy_version 64320 (0.0008) [2023-10-14 03:35:59,119][33201] Updated weights for policy 0, policy_version 63720 (0.0007) [2023-10-14 03:35:59,492][33201] Updated weights for policy 0, policy_version 63730 (0.0008) [2023-10-14 03:35:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 131104768. Throughput: 0: 1748.2, 1: 1789.6. Samples: 32783494. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:35:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:35:59,868][33201] Updated weights for policy 0, policy_version 63740 (0.0009) [2023-10-14 03:36:00,910][33226] Updated weights for policy 1, policy_version 64330 (0.0009) [2023-10-14 03:36:01,279][33226] Updated weights for policy 1, policy_version 64340 (0.0009) [2023-10-14 03:36:01,650][33226] Updated weights for policy 1, policy_version 64350 (0.0009) [2023-10-14 03:36:03,815][33201] Updated weights for policy 0, policy_version 63750 (0.0008) [2023-10-14 03:36:04,188][33201] Updated weights for policy 0, policy_version 63760 (0.0008) [2023-10-14 03:36:04,554][33201] Updated weights for policy 0, policy_version 63770 (0.0009) [2023-10-14 03:36:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 131170304. Throughput: 0: 1783.0, 1: 1783.1. Samples: 32805528. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:04,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 03:36:05,466][33226] Updated weights for policy 1, policy_version 64360 (0.0009) [2023-10-14 03:36:05,829][33226] Updated weights for policy 1, policy_version 64370 (0.0011) [2023-10-14 03:36:06,196][33226] Updated weights for policy 1, policy_version 64380 (0.0010) [2023-10-14 03:36:08,364][33201] Updated weights for policy 0, policy_version 63780 (0.0008) [2023-10-14 03:36:08,744][33201] Updated weights for policy 0, policy_version 63790 (0.0007) [2023-10-14 03:36:09,112][33201] Updated weights for policy 0, policy_version 63800 (0.0007) [2023-10-14 03:36:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 131268608. Throughput: 0: 1765.3, 1: 1784.6. Samples: 32826566. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:09,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 03:36:10,115][33226] Updated weights for policy 1, policy_version 64390 (0.0008) [2023-10-14 03:36:10,480][33226] Updated weights for policy 1, policy_version 64400 (0.0011) [2023-10-14 03:36:10,842][33226] Updated weights for policy 1, policy_version 64410 (0.0008) [2023-10-14 03:36:12,850][33201] Updated weights for policy 0, policy_version 63810 (0.0009) [2023-10-14 03:36:13,229][33201] Updated weights for policy 0, policy_version 63820 (0.0009) [2023-10-14 03:36:13,597][33201] Updated weights for policy 0, policy_version 63830 (0.0008) [2023-10-14 03:36:13,970][33201] Updated weights for policy 0, policy_version 63840 (0.0008) [2023-10-14 03:36:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 131334144. Throughput: 0: 1776.8, 1: 1776.8. Samples: 32837020. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.980')] [2023-10-14 03:36:14,597][33226] Updated weights for policy 1, policy_version 64420 (0.0008) [2023-10-14 03:36:14,959][33226] Updated weights for policy 1, policy_version 64430 (0.0009) [2023-10-14 03:36:15,327][33226] Updated weights for policy 1, policy_version 64440 (0.0009) [2023-10-14 03:36:17,875][33201] Updated weights for policy 0, policy_version 63850 (0.0009) [2023-10-14 03:36:18,247][33201] Updated weights for policy 0, policy_version 63860 (0.0008) [2023-10-14 03:36:18,613][33201] Updated weights for policy 0, policy_version 63870 (0.0010) [2023-10-14 03:36:19,076][33226] Updated weights for policy 1, policy_version 64450 (0.0008) [2023-10-14 03:36:19,448][33226] Updated weights for policy 1, policy_version 64460 (0.0010) [2023-10-14 03:36:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 131399680. Throughput: 0: 1769.0, 1: 1772.9. Samples: 32858346. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:19,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.980')] [2023-10-14 03:36:19,820][33226] Updated weights for policy 1, policy_version 64470 (0.0009) [2023-10-14 03:36:20,182][33226] Updated weights for policy 1, policy_version 64480 (0.0009) [2023-10-14 03:36:22,533][33201] Updated weights for policy 0, policy_version 63880 (0.0008) [2023-10-14 03:36:22,910][33201] Updated weights for policy 0, policy_version 63890 (0.0008) [2023-10-14 03:36:23,272][33201] Updated weights for policy 0, policy_version 63900 (0.0010) [2023-10-14 03:36:23,938][33226] Updated weights for policy 1, policy_version 64490 (0.0009) [2023-10-14 03:36:24,309][33226] Updated weights for policy 1, policy_version 64500 (0.0009) [2023-10-14 03:36:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 131465216. Throughput: 0: 1753.9, 1: 1799.7. Samples: 32879424. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:24,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.980')] [2023-10-14 03:36:24,679][33226] Updated weights for policy 1, policy_version 64510 (0.0009) [2023-10-14 03:36:27,025][33201] Updated weights for policy 0, policy_version 63910 (0.0009) [2023-10-14 03:36:27,397][33201] Updated weights for policy 0, policy_version 63920 (0.0008) [2023-10-14 03:36:27,760][33201] Updated weights for policy 0, policy_version 63930 (0.0009) [2023-10-14 03:36:28,465][33226] Updated weights for policy 1, policy_version 64520 (0.0008) [2023-10-14 03:36:28,837][33226] Updated weights for policy 1, policy_version 64530 (0.0008) [2023-10-14 03:36:29,197][33226] Updated weights for policy 1, policy_version 64540 (0.0007) [2023-10-14 03:36:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 131563520. Throughput: 0: 1783.3, 1: 1774.5. Samples: 32890666. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:29,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.980')] [2023-10-14 03:36:31,379][33201] Updated weights for policy 0, policy_version 63940 (0.0008) [2023-10-14 03:36:31,743][33201] Updated weights for policy 0, policy_version 63950 (0.0007) [2023-10-14 03:36:32,118][33201] Updated weights for policy 0, policy_version 63960 (0.0008) [2023-10-14 03:36:33,056][33226] Updated weights for policy 1, policy_version 64550 (0.0008) [2023-10-14 03:36:33,420][33226] Updated weights for policy 1, policy_version 64560 (0.0007) [2023-10-14 03:36:33,795][33226] Updated weights for policy 1, policy_version 64570 (0.0007) [2023-10-14 03:36:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 131629056. Throughput: 0: 1756.1, 1: 1801.6. Samples: 32911576. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:34,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.980')] [2023-10-14 03:36:36,172][33201] Updated weights for policy 0, policy_version 63970 (0.0008) [2023-10-14 03:36:36,552][33201] Updated weights for policy 0, policy_version 63980 (0.0009) [2023-10-14 03:36:36,924][33201] Updated weights for policy 0, policy_version 63990 (0.0009) [2023-10-14 03:36:37,289][33201] Updated weights for policy 0, policy_version 64000 (0.0008) [2023-10-14 03:36:37,518][33226] Updated weights for policy 1, policy_version 64580 (0.0008) [2023-10-14 03:36:37,894][33226] Updated weights for policy 1, policy_version 64590 (0.0007) [2023-10-14 03:36:38,253][33226] Updated weights for policy 1, policy_version 64600 (0.0009) [2023-10-14 03:36:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 131694592. Throughput: 0: 1756.4, 1: 1770.0. Samples: 32932372. Policy #0 lag: (min: 21.0, avg: 24.0, max: 53.0) [2023-10-14 03:36:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.980')] [2023-10-14 03:36:41,245][33201] Updated weights for policy 0, policy_version 64010 (0.0010) [2023-10-14 03:36:41,616][33201] Updated weights for policy 0, policy_version 64020 (0.0010) [2023-10-14 03:36:41,990][33201] Updated weights for policy 0, policy_version 64030 (0.0009) [2023-10-14 03:36:41,990][33226] Updated weights for policy 1, policy_version 64610 (0.0008) [2023-10-14 03:36:42,361][33226] Updated weights for policy 1, policy_version 64620 (0.0008) [2023-10-14 03:36:42,720][33226] Updated weights for policy 1, policy_version 64630 (0.0008) [2023-10-14 03:36:43,079][33226] Updated weights for policy 1, policy_version 64640 (0.0008) [2023-10-14 03:36:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 131760128. Throughput: 0: 1759.3, 1: 1797.0. Samples: 32943528. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:36:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:36:45,893][33201] Updated weights for policy 0, policy_version 64040 (0.0008) [2023-10-14 03:36:46,262][33201] Updated weights for policy 0, policy_version 64050 (0.0007) [2023-10-14 03:36:46,639][33201] Updated weights for policy 0, policy_version 64060 (0.0008) [2023-10-14 03:36:46,792][33226] Updated weights for policy 1, policy_version 64650 (0.0008) [2023-10-14 03:36:47,158][33226] Updated weights for policy 1, policy_version 64660 (0.0008) [2023-10-14 03:36:47,522][33226] Updated weights for policy 1, policy_version 64670 (0.0008) [2023-10-14 03:36:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 131825664. Throughput: 0: 1753.1, 1: 1771.9. Samples: 32964154. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:36:49,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.970')] [2023-10-14 03:36:50,620][33201] Updated weights for policy 0, policy_version 64070 (0.0009) [2023-10-14 03:36:50,987][33201] Updated weights for policy 0, policy_version 64080 (0.0007) [2023-10-14 03:36:51,349][33201] Updated weights for policy 0, policy_version 64090 (0.0009) [2023-10-14 03:36:51,567][33226] Updated weights for policy 1, policy_version 64680 (0.0007) [2023-10-14 03:36:51,961][33226] Updated weights for policy 1, policy_version 64690 (0.0007) [2023-10-14 03:36:52,332][33226] Updated weights for policy 1, policy_version 64700 (0.0008) [2023-10-14 03:36:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 131891200. Throughput: 0: 1770.4, 1: 1771.2. Samples: 32985934. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:36:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.970')] [2023-10-14 03:36:55,234][33201] Updated weights for policy 0, policy_version 64100 (0.0007) [2023-10-14 03:36:55,603][33201] Updated weights for policy 0, policy_version 64110 (0.0008) [2023-10-14 03:36:55,977][33201] Updated weights for policy 0, policy_version 64120 (0.0008) [2023-10-14 03:36:55,996][33226] Updated weights for policy 1, policy_version 64710 (0.0008) [2023-10-14 03:36:56,364][33226] Updated weights for policy 1, policy_version 64720 (0.0007) [2023-10-14 03:36:56,729][33226] Updated weights for policy 1, policy_version 64730 (0.0008) [2023-10-14 03:36:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 131956736. Throughput: 0: 1746.8, 1: 1774.0. Samples: 32995458. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:36:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:36:59,649][33201] Updated weights for policy 0, policy_version 64130 (0.0008) [2023-10-14 03:37:00,019][33201] Updated weights for policy 0, policy_version 64140 (0.0009) [2023-10-14 03:37:00,400][33201] Updated weights for policy 0, policy_version 64150 (0.0009) [2023-10-14 03:37:00,617][33226] Updated weights for policy 1, policy_version 64740 (0.0009) [2023-10-14 03:37:00,769][33201] Updated weights for policy 0, policy_version 64160 (0.0008) [2023-10-14 03:37:00,985][33226] Updated weights for policy 1, policy_version 64750 (0.0007) [2023-10-14 03:37:01,357][33226] Updated weights for policy 1, policy_version 64760 (0.0010) [2023-10-14 03:37:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 132022272. Throughput: 0: 1768.6, 1: 1764.4. Samples: 33017334. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:37:04,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:37:04,594][33201] Updated weights for policy 0, policy_version 64170 (0.0010) [2023-10-14 03:37:04,964][33201] Updated weights for policy 0, policy_version 64180 (0.0008) [2023-10-14 03:37:05,256][33226] Updated weights for policy 1, policy_version 64770 (0.0007) [2023-10-14 03:37:05,331][33201] Updated weights for policy 0, policy_version 64190 (0.0009) [2023-10-14 03:37:05,625][33226] Updated weights for policy 1, policy_version 64780 (0.0009) [2023-10-14 03:37:05,994][33226] Updated weights for policy 1, policy_version 64790 (0.0007) [2023-10-14 03:37:06,357][33226] Updated weights for policy 1, policy_version 64800 (0.0010) [2023-10-14 03:37:09,154][33201] Updated weights for policy 0, policy_version 64200 (0.0007) [2023-10-14 03:37:09,538][33201] Updated weights for policy 0, policy_version 64210 (0.0008) [2023-10-14 03:37:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 132087808. Throughput: 0: 1779.4, 1: 1772.2. Samples: 33039246. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:37:09,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:37:09,898][33201] Updated weights for policy 0, policy_version 64220 (0.0008) [2023-10-14 03:37:10,075][33226] Updated weights for policy 1, policy_version 64810 (0.0008) [2023-10-14 03:37:10,449][33226] Updated weights for policy 1, policy_version 64820 (0.0008) [2023-10-14 03:37:10,818][33226] Updated weights for policy 1, policy_version 64830 (0.0009) [2023-10-14 03:37:13,622][33201] Updated weights for policy 0, policy_version 64230 (0.0009) [2023-10-14 03:37:13,984][33201] Updated weights for policy 0, policy_version 64240 (0.0009) [2023-10-14 03:37:14,353][33201] Updated weights for policy 0, policy_version 64250 (0.0009) [2023-10-14 03:37:14,554][33226] Updated weights for policy 1, policy_version 64840 (0.0010) [2023-10-14 03:37:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 132153344. Throughput: 0: 1757.3, 1: 1761.9. Samples: 33049030. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:37:14,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:37:14,913][33226] Updated weights for policy 1, policy_version 64850 (0.0011) [2023-10-14 03:37:15,280][33226] Updated weights for policy 1, policy_version 64860 (0.0010) [2023-10-14 03:37:18,064][33201] Updated weights for policy 0, policy_version 64260 (0.0007) [2023-10-14 03:37:18,439][33201] Updated weights for policy 0, policy_version 64270 (0.0008) [2023-10-14 03:37:18,804][33201] Updated weights for policy 0, policy_version 64280 (0.0009) [2023-10-14 03:37:19,099][33226] Updated weights for policy 1, policy_version 64870 (0.0010) [2023-10-14 03:37:19,461][33226] Updated weights for policy 1, policy_version 64880 (0.0009) [2023-10-14 03:37:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 132251648. Throughput: 0: 1780.6, 1: 1768.4. Samples: 33071280. Policy #0 lag: (min: 30.0, avg: 30.0, max: 30.0) [2023-10-14 03:37:19,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:37:19,831][33226] Updated weights for policy 1, policy_version 64890 (0.0011) [2023-10-14 03:37:22,605][33201] Updated weights for policy 0, policy_version 64290 (0.0007) [2023-10-14 03:37:22,981][33201] Updated weights for policy 0, policy_version 64300 (0.0009) [2023-10-14 03:37:23,351][33201] Updated weights for policy 0, policy_version 64310 (0.0009) [2023-10-14 03:37:23,659][33226] Updated weights for policy 1, policy_version 64900 (0.0010) [2023-10-14 03:37:23,719][33201] Updated weights for policy 0, policy_version 64320 (0.0007) [2023-10-14 03:37:24,022][33226] Updated weights for policy 1, policy_version 64910 (0.0009) [2023-10-14 03:37:24,394][33226] Updated weights for policy 1, policy_version 64920 (0.0009) [2023-10-14 03:37:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 132317184. Throughput: 0: 1757.9, 1: 1789.9. Samples: 33092022. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:37:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000064320_65863680.pth... [2023-10-14 03:37:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000062656_64159744.pth [2023-10-14 03:37:24,685][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000064928_66486272.pth... [2023-10-14 03:37:24,724][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000063264_64782336.pth [2023-10-14 03:37:27,564][33201] Updated weights for policy 0, policy_version 64330 (0.0010) [2023-10-14 03:37:27,944][33201] Updated weights for policy 0, policy_version 64340 (0.0009) [2023-10-14 03:37:28,265][33226] Updated weights for policy 1, policy_version 64930 (0.0011) [2023-10-14 03:37:28,312][33201] Updated weights for policy 0, policy_version 64350 (0.0008) [2023-10-14 03:37:28,627][33226] Updated weights for policy 1, policy_version 64940 (0.0009) [2023-10-14 03:37:29,000][33226] Updated weights for policy 1, policy_version 64950 (0.0010) [2023-10-14 03:37:29,366][33226] Updated weights for policy 1, policy_version 64960 (0.0007) [2023-10-14 03:37:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 132415488. Throughput: 0: 1786.9, 1: 1764.1. Samples: 33103322. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:37:32,192][33201] Updated weights for policy 0, policy_version 64360 (0.0008) [2023-10-14 03:37:32,560][33201] Updated weights for policy 0, policy_version 64370 (0.0008) [2023-10-14 03:37:32,937][33201] Updated weights for policy 0, policy_version 64380 (0.0007) [2023-10-14 03:37:33,293][33226] Updated weights for policy 1, policy_version 64970 (0.0009) [2023-10-14 03:37:33,654][33226] Updated weights for policy 1, policy_version 64980 (0.0011) [2023-10-14 03:37:34,017][33226] Updated weights for policy 1, policy_version 64990 (0.0011) [2023-10-14 03:37:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 132481024. Throughput: 0: 1759.7, 1: 1789.7. Samples: 33123878. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:34,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.990')] [2023-10-14 03:37:36,689][33201] Updated weights for policy 0, policy_version 64390 (0.0009) [2023-10-14 03:37:37,051][33201] Updated weights for policy 0, policy_version 64400 (0.0010) [2023-10-14 03:37:37,424][33201] Updated weights for policy 0, policy_version 64410 (0.0008) [2023-10-14 03:37:37,987][33226] Updated weights for policy 1, policy_version 65000 (0.0010) [2023-10-14 03:37:38,369][33226] Updated weights for policy 1, policy_version 65010 (0.0009) [2023-10-14 03:37:38,735][33226] Updated weights for policy 1, policy_version 65020 (0.0008) [2023-10-14 03:37:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 132546560. Throughput: 0: 1767.2, 1: 1753.1. Samples: 33144346. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:39,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 03:37:41,030][33201] Updated weights for policy 0, policy_version 64420 (0.0008) [2023-10-14 03:37:41,393][33201] Updated weights for policy 0, policy_version 64430 (0.0009) [2023-10-14 03:37:41,765][33201] Updated weights for policy 0, policy_version 64440 (0.0010) [2023-10-14 03:37:42,499][33226] Updated weights for policy 1, policy_version 65030 (0.0009) [2023-10-14 03:37:42,876][33226] Updated weights for policy 1, policy_version 65040 (0.0008) [2023-10-14 03:37:43,242][33226] Updated weights for policy 1, policy_version 65050 (0.0008) [2023-10-14 03:37:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 132612096. Throughput: 0: 1771.9, 1: 1784.0. Samples: 33155476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:44,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 03:37:45,600][33201] Updated weights for policy 0, policy_version 64450 (0.0008) [2023-10-14 03:37:45,970][33201] Updated weights for policy 0, policy_version 64460 (0.0007) [2023-10-14 03:37:46,346][33201] Updated weights for policy 0, policy_version 64470 (0.0008) [2023-10-14 03:37:46,709][33201] Updated weights for policy 0, policy_version 64480 (0.0008) [2023-10-14 03:37:47,051][33226] Updated weights for policy 1, policy_version 65060 (0.0009) [2023-10-14 03:37:47,415][33226] Updated weights for policy 1, policy_version 65070 (0.0010) [2023-10-14 03:37:47,779][33226] Updated weights for policy 1, policy_version 65080 (0.0009) [2023-10-14 03:37:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 132677632. Throughput: 0: 1767.7, 1: 1769.0. Samples: 33176486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 03:37:50,509][33201] Updated weights for policy 0, policy_version 64490 (0.0010) [2023-10-14 03:37:50,877][33201] Updated weights for policy 0, policy_version 64500 (0.0008) [2023-10-14 03:37:51,246][33201] Updated weights for policy 0, policy_version 64510 (0.0007) [2023-10-14 03:37:51,552][33226] Updated weights for policy 1, policy_version 65090 (0.0009) [2023-10-14 03:37:51,921][33226] Updated weights for policy 1, policy_version 65100 (0.0007) [2023-10-14 03:37:52,285][33226] Updated weights for policy 1, policy_version 65110 (0.0007) [2023-10-14 03:37:52,650][33226] Updated weights for policy 1, policy_version 65120 (0.0007) [2023-10-14 03:37:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 132743168. Throughput: 0: 1774.2, 1: 1763.6. Samples: 33198448. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.940')] [2023-10-14 03:37:55,222][33201] Updated weights for policy 0, policy_version 64520 (0.0008) [2023-10-14 03:37:55,600][33201] Updated weights for policy 0, policy_version 64530 (0.0009) [2023-10-14 03:37:55,972][33201] Updated weights for policy 0, policy_version 64540 (0.0009) [2023-10-14 03:37:56,482][33226] Updated weights for policy 1, policy_version 65130 (0.0008) [2023-10-14 03:37:56,843][33226] Updated weights for policy 1, policy_version 65140 (0.0008) [2023-10-14 03:37:57,210][33226] Updated weights for policy 1, policy_version 65150 (0.0008) [2023-10-14 03:37:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 132808704. Throughput: 0: 1765.3, 1: 1774.4. Samples: 33208318. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:37:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.940')] [2023-10-14 03:37:59,761][33201] Updated weights for policy 0, policy_version 64550 (0.0008) [2023-10-14 03:38:00,136][33201] Updated weights for policy 0, policy_version 64560 (0.0009) [2023-10-14 03:38:00,499][33201] Updated weights for policy 0, policy_version 64570 (0.0007) [2023-10-14 03:38:01,047][33226] Updated weights for policy 1, policy_version 65160 (0.0008) [2023-10-14 03:38:01,413][33226] Updated weights for policy 1, policy_version 65170 (0.0009) [2023-10-14 03:38:01,768][33226] Updated weights for policy 1, policy_version 65180 (0.0007) [2023-10-14 03:38:04,522][33201] Updated weights for policy 0, policy_version 64580 (0.0008) [2023-10-14 03:38:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 132874240. Throughput: 0: 1765.1, 1: 1760.6. Samples: 33229934. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:38:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.930')] [2023-10-14 03:38:04,898][33201] Updated weights for policy 0, policy_version 64590 (0.0010) [2023-10-14 03:38:05,259][33201] Updated weights for policy 0, policy_version 64600 (0.0010) [2023-10-14 03:38:05,453][33226] Updated weights for policy 1, policy_version 65190 (0.0007) [2023-10-14 03:38:05,820][33226] Updated weights for policy 1, policy_version 65200 (0.0008) [2023-10-14 03:38:06,190][33226] Updated weights for policy 1, policy_version 65210 (0.0010) [2023-10-14 03:38:08,981][33201] Updated weights for policy 0, policy_version 64610 (0.0008) [2023-10-14 03:38:09,349][33201] Updated weights for policy 0, policy_version 64620 (0.0008) [2023-10-14 03:38:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 132939776. Throughput: 0: 1781.4, 1: 1772.1. Samples: 33251932. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:38:09,718][33201] Updated weights for policy 0, policy_version 64630 (0.0010) [2023-10-14 03:38:09,859][33226] Updated weights for policy 1, policy_version 65220 (0.0009) [2023-10-14 03:38:10,097][33201] Updated weights for policy 0, policy_version 64640 (0.0008) [2023-10-14 03:38:10,226][33226] Updated weights for policy 1, policy_version 65230 (0.0007) [2023-10-14 03:38:10,593][33226] Updated weights for policy 1, policy_version 65240 (0.0009) [2023-10-14 03:38:13,801][33201] Updated weights for policy 0, policy_version 64650 (0.0009) [2023-10-14 03:38:14,170][33201] Updated weights for policy 0, policy_version 64660 (0.0008) [2023-10-14 03:38:14,489][33226] Updated weights for policy 1, policy_version 65250 (0.0009) [2023-10-14 03:38:14,544][33201] Updated weights for policy 0, policy_version 64670 (0.0008) [2023-10-14 03:38:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 133005312. Throughput: 0: 1754.9, 1: 1760.9. Samples: 33261534. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:38:14,854][33226] Updated weights for policy 1, policy_version 65260 (0.0008) [2023-10-14 03:38:15,230][33226] Updated weights for policy 1, policy_version 65270 (0.0008) [2023-10-14 03:38:15,589][33226] Updated weights for policy 1, policy_version 65280 (0.0007) [2023-10-14 03:38:18,460][33201] Updated weights for policy 0, policy_version 64680 (0.0008) [2023-10-14 03:38:18,826][33201] Updated weights for policy 0, policy_version 64690 (0.0007) [2023-10-14 03:38:19,195][33201] Updated weights for policy 0, policy_version 64700 (0.0008) [2023-10-14 03:38:19,465][33226] Updated weights for policy 1, policy_version 65290 (0.0009) [2023-10-14 03:38:19,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 133103616. Throughput: 0: 1788.2, 1: 1763.4. Samples: 33283702. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:38:19,838][33226] Updated weights for policy 1, policy_version 65300 (0.0012) [2023-10-14 03:38:20,206][33226] Updated weights for policy 1, policy_version 65310 (0.0009) [2023-10-14 03:38:22,870][33201] Updated weights for policy 0, policy_version 64710 (0.0008) [2023-10-14 03:38:23,237][33201] Updated weights for policy 0, policy_version 64720 (0.0007) [2023-10-14 03:38:23,606][33201] Updated weights for policy 0, policy_version 64730 (0.0007) [2023-10-14 03:38:24,019][33226] Updated weights for policy 1, policy_version 65320 (0.0008) [2023-10-14 03:38:24,396][33226] Updated weights for policy 1, policy_version 65330 (0.0010) [2023-10-14 03:38:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 133169152. Throughput: 0: 1752.4, 1: 1793.4. Samples: 33303906. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:24,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:38:24,775][33226] Updated weights for policy 1, policy_version 65340 (0.0011) [2023-10-14 03:38:27,531][33201] Updated weights for policy 0, policy_version 64740 (0.0009) [2023-10-14 03:38:27,894][33201] Updated weights for policy 0, policy_version 64750 (0.0011) [2023-10-14 03:38:28,280][33201] Updated weights for policy 0, policy_version 64760 (0.0009) [2023-10-14 03:38:28,784][33226] Updated weights for policy 1, policy_version 65350 (0.0009) [2023-10-14 03:38:29,152][33226] Updated weights for policy 1, policy_version 65360 (0.0008) [2023-10-14 03:38:29,520][33226] Updated weights for policy 1, policy_version 65370 (0.0008) [2023-10-14 03:38:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 133234688. Throughput: 0: 1781.8, 1: 1763.6. Samples: 33315020. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.920')] [2023-10-14 03:38:31,983][33201] Updated weights for policy 0, policy_version 64770 (0.0009) [2023-10-14 03:38:32,355][33201] Updated weights for policy 0, policy_version 64780 (0.0007) [2023-10-14 03:38:32,728][33201] Updated weights for policy 0, policy_version 64790 (0.0007) [2023-10-14 03:38:33,099][33201] Updated weights for policy 0, policy_version 64800 (0.0009) [2023-10-14 03:38:33,257][33226] Updated weights for policy 1, policy_version 65380 (0.0009) [2023-10-14 03:38:33,632][33226] Updated weights for policy 1, policy_version 65390 (0.0011) [2023-10-14 03:38:33,996][33226] Updated weights for policy 1, policy_version 65400 (0.0008) [2023-10-14 03:38:34,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 133332992. Throughput: 0: 1754.1, 1: 1791.4. Samples: 33336034. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:34,559][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:38:36,835][33201] Updated weights for policy 0, policy_version 64810 (0.0010) [2023-10-14 03:38:37,204][33201] Updated weights for policy 0, policy_version 64820 (0.0008) [2023-10-14 03:38:37,576][33201] Updated weights for policy 0, policy_version 64830 (0.0009) [2023-10-14 03:38:37,748][33226] Updated weights for policy 1, policy_version 65410 (0.0007) [2023-10-14 03:38:38,122][33226] Updated weights for policy 1, policy_version 65420 (0.0008) [2023-10-14 03:38:38,489][33226] Updated weights for policy 1, policy_version 65430 (0.0008) [2023-10-14 03:38:38,858][33226] Updated weights for policy 1, policy_version 65440 (0.0007) [2023-10-14 03:38:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 133398528. Throughput: 0: 1757.4, 1: 1761.2. Samples: 33356788. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:38:41,516][33201] Updated weights for policy 0, policy_version 64840 (0.0009) [2023-10-14 03:38:41,886][33201] Updated weights for policy 0, policy_version 64850 (0.0008) [2023-10-14 03:38:42,258][33201] Updated weights for policy 0, policy_version 64860 (0.0007) [2023-10-14 03:38:42,617][33226] Updated weights for policy 1, policy_version 65450 (0.0009) [2023-10-14 03:38:42,994][33226] Updated weights for policy 1, policy_version 65460 (0.0007) [2023-10-14 03:38:43,357][33226] Updated weights for policy 1, policy_version 65470 (0.0007) [2023-10-14 03:38:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 133464064. Throughput: 0: 1769.2, 1: 1782.9. Samples: 33368166. Policy #0 lag: (min: 26.0, avg: 27.6, max: 46.0) [2023-10-14 03:38:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:38:46,036][33201] Updated weights for policy 0, policy_version 64870 (0.0008) [2023-10-14 03:38:46,405][33201] Updated weights for policy 0, policy_version 64880 (0.0009) [2023-10-14 03:38:46,776][33201] Updated weights for policy 0, policy_version 64890 (0.0009) [2023-10-14 03:38:47,230][33226] Updated weights for policy 1, policy_version 65480 (0.0010) [2023-10-14 03:38:47,601][33226] Updated weights for policy 1, policy_version 65490 (0.0008) [2023-10-14 03:38:47,967][33226] Updated weights for policy 1, policy_version 65500 (0.0008) [2023-10-14 03:38:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 133529600. Throughput: 0: 1767.0, 1: 1762.7. Samples: 33388768. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:38:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:38:50,527][33201] Updated weights for policy 0, policy_version 64900 (0.0009) [2023-10-14 03:38:50,905][33201] Updated weights for policy 0, policy_version 64910 (0.0009) [2023-10-14 03:38:51,269][33201] Updated weights for policy 0, policy_version 64920 (0.0008) [2023-10-14 03:38:51,920][33226] Updated weights for policy 1, policy_version 65510 (0.0008) [2023-10-14 03:38:52,291][33226] Updated weights for policy 1, policy_version 65520 (0.0008) [2023-10-14 03:38:52,651][33226] Updated weights for policy 1, policy_version 65530 (0.0008) [2023-10-14 03:38:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 133595136. Throughput: 0: 1770.9, 1: 1751.6. Samples: 33410444. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:38:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:38:55,113][33201] Updated weights for policy 0, policy_version 64930 (0.0008) [2023-10-14 03:38:55,477][33201] Updated weights for policy 0, policy_version 64940 (0.0007) [2023-10-14 03:38:55,841][33201] Updated weights for policy 0, policy_version 64950 (0.0007) [2023-10-14 03:38:56,215][33201] Updated weights for policy 0, policy_version 64960 (0.0009) [2023-10-14 03:38:56,411][33226] Updated weights for policy 1, policy_version 65540 (0.0008) [2023-10-14 03:38:56,769][33226] Updated weights for policy 1, policy_version 65550 (0.0009) [2023-10-14 03:38:57,132][33226] Updated weights for policy 1, policy_version 65560 (0.0008) [2023-10-14 03:38:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 133660672. Throughput: 0: 1770.3, 1: 1770.4. Samples: 33420864. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:38:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.910')] [2023-10-14 03:39:00,109][33201] Updated weights for policy 0, policy_version 64970 (0.0009) [2023-10-14 03:39:00,484][33201] Updated weights for policy 0, policy_version 64980 (0.0009) [2023-10-14 03:39:00,859][33201] Updated weights for policy 0, policy_version 64990 (0.0008) [2023-10-14 03:39:00,902][33226] Updated weights for policy 1, policy_version 65570 (0.0011) [2023-10-14 03:39:01,273][33226] Updated weights for policy 1, policy_version 65580 (0.0008) [2023-10-14 03:39:01,641][33226] Updated weights for policy 1, policy_version 65590 (0.0009) [2023-10-14 03:39:02,005][33226] Updated weights for policy 1, policy_version 65600 (0.0008) [2023-10-14 03:39:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 133726208. Throughput: 0: 1762.2, 1: 1761.9. Samples: 33442286. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:39:04,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:39:04,668][33201] Updated weights for policy 0, policy_version 65000 (0.0010) [2023-10-14 03:39:05,040][33201] Updated weights for policy 0, policy_version 65010 (0.0009) [2023-10-14 03:39:05,405][33201] Updated weights for policy 0, policy_version 65020 (0.0007) [2023-10-14 03:39:05,739][33226] Updated weights for policy 1, policy_version 65610 (0.0007) [2023-10-14 03:39:06,115][33226] Updated weights for policy 1, policy_version 65620 (0.0008) [2023-10-14 03:39:06,478][33226] Updated weights for policy 1, policy_version 65630 (0.0008) [2023-10-14 03:39:09,174][33201] Updated weights for policy 0, policy_version 65030 (0.0009) [2023-10-14 03:39:09,537][33201] Updated weights for policy 0, policy_version 65040 (0.0007) [2023-10-14 03:39:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 133791744. Throughput: 0: 1793.8, 1: 1774.0. Samples: 33464460. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:39:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.910')] [2023-10-14 03:39:09,903][33201] Updated weights for policy 0, policy_version 65050 (0.0009) [2023-10-14 03:39:10,248][33226] Updated weights for policy 1, policy_version 65640 (0.0009) [2023-10-14 03:39:10,616][33226] Updated weights for policy 1, policy_version 65650 (0.0008) [2023-10-14 03:39:10,985][33226] Updated weights for policy 1, policy_version 65660 (0.0009) [2023-10-14 03:39:13,698][33201] Updated weights for policy 0, policy_version 65060 (0.0008) [2023-10-14 03:39:14,069][33201] Updated weights for policy 0, policy_version 65070 (0.0008) [2023-10-14 03:39:14,439][33201] Updated weights for policy 0, policy_version 65080 (0.0010) [2023-10-14 03:39:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 133857280. Throughput: 0: 1768.4, 1: 1772.7. Samples: 33474366. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:39:14,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.910')] [2023-10-14 03:39:14,765][33226] Updated weights for policy 1, policy_version 65670 (0.0008) [2023-10-14 03:39:15,133][33226] Updated weights for policy 1, policy_version 65680 (0.0009) [2023-10-14 03:39:15,505][33226] Updated weights for policy 1, policy_version 65690 (0.0008) [2023-10-14 03:39:18,225][33201] Updated weights for policy 0, policy_version 65090 (0.0007) [2023-10-14 03:39:18,599][33201] Updated weights for policy 0, policy_version 65100 (0.0009) [2023-10-14 03:39:18,974][33201] Updated weights for policy 0, policy_version 65110 (0.0009) [2023-10-14 03:39:19,299][33226] Updated weights for policy 1, policy_version 65700 (0.0009) [2023-10-14 03:39:19,343][33201] Updated weights for policy 0, policy_version 65120 (0.0007) [2023-10-14 03:39:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 133955584. Throughput: 0: 1795.1, 1: 1772.0. Samples: 33496554. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:39:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.910')] [2023-10-14 03:39:19,665][33226] Updated weights for policy 1, policy_version 65710 (0.0010) [2023-10-14 03:39:20,039][33226] Updated weights for policy 1, policy_version 65720 (0.0009) [2023-10-14 03:39:23,285][33201] Updated weights for policy 0, policy_version 65130 (0.0007) [2023-10-14 03:39:23,660][33201] Updated weights for policy 0, policy_version 65140 (0.0009) [2023-10-14 03:39:23,872][33226] Updated weights for policy 1, policy_version 65730 (0.0008) [2023-10-14 03:39:24,023][33201] Updated weights for policy 0, policy_version 65150 (0.0008) [2023-10-14 03:39:24,237][33226] Updated weights for policy 1, policy_version 65740 (0.0008) [2023-10-14 03:39:24,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 134021120. Throughput: 0: 1756.5, 1: 1800.9. Samples: 33516872. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:39:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:39:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000065152_66715648.pth... [2023-10-14 03:39:24,597][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000063488_65011712.pth [2023-10-14 03:39:24,613][33226] Updated weights for policy 1, policy_version 65750 (0.0008) [2023-10-14 03:39:24,972][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000065760_67338240.pth... [2023-10-14 03:39:24,976][33226] Updated weights for policy 1, policy_version 65760 (0.0008) [2023-10-14 03:39:25,012][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000064096_65634304.pth [2023-10-14 03:39:28,040][33201] Updated weights for policy 0, policy_version 65160 (0.0007) [2023-10-14 03:39:28,416][33201] Updated weights for policy 0, policy_version 65170 (0.0010) [2023-10-14 03:39:28,791][33201] Updated weights for policy 0, policy_version 65180 (0.0007) [2023-10-14 03:39:28,851][33226] Updated weights for policy 1, policy_version 65770 (0.0009) [2023-10-14 03:39:29,210][33226] Updated weights for policy 1, policy_version 65780 (0.0009) [2023-10-14 03:39:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 134086656. Throughput: 0: 1778.3, 1: 1771.0. Samples: 33527884. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 03:39:29,572][33226] Updated weights for policy 1, policy_version 65790 (0.0007) [2023-10-14 03:39:32,554][33201] Updated weights for policy 0, policy_version 65190 (0.0008) [2023-10-14 03:39:32,928][33201] Updated weights for policy 0, policy_version 65200 (0.0007) [2023-10-14 03:39:33,296][33201] Updated weights for policy 0, policy_version 65210 (0.0007) [2023-10-14 03:39:33,342][33226] Updated weights for policy 1, policy_version 65800 (0.0009) [2023-10-14 03:39:33,710][33226] Updated weights for policy 1, policy_version 65810 (0.0010) [2023-10-14 03:39:34,078][33226] Updated weights for policy 1, policy_version 65820 (0.0007) [2023-10-14 03:39:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 134184960. Throughput: 0: 1760.9, 1: 1800.1. Samples: 33549010. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:39:37,107][33201] Updated weights for policy 0, policy_version 65220 (0.0010) [2023-10-14 03:39:37,480][33201] Updated weights for policy 0, policy_version 65230 (0.0011) [2023-10-14 03:39:37,853][33201] Updated weights for policy 0, policy_version 65240 (0.0010) [2023-10-14 03:39:38,045][33226] Updated weights for policy 1, policy_version 65830 (0.0009) [2023-10-14 03:39:38,421][33226] Updated weights for policy 1, policy_version 65840 (0.0011) [2023-10-14 03:39:38,789][33226] Updated weights for policy 1, policy_version 65850 (0.0010) [2023-10-14 03:39:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 134250496. Throughput: 0: 1747.8, 1: 1774.4. Samples: 33568944. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:39:41,704][33201] Updated weights for policy 0, policy_version 65250 (0.0008) [2023-10-14 03:39:42,069][33201] Updated weights for policy 0, policy_version 65260 (0.0010) [2023-10-14 03:39:42,447][33201] Updated weights for policy 0, policy_version 65270 (0.0010) [2023-10-14 03:39:42,629][33226] Updated weights for policy 1, policy_version 65860 (0.0008) [2023-10-14 03:39:42,816][33201] Updated weights for policy 0, policy_version 65280 (0.0007) [2023-10-14 03:39:42,995][33226] Updated weights for policy 1, policy_version 65870 (0.0008) [2023-10-14 03:39:43,371][33226] Updated weights for policy 1, policy_version 65880 (0.0008) [2023-10-14 03:39:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 134316032. Throughput: 0: 1763.3, 1: 1786.5. Samples: 33580606. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:44,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 03:39:46,723][33201] Updated weights for policy 0, policy_version 65290 (0.0007) [2023-10-14 03:39:47,100][33201] Updated weights for policy 0, policy_version 65300 (0.0008) [2023-10-14 03:39:47,191][33226] Updated weights for policy 1, policy_version 65890 (0.0007) [2023-10-14 03:39:47,460][33201] Updated weights for policy 0, policy_version 65310 (0.0008) [2023-10-14 03:39:47,559][33226] Updated weights for policy 1, policy_version 65900 (0.0007) [2023-10-14 03:39:47,915][33226] Updated weights for policy 1, policy_version 65910 (0.0008) [2023-10-14 03:39:48,285][33226] Updated weights for policy 1, policy_version 65920 (0.0009) [2023-10-14 03:39:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 134381568. Throughput: 0: 1747.5, 1: 1772.7. Samples: 33600694. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:49,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 03:39:51,273][33201] Updated weights for policy 0, policy_version 65320 (0.0008) [2023-10-14 03:39:51,631][33201] Updated weights for policy 0, policy_version 65330 (0.0008) [2023-10-14 03:39:51,919][33226] Updated weights for policy 1, policy_version 65930 (0.0008) [2023-10-14 03:39:52,008][33201] Updated weights for policy 0, policy_version 65340 (0.0008) [2023-10-14 03:39:52,281][33226] Updated weights for policy 1, policy_version 65940 (0.0008) [2023-10-14 03:39:52,648][33226] Updated weights for policy 1, policy_version 65950 (0.0008) [2023-10-14 03:39:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 134447104. Throughput: 0: 1745.2, 1: 1763.6. Samples: 33622356. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 03:39:55,884][33201] Updated weights for policy 0, policy_version 65350 (0.0008) [2023-10-14 03:39:56,254][33201] Updated weights for policy 0, policy_version 65360 (0.0007) [2023-10-14 03:39:56,609][33226] Updated weights for policy 1, policy_version 65960 (0.0009) [2023-10-14 03:39:56,626][33201] Updated weights for policy 0, policy_version 65370 (0.0007) [2023-10-14 03:39:56,991][33226] Updated weights for policy 1, policy_version 65970 (0.0008) [2023-10-14 03:39:57,357][33226] Updated weights for policy 1, policy_version 65980 (0.0011) [2023-10-14 03:39:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 134512640. Throughput: 0: 1738.4, 1: 1775.0. Samples: 33632468. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:39:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:40:00,483][33201] Updated weights for policy 0, policy_version 65380 (0.0007) [2023-10-14 03:40:00,849][33201] Updated weights for policy 0, policy_version 65390 (0.0008) [2023-10-14 03:40:01,128][33226] Updated weights for policy 1, policy_version 65990 (0.0008) [2023-10-14 03:40:01,210][33201] Updated weights for policy 0, policy_version 65400 (0.0008) [2023-10-14 03:40:01,487][33226] Updated weights for policy 1, policy_version 66000 (0.0008) [2023-10-14 03:40:01,862][33226] Updated weights for policy 1, policy_version 66010 (0.0008) [2023-10-14 03:40:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 134578176. Throughput: 0: 1742.4, 1: 1758.4. Samples: 33654090. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:40:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.980')] [2023-10-14 03:40:05,051][33201] Updated weights for policy 0, policy_version 65410 (0.0007) [2023-10-14 03:40:05,421][33201] Updated weights for policy 0, policy_version 65420 (0.0009) [2023-10-14 03:40:05,664][33226] Updated weights for policy 1, policy_version 66020 (0.0008) [2023-10-14 03:40:05,787][33201] Updated weights for policy 0, policy_version 65430 (0.0008) [2023-10-14 03:40:06,030][33226] Updated weights for policy 1, policy_version 66030 (0.0008) [2023-10-14 03:40:06,159][33201] Updated weights for policy 0, policy_version 65440 (0.0007) [2023-10-14 03:40:06,410][33226] Updated weights for policy 1, policy_version 66040 (0.0009) [2023-10-14 03:40:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 134643712. Throughput: 0: 1780.5, 1: 1762.5. Samples: 33676308. Policy #0 lag: (min: 31.0, avg: 39.0, max: 63.0) [2023-10-14 03:40:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:40:09,775][33201] Updated weights for policy 0, policy_version 65450 (0.0009) [2023-10-14 03:40:10,153][33201] Updated weights for policy 0, policy_version 65460 (0.0008) [2023-10-14 03:40:10,179][33226] Updated weights for policy 1, policy_version 66050 (0.0010) [2023-10-14 03:40:10,513][33201] Updated weights for policy 0, policy_version 65470 (0.0007) [2023-10-14 03:40:10,547][33226] Updated weights for policy 1, policy_version 66060 (0.0007) [2023-10-14 03:40:10,910][33226] Updated weights for policy 1, policy_version 66070 (0.0007) [2023-10-14 03:40:11,277][33226] Updated weights for policy 1, policy_version 66080 (0.0008) [2023-10-14 03:40:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 134709248. Throughput: 0: 1749.5, 1: 1761.5. Samples: 33685878. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:14,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 03:40:14,590][33201] Updated weights for policy 0, policy_version 65480 (0.0009) [2023-10-14 03:40:14,926][33226] Updated weights for policy 1, policy_version 66090 (0.0008) [2023-10-14 03:40:14,961][33201] Updated weights for policy 0, policy_version 65490 (0.0009) [2023-10-14 03:40:15,290][33226] Updated weights for policy 1, policy_version 66100 (0.0007) [2023-10-14 03:40:15,337][33201] Updated weights for policy 0, policy_version 65500 (0.0007) [2023-10-14 03:40:15,655][33226] Updated weights for policy 1, policy_version 66110 (0.0009) [2023-10-14 03:40:19,142][33201] Updated weights for policy 0, policy_version 65510 (0.0009) [2023-10-14 03:40:19,508][33201] Updated weights for policy 0, policy_version 65520 (0.0010) [2023-10-14 03:40:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 134774784. Throughput: 0: 1766.4, 1: 1764.9. Samples: 33707922. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:19,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 03:40:19,587][33226] Updated weights for policy 1, policy_version 66120 (0.0008) [2023-10-14 03:40:19,877][33201] Updated weights for policy 0, policy_version 65530 (0.0008) [2023-10-14 03:40:19,953][33226] Updated weights for policy 1, policy_version 66130 (0.0007) [2023-10-14 03:40:20,317][33226] Updated weights for policy 1, policy_version 66140 (0.0010) [2023-10-14 03:40:23,575][33201] Updated weights for policy 0, policy_version 65540 (0.0008) [2023-10-14 03:40:23,936][33201] Updated weights for policy 0, policy_version 65550 (0.0008) [2023-10-14 03:40:24,170][33226] Updated weights for policy 1, policy_version 66150 (0.0008) [2023-10-14 03:40:24,309][33201] Updated weights for policy 0, policy_version 65560 (0.0007) [2023-10-14 03:40:24,547][33226] Updated weights for policy 1, policy_version 66160 (0.0008) [2023-10-14 03:40:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 134840320. Throughput: 0: 1764.4, 1: 1796.9. Samples: 33729198. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.990')] [2023-10-14 03:40:24,909][33226] Updated weights for policy 1, policy_version 66170 (0.0010) [2023-10-14 03:40:28,240][33201] Updated weights for policy 0, policy_version 65570 (0.0009) [2023-10-14 03:40:28,613][33201] Updated weights for policy 0, policy_version 65580 (0.0010) [2023-10-14 03:40:28,697][33226] Updated weights for policy 1, policy_version 66180 (0.0009) [2023-10-14 03:40:28,980][33201] Updated weights for policy 0, policy_version 65590 (0.0008) [2023-10-14 03:40:29,063][33226] Updated weights for policy 1, policy_version 66190 (0.0008) [2023-10-14 03:40:29,353][33201] Updated weights for policy 0, policy_version 65600 (0.0008) [2023-10-14 03:40:29,429][33226] Updated weights for policy 1, policy_version 66200 (0.0009) [2023-10-14 03:40:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 134938624. Throughput: 0: 1759.8, 1: 1774.0. Samples: 33739628. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.990')] [2023-10-14 03:40:33,161][33226] Updated weights for policy 1, policy_version 66210 (0.0009) [2023-10-14 03:40:33,217][33201] Updated weights for policy 0, policy_version 65610 (0.0008) [2023-10-14 03:40:33,529][33226] Updated weights for policy 1, policy_version 66220 (0.0009) [2023-10-14 03:40:33,587][33201] Updated weights for policy 0, policy_version 65620 (0.0007) [2023-10-14 03:40:33,891][33226] Updated weights for policy 1, policy_version 66230 (0.0008) [2023-10-14 03:40:33,951][33201] Updated weights for policy 0, policy_version 65630 (0.0008) [2023-10-14 03:40:34,259][33226] Updated weights for policy 1, policy_version 66240 (0.0008) [2023-10-14 03:40:34,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 135036928. Throughput: 0: 1771.0, 1: 1799.0. Samples: 33761346. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.990')] [2023-10-14 03:40:37,834][33201] Updated weights for policy 0, policy_version 65640 (0.0009) [2023-10-14 03:40:38,128][33226] Updated weights for policy 1, policy_version 66250 (0.0009) [2023-10-14 03:40:38,204][33201] Updated weights for policy 0, policy_version 65650 (0.0008) [2023-10-14 03:40:38,494][33226] Updated weights for policy 1, policy_version 66260 (0.0009) [2023-10-14 03:40:38,572][33201] Updated weights for policy 0, policy_version 65660 (0.0008) [2023-10-14 03:40:38,858][33226] Updated weights for policy 1, policy_version 66270 (0.0008) [2023-10-14 03:40:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 135102464. Throughput: 0: 1748.4, 1: 1772.5. Samples: 33780798. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:40:42,344][33201] Updated weights for policy 0, policy_version 65670 (0.0008) [2023-10-14 03:40:42,700][33201] Updated weights for policy 0, policy_version 65680 (0.0007) [2023-10-14 03:40:42,732][33226] Updated weights for policy 1, policy_version 66280 (0.0009) [2023-10-14 03:40:43,071][33201] Updated weights for policy 0, policy_version 65690 (0.0009) [2023-10-14 03:40:43,104][33226] Updated weights for policy 1, policy_version 66290 (0.0008) [2023-10-14 03:40:43,468][33226] Updated weights for policy 1, policy_version 66300 (0.0008) [2023-10-14 03:40:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 135168000. Throughput: 0: 1778.6, 1: 1792.0. Samples: 33793146. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:44,557][31953] Avg episode reward: [(0, '20.840'), (1, '20.990')] [2023-10-14 03:40:47,013][33201] Updated weights for policy 0, policy_version 65700 (0.0007) [2023-10-14 03:40:47,360][33226] Updated weights for policy 1, policy_version 66310 (0.0010) [2023-10-14 03:40:47,385][33201] Updated weights for policy 0, policy_version 65710 (0.0009) [2023-10-14 03:40:47,721][33226] Updated weights for policy 1, policy_version 66320 (0.0009) [2023-10-14 03:40:47,761][33201] Updated weights for policy 0, policy_version 65720 (0.0009) [2023-10-14 03:40:48,090][33226] Updated weights for policy 1, policy_version 66330 (0.0009) [2023-10-14 03:40:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 135233536. Throughput: 0: 1743.7, 1: 1778.4. Samples: 33812584. Policy #0 lag: (min: 8.0, avg: 29.8, max: 32.0) [2023-10-14 03:40:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.990')] [2023-10-14 03:40:51,608][33201] Updated weights for policy 0, policy_version 65730 (0.0009) [2023-10-14 03:40:51,860][33226] Updated weights for policy 1, policy_version 66340 (0.0009) [2023-10-14 03:40:51,978][33201] Updated weights for policy 0, policy_version 65740 (0.0009) [2023-10-14 03:40:52,227][33226] Updated weights for policy 1, policy_version 66350 (0.0007) [2023-10-14 03:40:52,345][33201] Updated weights for policy 0, policy_version 65750 (0.0007) [2023-10-14 03:40:52,594][33226] Updated weights for policy 1, policy_version 66360 (0.0009) [2023-10-14 03:40:52,711][33201] Updated weights for policy 0, policy_version 65760 (0.0008) [2023-10-14 03:40:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 135299072. Throughput: 0: 1737.0, 1: 1772.3. Samples: 33834226. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:40:54,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:40:56,299][33226] Updated weights for policy 1, policy_version 66370 (0.0009) [2023-10-14 03:40:56,463][33201] Updated weights for policy 0, policy_version 65770 (0.0009) [2023-10-14 03:40:56,654][33226] Updated weights for policy 1, policy_version 66380 (0.0010) [2023-10-14 03:40:56,831][33201] Updated weights for policy 0, policy_version 65780 (0.0007) [2023-10-14 03:40:57,029][33226] Updated weights for policy 1, policy_version 66390 (0.0009) [2023-10-14 03:40:57,205][33201] Updated weights for policy 0, policy_version 65790 (0.0007) [2023-10-14 03:40:57,392][33226] Updated weights for policy 1, policy_version 66400 (0.0007) [2023-10-14 03:40:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 135364608. Throughput: 0: 1746.7, 1: 1784.0. Samples: 33844760. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:40:59,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:01,128][33226] Updated weights for policy 1, policy_version 66410 (0.0007) [2023-10-14 03:41:01,220][33201] Updated weights for policy 0, policy_version 65800 (0.0007) [2023-10-14 03:41:01,494][33226] Updated weights for policy 1, policy_version 66420 (0.0008) [2023-10-14 03:41:01,586][33201] Updated weights for policy 0, policy_version 65810 (0.0009) [2023-10-14 03:41:01,868][33226] Updated weights for policy 1, policy_version 66430 (0.0008) [2023-10-14 03:41:01,947][33201] Updated weights for policy 0, policy_version 65820 (0.0007) [2023-10-14 03:41:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 135430144. Throughput: 0: 1740.1, 1: 1769.0. Samples: 33865832. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:04,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:05,685][33201] Updated weights for policy 0, policy_version 65830 (0.0008) [2023-10-14 03:41:05,758][33226] Updated weights for policy 1, policy_version 66440 (0.0008) [2023-10-14 03:41:06,055][33201] Updated weights for policy 0, policy_version 65840 (0.0008) [2023-10-14 03:41:06,131][33226] Updated weights for policy 1, policy_version 66450 (0.0008) [2023-10-14 03:41:06,430][33201] Updated weights for policy 0, policy_version 65850 (0.0008) [2023-10-14 03:41:06,491][33226] Updated weights for policy 1, policy_version 66460 (0.0007) [2023-10-14 03:41:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 135495680. Throughput: 0: 1754.6, 1: 1767.5. Samples: 33887692. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:09,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:10,273][33226] Updated weights for policy 1, policy_version 66470 (0.0008) [2023-10-14 03:41:10,347][33201] Updated weights for policy 0, policy_version 65860 (0.0009) [2023-10-14 03:41:10,645][33226] Updated weights for policy 1, policy_version 66480 (0.0008) [2023-10-14 03:41:10,714][33201] Updated weights for policy 0, policy_version 65870 (0.0010) [2023-10-14 03:41:11,013][33226] Updated weights for policy 1, policy_version 66490 (0.0007) [2023-10-14 03:41:11,079][33201] Updated weights for policy 0, policy_version 65880 (0.0009) [2023-10-14 03:41:14,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 135561216. Throughput: 0: 1741.2, 1: 1762.0. Samples: 33897268. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:14,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:14,913][33226] Updated weights for policy 1, policy_version 66500 (0.0009) [2023-10-14 03:41:14,917][33201] Updated weights for policy 0, policy_version 65890 (0.0010) [2023-10-14 03:41:15,274][33226] Updated weights for policy 1, policy_version 66510 (0.0008) [2023-10-14 03:41:15,281][33201] Updated weights for policy 0, policy_version 65900 (0.0009) [2023-10-14 03:41:15,641][33226] Updated weights for policy 1, policy_version 66520 (0.0008) [2023-10-14 03:41:15,659][33201] Updated weights for policy 0, policy_version 65910 (0.0009) [2023-10-14 03:41:16,024][33201] Updated weights for policy 0, policy_version 65920 (0.0008) [2023-10-14 03:41:19,338][33226] Updated weights for policy 1, policy_version 66530 (0.0008) [2023-10-14 03:41:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 135626752. Throughput: 0: 1748.4, 1: 1758.8. Samples: 33919168. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:19,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:19,708][33226] Updated weights for policy 1, policy_version 66540 (0.0009) [2023-10-14 03:41:19,932][33201] Updated weights for policy 0, policy_version 65930 (0.0008) [2023-10-14 03:41:20,081][33226] Updated weights for policy 1, policy_version 66550 (0.0009) [2023-10-14 03:41:20,301][33201] Updated weights for policy 0, policy_version 65940 (0.0008) [2023-10-14 03:41:20,446][33226] Updated weights for policy 1, policy_version 66560 (0.0008) [2023-10-14 03:41:20,682][33201] Updated weights for policy 0, policy_version 65950 (0.0009) [2023-10-14 03:41:24,293][33226] Updated weights for policy 1, policy_version 66570 (0.0008) [2023-10-14 03:41:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 135692288. Throughput: 0: 1770.1, 1: 1792.0. Samples: 33941088. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:24,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:24,652][33226] Updated weights for policy 1, policy_version 66580 (0.0008) [2023-10-14 03:41:24,660][33201] Updated weights for policy 0, policy_version 65960 (0.0009) [2023-10-14 03:41:25,021][33226] Updated weights for policy 1, policy_version 66590 (0.0008) [2023-10-14 03:41:25,040][33201] Updated weights for policy 0, policy_version 65970 (0.0008) [2023-10-14 03:41:25,087][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000066592_68190208.pth... [2023-10-14 03:41:25,117][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000064928_66486272.pth [2023-10-14 03:41:25,120][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000066592_68190208.pth [2023-10-14 03:41:25,410][33201] Updated weights for policy 0, policy_version 65980 (0.0010) [2023-10-14 03:41:25,561][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000065984_67567616.pth... [2023-10-14 03:41:25,590][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000064320_65863680.pth [2023-10-14 03:41:25,594][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000065984_67567616.pth [2023-10-14 03:41:28,851][33226] Updated weights for policy 1, policy_version 66600 (0.0008) [2023-10-14 03:41:29,218][33201] Updated weights for policy 0, policy_version 65990 (0.0008) [2023-10-14 03:41:29,231][33226] Updated weights for policy 1, policy_version 66610 (0.0007) [2023-10-14 03:41:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 13995.8). Total num frames: 135757824. Throughput: 0: 1737.2, 1: 1766.7. Samples: 33950822. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:29,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.990')] [2023-10-14 03:41:29,596][33201] Updated weights for policy 0, policy_version 66000 (0.0008) [2023-10-14 03:41:29,597][33226] Updated weights for policy 1, policy_version 66620 (0.0007) [2023-10-14 03:41:29,964][33201] Updated weights for policy 0, policy_version 66010 (0.0010) [2023-10-14 03:41:33,334][33226] Updated weights for policy 1, policy_version 66630 (0.0009) [2023-10-14 03:41:33,675][33201] Updated weights for policy 0, policy_version 66020 (0.0010) [2023-10-14 03:41:33,699][33226] Updated weights for policy 1, policy_version 66640 (0.0007) [2023-10-14 03:41:34,043][33201] Updated weights for policy 0, policy_version 66030 (0.0007) [2023-10-14 03:41:34,068][33226] Updated weights for policy 1, policy_version 66650 (0.0009) [2023-10-14 03:41:34,408][33201] Updated weights for policy 0, policy_version 66040 (0.0007) [2023-10-14 03:41:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 135856128. Throughput: 0: 1777.7, 1: 1793.0. Samples: 33973268. Policy #0 lag: (min: 31.0, avg: 37.3, max: 63.0) [2023-10-14 03:41:34,558][31953] Avg episode reward: [(0, '20.860'), (1, '21.000')] [2023-10-14 03:41:37,937][33226] Updated weights for policy 1, policy_version 66660 (0.0008) [2023-10-14 03:41:38,168][33201] Updated weights for policy 0, policy_version 66050 (0.0007) [2023-10-14 03:41:38,307][33226] Updated weights for policy 1, policy_version 66670 (0.0008) [2023-10-14 03:41:38,535][33201] Updated weights for policy 0, policy_version 66060 (0.0009) [2023-10-14 03:41:38,669][33226] Updated weights for policy 1, policy_version 66680 (0.0007) [2023-10-14 03:41:38,905][33201] Updated weights for policy 0, policy_version 66070 (0.0007) [2023-10-14 03:41:39,265][33201] Updated weights for policy 0, policy_version 66080 (0.0008) [2023-10-14 03:41:39,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 135954432. Throughput: 0: 1755.4, 1: 1765.3. Samples: 33992660. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:41:39,559][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:41:42,464][33226] Updated weights for policy 1, policy_version 66690 (0.0008) [2023-10-14 03:41:42,841][33226] Updated weights for policy 1, policy_version 66700 (0.0010) [2023-10-14 03:41:43,188][33201] Updated weights for policy 0, policy_version 66090 (0.0007) [2023-10-14 03:41:43,209][33226] Updated weights for policy 1, policy_version 66710 (0.0007) [2023-10-14 03:41:43,554][33201] Updated weights for policy 0, policy_version 66100 (0.0008) [2023-10-14 03:41:43,576][33226] Updated weights for policy 1, policy_version 66720 (0.0008) [2023-10-14 03:41:43,919][33201] Updated weights for policy 0, policy_version 66110 (0.0008) [2023-10-14 03:41:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 136019968. Throughput: 0: 1767.9, 1: 1780.2. Samples: 34004424. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:41:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:41:47,293][33226] Updated weights for policy 1, policy_version 66730 (0.0008) [2023-10-14 03:41:47,661][33226] Updated weights for policy 1, policy_version 66740 (0.0009) [2023-10-14 03:41:47,914][33201] Updated weights for policy 0, policy_version 66120 (0.0008) [2023-10-14 03:41:48,036][33226] Updated weights for policy 1, policy_version 66750 (0.0008) [2023-10-14 03:41:48,279][33201] Updated weights for policy 0, policy_version 66130 (0.0009) [2023-10-14 03:41:48,658][33201] Updated weights for policy 0, policy_version 66140 (0.0009) [2023-10-14 03:41:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 136085504. Throughput: 0: 1767.7, 1: 1764.6. Samples: 34024784. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:41:49,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:41:51,879][33226] Updated weights for policy 1, policy_version 66760 (0.0007) [2023-10-14 03:41:52,254][33226] Updated weights for policy 1, policy_version 66770 (0.0007) [2023-10-14 03:41:52,598][33201] Updated weights for policy 0, policy_version 66150 (0.0007) [2023-10-14 03:41:52,624][33226] Updated weights for policy 1, policy_version 66780 (0.0007) [2023-10-14 03:41:52,989][33201] Updated weights for policy 0, policy_version 66160 (0.0008) [2023-10-14 03:41:53,346][33201] Updated weights for policy 0, policy_version 66170 (0.0008) [2023-10-14 03:41:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 136151040. Throughput: 0: 1744.8, 1: 1762.4. Samples: 34045518. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:41:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:41:56,427][33226] Updated weights for policy 1, policy_version 66790 (0.0008) [2023-10-14 03:41:56,800][33226] Updated weights for policy 1, policy_version 66800 (0.0010) [2023-10-14 03:41:57,068][33201] Updated weights for policy 0, policy_version 66180 (0.0008) [2023-10-14 03:41:57,157][33226] Updated weights for policy 1, policy_version 66810 (0.0007) [2023-10-14 03:41:57,444][33201] Updated weights for policy 0, policy_version 66190 (0.0008) [2023-10-14 03:41:57,812][33201] Updated weights for policy 0, policy_version 66200 (0.0009) [2023-10-14 03:41:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 136216576. Throughput: 0: 1777.7, 1: 1775.9. Samples: 34057178. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:41:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:42:01,062][33226] Updated weights for policy 1, policy_version 66820 (0.0008) [2023-10-14 03:42:01,426][33226] Updated weights for policy 1, policy_version 66830 (0.0008) [2023-10-14 03:42:01,524][33201] Updated weights for policy 0, policy_version 66210 (0.0008) [2023-10-14 03:42:01,788][33226] Updated weights for policy 1, policy_version 66840 (0.0007) [2023-10-14 03:42:01,899][33201] Updated weights for policy 0, policy_version 66220 (0.0010) [2023-10-14 03:42:02,273][33201] Updated weights for policy 0, policy_version 66230 (0.0009) [2023-10-14 03:42:02,640][33201] Updated weights for policy 0, policy_version 66240 (0.0009) [2023-10-14 03:42:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 136282112. Throughput: 0: 1755.5, 1: 1765.7. Samples: 34077622. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:42:04,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:42:05,777][33226] Updated weights for policy 1, policy_version 66850 (0.0007) [2023-10-14 03:42:06,147][33226] Updated weights for policy 1, policy_version 66860 (0.0008) [2023-10-14 03:42:06,400][33201] Updated weights for policy 0, policy_version 66250 (0.0008) [2023-10-14 03:42:06,519][33226] Updated weights for policy 1, policy_version 66870 (0.0007) [2023-10-14 03:42:06,774][33201] Updated weights for policy 0, policy_version 66260 (0.0008) [2023-10-14 03:42:06,879][33226] Updated weights for policy 1, policy_version 66880 (0.0007) [2023-10-14 03:42:07,139][33201] Updated weights for policy 0, policy_version 66270 (0.0011) [2023-10-14 03:42:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 136347648. Throughput: 0: 1760.9, 1: 1762.6. Samples: 34099646. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:42:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:42:10,639][33226] Updated weights for policy 1, policy_version 66890 (0.0009) [2023-10-14 03:42:10,944][33201] Updated weights for policy 0, policy_version 66280 (0.0009) [2023-10-14 03:42:11,006][33226] Updated weights for policy 1, policy_version 66900 (0.0008) [2023-10-14 03:42:11,314][33201] Updated weights for policy 0, policy_version 66290 (0.0008) [2023-10-14 03:42:11,371][33226] Updated weights for policy 1, policy_version 66910 (0.0009) [2023-10-14 03:42:11,685][33201] Updated weights for policy 0, policy_version 66300 (0.0011) [2023-10-14 03:42:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 136413184. Throughput: 0: 1762.9, 1: 1755.6. Samples: 34109154. Policy #0 lag: (min: 31.0, avg: 31.9, max: 51.0) [2023-10-14 03:42:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:42:15,394][33226] Updated weights for policy 1, policy_version 66920 (0.0009) [2023-10-14 03:42:15,491][33201] Updated weights for policy 0, policy_version 66310 (0.0009) [2023-10-14 03:42:15,760][33226] Updated weights for policy 1, policy_version 66930 (0.0009) [2023-10-14 03:42:15,864][33201] Updated weights for policy 0, policy_version 66320 (0.0007) [2023-10-14 03:42:16,122][33226] Updated weights for policy 1, policy_version 66940 (0.0007) [2023-10-14 03:42:16,236][33201] Updated weights for policy 0, policy_version 66330 (0.0008) [2023-10-14 03:42:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 136478720. Throughput: 0: 1757.6, 1: 1750.7. Samples: 34131142. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:19,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:42:19,902][33226] Updated weights for policy 1, policy_version 66950 (0.0010) [2023-10-14 03:42:19,933][33201] Updated weights for policy 0, policy_version 66340 (0.0007) [2023-10-14 03:42:20,263][33226] Updated weights for policy 1, policy_version 66960 (0.0008) [2023-10-14 03:42:20,305][33201] Updated weights for policy 0, policy_version 66350 (0.0008) [2023-10-14 03:42:20,637][33226] Updated weights for policy 1, policy_version 66970 (0.0009) [2023-10-14 03:42:20,676][33201] Updated weights for policy 0, policy_version 66360 (0.0009) [2023-10-14 03:42:24,454][33226] Updated weights for policy 1, policy_version 66980 (0.0007) [2023-10-14 03:42:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 136544256. Throughput: 0: 1779.0, 1: 1781.1. Samples: 34152866. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:42:24,705][33201] Updated weights for policy 0, policy_version 66370 (0.0009) [2023-10-14 03:42:24,817][33226] Updated weights for policy 1, policy_version 66990 (0.0008) [2023-10-14 03:42:25,062][33201] Updated weights for policy 0, policy_version 66380 (0.0008) [2023-10-14 03:42:25,180][33226] Updated weights for policy 1, policy_version 67000 (0.0008) [2023-10-14 03:42:25,440][33201] Updated weights for policy 0, policy_version 66390 (0.0008) [2023-10-14 03:42:25,802][33201] Updated weights for policy 0, policy_version 66400 (0.0010) [2023-10-14 03:42:28,944][33226] Updated weights for policy 1, policy_version 67010 (0.0007) [2023-10-14 03:42:29,300][33226] Updated weights for policy 1, policy_version 67020 (0.0007) [2023-10-14 03:42:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 136609792. Throughput: 0: 1755.9, 1: 1750.2. Samples: 34162198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:42:29,662][33226] Updated weights for policy 1, policy_version 67030 (0.0008) [2023-10-14 03:42:29,729][33201] Updated weights for policy 0, policy_version 66410 (0.0009) [2023-10-14 03:42:30,025][33226] Updated weights for policy 1, policy_version 67040 (0.0009) [2023-10-14 03:42:30,091][33201] Updated weights for policy 0, policy_version 66420 (0.0008) [2023-10-14 03:42:30,467][33201] Updated weights for policy 0, policy_version 66430 (0.0010) [2023-10-14 03:42:33,916][33226] Updated weights for policy 1, policy_version 67050 (0.0008) [2023-10-14 03:42:34,274][33226] Updated weights for policy 1, policy_version 67060 (0.0009) [2023-10-14 03:42:34,348][33201] Updated weights for policy 0, policy_version 66440 (0.0008) [2023-10-14 03:42:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 13995.8). Total num frames: 136675328. Throughput: 0: 1763.5, 1: 1774.5. Samples: 34183992. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.990')] [2023-10-14 03:42:34,647][33226] Updated weights for policy 1, policy_version 67070 (0.0007) [2023-10-14 03:42:34,719][33201] Updated weights for policy 0, policy_version 66450 (0.0007) [2023-10-14 03:42:35,092][33201] Updated weights for policy 0, policy_version 66460 (0.0010) [2023-10-14 03:42:38,441][33226] Updated weights for policy 1, policy_version 67080 (0.0008) [2023-10-14 03:42:38,779][33201] Updated weights for policy 0, policy_version 66470 (0.0008) [2023-10-14 03:42:38,810][33226] Updated weights for policy 1, policy_version 67090 (0.0008) [2023-10-14 03:42:39,155][33201] Updated weights for policy 0, policy_version 66480 (0.0009) [2023-10-14 03:42:39,176][33226] Updated weights for policy 1, policy_version 67100 (0.0009) [2023-10-14 03:42:39,530][33201] Updated weights for policy 0, policy_version 66490 (0.0009) [2023-10-14 03:42:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 136773632. Throughput: 0: 1779.6, 1: 1759.5. Samples: 34204778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:39,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:42:42,958][33226] Updated weights for policy 1, policy_version 67110 (0.0009) [2023-10-14 03:42:43,242][33201] Updated weights for policy 0, policy_version 66500 (0.0008) [2023-10-14 03:42:43,316][33226] Updated weights for policy 1, policy_version 67120 (0.0007) [2023-10-14 03:42:43,619][33201] Updated weights for policy 0, policy_version 66510 (0.0008) [2023-10-14 03:42:43,686][33226] Updated weights for policy 1, policy_version 67130 (0.0007) [2023-10-14 03:42:43,985][33201] Updated weights for policy 0, policy_version 66520 (0.0009) [2023-10-14 03:42:44,557][31953] Fps is (10 sec: 19660.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 136871936. Throughput: 0: 1760.0, 1: 1768.9. Samples: 34215976. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:42:47,624][33226] Updated weights for policy 1, policy_version 67140 (0.0008) [2023-10-14 03:42:47,809][33201] Updated weights for policy 0, policy_version 66530 (0.0009) [2023-10-14 03:42:47,986][33226] Updated weights for policy 1, policy_version 67150 (0.0008) [2023-10-14 03:42:48,172][33201] Updated weights for policy 0, policy_version 66540 (0.0008) [2023-10-14 03:42:48,354][33226] Updated weights for policy 1, policy_version 67160 (0.0008) [2023-10-14 03:42:48,546][33201] Updated weights for policy 0, policy_version 66550 (0.0009) [2023-10-14 03:42:48,915][33201] Updated weights for policy 0, policy_version 66560 (0.0007) [2023-10-14 03:42:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 136937472. Throughput: 0: 1775.0, 1: 1769.0. Samples: 34237104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:49,560][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:42:52,059][33226] Updated weights for policy 1, policy_version 67170 (0.0009) [2023-10-14 03:42:52,434][33226] Updated weights for policy 1, policy_version 67180 (0.0008) [2023-10-14 03:42:52,768][33201] Updated weights for policy 0, policy_version 66570 (0.0007) [2023-10-14 03:42:52,798][33226] Updated weights for policy 1, policy_version 67190 (0.0009) [2023-10-14 03:42:53,135][33201] Updated weights for policy 0, policy_version 66580 (0.0008) [2023-10-14 03:42:53,170][33226] Updated weights for policy 1, policy_version 67200 (0.0009) [2023-10-14 03:42:53,505][33201] Updated weights for policy 0, policy_version 66590 (0.0008) [2023-10-14 03:42:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137003008. Throughput: 0: 1749.3, 1: 1747.7. Samples: 34257010. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:42:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.990')] [2023-10-14 03:42:56,934][33226] Updated weights for policy 1, policy_version 67210 (0.0011) [2023-10-14 03:42:57,306][33226] Updated weights for policy 1, policy_version 67220 (0.0009) [2023-10-14 03:42:57,402][33201] Updated weights for policy 0, policy_version 66600 (0.0007) [2023-10-14 03:42:57,672][33226] Updated weights for policy 1, policy_version 67230 (0.0009) [2023-10-14 03:42:57,772][33201] Updated weights for policy 0, policy_version 66610 (0.0009) [2023-10-14 03:42:58,138][33201] Updated weights for policy 0, policy_version 66620 (0.0007) [2023-10-14 03:42:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137068544. Throughput: 0: 1785.0, 1: 1770.2. Samples: 34269140. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:42:59,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.990')] [2023-10-14 03:43:01,580][33226] Updated weights for policy 1, policy_version 67240 (0.0008) [2023-10-14 03:43:01,946][33226] Updated weights for policy 1, policy_version 67250 (0.0009) [2023-10-14 03:43:01,972][33201] Updated weights for policy 0, policy_version 66630 (0.0007) [2023-10-14 03:43:02,312][33226] Updated weights for policy 1, policy_version 67260 (0.0008) [2023-10-14 03:43:02,337][33201] Updated weights for policy 0, policy_version 66640 (0.0008) [2023-10-14 03:43:02,718][33201] Updated weights for policy 0, policy_version 66650 (0.0010) [2023-10-14 03:43:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137134080. Throughput: 0: 1748.2, 1: 1753.9. Samples: 34288734. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:43:06,361][33226] Updated weights for policy 1, policy_version 67270 (0.0008) [2023-10-14 03:43:06,437][33201] Updated weights for policy 0, policy_version 66660 (0.0010) [2023-10-14 03:43:06,739][33226] Updated weights for policy 1, policy_version 67280 (0.0009) [2023-10-14 03:43:06,811][33201] Updated weights for policy 0, policy_version 66670 (0.0010) [2023-10-14 03:43:07,107][33226] Updated weights for policy 1, policy_version 67290 (0.0009) [2023-10-14 03:43:07,183][33201] Updated weights for policy 0, policy_version 66680 (0.0009) [2023-10-14 03:43:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137199616. Throughput: 0: 1758.5, 1: 1751.2. Samples: 34310806. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:09,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:43:10,815][33226] Updated weights for policy 1, policy_version 67300 (0.0010) [2023-10-14 03:43:10,984][33201] Updated weights for policy 0, policy_version 66690 (0.0010) [2023-10-14 03:43:11,179][33226] Updated weights for policy 1, policy_version 67310 (0.0009) [2023-10-14 03:43:11,352][33201] Updated weights for policy 0, policy_version 66700 (0.0008) [2023-10-14 03:43:11,537][33226] Updated weights for policy 1, policy_version 67320 (0.0008) [2023-10-14 03:43:11,713][33201] Updated weights for policy 0, policy_version 66710 (0.0007) [2023-10-14 03:43:12,083][33201] Updated weights for policy 0, policy_version 66720 (0.0008) [2023-10-14 03:43:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 137265152. Throughput: 0: 1760.3, 1: 1758.4. Samples: 34320544. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:43:15,303][33226] Updated weights for policy 1, policy_version 67330 (0.0007) [2023-10-14 03:43:15,680][33226] Updated weights for policy 1, policy_version 67340 (0.0008) [2023-10-14 03:43:15,816][33201] Updated weights for policy 0, policy_version 66730 (0.0008) [2023-10-14 03:43:16,043][33226] Updated weights for policy 1, policy_version 67350 (0.0008) [2023-10-14 03:43:16,189][33201] Updated weights for policy 0, policy_version 66740 (0.0008) [2023-10-14 03:43:16,407][33226] Updated weights for policy 1, policy_version 67360 (0.0007) [2023-10-14 03:43:16,560][33201] Updated weights for policy 0, policy_version 66750 (0.0008) [2023-10-14 03:43:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 137330688. Throughput: 0: 1765.5, 1: 1759.5. Samples: 34342616. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:19,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 03:43:20,181][33226] Updated weights for policy 1, policy_version 67370 (0.0007) [2023-10-14 03:43:20,549][33226] Updated weights for policy 1, policy_version 67380 (0.0007) [2023-10-14 03:43:20,609][33201] Updated weights for policy 0, policy_version 66760 (0.0007) [2023-10-14 03:43:20,915][33226] Updated weights for policy 1, policy_version 67390 (0.0009) [2023-10-14 03:43:20,981][33201] Updated weights for policy 0, policy_version 66770 (0.0009) [2023-10-14 03:43:21,357][33201] Updated weights for policy 0, policy_version 66780 (0.0010) [2023-10-14 03:43:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 137396224. Throughput: 0: 1770.8, 1: 1782.3. Samples: 34364672. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:24,558][31953] Avg episode reward: [(0, '20.930'), (1, '21.000')] [2023-10-14 03:43:24,572][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth... [2023-10-14 03:43:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000065152_66715648.pth [2023-10-14 03:43:24,784][33226] Updated weights for policy 1, policy_version 67400 (0.0008) [2023-10-14 03:43:25,146][33226] Updated weights for policy 1, policy_version 67410 (0.0007) [2023-10-14 03:43:25,318][33201] Updated weights for policy 0, policy_version 66790 (0.0008) [2023-10-14 03:43:25,515][33226] Updated weights for policy 1, policy_version 67420 (0.0007) [2023-10-14 03:43:25,660][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000067424_69042176.pth... [2023-10-14 03:43:25,691][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000065760_67338240.pth [2023-10-14 03:43:25,694][33201] Updated weights for policy 0, policy_version 66800 (0.0007) [2023-10-14 03:43:26,060][33201] Updated weights for policy 0, policy_version 66810 (0.0008) [2023-10-14 03:43:29,340][33226] Updated weights for policy 1, policy_version 67430 (0.0007) [2023-10-14 03:43:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 137461760. Throughput: 0: 1751.7, 1: 1762.8. Samples: 34374130. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:29,558][31953] Avg episode reward: [(0, '20.910'), (1, '21.000')] [2023-10-14 03:43:29,707][33226] Updated weights for policy 1, policy_version 67440 (0.0009) [2023-10-14 03:43:29,977][33201] Updated weights for policy 0, policy_version 66820 (0.0007) [2023-10-14 03:43:30,075][33226] Updated weights for policy 1, policy_version 67450 (0.0008) [2023-10-14 03:43:30,338][33201] Updated weights for policy 0, policy_version 66830 (0.0007) [2023-10-14 03:43:30,706][33201] Updated weights for policy 0, policy_version 66840 (0.0007) [2023-10-14 03:43:33,848][33226] Updated weights for policy 1, policy_version 67460 (0.0007) [2023-10-14 03:43:34,216][33226] Updated weights for policy 1, policy_version 67470 (0.0008) [2023-10-14 03:43:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 137527296. Throughput: 0: 1760.5, 1: 1777.3. Samples: 34396304. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '21.000')] [2023-10-14 03:43:34,577][33226] Updated weights for policy 1, policy_version 67480 (0.0007) [2023-10-14 03:43:34,611][33201] Updated weights for policy 0, policy_version 66850 (0.0008) [2023-10-14 03:43:34,980][33201] Updated weights for policy 0, policy_version 66860 (0.0010) [2023-10-14 03:43:35,353][33201] Updated weights for policy 0, policy_version 66870 (0.0009) [2023-10-14 03:43:35,727][33201] Updated weights for policy 0, policy_version 66880 (0.0009) [2023-10-14 03:43:38,360][33226] Updated weights for policy 1, policy_version 67490 (0.0007) [2023-10-14 03:43:38,728][33226] Updated weights for policy 1, policy_version 67500 (0.0008) [2023-10-14 03:43:39,092][33226] Updated weights for policy 1, policy_version 67510 (0.0008) [2023-10-14 03:43:39,447][33201] Updated weights for policy 0, policy_version 66890 (0.0007) [2023-10-14 03:43:39,460][33226] Updated weights for policy 1, policy_version 67520 (0.0007) [2023-10-14 03:43:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 137625600. Throughput: 0: 1788.1, 1: 1781.7. Samples: 34417650. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 03:43:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '21.000')] [2023-10-14 03:43:39,823][33201] Updated weights for policy 0, policy_version 66900 (0.0010) [2023-10-14 03:43:40,198][33201] Updated weights for policy 0, policy_version 66910 (0.0009) [2023-10-14 03:43:43,227][33226] Updated weights for policy 1, policy_version 67530 (0.0009) [2023-10-14 03:43:43,600][33226] Updated weights for policy 1, policy_version 67540 (0.0007) [2023-10-14 03:43:43,957][33226] Updated weights for policy 1, policy_version 67550 (0.0009) [2023-10-14 03:43:43,980][33201] Updated weights for policy 0, policy_version 66920 (0.0009) [2023-10-14 03:43:44,348][33201] Updated weights for policy 0, policy_version 66930 (0.0009) [2023-10-14 03:43:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 137691136. Throughput: 0: 1751.9, 1: 1781.7. Samples: 34428150. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:43:44,557][31953] Avg episode reward: [(0, '20.920'), (1, '21.000')] [2023-10-14 03:43:44,714][33201] Updated weights for policy 0, policy_version 66940 (0.0011) [2023-10-14 03:43:47,733][33226] Updated weights for policy 1, policy_version 67560 (0.0010) [2023-10-14 03:43:48,102][33226] Updated weights for policy 1, policy_version 67570 (0.0011) [2023-10-14 03:43:48,383][33201] Updated weights for policy 0, policy_version 66950 (0.0009) [2023-10-14 03:43:48,464][33226] Updated weights for policy 1, policy_version 67580 (0.0008) [2023-10-14 03:43:48,764][33201] Updated weights for policy 0, policy_version 66960 (0.0010) [2023-10-14 03:43:49,128][33201] Updated weights for policy 0, policy_version 66970 (0.0008) [2023-10-14 03:43:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137789440. Throughput: 0: 1790.1, 1: 1789.1. Samples: 34449800. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:43:49,558][31953] Avg episode reward: [(0, '20.920'), (1, '21.000')] [2023-10-14 03:43:52,249][33226] Updated weights for policy 1, policy_version 67590 (0.0008) [2023-10-14 03:43:52,613][33226] Updated weights for policy 1, policy_version 67600 (0.0009) [2023-10-14 03:43:52,902][33201] Updated weights for policy 0, policy_version 66980 (0.0008) [2023-10-14 03:43:52,982][33226] Updated weights for policy 1, policy_version 67610 (0.0009) [2023-10-14 03:43:53,268][33201] Updated weights for policy 0, policy_version 66990 (0.0009) [2023-10-14 03:43:53,640][33201] Updated weights for policy 0, policy_version 67000 (0.0007) [2023-10-14 03:43:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137854976. Throughput: 0: 1750.8, 1: 1781.0. Samples: 34469740. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:43:54,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 03:43:56,853][33226] Updated weights for policy 1, policy_version 67620 (0.0007) [2023-10-14 03:43:57,217][33226] Updated weights for policy 1, policy_version 67630 (0.0007) [2023-10-14 03:43:57,410][33201] Updated weights for policy 0, policy_version 67010 (0.0010) [2023-10-14 03:43:57,579][33226] Updated weights for policy 1, policy_version 67640 (0.0007) [2023-10-14 03:43:57,769][33201] Updated weights for policy 0, policy_version 67020 (0.0007) [2023-10-14 03:43:58,139][33201] Updated weights for policy 0, policy_version 67030 (0.0009) [2023-10-14 03:43:58,510][33201] Updated weights for policy 0, policy_version 67040 (0.0008) [2023-10-14 03:43:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 137920512. Throughput: 0: 1785.0, 1: 1799.0. Samples: 34481822. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:43:59,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:01,354][33226] Updated weights for policy 1, policy_version 67650 (0.0007) [2023-10-14 03:44:01,714][33226] Updated weights for policy 1, policy_version 67660 (0.0007) [2023-10-14 03:44:02,083][33226] Updated weights for policy 1, policy_version 67670 (0.0010) [2023-10-14 03:44:02,450][33226] Updated weights for policy 1, policy_version 67680 (0.0008) [2023-10-14 03:44:02,482][33201] Updated weights for policy 0, policy_version 67050 (0.0009) [2023-10-14 03:44:02,844][33201] Updated weights for policy 0, policy_version 67060 (0.0010) [2023-10-14 03:44:03,218][33201] Updated weights for policy 0, policy_version 67070 (0.0008) [2023-10-14 03:44:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 137986048. Throughput: 0: 1755.6, 1: 1778.6. Samples: 34501654. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:44:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:06,202][33226] Updated weights for policy 1, policy_version 67690 (0.0009) [2023-10-14 03:44:06,570][33226] Updated weights for policy 1, policy_version 67700 (0.0008) [2023-10-14 03:44:06,944][33226] Updated weights for policy 1, policy_version 67710 (0.0009) [2023-10-14 03:44:07,160][33201] Updated weights for policy 0, policy_version 67080 (0.0007) [2023-10-14 03:44:07,529][33201] Updated weights for policy 0, policy_version 67090 (0.0008) [2023-10-14 03:44:07,895][33201] Updated weights for policy 0, policy_version 67100 (0.0008) [2023-10-14 03:44:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 138051584. Throughput: 0: 1749.6, 1: 1778.3. Samples: 34523426. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:44:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:10,617][33226] Updated weights for policy 1, policy_version 67720 (0.0008) [2023-10-14 03:44:10,991][33226] Updated weights for policy 1, policy_version 67730 (0.0009) [2023-10-14 03:44:11,353][33226] Updated weights for policy 1, policy_version 67740 (0.0010) [2023-10-14 03:44:11,800][33201] Updated weights for policy 0, policy_version 67110 (0.0009) [2023-10-14 03:44:12,198][33201] Updated weights for policy 0, policy_version 67120 (0.0008) [2023-10-14 03:44:12,571][33201] Updated weights for policy 0, policy_version 67130 (0.0007) [2023-10-14 03:44:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 138117120. Throughput: 0: 1770.3, 1: 1776.4. Samples: 34533730. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:44:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:15,347][33226] Updated weights for policy 1, policy_version 67750 (0.0008) [2023-10-14 03:44:15,712][33226] Updated weights for policy 1, policy_version 67760 (0.0008) [2023-10-14 03:44:16,088][33226] Updated weights for policy 1, policy_version 67770 (0.0007) [2023-10-14 03:44:16,378][33201] Updated weights for policy 0, policy_version 67140 (0.0008) [2023-10-14 03:44:16,754][33201] Updated weights for policy 0, policy_version 67150 (0.0010) [2023-10-14 03:44:17,124][33201] Updated weights for policy 0, policy_version 67160 (0.0009) [2023-10-14 03:44:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 138182656. Throughput: 0: 1750.7, 1: 1776.8. Samples: 34555044. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:44:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:19,785][33226] Updated weights for policy 1, policy_version 67780 (0.0009) [2023-10-14 03:44:20,153][33226] Updated weights for policy 1, policy_version 67790 (0.0009) [2023-10-14 03:44:20,522][33226] Updated weights for policy 1, policy_version 67800 (0.0008) [2023-10-14 03:44:20,962][33201] Updated weights for policy 0, policy_version 67170 (0.0009) [2023-10-14 03:44:21,323][33201] Updated weights for policy 0, policy_version 67180 (0.0010) [2023-10-14 03:44:21,691][33201] Updated weights for policy 0, policy_version 67190 (0.0008) [2023-10-14 03:44:22,059][33201] Updated weights for policy 0, policy_version 67200 (0.0010) [2023-10-14 03:44:24,303][33226] Updated weights for policy 1, policy_version 67810 (0.0010) [2023-10-14 03:44:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 138248192. Throughput: 0: 1745.8, 1: 1796.4. Samples: 34577050. Policy #0 lag: (min: 31.0, avg: 34.0, max: 63.0) [2023-10-14 03:44:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:24,680][33226] Updated weights for policy 1, policy_version 67820 (0.0010) [2023-10-14 03:44:25,035][33226] Updated weights for policy 1, policy_version 67830 (0.0010) [2023-10-14 03:44:25,399][33226] Updated weights for policy 1, policy_version 67840 (0.0010) [2023-10-14 03:44:25,908][33201] Updated weights for policy 0, policy_version 67210 (0.0011) [2023-10-14 03:44:26,275][33201] Updated weights for policy 0, policy_version 67220 (0.0010) [2023-10-14 03:44:26,639][33201] Updated weights for policy 0, policy_version 67230 (0.0011) [2023-10-14 03:44:29,254][33226] Updated weights for policy 1, policy_version 67850 (0.0007) [2023-10-14 03:44:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 138313728. Throughput: 0: 1745.3, 1: 1772.8. Samples: 34586466. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 03:44:29,624][33226] Updated weights for policy 1, policy_version 67860 (0.0009) [2023-10-14 03:44:29,987][33226] Updated weights for policy 1, policy_version 67870 (0.0009) [2023-10-14 03:44:30,576][33201] Updated weights for policy 0, policy_version 67240 (0.0009) [2023-10-14 03:44:30,954][33201] Updated weights for policy 0, policy_version 67250 (0.0009) [2023-10-14 03:44:31,317][33201] Updated weights for policy 0, policy_version 67260 (0.0011) [2023-10-14 03:44:33,735][33226] Updated weights for policy 1, policy_version 67880 (0.0008) [2023-10-14 03:44:34,106][33226] Updated weights for policy 1, policy_version 67890 (0.0008) [2023-10-14 03:44:34,460][33226] Updated weights for policy 1, policy_version 67900 (0.0008) [2023-10-14 03:44:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 138379264. Throughput: 0: 1742.1, 1: 1783.2. Samples: 34608438. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:34,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:44:34,939][33201] Updated weights for policy 0, policy_version 67270 (0.0011) [2023-10-14 03:44:35,314][33201] Updated weights for policy 0, policy_version 67280 (0.0008) [2023-10-14 03:44:35,687][33201] Updated weights for policy 0, policy_version 67290 (0.0008) [2023-10-14 03:44:38,225][33226] Updated weights for policy 1, policy_version 67910 (0.0007) [2023-10-14 03:44:38,619][33226] Updated weights for policy 1, policy_version 67920 (0.0009) [2023-10-14 03:44:38,987][33226] Updated weights for policy 1, policy_version 67930 (0.0007) [2023-10-14 03:44:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 138477568. Throughput: 0: 1771.2, 1: 1778.4. Samples: 34629470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:44:39,599][33201] Updated weights for policy 0, policy_version 67300 (0.0008) [2023-10-14 03:44:39,972][33201] Updated weights for policy 0, policy_version 67310 (0.0009) [2023-10-14 03:44:40,348][33201] Updated weights for policy 0, policy_version 67320 (0.0011) [2023-10-14 03:44:42,675][33226] Updated weights for policy 1, policy_version 67940 (0.0008) [2023-10-14 03:44:43,033][33226] Updated weights for policy 1, policy_version 67950 (0.0009) [2023-10-14 03:44:43,395][33226] Updated weights for policy 1, policy_version 67960 (0.0011) [2023-10-14 03:44:44,244][33201] Updated weights for policy 0, policy_version 67330 (0.0010) [2023-10-14 03:44:44,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 138543104. Throughput: 0: 1735.2, 1: 1779.1. Samples: 34639966. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:44:44,603][33201] Updated weights for policy 0, policy_version 67340 (0.0008) [2023-10-14 03:44:44,979][33201] Updated weights for policy 0, policy_version 67350 (0.0011) [2023-10-14 03:44:45,344][33201] Updated weights for policy 0, policy_version 67360 (0.0010) [2023-10-14 03:44:47,274][33226] Updated weights for policy 1, policy_version 67970 (0.0009) [2023-10-14 03:44:47,633][33226] Updated weights for policy 1, policy_version 67980 (0.0009) [2023-10-14 03:44:48,009][33226] Updated weights for policy 1, policy_version 67990 (0.0007) [2023-10-14 03:44:48,375][33226] Updated weights for policy 1, policy_version 68000 (0.0010) [2023-10-14 03:44:49,227][33201] Updated weights for policy 0, policy_version 67370 (0.0008) [2023-10-14 03:44:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 138608640. Throughput: 0: 1762.8, 1: 1782.0. Samples: 34661170. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:44:49,604][33201] Updated weights for policy 0, policy_version 67380 (0.0008) [2023-10-14 03:44:49,972][33201] Updated weights for policy 0, policy_version 67390 (0.0008) [2023-10-14 03:44:52,166][33226] Updated weights for policy 1, policy_version 68010 (0.0008) [2023-10-14 03:44:52,520][33226] Updated weights for policy 1, policy_version 68020 (0.0012) [2023-10-14 03:44:52,889][33226] Updated weights for policy 1, policy_version 68030 (0.0008) [2023-10-14 03:44:53,714][33201] Updated weights for policy 0, policy_version 67400 (0.0010) [2023-10-14 03:44:54,086][33201] Updated weights for policy 0, policy_version 67410 (0.0009) [2023-10-14 03:44:54,463][33201] Updated weights for policy 0, policy_version 67420 (0.0008) [2023-10-14 03:44:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 138674176. Throughput: 0: 1755.2, 1: 1769.4. Samples: 34682030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:44:56,791][33226] Updated weights for policy 1, policy_version 68040 (0.0009) [2023-10-14 03:44:57,154][33226] Updated weights for policy 1, policy_version 68050 (0.0007) [2023-10-14 03:44:57,522][33226] Updated weights for policy 1, policy_version 68060 (0.0007) [2023-10-14 03:44:58,678][33201] Updated weights for policy 0, policy_version 67430 (0.0008) [2023-10-14 03:44:59,063][33201] Updated weights for policy 0, policy_version 67440 (0.0009) [2023-10-14 03:44:59,439][33201] Updated weights for policy 0, policy_version 67450 (0.0011) [2023-10-14 03:44:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 138739712. Throughput: 0: 1751.9, 1: 1790.3. Samples: 34693128. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:44:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:01,265][33226] Updated weights for policy 1, policy_version 68070 (0.0009) [2023-10-14 03:45:01,639][33226] Updated weights for policy 1, policy_version 68080 (0.0009) [2023-10-14 03:45:02,016][33226] Updated weights for policy 1, policy_version 68090 (0.0007) [2023-10-14 03:45:03,347][33201] Updated weights for policy 0, policy_version 67460 (0.0009) [2023-10-14 03:45:03,701][33201] Updated weights for policy 0, policy_version 67470 (0.0010) [2023-10-14 03:45:04,077][33201] Updated weights for policy 0, policy_version 67480 (0.0011) [2023-10-14 03:45:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 138838016. Throughput: 0: 1765.1, 1: 1766.9. Samples: 34713980. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:45:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:05,923][33226] Updated weights for policy 1, policy_version 68100 (0.0007) [2023-10-14 03:45:06,299][33226] Updated weights for policy 1, policy_version 68110 (0.0009) [2023-10-14 03:45:06,659][33226] Updated weights for policy 1, policy_version 68120 (0.0008) [2023-10-14 03:45:08,002][33201] Updated weights for policy 0, policy_version 67490 (0.0007) [2023-10-14 03:45:08,386][33201] Updated weights for policy 0, policy_version 67500 (0.0010) [2023-10-14 03:45:08,762][33201] Updated weights for policy 0, policy_version 67510 (0.0011) [2023-10-14 03:45:09,130][33201] Updated weights for policy 0, policy_version 67520 (0.0009) [2023-10-14 03:45:09,557][31953] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 138903552. Throughput: 0: 1735.2, 1: 1768.9. Samples: 34734734. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:10,462][33226] Updated weights for policy 1, policy_version 68130 (0.0007) [2023-10-14 03:45:10,839][33226] Updated weights for policy 1, policy_version 68140 (0.0007) [2023-10-14 03:45:11,208][33226] Updated weights for policy 1, policy_version 68150 (0.0011) [2023-10-14 03:45:11,574][33226] Updated weights for policy 1, policy_version 68160 (0.0009) [2023-10-14 03:45:13,062][33201] Updated weights for policy 0, policy_version 67530 (0.0010) [2023-10-14 03:45:13,444][33201] Updated weights for policy 0, policy_version 67540 (0.0011) [2023-10-14 03:45:13,800][33201] Updated weights for policy 0, policy_version 67550 (0.0010) [2023-10-14 03:45:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 138969088. Throughput: 0: 1762.8, 1: 1772.3. Samples: 34745546. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:15,348][33226] Updated weights for policy 1, policy_version 68170 (0.0008) [2023-10-14 03:45:15,701][33226] Updated weights for policy 1, policy_version 68180 (0.0007) [2023-10-14 03:45:16,071][33226] Updated weights for policy 1, policy_version 68190 (0.0007) [2023-10-14 03:45:17,590][33201] Updated weights for policy 0, policy_version 67560 (0.0008) [2023-10-14 03:45:17,959][33201] Updated weights for policy 0, policy_version 67570 (0.0009) [2023-10-14 03:45:18,327][33201] Updated weights for policy 0, policy_version 67580 (0.0011) [2023-10-14 03:45:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 139034624. Throughput: 0: 1740.2, 1: 1778.0. Samples: 34766756. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:19,861][33226] Updated weights for policy 1, policy_version 68200 (0.0007) [2023-10-14 03:45:20,229][33226] Updated weights for policy 1, policy_version 68210 (0.0007) [2023-10-14 03:45:20,604][33226] Updated weights for policy 1, policy_version 68220 (0.0007) [2023-10-14 03:45:22,115][33201] Updated weights for policy 0, policy_version 67590 (0.0011) [2023-10-14 03:45:22,486][33201] Updated weights for policy 0, policy_version 67600 (0.0010) [2023-10-14 03:45:22,865][33201] Updated weights for policy 0, policy_version 67610 (0.0008) [2023-10-14 03:45:24,471][33226] Updated weights for policy 1, policy_version 68230 (0.0008) [2023-10-14 03:45:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 139100160. Throughput: 0: 1730.5, 1: 1803.6. Samples: 34788502. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000067616_69238784.pth... [2023-10-14 03:45:24,602][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000065984_67567616.pth [2023-10-14 03:45:24,868][33226] Updated weights for policy 1, policy_version 68240 (0.0009) [2023-10-14 03:45:25,236][33226] Updated weights for policy 1, policy_version 68250 (0.0008) [2023-10-14 03:45:25,449][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000068256_69894144.pth... [2023-10-14 03:45:25,488][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000066592_68190208.pth [2023-10-14 03:45:26,704][33201] Updated weights for policy 0, policy_version 67620 (0.0008) [2023-10-14 03:45:27,083][33201] Updated weights for policy 0, policy_version 67630 (0.0009) [2023-10-14 03:45:27,440][33201] Updated weights for policy 0, policy_version 67640 (0.0008) [2023-10-14 03:45:29,014][33226] Updated weights for policy 1, policy_version 68260 (0.0007) [2023-10-14 03:45:29,381][33226] Updated weights for policy 1, policy_version 68270 (0.0009) [2023-10-14 03:45:29,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 139165696. Throughput: 0: 1751.5, 1: 1775.3. Samples: 34798676. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:29,560][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:29,751][33226] Updated weights for policy 1, policy_version 68280 (0.0009) [2023-10-14 03:45:31,305][33201] Updated weights for policy 0, policy_version 67650 (0.0009) [2023-10-14 03:45:31,674][33201] Updated weights for policy 0, policy_version 67660 (0.0008) [2023-10-14 03:45:32,053][33201] Updated weights for policy 0, policy_version 67670 (0.0008) [2023-10-14 03:45:32,417][33201] Updated weights for policy 0, policy_version 67680 (0.0008) [2023-10-14 03:45:33,576][33226] Updated weights for policy 1, policy_version 68290 (0.0007) [2023-10-14 03:45:33,944][33226] Updated weights for policy 1, policy_version 68300 (0.0007) [2023-10-14 03:45:34,305][33226] Updated weights for policy 1, policy_version 68310 (0.0008) [2023-10-14 03:45:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 139231232. Throughput: 0: 1731.2, 1: 1791.5. Samples: 34819692. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:34,679][33226] Updated weights for policy 1, policy_version 68320 (0.0010) [2023-10-14 03:45:36,216][33201] Updated weights for policy 0, policy_version 67690 (0.0007) [2023-10-14 03:45:36,581][33201] Updated weights for policy 0, policy_version 67700 (0.0009) [2023-10-14 03:45:36,958][33201] Updated weights for policy 0, policy_version 67710 (0.0010) [2023-10-14 03:45:38,478][33226] Updated weights for policy 1, policy_version 68330 (0.0009) [2023-10-14 03:45:38,850][33226] Updated weights for policy 1, policy_version 68340 (0.0008) [2023-10-14 03:45:39,210][33226] Updated weights for policy 1, policy_version 68350 (0.0010) [2023-10-14 03:45:39,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 139329536. Throughput: 0: 1751.4, 1: 1780.4. Samples: 34840958. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:45:40,751][33201] Updated weights for policy 0, policy_version 67720 (0.0008) [2023-10-14 03:45:41,128][33201] Updated weights for policy 0, policy_version 67730 (0.0008) [2023-10-14 03:45:41,485][33201] Updated weights for policy 0, policy_version 67740 (0.0010) [2023-10-14 03:45:42,968][33226] Updated weights for policy 1, policy_version 68360 (0.0008) [2023-10-14 03:45:43,334][33226] Updated weights for policy 1, policy_version 68370 (0.0007) [2023-10-14 03:45:43,696][33226] Updated weights for policy 1, policy_version 68380 (0.0007) [2023-10-14 03:45:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 139395072. Throughput: 0: 1741.1, 1: 1782.7. Samples: 34851698. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.980')] [2023-10-14 03:45:45,391][33201] Updated weights for policy 0, policy_version 67750 (0.0010) [2023-10-14 03:45:45,768][33201] Updated weights for policy 0, policy_version 67760 (0.0011) [2023-10-14 03:45:46,138][33201] Updated weights for policy 0, policy_version 67770 (0.0010) [2023-10-14 03:45:47,451][33226] Updated weights for policy 1, policy_version 68390 (0.0010) [2023-10-14 03:45:47,811][33226] Updated weights for policy 1, policy_version 68400 (0.0011) [2023-10-14 03:45:48,176][33226] Updated weights for policy 1, policy_version 68410 (0.0007) [2023-10-14 03:45:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 139460608. Throughput: 0: 1748.8, 1: 1784.5. Samples: 34872978. Policy #0 lag: (min: 10.0, avg: 21.7, max: 42.0) [2023-10-14 03:45:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 03:45:49,808][33201] Updated weights for policy 0, policy_version 67780 (0.0009) [2023-10-14 03:45:50,180][33201] Updated weights for policy 0, policy_version 67790 (0.0009) [2023-10-14 03:45:50,541][33201] Updated weights for policy 0, policy_version 67800 (0.0007) [2023-10-14 03:45:51,772][33226] Updated weights for policy 1, policy_version 68420 (0.0008) [2023-10-14 03:45:52,136][33226] Updated weights for policy 1, policy_version 68430 (0.0008) [2023-10-14 03:45:52,498][33226] Updated weights for policy 1, policy_version 68440 (0.0008) [2023-10-14 03:45:54,404][33201] Updated weights for policy 0, policy_version 67810 (0.0008) [2023-10-14 03:45:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 139526144. Throughput: 0: 1781.9, 1: 1772.9. Samples: 34894700. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:45:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.920')] [2023-10-14 03:45:54,785][33201] Updated weights for policy 0, policy_version 67820 (0.0007) [2023-10-14 03:45:55,154][33201] Updated weights for policy 0, policy_version 67830 (0.0007) [2023-10-14 03:45:55,526][33201] Updated weights for policy 0, policy_version 67840 (0.0008) [2023-10-14 03:45:56,407][33226] Updated weights for policy 1, policy_version 68450 (0.0008) [2023-10-14 03:45:56,771][33226] Updated weights for policy 1, policy_version 68460 (0.0011) [2023-10-14 03:45:57,136][33226] Updated weights for policy 1, policy_version 68470 (0.0010) [2023-10-14 03:45:57,508][33226] Updated weights for policy 1, policy_version 68480 (0.0009) [2023-10-14 03:45:59,178][33201] Updated weights for policy 0, policy_version 67850 (0.0009) [2023-10-14 03:45:59,556][33201] Updated weights for policy 0, policy_version 67860 (0.0007) [2023-10-14 03:45:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 139591680. Throughput: 0: 1754.5, 1: 1787.0. Samples: 34904914. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:45:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.870')] [2023-10-14 03:45:59,925][33201] Updated weights for policy 0, policy_version 67870 (0.0007) [2023-10-14 03:46:01,401][33226] Updated weights for policy 1, policy_version 68490 (0.0009) [2023-10-14 03:46:01,767][33226] Updated weights for policy 1, policy_version 68500 (0.0009) [2023-10-14 03:46:02,130][33226] Updated weights for policy 1, policy_version 68510 (0.0008) [2023-10-14 03:46:03,806][33201] Updated weights for policy 0, policy_version 67880 (0.0009) [2023-10-14 03:46:04,176][33201] Updated weights for policy 0, policy_version 67890 (0.0007) [2023-10-14 03:46:04,556][33201] Updated weights for policy 0, policy_version 67900 (0.0009) [2023-10-14 03:46:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 139657216. Throughput: 0: 1777.8, 1: 1772.0. Samples: 34926498. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:04,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.850')] [2023-10-14 03:46:05,842][33226] Updated weights for policy 1, policy_version 68520 (0.0009) [2023-10-14 03:46:06,207][33226] Updated weights for policy 1, policy_version 68530 (0.0009) [2023-10-14 03:46:06,566][33226] Updated weights for policy 1, policy_version 68540 (0.0008) [2023-10-14 03:46:08,276][33201] Updated weights for policy 0, policy_version 67910 (0.0009) [2023-10-14 03:46:08,642][33201] Updated weights for policy 0, policy_version 67920 (0.0009) [2023-10-14 03:46:09,011][33201] Updated weights for policy 0, policy_version 67930 (0.0011) [2023-10-14 03:46:09,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 139755520. Throughput: 0: 1766.8, 1: 1765.0. Samples: 34947434. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:09,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 03:46:10,501][33226] Updated weights for policy 1, policy_version 68550 (0.0009) [2023-10-14 03:46:10,879][33226] Updated weights for policy 1, policy_version 68560 (0.0007) [2023-10-14 03:46:11,243][33226] Updated weights for policy 1, policy_version 68570 (0.0008) [2023-10-14 03:46:12,761][33201] Updated weights for policy 0, policy_version 67940 (0.0009) [2023-10-14 03:46:13,134][33201] Updated weights for policy 0, policy_version 67950 (0.0011) [2023-10-14 03:46:13,505][33201] Updated weights for policy 0, policy_version 67960 (0.0009) [2023-10-14 03:46:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 139821056. Throughput: 0: 1779.4, 1: 1768.0. Samples: 34958308. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:14,559][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 03:46:15,155][33226] Updated weights for policy 1, policy_version 68580 (0.0009) [2023-10-14 03:46:15,518][33226] Updated weights for policy 1, policy_version 68590 (0.0011) [2023-10-14 03:46:15,886][33226] Updated weights for policy 1, policy_version 68600 (0.0010) [2023-10-14 03:46:17,360][33201] Updated weights for policy 0, policy_version 67970 (0.0009) [2023-10-14 03:46:17,734][33201] Updated weights for policy 0, policy_version 67980 (0.0007) [2023-10-14 03:46:18,099][33201] Updated weights for policy 0, policy_version 67990 (0.0010) [2023-10-14 03:46:18,471][33201] Updated weights for policy 0, policy_version 68000 (0.0008) [2023-10-14 03:46:19,505][33226] Updated weights for policy 1, policy_version 68610 (0.0008) [2023-10-14 03:46:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 139886592. Throughput: 0: 1780.3, 1: 1774.0. Samples: 34979638. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.810')] [2023-10-14 03:46:19,877][33226] Updated weights for policy 1, policy_version 68620 (0.0011) [2023-10-14 03:46:20,256][33226] Updated weights for policy 1, policy_version 68630 (0.0009) [2023-10-14 03:46:20,623][33226] Updated weights for policy 1, policy_version 68640 (0.0009) [2023-10-14 03:46:22,332][33201] Updated weights for policy 0, policy_version 68010 (0.0008) [2023-10-14 03:46:22,698][33201] Updated weights for policy 0, policy_version 68020 (0.0007) [2023-10-14 03:46:23,063][33201] Updated weights for policy 0, policy_version 68030 (0.0008) [2023-10-14 03:46:24,466][33226] Updated weights for policy 1, policy_version 68650 (0.0008) [2023-10-14 03:46:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 139952128. Throughput: 0: 1764.6, 1: 1799.4. Samples: 35001336. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 03:46:24,835][33226] Updated weights for policy 1, policy_version 68660 (0.0009) [2023-10-14 03:46:25,199][33226] Updated weights for policy 1, policy_version 68670 (0.0008) [2023-10-14 03:46:27,013][33201] Updated weights for policy 0, policy_version 68040 (0.0009) [2023-10-14 03:46:27,384][33201] Updated weights for policy 0, policy_version 68050 (0.0008) [2023-10-14 03:46:27,759][33201] Updated weights for policy 0, policy_version 68060 (0.0008) [2023-10-14 03:46:29,010][33226] Updated weights for policy 1, policy_version 68680 (0.0010) [2023-10-14 03:46:29,383][33226] Updated weights for policy 1, policy_version 68690 (0.0008) [2023-10-14 03:46:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 140017664. Throughput: 0: 1783.6, 1: 1777.0. Samples: 35011924. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:29,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 03:46:29,749][33226] Updated weights for policy 1, policy_version 68700 (0.0008) [2023-10-14 03:46:31,582][33201] Updated weights for policy 0, policy_version 68070 (0.0007) [2023-10-14 03:46:31,946][33201] Updated weights for policy 0, policy_version 68080 (0.0007) [2023-10-14 03:46:32,310][33201] Updated weights for policy 0, policy_version 68090 (0.0007) [2023-10-14 03:46:33,593][33226] Updated weights for policy 1, policy_version 68710 (0.0008) [2023-10-14 03:46:33,963][33226] Updated weights for policy 1, policy_version 68720 (0.0012) [2023-10-14 03:46:34,326][33226] Updated weights for policy 1, policy_version 68730 (0.0010) [2023-10-14 03:46:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 140115968. Throughput: 0: 1762.7, 1: 1795.0. Samples: 35033074. Policy #0 lag: (min: 8.0, avg: 37.2, max: 40.0) [2023-10-14 03:46:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 03:46:36,147][33201] Updated weights for policy 0, policy_version 68100 (0.0008) [2023-10-14 03:46:36,532][33201] Updated weights for policy 0, policy_version 68110 (0.0009) [2023-10-14 03:46:36,906][33201] Updated weights for policy 0, policy_version 68120 (0.0008) [2023-10-14 03:46:38,040][33226] Updated weights for policy 1, policy_version 68740 (0.0009) [2023-10-14 03:46:38,412][33226] Updated weights for policy 1, policy_version 68750 (0.0007) [2023-10-14 03:46:38,771][33226] Updated weights for policy 1, policy_version 68760 (0.0008) [2023-10-14 03:46:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 140181504. Throughput: 0: 1762.8, 1: 1777.2. Samples: 35054000. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:46:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 03:46:40,643][33201] Updated weights for policy 0, policy_version 68130 (0.0007) [2023-10-14 03:46:41,013][33201] Updated weights for policy 0, policy_version 68140 (0.0007) [2023-10-14 03:46:41,388][33201] Updated weights for policy 0, policy_version 68150 (0.0008) [2023-10-14 03:46:41,755][33201] Updated weights for policy 0, policy_version 68160 (0.0009) [2023-10-14 03:46:42,404][33226] Updated weights for policy 1, policy_version 68770 (0.0010) [2023-10-14 03:46:42,772][33226] Updated weights for policy 1, policy_version 68780 (0.0010) [2023-10-14 03:46:43,138][33226] Updated weights for policy 1, policy_version 68790 (0.0008) [2023-10-14 03:46:43,510][33226] Updated weights for policy 1, policy_version 68800 (0.0011) [2023-10-14 03:46:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 140247040. Throughput: 0: 1764.9, 1: 1796.3. Samples: 35065170. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:46:44,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 03:46:45,266][33201] Updated weights for policy 0, policy_version 68170 (0.0007) [2023-10-14 03:46:45,637][33201] Updated weights for policy 0, policy_version 68180 (0.0011) [2023-10-14 03:46:46,012][33201] Updated weights for policy 0, policy_version 68190 (0.0009) [2023-10-14 03:46:47,295][33226] Updated weights for policy 1, policy_version 68810 (0.0007) [2023-10-14 03:46:47,652][33226] Updated weights for policy 1, policy_version 68820 (0.0007) [2023-10-14 03:46:48,018][33226] Updated weights for policy 1, policy_version 68830 (0.0007) [2023-10-14 03:46:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 140312576. Throughput: 0: 1767.2, 1: 1782.3. Samples: 35086226. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:46:49,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.830')] [2023-10-14 03:46:50,045][33201] Updated weights for policy 0, policy_version 68200 (0.0009) [2023-10-14 03:46:50,411][33201] Updated weights for policy 0, policy_version 68210 (0.0008) [2023-10-14 03:46:50,780][33201] Updated weights for policy 0, policy_version 68220 (0.0007) [2023-10-14 03:46:51,787][33226] Updated weights for policy 1, policy_version 68840 (0.0009) [2023-10-14 03:46:52,154][33226] Updated weights for policy 1, policy_version 68850 (0.0008) [2023-10-14 03:46:52,516][33226] Updated weights for policy 1, policy_version 68860 (0.0009) [2023-10-14 03:46:54,545][33201] Updated weights for policy 0, policy_version 68230 (0.0008) [2023-10-14 03:46:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 140378112. Throughput: 0: 1792.6, 1: 1773.7. Samples: 35107918. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:46:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.830')] [2023-10-14 03:46:54,909][33201] Updated weights for policy 0, policy_version 68240 (0.0010) [2023-10-14 03:46:55,287][33201] Updated weights for policy 0, policy_version 68250 (0.0008) [2023-10-14 03:46:56,394][33226] Updated weights for policy 1, policy_version 68870 (0.0010) [2023-10-14 03:46:56,782][33226] Updated weights for policy 1, policy_version 68880 (0.0007) [2023-10-14 03:46:57,146][33226] Updated weights for policy 1, policy_version 68890 (0.0007) [2023-10-14 03:46:59,193][33201] Updated weights for policy 0, policy_version 68260 (0.0008) [2023-10-14 03:46:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 140443648. Throughput: 0: 1762.7, 1: 1789.3. Samples: 35118148. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:46:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.810')] [2023-10-14 03:46:59,561][33201] Updated weights for policy 0, policy_version 68270 (0.0008) [2023-10-14 03:46:59,941][33201] Updated weights for policy 0, policy_version 68280 (0.0010) [2023-10-14 03:47:01,004][33226] Updated weights for policy 1, policy_version 68900 (0.0009) [2023-10-14 03:47:01,357][33226] Updated weights for policy 1, policy_version 68910 (0.0011) [2023-10-14 03:47:01,721][33226] Updated weights for policy 1, policy_version 68920 (0.0008) [2023-10-14 03:47:03,723][33201] Updated weights for policy 0, policy_version 68290 (0.0008) [2023-10-14 03:47:04,097][33201] Updated weights for policy 0, policy_version 68300 (0.0009) [2023-10-14 03:47:04,467][33201] Updated weights for policy 0, policy_version 68310 (0.0009) [2023-10-14 03:47:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 140509184. Throughput: 0: 1786.4, 1: 1772.6. Samples: 35139794. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:47:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.790')] [2023-10-14 03:47:04,839][33201] Updated weights for policy 0, policy_version 68320 (0.0010) [2023-10-14 03:47:05,340][33226] Updated weights for policy 1, policy_version 68930 (0.0007) [2023-10-14 03:47:05,700][33226] Updated weights for policy 1, policy_version 68940 (0.0007) [2023-10-14 03:47:06,071][33226] Updated weights for policy 1, policy_version 68950 (0.0009) [2023-10-14 03:47:06,439][33226] Updated weights for policy 1, policy_version 68960 (0.0008) [2023-10-14 03:47:08,665][33201] Updated weights for policy 0, policy_version 68330 (0.0008) [2023-10-14 03:47:09,045][33201] Updated weights for policy 0, policy_version 68340 (0.0009) [2023-10-14 03:47:09,420][33201] Updated weights for policy 0, policy_version 68350 (0.0008) [2023-10-14 03:47:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 140607488. Throughput: 0: 1775.8, 1: 1777.0. Samples: 35161214. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:47:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.780')] [2023-10-14 03:47:10,141][33226] Updated weights for policy 1, policy_version 68970 (0.0008) [2023-10-14 03:47:10,494][33226] Updated weights for policy 1, policy_version 68980 (0.0008) [2023-10-14 03:47:10,869][33226] Updated weights for policy 1, policy_version 68990 (0.0007) [2023-10-14 03:47:13,092][33201] Updated weights for policy 0, policy_version 68360 (0.0007) [2023-10-14 03:47:13,457][33201] Updated weights for policy 0, policy_version 68370 (0.0007) [2023-10-14 03:47:13,840][33201] Updated weights for policy 0, policy_version 68380 (0.0007) [2023-10-14 03:47:14,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 140673024. Throughput: 0: 1778.8, 1: 1776.3. Samples: 35171902. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:47:14,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.780')] [2023-10-14 03:47:14,631][33226] Updated weights for policy 1, policy_version 69000 (0.0007) [2023-10-14 03:47:14,995][33226] Updated weights for policy 1, policy_version 69010 (0.0009) [2023-10-14 03:47:15,350][33226] Updated weights for policy 1, policy_version 69020 (0.0010) [2023-10-14 03:47:17,655][33201] Updated weights for policy 0, policy_version 68390 (0.0009) [2023-10-14 03:47:18,022][33201] Updated weights for policy 0, policy_version 68400 (0.0011) [2023-10-14 03:47:18,391][33201] Updated weights for policy 0, policy_version 68410 (0.0011) [2023-10-14 03:47:19,220][33226] Updated weights for policy 1, policy_version 69030 (0.0010) [2023-10-14 03:47:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 140738560. Throughput: 0: 1784.6, 1: 1776.0. Samples: 35193304. Policy #0 lag: (min: 21.0, avg: 21.0, max: 21.0) [2023-10-14 03:47:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.780')] [2023-10-14 03:47:19,584][33226] Updated weights for policy 1, policy_version 69040 (0.0011) [2023-10-14 03:47:19,945][33226] Updated weights for policy 1, policy_version 69050 (0.0008) [2023-10-14 03:47:22,298][33201] Updated weights for policy 0, policy_version 68420 (0.0007) [2023-10-14 03:47:22,674][33201] Updated weights for policy 0, policy_version 68430 (0.0007) [2023-10-14 03:47:23,050][33201] Updated weights for policy 0, policy_version 68440 (0.0008) [2023-10-14 03:47:23,724][33226] Updated weights for policy 1, policy_version 69060 (0.0009) [2023-10-14 03:47:24,086][33226] Updated weights for policy 1, policy_version 69070 (0.0008) [2023-10-14 03:47:24,451][33226] Updated weights for policy 1, policy_version 69080 (0.0007) [2023-10-14 03:47:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 140804096. Throughput: 0: 1765.7, 1: 1798.5. Samples: 35214390. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.780')] [2023-10-14 03:47:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000068448_70090752.pth... [2023-10-14 03:47:24,604][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000066784_68386816.pth [2023-10-14 03:47:24,750][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000069088_70746112.pth... [2023-10-14 03:47:24,778][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000067424_69042176.pth [2023-10-14 03:47:26,920][33201] Updated weights for policy 0, policy_version 68450 (0.0008) [2023-10-14 03:47:27,291][33201] Updated weights for policy 0, policy_version 68460 (0.0007) [2023-10-14 03:47:27,659][33201] Updated weights for policy 0, policy_version 68470 (0.0007) [2023-10-14 03:47:28,030][33201] Updated weights for policy 0, policy_version 68480 (0.0007) [2023-10-14 03:47:28,214][33226] Updated weights for policy 1, policy_version 69090 (0.0007) [2023-10-14 03:47:28,586][33226] Updated weights for policy 1, policy_version 69100 (0.0008) [2023-10-14 03:47:28,953][33226] Updated weights for policy 1, policy_version 69110 (0.0007) [2023-10-14 03:47:29,326][33226] Updated weights for policy 1, policy_version 69120 (0.0007) [2023-10-14 03:47:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 140902400. Throughput: 0: 1792.3, 1: 1775.7. Samples: 35225730. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:29,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.780')] [2023-10-14 03:47:31,790][33201] Updated weights for policy 0, policy_version 68490 (0.0008) [2023-10-14 03:47:32,159][33201] Updated weights for policy 0, policy_version 68500 (0.0007) [2023-10-14 03:47:32,535][33201] Updated weights for policy 0, policy_version 68510 (0.0007) [2023-10-14 03:47:33,175][33226] Updated weights for policy 1, policy_version 69130 (0.0007) [2023-10-14 03:47:33,550][33226] Updated weights for policy 1, policy_version 69140 (0.0008) [2023-10-14 03:47:33,911][33226] Updated weights for policy 1, policy_version 69150 (0.0009) [2023-10-14 03:47:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 140967936. Throughput: 0: 1762.4, 1: 1802.7. Samples: 35246652. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.790')] [2023-10-14 03:47:36,369][33201] Updated weights for policy 0, policy_version 68520 (0.0007) [2023-10-14 03:47:36,743][33201] Updated weights for policy 0, policy_version 68530 (0.0011) [2023-10-14 03:47:37,108][33201] Updated weights for policy 0, policy_version 68540 (0.0008) [2023-10-14 03:47:37,657][33226] Updated weights for policy 1, policy_version 69160 (0.0008) [2023-10-14 03:47:38,023][33226] Updated weights for policy 1, policy_version 69170 (0.0008) [2023-10-14 03:47:38,396][33226] Updated weights for policy 1, policy_version 69180 (0.0010) [2023-10-14 03:47:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 141033472. Throughput: 0: 1764.6, 1: 1785.2. Samples: 35267656. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.790')] [2023-10-14 03:47:40,848][33201] Updated weights for policy 0, policy_version 68550 (0.0008) [2023-10-14 03:47:41,216][33201] Updated weights for policy 0, policy_version 68560 (0.0008) [2023-10-14 03:47:41,591][33201] Updated weights for policy 0, policy_version 68570 (0.0008) [2023-10-14 03:47:42,223][33226] Updated weights for policy 1, policy_version 69190 (0.0007) [2023-10-14 03:47:42,625][33226] Updated weights for policy 1, policy_version 69200 (0.0007) [2023-10-14 03:47:42,991][33226] Updated weights for policy 1, policy_version 69210 (0.0008) [2023-10-14 03:47:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 141099008. Throughput: 0: 1766.2, 1: 1803.2. Samples: 35278770. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:44,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 03:47:45,327][33201] Updated weights for policy 0, policy_version 68580 (0.0009) [2023-10-14 03:47:45,697][33201] Updated weights for policy 0, policy_version 68590 (0.0010) [2023-10-14 03:47:46,061][33201] Updated weights for policy 0, policy_version 68600 (0.0011) [2023-10-14 03:47:46,953][33226] Updated weights for policy 1, policy_version 69220 (0.0008) [2023-10-14 03:47:47,317][33226] Updated weights for policy 1, policy_version 69230 (0.0008) [2023-10-14 03:47:47,675][33226] Updated weights for policy 1, policy_version 69240 (0.0009) [2023-10-14 03:47:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 141164544. Throughput: 0: 1763.1, 1: 1785.5. Samples: 35299480. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:49,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:47:49,915][33201] Updated weights for policy 0, policy_version 68610 (0.0010) [2023-10-14 03:47:50,284][33201] Updated weights for policy 0, policy_version 68620 (0.0008) [2023-10-14 03:47:50,666][33201] Updated weights for policy 0, policy_version 68630 (0.0008) [2023-10-14 03:47:51,035][33201] Updated weights for policy 0, policy_version 68640 (0.0007) [2023-10-14 03:47:51,425][33226] Updated weights for policy 1, policy_version 69250 (0.0011) [2023-10-14 03:47:51,787][33226] Updated weights for policy 1, policy_version 69260 (0.0009) [2023-10-14 03:47:52,155][33226] Updated weights for policy 1, policy_version 69270 (0.0008) [2023-10-14 03:47:52,517][33226] Updated weights for policy 1, policy_version 69280 (0.0007) [2023-10-14 03:47:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 141230080. Throughput: 0: 1783.4, 1: 1775.8. Samples: 35321378. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.900')] [2023-10-14 03:47:54,812][33201] Updated weights for policy 0, policy_version 68650 (0.0011) [2023-10-14 03:47:55,184][33201] Updated weights for policy 0, policy_version 68660 (0.0010) [2023-10-14 03:47:55,545][33201] Updated weights for policy 0, policy_version 68670 (0.0009) [2023-10-14 03:47:56,061][33226] Updated weights for policy 1, policy_version 69290 (0.0009) [2023-10-14 03:47:56,432][33226] Updated weights for policy 1, policy_version 69300 (0.0009) [2023-10-14 03:47:56,794][33226] Updated weights for policy 1, policy_version 69310 (0.0009) [2023-10-14 03:47:59,220][33201] Updated weights for policy 0, policy_version 68680 (0.0009) [2023-10-14 03:47:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 141295616. Throughput: 0: 1762.8, 1: 1780.8. Samples: 35331364. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:47:59,562][31953] Avg episode reward: [(0, '20.940'), (1, '20.900')] [2023-10-14 03:47:59,592][33201] Updated weights for policy 0, policy_version 68690 (0.0007) [2023-10-14 03:47:59,955][33201] Updated weights for policy 0, policy_version 68700 (0.0007) [2023-10-14 03:48:00,570][33226] Updated weights for policy 1, policy_version 69320 (0.0011) [2023-10-14 03:48:00,935][33226] Updated weights for policy 1, policy_version 69330 (0.0010) [2023-10-14 03:48:01,297][33226] Updated weights for policy 1, policy_version 69340 (0.0008) [2023-10-14 03:48:03,886][33201] Updated weights for policy 0, policy_version 68710 (0.0010) [2023-10-14 03:48:04,258][33201] Updated weights for policy 0, policy_version 68720 (0.0009) [2023-10-14 03:48:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 141361152. Throughput: 0: 1776.8, 1: 1784.4. Samples: 35353560. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:48:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.920')] [2023-10-14 03:48:04,630][33201] Updated weights for policy 0, policy_version 68730 (0.0010) [2023-10-14 03:48:05,016][33226] Updated weights for policy 1, policy_version 69350 (0.0007) [2023-10-14 03:48:05,375][33226] Updated weights for policy 1, policy_version 69360 (0.0010) [2023-10-14 03:48:05,753][33226] Updated weights for policy 1, policy_version 69370 (0.0007) [2023-10-14 03:48:08,535][33201] Updated weights for policy 0, policy_version 68740 (0.0008) [2023-10-14 03:48:08,939][33201] Updated weights for policy 0, policy_version 68750 (0.0007) [2023-10-14 03:48:09,312][33201] Updated weights for policy 0, policy_version 68760 (0.0009) [2023-10-14 03:48:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 141426688. Throughput: 0: 1774.4, 1: 1789.4. Samples: 35374764. Policy #0 lag: (min: 31.0, avg: 33.6, max: 63.0) [2023-10-14 03:48:09,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.920')] [2023-10-14 03:48:09,603][33226] Updated weights for policy 1, policy_version 69380 (0.0010) [2023-10-14 03:48:09,974][33226] Updated weights for policy 1, policy_version 69390 (0.0010) [2023-10-14 03:48:10,334][33226] Updated weights for policy 1, policy_version 69400 (0.0007) [2023-10-14 03:48:13,059][33201] Updated weights for policy 0, policy_version 68770 (0.0011) [2023-10-14 03:48:13,446][33201] Updated weights for policy 0, policy_version 68780 (0.0009) [2023-10-14 03:48:13,813][33201] Updated weights for policy 0, policy_version 68790 (0.0010) [2023-10-14 03:48:14,087][33226] Updated weights for policy 1, policy_version 69410 (0.0008) [2023-10-14 03:48:14,168][33201] Updated weights for policy 0, policy_version 68800 (0.0010) [2023-10-14 03:48:14,462][33226] Updated weights for policy 1, policy_version 69420 (0.0008) [2023-10-14 03:48:14,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 141524992. Throughput: 0: 1766.7, 1: 1778.8. Samples: 35385280. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.890')] [2023-10-14 03:48:14,822][33226] Updated weights for policy 1, policy_version 69430 (0.0007) [2023-10-14 03:48:15,190][33226] Updated weights for policy 1, policy_version 69440 (0.0009) [2023-10-14 03:48:17,905][33201] Updated weights for policy 0, policy_version 68810 (0.0008) [2023-10-14 03:48:18,282][33201] Updated weights for policy 0, policy_version 68820 (0.0007) [2023-10-14 03:48:18,643][33201] Updated weights for policy 0, policy_version 68830 (0.0009) [2023-10-14 03:48:19,037][33226] Updated weights for policy 1, policy_version 69450 (0.0011) [2023-10-14 03:48:19,401][33226] Updated weights for policy 1, policy_version 69460 (0.0008) [2023-10-14 03:48:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 141590528. Throughput: 0: 1784.0, 1: 1779.5. Samples: 35407008. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.870')] [2023-10-14 03:48:19,778][33226] Updated weights for policy 1, policy_version 69470 (0.0008) [2023-10-14 03:48:22,414][33201] Updated weights for policy 0, policy_version 68840 (0.0007) [2023-10-14 03:48:22,788][33201] Updated weights for policy 0, policy_version 68850 (0.0007) [2023-10-14 03:48:23,160][33201] Updated weights for policy 0, policy_version 68860 (0.0007) [2023-10-14 03:48:23,658][33226] Updated weights for policy 1, policy_version 69480 (0.0008) [2023-10-14 03:48:24,020][33226] Updated weights for policy 1, policy_version 69490 (0.0009) [2023-10-14 03:48:24,382][33226] Updated weights for policy 1, policy_version 69500 (0.0008) [2023-10-14 03:48:24,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 141688832. Throughput: 0: 1763.7, 1: 1790.0. Samples: 35427574. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 03:48:26,912][33201] Updated weights for policy 0, policy_version 68870 (0.0007) [2023-10-14 03:48:27,294][33201] Updated weights for policy 0, policy_version 68880 (0.0009) [2023-10-14 03:48:27,657][33201] Updated weights for policy 0, policy_version 68890 (0.0009) [2023-10-14 03:48:28,113][33226] Updated weights for policy 1, policy_version 69510 (0.0008) [2023-10-14 03:48:28,489][33226] Updated weights for policy 1, policy_version 69520 (0.0009) [2023-10-14 03:48:28,860][33226] Updated weights for policy 1, policy_version 69530 (0.0008) [2023-10-14 03:48:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 141754368. Throughput: 0: 1785.4, 1: 1772.1. Samples: 35438858. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 03:48:31,474][33201] Updated weights for policy 0, policy_version 68900 (0.0008) [2023-10-14 03:48:31,845][33201] Updated weights for policy 0, policy_version 68910 (0.0009) [2023-10-14 03:48:32,211][33201] Updated weights for policy 0, policy_version 68920 (0.0009) [2023-10-14 03:48:32,752][33226] Updated weights for policy 1, policy_version 69540 (0.0009) [2023-10-14 03:48:33,120][33226] Updated weights for policy 1, policy_version 69550 (0.0007) [2023-10-14 03:48:33,484][33226] Updated weights for policy 1, policy_version 69560 (0.0008) [2023-10-14 03:48:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 141819904. Throughput: 0: 1762.1, 1: 1791.4. Samples: 35459390. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.780')] [2023-10-14 03:48:35,936][33201] Updated weights for policy 0, policy_version 68930 (0.0008) [2023-10-14 03:48:36,300][33201] Updated weights for policy 0, policy_version 68940 (0.0008) [2023-10-14 03:48:36,668][33201] Updated weights for policy 0, policy_version 68950 (0.0007) [2023-10-14 03:48:37,037][33201] Updated weights for policy 0, policy_version 68960 (0.0009) [2023-10-14 03:48:37,468][33226] Updated weights for policy 1, policy_version 69570 (0.0009) [2023-10-14 03:48:37,839][33226] Updated weights for policy 1, policy_version 69580 (0.0008) [2023-10-14 03:48:38,202][33226] Updated weights for policy 1, policy_version 69590 (0.0009) [2023-10-14 03:48:38,575][33226] Updated weights for policy 1, policy_version 69600 (0.0008) [2023-10-14 03:48:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 141885440. Throughput: 0: 1772.2, 1: 1765.3. Samples: 35480566. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.780')] [2023-10-14 03:48:40,849][33201] Updated weights for policy 0, policy_version 68970 (0.0007) [2023-10-14 03:48:41,221][33201] Updated weights for policy 0, policy_version 68980 (0.0009) [2023-10-14 03:48:41,586][33201] Updated weights for policy 0, policy_version 68990 (0.0010) [2023-10-14 03:48:42,289][33226] Updated weights for policy 1, policy_version 69610 (0.0011) [2023-10-14 03:48:42,656][33226] Updated weights for policy 1, policy_version 69620 (0.0011) [2023-10-14 03:48:43,020][33226] Updated weights for policy 1, policy_version 69630 (0.0008) [2023-10-14 03:48:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 141950976. Throughput: 0: 1768.5, 1: 1795.5. Samples: 35491746. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:44,559][31953] Avg episode reward: [(0, '20.980'), (1, '20.760')] [2023-10-14 03:48:45,468][33201] Updated weights for policy 0, policy_version 69000 (0.0008) [2023-10-14 03:48:45,835][33201] Updated weights for policy 0, policy_version 69010 (0.0010) [2023-10-14 03:48:46,204][33201] Updated weights for policy 0, policy_version 69020 (0.0009) [2023-10-14 03:48:46,776][33226] Updated weights for policy 1, policy_version 69640 (0.0008) [2023-10-14 03:48:47,143][33226] Updated weights for policy 1, policy_version 69650 (0.0010) [2023-10-14 03:48:47,504][33226] Updated weights for policy 1, policy_version 69660 (0.0007) [2023-10-14 03:48:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 142016512. Throughput: 0: 1771.4, 1: 1769.4. Samples: 35512894. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:49,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.740')] [2023-10-14 03:48:49,959][33201] Updated weights for policy 0, policy_version 69030 (0.0010) [2023-10-14 03:48:50,329][33201] Updated weights for policy 0, policy_version 69040 (0.0007) [2023-10-14 03:48:50,699][33201] Updated weights for policy 0, policy_version 69050 (0.0009) [2023-10-14 03:48:51,232][33226] Updated weights for policy 1, policy_version 69670 (0.0008) [2023-10-14 03:48:51,601][33226] Updated weights for policy 1, policy_version 69680 (0.0010) [2023-10-14 03:48:51,965][33226] Updated weights for policy 1, policy_version 69690 (0.0009) [2023-10-14 03:48:54,389][33201] Updated weights for policy 0, policy_version 69060 (0.0008) [2023-10-14 03:48:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 142082048. Throughput: 0: 1796.5, 1: 1768.0. Samples: 35535164. Policy #0 lag: (min: 5.0, avg: 7.6, max: 37.0) [2023-10-14 03:48:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.770')] [2023-10-14 03:48:54,786][33201] Updated weights for policy 0, policy_version 69070 (0.0009) [2023-10-14 03:48:55,152][33201] Updated weights for policy 0, policy_version 69080 (0.0008) [2023-10-14 03:48:55,798][33226] Updated weights for policy 1, policy_version 69700 (0.0010) [2023-10-14 03:48:56,169][33226] Updated weights for policy 1, policy_version 69710 (0.0010) [2023-10-14 03:48:56,530][33226] Updated weights for policy 1, policy_version 69720 (0.0008) [2023-10-14 03:48:58,972][33201] Updated weights for policy 0, policy_version 69090 (0.0008) [2023-10-14 03:48:59,345][33201] Updated weights for policy 0, policy_version 69100 (0.0009) [2023-10-14 03:48:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 142147584. Throughput: 0: 1773.9, 1: 1766.6. Samples: 35544602. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:48:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.770')] [2023-10-14 03:48:59,705][33201] Updated weights for policy 0, policy_version 69110 (0.0009) [2023-10-14 03:49:00,073][33201] Updated weights for policy 0, policy_version 69120 (0.0008) [2023-10-14 03:49:00,406][33226] Updated weights for policy 1, policy_version 69730 (0.0008) [2023-10-14 03:49:00,770][33226] Updated weights for policy 1, policy_version 69740 (0.0007) [2023-10-14 03:49:01,130][33226] Updated weights for policy 1, policy_version 69750 (0.0010) [2023-10-14 03:49:01,500][33226] Updated weights for policy 1, policy_version 69760 (0.0007) [2023-10-14 03:49:03,959][33201] Updated weights for policy 0, policy_version 69130 (0.0007) [2023-10-14 03:49:04,335][33201] Updated weights for policy 0, policy_version 69140 (0.0010) [2023-10-14 03:49:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 142213120. Throughput: 0: 1785.3, 1: 1766.1. Samples: 35566822. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:04,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.770')] [2023-10-14 03:49:04,708][33201] Updated weights for policy 0, policy_version 69150 (0.0010) [2023-10-14 03:49:05,271][33226] Updated weights for policy 1, policy_version 69770 (0.0008) [2023-10-14 03:49:05,637][33226] Updated weights for policy 1, policy_version 69780 (0.0008) [2023-10-14 03:49:05,997][33226] Updated weights for policy 1, policy_version 69790 (0.0007) [2023-10-14 03:49:08,621][33201] Updated weights for policy 0, policy_version 69160 (0.0009) [2023-10-14 03:49:08,986][33201] Updated weights for policy 0, policy_version 69170 (0.0008) [2023-10-14 03:49:09,365][33201] Updated weights for policy 0, policy_version 69180 (0.0008) [2023-10-14 03:49:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 142311424. Throughput: 0: 1784.7, 1: 1784.8. Samples: 35588200. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.770')] [2023-10-14 03:49:09,821][33226] Updated weights for policy 1, policy_version 69800 (0.0008) [2023-10-14 03:49:10,184][33226] Updated weights for policy 1, policy_version 69810 (0.0008) [2023-10-14 03:49:10,553][33226] Updated weights for policy 1, policy_version 69820 (0.0008) [2023-10-14 03:49:13,060][33201] Updated weights for policy 0, policy_version 69190 (0.0007) [2023-10-14 03:49:13,435][33201] Updated weights for policy 0, policy_version 69200 (0.0007) [2023-10-14 03:49:13,807][33201] Updated weights for policy 0, policy_version 69210 (0.0008) [2023-10-14 03:49:14,355][33226] Updated weights for policy 1, policy_version 69830 (0.0010) [2023-10-14 03:49:14,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 142376960. Throughput: 0: 1782.8, 1: 1768.3. Samples: 35598658. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.770')] [2023-10-14 03:49:14,732][33226] Updated weights for policy 1, policy_version 69840 (0.0010) [2023-10-14 03:49:15,097][33226] Updated weights for policy 1, policy_version 69850 (0.0009) [2023-10-14 03:49:17,840][33201] Updated weights for policy 0, policy_version 69220 (0.0010) [2023-10-14 03:49:18,215][33201] Updated weights for policy 0, policy_version 69230 (0.0010) [2023-10-14 03:49:18,580][33201] Updated weights for policy 0, policy_version 69240 (0.0007) [2023-10-14 03:49:18,895][33226] Updated weights for policy 1, policy_version 69860 (0.0009) [2023-10-14 03:49:19,262][33226] Updated weights for policy 1, policy_version 69870 (0.0010) [2023-10-14 03:49:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 142442496. Throughput: 0: 1788.5, 1: 1782.4. Samples: 35620080. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.770')] [2023-10-14 03:49:19,639][33226] Updated weights for policy 1, policy_version 69880 (0.0010) [2023-10-14 03:49:22,320][33201] Updated weights for policy 0, policy_version 69250 (0.0008) [2023-10-14 03:49:22,684][33201] Updated weights for policy 0, policy_version 69260 (0.0007) [2023-10-14 03:49:23,063][33201] Updated weights for policy 0, policy_version 69270 (0.0007) [2023-10-14 03:49:23,427][33201] Updated weights for policy 0, policy_version 69280 (0.0007) [2023-10-14 03:49:23,512][33226] Updated weights for policy 1, policy_version 69890 (0.0010) [2023-10-14 03:49:23,881][33226] Updated weights for policy 1, policy_version 69900 (0.0010) [2023-10-14 03:49:24,255][33226] Updated weights for policy 1, policy_version 69910 (0.0007) [2023-10-14 03:49:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 142508032. Throughput: 0: 1764.1, 1: 1790.4. Samples: 35640520. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.680')] [2023-10-14 03:49:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000069280_70942720.pth... [2023-10-14 03:49:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000067616_69238784.pth [2023-10-14 03:49:24,610][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000069920_71598080.pth... [2023-10-14 03:49:24,611][33226] Updated weights for policy 1, policy_version 69920 (0.0009) [2023-10-14 03:49:24,640][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000068256_69894144.pth [2023-10-14 03:49:27,226][33201] Updated weights for policy 0, policy_version 69290 (0.0007) [2023-10-14 03:49:27,596][33201] Updated weights for policy 0, policy_version 69300 (0.0008) [2023-10-14 03:49:27,967][33201] Updated weights for policy 0, policy_version 69310 (0.0010) [2023-10-14 03:49:28,422][33226] Updated weights for policy 1, policy_version 69930 (0.0009) [2023-10-14 03:49:28,787][33226] Updated weights for policy 1, policy_version 69940 (0.0009) [2023-10-14 03:49:29,151][33226] Updated weights for policy 1, policy_version 69950 (0.0008) [2023-10-14 03:49:29,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 142606336. Throughput: 0: 1786.5, 1: 1771.1. Samples: 35651834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:29,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.680')] [2023-10-14 03:49:31,619][33201] Updated weights for policy 0, policy_version 69320 (0.0010) [2023-10-14 03:49:31,994][33201] Updated weights for policy 0, policy_version 69330 (0.0010) [2023-10-14 03:49:32,361][33201] Updated weights for policy 0, policy_version 69340 (0.0009) [2023-10-14 03:49:32,905][33226] Updated weights for policy 1, policy_version 69960 (0.0007) [2023-10-14 03:49:33,272][33226] Updated weights for policy 1, policy_version 69970 (0.0007) [2023-10-14 03:49:33,644][33226] Updated weights for policy 1, policy_version 69980 (0.0008) [2023-10-14 03:49:34,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 142671872. Throughput: 0: 1762.3, 1: 1784.0. Samples: 35672478. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:34,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.680')] [2023-10-14 03:49:36,208][33201] Updated weights for policy 0, policy_version 69350 (0.0009) [2023-10-14 03:49:36,575][33201] Updated weights for policy 0, policy_version 69360 (0.0008) [2023-10-14 03:49:36,952][33201] Updated weights for policy 0, policy_version 69370 (0.0009) [2023-10-14 03:49:37,390][33226] Updated weights for policy 1, policy_version 69990 (0.0009) [2023-10-14 03:49:37,751][33226] Updated weights for policy 1, policy_version 70000 (0.0009) [2023-10-14 03:49:38,120][33226] Updated weights for policy 1, policy_version 70010 (0.0010) [2023-10-14 03:49:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 142737408. Throughput: 0: 1757.5, 1: 1765.2. Samples: 35693686. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:49:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.660')] [2023-10-14 03:49:40,971][33201] Updated weights for policy 0, policy_version 69380 (0.0010) [2023-10-14 03:49:41,369][33201] Updated weights for policy 0, policy_version 69390 (0.0007) [2023-10-14 03:49:41,737][33201] Updated weights for policy 0, policy_version 69400 (0.0009) [2023-10-14 03:49:42,080][33226] Updated weights for policy 1, policy_version 70020 (0.0008) [2023-10-14 03:49:42,451][33226] Updated weights for policy 1, policy_version 70030 (0.0007) [2023-10-14 03:49:42,816][33226] Updated weights for policy 1, policy_version 70040 (0.0009) [2023-10-14 03:49:44,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 142802944. Throughput: 0: 1755.2, 1: 1796.4. Samples: 35704426. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:49:44,559][31953] Avg episode reward: [(0, '20.990'), (1, '20.670')] [2023-10-14 03:49:45,530][33201] Updated weights for policy 0, policy_version 69410 (0.0007) [2023-10-14 03:49:45,903][33201] Updated weights for policy 0, policy_version 69420 (0.0007) [2023-10-14 03:49:46,274][33201] Updated weights for policy 0, policy_version 69430 (0.0008) [2023-10-14 03:49:46,405][33226] Updated weights for policy 1, policy_version 70050 (0.0007) [2023-10-14 03:49:46,644][33201] Updated weights for policy 0, policy_version 69440 (0.0007) [2023-10-14 03:49:46,764][33226] Updated weights for policy 1, policy_version 70060 (0.0009) [2023-10-14 03:49:47,130][33226] Updated weights for policy 1, policy_version 70070 (0.0007) [2023-10-14 03:49:47,495][33226] Updated weights for policy 1, policy_version 70080 (0.0008) [2023-10-14 03:49:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 142868480. Throughput: 0: 1753.3, 1: 1770.6. Samples: 35725396. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:49:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.690')] [2023-10-14 03:49:50,556][33201] Updated weights for policy 0, policy_version 69450 (0.0008) [2023-10-14 03:49:50,930][33201] Updated weights for policy 0, policy_version 69460 (0.0009) [2023-10-14 03:49:51,304][33201] Updated weights for policy 0, policy_version 69470 (0.0009) [2023-10-14 03:49:51,349][33226] Updated weights for policy 1, policy_version 70090 (0.0008) [2023-10-14 03:49:51,710][33226] Updated weights for policy 1, policy_version 70100 (0.0007) [2023-10-14 03:49:52,084][33226] Updated weights for policy 1, policy_version 70110 (0.0007) [2023-10-14 03:49:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 142934016. Throughput: 0: 1768.7, 1: 1770.4. Samples: 35747458. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:49:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.690')] [2023-10-14 03:49:55,067][33201] Updated weights for policy 0, policy_version 69480 (0.0009) [2023-10-14 03:49:55,446][33201] Updated weights for policy 0, policy_version 69490 (0.0008) [2023-10-14 03:49:55,821][33201] Updated weights for policy 0, policy_version 69500 (0.0008) [2023-10-14 03:49:55,823][33226] Updated weights for policy 1, policy_version 70120 (0.0007) [2023-10-14 03:49:56,188][33226] Updated weights for policy 1, policy_version 70130 (0.0010) [2023-10-14 03:49:56,560][33226] Updated weights for policy 1, policy_version 70140 (0.0010) [2023-10-14 03:49:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 142999552. Throughput: 0: 1746.0, 1: 1770.6. Samples: 35756904. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:49:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.690')] [2023-10-14 03:49:59,645][33201] Updated weights for policy 0, policy_version 69510 (0.0011) [2023-10-14 03:50:00,016][33201] Updated weights for policy 0, policy_version 69520 (0.0010) [2023-10-14 03:50:00,357][33226] Updated weights for policy 1, policy_version 70150 (0.0010) [2023-10-14 03:50:00,398][33201] Updated weights for policy 0, policy_version 69530 (0.0008) [2023-10-14 03:50:00,723][33226] Updated weights for policy 1, policy_version 70160 (0.0008) [2023-10-14 03:50:01,086][33226] Updated weights for policy 1, policy_version 70170 (0.0009) [2023-10-14 03:50:04,164][33201] Updated weights for policy 0, policy_version 69540 (0.0008) [2023-10-14 03:50:04,537][33201] Updated weights for policy 0, policy_version 69550 (0.0008) [2023-10-14 03:50:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 143065088. Throughput: 0: 1765.8, 1: 1768.3. Samples: 35779112. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:50:04,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.720')] [2023-10-14 03:50:04,910][33201] Updated weights for policy 0, policy_version 69560 (0.0008) [2023-10-14 03:50:05,081][33226] Updated weights for policy 1, policy_version 70180 (0.0011) [2023-10-14 03:50:05,494][33226] Updated weights for policy 1, policy_version 70190 (0.0009) [2023-10-14 03:50:05,857][33226] Updated weights for policy 1, policy_version 70200 (0.0008) [2023-10-14 03:50:08,745][33201] Updated weights for policy 0, policy_version 69570 (0.0008) [2023-10-14 03:50:09,110][33201] Updated weights for policy 0, policy_version 69580 (0.0009) [2023-10-14 03:50:09,480][33201] Updated weights for policy 0, policy_version 69590 (0.0007) [2023-10-14 03:50:09,501][33226] Updated weights for policy 1, policy_version 70210 (0.0009) [2023-10-14 03:50:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 143130624. Throughput: 0: 1773.4, 1: 1780.8. Samples: 35800456. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:50:09,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.740')] [2023-10-14 03:50:09,848][33201] Updated weights for policy 0, policy_version 69600 (0.0008) [2023-10-14 03:50:09,870][33226] Updated weights for policy 1, policy_version 70220 (0.0008) [2023-10-14 03:50:10,237][33226] Updated weights for policy 1, policy_version 70230 (0.0009) [2023-10-14 03:50:10,613][33226] Updated weights for policy 1, policy_version 70240 (0.0009) [2023-10-14 03:50:13,660][33201] Updated weights for policy 0, policy_version 69610 (0.0009) [2023-10-14 03:50:14,017][33201] Updated weights for policy 0, policy_version 69620 (0.0007) [2023-10-14 03:50:14,381][33201] Updated weights for policy 0, policy_version 69630 (0.0007) [2023-10-14 03:50:14,460][33226] Updated weights for policy 1, policy_version 70250 (0.0008) [2023-10-14 03:50:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143228928. Throughput: 0: 1762.1, 1: 1761.7. Samples: 35810408. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:50:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.790')] [2023-10-14 03:50:14,836][33226] Updated weights for policy 1, policy_version 70260 (0.0007) [2023-10-14 03:50:15,192][33226] Updated weights for policy 1, policy_version 70270 (0.0008) [2023-10-14 03:50:18,167][33201] Updated weights for policy 0, policy_version 69640 (0.0007) [2023-10-14 03:50:18,538][33201] Updated weights for policy 0, policy_version 69650 (0.0008) [2023-10-14 03:50:18,908][33201] Updated weights for policy 0, policy_version 69660 (0.0007) [2023-10-14 03:50:19,099][33226] Updated weights for policy 1, policy_version 70280 (0.0007) [2023-10-14 03:50:19,461][33226] Updated weights for policy 1, policy_version 70290 (0.0008) [2023-10-14 03:50:19,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 143294464. Throughput: 0: 1776.5, 1: 1772.3. Samples: 35832176. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:50:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.810')] [2023-10-14 03:50:19,829][33226] Updated weights for policy 1, policy_version 70300 (0.0009) [2023-10-14 03:50:22,777][33201] Updated weights for policy 0, policy_version 69670 (0.0007) [2023-10-14 03:50:23,152][33201] Updated weights for policy 0, policy_version 69680 (0.0007) [2023-10-14 03:50:23,528][33201] Updated weights for policy 0, policy_version 69690 (0.0007) [2023-10-14 03:50:23,565][33226] Updated weights for policy 1, policy_version 70310 (0.0008) [2023-10-14 03:50:23,929][33226] Updated weights for policy 1, policy_version 70320 (0.0008) [2023-10-14 03:50:24,286][33226] Updated weights for policy 1, policy_version 70330 (0.0008) [2023-10-14 03:50:24,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 143392768. Throughput: 0: 1749.2, 1: 1780.1. Samples: 35852508. Policy #0 lag: (min: 9.0, avg: 19.4, max: 41.0) [2023-10-14 03:50:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.850')] [2023-10-14 03:50:27,578][33201] Updated weights for policy 0, policy_version 69700 (0.0009) [2023-10-14 03:50:27,974][33201] Updated weights for policy 0, policy_version 69710 (0.0008) [2023-10-14 03:50:28,125][33226] Updated weights for policy 1, policy_version 70340 (0.0009) [2023-10-14 03:50:28,341][33201] Updated weights for policy 0, policy_version 69720 (0.0008) [2023-10-14 03:50:28,488][33226] Updated weights for policy 1, policy_version 70350 (0.0009) [2023-10-14 03:50:28,863][33226] Updated weights for policy 1, policy_version 70360 (0.0007) [2023-10-14 03:50:29,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 143458304. Throughput: 0: 1781.6, 1: 1763.5. Samples: 35863954. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:29,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 03:50:32,123][33201] Updated weights for policy 0, policy_version 69730 (0.0008) [2023-10-14 03:50:32,493][33201] Updated weights for policy 0, policy_version 69740 (0.0008) [2023-10-14 03:50:32,846][33226] Updated weights for policy 1, policy_version 70370 (0.0007) [2023-10-14 03:50:32,860][33201] Updated weights for policy 0, policy_version 69750 (0.0008) [2023-10-14 03:50:33,216][33226] Updated weights for policy 1, policy_version 70380 (0.0009) [2023-10-14 03:50:33,239][33201] Updated weights for policy 0, policy_version 69760 (0.0010) [2023-10-14 03:50:33,571][33226] Updated weights for policy 1, policy_version 70390 (0.0008) [2023-10-14 03:50:33,941][33226] Updated weights for policy 1, policy_version 70400 (0.0008) [2023-10-14 03:50:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143523840. Throughput: 0: 1750.3, 1: 1776.7. Samples: 35884110. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 03:50:37,075][33201] Updated weights for policy 0, policy_version 69770 (0.0007) [2023-10-14 03:50:37,450][33201] Updated weights for policy 0, policy_version 69780 (0.0009) [2023-10-14 03:50:37,688][33226] Updated weights for policy 1, policy_version 70410 (0.0008) [2023-10-14 03:50:37,807][33201] Updated weights for policy 0, policy_version 69790 (0.0007) [2023-10-14 03:50:38,057][33226] Updated weights for policy 1, policy_version 70420 (0.0009) [2023-10-14 03:50:38,430][33226] Updated weights for policy 1, policy_version 70430 (0.0010) [2023-10-14 03:50:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143589376. Throughput: 0: 1746.1, 1: 1750.1. Samples: 35904786. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.860')] [2023-10-14 03:50:41,581][33201] Updated weights for policy 0, policy_version 69800 (0.0008) [2023-10-14 03:50:41,961][33201] Updated weights for policy 0, policy_version 69810 (0.0008) [2023-10-14 03:50:42,297][33226] Updated weights for policy 1, policy_version 70440 (0.0007) [2023-10-14 03:50:42,333][33201] Updated weights for policy 0, policy_version 69820 (0.0009) [2023-10-14 03:50:42,663][33226] Updated weights for policy 1, policy_version 70450 (0.0008) [2023-10-14 03:50:43,036][33226] Updated weights for policy 1, policy_version 70460 (0.0007) [2023-10-14 03:50:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143654912. Throughput: 0: 1756.0, 1: 1783.0. Samples: 35916162. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.860')] [2023-10-14 03:50:46,103][33201] Updated weights for policy 0, policy_version 69830 (0.0008) [2023-10-14 03:50:46,475][33201] Updated weights for policy 0, policy_version 69840 (0.0007) [2023-10-14 03:50:46,822][33226] Updated weights for policy 1, policy_version 70470 (0.0008) [2023-10-14 03:50:46,837][33201] Updated weights for policy 0, policy_version 69850 (0.0007) [2023-10-14 03:50:47,186][33226] Updated weights for policy 1, policy_version 70480 (0.0009) [2023-10-14 03:50:47,552][33226] Updated weights for policy 1, policy_version 70490 (0.0010) [2023-10-14 03:50:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143720448. Throughput: 0: 1743.4, 1: 1750.8. Samples: 35936352. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.860')] [2023-10-14 03:50:50,709][33201] Updated weights for policy 0, policy_version 69860 (0.0010) [2023-10-14 03:50:51,066][33201] Updated weights for policy 0, policy_version 69870 (0.0009) [2023-10-14 03:50:51,436][33201] Updated weights for policy 0, policy_version 69880 (0.0009) [2023-10-14 03:50:51,540][33226] Updated weights for policy 1, policy_version 70500 (0.0008) [2023-10-14 03:50:51,930][33226] Updated weights for policy 1, policy_version 70510 (0.0008) [2023-10-14 03:50:52,293][33226] Updated weights for policy 1, policy_version 70520 (0.0009) [2023-10-14 03:50:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 143785984. Throughput: 0: 1750.7, 1: 1754.6. Samples: 35958192. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 03:50:55,256][33201] Updated weights for policy 0, policy_version 69890 (0.0009) [2023-10-14 03:50:55,613][33201] Updated weights for policy 0, policy_version 69900 (0.0007) [2023-10-14 03:50:55,978][33201] Updated weights for policy 0, policy_version 69910 (0.0008) [2023-10-14 03:50:56,031][33226] Updated weights for policy 1, policy_version 70530 (0.0008) [2023-10-14 03:50:56,347][33201] Updated weights for policy 0, policy_version 69920 (0.0009) [2023-10-14 03:50:56,392][33226] Updated weights for policy 1, policy_version 70540 (0.0009) [2023-10-14 03:50:56,767][33226] Updated weights for policy 1, policy_version 70550 (0.0010) [2023-10-14 03:50:57,129][33226] Updated weights for policy 1, policy_version 70560 (0.0009) [2023-10-14 03:50:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 143851520. Throughput: 0: 1741.3, 1: 1763.0. Samples: 35968102. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:50:59,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 03:51:00,302][33201] Updated weights for policy 0, policy_version 69930 (0.0009) [2023-10-14 03:51:00,676][33201] Updated weights for policy 0, policy_version 69940 (0.0008) [2023-10-14 03:51:00,863][33226] Updated weights for policy 1, policy_version 70570 (0.0008) [2023-10-14 03:51:01,038][33201] Updated weights for policy 0, policy_version 69950 (0.0009) [2023-10-14 03:51:01,235][33226] Updated weights for policy 1, policy_version 70580 (0.0009) [2023-10-14 03:51:01,600][33226] Updated weights for policy 1, policy_version 70590 (0.0009) [2023-10-14 03:51:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 143917056. Throughput: 0: 1752.4, 1: 1758.2. Samples: 35990154. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:51:04,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.860')] [2023-10-14 03:51:04,818][33201] Updated weights for policy 0, policy_version 69960 (0.0008) [2023-10-14 03:51:05,184][33201] Updated weights for policy 0, policy_version 69970 (0.0008) [2023-10-14 03:51:05,344][33226] Updated weights for policy 1, policy_version 70600 (0.0008) [2023-10-14 03:51:05,554][33201] Updated weights for policy 0, policy_version 69980 (0.0008) [2023-10-14 03:51:05,704][33226] Updated weights for policy 1, policy_version 70610 (0.0008) [2023-10-14 03:51:06,076][33226] Updated weights for policy 1, policy_version 70620 (0.0010) [2023-10-14 03:51:09,439][33201] Updated weights for policy 0, policy_version 69990 (0.0010) [2023-10-14 03:51:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 143982592. Throughput: 0: 1774.8, 1: 1772.1. Samples: 36012116. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:51:09,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.860')] [2023-10-14 03:51:09,805][33201] Updated weights for policy 0, policy_version 70000 (0.0009) [2023-10-14 03:51:09,910][33226] Updated weights for policy 1, policy_version 70630 (0.0009) [2023-10-14 03:51:10,172][33201] Updated weights for policy 0, policy_version 70010 (0.0008) [2023-10-14 03:51:10,272][33226] Updated weights for policy 1, policy_version 70640 (0.0008) [2023-10-14 03:51:10,643][33226] Updated weights for policy 1, policy_version 70650 (0.0008) [2023-10-14 03:51:13,905][33201] Updated weights for policy 0, policy_version 70020 (0.0008) [2023-10-14 03:51:14,289][33201] Updated weights for policy 0, policy_version 70030 (0.0010) [2023-10-14 03:51:14,392][33226] Updated weights for policy 1, policy_version 70660 (0.0007) [2023-10-14 03:51:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 144048128. Throughput: 0: 1750.3, 1: 1758.8. Samples: 36021864. Policy #0 lag: (min: 14.0, avg: 19.3, max: 46.0) [2023-10-14 03:51:14,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.950')] [2023-10-14 03:51:14,660][33201] Updated weights for policy 0, policy_version 70040 (0.0008) [2023-10-14 03:51:14,755][33226] Updated weights for policy 1, policy_version 70670 (0.0008) [2023-10-14 03:51:15,127][33226] Updated weights for policy 1, policy_version 70680 (0.0008) [2023-10-14 03:51:18,482][33201] Updated weights for policy 0, policy_version 70050 (0.0008) [2023-10-14 03:51:18,852][33201] Updated weights for policy 0, policy_version 70060 (0.0010) [2023-10-14 03:51:19,000][33226] Updated weights for policy 1, policy_version 70690 (0.0008) [2023-10-14 03:51:19,221][33201] Updated weights for policy 0, policy_version 70070 (0.0008) [2023-10-14 03:51:19,360][33226] Updated weights for policy 1, policy_version 70700 (0.0007) [2023-10-14 03:51:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 144113664. Throughput: 0: 1781.8, 1: 1773.7. Samples: 36044108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:19,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.950')] [2023-10-14 03:51:19,594][33201] Updated weights for policy 0, policy_version 70080 (0.0008) [2023-10-14 03:51:19,735][33226] Updated weights for policy 1, policy_version 70710 (0.0009) [2023-10-14 03:51:20,103][33226] Updated weights for policy 1, policy_version 70720 (0.0012) [2023-10-14 03:51:23,675][33201] Updated weights for policy 0, policy_version 70090 (0.0008) [2023-10-14 03:51:23,935][33226] Updated weights for policy 1, policy_version 70730 (0.0008) [2023-10-14 03:51:24,040][33201] Updated weights for policy 0, policy_version 70100 (0.0007) [2023-10-14 03:51:24,310][33226] Updated weights for policy 1, policy_version 70740 (0.0009) [2023-10-14 03:51:24,408][33201] Updated weights for policy 0, policy_version 70110 (0.0008) [2023-10-14 03:51:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 144211968. Throughput: 0: 1762.8, 1: 1782.4. Samples: 36064324. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:24,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.950')] [2023-10-14 03:51:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000070112_71794688.pth... [2023-10-14 03:51:24,600][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000068448_70090752.pth [2023-10-14 03:51:24,672][33226] Updated weights for policy 1, policy_version 70750 (0.0010) [2023-10-14 03:51:24,746][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000070752_72450048.pth... [2023-10-14 03:51:24,780][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000069088_70746112.pth [2023-10-14 03:51:28,296][33201] Updated weights for policy 0, policy_version 70120 (0.0008) [2023-10-14 03:51:28,514][33226] Updated weights for policy 1, policy_version 70760 (0.0011) [2023-10-14 03:51:28,667][33201] Updated weights for policy 0, policy_version 70130 (0.0009) [2023-10-14 03:51:28,879][33226] Updated weights for policy 1, policy_version 70770 (0.0009) [2023-10-14 03:51:29,040][33201] Updated weights for policy 0, policy_version 70140 (0.0010) [2023-10-14 03:51:29,239][33226] Updated weights for policy 1, policy_version 70780 (0.0008) [2023-10-14 03:51:29,557][31953] Fps is (10 sec: 19661.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 144310272. Throughput: 0: 1776.8, 1: 1761.9. Samples: 36075406. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:29,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 03:51:32,968][33201] Updated weights for policy 0, policy_version 70150 (0.0008) [2023-10-14 03:51:33,184][33226] Updated weights for policy 1, policy_version 70790 (0.0008) [2023-10-14 03:51:33,333][33201] Updated weights for policy 0, policy_version 70160 (0.0008) [2023-10-14 03:51:33,543][33226] Updated weights for policy 1, policy_version 70800 (0.0007) [2023-10-14 03:51:33,705][33201] Updated weights for policy 0, policy_version 70170 (0.0009) [2023-10-14 03:51:33,916][33226] Updated weights for policy 1, policy_version 70810 (0.0008) [2023-10-14 03:51:34,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 144375808. Throughput: 0: 1774.8, 1: 1787.1. Samples: 36096636. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:34,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 03:51:37,575][33201] Updated weights for policy 0, policy_version 70180 (0.0010) [2023-10-14 03:51:37,911][33226] Updated weights for policy 1, policy_version 70820 (0.0008) [2023-10-14 03:51:37,953][33201] Updated weights for policy 0, policy_version 70190 (0.0010) [2023-10-14 03:51:38,304][33226] Updated weights for policy 1, policy_version 70830 (0.0010) [2023-10-14 03:51:38,314][33201] Updated weights for policy 0, policy_version 70200 (0.0009) [2023-10-14 03:51:38,662][33226] Updated weights for policy 1, policy_version 70840 (0.0009) [2023-10-14 03:51:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 144441344. Throughput: 0: 1750.1, 1: 1753.2. Samples: 36115836. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 03:51:42,172][33201] Updated weights for policy 0, policy_version 70210 (0.0008) [2023-10-14 03:51:42,388][33226] Updated weights for policy 1, policy_version 70850 (0.0008) [2023-10-14 03:51:42,549][33201] Updated weights for policy 0, policy_version 70220 (0.0009) [2023-10-14 03:51:42,752][33226] Updated weights for policy 1, policy_version 70860 (0.0009) [2023-10-14 03:51:42,913][33201] Updated weights for policy 0, policy_version 70230 (0.0007) [2023-10-14 03:51:43,123][33226] Updated weights for policy 1, policy_version 70870 (0.0007) [2023-10-14 03:51:43,277][33201] Updated weights for policy 0, policy_version 70240 (0.0008) [2023-10-14 03:51:43,482][33226] Updated weights for policy 1, policy_version 70880 (0.0008) [2023-10-14 03:51:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 144506880. Throughput: 0: 1781.4, 1: 1780.3. Samples: 36128380. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 03:51:47,027][33201] Updated weights for policy 0, policy_version 70250 (0.0008) [2023-10-14 03:51:47,310][33226] Updated weights for policy 1, policy_version 70890 (0.0008) [2023-10-14 03:51:47,394][33201] Updated weights for policy 0, policy_version 70260 (0.0007) [2023-10-14 03:51:47,673][33226] Updated weights for policy 1, policy_version 70900 (0.0009) [2023-10-14 03:51:47,755][33201] Updated weights for policy 0, policy_version 70270 (0.0008) [2023-10-14 03:51:48,049][33226] Updated weights for policy 1, policy_version 70910 (0.0008) [2023-10-14 03:51:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 144572416. Throughput: 0: 1745.5, 1: 1757.2. Samples: 36147778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 03:51:51,483][33201] Updated weights for policy 0, policy_version 70280 (0.0010) [2023-10-14 03:51:51,847][33201] Updated weights for policy 0, policy_version 70290 (0.0008) [2023-10-14 03:51:51,964][33226] Updated weights for policy 1, policy_version 70920 (0.0007) [2023-10-14 03:51:52,222][33201] Updated weights for policy 0, policy_version 70300 (0.0008) [2023-10-14 03:51:52,339][33226] Updated weights for policy 1, policy_version 70930 (0.0007) [2023-10-14 03:51:52,711][33226] Updated weights for policy 1, policy_version 70940 (0.0007) [2023-10-14 03:51:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 144637952. Throughput: 0: 1754.9, 1: 1747.8. Samples: 36169736. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:54,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.930')] [2023-10-14 03:51:55,840][33201] Updated weights for policy 0, policy_version 70310 (0.0007) [2023-10-14 03:51:56,203][33201] Updated weights for policy 0, policy_version 70320 (0.0009) [2023-10-14 03:51:56,398][33226] Updated weights for policy 1, policy_version 70950 (0.0007) [2023-10-14 03:51:56,581][33201] Updated weights for policy 0, policy_version 70330 (0.0009) [2023-10-14 03:51:56,754][33226] Updated weights for policy 1, policy_version 70960 (0.0007) [2023-10-14 03:51:57,115][33226] Updated weights for policy 1, policy_version 70970 (0.0009) [2023-10-14 03:51:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 144703488. Throughput: 0: 1753.2, 1: 1759.8. Samples: 36179948. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:51:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 03:52:00,385][33201] Updated weights for policy 0, policy_version 70340 (0.0007) [2023-10-14 03:52:00,754][33201] Updated weights for policy 0, policy_version 70350 (0.0009) [2023-10-14 03:52:00,887][33226] Updated weights for policy 1, policy_version 70980 (0.0008) [2023-10-14 03:52:01,121][33201] Updated weights for policy 0, policy_version 70360 (0.0009) [2023-10-14 03:52:01,266][33226] Updated weights for policy 1, policy_version 70990 (0.0007) [2023-10-14 03:52:01,626][33226] Updated weights for policy 1, policy_version 71000 (0.0011) [2023-10-14 03:52:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 144769024. Throughput: 0: 1757.9, 1: 1752.7. Samples: 36202084. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 03:52:04,941][33201] Updated weights for policy 0, policy_version 70370 (0.0007) [2023-10-14 03:52:05,351][33201] Updated weights for policy 0, policy_version 70380 (0.0009) [2023-10-14 03:52:05,390][33226] Updated weights for policy 1, policy_version 71010 (0.0010) [2023-10-14 03:52:05,718][33201] Updated weights for policy 0, policy_version 70390 (0.0007) [2023-10-14 03:52:05,756][33226] Updated weights for policy 1, policy_version 71020 (0.0007) [2023-10-14 03:52:06,084][33201] Updated weights for policy 0, policy_version 70400 (0.0010) [2023-10-14 03:52:06,128][33226] Updated weights for policy 1, policy_version 71030 (0.0007) [2023-10-14 03:52:06,492][33226] Updated weights for policy 1, policy_version 71040 (0.0007) [2023-10-14 03:52:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 144834560. Throughput: 0: 1776.5, 1: 1771.7. Samples: 36223994. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 03:52:09,930][33201] Updated weights for policy 0, policy_version 70410 (0.0007) [2023-10-14 03:52:10,306][33201] Updated weights for policy 0, policy_version 70420 (0.0008) [2023-10-14 03:52:10,342][33226] Updated weights for policy 1, policy_version 71050 (0.0008) [2023-10-14 03:52:10,673][33201] Updated weights for policy 0, policy_version 70430 (0.0007) [2023-10-14 03:52:10,712][33226] Updated weights for policy 1, policy_version 71060 (0.0008) [2023-10-14 03:52:11,081][33226] Updated weights for policy 1, policy_version 71070 (0.0010) [2023-10-14 03:52:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 144900096. Throughput: 0: 1751.3, 1: 1762.0. Samples: 36233506. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.930')] [2023-10-14 03:52:14,634][33201] Updated weights for policy 0, policy_version 70440 (0.0009) [2023-10-14 03:52:14,833][33226] Updated weights for policy 1, policy_version 71080 (0.0010) [2023-10-14 03:52:15,001][33201] Updated weights for policy 0, policy_version 70450 (0.0008) [2023-10-14 03:52:15,199][33226] Updated weights for policy 1, policy_version 71090 (0.0008) [2023-10-14 03:52:15,387][33201] Updated weights for policy 0, policy_version 70460 (0.0008) [2023-10-14 03:52:15,555][33226] Updated weights for policy 1, policy_version 71100 (0.0009) [2023-10-14 03:52:19,345][33201] Updated weights for policy 0, policy_version 70470 (0.0008) [2023-10-14 03:52:19,448][33226] Updated weights for policy 1, policy_version 71110 (0.0007) [2023-10-14 03:52:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 144965632. Throughput: 0: 1758.1, 1: 1768.5. Samples: 36255332. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:19,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 03:52:19,714][33201] Updated weights for policy 0, policy_version 70480 (0.0008) [2023-10-14 03:52:19,814][33226] Updated weights for policy 1, policy_version 71120 (0.0008) [2023-10-14 03:52:20,078][33201] Updated weights for policy 0, policy_version 70490 (0.0008) [2023-10-14 03:52:20,192][33226] Updated weights for policy 1, policy_version 71130 (0.0008) [2023-10-14 03:52:23,746][33201] Updated weights for policy 0, policy_version 70500 (0.0008) [2023-10-14 03:52:23,934][33226] Updated weights for policy 1, policy_version 71140 (0.0008) [2023-10-14 03:52:24,128][33201] Updated weights for policy 0, policy_version 70510 (0.0008) [2023-10-14 03:52:24,312][33226] Updated weights for policy 1, policy_version 71150 (0.0007) [2023-10-14 03:52:24,498][33201] Updated weights for policy 0, policy_version 70520 (0.0007) [2023-10-14 03:52:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 145031168. Throughput: 0: 1778.5, 1: 1797.6. Samples: 36276760. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 03:52:24,675][33226] Updated weights for policy 1, policy_version 71160 (0.0007) [2023-10-14 03:52:28,335][33201] Updated weights for policy 0, policy_version 70530 (0.0008) [2023-10-14 03:52:28,523][33226] Updated weights for policy 1, policy_version 71170 (0.0007) [2023-10-14 03:52:28,702][33201] Updated weights for policy 0, policy_version 70540 (0.0007) [2023-10-14 03:52:28,884][33226] Updated weights for policy 1, policy_version 71180 (0.0007) [2023-10-14 03:52:29,069][33201] Updated weights for policy 0, policy_version 70550 (0.0008) [2023-10-14 03:52:29,243][33226] Updated weights for policy 1, policy_version 71190 (0.0007) [2023-10-14 03:52:29,432][33201] Updated weights for policy 0, policy_version 70560 (0.0008) [2023-10-14 03:52:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 145129472. Throughput: 0: 1758.3, 1: 1766.7. Samples: 36287006. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:52:29,616][33226] Updated weights for policy 1, policy_version 71200 (0.0007) [2023-10-14 03:52:33,202][33201] Updated weights for policy 0, policy_version 70570 (0.0011) [2023-10-14 03:52:33,570][33201] Updated weights for policy 0, policy_version 70580 (0.0007) [2023-10-14 03:52:33,624][33226] Updated weights for policy 1, policy_version 71210 (0.0007) [2023-10-14 03:52:33,935][33201] Updated weights for policy 0, policy_version 70590 (0.0007) [2023-10-14 03:52:33,992][33226] Updated weights for policy 1, policy_version 71220 (0.0008) [2023-10-14 03:52:34,359][33226] Updated weights for policy 1, policy_version 71230 (0.0008) [2023-10-14 03:52:34,557][31953] Fps is (10 sec: 19661.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145227776. Throughput: 0: 1782.7, 1: 1788.9. Samples: 36308498. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:34,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:52:37,905][33201] Updated weights for policy 0, policy_version 70600 (0.0008) [2023-10-14 03:52:38,267][33201] Updated weights for policy 0, policy_version 70610 (0.0008) [2023-10-14 03:52:38,360][33226] Updated weights for policy 1, policy_version 71240 (0.0008) [2023-10-14 03:52:38,647][33201] Updated weights for policy 0, policy_version 70620 (0.0009) [2023-10-14 03:52:38,724][33226] Updated weights for policy 1, policy_version 71250 (0.0007) [2023-10-14 03:52:39,091][33226] Updated weights for policy 1, policy_version 71260 (0.0008) [2023-10-14 03:52:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145293312. Throughput: 0: 1750.8, 1: 1765.9. Samples: 36327990. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:52:42,507][33201] Updated weights for policy 0, policy_version 70630 (0.0009) [2023-10-14 03:52:42,881][33201] Updated weights for policy 0, policy_version 70640 (0.0009) [2023-10-14 03:52:43,154][33226] Updated weights for policy 1, policy_version 71270 (0.0008) [2023-10-14 03:52:43,246][33201] Updated weights for policy 0, policy_version 70650 (0.0009) [2023-10-14 03:52:43,528][33226] Updated weights for policy 1, policy_version 71280 (0.0008) [2023-10-14 03:52:43,896][33226] Updated weights for policy 1, policy_version 71290 (0.0008) [2023-10-14 03:52:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145358848. Throughput: 0: 1780.5, 1: 1769.2. Samples: 36339686. Policy #0 lag: (min: 31.0, avg: 33.8, max: 63.0) [2023-10-14 03:52:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 03:52:47,087][33201] Updated weights for policy 0, policy_version 70660 (0.0008) [2023-10-14 03:52:47,461][33201] Updated weights for policy 0, policy_version 70670 (0.0009) [2023-10-14 03:52:47,842][33201] Updated weights for policy 0, policy_version 70680 (0.0008) [2023-10-14 03:52:47,868][33226] Updated weights for policy 1, policy_version 71300 (0.0008) [2023-10-14 03:52:48,232][33226] Updated weights for policy 1, policy_version 71310 (0.0009) [2023-10-14 03:52:48,595][33226] Updated weights for policy 1, policy_version 71320 (0.0009) [2023-10-14 03:52:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145424384. Throughput: 0: 1741.2, 1: 1764.9. Samples: 36359858. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:52:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:52:51,853][33201] Updated weights for policy 0, policy_version 70690 (0.0007) [2023-10-14 03:52:52,254][33201] Updated weights for policy 0, policy_version 70700 (0.0007) [2023-10-14 03:52:52,418][33226] Updated weights for policy 1, policy_version 71330 (0.0008) [2023-10-14 03:52:52,619][33201] Updated weights for policy 0, policy_version 70710 (0.0007) [2023-10-14 03:52:52,792][33226] Updated weights for policy 1, policy_version 71340 (0.0007) [2023-10-14 03:52:52,989][33201] Updated weights for policy 0, policy_version 70720 (0.0009) [2023-10-14 03:52:53,168][33226] Updated weights for policy 1, policy_version 71350 (0.0009) [2023-10-14 03:52:53,535][33226] Updated weights for policy 1, policy_version 71360 (0.0007) [2023-10-14 03:52:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145489920. Throughput: 0: 1741.6, 1: 1738.7. Samples: 36380606. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:52:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:52:56,776][33201] Updated weights for policy 0, policy_version 70730 (0.0008) [2023-10-14 03:52:57,143][33201] Updated weights for policy 0, policy_version 70740 (0.0007) [2023-10-14 03:52:57,221][33226] Updated weights for policy 1, policy_version 71370 (0.0008) [2023-10-14 03:52:57,512][33201] Updated weights for policy 0, policy_version 70750 (0.0007) [2023-10-14 03:52:57,586][33226] Updated weights for policy 1, policy_version 71380 (0.0008) [2023-10-14 03:52:57,942][33226] Updated weights for policy 1, policy_version 71390 (0.0009) [2023-10-14 03:52:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145555456. Throughput: 0: 1759.9, 1: 1768.4. Samples: 36392278. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:52:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:53:01,338][33201] Updated weights for policy 0, policy_version 70760 (0.0009) [2023-10-14 03:53:01,714][33201] Updated weights for policy 0, policy_version 70770 (0.0009) [2023-10-14 03:53:01,862][33226] Updated weights for policy 1, policy_version 71400 (0.0008) [2023-10-14 03:53:02,077][33201] Updated weights for policy 0, policy_version 70780 (0.0008) [2023-10-14 03:53:02,232][33226] Updated weights for policy 1, policy_version 71410 (0.0008) [2023-10-14 03:53:02,598][33226] Updated weights for policy 1, policy_version 71420 (0.0007) [2023-10-14 03:53:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 145620992. Throughput: 0: 1754.9, 1: 1739.2. Samples: 36412564. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:53:05,810][33201] Updated weights for policy 0, policy_version 70790 (0.0008) [2023-10-14 03:53:06,178][33201] Updated weights for policy 0, policy_version 70800 (0.0008) [2023-10-14 03:53:06,211][33226] Updated weights for policy 1, policy_version 71430 (0.0009) [2023-10-14 03:53:06,549][33201] Updated weights for policy 0, policy_version 70810 (0.0007) [2023-10-14 03:53:06,585][33226] Updated weights for policy 1, policy_version 71440 (0.0008) [2023-10-14 03:53:06,947][33226] Updated weights for policy 1, policy_version 71450 (0.0009) [2023-10-14 03:53:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 145686528. Throughput: 0: 1759.3, 1: 1750.7. Samples: 36434712. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:53:10,504][33201] Updated weights for policy 0, policy_version 70820 (0.0008) [2023-10-14 03:53:10,869][33201] Updated weights for policy 0, policy_version 70830 (0.0009) [2023-10-14 03:53:10,939][33226] Updated weights for policy 1, policy_version 71460 (0.0008) [2023-10-14 03:53:11,231][33201] Updated weights for policy 0, policy_version 70840 (0.0009) [2023-10-14 03:53:11,343][33226] Updated weights for policy 1, policy_version 71470 (0.0009) [2023-10-14 03:53:11,709][33226] Updated weights for policy 1, policy_version 71480 (0.0010) [2023-10-14 03:53:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 145752064. Throughput: 0: 1748.4, 1: 1746.5. Samples: 36444278. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:14,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 03:53:14,779][33201] Updated weights for policy 0, policy_version 70850 (0.0007) [2023-10-14 03:53:15,166][33201] Updated weights for policy 0, policy_version 70860 (0.0009) [2023-10-14 03:53:15,493][33226] Updated weights for policy 1, policy_version 71490 (0.0008) [2023-10-14 03:53:15,540][33201] Updated weights for policy 0, policy_version 70870 (0.0008) [2023-10-14 03:53:15,849][33226] Updated weights for policy 1, policy_version 71500 (0.0009) [2023-10-14 03:53:15,904][33201] Updated weights for policy 0, policy_version 70880 (0.0008) [2023-10-14 03:53:16,211][33226] Updated weights for policy 1, policy_version 71510 (0.0011) [2023-10-14 03:53:16,583][33226] Updated weights for policy 1, policy_version 71520 (0.0010) [2023-10-14 03:53:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 145817600. Throughput: 0: 1760.5, 1: 1749.7. Samples: 36466460. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:53:19,779][33201] Updated weights for policy 0, policy_version 70890 (0.0008) [2023-10-14 03:53:20,151][33201] Updated weights for policy 0, policy_version 70900 (0.0009) [2023-10-14 03:53:20,343][33226] Updated weights for policy 1, policy_version 71530 (0.0010) [2023-10-14 03:53:20,525][33201] Updated weights for policy 0, policy_version 70910 (0.0007) [2023-10-14 03:53:20,700][33226] Updated weights for policy 1, policy_version 71540 (0.0007) [2023-10-14 03:53:21,067][33226] Updated weights for policy 1, policy_version 71550 (0.0007) [2023-10-14 03:53:24,195][33201] Updated weights for policy 0, policy_version 70920 (0.0008) [2023-10-14 03:53:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 145883136. Throughput: 0: 1785.3, 1: 1779.5. Samples: 36488406. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.970')] [2023-10-14 03:53:24,567][33201] Updated weights for policy 0, policy_version 70930 (0.0008) [2023-10-14 03:53:24,907][33226] Updated weights for policy 1, policy_version 71560 (0.0008) [2023-10-14 03:53:24,928][33201] Updated weights for policy 0, policy_version 70940 (0.0008) [2023-10-14 03:53:25,079][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000070944_72646656.pth... [2023-10-14 03:53:25,107][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000069280_70942720.pth [2023-10-14 03:53:25,271][33226] Updated weights for policy 1, policy_version 71570 (0.0007) [2023-10-14 03:53:25,645][33226] Updated weights for policy 1, policy_version 71580 (0.0009) [2023-10-14 03:53:25,789][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000071584_73302016.pth... [2023-10-14 03:53:25,826][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000069920_71598080.pth [2023-10-14 03:53:28,692][33201] Updated weights for policy 0, policy_version 70950 (0.0007) [2023-10-14 03:53:29,064][33201] Updated weights for policy 0, policy_version 70960 (0.0008) [2023-10-14 03:53:29,439][33201] Updated weights for policy 0, policy_version 70970 (0.0008) [2023-10-14 03:53:29,486][33226] Updated weights for policy 1, policy_version 71590 (0.0007) [2023-10-14 03:53:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.4, 300 sec: 13995.8). Total num frames: 145948672. Throughput: 0: 1760.3, 1: 1764.4. Samples: 36498300. Policy #0 lag: (min: 23.0, avg: 28.3, max: 55.0) [2023-10-14 03:53:29,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:53:29,849][33226] Updated weights for policy 1, policy_version 71600 (0.0008) [2023-10-14 03:53:30,213][33226] Updated weights for policy 1, policy_version 71610 (0.0008) [2023-10-14 03:53:33,224][33201] Updated weights for policy 0, policy_version 70980 (0.0009) [2023-10-14 03:53:33,601][33201] Updated weights for policy 0, policy_version 70990 (0.0007) [2023-10-14 03:53:33,964][33201] Updated weights for policy 0, policy_version 71000 (0.0007) [2023-10-14 03:53:34,008][33226] Updated weights for policy 1, policy_version 71620 (0.0009) [2023-10-14 03:53:34,376][33226] Updated weights for policy 1, policy_version 71630 (0.0007) [2023-10-14 03:53:34,557][31953] Fps is (10 sec: 16384.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 146046976. Throughput: 0: 1795.6, 1: 1774.9. Samples: 36520528. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 03:53:34,737][33226] Updated weights for policy 1, policy_version 71640 (0.0009) [2023-10-14 03:53:37,838][33201] Updated weights for policy 0, policy_version 71010 (0.0007) [2023-10-14 03:53:38,254][33201] Updated weights for policy 0, policy_version 71020 (0.0009) [2023-10-14 03:53:38,355][33226] Updated weights for policy 1, policy_version 71650 (0.0007) [2023-10-14 03:53:38,632][33201] Updated weights for policy 0, policy_version 71030 (0.0007) [2023-10-14 03:53:38,715][33226] Updated weights for policy 1, policy_version 71660 (0.0007) [2023-10-14 03:53:38,998][33201] Updated weights for policy 0, policy_version 71040 (0.0007) [2023-10-14 03:53:39,080][33226] Updated weights for policy 1, policy_version 71670 (0.0007) [2023-10-14 03:53:39,449][33226] Updated weights for policy 1, policy_version 71680 (0.0010) [2023-10-14 03:53:39,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 146145280. Throughput: 0: 1765.1, 1: 1784.4. Samples: 36540332. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:39,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.990')] [2023-10-14 03:53:42,980][33201] Updated weights for policy 0, policy_version 71050 (0.0007) [2023-10-14 03:53:43,185][33226] Updated weights for policy 1, policy_version 71690 (0.0008) [2023-10-14 03:53:43,355][33201] Updated weights for policy 0, policy_version 71060 (0.0008) [2023-10-14 03:53:43,566][33226] Updated weights for policy 1, policy_version 71700 (0.0007) [2023-10-14 03:53:43,721][33201] Updated weights for policy 0, policy_version 71070 (0.0007) [2023-10-14 03:53:43,931][33226] Updated weights for policy 1, policy_version 71710 (0.0008) [2023-10-14 03:53:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 146210816. Throughput: 0: 1777.2, 1: 1772.6. Samples: 36552022. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:44,558][31953] Avg episode reward: [(0, '20.790'), (1, '21.000')] [2023-10-14 03:53:47,527][33201] Updated weights for policy 0, policy_version 71080 (0.0007) [2023-10-14 03:53:47,552][33226] Updated weights for policy 1, policy_version 71720 (0.0008) [2023-10-14 03:53:47,911][33201] Updated weights for policy 0, policy_version 71090 (0.0008) [2023-10-14 03:53:47,927][33226] Updated weights for policy 1, policy_version 71730 (0.0007) [2023-10-14 03:53:48,281][33201] Updated weights for policy 0, policy_version 71100 (0.0009) [2023-10-14 03:53:48,288][33226] Updated weights for policy 1, policy_version 71740 (0.0009) [2023-10-14 03:53:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 146276352. Throughput: 0: 1766.9, 1: 1788.9. Samples: 36572576. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:49,558][31953] Avg episode reward: [(0, '20.780'), (1, '21.000')] [2023-10-14 03:53:52,055][33201] Updated weights for policy 0, policy_version 71110 (0.0008) [2023-10-14 03:53:52,099][33226] Updated weights for policy 1, policy_version 71750 (0.0008) [2023-10-14 03:53:52,421][33201] Updated weights for policy 0, policy_version 71120 (0.0007) [2023-10-14 03:53:52,466][33226] Updated weights for policy 1, policy_version 71760 (0.0008) [2023-10-14 03:53:52,799][33201] Updated weights for policy 0, policy_version 71130 (0.0008) [2023-10-14 03:53:52,841][33226] Updated weights for policy 1, policy_version 71770 (0.0009) [2023-10-14 03:53:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 146341888. Throughput: 0: 1757.9, 1: 1766.9. Samples: 36593332. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:54,558][31953] Avg episode reward: [(0, '20.780'), (1, '20.980')] [2023-10-14 03:53:56,612][33201] Updated weights for policy 0, policy_version 71140 (0.0007) [2023-10-14 03:53:56,763][33226] Updated weights for policy 1, policy_version 71780 (0.0008) [2023-10-14 03:53:56,981][33201] Updated weights for policy 0, policy_version 71150 (0.0007) [2023-10-14 03:53:57,177][33226] Updated weights for policy 1, policy_version 71790 (0.0007) [2023-10-14 03:53:57,357][33201] Updated weights for policy 0, policy_version 71160 (0.0007) [2023-10-14 03:53:57,547][33226] Updated weights for policy 1, policy_version 71800 (0.0008) [2023-10-14 03:53:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 146407424. Throughput: 0: 1769.6, 1: 1791.7. Samples: 36604538. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:53:59,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.980')] [2023-10-14 03:54:01,207][33201] Updated weights for policy 0, policy_version 71170 (0.0008) [2023-10-14 03:54:01,465][33226] Updated weights for policy 1, policy_version 71810 (0.0010) [2023-10-14 03:54:01,577][33201] Updated weights for policy 0, policy_version 71180 (0.0009) [2023-10-14 03:54:01,828][33226] Updated weights for policy 1, policy_version 71820 (0.0008) [2023-10-14 03:54:01,932][33201] Updated weights for policy 0, policy_version 71190 (0.0008) [2023-10-14 03:54:02,194][33226] Updated weights for policy 1, policy_version 71830 (0.0008) [2023-10-14 03:54:02,305][33201] Updated weights for policy 0, policy_version 71200 (0.0007) [2023-10-14 03:54:02,559][33226] Updated weights for policy 1, policy_version 71840 (0.0010) [2023-10-14 03:54:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 146472960. Throughput: 0: 1750.8, 1: 1764.4. Samples: 36624644. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:54:04,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.980')] [2023-10-14 03:54:06,207][33201] Updated weights for policy 0, policy_version 71210 (0.0008) [2023-10-14 03:54:06,404][33226] Updated weights for policy 1, policy_version 71850 (0.0008) [2023-10-14 03:54:06,574][33201] Updated weights for policy 0, policy_version 71220 (0.0008) [2023-10-14 03:54:06,768][33226] Updated weights for policy 1, policy_version 71860 (0.0009) [2023-10-14 03:54:06,939][33201] Updated weights for policy 0, policy_version 71230 (0.0008) [2023-10-14 03:54:07,143][33226] Updated weights for policy 1, policy_version 71870 (0.0008) [2023-10-14 03:54:09,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 146538496. Throughput: 0: 1749.9, 1: 1761.9. Samples: 36646436. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:54:09,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.980')] [2023-10-14 03:54:10,885][33201] Updated weights for policy 0, policy_version 71240 (0.0008) [2023-10-14 03:54:11,067][33226] Updated weights for policy 1, policy_version 71880 (0.0007) [2023-10-14 03:54:11,257][33201] Updated weights for policy 0, policy_version 71250 (0.0010) [2023-10-14 03:54:11,432][33226] Updated weights for policy 1, policy_version 71890 (0.0008) [2023-10-14 03:54:11,616][33201] Updated weights for policy 0, policy_version 71260 (0.0007) [2023-10-14 03:54:11,797][33226] Updated weights for policy 1, policy_version 71900 (0.0009) [2023-10-14 03:54:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 146604032. Throughput: 0: 1740.5, 1: 1762.6. Samples: 36655940. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:54:14,557][31953] Avg episode reward: [(0, '20.760'), (1, '20.980')] [2023-10-14 03:54:15,364][33201] Updated weights for policy 0, policy_version 71270 (0.0008) [2023-10-14 03:54:15,642][33226] Updated weights for policy 1, policy_version 71910 (0.0008) [2023-10-14 03:54:15,732][33201] Updated weights for policy 0, policy_version 71280 (0.0008) [2023-10-14 03:54:16,000][33226] Updated weights for policy 1, policy_version 71920 (0.0008) [2023-10-14 03:54:16,095][33201] Updated weights for policy 0, policy_version 71290 (0.0008) [2023-10-14 03:54:16,367][33226] Updated weights for policy 1, policy_version 71930 (0.0008) [2023-10-14 03:54:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 146669568. Throughput: 0: 1741.6, 1: 1756.9. Samples: 36677962. Policy #0 lag: (min: 30.0, avg: 34.4, max: 62.0) [2023-10-14 03:54:19,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.980')] [2023-10-14 03:54:20,017][33201] Updated weights for policy 0, policy_version 71300 (0.0008) [2023-10-14 03:54:20,252][33226] Updated weights for policy 1, policy_version 71940 (0.0009) [2023-10-14 03:54:20,394][33201] Updated weights for policy 0, policy_version 71310 (0.0009) [2023-10-14 03:54:20,620][33226] Updated weights for policy 1, policy_version 71950 (0.0007) [2023-10-14 03:54:20,770][33201] Updated weights for policy 0, policy_version 71320 (0.0008) [2023-10-14 03:54:20,978][33226] Updated weights for policy 1, policy_version 71960 (0.0008) [2023-10-14 03:54:24,497][33201] Updated weights for policy 0, policy_version 71330 (0.0010) [2023-10-14 03:54:24,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 146735104. Throughput: 0: 1782.9, 1: 1772.6. Samples: 36700328. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:24,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.980')] [2023-10-14 03:54:24,746][33226] Updated weights for policy 1, policy_version 71970 (0.0009) [2023-10-14 03:54:24,889][33201] Updated weights for policy 0, policy_version 71340 (0.0008) [2023-10-14 03:54:25,117][33226] Updated weights for policy 1, policy_version 71980 (0.0010) [2023-10-14 03:54:25,250][33201] Updated weights for policy 0, policy_version 71350 (0.0007) [2023-10-14 03:54:25,486][33226] Updated weights for policy 1, policy_version 71990 (0.0008) [2023-10-14 03:54:25,628][33201] Updated weights for policy 0, policy_version 71360 (0.0007) [2023-10-14 03:54:25,844][33226] Updated weights for policy 1, policy_version 72000 (0.0009) [2023-10-14 03:54:29,392][33201] Updated weights for policy 0, policy_version 71370 (0.0007) [2023-10-14 03:54:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 146800640. Throughput: 0: 1751.5, 1: 1751.2. Samples: 36709640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:29,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.960')] [2023-10-14 03:54:29,751][33226] Updated weights for policy 1, policy_version 72010 (0.0008) [2023-10-14 03:54:29,760][33201] Updated weights for policy 0, policy_version 71380 (0.0008) [2023-10-14 03:54:30,114][33226] Updated weights for policy 1, policy_version 72020 (0.0011) [2023-10-14 03:54:30,119][33201] Updated weights for policy 0, policy_version 71390 (0.0008) [2023-10-14 03:54:30,485][33226] Updated weights for policy 1, policy_version 72030 (0.0008) [2023-10-14 03:54:34,022][33201] Updated weights for policy 0, policy_version 71400 (0.0007) [2023-10-14 03:54:34,078][33226] Updated weights for policy 1, policy_version 72040 (0.0008) [2023-10-14 03:54:34,387][33201] Updated weights for policy 0, policy_version 71410 (0.0007) [2023-10-14 03:54:34,436][33226] Updated weights for policy 1, policy_version 72050 (0.0007) [2023-10-14 03:54:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 146866176. Throughput: 0: 1773.1, 1: 1770.2. Samples: 36732024. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:34,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.910')] [2023-10-14 03:54:34,758][33201] Updated weights for policy 0, policy_version 71420 (0.0009) [2023-10-14 03:54:34,801][33226] Updated weights for policy 1, policy_version 72060 (0.0009) [2023-10-14 03:54:38,434][33226] Updated weights for policy 1, policy_version 72070 (0.0007) [2023-10-14 03:54:38,480][33201] Updated weights for policy 0, policy_version 71430 (0.0008) [2023-10-14 03:54:38,802][33226] Updated weights for policy 1, policy_version 72080 (0.0007) [2023-10-14 03:54:38,846][33201] Updated weights for policy 0, policy_version 71440 (0.0008) [2023-10-14 03:54:39,173][33226] Updated weights for policy 1, policy_version 72090 (0.0007) [2023-10-14 03:54:39,225][33201] Updated weights for policy 0, policy_version 71450 (0.0009) [2023-10-14 03:54:39,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 146997248. Throughput: 0: 1765.5, 1: 1778.1. Samples: 36752792. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:39,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.910')] [2023-10-14 03:54:42,999][33226] Updated weights for policy 1, policy_version 72100 (0.0007) [2023-10-14 03:54:43,105][33201] Updated weights for policy 0, policy_version 71460 (0.0008) [2023-10-14 03:54:43,399][33226] Updated weights for policy 1, policy_version 72110 (0.0008) [2023-10-14 03:54:43,482][33201] Updated weights for policy 0, policy_version 71470 (0.0007) [2023-10-14 03:54:43,768][33226] Updated weights for policy 1, policy_version 72120 (0.0008) [2023-10-14 03:54:43,854][33201] Updated weights for policy 0, policy_version 71480 (0.0009) [2023-10-14 03:54:44,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 147062784. Throughput: 0: 1772.0, 1: 1777.4. Samples: 36764260. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:44,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.910')] [2023-10-14 03:54:47,610][33226] Updated weights for policy 1, policy_version 72130 (0.0008) [2023-10-14 03:54:47,651][33201] Updated weights for policy 0, policy_version 71490 (0.0010) [2023-10-14 03:54:47,978][33226] Updated weights for policy 1, policy_version 72140 (0.0008) [2023-10-14 03:54:48,020][33201] Updated weights for policy 0, policy_version 71500 (0.0008) [2023-10-14 03:54:48,342][33226] Updated weights for policy 1, policy_version 72150 (0.0009) [2023-10-14 03:54:48,390][33201] Updated weights for policy 0, policy_version 71510 (0.0008) [2023-10-14 03:54:48,711][33226] Updated weights for policy 1, policy_version 72160 (0.0008) [2023-10-14 03:54:48,753][33201] Updated weights for policy 0, policy_version 71520 (0.0008) [2023-10-14 03:54:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 147128320. Throughput: 0: 1776.7, 1: 1788.5. Samples: 36785074. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:49,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.910')] [2023-10-14 03:54:52,437][33226] Updated weights for policy 1, policy_version 72170 (0.0008) [2023-10-14 03:54:52,637][33201] Updated weights for policy 0, policy_version 71530 (0.0008) [2023-10-14 03:54:52,816][33226] Updated weights for policy 1, policy_version 72180 (0.0009) [2023-10-14 03:54:53,003][33201] Updated weights for policy 0, policy_version 71540 (0.0007) [2023-10-14 03:54:53,187][33226] Updated weights for policy 1, policy_version 72190 (0.0008) [2023-10-14 03:54:53,380][33201] Updated weights for policy 0, policy_version 71550 (0.0008) [2023-10-14 03:54:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 147193856. Throughput: 0: 1760.8, 1: 1771.8. Samples: 36805404. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:54,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.910')] [2023-10-14 03:54:57,068][33226] Updated weights for policy 1, policy_version 72200 (0.0008) [2023-10-14 03:54:57,102][33201] Updated weights for policy 0, policy_version 71560 (0.0007) [2023-10-14 03:54:57,421][33226] Updated weights for policy 1, policy_version 72210 (0.0008) [2023-10-14 03:54:57,475][33201] Updated weights for policy 0, policy_version 71570 (0.0008) [2023-10-14 03:54:57,781][33226] Updated weights for policy 1, policy_version 72220 (0.0008) [2023-10-14 03:54:57,847][33201] Updated weights for policy 0, policy_version 71580 (0.0007) [2023-10-14 03:54:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 147259392. Throughput: 0: 1787.4, 1: 1794.1. Samples: 36817108. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:54:59,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.910')] [2023-10-14 03:55:01,540][33226] Updated weights for policy 1, policy_version 72230 (0.0009) [2023-10-14 03:55:01,611][33201] Updated weights for policy 0, policy_version 71590 (0.0008) [2023-10-14 03:55:01,914][33226] Updated weights for policy 1, policy_version 72240 (0.0008) [2023-10-14 03:55:01,978][33201] Updated weights for policy 0, policy_version 71600 (0.0008) [2023-10-14 03:55:02,277][33226] Updated weights for policy 1, policy_version 72250 (0.0009) [2023-10-14 03:55:02,353][33201] Updated weights for policy 0, policy_version 71610 (0.0010) [2023-10-14 03:55:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 147324928. Throughput: 0: 1764.3, 1: 1773.5. Samples: 36837166. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:04,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.870')] [2023-10-14 03:55:05,962][33226] Updated weights for policy 1, policy_version 72260 (0.0009) [2023-10-14 03:55:06,124][33201] Updated weights for policy 0, policy_version 71620 (0.0007) [2023-10-14 03:55:06,327][33226] Updated weights for policy 1, policy_version 72270 (0.0008) [2023-10-14 03:55:06,488][33201] Updated weights for policy 0, policy_version 71630 (0.0007) [2023-10-14 03:55:06,693][33226] Updated weights for policy 1, policy_version 72280 (0.0008) [2023-10-14 03:55:06,862][33201] Updated weights for policy 0, policy_version 71640 (0.0007) [2023-10-14 03:55:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 147390464. Throughput: 0: 1760.3, 1: 1772.5. Samples: 36859300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:09,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.870')] [2023-10-14 03:55:10,513][33226] Updated weights for policy 1, policy_version 72290 (0.0008) [2023-10-14 03:55:10,718][33201] Updated weights for policy 0, policy_version 71650 (0.0007) [2023-10-14 03:55:10,879][33226] Updated weights for policy 1, policy_version 72300 (0.0009) [2023-10-14 03:55:11,114][33201] Updated weights for policy 0, policy_version 71660 (0.0008) [2023-10-14 03:55:11,243][33226] Updated weights for policy 1, policy_version 72310 (0.0007) [2023-10-14 03:55:11,476][33201] Updated weights for policy 0, policy_version 71670 (0.0007) [2023-10-14 03:55:11,602][33226] Updated weights for policy 1, policy_version 72320 (0.0008) [2023-10-14 03:55:11,850][33201] Updated weights for policy 0, policy_version 71680 (0.0007) [2023-10-14 03:55:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 147456000. Throughput: 0: 1760.5, 1: 1778.4. Samples: 36868892. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 03:55:15,502][33226] Updated weights for policy 1, policy_version 72330 (0.0008) [2023-10-14 03:55:15,622][33201] Updated weights for policy 0, policy_version 71690 (0.0008) [2023-10-14 03:55:15,865][33226] Updated weights for policy 1, policy_version 72340 (0.0008) [2023-10-14 03:55:15,987][33201] Updated weights for policy 0, policy_version 71700 (0.0007) [2023-10-14 03:55:16,221][33226] Updated weights for policy 1, policy_version 72350 (0.0009) [2023-10-14 03:55:16,360][33201] Updated weights for policy 0, policy_version 71710 (0.0008) [2023-10-14 03:55:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 147521536. Throughput: 0: 1765.7, 1: 1769.9. Samples: 36891126. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:19,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 03:55:20,057][33201] Updated weights for policy 0, policy_version 71720 (0.0009) [2023-10-14 03:55:20,175][33226] Updated weights for policy 1, policy_version 72360 (0.0008) [2023-10-14 03:55:20,421][33201] Updated weights for policy 0, policy_version 71730 (0.0007) [2023-10-14 03:55:20,534][33226] Updated weights for policy 1, policy_version 72370 (0.0007) [2023-10-14 03:55:20,798][33201] Updated weights for policy 0, policy_version 71740 (0.0007) [2023-10-14 03:55:20,902][33226] Updated weights for policy 1, policy_version 72380 (0.0009) [2023-10-14 03:55:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 147587072. Throughput: 0: 1783.0, 1: 1781.4. Samples: 36913192. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.820')] [2023-10-14 03:55:24,565][33201] Updated weights for policy 0, policy_version 71750 (0.0007) [2023-10-14 03:55:24,609][33226] Updated weights for policy 1, policy_version 72390 (0.0008) [2023-10-14 03:55:24,940][33201] Updated weights for policy 0, policy_version 71760 (0.0008) [2023-10-14 03:55:24,984][33226] Updated weights for policy 1, policy_version 72400 (0.0008) [2023-10-14 03:55:25,319][33201] Updated weights for policy 0, policy_version 71770 (0.0009) [2023-10-14 03:55:25,347][33226] Updated weights for policy 1, policy_version 72410 (0.0008) [2023-10-14 03:55:25,528][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000071776_73498624.pth... [2023-10-14 03:55:25,563][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000070112_71794688.pth [2023-10-14 03:55:25,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000072416_74153984.pth... [2023-10-14 03:55:25,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000070752_72450048.pth [2023-10-14 03:55:29,210][33201] Updated weights for policy 0, policy_version 71780 (0.0008) [2023-10-14 03:55:29,366][33226] Updated weights for policy 1, policy_version 72420 (0.0007) [2023-10-14 03:55:29,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 147652608. Throughput: 0: 1762.1, 1: 1760.2. Samples: 36922764. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.810')] [2023-10-14 03:55:29,570][33201] Updated weights for policy 0, policy_version 71790 (0.0009) [2023-10-14 03:55:29,775][33226] Updated weights for policy 1, policy_version 72430 (0.0007) [2023-10-14 03:55:29,942][33201] Updated weights for policy 0, policy_version 71800 (0.0008) [2023-10-14 03:55:30,132][33226] Updated weights for policy 1, policy_version 72440 (0.0009) [2023-10-14 03:55:33,783][33201] Updated weights for policy 0, policy_version 71810 (0.0009) [2023-10-14 03:55:33,909][33226] Updated weights for policy 1, policy_version 72450 (0.0007) [2023-10-14 03:55:34,143][33201] Updated weights for policy 0, policy_version 71820 (0.0007) [2023-10-14 03:55:34,276][33226] Updated weights for policy 1, policy_version 72460 (0.0009) [2023-10-14 03:55:34,512][33201] Updated weights for policy 0, policy_version 71830 (0.0007) [2023-10-14 03:55:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 147718144. Throughput: 0: 1769.3, 1: 1771.9. Samples: 36944428. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:34,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.810')] [2023-10-14 03:55:34,640][33226] Updated weights for policy 1, policy_version 72470 (0.0008) [2023-10-14 03:55:34,877][33201] Updated weights for policy 0, policy_version 71840 (0.0007) [2023-10-14 03:55:34,992][33226] Updated weights for policy 1, policy_version 72480 (0.0007) [2023-10-14 03:55:38,729][33201] Updated weights for policy 0, policy_version 71850 (0.0007) [2023-10-14 03:55:38,916][33226] Updated weights for policy 1, policy_version 72490 (0.0008) [2023-10-14 03:55:39,102][33201] Updated weights for policy 0, policy_version 71860 (0.0010) [2023-10-14 03:55:39,278][33226] Updated weights for policy 1, policy_version 72500 (0.0008) [2023-10-14 03:55:39,468][33201] Updated weights for policy 0, policy_version 71870 (0.0007) [2023-10-14 03:55:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 147816448. Throughput: 0: 1772.9, 1: 1780.3. Samples: 36965296. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.810')] [2023-10-14 03:55:39,650][33226] Updated weights for policy 1, policy_version 72510 (0.0009) [2023-10-14 03:55:43,333][33226] Updated weights for policy 1, policy_version 72520 (0.0009) [2023-10-14 03:55:43,384][33201] Updated weights for policy 0, policy_version 71880 (0.0008) [2023-10-14 03:55:43,698][33226] Updated weights for policy 1, policy_version 72530 (0.0008) [2023-10-14 03:55:43,764][33201] Updated weights for policy 0, policy_version 71890 (0.0008) [2023-10-14 03:55:44,065][33226] Updated weights for policy 1, policy_version 72540 (0.0008) [2023-10-14 03:55:44,127][33201] Updated weights for policy 0, policy_version 71900 (0.0009) [2023-10-14 03:55:44,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 147914752. Throughput: 0: 1767.8, 1: 1769.1. Samples: 36976270. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.740')] [2023-10-14 03:55:47,927][33201] Updated weights for policy 0, policy_version 71910 (0.0009) [2023-10-14 03:55:47,981][33226] Updated weights for policy 1, policy_version 72550 (0.0008) [2023-10-14 03:55:48,291][33201] Updated weights for policy 0, policy_version 71920 (0.0010) [2023-10-14 03:55:48,343][33226] Updated weights for policy 1, policy_version 72560 (0.0008) [2023-10-14 03:55:48,658][33201] Updated weights for policy 0, policy_version 71930 (0.0009) [2023-10-14 03:55:48,701][33226] Updated weights for policy 1, policy_version 72570 (0.0008) [2023-10-14 03:55:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 147980288. Throughput: 0: 1778.0, 1: 1790.1. Samples: 36997728. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 03:55:49,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.760')] [2023-10-14 03:55:52,562][33226] Updated weights for policy 1, policy_version 72580 (0.0008) [2023-10-14 03:55:52,619][33201] Updated weights for policy 0, policy_version 71940 (0.0009) [2023-10-14 03:55:52,926][33226] Updated weights for policy 1, policy_version 72590 (0.0007) [2023-10-14 03:55:52,981][33201] Updated weights for policy 0, policy_version 71950 (0.0009) [2023-10-14 03:55:53,285][33226] Updated weights for policy 1, policy_version 72600 (0.0007) [2023-10-14 03:55:53,352][33201] Updated weights for policy 0, policy_version 71960 (0.0009) [2023-10-14 03:55:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148045824. Throughput: 0: 1754.5, 1: 1763.2. Samples: 37017598. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:55:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.760')] [2023-10-14 03:55:56,976][33226] Updated weights for policy 1, policy_version 72610 (0.0008) [2023-10-14 03:55:57,196][33201] Updated weights for policy 0, policy_version 71970 (0.0007) [2023-10-14 03:55:57,352][33226] Updated weights for policy 1, policy_version 72620 (0.0007) [2023-10-14 03:55:57,611][33201] Updated weights for policy 0, policy_version 71980 (0.0009) [2023-10-14 03:55:57,708][33226] Updated weights for policy 1, policy_version 72630 (0.0008) [2023-10-14 03:55:57,979][33201] Updated weights for policy 0, policy_version 71990 (0.0008) [2023-10-14 03:55:58,070][33226] Updated weights for policy 1, policy_version 72640 (0.0007) [2023-10-14 03:55:58,352][33201] Updated weights for policy 0, policy_version 72000 (0.0007) [2023-10-14 03:55:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148111360. Throughput: 0: 1789.4, 1: 1794.1. Samples: 37030150. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:55:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.740')] [2023-10-14 03:56:01,848][33226] Updated weights for policy 1, policy_version 72650 (0.0007) [2023-10-14 03:56:02,215][33226] Updated weights for policy 1, policy_version 72660 (0.0007) [2023-10-14 03:56:02,244][33201] Updated weights for policy 0, policy_version 72010 (0.0007) [2023-10-14 03:56:02,587][33226] Updated weights for policy 1, policy_version 72670 (0.0009) [2023-10-14 03:56:02,608][33201] Updated weights for policy 0, policy_version 72020 (0.0007) [2023-10-14 03:56:02,976][33201] Updated weights for policy 0, policy_version 72030 (0.0009) [2023-10-14 03:56:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148176896. Throughput: 0: 1748.8, 1: 1766.6. Samples: 37049320. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 03:56:06,495][33226] Updated weights for policy 1, policy_version 72680 (0.0008) [2023-10-14 03:56:06,866][33226] Updated weights for policy 1, policy_version 72690 (0.0008) [2023-10-14 03:56:06,880][33201] Updated weights for policy 0, policy_version 72040 (0.0007) [2023-10-14 03:56:07,228][33226] Updated weights for policy 1, policy_version 72700 (0.0009) [2023-10-14 03:56:07,253][33201] Updated weights for policy 0, policy_version 72050 (0.0007) [2023-10-14 03:56:07,633][33201] Updated weights for policy 0, policy_version 72060 (0.0009) [2023-10-14 03:56:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148242432. Throughput: 0: 1747.8, 1: 1766.2. Samples: 37071322. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 03:56:10,946][33226] Updated weights for policy 1, policy_version 72710 (0.0007) [2023-10-14 03:56:11,307][33226] Updated weights for policy 1, policy_version 72720 (0.0009) [2023-10-14 03:56:11,549][33201] Updated weights for policy 0, policy_version 72070 (0.0008) [2023-10-14 03:56:11,670][33226] Updated weights for policy 1, policy_version 72730 (0.0007) [2023-10-14 03:56:11,922][33201] Updated weights for policy 0, policy_version 72080 (0.0007) [2023-10-14 03:56:12,294][33201] Updated weights for policy 0, policy_version 72090 (0.0008) [2023-10-14 03:56:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148307968. Throughput: 0: 1760.4, 1: 1768.5. Samples: 37081564. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 03:56:15,682][33226] Updated weights for policy 1, policy_version 72740 (0.0008) [2023-10-14 03:56:16,056][33201] Updated weights for policy 0, policy_version 72100 (0.0010) [2023-10-14 03:56:16,075][33226] Updated weights for policy 1, policy_version 72750 (0.0008) [2023-10-14 03:56:16,421][33201] Updated weights for policy 0, policy_version 72110 (0.0008) [2023-10-14 03:56:16,425][33226] Updated weights for policy 1, policy_version 72760 (0.0009) [2023-10-14 03:56:16,792][33201] Updated weights for policy 0, policy_version 72120 (0.0009) [2023-10-14 03:56:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.6, 300 sec: 14106.9). Total num frames: 148373504. Throughput: 0: 1755.4, 1: 1771.1. Samples: 37103120. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.710')] [2023-10-14 03:56:20,193][33226] Updated weights for policy 1, policy_version 72770 (0.0008) [2023-10-14 03:56:20,474][33201] Updated weights for policy 0, policy_version 72130 (0.0009) [2023-10-14 03:56:20,548][33226] Updated weights for policy 1, policy_version 72780 (0.0008) [2023-10-14 03:56:20,855][33201] Updated weights for policy 0, policy_version 72140 (0.0009) [2023-10-14 03:56:20,916][33226] Updated weights for policy 1, policy_version 72790 (0.0009) [2023-10-14 03:56:21,230][33201] Updated weights for policy 0, policy_version 72150 (0.0009) [2023-10-14 03:56:21,283][33226] Updated weights for policy 1, policy_version 72800 (0.0009) [2023-10-14 03:56:21,594][33201] Updated weights for policy 0, policy_version 72160 (0.0008) [2023-10-14 03:56:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 148439040. Throughput: 0: 1776.6, 1: 1782.7. Samples: 37125464. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.770')] [2023-10-14 03:56:25,093][33226] Updated weights for policy 1, policy_version 72810 (0.0007) [2023-10-14 03:56:25,163][33201] Updated weights for policy 0, policy_version 72170 (0.0009) [2023-10-14 03:56:25,456][33226] Updated weights for policy 1, policy_version 72820 (0.0007) [2023-10-14 03:56:25,529][33201] Updated weights for policy 0, policy_version 72180 (0.0007) [2023-10-14 03:56:25,824][33226] Updated weights for policy 1, policy_version 72830 (0.0009) [2023-10-14 03:56:25,894][33201] Updated weights for policy 0, policy_version 72190 (0.0008) [2023-10-14 03:56:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 148504576. Throughput: 0: 1757.8, 1: 1770.2. Samples: 37135030. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:29,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 03:56:29,722][33226] Updated weights for policy 1, policy_version 72840 (0.0008) [2023-10-14 03:56:29,883][33201] Updated weights for policy 0, policy_version 72200 (0.0008) [2023-10-14 03:56:30,096][33226] Updated weights for policy 1, policy_version 72850 (0.0008) [2023-10-14 03:56:30,257][33201] Updated weights for policy 0, policy_version 72210 (0.0008) [2023-10-14 03:56:30,463][33226] Updated weights for policy 1, policy_version 72860 (0.0008) [2023-10-14 03:56:30,633][33201] Updated weights for policy 0, policy_version 72220 (0.0009) [2023-10-14 03:56:33,984][33226] Updated weights for policy 1, policy_version 72870 (0.0009) [2023-10-14 03:56:34,353][33226] Updated weights for policy 1, policy_version 72880 (0.0009) [2023-10-14 03:56:34,373][33201] Updated weights for policy 0, policy_version 72230 (0.0009) [2023-10-14 03:56:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 148570112. Throughput: 0: 1765.2, 1: 1778.6. Samples: 37157196. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 03:56:34,716][33226] Updated weights for policy 1, policy_version 72890 (0.0007) [2023-10-14 03:56:34,734][33201] Updated weights for policy 0, policy_version 72240 (0.0008) [2023-10-14 03:56:35,105][33201] Updated weights for policy 0, policy_version 72250 (0.0007) [2023-10-14 03:56:38,543][33226] Updated weights for policy 1, policy_version 72900 (0.0008) [2023-10-14 03:56:38,907][33226] Updated weights for policy 1, policy_version 72910 (0.0007) [2023-10-14 03:56:39,020][33201] Updated weights for policy 0, policy_version 72260 (0.0008) [2023-10-14 03:56:39,277][33226] Updated weights for policy 1, policy_version 72920 (0.0008) [2023-10-14 03:56:39,392][33201] Updated weights for policy 0, policy_version 72270 (0.0008) [2023-10-14 03:56:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 148635648. Throughput: 0: 1780.1, 1: 1789.9. Samples: 37178246. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-10-14 03:56:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 03:56:39,760][33201] Updated weights for policy 0, policy_version 72280 (0.0009) [2023-10-14 03:56:43,104][33226] Updated weights for policy 1, policy_version 72930 (0.0010) [2023-10-14 03:56:43,468][33226] Updated weights for policy 1, policy_version 72940 (0.0009) [2023-10-14 03:56:43,762][33201] Updated weights for policy 0, policy_version 72290 (0.0010) [2023-10-14 03:56:43,843][33226] Updated weights for policy 1, policy_version 72950 (0.0009) [2023-10-14 03:56:44,174][33201] Updated weights for policy 0, policy_version 72300 (0.0008) [2023-10-14 03:56:44,205][33226] Updated weights for policy 1, policy_version 72960 (0.0010) [2023-10-14 03:56:44,545][33201] Updated weights for policy 0, policy_version 72310 (0.0009) [2023-10-14 03:56:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 148733952. Throughput: 0: 1752.0, 1: 1772.5. Samples: 37188754. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:56:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 03:56:44,910][33201] Updated weights for policy 0, policy_version 72320 (0.0008) [2023-10-14 03:56:47,980][33226] Updated weights for policy 1, policy_version 72970 (0.0007) [2023-10-14 03:56:48,344][33226] Updated weights for policy 1, policy_version 72980 (0.0009) [2023-10-14 03:56:48,706][33201] Updated weights for policy 0, policy_version 72330 (0.0009) [2023-10-14 03:56:48,709][33226] Updated weights for policy 1, policy_version 72990 (0.0010) [2023-10-14 03:56:49,067][33201] Updated weights for policy 0, policy_version 72340 (0.0007) [2023-10-14 03:56:49,447][33201] Updated weights for policy 0, policy_version 72350 (0.0008) [2023-10-14 03:56:49,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 148832256. Throughput: 0: 1782.9, 1: 1793.5. Samples: 37210258. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:56:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 03:56:52,269][33226] Updated weights for policy 1, policy_version 73000 (0.0010) [2023-10-14 03:56:52,625][33226] Updated weights for policy 1, policy_version 73010 (0.0007) [2023-10-14 03:56:53,004][33226] Updated weights for policy 1, policy_version 73020 (0.0007) [2023-10-14 03:56:53,347][33201] Updated weights for policy 0, policy_version 72360 (0.0008) [2023-10-14 03:56:53,723][33201] Updated weights for policy 0, policy_version 72370 (0.0011) [2023-10-14 03:56:54,084][33201] Updated weights for policy 0, policy_version 72380 (0.0010) [2023-10-14 03:56:54,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148897792. Throughput: 0: 1756.9, 1: 1778.1. Samples: 37230396. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:56:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 03:56:56,729][33226] Updated weights for policy 1, policy_version 73030 (0.0008) [2023-10-14 03:56:57,090][33226] Updated weights for policy 1, policy_version 73040 (0.0007) [2023-10-14 03:56:57,459][33226] Updated weights for policy 1, policy_version 73050 (0.0009) [2023-10-14 03:56:57,957][33201] Updated weights for policy 0, policy_version 72390 (0.0009) [2023-10-14 03:56:58,329][33201] Updated weights for policy 0, policy_version 72400 (0.0008) [2023-10-14 03:56:58,707][33201] Updated weights for policy 0, policy_version 72410 (0.0008) [2023-10-14 03:56:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 148963328. Throughput: 0: 1767.8, 1: 1798.4. Samples: 37242046. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:56:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.760')] [2023-10-14 03:57:01,383][33226] Updated weights for policy 1, policy_version 73060 (0.0010) [2023-10-14 03:57:01,740][33226] Updated weights for policy 1, policy_version 73070 (0.0011) [2023-10-14 03:57:02,101][33226] Updated weights for policy 1, policy_version 73080 (0.0010) [2023-10-14 03:57:02,511][33201] Updated weights for policy 0, policy_version 72420 (0.0007) [2023-10-14 03:57:02,879][33201] Updated weights for policy 0, policy_version 72430 (0.0008) [2023-10-14 03:57:03,244][33201] Updated weights for policy 0, policy_version 72440 (0.0007) [2023-10-14 03:57:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 149028864. Throughput: 0: 1759.5, 1: 1779.9. Samples: 37262390. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:57:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.750')] [2023-10-14 03:57:06,053][33226] Updated weights for policy 1, policy_version 73090 (0.0008) [2023-10-14 03:57:06,481][33226] Updated weights for policy 1, policy_version 73100 (0.0011) [2023-10-14 03:57:06,846][33226] Updated weights for policy 1, policy_version 73110 (0.0011) [2023-10-14 03:57:07,037][33201] Updated weights for policy 0, policy_version 72450 (0.0010) [2023-10-14 03:57:07,199][33226] Updated weights for policy 1, policy_version 73120 (0.0009) [2023-10-14 03:57:07,409][33201] Updated weights for policy 0, policy_version 72460 (0.0008) [2023-10-14 03:57:07,786][33201] Updated weights for policy 0, policy_version 72470 (0.0008) [2023-10-14 03:57:08,157][33201] Updated weights for policy 0, policy_version 72480 (0.0009) [2023-10-14 03:57:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 149094400. Throughput: 0: 1741.7, 1: 1770.1. Samples: 37283496. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:57:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.750')] [2023-10-14 03:57:10,980][33226] Updated weights for policy 1, policy_version 73130 (0.0008) [2023-10-14 03:57:11,338][33226] Updated weights for policy 1, policy_version 73140 (0.0010) [2023-10-14 03:57:11,710][33226] Updated weights for policy 1, policy_version 73150 (0.0007) [2023-10-14 03:57:11,935][33201] Updated weights for policy 0, policy_version 72490 (0.0007) [2023-10-14 03:57:12,306][33201] Updated weights for policy 0, policy_version 72500 (0.0007) [2023-10-14 03:57:12,685][33201] Updated weights for policy 0, policy_version 72510 (0.0007) [2023-10-14 03:57:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 149159936. Throughput: 0: 1761.6, 1: 1766.8. Samples: 37293812. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:57:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.800')] [2023-10-14 03:57:15,520][33226] Updated weights for policy 1, policy_version 73160 (0.0007) [2023-10-14 03:57:15,881][33226] Updated weights for policy 1, policy_version 73170 (0.0007) [2023-10-14 03:57:16,253][33226] Updated weights for policy 1, policy_version 73180 (0.0009) [2023-10-14 03:57:16,389][33201] Updated weights for policy 0, policy_version 72520 (0.0008) [2023-10-14 03:57:16,760][33201] Updated weights for policy 0, policy_version 72530 (0.0008) [2023-10-14 03:57:17,128][33201] Updated weights for policy 0, policy_version 72540 (0.0010) [2023-10-14 03:57:19,558][31953] Fps is (10 sec: 13106.6, 60 sec: 14199.3, 300 sec: 14218.0). Total num frames: 149225472. Throughput: 0: 1748.3, 1: 1760.9. Samples: 37315114. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:57:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.810')] [2023-10-14 03:57:19,960][33226] Updated weights for policy 1, policy_version 73190 (0.0009) [2023-10-14 03:57:20,330][33226] Updated weights for policy 1, policy_version 73200 (0.0007) [2023-10-14 03:57:20,702][33226] Updated weights for policy 1, policy_version 73210 (0.0010) [2023-10-14 03:57:20,928][33201] Updated weights for policy 0, policy_version 72550 (0.0008) [2023-10-14 03:57:21,292][33201] Updated weights for policy 0, policy_version 72560 (0.0007) [2023-10-14 03:57:21,676][33201] Updated weights for policy 0, policy_version 72570 (0.0011) [2023-10-14 03:57:24,456][33226] Updated weights for policy 1, policy_version 73220 (0.0008) [2023-10-14 03:57:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 149291008. Throughput: 0: 1755.7, 1: 1783.5. Samples: 37337510. Policy #0 lag: (min: 31.0, avg: 38.1, max: 63.0) [2023-10-14 03:57:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.810')] [2023-10-14 03:57:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000072576_74317824.pth... [2023-10-14 03:57:24,597][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000070944_72646656.pth [2023-10-14 03:57:24,823][33226] Updated weights for policy 1, policy_version 73230 (0.0007) [2023-10-14 03:57:25,186][33226] Updated weights for policy 1, policy_version 73240 (0.0007) [2023-10-14 03:57:25,479][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000073248_75005952.pth... [2023-10-14 03:57:25,507][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000071584_73302016.pth [2023-10-14 03:57:25,526][33201] Updated weights for policy 0, policy_version 72580 (0.0010) [2023-10-14 03:57:25,894][33201] Updated weights for policy 0, policy_version 72590 (0.0008) [2023-10-14 03:57:26,271][33201] Updated weights for policy 0, policy_version 72600 (0.0009) [2023-10-14 03:57:28,956][33226] Updated weights for policy 1, policy_version 73250 (0.0008) [2023-10-14 03:57:29,320][33226] Updated weights for policy 1, policy_version 73260 (0.0008) [2023-10-14 03:57:29,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 149356544. Throughput: 0: 1751.6, 1: 1768.2. Samples: 37347146. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.810')] [2023-10-14 03:57:29,686][33226] Updated weights for policy 1, policy_version 73270 (0.0008) [2023-10-14 03:57:30,045][33226] Updated weights for policy 1, policy_version 73280 (0.0008) [2023-10-14 03:57:30,063][33201] Updated weights for policy 0, policy_version 72610 (0.0009) [2023-10-14 03:57:30,430][33201] Updated weights for policy 0, policy_version 72620 (0.0010) [2023-10-14 03:57:30,812][33201] Updated weights for policy 0, policy_version 72630 (0.0011) [2023-10-14 03:57:31,174][33201] Updated weights for policy 0, policy_version 72640 (0.0009) [2023-10-14 03:57:33,956][33226] Updated weights for policy 1, policy_version 73290 (0.0010) [2023-10-14 03:57:34,325][33226] Updated weights for policy 1, policy_version 73300 (0.0009) [2023-10-14 03:57:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 149422080. Throughput: 0: 1754.5, 1: 1781.0. Samples: 37369356. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:34,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.860')] [2023-10-14 03:57:34,701][33226] Updated weights for policy 1, policy_version 73310 (0.0009) [2023-10-14 03:57:35,092][33201] Updated weights for policy 0, policy_version 72650 (0.0008) [2023-10-14 03:57:35,473][33201] Updated weights for policy 0, policy_version 72660 (0.0007) [2023-10-14 03:57:35,841][33201] Updated weights for policy 0, policy_version 72670 (0.0010) [2023-10-14 03:57:38,302][33226] Updated weights for policy 1, policy_version 73320 (0.0009) [2023-10-14 03:57:38,667][33226] Updated weights for policy 1, policy_version 73330 (0.0009) [2023-10-14 03:57:39,041][33226] Updated weights for policy 1, policy_version 73340 (0.0010) [2023-10-14 03:57:39,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 149520384. Throughput: 0: 1778.1, 1: 1775.4. Samples: 37390304. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:39,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.860')] [2023-10-14 03:57:39,711][33201] Updated weights for policy 0, policy_version 72680 (0.0008) [2023-10-14 03:57:40,080][33201] Updated weights for policy 0, policy_version 72690 (0.0010) [2023-10-14 03:57:40,454][33201] Updated weights for policy 0, policy_version 72700 (0.0009) [2023-10-14 03:57:42,811][33226] Updated weights for policy 1, policy_version 73350 (0.0008) [2023-10-14 03:57:43,176][33226] Updated weights for policy 1, policy_version 73360 (0.0008) [2023-10-14 03:57:43,547][33226] Updated weights for policy 1, policy_version 73370 (0.0007) [2023-10-14 03:57:44,098][33201] Updated weights for policy 0, policy_version 72710 (0.0008) [2023-10-14 03:57:44,472][33201] Updated weights for policy 0, policy_version 72720 (0.0009) [2023-10-14 03:57:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 149585920. Throughput: 0: 1757.6, 1: 1777.3. Samples: 37401118. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.860')] [2023-10-14 03:57:44,842][33201] Updated weights for policy 0, policy_version 72730 (0.0007) [2023-10-14 03:57:47,419][33226] Updated weights for policy 1, policy_version 73380 (0.0008) [2023-10-14 03:57:47,790][33226] Updated weights for policy 1, policy_version 73390 (0.0009) [2023-10-14 03:57:48,149][33226] Updated weights for policy 1, policy_version 73400 (0.0010) [2023-10-14 03:57:48,764][33201] Updated weights for policy 0, policy_version 72740 (0.0007) [2023-10-14 03:57:49,133][33201] Updated weights for policy 0, policy_version 72750 (0.0008) [2023-10-14 03:57:49,502][33201] Updated weights for policy 0, policy_version 72760 (0.0008) [2023-10-14 03:57:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 149651456. Throughput: 0: 1777.0, 1: 1778.4. Samples: 37422384. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.880')] [2023-10-14 03:57:52,018][33226] Updated weights for policy 1, policy_version 73410 (0.0008) [2023-10-14 03:57:52,413][33226] Updated weights for policy 1, policy_version 73420 (0.0007) [2023-10-14 03:57:52,779][33226] Updated weights for policy 1, policy_version 73430 (0.0008) [2023-10-14 03:57:53,144][33226] Updated weights for policy 1, policy_version 73440 (0.0009) [2023-10-14 03:57:53,380][33201] Updated weights for policy 0, policy_version 72770 (0.0009) [2023-10-14 03:57:53,750][33201] Updated weights for policy 0, policy_version 72780 (0.0008) [2023-10-14 03:57:54,123][33201] Updated weights for policy 0, policy_version 72790 (0.0007) [2023-10-14 03:57:54,488][33201] Updated weights for policy 0, policy_version 72800 (0.0011) [2023-10-14 03:57:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 149749760. Throughput: 0: 1770.4, 1: 1771.9. Samples: 37442898. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:54,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.890')] [2023-10-14 03:57:56,793][33226] Updated weights for policy 1, policy_version 73450 (0.0009) [2023-10-14 03:57:57,166][33226] Updated weights for policy 1, policy_version 73460 (0.0007) [2023-10-14 03:57:57,540][33226] Updated weights for policy 1, policy_version 73470 (0.0008) [2023-10-14 03:57:58,293][33201] Updated weights for policy 0, policy_version 72810 (0.0008) [2023-10-14 03:57:58,669][33201] Updated weights for policy 0, policy_version 72820 (0.0007) [2023-10-14 03:57:59,040][33201] Updated weights for policy 0, policy_version 72830 (0.0007) [2023-10-14 03:57:59,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 149815296. Throughput: 0: 1769.2, 1: 1798.8. Samples: 37454368. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:57:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.890')] [2023-10-14 03:58:01,401][33226] Updated weights for policy 1, policy_version 73480 (0.0008) [2023-10-14 03:58:01,763][33226] Updated weights for policy 1, policy_version 73490 (0.0008) [2023-10-14 03:58:02,135][33226] Updated weights for policy 1, policy_version 73500 (0.0009) [2023-10-14 03:58:02,678][33201] Updated weights for policy 0, policy_version 72840 (0.0008) [2023-10-14 03:58:03,043][33201] Updated weights for policy 0, policy_version 72850 (0.0007) [2023-10-14 03:58:03,415][33201] Updated weights for policy 0, policy_version 72860 (0.0007) [2023-10-14 03:58:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 149880832. Throughput: 0: 1769.2, 1: 1786.6. Samples: 37475122. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:58:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 03:58:05,735][33226] Updated weights for policy 1, policy_version 73510 (0.0008) [2023-10-14 03:58:06,102][33226] Updated weights for policy 1, policy_version 73520 (0.0008) [2023-10-14 03:58:06,470][33226] Updated weights for policy 1, policy_version 73530 (0.0008) [2023-10-14 03:58:07,391][33201] Updated weights for policy 0, policy_version 72870 (0.0008) [2023-10-14 03:58:07,756][33201] Updated weights for policy 0, policy_version 72880 (0.0010) [2023-10-14 03:58:08,126][33201] Updated weights for policy 0, policy_version 72890 (0.0009) [2023-10-14 03:58:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 149946368. Throughput: 0: 1753.3, 1: 1778.4. Samples: 37496440. Policy #0 lag: (min: 0.0, avg: 26.1, max: 32.0) [2023-10-14 03:58:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 03:58:10,455][33226] Updated weights for policy 1, policy_version 73540 (0.0010) [2023-10-14 03:58:10,822][33226] Updated weights for policy 1, policy_version 73550 (0.0007) [2023-10-14 03:58:11,176][33226] Updated weights for policy 1, policy_version 73560 (0.0007) [2023-10-14 03:58:12,158][33201] Updated weights for policy 0, policy_version 72900 (0.0007) [2023-10-14 03:58:12,533][33201] Updated weights for policy 0, policy_version 72910 (0.0007) [2023-10-14 03:58:12,898][33201] Updated weights for policy 0, policy_version 72920 (0.0008) [2023-10-14 03:58:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 150011904. Throughput: 0: 1783.6, 1: 1775.1. Samples: 37507288. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.910')] [2023-10-14 03:58:14,966][33226] Updated weights for policy 1, policy_version 73570 (0.0008) [2023-10-14 03:58:15,331][33226] Updated weights for policy 1, policy_version 73580 (0.0009) [2023-10-14 03:58:15,700][33226] Updated weights for policy 1, policy_version 73590 (0.0007) [2023-10-14 03:58:16,068][33226] Updated weights for policy 1, policy_version 73600 (0.0008) [2023-10-14 03:58:16,906][33201] Updated weights for policy 0, policy_version 72930 (0.0008) [2023-10-14 03:58:17,279][33201] Updated weights for policy 0, policy_version 72940 (0.0007) [2023-10-14 03:58:17,637][33201] Updated weights for policy 0, policy_version 72950 (0.0010) [2023-10-14 03:58:18,007][33201] Updated weights for policy 0, policy_version 72960 (0.0010) [2023-10-14 03:58:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 150077440. Throughput: 0: 1750.2, 1: 1772.9. Samples: 37527892. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.910')] [2023-10-14 03:58:19,877][33226] Updated weights for policy 1, policy_version 73610 (0.0009) [2023-10-14 03:58:20,242][33226] Updated weights for policy 1, policy_version 73620 (0.0007) [2023-10-14 03:58:20,615][33226] Updated weights for policy 1, policy_version 73630 (0.0010) [2023-10-14 03:58:21,700][33201] Updated weights for policy 0, policy_version 72970 (0.0007) [2023-10-14 03:58:22,083][33201] Updated weights for policy 0, policy_version 72980 (0.0009) [2023-10-14 03:58:22,442][33201] Updated weights for policy 0, policy_version 72990 (0.0010) [2023-10-14 03:58:24,429][33226] Updated weights for policy 1, policy_version 73640 (0.0008) [2023-10-14 03:58:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 150142976. Throughput: 0: 1753.1, 1: 1799.9. Samples: 37550188. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.910')] [2023-10-14 03:58:24,800][33226] Updated weights for policy 1, policy_version 73650 (0.0008) [2023-10-14 03:58:25,168][33226] Updated weights for policy 1, policy_version 73660 (0.0009) [2023-10-14 03:58:26,305][33201] Updated weights for policy 0, policy_version 73000 (0.0008) [2023-10-14 03:58:26,671][33201] Updated weights for policy 0, policy_version 73010 (0.0008) [2023-10-14 03:58:27,037][33201] Updated weights for policy 0, policy_version 73020 (0.0007) [2023-10-14 03:58:29,051][33226] Updated weights for policy 1, policy_version 73670 (0.0010) [2023-10-14 03:58:29,422][33226] Updated weights for policy 1, policy_version 73680 (0.0010) [2023-10-14 03:58:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 150208512. Throughput: 0: 1756.3, 1: 1771.2. Samples: 37559858. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:29,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.910')] [2023-10-14 03:58:29,787][33226] Updated weights for policy 1, policy_version 73690 (0.0009) [2023-10-14 03:58:30,678][33201] Updated weights for policy 0, policy_version 73030 (0.0008) [2023-10-14 03:58:31,045][33201] Updated weights for policy 0, policy_version 73040 (0.0009) [2023-10-14 03:58:31,417][33201] Updated weights for policy 0, policy_version 73050 (0.0010) [2023-10-14 03:58:33,695][33226] Updated weights for policy 1, policy_version 73700 (0.0009) [2023-10-14 03:58:34,057][33226] Updated weights for policy 1, policy_version 73710 (0.0009) [2023-10-14 03:58:34,423][33226] Updated weights for policy 1, policy_version 73720 (0.0010) [2023-10-14 03:58:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 150274048. Throughput: 0: 1749.0, 1: 1789.9. Samples: 37581634. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:34,557][31953] Avg episode reward: [(0, '20.920'), (1, '20.940')] [2023-10-14 03:58:35,297][33201] Updated weights for policy 0, policy_version 73060 (0.0008) [2023-10-14 03:58:35,675][33201] Updated weights for policy 0, policy_version 73070 (0.0010) [2023-10-14 03:58:36,044][33201] Updated weights for policy 0, policy_version 73080 (0.0011) [2023-10-14 03:58:38,276][33226] Updated weights for policy 1, policy_version 73730 (0.0011) [2023-10-14 03:58:38,704][33226] Updated weights for policy 1, policy_version 73740 (0.0007) [2023-10-14 03:58:39,076][33226] Updated weights for policy 1, policy_version 73750 (0.0008) [2023-10-14 03:58:39,441][33226] Updated weights for policy 1, policy_version 73760 (0.0009) [2023-10-14 03:58:39,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 150372352. Throughput: 0: 1769.9, 1: 1785.6. Samples: 37602894. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 03:58:39,965][33201] Updated weights for policy 0, policy_version 73090 (0.0009) [2023-10-14 03:58:40,335][33201] Updated weights for policy 0, policy_version 73100 (0.0007) [2023-10-14 03:58:40,708][33201] Updated weights for policy 0, policy_version 73110 (0.0011) [2023-10-14 03:58:41,086][33201] Updated weights for policy 0, policy_version 73120 (0.0007) [2023-10-14 03:58:43,197][33226] Updated weights for policy 1, policy_version 73770 (0.0010) [2023-10-14 03:58:43,561][33226] Updated weights for policy 1, policy_version 73780 (0.0009) [2023-10-14 03:58:43,943][33226] Updated weights for policy 1, policy_version 73790 (0.0010) [2023-10-14 03:58:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 150437888. Throughput: 0: 1750.3, 1: 1780.8. Samples: 37613266. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.940')] [2023-10-14 03:58:44,911][33201] Updated weights for policy 0, policy_version 73130 (0.0008) [2023-10-14 03:58:45,293][33201] Updated weights for policy 0, policy_version 73140 (0.0008) [2023-10-14 03:58:45,650][33201] Updated weights for policy 0, policy_version 73150 (0.0007) [2023-10-14 03:58:47,569][33226] Updated weights for policy 1, policy_version 73800 (0.0008) [2023-10-14 03:58:47,929][33226] Updated weights for policy 1, policy_version 73810 (0.0008) [2023-10-14 03:58:48,298][33226] Updated weights for policy 1, policy_version 73820 (0.0008) [2023-10-14 03:58:49,362][33201] Updated weights for policy 0, policy_version 73160 (0.0008) [2023-10-14 03:58:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 150503424. Throughput: 0: 1769.7, 1: 1784.8. Samples: 37635074. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 03:58:49,727][33201] Updated weights for policy 0, policy_version 73170 (0.0009) [2023-10-14 03:58:50,097][33201] Updated weights for policy 0, policy_version 73180 (0.0009) [2023-10-14 03:58:51,954][33226] Updated weights for policy 1, policy_version 73830 (0.0008) [2023-10-14 03:58:52,329][33226] Updated weights for policy 1, policy_version 73840 (0.0008) [2023-10-14 03:58:52,697][33226] Updated weights for policy 1, policy_version 73850 (0.0007) [2023-10-14 03:58:53,899][33201] Updated weights for policy 0, policy_version 73190 (0.0008) [2023-10-14 03:58:54,271][33201] Updated weights for policy 0, policy_version 73200 (0.0010) [2023-10-14 03:58:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 150568960. Throughput: 0: 1775.3, 1: 1775.3. Samples: 37656218. Policy #0 lag: (min: 26.0, avg: 26.0, max: 26.0) [2023-10-14 03:58:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 03:58:54,639][33201] Updated weights for policy 0, policy_version 73210 (0.0009) [2023-10-14 03:58:56,338][33226] Updated weights for policy 1, policy_version 73860 (0.0009) [2023-10-14 03:58:56,704][33226] Updated weights for policy 1, policy_version 73870 (0.0009) [2023-10-14 03:58:57,079][33226] Updated weights for policy 1, policy_version 73880 (0.0010) [2023-10-14 03:58:58,491][33201] Updated weights for policy 0, policy_version 73220 (0.0007) [2023-10-14 03:58:58,862][33201] Updated weights for policy 0, policy_version 73230 (0.0008) [2023-10-14 03:58:59,229][33201] Updated weights for policy 0, policy_version 73240 (0.0008) [2023-10-14 03:58:59,557][31953] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 150667264. Throughput: 0: 1756.9, 1: 1793.7. Samples: 37667068. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:58:59,559][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 03:59:00,951][33226] Updated weights for policy 1, policy_version 73890 (0.0009) [2023-10-14 03:59:01,318][33226] Updated weights for policy 1, policy_version 73900 (0.0009) [2023-10-14 03:59:01,692][33226] Updated weights for policy 1, policy_version 73910 (0.0010) [2023-10-14 03:59:02,060][33226] Updated weights for policy 1, policy_version 73920 (0.0009) [2023-10-14 03:59:03,024][33201] Updated weights for policy 0, policy_version 73250 (0.0010) [2023-10-14 03:59:03,398][33201] Updated weights for policy 0, policy_version 73260 (0.0009) [2023-10-14 03:59:03,765][33201] Updated weights for policy 0, policy_version 73270 (0.0008) [2023-10-14 03:59:04,127][33201] Updated weights for policy 0, policy_version 73280 (0.0010) [2023-10-14 03:59:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 150732800. Throughput: 0: 1790.4, 1: 1780.9. Samples: 37688598. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.910')] [2023-10-14 03:59:05,756][33226] Updated weights for policy 1, policy_version 73930 (0.0007) [2023-10-14 03:59:06,118][33226] Updated weights for policy 1, policy_version 73940 (0.0010) [2023-10-14 03:59:06,488][33226] Updated weights for policy 1, policy_version 73950 (0.0011) [2023-10-14 03:59:08,118][33201] Updated weights for policy 0, policy_version 73290 (0.0008) [2023-10-14 03:59:08,498][33201] Updated weights for policy 0, policy_version 73300 (0.0008) [2023-10-14 03:59:08,870][33201] Updated weights for policy 0, policy_version 73310 (0.0008) [2023-10-14 03:59:09,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 150798336. Throughput: 0: 1755.7, 1: 1780.1. Samples: 37709300. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.910')] [2023-10-14 03:59:10,227][33226] Updated weights for policy 1, policy_version 73960 (0.0009) [2023-10-14 03:59:10,596][33226] Updated weights for policy 1, policy_version 73970 (0.0008) [2023-10-14 03:59:10,967][33226] Updated weights for policy 1, policy_version 73980 (0.0008) [2023-10-14 03:59:12,592][33201] Updated weights for policy 0, policy_version 73320 (0.0007) [2023-10-14 03:59:12,960][33201] Updated weights for policy 0, policy_version 73330 (0.0009) [2023-10-14 03:59:13,339][33201] Updated weights for policy 0, policy_version 73340 (0.0008) [2023-10-14 03:59:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 150863872. Throughput: 0: 1788.0, 1: 1779.1. Samples: 37720378. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.900')] [2023-10-14 03:59:14,787][33226] Updated weights for policy 1, policy_version 73990 (0.0009) [2023-10-14 03:59:15,143][33226] Updated weights for policy 1, policy_version 74000 (0.0008) [2023-10-14 03:59:15,509][33226] Updated weights for policy 1, policy_version 74010 (0.0009) [2023-10-14 03:59:16,996][33201] Updated weights for policy 0, policy_version 73350 (0.0008) [2023-10-14 03:59:17,362][33201] Updated weights for policy 0, policy_version 73360 (0.0009) [2023-10-14 03:59:17,744][33201] Updated weights for policy 0, policy_version 73370 (0.0008) [2023-10-14 03:59:19,224][33226] Updated weights for policy 1, policy_version 74020 (0.0009) [2023-10-14 03:59:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 150929408. Throughput: 0: 1761.5, 1: 1784.9. Samples: 37741224. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:19,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.900')] [2023-10-14 03:59:19,599][33226] Updated weights for policy 1, policy_version 74030 (0.0010) [2023-10-14 03:59:19,969][33226] Updated weights for policy 1, policy_version 74040 (0.0007) [2023-10-14 03:59:21,546][33201] Updated weights for policy 0, policy_version 73380 (0.0008) [2023-10-14 03:59:21,915][33201] Updated weights for policy 0, policy_version 73390 (0.0009) [2023-10-14 03:59:22,282][33201] Updated weights for policy 0, policy_version 73400 (0.0010) [2023-10-14 03:59:23,958][33226] Updated weights for policy 1, policy_version 74050 (0.0009) [2023-10-14 03:59:24,362][33226] Updated weights for policy 1, policy_version 74060 (0.0010) [2023-10-14 03:59:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 150994944. Throughput: 0: 1761.3, 1: 1803.9. Samples: 37763328. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:24,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.860')] [2023-10-14 03:59:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000073408_75169792.pth... [2023-10-14 03:59:24,600][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000071776_73498624.pth [2023-10-14 03:59:24,725][33226] Updated weights for policy 1, policy_version 74070 (0.0010) [2023-10-14 03:59:25,092][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000074080_75857920.pth... [2023-10-14 03:59:25,095][33226] Updated weights for policy 1, policy_version 74080 (0.0009) [2023-10-14 03:59:25,132][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000072416_74153984.pth [2023-10-14 03:59:26,026][33201] Updated weights for policy 0, policy_version 73410 (0.0011) [2023-10-14 03:59:26,402][33201] Updated weights for policy 0, policy_version 73420 (0.0010) [2023-10-14 03:59:26,766][33201] Updated weights for policy 0, policy_version 73430 (0.0010) [2023-10-14 03:59:27,135][33201] Updated weights for policy 0, policy_version 73440 (0.0009) [2023-10-14 03:59:28,839][33226] Updated weights for policy 1, policy_version 74090 (0.0010) [2023-10-14 03:59:29,203][33226] Updated weights for policy 1, policy_version 74100 (0.0008) [2023-10-14 03:59:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 151060480. Throughput: 0: 1770.4, 1: 1786.3. Samples: 37773316. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:29,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.860')] [2023-10-14 03:59:29,564][33226] Updated weights for policy 1, policy_version 74110 (0.0009) [2023-10-14 03:59:31,084][33201] Updated weights for policy 0, policy_version 73450 (0.0007) [2023-10-14 03:59:31,463][33201] Updated weights for policy 0, policy_version 73460 (0.0011) [2023-10-14 03:59:31,827][33201] Updated weights for policy 0, policy_version 73470 (0.0008) [2023-10-14 03:59:33,412][33226] Updated weights for policy 1, policy_version 74120 (0.0008) [2023-10-14 03:59:33,775][33226] Updated weights for policy 1, policy_version 74130 (0.0008) [2023-10-14 03:59:34,147][33226] Updated weights for policy 1, policy_version 74140 (0.0007) [2023-10-14 03:59:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.5, 300 sec: 14106.9). Total num frames: 151158784. Throughput: 0: 1769.5, 1: 1796.8. Samples: 37795558. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:34,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.840')] [2023-10-14 03:59:35,518][33201] Updated weights for policy 0, policy_version 73480 (0.0008) [2023-10-14 03:59:35,889][33201] Updated weights for policy 0, policy_version 73490 (0.0008) [2023-10-14 03:59:36,258][33201] Updated weights for policy 0, policy_version 73500 (0.0008) [2023-10-14 03:59:37,905][33226] Updated weights for policy 1, policy_version 74150 (0.0007) [2023-10-14 03:59:38,263][33226] Updated weights for policy 1, policy_version 74160 (0.0010) [2023-10-14 03:59:38,633][33226] Updated weights for policy 1, policy_version 74170 (0.0008) [2023-10-14 03:59:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 151224320. Throughput: 0: 1782.5, 1: 1777.7. Samples: 37816426. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:39,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.850')] [2023-10-14 03:59:40,086][33201] Updated weights for policy 0, policy_version 73510 (0.0010) [2023-10-14 03:59:40,457][33201] Updated weights for policy 0, policy_version 73520 (0.0008) [2023-10-14 03:59:40,830][33201] Updated weights for policy 0, policy_version 73530 (0.0007) [2023-10-14 03:59:42,422][33226] Updated weights for policy 1, policy_version 74180 (0.0009) [2023-10-14 03:59:42,799][33226] Updated weights for policy 1, policy_version 74190 (0.0010) [2023-10-14 03:59:43,154][33226] Updated weights for policy 1, policy_version 74200 (0.0011) [2023-10-14 03:59:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 151289856. Throughput: 0: 1766.5, 1: 1796.3. Samples: 37827396. Policy #0 lag: (min: 10.0, avg: 10.4, max: 23.0) [2023-10-14 03:59:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.850')] [2023-10-14 03:59:44,680][33201] Updated weights for policy 0, policy_version 73540 (0.0009) [2023-10-14 03:59:45,049][33201] Updated weights for policy 0, policy_version 73550 (0.0011) [2023-10-14 03:59:45,417][33201] Updated weights for policy 0, policy_version 73560 (0.0008) [2023-10-14 03:59:47,070][33226] Updated weights for policy 1, policy_version 74210 (0.0010) [2023-10-14 03:59:47,438][33226] Updated weights for policy 1, policy_version 74220 (0.0008) [2023-10-14 03:59:47,795][33226] Updated weights for policy 1, policy_version 74230 (0.0009) [2023-10-14 03:59:48,169][33226] Updated weights for policy 1, policy_version 74240 (0.0008) [2023-10-14 03:59:49,319][33201] Updated weights for policy 0, policy_version 73570 (0.0007) [2023-10-14 03:59:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 151355392. Throughput: 0: 1768.9, 1: 1780.4. Samples: 37848320. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 03:59:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.840')] [2023-10-14 03:59:49,699][33201] Updated weights for policy 0, policy_version 73580 (0.0009) [2023-10-14 03:59:50,059][33201] Updated weights for policy 0, policy_version 73590 (0.0010) [2023-10-14 03:59:50,434][33201] Updated weights for policy 0, policy_version 73600 (0.0008) [2023-10-14 03:59:51,957][33226] Updated weights for policy 1, policy_version 74250 (0.0007) [2023-10-14 03:59:52,317][33226] Updated weights for policy 1, policy_version 74260 (0.0007) [2023-10-14 03:59:52,687][33226] Updated weights for policy 1, policy_version 74270 (0.0007) [2023-10-14 03:59:54,183][33201] Updated weights for policy 0, policy_version 73610 (0.0009) [2023-10-14 03:59:54,548][33201] Updated weights for policy 0, policy_version 73620 (0.0007) [2023-10-14 03:59:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 151420928. Throughput: 0: 1793.9, 1: 1765.9. Samples: 37869492. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 03:59:54,559][31953] Avg episode reward: [(0, '20.950'), (1, '20.840')] [2023-10-14 03:59:54,914][33201] Updated weights for policy 0, policy_version 73630 (0.0010) [2023-10-14 03:59:56,492][33226] Updated weights for policy 1, policy_version 74280 (0.0008) [2023-10-14 03:59:56,854][33226] Updated weights for policy 1, policy_version 74290 (0.0007) [2023-10-14 03:59:57,219][33226] Updated weights for policy 1, policy_version 74300 (0.0008) [2023-10-14 03:59:58,732][33201] Updated weights for policy 0, policy_version 73640 (0.0008) [2023-10-14 03:59:59,105][33201] Updated weights for policy 0, policy_version 73650 (0.0008) [2023-10-14 03:59:59,474][33201] Updated weights for policy 0, policy_version 73660 (0.0011) [2023-10-14 03:59:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 151486464. Throughput: 0: 1761.9, 1: 1782.3. Samples: 37879866. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 03:59:59,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.840')] [2023-10-14 04:00:01,192][33226] Updated weights for policy 1, policy_version 74310 (0.0007) [2023-10-14 04:00:01,565][33226] Updated weights for policy 1, policy_version 74320 (0.0009) [2023-10-14 04:00:01,926][33226] Updated weights for policy 1, policy_version 74330 (0.0009) [2023-10-14 04:00:03,306][33201] Updated weights for policy 0, policy_version 73670 (0.0008) [2023-10-14 04:00:03,671][33201] Updated weights for policy 0, policy_version 73680 (0.0008) [2023-10-14 04:00:04,038][33201] Updated weights for policy 0, policy_version 73690 (0.0008) [2023-10-14 04:00:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 151584768. Throughput: 0: 1796.8, 1: 1764.5. Samples: 37901480. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 04:00:05,648][33226] Updated weights for policy 1, policy_version 74340 (0.0007) [2023-10-14 04:00:06,029][33226] Updated weights for policy 1, policy_version 74350 (0.0009) [2023-10-14 04:00:06,395][33226] Updated weights for policy 1, policy_version 74360 (0.0010) [2023-10-14 04:00:07,837][33201] Updated weights for policy 0, policy_version 73700 (0.0009) [2023-10-14 04:00:08,206][33201] Updated weights for policy 0, policy_version 73710 (0.0008) [2023-10-14 04:00:08,573][33201] Updated weights for policy 0, policy_version 73720 (0.0008) [2023-10-14 04:00:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 151650304. Throughput: 0: 1764.3, 1: 1769.8. Samples: 37922362. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:09,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 04:00:10,306][33226] Updated weights for policy 1, policy_version 74370 (0.0008) [2023-10-14 04:00:10,718][33226] Updated weights for policy 1, policy_version 74380 (0.0008) [2023-10-14 04:00:11,085][33226] Updated weights for policy 1, policy_version 74390 (0.0010) [2023-10-14 04:00:11,449][33226] Updated weights for policy 1, policy_version 74400 (0.0010) [2023-10-14 04:00:12,355][33201] Updated weights for policy 0, policy_version 73730 (0.0010) [2023-10-14 04:00:12,719][33201] Updated weights for policy 0, policy_version 73740 (0.0009) [2023-10-14 04:00:13,099][33201] Updated weights for policy 0, policy_version 73750 (0.0007) [2023-10-14 04:00:13,457][33201] Updated weights for policy 0, policy_version 73760 (0.0010) [2023-10-14 04:00:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 151715840. Throughput: 0: 1788.3, 1: 1762.3. Samples: 37933098. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 04:00:15,173][33226] Updated weights for policy 1, policy_version 74410 (0.0007) [2023-10-14 04:00:15,538][33226] Updated weights for policy 1, policy_version 74420 (0.0008) [2023-10-14 04:00:15,908][33226] Updated weights for policy 1, policy_version 74430 (0.0007) [2023-10-14 04:00:17,386][33201] Updated weights for policy 0, policy_version 73770 (0.0007) [2023-10-14 04:00:17,749][33201] Updated weights for policy 0, policy_version 73780 (0.0009) [2023-10-14 04:00:18,119][33201] Updated weights for policy 0, policy_version 73790 (0.0008) [2023-10-14 04:00:19,462][33226] Updated weights for policy 1, policy_version 74440 (0.0009) [2023-10-14 04:00:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 151781376. Throughput: 0: 1759.8, 1: 1765.3. Samples: 37954190. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.840')] [2023-10-14 04:00:19,831][33226] Updated weights for policy 1, policy_version 74450 (0.0008) [2023-10-14 04:00:20,197][33226] Updated weights for policy 1, policy_version 74460 (0.0011) [2023-10-14 04:00:22,023][33201] Updated weights for policy 0, policy_version 73800 (0.0008) [2023-10-14 04:00:22,400][33201] Updated weights for policy 0, policy_version 73810 (0.0009) [2023-10-14 04:00:22,772][33201] Updated weights for policy 0, policy_version 73820 (0.0008) [2023-10-14 04:00:23,983][33226] Updated weights for policy 1, policy_version 74470 (0.0010) [2023-10-14 04:00:24,353][33226] Updated weights for policy 1, policy_version 74480 (0.0007) [2023-10-14 04:00:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 151846912. Throughput: 0: 1753.6, 1: 1791.6. Samples: 37975962. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:24,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.840')] [2023-10-14 04:00:24,730][33226] Updated weights for policy 1, policy_version 74490 (0.0008) [2023-10-14 04:00:26,537][33201] Updated weights for policy 0, policy_version 73830 (0.0009) [2023-10-14 04:00:26,902][33201] Updated weights for policy 0, policy_version 73840 (0.0009) [2023-10-14 04:00:27,286][33201] Updated weights for policy 0, policy_version 73850 (0.0007) [2023-10-14 04:00:28,561][33226] Updated weights for policy 1, policy_version 74500 (0.0011) [2023-10-14 04:00:28,926][33226] Updated weights for policy 1, policy_version 74510 (0.0009) [2023-10-14 04:00:29,296][33226] Updated weights for policy 1, policy_version 74520 (0.0008) [2023-10-14 04:00:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 151912448. Throughput: 0: 1766.5, 1: 1760.2. Samples: 37986098. Policy #0 lag: (min: 1.0, avg: 13.1, max: 33.0) [2023-10-14 04:00:29,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.840')] [2023-10-14 04:00:31,082][33201] Updated weights for policy 0, policy_version 73860 (0.0008) [2023-10-14 04:00:31,450][33201] Updated weights for policy 0, policy_version 73870 (0.0008) [2023-10-14 04:00:31,827][33201] Updated weights for policy 0, policy_version 73880 (0.0007) [2023-10-14 04:00:33,087][33226] Updated weights for policy 1, policy_version 74530 (0.0008) [2023-10-14 04:00:33,449][33226] Updated weights for policy 1, policy_version 74540 (0.0008) [2023-10-14 04:00:33,819][33226] Updated weights for policy 1, policy_version 74550 (0.0008) [2023-10-14 04:00:34,179][33226] Updated weights for policy 1, policy_version 74560 (0.0007) [2023-10-14 04:00:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152010752. Throughput: 0: 1753.1, 1: 1789.5. Samples: 38007738. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.840')] [2023-10-14 04:00:35,623][33201] Updated weights for policy 0, policy_version 73890 (0.0008) [2023-10-14 04:00:35,995][33201] Updated weights for policy 0, policy_version 73900 (0.0011) [2023-10-14 04:00:36,362][33201] Updated weights for policy 0, policy_version 73910 (0.0007) [2023-10-14 04:00:36,742][33201] Updated weights for policy 0, policy_version 73920 (0.0009) [2023-10-14 04:00:37,881][33226] Updated weights for policy 1, policy_version 74570 (0.0007) [2023-10-14 04:00:38,247][33226] Updated weights for policy 1, policy_version 74580 (0.0007) [2023-10-14 04:00:38,606][33226] Updated weights for policy 1, policy_version 74590 (0.0007) [2023-10-14 04:00:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 152076288. Throughput: 0: 1764.9, 1: 1771.8. Samples: 38028642. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:39,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.850')] [2023-10-14 04:00:40,643][33201] Updated weights for policy 0, policy_version 73930 (0.0007) [2023-10-14 04:00:41,026][33201] Updated weights for policy 0, policy_version 73940 (0.0009) [2023-10-14 04:00:41,387][33201] Updated weights for policy 0, policy_version 73950 (0.0010) [2023-10-14 04:00:42,468][33226] Updated weights for policy 1, policy_version 74600 (0.0009) [2023-10-14 04:00:42,839][33226] Updated weights for policy 1, policy_version 74610 (0.0008) [2023-10-14 04:00:43,199][33226] Updated weights for policy 1, policy_version 74620 (0.0008) [2023-10-14 04:00:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 152141824. Throughput: 0: 1753.2, 1: 1789.5. Samples: 38039284. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.850')] [2023-10-14 04:00:45,214][33201] Updated weights for policy 0, policy_version 73960 (0.0008) [2023-10-14 04:00:45,579][33201] Updated weights for policy 0, policy_version 73970 (0.0008) [2023-10-14 04:00:45,955][33201] Updated weights for policy 0, policy_version 73980 (0.0008) [2023-10-14 04:00:46,997][33226] Updated weights for policy 1, policy_version 74630 (0.0009) [2023-10-14 04:00:47,366][33226] Updated weights for policy 1, policy_version 74640 (0.0011) [2023-10-14 04:00:47,739][33226] Updated weights for policy 1, policy_version 74650 (0.0009) [2023-10-14 04:00:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 152207360. Throughput: 0: 1746.1, 1: 1772.3. Samples: 38059806. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:49,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.850')] [2023-10-14 04:00:49,961][33201] Updated weights for policy 0, policy_version 73990 (0.0010) [2023-10-14 04:00:50,334][33201] Updated weights for policy 0, policy_version 74000 (0.0011) [2023-10-14 04:00:50,711][33201] Updated weights for policy 0, policy_version 74010 (0.0009) [2023-10-14 04:00:51,602][33226] Updated weights for policy 1, policy_version 74660 (0.0008) [2023-10-14 04:00:51,975][33226] Updated weights for policy 1, policy_version 74670 (0.0011) [2023-10-14 04:00:52,346][33226] Updated weights for policy 1, policy_version 74680 (0.0007) [2023-10-14 04:00:54,457][33201] Updated weights for policy 0, policy_version 74020 (0.0008) [2023-10-14 04:00:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 152272896. Throughput: 0: 1776.9, 1: 1764.6. Samples: 38081730. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:54,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.850')] [2023-10-14 04:00:54,824][33201] Updated weights for policy 0, policy_version 74030 (0.0010) [2023-10-14 04:00:55,202][33201] Updated weights for policy 0, policy_version 74040 (0.0008) [2023-10-14 04:00:56,271][33226] Updated weights for policy 1, policy_version 74690 (0.0008) [2023-10-14 04:00:56,691][33226] Updated weights for policy 1, policy_version 74700 (0.0009) [2023-10-14 04:00:57,058][33226] Updated weights for policy 1, policy_version 74710 (0.0011) [2023-10-14 04:00:57,419][33226] Updated weights for policy 1, policy_version 74720 (0.0009) [2023-10-14 04:00:58,949][33201] Updated weights for policy 0, policy_version 74050 (0.0009) [2023-10-14 04:00:59,316][33201] Updated weights for policy 0, policy_version 74060 (0.0007) [2023-10-14 04:00:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 152338432. Throughput: 0: 1744.9, 1: 1780.9. Samples: 38091758. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:00:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.850')] [2023-10-14 04:00:59,686][33201] Updated weights for policy 0, policy_version 74070 (0.0008) [2023-10-14 04:01:00,057][33201] Updated weights for policy 0, policy_version 74080 (0.0009) [2023-10-14 04:01:01,268][33226] Updated weights for policy 1, policy_version 74730 (0.0009) [2023-10-14 04:01:01,632][33226] Updated weights for policy 1, policy_version 74740 (0.0007) [2023-10-14 04:01:02,002][33226] Updated weights for policy 1, policy_version 74750 (0.0009) [2023-10-14 04:01:03,935][33201] Updated weights for policy 0, policy_version 74090 (0.0008) [2023-10-14 04:01:04,311][33201] Updated weights for policy 0, policy_version 74100 (0.0008) [2023-10-14 04:01:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 152403968. Throughput: 0: 1777.1, 1: 1763.9. Samples: 38113536. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:01:04,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.850')] [2023-10-14 04:01:04,683][33201] Updated weights for policy 0, policy_version 74110 (0.0009) [2023-10-14 04:01:05,767][33226] Updated weights for policy 1, policy_version 74760 (0.0008) [2023-10-14 04:01:06,134][33226] Updated weights for policy 1, policy_version 74770 (0.0008) [2023-10-14 04:01:06,501][33226] Updated weights for policy 1, policy_version 74780 (0.0007) [2023-10-14 04:01:08,455][33201] Updated weights for policy 0, policy_version 74120 (0.0009) [2023-10-14 04:01:08,810][33201] Updated weights for policy 0, policy_version 74130 (0.0008) [2023-10-14 04:01:09,185][33201] Updated weights for policy 0, policy_version 74140 (0.0010) [2023-10-14 04:01:09,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152502272. Throughput: 0: 1762.4, 1: 1772.3. Samples: 38135020. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:01:09,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.870')] [2023-10-14 04:01:10,214][33226] Updated weights for policy 1, policy_version 74790 (0.0007) [2023-10-14 04:01:10,568][33226] Updated weights for policy 1, policy_version 74800 (0.0007) [2023-10-14 04:01:10,934][33226] Updated weights for policy 1, policy_version 74810 (0.0008) [2023-10-14 04:01:13,145][33201] Updated weights for policy 0, policy_version 74150 (0.0008) [2023-10-14 04:01:13,511][33201] Updated weights for policy 0, policy_version 74160 (0.0008) [2023-10-14 04:01:13,886][33201] Updated weights for policy 0, policy_version 74170 (0.0008) [2023-10-14 04:01:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152567808. Throughput: 0: 1777.1, 1: 1769.1. Samples: 38145676. Policy #0 lag: (min: 31.0, avg: 32.1, max: 54.0) [2023-10-14 04:01:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:01:14,751][33226] Updated weights for policy 1, policy_version 74820 (0.0009) [2023-10-14 04:01:15,115][33226] Updated weights for policy 1, policy_version 74830 (0.0007) [2023-10-14 04:01:15,484][33226] Updated weights for policy 1, policy_version 74840 (0.0007) [2023-10-14 04:01:17,521][33201] Updated weights for policy 0, policy_version 74180 (0.0008) [2023-10-14 04:01:17,878][33201] Updated weights for policy 0, policy_version 74190 (0.0009) [2023-10-14 04:01:18,257][33201] Updated weights for policy 0, policy_version 74200 (0.0007) [2023-10-14 04:01:19,070][33226] Updated weights for policy 1, policy_version 74850 (0.0007) [2023-10-14 04:01:19,442][33226] Updated weights for policy 1, policy_version 74860 (0.0007) [2023-10-14 04:01:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152633344. Throughput: 0: 1769.4, 1: 1770.8. Samples: 38167046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:19,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:01:19,798][33226] Updated weights for policy 1, policy_version 74870 (0.0008) [2023-10-14 04:01:20,168][33226] Updated weights for policy 1, policy_version 74880 (0.0007) [2023-10-14 04:01:21,939][33201] Updated weights for policy 0, policy_version 74210 (0.0007) [2023-10-14 04:01:22,316][33201] Updated weights for policy 0, policy_version 74220 (0.0008) [2023-10-14 04:01:22,684][33201] Updated weights for policy 0, policy_version 74230 (0.0007) [2023-10-14 04:01:23,055][33201] Updated weights for policy 0, policy_version 74240 (0.0008) [2023-10-14 04:01:23,996][33226] Updated weights for policy 1, policy_version 74890 (0.0009) [2023-10-14 04:01:24,358][33226] Updated weights for policy 1, policy_version 74900 (0.0007) [2023-10-14 04:01:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152698880. Throughput: 0: 1759.8, 1: 1790.2. Samples: 38188390. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:24,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-14 04:01:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000074240_76021760.pth... [2023-10-14 04:01:24,598][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000072576_74317824.pth [2023-10-14 04:01:24,602][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000074240_76021760.pth [2023-10-14 04:01:24,728][33226] Updated weights for policy 1, policy_version 74910 (0.0010) [2023-10-14 04:01:24,795][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000074912_76709888.pth... [2023-10-14 04:01:24,831][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000073248_75005952.pth [2023-10-14 04:01:24,837][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000074912_76709888.pth [2023-10-14 04:01:26,896][33201] Updated weights for policy 0, policy_version 74250 (0.0010) [2023-10-14 04:01:27,261][33201] Updated weights for policy 0, policy_version 74260 (0.0008) [2023-10-14 04:01:27,640][33201] Updated weights for policy 0, policy_version 74270 (0.0009) [2023-10-14 04:01:28,557][33226] Updated weights for policy 1, policy_version 74920 (0.0008) [2023-10-14 04:01:28,932][33226] Updated weights for policy 1, policy_version 74930 (0.0008) [2023-10-14 04:01:29,302][33226] Updated weights for policy 1, policy_version 74940 (0.0010) [2023-10-14 04:01:29,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14745.6, 300 sec: 14329.0). Total num frames: 152797184. Throughput: 0: 1781.8, 1: 1769.4. Samples: 38199086. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:29,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.930')] [2023-10-14 04:01:31,508][33201] Updated weights for policy 0, policy_version 74280 (0.0010) [2023-10-14 04:01:31,871][33201] Updated weights for policy 0, policy_version 74290 (0.0009) [2023-10-14 04:01:32,239][33201] Updated weights for policy 0, policy_version 74300 (0.0008) [2023-10-14 04:01:33,021][33226] Updated weights for policy 1, policy_version 74950 (0.0009) [2023-10-14 04:01:33,393][33226] Updated weights for policy 1, policy_version 74960 (0.0007) [2023-10-14 04:01:33,765][33226] Updated weights for policy 1, policy_version 74970 (0.0008) [2023-10-14 04:01:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 152862720. Throughput: 0: 1768.9, 1: 1801.2. Samples: 38220462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:34,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:01:36,083][33201] Updated weights for policy 0, policy_version 74310 (0.0007) [2023-10-14 04:01:36,456][33201] Updated weights for policy 0, policy_version 74320 (0.0007) [2023-10-14 04:01:36,827][33201] Updated weights for policy 0, policy_version 74330 (0.0008) [2023-10-14 04:01:37,584][33226] Updated weights for policy 1, policy_version 74980 (0.0009) [2023-10-14 04:01:37,951][33226] Updated weights for policy 1, policy_version 74990 (0.0009) [2023-10-14 04:01:38,321][33226] Updated weights for policy 1, policy_version 75000 (0.0008) [2023-10-14 04:01:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 152928256. Throughput: 0: 1770.5, 1: 1778.9. Samples: 38241456. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:39,557][31953] Avg episode reward: [(0, '20.810'), (1, '20.930')] [2023-10-14 04:01:40,654][33201] Updated weights for policy 0, policy_version 74340 (0.0008) [2023-10-14 04:01:41,027][33201] Updated weights for policy 0, policy_version 74350 (0.0008) [2023-10-14 04:01:41,390][33201] Updated weights for policy 0, policy_version 74360 (0.0009) [2023-10-14 04:01:42,129][33226] Updated weights for policy 1, policy_version 75010 (0.0008) [2023-10-14 04:01:42,555][33226] Updated weights for policy 1, policy_version 75020 (0.0008) [2023-10-14 04:01:42,922][33226] Updated weights for policy 1, policy_version 75030 (0.0007) [2023-10-14 04:01:43,288][33226] Updated weights for policy 1, policy_version 75040 (0.0007) [2023-10-14 04:01:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 152993792. Throughput: 0: 1768.8, 1: 1805.7. Samples: 38252610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:44,558][31953] Avg episode reward: [(0, '20.780'), (1, '20.920')] [2023-10-14 04:01:45,340][33201] Updated weights for policy 0, policy_version 74370 (0.0008) [2023-10-14 04:01:45,705][33201] Updated weights for policy 0, policy_version 74380 (0.0009) [2023-10-14 04:01:46,077][33201] Updated weights for policy 0, policy_version 74390 (0.0009) [2023-10-14 04:01:46,446][33201] Updated weights for policy 0, policy_version 74400 (0.0010) [2023-10-14 04:01:46,926][33226] Updated weights for policy 1, policy_version 75050 (0.0009) [2023-10-14 04:01:47,294][33226] Updated weights for policy 1, policy_version 75060 (0.0008) [2023-10-14 04:01:47,664][33226] Updated weights for policy 1, policy_version 75070 (0.0008) [2023-10-14 04:01:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 153059328. Throughput: 0: 1764.4, 1: 1784.7. Samples: 38273246. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:49,558][31953] Avg episode reward: [(0, '20.780'), (1, '20.920')] [2023-10-14 04:01:50,297][33201] Updated weights for policy 0, policy_version 74410 (0.0008) [2023-10-14 04:01:50,657][33201] Updated weights for policy 0, policy_version 74420 (0.0007) [2023-10-14 04:01:51,039][33201] Updated weights for policy 0, policy_version 74430 (0.0007) [2023-10-14 04:01:51,458][33226] Updated weights for policy 1, policy_version 75080 (0.0008) [2023-10-14 04:01:51,824][33226] Updated weights for policy 1, policy_version 75090 (0.0007) [2023-10-14 04:01:52,194][33226] Updated weights for policy 1, policy_version 75100 (0.0008) [2023-10-14 04:01:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 153124864. Throughput: 0: 1788.5, 1: 1775.4. Samples: 38295396. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:54,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.940')] [2023-10-14 04:01:54,618][33201] Updated weights for policy 0, policy_version 74440 (0.0009) [2023-10-14 04:01:54,985][33201] Updated weights for policy 0, policy_version 74450 (0.0009) [2023-10-14 04:01:55,358][33201] Updated weights for policy 0, policy_version 74460 (0.0009) [2023-10-14 04:01:55,978][33226] Updated weights for policy 1, policy_version 75110 (0.0007) [2023-10-14 04:01:56,350][33226] Updated weights for policy 1, policy_version 75120 (0.0008) [2023-10-14 04:01:56,720][33226] Updated weights for policy 1, policy_version 75130 (0.0008) [2023-10-14 04:01:59,181][33201] Updated weights for policy 0, policy_version 74470 (0.0007) [2023-10-14 04:01:59,556][33201] Updated weights for policy 0, policy_version 74480 (0.0008) [2023-10-14 04:01:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 153190400. Throughput: 0: 1762.6, 1: 1779.1. Samples: 38305052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:01:59,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.940')] [2023-10-14 04:01:59,921][33201] Updated weights for policy 0, policy_version 74490 (0.0009) [2023-10-14 04:02:00,543][33226] Updated weights for policy 1, policy_version 75140 (0.0007) [2023-10-14 04:02:00,911][33226] Updated weights for policy 1, policy_version 75150 (0.0007) [2023-10-14 04:02:01,280][33226] Updated weights for policy 1, policy_version 75160 (0.0009) [2023-10-14 04:02:03,632][33201] Updated weights for policy 0, policy_version 74500 (0.0007) [2023-10-14 04:02:04,005][33201] Updated weights for policy 0, policy_version 74510 (0.0007) [2023-10-14 04:02:04,372][33201] Updated weights for policy 0, policy_version 74520 (0.0007) [2023-10-14 04:02:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 153255936. Throughput: 0: 1777.9, 1: 1778.9. Samples: 38327106. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:04,558][31953] Avg episode reward: [(0, '20.790'), (1, '20.940')] [2023-10-14 04:02:04,877][33226] Updated weights for policy 1, policy_version 75170 (0.0008) [2023-10-14 04:02:05,238][33226] Updated weights for policy 1, policy_version 75180 (0.0010) [2023-10-14 04:02:05,608][33226] Updated weights for policy 1, policy_version 75190 (0.0007) [2023-10-14 04:02:05,976][33226] Updated weights for policy 1, policy_version 75200 (0.0008) [2023-10-14 04:02:08,264][33201] Updated weights for policy 0, policy_version 74530 (0.0008) [2023-10-14 04:02:08,625][33201] Updated weights for policy 0, policy_version 74540 (0.0009) [2023-10-14 04:02:09,000][33201] Updated weights for policy 0, policy_version 74550 (0.0009) [2023-10-14 04:02:09,370][33201] Updated weights for policy 0, policy_version 74560 (0.0007) [2023-10-14 04:02:09,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 153354240. Throughput: 0: 1759.1, 1: 1792.0. Samples: 38348190. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:09,558][31953] Avg episode reward: [(0, '20.760'), (1, '20.940')] [2023-10-14 04:02:09,808][33226] Updated weights for policy 1, policy_version 75210 (0.0008) [2023-10-14 04:02:10,186][33226] Updated weights for policy 1, policy_version 75220 (0.0010) [2023-10-14 04:02:10,562][33226] Updated weights for policy 1, policy_version 75230 (0.0008) [2023-10-14 04:02:13,116][33201] Updated weights for policy 0, policy_version 74570 (0.0011) [2023-10-14 04:02:13,480][33201] Updated weights for policy 0, policy_version 74580 (0.0009) [2023-10-14 04:02:13,855][33201] Updated weights for policy 0, policy_version 74590 (0.0010) [2023-10-14 04:02:14,419][33226] Updated weights for policy 1, policy_version 75240 (0.0008) [2023-10-14 04:02:14,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 153419776. Throughput: 0: 1771.1, 1: 1778.5. Samples: 38358818. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:14,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.930')] [2023-10-14 04:02:14,788][33226] Updated weights for policy 1, policy_version 75250 (0.0007) [2023-10-14 04:02:15,155][33226] Updated weights for policy 1, policy_version 75260 (0.0009) [2023-10-14 04:02:17,683][33201] Updated weights for policy 0, policy_version 74600 (0.0009) [2023-10-14 04:02:18,044][33201] Updated weights for policy 0, policy_version 74610 (0.0007) [2023-10-14 04:02:18,411][33201] Updated weights for policy 0, policy_version 74620 (0.0009) [2023-10-14 04:02:18,997][33226] Updated weights for policy 1, policy_version 75270 (0.0008) [2023-10-14 04:02:19,357][33226] Updated weights for policy 1, policy_version 75280 (0.0007) [2023-10-14 04:02:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 153485312. Throughput: 0: 1765.3, 1: 1781.2. Samples: 38380050. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:19,557][31953] Avg episode reward: [(0, '20.770'), (1, '20.930')] [2023-10-14 04:02:19,726][33226] Updated weights for policy 1, policy_version 75290 (0.0008) [2023-10-14 04:02:22,276][33201] Updated weights for policy 0, policy_version 74630 (0.0009) [2023-10-14 04:02:22,644][33201] Updated weights for policy 0, policy_version 74640 (0.0007) [2023-10-14 04:02:23,010][33201] Updated weights for policy 0, policy_version 74650 (0.0008) [2023-10-14 04:02:23,423][33226] Updated weights for policy 1, policy_version 75300 (0.0008) [2023-10-14 04:02:23,783][33226] Updated weights for policy 1, policy_version 75310 (0.0010) [2023-10-14 04:02:24,153][33226] Updated weights for policy 1, policy_version 75320 (0.0010) [2023-10-14 04:02:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 153583616. Throughput: 0: 1753.2, 1: 1792.8. Samples: 38401030. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:24,558][31953] Avg episode reward: [(0, '20.750'), (1, '20.940')] [2023-10-14 04:02:26,951][33201] Updated weights for policy 0, policy_version 74660 (0.0010) [2023-10-14 04:02:27,323][33201] Updated weights for policy 0, policy_version 74670 (0.0009) [2023-10-14 04:02:27,686][33201] Updated weights for policy 0, policy_version 74680 (0.0008) [2023-10-14 04:02:27,964][33226] Updated weights for policy 1, policy_version 75330 (0.0011) [2023-10-14 04:02:28,367][33226] Updated weights for policy 1, policy_version 75340 (0.0008) [2023-10-14 04:02:28,731][33226] Updated weights for policy 1, policy_version 75350 (0.0008) [2023-10-14 04:02:29,100][33226] Updated weights for policy 1, policy_version 75360 (0.0008) [2023-10-14 04:02:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 153649152. Throughput: 0: 1776.1, 1: 1773.7. Samples: 38412354. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:29,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.950')] [2023-10-14 04:02:31,807][33201] Updated weights for policy 0, policy_version 74690 (0.0008) [2023-10-14 04:02:32,183][33201] Updated weights for policy 0, policy_version 74700 (0.0009) [2023-10-14 04:02:32,549][33201] Updated weights for policy 0, policy_version 74710 (0.0007) [2023-10-14 04:02:32,920][33226] Updated weights for policy 1, policy_version 75370 (0.0010) [2023-10-14 04:02:32,921][33201] Updated weights for policy 0, policy_version 74720 (0.0010) [2023-10-14 04:02:33,281][33226] Updated weights for policy 1, policy_version 75380 (0.0009) [2023-10-14 04:02:33,642][33226] Updated weights for policy 1, policy_version 75390 (0.0010) [2023-10-14 04:02:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 153714688. Throughput: 0: 1745.3, 1: 1792.8. Samples: 38432462. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:34,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.960')] [2023-10-14 04:02:36,806][33201] Updated weights for policy 0, policy_version 74730 (0.0008) [2023-10-14 04:02:37,179][33201] Updated weights for policy 0, policy_version 74740 (0.0008) [2023-10-14 04:02:37,500][33226] Updated weights for policy 1, policy_version 75400 (0.0009) [2023-10-14 04:02:37,545][33201] Updated weights for policy 0, policy_version 74750 (0.0008) [2023-10-14 04:02:37,867][33226] Updated weights for policy 1, policy_version 75410 (0.0008) [2023-10-14 04:02:38,232][33226] Updated weights for policy 1, policy_version 75420 (0.0008) [2023-10-14 04:02:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 153780224. Throughput: 0: 1738.6, 1: 1771.0. Samples: 38453330. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:39,558][31953] Avg episode reward: [(0, '20.710'), (1, '20.960')] [2023-10-14 04:02:41,332][33201] Updated weights for policy 0, policy_version 74760 (0.0008) [2023-10-14 04:02:41,698][33201] Updated weights for policy 0, policy_version 74770 (0.0008) [2023-10-14 04:02:42,068][33201] Updated weights for policy 0, policy_version 74780 (0.0009) [2023-10-14 04:02:42,169][33226] Updated weights for policy 1, policy_version 75430 (0.0007) [2023-10-14 04:02:42,534][33226] Updated weights for policy 1, policy_version 75440 (0.0008) [2023-10-14 04:02:42,905][33226] Updated weights for policy 1, policy_version 75450 (0.0007) [2023-10-14 04:02:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 153845760. Throughput: 0: 1743.6, 1: 1797.5. Samples: 38464398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:44,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.960')] [2023-10-14 04:02:45,884][33201] Updated weights for policy 0, policy_version 74790 (0.0007) [2023-10-14 04:02:46,250][33201] Updated weights for policy 0, policy_version 74800 (0.0008) [2023-10-14 04:02:46,612][33201] Updated weights for policy 0, policy_version 74810 (0.0009) [2023-10-14 04:02:46,788][33226] Updated weights for policy 1, policy_version 75460 (0.0009) [2023-10-14 04:02:47,155][33226] Updated weights for policy 1, policy_version 75470 (0.0008) [2023-10-14 04:02:47,522][33226] Updated weights for policy 1, policy_version 75480 (0.0008) [2023-10-14 04:02:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 153911296. Throughput: 0: 1745.8, 1: 1765.0. Samples: 38485094. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:49,558][31953] Avg episode reward: [(0, '20.640'), (1, '20.960')] [2023-10-14 04:02:50,468][33201] Updated weights for policy 0, policy_version 74820 (0.0008) [2023-10-14 04:02:50,832][33201] Updated weights for policy 0, policy_version 74830 (0.0007) [2023-10-14 04:02:51,208][33201] Updated weights for policy 0, policy_version 74840 (0.0008) [2023-10-14 04:02:51,278][33226] Updated weights for policy 1, policy_version 75490 (0.0009) [2023-10-14 04:02:51,646][33226] Updated weights for policy 1, policy_version 75500 (0.0008) [2023-10-14 04:02:52,022][33226] Updated weights for policy 1, policy_version 75510 (0.0007) [2023-10-14 04:02:52,395][33226] Updated weights for policy 1, policy_version 75520 (0.0010) [2023-10-14 04:02:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 153976832. Throughput: 0: 1776.5, 1: 1757.9. Samples: 38507238. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:54,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.950')] [2023-10-14 04:02:54,929][33201] Updated weights for policy 0, policy_version 74850 (0.0009) [2023-10-14 04:02:55,300][33201] Updated weights for policy 0, policy_version 74860 (0.0007) [2023-10-14 04:02:55,679][33201] Updated weights for policy 0, policy_version 74870 (0.0007) [2023-10-14 04:02:56,052][33201] Updated weights for policy 0, policy_version 74880 (0.0007) [2023-10-14 04:02:56,262][33226] Updated weights for policy 1, policy_version 75530 (0.0008) [2023-10-14 04:02:56,634][33226] Updated weights for policy 1, policy_version 75540 (0.0008) [2023-10-14 04:02:56,998][33226] Updated weights for policy 1, policy_version 75550 (0.0009) [2023-10-14 04:02:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 154042368. Throughput: 0: 1751.4, 1: 1768.5. Samples: 38517212. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:02:59,558][31953] Avg episode reward: [(0, '20.600'), (1, '20.950')] [2023-10-14 04:02:59,821][33201] Updated weights for policy 0, policy_version 74890 (0.0010) [2023-10-14 04:03:00,199][33201] Updated weights for policy 0, policy_version 74900 (0.0009) [2023-10-14 04:03:00,560][33201] Updated weights for policy 0, policy_version 74910 (0.0010) [2023-10-14 04:03:00,709][33226] Updated weights for policy 1, policy_version 75560 (0.0008) [2023-10-14 04:03:01,080][33226] Updated weights for policy 1, policy_version 75570 (0.0009) [2023-10-14 04:03:01,454][33226] Updated weights for policy 1, policy_version 75580 (0.0008) [2023-10-14 04:03:04,265][33201] Updated weights for policy 0, policy_version 74920 (0.0009) [2023-10-14 04:03:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 154107904. Throughput: 0: 1772.3, 1: 1763.3. Samples: 38539154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:04,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.890')] [2023-10-14 04:03:04,638][33201] Updated weights for policy 0, policy_version 74930 (0.0007) [2023-10-14 04:03:05,009][33201] Updated weights for policy 0, policy_version 74940 (0.0009) [2023-10-14 04:03:05,210][33226] Updated weights for policy 1, policy_version 75590 (0.0008) [2023-10-14 04:03:05,572][33226] Updated weights for policy 1, policy_version 75600 (0.0007) [2023-10-14 04:03:05,949][33226] Updated weights for policy 1, policy_version 75610 (0.0008) [2023-10-14 04:03:08,859][33201] Updated weights for policy 0, policy_version 74950 (0.0007) [2023-10-14 04:03:09,231][33201] Updated weights for policy 0, policy_version 74960 (0.0007) [2023-10-14 04:03:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 154173440. Throughput: 0: 1771.7, 1: 1779.3. Samples: 38560826. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:09,557][31953] Avg episode reward: [(0, '20.570'), (1, '20.890')] [2023-10-14 04:03:09,599][33201] Updated weights for policy 0, policy_version 74970 (0.0008) [2023-10-14 04:03:09,703][33226] Updated weights for policy 1, policy_version 75620 (0.0007) [2023-10-14 04:03:10,074][33226] Updated weights for policy 1, policy_version 75630 (0.0010) [2023-10-14 04:03:10,450][33226] Updated weights for policy 1, policy_version 75640 (0.0008) [2023-10-14 04:03:13,256][33201] Updated weights for policy 0, policy_version 74980 (0.0009) [2023-10-14 04:03:13,626][33201] Updated weights for policy 0, policy_version 74990 (0.0010) [2023-10-14 04:03:13,998][33201] Updated weights for policy 0, policy_version 75000 (0.0009) [2023-10-14 04:03:14,428][33226] Updated weights for policy 1, policy_version 75650 (0.0008) [2023-10-14 04:03:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 154271744. Throughput: 0: 1765.4, 1: 1763.1. Samples: 38571138. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:14,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.890')] [2023-10-14 04:03:14,836][33226] Updated weights for policy 1, policy_version 75660 (0.0008) [2023-10-14 04:03:15,199][33226] Updated weights for policy 1, policy_version 75670 (0.0010) [2023-10-14 04:03:15,569][33226] Updated weights for policy 1, policy_version 75680 (0.0010) [2023-10-14 04:03:17,731][33201] Updated weights for policy 0, policy_version 75010 (0.0008) [2023-10-14 04:03:18,100][33201] Updated weights for policy 0, policy_version 75020 (0.0007) [2023-10-14 04:03:18,473][33201] Updated weights for policy 0, policy_version 75030 (0.0007) [2023-10-14 04:03:18,849][33201] Updated weights for policy 0, policy_version 75040 (0.0010) [2023-10-14 04:03:19,252][33226] Updated weights for policy 1, policy_version 75690 (0.0008) [2023-10-14 04:03:19,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 154337280. Throughput: 0: 1785.0, 1: 1778.4. Samples: 38592814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:19,558][31953] Avg episode reward: [(0, '20.580'), (1, '20.900')] [2023-10-14 04:03:19,624][33226] Updated weights for policy 1, policy_version 75700 (0.0010) [2023-10-14 04:03:19,989][33226] Updated weights for policy 1, policy_version 75710 (0.0010) [2023-10-14 04:03:22,830][33201] Updated weights for policy 0, policy_version 75050 (0.0010) [2023-10-14 04:03:23,200][33201] Updated weights for policy 0, policy_version 75060 (0.0010) [2023-10-14 04:03:23,582][33201] Updated weights for policy 0, policy_version 75070 (0.0007) [2023-10-14 04:03:23,725][33226] Updated weights for policy 1, policy_version 75720 (0.0008) [2023-10-14 04:03:24,087][33226] Updated weights for policy 1, policy_version 75730 (0.0008) [2023-10-14 04:03:24,452][33226] Updated weights for policy 1, policy_version 75740 (0.0009) [2023-10-14 04:03:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 154402816. Throughput: 0: 1766.1, 1: 1790.7. Samples: 38613384. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:24,558][31953] Avg episode reward: [(0, '20.570'), (1, '20.900')] [2023-10-14 04:03:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000075072_76873728.pth... [2023-10-14 04:03:24,595][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000075744_77561856.pth... [2023-10-14 04:03:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000073408_75169792.pth [2023-10-14 04:03:24,636][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000074080_75857920.pth [2023-10-14 04:03:27,540][33201] Updated weights for policy 0, policy_version 75080 (0.0008) [2023-10-14 04:03:27,907][33201] Updated weights for policy 0, policy_version 75090 (0.0008) [2023-10-14 04:03:28,116][33226] Updated weights for policy 1, policy_version 75750 (0.0008) [2023-10-14 04:03:28,278][33201] Updated weights for policy 0, policy_version 75100 (0.0008) [2023-10-14 04:03:28,480][33226] Updated weights for policy 1, policy_version 75760 (0.0009) [2023-10-14 04:03:28,858][33226] Updated weights for policy 1, policy_version 75770 (0.0010) [2023-10-14 04:03:29,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 154501120. Throughput: 0: 1792.3, 1: 1778.0. Samples: 38625060. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:29,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.900')] [2023-10-14 04:03:32,106][33201] Updated weights for policy 0, policy_version 75110 (0.0007) [2023-10-14 04:03:32,479][33201] Updated weights for policy 0, policy_version 75120 (0.0007) [2023-10-14 04:03:32,674][33226] Updated weights for policy 1, policy_version 75780 (0.0008) [2023-10-14 04:03:32,842][33201] Updated weights for policy 0, policy_version 75130 (0.0008) [2023-10-14 04:03:33,040][33226] Updated weights for policy 1, policy_version 75790 (0.0008) [2023-10-14 04:03:33,408][33226] Updated weights for policy 1, policy_version 75800 (0.0010) [2023-10-14 04:03:34,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 154566656. Throughput: 0: 1762.0, 1: 1800.6. Samples: 38645410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:34,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.890')] [2023-10-14 04:03:36,691][33201] Updated weights for policy 0, policy_version 75140 (0.0008) [2023-10-14 04:03:37,066][33201] Updated weights for policy 0, policy_version 75150 (0.0009) [2023-10-14 04:03:37,094][33226] Updated weights for policy 1, policy_version 75810 (0.0007) [2023-10-14 04:03:37,433][33201] Updated weights for policy 0, policy_version 75160 (0.0009) [2023-10-14 04:03:37,452][33226] Updated weights for policy 1, policy_version 75820 (0.0008) [2023-10-14 04:03:37,816][33226] Updated weights for policy 1, policy_version 75830 (0.0009) [2023-10-14 04:03:38,179][33226] Updated weights for policy 1, policy_version 75840 (0.0010) [2023-10-14 04:03:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 154632192. Throughput: 0: 1758.0, 1: 1783.8. Samples: 38666618. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:39,558][31953] Avg episode reward: [(0, '20.560'), (1, '20.890')] [2023-10-14 04:03:41,417][33201] Updated weights for policy 0, policy_version 75170 (0.0007) [2023-10-14 04:03:41,782][33201] Updated weights for policy 0, policy_version 75180 (0.0008) [2023-10-14 04:03:42,012][33226] Updated weights for policy 1, policy_version 75850 (0.0009) [2023-10-14 04:03:42,152][33201] Updated weights for policy 0, policy_version 75190 (0.0007) [2023-10-14 04:03:42,375][33226] Updated weights for policy 1, policy_version 75860 (0.0008) [2023-10-14 04:03:42,526][33201] Updated weights for policy 0, policy_version 75200 (0.0007) [2023-10-14 04:03:42,741][33226] Updated weights for policy 1, policy_version 75870 (0.0007) [2023-10-14 04:03:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 154697728. Throughput: 0: 1766.0, 1: 1802.0. Samples: 38677770. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:03:44,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.870')] [2023-10-14 04:03:46,358][33201] Updated weights for policy 0, policy_version 75210 (0.0008) [2023-10-14 04:03:46,512][33226] Updated weights for policy 1, policy_version 75880 (0.0007) [2023-10-14 04:03:46,722][33201] Updated weights for policy 0, policy_version 75220 (0.0008) [2023-10-14 04:03:46,872][33226] Updated weights for policy 1, policy_version 75890 (0.0007) [2023-10-14 04:03:47,093][33201] Updated weights for policy 0, policy_version 75230 (0.0008) [2023-10-14 04:03:47,234][33226] Updated weights for policy 1, policy_version 75900 (0.0008) [2023-10-14 04:03:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 154763264. Throughput: 0: 1750.6, 1: 1785.0. Samples: 38698256. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:03:49,557][31953] Avg episode reward: [(0, '20.560'), (1, '20.870')] [2023-10-14 04:03:50,834][33201] Updated weights for policy 0, policy_version 75240 (0.0008) [2023-10-14 04:03:51,046][33226] Updated weights for policy 1, policy_version 75910 (0.0010) [2023-10-14 04:03:51,192][33201] Updated weights for policy 0, policy_version 75250 (0.0007) [2023-10-14 04:03:51,412][33226] Updated weights for policy 1, policy_version 75920 (0.0008) [2023-10-14 04:03:51,565][33201] Updated weights for policy 0, policy_version 75260 (0.0007) [2023-10-14 04:03:51,779][33226] Updated weights for policy 1, policy_version 75930 (0.0008) [2023-10-14 04:03:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 154828800. Throughput: 0: 1762.3, 1: 1782.3. Samples: 38720332. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:03:54,558][31953] Avg episode reward: [(0, '20.480'), (1, '20.870')] [2023-10-14 04:03:55,348][33201] Updated weights for policy 0, policy_version 75270 (0.0008) [2023-10-14 04:03:55,625][33226] Updated weights for policy 1, policy_version 75940 (0.0008) [2023-10-14 04:03:55,729][33201] Updated weights for policy 0, policy_version 75280 (0.0008) [2023-10-14 04:03:55,998][33226] Updated weights for policy 1, policy_version 75950 (0.0009) [2023-10-14 04:03:56,094][33201] Updated weights for policy 0, policy_version 75290 (0.0007) [2023-10-14 04:03:56,352][33226] Updated weights for policy 1, policy_version 75960 (0.0009) [2023-10-14 04:03:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 154894336. Throughput: 0: 1746.3, 1: 1780.3. Samples: 38729832. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:03:59,558][31953] Avg episode reward: [(0, '20.460'), (1, '20.880')] [2023-10-14 04:04:00,007][33201] Updated weights for policy 0, policy_version 75300 (0.0007) [2023-10-14 04:04:00,217][33226] Updated weights for policy 1, policy_version 75970 (0.0007) [2023-10-14 04:04:00,385][33201] Updated weights for policy 0, policy_version 75310 (0.0009) [2023-10-14 04:04:00,582][33226] Updated weights for policy 1, policy_version 75980 (0.0008) [2023-10-14 04:04:00,755][33201] Updated weights for policy 0, policy_version 75320 (0.0009) [2023-10-14 04:04:00,956][33226] Updated weights for policy 1, policy_version 75990 (0.0010) [2023-10-14 04:04:01,317][33226] Updated weights for policy 1, policy_version 76000 (0.0010) [2023-10-14 04:04:04,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 154959872. Throughput: 0: 1751.9, 1: 1776.4. Samples: 38751586. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:04,559][31953] Avg episode reward: [(0, '20.440'), (1, '20.870')] [2023-10-14 04:04:04,691][33201] Updated weights for policy 0, policy_version 75330 (0.0009) [2023-10-14 04:04:05,057][33201] Updated weights for policy 0, policy_version 75340 (0.0007) [2023-10-14 04:04:05,339][33226] Updated weights for policy 1, policy_version 76010 (0.0008) [2023-10-14 04:04:05,430][33201] Updated weights for policy 0, policy_version 75350 (0.0009) [2023-10-14 04:04:05,714][33226] Updated weights for policy 1, policy_version 76020 (0.0008) [2023-10-14 04:04:05,806][33201] Updated weights for policy 0, policy_version 75360 (0.0007) [2023-10-14 04:04:06,075][33226] Updated weights for policy 1, policy_version 76030 (0.0010) [2023-10-14 04:04:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 155025408. Throughput: 0: 1772.8, 1: 1787.9. Samples: 38773618. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:09,558][31953] Avg episode reward: [(0, '20.350'), (1, '20.870')] [2023-10-14 04:04:09,678][33201] Updated weights for policy 0, policy_version 75370 (0.0009) [2023-10-14 04:04:09,765][33226] Updated weights for policy 1, policy_version 76040 (0.0007) [2023-10-14 04:04:10,039][33201] Updated weights for policy 0, policy_version 75380 (0.0008) [2023-10-14 04:04:10,134][33226] Updated weights for policy 1, policy_version 76050 (0.0008) [2023-10-14 04:04:10,403][33201] Updated weights for policy 0, policy_version 75390 (0.0008) [2023-10-14 04:04:10,510][33226] Updated weights for policy 1, policy_version 76060 (0.0007) [2023-10-14 04:04:14,144][33226] Updated weights for policy 1, policy_version 76070 (0.0007) [2023-10-14 04:04:14,268][33201] Updated weights for policy 0, policy_version 75400 (0.0008) [2023-10-14 04:04:14,520][33226] Updated weights for policy 1, policy_version 76080 (0.0009) [2023-10-14 04:04:14,557][31953] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 155090944. Throughput: 0: 1743.6, 1: 1772.5. Samples: 38783286. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:14,558][31953] Avg episode reward: [(0, '20.330'), (1, '20.870')] [2023-10-14 04:04:14,630][33201] Updated weights for policy 0, policy_version 75410 (0.0008) [2023-10-14 04:04:14,883][33226] Updated weights for policy 1, policy_version 76090 (0.0008) [2023-10-14 04:04:14,997][33201] Updated weights for policy 0, policy_version 75420 (0.0008) [2023-10-14 04:04:18,641][33226] Updated weights for policy 1, policy_version 76100 (0.0007) [2023-10-14 04:04:18,765][33201] Updated weights for policy 0, policy_version 75430 (0.0007) [2023-10-14 04:04:19,009][33226] Updated weights for policy 1, policy_version 76110 (0.0007) [2023-10-14 04:04:19,132][33201] Updated weights for policy 0, policy_version 75440 (0.0009) [2023-10-14 04:04:19,381][33226] Updated weights for policy 1, policy_version 76120 (0.0009) [2023-10-14 04:04:19,498][33201] Updated weights for policy 0, policy_version 75450 (0.0008) [2023-10-14 04:04:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 155156480. Throughput: 0: 1776.5, 1: 1786.9. Samples: 38805766. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:19,557][31953] Avg episode reward: [(0, '20.360'), (1, '20.870')] [2023-10-14 04:04:23,223][33226] Updated weights for policy 1, policy_version 76130 (0.0007) [2023-10-14 04:04:23,476][33201] Updated weights for policy 0, policy_version 75460 (0.0008) [2023-10-14 04:04:23,590][33226] Updated weights for policy 1, policy_version 76140 (0.0007) [2023-10-14 04:04:23,847][33201] Updated weights for policy 0, policy_version 75470 (0.0008) [2023-10-14 04:04:23,951][33226] Updated weights for policy 1, policy_version 76150 (0.0009) [2023-10-14 04:04:24,227][33201] Updated weights for policy 0, policy_version 75480 (0.0010) [2023-10-14 04:04:24,317][33226] Updated weights for policy 1, policy_version 76160 (0.0008) [2023-10-14 04:04:24,557][31953] Fps is (10 sec: 19660.4, 60 sec: 14745.5, 300 sec: 14329.0). Total num frames: 155287552. Throughput: 0: 1751.2, 1: 1779.2. Samples: 38825484. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:24,558][31953] Avg episode reward: [(0, '20.420'), (1, '20.870')] [2023-10-14 04:04:27,994][33201] Updated weights for policy 0, policy_version 75490 (0.0007) [2023-10-14 04:04:28,114][33226] Updated weights for policy 1, policy_version 76170 (0.0008) [2023-10-14 04:04:28,370][33201] Updated weights for policy 0, policy_version 75500 (0.0008) [2023-10-14 04:04:28,482][33226] Updated weights for policy 1, policy_version 76180 (0.0008) [2023-10-14 04:04:28,745][33201] Updated weights for policy 0, policy_version 75510 (0.0008) [2023-10-14 04:04:28,863][33226] Updated weights for policy 1, policy_version 76190 (0.0009) [2023-10-14 04:04:29,119][33201] Updated weights for policy 0, policy_version 75520 (0.0007) [2023-10-14 04:04:29,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 155353088. Throughput: 0: 1756.2, 1: 1776.6. Samples: 38836746. Policy #0 lag: (min: 12.0, avg: 24.6, max: 44.0) [2023-10-14 04:04:29,557][31953] Avg episode reward: [(0, '20.470'), (1, '20.870')] [2023-10-14 04:04:32,832][33226] Updated weights for policy 1, policy_version 76200 (0.0008) [2023-10-14 04:04:33,039][33201] Updated weights for policy 0, policy_version 75530 (0.0007) [2023-10-14 04:04:33,187][33226] Updated weights for policy 1, policy_version 76210 (0.0008) [2023-10-14 04:04:33,412][33201] Updated weights for policy 0, policy_version 75540 (0.0008) [2023-10-14 04:04:33,551][33226] Updated weights for policy 1, policy_version 76220 (0.0007) [2023-10-14 04:04:33,786][33201] Updated weights for policy 0, policy_version 75550 (0.0007) [2023-10-14 04:04:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 155418624. Throughput: 0: 1757.4, 1: 1784.7. Samples: 38857654. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:34,558][31953] Avg episode reward: [(0, '20.430'), (1, '20.880')] [2023-10-14 04:04:37,389][33226] Updated weights for policy 1, policy_version 76230 (0.0007) [2023-10-14 04:04:37,547][33201] Updated weights for policy 0, policy_version 75560 (0.0007) [2023-10-14 04:04:37,751][33226] Updated weights for policy 1, policy_version 76240 (0.0008) [2023-10-14 04:04:37,908][33201] Updated weights for policy 0, policy_version 75570 (0.0007) [2023-10-14 04:04:38,119][33226] Updated weights for policy 1, policy_version 76250 (0.0007) [2023-10-14 04:04:38,277][33201] Updated weights for policy 0, policy_version 75580 (0.0010) [2023-10-14 04:04:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 155484160. Throughput: 0: 1735.0, 1: 1761.9. Samples: 38877692. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:39,558][31953] Avg episode reward: [(0, '20.430'), (1, '20.890')] [2023-10-14 04:04:42,071][33226] Updated weights for policy 1, policy_version 76260 (0.0007) [2023-10-14 04:04:42,110][33201] Updated weights for policy 0, policy_version 75590 (0.0008) [2023-10-14 04:04:42,433][33226] Updated weights for policy 1, policy_version 76270 (0.0007) [2023-10-14 04:04:42,479][33201] Updated weights for policy 0, policy_version 75600 (0.0008) [2023-10-14 04:04:42,801][33226] Updated weights for policy 1, policy_version 76280 (0.0007) [2023-10-14 04:04:42,842][33201] Updated weights for policy 0, policy_version 75610 (0.0008) [2023-10-14 04:04:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 155549696. Throughput: 0: 1761.9, 1: 1789.3. Samples: 38889634. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:44,558][31953] Avg episode reward: [(0, '20.430'), (1, '20.890')] [2023-10-14 04:04:46,858][33226] Updated weights for policy 1, policy_version 76290 (0.0009) [2023-10-14 04:04:46,962][33201] Updated weights for policy 0, policy_version 75620 (0.0008) [2023-10-14 04:04:47,229][33226] Updated weights for policy 1, policy_version 76300 (0.0007) [2023-10-14 04:04:47,327][33201] Updated weights for policy 0, policy_version 75630 (0.0008) [2023-10-14 04:04:47,595][33226] Updated weights for policy 1, policy_version 76310 (0.0008) [2023-10-14 04:04:47,692][33201] Updated weights for policy 0, policy_version 75640 (0.0008) [2023-10-14 04:04:47,953][33226] Updated weights for policy 1, policy_version 76320 (0.0008) [2023-10-14 04:04:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 155615232. Throughput: 0: 1732.3, 1: 1756.1. Samples: 38908564. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:49,557][31953] Avg episode reward: [(0, '20.450'), (1, '20.950')] [2023-10-14 04:04:51,613][33201] Updated weights for policy 0, policy_version 75650 (0.0008) [2023-10-14 04:04:51,890][33226] Updated weights for policy 1, policy_version 76330 (0.0008) [2023-10-14 04:04:51,975][33201] Updated weights for policy 0, policy_version 75660 (0.0007) [2023-10-14 04:04:52,268][33226] Updated weights for policy 1, policy_version 76340 (0.0008) [2023-10-14 04:04:52,339][33201] Updated weights for policy 0, policy_version 75670 (0.0009) [2023-10-14 04:04:52,626][33226] Updated weights for policy 1, policy_version 76350 (0.0008) [2023-10-14 04:04:52,710][33201] Updated weights for policy 0, policy_version 75680 (0.0007) [2023-10-14 04:04:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 155680768. Throughput: 0: 1731.6, 1: 1752.7. Samples: 38930414. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:54,558][31953] Avg episode reward: [(0, '20.470'), (1, '20.930')] [2023-10-14 04:04:56,444][33226] Updated weights for policy 1, policy_version 76360 (0.0007) [2023-10-14 04:04:56,587][33201] Updated weights for policy 0, policy_version 75690 (0.0010) [2023-10-14 04:04:56,817][33226] Updated weights for policy 1, policy_version 76370 (0.0007) [2023-10-14 04:04:56,954][33201] Updated weights for policy 0, policy_version 75700 (0.0007) [2023-10-14 04:04:57,181][33226] Updated weights for policy 1, policy_version 76380 (0.0009) [2023-10-14 04:04:57,328][33201] Updated weights for policy 0, policy_version 75710 (0.0008) [2023-10-14 04:04:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 155746304. Throughput: 0: 1737.9, 1: 1758.8. Samples: 38940638. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:04:59,558][31953] Avg episode reward: [(0, '20.470'), (1, '20.780')] [2023-10-14 04:05:00,834][33226] Updated weights for policy 1, policy_version 76390 (0.0009) [2023-10-14 04:05:01,189][33226] Updated weights for policy 1, policy_version 76400 (0.0008) [2023-10-14 04:05:01,280][33201] Updated weights for policy 0, policy_version 75720 (0.0008) [2023-10-14 04:05:01,551][33226] Updated weights for policy 1, policy_version 76410 (0.0008) [2023-10-14 04:05:01,649][33201] Updated weights for policy 0, policy_version 75730 (0.0009) [2023-10-14 04:05:02,015][33201] Updated weights for policy 0, policy_version 75740 (0.0010) [2023-10-14 04:05:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 155811840. Throughput: 0: 1725.1, 1: 1746.4. Samples: 38961988. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:05:04,558][31953] Avg episode reward: [(0, '20.490'), (1, '20.780')] [2023-10-14 04:05:05,349][33226] Updated weights for policy 1, policy_version 76420 (0.0008) [2023-10-14 04:05:05,711][33226] Updated weights for policy 1, policy_version 76430 (0.0008) [2023-10-14 04:05:05,940][33201] Updated weights for policy 0, policy_version 75750 (0.0010) [2023-10-14 04:05:06,069][33226] Updated weights for policy 1, policy_version 76440 (0.0007) [2023-10-14 04:05:06,308][33201] Updated weights for policy 0, policy_version 75760 (0.0008) [2023-10-14 04:05:06,676][33201] Updated weights for policy 0, policy_version 75770 (0.0008) [2023-10-14 04:05:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 155877376. Throughput: 0: 1752.1, 1: 1769.3. Samples: 38983948. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:05:09,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.780')] [2023-10-14 04:05:09,777][33226] Updated weights for policy 1, policy_version 76450 (0.0008) [2023-10-14 04:05:10,148][33226] Updated weights for policy 1, policy_version 76460 (0.0009) [2023-10-14 04:05:10,452][33201] Updated weights for policy 0, policy_version 75780 (0.0008) [2023-10-14 04:05:10,518][33226] Updated weights for policy 1, policy_version 76470 (0.0008) [2023-10-14 04:05:10,823][33201] Updated weights for policy 0, policy_version 75790 (0.0007) [2023-10-14 04:05:10,877][33226] Updated weights for policy 1, policy_version 76480 (0.0008) [2023-10-14 04:05:11,187][33201] Updated weights for policy 0, policy_version 75800 (0.0008) [2023-10-14 04:05:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 155942912. Throughput: 0: 1737.3, 1: 1749.2. Samples: 38993636. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:05:14,558][31953] Avg episode reward: [(0, '20.530'), (1, '20.780')] [2023-10-14 04:05:14,743][33226] Updated weights for policy 1, policy_version 76490 (0.0008) [2023-10-14 04:05:14,944][33201] Updated weights for policy 0, policy_version 75810 (0.0008) [2023-10-14 04:05:15,100][33226] Updated weights for policy 1, policy_version 76500 (0.0008) [2023-10-14 04:05:15,312][33201] Updated weights for policy 0, policy_version 75820 (0.0009) [2023-10-14 04:05:15,467][33226] Updated weights for policy 1, policy_version 76510 (0.0008) [2023-10-14 04:05:15,691][33201] Updated weights for policy 0, policy_version 75830 (0.0008) [2023-10-14 04:05:16,048][33201] Updated weights for policy 0, policy_version 75840 (0.0010) [2023-10-14 04:05:19,277][33226] Updated weights for policy 1, policy_version 76520 (0.0009) [2023-10-14 04:05:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 156008448. Throughput: 0: 1752.2, 1: 1758.6. Samples: 39015642. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:05:19,558][31953] Avg episode reward: [(0, '20.550'), (1, '20.790')] [2023-10-14 04:05:19,642][33226] Updated weights for policy 1, policy_version 76530 (0.0008) [2023-10-14 04:05:19,795][33201] Updated weights for policy 0, policy_version 75850 (0.0008) [2023-10-14 04:05:20,009][33226] Updated weights for policy 1, policy_version 76540 (0.0009) [2023-10-14 04:05:20,167][33201] Updated weights for policy 0, policy_version 75860 (0.0008) [2023-10-14 04:05:20,535][33201] Updated weights for policy 0, policy_version 75870 (0.0008) [2023-10-14 04:05:23,627][33226] Updated weights for policy 1, policy_version 76550 (0.0007) [2023-10-14 04:05:23,990][33226] Updated weights for policy 1, policy_version 76560 (0.0008) [2023-10-14 04:05:24,359][33226] Updated weights for policy 1, policy_version 76570 (0.0007) [2023-10-14 04:05:24,380][33201] Updated weights for policy 0, policy_version 75880 (0.0008) [2023-10-14 04:05:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13107.3, 300 sec: 14106.9). Total num frames: 156073984. Throughput: 0: 1771.3, 1: 1770.4. Samples: 39037068. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:24,558][31953] Avg episode reward: [(0, '20.590'), (1, '20.790')] [2023-10-14 04:05:24,572][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000076576_78413824.pth... [2023-10-14 04:05:24,611][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000074912_76709888.pth [2023-10-14 04:05:24,749][33201] Updated weights for policy 0, policy_version 75890 (0.0008) [2023-10-14 04:05:25,121][33201] Updated weights for policy 0, policy_version 75900 (0.0008) [2023-10-14 04:05:25,270][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000075904_77725696.pth... [2023-10-14 04:05:25,310][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000074240_76021760.pth [2023-10-14 04:05:28,057][33226] Updated weights for policy 1, policy_version 76580 (0.0007) [2023-10-14 04:05:28,434][33226] Updated weights for policy 1, policy_version 76590 (0.0010) [2023-10-14 04:05:28,801][33226] Updated weights for policy 1, policy_version 76600 (0.0008) [2023-10-14 04:05:28,897][33201] Updated weights for policy 0, policy_version 75910 (0.0008) [2023-10-14 04:05:29,267][33201] Updated weights for policy 0, policy_version 75920 (0.0007) [2023-10-14 04:05:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 156172288. Throughput: 0: 1744.6, 1: 1760.6. Samples: 39047368. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:29,558][31953] Avg episode reward: [(0, '20.610'), (1, '20.800')] [2023-10-14 04:05:29,632][33201] Updated weights for policy 0, policy_version 75930 (0.0008) [2023-10-14 04:05:32,576][33226] Updated weights for policy 1, policy_version 76610 (0.0007) [2023-10-14 04:05:32,941][33226] Updated weights for policy 1, policy_version 76620 (0.0007) [2023-10-14 04:05:33,301][33226] Updated weights for policy 1, policy_version 76630 (0.0007) [2023-10-14 04:05:33,670][33226] Updated weights for policy 1, policy_version 76640 (0.0007) [2023-10-14 04:05:33,678][33201] Updated weights for policy 0, policy_version 75940 (0.0008) [2023-10-14 04:05:34,055][33201] Updated weights for policy 0, policy_version 75950 (0.0008) [2023-10-14 04:05:34,429][33201] Updated weights for policy 0, policy_version 75960 (0.0007) [2023-10-14 04:05:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 156237824. Throughput: 0: 1776.4, 1: 1785.2. Samples: 39068836. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:34,557][31953] Avg episode reward: [(0, '20.670'), (1, '20.800')] [2023-10-14 04:05:37,598][33226] Updated weights for policy 1, policy_version 76650 (0.0008) [2023-10-14 04:05:37,973][33226] Updated weights for policy 1, policy_version 76660 (0.0009) [2023-10-14 04:05:38,241][33201] Updated weights for policy 0, policy_version 75970 (0.0007) [2023-10-14 04:05:38,337][33226] Updated weights for policy 1, policy_version 76670 (0.0008) [2023-10-14 04:05:38,605][33201] Updated weights for policy 0, policy_version 75980 (0.0008) [2023-10-14 04:05:38,978][33201] Updated weights for policy 0, policy_version 75990 (0.0009) [2023-10-14 04:05:39,353][33201] Updated weights for policy 0, policy_version 76000 (0.0007) [2023-10-14 04:05:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 156336128. Throughput: 0: 1746.9, 1: 1770.6. Samples: 39088704. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:39,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.800')] [2023-10-14 04:05:42,188][33226] Updated weights for policy 1, policy_version 76680 (0.0009) [2023-10-14 04:05:42,563][33226] Updated weights for policy 1, policy_version 76690 (0.0007) [2023-10-14 04:05:42,928][33226] Updated weights for policy 1, policy_version 76700 (0.0007) [2023-10-14 04:05:43,143][33201] Updated weights for policy 0, policy_version 76010 (0.0008) [2023-10-14 04:05:43,508][33201] Updated weights for policy 0, policy_version 76020 (0.0008) [2023-10-14 04:05:43,869][33201] Updated weights for policy 0, policy_version 76030 (0.0010) [2023-10-14 04:05:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 156401664. Throughput: 0: 1763.2, 1: 1791.8. Samples: 39100614. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:44,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.800')] [2023-10-14 04:05:46,774][33226] Updated weights for policy 1, policy_version 76710 (0.0007) [2023-10-14 04:05:47,140][33226] Updated weights for policy 1, policy_version 76720 (0.0008) [2023-10-14 04:05:47,513][33226] Updated weights for policy 1, policy_version 76730 (0.0008) [2023-10-14 04:05:47,782][33201] Updated weights for policy 0, policy_version 76040 (0.0009) [2023-10-14 04:05:48,148][33201] Updated weights for policy 0, policy_version 76050 (0.0007) [2023-10-14 04:05:48,515][33201] Updated weights for policy 0, policy_version 76060 (0.0009) [2023-10-14 04:05:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 156467200. Throughput: 0: 1765.2, 1: 1764.3. Samples: 39120818. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.810')] [2023-10-14 04:05:51,257][33226] Updated weights for policy 1, policy_version 76740 (0.0008) [2023-10-14 04:05:51,623][33226] Updated weights for policy 1, policy_version 76750 (0.0010) [2023-10-14 04:05:51,985][33226] Updated weights for policy 1, policy_version 76760 (0.0008) [2023-10-14 04:05:52,168][33201] Updated weights for policy 0, policy_version 76070 (0.0009) [2023-10-14 04:05:52,548][33201] Updated weights for policy 0, policy_version 76080 (0.0010) [2023-10-14 04:05:52,923][33201] Updated weights for policy 0, policy_version 76090 (0.0010) [2023-10-14 04:05:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 156532736. Throughput: 0: 1755.1, 1: 1765.3. Samples: 39142368. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.810')] [2023-10-14 04:05:55,745][33226] Updated weights for policy 1, policy_version 76770 (0.0008) [2023-10-14 04:05:56,108][33226] Updated weights for policy 1, policy_version 76780 (0.0007) [2023-10-14 04:05:56,470][33226] Updated weights for policy 1, policy_version 76790 (0.0008) [2023-10-14 04:05:56,826][33226] Updated weights for policy 1, policy_version 76800 (0.0008) [2023-10-14 04:05:56,882][33201] Updated weights for policy 0, policy_version 76100 (0.0009) [2023-10-14 04:05:57,251][33201] Updated weights for policy 0, policy_version 76110 (0.0011) [2023-10-14 04:05:57,614][33201] Updated weights for policy 0, policy_version 76120 (0.0009) [2023-10-14 04:05:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 156598272. Throughput: 0: 1780.7, 1: 1764.5. Samples: 39153170. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:05:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.810')] [2023-10-14 04:06:00,779][33226] Updated weights for policy 1, policy_version 76810 (0.0009) [2023-10-14 04:06:01,155][33226] Updated weights for policy 1, policy_version 76820 (0.0009) [2023-10-14 04:06:01,484][33201] Updated weights for policy 0, policy_version 76130 (0.0010) [2023-10-14 04:06:01,520][33226] Updated weights for policy 1, policy_version 76830 (0.0009) [2023-10-14 04:06:01,850][33201] Updated weights for policy 0, policy_version 76140 (0.0008) [2023-10-14 04:06:02,225][33201] Updated weights for policy 0, policy_version 76150 (0.0007) [2023-10-14 04:06:02,588][33201] Updated weights for policy 0, policy_version 76160 (0.0008) [2023-10-14 04:06:04,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 156663808. Throughput: 0: 1753.2, 1: 1768.2. Samples: 39174102. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:06:04,557][31953] Avg episode reward: [(0, '20.890'), (1, '20.800')] [2023-10-14 04:06:05,403][33226] Updated weights for policy 1, policy_version 76840 (0.0008) [2023-10-14 04:06:05,765][33226] Updated weights for policy 1, policy_version 76850 (0.0008) [2023-10-14 04:06:06,132][33226] Updated weights for policy 1, policy_version 76860 (0.0007) [2023-10-14 04:06:06,505][33201] Updated weights for policy 0, policy_version 76170 (0.0008) [2023-10-14 04:06:06,885][33201] Updated weights for policy 0, policy_version 76180 (0.0008) [2023-10-14 04:06:07,257][33201] Updated weights for policy 0, policy_version 76190 (0.0007) [2023-10-14 04:06:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 156729344. Throughput: 0: 1755.6, 1: 1781.7. Samples: 39196248. Policy #0 lag: (min: 31.0, avg: 40.5, max: 63.0) [2023-10-14 04:06:09,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.800')] [2023-10-14 04:06:09,846][33226] Updated weights for policy 1, policy_version 76870 (0.0010) [2023-10-14 04:06:10,211][33226] Updated weights for policy 1, policy_version 76880 (0.0010) [2023-10-14 04:06:10,576][33226] Updated weights for policy 1, policy_version 76890 (0.0010) [2023-10-14 04:06:11,065][33201] Updated weights for policy 0, policy_version 76200 (0.0008) [2023-10-14 04:06:11,431][33201] Updated weights for policy 0, policy_version 76210 (0.0007) [2023-10-14 04:06:11,804][33201] Updated weights for policy 0, policy_version 76220 (0.0010) [2023-10-14 04:06:14,447][33226] Updated weights for policy 1, policy_version 76900 (0.0009) [2023-10-14 04:06:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 156794880. Throughput: 0: 1757.4, 1: 1765.1. Samples: 39205880. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.800')] [2023-10-14 04:06:14,811][33226] Updated weights for policy 1, policy_version 76910 (0.0009) [2023-10-14 04:06:15,180][33226] Updated weights for policy 1, policy_version 76920 (0.0008) [2023-10-14 04:06:15,542][33201] Updated weights for policy 0, policy_version 76230 (0.0009) [2023-10-14 04:06:15,913][33201] Updated weights for policy 0, policy_version 76240 (0.0009) [2023-10-14 04:06:16,288][33201] Updated weights for policy 0, policy_version 76250 (0.0008) [2023-10-14 04:06:18,871][33226] Updated weights for policy 1, policy_version 76930 (0.0008) [2023-10-14 04:06:19,247][33226] Updated weights for policy 1, policy_version 76940 (0.0007) [2023-10-14 04:06:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 156860416. Throughput: 0: 1758.1, 1: 1777.5. Samples: 39227936. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.800')] [2023-10-14 04:06:19,613][33226] Updated weights for policy 1, policy_version 76950 (0.0009) [2023-10-14 04:06:19,971][33226] Updated weights for policy 1, policy_version 76960 (0.0010) [2023-10-14 04:06:20,134][33201] Updated weights for policy 0, policy_version 76260 (0.0007) [2023-10-14 04:06:20,499][33201] Updated weights for policy 0, policy_version 76270 (0.0007) [2023-10-14 04:06:20,867][33201] Updated weights for policy 0, policy_version 76280 (0.0007) [2023-10-14 04:06:23,737][33226] Updated weights for policy 1, policy_version 76970 (0.0007) [2023-10-14 04:06:24,104][33226] Updated weights for policy 1, policy_version 76980 (0.0007) [2023-10-14 04:06:24,472][33226] Updated weights for policy 1, policy_version 76990 (0.0007) [2023-10-14 04:06:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 156958720. Throughput: 0: 1795.3, 1: 1785.0. Samples: 39249818. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.800')] [2023-10-14 04:06:24,568][33201] Updated weights for policy 0, policy_version 76290 (0.0009) [2023-10-14 04:06:24,941][33201] Updated weights for policy 0, policy_version 76300 (0.0007) [2023-10-14 04:06:25,306][33201] Updated weights for policy 0, policy_version 76310 (0.0008) [2023-10-14 04:06:25,679][33201] Updated weights for policy 0, policy_version 76320 (0.0009) [2023-10-14 04:06:28,354][33226] Updated weights for policy 1, policy_version 77000 (0.0010) [2023-10-14 04:06:28,713][33226] Updated weights for policy 1, policy_version 77010 (0.0010) [2023-10-14 04:06:29,080][33226] Updated weights for policy 1, policy_version 77020 (0.0007) [2023-10-14 04:06:29,325][33201] Updated weights for policy 0, policy_version 76330 (0.0008) [2023-10-14 04:06:29,557][31953] Fps is (10 sec: 16383.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 157024256. Throughput: 0: 1775.2, 1: 1773.2. Samples: 39260296. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:29,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.780')] [2023-10-14 04:06:29,704][33201] Updated weights for policy 0, policy_version 76340 (0.0007) [2023-10-14 04:06:30,075][33201] Updated weights for policy 0, policy_version 76350 (0.0007) [2023-10-14 04:06:33,006][33226] Updated weights for policy 1, policy_version 77030 (0.0007) [2023-10-14 04:06:33,376][33226] Updated weights for policy 1, policy_version 77040 (0.0009) [2023-10-14 04:06:33,745][33226] Updated weights for policy 1, policy_version 77050 (0.0007) [2023-10-14 04:06:33,847][33201] Updated weights for policy 0, policy_version 76360 (0.0007) [2023-10-14 04:06:34,214][33201] Updated weights for policy 0, policy_version 76370 (0.0007) [2023-10-14 04:06:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 157089792. Throughput: 0: 1786.2, 1: 1801.2. Samples: 39282250. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.780')] [2023-10-14 04:06:34,583][33201] Updated weights for policy 0, policy_version 76380 (0.0009) [2023-10-14 04:06:37,570][33226] Updated weights for policy 1, policy_version 77060 (0.0008) [2023-10-14 04:06:37,940][33226] Updated weights for policy 1, policy_version 77070 (0.0008) [2023-10-14 04:06:38,291][33201] Updated weights for policy 0, policy_version 76390 (0.0009) [2023-10-14 04:06:38,313][33226] Updated weights for policy 1, policy_version 77080 (0.0008) [2023-10-14 04:06:38,658][33201] Updated weights for policy 0, policy_version 76400 (0.0008) [2023-10-14 04:06:39,028][33201] Updated weights for policy 0, policy_version 76410 (0.0011) [2023-10-14 04:06:39,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 157188096. Throughput: 0: 1779.1, 1: 1771.4. Samples: 39302138. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 04:06:41,832][33226] Updated weights for policy 1, policy_version 77090 (0.0008) [2023-10-14 04:06:42,202][33226] Updated weights for policy 1, policy_version 77100 (0.0008) [2023-10-14 04:06:42,572][33226] Updated weights for policy 1, policy_version 77110 (0.0007) [2023-10-14 04:06:42,837][33201] Updated weights for policy 0, policy_version 76420 (0.0008) [2023-10-14 04:06:42,933][33226] Updated weights for policy 1, policy_version 77120 (0.0008) [2023-10-14 04:06:43,219][33201] Updated weights for policy 0, policy_version 76430 (0.0010) [2023-10-14 04:06:43,582][33201] Updated weights for policy 0, policy_version 76440 (0.0009) [2023-10-14 04:06:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 157253632. Throughput: 0: 1779.0, 1: 1804.4. Samples: 39314420. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.800')] [2023-10-14 04:06:46,629][33226] Updated weights for policy 1, policy_version 77130 (0.0008) [2023-10-14 04:06:46,987][33226] Updated weights for policy 1, policy_version 77140 (0.0007) [2023-10-14 04:06:47,362][33226] Updated weights for policy 1, policy_version 77150 (0.0007) [2023-10-14 04:06:47,435][33201] Updated weights for policy 0, policy_version 76450 (0.0010) [2023-10-14 04:06:47,805][33201] Updated weights for policy 0, policy_version 76460 (0.0009) [2023-10-14 04:06:48,179][33201] Updated weights for policy 0, policy_version 76470 (0.0007) [2023-10-14 04:06:48,541][33201] Updated weights for policy 0, policy_version 76480 (0.0009) [2023-10-14 04:06:49,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 157319168. Throughput: 0: 1785.4, 1: 1780.7. Samples: 39334580. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 04:06:51,216][33226] Updated weights for policy 1, policy_version 77160 (0.0009) [2023-10-14 04:06:51,576][33226] Updated weights for policy 1, policy_version 77170 (0.0010) [2023-10-14 04:06:51,941][33226] Updated weights for policy 1, policy_version 77180 (0.0008) [2023-10-14 04:06:52,199][33201] Updated weights for policy 0, policy_version 76490 (0.0007) [2023-10-14 04:06:52,573][33201] Updated weights for policy 0, policy_version 76500 (0.0010) [2023-10-14 04:06:52,956][33201] Updated weights for policy 0, policy_version 76510 (0.0008) [2023-10-14 04:06:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 157384704. Throughput: 0: 1775.8, 1: 1780.5. Samples: 39356282. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:06:55,715][33226] Updated weights for policy 1, policy_version 77190 (0.0009) [2023-10-14 04:06:56,082][33226] Updated weights for policy 1, policy_version 77200 (0.0008) [2023-10-14 04:06:56,444][33226] Updated weights for policy 1, policy_version 77210 (0.0008) [2023-10-14 04:06:56,943][33201] Updated weights for policy 0, policy_version 76520 (0.0008) [2023-10-14 04:06:57,323][33201] Updated weights for policy 0, policy_version 76530 (0.0007) [2023-10-14 04:06:57,694][33201] Updated weights for policy 0, policy_version 76540 (0.0007) [2023-10-14 04:06:59,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 157450240. Throughput: 0: 1793.8, 1: 1779.2. Samples: 39366662. Policy #0 lag: (min: 8.0, avg: 36.7, max: 40.0) [2023-10-14 04:06:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.950')] [2023-10-14 04:07:00,322][33226] Updated weights for policy 1, policy_version 77220 (0.0010) [2023-10-14 04:07:00,680][33226] Updated weights for policy 1, policy_version 77230 (0.0011) [2023-10-14 04:07:01,055][33226] Updated weights for policy 1, policy_version 77240 (0.0010) [2023-10-14 04:07:01,528][33201] Updated weights for policy 0, policy_version 76550 (0.0008) [2023-10-14 04:07:01,904][33201] Updated weights for policy 0, policy_version 76560 (0.0008) [2023-10-14 04:07:02,279][33201] Updated weights for policy 0, policy_version 76570 (0.0008) [2023-10-14 04:07:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 157515776. Throughput: 0: 1770.9, 1: 1777.6. Samples: 39387618. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:04,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 04:07:04,843][33226] Updated weights for policy 1, policy_version 77250 (0.0010) [2023-10-14 04:07:05,209][33226] Updated weights for policy 1, policy_version 77260 (0.0007) [2023-10-14 04:07:05,572][33226] Updated weights for policy 1, policy_version 77270 (0.0008) [2023-10-14 04:07:05,928][33201] Updated weights for policy 0, policy_version 76580 (0.0009) [2023-10-14 04:07:05,941][33226] Updated weights for policy 1, policy_version 77280 (0.0008) [2023-10-14 04:07:06,296][33201] Updated weights for policy 0, policy_version 76590 (0.0011) [2023-10-14 04:07:06,660][33201] Updated weights for policy 0, policy_version 76600 (0.0010) [2023-10-14 04:07:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 157581312. Throughput: 0: 1767.8, 1: 1791.2. Samples: 39409972. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 04:07:09,798][33226] Updated weights for policy 1, policy_version 77290 (0.0008) [2023-10-14 04:07:10,174][33226] Updated weights for policy 1, policy_version 77300 (0.0007) [2023-10-14 04:07:10,495][33201] Updated weights for policy 0, policy_version 76610 (0.0008) [2023-10-14 04:07:10,532][33226] Updated weights for policy 1, policy_version 77310 (0.0007) [2023-10-14 04:07:10,869][33201] Updated weights for policy 0, policy_version 76620 (0.0008) [2023-10-14 04:07:11,231][33201] Updated weights for policy 0, policy_version 76630 (0.0008) [2023-10-14 04:07:11,602][33201] Updated weights for policy 0, policy_version 76640 (0.0008) [2023-10-14 04:07:14,189][33226] Updated weights for policy 1, policy_version 77320 (0.0007) [2023-10-14 04:07:14,553][33226] Updated weights for policy 1, policy_version 77330 (0.0007) [2023-10-14 04:07:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 157646848. Throughput: 0: 1765.5, 1: 1774.1. Samples: 39419580. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 04:07:14,922][33226] Updated weights for policy 1, policy_version 77340 (0.0008) [2023-10-14 04:07:15,430][33201] Updated weights for policy 0, policy_version 76650 (0.0008) [2023-10-14 04:07:15,798][33201] Updated weights for policy 0, policy_version 76660 (0.0009) [2023-10-14 04:07:16,174][33201] Updated weights for policy 0, policy_version 76670 (0.0007) [2023-10-14 04:07:18,867][33226] Updated weights for policy 1, policy_version 77350 (0.0008) [2023-10-14 04:07:19,233][33226] Updated weights for policy 1, policy_version 77360 (0.0007) [2023-10-14 04:07:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 157712384. Throughput: 0: 1763.4, 1: 1773.9. Samples: 39441428. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:19,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.970')] [2023-10-14 04:07:19,606][33226] Updated weights for policy 1, policy_version 77370 (0.0008) [2023-10-14 04:07:20,117][33201] Updated weights for policy 0, policy_version 76680 (0.0009) [2023-10-14 04:07:20,487][33201] Updated weights for policy 0, policy_version 76690 (0.0007) [2023-10-14 04:07:20,856][33201] Updated weights for policy 0, policy_version 76700 (0.0008) [2023-10-14 04:07:23,376][33226] Updated weights for policy 1, policy_version 77380 (0.0009) [2023-10-14 04:07:23,745][33226] Updated weights for policy 1, policy_version 77390 (0.0010) [2023-10-14 04:07:24,115][33226] Updated weights for policy 1, policy_version 77400 (0.0010) [2023-10-14 04:07:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 157810688. Throughput: 0: 1779.0, 1: 1785.6. Samples: 39462544. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:24,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000077408_79265792.pth... [2023-10-14 04:07:24,605][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000075744_77561856.pth [2023-10-14 04:07:24,624][33201] Updated weights for policy 0, policy_version 76710 (0.0008) [2023-10-14 04:07:24,993][33201] Updated weights for policy 0, policy_version 76720 (0.0009) [2023-10-14 04:07:25,366][33201] Updated weights for policy 0, policy_version 76730 (0.0009) [2023-10-14 04:07:25,583][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000076736_78577664.pth... [2023-10-14 04:07:25,621][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000075072_76873728.pth [2023-10-14 04:07:28,067][33226] Updated weights for policy 1, policy_version 77410 (0.0008) [2023-10-14 04:07:28,431][33226] Updated weights for policy 1, policy_version 77420 (0.0007) [2023-10-14 04:07:28,793][33226] Updated weights for policy 1, policy_version 77430 (0.0010) [2023-10-14 04:07:29,139][33201] Updated weights for policy 0, policy_version 76740 (0.0007) [2023-10-14 04:07:29,158][33226] Updated weights for policy 1, policy_version 77440 (0.0009) [2023-10-14 04:07:29,503][33201] Updated weights for policy 0, policy_version 76750 (0.0008) [2023-10-14 04:07:29,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 157876224. Throughput: 0: 1753.6, 1: 1770.2. Samples: 39472994. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:29,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:29,869][33201] Updated weights for policy 0, policy_version 76760 (0.0010) [2023-10-14 04:07:32,856][33226] Updated weights for policy 1, policy_version 77450 (0.0010) [2023-10-14 04:07:33,237][33226] Updated weights for policy 1, policy_version 77460 (0.0010) [2023-10-14 04:07:33,599][33226] Updated weights for policy 1, policy_version 77470 (0.0007) [2023-10-14 04:07:33,738][33201] Updated weights for policy 0, policy_version 76770 (0.0008) [2023-10-14 04:07:34,103][33201] Updated weights for policy 0, policy_version 76780 (0.0009) [2023-10-14 04:07:34,482][33201] Updated weights for policy 0, policy_version 76790 (0.0010) [2023-10-14 04:07:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 157941760. Throughput: 0: 1775.6, 1: 1783.3. Samples: 39494734. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:34,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:34,857][33201] Updated weights for policy 0, policy_version 76800 (0.0007) [2023-10-14 04:07:37,370][33226] Updated weights for policy 1, policy_version 77480 (0.0008) [2023-10-14 04:07:37,740][33226] Updated weights for policy 1, policy_version 77490 (0.0010) [2023-10-14 04:07:38,100][33226] Updated weights for policy 1, policy_version 77500 (0.0009) [2023-10-14 04:07:38,844][33201] Updated weights for policy 0, policy_version 76810 (0.0009) [2023-10-14 04:07:39,214][33201] Updated weights for policy 0, policy_version 76820 (0.0010) [2023-10-14 04:07:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 158007296. Throughput: 0: 1768.8, 1: 1767.2. Samples: 39515406. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:39,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:39,581][33201] Updated weights for policy 0, policy_version 76830 (0.0011) [2023-10-14 04:07:41,780][33226] Updated weights for policy 1, policy_version 77510 (0.0007) [2023-10-14 04:07:42,137][33226] Updated weights for policy 1, policy_version 77520 (0.0008) [2023-10-14 04:07:42,508][33226] Updated weights for policy 1, policy_version 77530 (0.0007) [2023-10-14 04:07:43,410][33201] Updated weights for policy 0, policy_version 76840 (0.0008) [2023-10-14 04:07:43,788][33201] Updated weights for policy 0, policy_version 76850 (0.0009) [2023-10-14 04:07:44,155][33201] Updated weights for policy 0, policy_version 76860 (0.0009) [2023-10-14 04:07:44,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 158105600. Throughput: 0: 1763.6, 1: 1791.2. Samples: 39526628. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:44,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:46,173][33226] Updated weights for policy 1, policy_version 77540 (0.0010) [2023-10-14 04:07:46,537][33226] Updated weights for policy 1, policy_version 77550 (0.0008) [2023-10-14 04:07:46,897][33226] Updated weights for policy 1, policy_version 77560 (0.0007) [2023-10-14 04:07:47,842][33201] Updated weights for policy 0, policy_version 76870 (0.0008) [2023-10-14 04:07:48,216][33201] Updated weights for policy 0, policy_version 76880 (0.0007) [2023-10-14 04:07:48,585][33201] Updated weights for policy 0, policy_version 76890 (0.0008) [2023-10-14 04:07:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 158171136. Throughput: 0: 1778.0, 1: 1777.3. Samples: 39547610. Policy #0 lag: (min: 31.0, avg: 38.3, max: 63.0) [2023-10-14 04:07:49,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.970')] [2023-10-14 04:07:50,833][33226] Updated weights for policy 1, policy_version 77570 (0.0008) [2023-10-14 04:07:51,190][33226] Updated weights for policy 1, policy_version 77580 (0.0008) [2023-10-14 04:07:51,553][33226] Updated weights for policy 1, policy_version 77590 (0.0011) [2023-10-14 04:07:51,921][33226] Updated weights for policy 1, policy_version 77600 (0.0009) [2023-10-14 04:07:52,261][33201] Updated weights for policy 0, policy_version 76900 (0.0008) [2023-10-14 04:07:52,630][33201] Updated weights for policy 0, policy_version 76910 (0.0008) [2023-10-14 04:07:52,997][33201] Updated weights for policy 0, policy_version 76920 (0.0008) [2023-10-14 04:07:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 158236672. Throughput: 0: 1758.2, 1: 1773.7. Samples: 39568908. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:07:54,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 04:07:55,679][33226] Updated weights for policy 1, policy_version 77610 (0.0009) [2023-10-14 04:07:56,051][33226] Updated weights for policy 1, policy_version 77620 (0.0008) [2023-10-14 04:07:56,405][33226] Updated weights for policy 1, policy_version 77630 (0.0009) [2023-10-14 04:07:56,781][33201] Updated weights for policy 0, policy_version 76930 (0.0007) [2023-10-14 04:07:57,147][33201] Updated weights for policy 0, policy_version 76940 (0.0008) [2023-10-14 04:07:57,513][33201] Updated weights for policy 0, policy_version 76950 (0.0009) [2023-10-14 04:07:57,883][33201] Updated weights for policy 0, policy_version 76960 (0.0008) [2023-10-14 04:07:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 158302208. Throughput: 0: 1782.9, 1: 1773.5. Samples: 39579618. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:07:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 04:08:00,285][33226] Updated weights for policy 1, policy_version 77640 (0.0008) [2023-10-14 04:08:00,649][33226] Updated weights for policy 1, policy_version 77650 (0.0007) [2023-10-14 04:08:01,018][33226] Updated weights for policy 1, policy_version 77660 (0.0007) [2023-10-14 04:08:01,714][33201] Updated weights for policy 0, policy_version 76970 (0.0011) [2023-10-14 04:08:02,077][33201] Updated weights for policy 0, policy_version 76980 (0.0008) [2023-10-14 04:08:02,442][33201] Updated weights for policy 0, policy_version 76990 (0.0007) [2023-10-14 04:08:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 158367744. Throughput: 0: 1759.6, 1: 1778.7. Samples: 39600654. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:04,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 04:08:04,780][33226] Updated weights for policy 1, policy_version 77670 (0.0009) [2023-10-14 04:08:05,154][33226] Updated weights for policy 1, policy_version 77680 (0.0010) [2023-10-14 04:08:05,520][33226] Updated weights for policy 1, policy_version 77690 (0.0008) [2023-10-14 04:08:06,182][33201] Updated weights for policy 0, policy_version 77000 (0.0009) [2023-10-14 04:08:06,563][33201] Updated weights for policy 0, policy_version 77010 (0.0009) [2023-10-14 04:08:06,921][33201] Updated weights for policy 0, policy_version 77020 (0.0009) [2023-10-14 04:08:09,294][33226] Updated weights for policy 1, policy_version 77700 (0.0008) [2023-10-14 04:08:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158433280. Throughput: 0: 1762.3, 1: 1801.1. Samples: 39622896. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:09,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.980')] [2023-10-14 04:08:09,654][33226] Updated weights for policy 1, policy_version 77710 (0.0008) [2023-10-14 04:08:10,028][33226] Updated weights for policy 1, policy_version 77720 (0.0009) [2023-10-14 04:08:10,901][33201] Updated weights for policy 0, policy_version 77030 (0.0009) [2023-10-14 04:08:11,273][33201] Updated weights for policy 0, policy_version 77040 (0.0010) [2023-10-14 04:08:11,642][33201] Updated weights for policy 0, policy_version 77050 (0.0011) [2023-10-14 04:08:13,959][33226] Updated weights for policy 1, policy_version 77730 (0.0008) [2023-10-14 04:08:14,332][33226] Updated weights for policy 1, policy_version 77740 (0.0009) [2023-10-14 04:08:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158498816. Throughput: 0: 1764.9, 1: 1782.5. Samples: 39632622. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:14,558][31953] Avg episode reward: [(0, '20.880'), (1, '21.000')] [2023-10-14 04:08:14,699][33226] Updated weights for policy 1, policy_version 77750 (0.0008) [2023-10-14 04:08:15,072][33226] Updated weights for policy 1, policy_version 77760 (0.0008) [2023-10-14 04:08:15,609][33201] Updated weights for policy 0, policy_version 77060 (0.0010) [2023-10-14 04:08:15,977][33201] Updated weights for policy 0, policy_version 77070 (0.0007) [2023-10-14 04:08:16,347][33201] Updated weights for policy 0, policy_version 77080 (0.0007) [2023-10-14 04:08:18,998][33226] Updated weights for policy 1, policy_version 77770 (0.0007) [2023-10-14 04:08:19,366][33226] Updated weights for policy 1, policy_version 77780 (0.0008) [2023-10-14 04:08:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158564352. Throughput: 0: 1765.9, 1: 1781.1. Samples: 39654346. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:19,558][31953] Avg episode reward: [(0, '20.880'), (1, '21.000')] [2023-10-14 04:08:19,732][33226] Updated weights for policy 1, policy_version 77790 (0.0007) [2023-10-14 04:08:20,158][33201] Updated weights for policy 0, policy_version 77090 (0.0009) [2023-10-14 04:08:20,523][33201] Updated weights for policy 0, policy_version 77100 (0.0007) [2023-10-14 04:08:20,900][33201] Updated weights for policy 0, policy_version 77110 (0.0008) [2023-10-14 04:08:21,264][33201] Updated weights for policy 0, policy_version 77120 (0.0008) [2023-10-14 04:08:23,521][33226] Updated weights for policy 1, policy_version 77800 (0.0007) [2023-10-14 04:08:23,898][33226] Updated weights for policy 1, policy_version 77810 (0.0007) [2023-10-14 04:08:24,255][33226] Updated weights for policy 1, policy_version 77820 (0.0008) [2023-10-14 04:08:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158662656. Throughput: 0: 1782.4, 1: 1778.2. Samples: 39675632. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:24,558][31953] Avg episode reward: [(0, '20.880'), (1, '21.000')] [2023-10-14 04:08:24,947][33201] Updated weights for policy 0, policy_version 77130 (0.0009) [2023-10-14 04:08:25,315][33201] Updated weights for policy 0, policy_version 77140 (0.0007) [2023-10-14 04:08:25,692][33201] Updated weights for policy 0, policy_version 77150 (0.0007) [2023-10-14 04:08:27,909][33226] Updated weights for policy 1, policy_version 77830 (0.0010) [2023-10-14 04:08:28,266][33226] Updated weights for policy 1, policy_version 77840 (0.0009) [2023-10-14 04:08:28,639][33226] Updated weights for policy 1, policy_version 77850 (0.0008) [2023-10-14 04:08:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158728192. Throughput: 0: 1763.8, 1: 1777.8. Samples: 39685998. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:29,558][31953] Avg episode reward: [(0, '20.880'), (1, '21.000')] [2023-10-14 04:08:29,584][33201] Updated weights for policy 0, policy_version 77160 (0.0008) [2023-10-14 04:08:29,952][33201] Updated weights for policy 0, policy_version 77170 (0.0008) [2023-10-14 04:08:30,331][33201] Updated weights for policy 0, policy_version 77180 (0.0009) [2023-10-14 04:08:32,562][33226] Updated weights for policy 1, policy_version 77860 (0.0008) [2023-10-14 04:08:32,934][33226] Updated weights for policy 1, policy_version 77870 (0.0008) [2023-10-14 04:08:33,294][33226] Updated weights for policy 1, policy_version 77880 (0.0010) [2023-10-14 04:08:34,137][33201] Updated weights for policy 0, policy_version 77190 (0.0009) [2023-10-14 04:08:34,505][33201] Updated weights for policy 0, policy_version 77200 (0.0008) [2023-10-14 04:08:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158793728. Throughput: 0: 1770.3, 1: 1773.9. Samples: 39707100. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '21.000')] [2023-10-14 04:08:34,877][33201] Updated weights for policy 0, policy_version 77210 (0.0009) [2023-10-14 04:08:37,133][33226] Updated weights for policy 1, policy_version 77890 (0.0010) [2023-10-14 04:08:37,492][33226] Updated weights for policy 1, policy_version 77900 (0.0008) [2023-10-14 04:08:37,870][33226] Updated weights for policy 1, policy_version 77910 (0.0009) [2023-10-14 04:08:38,238][33226] Updated weights for policy 1, policy_version 77920 (0.0008) [2023-10-14 04:08:38,769][33201] Updated weights for policy 0, policy_version 77220 (0.0007) [2023-10-14 04:08:39,135][33201] Updated weights for policy 0, policy_version 77230 (0.0009) [2023-10-14 04:08:39,516][33201] Updated weights for policy 0, policy_version 77240 (0.0010) [2023-10-14 04:08:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 158859264. Throughput: 0: 1770.9, 1: 1757.2. Samples: 39727674. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:08:42,152][33226] Updated weights for policy 1, policy_version 77930 (0.0008) [2023-10-14 04:08:42,525][33226] Updated weights for policy 1, policy_version 77940 (0.0010) [2023-10-14 04:08:42,887][33226] Updated weights for policy 1, policy_version 77950 (0.0008) [2023-10-14 04:08:43,569][33201] Updated weights for policy 0, policy_version 77250 (0.0012) [2023-10-14 04:08:43,933][33201] Updated weights for policy 0, policy_version 77260 (0.0010) [2023-10-14 04:08:44,302][33201] Updated weights for policy 0, policy_version 77270 (0.0010) [2023-10-14 04:08:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 158924800. Throughput: 0: 1755.5, 1: 1783.8. Samples: 39738886. Policy #0 lag: (min: 31.0, avg: 35.9, max: 63.0) [2023-10-14 04:08:44,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:08:44,675][33201] Updated weights for policy 0, policy_version 77280 (0.0009) [2023-10-14 04:08:46,773][33226] Updated weights for policy 1, policy_version 77960 (0.0009) [2023-10-14 04:08:47,133][33226] Updated weights for policy 1, policy_version 77970 (0.0010) [2023-10-14 04:08:47,502][33226] Updated weights for policy 1, policy_version 77980 (0.0010) [2023-10-14 04:08:48,720][33201] Updated weights for policy 0, policy_version 77290 (0.0008) [2023-10-14 04:08:49,083][33201] Updated weights for policy 0, policy_version 77300 (0.0011) [2023-10-14 04:08:49,446][33201] Updated weights for policy 0, policy_version 77310 (0.0010) [2023-10-14 04:08:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159023104. Throughput: 0: 1776.5, 1: 1755.8. Samples: 39759608. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:08:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 04:08:51,191][33226] Updated weights for policy 1, policy_version 77990 (0.0010) [2023-10-14 04:08:51,563][33226] Updated weights for policy 1, policy_version 78000 (0.0008) [2023-10-14 04:08:51,930][33226] Updated weights for policy 1, policy_version 78010 (0.0007) [2023-10-14 04:08:52,988][33201] Updated weights for policy 0, policy_version 77320 (0.0008) [2023-10-14 04:08:53,348][33201] Updated weights for policy 0, policy_version 77330 (0.0009) [2023-10-14 04:08:53,726][33201] Updated weights for policy 0, policy_version 77340 (0.0010) [2023-10-14 04:08:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159088640. Throughput: 0: 1746.0, 1: 1763.2. Samples: 39780810. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:08:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.840')] [2023-10-14 04:08:55,710][33226] Updated weights for policy 1, policy_version 78020 (0.0009) [2023-10-14 04:08:56,080][33226] Updated weights for policy 1, policy_version 78030 (0.0008) [2023-10-14 04:08:56,453][33226] Updated weights for policy 1, policy_version 78040 (0.0008) [2023-10-14 04:08:57,547][33201] Updated weights for policy 0, policy_version 77350 (0.0010) [2023-10-14 04:08:57,906][33201] Updated weights for policy 0, policy_version 77360 (0.0011) [2023-10-14 04:08:58,281][33201] Updated weights for policy 0, policy_version 77370 (0.0011) [2023-10-14 04:08:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159154176. Throughput: 0: 1776.4, 1: 1757.8. Samples: 39791660. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:08:59,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.840')] [2023-10-14 04:09:00,311][33226] Updated weights for policy 1, policy_version 78050 (0.0009) [2023-10-14 04:09:00,679][33226] Updated weights for policy 1, policy_version 78060 (0.0008) [2023-10-14 04:09:01,053][33226] Updated weights for policy 1, policy_version 78070 (0.0008) [2023-10-14 04:09:01,418][33226] Updated weights for policy 1, policy_version 78080 (0.0011) [2023-10-14 04:09:02,069][33201] Updated weights for policy 0, policy_version 77380 (0.0007) [2023-10-14 04:09:02,444][33201] Updated weights for policy 0, policy_version 77390 (0.0008) [2023-10-14 04:09:02,810][33201] Updated weights for policy 0, policy_version 77400 (0.0009) [2023-10-14 04:09:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159219712. Throughput: 0: 1744.0, 1: 1771.0. Samples: 39812522. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.840')] [2023-10-14 04:09:05,215][33226] Updated weights for policy 1, policy_version 78090 (0.0009) [2023-10-14 04:09:05,576][33226] Updated weights for policy 1, policy_version 78100 (0.0008) [2023-10-14 04:09:05,943][33226] Updated weights for policy 1, policy_version 78110 (0.0008) [2023-10-14 04:09:06,520][33201] Updated weights for policy 0, policy_version 77410 (0.0008) [2023-10-14 04:09:06,891][33201] Updated weights for policy 0, policy_version 77420 (0.0007) [2023-10-14 04:09:07,274][33201] Updated weights for policy 0, policy_version 77430 (0.0011) [2023-10-14 04:09:07,639][33201] Updated weights for policy 0, policy_version 77440 (0.0009) [2023-10-14 04:09:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159285248. Throughput: 0: 1748.0, 1: 1791.9. Samples: 39834928. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.840')] [2023-10-14 04:09:09,676][33226] Updated weights for policy 1, policy_version 78120 (0.0008) [2023-10-14 04:09:10,053][33226] Updated weights for policy 1, policy_version 78130 (0.0008) [2023-10-14 04:09:10,415][33226] Updated weights for policy 1, policy_version 78140 (0.0010) [2023-10-14 04:09:11,488][33201] Updated weights for policy 0, policy_version 77450 (0.0007) [2023-10-14 04:09:11,857][33201] Updated weights for policy 0, policy_version 77460 (0.0008) [2023-10-14 04:09:12,223][33201] Updated weights for policy 0, policy_version 77470 (0.0010) [2023-10-14 04:09:14,345][33226] Updated weights for policy 1, policy_version 78150 (0.0008) [2023-10-14 04:09:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159350784. Throughput: 0: 1758.4, 1: 1766.3. Samples: 39844610. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.780')] [2023-10-14 04:09:14,707][33226] Updated weights for policy 1, policy_version 78160 (0.0010) [2023-10-14 04:09:15,078][33226] Updated weights for policy 1, policy_version 78170 (0.0009) [2023-10-14 04:09:15,872][33201] Updated weights for policy 0, policy_version 77480 (0.0009) [2023-10-14 04:09:16,238][33201] Updated weights for policy 0, policy_version 77490 (0.0007) [2023-10-14 04:09:16,619][33201] Updated weights for policy 0, policy_version 77500 (0.0010) [2023-10-14 04:09:19,031][33226] Updated weights for policy 1, policy_version 78180 (0.0009) [2023-10-14 04:09:19,407][33226] Updated weights for policy 1, policy_version 78190 (0.0008) [2023-10-14 04:09:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 159416320. Throughput: 0: 1759.3, 1: 1781.8. Samples: 39866448. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.780')] [2023-10-14 04:09:19,773][33226] Updated weights for policy 1, policy_version 78200 (0.0010) [2023-10-14 04:09:20,363][33201] Updated weights for policy 0, policy_version 77510 (0.0011) [2023-10-14 04:09:20,735][33201] Updated weights for policy 0, policy_version 77520 (0.0008) [2023-10-14 04:09:21,099][33201] Updated weights for policy 0, policy_version 77530 (0.0010) [2023-10-14 04:09:23,401][33226] Updated weights for policy 1, policy_version 78210 (0.0009) [2023-10-14 04:09:23,770][33226] Updated weights for policy 1, policy_version 78220 (0.0010) [2023-10-14 04:09:24,133][33226] Updated weights for policy 1, policy_version 78230 (0.0009) [2023-10-14 04:09:24,497][33226] Updated weights for policy 1, policy_version 78240 (0.0007) [2023-10-14 04:09:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 159514624. Throughput: 0: 1778.6, 1: 1784.5. Samples: 39888012. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.750')] [2023-10-14 04:09:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000078240_80117760.pth... [2023-10-14 04:09:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000077536_79396864.pth... [2023-10-14 04:09:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000075904_77725696.pth [2023-10-14 04:09:24,610][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000076576_78413824.pth [2023-10-14 04:09:24,893][33201] Updated weights for policy 0, policy_version 77540 (0.0010) [2023-10-14 04:09:25,267][33201] Updated weights for policy 0, policy_version 77550 (0.0007) [2023-10-14 04:09:25,630][33201] Updated weights for policy 0, policy_version 77560 (0.0008) [2023-10-14 04:09:28,279][33226] Updated weights for policy 1, policy_version 78250 (0.0007) [2023-10-14 04:09:28,654][33226] Updated weights for policy 1, policy_version 78260 (0.0008) [2023-10-14 04:09:29,025][33226] Updated weights for policy 1, policy_version 78270 (0.0009) [2023-10-14 04:09:29,549][33201] Updated weights for policy 0, policy_version 77570 (0.0007) [2023-10-14 04:09:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 159580160. Throughput: 0: 1768.7, 1: 1774.7. Samples: 39898342. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.730')] [2023-10-14 04:09:29,914][33201] Updated weights for policy 0, policy_version 77580 (0.0008) [2023-10-14 04:09:30,289][33201] Updated weights for policy 0, policy_version 77590 (0.0010) [2023-10-14 04:09:30,649][33201] Updated weights for policy 0, policy_version 77600 (0.0010) [2023-10-14 04:09:32,925][33226] Updated weights for policy 1, policy_version 78280 (0.0010) [2023-10-14 04:09:33,290][33226] Updated weights for policy 1, policy_version 78290 (0.0010) [2023-10-14 04:09:33,656][33226] Updated weights for policy 1, policy_version 78300 (0.0010) [2023-10-14 04:09:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 159645696. Throughput: 0: 1769.7, 1: 1787.8. Samples: 39919696. Policy #0 lag: (min: 18.0, avg: 18.0, max: 18.0) [2023-10-14 04:09:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:09:34,599][33201] Updated weights for policy 0, policy_version 77610 (0.0007) [2023-10-14 04:09:34,960][33201] Updated weights for policy 0, policy_version 77620 (0.0008) [2023-10-14 04:09:35,328][33201] Updated weights for policy 0, policy_version 77630 (0.0007) [2023-10-14 04:09:37,402][33226] Updated weights for policy 1, policy_version 78310 (0.0009) [2023-10-14 04:09:37,761][33226] Updated weights for policy 1, policy_version 78320 (0.0007) [2023-10-14 04:09:38,126][33226] Updated weights for policy 1, policy_version 78330 (0.0009) [2023-10-14 04:09:38,984][33201] Updated weights for policy 0, policy_version 77640 (0.0008) [2023-10-14 04:09:39,356][33201] Updated weights for policy 0, policy_version 77650 (0.0007) [2023-10-14 04:09:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 159711232. Throughput: 0: 1797.2, 1: 1755.8. Samples: 39940696. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:09:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:09:39,718][33201] Updated weights for policy 0, policy_version 77660 (0.0009) [2023-10-14 04:09:41,959][33226] Updated weights for policy 1, policy_version 78340 (0.0009) [2023-10-14 04:09:42,328][33226] Updated weights for policy 1, policy_version 78350 (0.0009) [2023-10-14 04:09:42,687][33226] Updated weights for policy 1, policy_version 78360 (0.0009) [2023-10-14 04:09:43,550][33201] Updated weights for policy 0, policy_version 77670 (0.0009) [2023-10-14 04:09:43,924][33201] Updated weights for policy 0, policy_version 77680 (0.0010) [2023-10-14 04:09:44,290][33201] Updated weights for policy 0, policy_version 77690 (0.0010) [2023-10-14 04:09:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.5, 300 sec: 14218.0). Total num frames: 159809536. Throughput: 0: 1772.7, 1: 1791.6. Samples: 39952058. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:09:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:09:46,383][33226] Updated weights for policy 1, policy_version 78370 (0.0009) [2023-10-14 04:09:46,758][33226] Updated weights for policy 1, policy_version 78380 (0.0008) [2023-10-14 04:09:47,123][33226] Updated weights for policy 1, policy_version 78390 (0.0008) [2023-10-14 04:09:47,505][33226] Updated weights for policy 1, policy_version 78400 (0.0009) [2023-10-14 04:09:48,196][33201] Updated weights for policy 0, policy_version 77700 (0.0009) [2023-10-14 04:09:48,565][33201] Updated weights for policy 0, policy_version 77710 (0.0009) [2023-10-14 04:09:48,936][33201] Updated weights for policy 0, policy_version 77720 (0.0007) [2023-10-14 04:09:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 159875072. Throughput: 0: 1798.8, 1: 1764.1. Samples: 39972850. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:09:49,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:09:51,393][33226] Updated weights for policy 1, policy_version 78410 (0.0008) [2023-10-14 04:09:51,766][33226] Updated weights for policy 1, policy_version 78420 (0.0007) [2023-10-14 04:09:52,142][33226] Updated weights for policy 1, policy_version 78430 (0.0008) [2023-10-14 04:09:52,894][33201] Updated weights for policy 0, policy_version 77730 (0.0009) [2023-10-14 04:09:53,257][33201] Updated weights for policy 0, policy_version 77740 (0.0011) [2023-10-14 04:09:53,629][33201] Updated weights for policy 0, policy_version 77750 (0.0008) [2023-10-14 04:09:54,007][33201] Updated weights for policy 0, policy_version 77760 (0.0010) [2023-10-14 04:09:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 159940608. Throughput: 0: 1758.1, 1: 1763.4. Samples: 39993394. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:09:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:09:55,920][33226] Updated weights for policy 1, policy_version 78440 (0.0008) [2023-10-14 04:09:56,274][33226] Updated weights for policy 1, policy_version 78450 (0.0008) [2023-10-14 04:09:56,645][33226] Updated weights for policy 1, policy_version 78460 (0.0008) [2023-10-14 04:09:58,000][33201] Updated weights for policy 0, policy_version 77770 (0.0007) [2023-10-14 04:09:58,372][33201] Updated weights for policy 0, policy_version 77780 (0.0009) [2023-10-14 04:09:58,739][33201] Updated weights for policy 0, policy_version 77790 (0.0009) [2023-10-14 04:09:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 160006144. Throughput: 0: 1782.4, 1: 1764.0. Samples: 40004200. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:09:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:00,346][33226] Updated weights for policy 1, policy_version 78470 (0.0007) [2023-10-14 04:10:00,715][33226] Updated weights for policy 1, policy_version 78480 (0.0008) [2023-10-14 04:10:01,086][33226] Updated weights for policy 1, policy_version 78490 (0.0010) [2023-10-14 04:10:02,388][33201] Updated weights for policy 0, policy_version 77800 (0.0007) [2023-10-14 04:10:02,754][33201] Updated weights for policy 0, policy_version 77810 (0.0007) [2023-10-14 04:10:03,127][33201] Updated weights for policy 0, policy_version 77820 (0.0007) [2023-10-14 04:10:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160071680. Throughput: 0: 1758.4, 1: 1769.2. Samples: 40025190. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:10:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:04,847][33226] Updated weights for policy 1, policy_version 78500 (0.0011) [2023-10-14 04:10:05,213][33226] Updated weights for policy 1, policy_version 78510 (0.0010) [2023-10-14 04:10:05,588][33226] Updated weights for policy 1, policy_version 78520 (0.0008) [2023-10-14 04:10:07,009][33201] Updated weights for policy 0, policy_version 77830 (0.0008) [2023-10-14 04:10:07,376][33201] Updated weights for policy 0, policy_version 77840 (0.0009) [2023-10-14 04:10:07,744][33201] Updated weights for policy 0, policy_version 77850 (0.0008) [2023-10-14 04:10:09,379][33226] Updated weights for policy 1, policy_version 78530 (0.0008) [2023-10-14 04:10:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 160137216. Throughput: 0: 1745.4, 1: 1788.1. Samples: 40047020. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:10:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:09,756][33226] Updated weights for policy 1, policy_version 78540 (0.0009) [2023-10-14 04:10:10,120][33226] Updated weights for policy 1, policy_version 78550 (0.0007) [2023-10-14 04:10:10,483][33226] Updated weights for policy 1, policy_version 78560 (0.0008) [2023-10-14 04:10:11,677][33201] Updated weights for policy 0, policy_version 77860 (0.0009) [2023-10-14 04:10:12,046][33201] Updated weights for policy 0, policy_version 77870 (0.0007) [2023-10-14 04:10:12,418][33201] Updated weights for policy 0, policy_version 77880 (0.0010) [2023-10-14 04:10:14,512][33226] Updated weights for policy 1, policy_version 78570 (0.0010) [2023-10-14 04:10:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 160202752. Throughput: 0: 1760.1, 1: 1767.3. Samples: 40057074. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:10:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:14,875][33226] Updated weights for policy 1, policy_version 78580 (0.0010) [2023-10-14 04:10:15,251][33226] Updated weights for policy 1, policy_version 78590 (0.0009) [2023-10-14 04:10:16,172][33201] Updated weights for policy 0, policy_version 77890 (0.0010) [2023-10-14 04:10:16,543][33201] Updated weights for policy 0, policy_version 77900 (0.0007) [2023-10-14 04:10:16,914][33201] Updated weights for policy 0, policy_version 77910 (0.0009) [2023-10-14 04:10:17,280][33201] Updated weights for policy 0, policy_version 77920 (0.0007) [2023-10-14 04:10:18,978][33226] Updated weights for policy 1, policy_version 78600 (0.0009) [2023-10-14 04:10:19,349][33226] Updated weights for policy 1, policy_version 78610 (0.0010) [2023-10-14 04:10:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160268288. Throughput: 0: 1746.4, 1: 1783.6. Samples: 40078546. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:10:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:19,712][33226] Updated weights for policy 1, policy_version 78620 (0.0009) [2023-10-14 04:10:21,154][33201] Updated weights for policy 0, policy_version 77930 (0.0009) [2023-10-14 04:10:21,521][33201] Updated weights for policy 0, policy_version 77940 (0.0010) [2023-10-14 04:10:21,882][33201] Updated weights for policy 0, policy_version 77950 (0.0008) [2023-10-14 04:10:23,429][33226] Updated weights for policy 1, policy_version 78630 (0.0008) [2023-10-14 04:10:23,794][33226] Updated weights for policy 1, policy_version 78640 (0.0011) [2023-10-14 04:10:24,163][33226] Updated weights for policy 1, policy_version 78650 (0.0011) [2023-10-14 04:10:24,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160366592. Throughput: 0: 1753.3, 1: 1792.1. Samples: 40100238. Policy #0 lag: (min: 0.0, avg: 21.0, max: 32.0) [2023-10-14 04:10:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.720')] [2023-10-14 04:10:25,566][33201] Updated weights for policy 0, policy_version 77960 (0.0007) [2023-10-14 04:10:25,934][33201] Updated weights for policy 0, policy_version 77970 (0.0007) [2023-10-14 04:10:26,307][33201] Updated weights for policy 0, policy_version 77980 (0.0009) [2023-10-14 04:10:27,922][33226] Updated weights for policy 1, policy_version 78660 (0.0010) [2023-10-14 04:10:28,288][33226] Updated weights for policy 1, policy_version 78670 (0.0010) [2023-10-14 04:10:28,648][33226] Updated weights for policy 1, policy_version 78680 (0.0010) [2023-10-14 04:10:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160432128. Throughput: 0: 1747.5, 1: 1781.2. Samples: 40110848. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.880')] [2023-10-14 04:10:29,957][33201] Updated weights for policy 0, policy_version 77990 (0.0008) [2023-10-14 04:10:30,328][33201] Updated weights for policy 0, policy_version 78000 (0.0008) [2023-10-14 04:10:30,692][33201] Updated weights for policy 0, policy_version 78010 (0.0010) [2023-10-14 04:10:32,501][33226] Updated weights for policy 1, policy_version 78690 (0.0012) [2023-10-14 04:10:32,865][33226] Updated weights for policy 1, policy_version 78700 (0.0010) [2023-10-14 04:10:33,238][33226] Updated weights for policy 1, policy_version 78710 (0.0011) [2023-10-14 04:10:33,610][33226] Updated weights for policy 1, policy_version 78720 (0.0010) [2023-10-14 04:10:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 160497664. Throughput: 0: 1752.1, 1: 1790.2. Samples: 40132252. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.880')] [2023-10-14 04:10:34,565][33201] Updated weights for policy 0, policy_version 78020 (0.0007) [2023-10-14 04:10:34,927][33201] Updated weights for policy 0, policy_version 78030 (0.0009) [2023-10-14 04:10:35,303][33201] Updated weights for policy 0, policy_version 78040 (0.0007) [2023-10-14 04:10:37,403][33226] Updated weights for policy 1, policy_version 78730 (0.0010) [2023-10-14 04:10:37,762][33226] Updated weights for policy 1, policy_version 78740 (0.0010) [2023-10-14 04:10:38,136][33226] Updated weights for policy 1, policy_version 78750 (0.0011) [2023-10-14 04:10:39,147][33201] Updated weights for policy 0, policy_version 78050 (0.0010) [2023-10-14 04:10:39,522][33201] Updated weights for policy 0, policy_version 78060 (0.0009) [2023-10-14 04:10:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 160563200. Throughput: 0: 1791.6, 1: 1766.3. Samples: 40153496. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.880')] [2023-10-14 04:10:39,897][33201] Updated weights for policy 0, policy_version 78070 (0.0008) [2023-10-14 04:10:40,260][33201] Updated weights for policy 0, policy_version 78080 (0.0008) [2023-10-14 04:10:41,974][33226] Updated weights for policy 1, policy_version 78760 (0.0010) [2023-10-14 04:10:42,351][33226] Updated weights for policy 1, policy_version 78770 (0.0009) [2023-10-14 04:10:42,726][33226] Updated weights for policy 1, policy_version 78780 (0.0008) [2023-10-14 04:10:44,273][33201] Updated weights for policy 0, policy_version 78090 (0.0009) [2023-10-14 04:10:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 160628736. Throughput: 0: 1761.6, 1: 1793.7. Samples: 40164188. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 04:10:44,656][33201] Updated weights for policy 0, policy_version 78100 (0.0009) [2023-10-14 04:10:45,027][33201] Updated weights for policy 0, policy_version 78110 (0.0009) [2023-10-14 04:10:46,537][33226] Updated weights for policy 1, policy_version 78790 (0.0008) [2023-10-14 04:10:46,913][33226] Updated weights for policy 1, policy_version 78800 (0.0008) [2023-10-14 04:10:47,296][33226] Updated weights for policy 1, policy_version 78810 (0.0008) [2023-10-14 04:10:48,813][33201] Updated weights for policy 0, policy_version 78120 (0.0009) [2023-10-14 04:10:49,181][33201] Updated weights for policy 0, policy_version 78130 (0.0008) [2023-10-14 04:10:49,555][33201] Updated weights for policy 0, policy_version 78140 (0.0008) [2023-10-14 04:10:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 160694272. Throughput: 0: 1786.9, 1: 1764.8. Samples: 40185020. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:49,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 04:10:51,152][33226] Updated weights for policy 1, policy_version 78820 (0.0008) [2023-10-14 04:10:51,523][33226] Updated weights for policy 1, policy_version 78830 (0.0008) [2023-10-14 04:10:51,888][33226] Updated weights for policy 1, policy_version 78840 (0.0007) [2023-10-14 04:10:53,330][33201] Updated weights for policy 0, policy_version 78150 (0.0008) [2023-10-14 04:10:53,694][33201] Updated weights for policy 0, policy_version 78160 (0.0011) [2023-10-14 04:10:54,066][33201] Updated weights for policy 0, policy_version 78170 (0.0008) [2023-10-14 04:10:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160792576. Throughput: 0: 1772.3, 1: 1757.1. Samples: 40205842. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 04:10:55,818][33226] Updated weights for policy 1, policy_version 78850 (0.0007) [2023-10-14 04:10:56,179][33226] Updated weights for policy 1, policy_version 78860 (0.0011) [2023-10-14 04:10:56,545][33226] Updated weights for policy 1, policy_version 78870 (0.0010) [2023-10-14 04:10:56,912][33226] Updated weights for policy 1, policy_version 78880 (0.0007) [2023-10-14 04:10:57,946][33201] Updated weights for policy 0, policy_version 78180 (0.0008) [2023-10-14 04:10:58,319][33201] Updated weights for policy 0, policy_version 78190 (0.0007) [2023-10-14 04:10:58,682][33201] Updated weights for policy 0, policy_version 78200 (0.0007) [2023-10-14 04:10:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160858112. Throughput: 0: 1783.9, 1: 1760.5. Samples: 40216572. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:10:59,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.850')] [2023-10-14 04:11:00,624][33226] Updated weights for policy 1, policy_version 78890 (0.0008) [2023-10-14 04:11:00,989][33226] Updated weights for policy 1, policy_version 78900 (0.0007) [2023-10-14 04:11:01,348][33226] Updated weights for policy 1, policy_version 78910 (0.0010) [2023-10-14 04:11:02,429][33201] Updated weights for policy 0, policy_version 78210 (0.0009) [2023-10-14 04:11:02,797][33201] Updated weights for policy 0, policy_version 78220 (0.0010) [2023-10-14 04:11:03,171][33201] Updated weights for policy 0, policy_version 78230 (0.0007) [2023-10-14 04:11:03,542][33201] Updated weights for policy 0, policy_version 78240 (0.0007) [2023-10-14 04:11:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160923648. Throughput: 0: 1779.6, 1: 1761.6. Samples: 40237898. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:11:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.910')] [2023-10-14 04:11:05,267][33226] Updated weights for policy 1, policy_version 78920 (0.0009) [2023-10-14 04:11:05,639][33226] Updated weights for policy 1, policy_version 78930 (0.0009) [2023-10-14 04:11:06,014][33226] Updated weights for policy 1, policy_version 78940 (0.0007) [2023-10-14 04:11:07,240][33201] Updated weights for policy 0, policy_version 78250 (0.0011) [2023-10-14 04:11:07,607][33201] Updated weights for policy 0, policy_version 78260 (0.0009) [2023-10-14 04:11:07,976][33201] Updated weights for policy 0, policy_version 78270 (0.0011) [2023-10-14 04:11:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 160989184. Throughput: 0: 1767.0, 1: 1778.5. Samples: 40259788. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:11:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:11:09,723][33226] Updated weights for policy 1, policy_version 78950 (0.0009) [2023-10-14 04:11:10,096][33226] Updated weights for policy 1, policy_version 78960 (0.0008) [2023-10-14 04:11:10,467][33226] Updated weights for policy 1, policy_version 78970 (0.0007) [2023-10-14 04:11:11,744][33201] Updated weights for policy 0, policy_version 78280 (0.0010) [2023-10-14 04:11:12,106][33201] Updated weights for policy 0, policy_version 78290 (0.0010) [2023-10-14 04:11:12,478][33201] Updated weights for policy 0, policy_version 78300 (0.0010) [2023-10-14 04:11:14,194][33226] Updated weights for policy 1, policy_version 78980 (0.0008) [2023-10-14 04:11:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161054720. Throughput: 0: 1781.0, 1: 1760.3. Samples: 40270204. Policy #0 lag: (min: 30.0, avg: 35.3, max: 62.0) [2023-10-14 04:11:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:11:14,567][33226] Updated weights for policy 1, policy_version 78990 (0.0007) [2023-10-14 04:11:14,926][33226] Updated weights for policy 1, policy_version 79000 (0.0007) [2023-10-14 04:11:16,304][33201] Updated weights for policy 0, policy_version 78310 (0.0009) [2023-10-14 04:11:16,681][33201] Updated weights for policy 0, policy_version 78320 (0.0009) [2023-10-14 04:11:17,047][33201] Updated weights for policy 0, policy_version 78330 (0.0007) [2023-10-14 04:11:18,602][33226] Updated weights for policy 1, policy_version 79010 (0.0008) [2023-10-14 04:11:18,971][33226] Updated weights for policy 1, policy_version 79020 (0.0009) [2023-10-14 04:11:19,341][33226] Updated weights for policy 1, policy_version 79030 (0.0010) [2023-10-14 04:11:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 161120256. Throughput: 0: 1764.5, 1: 1780.9. Samples: 40291798. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:19,701][33226] Updated weights for policy 1, policy_version 79040 (0.0009) [2023-10-14 04:11:20,775][33201] Updated weights for policy 0, policy_version 78340 (0.0008) [2023-10-14 04:11:21,145][33201] Updated weights for policy 0, policy_version 78350 (0.0009) [2023-10-14 04:11:21,511][33201] Updated weights for policy 0, policy_version 78360 (0.0009) [2023-10-14 04:11:23,458][33226] Updated weights for policy 1, policy_version 79050 (0.0008) [2023-10-14 04:11:23,814][33226] Updated weights for policy 1, policy_version 79060 (0.0010) [2023-10-14 04:11:24,180][33226] Updated weights for policy 1, policy_version 79070 (0.0007) [2023-10-14 04:11:24,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161218560. Throughput: 0: 1764.8, 1: 1779.3. Samples: 40312978. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:24,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:11:24,563][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000078368_80248832.pth... [2023-10-14 04:11:24,564][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000079072_80969728.pth... [2023-10-14 04:11:24,594][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000077408_79265792.pth [2023-10-14 04:11:24,597][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000076736_78577664.pth [2023-10-14 04:11:25,309][33201] Updated weights for policy 0, policy_version 78370 (0.0011) [2023-10-14 04:11:25,683][33201] Updated weights for policy 0, policy_version 78380 (0.0007) [2023-10-14 04:11:26,053][33201] Updated weights for policy 0, policy_version 78390 (0.0009) [2023-10-14 04:11:26,429][33201] Updated weights for policy 0, policy_version 78400 (0.0008) [2023-10-14 04:11:28,037][33226] Updated weights for policy 1, policy_version 79080 (0.0011) [2023-10-14 04:11:28,402][33226] Updated weights for policy 1, policy_version 79090 (0.0010) [2023-10-14 04:11:28,769][33226] Updated weights for policy 1, policy_version 79100 (0.0009) [2023-10-14 04:11:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161284096. Throughput: 0: 1764.8, 1: 1779.2. Samples: 40323666. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:11:30,351][33201] Updated weights for policy 0, policy_version 78410 (0.0009) [2023-10-14 04:11:30,718][33201] Updated weights for policy 0, policy_version 78420 (0.0007) [2023-10-14 04:11:31,088][33201] Updated weights for policy 0, policy_version 78430 (0.0007) [2023-10-14 04:11:32,328][33226] Updated weights for policy 1, policy_version 79110 (0.0008) [2023-10-14 04:11:32,685][33226] Updated weights for policy 1, policy_version 79120 (0.0009) [2023-10-14 04:11:33,056][33226] Updated weights for policy 1, policy_version 79130 (0.0007) [2023-10-14 04:11:34,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 161349632. Throughput: 0: 1767.2, 1: 1790.6. Samples: 40345124. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:34,559][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:35,056][33201] Updated weights for policy 0, policy_version 78440 (0.0008) [2023-10-14 04:11:35,442][33201] Updated weights for policy 0, policy_version 78450 (0.0007) [2023-10-14 04:11:35,814][33201] Updated weights for policy 0, policy_version 78460 (0.0008) [2023-10-14 04:11:36,831][33226] Updated weights for policy 1, policy_version 79140 (0.0008) [2023-10-14 04:11:37,199][33226] Updated weights for policy 1, policy_version 79150 (0.0008) [2023-10-14 04:11:37,574][33226] Updated weights for policy 1, policy_version 79160 (0.0008) [2023-10-14 04:11:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 161415168. Throughput: 0: 1780.7, 1: 1788.5. Samples: 40366458. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:39,692][33201] Updated weights for policy 0, policy_version 78470 (0.0008) [2023-10-14 04:11:40,075][33201] Updated weights for policy 0, policy_version 78480 (0.0007) [2023-10-14 04:11:40,449][33201] Updated weights for policy 0, policy_version 78490 (0.0008) [2023-10-14 04:11:41,302][33226] Updated weights for policy 1, policy_version 79170 (0.0010) [2023-10-14 04:11:41,668][33226] Updated weights for policy 1, policy_version 79180 (0.0011) [2023-10-14 04:11:42,034][33226] Updated weights for policy 1, policy_version 79190 (0.0007) [2023-10-14 04:11:42,401][33226] Updated weights for policy 1, policy_version 79200 (0.0008) [2023-10-14 04:11:44,153][33201] Updated weights for policy 0, policy_version 78500 (0.0007) [2023-10-14 04:11:44,523][33201] Updated weights for policy 0, policy_version 78510 (0.0007) [2023-10-14 04:11:44,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 161480704. Throughput: 0: 1754.4, 1: 1806.0. Samples: 40376790. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:44,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:44,889][33201] Updated weights for policy 0, policy_version 78520 (0.0007) [2023-10-14 04:11:46,087][33226] Updated weights for policy 1, policy_version 79210 (0.0008) [2023-10-14 04:11:46,449][33226] Updated weights for policy 1, policy_version 79220 (0.0007) [2023-10-14 04:11:46,824][33226] Updated weights for policy 1, policy_version 79230 (0.0008) [2023-10-14 04:11:48,866][33201] Updated weights for policy 0, policy_version 78530 (0.0007) [2023-10-14 04:11:49,233][33201] Updated weights for policy 0, policy_version 78540 (0.0008) [2023-10-14 04:11:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 161546240. Throughput: 0: 1778.8, 1: 1797.0. Samples: 40398806. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:49,597][33201] Updated weights for policy 0, policy_version 78550 (0.0009) [2023-10-14 04:11:49,968][33201] Updated weights for policy 0, policy_version 78560 (0.0008) [2023-10-14 04:11:50,680][33226] Updated weights for policy 1, policy_version 79240 (0.0007) [2023-10-14 04:11:51,052][33226] Updated weights for policy 1, policy_version 79250 (0.0007) [2023-10-14 04:11:51,425][33226] Updated weights for policy 1, policy_version 79260 (0.0008) [2023-10-14 04:11:53,712][33201] Updated weights for policy 0, policy_version 78570 (0.0008) [2023-10-14 04:11:54,080][33201] Updated weights for policy 0, policy_version 78580 (0.0008) [2023-10-14 04:11:54,461][33201] Updated weights for policy 0, policy_version 78590 (0.0008) [2023-10-14 04:11:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 161644544. Throughput: 0: 1768.6, 1: 1794.1. Samples: 40420110. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:55,105][33226] Updated weights for policy 1, policy_version 79270 (0.0009) [2023-10-14 04:11:55,475][33226] Updated weights for policy 1, policy_version 79280 (0.0007) [2023-10-14 04:11:55,838][33226] Updated weights for policy 1, policy_version 79290 (0.0007) [2023-10-14 04:11:58,171][33201] Updated weights for policy 0, policy_version 78600 (0.0007) [2023-10-14 04:11:58,546][33201] Updated weights for policy 0, policy_version 78610 (0.0007) [2023-10-14 04:11:58,915][33201] Updated weights for policy 0, policy_version 78620 (0.0007) [2023-10-14 04:11:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161710080. Throughput: 0: 1773.8, 1: 1792.1. Samples: 40430668. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:11:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:11:59,678][33226] Updated weights for policy 1, policy_version 79300 (0.0007) [2023-10-14 04:12:00,044][33226] Updated weights for policy 1, policy_version 79310 (0.0007) [2023-10-14 04:12:00,410][33226] Updated weights for policy 1, policy_version 79320 (0.0008) [2023-10-14 04:12:02,590][33201] Updated weights for policy 0, policy_version 78630 (0.0009) [2023-10-14 04:12:02,962][33201] Updated weights for policy 0, policy_version 78640 (0.0009) [2023-10-14 04:12:03,321][33201] Updated weights for policy 0, policy_version 78650 (0.0011) [2023-10-14 04:12:04,226][33226] Updated weights for policy 1, policy_version 79330 (0.0009) [2023-10-14 04:12:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161775616. Throughput: 0: 1770.1, 1: 1791.0. Samples: 40452048. Policy #0 lag: (min: 27.0, avg: 27.0, max: 27.0) [2023-10-14 04:12:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:12:04,591][33226] Updated weights for policy 1, policy_version 79340 (0.0008) [2023-10-14 04:12:04,953][33226] Updated weights for policy 1, policy_version 79350 (0.0010) [2023-10-14 04:12:05,315][33226] Updated weights for policy 1, policy_version 79360 (0.0009) [2023-10-14 04:12:07,187][33201] Updated weights for policy 0, policy_version 78660 (0.0010) [2023-10-14 04:12:07,555][33201] Updated weights for policy 0, policy_version 78670 (0.0009) [2023-10-14 04:12:07,922][33201] Updated weights for policy 0, policy_version 78680 (0.0008) [2023-10-14 04:12:09,115][33226] Updated weights for policy 1, policy_version 79370 (0.0009) [2023-10-14 04:12:09,494][33226] Updated weights for policy 1, policy_version 79380 (0.0009) [2023-10-14 04:12:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 161841152. Throughput: 0: 1755.7, 1: 1812.1. Samples: 40473532. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:12:09,858][33226] Updated weights for policy 1, policy_version 79390 (0.0008) [2023-10-14 04:12:11,639][33201] Updated weights for policy 0, policy_version 78690 (0.0008) [2023-10-14 04:12:12,005][33201] Updated weights for policy 0, policy_version 78700 (0.0008) [2023-10-14 04:12:12,382][33201] Updated weights for policy 0, policy_version 78710 (0.0008) [2023-10-14 04:12:12,742][33201] Updated weights for policy 0, policy_version 78720 (0.0007) [2023-10-14 04:12:13,545][33226] Updated weights for policy 1, policy_version 79400 (0.0009) [2023-10-14 04:12:13,912][33226] Updated weights for policy 1, policy_version 79410 (0.0009) [2023-10-14 04:12:14,284][33226] Updated weights for policy 1, policy_version 79420 (0.0009) [2023-10-14 04:12:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 161939456. Throughput: 0: 1773.6, 1: 1795.9. Samples: 40484294. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:12:16,619][33201] Updated weights for policy 0, policy_version 78730 (0.0009) [2023-10-14 04:12:16,988][33201] Updated weights for policy 0, policy_version 78740 (0.0007) [2023-10-14 04:12:17,349][33201] Updated weights for policy 0, policy_version 78750 (0.0010) [2023-10-14 04:12:18,015][33226] Updated weights for policy 1, policy_version 79430 (0.0007) [2023-10-14 04:12:18,386][33226] Updated weights for policy 1, policy_version 79440 (0.0009) [2023-10-14 04:12:18,752][33226] Updated weights for policy 1, policy_version 79450 (0.0010) [2023-10-14 04:12:19,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 162004992. Throughput: 0: 1754.6, 1: 1810.7. Samples: 40505562. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:12:21,082][33201] Updated weights for policy 0, policy_version 78760 (0.0007) [2023-10-14 04:12:21,445][33201] Updated weights for policy 0, policy_version 78770 (0.0009) [2023-10-14 04:12:21,817][33201] Updated weights for policy 0, policy_version 78780 (0.0008) [2023-10-14 04:12:22,613][33226] Updated weights for policy 1, policy_version 79460 (0.0010) [2023-10-14 04:12:22,978][33226] Updated weights for policy 1, policy_version 79470 (0.0009) [2023-10-14 04:12:23,338][33226] Updated weights for policy 1, policy_version 79480 (0.0007) [2023-10-14 04:12:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 162070528. Throughput: 0: 1769.2, 1: 1788.0. Samples: 40526534. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.940')] [2023-10-14 04:12:25,650][33201] Updated weights for policy 0, policy_version 78790 (0.0009) [2023-10-14 04:12:26,025][33201] Updated weights for policy 0, policy_version 78800 (0.0008) [2023-10-14 04:12:26,385][33201] Updated weights for policy 0, policy_version 78810 (0.0008) [2023-10-14 04:12:27,051][33226] Updated weights for policy 1, policy_version 79490 (0.0008) [2023-10-14 04:12:27,418][33226] Updated weights for policy 1, policy_version 79500 (0.0007) [2023-10-14 04:12:27,790][33226] Updated weights for policy 1, policy_version 79510 (0.0007) [2023-10-14 04:12:28,153][33226] Updated weights for policy 1, policy_version 79520 (0.0007) [2023-10-14 04:12:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 162136064. Throughput: 0: 1769.8, 1: 1807.3. Samples: 40537760. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 04:12:30,335][33201] Updated weights for policy 0, policy_version 78820 (0.0007) [2023-10-14 04:12:30,700][33201] Updated weights for policy 0, policy_version 78830 (0.0007) [2023-10-14 04:12:31,072][33201] Updated weights for policy 0, policy_version 78840 (0.0008) [2023-10-14 04:12:31,900][33226] Updated weights for policy 1, policy_version 79530 (0.0008) [2023-10-14 04:12:32,267][33226] Updated weights for policy 1, policy_version 79540 (0.0008) [2023-10-14 04:12:32,637][33226] Updated weights for policy 1, policy_version 79550 (0.0008) [2023-10-14 04:12:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162201600. Throughput: 0: 1764.4, 1: 1784.8. Samples: 40558520. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.900')] [2023-10-14 04:12:35,151][33201] Updated weights for policy 0, policy_version 78850 (0.0009) [2023-10-14 04:12:35,515][33201] Updated weights for policy 0, policy_version 78860 (0.0011) [2023-10-14 04:12:35,889][33201] Updated weights for policy 0, policy_version 78870 (0.0009) [2023-10-14 04:12:36,254][33201] Updated weights for policy 0, policy_version 78880 (0.0009) [2023-10-14 04:12:36,553][33226] Updated weights for policy 1, policy_version 79560 (0.0008) [2023-10-14 04:12:36,917][33226] Updated weights for policy 1, policy_version 79570 (0.0010) [2023-10-14 04:12:37,286][33226] Updated weights for policy 1, policy_version 79580 (0.0007) [2023-10-14 04:12:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 162267136. Throughput: 0: 1783.3, 1: 1782.0. Samples: 40580546. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:39,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.900')] [2023-10-14 04:12:39,861][33201] Updated weights for policy 0, policy_version 78890 (0.0007) [2023-10-14 04:12:40,229][33201] Updated weights for policy 0, policy_version 78900 (0.0007) [2023-10-14 04:12:40,608][33201] Updated weights for policy 0, policy_version 78910 (0.0008) [2023-10-14 04:12:41,049][33226] Updated weights for policy 1, policy_version 79590 (0.0009) [2023-10-14 04:12:41,411][33226] Updated weights for policy 1, policy_version 79600 (0.0008) [2023-10-14 04:12:41,773][33226] Updated weights for policy 1, policy_version 79610 (0.0010) [2023-10-14 04:12:44,227][33201] Updated weights for policy 0, policy_version 78920 (0.0010) [2023-10-14 04:12:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 162332672. Throughput: 0: 1764.1, 1: 1789.1. Samples: 40590564. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 04:12:44,599][33201] Updated weights for policy 0, policy_version 78930 (0.0011) [2023-10-14 04:12:44,976][33201] Updated weights for policy 0, policy_version 78940 (0.0010) [2023-10-14 04:12:45,401][33226] Updated weights for policy 1, policy_version 79620 (0.0008) [2023-10-14 04:12:45,767][33226] Updated weights for policy 1, policy_version 79630 (0.0007) [2023-10-14 04:12:46,138][33226] Updated weights for policy 1, policy_version 79640 (0.0009) [2023-10-14 04:12:48,879][33201] Updated weights for policy 0, policy_version 78950 (0.0009) [2023-10-14 04:12:49,245][33201] Updated weights for policy 0, policy_version 78960 (0.0011) [2023-10-14 04:12:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 162398208. Throughput: 0: 1785.2, 1: 1787.2. Samples: 40612810. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.920')] [2023-10-14 04:12:49,611][33201] Updated weights for policy 0, policy_version 78970 (0.0010) [2023-10-14 04:12:49,926][33226] Updated weights for policy 1, policy_version 79650 (0.0009) [2023-10-14 04:12:50,287][33226] Updated weights for policy 1, policy_version 79660 (0.0012) [2023-10-14 04:12:50,660][33226] Updated weights for policy 1, policy_version 79670 (0.0011) [2023-10-14 04:12:51,020][33226] Updated weights for policy 1, policy_version 79680 (0.0011) [2023-10-14 04:12:53,483][33201] Updated weights for policy 0, policy_version 78980 (0.0008) [2023-10-14 04:12:53,852][33201] Updated weights for policy 0, policy_version 78990 (0.0011) [2023-10-14 04:12:54,222][33201] Updated weights for policy 0, policy_version 79000 (0.0007) [2023-10-14 04:12:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162496512. Throughput: 0: 1779.9, 1: 1788.0. Samples: 40634090. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:12:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 04:12:54,960][33226] Updated weights for policy 1, policy_version 79690 (0.0009) [2023-10-14 04:12:55,323][33226] Updated weights for policy 1, policy_version 79700 (0.0009) [2023-10-14 04:12:55,695][33226] Updated weights for policy 1, policy_version 79710 (0.0007) [2023-10-14 04:12:58,029][33201] Updated weights for policy 0, policy_version 79010 (0.0007) [2023-10-14 04:12:58,401][33201] Updated weights for policy 0, policy_version 79020 (0.0007) [2023-10-14 04:12:58,773][33201] Updated weights for policy 0, policy_version 79030 (0.0008) [2023-10-14 04:12:59,145][33201] Updated weights for policy 0, policy_version 79040 (0.0009) [2023-10-14 04:12:59,520][33226] Updated weights for policy 1, policy_version 79720 (0.0008) [2023-10-14 04:12:59,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162562048. Throughput: 0: 1784.0, 1: 1776.9. Samples: 40644536. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:12:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 04:12:59,891][33226] Updated weights for policy 1, policy_version 79730 (0.0010) [2023-10-14 04:13:00,248][33226] Updated weights for policy 1, policy_version 79740 (0.0007) [2023-10-14 04:13:03,003][33201] Updated weights for policy 0, policy_version 79050 (0.0007) [2023-10-14 04:13:03,379][33201] Updated weights for policy 0, policy_version 79060 (0.0007) [2023-10-14 04:13:03,753][33201] Updated weights for policy 0, policy_version 79070 (0.0008) [2023-10-14 04:13:03,964][33226] Updated weights for policy 1, policy_version 79750 (0.0008) [2023-10-14 04:13:04,343][33226] Updated weights for policy 1, policy_version 79760 (0.0008) [2023-10-14 04:13:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 162627584. Throughput: 0: 1787.5, 1: 1782.0. Samples: 40666188. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 04:13:04,709][33226] Updated weights for policy 1, policy_version 79770 (0.0008) [2023-10-14 04:13:07,636][33201] Updated weights for policy 0, policy_version 79080 (0.0009) [2023-10-14 04:13:08,015][33201] Updated weights for policy 0, policy_version 79090 (0.0008) [2023-10-14 04:13:08,338][33226] Updated weights for policy 1, policy_version 79780 (0.0008) [2023-10-14 04:13:08,377][33201] Updated weights for policy 0, policy_version 79100 (0.0010) [2023-10-14 04:13:08,704][33226] Updated weights for policy 1, policy_version 79790 (0.0007) [2023-10-14 04:13:09,069][33226] Updated weights for policy 1, policy_version 79800 (0.0008) [2023-10-14 04:13:09,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 162725888. Throughput: 0: 1759.8, 1: 1798.3. Samples: 40686648. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 04:13:12,342][33201] Updated weights for policy 0, policy_version 79110 (0.0007) [2023-10-14 04:13:12,709][33201] Updated weights for policy 0, policy_version 79120 (0.0007) [2023-10-14 04:13:12,759][33226] Updated weights for policy 1, policy_version 79810 (0.0007) [2023-10-14 04:13:13,071][33201] Updated weights for policy 0, policy_version 79130 (0.0008) [2023-10-14 04:13:13,126][33226] Updated weights for policy 1, policy_version 79820 (0.0008) [2023-10-14 04:13:13,502][33226] Updated weights for policy 1, policy_version 79830 (0.0009) [2023-10-14 04:13:13,863][33226] Updated weights for policy 1, policy_version 79840 (0.0008) [2023-10-14 04:13:14,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 162791424. Throughput: 0: 1787.6, 1: 1782.2. Samples: 40698402. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.920')] [2023-10-14 04:13:17,037][33201] Updated weights for policy 0, policy_version 79140 (0.0008) [2023-10-14 04:13:17,405][33201] Updated weights for policy 0, policy_version 79150 (0.0011) [2023-10-14 04:13:17,739][33226] Updated weights for policy 1, policy_version 79850 (0.0009) [2023-10-14 04:13:17,777][33201] Updated weights for policy 0, policy_version 79160 (0.0008) [2023-10-14 04:13:18,108][33226] Updated weights for policy 1, policy_version 79860 (0.0009) [2023-10-14 04:13:18,474][33226] Updated weights for policy 1, policy_version 79870 (0.0009) [2023-10-14 04:13:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162856960. Throughput: 0: 1752.9, 1: 1800.3. Samples: 40718414. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:21,512][33201] Updated weights for policy 0, policy_version 79170 (0.0007) [2023-10-14 04:13:21,888][33201] Updated weights for policy 0, policy_version 79180 (0.0008) [2023-10-14 04:13:22,256][33201] Updated weights for policy 0, policy_version 79190 (0.0008) [2023-10-14 04:13:22,297][33226] Updated weights for policy 1, policy_version 79880 (0.0009) [2023-10-14 04:13:22,638][33201] Updated weights for policy 0, policy_version 79200 (0.0009) [2023-10-14 04:13:22,669][33226] Updated weights for policy 1, policy_version 79890 (0.0010) [2023-10-14 04:13:23,029][33226] Updated weights for policy 1, policy_version 79900 (0.0009) [2023-10-14 04:13:24,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162922496. Throughput: 0: 1751.5, 1: 1787.9. Samples: 40739816. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000079904_81821696.pth... [2023-10-14 04:13:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000079200_81100800.pth... [2023-10-14 04:13:24,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000078240_80117760.pth [2023-10-14 04:13:24,606][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000077536_79396864.pth [2023-10-14 04:13:26,603][33201] Updated weights for policy 0, policy_version 79210 (0.0009) [2023-10-14 04:13:26,687][33226] Updated weights for policy 1, policy_version 79910 (0.0008) [2023-10-14 04:13:26,966][33201] Updated weights for policy 0, policy_version 79220 (0.0009) [2023-10-14 04:13:27,048][33226] Updated weights for policy 1, policy_version 79920 (0.0009) [2023-10-14 04:13:27,339][33201] Updated weights for policy 0, policy_version 79230 (0.0008) [2023-10-14 04:13:27,421][33226] Updated weights for policy 1, policy_version 79930 (0.0009) [2023-10-14 04:13:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 162988032. Throughput: 0: 1754.1, 1: 1806.7. Samples: 40750800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:31,009][33201] Updated weights for policy 0, policy_version 79240 (0.0008) [2023-10-14 04:13:31,267][33226] Updated weights for policy 1, policy_version 79940 (0.0009) [2023-10-14 04:13:31,378][33201] Updated weights for policy 0, policy_version 79250 (0.0008) [2023-10-14 04:13:31,624][33226] Updated weights for policy 1, policy_version 79950 (0.0008) [2023-10-14 04:13:31,741][33201] Updated weights for policy 0, policy_version 79260 (0.0008) [2023-10-14 04:13:31,989][33226] Updated weights for policy 1, policy_version 79960 (0.0009) [2023-10-14 04:13:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 163053568. Throughput: 0: 1743.7, 1: 1785.2. Samples: 40771608. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:35,463][33201] Updated weights for policy 0, policy_version 79270 (0.0009) [2023-10-14 04:13:35,688][33226] Updated weights for policy 1, policy_version 79970 (0.0009) [2023-10-14 04:13:35,829][33201] Updated weights for policy 0, policy_version 79280 (0.0010) [2023-10-14 04:13:36,059][33226] Updated weights for policy 1, policy_version 79980 (0.0007) [2023-10-14 04:13:36,192][33201] Updated weights for policy 0, policy_version 79290 (0.0008) [2023-10-14 04:13:36,414][33226] Updated weights for policy 1, policy_version 79990 (0.0007) [2023-10-14 04:13:36,783][33226] Updated weights for policy 1, policy_version 80000 (0.0008) [2023-10-14 04:13:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 163119104. Throughput: 0: 1761.1, 1: 1786.3. Samples: 40793724. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:39,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:40,132][33201] Updated weights for policy 0, policy_version 79300 (0.0009) [2023-10-14 04:13:40,506][33201] Updated weights for policy 0, policy_version 79310 (0.0008) [2023-10-14 04:13:40,666][33226] Updated weights for policy 1, policy_version 80010 (0.0007) [2023-10-14 04:13:40,880][33201] Updated weights for policy 0, policy_version 79320 (0.0008) [2023-10-14 04:13:41,027][33226] Updated weights for policy 1, policy_version 80020 (0.0008) [2023-10-14 04:13:41,392][33226] Updated weights for policy 1, policy_version 80030 (0.0010) [2023-10-14 04:13:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 163184640. Throughput: 0: 1742.7, 1: 1787.9. Samples: 40803412. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:44,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:44,709][33201] Updated weights for policy 0, policy_version 79330 (0.0007) [2023-10-14 04:13:45,010][33226] Updated weights for policy 1, policy_version 80040 (0.0010) [2023-10-14 04:13:45,081][33201] Updated weights for policy 0, policy_version 79340 (0.0010) [2023-10-14 04:13:45,374][33226] Updated weights for policy 1, policy_version 80050 (0.0007) [2023-10-14 04:13:45,458][33201] Updated weights for policy 0, policy_version 79350 (0.0009) [2023-10-14 04:13:45,741][33226] Updated weights for policy 1, policy_version 80060 (0.0008) [2023-10-14 04:13:45,820][33201] Updated weights for policy 0, policy_version 79360 (0.0007) [2023-10-14 04:13:49,543][33226] Updated weights for policy 1, policy_version 80070 (0.0008) [2023-10-14 04:13:49,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 163250176. Throughput: 0: 1752.5, 1: 1787.2. Samples: 40825476. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:13:49,701][33201] Updated weights for policy 0, policy_version 79370 (0.0008) [2023-10-14 04:13:49,911][33226] Updated weights for policy 1, policy_version 80080 (0.0008) [2023-10-14 04:13:50,065][33201] Updated weights for policy 0, policy_version 79380 (0.0007) [2023-10-14 04:13:50,274][33226] Updated weights for policy 1, policy_version 80090 (0.0007) [2023-10-14 04:13:50,440][33201] Updated weights for policy 0, policy_version 79390 (0.0008) [2023-10-14 04:13:54,018][33226] Updated weights for policy 1, policy_version 80100 (0.0009) [2023-10-14 04:13:54,183][33201] Updated weights for policy 0, policy_version 79400 (0.0008) [2023-10-14 04:13:54,387][33226] Updated weights for policy 1, policy_version 80110 (0.0009) [2023-10-14 04:13:54,554][33201] Updated weights for policy 0, policy_version 79410 (0.0009) [2023-10-14 04:13:54,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 163315712. Throughput: 0: 1774.2, 1: 1801.5. Samples: 40847554. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.930')] [2023-10-14 04:13:54,760][33226] Updated weights for policy 1, policy_version 80120 (0.0008) [2023-10-14 04:13:54,917][33201] Updated weights for policy 0, policy_version 79420 (0.0007) [2023-10-14 04:13:58,651][33226] Updated weights for policy 1, policy_version 80130 (0.0008) [2023-10-14 04:13:58,704][33201] Updated weights for policy 0, policy_version 79430 (0.0007) [2023-10-14 04:13:59,013][33226] Updated weights for policy 1, policy_version 80140 (0.0007) [2023-10-14 04:13:59,071][33201] Updated weights for policy 0, policy_version 79440 (0.0007) [2023-10-14 04:13:59,380][33226] Updated weights for policy 1, policy_version 80150 (0.0007) [2023-10-14 04:13:59,441][33201] Updated weights for policy 0, policy_version 79450 (0.0007) [2023-10-14 04:13:59,557][31953] Fps is (10 sec: 13107.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 163381248. Throughput: 0: 1752.3, 1: 1781.3. Samples: 40857410. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:13:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 04:13:59,746][33226] Updated weights for policy 1, policy_version 80160 (0.0007) [2023-10-14 04:14:03,362][33201] Updated weights for policy 0, policy_version 79460 (0.0008) [2023-10-14 04:14:03,527][33226] Updated weights for policy 1, policy_version 80170 (0.0008) [2023-10-14 04:14:03,723][33201] Updated weights for policy 0, policy_version 79470 (0.0009) [2023-10-14 04:14:03,899][33226] Updated weights for policy 1, policy_version 80180 (0.0008) [2023-10-14 04:14:04,087][33201] Updated weights for policy 0, policy_version 79480 (0.0008) [2023-10-14 04:14:04,261][33226] Updated weights for policy 1, policy_version 80190 (0.0008) [2023-10-14 04:14:04,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14745.6, 300 sec: 14329.0). Total num frames: 163512320. Throughput: 0: 1787.5, 1: 1792.1. Samples: 40879498. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.940')] [2023-10-14 04:14:07,796][33201] Updated weights for policy 0, policy_version 79490 (0.0008) [2023-10-14 04:14:08,161][33201] Updated weights for policy 0, policy_version 79500 (0.0008) [2023-10-14 04:14:08,281][33226] Updated weights for policy 1, policy_version 80200 (0.0008) [2023-10-14 04:14:08,528][33201] Updated weights for policy 0, policy_version 79510 (0.0007) [2023-10-14 04:14:08,666][33226] Updated weights for policy 1, policy_version 80210 (0.0007) [2023-10-14 04:14:08,893][33201] Updated weights for policy 0, policy_version 79520 (0.0008) [2023-10-14 04:14:09,029][33226] Updated weights for policy 1, policy_version 80220 (0.0008) [2023-10-14 04:14:09,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 163577856. Throughput: 0: 1757.0, 1: 1772.5. Samples: 40898644. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.930')] [2023-10-14 04:14:12,746][33201] Updated weights for policy 0, policy_version 79530 (0.0008) [2023-10-14 04:14:12,916][33226] Updated weights for policy 1, policy_version 80230 (0.0009) [2023-10-14 04:14:13,109][33201] Updated weights for policy 0, policy_version 79540 (0.0009) [2023-10-14 04:14:13,291][33226] Updated weights for policy 1, policy_version 80240 (0.0009) [2023-10-14 04:14:13,485][33201] Updated weights for policy 0, policy_version 79550 (0.0008) [2023-10-14 04:14:13,653][33226] Updated weights for policy 1, policy_version 80250 (0.0009) [2023-10-14 04:14:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 163643392. Throughput: 0: 1779.6, 1: 1770.8. Samples: 40910568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.950')] [2023-10-14 04:14:17,276][33201] Updated weights for policy 0, policy_version 79560 (0.0008) [2023-10-14 04:14:17,531][33226] Updated weights for policy 1, policy_version 80260 (0.0008) [2023-10-14 04:14:17,638][33201] Updated weights for policy 0, policy_version 79570 (0.0008) [2023-10-14 04:14:17,903][33226] Updated weights for policy 1, policy_version 80270 (0.0008) [2023-10-14 04:14:18,009][33201] Updated weights for policy 0, policy_version 79580 (0.0008) [2023-10-14 04:14:18,265][33226] Updated weights for policy 1, policy_version 80280 (0.0009) [2023-10-14 04:14:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 163708928. Throughput: 0: 1759.3, 1: 1773.7. Samples: 40930596. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:14:21,913][33201] Updated weights for policy 0, policy_version 79590 (0.0008) [2023-10-14 04:14:22,159][33226] Updated weights for policy 1, policy_version 80290 (0.0009) [2023-10-14 04:14:22,278][33201] Updated weights for policy 0, policy_version 79600 (0.0007) [2023-10-14 04:14:22,518][33226] Updated weights for policy 1, policy_version 80300 (0.0008) [2023-10-14 04:14:22,652][33201] Updated weights for policy 0, policy_version 79610 (0.0007) [2023-10-14 04:14:22,892][33226] Updated weights for policy 1, policy_version 80310 (0.0007) [2023-10-14 04:14:23,253][33226] Updated weights for policy 1, policy_version 80320 (0.0009) [2023-10-14 04:14:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 163774464. Throughput: 0: 1753.9, 1: 1754.3. Samples: 40951592. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:14:26,477][33201] Updated weights for policy 0, policy_version 79620 (0.0009) [2023-10-14 04:14:26,849][33201] Updated weights for policy 0, policy_version 79630 (0.0009) [2023-10-14 04:14:27,058][33226] Updated weights for policy 1, policy_version 80330 (0.0007) [2023-10-14 04:14:27,209][33201] Updated weights for policy 0, policy_version 79640 (0.0008) [2023-10-14 04:14:27,425][33226] Updated weights for policy 1, policy_version 80340 (0.0007) [2023-10-14 04:14:27,801][33226] Updated weights for policy 1, policy_version 80350 (0.0011) [2023-10-14 04:14:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 163840000. Throughput: 0: 1763.6, 1: 1779.3. Samples: 40962844. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:14:31,164][33201] Updated weights for policy 0, policy_version 79650 (0.0008) [2023-10-14 04:14:31,535][33201] Updated weights for policy 0, policy_version 79660 (0.0007) [2023-10-14 04:14:31,743][33226] Updated weights for policy 1, policy_version 80360 (0.0008) [2023-10-14 04:14:31,897][33201] Updated weights for policy 0, policy_version 79670 (0.0007) [2023-10-14 04:14:32,106][33226] Updated weights for policy 1, policy_version 80370 (0.0008) [2023-10-14 04:14:32,272][33201] Updated weights for policy 0, policy_version 79680 (0.0008) [2023-10-14 04:14:32,468][33226] Updated weights for policy 1, policy_version 80380 (0.0007) [2023-10-14 04:14:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 163905536. Throughput: 0: 1751.0, 1: 1747.6. Samples: 40982914. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:14:36,118][33226] Updated weights for policy 1, policy_version 80390 (0.0007) [2023-10-14 04:14:36,150][33201] Updated weights for policy 0, policy_version 79690 (0.0007) [2023-10-14 04:14:36,487][33226] Updated weights for policy 1, policy_version 80400 (0.0009) [2023-10-14 04:14:36,514][33201] Updated weights for policy 0, policy_version 79700 (0.0008) [2023-10-14 04:14:36,846][33226] Updated weights for policy 1, policy_version 80410 (0.0010) [2023-10-14 04:14:36,889][33201] Updated weights for policy 0, policy_version 79710 (0.0009) [2023-10-14 04:14:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 163971072. Throughput: 0: 1752.1, 1: 1756.3. Samples: 41005432. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:14:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:14:40,530][33226] Updated weights for policy 1, policy_version 80420 (0.0010) [2023-10-14 04:14:40,899][33226] Updated weights for policy 1, policy_version 80430 (0.0008) [2023-10-14 04:14:40,933][33201] Updated weights for policy 0, policy_version 79720 (0.0007) [2023-10-14 04:14:41,262][33226] Updated weights for policy 1, policy_version 80440 (0.0009) [2023-10-14 04:14:41,302][33201] Updated weights for policy 0, policy_version 79730 (0.0010) [2023-10-14 04:14:41,673][33201] Updated weights for policy 0, policy_version 79740 (0.0009) [2023-10-14 04:14:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 164036608. Throughput: 0: 1739.7, 1: 1757.8. Samples: 41014798. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:14:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:14:45,263][33226] Updated weights for policy 1, policy_version 80450 (0.0008) [2023-10-14 04:14:45,507][33201] Updated weights for policy 0, policy_version 79750 (0.0008) [2023-10-14 04:14:45,623][33226] Updated weights for policy 1, policy_version 80460 (0.0007) [2023-10-14 04:14:45,873][33201] Updated weights for policy 0, policy_version 79760 (0.0009) [2023-10-14 04:14:45,991][33226] Updated weights for policy 1, policy_version 80470 (0.0009) [2023-10-14 04:14:46,249][33201] Updated weights for policy 0, policy_version 79770 (0.0008) [2023-10-14 04:14:46,356][33226] Updated weights for policy 1, policy_version 80480 (0.0009) [2023-10-14 04:14:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 164102144. Throughput: 0: 1734.1, 1: 1753.7. Samples: 41036450. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:14:49,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:14:50,052][33201] Updated weights for policy 0, policy_version 79780 (0.0008) [2023-10-14 04:14:50,101][33226] Updated weights for policy 1, policy_version 80490 (0.0008) [2023-10-14 04:14:50,426][33201] Updated weights for policy 0, policy_version 79790 (0.0009) [2023-10-14 04:14:50,465][33226] Updated weights for policy 1, policy_version 80500 (0.0007) [2023-10-14 04:14:50,803][33201] Updated weights for policy 0, policy_version 79800 (0.0008) [2023-10-14 04:14:50,820][33226] Updated weights for policy 1, policy_version 80510 (0.0008) [2023-10-14 04:14:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 164167680. Throughput: 0: 1763.6, 1: 1793.6. Samples: 41058716. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:14:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:14:54,612][33201] Updated weights for policy 0, policy_version 79810 (0.0007) [2023-10-14 04:14:54,636][33226] Updated weights for policy 1, policy_version 80520 (0.0008) [2023-10-14 04:14:54,980][33201] Updated weights for policy 0, policy_version 79820 (0.0008) [2023-10-14 04:14:55,009][33226] Updated weights for policy 1, policy_version 80530 (0.0008) [2023-10-14 04:14:55,353][33201] Updated weights for policy 0, policy_version 79830 (0.0009) [2023-10-14 04:14:55,391][33226] Updated weights for policy 1, policy_version 80540 (0.0009) [2023-10-14 04:14:55,731][33201] Updated weights for policy 0, policy_version 79840 (0.0007) [2023-10-14 04:14:59,173][33226] Updated weights for policy 1, policy_version 80550 (0.0007) [2023-10-14 04:14:59,538][33226] Updated weights for policy 1, policy_version 80560 (0.0007) [2023-10-14 04:14:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 164233216. Throughput: 0: 1732.5, 1: 1767.4. Samples: 41068060. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:14:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 04:14:59,612][33201] Updated weights for policy 0, policy_version 79850 (0.0008) [2023-10-14 04:14:59,914][33226] Updated weights for policy 1, policy_version 80570 (0.0008) [2023-10-14 04:14:59,983][33201] Updated weights for policy 0, policy_version 79860 (0.0008) [2023-10-14 04:15:00,347][33201] Updated weights for policy 0, policy_version 79870 (0.0009) [2023-10-14 04:15:03,665][33226] Updated weights for policy 1, policy_version 80580 (0.0008) [2023-10-14 04:15:04,026][33226] Updated weights for policy 1, policy_version 80590 (0.0009) [2023-10-14 04:15:04,063][33201] Updated weights for policy 0, policy_version 79880 (0.0008) [2023-10-14 04:15:04,389][33226] Updated weights for policy 1, policy_version 80600 (0.0008) [2023-10-14 04:15:04,438][33201] Updated weights for policy 0, policy_version 79890 (0.0007) [2023-10-14 04:15:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 14106.9). Total num frames: 164298752. Throughput: 0: 1761.9, 1: 1789.0. Samples: 41090386. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 04:15:04,804][33201] Updated weights for policy 0, policy_version 79900 (0.0007) [2023-10-14 04:15:08,039][33226] Updated weights for policy 1, policy_version 80610 (0.0008) [2023-10-14 04:15:08,408][33226] Updated weights for policy 1, policy_version 80620 (0.0009) [2023-10-14 04:15:08,669][33201] Updated weights for policy 0, policy_version 79910 (0.0007) [2023-10-14 04:15:08,784][33226] Updated weights for policy 1, policy_version 80630 (0.0009) [2023-10-14 04:15:09,039][33201] Updated weights for policy 0, policy_version 79920 (0.0007) [2023-10-14 04:15:09,149][33226] Updated weights for policy 1, policy_version 80640 (0.0008) [2023-10-14 04:15:09,404][33201] Updated weights for policy 0, policy_version 79930 (0.0009) [2023-10-14 04:15:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 164397056. Throughput: 0: 1748.7, 1: 1784.5. Samples: 41110586. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:09,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 04:15:12,822][33226] Updated weights for policy 1, policy_version 80650 (0.0008) [2023-10-14 04:15:13,198][33226] Updated weights for policy 1, policy_version 80660 (0.0008) [2023-10-14 04:15:13,340][33201] Updated weights for policy 0, policy_version 79940 (0.0007) [2023-10-14 04:15:13,576][33226] Updated weights for policy 1, policy_version 80670 (0.0009) [2023-10-14 04:15:13,709][33201] Updated weights for policy 0, policy_version 79950 (0.0008) [2023-10-14 04:15:14,079][33201] Updated weights for policy 0, policy_version 79960 (0.0008) [2023-10-14 04:15:14,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 164495360. Throughput: 0: 1748.3, 1: 1786.3. Samples: 41121898. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 04:15:17,283][33226] Updated weights for policy 1, policy_version 80680 (0.0008) [2023-10-14 04:15:17,652][33226] Updated weights for policy 1, policy_version 80690 (0.0009) [2023-10-14 04:15:18,028][33226] Updated weights for policy 1, policy_version 80700 (0.0009) [2023-10-14 04:15:18,037][33201] Updated weights for policy 0, policy_version 79970 (0.0009) [2023-10-14 04:15:18,410][33201] Updated weights for policy 0, policy_version 79980 (0.0007) [2023-10-14 04:15:18,783][33201] Updated weights for policy 0, policy_version 79990 (0.0009) [2023-10-14 04:15:19,149][33201] Updated weights for policy 0, policy_version 80000 (0.0011) [2023-10-14 04:15:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 164560896. Throughput: 0: 1761.3, 1: 1791.8. Samples: 41142804. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.950')] [2023-10-14 04:15:21,991][33226] Updated weights for policy 1, policy_version 80710 (0.0008) [2023-10-14 04:15:22,351][33226] Updated weights for policy 1, policy_version 80720 (0.0009) [2023-10-14 04:15:22,713][33226] Updated weights for policy 1, policy_version 80730 (0.0008) [2023-10-14 04:15:23,002][33201] Updated weights for policy 0, policy_version 80010 (0.0008) [2023-10-14 04:15:23,382][33201] Updated weights for policy 0, policy_version 80020 (0.0007) [2023-10-14 04:15:23,750][33201] Updated weights for policy 0, policy_version 80030 (0.0009) [2023-10-14 04:15:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 164626432. Throughput: 0: 1732.0, 1: 1771.5. Samples: 41163088. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:15:24,570][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000080032_81952768.pth... [2023-10-14 04:15:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000080736_82673664.pth... [2023-10-14 04:15:24,608][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000078368_80248832.pth [2023-10-14 04:15:24,612][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000079072_80969728.pth [2023-10-14 04:15:26,546][33226] Updated weights for policy 1, policy_version 80740 (0.0008) [2023-10-14 04:15:26,908][33226] Updated weights for policy 1, policy_version 80750 (0.0007) [2023-10-14 04:15:27,274][33226] Updated weights for policy 1, policy_version 80760 (0.0007) [2023-10-14 04:15:27,624][33201] Updated weights for policy 0, policy_version 80040 (0.0009) [2023-10-14 04:15:27,991][33201] Updated weights for policy 0, policy_version 80050 (0.0009) [2023-10-14 04:15:28,363][33201] Updated weights for policy 0, policy_version 80060 (0.0007) [2023-10-14 04:15:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 164691968. Throughput: 0: 1771.8, 1: 1789.3. Samples: 41175048. Policy #0 lag: (min: 29.0, avg: 32.4, max: 61.0) [2023-10-14 04:15:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:15:30,988][33226] Updated weights for policy 1, policy_version 80770 (0.0008) [2023-10-14 04:15:31,337][33226] Updated weights for policy 1, policy_version 80780 (0.0010) [2023-10-14 04:15:31,703][33226] Updated weights for policy 1, policy_version 80790 (0.0009) [2023-10-14 04:15:32,069][33226] Updated weights for policy 1, policy_version 80800 (0.0008) [2023-10-14 04:15:32,307][33201] Updated weights for policy 0, policy_version 80070 (0.0009) [2023-10-14 04:15:32,687][33201] Updated weights for policy 0, policy_version 80080 (0.0010) [2023-10-14 04:15:33,055][33201] Updated weights for policy 0, policy_version 80090 (0.0011) [2023-10-14 04:15:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 164757504. Throughput: 0: 1746.7, 1: 1786.3. Samples: 41195434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:15:35,779][33226] Updated weights for policy 1, policy_version 80810 (0.0008) [2023-10-14 04:15:36,153][33226] Updated weights for policy 1, policy_version 80820 (0.0008) [2023-10-14 04:15:36,520][33226] Updated weights for policy 1, policy_version 80830 (0.0011) [2023-10-14 04:15:36,952][33201] Updated weights for policy 0, policy_version 80100 (0.0010) [2023-10-14 04:15:37,326][33201] Updated weights for policy 0, policy_version 80110 (0.0010) [2023-10-14 04:15:37,704][33201] Updated weights for policy 0, policy_version 80120 (0.0010) [2023-10-14 04:15:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 164823040. Throughput: 0: 1742.1, 1: 1786.7. Samples: 41217510. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:15:40,253][33226] Updated weights for policy 1, policy_version 80840 (0.0009) [2023-10-14 04:15:40,628][33226] Updated weights for policy 1, policy_version 80850 (0.0009) [2023-10-14 04:15:40,988][33226] Updated weights for policy 1, policy_version 80860 (0.0010) [2023-10-14 04:15:41,577][33201] Updated weights for policy 0, policy_version 80130 (0.0010) [2023-10-14 04:15:41,951][33201] Updated weights for policy 0, policy_version 80140 (0.0008) [2023-10-14 04:15:42,316][33201] Updated weights for policy 0, policy_version 80150 (0.0008) [2023-10-14 04:15:42,694][33201] Updated weights for policy 0, policy_version 80160 (0.0009) [2023-10-14 04:15:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 164888576. Throughput: 0: 1762.8, 1: 1785.0. Samples: 41227710. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:15:44,680][33226] Updated weights for policy 1, policy_version 80870 (0.0010) [2023-10-14 04:15:45,055][33226] Updated weights for policy 1, policy_version 80880 (0.0009) [2023-10-14 04:15:45,428][33226] Updated weights for policy 1, policy_version 80890 (0.0009) [2023-10-14 04:15:46,560][33201] Updated weights for policy 0, policy_version 80170 (0.0009) [2023-10-14 04:15:46,931][33201] Updated weights for policy 0, policy_version 80180 (0.0010) [2023-10-14 04:15:47,295][33201] Updated weights for policy 0, policy_version 80190 (0.0007) [2023-10-14 04:15:49,278][33226] Updated weights for policy 1, policy_version 80900 (0.0008) [2023-10-14 04:15:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 164954112. Throughput: 0: 1744.9, 1: 1787.7. Samples: 41249356. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:15:49,642][33226] Updated weights for policy 1, policy_version 80910 (0.0009) [2023-10-14 04:15:50,001][33226] Updated weights for policy 1, policy_version 80920 (0.0008) [2023-10-14 04:15:51,141][33201] Updated weights for policy 0, policy_version 80200 (0.0009) [2023-10-14 04:15:51,518][33201] Updated weights for policy 0, policy_version 80210 (0.0009) [2023-10-14 04:15:51,883][33201] Updated weights for policy 0, policy_version 80220 (0.0010) [2023-10-14 04:15:53,845][33226] Updated weights for policy 1, policy_version 80930 (0.0008) [2023-10-14 04:15:54,209][33226] Updated weights for policy 1, policy_version 80940 (0.0009) [2023-10-14 04:15:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 165019648. Throughput: 0: 1763.3, 1: 1810.7. Samples: 41271416. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 04:15:54,592][33226] Updated weights for policy 1, policy_version 80950 (0.0008) [2023-10-14 04:15:54,954][33226] Updated weights for policy 1, policy_version 80960 (0.0011) [2023-10-14 04:15:55,619][33201] Updated weights for policy 0, policy_version 80230 (0.0009) [2023-10-14 04:15:55,993][33201] Updated weights for policy 0, policy_version 80240 (0.0009) [2023-10-14 04:15:56,364][33201] Updated weights for policy 0, policy_version 80250 (0.0007) [2023-10-14 04:15:58,737][33226] Updated weights for policy 1, policy_version 80970 (0.0008) [2023-10-14 04:15:59,111][33226] Updated weights for policy 1, policy_version 80980 (0.0007) [2023-10-14 04:15:59,473][33226] Updated weights for policy 1, policy_version 80990 (0.0007) [2023-10-14 04:15:59,557][31953] Fps is (10 sec: 16384.5, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 165117952. Throughput: 0: 1749.4, 1: 1788.5. Samples: 41281104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:15:59,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.940')] [2023-10-14 04:16:00,130][33201] Updated weights for policy 0, policy_version 80260 (0.0008) [2023-10-14 04:16:00,504][33201] Updated weights for policy 0, policy_version 80270 (0.0007) [2023-10-14 04:16:00,877][33201] Updated weights for policy 0, policy_version 80280 (0.0009) [2023-10-14 04:16:03,178][33226] Updated weights for policy 1, policy_version 81000 (0.0009) [2023-10-14 04:16:03,541][33226] Updated weights for policy 1, policy_version 81010 (0.0009) [2023-10-14 04:16:03,918][33226] Updated weights for policy 1, policy_version 81020 (0.0009) [2023-10-14 04:16:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 165183488. Throughput: 0: 1754.0, 1: 1812.8. Samples: 41303314. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:16:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 04:16:04,623][33201] Updated weights for policy 0, policy_version 80290 (0.0010) [2023-10-14 04:16:04,995][33201] Updated weights for policy 0, policy_version 80300 (0.0009) [2023-10-14 04:16:05,356][33201] Updated weights for policy 0, policy_version 80310 (0.0008) [2023-10-14 04:16:05,731][33201] Updated weights for policy 0, policy_version 80320 (0.0009) [2023-10-14 04:16:07,668][33226] Updated weights for policy 1, policy_version 81030 (0.0008) [2023-10-14 04:16:08,043][33226] Updated weights for policy 1, policy_version 81040 (0.0009) [2023-10-14 04:16:08,412][33226] Updated weights for policy 1, policy_version 81050 (0.0008) [2023-10-14 04:16:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 165249024. Throughput: 0: 1783.9, 1: 1799.4. Samples: 41324338. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:16:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.950')] [2023-10-14 04:16:09,568][33201] Updated weights for policy 0, policy_version 80330 (0.0011) [2023-10-14 04:16:09,938][33201] Updated weights for policy 0, policy_version 80340 (0.0010) [2023-10-14 04:16:10,307][33201] Updated weights for policy 0, policy_version 80350 (0.0007) [2023-10-14 04:16:12,179][33226] Updated weights for policy 1, policy_version 81060 (0.0007) [2023-10-14 04:16:12,557][33226] Updated weights for policy 1, policy_version 81070 (0.0009) [2023-10-14 04:16:12,928][33226] Updated weights for policy 1, policy_version 81080 (0.0009) [2023-10-14 04:16:14,320][33201] Updated weights for policy 0, policy_version 80360 (0.0008) [2023-10-14 04:16:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 165314560. Throughput: 0: 1749.6, 1: 1808.9. Samples: 41335178. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:16:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:14,685][33201] Updated weights for policy 0, policy_version 80370 (0.0008) [2023-10-14 04:16:15,056][33201] Updated weights for policy 0, policy_version 80380 (0.0008) [2023-10-14 04:16:16,832][33226] Updated weights for policy 1, policy_version 81090 (0.0009) [2023-10-14 04:16:17,191][33226] Updated weights for policy 1, policy_version 81100 (0.0010) [2023-10-14 04:16:17,553][33226] Updated weights for policy 1, policy_version 81110 (0.0009) [2023-10-14 04:16:17,928][33226] Updated weights for policy 1, policy_version 81120 (0.0009) [2023-10-14 04:16:18,785][33201] Updated weights for policy 0, policy_version 80390 (0.0008) [2023-10-14 04:16:19,155][33201] Updated weights for policy 0, policy_version 80400 (0.0008) [2023-10-14 04:16:19,523][33201] Updated weights for policy 0, policy_version 80410 (0.0007) [2023-10-14 04:16:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 165380096. Throughput: 0: 1783.6, 1: 1782.7. Samples: 41355918. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:16:19,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:21,633][33226] Updated weights for policy 1, policy_version 81130 (0.0008) [2023-10-14 04:16:22,006][33226] Updated weights for policy 1, policy_version 81140 (0.0008) [2023-10-14 04:16:22,364][33226] Updated weights for policy 1, policy_version 81150 (0.0007) [2023-10-14 04:16:23,151][33201] Updated weights for policy 0, policy_version 80420 (0.0007) [2023-10-14 04:16:23,519][33201] Updated weights for policy 0, policy_version 80430 (0.0007) [2023-10-14 04:16:23,897][33201] Updated weights for policy 0, policy_version 80440 (0.0009) [2023-10-14 04:16:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 165478400. Throughput: 0: 1763.0, 1: 1782.4. Samples: 41377054. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:26,052][33226] Updated weights for policy 1, policy_version 81160 (0.0008) [2023-10-14 04:16:26,435][33226] Updated weights for policy 1, policy_version 81170 (0.0011) [2023-10-14 04:16:26,791][33226] Updated weights for policy 1, policy_version 81180 (0.0010) [2023-10-14 04:16:27,748][33201] Updated weights for policy 0, policy_version 80450 (0.0008) [2023-10-14 04:16:28,116][33201] Updated weights for policy 0, policy_version 80460 (0.0008) [2023-10-14 04:16:28,492][33201] Updated weights for policy 0, policy_version 80470 (0.0007) [2023-10-14 04:16:28,866][33201] Updated weights for policy 0, policy_version 80480 (0.0009) [2023-10-14 04:16:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 165543936. Throughput: 0: 1773.3, 1: 1784.8. Samples: 41387826. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:29,560][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:30,618][33226] Updated weights for policy 1, policy_version 81190 (0.0008) [2023-10-14 04:16:30,978][33226] Updated weights for policy 1, policy_version 81200 (0.0008) [2023-10-14 04:16:31,348][33226] Updated weights for policy 1, policy_version 81210 (0.0009) [2023-10-14 04:16:32,608][33201] Updated weights for policy 0, policy_version 80490 (0.0009) [2023-10-14 04:16:32,974][33201] Updated weights for policy 0, policy_version 80500 (0.0008) [2023-10-14 04:16:33,349][33201] Updated weights for policy 0, policy_version 80510 (0.0009) [2023-10-14 04:16:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 165609472. Throughput: 0: 1766.8, 1: 1778.3. Samples: 41408882. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:35,144][33226] Updated weights for policy 1, policy_version 81220 (0.0008) [2023-10-14 04:16:35,523][33226] Updated weights for policy 1, policy_version 81230 (0.0007) [2023-10-14 04:16:35,884][33226] Updated weights for policy 1, policy_version 81240 (0.0010) [2023-10-14 04:16:37,097][33201] Updated weights for policy 0, policy_version 80520 (0.0008) [2023-10-14 04:16:37,462][33201] Updated weights for policy 0, policy_version 80530 (0.0009) [2023-10-14 04:16:37,839][33201] Updated weights for policy 0, policy_version 80540 (0.0010) [2023-10-14 04:16:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 165675008. Throughput: 0: 1755.0, 1: 1785.4. Samples: 41430736. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.950')] [2023-10-14 04:16:39,728][33226] Updated weights for policy 1, policy_version 81250 (0.0008) [2023-10-14 04:16:40,095][33226] Updated weights for policy 1, policy_version 81260 (0.0011) [2023-10-14 04:16:40,459][33226] Updated weights for policy 1, policy_version 81270 (0.0010) [2023-10-14 04:16:40,820][33226] Updated weights for policy 1, policy_version 81280 (0.0008) [2023-10-14 04:16:41,680][33201] Updated weights for policy 0, policy_version 80550 (0.0009) [2023-10-14 04:16:42,063][33201] Updated weights for policy 0, policy_version 80560 (0.0008) [2023-10-14 04:16:42,434][33201] Updated weights for policy 0, policy_version 80570 (0.0009) [2023-10-14 04:16:44,398][33226] Updated weights for policy 1, policy_version 81290 (0.0010) [2023-10-14 04:16:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 165740544. Throughput: 0: 1777.1, 1: 1779.5. Samples: 41441148. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 04:16:44,762][33226] Updated weights for policy 1, policy_version 81300 (0.0010) [2023-10-14 04:16:45,130][33226] Updated weights for policy 1, policy_version 81310 (0.0009) [2023-10-14 04:16:46,317][33201] Updated weights for policy 0, policy_version 80580 (0.0007) [2023-10-14 04:16:46,688][33201] Updated weights for policy 0, policy_version 80590 (0.0008) [2023-10-14 04:16:47,058][33201] Updated weights for policy 0, policy_version 80600 (0.0007) [2023-10-14 04:16:49,015][33226] Updated weights for policy 1, policy_version 81320 (0.0009) [2023-10-14 04:16:49,369][33226] Updated weights for policy 1, policy_version 81330 (0.0012) [2023-10-14 04:16:49,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 165806080. Throughput: 0: 1761.6, 1: 1780.7. Samples: 41462720. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:16:49,736][33226] Updated weights for policy 1, policy_version 81340 (0.0007) [2023-10-14 04:16:50,931][33201] Updated weights for policy 0, policy_version 80610 (0.0008) [2023-10-14 04:16:51,301][33201] Updated weights for policy 0, policy_version 80620 (0.0008) [2023-10-14 04:16:51,679][33201] Updated weights for policy 0, policy_version 80630 (0.0007) [2023-10-14 04:16:52,042][33201] Updated weights for policy 0, policy_version 80640 (0.0008) [2023-10-14 04:16:53,345][33226] Updated weights for policy 1, policy_version 81350 (0.0008) [2023-10-14 04:16:53,716][33226] Updated weights for policy 1, policy_version 81360 (0.0008) [2023-10-14 04:16:54,089][33226] Updated weights for policy 1, policy_version 81370 (0.0007) [2023-10-14 04:16:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 165904384. Throughput: 0: 1758.1, 1: 1791.4. Samples: 41484064. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:16:55,883][33201] Updated weights for policy 0, policy_version 80650 (0.0008) [2023-10-14 04:16:56,254][33201] Updated weights for policy 0, policy_version 80660 (0.0007) [2023-10-14 04:16:56,634][33201] Updated weights for policy 0, policy_version 80670 (0.0010) [2023-10-14 04:16:57,944][33226] Updated weights for policy 1, policy_version 81380 (0.0010) [2023-10-14 04:16:58,308][33226] Updated weights for policy 1, policy_version 81390 (0.0008) [2023-10-14 04:16:58,686][33226] Updated weights for policy 1, policy_version 81400 (0.0007) [2023-10-14 04:16:59,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 165969920. Throughput: 0: 1756.0, 1: 1782.2. Samples: 41494396. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:16:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:17:00,595][33201] Updated weights for policy 0, policy_version 80680 (0.0008) [2023-10-14 04:17:00,967][33201] Updated weights for policy 0, policy_version 80690 (0.0008) [2023-10-14 04:17:01,336][33201] Updated weights for policy 0, policy_version 80700 (0.0008) [2023-10-14 04:17:02,375][33226] Updated weights for policy 1, policy_version 81410 (0.0008) [2023-10-14 04:17:02,738][33226] Updated weights for policy 1, policy_version 81420 (0.0008) [2023-10-14 04:17:03,104][33226] Updated weights for policy 1, policy_version 81430 (0.0009) [2023-10-14 04:17:03,475][33226] Updated weights for policy 1, policy_version 81440 (0.0008) [2023-10-14 04:17:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166035456. Throughput: 0: 1750.4, 1: 1799.3. Samples: 41515654. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:17:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:05,174][33201] Updated weights for policy 0, policy_version 80710 (0.0010) [2023-10-14 04:17:05,545][33201] Updated weights for policy 0, policy_version 80720 (0.0007) [2023-10-14 04:17:05,916][33201] Updated weights for policy 0, policy_version 80730 (0.0008) [2023-10-14 04:17:07,181][33226] Updated weights for policy 1, policy_version 81450 (0.0008) [2023-10-14 04:17:07,547][33226] Updated weights for policy 1, policy_version 81460 (0.0010) [2023-10-14 04:17:07,921][33226] Updated weights for policy 1, policy_version 81470 (0.0010) [2023-10-14 04:17:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 166100992. Throughput: 0: 1778.4, 1: 1787.8. Samples: 41537534. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:17:09,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 04:17:09,569][33201] Updated weights for policy 0, policy_version 80740 (0.0008) [2023-10-14 04:17:09,936][33201] Updated weights for policy 0, policy_version 80750 (0.0007) [2023-10-14 04:17:10,304][33201] Updated weights for policy 0, policy_version 80760 (0.0009) [2023-10-14 04:17:11,668][33226] Updated weights for policy 1, policy_version 81480 (0.0008) [2023-10-14 04:17:12,040][33226] Updated weights for policy 1, policy_version 81490 (0.0007) [2023-10-14 04:17:12,403][33226] Updated weights for policy 1, policy_version 81500 (0.0007) [2023-10-14 04:17:13,970][33201] Updated weights for policy 0, policy_version 80770 (0.0007) [2023-10-14 04:17:14,334][33201] Updated weights for policy 0, policy_version 80780 (0.0008) [2023-10-14 04:17:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 166166528. Throughput: 0: 1753.1, 1: 1807.8. Samples: 41548066. Policy #0 lag: (min: 10.0, avg: 18.3, max: 42.0) [2023-10-14 04:17:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 04:17:14,703][33201] Updated weights for policy 0, policy_version 80790 (0.0010) [2023-10-14 04:17:15,078][33201] Updated weights for policy 0, policy_version 80800 (0.0011) [2023-10-14 04:17:16,197][33226] Updated weights for policy 1, policy_version 81510 (0.0009) [2023-10-14 04:17:16,560][33226] Updated weights for policy 1, policy_version 81520 (0.0007) [2023-10-14 04:17:16,924][33226] Updated weights for policy 1, policy_version 81530 (0.0007) [2023-10-14 04:17:19,053][33201] Updated weights for policy 0, policy_version 80810 (0.0008) [2023-10-14 04:17:19,427][33201] Updated weights for policy 0, policy_version 80820 (0.0009) [2023-10-14 04:17:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 166232064. Throughput: 0: 1777.4, 1: 1792.8. Samples: 41569538. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 04:17:19,789][33201] Updated weights for policy 0, policy_version 80830 (0.0007) [2023-10-14 04:17:20,606][33226] Updated weights for policy 1, policy_version 81540 (0.0009) [2023-10-14 04:17:20,981][33226] Updated weights for policy 1, policy_version 81550 (0.0008) [2023-10-14 04:17:21,343][33226] Updated weights for policy 1, policy_version 81560 (0.0009) [2023-10-14 04:17:23,613][33201] Updated weights for policy 0, policy_version 80840 (0.0007) [2023-10-14 04:17:23,981][33201] Updated weights for policy 0, policy_version 80850 (0.0008) [2023-10-14 04:17:24,353][33201] Updated weights for policy 0, policy_version 80860 (0.0009) [2023-10-14 04:17:24,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 166330368. Throughput: 0: 1766.1, 1: 1787.6. Samples: 41590656. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:17:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000081568_83525632.pth... [2023-10-14 04:17:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000080864_82804736.pth... [2023-10-14 04:17:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000079200_81100800.pth [2023-10-14 04:17:24,612][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000079904_81821696.pth [2023-10-14 04:17:25,252][33226] Updated weights for policy 1, policy_version 81570 (0.0010) [2023-10-14 04:17:25,614][33226] Updated weights for policy 1, policy_version 81580 (0.0009) [2023-10-14 04:17:25,981][33226] Updated weights for policy 1, policy_version 81590 (0.0009) [2023-10-14 04:17:26,357][33226] Updated weights for policy 1, policy_version 81600 (0.0008) [2023-10-14 04:17:28,027][33201] Updated weights for policy 0, policy_version 80870 (0.0008) [2023-10-14 04:17:28,404][33201] Updated weights for policy 0, policy_version 80880 (0.0008) [2023-10-14 04:17:28,780][33201] Updated weights for policy 0, policy_version 80890 (0.0008) [2023-10-14 04:17:29,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166395904. Throughput: 0: 1766.8, 1: 1787.6. Samples: 41601094. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:17:30,090][33226] Updated weights for policy 1, policy_version 81610 (0.0007) [2023-10-14 04:17:30,453][33226] Updated weights for policy 1, policy_version 81620 (0.0007) [2023-10-14 04:17:30,814][33226] Updated weights for policy 1, policy_version 81630 (0.0008) [2023-10-14 04:17:32,547][33201] Updated weights for policy 0, policy_version 80900 (0.0008) [2023-10-14 04:17:32,913][33201] Updated weights for policy 0, policy_version 80910 (0.0007) [2023-10-14 04:17:33,284][33201] Updated weights for policy 0, policy_version 80920 (0.0007) [2023-10-14 04:17:34,465][33226] Updated weights for policy 1, policy_version 81640 (0.0010) [2023-10-14 04:17:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166461440. Throughput: 0: 1771.2, 1: 1791.1. Samples: 41623024. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:34,837][33226] Updated weights for policy 1, policy_version 81650 (0.0011) [2023-10-14 04:17:35,204][33226] Updated weights for policy 1, policy_version 81660 (0.0011) [2023-10-14 04:17:37,345][33201] Updated weights for policy 0, policy_version 80930 (0.0008) [2023-10-14 04:17:37,719][33201] Updated weights for policy 0, policy_version 80940 (0.0010) [2023-10-14 04:17:38,085][33201] Updated weights for policy 0, policy_version 80950 (0.0009) [2023-10-14 04:17:38,446][33201] Updated weights for policy 0, policy_version 80960 (0.0010) [2023-10-14 04:17:38,966][33226] Updated weights for policy 1, policy_version 81670 (0.0011) [2023-10-14 04:17:39,334][33226] Updated weights for policy 1, policy_version 81680 (0.0008) [2023-10-14 04:17:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166526976. Throughput: 0: 1753.8, 1: 1804.2. Samples: 41644176. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:39,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:39,701][33226] Updated weights for policy 1, policy_version 81690 (0.0007) [2023-10-14 04:17:42,377][33201] Updated weights for policy 0, policy_version 80970 (0.0010) [2023-10-14 04:17:42,748][33201] Updated weights for policy 0, policy_version 80980 (0.0009) [2023-10-14 04:17:43,117][33201] Updated weights for policy 0, policy_version 80990 (0.0007) [2023-10-14 04:17:43,532][33226] Updated weights for policy 1, policy_version 81700 (0.0010) [2023-10-14 04:17:43,894][33226] Updated weights for policy 1, policy_version 81710 (0.0010) [2023-10-14 04:17:44,260][33226] Updated weights for policy 1, policy_version 81720 (0.0010) [2023-10-14 04:17:44,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 166625280. Throughput: 0: 1790.5, 1: 1787.3. Samples: 41655396. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:46,856][33201] Updated weights for policy 0, policy_version 81000 (0.0007) [2023-10-14 04:17:47,215][33201] Updated weights for policy 0, policy_version 81010 (0.0009) [2023-10-14 04:17:47,587][33201] Updated weights for policy 0, policy_version 81020 (0.0007) [2023-10-14 04:17:48,090][33226] Updated weights for policy 1, policy_version 81730 (0.0007) [2023-10-14 04:17:48,459][33226] Updated weights for policy 1, policy_version 81740 (0.0008) [2023-10-14 04:17:48,825][33226] Updated weights for policy 1, policy_version 81750 (0.0009) [2023-10-14 04:17:49,185][33226] Updated weights for policy 1, policy_version 81760 (0.0007) [2023-10-14 04:17:49,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.7, 300 sec: 14218.0). Total num frames: 166690816. Throughput: 0: 1764.3, 1: 1805.0. Samples: 41676272. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:51,373][33201] Updated weights for policy 0, policy_version 81030 (0.0010) [2023-10-14 04:17:51,749][33201] Updated weights for policy 0, policy_version 81040 (0.0010) [2023-10-14 04:17:52,122][33201] Updated weights for policy 0, policy_version 81050 (0.0007) [2023-10-14 04:17:52,881][33226] Updated weights for policy 1, policy_version 81770 (0.0007) [2023-10-14 04:17:53,250][33226] Updated weights for policy 1, policy_version 81780 (0.0007) [2023-10-14 04:17:53,622][33226] Updated weights for policy 1, policy_version 81790 (0.0007) [2023-10-14 04:17:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166756352. Throughput: 0: 1764.9, 1: 1781.4. Samples: 41697116. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:17:55,844][33201] Updated weights for policy 0, policy_version 81060 (0.0009) [2023-10-14 04:17:56,211][33201] Updated weights for policy 0, policy_version 81070 (0.0011) [2023-10-14 04:17:56,576][33201] Updated weights for policy 0, policy_version 81080 (0.0010) [2023-10-14 04:17:57,660][33226] Updated weights for policy 1, policy_version 81800 (0.0008) [2023-10-14 04:17:58,041][33226] Updated weights for policy 1, policy_version 81810 (0.0008) [2023-10-14 04:17:58,398][33226] Updated weights for policy 1, policy_version 81820 (0.0008) [2023-10-14 04:17:59,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 166821888. Throughput: 0: 1762.2, 1: 1792.8. Samples: 41708040. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:17:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:18:00,637][33201] Updated weights for policy 0, policy_version 81090 (0.0010) [2023-10-14 04:18:01,002][33201] Updated weights for policy 0, policy_version 81100 (0.0010) [2023-10-14 04:18:01,380][33201] Updated weights for policy 0, policy_version 81110 (0.0010) [2023-10-14 04:18:01,741][33201] Updated weights for policy 0, policy_version 81120 (0.0010) [2023-10-14 04:18:02,240][33226] Updated weights for policy 1, policy_version 81830 (0.0008) [2023-10-14 04:18:02,598][33226] Updated weights for policy 1, policy_version 81840 (0.0008) [2023-10-14 04:18:02,968][33226] Updated weights for policy 1, policy_version 81850 (0.0007) [2023-10-14 04:18:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 166887424. Throughput: 0: 1761.4, 1: 1782.3. Samples: 41729002. Policy #0 lag: (min: 31.0, avg: 38.2, max: 63.0) [2023-10-14 04:18:04,559][31953] Avg episode reward: [(0, '20.980'), (1, '20.990')] [2023-10-14 04:18:05,517][33201] Updated weights for policy 0, policy_version 81130 (0.0009) [2023-10-14 04:18:05,886][33201] Updated weights for policy 0, policy_version 81140 (0.0007) [2023-10-14 04:18:06,243][33201] Updated weights for policy 0, policy_version 81150 (0.0008) [2023-10-14 04:18:06,574][33226] Updated weights for policy 1, policy_version 81860 (0.0007) [2023-10-14 04:18:06,944][33226] Updated weights for policy 1, policy_version 81870 (0.0009) [2023-10-14 04:18:07,312][33226] Updated weights for policy 1, policy_version 81880 (0.0009) [2023-10-14 04:18:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 166952960. Throughput: 0: 1784.2, 1: 1778.2. Samples: 41750964. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:18:10,091][33201] Updated weights for policy 0, policy_version 81160 (0.0008) [2023-10-14 04:18:10,466][33201] Updated weights for policy 0, policy_version 81170 (0.0008) [2023-10-14 04:18:10,834][33201] Updated weights for policy 0, policy_version 81180 (0.0007) [2023-10-14 04:18:11,136][33226] Updated weights for policy 1, policy_version 81890 (0.0012) [2023-10-14 04:18:11,505][33226] Updated weights for policy 1, policy_version 81900 (0.0010) [2023-10-14 04:18:11,877][33226] Updated weights for policy 1, policy_version 81910 (0.0010) [2023-10-14 04:18:12,237][33226] Updated weights for policy 1, policy_version 81920 (0.0007) [2023-10-14 04:18:14,542][33201] Updated weights for policy 0, policy_version 81190 (0.0009) [2023-10-14 04:18:14,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 167018496. Throughput: 0: 1763.0, 1: 1789.9. Samples: 41760976. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:18:14,906][33201] Updated weights for policy 0, policy_version 81200 (0.0009) [2023-10-14 04:18:15,275][33201] Updated weights for policy 0, policy_version 81210 (0.0007) [2023-10-14 04:18:16,045][33226] Updated weights for policy 1, policy_version 81930 (0.0008) [2023-10-14 04:18:16,416][33226] Updated weights for policy 1, policy_version 81940 (0.0009) [2023-10-14 04:18:16,786][33226] Updated weights for policy 1, policy_version 81950 (0.0008) [2023-10-14 04:18:18,983][33201] Updated weights for policy 0, policy_version 81220 (0.0009) [2023-10-14 04:18:19,351][33201] Updated weights for policy 0, policy_version 81230 (0.0011) [2023-10-14 04:18:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 167084032. Throughput: 0: 1777.0, 1: 1775.6. Samples: 41782890. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:18:19,731][33201] Updated weights for policy 0, policy_version 81240 (0.0011) [2023-10-14 04:18:20,501][33226] Updated weights for policy 1, policy_version 81960 (0.0008) [2023-10-14 04:18:20,879][33226] Updated weights for policy 1, policy_version 81970 (0.0007) [2023-10-14 04:18:21,239][33226] Updated weights for policy 1, policy_version 81980 (0.0009) [2023-10-14 04:18:23,587][33201] Updated weights for policy 0, policy_version 81250 (0.0008) [2023-10-14 04:18:23,945][33201] Updated weights for policy 0, policy_version 81260 (0.0009) [2023-10-14 04:18:24,319][33201] Updated weights for policy 0, policy_version 81270 (0.0009) [2023-10-14 04:18:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 167149568. Throughput: 0: 1782.9, 1: 1777.7. Samples: 41804406. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 04:18:24,681][33201] Updated weights for policy 0, policy_version 81280 (0.0009) [2023-10-14 04:18:24,925][33226] Updated weights for policy 1, policy_version 81990 (0.0010) [2023-10-14 04:18:25,298][33226] Updated weights for policy 1, policy_version 82000 (0.0010) [2023-10-14 04:18:25,669][33226] Updated weights for policy 1, policy_version 82010 (0.0009) [2023-10-14 04:18:28,540][33201] Updated weights for policy 0, policy_version 81290 (0.0007) [2023-10-14 04:18:28,898][33201] Updated weights for policy 0, policy_version 81300 (0.0007) [2023-10-14 04:18:29,264][33201] Updated weights for policy 0, policy_version 81310 (0.0007) [2023-10-14 04:18:29,554][33226] Updated weights for policy 1, policy_version 82020 (0.0009) [2023-10-14 04:18:29,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 167247872. Throughput: 0: 1765.0, 1: 1774.6. Samples: 41814680. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:29,912][33226] Updated weights for policy 1, policy_version 82030 (0.0010) [2023-10-14 04:18:30,282][33226] Updated weights for policy 1, policy_version 82040 (0.0007) [2023-10-14 04:18:33,243][33201] Updated weights for policy 0, policy_version 81320 (0.0008) [2023-10-14 04:18:33,621][33201] Updated weights for policy 0, policy_version 81330 (0.0008) [2023-10-14 04:18:33,990][33201] Updated weights for policy 0, policy_version 81340 (0.0011) [2023-10-14 04:18:34,221][33226] Updated weights for policy 1, policy_version 82050 (0.0008) [2023-10-14 04:18:34,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 167313408. Throughput: 0: 1785.9, 1: 1770.5. Samples: 41836310. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:34,592][33226] Updated weights for policy 1, policy_version 82060 (0.0009) [2023-10-14 04:18:34,959][33226] Updated weights for policy 1, policy_version 82070 (0.0008) [2023-10-14 04:18:35,325][33226] Updated weights for policy 1, policy_version 82080 (0.0008) [2023-10-14 04:18:37,961][33201] Updated weights for policy 0, policy_version 81350 (0.0009) [2023-10-14 04:18:38,333][33201] Updated weights for policy 0, policy_version 81360 (0.0007) [2023-10-14 04:18:38,695][33201] Updated weights for policy 0, policy_version 81370 (0.0007) [2023-10-14 04:18:38,952][33226] Updated weights for policy 1, policy_version 82090 (0.0007) [2023-10-14 04:18:39,312][33226] Updated weights for policy 1, policy_version 82100 (0.0008) [2023-10-14 04:18:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 167378944. Throughput: 0: 1751.7, 1: 1798.3. Samples: 41856866. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:39,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:39,685][33226] Updated weights for policy 1, policy_version 82110 (0.0011) [2023-10-14 04:18:42,671][33201] Updated weights for policy 0, policy_version 81380 (0.0009) [2023-10-14 04:18:43,027][33201] Updated weights for policy 0, policy_version 81390 (0.0009) [2023-10-14 04:18:43,408][33201] Updated weights for policy 0, policy_version 81400 (0.0008) [2023-10-14 04:18:43,435][33226] Updated weights for policy 1, policy_version 82120 (0.0008) [2023-10-14 04:18:43,809][33226] Updated weights for policy 1, policy_version 82130 (0.0008) [2023-10-14 04:18:44,178][33226] Updated weights for policy 1, policy_version 82140 (0.0007) [2023-10-14 04:18:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 167477248. Throughput: 0: 1782.1, 1: 1782.0. Samples: 41868428. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:44,559][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:47,090][33201] Updated weights for policy 0, policy_version 81410 (0.0009) [2023-10-14 04:18:47,455][33201] Updated weights for policy 0, policy_version 81420 (0.0009) [2023-10-14 04:18:47,794][33226] Updated weights for policy 1, policy_version 82150 (0.0008) [2023-10-14 04:18:47,835][33201] Updated weights for policy 0, policy_version 81430 (0.0008) [2023-10-14 04:18:48,162][33226] Updated weights for policy 1, policy_version 82160 (0.0007) [2023-10-14 04:18:48,195][33201] Updated weights for policy 0, policy_version 81440 (0.0009) [2023-10-14 04:18:48,523][33226] Updated weights for policy 1, policy_version 82170 (0.0009) [2023-10-14 04:18:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 167542784. Throughput: 0: 1750.7, 1: 1797.6. Samples: 41888674. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:49,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:52,142][33201] Updated weights for policy 0, policy_version 81450 (0.0008) [2023-10-14 04:18:52,416][33226] Updated weights for policy 1, policy_version 82180 (0.0007) [2023-10-14 04:18:52,516][33201] Updated weights for policy 0, policy_version 81460 (0.0008) [2023-10-14 04:18:52,777][33226] Updated weights for policy 1, policy_version 82190 (0.0008) [2023-10-14 04:18:52,887][33201] Updated weights for policy 0, policy_version 81470 (0.0009) [2023-10-14 04:18:53,146][33226] Updated weights for policy 1, policy_version 82200 (0.0008) [2023-10-14 04:18:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 167608320. Throughput: 0: 1744.3, 1: 1777.1. Samples: 41909428. Policy #0 lag: (min: 8.0, avg: 35.9, max: 40.0) [2023-10-14 04:18:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:18:56,658][33201] Updated weights for policy 0, policy_version 81480 (0.0008) [2023-10-14 04:18:56,951][33226] Updated weights for policy 1, policy_version 82210 (0.0010) [2023-10-14 04:18:57,016][33201] Updated weights for policy 0, policy_version 81490 (0.0008) [2023-10-14 04:18:57,322][33226] Updated weights for policy 1, policy_version 82220 (0.0008) [2023-10-14 04:18:57,390][33201] Updated weights for policy 0, policy_version 81500 (0.0007) [2023-10-14 04:18:57,689][33226] Updated weights for policy 1, policy_version 82230 (0.0008) [2023-10-14 04:18:58,050][33226] Updated weights for policy 1, policy_version 82240 (0.0007) [2023-10-14 04:18:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 167673856. Throughput: 0: 1760.8, 1: 1798.1. Samples: 41921124. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:18:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:19:01,118][33201] Updated weights for policy 0, policy_version 81510 (0.0008) [2023-10-14 04:19:01,497][33201] Updated weights for policy 0, policy_version 81520 (0.0011) [2023-10-14 04:19:01,839][33226] Updated weights for policy 1, policy_version 82250 (0.0008) [2023-10-14 04:19:01,862][33201] Updated weights for policy 0, policy_version 81530 (0.0008) [2023-10-14 04:19:02,206][33226] Updated weights for policy 1, policy_version 82260 (0.0007) [2023-10-14 04:19:02,566][33226] Updated weights for policy 1, policy_version 82270 (0.0008) [2023-10-14 04:19:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 167739392. Throughput: 0: 1744.5, 1: 1779.0. Samples: 41941446. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:19:05,812][33201] Updated weights for policy 0, policy_version 81540 (0.0008) [2023-10-14 04:19:06,181][33201] Updated weights for policy 0, policy_version 81550 (0.0009) [2023-10-14 04:19:06,321][33226] Updated weights for policy 1, policy_version 82280 (0.0007) [2023-10-14 04:19:06,559][33201] Updated weights for policy 0, policy_version 81560 (0.0009) [2023-10-14 04:19:06,684][33226] Updated weights for policy 1, policy_version 82290 (0.0009) [2023-10-14 04:19:07,049][33226] Updated weights for policy 1, policy_version 82300 (0.0009) [2023-10-14 04:19:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 167804928. Throughput: 0: 1754.9, 1: 1787.3. Samples: 41963802. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:19:10,346][33201] Updated weights for policy 0, policy_version 81570 (0.0008) [2023-10-14 04:19:10,678][33226] Updated weights for policy 1, policy_version 82310 (0.0008) [2023-10-14 04:19:10,720][33201] Updated weights for policy 0, policy_version 81580 (0.0008) [2023-10-14 04:19:11,051][33226] Updated weights for policy 1, policy_version 82320 (0.0009) [2023-10-14 04:19:11,085][33201] Updated weights for policy 0, policy_version 81590 (0.0009) [2023-10-14 04:19:11,406][33226] Updated weights for policy 1, policy_version 82330 (0.0008) [2023-10-14 04:19:11,457][33201] Updated weights for policy 0, policy_version 81600 (0.0008) [2023-10-14 04:19:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 167870464. Throughput: 0: 1735.8, 1: 1793.2. Samples: 41973486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:14,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:15,171][33226] Updated weights for policy 1, policy_version 82340 (0.0008) [2023-10-14 04:19:15,329][33201] Updated weights for policy 0, policy_version 81610 (0.0008) [2023-10-14 04:19:15,542][33226] Updated weights for policy 1, policy_version 82350 (0.0008) [2023-10-14 04:19:15,699][33201] Updated weights for policy 0, policy_version 81620 (0.0008) [2023-10-14 04:19:15,908][33226] Updated weights for policy 1, policy_version 82360 (0.0009) [2023-10-14 04:19:16,061][33201] Updated weights for policy 0, policy_version 81630 (0.0009) [2023-10-14 04:19:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 167936000. Throughput: 0: 1736.2, 1: 1798.4. Samples: 41995368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:19,774][33226] Updated weights for policy 1, policy_version 82370 (0.0008) [2023-10-14 04:19:19,993][33201] Updated weights for policy 0, policy_version 81640 (0.0007) [2023-10-14 04:19:20,149][33226] Updated weights for policy 1, policy_version 82380 (0.0008) [2023-10-14 04:19:20,350][33201] Updated weights for policy 0, policy_version 81650 (0.0008) [2023-10-14 04:19:20,524][33226] Updated weights for policy 1, policy_version 82390 (0.0007) [2023-10-14 04:19:20,718][33201] Updated weights for policy 0, policy_version 81660 (0.0008) [2023-10-14 04:19:20,885][33226] Updated weights for policy 1, policy_version 82400 (0.0008) [2023-10-14 04:19:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 168001536. Throughput: 0: 1766.9, 1: 1804.9. Samples: 42017600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:24,609][33201] Updated weights for policy 0, policy_version 81670 (0.0008) [2023-10-14 04:19:24,644][33226] Updated weights for policy 1, policy_version 82410 (0.0008) [2023-10-14 04:19:24,983][33201] Updated weights for policy 0, policy_version 81680 (0.0007) [2023-10-14 04:19:25,001][33226] Updated weights for policy 1, policy_version 82420 (0.0008) [2023-10-14 04:19:25,342][33201] Updated weights for policy 0, policy_version 81690 (0.0009) [2023-10-14 04:19:25,368][33226] Updated weights for policy 1, policy_version 82430 (0.0007) [2023-10-14 04:19:25,441][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000082432_84410368.pth... [2023-10-14 04:19:25,470][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000080736_82673664.pth [2023-10-14 04:19:25,562][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000081696_83656704.pth... [2023-10-14 04:19:25,600][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000080032_81952768.pth [2023-10-14 04:19:29,059][33201] Updated weights for policy 0, policy_version 81700 (0.0008) [2023-10-14 04:19:29,397][33226] Updated weights for policy 1, policy_version 82440 (0.0009) [2023-10-14 04:19:29,425][33201] Updated weights for policy 0, policy_version 81710 (0.0007) [2023-10-14 04:19:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 168067072. Throughput: 0: 1733.8, 1: 1791.5. Samples: 42027068. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:29,778][33226] Updated weights for policy 1, policy_version 82450 (0.0008) [2023-10-14 04:19:29,793][33201] Updated weights for policy 0, policy_version 81720 (0.0007) [2023-10-14 04:19:30,155][33226] Updated weights for policy 1, policy_version 82460 (0.0009) [2023-10-14 04:19:33,657][33201] Updated weights for policy 0, policy_version 81730 (0.0009) [2023-10-14 04:19:33,791][33226] Updated weights for policy 1, policy_version 82470 (0.0008) [2023-10-14 04:19:34,019][33201] Updated weights for policy 0, policy_version 81740 (0.0008) [2023-10-14 04:19:34,168][33226] Updated weights for policy 1, policy_version 82480 (0.0009) [2023-10-14 04:19:34,394][33201] Updated weights for policy 0, policy_version 81750 (0.0007) [2023-10-14 04:19:34,535][33226] Updated weights for policy 1, policy_version 82490 (0.0008) [2023-10-14 04:19:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 168132608. Throughput: 0: 1767.5, 1: 1800.4. Samples: 42049226. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:34,760][33201] Updated weights for policy 0, policy_version 81760 (0.0008) [2023-10-14 04:19:38,329][33226] Updated weights for policy 1, policy_version 82500 (0.0008) [2023-10-14 04:19:38,620][33201] Updated weights for policy 0, policy_version 81770 (0.0007) [2023-10-14 04:19:38,693][33226] Updated weights for policy 1, policy_version 82510 (0.0009) [2023-10-14 04:19:38,989][33201] Updated weights for policy 0, policy_version 81780 (0.0007) [2023-10-14 04:19:39,060][33226] Updated weights for policy 1, policy_version 82520 (0.0008) [2023-10-14 04:19:39,366][33201] Updated weights for policy 0, policy_version 81790 (0.0008) [2023-10-14 04:19:39,557][31953] Fps is (10 sec: 19660.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 168263680. Throughput: 0: 1748.9, 1: 1806.3. Samples: 42069408. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:42,727][33226] Updated weights for policy 1, policy_version 82530 (0.0008) [2023-10-14 04:19:43,092][33226] Updated weights for policy 1, policy_version 82540 (0.0008) [2023-10-14 04:19:43,401][33201] Updated weights for policy 0, policy_version 81800 (0.0008) [2023-10-14 04:19:43,459][33226] Updated weights for policy 1, policy_version 82550 (0.0008) [2023-10-14 04:19:43,772][33201] Updated weights for policy 0, policy_version 81810 (0.0007) [2023-10-14 04:19:43,821][33226] Updated weights for policy 1, policy_version 82560 (0.0007) [2023-10-14 04:19:44,130][33201] Updated weights for policy 0, policy_version 81820 (0.0009) [2023-10-14 04:19:44,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 168329216. Throughput: 0: 1751.3, 1: 1794.6. Samples: 42080688. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:19:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:19:47,570][33226] Updated weights for policy 1, policy_version 82570 (0.0008) [2023-10-14 04:19:47,928][33226] Updated weights for policy 1, policy_version 82580 (0.0008) [2023-10-14 04:19:48,232][33201] Updated weights for policy 0, policy_version 81830 (0.0007) [2023-10-14 04:19:48,292][33226] Updated weights for policy 1, policy_version 82590 (0.0009) [2023-10-14 04:19:48,595][33201] Updated weights for policy 0, policy_version 81840 (0.0008) [2023-10-14 04:19:48,963][33201] Updated weights for policy 0, policy_version 81850 (0.0007) [2023-10-14 04:19:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 168394752. Throughput: 0: 1758.8, 1: 1800.4. Samples: 42101606. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:19:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:19:52,047][33226] Updated weights for policy 1, policy_version 82600 (0.0008) [2023-10-14 04:19:52,420][33226] Updated weights for policy 1, policy_version 82610 (0.0008) [2023-10-14 04:19:52,783][33201] Updated weights for policy 0, policy_version 81860 (0.0008) [2023-10-14 04:19:52,784][33226] Updated weights for policy 1, policy_version 82620 (0.0008) [2023-10-14 04:19:53,142][33201] Updated weights for policy 0, policy_version 81870 (0.0007) [2023-10-14 04:19:53,520][33201] Updated weights for policy 0, policy_version 81880 (0.0008) [2023-10-14 04:19:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 168460288. Throughput: 0: 1733.2, 1: 1792.0. Samples: 42122432. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:19:54,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 04:19:56,258][33226] Updated weights for policy 1, policy_version 82630 (0.0009) [2023-10-14 04:19:56,616][33226] Updated weights for policy 1, policy_version 82640 (0.0011) [2023-10-14 04:19:56,984][33226] Updated weights for policy 1, policy_version 82650 (0.0007) [2023-10-14 04:19:57,411][33201] Updated weights for policy 0, policy_version 81890 (0.0009) [2023-10-14 04:19:57,780][33201] Updated weights for policy 0, policy_version 81900 (0.0009) [2023-10-14 04:19:58,153][33201] Updated weights for policy 0, policy_version 81910 (0.0008) [2023-10-14 04:19:58,523][33201] Updated weights for policy 0, policy_version 81920 (0.0010) [2023-10-14 04:19:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 168525824. Throughput: 0: 1769.3, 1: 1796.9. Samples: 42133960. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:19:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 04:20:00,850][33226] Updated weights for policy 1, policy_version 82660 (0.0008) [2023-10-14 04:20:01,217][33226] Updated weights for policy 1, policy_version 82670 (0.0007) [2023-10-14 04:20:01,580][33226] Updated weights for policy 1, policy_version 82680 (0.0010) [2023-10-14 04:20:02,432][33201] Updated weights for policy 0, policy_version 81930 (0.0007) [2023-10-14 04:20:02,798][33201] Updated weights for policy 0, policy_version 81940 (0.0009) [2023-10-14 04:20:03,179][33201] Updated weights for policy 0, policy_version 81950 (0.0008) [2023-10-14 04:20:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 168591360. Throughput: 0: 1749.1, 1: 1786.8. Samples: 42154480. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 04:20:05,304][33226] Updated weights for policy 1, policy_version 82690 (0.0007) [2023-10-14 04:20:05,669][33226] Updated weights for policy 1, policy_version 82700 (0.0007) [2023-10-14 04:20:06,036][33226] Updated weights for policy 1, policy_version 82710 (0.0008) [2023-10-14 04:20:06,405][33226] Updated weights for policy 1, policy_version 82720 (0.0008) [2023-10-14 04:20:06,901][33201] Updated weights for policy 0, policy_version 81960 (0.0010) [2023-10-14 04:20:07,273][33201] Updated weights for policy 0, policy_version 81970 (0.0009) [2023-10-14 04:20:07,639][33201] Updated weights for policy 0, policy_version 81980 (0.0008) [2023-10-14 04:20:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 168656896. Throughput: 0: 1745.8, 1: 1787.7. Samples: 42176604. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:09,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:20:10,155][33226] Updated weights for policy 1, policy_version 82730 (0.0008) [2023-10-14 04:20:10,523][33226] Updated weights for policy 1, policy_version 82740 (0.0009) [2023-10-14 04:20:10,881][33226] Updated weights for policy 1, policy_version 82750 (0.0008) [2023-10-14 04:20:11,558][33201] Updated weights for policy 0, policy_version 81990 (0.0010) [2023-10-14 04:20:11,934][33201] Updated weights for policy 0, policy_version 82000 (0.0009) [2023-10-14 04:20:12,305][33201] Updated weights for policy 0, policy_version 82010 (0.0011) [2023-10-14 04:20:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 168722432. Throughput: 0: 1758.1, 1: 1788.7. Samples: 42186676. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:14,601][33226] Updated weights for policy 1, policy_version 82760 (0.0009) [2023-10-14 04:20:14,978][33226] Updated weights for policy 1, policy_version 82770 (0.0009) [2023-10-14 04:20:15,352][33226] Updated weights for policy 1, policy_version 82780 (0.0008) [2023-10-14 04:20:16,149][33201] Updated weights for policy 0, policy_version 82020 (0.0011) [2023-10-14 04:20:16,522][33201] Updated weights for policy 0, policy_version 82030 (0.0010) [2023-10-14 04:20:16,888][33201] Updated weights for policy 0, policy_version 82040 (0.0011) [2023-10-14 04:20:19,142][33226] Updated weights for policy 1, policy_version 82790 (0.0008) [2023-10-14 04:20:19,512][33226] Updated weights for policy 1, policy_version 82800 (0.0011) [2023-10-14 04:20:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 168787968. Throughput: 0: 1745.6, 1: 1791.2. Samples: 42208384. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:19,880][33226] Updated weights for policy 1, policy_version 82810 (0.0007) [2023-10-14 04:20:20,619][33201] Updated weights for policy 0, policy_version 82050 (0.0008) [2023-10-14 04:20:20,995][33201] Updated weights for policy 0, policy_version 82060 (0.0010) [2023-10-14 04:20:21,371][33201] Updated weights for policy 0, policy_version 82070 (0.0008) [2023-10-14 04:20:21,740][33201] Updated weights for policy 0, policy_version 82080 (0.0009) [2023-10-14 04:20:23,723][33226] Updated weights for policy 1, policy_version 82820 (0.0009) [2023-10-14 04:20:24,093][33226] Updated weights for policy 1, policy_version 82830 (0.0009) [2023-10-14 04:20:24,458][33226] Updated weights for policy 1, policy_version 82840 (0.0009) [2023-10-14 04:20:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 168853504. Throughput: 0: 1765.8, 1: 1799.5. Samples: 42229846. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:25,541][33201] Updated weights for policy 0, policy_version 82090 (0.0008) [2023-10-14 04:20:25,918][33201] Updated weights for policy 0, policy_version 82100 (0.0009) [2023-10-14 04:20:26,281][33201] Updated weights for policy 0, policy_version 82110 (0.0008) [2023-10-14 04:20:28,243][33226] Updated weights for policy 1, policy_version 82850 (0.0008) [2023-10-14 04:20:28,624][33226] Updated weights for policy 1, policy_version 82860 (0.0009) [2023-10-14 04:20:28,994][33226] Updated weights for policy 1, policy_version 82870 (0.0011) [2023-10-14 04:20:29,358][33226] Updated weights for policy 1, policy_version 82880 (0.0010) [2023-10-14 04:20:29,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 168951808. Throughput: 0: 1749.0, 1: 1793.4. Samples: 42240096. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:29,911][33201] Updated weights for policy 0, policy_version 82120 (0.0008) [2023-10-14 04:20:30,292][33201] Updated weights for policy 0, policy_version 82130 (0.0008) [2023-10-14 04:20:30,662][33201] Updated weights for policy 0, policy_version 82140 (0.0010) [2023-10-14 04:20:33,178][33226] Updated weights for policy 1, policy_version 82890 (0.0008) [2023-10-14 04:20:33,551][33226] Updated weights for policy 1, policy_version 82900 (0.0007) [2023-10-14 04:20:33,911][33226] Updated weights for policy 1, policy_version 82910 (0.0008) [2023-10-14 04:20:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 169017344. Throughput: 0: 1754.2, 1: 1813.8. Samples: 42262168. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:34,625][33201] Updated weights for policy 0, policy_version 82150 (0.0007) [2023-10-14 04:20:34,988][33201] Updated weights for policy 0, policy_version 82160 (0.0008) [2023-10-14 04:20:35,359][33201] Updated weights for policy 0, policy_version 82170 (0.0008) [2023-10-14 04:20:37,774][33226] Updated weights for policy 1, policy_version 82920 (0.0009) [2023-10-14 04:20:38,141][33226] Updated weights for policy 1, policy_version 82930 (0.0010) [2023-10-14 04:20:38,512][33226] Updated weights for policy 1, policy_version 82940 (0.0009) [2023-10-14 04:20:39,377][33201] Updated weights for policy 0, policy_version 82180 (0.0009) [2023-10-14 04:20:39,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 169082880. Throughput: 0: 1779.5, 1: 1784.8. Samples: 42282828. Policy #0 lag: (min: 0.0, avg: 22.1, max: 32.0) [2023-10-14 04:20:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:39,751][33201] Updated weights for policy 0, policy_version 82190 (0.0009) [2023-10-14 04:20:40,119][33201] Updated weights for policy 0, policy_version 82200 (0.0009) [2023-10-14 04:20:42,166][33226] Updated weights for policy 1, policy_version 82950 (0.0007) [2023-10-14 04:20:42,533][33226] Updated weights for policy 1, policy_version 82960 (0.0007) [2023-10-14 04:20:42,900][33226] Updated weights for policy 1, policy_version 82970 (0.0008) [2023-10-14 04:20:43,888][33201] Updated weights for policy 0, policy_version 82210 (0.0007) [2023-10-14 04:20:44,262][33201] Updated weights for policy 0, policy_version 82220 (0.0008) [2023-10-14 04:20:44,557][31953] Fps is (10 sec: 13106.8, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 169148416. Throughput: 0: 1746.7, 1: 1812.4. Samples: 42294122. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:20:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.970')] [2023-10-14 04:20:44,641][33201] Updated weights for policy 0, policy_version 82230 (0.0007) [2023-10-14 04:20:45,010][33201] Updated weights for policy 0, policy_version 82240 (0.0008) [2023-10-14 04:20:46,758][33226] Updated weights for policy 1, policy_version 82980 (0.0009) [2023-10-14 04:20:47,130][33226] Updated weights for policy 1, policy_version 82990 (0.0008) [2023-10-14 04:20:47,487][33226] Updated weights for policy 1, policy_version 83000 (0.0008) [2023-10-14 04:20:48,877][33201] Updated weights for policy 0, policy_version 82250 (0.0008) [2023-10-14 04:20:49,239][33201] Updated weights for policy 0, policy_version 82260 (0.0009) [2023-10-14 04:20:49,557][31953] Fps is (10 sec: 13107.7, 60 sec: 13653.4, 300 sec: 14218.0). Total num frames: 169213952. Throughput: 0: 1774.2, 1: 1784.7. Samples: 42314630. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:20:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:20:49,608][33201] Updated weights for policy 0, policy_version 82270 (0.0008) [2023-10-14 04:20:51,246][33226] Updated weights for policy 1, policy_version 83010 (0.0009) [2023-10-14 04:20:51,615][33226] Updated weights for policy 1, policy_version 83020 (0.0009) [2023-10-14 04:20:51,979][33226] Updated weights for policy 1, policy_version 83030 (0.0009) [2023-10-14 04:20:52,344][33226] Updated weights for policy 1, policy_version 83040 (0.0008) [2023-10-14 04:20:53,434][33201] Updated weights for policy 0, policy_version 82280 (0.0008) [2023-10-14 04:20:53,804][33201] Updated weights for policy 0, policy_version 82290 (0.0009) [2023-10-14 04:20:54,187][33201] Updated weights for policy 0, policy_version 82300 (0.0010) [2023-10-14 04:20:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 169312256. Throughput: 0: 1754.2, 1: 1785.3. Samples: 42335882. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:20:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.970')] [2023-10-14 04:20:56,003][33226] Updated weights for policy 1, policy_version 83050 (0.0008) [2023-10-14 04:20:56,371][33226] Updated weights for policy 1, policy_version 83060 (0.0007) [2023-10-14 04:20:56,737][33226] Updated weights for policy 1, policy_version 83070 (0.0008) [2023-10-14 04:20:57,841][33201] Updated weights for policy 0, policy_version 82310 (0.0008) [2023-10-14 04:20:58,224][33201] Updated weights for policy 0, policy_version 82320 (0.0007) [2023-10-14 04:20:58,593][33201] Updated weights for policy 0, policy_version 82330 (0.0009) [2023-10-14 04:20:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169377792. Throughput: 0: 1771.6, 1: 1790.6. Samples: 42346974. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:20:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 04:21:00,380][33226] Updated weights for policy 1, policy_version 83080 (0.0007) [2023-10-14 04:21:00,742][33226] Updated weights for policy 1, policy_version 83090 (0.0007) [2023-10-14 04:21:01,105][33226] Updated weights for policy 1, policy_version 83100 (0.0009) [2023-10-14 04:21:02,373][33201] Updated weights for policy 0, policy_version 82340 (0.0007) [2023-10-14 04:21:02,741][33201] Updated weights for policy 0, policy_version 82350 (0.0009) [2023-10-14 04:21:03,101][33201] Updated weights for policy 0, policy_version 82360 (0.0010) [2023-10-14 04:21:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169443328. Throughput: 0: 1756.5, 1: 1792.5. Samples: 42368092. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 04:21:04,905][33226] Updated weights for policy 1, policy_version 83110 (0.0010) [2023-10-14 04:21:05,298][33226] Updated weights for policy 1, policy_version 83120 (0.0010) [2023-10-14 04:21:05,671][33226] Updated weights for policy 1, policy_version 83130 (0.0008) [2023-10-14 04:21:06,944][33201] Updated weights for policy 0, policy_version 82370 (0.0009) [2023-10-14 04:21:07,317][33201] Updated weights for policy 0, policy_version 82380 (0.0007) [2023-10-14 04:21:07,686][33201] Updated weights for policy 0, policy_version 82390 (0.0010) [2023-10-14 04:21:08,054][33201] Updated weights for policy 0, policy_version 82400 (0.0008) [2023-10-14 04:21:09,400][33226] Updated weights for policy 1, policy_version 83140 (0.0009) [2023-10-14 04:21:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169508864. Throughput: 0: 1750.0, 1: 1802.4. Samples: 42389704. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 04:21:09,768][33226] Updated weights for policy 1, policy_version 83150 (0.0009) [2023-10-14 04:21:10,135][33226] Updated weights for policy 1, policy_version 83160 (0.0008) [2023-10-14 04:21:11,977][33201] Updated weights for policy 0, policy_version 82410 (0.0008) [2023-10-14 04:21:12,343][33201] Updated weights for policy 0, policy_version 82420 (0.0007) [2023-10-14 04:21:12,714][33201] Updated weights for policy 0, policy_version 82430 (0.0007) [2023-10-14 04:21:14,052][33226] Updated weights for policy 1, policy_version 83170 (0.0008) [2023-10-14 04:21:14,419][33226] Updated weights for policy 1, policy_version 83180 (0.0008) [2023-10-14 04:21:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169574400. Throughput: 0: 1769.7, 1: 1789.9. Samples: 42400278. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:14,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:21:14,785][33226] Updated weights for policy 1, policy_version 83190 (0.0007) [2023-10-14 04:21:15,147][33226] Updated weights for policy 1, policy_version 83200 (0.0007) [2023-10-14 04:21:16,589][33201] Updated weights for policy 0, policy_version 82440 (0.0009) [2023-10-14 04:21:16,968][33201] Updated weights for policy 0, policy_version 82450 (0.0008) [2023-10-14 04:21:17,338][33201] Updated weights for policy 0, policy_version 82460 (0.0007) [2023-10-14 04:21:18,954][33226] Updated weights for policy 1, policy_version 83210 (0.0008) [2023-10-14 04:21:19,329][33226] Updated weights for policy 1, policy_version 83220 (0.0008) [2023-10-14 04:21:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 169639936. Throughput: 0: 1751.4, 1: 1786.8. Samples: 42421388. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:21:19,696][33226] Updated weights for policy 1, policy_version 83230 (0.0008) [2023-10-14 04:21:20,964][33201] Updated weights for policy 0, policy_version 82470 (0.0007) [2023-10-14 04:21:21,336][33201] Updated weights for policy 0, policy_version 82480 (0.0009) [2023-10-14 04:21:21,706][33201] Updated weights for policy 0, policy_version 82490 (0.0010) [2023-10-14 04:21:23,609][33226] Updated weights for policy 1, policy_version 83240 (0.0009) [2023-10-14 04:21:23,964][33226] Updated weights for policy 1, policy_version 83250 (0.0009) [2023-10-14 04:21:24,337][33226] Updated weights for policy 1, policy_version 83260 (0.0008) [2023-10-14 04:21:24,557][31953] Fps is (10 sec: 16383.2, 60 sec: 14745.5, 300 sec: 14218.0). Total num frames: 169738240. Throughput: 0: 1764.1, 1: 1793.0. Samples: 42442898. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:24,559][31953] Avg episode reward: [(0, '20.990'), (1, '20.960')] [2023-10-14 04:21:24,571][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000082496_84475904.pth... [2023-10-14 04:21:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000083264_85262336.pth... [2023-10-14 04:21:24,610][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000080864_82804736.pth [2023-10-14 04:21:24,615][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000081568_83525632.pth [2023-10-14 04:21:24,616][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000082496_84475904.pth [2023-10-14 04:21:24,619][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000083264_85262336.pth [2023-10-14 04:21:25,471][33201] Updated weights for policy 0, policy_version 82500 (0.0007) [2023-10-14 04:21:25,843][33201] Updated weights for policy 0, policy_version 82510 (0.0008) [2023-10-14 04:21:26,209][33201] Updated weights for policy 0, policy_version 82520 (0.0008) [2023-10-14 04:21:28,171][33226] Updated weights for policy 1, policy_version 83270 (0.0009) [2023-10-14 04:21:28,540][33226] Updated weights for policy 1, policy_version 83280 (0.0008) [2023-10-14 04:21:28,914][33226] Updated weights for policy 1, policy_version 83290 (0.0007) [2023-10-14 04:21:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169803776. Throughput: 0: 1762.7, 1: 1771.5. Samples: 42453162. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:21:30,056][33201] Updated weights for policy 0, policy_version 82530 (0.0008) [2023-10-14 04:21:30,432][33201] Updated weights for policy 0, policy_version 82540 (0.0009) [2023-10-14 04:21:30,795][33201] Updated weights for policy 0, policy_version 82550 (0.0008) [2023-10-14 04:21:31,177][33201] Updated weights for policy 0, policy_version 82560 (0.0007) [2023-10-14 04:21:32,597][33226] Updated weights for policy 1, policy_version 83300 (0.0007) [2023-10-14 04:21:32,957][33226] Updated weights for policy 1, policy_version 83310 (0.0010) [2023-10-14 04:21:33,320][33226] Updated weights for policy 1, policy_version 83320 (0.0008) [2023-10-14 04:21:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 169869312. Throughput: 0: 1764.3, 1: 1793.6. Samples: 42474736. Policy #0 lag: (min: 31.0, avg: 42.4, max: 63.0) [2023-10-14 04:21:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:21:34,962][33201] Updated weights for policy 0, policy_version 82570 (0.0010) [2023-10-14 04:21:35,330][33201] Updated weights for policy 0, policy_version 82580 (0.0009) [2023-10-14 04:21:35,705][33201] Updated weights for policy 0, policy_version 82590 (0.0008) [2023-10-14 04:21:37,218][33226] Updated weights for policy 1, policy_version 83330 (0.0008) [2023-10-14 04:21:37,592][33226] Updated weights for policy 1, policy_version 83340 (0.0007) [2023-10-14 04:21:37,953][33226] Updated weights for policy 1, policy_version 83350 (0.0009) [2023-10-14 04:21:38,313][33226] Updated weights for policy 1, policy_version 83360 (0.0007) [2023-10-14 04:21:39,495][33201] Updated weights for policy 0, policy_version 82600 (0.0010) [2023-10-14 04:21:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 169934848. Throughput: 0: 1788.5, 1: 1768.4. Samples: 42495944. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:21:39,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:21:39,872][33201] Updated weights for policy 0, policy_version 82610 (0.0010) [2023-10-14 04:21:40,236][33201] Updated weights for policy 0, policy_version 82620 (0.0008) [2023-10-14 04:21:42,021][33226] Updated weights for policy 1, policy_version 83370 (0.0008) [2023-10-14 04:21:42,383][33226] Updated weights for policy 1, policy_version 83380 (0.0007) [2023-10-14 04:21:42,749][33226] Updated weights for policy 1, policy_version 83390 (0.0007) [2023-10-14 04:21:44,138][33201] Updated weights for policy 0, policy_version 82630 (0.0008) [2023-10-14 04:21:44,513][33201] Updated weights for policy 0, policy_version 82640 (0.0008) [2023-10-14 04:21:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170000384. Throughput: 0: 1760.7, 1: 1786.9. Samples: 42506614. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:21:44,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:21:44,888][33201] Updated weights for policy 0, policy_version 82650 (0.0010) [2023-10-14 04:21:46,579][33226] Updated weights for policy 1, policy_version 83400 (0.0007) [2023-10-14 04:21:46,950][33226] Updated weights for policy 1, policy_version 83410 (0.0010) [2023-10-14 04:21:47,319][33226] Updated weights for policy 1, policy_version 83420 (0.0007) [2023-10-14 04:21:48,605][33201] Updated weights for policy 0, policy_version 82660 (0.0008) [2023-10-14 04:21:48,975][33201] Updated weights for policy 0, policy_version 82670 (0.0008) [2023-10-14 04:21:49,335][33201] Updated weights for policy 0, policy_version 82680 (0.0008) [2023-10-14 04:21:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 170065920. Throughput: 0: 1789.6, 1: 1760.0. Samples: 42527824. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:21:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.960')] [2023-10-14 04:21:51,260][33226] Updated weights for policy 1, policy_version 83430 (0.0007) [2023-10-14 04:21:51,652][33226] Updated weights for policy 1, policy_version 83440 (0.0009) [2023-10-14 04:21:52,018][33226] Updated weights for policy 1, policy_version 83450 (0.0008) [2023-10-14 04:21:53,203][33201] Updated weights for policy 0, policy_version 82690 (0.0008) [2023-10-14 04:21:53,571][33201] Updated weights for policy 0, policy_version 82700 (0.0007) [2023-10-14 04:21:53,945][33201] Updated weights for policy 0, policy_version 82710 (0.0008) [2023-10-14 04:21:54,305][33201] Updated weights for policy 0, policy_version 82720 (0.0008) [2023-10-14 04:21:54,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170164224. Throughput: 0: 1774.9, 1: 1756.4. Samples: 42548612. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:21:54,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.980')] [2023-10-14 04:21:55,776][33226] Updated weights for policy 1, policy_version 83460 (0.0009) [2023-10-14 04:21:56,145][33226] Updated weights for policy 1, policy_version 83470 (0.0008) [2023-10-14 04:21:56,518][33226] Updated weights for policy 1, policy_version 83480 (0.0007) [2023-10-14 04:21:58,217][33201] Updated weights for policy 0, policy_version 82730 (0.0008) [2023-10-14 04:21:58,585][33201] Updated weights for policy 0, policy_version 82740 (0.0007) [2023-10-14 04:21:58,953][33201] Updated weights for policy 0, policy_version 82750 (0.0011) [2023-10-14 04:21:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170229760. Throughput: 0: 1777.5, 1: 1757.2. Samples: 42559342. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:21:59,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:00,365][33226] Updated weights for policy 1, policy_version 83490 (0.0009) [2023-10-14 04:22:00,730][33226] Updated weights for policy 1, policy_version 83500 (0.0009) [2023-10-14 04:22:01,105][33226] Updated weights for policy 1, policy_version 83510 (0.0008) [2023-10-14 04:22:01,462][33226] Updated weights for policy 1, policy_version 83520 (0.0007) [2023-10-14 04:22:02,861][33201] Updated weights for policy 0, policy_version 82760 (0.0008) [2023-10-14 04:22:03,226][33201] Updated weights for policy 0, policy_version 82770 (0.0009) [2023-10-14 04:22:03,609][33201] Updated weights for policy 0, policy_version 82780 (0.0008) [2023-10-14 04:22:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170295296. Throughput: 0: 1779.6, 1: 1762.6. Samples: 42580784. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:22:04,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:05,266][33226] Updated weights for policy 1, policy_version 83530 (0.0007) [2023-10-14 04:22:05,627][33226] Updated weights for policy 1, policy_version 83540 (0.0007) [2023-10-14 04:22:05,998][33226] Updated weights for policy 1, policy_version 83550 (0.0007) [2023-10-14 04:22:07,379][33201] Updated weights for policy 0, policy_version 82790 (0.0010) [2023-10-14 04:22:07,737][33201] Updated weights for policy 0, policy_version 82800 (0.0008) [2023-10-14 04:22:08,102][33201] Updated weights for policy 0, policy_version 82810 (0.0007) [2023-10-14 04:22:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170360832. Throughput: 0: 1752.5, 1: 1785.8. Samples: 42602122. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:22:09,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:09,689][33226] Updated weights for policy 1, policy_version 83560 (0.0008) [2023-10-14 04:22:10,056][33226] Updated weights for policy 1, policy_version 83570 (0.0007) [2023-10-14 04:22:10,429][33226] Updated weights for policy 1, policy_version 83580 (0.0011) [2023-10-14 04:22:11,954][33201] Updated weights for policy 0, policy_version 82820 (0.0011) [2023-10-14 04:22:12,331][33201] Updated weights for policy 0, policy_version 82830 (0.0010) [2023-10-14 04:22:12,697][33201] Updated weights for policy 0, policy_version 82840 (0.0008) [2023-10-14 04:22:14,191][33226] Updated weights for policy 1, policy_version 83590 (0.0008) [2023-10-14 04:22:14,556][33226] Updated weights for policy 1, policy_version 83600 (0.0009) [2023-10-14 04:22:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 170426368. Throughput: 0: 1780.3, 1: 1767.7. Samples: 42612820. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:22:14,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:14,928][33226] Updated weights for policy 1, policy_version 83610 (0.0009) [2023-10-14 04:22:16,465][33201] Updated weights for policy 0, policy_version 82850 (0.0007) [2023-10-14 04:22:16,849][33201] Updated weights for policy 0, policy_version 82860 (0.0011) [2023-10-14 04:22:17,215][33201] Updated weights for policy 0, policy_version 82870 (0.0008) [2023-10-14 04:22:17,577][33201] Updated weights for policy 0, policy_version 82880 (0.0007) [2023-10-14 04:22:18,783][33226] Updated weights for policy 1, policy_version 83620 (0.0009) [2023-10-14 04:22:19,154][33226] Updated weights for policy 1, policy_version 83630 (0.0008) [2023-10-14 04:22:19,514][33226] Updated weights for policy 1, policy_version 83640 (0.0009) [2023-10-14 04:22:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 170491904. Throughput: 0: 1756.4, 1: 1782.0. Samples: 42633962. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:22:19,562][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:21,362][33201] Updated weights for policy 0, policy_version 82890 (0.0008) [2023-10-14 04:22:21,724][33201] Updated weights for policy 0, policy_version 82900 (0.0007) [2023-10-14 04:22:22,098][33201] Updated weights for policy 0, policy_version 82910 (0.0007) [2023-10-14 04:22:23,224][33226] Updated weights for policy 1, policy_version 83650 (0.0008) [2023-10-14 04:22:23,587][33226] Updated weights for policy 1, policy_version 83660 (0.0007) [2023-10-14 04:22:23,960][33226] Updated weights for policy 1, policy_version 83670 (0.0011) [2023-10-14 04:22:24,335][33226] Updated weights for policy 1, policy_version 83680 (0.0009) [2023-10-14 04:22:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170590208. Throughput: 0: 1760.0, 1: 1784.3. Samples: 42655438. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:22:24,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.990')] [2023-10-14 04:22:25,788][33201] Updated weights for policy 0, policy_version 82920 (0.0009) [2023-10-14 04:22:26,161][33201] Updated weights for policy 0, policy_version 82930 (0.0007) [2023-10-14 04:22:26,533][33201] Updated weights for policy 0, policy_version 82940 (0.0009) [2023-10-14 04:22:28,049][33226] Updated weights for policy 1, policy_version 83690 (0.0010) [2023-10-14 04:22:28,410][33226] Updated weights for policy 1, policy_version 83700 (0.0008) [2023-10-14 04:22:28,773][33226] Updated weights for policy 1, policy_version 83710 (0.0007) [2023-10-14 04:22:29,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170655744. Throughput: 0: 1763.2, 1: 1783.9. Samples: 42666230. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:29,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.980')] [2023-10-14 04:22:30,290][33201] Updated weights for policy 0, policy_version 82950 (0.0008) [2023-10-14 04:22:30,663][33201] Updated weights for policy 0, policy_version 82960 (0.0007) [2023-10-14 04:22:31,045][33201] Updated weights for policy 0, policy_version 82970 (0.0008) [2023-10-14 04:22:32,583][33226] Updated weights for policy 1, policy_version 83720 (0.0009) [2023-10-14 04:22:32,956][33226] Updated weights for policy 1, policy_version 83730 (0.0007) [2023-10-14 04:22:33,324][33226] Updated weights for policy 1, policy_version 83740 (0.0007) [2023-10-14 04:22:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 170721280. Throughput: 0: 1763.2, 1: 1791.3. Samples: 42687774. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:34,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.980')] [2023-10-14 04:22:35,080][33201] Updated weights for policy 0, policy_version 82980 (0.0007) [2023-10-14 04:22:35,456][33201] Updated weights for policy 0, policy_version 82990 (0.0007) [2023-10-14 04:22:35,831][33201] Updated weights for policy 0, policy_version 83000 (0.0007) [2023-10-14 04:22:37,170][33226] Updated weights for policy 1, policy_version 83750 (0.0009) [2023-10-14 04:22:37,560][33226] Updated weights for policy 1, policy_version 83760 (0.0009) [2023-10-14 04:22:37,926][33226] Updated weights for policy 1, policy_version 83770 (0.0010) [2023-10-14 04:22:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 170786816. Throughput: 0: 1787.8, 1: 1782.8. Samples: 42709288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.980')] [2023-10-14 04:22:39,619][33201] Updated weights for policy 0, policy_version 83010 (0.0009) [2023-10-14 04:22:39,989][33201] Updated weights for policy 0, policy_version 83020 (0.0008) [2023-10-14 04:22:40,363][33201] Updated weights for policy 0, policy_version 83030 (0.0008) [2023-10-14 04:22:40,738][33201] Updated weights for policy 0, policy_version 83040 (0.0007) [2023-10-14 04:22:41,703][33226] Updated weights for policy 1, policy_version 83780 (0.0010) [2023-10-14 04:22:42,073][33226] Updated weights for policy 1, policy_version 83790 (0.0008) [2023-10-14 04:22:42,430][33226] Updated weights for policy 1, policy_version 83800 (0.0009) [2023-10-14 04:22:44,547][33201] Updated weights for policy 0, policy_version 83050 (0.0007) [2023-10-14 04:22:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 170852352. Throughput: 0: 1761.9, 1: 1802.1. Samples: 42719722. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:44,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.990')] [2023-10-14 04:22:44,907][33201] Updated weights for policy 0, policy_version 83060 (0.0009) [2023-10-14 04:22:45,283][33201] Updated weights for policy 0, policy_version 83070 (0.0007) [2023-10-14 04:22:45,977][33226] Updated weights for policy 1, policy_version 83810 (0.0009) [2023-10-14 04:22:46,338][33226] Updated weights for policy 1, policy_version 83820 (0.0007) [2023-10-14 04:22:46,705][33226] Updated weights for policy 1, policy_version 83830 (0.0007) [2023-10-14 04:22:47,080][33226] Updated weights for policy 1, policy_version 83840 (0.0008) [2023-10-14 04:22:49,100][33201] Updated weights for policy 0, policy_version 83080 (0.0008) [2023-10-14 04:22:49,476][33201] Updated weights for policy 0, policy_version 83090 (0.0010) [2023-10-14 04:22:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 170917888. Throughput: 0: 1779.4, 1: 1781.8. Samples: 42741036. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:49,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.980')] [2023-10-14 04:22:49,848][33201] Updated weights for policy 0, policy_version 83100 (0.0009) [2023-10-14 04:22:50,906][33226] Updated weights for policy 1, policy_version 83850 (0.0009) [2023-10-14 04:22:51,269][33226] Updated weights for policy 1, policy_version 83860 (0.0010) [2023-10-14 04:22:51,636][33226] Updated weights for policy 1, policy_version 83870 (0.0010) [2023-10-14 04:22:53,705][33201] Updated weights for policy 0, policy_version 83110 (0.0009) [2023-10-14 04:22:54,081][33201] Updated weights for policy 0, policy_version 83120 (0.0008) [2023-10-14 04:22:54,444][33201] Updated weights for policy 0, policy_version 83130 (0.0008) [2023-10-14 04:22:54,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 170983424. Throughput: 0: 1781.3, 1: 1782.4. Samples: 42762490. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:54,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.870')] [2023-10-14 04:22:55,488][33226] Updated weights for policy 1, policy_version 83880 (0.0009) [2023-10-14 04:22:55,848][33226] Updated weights for policy 1, policy_version 83890 (0.0011) [2023-10-14 04:22:56,219][33226] Updated weights for policy 1, policy_version 83900 (0.0008) [2023-10-14 04:22:58,185][33201] Updated weights for policy 0, policy_version 83140 (0.0007) [2023-10-14 04:22:58,546][33201] Updated weights for policy 0, policy_version 83150 (0.0011) [2023-10-14 04:22:58,912][33201] Updated weights for policy 0, policy_version 83160 (0.0008) [2023-10-14 04:22:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 171081728. Throughput: 0: 1773.2, 1: 1784.3. Samples: 42772904. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:22:59,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:22:59,899][33226] Updated weights for policy 1, policy_version 83910 (0.0009) [2023-10-14 04:23:00,261][33226] Updated weights for policy 1, policy_version 83920 (0.0010) [2023-10-14 04:23:00,629][33226] Updated weights for policy 1, policy_version 83930 (0.0009) [2023-10-14 04:23:02,758][33201] Updated weights for policy 0, policy_version 83170 (0.0009) [2023-10-14 04:23:03,123][33201] Updated weights for policy 0, policy_version 83180 (0.0007) [2023-10-14 04:23:03,498][33201] Updated weights for policy 0, policy_version 83190 (0.0008) [2023-10-14 04:23:03,861][33201] Updated weights for policy 0, policy_version 83200 (0.0008) [2023-10-14 04:23:04,457][33226] Updated weights for policy 1, policy_version 83940 (0.0008) [2023-10-14 04:23:04,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 171147264. Throughput: 0: 1785.7, 1: 1788.8. Samples: 42794814. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:04,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:23:04,832][33226] Updated weights for policy 1, policy_version 83950 (0.0009) [2023-10-14 04:23:05,194][33226] Updated weights for policy 1, policy_version 83960 (0.0011) [2023-10-14 04:23:07,630][33201] Updated weights for policy 0, policy_version 83210 (0.0009) [2023-10-14 04:23:07,992][33201] Updated weights for policy 0, policy_version 83220 (0.0010) [2023-10-14 04:23:08,369][33201] Updated weights for policy 0, policy_version 83230 (0.0010) [2023-10-14 04:23:08,871][33226] Updated weights for policy 1, policy_version 83970 (0.0009) [2023-10-14 04:23:09,231][33226] Updated weights for policy 1, policy_version 83980 (0.0009) [2023-10-14 04:23:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 171212800. Throughput: 0: 1760.7, 1: 1806.7. Samples: 42815968. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:23:09,599][33226] Updated weights for policy 1, policy_version 83990 (0.0007) [2023-10-14 04:23:09,957][33226] Updated weights for policy 1, policy_version 84000 (0.0007) [2023-10-14 04:23:12,141][33201] Updated weights for policy 0, policy_version 83240 (0.0009) [2023-10-14 04:23:12,509][33201] Updated weights for policy 0, policy_version 83250 (0.0008) [2023-10-14 04:23:12,883][33201] Updated weights for policy 0, policy_version 83260 (0.0007) [2023-10-14 04:23:13,760][33226] Updated weights for policy 1, policy_version 84010 (0.0010) [2023-10-14 04:23:14,122][33226] Updated weights for policy 1, policy_version 84020 (0.0011) [2023-10-14 04:23:14,493][33226] Updated weights for policy 1, policy_version 84030 (0.0009) [2023-10-14 04:23:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 171278336. Throughput: 0: 1788.2, 1: 1787.1. Samples: 42827118. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:14,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:23:16,633][33201] Updated weights for policy 0, policy_version 83270 (0.0008) [2023-10-14 04:23:17,007][33201] Updated weights for policy 0, policy_version 83280 (0.0008) [2023-10-14 04:23:17,385][33201] Updated weights for policy 0, policy_version 83290 (0.0009) [2023-10-14 04:23:18,337][33226] Updated weights for policy 1, policy_version 84040 (0.0009) [2023-10-14 04:23:18,699][33226] Updated weights for policy 1, policy_version 84050 (0.0007) [2023-10-14 04:23:19,066][33226] Updated weights for policy 1, policy_version 84060 (0.0007) [2023-10-14 04:23:19,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 171376640. Throughput: 0: 1761.6, 1: 1803.3. Samples: 42848198. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:19,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:23:21,244][33201] Updated weights for policy 0, policy_version 83300 (0.0009) [2023-10-14 04:23:21,628][33201] Updated weights for policy 0, policy_version 83310 (0.0009) [2023-10-14 04:23:22,006][33201] Updated weights for policy 0, policy_version 83320 (0.0008) [2023-10-14 04:23:23,064][33226] Updated weights for policy 1, policy_version 84070 (0.0010) [2023-10-14 04:23:23,451][33226] Updated weights for policy 1, policy_version 84080 (0.0011) [2023-10-14 04:23:23,828][33226] Updated weights for policy 1, policy_version 84090 (0.0008) [2023-10-14 04:23:24,557][31953] Fps is (10 sec: 16383.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 171442176. Throughput: 0: 1764.1, 1: 1783.1. Samples: 42868916. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:24,559][31953] Avg episode reward: [(0, '20.980'), (1, '20.840')] [2023-10-14 04:23:24,569][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000083328_85327872.pth... [2023-10-14 04:23:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000084096_86114304.pth... [2023-10-14 04:23:24,610][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000081696_83656704.pth [2023-10-14 04:23:24,611][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000082432_84410368.pth [2023-10-14 04:23:25,694][33201] Updated weights for policy 0, policy_version 83330 (0.0008) [2023-10-14 04:23:26,061][33201] Updated weights for policy 0, policy_version 83340 (0.0009) [2023-10-14 04:23:26,431][33201] Updated weights for policy 0, policy_version 83350 (0.0007) [2023-10-14 04:23:26,792][33201] Updated weights for policy 0, policy_version 83360 (0.0007) [2023-10-14 04:23:27,515][33226] Updated weights for policy 1, policy_version 84100 (0.0009) [2023-10-14 04:23:27,886][33226] Updated weights for policy 1, policy_version 84110 (0.0008) [2023-10-14 04:23:28,255][33226] Updated weights for policy 1, policy_version 84120 (0.0008) [2023-10-14 04:23:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 171507712. Throughput: 0: 1763.1, 1: 1790.6. Samples: 42879640. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.840')] [2023-10-14 04:23:30,604][33201] Updated weights for policy 0, policy_version 83370 (0.0008) [2023-10-14 04:23:30,973][33201] Updated weights for policy 0, policy_version 83380 (0.0007) [2023-10-14 04:23:31,340][33201] Updated weights for policy 0, policy_version 83390 (0.0008) [2023-10-14 04:23:31,977][33226] Updated weights for policy 1, policy_version 84130 (0.0008) [2023-10-14 04:23:32,341][33226] Updated weights for policy 1, policy_version 84140 (0.0009) [2023-10-14 04:23:32,714][33226] Updated weights for policy 1, policy_version 84150 (0.0012) [2023-10-14 04:23:33,088][33226] Updated weights for policy 1, policy_version 84160 (0.0011) [2023-10-14 04:23:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 171573248. Throughput: 0: 1764.2, 1: 1788.3. Samples: 42900898. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:34,559][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:23:35,130][33201] Updated weights for policy 0, policy_version 83400 (0.0007) [2023-10-14 04:23:35,497][33201] Updated weights for policy 0, policy_version 83410 (0.0008) [2023-10-14 04:23:35,870][33201] Updated weights for policy 0, policy_version 83420 (0.0007) [2023-10-14 04:23:36,930][33226] Updated weights for policy 1, policy_version 84170 (0.0010) [2023-10-14 04:23:37,307][33226] Updated weights for policy 1, policy_version 84180 (0.0009) [2023-10-14 04:23:37,674][33226] Updated weights for policy 1, policy_version 84190 (0.0011) [2023-10-14 04:23:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 171638784. Throughput: 0: 1785.5, 1: 1778.2. Samples: 42922854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:23:39,658][33201] Updated weights for policy 0, policy_version 83430 (0.0008) [2023-10-14 04:23:40,037][33201] Updated weights for policy 0, policy_version 83440 (0.0008) [2023-10-14 04:23:40,412][33201] Updated weights for policy 0, policy_version 83450 (0.0008) [2023-10-14 04:23:41,477][33226] Updated weights for policy 1, policy_version 84200 (0.0009) [2023-10-14 04:23:41,849][33226] Updated weights for policy 1, policy_version 84210 (0.0007) [2023-10-14 04:23:42,209][33226] Updated weights for policy 1, policy_version 84220 (0.0009) [2023-10-14 04:23:44,074][33201] Updated weights for policy 0, policy_version 83460 (0.0007) [2023-10-14 04:23:44,443][33201] Updated weights for policy 0, policy_version 83470 (0.0007) [2023-10-14 04:23:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 171704320. Throughput: 0: 1772.4, 1: 1790.4. Samples: 42933232. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:23:44,814][33201] Updated weights for policy 0, policy_version 83480 (0.0007) [2023-10-14 04:23:45,901][33226] Updated weights for policy 1, policy_version 84230 (0.0008) [2023-10-14 04:23:46,268][33226] Updated weights for policy 1, policy_version 84240 (0.0009) [2023-10-14 04:23:46,639][33226] Updated weights for policy 1, policy_version 84250 (0.0009) [2023-10-14 04:23:48,644][33201] Updated weights for policy 0, policy_version 83490 (0.0007) [2023-10-14 04:23:49,004][33201] Updated weights for policy 0, policy_version 83500 (0.0009) [2023-10-14 04:23:49,373][33201] Updated weights for policy 0, policy_version 83510 (0.0007) [2023-10-14 04:23:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 171769856. Throughput: 0: 1786.4, 1: 1769.9. Samples: 42954848. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:23:49,745][33201] Updated weights for policy 0, policy_version 83520 (0.0008) [2023-10-14 04:23:50,505][33226] Updated weights for policy 1, policy_version 84260 (0.0008) [2023-10-14 04:23:50,860][33226] Updated weights for policy 1, policy_version 84270 (0.0010) [2023-10-14 04:23:51,237][33226] Updated weights for policy 1, policy_version 84280 (0.0008) [2023-10-14 04:23:53,625][33201] Updated weights for policy 0, policy_version 83530 (0.0007) [2023-10-14 04:23:53,995][33201] Updated weights for policy 0, policy_version 83540 (0.0008) [2023-10-14 04:23:54,372][33201] Updated weights for policy 0, policy_version 83550 (0.0008) [2023-10-14 04:23:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 171868160. Throughput: 0: 1784.0, 1: 1774.8. Samples: 42976114. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:23:54,907][33226] Updated weights for policy 1, policy_version 84290 (0.0010) [2023-10-14 04:23:55,272][33226] Updated weights for policy 1, policy_version 84300 (0.0011) [2023-10-14 04:23:55,646][33226] Updated weights for policy 1, policy_version 84310 (0.0009) [2023-10-14 04:23:56,007][33226] Updated weights for policy 1, policy_version 84320 (0.0007) [2023-10-14 04:23:58,378][33201] Updated weights for policy 0, policy_version 83560 (0.0010) [2023-10-14 04:23:58,738][33201] Updated weights for policy 0, policy_version 83570 (0.0007) [2023-10-14 04:23:59,113][33201] Updated weights for policy 0, policy_version 83580 (0.0009) [2023-10-14 04:23:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 171933696. Throughput: 0: 1775.6, 1: 1768.7. Samples: 42986610. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:23:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:24:00,033][33226] Updated weights for policy 1, policy_version 84330 (0.0009) [2023-10-14 04:24:00,393][33226] Updated weights for policy 1, policy_version 84340 (0.0008) [2023-10-14 04:24:00,761][33226] Updated weights for policy 1, policy_version 84350 (0.0008) [2023-10-14 04:24:02,984][33201] Updated weights for policy 0, policy_version 83590 (0.0008) [2023-10-14 04:24:03,344][33201] Updated weights for policy 0, policy_version 83600 (0.0007) [2023-10-14 04:24:03,712][33201] Updated weights for policy 0, policy_version 83610 (0.0010) [2023-10-14 04:24:04,478][33226] Updated weights for policy 1, policy_version 84360 (0.0009) [2023-10-14 04:24:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 171999232. Throughput: 0: 1787.5, 1: 1768.4. Samples: 43008214. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:24:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:24:04,839][33226] Updated weights for policy 1, policy_version 84370 (0.0008) [2023-10-14 04:24:05,202][33226] Updated weights for policy 1, policy_version 84380 (0.0008) [2023-10-14 04:24:07,562][33201] Updated weights for policy 0, policy_version 83620 (0.0009) [2023-10-14 04:24:07,949][33201] Updated weights for policy 0, policy_version 83630 (0.0008) [2023-10-14 04:24:08,320][33201] Updated weights for policy 0, policy_version 83640 (0.0008) [2023-10-14 04:24:09,029][33226] Updated weights for policy 1, policy_version 84390 (0.0007) [2023-10-14 04:24:09,404][33226] Updated weights for policy 1, policy_version 84400 (0.0010) [2023-10-14 04:24:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 172064768. Throughput: 0: 1758.7, 1: 1794.1. Samples: 43028790. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:24:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.840')] [2023-10-14 04:24:09,767][33226] Updated weights for policy 1, policy_version 84410 (0.0008) [2023-10-14 04:24:12,056][33201] Updated weights for policy 0, policy_version 83650 (0.0008) [2023-10-14 04:24:12,431][33201] Updated weights for policy 0, policy_version 83660 (0.0007) [2023-10-14 04:24:12,794][33201] Updated weights for policy 0, policy_version 83670 (0.0008) [2023-10-14 04:24:13,157][33201] Updated weights for policy 0, policy_version 83680 (0.0007) [2023-10-14 04:24:13,328][33226] Updated weights for policy 1, policy_version 84420 (0.0008) [2023-10-14 04:24:13,698][33226] Updated weights for policy 1, policy_version 84430 (0.0009) [2023-10-14 04:24:14,051][33226] Updated weights for policy 1, policy_version 84440 (0.0008) [2023-10-14 04:24:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 172163072. Throughput: 0: 1791.8, 1: 1774.1. Samples: 43040104. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:24:14,557][31953] Avg episode reward: [(0, '20.940'), (1, '20.840')] [2023-10-14 04:24:16,952][33201] Updated weights for policy 0, policy_version 83690 (0.0010) [2023-10-14 04:24:17,315][33201] Updated weights for policy 0, policy_version 83700 (0.0010) [2023-10-14 04:24:17,679][33201] Updated weights for policy 0, policy_version 83710 (0.0010) [2023-10-14 04:24:17,898][33226] Updated weights for policy 1, policy_version 84450 (0.0008) [2023-10-14 04:24:18,263][33226] Updated weights for policy 1, policy_version 84460 (0.0009) [2023-10-14 04:24:18,634][33226] Updated weights for policy 1, policy_version 84470 (0.0008) [2023-10-14 04:24:18,997][33226] Updated weights for policy 1, policy_version 84480 (0.0010) [2023-10-14 04:24:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 172228608. Throughput: 0: 1762.8, 1: 1796.5. Samples: 43061064. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.850')] [2023-10-14 04:24:21,442][33201] Updated weights for policy 0, policy_version 83720 (0.0010) [2023-10-14 04:24:21,816][33201] Updated weights for policy 0, policy_version 83730 (0.0010) [2023-10-14 04:24:22,180][33201] Updated weights for policy 0, policy_version 83740 (0.0010) [2023-10-14 04:24:22,863][33226] Updated weights for policy 1, policy_version 84490 (0.0010) [2023-10-14 04:24:23,231][33226] Updated weights for policy 1, policy_version 84500 (0.0007) [2023-10-14 04:24:23,590][33226] Updated weights for policy 1, policy_version 84510 (0.0011) [2023-10-14 04:24:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 172294144. Throughput: 0: 1761.1, 1: 1776.7. Samples: 43082054. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.850')] [2023-10-14 04:24:26,056][33201] Updated weights for policy 0, policy_version 83750 (0.0009) [2023-10-14 04:24:26,428][33201] Updated weights for policy 0, policy_version 83760 (0.0010) [2023-10-14 04:24:26,806][33201] Updated weights for policy 0, policy_version 83770 (0.0010) [2023-10-14 04:24:27,359][33226] Updated weights for policy 1, policy_version 84520 (0.0008) [2023-10-14 04:24:27,729][33226] Updated weights for policy 1, policy_version 84530 (0.0007) [2023-10-14 04:24:28,091][33226] Updated weights for policy 1, policy_version 84540 (0.0008) [2023-10-14 04:24:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 172359680. Throughput: 0: 1752.4, 1: 1797.5. Samples: 43092976. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.850')] [2023-10-14 04:24:30,805][33201] Updated weights for policy 0, policy_version 83780 (0.0008) [2023-10-14 04:24:31,170][33201] Updated weights for policy 0, policy_version 83790 (0.0007) [2023-10-14 04:24:31,544][33201] Updated weights for policy 0, policy_version 83800 (0.0008) [2023-10-14 04:24:31,815][33226] Updated weights for policy 1, policy_version 84550 (0.0009) [2023-10-14 04:24:32,182][33226] Updated weights for policy 1, policy_version 84560 (0.0008) [2023-10-14 04:24:32,550][33226] Updated weights for policy 1, policy_version 84570 (0.0009) [2023-10-14 04:24:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 172425216. Throughput: 0: 1743.9, 1: 1783.9. Samples: 43113596. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.850')] [2023-10-14 04:24:35,423][33201] Updated weights for policy 0, policy_version 83810 (0.0008) [2023-10-14 04:24:35,793][33201] Updated weights for policy 0, policy_version 83820 (0.0010) [2023-10-14 04:24:36,159][33201] Updated weights for policy 0, policy_version 83830 (0.0009) [2023-10-14 04:24:36,405][33226] Updated weights for policy 1, policy_version 84580 (0.0008) [2023-10-14 04:24:36,526][33201] Updated weights for policy 0, policy_version 83840 (0.0009) [2023-10-14 04:24:36,762][33226] Updated weights for policy 1, policy_version 84590 (0.0008) [2023-10-14 04:24:37,128][33226] Updated weights for policy 1, policy_version 84600 (0.0008) [2023-10-14 04:24:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 172490752. Throughput: 0: 1765.1, 1: 1780.1. Samples: 43135648. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:39,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.930')] [2023-10-14 04:24:40,424][33201] Updated weights for policy 0, policy_version 83850 (0.0007) [2023-10-14 04:24:40,741][33226] Updated weights for policy 1, policy_version 84610 (0.0008) [2023-10-14 04:24:40,794][33201] Updated weights for policy 0, policy_version 83860 (0.0008) [2023-10-14 04:24:41,104][33226] Updated weights for policy 1, policy_version 84620 (0.0008) [2023-10-14 04:24:41,165][33201] Updated weights for policy 0, policy_version 83870 (0.0008) [2023-10-14 04:24:41,469][33226] Updated weights for policy 1, policy_version 84630 (0.0008) [2023-10-14 04:24:41,837][33226] Updated weights for policy 1, policy_version 84640 (0.0010) [2023-10-14 04:24:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 172556288. Throughput: 0: 1741.3, 1: 1785.9. Samples: 43145332. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:44,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 04:24:45,046][33201] Updated weights for policy 0, policy_version 83880 (0.0009) [2023-10-14 04:24:45,420][33201] Updated weights for policy 0, policy_version 83890 (0.0009) [2023-10-14 04:24:45,616][33226] Updated weights for policy 1, policy_version 84650 (0.0007) [2023-10-14 04:24:45,789][33201] Updated weights for policy 0, policy_version 83900 (0.0010) [2023-10-14 04:24:45,993][33226] Updated weights for policy 1, policy_version 84660 (0.0009) [2023-10-14 04:24:46,371][33226] Updated weights for policy 1, policy_version 84670 (0.0007) [2023-10-14 04:24:49,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 172621824. Throughput: 0: 1749.4, 1: 1788.2. Samples: 43167404. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:49,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 04:24:49,630][33201] Updated weights for policy 0, policy_version 83910 (0.0008) [2023-10-14 04:24:50,007][33201] Updated weights for policy 0, policy_version 83920 (0.0009) [2023-10-14 04:24:50,183][33226] Updated weights for policy 1, policy_version 84680 (0.0008) [2023-10-14 04:24:50,381][33201] Updated weights for policy 0, policy_version 83930 (0.0008) [2023-10-14 04:24:50,555][33226] Updated weights for policy 1, policy_version 84690 (0.0009) [2023-10-14 04:24:50,913][33226] Updated weights for policy 1, policy_version 84700 (0.0009) [2023-10-14 04:24:54,245][33201] Updated weights for policy 0, policy_version 83940 (0.0009) [2023-10-14 04:24:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 172687360. Throughput: 0: 1775.4, 1: 1791.4. Samples: 43189296. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 04:24:54,623][33201] Updated weights for policy 0, policy_version 83950 (0.0008) [2023-10-14 04:24:54,783][33226] Updated weights for policy 1, policy_version 84710 (0.0008) [2023-10-14 04:24:54,994][33201] Updated weights for policy 0, policy_version 83960 (0.0008) [2023-10-14 04:24:55,172][33226] Updated weights for policy 1, policy_version 84720 (0.0009) [2023-10-14 04:24:55,540][33226] Updated weights for policy 1, policy_version 84730 (0.0009) [2023-10-14 04:24:58,705][33201] Updated weights for policy 0, policy_version 83970 (0.0007) [2023-10-14 04:24:59,073][33201] Updated weights for policy 0, policy_version 83980 (0.0007) [2023-10-14 04:24:59,431][33226] Updated weights for policy 1, policy_version 84740 (0.0010) [2023-10-14 04:24:59,447][33201] Updated weights for policy 0, policy_version 83990 (0.0008) [2023-10-14 04:24:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 172752896. Throughput: 0: 1745.6, 1: 1778.4. Samples: 43198686. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:24:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 04:24:59,803][33226] Updated weights for policy 1, policy_version 84750 (0.0007) [2023-10-14 04:24:59,813][33201] Updated weights for policy 0, policy_version 84000 (0.0010) [2023-10-14 04:25:00,163][33226] Updated weights for policy 1, policy_version 84760 (0.0008) [2023-10-14 04:25:03,612][33201] Updated weights for policy 0, policy_version 84010 (0.0007) [2023-10-14 04:25:03,885][33226] Updated weights for policy 1, policy_version 84770 (0.0008) [2023-10-14 04:25:03,981][33201] Updated weights for policy 0, policy_version 84020 (0.0009) [2023-10-14 04:25:04,245][33226] Updated weights for policy 1, policy_version 84780 (0.0007) [2023-10-14 04:25:04,352][33201] Updated weights for policy 0, policy_version 84030 (0.0008) [2023-10-14 04:25:04,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 172851200. Throughput: 0: 1773.8, 1: 1777.0. Samples: 43220848. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:25:04,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 04:25:04,607][33226] Updated weights for policy 1, policy_version 84790 (0.0009) [2023-10-14 04:25:04,975][33226] Updated weights for policy 1, policy_version 84800 (0.0010) [2023-10-14 04:25:08,243][33201] Updated weights for policy 0, policy_version 84040 (0.0008) [2023-10-14 04:25:08,619][33201] Updated weights for policy 0, policy_version 84050 (0.0007) [2023-10-14 04:25:08,777][33226] Updated weights for policy 1, policy_version 84810 (0.0007) [2023-10-14 04:25:08,987][33201] Updated weights for policy 0, policy_version 84060 (0.0008) [2023-10-14 04:25:09,141][33226] Updated weights for policy 1, policy_version 84820 (0.0008) [2023-10-14 04:25:09,516][33226] Updated weights for policy 1, policy_version 84830 (0.0010) [2023-10-14 04:25:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 172916736. Throughput: 0: 1739.5, 1: 1793.2. Samples: 43241024. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:25:09,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.980')] [2023-10-14 04:25:12,768][33201] Updated weights for policy 0, policy_version 84070 (0.0007) [2023-10-14 04:25:13,135][33201] Updated weights for policy 0, policy_version 84080 (0.0007) [2023-10-14 04:25:13,389][33226] Updated weights for policy 1, policy_version 84840 (0.0007) [2023-10-14 04:25:13,500][33201] Updated weights for policy 0, policy_version 84090 (0.0010) [2023-10-14 04:25:13,758][33226] Updated weights for policy 1, policy_version 84850 (0.0008) [2023-10-14 04:25:14,122][33226] Updated weights for policy 1, policy_version 84860 (0.0009) [2023-10-14 04:25:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 173015040. Throughput: 0: 1773.5, 1: 1771.8. Samples: 43252512. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.990')] [2023-10-14 04:25:17,457][33201] Updated weights for policy 0, policy_version 84100 (0.0009) [2023-10-14 04:25:17,834][33201] Updated weights for policy 0, policy_version 84110 (0.0009) [2023-10-14 04:25:17,965][33226] Updated weights for policy 1, policy_version 84870 (0.0008) [2023-10-14 04:25:18,202][33201] Updated weights for policy 0, policy_version 84120 (0.0008) [2023-10-14 04:25:18,336][33226] Updated weights for policy 1, policy_version 84880 (0.0009) [2023-10-14 04:25:18,708][33226] Updated weights for policy 1, policy_version 84890 (0.0008) [2023-10-14 04:25:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 173080576. Throughput: 0: 1752.5, 1: 1790.8. Samples: 43273048. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:19,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 04:25:22,028][33201] Updated weights for policy 0, policy_version 84130 (0.0008) [2023-10-14 04:25:22,393][33201] Updated weights for policy 0, policy_version 84140 (0.0007) [2023-10-14 04:25:22,417][33226] Updated weights for policy 1, policy_version 84900 (0.0008) [2023-10-14 04:25:22,758][33201] Updated weights for policy 0, policy_version 84150 (0.0009) [2023-10-14 04:25:22,774][33226] Updated weights for policy 1, policy_version 84910 (0.0008) [2023-10-14 04:25:23,129][33201] Updated weights for policy 0, policy_version 84160 (0.0009) [2023-10-14 04:25:23,140][33226] Updated weights for policy 1, policy_version 84920 (0.0008) [2023-10-14 04:25:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173146112. Throughput: 0: 1743.0, 1: 1763.0. Samples: 43293420. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 04:25:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000084928_86966272.pth... [2023-10-14 04:25:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000084160_86179840.pth... [2023-10-14 04:25:24,600][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000082496_84475904.pth [2023-10-14 04:25:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000083264_85262336.pth [2023-10-14 04:25:26,876][33201] Updated weights for policy 0, policy_version 84170 (0.0010) [2023-10-14 04:25:26,988][33226] Updated weights for policy 1, policy_version 84930 (0.0009) [2023-10-14 04:25:27,246][33201] Updated weights for policy 0, policy_version 84180 (0.0008) [2023-10-14 04:25:27,356][33226] Updated weights for policy 1, policy_version 84940 (0.0009) [2023-10-14 04:25:27,610][33201] Updated weights for policy 0, policy_version 84190 (0.0008) [2023-10-14 04:25:27,714][33226] Updated weights for policy 1, policy_version 84950 (0.0010) [2023-10-14 04:25:28,080][33226] Updated weights for policy 1, policy_version 84960 (0.0011) [2023-10-14 04:25:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173211648. Throughput: 0: 1766.0, 1: 1790.8. Samples: 43305386. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:29,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 04:25:31,300][33201] Updated weights for policy 0, policy_version 84200 (0.0007) [2023-10-14 04:25:31,667][33201] Updated weights for policy 0, policy_version 84210 (0.0008) [2023-10-14 04:25:32,038][33201] Updated weights for policy 0, policy_version 84220 (0.0009) [2023-10-14 04:25:32,067][33226] Updated weights for policy 1, policy_version 84970 (0.0008) [2023-10-14 04:25:32,427][33226] Updated weights for policy 1, policy_version 84980 (0.0010) [2023-10-14 04:25:32,795][33226] Updated weights for policy 1, policy_version 84990 (0.0008) [2023-10-14 04:25:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173277184. Throughput: 0: 1759.9, 1: 1749.0. Samples: 43325302. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 04:25:35,952][33201] Updated weights for policy 0, policy_version 84230 (0.0008) [2023-10-14 04:25:36,315][33201] Updated weights for policy 0, policy_version 84240 (0.0008) [2023-10-14 04:25:36,540][33226] Updated weights for policy 1, policy_version 85000 (0.0008) [2023-10-14 04:25:36,684][33201] Updated weights for policy 0, policy_version 84250 (0.0009) [2023-10-14 04:25:36,906][33226] Updated weights for policy 1, policy_version 85010 (0.0008) [2023-10-14 04:25:37,262][33226] Updated weights for policy 1, policy_version 85020 (0.0009) [2023-10-14 04:25:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173342720. Throughput: 0: 1762.8, 1: 1755.7. Samples: 43347632. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.990')] [2023-10-14 04:25:40,558][33201] Updated weights for policy 0, policy_version 84260 (0.0009) [2023-10-14 04:25:40,957][33201] Updated weights for policy 0, policy_version 84270 (0.0008) [2023-10-14 04:25:41,130][33226] Updated weights for policy 1, policy_version 85030 (0.0009) [2023-10-14 04:25:41,322][33201] Updated weights for policy 0, policy_version 84280 (0.0009) [2023-10-14 04:25:41,521][33226] Updated weights for policy 1, policy_version 85040 (0.0010) [2023-10-14 04:25:41,885][33226] Updated weights for policy 1, policy_version 85050 (0.0009) [2023-10-14 04:25:44,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173408256. Throughput: 0: 1762.5, 1: 1763.2. Samples: 43357344. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:44,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.990')] [2023-10-14 04:25:45,066][33201] Updated weights for policy 0, policy_version 84290 (0.0009) [2023-10-14 04:25:45,438][33201] Updated weights for policy 0, policy_version 84300 (0.0009) [2023-10-14 04:25:45,662][33226] Updated weights for policy 1, policy_version 85060 (0.0010) [2023-10-14 04:25:45,806][33201] Updated weights for policy 0, policy_version 84310 (0.0007) [2023-10-14 04:25:46,023][33226] Updated weights for policy 1, policy_version 85070 (0.0009) [2023-10-14 04:25:46,173][33201] Updated weights for policy 0, policy_version 84320 (0.0009) [2023-10-14 04:25:46,389][33226] Updated weights for policy 1, policy_version 85080 (0.0010) [2023-10-14 04:25:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 173473792. Throughput: 0: 1762.4, 1: 1757.9. Samples: 43379260. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:49,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.980')] [2023-10-14 04:25:49,827][33201] Updated weights for policy 0, policy_version 84330 (0.0009) [2023-10-14 04:25:50,185][33201] Updated weights for policy 0, policy_version 84340 (0.0008) [2023-10-14 04:25:50,260][33226] Updated weights for policy 1, policy_version 85090 (0.0009) [2023-10-14 04:25:50,556][33201] Updated weights for policy 0, policy_version 84350 (0.0008) [2023-10-14 04:25:50,621][33226] Updated weights for policy 1, policy_version 85100 (0.0008) [2023-10-14 04:25:50,985][33226] Updated weights for policy 1, policy_version 85110 (0.0009) [2023-10-14 04:25:51,350][33226] Updated weights for policy 1, policy_version 85120 (0.0010) [2023-10-14 04:25:54,428][33201] Updated weights for policy 0, policy_version 84360 (0.0007) [2023-10-14 04:25:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 173539328. Throughput: 0: 1800.8, 1: 1771.1. Samples: 43401760. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:54,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 04:25:54,799][33201] Updated weights for policy 0, policy_version 84370 (0.0007) [2023-10-14 04:25:55,075][33226] Updated weights for policy 1, policy_version 85130 (0.0009) [2023-10-14 04:25:55,167][33201] Updated weights for policy 0, policy_version 84380 (0.0008) [2023-10-14 04:25:55,448][33226] Updated weights for policy 1, policy_version 85140 (0.0008) [2023-10-14 04:25:55,814][33226] Updated weights for policy 1, policy_version 85150 (0.0008) [2023-10-14 04:25:58,910][33201] Updated weights for policy 0, policy_version 84390 (0.0009) [2023-10-14 04:25:59,289][33201] Updated weights for policy 0, policy_version 84400 (0.0009) [2023-10-14 04:25:59,538][33226] Updated weights for policy 1, policy_version 85160 (0.0007) [2023-10-14 04:25:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 173604864. Throughput: 0: 1771.7, 1: 1761.1. Samples: 43411488. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:25:59,558][31953] Avg episode reward: [(0, '20.890'), (1, '20.980')] [2023-10-14 04:25:59,648][33201] Updated weights for policy 0, policy_version 84410 (0.0008) [2023-10-14 04:25:59,895][33226] Updated weights for policy 1, policy_version 85170 (0.0007) [2023-10-14 04:26:00,262][33226] Updated weights for policy 1, policy_version 85180 (0.0009) [2023-10-14 04:26:03,553][33201] Updated weights for policy 0, policy_version 84420 (0.0009) [2023-10-14 04:26:03,914][33201] Updated weights for policy 0, policy_version 84430 (0.0009) [2023-10-14 04:26:04,127][33226] Updated weights for policy 1, policy_version 85190 (0.0009) [2023-10-14 04:26:04,280][33201] Updated weights for policy 0, policy_version 84440 (0.0008) [2023-10-14 04:26:04,496][33226] Updated weights for policy 1, policy_version 85200 (0.0007) [2023-10-14 04:26:04,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 173670400. Throughput: 0: 1794.7, 1: 1772.5. Samples: 43433574. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-10-14 04:26:04,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.980')] [2023-10-14 04:26:04,856][33226] Updated weights for policy 1, policy_version 85210 (0.0009) [2023-10-14 04:26:07,996][33201] Updated weights for policy 0, policy_version 84450 (0.0009) [2023-10-14 04:26:08,363][33201] Updated weights for policy 0, policy_version 84460 (0.0007) [2023-10-14 04:26:08,704][33226] Updated weights for policy 1, policy_version 85220 (0.0009) [2023-10-14 04:26:08,738][33201] Updated weights for policy 0, policy_version 84470 (0.0008) [2023-10-14 04:26:09,063][33226] Updated weights for policy 1, policy_version 85230 (0.0009) [2023-10-14 04:26:09,109][33201] Updated weights for policy 0, policy_version 84480 (0.0008) [2023-10-14 04:26:09,437][33226] Updated weights for policy 1, policy_version 85240 (0.0007) [2023-10-14 04:26:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173768704. Throughput: 0: 1774.0, 1: 1785.2. Samples: 43453584. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:09,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 04:26:12,827][33201] Updated weights for policy 0, policy_version 84490 (0.0009) [2023-10-14 04:26:13,198][33226] Updated weights for policy 1, policy_version 85250 (0.0008) [2023-10-14 04:26:13,204][33201] Updated weights for policy 0, policy_version 84500 (0.0009) [2023-10-14 04:26:13,568][33226] Updated weights for policy 1, policy_version 85260 (0.0009) [2023-10-14 04:26:13,573][33201] Updated weights for policy 0, policy_version 84510 (0.0008) [2023-10-14 04:26:13,925][33226] Updated weights for policy 1, policy_version 85270 (0.0010) [2023-10-14 04:26:14,295][33226] Updated weights for policy 1, policy_version 85280 (0.0009) [2023-10-14 04:26:14,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 173867008. Throughput: 0: 1784.3, 1: 1764.5. Samples: 43465082. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:17,582][33201] Updated weights for policy 0, policy_version 84520 (0.0008) [2023-10-14 04:26:17,949][33201] Updated weights for policy 0, policy_version 84530 (0.0009) [2023-10-14 04:26:18,187][33226] Updated weights for policy 1, policy_version 85290 (0.0007) [2023-10-14 04:26:18,324][33201] Updated weights for policy 0, policy_version 84540 (0.0008) [2023-10-14 04:26:18,555][33226] Updated weights for policy 1, policy_version 85300 (0.0009) [2023-10-14 04:26:18,923][33226] Updated weights for policy 1, policy_version 85310 (0.0008) [2023-10-14 04:26:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 173932544. Throughput: 0: 1769.4, 1: 1800.1. Samples: 43485932. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:19,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:22,283][33201] Updated weights for policy 0, policy_version 84550 (0.0008) [2023-10-14 04:26:22,630][33226] Updated weights for policy 1, policy_version 85320 (0.0008) [2023-10-14 04:26:22,644][33201] Updated weights for policy 0, policy_version 84560 (0.0007) [2023-10-14 04:26:22,990][33226] Updated weights for policy 1, policy_version 85330 (0.0008) [2023-10-14 04:26:23,019][33201] Updated weights for policy 0, policy_version 84570 (0.0009) [2023-10-14 04:26:23,363][33226] Updated weights for policy 1, policy_version 85340 (0.0008) [2023-10-14 04:26:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 173998080. Throughput: 0: 1752.4, 1: 1770.1. Samples: 43506144. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:26,857][33201] Updated weights for policy 0, policy_version 84580 (0.0008) [2023-10-14 04:26:27,229][33226] Updated weights for policy 1, policy_version 85350 (0.0009) [2023-10-14 04:26:27,243][33201] Updated weights for policy 0, policy_version 84590 (0.0008) [2023-10-14 04:26:27,604][33201] Updated weights for policy 0, policy_version 84600 (0.0007) [2023-10-14 04:26:27,607][33226] Updated weights for policy 1, policy_version 85360 (0.0007) [2023-10-14 04:26:27,975][33226] Updated weights for policy 1, policy_version 85370 (0.0009) [2023-10-14 04:26:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 174063616. Throughput: 0: 1772.3, 1: 1798.5. Samples: 43518030. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:29,559][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:31,561][33201] Updated weights for policy 0, policy_version 84610 (0.0007) [2023-10-14 04:26:31,865][33226] Updated weights for policy 1, policy_version 85380 (0.0010) [2023-10-14 04:26:31,930][33201] Updated weights for policy 0, policy_version 84620 (0.0009) [2023-10-14 04:26:32,233][33226] Updated weights for policy 1, policy_version 85390 (0.0009) [2023-10-14 04:26:32,316][33201] Updated weights for policy 0, policy_version 84630 (0.0008) [2023-10-14 04:26:32,590][33226] Updated weights for policy 1, policy_version 85400 (0.0009) [2023-10-14 04:26:32,680][33201] Updated weights for policy 0, policy_version 84640 (0.0007) [2023-10-14 04:26:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 174129152. Throughput: 0: 1746.0, 1: 1765.8. Samples: 43537292. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:34,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:36,401][33226] Updated weights for policy 1, policy_version 85410 (0.0011) [2023-10-14 04:26:36,759][33226] Updated weights for policy 1, policy_version 85420 (0.0007) [2023-10-14 04:26:36,814][33201] Updated weights for policy 0, policy_version 84650 (0.0008) [2023-10-14 04:26:37,125][33226] Updated weights for policy 1, policy_version 85430 (0.0008) [2023-10-14 04:26:37,182][33201] Updated weights for policy 0, policy_version 84660 (0.0009) [2023-10-14 04:26:37,477][33226] Updated weights for policy 1, policy_version 85440 (0.0009) [2023-10-14 04:26:37,549][33201] Updated weights for policy 0, policy_version 84670 (0.0009) [2023-10-14 04:26:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 174194688. Throughput: 0: 1736.5, 1: 1766.8. Samples: 43559410. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.960')] [2023-10-14 04:26:41,257][33201] Updated weights for policy 0, policy_version 84680 (0.0009) [2023-10-14 04:26:41,435][33226] Updated weights for policy 1, policy_version 85450 (0.0007) [2023-10-14 04:26:41,627][33201] Updated weights for policy 0, policy_version 84690 (0.0009) [2023-10-14 04:26:41,794][33226] Updated weights for policy 1, policy_version 85460 (0.0007) [2023-10-14 04:26:41,992][33201] Updated weights for policy 0, policy_version 84700 (0.0007) [2023-10-14 04:26:42,153][33226] Updated weights for policy 1, policy_version 85470 (0.0008) [2023-10-14 04:26:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 174260224. Throughput: 0: 1741.2, 1: 1770.2. Samples: 43569502. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:44,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.960')] [2023-10-14 04:26:45,692][33201] Updated weights for policy 0, policy_version 84710 (0.0008) [2023-10-14 04:26:45,939][33226] Updated weights for policy 1, policy_version 85480 (0.0009) [2023-10-14 04:26:46,054][33201] Updated weights for policy 0, policy_version 84720 (0.0007) [2023-10-14 04:26:46,300][33226] Updated weights for policy 1, policy_version 85490 (0.0008) [2023-10-14 04:26:46,420][33201] Updated weights for policy 0, policy_version 84730 (0.0010) [2023-10-14 04:26:46,663][33226] Updated weights for policy 1, policy_version 85500 (0.0008) [2023-10-14 04:26:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 174325760. Throughput: 0: 1743.1, 1: 1757.2. Samples: 43591084. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:49,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.960')] [2023-10-14 04:26:50,210][33201] Updated weights for policy 0, policy_version 84740 (0.0010) [2023-10-14 04:26:50,572][33226] Updated weights for policy 1, policy_version 85510 (0.0007) [2023-10-14 04:26:50,581][33201] Updated weights for policy 0, policy_version 84750 (0.0009) [2023-10-14 04:26:50,940][33226] Updated weights for policy 1, policy_version 85520 (0.0007) [2023-10-14 04:26:50,955][33201] Updated weights for policy 0, policy_version 84760 (0.0008) [2023-10-14 04:26:51,305][33226] Updated weights for policy 1, policy_version 85530 (0.0010) [2023-10-14 04:26:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 174391296. Throughput: 0: 1772.3, 1: 1769.2. Samples: 43612954. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.970')] [2023-10-14 04:26:54,709][33201] Updated weights for policy 0, policy_version 84770 (0.0008) [2023-10-14 04:26:55,049][33226] Updated weights for policy 1, policy_version 85540 (0.0011) [2023-10-14 04:26:55,082][33201] Updated weights for policy 0, policy_version 84780 (0.0008) [2023-10-14 04:26:55,426][33226] Updated weights for policy 1, policy_version 85550 (0.0008) [2023-10-14 04:26:55,450][33201] Updated weights for policy 0, policy_version 84790 (0.0008) [2023-10-14 04:26:55,790][33226] Updated weights for policy 1, policy_version 85560 (0.0008) [2023-10-14 04:26:55,822][33201] Updated weights for policy 0, policy_version 84800 (0.0009) [2023-10-14 04:26:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 174456832. Throughput: 0: 1739.3, 1: 1757.2. Samples: 43622424. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:26:59,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 04:26:59,584][33226] Updated weights for policy 1, policy_version 85570 (0.0010) [2023-10-14 04:26:59,828][33201] Updated weights for policy 0, policy_version 84810 (0.0009) [2023-10-14 04:26:59,941][33226] Updated weights for policy 1, policy_version 85580 (0.0008) [2023-10-14 04:27:00,200][33201] Updated weights for policy 0, policy_version 84820 (0.0008) [2023-10-14 04:27:00,303][33226] Updated weights for policy 1, policy_version 85590 (0.0010) [2023-10-14 04:27:00,566][33201] Updated weights for policy 0, policy_version 84830 (0.0008) [2023-10-14 04:27:00,665][33226] Updated weights for policy 1, policy_version 85600 (0.0007) [2023-10-14 04:27:04,477][33226] Updated weights for policy 1, policy_version 85610 (0.0010) [2023-10-14 04:27:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 174522368. Throughput: 0: 1757.2, 1: 1763.6. Samples: 43644366. Policy #0 lag: (min: 31.0, avg: 41.8, max: 63.0) [2023-10-14 04:27:04,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.950')] [2023-10-14 04:27:04,576][33201] Updated weights for policy 0, policy_version 84840 (0.0007) [2023-10-14 04:27:04,842][33226] Updated weights for policy 1, policy_version 85620 (0.0008) [2023-10-14 04:27:04,941][33201] Updated weights for policy 0, policy_version 84850 (0.0009) [2023-10-14 04:27:05,208][33226] Updated weights for policy 1, policy_version 85630 (0.0008) [2023-10-14 04:27:05,303][33201] Updated weights for policy 0, policy_version 84860 (0.0007) [2023-10-14 04:27:08,994][33226] Updated weights for policy 1, policy_version 85640 (0.0008) [2023-10-14 04:27:09,139][33201] Updated weights for policy 0, policy_version 84870 (0.0008) [2023-10-14 04:27:09,363][33226] Updated weights for policy 1, policy_version 85650 (0.0007) [2023-10-14 04:27:09,506][33201] Updated weights for policy 0, policy_version 84880 (0.0009) [2023-10-14 04:27:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 174587904. Throughput: 0: 1765.7, 1: 1783.3. Samples: 43665850. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:09,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.950')] [2023-10-14 04:27:09,721][33226] Updated weights for policy 1, policy_version 85660 (0.0010) [2023-10-14 04:27:09,874][33201] Updated weights for policy 0, policy_version 84890 (0.0009) [2023-10-14 04:27:13,625][33226] Updated weights for policy 1, policy_version 85670 (0.0009) [2023-10-14 04:27:13,664][33201] Updated weights for policy 0, policy_version 84900 (0.0008) [2023-10-14 04:27:14,011][33226] Updated weights for policy 1, policy_version 85680 (0.0008) [2023-10-14 04:27:14,049][33201] Updated weights for policy 0, policy_version 84910 (0.0008) [2023-10-14 04:27:14,374][33226] Updated weights for policy 1, policy_version 85690 (0.0008) [2023-10-14 04:27:14,424][33201] Updated weights for policy 0, policy_version 84920 (0.0007) [2023-10-14 04:27:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 14106.9). Total num frames: 174653440. Throughput: 0: 1747.3, 1: 1761.5. Samples: 43675924. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:14,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 04:27:18,132][33226] Updated weights for policy 1, policy_version 85700 (0.0008) [2023-10-14 04:27:18,312][33201] Updated weights for policy 0, policy_version 84930 (0.0007) [2023-10-14 04:27:18,492][33226] Updated weights for policy 1, policy_version 85710 (0.0009) [2023-10-14 04:27:18,681][33201] Updated weights for policy 0, policy_version 84940 (0.0009) [2023-10-14 04:27:18,861][33226] Updated weights for policy 1, policy_version 85720 (0.0010) [2023-10-14 04:27:19,051][33201] Updated weights for policy 0, policy_version 84950 (0.0009) [2023-10-14 04:27:19,421][33201] Updated weights for policy 0, policy_version 84960 (0.0007) [2023-10-14 04:27:19,557][31953] Fps is (10 sec: 19661.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 174784512. Throughput: 0: 1773.4, 1: 1794.9. Samples: 43697868. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:19,558][31953] Avg episode reward: [(0, '20.920'), (1, '20.950')] [2023-10-14 04:27:22,645][33226] Updated weights for policy 1, policy_version 85730 (0.0009) [2023-10-14 04:27:23,011][33226] Updated weights for policy 1, policy_version 85740 (0.0009) [2023-10-14 04:27:23,381][33226] Updated weights for policy 1, policy_version 85750 (0.0007) [2023-10-14 04:27:23,459][33201] Updated weights for policy 0, policy_version 84970 (0.0008) [2023-10-14 04:27:23,757][33226] Updated weights for policy 1, policy_version 85760 (0.0008) [2023-10-14 04:27:23,832][33201] Updated weights for policy 0, policy_version 84980 (0.0007) [2023-10-14 04:27:24,211][33201] Updated weights for policy 0, policy_version 84990 (0.0009) [2023-10-14 04:27:24,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 174850048. Throughput: 0: 1745.8, 1: 1757.5. Samples: 43717056. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:24,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.950')] [2023-10-14 04:27:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000084992_87031808.pth... [2023-10-14 04:27:24,566][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000085760_87818240.pth... [2023-10-14 04:27:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000083328_85327872.pth [2023-10-14 04:27:24,605][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000084096_86114304.pth [2023-10-14 04:27:27,695][33226] Updated weights for policy 1, policy_version 85770 (0.0007) [2023-10-14 04:27:28,054][33226] Updated weights for policy 1, policy_version 85780 (0.0008) [2023-10-14 04:27:28,177][33201] Updated weights for policy 0, policy_version 85000 (0.0008) [2023-10-14 04:27:28,422][33226] Updated weights for policy 1, policy_version 85790 (0.0008) [2023-10-14 04:27:28,541][33201] Updated weights for policy 0, policy_version 85010 (0.0009) [2023-10-14 04:27:28,910][33201] Updated weights for policy 0, policy_version 85020 (0.0010) [2023-10-14 04:27:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 174915584. Throughput: 0: 1762.5, 1: 1781.1. Samples: 43728964. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.950')] [2023-10-14 04:27:32,158][33226] Updated weights for policy 1, policy_version 85800 (0.0009) [2023-10-14 04:27:32,521][33226] Updated weights for policy 1, policy_version 85810 (0.0007) [2023-10-14 04:27:32,756][33201] Updated weights for policy 0, policy_version 85030 (0.0010) [2023-10-14 04:27:32,881][33226] Updated weights for policy 1, policy_version 85820 (0.0008) [2023-10-14 04:27:33,113][33201] Updated weights for policy 0, policy_version 85040 (0.0009) [2023-10-14 04:27:33,493][33201] Updated weights for policy 0, policy_version 85050 (0.0009) [2023-10-14 04:27:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 174981120. Throughput: 0: 1745.2, 1: 1767.0. Samples: 43749134. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:34,559][31953] Avg episode reward: [(0, '20.950'), (1, '20.950')] [2023-10-14 04:27:36,707][33226] Updated weights for policy 1, policy_version 85830 (0.0010) [2023-10-14 04:27:37,083][33226] Updated weights for policy 1, policy_version 85840 (0.0009) [2023-10-14 04:27:37,170][33201] Updated weights for policy 0, policy_version 85060 (0.0008) [2023-10-14 04:27:37,445][33226] Updated weights for policy 1, policy_version 85850 (0.0008) [2023-10-14 04:27:37,532][33201] Updated weights for policy 0, policy_version 85070 (0.0009) [2023-10-14 04:27:37,908][33201] Updated weights for policy 0, policy_version 85080 (0.0009) [2023-10-14 04:27:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175046656. Throughput: 0: 1737.8, 1: 1764.9. Samples: 43770572. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:39,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.960')] [2023-10-14 04:27:41,277][33226] Updated weights for policy 1, policy_version 85860 (0.0008) [2023-10-14 04:27:41,639][33226] Updated weights for policy 1, policy_version 85870 (0.0007) [2023-10-14 04:27:41,680][33201] Updated weights for policy 0, policy_version 85090 (0.0007) [2023-10-14 04:27:42,007][33226] Updated weights for policy 1, policy_version 85880 (0.0008) [2023-10-14 04:27:42,044][33201] Updated weights for policy 0, policy_version 85100 (0.0009) [2023-10-14 04:27:42,408][33201] Updated weights for policy 0, policy_version 85110 (0.0007) [2023-10-14 04:27:42,775][33201] Updated weights for policy 0, policy_version 85120 (0.0008) [2023-10-14 04:27:44,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175112192. Throughput: 0: 1762.2, 1: 1774.5. Samples: 43781576. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.960')] [2023-10-14 04:27:45,819][33226] Updated weights for policy 1, policy_version 85890 (0.0007) [2023-10-14 04:27:46,195][33226] Updated weights for policy 1, policy_version 85900 (0.0010) [2023-10-14 04:27:46,559][33226] Updated weights for policy 1, policy_version 85910 (0.0009) [2023-10-14 04:27:46,587][33201] Updated weights for policy 0, policy_version 85130 (0.0010) [2023-10-14 04:27:46,929][33226] Updated weights for policy 1, policy_version 85920 (0.0008) [2023-10-14 04:27:46,954][33201] Updated weights for policy 0, policy_version 85140 (0.0009) [2023-10-14 04:27:47,319][33201] Updated weights for policy 0, policy_version 85150 (0.0009) [2023-10-14 04:27:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175177728. Throughput: 0: 1744.4, 1: 1761.2. Samples: 43802122. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.960')] [2023-10-14 04:27:50,884][33226] Updated weights for policy 1, policy_version 85930 (0.0007) [2023-10-14 04:27:51,113][33201] Updated weights for policy 0, policy_version 85160 (0.0008) [2023-10-14 04:27:51,244][33226] Updated weights for policy 1, policy_version 85940 (0.0009) [2023-10-14 04:27:51,489][33201] Updated weights for policy 0, policy_version 85170 (0.0008) [2023-10-14 04:27:51,606][33226] Updated weights for policy 1, policy_version 85950 (0.0007) [2023-10-14 04:27:51,866][33201] Updated weights for policy 0, policy_version 85180 (0.0008) [2023-10-14 04:27:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 175243264. Throughput: 0: 1752.4, 1: 1767.8. Samples: 43824256. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 04:27:55,443][33226] Updated weights for policy 1, policy_version 85960 (0.0007) [2023-10-14 04:27:55,762][33201] Updated weights for policy 0, policy_version 85190 (0.0008) [2023-10-14 04:27:55,812][33226] Updated weights for policy 1, policy_version 85970 (0.0007) [2023-10-14 04:27:56,123][33201] Updated weights for policy 0, policy_version 85200 (0.0009) [2023-10-14 04:27:56,174][33226] Updated weights for policy 1, policy_version 85980 (0.0008) [2023-10-14 04:27:56,498][33201] Updated weights for policy 0, policy_version 85210 (0.0008) [2023-10-14 04:27:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 175308800. Throughput: 0: 1749.6, 1: 1756.2. Samples: 43833686. Policy #0 lag: (min: 10.0, avg: 10.0, max: 11.0) [2023-10-14 04:27:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 04:28:00,094][33226] Updated weights for policy 1, policy_version 85990 (0.0008) [2023-10-14 04:28:00,418][33201] Updated weights for policy 0, policy_version 85220 (0.0009) [2023-10-14 04:28:00,464][33226] Updated weights for policy 1, policy_version 86000 (0.0009) [2023-10-14 04:28:00,798][33201] Updated weights for policy 0, policy_version 85230 (0.0009) [2023-10-14 04:28:00,837][33226] Updated weights for policy 1, policy_version 86010 (0.0008) [2023-10-14 04:28:01,167][33201] Updated weights for policy 0, policy_version 85240 (0.0009) [2023-10-14 04:28:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 175374336. Throughput: 0: 1749.1, 1: 1754.2. Samples: 43855518. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 04:28:04,799][33226] Updated weights for policy 1, policy_version 86020 (0.0010) [2023-10-14 04:28:05,020][33201] Updated weights for policy 0, policy_version 85250 (0.0010) [2023-10-14 04:28:05,165][33226] Updated weights for policy 1, policy_version 86030 (0.0008) [2023-10-14 04:28:05,415][33201] Updated weights for policy 0, policy_version 85260 (0.0009) [2023-10-14 04:28:05,527][33226] Updated weights for policy 1, policy_version 86040 (0.0009) [2023-10-14 04:28:05,797][33201] Updated weights for policy 0, policy_version 85270 (0.0008) [2023-10-14 04:28:06,172][33201] Updated weights for policy 0, policy_version 85280 (0.0008) [2023-10-14 04:28:09,319][33226] Updated weights for policy 1, policy_version 86050 (0.0010) [2023-10-14 04:28:09,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 175439872. Throughput: 0: 1774.3, 1: 1790.8. Samples: 43877484. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 04:28:09,680][33226] Updated weights for policy 1, policy_version 86060 (0.0008) [2023-10-14 04:28:09,994][33201] Updated weights for policy 0, policy_version 85290 (0.0009) [2023-10-14 04:28:10,043][33226] Updated weights for policy 1, policy_version 86070 (0.0008) [2023-10-14 04:28:10,354][33201] Updated weights for policy 0, policy_version 85300 (0.0009) [2023-10-14 04:28:10,414][33226] Updated weights for policy 1, policy_version 86080 (0.0008) [2023-10-14 04:28:10,726][33201] Updated weights for policy 0, policy_version 85310 (0.0009) [2023-10-14 04:28:14,135][33226] Updated weights for policy 1, policy_version 86090 (0.0011) [2023-10-14 04:28:14,474][33201] Updated weights for policy 0, policy_version 85320 (0.0007) [2023-10-14 04:28:14,502][33226] Updated weights for policy 1, policy_version 86100 (0.0009) [2023-10-14 04:28:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 175505408. Throughput: 0: 1749.4, 1: 1763.9. Samples: 43887064. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 04:28:14,842][33201] Updated weights for policy 0, policy_version 85330 (0.0007) [2023-10-14 04:28:14,872][33226] Updated weights for policy 1, policy_version 86110 (0.0008) [2023-10-14 04:28:15,211][33201] Updated weights for policy 0, policy_version 85340 (0.0007) [2023-10-14 04:28:18,940][33226] Updated weights for policy 1, policy_version 86120 (0.0009) [2023-10-14 04:28:18,978][33201] Updated weights for policy 0, policy_version 85350 (0.0007) [2023-10-14 04:28:19,310][33226] Updated weights for policy 1, policy_version 86130 (0.0009) [2023-10-14 04:28:19,349][33201] Updated weights for policy 0, policy_version 85360 (0.0007) [2023-10-14 04:28:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13995.8). Total num frames: 175570944. Throughput: 0: 1774.1, 1: 1780.5. Samples: 43909092. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 04:28:19,677][33226] Updated weights for policy 1, policy_version 86140 (0.0008) [2023-10-14 04:28:19,723][33201] Updated weights for policy 0, policy_version 85370 (0.0008) [2023-10-14 04:28:23,437][33226] Updated weights for policy 1, policy_version 86150 (0.0009) [2023-10-14 04:28:23,518][33201] Updated weights for policy 0, policy_version 85380 (0.0009) [2023-10-14 04:28:23,804][33226] Updated weights for policy 1, policy_version 86160 (0.0009) [2023-10-14 04:28:23,882][33201] Updated weights for policy 0, policy_version 85390 (0.0007) [2023-10-14 04:28:24,172][33226] Updated weights for policy 1, policy_version 86170 (0.0008) [2023-10-14 04:28:24,260][33201] Updated weights for policy 0, policy_version 85400 (0.0007) [2023-10-14 04:28:24,557][31953] Fps is (10 sec: 19661.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175702016. Throughput: 0: 1768.7, 1: 1761.2. Samples: 43929414. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.980')] [2023-10-14 04:28:27,946][33226] Updated weights for policy 1, policy_version 86180 (0.0008) [2023-10-14 04:28:28,064][33201] Updated weights for policy 0, policy_version 85410 (0.0009) [2023-10-14 04:28:28,320][33226] Updated weights for policy 1, policy_version 86190 (0.0008) [2023-10-14 04:28:28,441][33201] Updated weights for policy 0, policy_version 85420 (0.0010) [2023-10-14 04:28:28,681][33226] Updated weights for policy 1, policy_version 86200 (0.0007) [2023-10-14 04:28:28,805][33201] Updated weights for policy 0, policy_version 85430 (0.0009) [2023-10-14 04:28:29,174][33201] Updated weights for policy 0, policy_version 85440 (0.0009) [2023-10-14 04:28:29,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175767552. Throughput: 0: 1765.1, 1: 1773.8. Samples: 43940824. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:29,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.980')] [2023-10-14 04:28:32,505][33226] Updated weights for policy 1, policy_version 86210 (0.0009) [2023-10-14 04:28:32,870][33226] Updated weights for policy 1, policy_version 86220 (0.0009) [2023-10-14 04:28:33,028][33201] Updated weights for policy 0, policy_version 85450 (0.0009) [2023-10-14 04:28:33,229][33226] Updated weights for policy 1, policy_version 86230 (0.0009) [2023-10-14 04:28:33,394][33201] Updated weights for policy 0, policy_version 85460 (0.0008) [2023-10-14 04:28:33,595][33226] Updated weights for policy 1, policy_version 86240 (0.0008) [2023-10-14 04:28:33,763][33201] Updated weights for policy 0, policy_version 85470 (0.0007) [2023-10-14 04:28:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 175833088. Throughput: 0: 1776.2, 1: 1770.5. Samples: 43961724. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:34,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.970')] [2023-10-14 04:28:37,529][33226] Updated weights for policy 1, policy_version 86250 (0.0009) [2023-10-14 04:28:37,533][33201] Updated weights for policy 0, policy_version 85480 (0.0009) [2023-10-14 04:28:37,901][33201] Updated weights for policy 0, policy_version 85490 (0.0009) [2023-10-14 04:28:37,904][33226] Updated weights for policy 1, policy_version 86260 (0.0009) [2023-10-14 04:28:38,265][33226] Updated weights for policy 1, policy_version 86270 (0.0007) [2023-10-14 04:28:38,270][33201] Updated weights for policy 0, policy_version 85500 (0.0008) [2023-10-14 04:28:39,558][31953] Fps is (10 sec: 13106.5, 60 sec: 14199.3, 300 sec: 14218.0). Total num frames: 175898624. Throughput: 0: 1755.4, 1: 1748.8. Samples: 43981948. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:39,559][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:28:41,915][33226] Updated weights for policy 1, policy_version 86280 (0.0007) [2023-10-14 04:28:42,060][33201] Updated weights for policy 0, policy_version 85510 (0.0008) [2023-10-14 04:28:42,282][33226] Updated weights for policy 1, policy_version 86290 (0.0007) [2023-10-14 04:28:42,436][33201] Updated weights for policy 0, policy_version 85520 (0.0007) [2023-10-14 04:28:42,649][33226] Updated weights for policy 1, policy_version 86300 (0.0008) [2023-10-14 04:28:42,806][33201] Updated weights for policy 0, policy_version 85530 (0.0008) [2023-10-14 04:28:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 175964160. Throughput: 0: 1785.2, 1: 1773.8. Samples: 43993842. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:44,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.990')] [2023-10-14 04:28:46,478][33201] Updated weights for policy 0, policy_version 85540 (0.0007) [2023-10-14 04:28:46,512][33226] Updated weights for policy 1, policy_version 86310 (0.0009) [2023-10-14 04:28:46,847][33201] Updated weights for policy 0, policy_version 85550 (0.0009) [2023-10-14 04:28:46,894][33226] Updated weights for policy 1, policy_version 86320 (0.0007) [2023-10-14 04:28:47,206][33201] Updated weights for policy 0, policy_version 85560 (0.0008) [2023-10-14 04:28:47,258][33226] Updated weights for policy 1, policy_version 86330 (0.0007) [2023-10-14 04:28:49,557][31953] Fps is (10 sec: 13107.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 176029696. Throughput: 0: 1760.9, 1: 1751.7. Samples: 44013586. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:49,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:28:50,889][33226] Updated weights for policy 1, policy_version 86340 (0.0009) [2023-10-14 04:28:51,185][33201] Updated weights for policy 0, policy_version 85570 (0.0008) [2023-10-14 04:28:51,251][33226] Updated weights for policy 1, policy_version 86350 (0.0009) [2023-10-14 04:28:51,583][33201] Updated weights for policy 0, policy_version 85580 (0.0010) [2023-10-14 04:28:51,625][33226] Updated weights for policy 1, policy_version 86360 (0.0007) [2023-10-14 04:28:51,941][33201] Updated weights for policy 0, policy_version 85590 (0.0009) [2023-10-14 04:28:52,315][33201] Updated weights for policy 0, policy_version 85600 (0.0007) [2023-10-14 04:28:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 176095232. Throughput: 0: 1761.2, 1: 1755.8. Samples: 44035750. Policy #0 lag: (min: 31.0, avg: 33.2, max: 63.0) [2023-10-14 04:28:54,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:28:55,475][33226] Updated weights for policy 1, policy_version 86370 (0.0007) [2023-10-14 04:28:55,845][33226] Updated weights for policy 1, policy_version 86380 (0.0008) [2023-10-14 04:28:56,143][33201] Updated weights for policy 0, policy_version 85610 (0.0007) [2023-10-14 04:28:56,207][33226] Updated weights for policy 1, policy_version 86390 (0.0009) [2023-10-14 04:28:56,518][33201] Updated weights for policy 0, policy_version 85620 (0.0007) [2023-10-14 04:28:56,570][33226] Updated weights for policy 1, policy_version 86400 (0.0008) [2023-10-14 04:28:56,891][33201] Updated weights for policy 0, policy_version 85630 (0.0009) [2023-10-14 04:28:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 176160768. Throughput: 0: 1761.8, 1: 1752.4. Samples: 44045200. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:28:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:29:00,403][33226] Updated weights for policy 1, policy_version 86410 (0.0008) [2023-10-14 04:29:00,627][33201] Updated weights for policy 0, policy_version 85640 (0.0007) [2023-10-14 04:29:00,772][33226] Updated weights for policy 1, policy_version 86420 (0.0008) [2023-10-14 04:29:00,987][33201] Updated weights for policy 0, policy_version 85650 (0.0008) [2023-10-14 04:29:01,135][33226] Updated weights for policy 1, policy_version 86430 (0.0008) [2023-10-14 04:29:01,358][33201] Updated weights for policy 0, policy_version 85660 (0.0008) [2023-10-14 04:29:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 176226304. Throughput: 0: 1759.5, 1: 1764.7. Samples: 44067684. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:29:04,888][33226] Updated weights for policy 1, policy_version 86440 (0.0009) [2023-10-14 04:29:05,227][33201] Updated weights for policy 0, policy_version 85670 (0.0008) [2023-10-14 04:29:05,264][33226] Updated weights for policy 1, policy_version 86450 (0.0007) [2023-10-14 04:29:05,594][33201] Updated weights for policy 0, policy_version 85680 (0.0007) [2023-10-14 04:29:05,636][33226] Updated weights for policy 1, policy_version 86460 (0.0007) [2023-10-14 04:29:05,956][33201] Updated weights for policy 0, policy_version 85690 (0.0009) [2023-10-14 04:29:09,511][33226] Updated weights for policy 1, policy_version 86470 (0.0010) [2023-10-14 04:29:09,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 176291840. Throughput: 0: 1777.0, 1: 1783.5. Samples: 44089636. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.970')] [2023-10-14 04:29:09,798][33201] Updated weights for policy 0, policy_version 85700 (0.0009) [2023-10-14 04:29:09,870][33226] Updated weights for policy 1, policy_version 86480 (0.0009) [2023-10-14 04:29:10,167][33201] Updated weights for policy 0, policy_version 85710 (0.0010) [2023-10-14 04:29:10,239][33226] Updated weights for policy 1, policy_version 86490 (0.0009) [2023-10-14 04:29:10,529][33201] Updated weights for policy 0, policy_version 85720 (0.0009) [2023-10-14 04:29:13,993][33226] Updated weights for policy 1, policy_version 86500 (0.0008) [2023-10-14 04:29:14,354][33226] Updated weights for policy 1, policy_version 86510 (0.0010) [2023-10-14 04:29:14,366][33201] Updated weights for policy 0, policy_version 85730 (0.0008) [2023-10-14 04:29:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 176357376. Throughput: 0: 1756.5, 1: 1764.4. Samples: 44099266. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:29:14,730][33226] Updated weights for policy 1, policy_version 86520 (0.0007) [2023-10-14 04:29:14,732][33201] Updated weights for policy 0, policy_version 85740 (0.0008) [2023-10-14 04:29:15,100][33201] Updated weights for policy 0, policy_version 85750 (0.0007) [2023-10-14 04:29:15,469][33201] Updated weights for policy 0, policy_version 85760 (0.0008) [2023-10-14 04:29:18,564][33226] Updated weights for policy 1, policy_version 86530 (0.0008) [2023-10-14 04:29:18,936][33226] Updated weights for policy 1, policy_version 86540 (0.0009) [2023-10-14 04:29:19,194][33201] Updated weights for policy 0, policy_version 85770 (0.0009) [2023-10-14 04:29:19,293][33226] Updated weights for policy 1, policy_version 86550 (0.0008) [2023-10-14 04:29:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 176422912. Throughput: 0: 1774.4, 1: 1778.4. Samples: 44121600. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:29:19,561][33201] Updated weights for policy 0, policy_version 85780 (0.0007) [2023-10-14 04:29:19,660][33226] Updated weights for policy 1, policy_version 86560 (0.0008) [2023-10-14 04:29:19,928][33201] Updated weights for policy 0, policy_version 85790 (0.0007) [2023-10-14 04:29:23,423][33226] Updated weights for policy 1, policy_version 86570 (0.0009) [2023-10-14 04:29:23,779][33226] Updated weights for policy 1, policy_version 86580 (0.0009) [2023-10-14 04:29:23,780][33201] Updated weights for policy 0, policy_version 85800 (0.0009) [2023-10-14 04:29:24,141][33226] Updated weights for policy 1, policy_version 86590 (0.0009) [2023-10-14 04:29:24,162][33201] Updated weights for policy 0, policy_version 85810 (0.0007) [2023-10-14 04:29:24,526][33201] Updated weights for policy 0, policy_version 85820 (0.0008) [2023-10-14 04:29:24,557][31953] Fps is (10 sec: 16383.6, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 176521216. Throughput: 0: 1778.5, 1: 1777.5. Samples: 44141970. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:29:24,572][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000086592_88670208.pth... [2023-10-14 04:29:24,609][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000084928_86966272.pth [2023-10-14 04:29:24,668][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000085824_87883776.pth... [2023-10-14 04:29:24,711][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000084160_86179840.pth [2023-10-14 04:29:27,949][33226] Updated weights for policy 1, policy_version 86600 (0.0008) [2023-10-14 04:29:28,312][33226] Updated weights for policy 1, policy_version 86610 (0.0008) [2023-10-14 04:29:28,370][33201] Updated weights for policy 0, policy_version 85830 (0.0009) [2023-10-14 04:29:28,668][33226] Updated weights for policy 1, policy_version 86620 (0.0007) [2023-10-14 04:29:28,740][33201] Updated weights for policy 0, policy_version 85840 (0.0008) [2023-10-14 04:29:29,107][33201] Updated weights for policy 0, policy_version 85850 (0.0008) [2023-10-14 04:29:29,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 176619520. Throughput: 0: 1766.4, 1: 1778.1. Samples: 44153346. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:29:32,483][33226] Updated weights for policy 1, policy_version 86630 (0.0010) [2023-10-14 04:29:32,837][33226] Updated weights for policy 1, policy_version 86640 (0.0008) [2023-10-14 04:29:33,034][33201] Updated weights for policy 0, policy_version 85860 (0.0008) [2023-10-14 04:29:33,205][33226] Updated weights for policy 1, policy_version 86650 (0.0009) [2023-10-14 04:29:33,402][33201] Updated weights for policy 0, policy_version 85870 (0.0007) [2023-10-14 04:29:33,765][33201] Updated weights for policy 0, policy_version 85880 (0.0009) [2023-10-14 04:29:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 176685056. Throughput: 0: 1783.4, 1: 1786.2. Samples: 44174216. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.960')] [2023-10-14 04:29:37,133][33226] Updated weights for policy 1, policy_version 86660 (0.0008) [2023-10-14 04:29:37,500][33226] Updated weights for policy 1, policy_version 86670 (0.0009) [2023-10-14 04:29:37,821][33201] Updated weights for policy 0, policy_version 85890 (0.0008) [2023-10-14 04:29:37,866][33226] Updated weights for policy 1, policy_version 86680 (0.0008) [2023-10-14 04:29:38,232][33201] Updated weights for policy 0, policy_version 85900 (0.0008) [2023-10-14 04:29:38,612][33201] Updated weights for policy 0, policy_version 85910 (0.0010) [2023-10-14 04:29:38,982][33201] Updated weights for policy 0, policy_version 85920 (0.0010) [2023-10-14 04:29:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 176750592. Throughput: 0: 1753.1, 1: 1767.1. Samples: 44194156. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.960')] [2023-10-14 04:29:41,645][33226] Updated weights for policy 1, policy_version 86690 (0.0007) [2023-10-14 04:29:42,023][33226] Updated weights for policy 1, policy_version 86700 (0.0008) [2023-10-14 04:29:42,391][33226] Updated weights for policy 1, policy_version 86710 (0.0009) [2023-10-14 04:29:42,747][33226] Updated weights for policy 1, policy_version 86720 (0.0007) [2023-10-14 04:29:42,768][33201] Updated weights for policy 0, policy_version 85930 (0.0007) [2023-10-14 04:29:43,144][33201] Updated weights for policy 0, policy_version 85940 (0.0008) [2023-10-14 04:29:43,518][33201] Updated weights for policy 0, policy_version 85950 (0.0011) [2023-10-14 04:29:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 176816128. Throughput: 0: 1786.3, 1: 1789.7. Samples: 44206122. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.950')] [2023-10-14 04:29:46,438][33226] Updated weights for policy 1, policy_version 86730 (0.0008) [2023-10-14 04:29:46,809][33226] Updated weights for policy 1, policy_version 86740 (0.0008) [2023-10-14 04:29:47,167][33226] Updated weights for policy 1, policy_version 86750 (0.0007) [2023-10-14 04:29:47,286][33201] Updated weights for policy 0, policy_version 85960 (0.0009) [2023-10-14 04:29:47,659][33201] Updated weights for policy 0, policy_version 85970 (0.0007) [2023-10-14 04:29:48,026][33201] Updated weights for policy 0, policy_version 85980 (0.0009) [2023-10-14 04:29:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 176881664. Throughput: 0: 1753.6, 1: 1769.0. Samples: 44226202. Policy #0 lag: (min: 31.0, avg: 45.9, max: 63.0) [2023-10-14 04:29:49,560][31953] Avg episode reward: [(0, '20.700'), (1, '20.950')] [2023-10-14 04:29:50,953][33226] Updated weights for policy 1, policy_version 86760 (0.0007) [2023-10-14 04:29:51,314][33226] Updated weights for policy 1, policy_version 86770 (0.0011) [2023-10-14 04:29:51,673][33226] Updated weights for policy 1, policy_version 86780 (0.0009) [2023-10-14 04:29:51,813][33201] Updated weights for policy 0, policy_version 85990 (0.0007) [2023-10-14 04:29:52,182][33201] Updated weights for policy 0, policy_version 86000 (0.0009) [2023-10-14 04:29:52,555][33201] Updated weights for policy 0, policy_version 86010 (0.0008) [2023-10-14 04:29:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 176947200. Throughput: 0: 1747.7, 1: 1776.6. Samples: 44248230. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:29:54,557][31953] Avg episode reward: [(0, '20.700'), (1, '20.950')] [2023-10-14 04:29:55,283][33226] Updated weights for policy 1, policy_version 86790 (0.0008) [2023-10-14 04:29:55,656][33226] Updated weights for policy 1, policy_version 86800 (0.0008) [2023-10-14 04:29:56,020][33226] Updated weights for policy 1, policy_version 86810 (0.0010) [2023-10-14 04:29:56,464][33201] Updated weights for policy 0, policy_version 86020 (0.0008) [2023-10-14 04:29:56,832][33201] Updated weights for policy 0, policy_version 86030 (0.0010) [2023-10-14 04:29:57,205][33201] Updated weights for policy 0, policy_version 86040 (0.0010) [2023-10-14 04:29:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 177012736. Throughput: 0: 1761.8, 1: 1776.0. Samples: 44258468. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:29:59,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:29:59,740][33226] Updated weights for policy 1, policy_version 86820 (0.0008) [2023-10-14 04:30:00,108][33226] Updated weights for policy 1, policy_version 86830 (0.0008) [2023-10-14 04:30:00,470][33226] Updated weights for policy 1, policy_version 86840 (0.0008) [2023-10-14 04:30:01,019][33201] Updated weights for policy 0, policy_version 86050 (0.0007) [2023-10-14 04:30:01,385][33201] Updated weights for policy 0, policy_version 86060 (0.0010) [2023-10-14 04:30:01,769][33201] Updated weights for policy 0, policy_version 86070 (0.0010) [2023-10-14 04:30:02,136][33201] Updated weights for policy 0, policy_version 86080 (0.0009) [2023-10-14 04:30:04,178][33226] Updated weights for policy 1, policy_version 86850 (0.0008) [2023-10-14 04:30:04,541][33226] Updated weights for policy 1, policy_version 86860 (0.0009) [2023-10-14 04:30:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 177078272. Throughput: 0: 1743.8, 1: 1781.5. Samples: 44280238. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:04,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:04,912][33226] Updated weights for policy 1, policy_version 86870 (0.0010) [2023-10-14 04:30:05,274][33226] Updated weights for policy 1, policy_version 86880 (0.0009) [2023-10-14 04:30:05,885][33201] Updated weights for policy 0, policy_version 86090 (0.0011) [2023-10-14 04:30:06,251][33201] Updated weights for policy 0, policy_version 86100 (0.0011) [2023-10-14 04:30:06,627][33201] Updated weights for policy 0, policy_version 86110 (0.0010) [2023-10-14 04:30:09,025][33226] Updated weights for policy 1, policy_version 86890 (0.0007) [2023-10-14 04:30:09,389][33226] Updated weights for policy 1, policy_version 86900 (0.0007) [2023-10-14 04:30:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 177143808. Throughput: 0: 1757.2, 1: 1799.2. Samples: 44302006. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:09,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:09,758][33226] Updated weights for policy 1, policy_version 86910 (0.0008) [2023-10-14 04:30:10,662][33201] Updated weights for policy 0, policy_version 86120 (0.0008) [2023-10-14 04:30:11,031][33201] Updated weights for policy 0, policy_version 86130 (0.0008) [2023-10-14 04:30:11,404][33201] Updated weights for policy 0, policy_version 86140 (0.0008) [2023-10-14 04:30:13,574][33226] Updated weights for policy 1, policy_version 86920 (0.0010) [2023-10-14 04:30:13,932][33226] Updated weights for policy 1, policy_version 86930 (0.0009) [2023-10-14 04:30:14,305][33226] Updated weights for policy 1, policy_version 86940 (0.0010) [2023-10-14 04:30:14,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 177242112. Throughput: 0: 1739.1, 1: 1784.4. Samples: 44311900. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:14,557][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:15,003][33201] Updated weights for policy 0, policy_version 86150 (0.0009) [2023-10-14 04:30:15,373][33201] Updated weights for policy 0, policy_version 86160 (0.0009) [2023-10-14 04:30:15,743][33201] Updated weights for policy 0, policy_version 86170 (0.0007) [2023-10-14 04:30:18,308][33226] Updated weights for policy 1, policy_version 86950 (0.0010) [2023-10-14 04:30:18,684][33226] Updated weights for policy 1, policy_version 86960 (0.0008) [2023-10-14 04:30:19,044][33226] Updated weights for policy 1, policy_version 86970 (0.0008) [2023-10-14 04:30:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 177307648. Throughput: 0: 1745.7, 1: 1803.0. Samples: 44333910. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:19,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:19,652][33201] Updated weights for policy 0, policy_version 86180 (0.0009) [2023-10-14 04:30:20,013][33201] Updated weights for policy 0, policy_version 86190 (0.0007) [2023-10-14 04:30:20,388][33201] Updated weights for policy 0, policy_version 86200 (0.0007) [2023-10-14 04:30:23,011][33226] Updated weights for policy 1, policy_version 86980 (0.0008) [2023-10-14 04:30:23,377][33226] Updated weights for policy 1, policy_version 86990 (0.0007) [2023-10-14 04:30:23,746][33226] Updated weights for policy 1, policy_version 87000 (0.0007) [2023-10-14 04:30:24,287][33201] Updated weights for policy 0, policy_version 86210 (0.0007) [2023-10-14 04:30:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 177373184. Throughput: 0: 1782.1, 1: 1783.6. Samples: 44354612. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:24,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:24,661][33201] Updated weights for policy 0, policy_version 86220 (0.0008) [2023-10-14 04:30:25,046][33201] Updated weights for policy 0, policy_version 86230 (0.0008) [2023-10-14 04:30:25,408][33201] Updated weights for policy 0, policy_version 86240 (0.0008) [2023-10-14 04:30:27,554][33226] Updated weights for policy 1, policy_version 87010 (0.0007) [2023-10-14 04:30:27,921][33226] Updated weights for policy 1, policy_version 87020 (0.0009) [2023-10-14 04:30:28,286][33226] Updated weights for policy 1, policy_version 87030 (0.0009) [2023-10-14 04:30:28,655][33226] Updated weights for policy 1, policy_version 87040 (0.0008) [2023-10-14 04:30:29,113][33201] Updated weights for policy 0, policy_version 86250 (0.0008) [2023-10-14 04:30:29,491][33201] Updated weights for policy 0, policy_version 86260 (0.0009) [2023-10-14 04:30:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 177438720. Throughput: 0: 1748.9, 1: 1794.4. Samples: 44365568. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:29,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:29,858][33201] Updated weights for policy 0, policy_version 86270 (0.0008) [2023-10-14 04:30:32,434][33226] Updated weights for policy 1, policy_version 87050 (0.0009) [2023-10-14 04:30:32,801][33226] Updated weights for policy 1, policy_version 87060 (0.0008) [2023-10-14 04:30:33,172][33226] Updated weights for policy 1, policy_version 87070 (0.0008) [2023-10-14 04:30:33,593][33201] Updated weights for policy 0, policy_version 86280 (0.0007) [2023-10-14 04:30:33,961][33201] Updated weights for policy 0, policy_version 86290 (0.0008) [2023-10-14 04:30:34,340][33201] Updated weights for policy 0, policy_version 86300 (0.0009) [2023-10-14 04:30:34,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 177537024. Throughput: 0: 1783.8, 1: 1787.1. Samples: 44386892. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:34,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:36,963][33226] Updated weights for policy 1, policy_version 87080 (0.0008) [2023-10-14 04:30:37,342][33226] Updated weights for policy 1, policy_version 87090 (0.0009) [2023-10-14 04:30:37,703][33226] Updated weights for policy 1, policy_version 87100 (0.0008) [2023-10-14 04:30:38,274][33201] Updated weights for policy 0, policy_version 86310 (0.0010) [2023-10-14 04:30:38,642][33201] Updated weights for policy 0, policy_version 86320 (0.0007) [2023-10-14 04:30:39,007][33201] Updated weights for policy 0, policy_version 86330 (0.0009) [2023-10-14 04:30:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 177602560. Throughput: 0: 1759.0, 1: 1775.3. Samples: 44407274. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:39,557][31953] Avg episode reward: [(0, '20.680'), (1, '20.960')] [2023-10-14 04:30:41,420][33226] Updated weights for policy 1, policy_version 87110 (0.0008) [2023-10-14 04:30:41,781][33226] Updated weights for policy 1, policy_version 87120 (0.0009) [2023-10-14 04:30:42,140][33226] Updated weights for policy 1, policy_version 87130 (0.0008) [2023-10-14 04:30:42,866][33201] Updated weights for policy 0, policy_version 86340 (0.0009) [2023-10-14 04:30:43,235][33201] Updated weights for policy 0, policy_version 86350 (0.0010) [2023-10-14 04:30:43,609][33201] Updated weights for policy 0, policy_version 86360 (0.0008) [2023-10-14 04:30:44,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 177668096. Throughput: 0: 1772.0, 1: 1785.4. Samples: 44418548. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-10-14 04:30:44,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:30:46,004][33226] Updated weights for policy 1, policy_version 87140 (0.0007) [2023-10-14 04:30:46,376][33226] Updated weights for policy 1, policy_version 87150 (0.0007) [2023-10-14 04:30:46,741][33226] Updated weights for policy 1, policy_version 87160 (0.0008) [2023-10-14 04:30:47,344][33201] Updated weights for policy 0, policy_version 86370 (0.0008) [2023-10-14 04:30:47,711][33201] Updated weights for policy 0, policy_version 86380 (0.0010) [2023-10-14 04:30:48,077][33201] Updated weights for policy 0, policy_version 86390 (0.0011) [2023-10-14 04:30:48,454][33201] Updated weights for policy 0, policy_version 86400 (0.0009) [2023-10-14 04:30:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 177733632. Throughput: 0: 1765.8, 1: 1764.5. Samples: 44439100. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:30:49,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.930')] [2023-10-14 04:30:50,694][33226] Updated weights for policy 1, policy_version 87170 (0.0008) [2023-10-14 04:30:51,059][33226] Updated weights for policy 1, policy_version 87180 (0.0007) [2023-10-14 04:30:51,427][33226] Updated weights for policy 1, policy_version 87190 (0.0009) [2023-10-14 04:30:51,798][33226] Updated weights for policy 1, policy_version 87200 (0.0007) [2023-10-14 04:30:52,334][33201] Updated weights for policy 0, policy_version 86410 (0.0008) [2023-10-14 04:30:52,700][33201] Updated weights for policy 0, policy_version 86420 (0.0009) [2023-10-14 04:30:53,072][33201] Updated weights for policy 0, policy_version 86430 (0.0009) [2023-10-14 04:30:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 177799168. Throughput: 0: 1759.8, 1: 1773.1. Samples: 44460986. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:30:54,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.930')] [2023-10-14 04:30:55,460][33226] Updated weights for policy 1, policy_version 87210 (0.0008) [2023-10-14 04:30:55,836][33226] Updated weights for policy 1, policy_version 87220 (0.0008) [2023-10-14 04:30:56,200][33226] Updated weights for policy 1, policy_version 87230 (0.0008) [2023-10-14 04:30:56,882][33201] Updated weights for policy 0, policy_version 86440 (0.0009) [2023-10-14 04:30:57,253][33201] Updated weights for policy 0, policy_version 86450 (0.0008) [2023-10-14 04:30:57,622][33201] Updated weights for policy 0, policy_version 86460 (0.0009) [2023-10-14 04:30:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 177864704. Throughput: 0: 1782.8, 1: 1764.3. Samples: 44471516. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:30:59,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.930')] [2023-10-14 04:30:59,903][33226] Updated weights for policy 1, policy_version 87240 (0.0010) [2023-10-14 04:31:00,273][33226] Updated weights for policy 1, policy_version 87250 (0.0011) [2023-10-14 04:31:00,644][33226] Updated weights for policy 1, policy_version 87260 (0.0009) [2023-10-14 04:31:01,231][33201] Updated weights for policy 0, policy_version 86470 (0.0009) [2023-10-14 04:31:01,598][33201] Updated weights for policy 0, policy_version 86480 (0.0008) [2023-10-14 04:31:01,963][33201] Updated weights for policy 0, policy_version 86490 (0.0008) [2023-10-14 04:31:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 177930240. Throughput: 0: 1771.1, 1: 1774.5. Samples: 44493462. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:04,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.940')] [2023-10-14 04:31:04,606][33226] Updated weights for policy 1, policy_version 87270 (0.0007) [2023-10-14 04:31:04,988][33226] Updated weights for policy 1, policy_version 87280 (0.0007) [2023-10-14 04:31:05,353][33226] Updated weights for policy 1, policy_version 87290 (0.0009) [2023-10-14 04:31:05,718][33201] Updated weights for policy 0, policy_version 86500 (0.0010) [2023-10-14 04:31:06,083][33201] Updated weights for policy 0, policy_version 86510 (0.0010) [2023-10-14 04:31:06,455][33201] Updated weights for policy 0, policy_version 86520 (0.0009) [2023-10-14 04:31:09,142][33226] Updated weights for policy 1, policy_version 87300 (0.0009) [2023-10-14 04:31:09,516][33226] Updated weights for policy 1, policy_version 87310 (0.0008) [2023-10-14 04:31:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 13995.8). Total num frames: 177995776. Throughput: 0: 1776.7, 1: 1800.8. Samples: 44515600. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:09,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.940')] [2023-10-14 04:31:09,882][33226] Updated weights for policy 1, policy_version 87320 (0.0008) [2023-10-14 04:31:10,149][33201] Updated weights for policy 0, policy_version 86530 (0.0007) [2023-10-14 04:31:10,546][33201] Updated weights for policy 0, policy_version 86540 (0.0007) [2023-10-14 04:31:10,909][33201] Updated weights for policy 0, policy_version 86550 (0.0007) [2023-10-14 04:31:11,284][33201] Updated weights for policy 0, policy_version 86560 (0.0008) [2023-10-14 04:31:13,644][33226] Updated weights for policy 1, policy_version 87330 (0.0008) [2023-10-14 04:31:14,008][33226] Updated weights for policy 1, policy_version 87340 (0.0010) [2023-10-14 04:31:14,376][33226] Updated weights for policy 1, policy_version 87350 (0.0009) [2023-10-14 04:31:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 178061312. Throughput: 0: 1779.6, 1: 1768.8. Samples: 44525244. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:14,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.940')] [2023-10-14 04:31:14,744][33226] Updated weights for policy 1, policy_version 87360 (0.0009) [2023-10-14 04:31:14,976][33201] Updated weights for policy 0, policy_version 86570 (0.0009) [2023-10-14 04:31:15,352][33201] Updated weights for policy 0, policy_version 86580 (0.0009) [2023-10-14 04:31:15,722][33201] Updated weights for policy 0, policy_version 86590 (0.0009) [2023-10-14 04:31:18,441][33226] Updated weights for policy 1, policy_version 87370 (0.0008) [2023-10-14 04:31:18,806][33226] Updated weights for policy 1, policy_version 87380 (0.0008) [2023-10-14 04:31:19,175][33226] Updated weights for policy 1, policy_version 87390 (0.0008) [2023-10-14 04:31:19,537][33201] Updated weights for policy 0, policy_version 86600 (0.0007) [2023-10-14 04:31:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 178159616. Throughput: 0: 1765.6, 1: 1801.1. Samples: 44547392. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:19,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.940')] [2023-10-14 04:31:19,900][33201] Updated weights for policy 0, policy_version 86610 (0.0009) [2023-10-14 04:31:20,281][33201] Updated weights for policy 0, policy_version 86620 (0.0010) [2023-10-14 04:31:22,943][33226] Updated weights for policy 1, policy_version 87400 (0.0011) [2023-10-14 04:31:23,315][33226] Updated weights for policy 1, policy_version 87410 (0.0009) [2023-10-14 04:31:23,682][33226] Updated weights for policy 1, policy_version 87420 (0.0011) [2023-10-14 04:31:24,171][33201] Updated weights for policy 0, policy_version 86630 (0.0009) [2023-10-14 04:31:24,547][33201] Updated weights for policy 0, policy_version 86640 (0.0009) [2023-10-14 04:31:24,557][31953] Fps is (10 sec: 16383.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 178225152. Throughput: 0: 1797.6, 1: 1777.2. Samples: 44568142. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:24,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 04:31:24,569][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000087424_89522176.pth... [2023-10-14 04:31:24,602][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000085760_87818240.pth [2023-10-14 04:31:24,911][33201] Updated weights for policy 0, policy_version 86650 (0.0008) [2023-10-14 04:31:25,134][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000086656_88735744.pth... [2023-10-14 04:31:25,172][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000084992_87031808.pth [2023-10-14 04:31:27,484][33226] Updated weights for policy 1, policy_version 87430 (0.0010) [2023-10-14 04:31:27,852][33226] Updated weights for policy 1, policy_version 87440 (0.0007) [2023-10-14 04:31:28,225][33226] Updated weights for policy 1, policy_version 87450 (0.0008) [2023-10-14 04:31:28,865][33201] Updated weights for policy 0, policy_version 86660 (0.0009) [2023-10-14 04:31:29,230][33201] Updated weights for policy 0, policy_version 86670 (0.0007) [2023-10-14 04:31:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 178290688. Throughput: 0: 1776.7, 1: 1801.3. Samples: 44579558. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:29,560][31953] Avg episode reward: [(0, '20.880'), (1, '20.940')] [2023-10-14 04:31:29,596][33201] Updated weights for policy 0, policy_version 86680 (0.0008) [2023-10-14 04:31:31,984][33226] Updated weights for policy 1, policy_version 87460 (0.0010) [2023-10-14 04:31:32,360][33226] Updated weights for policy 1, policy_version 87470 (0.0011) [2023-10-14 04:31:32,723][33226] Updated weights for policy 1, policy_version 87480 (0.0008) [2023-10-14 04:31:33,439][33201] Updated weights for policy 0, policy_version 86690 (0.0007) [2023-10-14 04:31:33,798][33201] Updated weights for policy 0, policy_version 86700 (0.0009) [2023-10-14 04:31:34,168][33201] Updated weights for policy 0, policy_version 86710 (0.0009) [2023-10-14 04:31:34,544][33201] Updated weights for policy 0, policy_version 86720 (0.0007) [2023-10-14 04:31:34,557][31953] Fps is (10 sec: 16384.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178388992. Throughput: 0: 1794.7, 1: 1787.9. Samples: 44600318. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:34,557][31953] Avg episode reward: [(0, '20.880'), (1, '20.950')] [2023-10-14 04:31:36,403][33226] Updated weights for policy 1, policy_version 87490 (0.0008) [2023-10-14 04:31:36,762][33226] Updated weights for policy 1, policy_version 87500 (0.0007) [2023-10-14 04:31:37,133][33226] Updated weights for policy 1, policy_version 87510 (0.0007) [2023-10-14 04:31:37,493][33226] Updated weights for policy 1, policy_version 87520 (0.0008) [2023-10-14 04:31:38,345][33201] Updated weights for policy 0, policy_version 86730 (0.0011) [2023-10-14 04:31:38,718][33201] Updated weights for policy 0, policy_version 86740 (0.0011) [2023-10-14 04:31:39,093][33201] Updated weights for policy 0, policy_version 86750 (0.0010) [2023-10-14 04:31:39,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 178454528. Throughput: 0: 1770.1, 1: 1786.4. Samples: 44621028. Policy #0 lag: (min: 26.0, avg: 26.6, max: 42.0) [2023-10-14 04:31:39,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.950')] [2023-10-14 04:31:41,528][33226] Updated weights for policy 1, policy_version 87530 (0.0009) [2023-10-14 04:31:41,890][33226] Updated weights for policy 1, policy_version 87540 (0.0009) [2023-10-14 04:31:42,260][33226] Updated weights for policy 1, policy_version 87550 (0.0007) [2023-10-14 04:31:42,949][33201] Updated weights for policy 0, policy_version 86760 (0.0007) [2023-10-14 04:31:43,317][33201] Updated weights for policy 0, policy_version 86770 (0.0008) [2023-10-14 04:31:43,682][33201] Updated weights for policy 0, policy_version 86780 (0.0007) [2023-10-14 04:31:44,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 178520064. Throughput: 0: 1776.7, 1: 1795.2. Samples: 44632252. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:31:44,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:31:45,933][33226] Updated weights for policy 1, policy_version 87560 (0.0008) [2023-10-14 04:31:46,292][33226] Updated weights for policy 1, policy_version 87570 (0.0011) [2023-10-14 04:31:46,665][33226] Updated weights for policy 1, policy_version 87580 (0.0010) [2023-10-14 04:31:47,515][33201] Updated weights for policy 0, policy_version 86790 (0.0009) [2023-10-14 04:31:47,883][33201] Updated weights for policy 0, policy_version 86800 (0.0009) [2023-10-14 04:31:48,255][33201] Updated weights for policy 0, policy_version 86810 (0.0007) [2023-10-14 04:31:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178585600. Throughput: 0: 1763.0, 1: 1782.7. Samples: 44653020. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:31:49,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.910')] [2023-10-14 04:31:50,459][33226] Updated weights for policy 1, policy_version 87590 (0.0008) [2023-10-14 04:31:50,850][33226] Updated weights for policy 1, policy_version 87600 (0.0008) [2023-10-14 04:31:51,222][33226] Updated weights for policy 1, policy_version 87610 (0.0009) [2023-10-14 04:31:52,205][33201] Updated weights for policy 0, policy_version 86820 (0.0010) [2023-10-14 04:31:52,578][33201] Updated weights for policy 0, policy_version 86830 (0.0010) [2023-10-14 04:31:52,939][33201] Updated weights for policy 0, policy_version 86840 (0.0008) [2023-10-14 04:31:54,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178651136. Throughput: 0: 1742.7, 1: 1786.9. Samples: 44674434. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:31:54,558][31953] Avg episode reward: [(0, '20.700'), (1, '20.910')] [2023-10-14 04:31:54,845][33226] Updated weights for policy 1, policy_version 87620 (0.0011) [2023-10-14 04:31:55,215][33226] Updated weights for policy 1, policy_version 87630 (0.0010) [2023-10-14 04:31:55,585][33226] Updated weights for policy 1, policy_version 87640 (0.0010) [2023-10-14 04:31:56,772][33201] Updated weights for policy 0, policy_version 86850 (0.0008) [2023-10-14 04:31:57,172][33201] Updated weights for policy 0, policy_version 86860 (0.0009) [2023-10-14 04:31:57,540][33201] Updated weights for policy 0, policy_version 86870 (0.0011) [2023-10-14 04:31:57,916][33201] Updated weights for policy 0, policy_version 86880 (0.0010) [2023-10-14 04:31:59,347][33226] Updated weights for policy 1, policy_version 87650 (0.0009) [2023-10-14 04:31:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178716672. Throughput: 0: 1766.9, 1: 1786.8. Samples: 44685162. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:31:59,557][31953] Avg episode reward: [(0, '20.700'), (1, '20.910')] [2023-10-14 04:31:59,719][33226] Updated weights for policy 1, policy_version 87660 (0.0009) [2023-10-14 04:32:00,089][33226] Updated weights for policy 1, policy_version 87670 (0.0008) [2023-10-14 04:32:00,456][33226] Updated weights for policy 1, policy_version 87680 (0.0008) [2023-10-14 04:32:01,651][33201] Updated weights for policy 0, policy_version 86890 (0.0009) [2023-10-14 04:32:02,026][33201] Updated weights for policy 0, policy_version 86900 (0.0007) [2023-10-14 04:32:02,395][33201] Updated weights for policy 0, policy_version 86910 (0.0008) [2023-10-14 04:32:04,145][33226] Updated weights for policy 1, policy_version 87690 (0.0011) [2023-10-14 04:32:04,516][33226] Updated weights for policy 1, policy_version 87700 (0.0009) [2023-10-14 04:32:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178782208. Throughput: 0: 1755.6, 1: 1779.8. Samples: 44706486. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:04,557][31953] Avg episode reward: [(0, '20.280'), (1, '20.920')] [2023-10-14 04:32:04,873][33226] Updated weights for policy 1, policy_version 87710 (0.0009) [2023-10-14 04:32:06,020][33201] Updated weights for policy 0, policy_version 86920 (0.0010) [2023-10-14 04:32:06,388][33201] Updated weights for policy 0, policy_version 86930 (0.0010) [2023-10-14 04:32:06,766][33201] Updated weights for policy 0, policy_version 86940 (0.0008) [2023-10-14 04:32:08,882][33226] Updated weights for policy 1, policy_version 87720 (0.0009) [2023-10-14 04:32:09,245][33226] Updated weights for policy 1, policy_version 87730 (0.0011) [2023-10-14 04:32:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 178847744. Throughput: 0: 1757.2, 1: 1795.1. Samples: 44727992. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:09,558][31953] Avg episode reward: [(0, '20.260'), (1, '20.900')] [2023-10-14 04:32:09,616][33226] Updated weights for policy 1, policy_version 87740 (0.0009) [2023-10-14 04:32:10,651][33201] Updated weights for policy 0, policy_version 86950 (0.0009) [2023-10-14 04:32:11,017][33201] Updated weights for policy 0, policy_version 86960 (0.0009) [2023-10-14 04:32:11,381][33201] Updated weights for policy 0, policy_version 86970 (0.0010) [2023-10-14 04:32:13,232][33226] Updated weights for policy 1, policy_version 87750 (0.0008) [2023-10-14 04:32:13,606][33226] Updated weights for policy 1, policy_version 87760 (0.0009) [2023-10-14 04:32:13,972][33226] Updated weights for policy 1, policy_version 87770 (0.0009) [2023-10-14 04:32:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 178946048. Throughput: 0: 1755.7, 1: 1770.6. Samples: 44738242. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:14,557][31953] Avg episode reward: [(0, '20.260'), (1, '20.900')] [2023-10-14 04:32:15,144][33201] Updated weights for policy 0, policy_version 86980 (0.0009) [2023-10-14 04:32:15,508][33201] Updated weights for policy 0, policy_version 86990 (0.0008) [2023-10-14 04:32:15,870][33201] Updated weights for policy 0, policy_version 87000 (0.0007) [2023-10-14 04:32:17,671][33226] Updated weights for policy 1, policy_version 87780 (0.0009) [2023-10-14 04:32:18,034][33226] Updated weights for policy 1, policy_version 87790 (0.0009) [2023-10-14 04:32:18,401][33226] Updated weights for policy 1, policy_version 87800 (0.0008) [2023-10-14 04:32:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 179011584. Throughput: 0: 1757.5, 1: 1791.2. Samples: 44760010. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:19,558][31953] Avg episode reward: [(0, '19.930'), (1, '20.880')] [2023-10-14 04:32:19,641][33201] Updated weights for policy 0, policy_version 87010 (0.0009) [2023-10-14 04:32:20,018][33201] Updated weights for policy 0, policy_version 87020 (0.0008) [2023-10-14 04:32:20,381][33201] Updated weights for policy 0, policy_version 87030 (0.0008) [2023-10-14 04:32:20,747][33201] Updated weights for policy 0, policy_version 87040 (0.0007) [2023-10-14 04:32:22,363][33226] Updated weights for policy 1, policy_version 87810 (0.0008) [2023-10-14 04:32:22,734][33226] Updated weights for policy 1, policy_version 87820 (0.0009) [2023-10-14 04:32:23,107][33226] Updated weights for policy 1, policy_version 87830 (0.0009) [2023-10-14 04:32:23,484][33226] Updated weights for policy 1, policy_version 87840 (0.0007) [2023-10-14 04:32:24,499][33201] Updated weights for policy 0, policy_version 87050 (0.0007) [2023-10-14 04:32:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 179077120. Throughput: 0: 1791.7, 1: 1764.9. Samples: 44781076. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:24,558][31953] Avg episode reward: [(0, '19.860'), (1, '20.870')] [2023-10-14 04:32:24,868][33201] Updated weights for policy 0, policy_version 87060 (0.0009) [2023-10-14 04:32:25,236][33201] Updated weights for policy 0, policy_version 87070 (0.0008) [2023-10-14 04:32:27,213][33226] Updated weights for policy 1, policy_version 87850 (0.0011) [2023-10-14 04:32:27,567][33226] Updated weights for policy 1, policy_version 87860 (0.0011) [2023-10-14 04:32:27,932][33226] Updated weights for policy 1, policy_version 87870 (0.0008) [2023-10-14 04:32:29,160][33201] Updated weights for policy 0, policy_version 87080 (0.0009) [2023-10-14 04:32:29,529][33201] Updated weights for policy 0, policy_version 87090 (0.0009) [2023-10-14 04:32:29,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 179142656. Throughput: 0: 1764.0, 1: 1786.2. Samples: 44792008. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:29,558][31953] Avg episode reward: [(0, '19.860'), (1, '20.870')] [2023-10-14 04:32:29,903][33201] Updated weights for policy 0, policy_version 87100 (0.0008) [2023-10-14 04:32:31,771][33226] Updated weights for policy 1, policy_version 87880 (0.0008) [2023-10-14 04:32:32,146][33226] Updated weights for policy 1, policy_version 87890 (0.0008) [2023-10-14 04:32:32,500][33226] Updated weights for policy 1, policy_version 87900 (0.0007) [2023-10-14 04:32:33,748][33201] Updated weights for policy 0, policy_version 87110 (0.0007) [2023-10-14 04:32:34,124][33201] Updated weights for policy 0, policy_version 87120 (0.0008) [2023-10-14 04:32:34,496][33201] Updated weights for policy 0, policy_version 87130 (0.0007) [2023-10-14 04:32:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 179208192. Throughput: 0: 1794.4, 1: 1761.7. Samples: 44813046. Policy #0 lag: (min: 7.0, avg: 13.3, max: 39.0) [2023-10-14 04:32:34,558][31953] Avg episode reward: [(0, '19.770'), (1, '20.900')] [2023-10-14 04:32:36,388][33226] Updated weights for policy 1, policy_version 87910 (0.0010) [2023-10-14 04:32:36,766][33226] Updated weights for policy 1, policy_version 87920 (0.0009) [2023-10-14 04:32:37,139][33226] Updated weights for policy 1, policy_version 87930 (0.0010) [2023-10-14 04:32:38,300][33201] Updated weights for policy 0, policy_version 87140 (0.0008) [2023-10-14 04:32:38,671][33201] Updated weights for policy 0, policy_version 87150 (0.0009) [2023-10-14 04:32:39,046][33201] Updated weights for policy 0, policy_version 87160 (0.0007) [2023-10-14 04:32:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 179306496. Throughput: 0: 1779.9, 1: 1764.4. Samples: 44833928. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:32:39,558][31953] Avg episode reward: [(0, '19.770'), (1, '20.900')] [2023-10-14 04:32:40,925][33226] Updated weights for policy 1, policy_version 87940 (0.0008) [2023-10-14 04:32:41,303][33226] Updated weights for policy 1, policy_version 87950 (0.0009) [2023-10-14 04:32:41,666][33226] Updated weights for policy 1, policy_version 87960 (0.0011) [2023-10-14 04:32:42,843][33201] Updated weights for policy 0, policy_version 87170 (0.0009) [2023-10-14 04:32:43,255][33201] Updated weights for policy 0, policy_version 87180 (0.0011) [2023-10-14 04:32:43,638][33201] Updated weights for policy 0, policy_version 87190 (0.0007) [2023-10-14 04:32:44,001][33201] Updated weights for policy 0, policy_version 87200 (0.0009) [2023-10-14 04:32:44,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 179372032. Throughput: 0: 1780.7, 1: 1764.8. Samples: 44844710. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:32:44,558][31953] Avg episode reward: [(0, '19.770'), (1, '20.900')] [2023-10-14 04:32:45,462][33226] Updated weights for policy 1, policy_version 87970 (0.0008) [2023-10-14 04:32:45,826][33226] Updated weights for policy 1, policy_version 87980 (0.0007) [2023-10-14 04:32:46,187][33226] Updated weights for policy 1, policy_version 87990 (0.0010) [2023-10-14 04:32:46,550][33226] Updated weights for policy 1, policy_version 88000 (0.0008) [2023-10-14 04:32:47,816][33201] Updated weights for policy 0, policy_version 87210 (0.0010) [2023-10-14 04:32:48,191][33201] Updated weights for policy 0, policy_version 87220 (0.0009) [2023-10-14 04:32:48,557][33201] Updated weights for policy 0, policy_version 87230 (0.0009) [2023-10-14 04:32:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 179437568. Throughput: 0: 1777.0, 1: 1764.4. Samples: 44865850. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:32:49,558][31953] Avg episode reward: [(0, '19.750'), (1, '20.900')] [2023-10-14 04:32:50,319][33226] Updated weights for policy 1, policy_version 88010 (0.0007) [2023-10-14 04:32:50,696][33226] Updated weights for policy 1, policy_version 88020 (0.0007) [2023-10-14 04:32:51,067][33226] Updated weights for policy 1, policy_version 88030 (0.0008) [2023-10-14 04:32:52,390][33201] Updated weights for policy 0, policy_version 87240 (0.0007) [2023-10-14 04:32:52,773][33201] Updated weights for policy 0, policy_version 87250 (0.0009) [2023-10-14 04:32:53,137][33201] Updated weights for policy 0, policy_version 87260 (0.0007) [2023-10-14 04:32:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 179503104. Throughput: 0: 1759.0, 1: 1781.8. Samples: 44887326. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:32:54,558][31953] Avg episode reward: [(0, '19.770'), (1, '20.900')] [2023-10-14 04:32:54,918][33226] Updated weights for policy 1, policy_version 88040 (0.0009) [2023-10-14 04:32:55,289][33226] Updated weights for policy 1, policy_version 88050 (0.0010) [2023-10-14 04:32:55,654][33226] Updated weights for policy 1, policy_version 88060 (0.0008) [2023-10-14 04:32:56,842][33201] Updated weights for policy 0, policy_version 87270 (0.0010) [2023-10-14 04:32:57,223][33201] Updated weights for policy 0, policy_version 87280 (0.0008) [2023-10-14 04:32:57,588][33201] Updated weights for policy 0, policy_version 87290 (0.0010) [2023-10-14 04:32:59,239][33226] Updated weights for policy 1, policy_version 88070 (0.0009) [2023-10-14 04:32:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 179568640. Throughput: 0: 1778.0, 1: 1771.1. Samples: 44897952. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:32:59,558][31953] Avg episode reward: [(0, '19.810'), (1, '20.900')] [2023-10-14 04:32:59,614][33226] Updated weights for policy 1, policy_version 88080 (0.0008) [2023-10-14 04:32:59,994][33226] Updated weights for policy 1, policy_version 88090 (0.0008) [2023-10-14 04:33:01,613][33201] Updated weights for policy 0, policy_version 87300 (0.0010) [2023-10-14 04:33:01,987][33201] Updated weights for policy 0, policy_version 87310 (0.0008) [2023-10-14 04:33:02,352][33201] Updated weights for policy 0, policy_version 87320 (0.0009) [2023-10-14 04:33:03,795][33226] Updated weights for policy 1, policy_version 88100 (0.0009) [2023-10-14 04:33:04,153][33226] Updated weights for policy 1, policy_version 88110 (0.0010) [2023-10-14 04:33:04,520][33226] Updated weights for policy 1, policy_version 88120 (0.0009) [2023-10-14 04:33:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 179634176. Throughput: 0: 1752.1, 1: 1784.2. Samples: 44919142. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:04,558][31953] Avg episode reward: [(0, '19.810'), (1, '20.900')] [2023-10-14 04:33:06,169][33201] Updated weights for policy 0, policy_version 87330 (0.0009) [2023-10-14 04:33:06,529][33201] Updated weights for policy 0, policy_version 87340 (0.0011) [2023-10-14 04:33:06,903][33201] Updated weights for policy 0, policy_version 87350 (0.0011) [2023-10-14 04:33:07,273][33201] Updated weights for policy 0, policy_version 87360 (0.0010) [2023-10-14 04:33:08,353][33226] Updated weights for policy 1, policy_version 88130 (0.0010) [2023-10-14 04:33:08,723][33226] Updated weights for policy 1, policy_version 88140 (0.0010) [2023-10-14 04:33:09,082][33226] Updated weights for policy 1, policy_version 88150 (0.0009) [2023-10-14 04:33:09,442][33226] Updated weights for policy 1, policy_version 88160 (0.0010) [2023-10-14 04:33:09,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 179732480. Throughput: 0: 1751.6, 1: 1788.2. Samples: 44940366. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:09,558][31953] Avg episode reward: [(0, '19.830'), (1, '20.890')] [2023-10-14 04:33:11,150][33201] Updated weights for policy 0, policy_version 87370 (0.0011) [2023-10-14 04:33:11,521][33201] Updated weights for policy 0, policy_version 87380 (0.0010) [2023-10-14 04:33:11,890][33201] Updated weights for policy 0, policy_version 87390 (0.0009) [2023-10-14 04:33:13,430][33226] Updated weights for policy 1, policy_version 88170 (0.0007) [2023-10-14 04:33:13,797][33226] Updated weights for policy 1, policy_version 88180 (0.0008) [2023-10-14 04:33:14,157][33226] Updated weights for policy 1, policy_version 88190 (0.0008) [2023-10-14 04:33:14,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 179798016. Throughput: 0: 1749.4, 1: 1772.4. Samples: 44950488. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:14,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.880')] [2023-10-14 04:33:15,623][33201] Updated weights for policy 0, policy_version 87400 (0.0009) [2023-10-14 04:33:15,987][33201] Updated weights for policy 0, policy_version 87410 (0.0008) [2023-10-14 04:33:16,353][33201] Updated weights for policy 0, policy_version 87420 (0.0008) [2023-10-14 04:33:17,833][33226] Updated weights for policy 1, policy_version 88200 (0.0008) [2023-10-14 04:33:18,197][33226] Updated weights for policy 1, policy_version 88210 (0.0007) [2023-10-14 04:33:18,568][33226] Updated weights for policy 1, policy_version 88220 (0.0009) [2023-10-14 04:33:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 179863552. Throughput: 0: 1750.5, 1: 1796.1. Samples: 44972646. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:19,558][31953] Avg episode reward: [(0, '19.850'), (1, '20.870')] [2023-10-14 04:33:20,162][33201] Updated weights for policy 0, policy_version 87430 (0.0008) [2023-10-14 04:33:20,541][33201] Updated weights for policy 0, policy_version 87440 (0.0008) [2023-10-14 04:33:20,911][33201] Updated weights for policy 0, policy_version 87450 (0.0007) [2023-10-14 04:33:22,455][33226] Updated weights for policy 1, policy_version 88230 (0.0007) [2023-10-14 04:33:22,837][33226] Updated weights for policy 1, policy_version 88240 (0.0009) [2023-10-14 04:33:23,196][33226] Updated weights for policy 1, policy_version 88250 (0.0008) [2023-10-14 04:33:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 179929088. Throughput: 0: 1776.4, 1: 1770.8. Samples: 44993552. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:24,558][31953] Avg episode reward: [(0, '19.980'), (1, '20.870')] [2023-10-14 04:33:24,571][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000088256_90374144.pth... [2023-10-14 04:33:24,612][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000086592_88670208.pth [2023-10-14 04:33:24,780][33201] Updated weights for policy 0, policy_version 87460 (0.0009) [2023-10-14 04:33:25,149][33201] Updated weights for policy 0, policy_version 87470 (0.0010) [2023-10-14 04:33:25,519][33201] Updated weights for policy 0, policy_version 87480 (0.0007) [2023-10-14 04:33:25,810][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000087488_89587712.pth... [2023-10-14 04:33:25,839][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000085824_87883776.pth [2023-10-14 04:33:26,910][33226] Updated weights for policy 1, policy_version 88260 (0.0007) [2023-10-14 04:33:27,277][33226] Updated weights for policy 1, policy_version 88270 (0.0008) [2023-10-14 04:33:27,645][33226] Updated weights for policy 1, policy_version 88280 (0.0008) [2023-10-14 04:33:29,454][33201] Updated weights for policy 0, policy_version 87490 (0.0008) [2023-10-14 04:33:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 179994624. Throughput: 0: 1751.8, 1: 1797.8. Samples: 45004442. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:29,558][31953] Avg episode reward: [(0, '19.960'), (1, '20.900')] [2023-10-14 04:33:29,831][33201] Updated weights for policy 0, policy_version 87500 (0.0008) [2023-10-14 04:33:30,202][33201] Updated weights for policy 0, policy_version 87510 (0.0010) [2023-10-14 04:33:30,574][33201] Updated weights for policy 0, policy_version 87520 (0.0009) [2023-10-14 04:33:31,571][33226] Updated weights for policy 1, policy_version 88290 (0.0010) [2023-10-14 04:33:31,940][33226] Updated weights for policy 1, policy_version 88300 (0.0008) [2023-10-14 04:33:32,312][33226] Updated weights for policy 1, policy_version 88310 (0.0008) [2023-10-14 04:33:32,672][33226] Updated weights for policy 1, policy_version 88320 (0.0008) [2023-10-14 04:33:34,339][33201] Updated weights for policy 0, policy_version 87530 (0.0011) [2023-10-14 04:33:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 180060160. Throughput: 0: 1773.2, 1: 1767.0. Samples: 45025160. Policy #0 lag: (min: 31.0, avg: 39.4, max: 63.0) [2023-10-14 04:33:34,558][31953] Avg episode reward: [(0, '19.960'), (1, '20.900')] [2023-10-14 04:33:34,713][33201] Updated weights for policy 0, policy_version 87540 (0.0007) [2023-10-14 04:33:35,085][33201] Updated weights for policy 0, policy_version 87550 (0.0007) [2023-10-14 04:33:36,412][33226] Updated weights for policy 1, policy_version 88330 (0.0009) [2023-10-14 04:33:36,788][33226] Updated weights for policy 1, policy_version 88340 (0.0008) [2023-10-14 04:33:37,155][33226] Updated weights for policy 1, policy_version 88350 (0.0007) [2023-10-14 04:33:38,984][33201] Updated weights for policy 0, policy_version 87560 (0.0008) [2023-10-14 04:33:39,345][33201] Updated weights for policy 0, policy_version 87570 (0.0007) [2023-10-14 04:33:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 180125696. Throughput: 0: 1772.6, 1: 1766.3. Samples: 45046578. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:33:39,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.900')] [2023-10-14 04:33:39,721][33201] Updated weights for policy 0, policy_version 87580 (0.0011) [2023-10-14 04:33:40,978][33226] Updated weights for policy 1, policy_version 88360 (0.0008) [2023-10-14 04:33:41,341][33226] Updated weights for policy 1, policy_version 88370 (0.0009) [2023-10-14 04:33:41,712][33226] Updated weights for policy 1, policy_version 88380 (0.0007) [2023-10-14 04:33:43,586][33201] Updated weights for policy 0, policy_version 87590 (0.0010) [2023-10-14 04:33:43,956][33201] Updated weights for policy 0, policy_version 87600 (0.0008) [2023-10-14 04:33:44,326][33201] Updated weights for policy 0, policy_version 87610 (0.0009) [2023-10-14 04:33:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 180224000. Throughput: 0: 1762.0, 1: 1766.4. Samples: 45056726. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:33:44,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.900')] [2023-10-14 04:33:45,423][33226] Updated weights for policy 1, policy_version 88390 (0.0009) [2023-10-14 04:33:45,786][33226] Updated weights for policy 1, policy_version 88400 (0.0010) [2023-10-14 04:33:46,149][33226] Updated weights for policy 1, policy_version 88410 (0.0009) [2023-10-14 04:33:48,134][33201] Updated weights for policy 0, policy_version 87620 (0.0007) [2023-10-14 04:33:48,507][33201] Updated weights for policy 0, policy_version 87630 (0.0008) [2023-10-14 04:33:48,873][33201] Updated weights for policy 0, policy_version 87640 (0.0008) [2023-10-14 04:33:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 180289536. Throughput: 0: 1784.7, 1: 1769.1. Samples: 45079066. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:33:49,558][31953] Avg episode reward: [(0, '20.400'), (1, '20.880')] [2023-10-14 04:33:49,818][33226] Updated weights for policy 1, policy_version 88420 (0.0008) [2023-10-14 04:33:50,186][33226] Updated weights for policy 1, policy_version 88430 (0.0007) [2023-10-14 04:33:50,549][33226] Updated weights for policy 1, policy_version 88440 (0.0009) [2023-10-14 04:33:52,739][33201] Updated weights for policy 0, policy_version 87650 (0.0008) [2023-10-14 04:33:53,107][33201] Updated weights for policy 0, policy_version 87660 (0.0007) [2023-10-14 04:33:53,486][33201] Updated weights for policy 0, policy_version 87670 (0.0007) [2023-10-14 04:33:53,850][33201] Updated weights for policy 0, policy_version 87680 (0.0008) [2023-10-14 04:33:54,285][33226] Updated weights for policy 1, policy_version 88450 (0.0010) [2023-10-14 04:33:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 180355072. Throughput: 0: 1754.6, 1: 1795.2. Samples: 45100106. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:33:54,558][31953] Avg episode reward: [(0, '20.690'), (1, '20.900')] [2023-10-14 04:33:54,652][33226] Updated weights for policy 1, policy_version 88460 (0.0007) [2023-10-14 04:33:55,009][33226] Updated weights for policy 1, policy_version 88470 (0.0008) [2023-10-14 04:33:55,379][33226] Updated weights for policy 1, policy_version 88480 (0.0008) [2023-10-14 04:33:57,584][33201] Updated weights for policy 0, policy_version 87690 (0.0008) [2023-10-14 04:33:57,944][33201] Updated weights for policy 0, policy_version 87700 (0.0007) [2023-10-14 04:33:58,317][33201] Updated weights for policy 0, policy_version 87710 (0.0007) [2023-10-14 04:33:59,274][33226] Updated weights for policy 1, policy_version 88490 (0.0011) [2023-10-14 04:33:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 180420608. Throughput: 0: 1790.8, 1: 1777.9. Samples: 45111078. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:33:59,558][31953] Avg episode reward: [(0, '20.650'), (1, '20.920')] [2023-10-14 04:33:59,636][33226] Updated weights for policy 1, policy_version 88500 (0.0008) [2023-10-14 04:34:00,008][33226] Updated weights for policy 1, policy_version 88510 (0.0007) [2023-10-14 04:34:02,225][33201] Updated weights for policy 0, policy_version 87720 (0.0011) [2023-10-14 04:34:02,589][33201] Updated weights for policy 0, policy_version 87730 (0.0010) [2023-10-14 04:34:02,965][33201] Updated weights for policy 0, policy_version 87740 (0.0009) [2023-10-14 04:34:03,785][33226] Updated weights for policy 1, policy_version 88520 (0.0008) [2023-10-14 04:34:04,151][33226] Updated weights for policy 1, policy_version 88530 (0.0008) [2023-10-14 04:34:04,517][33226] Updated weights for policy 1, policy_version 88540 (0.0008) [2023-10-14 04:34:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 180486144. Throughput: 0: 1749.8, 1: 1784.1. Samples: 45131672. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:04,558][31953] Avg episode reward: [(0, '20.660'), (1, '20.920')] [2023-10-14 04:34:06,743][33201] Updated weights for policy 0, policy_version 87750 (0.0009) [2023-10-14 04:34:07,113][33201] Updated weights for policy 0, policy_version 87760 (0.0008) [2023-10-14 04:34:07,490][33201] Updated weights for policy 0, policy_version 87770 (0.0007) [2023-10-14 04:34:08,330][33226] Updated weights for policy 1, policy_version 88550 (0.0010) [2023-10-14 04:34:08,699][33226] Updated weights for policy 1, policy_version 88560 (0.0008) [2023-10-14 04:34:09,060][33226] Updated weights for policy 1, policy_version 88570 (0.0007) [2023-10-14 04:34:09,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 180584448. Throughput: 0: 1755.7, 1: 1784.7. Samples: 45152872. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:09,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.930')] [2023-10-14 04:34:11,254][33201] Updated weights for policy 0, policy_version 87780 (0.0008) [2023-10-14 04:34:11,615][33201] Updated weights for policy 0, policy_version 87790 (0.0009) [2023-10-14 04:34:11,983][33201] Updated weights for policy 0, policy_version 87800 (0.0008) [2023-10-14 04:34:12,911][33226] Updated weights for policy 1, policy_version 88580 (0.0007) [2023-10-14 04:34:13,278][33226] Updated weights for policy 1, policy_version 88590 (0.0009) [2023-10-14 04:34:13,646][33226] Updated weights for policy 1, policy_version 88600 (0.0008) [2023-10-14 04:34:14,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 180649984. Throughput: 0: 1757.9, 1: 1775.7. Samples: 45163456. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:14,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.910')] [2023-10-14 04:34:15,717][33201] Updated weights for policy 0, policy_version 87810 (0.0007) [2023-10-14 04:34:16,108][33201] Updated weights for policy 0, policy_version 87820 (0.0009) [2023-10-14 04:34:16,491][33201] Updated weights for policy 0, policy_version 87830 (0.0007) [2023-10-14 04:34:16,856][33201] Updated weights for policy 0, policy_version 87840 (0.0009) [2023-10-14 04:34:17,586][33226] Updated weights for policy 1, policy_version 88610 (0.0009) [2023-10-14 04:34:17,947][33226] Updated weights for policy 1, policy_version 88620 (0.0010) [2023-10-14 04:34:18,318][33226] Updated weights for policy 1, policy_version 88630 (0.0010) [2023-10-14 04:34:18,685][33226] Updated weights for policy 1, policy_version 88640 (0.0011) [2023-10-14 04:34:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 180715520. Throughput: 0: 1758.9, 1: 1793.8. Samples: 45185030. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:19,558][31953] Avg episode reward: [(0, '20.680'), (1, '20.910')] [2023-10-14 04:34:20,495][33201] Updated weights for policy 0, policy_version 87850 (0.0008) [2023-10-14 04:34:20,862][33201] Updated weights for policy 0, policy_version 87860 (0.0009) [2023-10-14 04:34:21,235][33201] Updated weights for policy 0, policy_version 87870 (0.0007) [2023-10-14 04:34:22,445][33226] Updated weights for policy 1, policy_version 88650 (0.0009) [2023-10-14 04:34:22,815][33226] Updated weights for policy 1, policy_version 88660 (0.0011) [2023-10-14 04:34:23,183][33226] Updated weights for policy 1, policy_version 88670 (0.0011) [2023-10-14 04:34:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 180781056. Throughput: 0: 1775.7, 1: 1773.2. Samples: 45206282. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:24,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:34:25,022][33201] Updated weights for policy 0, policy_version 87880 (0.0008) [2023-10-14 04:34:25,401][33201] Updated weights for policy 0, policy_version 87890 (0.0008) [2023-10-14 04:34:25,780][33201] Updated weights for policy 0, policy_version 87900 (0.0009) [2023-10-14 04:34:27,147][33226] Updated weights for policy 1, policy_version 88680 (0.0011) [2023-10-14 04:34:27,517][33226] Updated weights for policy 1, policy_version 88690 (0.0008) [2023-10-14 04:34:27,879][33226] Updated weights for policy 1, policy_version 88700 (0.0007) [2023-10-14 04:34:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 180846592. Throughput: 0: 1762.7, 1: 1798.1. Samples: 45216964. Policy #0 lag: (min: 17.0, avg: 26.7, max: 49.0) [2023-10-14 04:34:29,557][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:34:29,647][33201] Updated weights for policy 0, policy_version 87910 (0.0009) [2023-10-14 04:34:30,014][33201] Updated weights for policy 0, policy_version 87920 (0.0008) [2023-10-14 04:34:30,396][33201] Updated weights for policy 0, policy_version 87930 (0.0007) [2023-10-14 04:34:31,482][33226] Updated weights for policy 1, policy_version 88710 (0.0007) [2023-10-14 04:34:31,859][33226] Updated weights for policy 1, policy_version 88720 (0.0007) [2023-10-14 04:34:32,221][33226] Updated weights for policy 1, policy_version 88730 (0.0007) [2023-10-14 04:34:34,242][33201] Updated weights for policy 0, policy_version 87940 (0.0007) [2023-10-14 04:34:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 180912128. Throughput: 0: 1765.4, 1: 1769.5. Samples: 45238136. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:34,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:34:34,614][33201] Updated weights for policy 0, policy_version 87950 (0.0007) [2023-10-14 04:34:34,983][33201] Updated weights for policy 0, policy_version 87960 (0.0008) [2023-10-14 04:34:35,870][33226] Updated weights for policy 1, policy_version 88740 (0.0008) [2023-10-14 04:34:36,243][33226] Updated weights for policy 1, policy_version 88750 (0.0008) [2023-10-14 04:34:36,608][33226] Updated weights for policy 1, policy_version 88760 (0.0009) [2023-10-14 04:34:38,777][33201] Updated weights for policy 0, policy_version 87970 (0.0007) [2023-10-14 04:34:39,150][33201] Updated weights for policy 0, policy_version 87980 (0.0008) [2023-10-14 04:34:39,515][33201] Updated weights for policy 0, policy_version 87990 (0.0008) [2023-10-14 04:34:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 180977664. Throughput: 0: 1785.6, 1: 1771.2. Samples: 45260164. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:39,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:34:39,877][33201] Updated weights for policy 0, policy_version 88000 (0.0008) [2023-10-14 04:34:40,428][33226] Updated weights for policy 1, policy_version 88770 (0.0007) [2023-10-14 04:34:40,796][33226] Updated weights for policy 1, policy_version 88780 (0.0008) [2023-10-14 04:34:41,172][33226] Updated weights for policy 1, policy_version 88790 (0.0008) [2023-10-14 04:34:41,537][33226] Updated weights for policy 1, policy_version 88800 (0.0008) [2023-10-14 04:34:43,650][33201] Updated weights for policy 0, policy_version 88010 (0.0007) [2023-10-14 04:34:44,028][33201] Updated weights for policy 0, policy_version 88020 (0.0007) [2023-10-14 04:34:44,401][33201] Updated weights for policy 0, policy_version 88030 (0.0010) [2023-10-14 04:34:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181075968. Throughput: 0: 1760.7, 1: 1776.0. Samples: 45270228. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:44,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.910')] [2023-10-14 04:34:45,375][33226] Updated weights for policy 1, policy_version 88810 (0.0008) [2023-10-14 04:34:45,738][33226] Updated weights for policy 1, policy_version 88820 (0.0007) [2023-10-14 04:34:46,107][33226] Updated weights for policy 1, policy_version 88830 (0.0008) [2023-10-14 04:34:48,233][33201] Updated weights for policy 0, policy_version 88040 (0.0009) [2023-10-14 04:34:48,602][33201] Updated weights for policy 0, policy_version 88050 (0.0010) [2023-10-14 04:34:48,970][33201] Updated weights for policy 0, policy_version 88060 (0.0007) [2023-10-14 04:34:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181141504. Throughput: 0: 1794.5, 1: 1772.9. Samples: 45292204. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:49,557][31953] Avg episode reward: [(0, '20.720'), (1, '20.880')] [2023-10-14 04:34:49,824][33226] Updated weights for policy 1, policy_version 88840 (0.0008) [2023-10-14 04:34:50,193][33226] Updated weights for policy 1, policy_version 88850 (0.0007) [2023-10-14 04:34:50,555][33226] Updated weights for policy 1, policy_version 88860 (0.0008) [2023-10-14 04:34:52,865][33201] Updated weights for policy 0, policy_version 88070 (0.0008) [2023-10-14 04:34:53,237][33201] Updated weights for policy 0, policy_version 88080 (0.0009) [2023-10-14 04:34:53,607][33201] Updated weights for policy 0, policy_version 88090 (0.0007) [2023-10-14 04:34:54,338][33226] Updated weights for policy 1, policy_version 88870 (0.0009) [2023-10-14 04:34:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181207040. Throughput: 0: 1762.5, 1: 1802.6. Samples: 45313300. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:54,558][31953] Avg episode reward: [(0, '20.720'), (1, '20.890')] [2023-10-14 04:34:54,723][33226] Updated weights for policy 1, policy_version 88880 (0.0008) [2023-10-14 04:34:55,090][33226] Updated weights for policy 1, policy_version 88890 (0.0008) [2023-10-14 04:34:57,371][33201] Updated weights for policy 0, policy_version 88100 (0.0008) [2023-10-14 04:34:57,736][33201] Updated weights for policy 0, policy_version 88110 (0.0007) [2023-10-14 04:34:58,103][33201] Updated weights for policy 0, policy_version 88120 (0.0008) [2023-10-14 04:34:58,884][33226] Updated weights for policy 1, policy_version 88900 (0.0008) [2023-10-14 04:34:59,252][33226] Updated weights for policy 1, policy_version 88910 (0.0008) [2023-10-14 04:34:59,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181272576. Throughput: 0: 1797.7, 1: 1778.8. Samples: 45324398. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:34:59,558][31953] Avg episode reward: [(0, '20.730'), (1, '20.890')] [2023-10-14 04:34:59,614][33226] Updated weights for policy 1, policy_version 88920 (0.0009) [2023-10-14 04:35:02,017][33201] Updated weights for policy 0, policy_version 88130 (0.0010) [2023-10-14 04:35:02,413][33201] Updated weights for policy 0, policy_version 88140 (0.0008) [2023-10-14 04:35:02,789][33201] Updated weights for policy 0, policy_version 88150 (0.0008) [2023-10-14 04:35:03,156][33201] Updated weights for policy 0, policy_version 88160 (0.0009) [2023-10-14 04:35:03,498][33226] Updated weights for policy 1, policy_version 88930 (0.0008) [2023-10-14 04:35:03,863][33226] Updated weights for policy 1, policy_version 88940 (0.0011) [2023-10-14 04:35:04,224][33226] Updated weights for policy 1, policy_version 88950 (0.0009) [2023-10-14 04:35:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 181338112. Throughput: 0: 1768.4, 1: 1785.6. Samples: 45344964. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:35:04,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.890')] [2023-10-14 04:35:04,596][33226] Updated weights for policy 1, policy_version 88960 (0.0007) [2023-10-14 04:35:06,942][33201] Updated weights for policy 0, policy_version 88170 (0.0010) [2023-10-14 04:35:07,313][33201] Updated weights for policy 0, policy_version 88180 (0.0010) [2023-10-14 04:35:07,694][33201] Updated weights for policy 0, policy_version 88190 (0.0009) [2023-10-14 04:35:08,414][33226] Updated weights for policy 1, policy_version 88970 (0.0011) [2023-10-14 04:35:08,781][33226] Updated weights for policy 1, policy_version 88980 (0.0010) [2023-10-14 04:35:09,138][33226] Updated weights for policy 1, policy_version 88990 (0.0011) [2023-10-14 04:35:09,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181436416. Throughput: 0: 1764.9, 1: 1781.3. Samples: 45365862. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:35:09,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.890')] [2023-10-14 04:35:11,394][33201] Updated weights for policy 0, policy_version 88200 (0.0009) [2023-10-14 04:35:11,770][33201] Updated weights for policy 0, policy_version 88210 (0.0009) [2023-10-14 04:35:12,142][33201] Updated weights for policy 0, policy_version 88220 (0.0008) [2023-10-14 04:35:12,893][33226] Updated weights for policy 1, policy_version 89000 (0.0008) [2023-10-14 04:35:13,258][33226] Updated weights for policy 1, policy_version 89010 (0.0009) [2023-10-14 04:35:13,620][33226] Updated weights for policy 1, policy_version 89020 (0.0008) [2023-10-14 04:35:14,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181501952. Throughput: 0: 1774.4, 1: 1784.4. Samples: 45377112. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:35:14,558][31953] Avg episode reward: [(0, '20.770'), (1, '20.870')] [2023-10-14 04:35:15,880][33201] Updated weights for policy 0, policy_version 88230 (0.0009) [2023-10-14 04:35:16,248][33201] Updated weights for policy 0, policy_version 88240 (0.0008) [2023-10-14 04:35:16,624][33201] Updated weights for policy 0, policy_version 88250 (0.0007) [2023-10-14 04:35:17,387][33226] Updated weights for policy 1, policy_version 89030 (0.0007) [2023-10-14 04:35:17,757][33226] Updated weights for policy 1, policy_version 89040 (0.0008) [2023-10-14 04:35:18,131][33226] Updated weights for policy 1, policy_version 89050 (0.0007) [2023-10-14 04:35:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181567488. Throughput: 0: 1770.9, 1: 1784.9. Samples: 45398148. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:35:19,559][31953] Avg episode reward: [(0, '20.770'), (1, '20.890')] [2023-10-14 04:35:20,346][33201] Updated weights for policy 0, policy_version 88260 (0.0010) [2023-10-14 04:35:20,720][33201] Updated weights for policy 0, policy_version 88270 (0.0008) [2023-10-14 04:35:21,082][33201] Updated weights for policy 0, policy_version 88280 (0.0008) [2023-10-14 04:35:22,079][33226] Updated weights for policy 1, policy_version 89060 (0.0009) [2023-10-14 04:35:22,449][33226] Updated weights for policy 1, policy_version 89070 (0.0010) [2023-10-14 04:35:22,807][33226] Updated weights for policy 1, policy_version 89080 (0.0007) [2023-10-14 04:35:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181633024. Throughput: 0: 1777.8, 1: 1769.5. Samples: 45419794. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:35:24,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 04:35:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000089088_91226112.pth... [2023-10-14 04:35:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000088288_90406912.pth... [2023-10-14 04:35:24,598][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000086656_88735744.pth [2023-10-14 04:35:24,606][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000087424_89522176.pth [2023-10-14 04:35:24,992][33201] Updated weights for policy 0, policy_version 88290 (0.0008) [2023-10-14 04:35:25,360][33201] Updated weights for policy 0, policy_version 88300 (0.0008) [2023-10-14 04:35:25,728][33201] Updated weights for policy 0, policy_version 88310 (0.0009) [2023-10-14 04:35:26,105][33201] Updated weights for policy 0, policy_version 88320 (0.0010) [2023-10-14 04:35:26,576][33226] Updated weights for policy 1, policy_version 89090 (0.0007) [2023-10-14 04:35:26,945][33226] Updated weights for policy 1, policy_version 89100 (0.0007) [2023-10-14 04:35:27,305][33226] Updated weights for policy 1, policy_version 89110 (0.0008) [2023-10-14 04:35:27,676][33226] Updated weights for policy 1, policy_version 89120 (0.0009) [2023-10-14 04:35:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 181698560. Throughput: 0: 1770.0, 1: 1786.5. Samples: 45430272. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:29,558][31953] Avg episode reward: [(0, '20.810'), (1, '20.890')] [2023-10-14 04:35:29,986][33201] Updated weights for policy 0, policy_version 88330 (0.0011) [2023-10-14 04:35:30,364][33201] Updated weights for policy 0, policy_version 88340 (0.0010) [2023-10-14 04:35:30,724][33201] Updated weights for policy 0, policy_version 88350 (0.0008) [2023-10-14 04:35:31,498][33226] Updated weights for policy 1, policy_version 89130 (0.0007) [2023-10-14 04:35:31,868][33226] Updated weights for policy 1, policy_version 89140 (0.0008) [2023-10-14 04:35:32,227][33226] Updated weights for policy 1, policy_version 89150 (0.0010) [2023-10-14 04:35:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 181764096. Throughput: 0: 1767.2, 1: 1769.7. Samples: 45451366. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.880')] [2023-10-14 04:35:34,653][33201] Updated weights for policy 0, policy_version 88360 (0.0009) [2023-10-14 04:35:35,021][33201] Updated weights for policy 0, policy_version 88370 (0.0010) [2023-10-14 04:35:35,392][33201] Updated weights for policy 0, policy_version 88380 (0.0011) [2023-10-14 04:35:36,014][33226] Updated weights for policy 1, policy_version 89160 (0.0008) [2023-10-14 04:35:36,378][33226] Updated weights for policy 1, policy_version 89170 (0.0010) [2023-10-14 04:35:36,740][33226] Updated weights for policy 1, policy_version 89180 (0.0010) [2023-10-14 04:35:39,317][33201] Updated weights for policy 0, policy_version 88390 (0.0008) [2023-10-14 04:35:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 181829632. Throughput: 0: 1792.1, 1: 1765.6. Samples: 45473398. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:39,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.880')] [2023-10-14 04:35:39,687][33201] Updated weights for policy 0, policy_version 88400 (0.0009) [2023-10-14 04:35:40,058][33201] Updated weights for policy 0, policy_version 88410 (0.0010) [2023-10-14 04:35:40,575][33226] Updated weights for policy 1, policy_version 89190 (0.0010) [2023-10-14 04:35:40,958][33226] Updated weights for policy 1, policy_version 89200 (0.0008) [2023-10-14 04:35:41,331][33226] Updated weights for policy 1, policy_version 89210 (0.0010) [2023-10-14 04:35:43,650][33201] Updated weights for policy 0, policy_version 88420 (0.0010) [2023-10-14 04:35:44,022][33201] Updated weights for policy 0, policy_version 88430 (0.0008) [2023-10-14 04:35:44,380][33201] Updated weights for policy 0, policy_version 88440 (0.0007) [2023-10-14 04:35:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 181895168. Throughput: 0: 1754.6, 1: 1768.8. Samples: 45482948. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:44,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.880')] [2023-10-14 04:35:45,046][33226] Updated weights for policy 1, policy_version 89220 (0.0009) [2023-10-14 04:35:45,404][33226] Updated weights for policy 1, policy_version 89230 (0.0007) [2023-10-14 04:35:45,768][33226] Updated weights for policy 1, policy_version 89240 (0.0007) [2023-10-14 04:35:48,165][33201] Updated weights for policy 0, policy_version 88450 (0.0008) [2023-10-14 04:35:48,564][33201] Updated weights for policy 0, policy_version 88460 (0.0008) [2023-10-14 04:35:48,931][33201] Updated weights for policy 0, policy_version 88470 (0.0007) [2023-10-14 04:35:49,308][33201] Updated weights for policy 0, policy_version 88480 (0.0008) [2023-10-14 04:35:49,511][33226] Updated weights for policy 1, policy_version 89250 (0.0010) [2023-10-14 04:35:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 181993472. Throughput: 0: 1786.5, 1: 1775.7. Samples: 45505260. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 04:35:49,870][33226] Updated weights for policy 1, policy_version 89260 (0.0007) [2023-10-14 04:35:50,237][33226] Updated weights for policy 1, policy_version 89270 (0.0009) [2023-10-14 04:35:50,613][33226] Updated weights for policy 1, policy_version 89280 (0.0007) [2023-10-14 04:35:53,155][33201] Updated weights for policy 0, policy_version 88490 (0.0010) [2023-10-14 04:35:53,532][33201] Updated weights for policy 0, policy_version 88500 (0.0007) [2023-10-14 04:35:53,892][33201] Updated weights for policy 0, policy_version 88510 (0.0007) [2023-10-14 04:35:54,439][33226] Updated weights for policy 1, policy_version 89290 (0.0010) [2023-10-14 04:35:54,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 182059008. Throughput: 0: 1753.4, 1: 1799.1. Samples: 45525726. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:54,559][31953] Avg episode reward: [(0, '20.980'), (1, '20.880')] [2023-10-14 04:35:54,804][33226] Updated weights for policy 1, policy_version 89300 (0.0008) [2023-10-14 04:35:55,162][33226] Updated weights for policy 1, policy_version 89310 (0.0007) [2023-10-14 04:35:57,768][33201] Updated weights for policy 0, policy_version 88520 (0.0008) [2023-10-14 04:35:58,129][33201] Updated weights for policy 0, policy_version 88530 (0.0009) [2023-10-14 04:35:58,501][33201] Updated weights for policy 0, policy_version 88540 (0.0007) [2023-10-14 04:35:58,905][33226] Updated weights for policy 1, policy_version 89320 (0.0008) [2023-10-14 04:35:59,277][33226] Updated weights for policy 1, policy_version 89330 (0.0007) [2023-10-14 04:35:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 182124544. Throughput: 0: 1777.6, 1: 1770.0. Samples: 45536756. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:35:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 04:35:59,638][33226] Updated weights for policy 1, policy_version 89340 (0.0007) [2023-10-14 04:36:02,254][33201] Updated weights for policy 0, policy_version 88550 (0.0007) [2023-10-14 04:36:02,631][33201] Updated weights for policy 0, policy_version 88560 (0.0008) [2023-10-14 04:36:03,004][33201] Updated weights for policy 0, policy_version 88570 (0.0007) [2023-10-14 04:36:03,468][33226] Updated weights for policy 1, policy_version 89350 (0.0007) [2023-10-14 04:36:03,831][33226] Updated weights for policy 1, policy_version 89360 (0.0009) [2023-10-14 04:36:04,200][33226] Updated weights for policy 1, policy_version 89370 (0.0007) [2023-10-14 04:36:04,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 182222848. Throughput: 0: 1754.4, 1: 1791.9. Samples: 45557734. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:36:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 04:36:06,933][33201] Updated weights for policy 0, policy_version 88580 (0.0007) [2023-10-14 04:36:07,297][33201] Updated weights for policy 0, policy_version 88590 (0.0008) [2023-10-14 04:36:07,674][33201] Updated weights for policy 0, policy_version 88600 (0.0009) [2023-10-14 04:36:08,025][33226] Updated weights for policy 1, policy_version 89380 (0.0007) [2023-10-14 04:36:08,394][33226] Updated weights for policy 1, policy_version 89390 (0.0008) [2023-10-14 04:36:08,758][33226] Updated weights for policy 1, policy_version 89400 (0.0007) [2023-10-14 04:36:09,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 182288384. Throughput: 0: 1751.8, 1: 1771.9. Samples: 45578362. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:36:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 04:36:11,422][33201] Updated weights for policy 0, policy_version 88610 (0.0008) [2023-10-14 04:36:11,780][33201] Updated weights for policy 0, policy_version 88620 (0.0009) [2023-10-14 04:36:12,158][33201] Updated weights for policy 0, policy_version 88630 (0.0007) [2023-10-14 04:36:12,517][33201] Updated weights for policy 0, policy_version 88640 (0.0007) [2023-10-14 04:36:12,574][33226] Updated weights for policy 1, policy_version 89410 (0.0007) [2023-10-14 04:36:12,949][33226] Updated weights for policy 1, policy_version 89420 (0.0010) [2023-10-14 04:36:13,318][33226] Updated weights for policy 1, policy_version 89430 (0.0011) [2023-10-14 04:36:13,692][33226] Updated weights for policy 1, policy_version 89440 (0.0009) [2023-10-14 04:36:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 182353920. Throughput: 0: 1763.6, 1: 1777.6. Samples: 45589630. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:36:14,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.900')] [2023-10-14 04:36:16,407][33201] Updated weights for policy 0, policy_version 88650 (0.0011) [2023-10-14 04:36:16,769][33201] Updated weights for policy 0, policy_version 88660 (0.0009) [2023-10-14 04:36:17,143][33201] Updated weights for policy 0, policy_version 88670 (0.0009) [2023-10-14 04:36:17,562][33226] Updated weights for policy 1, policy_version 89450 (0.0008) [2023-10-14 04:36:17,922][33226] Updated weights for policy 1, policy_version 89460 (0.0008) [2023-10-14 04:36:18,296][33226] Updated weights for policy 1, policy_version 89470 (0.0009) [2023-10-14 04:36:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 182419456. Throughput: 0: 1752.7, 1: 1774.1. Samples: 45610072. Policy #0 lag: (min: 31.0, avg: 32.3, max: 56.0) [2023-10-14 04:36:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:36:20,915][33201] Updated weights for policy 0, policy_version 88680 (0.0009) [2023-10-14 04:36:21,290][33201] Updated weights for policy 0, policy_version 88690 (0.0010) [2023-10-14 04:36:21,660][33201] Updated weights for policy 0, policy_version 88700 (0.0009) [2023-10-14 04:36:21,969][33226] Updated weights for policy 1, policy_version 89480 (0.0010) [2023-10-14 04:36:22,335][33226] Updated weights for policy 1, policy_version 89490 (0.0009) [2023-10-14 04:36:22,713][33226] Updated weights for policy 1, policy_version 89500 (0.0007) [2023-10-14 04:36:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 182484992. Throughput: 0: 1758.8, 1: 1767.6. Samples: 45632086. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:24,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:36:25,609][33201] Updated weights for policy 0, policy_version 88710 (0.0007) [2023-10-14 04:36:25,981][33201] Updated weights for policy 0, policy_version 88720 (0.0007) [2023-10-14 04:36:26,353][33201] Updated weights for policy 0, policy_version 88730 (0.0007) [2023-10-14 04:36:26,686][33226] Updated weights for policy 1, policy_version 89510 (0.0008) [2023-10-14 04:36:27,084][33226] Updated weights for policy 1, policy_version 89520 (0.0009) [2023-10-14 04:36:27,447][33226] Updated weights for policy 1, policy_version 89530 (0.0008) [2023-10-14 04:36:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 182550528. Throughput: 0: 1755.6, 1: 1785.0. Samples: 45642278. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:36:30,179][33201] Updated weights for policy 0, policy_version 88740 (0.0009) [2023-10-14 04:36:30,544][33201] Updated weights for policy 0, policy_version 88750 (0.0010) [2023-10-14 04:36:30,912][33201] Updated weights for policy 0, policy_version 88760 (0.0010) [2023-10-14 04:36:31,213][33226] Updated weights for policy 1, policy_version 89540 (0.0008) [2023-10-14 04:36:31,569][33226] Updated weights for policy 1, policy_version 89550 (0.0008) [2023-10-14 04:36:31,938][33226] Updated weights for policy 1, policy_version 89560 (0.0011) [2023-10-14 04:36:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 182616064. Throughput: 0: 1758.1, 1: 1765.4. Samples: 45663816. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:36:34,644][33201] Updated weights for policy 0, policy_version 88770 (0.0009) [2023-10-14 04:36:35,004][33201] Updated weights for policy 0, policy_version 88780 (0.0009) [2023-10-14 04:36:35,376][33201] Updated weights for policy 0, policy_version 88790 (0.0008) [2023-10-14 04:36:35,728][33226] Updated weights for policy 1, policy_version 89570 (0.0008) [2023-10-14 04:36:35,736][33201] Updated weights for policy 0, policy_version 88800 (0.0007) [2023-10-14 04:36:36,101][33226] Updated weights for policy 1, policy_version 89580 (0.0009) [2023-10-14 04:36:36,472][33226] Updated weights for policy 1, policy_version 89590 (0.0009) [2023-10-14 04:36:36,842][33226] Updated weights for policy 1, policy_version 89600 (0.0009) [2023-10-14 04:36:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 182681600. Throughput: 0: 1788.5, 1: 1771.0. Samples: 45685900. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.890')] [2023-10-14 04:36:39,805][33201] Updated weights for policy 0, policy_version 88810 (0.0008) [2023-10-14 04:36:40,185][33201] Updated weights for policy 0, policy_version 88820 (0.0007) [2023-10-14 04:36:40,556][33201] Updated weights for policy 0, policy_version 88830 (0.0009) [2023-10-14 04:36:40,591][33226] Updated weights for policy 1, policy_version 89610 (0.0008) [2023-10-14 04:36:40,948][33226] Updated weights for policy 1, policy_version 89620 (0.0008) [2023-10-14 04:36:41,317][33226] Updated weights for policy 1, policy_version 89630 (0.0008) [2023-10-14 04:36:44,357][33201] Updated weights for policy 0, policy_version 88840 (0.0008) [2023-10-14 04:36:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 182747136. Throughput: 0: 1753.6, 1: 1773.6. Samples: 45695478. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.890')] [2023-10-14 04:36:44,730][33201] Updated weights for policy 0, policy_version 88850 (0.0009) [2023-10-14 04:36:45,068][33226] Updated weights for policy 1, policy_version 89640 (0.0007) [2023-10-14 04:36:45,098][33201] Updated weights for policy 0, policy_version 88860 (0.0008) [2023-10-14 04:36:45,431][33226] Updated weights for policy 1, policy_version 89650 (0.0007) [2023-10-14 04:36:45,795][33226] Updated weights for policy 1, policy_version 89660 (0.0007) [2023-10-14 04:36:48,915][33201] Updated weights for policy 0, policy_version 88870 (0.0007) [2023-10-14 04:36:49,283][33201] Updated weights for policy 0, policy_version 88880 (0.0009) [2023-10-14 04:36:49,470][33226] Updated weights for policy 1, policy_version 89670 (0.0008) [2023-10-14 04:36:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 182812672. Throughput: 0: 1777.7, 1: 1772.8. Samples: 45717508. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:36:49,646][33201] Updated weights for policy 0, policy_version 88890 (0.0007) [2023-10-14 04:36:49,837][33226] Updated weights for policy 1, policy_version 89680 (0.0009) [2023-10-14 04:36:50,207][33226] Updated weights for policy 1, policy_version 89690 (0.0007) [2023-10-14 04:36:53,409][33201] Updated weights for policy 0, policy_version 88900 (0.0008) [2023-10-14 04:36:53,772][33201] Updated weights for policy 0, policy_version 88910 (0.0009) [2023-10-14 04:36:54,083][33226] Updated weights for policy 1, policy_version 89700 (0.0007) [2023-10-14 04:36:54,133][33201] Updated weights for policy 0, policy_version 88920 (0.0008) [2023-10-14 04:36:54,454][33226] Updated weights for policy 1, policy_version 89710 (0.0007) [2023-10-14 04:36:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 182910976. Throughput: 0: 1765.0, 1: 1801.8. Samples: 45738868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:36:54,810][33226] Updated weights for policy 1, policy_version 89720 (0.0008) [2023-10-14 04:36:58,032][33201] Updated weights for policy 0, policy_version 88930 (0.0008) [2023-10-14 04:36:58,402][33201] Updated weights for policy 0, policy_version 88940 (0.0012) [2023-10-14 04:36:58,556][33226] Updated weights for policy 1, policy_version 89730 (0.0007) [2023-10-14 04:36:58,769][33201] Updated weights for policy 0, policy_version 88950 (0.0008) [2023-10-14 04:36:58,908][33226] Updated weights for policy 1, policy_version 89740 (0.0008) [2023-10-14 04:36:59,138][33201] Updated weights for policy 0, policy_version 88960 (0.0007) [2023-10-14 04:36:59,272][33226] Updated weights for policy 1, policy_version 89750 (0.0010) [2023-10-14 04:36:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 182976512. Throughput: 0: 1772.6, 1: 1779.2. Samples: 45749462. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:36:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:36:59,640][33226] Updated weights for policy 1, policy_version 89760 (0.0011) [2023-10-14 04:37:03,080][33201] Updated weights for policy 0, policy_version 88970 (0.0009) [2023-10-14 04:37:03,442][33226] Updated weights for policy 1, policy_version 89770 (0.0007) [2023-10-14 04:37:03,465][33201] Updated weights for policy 0, policy_version 88980 (0.0009) [2023-10-14 04:37:03,803][33226] Updated weights for policy 1, policy_version 89780 (0.0007) [2023-10-14 04:37:03,842][33201] Updated weights for policy 0, policy_version 88990 (0.0009) [2023-10-14 04:37:04,169][33226] Updated weights for policy 1, policy_version 89790 (0.0007) [2023-10-14 04:37:04,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14329.0). Total num frames: 183074816. Throughput: 0: 1772.4, 1: 1801.4. Samples: 45770892. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:37:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:37:07,797][33201] Updated weights for policy 0, policy_version 89000 (0.0010) [2023-10-14 04:37:08,080][33226] Updated weights for policy 1, policy_version 89800 (0.0009) [2023-10-14 04:37:08,171][33201] Updated weights for policy 0, policy_version 89010 (0.0009) [2023-10-14 04:37:08,448][33226] Updated weights for policy 1, policy_version 89810 (0.0008) [2023-10-14 04:37:08,532][33201] Updated weights for policy 0, policy_version 89020 (0.0008) [2023-10-14 04:37:08,815][33226] Updated weights for policy 1, policy_version 89820 (0.0008) [2023-10-14 04:37:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 183140352. Throughput: 0: 1741.3, 1: 1776.5. Samples: 45790386. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:37:09,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:37:12,359][33201] Updated weights for policy 0, policy_version 89030 (0.0008) [2023-10-14 04:37:12,671][33226] Updated weights for policy 1, policy_version 89830 (0.0008) [2023-10-14 04:37:12,726][33201] Updated weights for policy 0, policy_version 89040 (0.0008) [2023-10-14 04:37:13,039][33226] Updated weights for policy 1, policy_version 89840 (0.0007) [2023-10-14 04:37:13,094][33201] Updated weights for policy 0, policy_version 89050 (0.0008) [2023-10-14 04:37:13,412][33226] Updated weights for policy 1, policy_version 89850 (0.0008) [2023-10-14 04:37:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 183205888. Throughput: 0: 1770.7, 1: 1796.1. Samples: 45802784. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:37:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:37:16,899][33201] Updated weights for policy 0, policy_version 89060 (0.0009) [2023-10-14 04:37:17,136][33226] Updated weights for policy 1, policy_version 89860 (0.0010) [2023-10-14 04:37:17,264][33201] Updated weights for policy 0, policy_version 89070 (0.0008) [2023-10-14 04:37:17,494][33226] Updated weights for policy 1, policy_version 89870 (0.0009) [2023-10-14 04:37:17,636][33201] Updated weights for policy 0, policy_version 89080 (0.0007) [2023-10-14 04:37:17,858][33226] Updated weights for policy 1, policy_version 89880 (0.0008) [2023-10-14 04:37:19,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 183271424. Throughput: 0: 1732.4, 1: 1786.3. Samples: 45822154. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.940')] [2023-10-14 04:37:21,675][33201] Updated weights for policy 0, policy_version 89090 (0.0008) [2023-10-14 04:37:21,725][33226] Updated weights for policy 1, policy_version 89890 (0.0008) [2023-10-14 04:37:22,078][33201] Updated weights for policy 0, policy_version 89100 (0.0007) [2023-10-14 04:37:22,093][33226] Updated weights for policy 1, policy_version 89900 (0.0007) [2023-10-14 04:37:22,453][33201] Updated weights for policy 0, policy_version 89110 (0.0008) [2023-10-14 04:37:22,456][33226] Updated weights for policy 1, policy_version 89910 (0.0008) [2023-10-14 04:37:22,817][33201] Updated weights for policy 0, policy_version 89120 (0.0007) [2023-10-14 04:37:22,826][33226] Updated weights for policy 1, policy_version 89920 (0.0007) [2023-10-14 04:37:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 183336960. Throughput: 0: 1729.3, 1: 1775.5. Samples: 45843620. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.920')] [2023-10-14 04:37:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000089920_92078080.pth... [2023-10-14 04:37:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000089120_91258880.pth... [2023-10-14 04:37:24,598][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000088256_90374144.pth [2023-10-14 04:37:24,601][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000087488_89587712.pth [2023-10-14 04:37:26,532][33201] Updated weights for policy 0, policy_version 89130 (0.0009) [2023-10-14 04:37:26,567][33226] Updated weights for policy 1, policy_version 89930 (0.0009) [2023-10-14 04:37:26,899][33201] Updated weights for policy 0, policy_version 89140 (0.0008) [2023-10-14 04:37:26,921][33226] Updated weights for policy 1, policy_version 89940 (0.0008) [2023-10-14 04:37:27,268][33201] Updated weights for policy 0, policy_version 89150 (0.0007) [2023-10-14 04:37:27,291][33226] Updated weights for policy 1, policy_version 89950 (0.0009) [2023-10-14 04:37:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 183402496. Throughput: 0: 1743.2, 1: 1788.4. Samples: 45854400. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:29,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 04:37:31,047][33226] Updated weights for policy 1, policy_version 89960 (0.0009) [2023-10-14 04:37:31,078][33201] Updated weights for policy 0, policy_version 89160 (0.0007) [2023-10-14 04:37:31,410][33226] Updated weights for policy 1, policy_version 89970 (0.0009) [2023-10-14 04:37:31,444][33201] Updated weights for policy 0, policy_version 89170 (0.0007) [2023-10-14 04:37:31,780][33226] Updated weights for policy 1, policy_version 89980 (0.0007) [2023-10-14 04:37:31,822][33201] Updated weights for policy 0, policy_version 89180 (0.0007) [2023-10-14 04:37:34,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 183468032. Throughput: 0: 1740.4, 1: 1776.0. Samples: 45875746. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 04:37:35,609][33226] Updated weights for policy 1, policy_version 89990 (0.0009) [2023-10-14 04:37:35,765][33201] Updated weights for policy 0, policy_version 89190 (0.0008) [2023-10-14 04:37:35,975][33226] Updated weights for policy 1, policy_version 90000 (0.0007) [2023-10-14 04:37:36,135][33201] Updated weights for policy 0, policy_version 89200 (0.0008) [2023-10-14 04:37:36,357][33226] Updated weights for policy 1, policy_version 90010 (0.0007) [2023-10-14 04:37:36,499][33201] Updated weights for policy 0, policy_version 89210 (0.0008) [2023-10-14 04:37:39,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 183533568. Throughput: 0: 1758.9, 1: 1773.6. Samples: 45897834. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 04:37:40,153][33226] Updated weights for policy 1, policy_version 90020 (0.0008) [2023-10-14 04:37:40,260][33201] Updated weights for policy 0, policy_version 89220 (0.0008) [2023-10-14 04:37:40,523][33226] Updated weights for policy 1, policy_version 90030 (0.0007) [2023-10-14 04:37:40,635][33201] Updated weights for policy 0, policy_version 89230 (0.0008) [2023-10-14 04:37:40,887][33226] Updated weights for policy 1, policy_version 90040 (0.0009) [2023-10-14 04:37:40,992][33201] Updated weights for policy 0, policy_version 89240 (0.0008) [2023-10-14 04:37:44,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 183599104. Throughput: 0: 1737.8, 1: 1775.1. Samples: 45907540. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.870')] [2023-10-14 04:37:44,794][33226] Updated weights for policy 1, policy_version 90050 (0.0008) [2023-10-14 04:37:44,932][33201] Updated weights for policy 0, policy_version 89250 (0.0009) [2023-10-14 04:37:45,149][33226] Updated weights for policy 1, policy_version 90060 (0.0008) [2023-10-14 04:37:45,300][33201] Updated weights for policy 0, policy_version 89260 (0.0009) [2023-10-14 04:37:45,516][33226] Updated weights for policy 1, policy_version 90070 (0.0007) [2023-10-14 04:37:45,673][33201] Updated weights for policy 0, policy_version 89270 (0.0008) [2023-10-14 04:37:45,883][33226] Updated weights for policy 1, policy_version 90080 (0.0007) [2023-10-14 04:37:46,033][33201] Updated weights for policy 0, policy_version 89280 (0.0007) [2023-10-14 04:37:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 183664640. Throughput: 0: 1748.9, 1: 1771.0. Samples: 45929288. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:37:49,772][33226] Updated weights for policy 1, policy_version 90090 (0.0007) [2023-10-14 04:37:49,938][33201] Updated weights for policy 0, policy_version 89290 (0.0007) [2023-10-14 04:37:50,136][33226] Updated weights for policy 1, policy_version 90100 (0.0007) [2023-10-14 04:37:50,309][33201] Updated weights for policy 0, policy_version 89300 (0.0008) [2023-10-14 04:37:50,494][33226] Updated weights for policy 1, policy_version 90110 (0.0007) [2023-10-14 04:37:50,686][33201] Updated weights for policy 0, policy_version 89310 (0.0007) [2023-10-14 04:37:54,308][33226] Updated weights for policy 1, policy_version 90120 (0.0007) [2023-10-14 04:37:54,458][33201] Updated weights for policy 0, policy_version 89320 (0.0007) [2023-10-14 04:37:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 183730176. Throughput: 0: 1779.5, 1: 1796.6. Samples: 45951308. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:54,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:37:54,664][33226] Updated weights for policy 1, policy_version 90130 (0.0008) [2023-10-14 04:37:54,830][33201] Updated weights for policy 0, policy_version 89330 (0.0007) [2023-10-14 04:37:55,022][33226] Updated weights for policy 1, policy_version 90140 (0.0007) [2023-10-14 04:37:55,193][33201] Updated weights for policy 0, policy_version 89340 (0.0007) [2023-10-14 04:37:58,916][33226] Updated weights for policy 1, policy_version 90150 (0.0008) [2023-10-14 04:37:58,934][33201] Updated weights for policy 0, policy_version 89350 (0.0007) [2023-10-14 04:37:59,293][33201] Updated weights for policy 0, policy_version 89360 (0.0008) [2023-10-14 04:37:59,301][33226] Updated weights for policy 1, policy_version 90160 (0.0008) [2023-10-14 04:37:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 183795712. Throughput: 0: 1754.8, 1: 1765.0. Samples: 45961176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:37:59,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:37:59,662][33226] Updated weights for policy 1, policy_version 90170 (0.0008) [2023-10-14 04:37:59,663][33201] Updated weights for policy 0, policy_version 89370 (0.0007) [2023-10-14 04:38:03,504][33226] Updated weights for policy 1, policy_version 90180 (0.0009) [2023-10-14 04:38:03,581][33201] Updated weights for policy 0, policy_version 89380 (0.0007) [2023-10-14 04:38:03,862][33226] Updated weights for policy 1, policy_version 90190 (0.0008) [2023-10-14 04:38:03,954][33201] Updated weights for policy 0, policy_version 89390 (0.0008) [2023-10-14 04:38:04,226][33226] Updated weights for policy 1, policy_version 90200 (0.0009) [2023-10-14 04:38:04,318][33201] Updated weights for policy 0, policy_version 89400 (0.0008) [2023-10-14 04:38:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 183894016. Throughput: 0: 1783.6, 1: 1786.4. Samples: 45982806. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:38:04,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:38:07,938][33226] Updated weights for policy 1, policy_version 90210 (0.0008) [2023-10-14 04:38:08,154][33201] Updated weights for policy 0, policy_version 89410 (0.0008) [2023-10-14 04:38:08,306][33226] Updated weights for policy 1, policy_version 90220 (0.0009) [2023-10-14 04:38:08,536][33201] Updated weights for policy 0, policy_version 89420 (0.0009) [2023-10-14 04:38:08,670][33226] Updated weights for policy 1, policy_version 90230 (0.0008) [2023-10-14 04:38:08,899][33201] Updated weights for policy 0, policy_version 89430 (0.0008) [2023-10-14 04:38:09,044][33226] Updated weights for policy 1, policy_version 90240 (0.0009) [2023-10-14 04:38:09,268][33201] Updated weights for policy 0, policy_version 89440 (0.0009) [2023-10-14 04:38:09,557][31953] Fps is (10 sec: 19660.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 183992320. Throughput: 0: 1763.5, 1: 1767.2. Samples: 46002500. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:38:09,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:38:13,060][33201] Updated weights for policy 0, policy_version 89450 (0.0007) [2023-10-14 04:38:13,087][33226] Updated weights for policy 1, policy_version 90250 (0.0007) [2023-10-14 04:38:13,428][33201] Updated weights for policy 0, policy_version 89460 (0.0007) [2023-10-14 04:38:13,462][33226] Updated weights for policy 1, policy_version 90260 (0.0008) [2023-10-14 04:38:13,803][33201] Updated weights for policy 0, policy_version 89470 (0.0007) [2023-10-14 04:38:13,825][33226] Updated weights for policy 1, policy_version 90270 (0.0007) [2023-10-14 04:38:14,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 184057856. Throughput: 0: 1774.9, 1: 1777.0. Samples: 46014238. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:38:17,664][33226] Updated weights for policy 1, policy_version 90280 (0.0009) [2023-10-14 04:38:17,676][33201] Updated weights for policy 0, policy_version 89480 (0.0008) [2023-10-14 04:38:18,032][33226] Updated weights for policy 1, policy_version 90290 (0.0008) [2023-10-14 04:38:18,042][33201] Updated weights for policy 0, policy_version 89490 (0.0008) [2023-10-14 04:38:18,380][33226] Updated weights for policy 1, policy_version 90300 (0.0008) [2023-10-14 04:38:18,420][33201] Updated weights for policy 0, policy_version 89500 (0.0010) [2023-10-14 04:38:19,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 184123392. Throughput: 0: 1759.0, 1: 1765.7. Samples: 46034358. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:38:22,072][33226] Updated weights for policy 1, policy_version 90310 (0.0007) [2023-10-14 04:38:22,369][33201] Updated weights for policy 0, policy_version 89510 (0.0009) [2023-10-14 04:38:22,439][33226] Updated weights for policy 1, policy_version 90320 (0.0008) [2023-10-14 04:38:22,741][33201] Updated weights for policy 0, policy_version 89520 (0.0008) [2023-10-14 04:38:22,809][33226] Updated weights for policy 1, policy_version 90330 (0.0007) [2023-10-14 04:38:23,109][33201] Updated weights for policy 0, policy_version 89530 (0.0008) [2023-10-14 04:38:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 184188928. Throughput: 0: 1741.1, 1: 1750.3. Samples: 46054946. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.870')] [2023-10-14 04:38:26,595][33226] Updated weights for policy 1, policy_version 90340 (0.0008) [2023-10-14 04:38:26,915][33201] Updated weights for policy 0, policy_version 89540 (0.0009) [2023-10-14 04:38:26,949][33226] Updated weights for policy 1, policy_version 90350 (0.0008) [2023-10-14 04:38:27,281][33201] Updated weights for policy 0, policy_version 89550 (0.0007) [2023-10-14 04:38:27,308][33226] Updated weights for policy 1, policy_version 90360 (0.0007) [2023-10-14 04:38:27,653][33201] Updated weights for policy 0, policy_version 89560 (0.0008) [2023-10-14 04:38:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 184254464. Throughput: 0: 1763.6, 1: 1764.1. Samples: 46066290. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:29,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:38:31,179][33226] Updated weights for policy 1, policy_version 90370 (0.0008) [2023-10-14 04:38:31,542][33201] Updated weights for policy 0, policy_version 89570 (0.0008) [2023-10-14 04:38:31,544][33226] Updated weights for policy 1, policy_version 90380 (0.0011) [2023-10-14 04:38:31,900][33201] Updated weights for policy 0, policy_version 89580 (0.0007) [2023-10-14 04:38:31,910][33226] Updated weights for policy 1, policy_version 90390 (0.0008) [2023-10-14 04:38:32,271][33201] Updated weights for policy 0, policy_version 89590 (0.0008) [2023-10-14 04:38:32,273][33226] Updated weights for policy 1, policy_version 90400 (0.0007) [2023-10-14 04:38:32,637][33201] Updated weights for policy 0, policy_version 89600 (0.0007) [2023-10-14 04:38:34,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 184320000. Throughput: 0: 1740.1, 1: 1754.3. Samples: 46086536. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:38:36,086][33226] Updated weights for policy 1, policy_version 90410 (0.0008) [2023-10-14 04:38:36,452][33226] Updated weights for policy 1, policy_version 90420 (0.0009) [2023-10-14 04:38:36,569][33201] Updated weights for policy 0, policy_version 89610 (0.0008) [2023-10-14 04:38:36,823][33226] Updated weights for policy 1, policy_version 90430 (0.0008) [2023-10-14 04:38:36,939][33201] Updated weights for policy 0, policy_version 89620 (0.0008) [2023-10-14 04:38:37,308][33201] Updated weights for policy 0, policy_version 89630 (0.0007) [2023-10-14 04:38:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 184385536. Throughput: 0: 1740.8, 1: 1759.8. Samples: 46108832. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:39,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:38:40,515][33226] Updated weights for policy 1, policy_version 90440 (0.0009) [2023-10-14 04:38:40,873][33226] Updated weights for policy 1, policy_version 90450 (0.0007) [2023-10-14 04:38:40,989][33201] Updated weights for policy 0, policy_version 89640 (0.0009) [2023-10-14 04:38:41,238][33226] Updated weights for policy 1, policy_version 90460 (0.0007) [2023-10-14 04:38:41,362][33201] Updated weights for policy 0, policy_version 89650 (0.0008) [2023-10-14 04:38:41,729][33201] Updated weights for policy 0, policy_version 89660 (0.0010) [2023-10-14 04:38:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 184451072. Throughput: 0: 1737.8, 1: 1759.0. Samples: 46118530. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:44,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:38:45,179][33226] Updated weights for policy 1, policy_version 90470 (0.0008) [2023-10-14 04:38:45,486][33201] Updated weights for policy 0, policy_version 89670 (0.0008) [2023-10-14 04:38:45,563][33226] Updated weights for policy 1, policy_version 90480 (0.0008) [2023-10-14 04:38:45,860][33201] Updated weights for policy 0, policy_version 89680 (0.0008) [2023-10-14 04:38:45,934][33226] Updated weights for policy 1, policy_version 90490 (0.0007) [2023-10-14 04:38:46,233][33201] Updated weights for policy 0, policy_version 89690 (0.0010) [2023-10-14 04:38:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 184516608. Throughput: 0: 1741.4, 1: 1756.3. Samples: 46140204. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:49,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:38:49,761][33226] Updated weights for policy 1, policy_version 90500 (0.0009) [2023-10-14 04:38:50,138][33226] Updated weights for policy 1, policy_version 90510 (0.0008) [2023-10-14 04:38:50,274][33201] Updated weights for policy 0, policy_version 89700 (0.0008) [2023-10-14 04:38:50,504][33226] Updated weights for policy 1, policy_version 90520 (0.0008) [2023-10-14 04:38:50,653][33201] Updated weights for policy 0, policy_version 89710 (0.0008) [2023-10-14 04:38:51,022][33201] Updated weights for policy 0, policy_version 89720 (0.0008) [2023-10-14 04:38:54,321][33226] Updated weights for policy 1, policy_version 90530 (0.0010) [2023-10-14 04:38:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 184582144. Throughput: 0: 1770.1, 1: 1778.6. Samples: 46162192. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:54,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:38:54,684][33226] Updated weights for policy 1, policy_version 90540 (0.0008) [2023-10-14 04:38:54,711][33201] Updated weights for policy 0, policy_version 89730 (0.0008) [2023-10-14 04:38:55,050][33226] Updated weights for policy 1, policy_version 90550 (0.0009) [2023-10-14 04:38:55,093][33201] Updated weights for policy 0, policy_version 89740 (0.0007) [2023-10-14 04:38:55,414][33226] Updated weights for policy 1, policy_version 90560 (0.0007) [2023-10-14 04:38:55,462][33201] Updated weights for policy 0, policy_version 89750 (0.0007) [2023-10-14 04:38:55,828][33201] Updated weights for policy 0, policy_version 89760 (0.0009) [2023-10-14 04:38:59,164][33226] Updated weights for policy 1, policy_version 90570 (0.0010) [2023-10-14 04:38:59,531][33226] Updated weights for policy 1, policy_version 90580 (0.0009) [2023-10-14 04:38:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 184647680. Throughput: 0: 1744.8, 1: 1756.3. Samples: 46171788. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:38:59,557][31953] Avg episode reward: [(0, '20.860'), (1, '20.920')] [2023-10-14 04:38:59,767][33201] Updated weights for policy 0, policy_version 89770 (0.0008) [2023-10-14 04:38:59,910][33226] Updated weights for policy 1, policy_version 90590 (0.0009) [2023-10-14 04:39:00,135][33201] Updated weights for policy 0, policy_version 89780 (0.0009) [2023-10-14 04:39:00,517][33201] Updated weights for policy 0, policy_version 89790 (0.0009) [2023-10-14 04:39:03,672][33226] Updated weights for policy 1, policy_version 90600 (0.0010) [2023-10-14 04:39:04,043][33226] Updated weights for policy 1, policy_version 90610 (0.0011) [2023-10-14 04:39:04,366][33201] Updated weights for policy 0, policy_version 89800 (0.0008) [2023-10-14 04:39:04,411][33226] Updated weights for policy 1, policy_version 90620 (0.0009) [2023-10-14 04:39:04,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 184745984. Throughput: 0: 1758.3, 1: 1779.9. Samples: 46193578. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:39:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.930')] [2023-10-14 04:39:04,741][33201] Updated weights for policy 0, policy_version 89810 (0.0008) [2023-10-14 04:39:05,115][33201] Updated weights for policy 0, policy_version 89820 (0.0008) [2023-10-14 04:39:08,191][33226] Updated weights for policy 1, policy_version 90630 (0.0007) [2023-10-14 04:39:08,562][33226] Updated weights for policy 1, policy_version 90640 (0.0008) [2023-10-14 04:39:08,926][33226] Updated weights for policy 1, policy_version 90650 (0.0009) [2023-10-14 04:39:08,945][33201] Updated weights for policy 0, policy_version 89830 (0.0007) [2023-10-14 04:39:09,311][33201] Updated weights for policy 0, policy_version 89840 (0.0008) [2023-10-14 04:39:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 184811520. Throughput: 0: 1763.8, 1: 1774.2. Samples: 46214158. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:39:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.940')] [2023-10-14 04:39:09,688][33201] Updated weights for policy 0, policy_version 89850 (0.0009) [2023-10-14 04:39:12,678][33226] Updated weights for policy 1, policy_version 90660 (0.0008) [2023-10-14 04:39:13,044][33226] Updated weights for policy 1, policy_version 90670 (0.0007) [2023-10-14 04:39:13,408][33226] Updated weights for policy 1, policy_version 90680 (0.0009) [2023-10-14 04:39:13,530][33201] Updated weights for policy 0, policy_version 89860 (0.0007) [2023-10-14 04:39:13,899][33201] Updated weights for policy 0, policy_version 89870 (0.0007) [2023-10-14 04:39:14,266][33201] Updated weights for policy 0, policy_version 89880 (0.0007) [2023-10-14 04:39:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 184877056. Throughput: 0: 1750.8, 1: 1785.8. Samples: 46225434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '21.000')] [2023-10-14 04:39:17,214][33226] Updated weights for policy 1, policy_version 90690 (0.0007) [2023-10-14 04:39:17,582][33226] Updated weights for policy 1, policy_version 90700 (0.0008) [2023-10-14 04:39:17,948][33226] Updated weights for policy 1, policy_version 90710 (0.0011) [2023-10-14 04:39:18,258][33201] Updated weights for policy 0, policy_version 89890 (0.0008) [2023-10-14 04:39:18,315][33226] Updated weights for policy 1, policy_version 90720 (0.0008) [2023-10-14 04:39:18,627][33201] Updated weights for policy 0, policy_version 89900 (0.0008) [2023-10-14 04:39:18,997][33201] Updated weights for policy 0, policy_version 89910 (0.0008) [2023-10-14 04:39:19,365][33201] Updated weights for policy 0, policy_version 89920 (0.0010) [2023-10-14 04:39:19,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 184975360. Throughput: 0: 1776.3, 1: 1781.1. Samples: 46246616. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '21.000')] [2023-10-14 04:39:22,149][33226] Updated weights for policy 1, policy_version 90730 (0.0007) [2023-10-14 04:39:22,522][33226] Updated weights for policy 1, policy_version 90740 (0.0009) [2023-10-14 04:39:22,884][33226] Updated weights for policy 1, policy_version 90750 (0.0009) [2023-10-14 04:39:23,270][33201] Updated weights for policy 0, policy_version 89930 (0.0007) [2023-10-14 04:39:23,637][33201] Updated weights for policy 0, policy_version 89940 (0.0007) [2023-10-14 04:39:23,998][33201] Updated weights for policy 0, policy_version 89950 (0.0009) [2023-10-14 04:39:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 185040896. Throughput: 0: 1741.1, 1: 1770.6. Samples: 46266858. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:24,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.650')] [2023-10-14 04:39:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000090752_92930048.pth... [2023-10-14 04:39:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000089952_92110848.pth... [2023-10-14 04:39:24,617][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000089088_91226112.pth [2023-10-14 04:39:24,617][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000088288_90406912.pth [2023-10-14 04:39:26,487][33226] Updated weights for policy 1, policy_version 90760 (0.0007) [2023-10-14 04:39:26,860][33226] Updated weights for policy 1, policy_version 90770 (0.0007) [2023-10-14 04:39:27,238][33226] Updated weights for policy 1, policy_version 90780 (0.0010) [2023-10-14 04:39:27,948][33201] Updated weights for policy 0, policy_version 89960 (0.0010) [2023-10-14 04:39:28,317][33201] Updated weights for policy 0, policy_version 89970 (0.0010) [2023-10-14 04:39:28,693][33201] Updated weights for policy 0, policy_version 89980 (0.0008) [2023-10-14 04:39:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 185106432. Throughput: 0: 1770.7, 1: 1783.7. Samples: 46278480. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:29,559][31953] Avg episode reward: [(0, '20.970'), (1, '20.650')] [2023-10-14 04:39:31,023][33226] Updated weights for policy 1, policy_version 90790 (0.0009) [2023-10-14 04:39:31,388][33226] Updated weights for policy 1, policy_version 90800 (0.0010) [2023-10-14 04:39:31,759][33226] Updated weights for policy 1, policy_version 90810 (0.0007) [2023-10-14 04:39:32,468][33201] Updated weights for policy 0, policy_version 89990 (0.0010) [2023-10-14 04:39:32,844][33201] Updated weights for policy 0, policy_version 90000 (0.0007) [2023-10-14 04:39:33,218][33201] Updated weights for policy 0, policy_version 90010 (0.0010) [2023-10-14 04:39:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 185171968. Throughput: 0: 1752.8, 1: 1780.9. Samples: 46299220. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.650')] [2023-10-14 04:39:35,730][33226] Updated weights for policy 1, policy_version 90820 (0.0007) [2023-10-14 04:39:36,107][33226] Updated weights for policy 1, policy_version 90830 (0.0007) [2023-10-14 04:39:36,459][33226] Updated weights for policy 1, policy_version 90840 (0.0007) [2023-10-14 04:39:36,995][33201] Updated weights for policy 0, policy_version 90020 (0.0010) [2023-10-14 04:39:37,357][33201] Updated weights for policy 0, policy_version 90030 (0.0008) [2023-10-14 04:39:37,725][33201] Updated weights for policy 0, policy_version 90040 (0.0008) [2023-10-14 04:39:39,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 185237504. Throughput: 0: 1744.2, 1: 1785.8. Samples: 46321042. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.650')] [2023-10-14 04:39:40,087][33226] Updated weights for policy 1, policy_version 90850 (0.0009) [2023-10-14 04:39:40,458][33226] Updated weights for policy 1, policy_version 90860 (0.0008) [2023-10-14 04:39:40,815][33226] Updated weights for policy 1, policy_version 90870 (0.0007) [2023-10-14 04:39:41,181][33226] Updated weights for policy 1, policy_version 90880 (0.0009) [2023-10-14 04:39:41,406][33201] Updated weights for policy 0, policy_version 90050 (0.0008) [2023-10-14 04:39:41,797][33201] Updated weights for policy 0, policy_version 90060 (0.0010) [2023-10-14 04:39:42,165][33201] Updated weights for policy 0, policy_version 90070 (0.0009) [2023-10-14 04:39:42,535][33201] Updated weights for policy 0, policy_version 90080 (0.0010) [2023-10-14 04:39:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 185303040. Throughput: 0: 1761.0, 1: 1782.7. Samples: 46331252. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.650')] [2023-10-14 04:39:45,039][33226] Updated weights for policy 1, policy_version 90890 (0.0008) [2023-10-14 04:39:45,403][33226] Updated weights for policy 1, policy_version 90900 (0.0007) [2023-10-14 04:39:45,762][33226] Updated weights for policy 1, policy_version 90910 (0.0007) [2023-10-14 04:39:46,284][33201] Updated weights for policy 0, policy_version 90090 (0.0008) [2023-10-14 04:39:46,649][33201] Updated weights for policy 0, policy_version 90100 (0.0007) [2023-10-14 04:39:47,017][33201] Updated weights for policy 0, policy_version 90110 (0.0008) [2023-10-14 04:39:49,551][33226] Updated weights for policy 1, policy_version 90920 (0.0008) [2023-10-14 04:39:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 185368576. Throughput: 0: 1754.4, 1: 1783.4. Samples: 46352778. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 04:39:49,920][33226] Updated weights for policy 1, policy_version 90930 (0.0010) [2023-10-14 04:39:50,284][33226] Updated weights for policy 1, policy_version 90940 (0.0007) [2023-10-14 04:39:50,780][33201] Updated weights for policy 0, policy_version 90120 (0.0007) [2023-10-14 04:39:51,149][33201] Updated weights for policy 0, policy_version 90130 (0.0009) [2023-10-14 04:39:51,519][33201] Updated weights for policy 0, policy_version 90140 (0.0007) [2023-10-14 04:39:54,082][33226] Updated weights for policy 1, policy_version 90950 (0.0009) [2023-10-14 04:39:54,450][33226] Updated weights for policy 1, policy_version 90960 (0.0008) [2023-10-14 04:39:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 185434112. Throughput: 0: 1768.2, 1: 1807.2. Samples: 46375052. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 04:39:54,814][33226] Updated weights for policy 1, policy_version 90970 (0.0009) [2023-10-14 04:39:55,232][33201] Updated weights for policy 0, policy_version 90150 (0.0007) [2023-10-14 04:39:55,609][33201] Updated weights for policy 0, policy_version 90160 (0.0010) [2023-10-14 04:39:55,977][33201] Updated weights for policy 0, policy_version 90170 (0.0009) [2023-10-14 04:39:58,433][33226] Updated weights for policy 1, policy_version 90980 (0.0007) [2023-10-14 04:39:58,797][33226] Updated weights for policy 1, policy_version 90990 (0.0008) [2023-10-14 04:39:59,159][33226] Updated weights for policy 1, policy_version 91000 (0.0010) [2023-10-14 04:39:59,557][31953] Fps is (10 sec: 16384.4, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 185532416. Throughput: 0: 1759.1, 1: 1783.6. Samples: 46384854. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:39:59,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 04:39:59,871][33201] Updated weights for policy 0, policy_version 90180 (0.0009) [2023-10-14 04:40:00,249][33201] Updated weights for policy 0, policy_version 90190 (0.0007) [2023-10-14 04:40:00,611][33201] Updated weights for policy 0, policy_version 90200 (0.0007) [2023-10-14 04:40:02,916][33226] Updated weights for policy 1, policy_version 91010 (0.0010) [2023-10-14 04:40:03,287][33226] Updated weights for policy 1, policy_version 91020 (0.0008) [2023-10-14 04:40:03,653][33226] Updated weights for policy 1, policy_version 91030 (0.0008) [2023-10-14 04:40:04,018][33226] Updated weights for policy 1, policy_version 91040 (0.0008) [2023-10-14 04:40:04,360][33201] Updated weights for policy 0, policy_version 90210 (0.0009) [2023-10-14 04:40:04,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 185597952. Throughput: 0: 1760.4, 1: 1796.2. Samples: 46406664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:40:04,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 04:40:04,738][33201] Updated weights for policy 0, policy_version 90220 (0.0007) [2023-10-14 04:40:05,105][33201] Updated weights for policy 0, policy_version 90230 (0.0007) [2023-10-14 04:40:05,479][33201] Updated weights for policy 0, policy_version 90240 (0.0008) [2023-10-14 04:40:07,823][33226] Updated weights for policy 1, policy_version 91050 (0.0009) [2023-10-14 04:40:08,196][33226] Updated weights for policy 1, policy_version 91060 (0.0007) [2023-10-14 04:40:08,566][33226] Updated weights for policy 1, policy_version 91070 (0.0011) [2023-10-14 04:40:09,442][33201] Updated weights for policy 0, policy_version 90250 (0.0007) [2023-10-14 04:40:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 185663488. Throughput: 0: 1791.8, 1: 1779.3. Samples: 46427558. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.650')] [2023-10-14 04:40:09,813][33201] Updated weights for policy 0, policy_version 90260 (0.0010) [2023-10-14 04:40:10,188][33201] Updated weights for policy 0, policy_version 90270 (0.0010) [2023-10-14 04:40:12,279][33226] Updated weights for policy 1, policy_version 91080 (0.0008) [2023-10-14 04:40:12,649][33226] Updated weights for policy 1, policy_version 91090 (0.0007) [2023-10-14 04:40:13,012][33226] Updated weights for policy 1, policy_version 91100 (0.0008) [2023-10-14 04:40:13,918][33201] Updated weights for policy 0, policy_version 90280 (0.0008) [2023-10-14 04:40:14,292][33201] Updated weights for policy 0, policy_version 90290 (0.0010) [2023-10-14 04:40:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 185729024. Throughput: 0: 1762.2, 1: 1800.1. Samples: 46438784. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.640')] [2023-10-14 04:40:14,663][33201] Updated weights for policy 0, policy_version 90300 (0.0008) [2023-10-14 04:40:16,895][33226] Updated weights for policy 1, policy_version 91110 (0.0010) [2023-10-14 04:40:17,261][33226] Updated weights for policy 1, policy_version 91120 (0.0010) [2023-10-14 04:40:17,635][33226] Updated weights for policy 1, policy_version 91130 (0.0008) [2023-10-14 04:40:18,566][33201] Updated weights for policy 0, policy_version 90310 (0.0007) [2023-10-14 04:40:18,932][33201] Updated weights for policy 0, policy_version 90320 (0.0008) [2023-10-14 04:40:19,306][33201] Updated weights for policy 0, policy_version 90330 (0.0007) [2023-10-14 04:40:19,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 185827328. Throughput: 0: 1783.6, 1: 1777.7. Samples: 46459478. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.640')] [2023-10-14 04:40:21,408][33226] Updated weights for policy 1, policy_version 91140 (0.0009) [2023-10-14 04:40:21,773][33226] Updated weights for policy 1, policy_version 91150 (0.0009) [2023-10-14 04:40:22,144][33226] Updated weights for policy 1, policy_version 91160 (0.0007) [2023-10-14 04:40:23,204][33201] Updated weights for policy 0, policy_version 90340 (0.0008) [2023-10-14 04:40:23,565][33201] Updated weights for policy 0, policy_version 90350 (0.0009) [2023-10-14 04:40:23,939][33201] Updated weights for policy 0, policy_version 90360 (0.0009) [2023-10-14 04:40:24,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 185892864. Throughput: 0: 1764.2, 1: 1777.9. Samples: 46480436. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:24,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.640')] [2023-10-14 04:40:26,024][33226] Updated weights for policy 1, policy_version 91170 (0.0009) [2023-10-14 04:40:26,389][33226] Updated weights for policy 1, policy_version 91180 (0.0009) [2023-10-14 04:40:26,749][33226] Updated weights for policy 1, policy_version 91190 (0.0008) [2023-10-14 04:40:27,112][33226] Updated weights for policy 1, policy_version 91200 (0.0008) [2023-10-14 04:40:27,760][33201] Updated weights for policy 0, policy_version 90370 (0.0008) [2023-10-14 04:40:28,172][33201] Updated weights for policy 0, policy_version 90380 (0.0007) [2023-10-14 04:40:28,540][33201] Updated weights for policy 0, policy_version 90390 (0.0007) [2023-10-14 04:40:28,913][33201] Updated weights for policy 0, policy_version 90400 (0.0009) [2023-10-14 04:40:29,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 185958400. Throughput: 0: 1777.1, 1: 1782.9. Samples: 46491454. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:29,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.640')] [2023-10-14 04:40:30,952][33226] Updated weights for policy 1, policy_version 91210 (0.0007) [2023-10-14 04:40:31,313][33226] Updated weights for policy 1, policy_version 91220 (0.0008) [2023-10-14 04:40:31,687][33226] Updated weights for policy 1, policy_version 91230 (0.0009) [2023-10-14 04:40:32,837][33201] Updated weights for policy 0, policy_version 90410 (0.0009) [2023-10-14 04:40:33,214][33201] Updated weights for policy 0, policy_version 90420 (0.0011) [2023-10-14 04:40:33,578][33201] Updated weights for policy 0, policy_version 90430 (0.0008) [2023-10-14 04:40:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 186023936. Throughput: 0: 1765.3, 1: 1779.1. Samples: 46512276. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:34,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.640')] [2023-10-14 04:40:35,310][33226] Updated weights for policy 1, policy_version 91240 (0.0008) [2023-10-14 04:40:35,668][33226] Updated weights for policy 1, policy_version 91250 (0.0007) [2023-10-14 04:40:36,045][33226] Updated weights for policy 1, policy_version 91260 (0.0008) [2023-10-14 04:40:37,311][33201] Updated weights for policy 0, policy_version 90440 (0.0007) [2023-10-14 04:40:37,690][33201] Updated weights for policy 0, policy_version 90450 (0.0008) [2023-10-14 04:40:38,053][33201] Updated weights for policy 0, policy_version 90460 (0.0007) [2023-10-14 04:40:39,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 186089472. Throughput: 0: 1746.9, 1: 1780.9. Samples: 46533802. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:39,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.640')] [2023-10-14 04:40:39,930][33226] Updated weights for policy 1, policy_version 91270 (0.0009) [2023-10-14 04:40:40,294][33226] Updated weights for policy 1, policy_version 91280 (0.0007) [2023-10-14 04:40:40,673][33226] Updated weights for policy 1, policy_version 91290 (0.0009) [2023-10-14 04:40:41,963][33201] Updated weights for policy 0, policy_version 90470 (0.0008) [2023-10-14 04:40:42,330][33201] Updated weights for policy 0, policy_version 90480 (0.0011) [2023-10-14 04:40:42,704][33201] Updated weights for policy 0, policy_version 90490 (0.0008) [2023-10-14 04:40:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 186155008. Throughput: 0: 1768.8, 1: 1771.9. Samples: 46544186. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.640')] [2023-10-14 04:40:44,601][33226] Updated weights for policy 1, policy_version 91300 (0.0009) [2023-10-14 04:40:44,967][33226] Updated weights for policy 1, policy_version 91310 (0.0008) [2023-10-14 04:40:45,333][33226] Updated weights for policy 1, policy_version 91320 (0.0009) [2023-10-14 04:40:46,676][33201] Updated weights for policy 0, policy_version 90500 (0.0008) [2023-10-14 04:40:47,046][33201] Updated weights for policy 0, policy_version 90510 (0.0007) [2023-10-14 04:40:47,413][33201] Updated weights for policy 0, policy_version 90520 (0.0007) [2023-10-14 04:40:49,142][33226] Updated weights for policy 1, policy_version 91330 (0.0008) [2023-10-14 04:40:49,505][33226] Updated weights for policy 1, policy_version 91340 (0.0008) [2023-10-14 04:40:49,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 186220544. Throughput: 0: 1744.0, 1: 1775.9. Samples: 46565056. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.640')] [2023-10-14 04:40:49,868][33226] Updated weights for policy 1, policy_version 91350 (0.0010) [2023-10-14 04:40:50,231][33226] Updated weights for policy 1, policy_version 91360 (0.0007) [2023-10-14 04:40:51,240][33201] Updated weights for policy 0, policy_version 90530 (0.0007) [2023-10-14 04:40:51,604][33201] Updated weights for policy 0, policy_version 90540 (0.0008) [2023-10-14 04:40:51,972][33201] Updated weights for policy 0, policy_version 90550 (0.0007) [2023-10-14 04:40:52,337][33201] Updated weights for policy 0, policy_version 90560 (0.0010) [2023-10-14 04:40:54,006][33226] Updated weights for policy 1, policy_version 91370 (0.0008) [2023-10-14 04:40:54,384][33226] Updated weights for policy 1, policy_version 91380 (0.0008) [2023-10-14 04:40:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 186286080. Throughput: 0: 1741.9, 1: 1795.4. Samples: 46586738. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.640')] [2023-10-14 04:40:54,749][33226] Updated weights for policy 1, policy_version 91390 (0.0009) [2023-10-14 04:40:56,238][33201] Updated weights for policy 0, policy_version 90570 (0.0011) [2023-10-14 04:40:56,603][33201] Updated weights for policy 0, policy_version 90580 (0.0009) [2023-10-14 04:40:56,967][33201] Updated weights for policy 0, policy_version 90590 (0.0008) [2023-10-14 04:40:58,483][33226] Updated weights for policy 1, policy_version 91400 (0.0008) [2023-10-14 04:40:58,851][33226] Updated weights for policy 1, policy_version 91410 (0.0007) [2023-10-14 04:40:59,212][33226] Updated weights for policy 1, policy_version 91420 (0.0009) [2023-10-14 04:40:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 186384384. Throughput: 0: 1743.1, 1: 1766.4. Samples: 46596714. Policy #0 lag: (min: 31.0, avg: 38.9, max: 63.0) [2023-10-14 04:40:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.640')] [2023-10-14 04:41:00,714][33201] Updated weights for policy 0, policy_version 90600 (0.0007) [2023-10-14 04:41:01,090][33201] Updated weights for policy 0, policy_version 90610 (0.0007) [2023-10-14 04:41:01,451][33201] Updated weights for policy 0, policy_version 90620 (0.0010) [2023-10-14 04:41:03,079][33226] Updated weights for policy 1, policy_version 91430 (0.0010) [2023-10-14 04:41:03,445][33226] Updated weights for policy 1, policy_version 91440 (0.0009) [2023-10-14 04:41:03,810][33226] Updated weights for policy 1, policy_version 91450 (0.0008) [2023-10-14 04:41:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 186449920. Throughput: 0: 1742.4, 1: 1799.7. Samples: 46618870. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.640')] [2023-10-14 04:41:05,168][33201] Updated weights for policy 0, policy_version 90630 (0.0008) [2023-10-14 04:41:05,544][33201] Updated weights for policy 0, policy_version 90640 (0.0007) [2023-10-14 04:41:05,907][33201] Updated weights for policy 0, policy_version 90650 (0.0009) [2023-10-14 04:41:07,778][33226] Updated weights for policy 1, policy_version 91460 (0.0008) [2023-10-14 04:41:08,179][33226] Updated weights for policy 1, policy_version 91470 (0.0008) [2023-10-14 04:41:08,547][33226] Updated weights for policy 1, policy_version 91480 (0.0008) [2023-10-14 04:41:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 186515456. Throughput: 0: 1771.1, 1: 1761.4. Samples: 46639398. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 04:41:09,762][33201] Updated weights for policy 0, policy_version 90660 (0.0008) [2023-10-14 04:41:10,128][33201] Updated weights for policy 0, policy_version 90670 (0.0007) [2023-10-14 04:41:10,496][33201] Updated weights for policy 0, policy_version 90680 (0.0007) [2023-10-14 04:41:12,235][33226] Updated weights for policy 1, policy_version 91490 (0.0009) [2023-10-14 04:41:12,602][33226] Updated weights for policy 1, policy_version 91500 (0.0008) [2023-10-14 04:41:12,970][33226] Updated weights for policy 1, policy_version 91510 (0.0008) [2023-10-14 04:41:13,325][33226] Updated weights for policy 1, policy_version 91520 (0.0008) [2023-10-14 04:41:14,342][33201] Updated weights for policy 0, policy_version 90690 (0.0008) [2023-10-14 04:41:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 186580992. Throughput: 0: 1741.0, 1: 1790.2. Samples: 46650356. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.990')] [2023-10-14 04:41:14,755][33201] Updated weights for policy 0, policy_version 90700 (0.0009) [2023-10-14 04:41:15,123][33201] Updated weights for policy 0, policy_version 90710 (0.0007) [2023-10-14 04:41:15,490][33201] Updated weights for policy 0, policy_version 90720 (0.0007) [2023-10-14 04:41:17,121][33226] Updated weights for policy 1, policy_version 91530 (0.0009) [2023-10-14 04:41:17,491][33226] Updated weights for policy 1, policy_version 91540 (0.0009) [2023-10-14 04:41:17,849][33226] Updated weights for policy 1, policy_version 91550 (0.0007) [2023-10-14 04:41:19,221][33201] Updated weights for policy 0, policy_version 90730 (0.0008) [2023-10-14 04:41:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 186646528. Throughput: 0: 1768.2, 1: 1763.3. Samples: 46671194. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:19,595][33201] Updated weights for policy 0, policy_version 90740 (0.0011) [2023-10-14 04:41:19,959][33201] Updated weights for policy 0, policy_version 90750 (0.0010) [2023-10-14 04:41:21,589][33226] Updated weights for policy 1, policy_version 91560 (0.0011) [2023-10-14 04:41:21,956][33226] Updated weights for policy 1, policy_version 91570 (0.0007) [2023-10-14 04:41:22,327][33226] Updated weights for policy 1, policy_version 91580 (0.0010) [2023-10-14 04:41:23,865][33201] Updated weights for policy 0, policy_version 90760 (0.0009) [2023-10-14 04:41:24,231][33201] Updated weights for policy 0, policy_version 90770 (0.0009) [2023-10-14 04:41:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 186712064. Throughput: 0: 1771.9, 1: 1760.4. Samples: 46692756. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000091584_93782016.pth... [2023-10-14 04:41:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000089920_92078080.pth [2023-10-14 04:41:24,606][32895] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p1/milestones/checkpoint_000091584_93782016.pth [2023-10-14 04:41:24,613][33201] Updated weights for policy 0, policy_version 90780 (0.0007) [2023-10-14 04:41:24,749][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000090784_92962816.pth... [2023-10-14 04:41:24,777][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000089120_91258880.pth [2023-10-14 04:41:24,781][32837] Saving a milestone ./train_atari/atari_pong_APPO/checkpoint_p0/milestones/checkpoint_000090784_92962816.pth [2023-10-14 04:41:26,018][33226] Updated weights for policy 1, policy_version 91590 (0.0007) [2023-10-14 04:41:26,384][33226] Updated weights for policy 1, policy_version 91600 (0.0010) [2023-10-14 04:41:26,760][33226] Updated weights for policy 1, policy_version 91610 (0.0008) [2023-10-14 04:41:28,431][33201] Updated weights for policy 0, policy_version 90790 (0.0008) [2023-10-14 04:41:28,802][33201] Updated weights for policy 0, policy_version 90800 (0.0007) [2023-10-14 04:41:29,182][33201] Updated weights for policy 0, policy_version 90810 (0.0008) [2023-10-14 04:41:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 186810368. Throughput: 0: 1765.4, 1: 1767.3. Samples: 46703158. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:30,582][33226] Updated weights for policy 1, policy_version 91620 (0.0009) [2023-10-14 04:41:30,955][33226] Updated weights for policy 1, policy_version 91630 (0.0009) [2023-10-14 04:41:31,325][33226] Updated weights for policy 1, policy_version 91640 (0.0011) [2023-10-14 04:41:32,972][33201] Updated weights for policy 0, policy_version 90820 (0.0011) [2023-10-14 04:41:33,338][33201] Updated weights for policy 0, policy_version 90830 (0.0009) [2023-10-14 04:41:33,703][33201] Updated weights for policy 0, policy_version 90840 (0.0008) [2023-10-14 04:41:34,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 186875904. Throughput: 0: 1781.2, 1: 1773.5. Samples: 46725018. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:34,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:35,131][33226] Updated weights for policy 1, policy_version 91650 (0.0010) [2023-10-14 04:41:35,502][33226] Updated weights for policy 1, policy_version 91660 (0.0007) [2023-10-14 04:41:35,859][33226] Updated weights for policy 1, policy_version 91670 (0.0007) [2023-10-14 04:41:36,216][33226] Updated weights for policy 1, policy_version 91680 (0.0008) [2023-10-14 04:41:37,542][33201] Updated weights for policy 0, policy_version 90850 (0.0010) [2023-10-14 04:41:37,911][33201] Updated weights for policy 0, policy_version 90860 (0.0007) [2023-10-14 04:41:38,274][33201] Updated weights for policy 0, policy_version 90870 (0.0007) [2023-10-14 04:41:38,647][33201] Updated weights for policy 0, policy_version 90880 (0.0008) [2023-10-14 04:41:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 186941440. Throughput: 0: 1763.9, 1: 1780.9. Samples: 46746252. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:40,131][33226] Updated weights for policy 1, policy_version 91690 (0.0007) [2023-10-14 04:41:40,498][33226] Updated weights for policy 1, policy_version 91700 (0.0008) [2023-10-14 04:41:40,869][33226] Updated weights for policy 1, policy_version 91710 (0.0010) [2023-10-14 04:41:42,251][33201] Updated weights for policy 0, policy_version 90890 (0.0010) [2023-10-14 04:41:42,627][33201] Updated weights for policy 0, policy_version 90900 (0.0010) [2023-10-14 04:41:43,000][33201] Updated weights for policy 0, policy_version 90910 (0.0008) [2023-10-14 04:41:44,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 187006976. Throughput: 0: 1796.0, 1: 1767.9. Samples: 46757088. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:44,731][33226] Updated weights for policy 1, policy_version 91720 (0.0009) [2023-10-14 04:41:45,089][33226] Updated weights for policy 1, policy_version 91730 (0.0009) [2023-10-14 04:41:45,463][33226] Updated weights for policy 1, policy_version 91740 (0.0008) [2023-10-14 04:41:46,832][33201] Updated weights for policy 0, policy_version 90920 (0.0008) [2023-10-14 04:41:47,211][33201] Updated weights for policy 0, policy_version 90930 (0.0009) [2023-10-14 04:41:47,593][33201] Updated weights for policy 0, policy_version 90940 (0.0008) [2023-10-14 04:41:49,340][33226] Updated weights for policy 1, policy_version 91750 (0.0010) [2023-10-14 04:41:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 187072512. Throughput: 0: 1764.4, 1: 1771.9. Samples: 46778002. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:49,706][33226] Updated weights for policy 1, policy_version 91760 (0.0010) [2023-10-14 04:41:50,073][33226] Updated weights for policy 1, policy_version 91770 (0.0009) [2023-10-14 04:41:51,324][33201] Updated weights for policy 0, policy_version 90950 (0.0010) [2023-10-14 04:41:51,700][33201] Updated weights for policy 0, policy_version 90960 (0.0008) [2023-10-14 04:41:52,071][33201] Updated weights for policy 0, policy_version 90970 (0.0008) [2023-10-14 04:41:53,810][33226] Updated weights for policy 1, policy_version 91780 (0.0007) [2023-10-14 04:41:54,212][33226] Updated weights for policy 1, policy_version 91790 (0.0007) [2023-10-14 04:41:54,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 187138048. Throughput: 0: 1761.8, 1: 1803.2. Samples: 46799822. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:54,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.990')] [2023-10-14 04:41:54,579][33226] Updated weights for policy 1, policy_version 91800 (0.0010) [2023-10-14 04:41:55,974][33201] Updated weights for policy 0, policy_version 90980 (0.0008) [2023-10-14 04:41:56,342][33201] Updated weights for policy 0, policy_version 90990 (0.0009) [2023-10-14 04:41:56,723][33201] Updated weights for policy 0, policy_version 91000 (0.0009) [2023-10-14 04:41:58,420][33226] Updated weights for policy 1, policy_version 91810 (0.0007) [2023-10-14 04:41:58,796][33226] Updated weights for policy 1, policy_version 91820 (0.0008) [2023-10-14 04:41:59,166][33226] Updated weights for policy 1, policy_version 91830 (0.0008) [2023-10-14 04:41:59,526][33226] Updated weights for policy 1, policy_version 91840 (0.0010) [2023-10-14 04:41:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 187236352. Throughput: 0: 1764.4, 1: 1774.4. Samples: 46809604. Policy #0 lag: (min: 20.0, avg: 30.6, max: 52.0) [2023-10-14 04:41:59,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:00,624][33201] Updated weights for policy 0, policy_version 91010 (0.0009) [2023-10-14 04:42:00,995][33201] Updated weights for policy 0, policy_version 91020 (0.0008) [2023-10-14 04:42:01,360][33201] Updated weights for policy 0, policy_version 91030 (0.0008) [2023-10-14 04:42:01,731][33201] Updated weights for policy 0, policy_version 91040 (0.0010) [2023-10-14 04:42:03,253][33226] Updated weights for policy 1, policy_version 91850 (0.0010) [2023-10-14 04:42:03,616][33226] Updated weights for policy 1, policy_version 91860 (0.0009) [2023-10-14 04:42:03,991][33226] Updated weights for policy 1, policy_version 91870 (0.0008) [2023-10-14 04:42:04,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 187301888. Throughput: 0: 1767.1, 1: 1801.6. Samples: 46831784. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:04,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:05,666][33201] Updated weights for policy 0, policy_version 91050 (0.0009) [2023-10-14 04:42:06,026][33201] Updated weights for policy 0, policy_version 91060 (0.0010) [2023-10-14 04:42:06,388][33201] Updated weights for policy 0, policy_version 91070 (0.0010) [2023-10-14 04:42:07,804][33226] Updated weights for policy 1, policy_version 91880 (0.0009) [2023-10-14 04:42:08,175][33226] Updated weights for policy 1, policy_version 91890 (0.0011) [2023-10-14 04:42:08,537][33226] Updated weights for policy 1, policy_version 91900 (0.0010) [2023-10-14 04:42:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 187367424. Throughput: 0: 1777.9, 1: 1774.4. Samples: 46852612. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:09,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:10,084][33201] Updated weights for policy 0, policy_version 91080 (0.0008) [2023-10-14 04:42:10,454][33201] Updated weights for policy 0, policy_version 91090 (0.0007) [2023-10-14 04:42:10,824][33201] Updated weights for policy 0, policy_version 91100 (0.0008) [2023-10-14 04:42:12,226][33226] Updated weights for policy 1, policy_version 91910 (0.0007) [2023-10-14 04:42:12,597][33226] Updated weights for policy 1, policy_version 91920 (0.0007) [2023-10-14 04:42:12,955][33226] Updated weights for policy 1, policy_version 91930 (0.0009) [2023-10-14 04:42:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 187432960. Throughput: 0: 1761.8, 1: 1805.0. Samples: 46863664. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:14,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:14,694][33201] Updated weights for policy 0, policy_version 91110 (0.0009) [2023-10-14 04:42:15,065][33201] Updated weights for policy 0, policy_version 91120 (0.0009) [2023-10-14 04:42:15,432][33201] Updated weights for policy 0, policy_version 91130 (0.0010) [2023-10-14 04:42:16,824][33226] Updated weights for policy 1, policy_version 91940 (0.0008) [2023-10-14 04:42:17,202][33226] Updated weights for policy 1, policy_version 91950 (0.0009) [2023-10-14 04:42:17,573][33226] Updated weights for policy 1, policy_version 91960 (0.0010) [2023-10-14 04:42:19,261][33201] Updated weights for policy 0, policy_version 91140 (0.0009) [2023-10-14 04:42:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 187498496. Throughput: 0: 1772.6, 1: 1770.8. Samples: 46884472. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:19,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:19,633][33201] Updated weights for policy 0, policy_version 91150 (0.0007) [2023-10-14 04:42:20,002][33201] Updated weights for policy 0, policy_version 91160 (0.0007) [2023-10-14 04:42:21,286][33226] Updated weights for policy 1, policy_version 91970 (0.0010) [2023-10-14 04:42:21,645][33226] Updated weights for policy 1, policy_version 91980 (0.0009) [2023-10-14 04:42:22,016][33226] Updated weights for policy 1, policy_version 91990 (0.0007) [2023-10-14 04:42:22,383][33226] Updated weights for policy 1, policy_version 92000 (0.0007) [2023-10-14 04:42:23,855][33201] Updated weights for policy 0, policy_version 91170 (0.0008) [2023-10-14 04:42:24,219][33201] Updated weights for policy 0, policy_version 91180 (0.0010) [2023-10-14 04:42:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 187564032. Throughput: 0: 1778.9, 1: 1773.0. Samples: 46906088. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:24,595][33201] Updated weights for policy 0, policy_version 91190 (0.0010) [2023-10-14 04:42:24,964][33201] Updated weights for policy 0, policy_version 91200 (0.0009) [2023-10-14 04:42:26,005][33226] Updated weights for policy 1, policy_version 92010 (0.0010) [2023-10-14 04:42:26,376][33226] Updated weights for policy 1, policy_version 92020 (0.0009) [2023-10-14 04:42:26,733][33226] Updated weights for policy 1, policy_version 92030 (0.0008) [2023-10-14 04:42:28,898][33201] Updated weights for policy 0, policy_version 91210 (0.0011) [2023-10-14 04:42:29,272][33201] Updated weights for policy 0, policy_version 91220 (0.0011) [2023-10-14 04:42:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 187629568. Throughput: 0: 1754.3, 1: 1782.6. Samples: 46916250. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:29,643][33201] Updated weights for policy 0, policy_version 91230 (0.0010) [2023-10-14 04:42:30,335][33226] Updated weights for policy 1, policy_version 92040 (0.0009) [2023-10-14 04:42:30,706][33226] Updated weights for policy 1, policy_version 92050 (0.0008) [2023-10-14 04:42:31,083][33226] Updated weights for policy 1, policy_version 92060 (0.0010) [2023-10-14 04:42:33,538][33201] Updated weights for policy 0, policy_version 91240 (0.0009) [2023-10-14 04:42:33,907][33201] Updated weights for policy 0, policy_version 91250 (0.0007) [2023-10-14 04:42:34,280][33201] Updated weights for policy 0, policy_version 91260 (0.0007) [2023-10-14 04:42:34,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 187727872. Throughput: 0: 1782.8, 1: 1778.6. Samples: 46938266. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:34,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:35,004][33226] Updated weights for policy 1, policy_version 92070 (0.0009) [2023-10-14 04:42:35,369][33226] Updated weights for policy 1, policy_version 92080 (0.0008) [2023-10-14 04:42:35,732][33226] Updated weights for policy 1, policy_version 92090 (0.0010) [2023-10-14 04:42:38,010][33201] Updated weights for policy 0, policy_version 91270 (0.0008) [2023-10-14 04:42:38,374][33201] Updated weights for policy 0, policy_version 91280 (0.0009) [2023-10-14 04:42:38,749][33201] Updated weights for policy 0, policy_version 91290 (0.0009) [2023-10-14 04:42:39,424][33226] Updated weights for policy 1, policy_version 92100 (0.0008) [2023-10-14 04:42:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 187793408. Throughput: 0: 1751.6, 1: 1791.3. Samples: 46959250. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:39,558][31953] Avg episode reward: [(0, '21.000'), (1, '21.000')] [2023-10-14 04:42:39,793][33226] Updated weights for policy 1, policy_version 92110 (0.0011) [2023-10-14 04:42:40,156][33226] Updated weights for policy 1, policy_version 92120 (0.0009) [2023-10-14 04:42:42,606][33201] Updated weights for policy 0, policy_version 91300 (0.0008) [2023-10-14 04:42:42,968][33201] Updated weights for policy 0, policy_version 91310 (0.0009) [2023-10-14 04:42:43,329][33201] Updated weights for policy 0, policy_version 91320 (0.0008) [2023-10-14 04:42:43,817][33226] Updated weights for policy 1, policy_version 92130 (0.0007) [2023-10-14 04:42:44,186][33226] Updated weights for policy 1, policy_version 92140 (0.0007) [2023-10-14 04:42:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 187858944. Throughput: 0: 1782.9, 1: 1787.5. Samples: 46970274. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:44,558][31953] Avg episode reward: [(0, '20.990'), (1, '21.000')] [2023-10-14 04:42:44,559][33226] Updated weights for policy 1, policy_version 92150 (0.0008) [2023-10-14 04:42:44,923][33226] Updated weights for policy 1, policy_version 92160 (0.0010) [2023-10-14 04:42:47,239][33201] Updated weights for policy 0, policy_version 91330 (0.0009) [2023-10-14 04:42:47,613][33201] Updated weights for policy 0, policy_version 91340 (0.0009) [2023-10-14 04:42:47,973][33201] Updated weights for policy 0, policy_version 91350 (0.0008) [2023-10-14 04:42:48,348][33201] Updated weights for policy 0, policy_version 91360 (0.0011) [2023-10-14 04:42:48,757][33226] Updated weights for policy 1, policy_version 92170 (0.0007) [2023-10-14 04:42:49,121][33226] Updated weights for policy 1, policy_version 92180 (0.0009) [2023-10-14 04:42:49,488][33226] Updated weights for policy 1, policy_version 92190 (0.0010) [2023-10-14 04:42:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 187924480. Throughput: 0: 1753.5, 1: 1789.3. Samples: 46991210. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:49,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:42:52,174][33201] Updated weights for policy 0, policy_version 91370 (0.0007) [2023-10-14 04:42:52,558][33201] Updated weights for policy 0, policy_version 91380 (0.0009) [2023-10-14 04:42:52,921][33201] Updated weights for policy 0, policy_version 91390 (0.0008) [2023-10-14 04:42:53,233][33226] Updated weights for policy 1, policy_version 92200 (0.0009) [2023-10-14 04:42:53,605][33226] Updated weights for policy 1, policy_version 92210 (0.0007) [2023-10-14 04:42:53,975][33226] Updated weights for policy 1, policy_version 92220 (0.0007) [2023-10-14 04:42:54,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.5, 300 sec: 14329.1). Total num frames: 188022784. Throughput: 0: 1741.9, 1: 1793.1. Samples: 47011688. Policy #0 lag: (min: 19.0, avg: 19.0, max: 19.0) [2023-10-14 04:42:54,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:42:56,664][33201] Updated weights for policy 0, policy_version 91400 (0.0009) [2023-10-14 04:42:57,030][33201] Updated weights for policy 0, policy_version 91410 (0.0008) [2023-10-14 04:42:57,395][33201] Updated weights for policy 0, policy_version 91420 (0.0011) [2023-10-14 04:42:57,699][33226] Updated weights for policy 1, policy_version 92230 (0.0009) [2023-10-14 04:42:58,064][33226] Updated weights for policy 1, policy_version 92240 (0.0010) [2023-10-14 04:42:58,431][33226] Updated weights for policy 1, policy_version 92250 (0.0011) [2023-10-14 04:42:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 188088320. Throughput: 0: 1757.9, 1: 1784.4. Samples: 47023068. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:42:59,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:01,277][33201] Updated weights for policy 0, policy_version 91430 (0.0008) [2023-10-14 04:43:01,642][33201] Updated weights for policy 0, policy_version 91440 (0.0009) [2023-10-14 04:43:02,010][33201] Updated weights for policy 0, policy_version 91450 (0.0007) [2023-10-14 04:43:02,318][33226] Updated weights for policy 1, policy_version 92260 (0.0010) [2023-10-14 04:43:02,675][33226] Updated weights for policy 1, policy_version 92270 (0.0007) [2023-10-14 04:43:03,045][33226] Updated weights for policy 1, policy_version 92280 (0.0009) [2023-10-14 04:43:04,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 188153856. Throughput: 0: 1746.5, 1: 1796.9. Samples: 47043924. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:04,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:05,697][33201] Updated weights for policy 0, policy_version 91460 (0.0008) [2023-10-14 04:43:06,070][33201] Updated weights for policy 0, policy_version 91470 (0.0010) [2023-10-14 04:43:06,442][33201] Updated weights for policy 0, policy_version 91480 (0.0009) [2023-10-14 04:43:06,858][33226] Updated weights for policy 1, policy_version 92290 (0.0009) [2023-10-14 04:43:07,225][33226] Updated weights for policy 1, policy_version 92300 (0.0010) [2023-10-14 04:43:07,587][33226] Updated weights for policy 1, policy_version 92310 (0.0009) [2023-10-14 04:43:07,952][33226] Updated weights for policy 1, policy_version 92320 (0.0007) [2023-10-14 04:43:09,557][31953] Fps is (10 sec: 13106.6, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 188219392. Throughput: 0: 1761.0, 1: 1786.5. Samples: 47065726. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:09,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:10,322][33201] Updated weights for policy 0, policy_version 91490 (0.0009) [2023-10-14 04:43:10,690][33201] Updated weights for policy 0, policy_version 91500 (0.0008) [2023-10-14 04:43:11,062][33201] Updated weights for policy 0, policy_version 91510 (0.0007) [2023-10-14 04:43:11,425][33201] Updated weights for policy 0, policy_version 91520 (0.0009) [2023-10-14 04:43:11,594][33226] Updated weights for policy 1, policy_version 92330 (0.0008) [2023-10-14 04:43:11,961][33226] Updated weights for policy 1, policy_version 92340 (0.0010) [2023-10-14 04:43:12,330][33226] Updated weights for policy 1, policy_version 92350 (0.0010) [2023-10-14 04:43:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 188284928. Throughput: 0: 1751.3, 1: 1798.3. Samples: 47075982. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:14,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:15,324][33201] Updated weights for policy 0, policy_version 91530 (0.0011) [2023-10-14 04:43:15,693][33201] Updated weights for policy 0, policy_version 91540 (0.0007) [2023-10-14 04:43:16,065][33201] Updated weights for policy 0, policy_version 91550 (0.0007) [2023-10-14 04:43:16,088][33226] Updated weights for policy 1, policy_version 92360 (0.0008) [2023-10-14 04:43:16,458][33226] Updated weights for policy 1, policy_version 92370 (0.0010) [2023-10-14 04:43:16,826][33226] Updated weights for policy 1, policy_version 92380 (0.0010) [2023-10-14 04:43:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 188350464. Throughput: 0: 1750.7, 1: 1790.5. Samples: 47097620. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:19,877][33201] Updated weights for policy 0, policy_version 91560 (0.0008) [2023-10-14 04:43:20,251][33201] Updated weights for policy 0, policy_version 91570 (0.0007) [2023-10-14 04:43:20,555][33226] Updated weights for policy 1, policy_version 92390 (0.0007) [2023-10-14 04:43:20,623][33201] Updated weights for policy 0, policy_version 91580 (0.0008) [2023-10-14 04:43:20,935][33226] Updated weights for policy 1, policy_version 92400 (0.0010) [2023-10-14 04:43:21,298][33226] Updated weights for policy 1, policy_version 92410 (0.0008) [2023-10-14 04:43:24,392][33201] Updated weights for policy 0, policy_version 91590 (0.0009) [2023-10-14 04:43:24,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 188416000. Throughput: 0: 1779.5, 1: 1787.3. Samples: 47119756. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:24,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000092416_94633984.pth... [2023-10-14 04:43:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000090752_92930048.pth [2023-10-14 04:43:24,757][33201] Updated weights for policy 0, policy_version 91600 (0.0012) [2023-10-14 04:43:25,039][33226] Updated weights for policy 1, policy_version 92420 (0.0010) [2023-10-14 04:43:25,121][33201] Updated weights for policy 0, policy_version 91610 (0.0009) [2023-10-14 04:43:25,344][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000091616_93814784.pth... [2023-10-14 04:43:25,372][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000089952_92110848.pth [2023-10-14 04:43:25,429][33226] Updated weights for policy 1, policy_version 92430 (0.0009) [2023-10-14 04:43:25,792][33226] Updated weights for policy 1, policy_version 92440 (0.0007) [2023-10-14 04:43:28,961][33201] Updated weights for policy 0, policy_version 91620 (0.0007) [2023-10-14 04:43:29,328][33201] Updated weights for policy 0, policy_version 91630 (0.0007) [2023-10-14 04:43:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 188481536. Throughput: 0: 1748.6, 1: 1783.1. Samples: 47129202. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:29,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:29,663][33226] Updated weights for policy 1, policy_version 92450 (0.0007) [2023-10-14 04:43:29,701][33201] Updated weights for policy 0, policy_version 91640 (0.0007) [2023-10-14 04:43:30,031][33226] Updated weights for policy 1, policy_version 92460 (0.0010) [2023-10-14 04:43:30,395][33226] Updated weights for policy 1, policy_version 92470 (0.0008) [2023-10-14 04:43:30,769][33226] Updated weights for policy 1, policy_version 92480 (0.0009) [2023-10-14 04:43:33,487][33201] Updated weights for policy 0, policy_version 91650 (0.0007) [2023-10-14 04:43:33,857][33201] Updated weights for policy 0, policy_version 91660 (0.0010) [2023-10-14 04:43:34,232][33201] Updated weights for policy 0, policy_version 91670 (0.0007) [2023-10-14 04:43:34,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 188547072. Throughput: 0: 1776.5, 1: 1782.1. Samples: 47151346. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:34,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:34,579][33226] Updated weights for policy 1, policy_version 92490 (0.0008) [2023-10-14 04:43:34,592][33201] Updated weights for policy 0, policy_version 91680 (0.0010) [2023-10-14 04:43:34,941][33226] Updated weights for policy 1, policy_version 92500 (0.0008) [2023-10-14 04:43:35,314][33226] Updated weights for policy 1, policy_version 92510 (0.0011) [2023-10-14 04:43:38,577][33201] Updated weights for policy 0, policy_version 91690 (0.0007) [2023-10-14 04:43:38,944][33201] Updated weights for policy 0, policy_version 91700 (0.0008) [2023-10-14 04:43:39,091][33226] Updated weights for policy 1, policy_version 92520 (0.0009) [2023-10-14 04:43:39,307][33201] Updated weights for policy 0, policy_version 91710 (0.0008) [2023-10-14 04:43:39,459][33226] Updated weights for policy 1, policy_version 92530 (0.0007) [2023-10-14 04:43:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 188645376. Throughput: 0: 1758.9, 1: 1802.4. Samples: 47171948. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:39,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:39,818][33226] Updated weights for policy 1, policy_version 92540 (0.0009) [2023-10-14 04:43:43,341][33201] Updated weights for policy 0, policy_version 91720 (0.0007) [2023-10-14 04:43:43,614][33226] Updated weights for policy 1, policy_version 92550 (0.0007) [2023-10-14 04:43:43,718][33201] Updated weights for policy 0, policy_version 91730 (0.0008) [2023-10-14 04:43:43,985][33226] Updated weights for policy 1, policy_version 92560 (0.0008) [2023-10-14 04:43:44,086][33201] Updated weights for policy 0, policy_version 91740 (0.0009) [2023-10-14 04:43:44,351][33226] Updated weights for policy 1, policy_version 92570 (0.0009) [2023-10-14 04:43:44,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 188710912. Throughput: 0: 1768.0, 1: 1781.3. Samples: 47182788. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:44,557][31953] Avg episode reward: [(0, '20.990'), (1, '20.830')] [2023-10-14 04:43:48,035][33201] Updated weights for policy 0, policy_version 91750 (0.0009) [2023-10-14 04:43:48,335][33226] Updated weights for policy 1, policy_version 92580 (0.0010) [2023-10-14 04:43:48,402][33201] Updated weights for policy 0, policy_version 91760 (0.0007) [2023-10-14 04:43:48,699][33226] Updated weights for policy 1, policy_version 92590 (0.0008) [2023-10-14 04:43:48,773][33201] Updated weights for policy 0, policy_version 91770 (0.0007) [2023-10-14 04:43:49,068][33226] Updated weights for policy 1, policy_version 92600 (0.0009) [2023-10-14 04:43:49,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 188809216. Throughput: 0: 1772.3, 1: 1793.6. Samples: 47204388. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:49,557][31953] Avg episode reward: [(0, '20.930'), (1, '20.830')] [2023-10-14 04:43:52,439][33201] Updated weights for policy 0, policy_version 91780 (0.0008) [2023-10-14 04:43:52,799][33201] Updated weights for policy 0, policy_version 91790 (0.0008) [2023-10-14 04:43:52,883][33226] Updated weights for policy 1, policy_version 92610 (0.0009) [2023-10-14 04:43:53,171][33201] Updated weights for policy 0, policy_version 91800 (0.0008) [2023-10-14 04:43:53,258][33226] Updated weights for policy 1, policy_version 92620 (0.0010) [2023-10-14 04:43:53,617][33226] Updated weights for policy 1, policy_version 92630 (0.0009) [2023-10-14 04:43:53,987][33226] Updated weights for policy 1, policy_version 92640 (0.0008) [2023-10-14 04:43:54,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 188874752. Throughput: 0: 1743.6, 1: 1768.7. Samples: 47223778. Policy #0 lag: (min: 31.0, avg: 31.1, max: 39.0) [2023-10-14 04:43:54,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.830')] [2023-10-14 04:43:57,001][33201] Updated weights for policy 0, policy_version 91810 (0.0009) [2023-10-14 04:43:57,370][33201] Updated weights for policy 0, policy_version 91820 (0.0007) [2023-10-14 04:43:57,733][33201] Updated weights for policy 0, policy_version 91830 (0.0007) [2023-10-14 04:43:57,735][33226] Updated weights for policy 1, policy_version 92650 (0.0008) [2023-10-14 04:43:58,104][33226] Updated weights for policy 1, policy_version 92660 (0.0009) [2023-10-14 04:43:58,104][33201] Updated weights for policy 0, policy_version 91840 (0.0008) [2023-10-14 04:43:58,474][33226] Updated weights for policy 1, policy_version 92670 (0.0007) [2023-10-14 04:43:59,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 188940288. Throughput: 0: 1771.7, 1: 1785.2. Samples: 47236046. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:43:59,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.830')] [2023-10-14 04:44:02,004][33201] Updated weights for policy 0, policy_version 91850 (0.0010) [2023-10-14 04:44:02,378][33201] Updated weights for policy 0, policy_version 91860 (0.0009) [2023-10-14 04:44:02,397][33226] Updated weights for policy 1, policy_version 92680 (0.0008) [2023-10-14 04:44:02,751][33201] Updated weights for policy 0, policy_version 91870 (0.0008) [2023-10-14 04:44:02,758][33226] Updated weights for policy 1, policy_version 92690 (0.0008) [2023-10-14 04:44:03,124][33226] Updated weights for policy 1, policy_version 92700 (0.0010) [2023-10-14 04:44:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 189005824. Throughput: 0: 1746.6, 1: 1768.7. Samples: 47255810. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:04,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.810')] [2023-10-14 04:44:06,545][33201] Updated weights for policy 0, policy_version 91880 (0.0008) [2023-10-14 04:44:06,889][33226] Updated weights for policy 1, policy_version 92710 (0.0010) [2023-10-14 04:44:06,916][33201] Updated weights for policy 0, policy_version 91890 (0.0008) [2023-10-14 04:44:07,250][33226] Updated weights for policy 1, policy_version 92720 (0.0007) [2023-10-14 04:44:07,289][33201] Updated weights for policy 0, policy_version 91900 (0.0007) [2023-10-14 04:44:07,614][33226] Updated weights for policy 1, policy_version 92730 (0.0010) [2023-10-14 04:44:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 189071360. Throughput: 0: 1747.8, 1: 1756.1. Samples: 47277434. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:09,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.810')] [2023-10-14 04:44:11,351][33201] Updated weights for policy 0, policy_version 91910 (0.0008) [2023-10-14 04:44:11,664][33226] Updated weights for policy 1, policy_version 92740 (0.0008) [2023-10-14 04:44:11,728][33201] Updated weights for policy 0, policy_version 91920 (0.0007) [2023-10-14 04:44:12,066][33226] Updated weights for policy 1, policy_version 92750 (0.0008) [2023-10-14 04:44:12,091][33201] Updated weights for policy 0, policy_version 91930 (0.0007) [2023-10-14 04:44:12,440][33226] Updated weights for policy 1, policy_version 92760 (0.0008) [2023-10-14 04:44:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 189136896. Throughput: 0: 1752.0, 1: 1772.0. Samples: 47287786. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:14,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.810')] [2023-10-14 04:44:15,929][33201] Updated weights for policy 0, policy_version 91940 (0.0007) [2023-10-14 04:44:16,298][33226] Updated weights for policy 1, policy_version 92770 (0.0007) [2023-10-14 04:44:16,299][33201] Updated weights for policy 0, policy_version 91950 (0.0007) [2023-10-14 04:44:16,661][33226] Updated weights for policy 1, policy_version 92780 (0.0009) [2023-10-14 04:44:16,661][33201] Updated weights for policy 0, policy_version 91960 (0.0010) [2023-10-14 04:44:17,027][33226] Updated weights for policy 1, policy_version 92790 (0.0008) [2023-10-14 04:44:17,388][33226] Updated weights for policy 1, policy_version 92800 (0.0007) [2023-10-14 04:44:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 189202432. Throughput: 0: 1738.7, 1: 1746.4. Samples: 47308176. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:19,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.810')] [2023-10-14 04:44:20,370][33201] Updated weights for policy 0, policy_version 91970 (0.0010) [2023-10-14 04:44:20,730][33201] Updated weights for policy 0, policy_version 91980 (0.0011) [2023-10-14 04:44:21,100][33201] Updated weights for policy 0, policy_version 91990 (0.0008) [2023-10-14 04:44:21,239][33226] Updated weights for policy 1, policy_version 92810 (0.0007) [2023-10-14 04:44:21,469][33201] Updated weights for policy 0, policy_version 92000 (0.0008) [2023-10-14 04:44:21,594][33226] Updated weights for policy 1, policy_version 92820 (0.0008) [2023-10-14 04:44:21,963][33226] Updated weights for policy 1, policy_version 92830 (0.0007) [2023-10-14 04:44:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 189267968. Throughput: 0: 1771.3, 1: 1749.2. Samples: 47330368. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:24,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.810')] [2023-10-14 04:44:25,307][33201] Updated weights for policy 0, policy_version 92010 (0.0008) [2023-10-14 04:44:25,678][33201] Updated weights for policy 0, policy_version 92020 (0.0009) [2023-10-14 04:44:25,818][33226] Updated weights for policy 1, policy_version 92840 (0.0009) [2023-10-14 04:44:26,036][33201] Updated weights for policy 0, policy_version 92030 (0.0010) [2023-10-14 04:44:26,181][33226] Updated weights for policy 1, policy_version 92850 (0.0009) [2023-10-14 04:44:26,545][33226] Updated weights for policy 1, policy_version 92860 (0.0010) [2023-10-14 04:44:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 189333504. Throughput: 0: 1747.7, 1: 1746.7. Samples: 47340038. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:29,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.770')] [2023-10-14 04:44:29,870][33201] Updated weights for policy 0, policy_version 92040 (0.0010) [2023-10-14 04:44:30,242][33201] Updated weights for policy 0, policy_version 92050 (0.0010) [2023-10-14 04:44:30,516][33226] Updated weights for policy 1, policy_version 92870 (0.0008) [2023-10-14 04:44:30,611][33201] Updated weights for policy 0, policy_version 92060 (0.0007) [2023-10-14 04:44:30,878][33226] Updated weights for policy 1, policy_version 92880 (0.0009) [2023-10-14 04:44:31,247][33226] Updated weights for policy 1, policy_version 92890 (0.0009) [2023-10-14 04:44:34,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 189399040. Throughput: 0: 1749.9, 1: 1746.2. Samples: 47361712. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:34,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:44:34,562][33201] Updated weights for policy 0, policy_version 92070 (0.0008) [2023-10-14 04:44:34,931][33201] Updated weights for policy 0, policy_version 92080 (0.0009) [2023-10-14 04:44:35,074][33226] Updated weights for policy 1, policy_version 92900 (0.0009) [2023-10-14 04:44:35,306][33201] Updated weights for policy 0, policy_version 92090 (0.0008) [2023-10-14 04:44:35,444][33226] Updated weights for policy 1, policy_version 92910 (0.0008) [2023-10-14 04:44:35,807][33226] Updated weights for policy 1, policy_version 92920 (0.0009) [2023-10-14 04:44:39,070][33201] Updated weights for policy 0, policy_version 92100 (0.0008) [2023-10-14 04:44:39,447][33201] Updated weights for policy 0, policy_version 92110 (0.0010) [2023-10-14 04:44:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 189464576. Throughput: 0: 1773.7, 1: 1774.4. Samples: 47383444. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:39,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:44:39,672][33226] Updated weights for policy 1, policy_version 92930 (0.0009) [2023-10-14 04:44:39,816][33201] Updated weights for policy 0, policy_version 92120 (0.0008) [2023-10-14 04:44:40,032][33226] Updated weights for policy 1, policy_version 92940 (0.0007) [2023-10-14 04:44:40,392][33226] Updated weights for policy 1, policy_version 92950 (0.0007) [2023-10-14 04:44:40,754][33226] Updated weights for policy 1, policy_version 92960 (0.0008) [2023-10-14 04:44:43,647][33201] Updated weights for policy 0, policy_version 92130 (0.0007) [2023-10-14 04:44:44,012][33201] Updated weights for policy 0, policy_version 92140 (0.0010) [2023-10-14 04:44:44,375][33201] Updated weights for policy 0, policy_version 92150 (0.0007) [2023-10-14 04:44:44,500][33226] Updated weights for policy 1, policy_version 92970 (0.0007) [2023-10-14 04:44:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 189530112. Throughput: 0: 1753.0, 1: 1742.3. Samples: 47393334. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:44,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:44:44,747][33201] Updated weights for policy 0, policy_version 92160 (0.0008) [2023-10-14 04:44:44,857][33226] Updated weights for policy 1, policy_version 92980 (0.0009) [2023-10-14 04:44:45,220][33226] Updated weights for policy 1, policy_version 92990 (0.0008) [2023-10-14 04:44:48,424][33201] Updated weights for policy 0, policy_version 92170 (0.0008) [2023-10-14 04:44:48,793][33201] Updated weights for policy 0, policy_version 92180 (0.0007) [2023-10-14 04:44:49,149][33226] Updated weights for policy 1, policy_version 93000 (0.0008) [2023-10-14 04:44:49,156][33201] Updated weights for policy 0, policy_version 92190 (0.0007) [2023-10-14 04:44:49,520][33226] Updated weights for policy 1, policy_version 93010 (0.0009) [2023-10-14 04:44:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 13653.3, 300 sec: 14218.0). Total num frames: 189628416. Throughput: 0: 1784.3, 1: 1765.9. Samples: 47415568. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:49,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:44:49,894][33226] Updated weights for policy 1, policy_version 93020 (0.0007) [2023-10-14 04:44:53,017][33201] Updated weights for policy 0, policy_version 92200 (0.0007) [2023-10-14 04:44:53,391][33201] Updated weights for policy 0, policy_version 92210 (0.0009) [2023-10-14 04:44:53,585][33226] Updated weights for policy 1, policy_version 93030 (0.0007) [2023-10-14 04:44:53,764][33201] Updated weights for policy 0, policy_version 92220 (0.0009) [2023-10-14 04:44:53,946][33226] Updated weights for policy 1, policy_version 93040 (0.0007) [2023-10-14 04:44:54,313][33226] Updated weights for policy 1, policy_version 93050 (0.0007) [2023-10-14 04:44:54,557][31953] Fps is (10 sec: 19660.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 189726720. Throughput: 0: 1755.8, 1: 1763.4. Samples: 47435800. Policy #0 lag: (min: 31.0, avg: 31.0, max: 35.0) [2023-10-14 04:44:54,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:44:57,612][33201] Updated weights for policy 0, policy_version 92230 (0.0008) [2023-10-14 04:44:57,988][33201] Updated weights for policy 0, policy_version 92240 (0.0008) [2023-10-14 04:44:58,136][33226] Updated weights for policy 1, policy_version 93060 (0.0009) [2023-10-14 04:44:58,348][33201] Updated weights for policy 0, policy_version 92250 (0.0007) [2023-10-14 04:44:58,532][33226] Updated weights for policy 1, policy_version 93070 (0.0009) [2023-10-14 04:44:58,903][33226] Updated weights for policy 1, policy_version 93080 (0.0007) [2023-10-14 04:44:59,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 189792256. Throughput: 0: 1784.3, 1: 1769.4. Samples: 47447702. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:44:59,557][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:45:02,023][33201] Updated weights for policy 0, policy_version 92260 (0.0007) [2023-10-14 04:45:02,392][33201] Updated weights for policy 0, policy_version 92270 (0.0007) [2023-10-14 04:45:02,737][33226] Updated weights for policy 1, policy_version 93090 (0.0009) [2023-10-14 04:45:02,766][33201] Updated weights for policy 0, policy_version 92280 (0.0007) [2023-10-14 04:45:03,113][33226] Updated weights for policy 1, policy_version 93100 (0.0008) [2023-10-14 04:45:03,483][33226] Updated weights for policy 1, policy_version 93110 (0.0009) [2023-10-14 04:45:03,848][33226] Updated weights for policy 1, policy_version 93120 (0.0009) [2023-10-14 04:45:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 189857792. Throughput: 0: 1766.8, 1: 1782.0. Samples: 47467874. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:04,558][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:45:06,603][33201] Updated weights for policy 0, policy_version 92290 (0.0008) [2023-10-14 04:45:06,969][33201] Updated weights for policy 0, policy_version 92300 (0.0010) [2023-10-14 04:45:07,352][33201] Updated weights for policy 0, policy_version 92310 (0.0008) [2023-10-14 04:45:07,618][33226] Updated weights for policy 1, policy_version 93130 (0.0008) [2023-10-14 04:45:07,717][33201] Updated weights for policy 0, policy_version 92320 (0.0008) [2023-10-14 04:45:07,989][33226] Updated weights for policy 1, policy_version 93140 (0.0007) [2023-10-14 04:45:08,354][33226] Updated weights for policy 1, policy_version 93150 (0.0007) [2023-10-14 04:45:09,558][31953] Fps is (10 sec: 13106.5, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 189923328. Throughput: 0: 1764.8, 1: 1760.7. Samples: 47489018. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:09,559][31953] Avg episode reward: [(0, '20.830'), (1, '20.940')] [2023-10-14 04:45:11,556][33201] Updated weights for policy 0, policy_version 92330 (0.0010) [2023-10-14 04:45:11,925][33201] Updated weights for policy 0, policy_version 92340 (0.0010) [2023-10-14 04:45:12,151][33226] Updated weights for policy 1, policy_version 93160 (0.0008) [2023-10-14 04:45:12,297][33201] Updated weights for policy 0, policy_version 92350 (0.0009) [2023-10-14 04:45:12,510][33226] Updated weights for policy 1, policy_version 93170 (0.0010) [2023-10-14 04:45:12,879][33226] Updated weights for policy 1, policy_version 93180 (0.0007) [2023-10-14 04:45:14,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 189988864. Throughput: 0: 1771.2, 1: 1786.6. Samples: 47500136. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:14,558][31953] Avg episode reward: [(0, '20.740'), (1, '20.940')] [2023-10-14 04:45:16,163][33201] Updated weights for policy 0, policy_version 92360 (0.0009) [2023-10-14 04:45:16,537][33201] Updated weights for policy 0, policy_version 92370 (0.0009) [2023-10-14 04:45:16,793][33226] Updated weights for policy 1, policy_version 93190 (0.0007) [2023-10-14 04:45:16,903][33201] Updated weights for policy 0, policy_version 92380 (0.0009) [2023-10-14 04:45:17,162][33226] Updated weights for policy 1, policy_version 93200 (0.0008) [2023-10-14 04:45:17,527][33226] Updated weights for policy 1, policy_version 93210 (0.0008) [2023-10-14 04:45:19,557][31953] Fps is (10 sec: 13107.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 190054400. Throughput: 0: 1767.2, 1: 1763.3. Samples: 47520580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:19,559][31953] Avg episode reward: [(0, '20.800'), (1, '20.940')] [2023-10-14 04:45:20,689][33201] Updated weights for policy 0, policy_version 92390 (0.0009) [2023-10-14 04:45:21,047][33201] Updated weights for policy 0, policy_version 92400 (0.0009) [2023-10-14 04:45:21,235][33226] Updated weights for policy 1, policy_version 93220 (0.0010) [2023-10-14 04:45:21,424][33201] Updated weights for policy 0, policy_version 92410 (0.0008) [2023-10-14 04:45:21,596][33226] Updated weights for policy 1, policy_version 93230 (0.0008) [2023-10-14 04:45:21,967][33226] Updated weights for policy 1, policy_version 93240 (0.0009) [2023-10-14 04:45:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 190119936. Throughput: 0: 1771.4, 1: 1772.6. Samples: 47542922. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:24,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.940')] [2023-10-14 04:45:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000092416_94633984.pth... [2023-10-14 04:45:24,567][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000093248_95485952.pth... [2023-10-14 04:45:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000090784_92962816.pth [2023-10-14 04:45:24,606][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000091584_93782016.pth [2023-10-14 04:45:25,163][33201] Updated weights for policy 0, policy_version 92420 (0.0009) [2023-10-14 04:45:25,534][33201] Updated weights for policy 0, policy_version 92430 (0.0007) [2023-10-14 04:45:25,771][33226] Updated weights for policy 1, policy_version 93250 (0.0008) [2023-10-14 04:45:25,905][33201] Updated weights for policy 0, policy_version 92440 (0.0008) [2023-10-14 04:45:26,137][33226] Updated weights for policy 1, policy_version 93260 (0.0009) [2023-10-14 04:45:26,499][33226] Updated weights for policy 1, policy_version 93270 (0.0008) [2023-10-14 04:45:26,865][33226] Updated weights for policy 1, policy_version 93280 (0.0007) [2023-10-14 04:45:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 190185472. Throughput: 0: 1764.3, 1: 1770.5. Samples: 47552400. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:29,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.940')] [2023-10-14 04:45:29,794][33201] Updated weights for policy 0, policy_version 92450 (0.0009) [2023-10-14 04:45:30,168][33201] Updated weights for policy 0, policy_version 92460 (0.0008) [2023-10-14 04:45:30,533][33201] Updated weights for policy 0, policy_version 92470 (0.0010) [2023-10-14 04:45:30,802][33226] Updated weights for policy 1, policy_version 93290 (0.0007) [2023-10-14 04:45:30,898][33201] Updated weights for policy 0, policy_version 92480 (0.0007) [2023-10-14 04:45:31,166][33226] Updated weights for policy 1, policy_version 93300 (0.0008) [2023-10-14 04:45:31,537][33226] Updated weights for policy 1, policy_version 93310 (0.0010) [2023-10-14 04:45:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 190251008. Throughput: 0: 1760.0, 1: 1767.8. Samples: 47574318. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:34,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.890')] [2023-10-14 04:45:34,724][33201] Updated weights for policy 0, policy_version 92490 (0.0008) [2023-10-14 04:45:35,102][33201] Updated weights for policy 0, policy_version 92500 (0.0009) [2023-10-14 04:45:35,368][33226] Updated weights for policy 1, policy_version 93320 (0.0007) [2023-10-14 04:45:35,469][33201] Updated weights for policy 0, policy_version 92510 (0.0009) [2023-10-14 04:45:35,729][33226] Updated weights for policy 1, policy_version 93330 (0.0009) [2023-10-14 04:45:36,094][33226] Updated weights for policy 1, policy_version 93340 (0.0011) [2023-10-14 04:45:39,420][33201] Updated weights for policy 0, policy_version 92520 (0.0009) [2023-10-14 04:45:39,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 190316544. Throughput: 0: 1787.2, 1: 1781.6. Samples: 47596396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:39,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.890')] [2023-10-14 04:45:39,775][33226] Updated weights for policy 1, policy_version 93350 (0.0008) [2023-10-14 04:45:39,793][33201] Updated weights for policy 0, policy_version 92530 (0.0009) [2023-10-14 04:45:40,137][33226] Updated weights for policy 1, policy_version 93360 (0.0007) [2023-10-14 04:45:40,155][33201] Updated weights for policy 0, policy_version 92540 (0.0009) [2023-10-14 04:45:40,500][33226] Updated weights for policy 1, policy_version 93370 (0.0009) [2023-10-14 04:45:43,973][33201] Updated weights for policy 0, policy_version 92550 (0.0009) [2023-10-14 04:45:44,339][33201] Updated weights for policy 0, policy_version 92560 (0.0007) [2023-10-14 04:45:44,351][33226] Updated weights for policy 1, policy_version 93380 (0.0008) [2023-10-14 04:45:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 190382080. Throughput: 0: 1756.0, 1: 1763.6. Samples: 47606082. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:44,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.890')] [2023-10-14 04:45:44,707][33201] Updated weights for policy 0, policy_version 92570 (0.0009) [2023-10-14 04:45:44,749][33226] Updated weights for policy 1, policy_version 93390 (0.0008) [2023-10-14 04:45:45,111][33226] Updated weights for policy 1, policy_version 93400 (0.0008) [2023-10-14 04:45:48,444][33201] Updated weights for policy 0, policy_version 92580 (0.0008) [2023-10-14 04:45:48,815][33201] Updated weights for policy 0, policy_version 92590 (0.0007) [2023-10-14 04:45:48,850][33226] Updated weights for policy 1, policy_version 93410 (0.0008) [2023-10-14 04:45:49,191][33201] Updated weights for policy 0, policy_version 92600 (0.0008) [2023-10-14 04:45:49,229][33226] Updated weights for policy 1, policy_version 93420 (0.0008) [2023-10-14 04:45:49,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 190480384. Throughput: 0: 1787.2, 1: 1779.7. Samples: 47628386. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:49,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 04:45:49,587][33226] Updated weights for policy 1, policy_version 93430 (0.0009) [2023-10-14 04:45:49,953][33226] Updated weights for policy 1, policy_version 93440 (0.0012) [2023-10-14 04:45:53,100][33201] Updated weights for policy 0, policy_version 92610 (0.0007) [2023-10-14 04:45:53,471][33201] Updated weights for policy 0, policy_version 92620 (0.0008) [2023-10-14 04:45:53,672][33226] Updated weights for policy 1, policy_version 93450 (0.0009) [2023-10-14 04:45:53,844][33201] Updated weights for policy 0, policy_version 92630 (0.0008) [2023-10-14 04:45:54,046][33226] Updated weights for policy 1, policy_version 93460 (0.0009) [2023-10-14 04:45:54,212][33201] Updated weights for policy 0, policy_version 92640 (0.0007) [2023-10-14 04:45:54,409][33226] Updated weights for policy 1, policy_version 93470 (0.0008) [2023-10-14 04:45:54,557][31953] Fps is (10 sec: 19660.7, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 190578688. Throughput: 0: 1760.1, 1: 1791.2. Samples: 47648824. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-10-14 04:45:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 04:45:58,044][33201] Updated weights for policy 0, policy_version 92650 (0.0008) [2023-10-14 04:45:58,257][33226] Updated weights for policy 1, policy_version 93480 (0.0009) [2023-10-14 04:45:58,411][33201] Updated weights for policy 0, policy_version 92660 (0.0008) [2023-10-14 04:45:58,627][33226] Updated weights for policy 1, policy_version 93490 (0.0009) [2023-10-14 04:45:58,785][33201] Updated weights for policy 0, policy_version 92670 (0.0008) [2023-10-14 04:45:58,995][33226] Updated weights for policy 1, policy_version 93500 (0.0009) [2023-10-14 04:45:59,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 190644224. Throughput: 0: 1781.1, 1: 1783.0. Samples: 47660518. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:45:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 04:46:02,665][33201] Updated weights for policy 0, policy_version 92680 (0.0008) [2023-10-14 04:46:02,717][33226] Updated weights for policy 1, policy_version 93510 (0.0007) [2023-10-14 04:46:03,035][33201] Updated weights for policy 0, policy_version 92690 (0.0009) [2023-10-14 04:46:03,095][33226] Updated weights for policy 1, policy_version 93520 (0.0007) [2023-10-14 04:46:03,392][33201] Updated weights for policy 0, policy_version 92700 (0.0008) [2023-10-14 04:46:03,452][33226] Updated weights for policy 1, policy_version 93530 (0.0007) [2023-10-14 04:46:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 190709760. Throughput: 0: 1763.4, 1: 1798.6. Samples: 47680870. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:04,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.910')] [2023-10-14 04:46:07,162][33226] Updated weights for policy 1, policy_version 93540 (0.0009) [2023-10-14 04:46:07,298][33201] Updated weights for policy 0, policy_version 92710 (0.0007) [2023-10-14 04:46:07,528][33226] Updated weights for policy 1, policy_version 93550 (0.0010) [2023-10-14 04:46:07,664][33201] Updated weights for policy 0, policy_version 92720 (0.0008) [2023-10-14 04:46:07,907][33226] Updated weights for policy 1, policy_version 93560 (0.0010) [2023-10-14 04:46:08,031][33201] Updated weights for policy 0, policy_version 92730 (0.0007) [2023-10-14 04:46:09,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 190775296. Throughput: 0: 1744.4, 1: 1781.7. Samples: 47701598. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:09,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:11,581][33226] Updated weights for policy 1, policy_version 93570 (0.0008) [2023-10-14 04:46:11,743][33201] Updated weights for policy 0, policy_version 92740 (0.0008) [2023-10-14 04:46:11,945][33226] Updated weights for policy 1, policy_version 93580 (0.0007) [2023-10-14 04:46:12,105][33201] Updated weights for policy 0, policy_version 92750 (0.0008) [2023-10-14 04:46:12,317][33226] Updated weights for policy 1, policy_version 93590 (0.0008) [2023-10-14 04:46:12,474][33201] Updated weights for policy 0, policy_version 92760 (0.0009) [2023-10-14 04:46:12,691][33226] Updated weights for policy 1, policy_version 93600 (0.0007) [2023-10-14 04:46:14,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 190840832. Throughput: 0: 1766.8, 1: 1807.2. Samples: 47713228. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:14,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:16,192][33201] Updated weights for policy 0, policy_version 92770 (0.0007) [2023-10-14 04:46:16,533][33226] Updated weights for policy 1, policy_version 93610 (0.0009) [2023-10-14 04:46:16,574][33201] Updated weights for policy 0, policy_version 92780 (0.0009) [2023-10-14 04:46:16,893][33226] Updated weights for policy 1, policy_version 93620 (0.0008) [2023-10-14 04:46:16,945][33201] Updated weights for policy 0, policy_version 92790 (0.0009) [2023-10-14 04:46:17,272][33226] Updated weights for policy 1, policy_version 93630 (0.0008) [2023-10-14 04:46:17,312][33201] Updated weights for policy 0, policy_version 92800 (0.0009) [2023-10-14 04:46:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 190906368. Throughput: 0: 1751.2, 1: 1790.7. Samples: 47733704. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:19,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:20,901][33226] Updated weights for policy 1, policy_version 93640 (0.0007) [2023-10-14 04:46:21,190][33201] Updated weights for policy 0, policy_version 92810 (0.0008) [2023-10-14 04:46:21,263][33226] Updated weights for policy 1, policy_version 93650 (0.0007) [2023-10-14 04:46:21,558][33201] Updated weights for policy 0, policy_version 92820 (0.0008) [2023-10-14 04:46:21,625][33226] Updated weights for policy 1, policy_version 93660 (0.0008) [2023-10-14 04:46:21,917][33201] Updated weights for policy 0, policy_version 92830 (0.0009) [2023-10-14 04:46:24,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 190971904. Throughput: 0: 1757.9, 1: 1789.7. Samples: 47756038. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:25,378][33226] Updated weights for policy 1, policy_version 93670 (0.0008) [2023-10-14 04:46:25,684][33201] Updated weights for policy 0, policy_version 92840 (0.0007) [2023-10-14 04:46:25,743][33226] Updated weights for policy 1, policy_version 93680 (0.0007) [2023-10-14 04:46:26,050][33201] Updated weights for policy 0, policy_version 92850 (0.0008) [2023-10-14 04:46:26,114][33226] Updated weights for policy 1, policy_version 93690 (0.0007) [2023-10-14 04:46:26,421][33201] Updated weights for policy 0, policy_version 92860 (0.0007) [2023-10-14 04:46:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 191037440. Throughput: 0: 1754.2, 1: 1791.5. Samples: 47765640. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:29,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:29,966][33226] Updated weights for policy 1, policy_version 93700 (0.0008) [2023-10-14 04:46:30,327][33226] Updated weights for policy 1, policy_version 93710 (0.0008) [2023-10-14 04:46:30,497][33201] Updated weights for policy 0, policy_version 92870 (0.0008) [2023-10-14 04:46:30,698][33226] Updated weights for policy 1, policy_version 93720 (0.0010) [2023-10-14 04:46:30,866][33201] Updated weights for policy 0, policy_version 92880 (0.0007) [2023-10-14 04:46:31,240][33201] Updated weights for policy 0, policy_version 92890 (0.0010) [2023-10-14 04:46:34,546][33226] Updated weights for policy 1, policy_version 93730 (0.0009) [2023-10-14 04:46:34,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 191102976. Throughput: 0: 1746.6, 1: 1788.5. Samples: 47787464. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:34,913][33226] Updated weights for policy 1, policy_version 93740 (0.0007) [2023-10-14 04:46:35,018][33201] Updated weights for policy 0, policy_version 92900 (0.0007) [2023-10-14 04:46:35,270][33226] Updated weights for policy 1, policy_version 93750 (0.0009) [2023-10-14 04:46:35,401][33201] Updated weights for policy 0, policy_version 92910 (0.0009) [2023-10-14 04:46:35,644][33226] Updated weights for policy 1, policy_version 93760 (0.0007) [2023-10-14 04:46:35,764][33201] Updated weights for policy 0, policy_version 92920 (0.0007) [2023-10-14 04:46:39,418][33226] Updated weights for policy 1, policy_version 93770 (0.0010) [2023-10-14 04:46:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 191168512. Throughput: 0: 1767.2, 1: 1799.0. Samples: 47809304. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:39,557][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:39,655][33201] Updated weights for policy 0, policy_version 92930 (0.0008) [2023-10-14 04:46:39,784][33226] Updated weights for policy 1, policy_version 93780 (0.0009) [2023-10-14 04:46:40,020][33201] Updated weights for policy 0, policy_version 92940 (0.0008) [2023-10-14 04:46:40,143][33226] Updated weights for policy 1, policy_version 93790 (0.0007) [2023-10-14 04:46:40,379][33201] Updated weights for policy 0, policy_version 92950 (0.0009) [2023-10-14 04:46:40,755][33201] Updated weights for policy 0, policy_version 92960 (0.0008) [2023-10-14 04:46:43,858][33226] Updated weights for policy 1, policy_version 93800 (0.0010) [2023-10-14 04:46:44,222][33226] Updated weights for policy 1, policy_version 93810 (0.0008) [2023-10-14 04:46:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 191234048. Throughput: 0: 1740.9, 1: 1781.6. Samples: 47819030. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:44,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.910')] [2023-10-14 04:46:44,580][33201] Updated weights for policy 0, policy_version 92970 (0.0008) [2023-10-14 04:46:44,584][33226] Updated weights for policy 1, policy_version 93820 (0.0008) [2023-10-14 04:46:44,947][33201] Updated weights for policy 0, policy_version 92980 (0.0007) [2023-10-14 04:46:45,317][33201] Updated weights for policy 0, policy_version 92990 (0.0008) [2023-10-14 04:46:48,418][33226] Updated weights for policy 1, policy_version 93830 (0.0008) [2023-10-14 04:46:48,783][33226] Updated weights for policy 1, policy_version 93840 (0.0008) [2023-10-14 04:46:49,107][33201] Updated weights for policy 0, policy_version 93000 (0.0009) [2023-10-14 04:46:49,136][33226] Updated weights for policy 1, policy_version 93850 (0.0008) [2023-10-14 04:46:49,468][33201] Updated weights for policy 0, policy_version 93010 (0.0008) [2023-10-14 04:46:49,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191332352. Throughput: 0: 1768.3, 1: 1793.6. Samples: 47841156. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:49,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.910')] [2023-10-14 04:46:49,845][33201] Updated weights for policy 0, policy_version 93020 (0.0010) [2023-10-14 04:46:52,971][33226] Updated weights for policy 1, policy_version 93860 (0.0008) [2023-10-14 04:46:53,334][33226] Updated weights for policy 1, policy_version 93870 (0.0008) [2023-10-14 04:46:53,700][33226] Updated weights for policy 1, policy_version 93880 (0.0007) [2023-10-14 04:46:53,741][33201] Updated weights for policy 0, policy_version 93030 (0.0009) [2023-10-14 04:46:54,108][33201] Updated weights for policy 0, policy_version 93040 (0.0008) [2023-10-14 04:46:54,486][33201] Updated weights for policy 0, policy_version 93050 (0.0008) [2023-10-14 04:46:54,557][31953] Fps is (10 sec: 16384.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 191397888. Throughput: 0: 1776.4, 1: 1774.6. Samples: 47861394. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-10-14 04:46:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:46:57,434][33226] Updated weights for policy 1, policy_version 93890 (0.0008) [2023-10-14 04:46:57,816][33226] Updated weights for policy 1, policy_version 93900 (0.0011) [2023-10-14 04:46:58,166][33201] Updated weights for policy 0, policy_version 93060 (0.0007) [2023-10-14 04:46:58,185][33226] Updated weights for policy 1, policy_version 93910 (0.0010) [2023-10-14 04:46:58,537][33201] Updated weights for policy 0, policy_version 93070 (0.0008) [2023-10-14 04:46:58,540][33226] Updated weights for policy 1, policy_version 93920 (0.0008) [2023-10-14 04:46:58,896][33201] Updated weights for policy 0, policy_version 93080 (0.0011) [2023-10-14 04:46:59,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191496192. Throughput: 0: 1773.4, 1: 1780.8. Samples: 47873166. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:46:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:47:02,519][33226] Updated weights for policy 1, policy_version 93930 (0.0010) [2023-10-14 04:47:02,701][33201] Updated weights for policy 0, policy_version 93090 (0.0009) [2023-10-14 04:47:02,883][33226] Updated weights for policy 1, policy_version 93940 (0.0008) [2023-10-14 04:47:03,077][33201] Updated weights for policy 0, policy_version 93100 (0.0008) [2023-10-14 04:47:03,246][33226] Updated weights for policy 1, policy_version 93950 (0.0008) [2023-10-14 04:47:03,448][33201] Updated weights for policy 0, policy_version 93110 (0.0008) [2023-10-14 04:47:03,812][33201] Updated weights for policy 0, policy_version 93120 (0.0009) [2023-10-14 04:47:04,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 191561728. Throughput: 0: 1778.6, 1: 1778.5. Samples: 47893774. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:47:06,926][33226] Updated weights for policy 1, policy_version 93960 (0.0007) [2023-10-14 04:47:07,293][33226] Updated weights for policy 1, policy_version 93970 (0.0007) [2023-10-14 04:47:07,593][33201] Updated weights for policy 0, policy_version 93130 (0.0008) [2023-10-14 04:47:07,657][33226] Updated weights for policy 1, policy_version 93980 (0.0007) [2023-10-14 04:47:07,969][33201] Updated weights for policy 0, policy_version 93140 (0.0008) [2023-10-14 04:47:08,343][33201] Updated weights for policy 0, policy_version 93150 (0.0007) [2023-10-14 04:47:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191627264. Throughput: 0: 1755.2, 1: 1772.8. Samples: 47914794. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:47:11,470][33226] Updated weights for policy 1, policy_version 93990 (0.0007) [2023-10-14 04:47:11,835][33226] Updated weights for policy 1, policy_version 94000 (0.0007) [2023-10-14 04:47:12,138][33201] Updated weights for policy 0, policy_version 93160 (0.0009) [2023-10-14 04:47:12,198][33226] Updated weights for policy 1, policy_version 94010 (0.0007) [2023-10-14 04:47:12,510][33201] Updated weights for policy 0, policy_version 93170 (0.0008) [2023-10-14 04:47:12,875][33201] Updated weights for policy 0, policy_version 93180 (0.0011) [2023-10-14 04:47:14,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191692800. Throughput: 0: 1783.1, 1: 1786.7. Samples: 47926280. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:47:16,200][33226] Updated weights for policy 1, policy_version 94020 (0.0008) [2023-10-14 04:47:16,570][33201] Updated weights for policy 0, policy_version 93190 (0.0009) [2023-10-14 04:47:16,596][33226] Updated weights for policy 1, policy_version 94030 (0.0009) [2023-10-14 04:47:16,936][33201] Updated weights for policy 0, policy_version 93200 (0.0008) [2023-10-14 04:47:16,967][33226] Updated weights for policy 1, policy_version 94040 (0.0007) [2023-10-14 04:47:17,309][33201] Updated weights for policy 0, policy_version 93210 (0.0007) [2023-10-14 04:47:19,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191758336. Throughput: 0: 1764.3, 1: 1767.7. Samples: 47946404. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:19,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:47:20,797][33226] Updated weights for policy 1, policy_version 94050 (0.0008) [2023-10-14 04:47:20,930][33201] Updated weights for policy 0, policy_version 93220 (0.0009) [2023-10-14 04:47:21,153][33226] Updated weights for policy 1, policy_version 94060 (0.0008) [2023-10-14 04:47:21,296][33201] Updated weights for policy 0, policy_version 93230 (0.0010) [2023-10-14 04:47:21,521][33226] Updated weights for policy 1, policy_version 94070 (0.0008) [2023-10-14 04:47:21,675][33201] Updated weights for policy 0, policy_version 93240 (0.0008) [2023-10-14 04:47:21,884][33226] Updated weights for policy 1, policy_version 94080 (0.0009) [2023-10-14 04:47:24,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 191823872. Throughput: 0: 1778.3, 1: 1763.0. Samples: 47968660. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:24,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.900')] [2023-10-14 04:47:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000093248_95485952.pth... [2023-10-14 04:47:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000094080_96337920.pth... [2023-10-14 04:47:24,607][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000092416_94633984.pth [2023-10-14 04:47:24,607][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000091616_93814784.pth [2023-10-14 04:47:25,538][33201] Updated weights for policy 0, policy_version 93250 (0.0009) [2023-10-14 04:47:25,794][33226] Updated weights for policy 1, policy_version 94090 (0.0009) [2023-10-14 04:47:25,911][33201] Updated weights for policy 0, policy_version 93260 (0.0007) [2023-10-14 04:47:26,157][33226] Updated weights for policy 1, policy_version 94100 (0.0008) [2023-10-14 04:47:26,283][33201] Updated weights for policy 0, policy_version 93270 (0.0009) [2023-10-14 04:47:26,523][33226] Updated weights for policy 1, policy_version 94110 (0.0009) [2023-10-14 04:47:26,652][33201] Updated weights for policy 0, policy_version 93280 (0.0010) [2023-10-14 04:47:29,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 191889408. Throughput: 0: 1775.2, 1: 1764.0. Samples: 47978294. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:29,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:47:30,322][33226] Updated weights for policy 1, policy_version 94120 (0.0009) [2023-10-14 04:47:30,451][33201] Updated weights for policy 0, policy_version 93290 (0.0007) [2023-10-14 04:47:30,691][33226] Updated weights for policy 1, policy_version 94130 (0.0008) [2023-10-14 04:47:30,822][33201] Updated weights for policy 0, policy_version 93300 (0.0009) [2023-10-14 04:47:31,060][33226] Updated weights for policy 1, policy_version 94140 (0.0008) [2023-10-14 04:47:31,196][33201] Updated weights for policy 0, policy_version 93310 (0.0008) [2023-10-14 04:47:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 191954944. Throughput: 0: 1768.0, 1: 1765.0. Samples: 48000138. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:34,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.880')] [2023-10-14 04:47:34,735][33226] Updated weights for policy 1, policy_version 94150 (0.0007) [2023-10-14 04:47:35,099][33226] Updated weights for policy 1, policy_version 94160 (0.0007) [2023-10-14 04:47:35,227][33201] Updated weights for policy 0, policy_version 93320 (0.0009) [2023-10-14 04:47:35,465][33226] Updated weights for policy 1, policy_version 94170 (0.0009) [2023-10-14 04:47:35,591][33201] Updated weights for policy 0, policy_version 93330 (0.0008) [2023-10-14 04:47:35,951][33201] Updated weights for policy 0, policy_version 93340 (0.0009) [2023-10-14 04:47:39,026][33226] Updated weights for policy 1, policy_version 94180 (0.0008) [2023-10-14 04:47:39,393][33226] Updated weights for policy 1, policy_version 94190 (0.0008) [2023-10-14 04:47:39,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 192020480. Throughput: 0: 1775.4, 1: 1795.1. Samples: 48022068. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:39,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.860')] [2023-10-14 04:47:39,754][33201] Updated weights for policy 0, policy_version 93350 (0.0008) [2023-10-14 04:47:39,763][33226] Updated weights for policy 1, policy_version 94200 (0.0009) [2023-10-14 04:47:40,123][33201] Updated weights for policy 0, policy_version 93360 (0.0008) [2023-10-14 04:47:40,489][33201] Updated weights for policy 0, policy_version 93370 (0.0007) [2023-10-14 04:47:43,749][33226] Updated weights for policy 1, policy_version 94210 (0.0008) [2023-10-14 04:47:44,111][33226] Updated weights for policy 1, policy_version 94220 (0.0009) [2023-10-14 04:47:44,309][33201] Updated weights for policy 0, policy_version 93380 (0.0007) [2023-10-14 04:47:44,489][33226] Updated weights for policy 1, policy_version 94230 (0.0007) [2023-10-14 04:47:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 192086016. Throughput: 0: 1759.9, 1: 1763.3. Samples: 48031710. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:44,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.810')] [2023-10-14 04:47:44,680][33201] Updated weights for policy 0, policy_version 93390 (0.0008) [2023-10-14 04:47:44,856][33226] Updated weights for policy 1, policy_version 94240 (0.0008) [2023-10-14 04:47:45,041][33201] Updated weights for policy 0, policy_version 93400 (0.0007) [2023-10-14 04:47:48,680][33226] Updated weights for policy 1, policy_version 94250 (0.0007) [2023-10-14 04:47:48,823][33201] Updated weights for policy 0, policy_version 93410 (0.0011) [2023-10-14 04:47:49,041][33226] Updated weights for policy 1, policy_version 94260 (0.0010) [2023-10-14 04:47:49,191][33201] Updated weights for policy 0, policy_version 93420 (0.0007) [2023-10-14 04:47:49,404][33226] Updated weights for policy 1, policy_version 94270 (0.0010) [2023-10-14 04:47:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 192184320. Throughput: 0: 1772.0, 1: 1784.4. Samples: 48053812. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:49,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.810')] [2023-10-14 04:47:49,562][33201] Updated weights for policy 0, policy_version 93430 (0.0009) [2023-10-14 04:47:49,927][33201] Updated weights for policy 0, policy_version 93440 (0.0009) [2023-10-14 04:47:53,232][33226] Updated weights for policy 1, policy_version 94280 (0.0009) [2023-10-14 04:47:53,595][33226] Updated weights for policy 1, policy_version 94290 (0.0007) [2023-10-14 04:47:53,743][33201] Updated weights for policy 0, policy_version 93450 (0.0007) [2023-10-14 04:47:53,961][33226] Updated weights for policy 1, policy_version 94300 (0.0008) [2023-10-14 04:47:54,105][33201] Updated weights for policy 0, policy_version 93460 (0.0009) [2023-10-14 04:47:54,479][33201] Updated weights for policy 0, policy_version 93470 (0.0010) [2023-10-14 04:47:54,558][31953] Fps is (10 sec: 19660.4, 60 sec: 14745.5, 300 sec: 14218.0). Total num frames: 192282624. Throughput: 0: 1775.4, 1: 1758.6. Samples: 48073824. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-10-14 04:47:54,559][31953] Avg episode reward: [(0, '20.950'), (1, '20.810')] [2023-10-14 04:47:57,729][33226] Updated weights for policy 1, policy_version 94310 (0.0009) [2023-10-14 04:47:58,086][33226] Updated weights for policy 1, policy_version 94320 (0.0007) [2023-10-14 04:47:58,308][33201] Updated weights for policy 0, policy_version 93480 (0.0008) [2023-10-14 04:47:58,453][33226] Updated weights for policy 1, policy_version 94330 (0.0008) [2023-10-14 04:47:58,680][33201] Updated weights for policy 0, policy_version 93490 (0.0007) [2023-10-14 04:47:59,055][33201] Updated weights for policy 0, policy_version 93500 (0.0007) [2023-10-14 04:47:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 192348160. Throughput: 0: 1770.7, 1: 1773.0. Samples: 48085746. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:47:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.810')] [2023-10-14 04:48:02,452][33226] Updated weights for policy 1, policy_version 94340 (0.0008) [2023-10-14 04:48:02,855][33226] Updated weights for policy 1, policy_version 94350 (0.0007) [2023-10-14 04:48:02,902][33201] Updated weights for policy 0, policy_version 93510 (0.0009) [2023-10-14 04:48:03,224][33226] Updated weights for policy 1, policy_version 94360 (0.0008) [2023-10-14 04:48:03,279][33201] Updated weights for policy 0, policy_version 93520 (0.0008) [2023-10-14 04:48:03,640][33201] Updated weights for policy 0, policy_version 93530 (0.0008) [2023-10-14 04:48:04,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 192413696. Throughput: 0: 1777.9, 1: 1770.3. Samples: 48106074. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.790')] [2023-10-14 04:48:07,017][33226] Updated weights for policy 1, policy_version 94370 (0.0007) [2023-10-14 04:48:07,386][33226] Updated weights for policy 1, policy_version 94380 (0.0008) [2023-10-14 04:48:07,556][33201] Updated weights for policy 0, policy_version 93540 (0.0007) [2023-10-14 04:48:07,756][33226] Updated weights for policy 1, policy_version 94390 (0.0009) [2023-10-14 04:48:07,920][33201] Updated weights for policy 0, policy_version 93550 (0.0007) [2023-10-14 04:48:08,114][33226] Updated weights for policy 1, policy_version 94400 (0.0007) [2023-10-14 04:48:08,296][33201] Updated weights for policy 0, policy_version 93560 (0.0008) [2023-10-14 04:48:09,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 192479232. Throughput: 0: 1745.4, 1: 1761.1. Samples: 48126452. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:09,557][31953] Avg episode reward: [(0, '20.950'), (1, '20.830')] [2023-10-14 04:48:11,969][33226] Updated weights for policy 1, policy_version 94410 (0.0008) [2023-10-14 04:48:12,146][33201] Updated weights for policy 0, policy_version 93570 (0.0007) [2023-10-14 04:48:12,336][33226] Updated weights for policy 1, policy_version 94420 (0.0008) [2023-10-14 04:48:12,515][33201] Updated weights for policy 0, policy_version 93580 (0.0007) [2023-10-14 04:48:12,694][33226] Updated weights for policy 1, policy_version 94430 (0.0009) [2023-10-14 04:48:12,882][33201] Updated weights for policy 0, policy_version 93590 (0.0010) [2023-10-14 04:48:13,255][33201] Updated weights for policy 0, policy_version 93600 (0.0007) [2023-10-14 04:48:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 192544768. Throughput: 0: 1778.7, 1: 1775.9. Samples: 48138250. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.830')] [2023-10-14 04:48:16,574][33226] Updated weights for policy 1, policy_version 94440 (0.0007) [2023-10-14 04:48:16,936][33226] Updated weights for policy 1, policy_version 94450 (0.0007) [2023-10-14 04:48:17,098][33201] Updated weights for policy 0, policy_version 93610 (0.0007) [2023-10-14 04:48:17,305][33226] Updated weights for policy 1, policy_version 94460 (0.0007) [2023-10-14 04:48:17,462][33201] Updated weights for policy 0, policy_version 93620 (0.0008) [2023-10-14 04:48:17,833][33201] Updated weights for policy 0, policy_version 93630 (0.0007) [2023-10-14 04:48:19,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 192610304. Throughput: 0: 1750.3, 1: 1752.9. Samples: 48157780. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.830')] [2023-10-14 04:48:21,107][33226] Updated weights for policy 1, policy_version 94470 (0.0008) [2023-10-14 04:48:21,465][33226] Updated weights for policy 1, policy_version 94480 (0.0008) [2023-10-14 04:48:21,824][33201] Updated weights for policy 0, policy_version 93640 (0.0008) [2023-10-14 04:48:21,828][33226] Updated weights for policy 1, policy_version 94490 (0.0008) [2023-10-14 04:48:22,198][33201] Updated weights for policy 0, policy_version 93650 (0.0008) [2023-10-14 04:48:22,573][33201] Updated weights for policy 0, policy_version 93660 (0.0008) [2023-10-14 04:48:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 192675840. Throughput: 0: 1747.6, 1: 1762.9. Samples: 48180044. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:24,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.830')] [2023-10-14 04:48:25,522][33226] Updated weights for policy 1, policy_version 94500 (0.0007) [2023-10-14 04:48:25,889][33226] Updated weights for policy 1, policy_version 94510 (0.0007) [2023-10-14 04:48:26,258][33226] Updated weights for policy 1, policy_version 94520 (0.0009) [2023-10-14 04:48:26,499][33201] Updated weights for policy 0, policy_version 93670 (0.0008) [2023-10-14 04:48:26,862][33201] Updated weights for policy 0, policy_version 93680 (0.0007) [2023-10-14 04:48:27,235][33201] Updated weights for policy 0, policy_version 93690 (0.0009) [2023-10-14 04:48:29,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 192741376. Throughput: 0: 1754.6, 1: 1765.7. Samples: 48190126. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:29,558][31953] Avg episode reward: [(0, '21.000'), (1, '20.810')] [2023-10-14 04:48:30,033][33226] Updated weights for policy 1, policy_version 94530 (0.0008) [2023-10-14 04:48:30,402][33226] Updated weights for policy 1, policy_version 94540 (0.0008) [2023-10-14 04:48:30,772][33226] Updated weights for policy 1, policy_version 94550 (0.0009) [2023-10-14 04:48:30,989][33201] Updated weights for policy 0, policy_version 93700 (0.0009) [2023-10-14 04:48:31,134][33226] Updated weights for policy 1, policy_version 94560 (0.0007) [2023-10-14 04:48:31,352][33201] Updated weights for policy 0, policy_version 93710 (0.0009) [2023-10-14 04:48:31,724][33201] Updated weights for policy 0, policy_version 93720 (0.0008) [2023-10-14 04:48:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 192806912. Throughput: 0: 1738.1, 1: 1771.3. Samples: 48211738. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.810')] [2023-10-14 04:48:34,898][33226] Updated weights for policy 1, policy_version 94570 (0.0010) [2023-10-14 04:48:35,264][33226] Updated weights for policy 1, policy_version 94580 (0.0009) [2023-10-14 04:48:35,630][33226] Updated weights for policy 1, policy_version 94590 (0.0009) [2023-10-14 04:48:35,760][33201] Updated weights for policy 0, policy_version 93730 (0.0007) [2023-10-14 04:48:36,127][33201] Updated weights for policy 0, policy_version 93740 (0.0009) [2023-10-14 04:48:36,503][33201] Updated weights for policy 0, policy_version 93750 (0.0008) [2023-10-14 04:48:36,873][33201] Updated weights for policy 0, policy_version 93760 (0.0010) [2023-10-14 04:48:39,414][33226] Updated weights for policy 1, policy_version 94600 (0.0007) [2023-10-14 04:48:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 192872448. Throughput: 0: 1758.4, 1: 1803.2. Samples: 48234094. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.810')] [2023-10-14 04:48:39,785][33226] Updated weights for policy 1, policy_version 94610 (0.0007) [2023-10-14 04:48:40,146][33226] Updated weights for policy 1, policy_version 94620 (0.0009) [2023-10-14 04:48:40,597][33201] Updated weights for policy 0, policy_version 93770 (0.0007) [2023-10-14 04:48:40,968][33201] Updated weights for policy 0, policy_version 93780 (0.0008) [2023-10-14 04:48:41,341][33201] Updated weights for policy 0, policy_version 93790 (0.0008) [2023-10-14 04:48:43,924][33226] Updated weights for policy 1, policy_version 94630 (0.0009) [2023-10-14 04:48:44,291][33226] Updated weights for policy 1, policy_version 94640 (0.0007) [2023-10-14 04:48:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 13995.8). Total num frames: 192937984. Throughput: 0: 1737.9, 1: 1775.9. Samples: 48243866. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.820')] [2023-10-14 04:48:44,646][33226] Updated weights for policy 1, policy_version 94650 (0.0007) [2023-10-14 04:48:45,170][33201] Updated weights for policy 0, policy_version 93800 (0.0009) [2023-10-14 04:48:45,542][33201] Updated weights for policy 0, policy_version 93810 (0.0012) [2023-10-14 04:48:45,914][33201] Updated weights for policy 0, policy_version 93820 (0.0011) [2023-10-14 04:48:48,574][33226] Updated weights for policy 1, policy_version 94660 (0.0008) [2023-10-14 04:48:48,978][33226] Updated weights for policy 1, policy_version 94670 (0.0009) [2023-10-14 04:48:49,342][33226] Updated weights for policy 1, policy_version 94680 (0.0007) [2023-10-14 04:48:49,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 13995.8). Total num frames: 193003520. Throughput: 0: 1754.4, 1: 1800.5. Samples: 48266044. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.820')] [2023-10-14 04:48:49,739][33201] Updated weights for policy 0, policy_version 93830 (0.0009) [2023-10-14 04:48:50,114][33201] Updated weights for policy 0, policy_version 93840 (0.0008) [2023-10-14 04:48:50,484][33201] Updated weights for policy 0, policy_version 93850 (0.0008) [2023-10-14 04:48:53,102][33226] Updated weights for policy 1, policy_version 94690 (0.0008) [2023-10-14 04:48:53,462][33226] Updated weights for policy 1, policy_version 94700 (0.0008) [2023-10-14 04:48:53,826][33226] Updated weights for policy 1, policy_version 94710 (0.0010) [2023-10-14 04:48:54,192][33226] Updated weights for policy 1, policy_version 94720 (0.0009) [2023-10-14 04:48:54,316][33201] Updated weights for policy 0, policy_version 93860 (0.0007) [2023-10-14 04:48:54,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 193101824. Throughput: 0: 1779.2, 1: 1783.3. Samples: 48286762. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:54,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.820')] [2023-10-14 04:48:54,686][33201] Updated weights for policy 0, policy_version 93870 (0.0008) [2023-10-14 04:48:55,046][33201] Updated weights for policy 0, policy_version 93880 (0.0008) [2023-10-14 04:48:57,982][33226] Updated weights for policy 1, policy_version 94730 (0.0008) [2023-10-14 04:48:58,354][33226] Updated weights for policy 1, policy_version 94740 (0.0007) [2023-10-14 04:48:58,712][33226] Updated weights for policy 1, policy_version 94750 (0.0008) [2023-10-14 04:48:58,911][33201] Updated weights for policy 0, policy_version 93890 (0.0008) [2023-10-14 04:48:59,280][33201] Updated weights for policy 0, policy_version 93900 (0.0009) [2023-10-14 04:48:59,557][31953] Fps is (10 sec: 16384.0, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 193167360. Throughput: 0: 1745.7, 1: 1793.2. Samples: 48297498. Policy #0 lag: (min: 31.0, avg: 35.6, max: 63.0) [2023-10-14 04:48:59,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.830')] [2023-10-14 04:48:59,648][33201] Updated weights for policy 0, policy_version 93910 (0.0008) [2023-10-14 04:49:00,021][33201] Updated weights for policy 0, policy_version 93920 (0.0009) [2023-10-14 04:49:02,522][33226] Updated weights for policy 1, policy_version 94760 (0.0009) [2023-10-14 04:49:02,890][33226] Updated weights for policy 1, policy_version 94770 (0.0010) [2023-10-14 04:49:03,264][33226] Updated weights for policy 1, policy_version 94780 (0.0010) [2023-10-14 04:49:03,879][33201] Updated weights for policy 0, policy_version 93930 (0.0007) [2023-10-14 04:49:04,252][33201] Updated weights for policy 0, policy_version 93940 (0.0008) [2023-10-14 04:49:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 193232896. Throughput: 0: 1782.9, 1: 1798.3. Samples: 48318932. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.830')] [2023-10-14 04:49:04,615][33201] Updated weights for policy 0, policy_version 93950 (0.0008) [2023-10-14 04:49:06,933][33226] Updated weights for policy 1, policy_version 94790 (0.0008) [2023-10-14 04:49:07,303][33226] Updated weights for policy 1, policy_version 94800 (0.0010) [2023-10-14 04:49:07,664][33226] Updated weights for policy 1, policy_version 94810 (0.0009) [2023-10-14 04:49:08,625][33201] Updated weights for policy 0, policy_version 93960 (0.0008) [2023-10-14 04:49:09,008][33201] Updated weights for policy 0, policy_version 93970 (0.0008) [2023-10-14 04:49:09,378][33201] Updated weights for policy 0, policy_version 93980 (0.0007) [2023-10-14 04:49:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 193331200. Throughput: 0: 1761.9, 1: 1781.7. Samples: 48339508. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:09,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.830')] [2023-10-14 04:49:11,329][33226] Updated weights for policy 1, policy_version 94820 (0.0009) [2023-10-14 04:49:11,698][33226] Updated weights for policy 1, policy_version 94830 (0.0007) [2023-10-14 04:49:12,070][33226] Updated weights for policy 1, policy_version 94840 (0.0009) [2023-10-14 04:49:13,062][33201] Updated weights for policy 0, policy_version 93990 (0.0009) [2023-10-14 04:49:13,431][33201] Updated weights for policy 0, policy_version 94000 (0.0007) [2023-10-14 04:49:13,811][33201] Updated weights for policy 0, policy_version 94010 (0.0008) [2023-10-14 04:49:14,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 193396736. Throughput: 0: 1774.6, 1: 1792.4. Samples: 48350640. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:14,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.870')] [2023-10-14 04:49:15,769][33226] Updated weights for policy 1, policy_version 94850 (0.0009) [2023-10-14 04:49:16,144][33226] Updated weights for policy 1, policy_version 94860 (0.0007) [2023-10-14 04:49:16,506][33226] Updated weights for policy 1, policy_version 94870 (0.0008) [2023-10-14 04:49:16,868][33226] Updated weights for policy 1, policy_version 94880 (0.0008) [2023-10-14 04:49:17,421][33201] Updated weights for policy 0, policy_version 94020 (0.0007) [2023-10-14 04:49:17,797][33201] Updated weights for policy 0, policy_version 94030 (0.0010) [2023-10-14 04:49:18,173][33201] Updated weights for policy 0, policy_version 94040 (0.0008) [2023-10-14 04:49:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 193462272. Throughput: 0: 1772.3, 1: 1778.7. Samples: 48371532. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:19,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.870')] [2023-10-14 04:49:20,799][33226] Updated weights for policy 1, policy_version 94890 (0.0009) [2023-10-14 04:49:21,164][33226] Updated weights for policy 1, policy_version 94900 (0.0009) [2023-10-14 04:49:21,529][33226] Updated weights for policy 1, policy_version 94910 (0.0010) [2023-10-14 04:49:21,939][33201] Updated weights for policy 0, policy_version 94050 (0.0010) [2023-10-14 04:49:22,319][33201] Updated weights for policy 0, policy_version 94060 (0.0008) [2023-10-14 04:49:22,681][33201] Updated weights for policy 0, policy_version 94070 (0.0007) [2023-10-14 04:49:23,045][33201] Updated weights for policy 0, policy_version 94080 (0.0007) [2023-10-14 04:49:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 193527808. Throughput: 0: 1758.1, 1: 1778.6. Samples: 48393246. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.890')] [2023-10-14 04:49:24,564][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000094912_97189888.pth... [2023-10-14 04:49:24,564][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000094080_96337920.pth... [2023-10-14 04:49:24,603][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000093248_95485952.pth [2023-10-14 04:49:24,603][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000092416_94633984.pth [2023-10-14 04:49:25,391][33226] Updated weights for policy 1, policy_version 94920 (0.0008) [2023-10-14 04:49:25,754][33226] Updated weights for policy 1, policy_version 94930 (0.0008) [2023-10-14 04:49:26,122][33226] Updated weights for policy 1, policy_version 94940 (0.0008) [2023-10-14 04:49:26,932][33201] Updated weights for policy 0, policy_version 94090 (0.0010) [2023-10-14 04:49:27,302][33201] Updated weights for policy 0, policy_version 94100 (0.0009) [2023-10-14 04:49:27,674][33201] Updated weights for policy 0, policy_version 94110 (0.0009) [2023-10-14 04:49:29,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 193593344. Throughput: 0: 1776.8, 1: 1774.1. Samples: 48403656. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:49:29,811][33226] Updated weights for policy 1, policy_version 94950 (0.0009) [2023-10-14 04:49:30,171][33226] Updated weights for policy 1, policy_version 94960 (0.0008) [2023-10-14 04:49:30,549][33226] Updated weights for policy 1, policy_version 94970 (0.0007) [2023-10-14 04:49:31,320][33201] Updated weights for policy 0, policy_version 94120 (0.0008) [2023-10-14 04:49:31,700][33201] Updated weights for policy 0, policy_version 94130 (0.0008) [2023-10-14 04:49:32,067][33201] Updated weights for policy 0, policy_version 94140 (0.0011) [2023-10-14 04:49:34,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 193658880. Throughput: 0: 1766.3, 1: 1772.9. Samples: 48425308. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:34,558][33226] Updated weights for policy 1, policy_version 94980 (0.0010) [2023-10-14 04:49:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.940')] [2023-10-14 04:49:34,947][33226] Updated weights for policy 1, policy_version 94990 (0.0008) [2023-10-14 04:49:35,308][33226] Updated weights for policy 1, policy_version 95000 (0.0007) [2023-10-14 04:49:35,814][33201] Updated weights for policy 0, policy_version 94150 (0.0008) [2023-10-14 04:49:36,184][33201] Updated weights for policy 0, policy_version 94160 (0.0007) [2023-10-14 04:49:36,547][33201] Updated weights for policy 0, policy_version 94170 (0.0008) [2023-10-14 04:49:38,988][33226] Updated weights for policy 1, policy_version 95010 (0.0007) [2023-10-14 04:49:39,345][33226] Updated weights for policy 1, policy_version 95020 (0.0007) [2023-10-14 04:49:39,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 193724416. Throughput: 0: 1763.9, 1: 1800.0. Samples: 48447140. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:39,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.910')] [2023-10-14 04:49:39,709][33226] Updated weights for policy 1, policy_version 95030 (0.0009) [2023-10-14 04:49:40,082][33226] Updated weights for policy 1, policy_version 95040 (0.0010) [2023-10-14 04:49:40,444][33201] Updated weights for policy 0, policy_version 94180 (0.0009) [2023-10-14 04:49:40,808][33201] Updated weights for policy 0, policy_version 94190 (0.0009) [2023-10-14 04:49:41,184][33201] Updated weights for policy 0, policy_version 94200 (0.0010) [2023-10-14 04:49:43,892][33226] Updated weights for policy 1, policy_version 95050 (0.0009) [2023-10-14 04:49:44,259][33226] Updated weights for policy 1, policy_version 95060 (0.0007) [2023-10-14 04:49:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 193789952. Throughput: 0: 1765.1, 1: 1775.1. Samples: 48456804. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:49:44,621][33226] Updated weights for policy 1, policy_version 95070 (0.0007) [2023-10-14 04:49:45,010][33201] Updated weights for policy 0, policy_version 94210 (0.0007) [2023-10-14 04:49:45,377][33201] Updated weights for policy 0, policy_version 94220 (0.0008) [2023-10-14 04:49:45,753][33201] Updated weights for policy 0, policy_version 94230 (0.0008) [2023-10-14 04:49:46,127][33201] Updated weights for policy 0, policy_version 94240 (0.0008) [2023-10-14 04:49:48,250][33226] Updated weights for policy 1, policy_version 95080 (0.0008) [2023-10-14 04:49:48,619][33226] Updated weights for policy 1, policy_version 95090 (0.0009) [2023-10-14 04:49:48,983][33226] Updated weights for policy 1, policy_version 95100 (0.0008) [2023-10-14 04:49:49,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14745.6, 300 sec: 14106.9). Total num frames: 193888256. Throughput: 0: 1760.0, 1: 1800.5. Samples: 48479154. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:49,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:49:49,991][33201] Updated weights for policy 0, policy_version 94250 (0.0008) [2023-10-14 04:49:50,365][33201] Updated weights for policy 0, policy_version 94260 (0.0007) [2023-10-14 04:49:50,739][33201] Updated weights for policy 0, policy_version 94270 (0.0007) [2023-10-14 04:49:52,702][33226] Updated weights for policy 1, policy_version 95110 (0.0007) [2023-10-14 04:49:53,070][33226] Updated weights for policy 1, policy_version 95120 (0.0007) [2023-10-14 04:49:53,425][33226] Updated weights for policy 1, policy_version 95130 (0.0010) [2023-10-14 04:49:54,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 193953792. Throughput: 0: 1790.5, 1: 1778.6. Samples: 48500118. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:49:54,674][33201] Updated weights for policy 0, policy_version 94280 (0.0008) [2023-10-14 04:49:55,040][33201] Updated weights for policy 0, policy_version 94290 (0.0008) [2023-10-14 04:49:55,407][33201] Updated weights for policy 0, policy_version 94300 (0.0007) [2023-10-14 04:49:57,270][33226] Updated weights for policy 1, policy_version 95140 (0.0010) [2023-10-14 04:49:57,633][33226] Updated weights for policy 1, policy_version 95150 (0.0009) [2023-10-14 04:49:58,000][33226] Updated weights for policy 1, policy_version 95160 (0.0008) [2023-10-14 04:49:59,089][33201] Updated weights for policy 0, policy_version 94310 (0.0008) [2023-10-14 04:49:59,453][33201] Updated weights for policy 0, policy_version 94320 (0.0007) [2023-10-14 04:49:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194019328. Throughput: 0: 1765.4, 1: 1800.4. Samples: 48511100. Policy #0 lag: (min: 31.0, avg: 35.5, max: 63.0) [2023-10-14 04:49:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:49:59,822][33201] Updated weights for policy 0, policy_version 94330 (0.0007) [2023-10-14 04:50:01,873][33226] Updated weights for policy 1, policy_version 95170 (0.0008) [2023-10-14 04:50:02,232][33226] Updated weights for policy 1, policy_version 95180 (0.0007) [2023-10-14 04:50:02,607][33226] Updated weights for policy 1, policy_version 95190 (0.0007) [2023-10-14 04:50:02,965][33226] Updated weights for policy 1, policy_version 95200 (0.0008) [2023-10-14 04:50:03,557][33201] Updated weights for policy 0, policy_version 94340 (0.0008) [2023-10-14 04:50:03,922][33201] Updated weights for policy 0, policy_version 94350 (0.0010) [2023-10-14 04:50:04,291][33201] Updated weights for policy 0, policy_version 94360 (0.0009) [2023-10-14 04:50:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194084864. Throughput: 0: 1788.0, 1: 1777.1. Samples: 48531964. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:04,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:50:06,775][33226] Updated weights for policy 1, policy_version 95210 (0.0008) [2023-10-14 04:50:07,139][33226] Updated weights for policy 1, policy_version 95220 (0.0008) [2023-10-14 04:50:07,508][33226] Updated weights for policy 1, policy_version 95230 (0.0010) [2023-10-14 04:50:08,272][33201] Updated weights for policy 0, policy_version 94370 (0.0008) [2023-10-14 04:50:08,631][33201] Updated weights for policy 0, policy_version 94380 (0.0008) [2023-10-14 04:50:09,004][33201] Updated weights for policy 0, policy_version 94390 (0.0007) [2023-10-14 04:50:09,381][33201] Updated weights for policy 0, policy_version 94400 (0.0007) [2023-10-14 04:50:09,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 194183168. Throughput: 0: 1772.2, 1: 1776.8. Samples: 48552948. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:09,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.910')] [2023-10-14 04:50:11,232][33226] Updated weights for policy 1, policy_version 95240 (0.0010) [2023-10-14 04:50:11,600][33226] Updated weights for policy 1, policy_version 95250 (0.0010) [2023-10-14 04:50:11,967][33226] Updated weights for policy 1, policy_version 95260 (0.0010) [2023-10-14 04:50:13,150][33201] Updated weights for policy 0, policy_version 94410 (0.0008) [2023-10-14 04:50:13,521][33201] Updated weights for policy 0, policy_version 94420 (0.0007) [2023-10-14 04:50:13,889][33201] Updated weights for policy 0, policy_version 94430 (0.0008) [2023-10-14 04:50:14,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 194248704. Throughput: 0: 1777.5, 1: 1785.0. Samples: 48563968. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:14,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:15,774][33226] Updated weights for policy 1, policy_version 95270 (0.0007) [2023-10-14 04:50:16,138][33226] Updated weights for policy 1, policy_version 95280 (0.0010) [2023-10-14 04:50:16,514][33226] Updated weights for policy 1, policy_version 95290 (0.0011) [2023-10-14 04:50:17,813][33201] Updated weights for policy 0, policy_version 94440 (0.0010) [2023-10-14 04:50:18,175][33201] Updated weights for policy 0, policy_version 94450 (0.0007) [2023-10-14 04:50:18,550][33201] Updated weights for policy 0, policy_version 94460 (0.0009) [2023-10-14 04:50:19,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 194314240. Throughput: 0: 1775.2, 1: 1779.4. Samples: 48585268. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:19,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:20,375][33226] Updated weights for policy 1, policy_version 95300 (0.0009) [2023-10-14 04:50:20,776][33226] Updated weights for policy 1, policy_version 95310 (0.0008) [2023-10-14 04:50:21,153][33226] Updated weights for policy 1, policy_version 95320 (0.0008) [2023-10-14 04:50:22,284][33201] Updated weights for policy 0, policy_version 94470 (0.0008) [2023-10-14 04:50:22,654][33201] Updated weights for policy 0, policy_version 94480 (0.0007) [2023-10-14 04:50:23,016][33201] Updated weights for policy 0, policy_version 94490 (0.0007) [2023-10-14 04:50:24,557][31953] Fps is (10 sec: 13106.9, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 194379776. Throughput: 0: 1759.6, 1: 1783.1. Samples: 48606562. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:24,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:24,899][33226] Updated weights for policy 1, policy_version 95330 (0.0009) [2023-10-14 04:50:25,267][33226] Updated weights for policy 1, policy_version 95340 (0.0007) [2023-10-14 04:50:25,634][33226] Updated weights for policy 1, policy_version 95350 (0.0009) [2023-10-14 04:50:25,998][33226] Updated weights for policy 1, policy_version 95360 (0.0009) [2023-10-14 04:50:26,890][33201] Updated weights for policy 0, policy_version 94500 (0.0009) [2023-10-14 04:50:27,266][33201] Updated weights for policy 0, policy_version 94510 (0.0008) [2023-10-14 04:50:27,640][33201] Updated weights for policy 0, policy_version 94520 (0.0008) [2023-10-14 04:50:29,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 194445312. Throughput: 0: 1783.5, 1: 1781.1. Samples: 48617210. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:29,557][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:29,664][33226] Updated weights for policy 1, policy_version 95370 (0.0009) [2023-10-14 04:50:30,032][33226] Updated weights for policy 1, policy_version 95380 (0.0007) [2023-10-14 04:50:30,401][33226] Updated weights for policy 1, policy_version 95390 (0.0011) [2023-10-14 04:50:31,491][33201] Updated weights for policy 0, policy_version 94530 (0.0009) [2023-10-14 04:50:31,856][33201] Updated weights for policy 0, policy_version 94540 (0.0009) [2023-10-14 04:50:32,230][33201] Updated weights for policy 0, policy_version 94550 (0.0008) [2023-10-14 04:50:32,596][33201] Updated weights for policy 0, policy_version 94560 (0.0008) [2023-10-14 04:50:34,108][33226] Updated weights for policy 1, policy_version 95400 (0.0010) [2023-10-14 04:50:34,473][33226] Updated weights for policy 1, policy_version 95410 (0.0008) [2023-10-14 04:50:34,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 194510848. Throughput: 0: 1759.0, 1: 1779.6. Samples: 48638394. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:34,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:34,843][33226] Updated weights for policy 1, policy_version 95420 (0.0007) [2023-10-14 04:50:36,329][33201] Updated weights for policy 0, policy_version 94570 (0.0009) [2023-10-14 04:50:36,695][33201] Updated weights for policy 0, policy_version 94580 (0.0010) [2023-10-14 04:50:37,074][33201] Updated weights for policy 0, policy_version 94590 (0.0011) [2023-10-14 04:50:38,605][33226] Updated weights for policy 1, policy_version 95430 (0.0007) [2023-10-14 04:50:38,971][33226] Updated weights for policy 1, policy_version 95440 (0.0007) [2023-10-14 04:50:39,338][33226] Updated weights for policy 1, policy_version 95450 (0.0007) [2023-10-14 04:50:39,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 194609152. Throughput: 0: 1757.2, 1: 1794.1. Samples: 48659928. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:39,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:40,958][33201] Updated weights for policy 0, policy_version 94600 (0.0010) [2023-10-14 04:50:41,342][33201] Updated weights for policy 0, policy_version 94610 (0.0010) [2023-10-14 04:50:41,702][33201] Updated weights for policy 0, policy_version 94620 (0.0010) [2023-10-14 04:50:43,106][33226] Updated weights for policy 1, policy_version 95460 (0.0008) [2023-10-14 04:50:43,479][33226] Updated weights for policy 1, policy_version 95470 (0.0008) [2023-10-14 04:50:43,847][33226] Updated weights for policy 1, policy_version 95480 (0.0008) [2023-10-14 04:50:44,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14745.6, 300 sec: 14218.0). Total num frames: 194674688. Throughput: 0: 1757.0, 1: 1773.2. Samples: 48669956. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:44,558][31953] Avg episode reward: [(0, '20.870'), (1, '20.930')] [2023-10-14 04:50:45,492][33201] Updated weights for policy 0, policy_version 94630 (0.0010) [2023-10-14 04:50:45,863][33201] Updated weights for policy 0, policy_version 94640 (0.0010) [2023-10-14 04:50:46,239][33201] Updated weights for policy 0, policy_version 94650 (0.0009) [2023-10-14 04:50:47,668][33226] Updated weights for policy 1, policy_version 95490 (0.0010) [2023-10-14 04:50:48,038][33226] Updated weights for policy 1, policy_version 95500 (0.0010) [2023-10-14 04:50:48,399][33226] Updated weights for policy 1, policy_version 95510 (0.0010) [2023-10-14 04:50:48,770][33226] Updated weights for policy 1, policy_version 95520 (0.0009) [2023-10-14 04:50:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194740224. Throughput: 0: 1754.7, 1: 1799.2. Samples: 48691886. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:49,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:50:49,939][33201] Updated weights for policy 0, policy_version 94660 (0.0010) [2023-10-14 04:50:50,305][33201] Updated weights for policy 0, policy_version 94670 (0.0009) [2023-10-14 04:50:50,664][33201] Updated weights for policy 0, policy_version 94680 (0.0008) [2023-10-14 04:50:52,675][33226] Updated weights for policy 1, policy_version 95530 (0.0008) [2023-10-14 04:50:53,046][33226] Updated weights for policy 1, policy_version 95540 (0.0008) [2023-10-14 04:50:53,414][33226] Updated weights for policy 1, policy_version 95550 (0.0007) [2023-10-14 04:50:54,503][33201] Updated weights for policy 0, policy_version 94690 (0.0009) [2023-10-14 04:50:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194805760. Throughput: 0: 1782.1, 1: 1772.0. Samples: 48712884. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:54,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:50:54,878][33201] Updated weights for policy 0, policy_version 94700 (0.0010) [2023-10-14 04:50:55,258][33201] Updated weights for policy 0, policy_version 94710 (0.0008) [2023-10-14 04:50:55,632][33201] Updated weights for policy 0, policy_version 94720 (0.0009) [2023-10-14 04:50:57,280][33226] Updated weights for policy 1, policy_version 95560 (0.0007) [2023-10-14 04:50:57,643][33226] Updated weights for policy 1, policy_version 95570 (0.0008) [2023-10-14 04:50:58,011][33226] Updated weights for policy 1, policy_version 95580 (0.0008) [2023-10-14 04:50:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194871296. Throughput: 0: 1754.5, 1: 1795.6. Samples: 48723722. Policy #0 lag: (min: 17.0, avg: 22.7, max: 49.0) [2023-10-14 04:50:59,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:50:59,623][33201] Updated weights for policy 0, policy_version 94730 (0.0009) [2023-10-14 04:51:00,002][33201] Updated weights for policy 0, policy_version 94740 (0.0008) [2023-10-14 04:51:00,382][33201] Updated weights for policy 0, policy_version 94750 (0.0008) [2023-10-14 04:51:01,881][33226] Updated weights for policy 1, policy_version 95590 (0.0009) [2023-10-14 04:51:02,249][33226] Updated weights for policy 1, policy_version 95600 (0.0008) [2023-10-14 04:51:02,615][33226] Updated weights for policy 1, policy_version 95610 (0.0009) [2023-10-14 04:51:04,103][33201] Updated weights for policy 0, policy_version 94760 (0.0010) [2023-10-14 04:51:04,474][33201] Updated weights for policy 0, policy_version 94770 (0.0010) [2023-10-14 04:51:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 194936832. Throughput: 0: 1765.9, 1: 1763.1. Samples: 48744072. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:04,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:51:04,849][33201] Updated weights for policy 0, policy_version 94780 (0.0011) [2023-10-14 04:51:06,462][33226] Updated weights for policy 1, policy_version 95620 (0.0008) [2023-10-14 04:51:06,846][33226] Updated weights for policy 1, policy_version 95630 (0.0009) [2023-10-14 04:51:07,207][33226] Updated weights for policy 1, policy_version 95640 (0.0009) [2023-10-14 04:51:08,634][33201] Updated weights for policy 0, policy_version 94790 (0.0009) [2023-10-14 04:51:08,998][33201] Updated weights for policy 0, policy_version 94800 (0.0008) [2023-10-14 04:51:09,364][33201] Updated weights for policy 0, policy_version 94810 (0.0008) [2023-10-14 04:51:09,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 195002368. Throughput: 0: 1764.8, 1: 1762.9. Samples: 48765304. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:09,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:51:10,894][33226] Updated weights for policy 1, policy_version 95650 (0.0011) [2023-10-14 04:51:11,270][33226] Updated weights for policy 1, policy_version 95660 (0.0009) [2023-10-14 04:51:11,638][33226] Updated weights for policy 1, policy_version 95670 (0.0010) [2023-10-14 04:51:12,009][33226] Updated weights for policy 1, policy_version 95680 (0.0007) [2023-10-14 04:51:13,143][33201] Updated weights for policy 0, policy_version 94820 (0.0011) [2023-10-14 04:51:13,520][33201] Updated weights for policy 0, policy_version 94830 (0.0008) [2023-10-14 04:51:13,892][33201] Updated weights for policy 0, policy_version 94840 (0.0009) [2023-10-14 04:51:14,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195100672. Throughput: 0: 1762.9, 1: 1767.9. Samples: 48776094. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:14,558][31953] Avg episode reward: [(0, '20.850'), (1, '20.930')] [2023-10-14 04:51:15,712][33226] Updated weights for policy 1, policy_version 95690 (0.0007) [2023-10-14 04:51:16,082][33226] Updated weights for policy 1, policy_version 95700 (0.0007) [2023-10-14 04:51:16,443][33226] Updated weights for policy 1, policy_version 95710 (0.0009) [2023-10-14 04:51:17,750][33201] Updated weights for policy 0, policy_version 94850 (0.0008) [2023-10-14 04:51:18,115][33201] Updated weights for policy 0, policy_version 94860 (0.0009) [2023-10-14 04:51:18,480][33201] Updated weights for policy 0, policy_version 94870 (0.0008) [2023-10-14 04:51:18,853][33201] Updated weights for policy 0, policy_version 94880 (0.0009) [2023-10-14 04:51:19,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195166208. Throughput: 0: 1774.9, 1: 1760.4. Samples: 48797486. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.930')] [2023-10-14 04:51:20,229][33226] Updated weights for policy 1, policy_version 95720 (0.0010) [2023-10-14 04:51:20,601][33226] Updated weights for policy 1, policy_version 95730 (0.0010) [2023-10-14 04:51:20,980][33226] Updated weights for policy 1, policy_version 95740 (0.0010) [2023-10-14 04:51:22,627][33201] Updated weights for policy 0, policy_version 94890 (0.0008) [2023-10-14 04:51:22,999][33201] Updated weights for policy 0, policy_version 94900 (0.0010) [2023-10-14 04:51:23,376][33201] Updated weights for policy 0, policy_version 94910 (0.0008) [2023-10-14 04:51:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195231744. Throughput: 0: 1751.0, 1: 1776.0. Samples: 48818642. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.930')] [2023-10-14 04:51:24,568][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000095744_98041856.pth... [2023-10-14 04:51:24,568][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000094912_97189888.pth... [2023-10-14 04:51:24,604][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000093248_95485952.pth [2023-10-14 04:51:24,608][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000094080_96337920.pth [2023-10-14 04:51:24,941][33226] Updated weights for policy 1, policy_version 95750 (0.0010) [2023-10-14 04:51:25,307][33226] Updated weights for policy 1, policy_version 95760 (0.0008) [2023-10-14 04:51:25,678][33226] Updated weights for policy 1, policy_version 95770 (0.0008) [2023-10-14 04:51:27,300][33201] Updated weights for policy 0, policy_version 94920 (0.0007) [2023-10-14 04:51:27,672][33201] Updated weights for policy 0, policy_version 94930 (0.0008) [2023-10-14 04:51:28,049][33201] Updated weights for policy 0, policy_version 94940 (0.0009) [2023-10-14 04:51:29,391][33226] Updated weights for policy 1, policy_version 95780 (0.0008) [2023-10-14 04:51:29,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195297280. Throughput: 0: 1782.9, 1: 1764.0. Samples: 48829568. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:29,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:51:29,760][33226] Updated weights for policy 1, policy_version 95790 (0.0008) [2023-10-14 04:51:30,136][33226] Updated weights for policy 1, policy_version 95800 (0.0008) [2023-10-14 04:51:31,892][33201] Updated weights for policy 0, policy_version 94950 (0.0009) [2023-10-14 04:51:32,258][33201] Updated weights for policy 0, policy_version 94960 (0.0008) [2023-10-14 04:51:32,632][33201] Updated weights for policy 0, policy_version 94970 (0.0008) [2023-10-14 04:51:33,783][33226] Updated weights for policy 1, policy_version 95810 (0.0009) [2023-10-14 04:51:34,154][33226] Updated weights for policy 1, policy_version 95820 (0.0007) [2023-10-14 04:51:34,519][33226] Updated weights for policy 1, policy_version 95830 (0.0007) [2023-10-14 04:51:34,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195362816. Throughput: 0: 1748.1, 1: 1777.7. Samples: 48850548. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.830')] [2023-10-14 04:51:34,883][33226] Updated weights for policy 1, policy_version 95840 (0.0007) [2023-10-14 04:51:36,388][33201] Updated weights for policy 0, policy_version 94980 (0.0009) [2023-10-14 04:51:36,751][33201] Updated weights for policy 0, policy_version 94990 (0.0009) [2023-10-14 04:51:37,123][33201] Updated weights for policy 0, policy_version 95000 (0.0011) [2023-10-14 04:51:38,628][33226] Updated weights for policy 1, policy_version 95850 (0.0009) [2023-10-14 04:51:39,002][33226] Updated weights for policy 1, policy_version 95860 (0.0007) [2023-10-14 04:51:39,373][33226] Updated weights for policy 1, policy_version 95870 (0.0009) [2023-10-14 04:51:39,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 195461120. Throughput: 0: 1756.4, 1: 1792.7. Samples: 48872594. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.830')] [2023-10-14 04:51:40,852][33201] Updated weights for policy 0, policy_version 95010 (0.0010) [2023-10-14 04:51:41,219][33201] Updated weights for policy 0, policy_version 95020 (0.0010) [2023-10-14 04:51:41,587][33201] Updated weights for policy 0, policy_version 95030 (0.0010) [2023-10-14 04:51:41,949][33201] Updated weights for policy 0, policy_version 95040 (0.0008) [2023-10-14 04:51:43,201][33226] Updated weights for policy 1, policy_version 95880 (0.0010) [2023-10-14 04:51:43,566][33226] Updated weights for policy 1, policy_version 95890 (0.0010) [2023-10-14 04:51:43,930][33226] Updated weights for policy 1, policy_version 95900 (0.0011) [2023-10-14 04:51:44,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 195526656. Throughput: 0: 1759.9, 1: 1779.4. Samples: 48882992. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.830')] [2023-10-14 04:51:45,687][33201] Updated weights for policy 0, policy_version 95050 (0.0007) [2023-10-14 04:51:46,053][33201] Updated weights for policy 0, policy_version 95060 (0.0009) [2023-10-14 04:51:46,427][33201] Updated weights for policy 0, policy_version 95070 (0.0010) [2023-10-14 04:51:47,610][33226] Updated weights for policy 1, policy_version 95910 (0.0009) [2023-10-14 04:51:47,980][33226] Updated weights for policy 1, policy_version 95920 (0.0008) [2023-10-14 04:51:48,335][33226] Updated weights for policy 1, policy_version 95930 (0.0010) [2023-10-14 04:51:49,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 195592192. Throughput: 0: 1760.6, 1: 1807.6. Samples: 48904638. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:49,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.830')] [2023-10-14 04:51:50,402][33201] Updated weights for policy 0, policy_version 95080 (0.0010) [2023-10-14 04:51:50,776][33201] Updated weights for policy 0, policy_version 95090 (0.0007) [2023-10-14 04:51:51,145][33201] Updated weights for policy 0, policy_version 95100 (0.0011) [2023-10-14 04:51:52,151][33226] Updated weights for policy 1, policy_version 95940 (0.0008) [2023-10-14 04:51:52,545][33226] Updated weights for policy 1, policy_version 95950 (0.0007) [2023-10-14 04:51:52,906][33226] Updated weights for policy 1, policy_version 95960 (0.0007) [2023-10-14 04:51:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 195657728. Throughput: 0: 1779.3, 1: 1787.0. Samples: 48925788. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:54,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.830')] [2023-10-14 04:51:54,879][33201] Updated weights for policy 0, policy_version 95110 (0.0008) [2023-10-14 04:51:55,258][33201] Updated weights for policy 0, policy_version 95120 (0.0007) [2023-10-14 04:51:55,627][33201] Updated weights for policy 0, policy_version 95130 (0.0007) [2023-10-14 04:51:56,512][33226] Updated weights for policy 1, policy_version 95970 (0.0008) [2023-10-14 04:51:56,870][33226] Updated weights for policy 1, policy_version 95980 (0.0010) [2023-10-14 04:51:57,237][33226] Updated weights for policy 1, policy_version 95990 (0.0011) [2023-10-14 04:51:57,609][33226] Updated weights for policy 1, policy_version 96000 (0.0007) [2023-10-14 04:51:59,545][33201] Updated weights for policy 0, policy_version 95140 (0.0008) [2023-10-14 04:51:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 195723264. Throughput: 0: 1756.0, 1: 1804.8. Samples: 48936332. Policy #0 lag: (min: 31.0, avg: 36.6, max: 63.0) [2023-10-14 04:51:59,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 04:51:59,909][33201] Updated weights for policy 0, policy_version 95150 (0.0009) [2023-10-14 04:52:00,293][33201] Updated weights for policy 0, policy_version 95160 (0.0010) [2023-10-14 04:52:01,488][33226] Updated weights for policy 1, policy_version 96010 (0.0010) [2023-10-14 04:52:01,844][33226] Updated weights for policy 1, policy_version 96020 (0.0007) [2023-10-14 04:52:02,215][33226] Updated weights for policy 1, policy_version 96030 (0.0009) [2023-10-14 04:52:04,075][33201] Updated weights for policy 0, policy_version 95170 (0.0008) [2023-10-14 04:52:04,441][33201] Updated weights for policy 0, policy_version 95180 (0.0007) [2023-10-14 04:52:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 195788800. Throughput: 0: 1768.8, 1: 1790.0. Samples: 48957630. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 04:52:04,813][33201] Updated weights for policy 0, policy_version 95190 (0.0008) [2023-10-14 04:52:05,179][33201] Updated weights for policy 0, policy_version 95200 (0.0008) [2023-10-14 04:52:06,039][33226] Updated weights for policy 1, policy_version 96040 (0.0009) [2023-10-14 04:52:06,406][33226] Updated weights for policy 1, policy_version 96050 (0.0009) [2023-10-14 04:52:06,777][33226] Updated weights for policy 1, policy_version 96060 (0.0009) [2023-10-14 04:52:09,124][33201] Updated weights for policy 0, policy_version 95210 (0.0008) [2023-10-14 04:52:09,498][33201] Updated weights for policy 0, policy_version 95220 (0.0007) [2023-10-14 04:52:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 195854336. Throughput: 0: 1781.1, 1: 1788.4. Samples: 48979268. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 04:52:09,865][33201] Updated weights for policy 0, policy_version 95230 (0.0011) [2023-10-14 04:52:10,644][33226] Updated weights for policy 1, policy_version 96070 (0.0007) [2023-10-14 04:52:11,013][33226] Updated weights for policy 1, policy_version 96080 (0.0007) [2023-10-14 04:52:11,379][33226] Updated weights for policy 1, policy_version 96090 (0.0008) [2023-10-14 04:52:13,903][33201] Updated weights for policy 0, policy_version 95240 (0.0009) [2023-10-14 04:52:14,284][33201] Updated weights for policy 0, policy_version 95250 (0.0008) [2023-10-14 04:52:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 195919872. Throughput: 0: 1760.4, 1: 1789.6. Samples: 48989320. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.850')] [2023-10-14 04:52:14,662][33201] Updated weights for policy 0, policy_version 95260 (0.0009) [2023-10-14 04:52:14,936][33226] Updated weights for policy 1, policy_version 96100 (0.0007) [2023-10-14 04:52:15,308][33226] Updated weights for policy 1, policy_version 96110 (0.0009) [2023-10-14 04:52:15,671][33226] Updated weights for policy 1, policy_version 96120 (0.0007) [2023-10-14 04:52:18,459][33201] Updated weights for policy 0, policy_version 95270 (0.0009) [2023-10-14 04:52:18,827][33201] Updated weights for policy 0, policy_version 95280 (0.0009) [2023-10-14 04:52:19,199][33201] Updated weights for policy 0, policy_version 95290 (0.0007) [2023-10-14 04:52:19,450][33226] Updated weights for policy 1, policy_version 96130 (0.0007) [2023-10-14 04:52:19,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 196018176. Throughput: 0: 1789.2, 1: 1792.5. Samples: 49011724. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:19,558][31953] Avg episode reward: [(0, '20.990'), (1, '20.870')] [2023-10-14 04:52:19,823][33226] Updated weights for policy 1, policy_version 96140 (0.0008) [2023-10-14 04:52:20,191][33226] Updated weights for policy 1, policy_version 96150 (0.0008) [2023-10-14 04:52:20,556][33226] Updated weights for policy 1, policy_version 96160 (0.0008) [2023-10-14 04:52:23,035][33201] Updated weights for policy 0, policy_version 95300 (0.0007) [2023-10-14 04:52:23,412][33201] Updated weights for policy 0, policy_version 95310 (0.0008) [2023-10-14 04:52:23,771][33201] Updated weights for policy 0, policy_version 95320 (0.0009) [2023-10-14 04:52:24,299][33226] Updated weights for policy 1, policy_version 96170 (0.0008) [2023-10-14 04:52:24,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 196083712. Throughput: 0: 1750.3, 1: 1803.1. Samples: 49032498. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:24,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 04:52:24,661][33226] Updated weights for policy 1, policy_version 96180 (0.0009) [2023-10-14 04:52:25,037][33226] Updated weights for policy 1, policy_version 96190 (0.0007) [2023-10-14 04:52:27,553][33201] Updated weights for policy 0, policy_version 95330 (0.0008) [2023-10-14 04:52:27,930][33201] Updated weights for policy 0, policy_version 95340 (0.0011) [2023-10-14 04:52:28,298][33201] Updated weights for policy 0, policy_version 95350 (0.0007) [2023-10-14 04:52:28,667][33201] Updated weights for policy 0, policy_version 95360 (0.0008) [2023-10-14 04:52:28,794][33226] Updated weights for policy 1, policy_version 96200 (0.0010) [2023-10-14 04:52:29,159][33226] Updated weights for policy 1, policy_version 96210 (0.0010) [2023-10-14 04:52:29,522][33226] Updated weights for policy 1, policy_version 96220 (0.0010) [2023-10-14 04:52:29,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 196149248. Throughput: 0: 1781.2, 1: 1784.0. Samples: 49043426. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:29,560][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 04:52:32,263][33201] Updated weights for policy 0, policy_version 95370 (0.0011) [2023-10-14 04:52:32,634][33201] Updated weights for policy 0, policy_version 95380 (0.0008) [2023-10-14 04:52:33,002][33201] Updated weights for policy 0, policy_version 95390 (0.0007) [2023-10-14 04:52:33,286][33226] Updated weights for policy 1, policy_version 96230 (0.0010) [2023-10-14 04:52:33,657][33226] Updated weights for policy 1, policy_version 96240 (0.0008) [2023-10-14 04:52:34,021][33226] Updated weights for policy 1, policy_version 96250 (0.0010) [2023-10-14 04:52:34,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14745.6, 300 sec: 14329.1). Total num frames: 196247552. Throughput: 0: 1754.5, 1: 1798.7. Samples: 49064532. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:34,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 04:52:36,833][33201] Updated weights for policy 0, policy_version 95400 (0.0007) [2023-10-14 04:52:37,195][33201] Updated weights for policy 0, policy_version 95410 (0.0009) [2023-10-14 04:52:37,568][33201] Updated weights for policy 0, policy_version 95420 (0.0011) [2023-10-14 04:52:37,973][33226] Updated weights for policy 1, policy_version 96260 (0.0009) [2023-10-14 04:52:38,362][33226] Updated weights for policy 1, policy_version 96270 (0.0009) [2023-10-14 04:52:38,727][33226] Updated weights for policy 1, policy_version 96280 (0.0010) [2023-10-14 04:52:39,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 196313088. Throughput: 0: 1755.6, 1: 1782.2. Samples: 49084990. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:39,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 04:52:41,524][33201] Updated weights for policy 0, policy_version 95430 (0.0009) [2023-10-14 04:52:41,896][33201] Updated weights for policy 0, policy_version 95440 (0.0009) [2023-10-14 04:52:42,268][33201] Updated weights for policy 0, policy_version 95450 (0.0008) [2023-10-14 04:52:42,541][33226] Updated weights for policy 1, policy_version 96290 (0.0008) [2023-10-14 04:52:42,908][33226] Updated weights for policy 1, policy_version 96300 (0.0008) [2023-10-14 04:52:43,277][33226] Updated weights for policy 1, policy_version 96310 (0.0008) [2023-10-14 04:52:43,650][33226] Updated weights for policy 1, policy_version 96320 (0.0010) [2023-10-14 04:52:44,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 196378624. Throughput: 0: 1768.2, 1: 1788.3. Samples: 49096372. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:44,558][31953] Avg episode reward: [(0, '20.910'), (1, '20.870')] [2023-10-14 04:52:46,170][33201] Updated weights for policy 0, policy_version 95460 (0.0008) [2023-10-14 04:52:46,539][33201] Updated weights for policy 0, policy_version 95470 (0.0008) [2023-10-14 04:52:46,904][33201] Updated weights for policy 0, policy_version 95480 (0.0008) [2023-10-14 04:52:47,478][33226] Updated weights for policy 1, policy_version 96330 (0.0008) [2023-10-14 04:52:47,846][33226] Updated weights for policy 1, policy_version 96340 (0.0009) [2023-10-14 04:52:48,201][33226] Updated weights for policy 1, policy_version 96350 (0.0008) [2023-10-14 04:52:49,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 196444160. Throughput: 0: 1755.8, 1: 1783.2. Samples: 49116886. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:49,559][31953] Avg episode reward: [(0, '20.910'), (1, '20.850')] [2023-10-14 04:52:50,700][33201] Updated weights for policy 0, policy_version 95490 (0.0008) [2023-10-14 04:52:51,081][33201] Updated weights for policy 0, policy_version 95500 (0.0008) [2023-10-14 04:52:51,454][33201] Updated weights for policy 0, policy_version 95510 (0.0009) [2023-10-14 04:52:51,817][33201] Updated weights for policy 0, policy_version 95520 (0.0008) [2023-10-14 04:52:51,980][33226] Updated weights for policy 1, policy_version 96360 (0.0007) [2023-10-14 04:52:52,345][33226] Updated weights for policy 1, policy_version 96370 (0.0010) [2023-10-14 04:52:52,712][33226] Updated weights for policy 1, policy_version 96380 (0.0009) [2023-10-14 04:52:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 196509696. Throughput: 0: 1769.3, 1: 1772.0. Samples: 49138630. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:54,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.850')] [2023-10-14 04:52:55,623][33201] Updated weights for policy 0, policy_version 95530 (0.0010) [2023-10-14 04:52:55,993][33201] Updated weights for policy 0, policy_version 95540 (0.0011) [2023-10-14 04:52:56,365][33201] Updated weights for policy 0, policy_version 95550 (0.0009) [2023-10-14 04:52:56,445][33226] Updated weights for policy 1, policy_version 96390 (0.0009) [2023-10-14 04:52:56,814][33226] Updated weights for policy 1, policy_version 96400 (0.0010) [2023-10-14 04:52:57,173][33226] Updated weights for policy 1, policy_version 96410 (0.0008) [2023-10-14 04:52:59,557][31953] Fps is (10 sec: 13107.7, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 196575232. Throughput: 0: 1760.4, 1: 1783.2. Samples: 49148782. Policy #0 lag: (min: 31.0, avg: 38.7, max: 63.0) [2023-10-14 04:52:59,558][31953] Avg episode reward: [(0, '20.900'), (1, '20.850')] [2023-10-14 04:53:00,044][33201] Updated weights for policy 0, policy_version 95560 (0.0009) [2023-10-14 04:53:00,416][33201] Updated weights for policy 0, policy_version 95570 (0.0009) [2023-10-14 04:53:00,783][33201] Updated weights for policy 0, policy_version 95580 (0.0008) [2023-10-14 04:53:01,130][33226] Updated weights for policy 1, policy_version 96420 (0.0007) [2023-10-14 04:53:01,502][33226] Updated weights for policy 1, policy_version 96430 (0.0009) [2023-10-14 04:53:01,867][33226] Updated weights for policy 1, policy_version 96440 (0.0008) [2023-10-14 04:53:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 196640768. Throughput: 0: 1764.9, 1: 1766.7. Samples: 49170650. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:04,558][31953] Avg episode reward: [(0, '20.860'), (1, '20.850')] [2023-10-14 04:53:04,763][33201] Updated weights for policy 0, policy_version 95590 (0.0010) [2023-10-14 04:53:05,152][33201] Updated weights for policy 0, policy_version 95600 (0.0009) [2023-10-14 04:53:05,523][33201] Updated weights for policy 0, policy_version 95610 (0.0008) [2023-10-14 04:53:05,650][33226] Updated weights for policy 1, policy_version 96450 (0.0009) [2023-10-14 04:53:06,013][33226] Updated weights for policy 1, policy_version 96460 (0.0010) [2023-10-14 04:53:06,372][33226] Updated weights for policy 1, policy_version 96470 (0.0008) [2023-10-14 04:53:06,744][33226] Updated weights for policy 1, policy_version 96480 (0.0010) [2023-10-14 04:53:09,224][33201] Updated weights for policy 0, policy_version 95620 (0.0008) [2023-10-14 04:53:09,557][31953] Fps is (10 sec: 13106.8, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 196706304. Throughput: 0: 1789.9, 1: 1766.3. Samples: 49192528. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:09,558][31953] Avg episode reward: [(0, '20.840'), (1, '20.850')] [2023-10-14 04:53:09,592][33201] Updated weights for policy 0, policy_version 95630 (0.0007) [2023-10-14 04:53:09,963][33201] Updated weights for policy 0, policy_version 95640 (0.0007) [2023-10-14 04:53:10,618][33226] Updated weights for policy 1, policy_version 96490 (0.0009) [2023-10-14 04:53:10,982][33226] Updated weights for policy 1, policy_version 96500 (0.0007) [2023-10-14 04:53:11,342][33226] Updated weights for policy 1, policy_version 96510 (0.0008) [2023-10-14 04:53:13,678][33201] Updated weights for policy 0, policy_version 95650 (0.0007) [2023-10-14 04:53:14,046][33201] Updated weights for policy 0, policy_version 95660 (0.0008) [2023-10-14 04:53:14,406][33201] Updated weights for policy 0, policy_version 95670 (0.0009) [2023-10-14 04:53:14,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 196771840. Throughput: 0: 1759.7, 1: 1767.9. Samples: 49202168. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:14,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.850')] [2023-10-14 04:53:14,775][33201] Updated weights for policy 0, policy_version 95680 (0.0010) [2023-10-14 04:53:15,109][33226] Updated weights for policy 1, policy_version 96520 (0.0009) [2023-10-14 04:53:15,469][33226] Updated weights for policy 1, policy_version 96530 (0.0007) [2023-10-14 04:53:15,825][33226] Updated weights for policy 1, policy_version 96540 (0.0009) [2023-10-14 04:53:18,551][33201] Updated weights for policy 0, policy_version 95690 (0.0011) [2023-10-14 04:53:18,928][33201] Updated weights for policy 0, policy_version 95700 (0.0007) [2023-10-14 04:53:19,290][33201] Updated weights for policy 0, policy_version 95710 (0.0008) [2023-10-14 04:53:19,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 196870144. Throughput: 0: 1791.3, 1: 1761.6. Samples: 49224416. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:19,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.850')] [2023-10-14 04:53:19,618][33226] Updated weights for policy 1, policy_version 96550 (0.0010) [2023-10-14 04:53:19,996][33226] Updated weights for policy 1, policy_version 96560 (0.0010) [2023-10-14 04:53:20,367][33226] Updated weights for policy 1, policy_version 96570 (0.0007) [2023-10-14 04:53:23,005][33201] Updated weights for policy 0, policy_version 95720 (0.0007) [2023-10-14 04:53:23,378][33201] Updated weights for policy 0, policy_version 95730 (0.0008) [2023-10-14 04:53:23,756][33201] Updated weights for policy 0, policy_version 95740 (0.0007) [2023-10-14 04:53:24,074][33226] Updated weights for policy 1, policy_version 96580 (0.0008) [2023-10-14 04:53:24,451][33226] Updated weights for policy 1, policy_version 96590 (0.0008) [2023-10-14 04:53:24,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 196935680. Throughput: 0: 1762.3, 1: 1802.3. Samples: 49245396. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:24,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.980')] [2023-10-14 04:53:24,566][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth... [2023-10-14 04:53:24,597][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000094080_96337920.pth [2023-10-14 04:53:24,818][33226] Updated weights for policy 1, policy_version 96600 (0.0007) [2023-10-14 04:53:25,114][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000096608_98926592.pth... [2023-10-14 04:53:25,144][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000094912_97189888.pth [2023-10-14 04:53:27,692][33201] Updated weights for policy 0, policy_version 95750 (0.0007) [2023-10-14 04:53:28,063][33201] Updated weights for policy 0, policy_version 95760 (0.0007) [2023-10-14 04:53:28,436][33201] Updated weights for policy 0, policy_version 95770 (0.0007) [2023-10-14 04:53:28,744][33226] Updated weights for policy 1, policy_version 96610 (0.0010) [2023-10-14 04:53:29,099][33226] Updated weights for policy 1, policy_version 96620 (0.0009) [2023-10-14 04:53:29,474][33226] Updated weights for policy 1, policy_version 96630 (0.0009) [2023-10-14 04:53:29,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197001216. Throughput: 0: 1784.8, 1: 1771.2. Samples: 49256390. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:29,558][31953] Avg episode reward: [(0, '20.820'), (1, '20.980')] [2023-10-14 04:53:29,830][33226] Updated weights for policy 1, policy_version 96640 (0.0011) [2023-10-14 04:53:32,361][33201] Updated weights for policy 0, policy_version 95780 (0.0008) [2023-10-14 04:53:32,723][33201] Updated weights for policy 0, policy_version 95790 (0.0008) [2023-10-14 04:53:33,093][33201] Updated weights for policy 0, policy_version 95800 (0.0008) [2023-10-14 04:53:33,467][33226] Updated weights for policy 1, policy_version 96650 (0.0008) [2023-10-14 04:53:33,843][33226] Updated weights for policy 1, policy_version 96660 (0.0007) [2023-10-14 04:53:34,201][33226] Updated weights for policy 1, policy_version 96670 (0.0008) [2023-10-14 04:53:34,557][31953] Fps is (10 sec: 16384.3, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 197099520. Throughput: 0: 1770.8, 1: 1798.1. Samples: 49277484. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:34,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.980')] [2023-10-14 04:53:36,968][33201] Updated weights for policy 0, policy_version 95810 (0.0008) [2023-10-14 04:53:37,336][33201] Updated weights for policy 0, policy_version 95820 (0.0008) [2023-10-14 04:53:37,704][33201] Updated weights for policy 0, policy_version 95830 (0.0007) [2023-10-14 04:53:38,023][33226] Updated weights for policy 1, policy_version 96680 (0.0008) [2023-10-14 04:53:38,073][33201] Updated weights for policy 0, policy_version 95840 (0.0008) [2023-10-14 04:53:38,402][33226] Updated weights for policy 1, policy_version 96690 (0.0008) [2023-10-14 04:53:38,767][33226] Updated weights for policy 1, policy_version 96700 (0.0010) [2023-10-14 04:53:39,557][31953] Fps is (10 sec: 16383.7, 60 sec: 14199.5, 300 sec: 14329.0). Total num frames: 197165056. Throughput: 0: 1762.0, 1: 1773.2. Samples: 49297714. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:39,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.980')] [2023-10-14 04:53:41,772][33201] Updated weights for policy 0, policy_version 95850 (0.0008) [2023-10-14 04:53:42,139][33201] Updated weights for policy 0, policy_version 95860 (0.0007) [2023-10-14 04:53:42,512][33201] Updated weights for policy 0, policy_version 95870 (0.0007) [2023-10-14 04:53:42,758][33226] Updated weights for policy 1, policy_version 96710 (0.0008) [2023-10-14 04:53:43,128][33226] Updated weights for policy 1, policy_version 96720 (0.0008) [2023-10-14 04:53:43,497][33226] Updated weights for policy 1, policy_version 96730 (0.0008) [2023-10-14 04:53:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 197230592. Throughput: 0: 1777.2, 1: 1789.9. Samples: 49309302. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:44,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.980')] [2023-10-14 04:53:46,431][33201] Updated weights for policy 0, policy_version 95880 (0.0010) [2023-10-14 04:53:46,799][33201] Updated weights for policy 0, policy_version 95890 (0.0008) [2023-10-14 04:53:47,172][33201] Updated weights for policy 0, policy_version 95900 (0.0010) [2023-10-14 04:53:47,329][33226] Updated weights for policy 1, policy_version 96740 (0.0009) [2023-10-14 04:53:47,694][33226] Updated weights for policy 1, policy_version 96750 (0.0010) [2023-10-14 04:53:48,069][33226] Updated weights for policy 1, policy_version 96760 (0.0010) [2023-10-14 04:53:49,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197296128. Throughput: 0: 1758.7, 1: 1779.5. Samples: 49329866. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:49,558][31953] Avg episode reward: [(0, '20.800'), (1, '20.960')] [2023-10-14 04:53:51,178][33201] Updated weights for policy 0, policy_version 95910 (0.0009) [2023-10-14 04:53:51,560][33201] Updated weights for policy 0, policy_version 95920 (0.0008) [2023-10-14 04:53:51,646][33226] Updated weights for policy 1, policy_version 96770 (0.0011) [2023-10-14 04:53:51,934][33201] Updated weights for policy 0, policy_version 95930 (0.0009) [2023-10-14 04:53:52,016][33226] Updated weights for policy 1, policy_version 96780 (0.0008) [2023-10-14 04:53:52,380][33226] Updated weights for policy 1, policy_version 96790 (0.0008) [2023-10-14 04:53:52,742][33226] Updated weights for policy 1, policy_version 96800 (0.0007) [2023-10-14 04:53:54,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197361664. Throughput: 0: 1755.7, 1: 1775.9. Samples: 49351448. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:54,557][31953] Avg episode reward: [(0, '20.800'), (1, '20.960')] [2023-10-14 04:53:55,773][33201] Updated weights for policy 0, policy_version 95940 (0.0010) [2023-10-14 04:53:56,141][33201] Updated weights for policy 0, policy_version 95950 (0.0010) [2023-10-14 04:53:56,437][33226] Updated weights for policy 1, policy_version 96810 (0.0008) [2023-10-14 04:53:56,508][33201] Updated weights for policy 0, policy_version 95960 (0.0009) [2023-10-14 04:53:56,808][33226] Updated weights for policy 1, policy_version 96820 (0.0009) [2023-10-14 04:53:57,168][33226] Updated weights for policy 1, policy_version 96830 (0.0008) [2023-10-14 04:53:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197427200. Throughput: 0: 1753.1, 1: 1787.3. Samples: 49361486. Policy #0 lag: (min: 2.0, avg: 10.3, max: 34.0) [2023-10-14 04:53:59,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:00,387][33201] Updated weights for policy 0, policy_version 95970 (0.0009) [2023-10-14 04:54:00,762][33201] Updated weights for policy 0, policy_version 95980 (0.0007) [2023-10-14 04:54:00,997][33226] Updated weights for policy 1, policy_version 96840 (0.0009) [2023-10-14 04:54:01,137][33201] Updated weights for policy 0, policy_version 95990 (0.0007) [2023-10-14 04:54:01,369][33226] Updated weights for policy 1, policy_version 96850 (0.0007) [2023-10-14 04:54:01,505][33201] Updated weights for policy 0, policy_version 96000 (0.0009) [2023-10-14 04:54:01,737][33226] Updated weights for policy 1, policy_version 96860 (0.0008) [2023-10-14 04:54:04,557][31953] Fps is (10 sec: 13106.7, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 197492736. Throughput: 0: 1751.6, 1: 1780.8. Samples: 49383374. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:04,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:05,291][33201] Updated weights for policy 0, policy_version 96010 (0.0008) [2023-10-14 04:54:05,568][33226] Updated weights for policy 1, policy_version 96870 (0.0009) [2023-10-14 04:54:05,658][33201] Updated weights for policy 0, policy_version 96020 (0.0008) [2023-10-14 04:54:05,932][33226] Updated weights for policy 1, policy_version 96880 (0.0009) [2023-10-14 04:54:06,032][33201] Updated weights for policy 0, policy_version 96030 (0.0009) [2023-10-14 04:54:06,285][33226] Updated weights for policy 1, policy_version 96890 (0.0010) [2023-10-14 04:54:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 197558272. Throughput: 0: 1783.3, 1: 1774.8. Samples: 49405510. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:09,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:09,809][33201] Updated weights for policy 0, policy_version 96040 (0.0009) [2023-10-14 04:54:10,179][33201] Updated weights for policy 0, policy_version 96050 (0.0009) [2023-10-14 04:54:10,206][33226] Updated weights for policy 1, policy_version 96900 (0.0008) [2023-10-14 04:54:10,545][33201] Updated weights for policy 0, policy_version 96060 (0.0008) [2023-10-14 04:54:10,602][33226] Updated weights for policy 1, policy_version 96910 (0.0007) [2023-10-14 04:54:10,971][33226] Updated weights for policy 1, policy_version 96920 (0.0008) [2023-10-14 04:54:14,291][33201] Updated weights for policy 0, policy_version 96070 (0.0008) [2023-10-14 04:54:14,557][31953] Fps is (10 sec: 13107.6, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 197623808. Throughput: 0: 1751.4, 1: 1773.2. Samples: 49414998. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:14,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:14,664][33201] Updated weights for policy 0, policy_version 96080 (0.0009) [2023-10-14 04:54:14,888][33226] Updated weights for policy 1, policy_version 96930 (0.0011) [2023-10-14 04:54:15,032][33201] Updated weights for policy 0, policy_version 96090 (0.0007) [2023-10-14 04:54:15,255][33226] Updated weights for policy 1, policy_version 96940 (0.0008) [2023-10-14 04:54:15,617][33226] Updated weights for policy 1, policy_version 96950 (0.0009) [2023-10-14 04:54:15,981][33226] Updated weights for policy 1, policy_version 96960 (0.0008) [2023-10-14 04:54:18,894][33201] Updated weights for policy 0, policy_version 96100 (0.0008) [2023-10-14 04:54:19,257][33201] Updated weights for policy 0, policy_version 96110 (0.0008) [2023-10-14 04:54:19,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 197689344. Throughput: 0: 1778.1, 1: 1764.9. Samples: 49436922. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:19,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:19,638][33201] Updated weights for policy 0, policy_version 96120 (0.0008) [2023-10-14 04:54:19,796][33226] Updated weights for policy 1, policy_version 96970 (0.0009) [2023-10-14 04:54:20,159][33226] Updated weights for policy 1, policy_version 96980 (0.0011) [2023-10-14 04:54:20,525][33226] Updated weights for policy 1, policy_version 96990 (0.0011) [2023-10-14 04:54:23,215][33201] Updated weights for policy 0, policy_version 96130 (0.0009) [2023-10-14 04:54:23,590][33201] Updated weights for policy 0, policy_version 96140 (0.0010) [2023-10-14 04:54:23,964][33201] Updated weights for policy 0, policy_version 96150 (0.0010) [2023-10-14 04:54:24,293][33226] Updated weights for policy 1, policy_version 97000 (0.0008) [2023-10-14 04:54:24,334][33201] Updated weights for policy 0, policy_version 96160 (0.0008) [2023-10-14 04:54:24,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197787648. Throughput: 0: 1763.7, 1: 1800.5. Samples: 49458104. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:24,558][31953] Avg episode reward: [(0, '20.880'), (1, '20.960')] [2023-10-14 04:54:24,661][33226] Updated weights for policy 1, policy_version 97010 (0.0009) [2023-10-14 04:54:25,024][33226] Updated weights for policy 1, policy_version 97020 (0.0010) [2023-10-14 04:54:28,244][33201] Updated weights for policy 0, policy_version 96170 (0.0008) [2023-10-14 04:54:28,602][33201] Updated weights for policy 0, policy_version 96180 (0.0009) [2023-10-14 04:54:28,802][33226] Updated weights for policy 1, policy_version 97030 (0.0010) [2023-10-14 04:54:28,974][33201] Updated weights for policy 0, policy_version 96190 (0.0008) [2023-10-14 04:54:29,160][33226] Updated weights for policy 1, policy_version 97040 (0.0010) [2023-10-14 04:54:29,528][33226] Updated weights for policy 1, policy_version 97050 (0.0009) [2023-10-14 04:54:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 197853184. Throughput: 0: 1772.9, 1: 1768.1. Samples: 49468646. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:29,557][31953] Avg episode reward: [(0, '20.900'), (1, '20.960')] [2023-10-14 04:54:33,000][33201] Updated weights for policy 0, policy_version 96200 (0.0008) [2023-10-14 04:54:33,299][33226] Updated weights for policy 1, policy_version 97060 (0.0007) [2023-10-14 04:54:33,369][33201] Updated weights for policy 0, policy_version 96210 (0.0009) [2023-10-14 04:54:33,662][33226] Updated weights for policy 1, policy_version 97070 (0.0007) [2023-10-14 04:54:33,734][33201] Updated weights for policy 0, policy_version 96220 (0.0009) [2023-10-14 04:54:34,028][33226] Updated weights for policy 1, policy_version 97080 (0.0007) [2023-10-14 04:54:34,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.4, 300 sec: 14329.1). Total num frames: 197951488. Throughput: 0: 1772.8, 1: 1790.6. Samples: 49490220. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:34,558][31953] Avg episode reward: [(0, '20.930'), (1, '20.980')] [2023-10-14 04:54:37,575][33201] Updated weights for policy 0, policy_version 96230 (0.0008) [2023-10-14 04:54:37,877][33226] Updated weights for policy 1, policy_version 97090 (0.0008) [2023-10-14 04:54:37,961][33201] Updated weights for policy 0, policy_version 96240 (0.0010) [2023-10-14 04:54:38,248][33226] Updated weights for policy 1, policy_version 97100 (0.0008) [2023-10-14 04:54:38,335][33201] Updated weights for policy 0, policy_version 96250 (0.0007) [2023-10-14 04:54:38,609][33226] Updated weights for policy 1, policy_version 97110 (0.0008) [2023-10-14 04:54:38,968][33226] Updated weights for policy 1, policy_version 97120 (0.0007) [2023-10-14 04:54:39,557][31953] Fps is (10 sec: 16383.6, 60 sec: 14199.5, 300 sec: 14329.1). Total num frames: 198017024. Throughput: 0: 1757.0, 1: 1760.1. Samples: 49509718. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:39,558][31953] Avg episode reward: [(0, '20.940'), (1, '20.980')] [2023-10-14 04:54:42,176][33201] Updated weights for policy 0, policy_version 96260 (0.0007) [2023-10-14 04:54:42,537][33201] Updated weights for policy 0, policy_version 96270 (0.0007) [2023-10-14 04:54:42,770][33226] Updated weights for policy 1, policy_version 97130 (0.0007) [2023-10-14 04:54:42,901][33201] Updated weights for policy 0, policy_version 96280 (0.0008) [2023-10-14 04:54:43,129][33226] Updated weights for policy 1, policy_version 97140 (0.0008) [2023-10-14 04:54:43,501][33226] Updated weights for policy 1, policy_version 97150 (0.0008) [2023-10-14 04:54:44,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 198082560. Throughput: 0: 1787.3, 1: 1782.2. Samples: 49522114. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:44,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:54:46,787][33201] Updated weights for policy 0, policy_version 96290 (0.0008) [2023-10-14 04:54:47,158][33201] Updated weights for policy 0, policy_version 96300 (0.0007) [2023-10-14 04:54:47,230][33226] Updated weights for policy 1, policy_version 97160 (0.0007) [2023-10-14 04:54:47,539][33201] Updated weights for policy 0, policy_version 96310 (0.0008) [2023-10-14 04:54:47,592][33226] Updated weights for policy 1, policy_version 97170 (0.0008) [2023-10-14 04:54:47,904][33201] Updated weights for policy 0, policy_version 96320 (0.0007) [2023-10-14 04:54:47,954][33226] Updated weights for policy 1, policy_version 97180 (0.0008) [2023-10-14 04:54:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 198148096. Throughput: 0: 1756.2, 1: 1765.3. Samples: 49541838. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.980')] [2023-10-14 04:54:51,553][33201] Updated weights for policy 0, policy_version 96330 (0.0008) [2023-10-14 04:54:51,914][33226] Updated weights for policy 1, policy_version 97190 (0.0008) [2023-10-14 04:54:51,918][33201] Updated weights for policy 0, policy_version 96340 (0.0007) [2023-10-14 04:54:52,281][33226] Updated weights for policy 1, policy_version 97200 (0.0009) [2023-10-14 04:54:52,285][33201] Updated weights for policy 0, policy_version 96350 (0.0007) [2023-10-14 04:54:52,652][33226] Updated weights for policy 1, policy_version 97210 (0.0007) [2023-10-14 04:54:54,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 198213632. Throughput: 0: 1755.4, 1: 1758.8. Samples: 49563650. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:54,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.980')] [2023-10-14 04:54:56,095][33201] Updated weights for policy 0, policy_version 96360 (0.0008) [2023-10-14 04:54:56,451][33201] Updated weights for policy 0, policy_version 96370 (0.0009) [2023-10-14 04:54:56,492][33226] Updated weights for policy 1, policy_version 97220 (0.0007) [2023-10-14 04:54:56,819][33201] Updated weights for policy 0, policy_version 96380 (0.0007) [2023-10-14 04:54:56,887][33226] Updated weights for policy 1, policy_version 97230 (0.0008) [2023-10-14 04:54:57,256][33226] Updated weights for policy 1, policy_version 97240 (0.0008) [2023-10-14 04:54:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 198279168. Throughput: 0: 1754.9, 1: 1778.0. Samples: 49573982. Policy #0 lag: (min: 25.0, avg: 40.1, max: 57.0) [2023-10-14 04:54:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.980')] [2023-10-14 04:55:00,714][33201] Updated weights for policy 0, policy_version 96390 (0.0007) [2023-10-14 04:55:00,943][33226] Updated weights for policy 1, policy_version 97250 (0.0008) [2023-10-14 04:55:01,077][33201] Updated weights for policy 0, policy_version 96400 (0.0008) [2023-10-14 04:55:01,304][33226] Updated weights for policy 1, policy_version 97260 (0.0008) [2023-10-14 04:55:01,444][33201] Updated weights for policy 0, policy_version 96410 (0.0009) [2023-10-14 04:55:01,659][33226] Updated weights for policy 1, policy_version 97270 (0.0008) [2023-10-14 04:55:02,036][33226] Updated weights for policy 1, policy_version 97280 (0.0010) [2023-10-14 04:55:04,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 198344704. Throughput: 0: 1752.5, 1: 1764.8. Samples: 49595202. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:04,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:55:05,389][33201] Updated weights for policy 0, policy_version 96420 (0.0009) [2023-10-14 04:55:05,771][33201] Updated weights for policy 0, policy_version 96430 (0.0009) [2023-10-14 04:55:05,908][33226] Updated weights for policy 1, policy_version 97290 (0.0007) [2023-10-14 04:55:06,132][33201] Updated weights for policy 0, policy_version 96440 (0.0008) [2023-10-14 04:55:06,280][33226] Updated weights for policy 1, policy_version 97300 (0.0008) [2023-10-14 04:55:06,646][33226] Updated weights for policy 1, policy_version 97310 (0.0010) [2023-10-14 04:55:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 198410240. Throughput: 0: 1770.2, 1: 1770.5. Samples: 49617434. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:09,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:55:10,041][33201] Updated weights for policy 0, policy_version 96450 (0.0007) [2023-10-14 04:55:10,351][33226] Updated weights for policy 1, policy_version 97320 (0.0008) [2023-10-14 04:55:10,418][33201] Updated weights for policy 0, policy_version 96460 (0.0007) [2023-10-14 04:55:10,719][33226] Updated weights for policy 1, policy_version 97330 (0.0008) [2023-10-14 04:55:10,784][33201] Updated weights for policy 0, policy_version 96470 (0.0009) [2023-10-14 04:55:11,073][33226] Updated weights for policy 1, policy_version 97340 (0.0007) [2023-10-14 04:55:11,146][33201] Updated weights for policy 0, policy_version 96480 (0.0009) [2023-10-14 04:55:14,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 198475776. Throughput: 0: 1742.3, 1: 1775.7. Samples: 49626956. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:14,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.960')] [2023-10-14 04:55:14,809][33226] Updated weights for policy 1, policy_version 97350 (0.0008) [2023-10-14 04:55:15,081][33201] Updated weights for policy 0, policy_version 96490 (0.0007) [2023-10-14 04:55:15,163][33226] Updated weights for policy 1, policy_version 97360 (0.0008) [2023-10-14 04:55:15,452][33201] Updated weights for policy 0, policy_version 96500 (0.0007) [2023-10-14 04:55:15,529][33226] Updated weights for policy 1, policy_version 97370 (0.0008) [2023-10-14 04:55:15,818][33201] Updated weights for policy 0, policy_version 96510 (0.0008) [2023-10-14 04:55:19,251][33226] Updated weights for policy 1, policy_version 97380 (0.0008) [2023-10-14 04:55:19,537][33201] Updated weights for policy 0, policy_version 96520 (0.0008) [2023-10-14 04:55:19,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 198541312. Throughput: 0: 1761.4, 1: 1772.6. Samples: 49649248. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:19,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 04:55:19,617][33226] Updated weights for policy 1, policy_version 97390 (0.0008) [2023-10-14 04:55:19,907][33201] Updated weights for policy 0, policy_version 96530 (0.0008) [2023-10-14 04:55:19,985][33226] Updated weights for policy 1, policy_version 97400 (0.0009) [2023-10-14 04:55:20,281][33201] Updated weights for policy 0, policy_version 96540 (0.0009) [2023-10-14 04:55:23,756][33226] Updated weights for policy 1, policy_version 97410 (0.0009) [2023-10-14 04:55:24,121][33226] Updated weights for policy 1, policy_version 97420 (0.0008) [2023-10-14 04:55:24,206][33201] Updated weights for policy 0, policy_version 96550 (0.0010) [2023-10-14 04:55:24,491][33226] Updated weights for policy 1, policy_version 97430 (0.0008) [2023-10-14 04:55:24,557][31953] Fps is (10 sec: 13107.2, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 198606848. Throughput: 0: 1774.3, 1: 1805.5. Samples: 49670806. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 04:55:24,599][33201] Updated weights for policy 0, policy_version 96560 (0.0009) [2023-10-14 04:55:24,855][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000097440_99778560.pth... [2023-10-14 04:55:24,859][33226] Updated weights for policy 1, policy_version 97440 (0.0009) [2023-10-14 04:55:24,893][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000095744_98041856.pth [2023-10-14 04:55:24,970][33201] Updated weights for policy 0, policy_version 96570 (0.0008) [2023-10-14 04:55:25,186][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000096576_98893824.pth... [2023-10-14 04:55:25,215][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000094912_97189888.pth [2023-10-14 04:55:28,620][33226] Updated weights for policy 1, policy_version 97450 (0.0007) [2023-10-14 04:55:28,683][33201] Updated weights for policy 0, policy_version 96580 (0.0008) [2023-10-14 04:55:28,985][33226] Updated weights for policy 1, policy_version 97460 (0.0007) [2023-10-14 04:55:29,053][33201] Updated weights for policy 0, policy_version 96590 (0.0007) [2023-10-14 04:55:29,357][33226] Updated weights for policy 1, policy_version 97470 (0.0008) [2023-10-14 04:55:29,420][33201] Updated weights for policy 0, policy_version 96600 (0.0008) [2023-10-14 04:55:29,557][31953] Fps is (10 sec: 16384.1, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 198705152. Throughput: 0: 1750.7, 1: 1780.1. Samples: 49681000. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:29,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 04:55:33,116][33226] Updated weights for policy 1, policy_version 97480 (0.0009) [2023-10-14 04:55:33,343][33201] Updated weights for policy 0, policy_version 96610 (0.0007) [2023-10-14 04:55:33,491][33226] Updated weights for policy 1, policy_version 97490 (0.0009) [2023-10-14 04:55:33,697][33201] Updated weights for policy 0, policy_version 96620 (0.0009) [2023-10-14 04:55:33,856][33226] Updated weights for policy 1, policy_version 97500 (0.0008) [2023-10-14 04:55:34,065][33201] Updated weights for policy 0, policy_version 96630 (0.0009) [2023-10-14 04:55:34,437][33201] Updated weights for policy 0, policy_version 96640 (0.0007) [2023-10-14 04:55:34,557][31953] Fps is (10 sec: 19660.6, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 198803456. Throughput: 0: 1779.6, 1: 1801.3. Samples: 49702978. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:34,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.910')] [2023-10-14 04:55:37,792][33226] Updated weights for policy 1, policy_version 97510 (0.0009) [2023-10-14 04:55:38,088][33201] Updated weights for policy 0, policy_version 96650 (0.0008) [2023-10-14 04:55:38,146][33226] Updated weights for policy 1, policy_version 97520 (0.0007) [2023-10-14 04:55:38,453][33201] Updated weights for policy 0, policy_version 96660 (0.0008) [2023-10-14 04:55:38,506][33226] Updated weights for policy 1, policy_version 97530 (0.0007) [2023-10-14 04:55:38,812][33201] Updated weights for policy 0, policy_version 96670 (0.0008) [2023-10-14 04:55:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 198868992. Throughput: 0: 1748.3, 1: 1778.8. Samples: 49722366. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:39,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 04:55:42,377][33226] Updated weights for policy 1, policy_version 97540 (0.0008) [2023-10-14 04:55:42,556][33201] Updated weights for policy 0, policy_version 96680 (0.0008) [2023-10-14 04:55:42,768][33226] Updated weights for policy 1, policy_version 97550 (0.0009) [2023-10-14 04:55:42,930][33201] Updated weights for policy 0, policy_version 96690 (0.0009) [2023-10-14 04:55:43,129][33226] Updated weights for policy 1, policy_version 97560 (0.0009) [2023-10-14 04:55:43,293][33201] Updated weights for policy 0, policy_version 96700 (0.0009) [2023-10-14 04:55:44,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 198934528. Throughput: 0: 1781.7, 1: 1794.0. Samples: 49734888. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:44,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.900')] [2023-10-14 04:55:46,783][33226] Updated weights for policy 1, policy_version 97570 (0.0008) [2023-10-14 04:55:47,146][33226] Updated weights for policy 1, policy_version 97580 (0.0008) [2023-10-14 04:55:47,234][33201] Updated weights for policy 0, policy_version 96710 (0.0010) [2023-10-14 04:55:47,516][33226] Updated weights for policy 1, policy_version 97590 (0.0008) [2023-10-14 04:55:47,602][33201] Updated weights for policy 0, policy_version 96720 (0.0007) [2023-10-14 04:55:47,878][33226] Updated weights for policy 1, policy_version 97600 (0.0007) [2023-10-14 04:55:47,977][33201] Updated weights for policy 0, policy_version 96730 (0.0007) [2023-10-14 04:55:49,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 199000064. Throughput: 0: 1754.8, 1: 1779.0. Samples: 49754220. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:49,557][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 04:55:51,788][33226] Updated weights for policy 1, policy_version 97610 (0.0008) [2023-10-14 04:55:51,869][33201] Updated weights for policy 0, policy_version 96740 (0.0008) [2023-10-14 04:55:52,157][33226] Updated weights for policy 1, policy_version 97620 (0.0009) [2023-10-14 04:55:52,240][33201] Updated weights for policy 0, policy_version 96750 (0.0009) [2023-10-14 04:55:52,521][33226] Updated weights for policy 1, policy_version 97630 (0.0008) [2023-10-14 04:55:52,610][33201] Updated weights for policy 0, policy_version 96760 (0.0007) [2023-10-14 04:55:54,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 199065600. Throughput: 0: 1745.1, 1: 1775.8. Samples: 49775872. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:54,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:55:56,303][33226] Updated weights for policy 1, policy_version 97640 (0.0008) [2023-10-14 04:55:56,380][33201] Updated weights for policy 0, policy_version 96770 (0.0008) [2023-10-14 04:55:56,664][33226] Updated weights for policy 1, policy_version 97650 (0.0010) [2023-10-14 04:55:56,747][33201] Updated weights for policy 0, policy_version 96780 (0.0007) [2023-10-14 04:55:57,037][33226] Updated weights for policy 1, policy_version 97660 (0.0009) [2023-10-14 04:55:57,110][33201] Updated weights for policy 0, policy_version 96790 (0.0008) [2023-10-14 04:55:57,482][33201] Updated weights for policy 0, policy_version 96800 (0.0008) [2023-10-14 04:55:59,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 199131136. Throughput: 0: 1761.5, 1: 1779.6. Samples: 49786306. Policy #0 lag: (min: 30.0, avg: 32.4, max: 62.0) [2023-10-14 04:55:59,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:56:00,868][33226] Updated weights for policy 1, policy_version 97670 (0.0010) [2023-10-14 04:56:01,237][33226] Updated weights for policy 1, policy_version 97680 (0.0009) [2023-10-14 04:56:01,330][33201] Updated weights for policy 0, policy_version 96810 (0.0009) [2023-10-14 04:56:01,598][33226] Updated weights for policy 1, policy_version 97690 (0.0010) [2023-10-14 04:56:01,704][33201] Updated weights for policy 0, policy_version 96820 (0.0007) [2023-10-14 04:56:02,085][33201] Updated weights for policy 0, policy_version 96830 (0.0007) [2023-10-14 04:56:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199196672. Throughput: 0: 1741.4, 1: 1770.9. Samples: 49807302. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:04,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:56:05,372][33226] Updated weights for policy 1, policy_version 97700 (0.0008) [2023-10-14 04:56:05,732][33226] Updated weights for policy 1, policy_version 97710 (0.0009) [2023-10-14 04:56:05,897][33201] Updated weights for policy 0, policy_version 96840 (0.0008) [2023-10-14 04:56:06,095][33226] Updated weights for policy 1, policy_version 97720 (0.0008) [2023-10-14 04:56:06,268][33201] Updated weights for policy 0, policy_version 96850 (0.0010) [2023-10-14 04:56:06,629][33201] Updated weights for policy 0, policy_version 96860 (0.0007) [2023-10-14 04:56:09,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 199262208. Throughput: 0: 1756.8, 1: 1772.3. Samples: 49829618. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:09,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:56:09,973][33226] Updated weights for policy 1, policy_version 97730 (0.0009) [2023-10-14 04:56:10,345][33226] Updated weights for policy 1, policy_version 97740 (0.0009) [2023-10-14 04:56:10,487][33201] Updated weights for policy 0, policy_version 96870 (0.0007) [2023-10-14 04:56:10,704][33226] Updated weights for policy 1, policy_version 97750 (0.0008) [2023-10-14 04:56:10,874][33201] Updated weights for policy 0, policy_version 96880 (0.0008) [2023-10-14 04:56:11,070][33226] Updated weights for policy 1, policy_version 97760 (0.0008) [2023-10-14 04:56:11,248][33201] Updated weights for policy 0, policy_version 96890 (0.0009) [2023-10-14 04:56:14,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 199327744. Throughput: 0: 1750.6, 1: 1763.6. Samples: 49839138. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:14,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:56:14,898][33226] Updated weights for policy 1, policy_version 97770 (0.0009) [2023-10-14 04:56:15,277][33226] Updated weights for policy 1, policy_version 97780 (0.0008) [2023-10-14 04:56:15,289][33201] Updated weights for policy 0, policy_version 96900 (0.0008) [2023-10-14 04:56:15,643][33226] Updated weights for policy 1, policy_version 97790 (0.0008) [2023-10-14 04:56:15,664][33201] Updated weights for policy 0, policy_version 96910 (0.0007) [2023-10-14 04:56:16,030][33201] Updated weights for policy 0, policy_version 96920 (0.0010) [2023-10-14 04:56:19,364][33226] Updated weights for policy 1, policy_version 97800 (0.0007) [2023-10-14 04:56:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 199393280. Throughput: 0: 1750.3, 1: 1762.4. Samples: 49861050. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:19,558][31953] Avg episode reward: [(0, '20.950'), (1, '20.870')] [2023-10-14 04:56:19,729][33226] Updated weights for policy 1, policy_version 97810 (0.0008) [2023-10-14 04:56:19,804][33201] Updated weights for policy 0, policy_version 96930 (0.0011) [2023-10-14 04:56:20,099][33226] Updated weights for policy 1, policy_version 97820 (0.0009) [2023-10-14 04:56:20,178][33201] Updated weights for policy 0, policy_version 96940 (0.0008) [2023-10-14 04:56:20,543][33201] Updated weights for policy 0, policy_version 96950 (0.0009) [2023-10-14 04:56:20,916][33201] Updated weights for policy 0, policy_version 96960 (0.0010) [2023-10-14 04:56:23,893][33226] Updated weights for policy 1, policy_version 97830 (0.0007) [2023-10-14 04:56:24,256][33226] Updated weights for policy 1, policy_version 97840 (0.0007) [2023-10-14 04:56:24,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 199458816. Throughput: 0: 1776.2, 1: 1791.0. Samples: 49882892. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:24,558][31953] Avg episode reward: [(0, '20.970'), (1, '20.870')] [2023-10-14 04:56:24,631][33226] Updated weights for policy 1, policy_version 97850 (0.0007) [2023-10-14 04:56:24,689][33201] Updated weights for policy 0, policy_version 96970 (0.0007) [2023-10-14 04:56:25,060][33201] Updated weights for policy 0, policy_version 96980 (0.0009) [2023-10-14 04:56:25,427][33201] Updated weights for policy 0, policy_version 96990 (0.0010) [2023-10-14 04:56:28,468][33226] Updated weights for policy 1, policy_version 97860 (0.0008) [2023-10-14 04:56:28,861][33226] Updated weights for policy 1, policy_version 97870 (0.0007) [2023-10-14 04:56:29,101][33201] Updated weights for policy 0, policy_version 97000 (0.0008) [2023-10-14 04:56:29,234][33226] Updated weights for policy 1, policy_version 97880 (0.0008) [2023-10-14 04:56:29,471][33201] Updated weights for policy 0, policy_version 97010 (0.0007) [2023-10-14 04:56:29,557][31953] Fps is (10 sec: 16384.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199557120. Throughput: 0: 1740.7, 1: 1769.2. Samples: 49892832. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:29,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.870')] [2023-10-14 04:56:29,840][33201] Updated weights for policy 0, policy_version 97020 (0.0010) [2023-10-14 04:56:33,012][33226] Updated weights for policy 1, policy_version 97890 (0.0008) [2023-10-14 04:56:33,382][33226] Updated weights for policy 1, policy_version 97900 (0.0008) [2023-10-14 04:56:33,687][33201] Updated weights for policy 0, policy_version 97030 (0.0007) [2023-10-14 04:56:33,748][33226] Updated weights for policy 1, policy_version 97910 (0.0008) [2023-10-14 04:56:34,060][33201] Updated weights for policy 0, policy_version 97040 (0.0007) [2023-10-14 04:56:34,104][33226] Updated weights for policy 1, policy_version 97920 (0.0008) [2023-10-14 04:56:34,440][33201] Updated weights for policy 0, policy_version 97050 (0.0009) [2023-10-14 04:56:34,557][31953] Fps is (10 sec: 16383.8, 60 sec: 13653.3, 300 sec: 14106.9). Total num frames: 199622656. Throughput: 0: 1771.5, 1: 1794.8. Samples: 49914708. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:34,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.870')] [2023-10-14 04:56:38,018][33226] Updated weights for policy 1, policy_version 97930 (0.0007) [2023-10-14 04:56:38,331][33201] Updated weights for policy 0, policy_version 97060 (0.0008) [2023-10-14 04:56:38,391][33226] Updated weights for policy 1, policy_version 97940 (0.0008) [2023-10-14 04:56:38,706][33201] Updated weights for policy 0, policy_version 97070 (0.0007) [2023-10-14 04:56:38,762][33226] Updated weights for policy 1, policy_version 97950 (0.0008) [2023-10-14 04:56:39,074][33201] Updated weights for policy 0, policy_version 97080 (0.0007) [2023-10-14 04:56:39,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199720960. Throughput: 0: 1763.0, 1: 1759.3. Samples: 49934376. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:39,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.870')] [2023-10-14 04:56:42,376][33226] Updated weights for policy 1, policy_version 97960 (0.0009) [2023-10-14 04:56:42,745][33226] Updated weights for policy 1, policy_version 97970 (0.0008) [2023-10-14 04:56:42,912][33201] Updated weights for policy 0, policy_version 97090 (0.0009) [2023-10-14 04:56:43,111][33226] Updated weights for policy 1, policy_version 97980 (0.0009) [2023-10-14 04:56:43,284][33201] Updated weights for policy 0, policy_version 97100 (0.0009) [2023-10-14 04:56:43,654][33201] Updated weights for policy 0, policy_version 97110 (0.0009) [2023-10-14 04:56:44,015][33201] Updated weights for policy 0, policy_version 97120 (0.0008) [2023-10-14 04:56:44,557][31953] Fps is (10 sec: 16384.2, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 199786496. Throughput: 0: 1771.9, 1: 1792.8. Samples: 49946716. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:44,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.830')] [2023-10-14 04:56:46,864][33226] Updated weights for policy 1, policy_version 97990 (0.0007) [2023-10-14 04:56:47,228][33226] Updated weights for policy 1, policy_version 98000 (0.0007) [2023-10-14 04:56:47,594][33226] Updated weights for policy 1, policy_version 98010 (0.0008) [2023-10-14 04:56:48,044][33201] Updated weights for policy 0, policy_version 97130 (0.0008) [2023-10-14 04:56:48,414][33201] Updated weights for policy 0, policy_version 97140 (0.0008) [2023-10-14 04:56:48,800][33201] Updated weights for policy 0, policy_version 97150 (0.0010) [2023-10-14 04:56:49,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199852032. Throughput: 0: 1775.9, 1: 1769.1. Samples: 49966826. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:49,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.850')] [2023-10-14 04:56:51,543][33226] Updated weights for policy 1, policy_version 98020 (0.0009) [2023-10-14 04:56:51,908][33226] Updated weights for policy 1, policy_version 98030 (0.0010) [2023-10-14 04:56:52,280][33226] Updated weights for policy 1, policy_version 98040 (0.0011) [2023-10-14 04:56:52,459][33201] Updated weights for policy 0, policy_version 97160 (0.0009) [2023-10-14 04:56:52,839][33201] Updated weights for policy 0, policy_version 97170 (0.0008) [2023-10-14 04:56:53,209][33201] Updated weights for policy 0, policy_version 97180 (0.0009) [2023-10-14 04:56:54,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199917568. Throughput: 0: 1748.5, 1: 1763.4. Samples: 49987654. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:54,559][31953] Avg episode reward: [(0, '20.960'), (1, '20.850')] [2023-10-14 04:56:56,079][33226] Updated weights for policy 1, policy_version 98050 (0.0007) [2023-10-14 04:56:56,450][33226] Updated weights for policy 1, policy_version 98060 (0.0007) [2023-10-14 04:56:56,817][33226] Updated weights for policy 1, policy_version 98070 (0.0007) [2023-10-14 04:56:56,979][33201] Updated weights for policy 0, policy_version 97190 (0.0009) [2023-10-14 04:56:57,179][33226] Updated weights for policy 1, policy_version 98080 (0.0007) [2023-10-14 04:56:57,359][33201] Updated weights for policy 0, policy_version 97200 (0.0009) [2023-10-14 04:56:57,728][33201] Updated weights for policy 0, policy_version 97210 (0.0010) [2023-10-14 04:56:59,557][31953] Fps is (10 sec: 13107.1, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 199983104. Throughput: 0: 1772.5, 1: 1770.7. Samples: 49998586. Policy #0 lag: (min: 22.0, avg: 22.0, max: 22.0) [2023-10-14 04:56:59,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.850')] [2023-10-14 04:57:00,866][33226] Updated weights for policy 1, policy_version 98090 (0.0010) [2023-10-14 04:57:01,230][33226] Updated weights for policy 1, policy_version 98100 (0.0009) [2023-10-14 04:57:01,483][33201] Updated weights for policy 0, policy_version 97220 (0.0010) [2023-10-14 04:57:01,599][33226] Updated weights for policy 1, policy_version 98110 (0.0008) [2023-10-14 04:57:01,862][33201] Updated weights for policy 0, policy_version 97230 (0.0009) [2023-10-14 04:57:02,246][33201] Updated weights for policy 0, policy_version 97240 (0.0008) [2023-10-14 04:57:04,557][31953] Fps is (10 sec: 13107.0, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 200048640. Throughput: 0: 1749.0, 1: 1775.7. Samples: 50019664. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:04,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:57:05,367][33226] Updated weights for policy 1, policy_version 98120 (0.0011) [2023-10-14 04:57:05,747][33226] Updated weights for policy 1, policy_version 98130 (0.0008) [2023-10-14 04:57:06,110][33226] Updated weights for policy 1, policy_version 98140 (0.0007) [2023-10-14 04:57:06,204][33201] Updated weights for policy 0, policy_version 97250 (0.0007) [2023-10-14 04:57:06,571][33201] Updated weights for policy 0, policy_version 97260 (0.0008) [2023-10-14 04:57:06,946][33201] Updated weights for policy 0, policy_version 97270 (0.0008) [2023-10-14 04:57:07,318][33201] Updated weights for policy 0, policy_version 97280 (0.0008) [2023-10-14 04:57:09,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 200114176. Throughput: 0: 1754.2, 1: 1774.8. Samples: 50041694. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:09,557][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:57:10,006][33226] Updated weights for policy 1, policy_version 98150 (0.0009) [2023-10-14 04:57:10,361][33226] Updated weights for policy 1, policy_version 98160 (0.0008) [2023-10-14 04:57:10,739][33226] Updated weights for policy 1, policy_version 98170 (0.0007) [2023-10-14 04:57:11,212][33201] Updated weights for policy 0, policy_version 97290 (0.0008) [2023-10-14 04:57:11,583][33201] Updated weights for policy 0, policy_version 97300 (0.0009) [2023-10-14 04:57:11,948][33201] Updated weights for policy 0, policy_version 97310 (0.0008) [2023-10-14 04:57:14,450][33226] Updated weights for policy 1, policy_version 98180 (0.0010) [2023-10-14 04:57:14,557][31953] Fps is (10 sec: 13107.4, 60 sec: 14199.4, 300 sec: 14106.9). Total num frames: 200179712. Throughput: 0: 1755.0, 1: 1770.7. Samples: 50051486. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:14,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:57:14,823][33226] Updated weights for policy 1, policy_version 98190 (0.0009) [2023-10-14 04:57:15,192][33226] Updated weights for policy 1, policy_version 98200 (0.0008) [2023-10-14 04:57:15,668][33201] Updated weights for policy 0, policy_version 97320 (0.0009) [2023-10-14 04:57:16,037][33201] Updated weights for policy 0, policy_version 97330 (0.0007) [2023-10-14 04:57:16,413][33201] Updated weights for policy 0, policy_version 97340 (0.0009) [2023-10-14 04:57:18,883][33226] Updated weights for policy 1, policy_version 98210 (0.0008) [2023-10-14 04:57:19,250][33226] Updated weights for policy 1, policy_version 98220 (0.0007) [2023-10-14 04:57:19,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 200245248. Throughput: 0: 1754.2, 1: 1776.0. Samples: 50073566. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:19,558][31953] Avg episode reward: [(0, '20.960'), (1, '20.900')] [2023-10-14 04:57:19,625][33226] Updated weights for policy 1, policy_version 98230 (0.0008) [2023-10-14 04:57:19,984][33226] Updated weights for policy 1, policy_version 98240 (0.0009) [2023-10-14 04:57:20,304][33201] Updated weights for policy 0, policy_version 97350 (0.0010) [2023-10-14 04:57:20,672][33201] Updated weights for policy 0, policy_version 97360 (0.0010) [2023-10-14 04:57:21,040][33201] Updated weights for policy 0, policy_version 97370 (0.0008) [2023-10-14 04:57:23,778][33226] Updated weights for policy 1, policy_version 98250 (0.0008) [2023-10-14 04:57:24,136][33226] Updated weights for policy 1, policy_version 98260 (0.0008) [2023-10-14 04:57:24,506][33226] Updated weights for policy 1, policy_version 98270 (0.0009) [2023-10-14 04:57:24,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 200310784. Throughput: 0: 1778.1, 1: 1802.1. Samples: 50095484. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:24,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:57:24,567][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000097376_99713024.pth... [2023-10-14 04:57:24,574][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000098272_100630528.pth... [2023-10-14 04:57:24,612][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000096608_98926592.pth [2023-10-14 04:57:24,616][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000095744_98041856.pth [2023-10-14 04:57:24,901][33201] Updated weights for policy 0, policy_version 97380 (0.0008) [2023-10-14 04:57:25,275][33201] Updated weights for policy 0, policy_version 97390 (0.0007) [2023-10-14 04:57:25,643][33201] Updated weights for policy 0, policy_version 97400 (0.0008) [2023-10-14 04:57:28,378][33226] Updated weights for policy 1, policy_version 98280 (0.0007) [2023-10-14 04:57:28,742][33226] Updated weights for policy 1, policy_version 98290 (0.0007) [2023-10-14 04:57:29,108][33226] Updated weights for policy 1, policy_version 98300 (0.0010) [2023-10-14 04:57:29,429][33201] Updated weights for policy 0, policy_version 97410 (0.0008) [2023-10-14 04:57:29,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 200409088. Throughput: 0: 1756.4, 1: 1776.6. Samples: 50105700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:29,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:57:29,793][33201] Updated weights for policy 0, policy_version 97420 (0.0008) [2023-10-14 04:57:30,169][33201] Updated weights for policy 0, policy_version 97430 (0.0009) [2023-10-14 04:57:30,542][33201] Updated weights for policy 0, policy_version 97440 (0.0008) [2023-10-14 04:57:32,838][33226] Updated weights for policy 1, policy_version 98310 (0.0010) [2023-10-14 04:57:33,200][33226] Updated weights for policy 1, policy_version 98320 (0.0008) [2023-10-14 04:57:33,577][33226] Updated weights for policy 1, policy_version 98330 (0.0009) [2023-10-14 04:57:34,394][33201] Updated weights for policy 0, policy_version 97450 (0.0008) [2023-10-14 04:57:34,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14106.9). Total num frames: 200474624. Throughput: 0: 1771.5, 1: 1798.4. Samples: 50127470. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:34,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.930')] [2023-10-14 04:57:34,778][33201] Updated weights for policy 0, policy_version 97460 (0.0008) [2023-10-14 04:57:35,140][33201] Updated weights for policy 0, policy_version 97470 (0.0009) [2023-10-14 04:57:37,403][33226] Updated weights for policy 1, policy_version 98340 (0.0008) [2023-10-14 04:57:37,764][33226] Updated weights for policy 1, policy_version 98350 (0.0008) [2023-10-14 04:57:38,127][33226] Updated weights for policy 1, policy_version 98360 (0.0007) [2023-10-14 04:57:38,718][33201] Updated weights for policy 0, policy_version 97480 (0.0008) [2023-10-14 04:57:39,092][33201] Updated weights for policy 0, policy_version 97490 (0.0007) [2023-10-14 04:57:39,459][33201] Updated weights for policy 0, policy_version 97500 (0.0009) [2023-10-14 04:57:39,557][31953] Fps is (10 sec: 13107.3, 60 sec: 13653.4, 300 sec: 14106.9). Total num frames: 200540160. Throughput: 0: 1780.2, 1: 1781.0. Samples: 50147908. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:39,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.960')] [2023-10-14 04:57:41,928][33226] Updated weights for policy 1, policy_version 98370 (0.0009) [2023-10-14 04:57:42,300][33226] Updated weights for policy 1, policy_version 98380 (0.0009) [2023-10-14 04:57:42,664][33226] Updated weights for policy 1, policy_version 98390 (0.0007) [2023-10-14 04:57:43,033][33226] Updated weights for policy 1, policy_version 98400 (0.0007) [2023-10-14 04:57:43,388][33201] Updated weights for policy 0, policy_version 97510 (0.0009) [2023-10-14 04:57:43,776][33201] Updated weights for policy 0, policy_version 97520 (0.0009) [2023-10-14 04:57:44,154][33201] Updated weights for policy 0, policy_version 97530 (0.0010) [2023-10-14 04:57:44,557][31953] Fps is (10 sec: 16383.8, 60 sec: 14199.4, 300 sec: 14218.0). Total num frames: 200638464. Throughput: 0: 1775.5, 1: 1802.6. Samples: 50159600. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:44,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 04:57:46,907][33226] Updated weights for policy 1, policy_version 98410 (0.0010) [2023-10-14 04:57:47,271][33226] Updated weights for policy 1, policy_version 98420 (0.0008) [2023-10-14 04:57:47,633][33226] Updated weights for policy 1, policy_version 98430 (0.0010) [2023-10-14 04:57:47,785][33201] Updated weights for policy 0, policy_version 97540 (0.0009) [2023-10-14 04:57:48,165][33201] Updated weights for policy 0, policy_version 97550 (0.0009) [2023-10-14 04:57:48,536][33201] Updated weights for policy 0, policy_version 97560 (0.0009) [2023-10-14 04:57:49,557][31953] Fps is (10 sec: 16383.9, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 200704000. Throughput: 0: 1786.1, 1: 1770.3. Samples: 50179700. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:49,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 04:57:51,365][33226] Updated weights for policy 1, policy_version 98440 (0.0009) [2023-10-14 04:57:51,734][33226] Updated weights for policy 1, policy_version 98450 (0.0009) [2023-10-14 04:57:52,094][33226] Updated weights for policy 1, policy_version 98460 (0.0007) [2023-10-14 04:57:52,484][33201] Updated weights for policy 0, policy_version 97570 (0.0008) [2023-10-14 04:57:52,856][33201] Updated weights for policy 0, policy_version 97580 (0.0007) [2023-10-14 04:57:53,222][33201] Updated weights for policy 0, policy_version 97590 (0.0008) [2023-10-14 04:57:53,593][33201] Updated weights for policy 0, policy_version 97600 (0.0007) [2023-10-14 04:57:54,557][31953] Fps is (10 sec: 13107.5, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 200769536. Throughput: 0: 1763.0, 1: 1781.9. Samples: 50201216. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:54,557][31953] Avg episode reward: [(0, '20.980'), (1, '20.920')] [2023-10-14 04:57:55,872][33226] Updated weights for policy 1, policy_version 98470 (0.0008) [2023-10-14 04:57:56,231][33226] Updated weights for policy 1, policy_version 98480 (0.0009) [2023-10-14 04:57:56,607][33226] Updated weights for policy 1, policy_version 98490 (0.0009) [2023-10-14 04:57:57,433][33201] Updated weights for policy 0, policy_version 97610 (0.0007) [2023-10-14 04:57:57,813][33201] Updated weights for policy 0, policy_version 97620 (0.0009) [2023-10-14 04:57:58,184][33201] Updated weights for policy 0, policy_version 97630 (0.0009) [2023-10-14 04:57:59,557][31953] Fps is (10 sec: 13107.3, 60 sec: 14199.5, 300 sec: 14218.0). Total num frames: 200835072. Throughput: 0: 1798.5, 1: 1775.7. Samples: 50212326. Policy #0 lag: (min: 31.0, avg: 31.0, max: 31.0) [2023-10-14 04:57:59,557][31953] Avg episode reward: [(0, '21.000'), (1, '20.900')] [2023-10-14 04:58:00,349][33226] Updated weights for policy 1, policy_version 98500 (0.0008) [2023-10-14 04:58:00,718][33226] Updated weights for policy 1, policy_version 98510 (0.0007) [2023-10-14 04:58:01,081][33226] Updated weights for policy 1, policy_version 98520 (0.0009) [2023-10-14 04:58:01,837][33201] Updated weights for policy 0, policy_version 97640 (0.0008) [2023-10-14 04:58:02,216][33201] Updated weights for policy 0, policy_version 97650 (0.0010) [2023-10-14 04:58:02,587][33201] Updated weights for policy 0, policy_version 97660 (0.0009) [2023-10-14 04:58:04,557][31953] Fps is (10 sec: 13107.2, 60 sec: 14199.6, 300 sec: 14218.0). Total num frames: 200900608. Throughput: 0: 1765.3, 1: 1787.6. Samples: 50233446. Policy #0 lag: (min: 31.0, avg: 37.5, max: 63.0) [2023-10-14 04:58:04,558][31953] Avg episode reward: [(0, '20.980'), (1, '20.900')] [2023-10-14 04:58:04,720][33226] Updated weights for policy 1, policy_version 98530 (0.0009) [2023-10-14 04:58:05,089][33226] Updated weights for policy 1, policy_version 98540 (0.0008) [2023-10-14 04:58:05,450][33226] Updated weights for policy 1, policy_version 98550 (0.0010) [2023-10-14 04:58:05,811][33244] Stopping RolloutWorker_w8... [2023-10-14 04:58:05,811][33249] Stopping RolloutWorker_w12... [2023-10-14 04:58:05,811][33238] Stopping RolloutWorker_w3... [2023-10-14 04:58:05,811][33240] Stopping RolloutWorker_w4... [2023-10-14 04:58:05,811][33245] Stopping RolloutWorker_w9... [2023-10-14 04:58:05,811][33244] Loop rollout_proc8_evt_loop terminating... [2023-10-14 04:58:05,811][33249] Loop rollout_proc12_evt_loop terminating... [2023-10-14 04:58:05,811][32837] Stopping Batcher_0... [2023-10-14 04:58:05,812][33238] Loop rollout_proc3_evt_loop terminating... [2023-10-14 04:58:05,812][33240] Loop rollout_proc4_evt_loop terminating... [2023-10-14 04:58:05,811][31953] Component RolloutWorker_w8 stopped! [2023-10-14 04:58:05,812][33245] Loop rollout_proc9_evt_loop terminating... [2023-10-14 04:58:05,812][32837] Loop batcher_evt_loop terminating... [2023-10-14 04:58:05,812][32895] Stopping Batcher_1... [2023-10-14 04:58:05,812][33226] Updated weights for policy 1, policy_version 98560 (0.0011) [2023-10-14 04:58:05,812][31953] Component RolloutWorker_w12 stopped! [2023-10-14 04:58:05,812][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... [2023-10-14 04:58:05,812][31953] Component RolloutWorker_w4 stopped! [2023-10-14 04:58:05,813][33813] Stopping RolloutWorker_w14... [2023-10-14 04:58:05,812][32895] Loop batcher_evt_loop terminating... [2023-10-14 04:58:05,813][33813] Loop rollout_proc14_evt_loop terminating... [2023-10-14 04:58:05,813][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000098560_100925440.pth... [2023-10-14 04:58:05,813][31953] Component RolloutWorker_w3 stopped! [2023-10-14 04:58:05,813][33246] Stopping RolloutWorker_w10... [2023-10-14 04:58:05,814][33246] Loop rollout_proc10_evt_loop terminating... [2023-10-14 04:58:05,813][31953] Component RolloutWorker_w9 stopped! [2023-10-14 04:58:05,814][31953] Component Batcher_0 stopped! [2023-10-14 04:58:05,814][33239] Stopping RolloutWorker_w2... [2023-10-14 04:58:05,814][31953] Component Batcher_1 stopped! [2023-10-14 04:58:05,814][33239] Loop rollout_proc2_evt_loop terminating... [2023-10-14 04:58:05,815][31953] Component RolloutWorker_w14 stopped! [2023-10-14 04:58:05,815][33242] Stopping RolloutWorker_w6... [2023-10-14 04:58:05,815][31953] Component RolloutWorker_w10 stopped! [2023-10-14 04:58:05,815][33242] Loop rollout_proc6_evt_loop terminating... [2023-10-14 04:58:05,815][31953] Component RolloutWorker_w2 stopped! [2023-10-14 04:58:05,815][33234] Stopping RolloutWorker_w0... [2023-10-14 04:58:05,816][33234] Loop rollout_proc0_evt_loop terminating... [2023-10-14 04:58:05,816][31953] Component RolloutWorker_w6 stopped! [2023-10-14 04:58:05,816][31953] Component RolloutWorker_w0 stopped! [2023-10-14 04:58:05,817][33241] Stopping RolloutWorker_w5... [2023-10-14 04:58:05,817][31953] Component RolloutWorker_w5 stopped! [2023-10-14 04:58:05,817][33241] Loop rollout_proc5_evt_loop terminating... [2023-10-14 04:58:05,818][33235] Stopping RolloutWorker_w1... [2023-10-14 04:58:05,818][31953] Component RolloutWorker_w1 stopped! [2023-10-14 04:58:05,818][33243] Stopping RolloutWorker_w7... [2023-10-14 04:58:05,818][33850] Stopping RolloutWorker_w15... [2023-10-14 04:58:05,818][33235] Loop rollout_proc1_evt_loop terminating... [2023-10-14 04:58:05,818][33243] Loop rollout_proc7_evt_loop terminating... [2023-10-14 04:58:05,818][31953] Component RolloutWorker_w7 stopped! [2023-10-14 04:58:05,818][33850] Loop rollout_proc15_evt_loop terminating... [2023-10-14 04:58:05,819][31953] Component RolloutWorker_w15 stopped! [2023-10-14 04:58:05,819][33248] Stopping RolloutWorker_w13... [2023-10-14 04:58:05,819][31953] Component RolloutWorker_w13 stopped! [2023-10-14 04:58:05,819][33247] Stopping RolloutWorker_w11... [2023-10-14 04:58:05,819][33248] Loop rollout_proc13_evt_loop terminating... [2023-10-14 04:58:05,819][31953] Component RolloutWorker_w11 stopped! [2023-10-14 04:58:05,819][33247] Loop rollout_proc11_evt_loop terminating... [2023-10-14 04:58:05,832][33201] Weights refcount: 2 0 [2023-10-14 04:58:05,834][33201] Stopping InferenceWorker_p0-w0... [2023-10-14 04:58:05,834][33201] Loop inference_proc0-0_evt_loop terminating... [2023-10-14 04:58:05,834][31953] Component InferenceWorker_p0-w0 stopped! [2023-10-14 04:58:05,845][33226] Weights refcount: 2 0 [2023-10-14 04:58:05,847][33226] Stopping InferenceWorker_p1-w0... [2023-10-14 04:58:05,848][33226] Loop inference_proc1-0_evt_loop terminating... [2023-10-14 04:58:05,847][31953] Component InferenceWorker_p1-w0 stopped! [2023-10-14 04:58:05,864][32837] Removing ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000096576_98893824.pth [2023-10-14 04:58:05,864][32895] Removing ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000097440_99778560.pth [2023-10-14 04:58:05,870][32837] Saving ./train_atari/atari_pong_APPO/checkpoint_p0/checkpoint_000097664_100007936.pth... [2023-10-14 04:58:05,871][32895] Saving ./train_atari/atari_pong_APPO/checkpoint_p1/checkpoint_000098560_100925440.pth... [2023-10-14 04:58:05,928][32837] Stopping LearnerWorker_p0... [2023-10-14 04:58:05,929][32837] Loop learner_proc0_evt_loop terminating... [2023-10-14 04:58:05,928][31953] Component LearnerWorker_p0 stopped! [2023-10-14 04:58:05,931][32895] Stopping LearnerWorker_p1... [2023-10-14 04:58:05,931][31953] Component LearnerWorker_p1 stopped! [2023-10-14 04:58:05,932][32895] Loop learner_proc1_evt_loop terminating... [2023-10-14 04:58:05,932][31953] Waiting for process learner_proc0 to stop... [2023-10-14 04:58:06,865][31953] Waiting for process learner_proc1 to stop... [2023-10-14 04:58:06,866][31953] Waiting for process inference_proc0-0 to join... [2023-10-14 04:58:06,867][31953] Waiting for process inference_proc1-0 to join... [2023-10-14 04:58:06,868][31953] Waiting for process rollout_proc0 to join... [2023-10-14 04:58:06,868][31953] Waiting for process rollout_proc1 to join... [2023-10-14 04:58:06,869][31953] Waiting for process rollout_proc2 to join... [2023-10-14 04:58:06,870][31953] Waiting for process rollout_proc3 to join... [2023-10-14 04:58:06,871][31953] Waiting for process rollout_proc4 to join... [2023-10-14 04:58:06,871][31953] Waiting for process rollout_proc5 to join... [2023-10-14 04:58:06,872][31953] Waiting for process rollout_proc6 to join... [2023-10-14 04:58:06,872][31953] Waiting for process rollout_proc7 to join... [2023-10-14 04:58:06,873][31953] Waiting for process rollout_proc8 to join... [2023-10-14 04:58:06,873][31953] Waiting for process rollout_proc9 to join... [2023-10-14 04:58:06,873][31953] Waiting for process rollout_proc10 to join... [2023-10-14 04:58:06,874][31953] Waiting for process rollout_proc11 to join... [2023-10-14 04:58:06,874][31953] Waiting for process rollout_proc12 to join... [2023-10-14 04:58:06,875][31953] Waiting for process rollout_proc13 to join... [2023-10-14 04:58:06,875][31953] Waiting for process rollout_proc14 to join... [2023-10-14 04:58:06,876][31953] Waiting for process rollout_proc15 to join... [2023-10-14 04:58:06,876][31953] Batcher 0 profile tree view: batching: 169.5953, releasing_batches: 0.0909 [2023-10-14 04:58:06,876][31953] Batcher 1 profile tree view: batching: 170.3080, releasing_batches: 0.0954 [2023-10-14 04:58:06,877][31953] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0003 wait_policy_total: 2103.1985 update_model: 202.1913 weight_update: 0.0010 one_step: 0.0018 handle_policy_step: 11218.2094 deserialize: 63.8218, stack: 190.5480, obs_to_device_normalize: 2504.5395, forward: 5067.2460, prepare_outputs: 2446.7400, send_messages: 456.0617 [2023-10-14 04:58:06,877][31953] InferenceWorker_p1-w0 profile tree view: wait_policy: 0.0011 wait_policy_total: 2072.4268 update_model: 202.4535 weight_update: 0.0011 one_step: 0.0028 handle_policy_step: 11238.8010 deserialize: 64.0955, stack: 192.8157, obs_to_device_normalize: 2514.4213, forward: 5074.2147, prepare_outputs: 2442.6844, send_messages: 458.5085 [2023-10-14 04:58:06,877][31953] Learner 0 profile tree view: misc: 0.0181, prepare_batch: 268.7395 train: 3636.1835 epoch_init: 0.1856, minibatch_init: 13.0857, losses_postprocess: 895.6047, kl_divergence: 31.9793, update: 392.0949, after_optimizer: 2116.3303 calculate_losses: 170.0528 losses_init: 0.3954, forward_head: 59.7844, bptt_initial: 1.4408, bptt: 1.7949, tail: 38.3284, advantages_returns: 11.1296, losses: 43.7156 [2023-10-14 04:58:06,878][31953] Learner 1 profile tree view: misc: 0.0203, prepare_batch: 271.7637 train: 3637.7231 epoch_init: 0.1878, minibatch_init: 13.1317, losses_postprocess: 893.1074, kl_divergence: 31.1892, update: 389.1712, after_optimizer: 2126.8825 calculate_losses: 167.1765 losses_init: 0.3672, forward_head: 55.9487, bptt_initial: 1.4588, bptt: 1.9702, tail: 38.4078, advantages_returns: 11.2502, losses: 44.0787 [2023-10-14 04:58:06,878][31953] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.2107, enqueue_policy_requests: 408.3426, process_policy_outputs: 190.5016, env_step: 6854.7324, finalize_trajectories: 3.5041, complete_rollouts: 2.9433 post_env_step: 375.4834 process_env_step: 84.0017 [2023-10-14 04:58:06,878][31953] RolloutWorker_w15 profile tree view: wait_for_trajectories: 1.2567, enqueue_policy_requests: 410.4064, process_policy_outputs: 192.2133, env_step: 6859.6086, finalize_trajectories: 3.5536, complete_rollouts: 2.9243 post_env_step: 375.5434 process_env_step: 84.2165 [2023-10-14 04:58:06,879][31953] Loop Runner_EvtLoop terminating... [2023-10-14 04:58:06,879][31953] Runner profile tree view: main_loop: 14197.7697 [2023-10-14 04:58:06,880][31953] Collected {0: 100007936, 1: 100925440}, FPS: 14152.5