[2023-09-22 16:56:31,205][106676] Saving configuration to ./train_atari/Asteroids/config.json... [2023-09-22 16:56:31,471][106676] Rollout worker 0 uses device cpu [2023-09-22 16:56:31,471][106676] Rollout worker 1 uses device cpu [2023-09-22 16:56:31,472][106676] Rollout worker 2 uses device cpu [2023-09-22 16:56:31,472][106676] Rollout worker 3 uses device cpu [2023-09-22 16:56:31,472][106676] Rollout worker 4 uses device cpu [2023-09-22 16:56:31,472][106676] Rollout worker 5 uses device cpu [2023-09-22 16:56:31,472][106676] Rollout worker 6 uses device cpu [2023-09-22 16:56:31,473][106676] Rollout worker 7 uses device cpu [2023-09-22 16:56:31,473][106676] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 [2023-09-22 16:56:31,523][106676] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:56:31,523][106676] InferenceWorker_p0-w0: min num requests: 2 [2023-09-22 16:56:31,548][106676] Starting all processes... [2023-09-22 16:56:31,548][106676] Starting process learner_proc0 [2023-09-22 16:56:33,286][106676] Starting all processes... [2023-09-22 16:56:33,289][107070] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:56:33,289][107070] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-09-22 16:56:33,292][106676] Starting process inference_proc0-0 [2023-09-22 16:56:33,293][106676] Starting process rollout_proc0 [2023-09-22 16:56:33,293][106676] Starting process rollout_proc1 [2023-09-22 16:56:33,293][106676] Starting process rollout_proc2 [2023-09-22 16:56:33,294][106676] Starting process rollout_proc3 [2023-09-22 16:56:33,298][106676] Starting process rollout_proc4 [2023-09-22 16:56:33,301][106676] Starting process rollout_proc5 [2023-09-22 16:56:33,303][106676] Starting process rollout_proc6 [2023-09-22 16:56:33,305][106676] Starting process rollout_proc7 [2023-09-22 16:56:33,331][107070] Num visible devices: 1 [2023-09-22 16:56:33,400][107070] Starting seed is not provided [2023-09-22 16:56:33,400][107070] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:56:33,401][107070] Initializing actor-critic model on device cuda:0 [2023-09-22 16:56:33,401][107070] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:56:33,402][107070] RunningMeanStd input shape: (1,) [2023-09-22 16:56:33,427][107070] ConvEncoder: input_channels=4 [2023-09-22 16:56:33,745][107070] Conv encoder output size: 512 [2023-09-22 16:56:33,747][107070] Created Actor Critic model with architecture: [2023-09-22 16:56:33,747][107070] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=14, bias=True) ) ) [2023-09-22 16:56:34,333][107070] Using optimizer [2023-09-22 16:56:34,334][107070] No checkpoints found [2023-09-22 16:56:34,334][107070] Did not load from checkpoint, starting from scratch! [2023-09-22 16:56:34,334][107070] Initialized policy 0 weights for model version 0 [2023-09-22 16:56:34,335][107070] LearnerWorker_p0 finished initialization! [2023-09-22 16:56:34,336][107070] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:56:35,211][107219] Worker 7 uses CPU cores [28, 29, 30, 31] [2023-09-22 16:56:35,218][107217] Worker 2 uses CPU cores [8, 9, 10, 11] [2023-09-22 16:56:35,224][107218] Worker 5 uses CPU cores [20, 21, 22, 23] [2023-09-22 16:56:35,226][107220] Worker 4 uses CPU cores [16, 17, 18, 19] [2023-09-22 16:56:35,233][107222] Worker 6 uses CPU cores [24, 25, 26, 27] [2023-09-22 16:56:35,239][107216] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:56:35,239][107216] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-09-22 16:56:35,254][107214] Worker 1 uses CPU cores [4, 5, 6, 7] [2023-09-22 16:56:35,263][107215] Worker 0 uses CPU cores [0, 1, 2, 3] [2023-09-22 16:56:35,271][107221] Worker 3 uses CPU cores [12, 13, 14, 15] [2023-09-22 16:56:35,281][107216] Num visible devices: 1 [2023-09-22 16:56:35,925][107216] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:56:35,925][107216] RunningMeanStd input shape: (1,) [2023-09-22 16:56:35,937][107216] ConvEncoder: input_channels=4 [2023-09-22 16:56:36,045][107216] Conv encoder output size: 512 [2023-09-22 16:56:36,052][106676] Inference worker 0-0 is ready! [2023-09-22 16:56:36,052][106676] All inference workers are ready! Signal rollout workers to start! [2023-09-22 16:56:36,505][107220] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,508][107214] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,511][107219] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,513][107218] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,514][107217] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,516][107221] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,518][107222] Decorrelating experience for 0 frames... [2023-09-22 16:56:36,712][107215] Decorrelating experience for 0 frames... [2023-09-22 16:56:37,135][106676] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-09-22 16:56:42,135][106676] Fps is (10 sec: 1638.3, 60 sec: 1638.3, 300 sec: 1638.3). Total num frames: 8192. Throughput: 0: 559.4. Samples: 2797. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 16:56:42,136][106676] Avg episode reward: [(0, '3.357')] [2023-09-22 16:56:43,948][106676] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 106676], exiting... [2023-09-22 16:56:43,949][107222] Stopping RolloutWorker_w6... [2023-09-22 16:56:43,949][107217] Stopping RolloutWorker_w2... [2023-09-22 16:56:43,949][107221] Stopping RolloutWorker_w3... [2023-09-22 16:56:43,949][107222] Loop rollout_proc6_evt_loop terminating... [2023-09-22 16:56:43,949][107220] Stopping RolloutWorker_w4... [2023-09-22 16:56:43,949][106676] Runner profile tree view: main_loop: 12.4010 [2023-09-22 16:56:43,949][107217] Loop rollout_proc2_evt_loop terminating... [2023-09-22 16:56:43,949][107220] Loop rollout_proc4_evt_loop terminating... [2023-09-22 16:56:43,949][107221] Loop rollout_proc3_evt_loop terminating... [2023-09-22 16:56:43,949][106676] Collected {0: 16384}, FPS: 1321.2 [2023-09-22 16:56:43,949][107219] Stopping RolloutWorker_w7... [2023-09-22 16:56:43,949][107214] Stopping RolloutWorker_w1... [2023-09-22 16:56:43,949][107070] Stopping Batcher_0... [2023-09-22 16:56:43,950][107219] Loop rollout_proc7_evt_loop terminating... [2023-09-22 16:56:43,950][107214] Loop rollout_proc1_evt_loop terminating... [2023-09-22 16:56:43,950][107070] Loop batcher_evt_loop terminating... [2023-09-22 16:56:43,951][107070] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000000064_16384.pth... [2023-09-22 16:56:43,952][107218] Stopping RolloutWorker_w5... [2023-09-22 16:56:43,952][107215] Stopping RolloutWorker_w0... [2023-09-22 16:56:43,952][107218] Loop rollout_proc5_evt_loop terminating... [2023-09-22 16:56:43,953][107215] Loop rollout_proc0_evt_loop terminating... [2023-09-22 16:56:43,965][107216] Weights refcount: 2 0 [2023-09-22 16:56:43,966][107216] Stopping InferenceWorker_p0-w0... [2023-09-22 16:56:43,967][107216] Loop inference_proc0-0_evt_loop terminating... [2023-09-22 16:56:43,987][107070] Stopping LearnerWorker_p0... [2023-09-22 16:56:43,987][107070] Loop learner_proc0_evt_loop terminating... [2023-09-22 16:57:48,678][111881] Saving configuration to ./train_atari/Asteroids/config.json... [2023-09-22 16:57:48,953][111881] Rollout worker 0 uses device cpu [2023-09-22 16:57:48,954][111881] Rollout worker 1 uses device cpu [2023-09-22 16:57:48,954][111881] Rollout worker 2 uses device cpu [2023-09-22 16:57:48,955][111881] Rollout worker 3 uses device cpu [2023-09-22 16:57:48,955][111881] Rollout worker 4 uses device cpu [2023-09-22 16:57:48,956][111881] Rollout worker 5 uses device cpu [2023-09-22 16:57:48,956][111881] Rollout worker 6 uses device cpu [2023-09-22 16:57:48,957][111881] Rollout worker 7 uses device cpu [2023-09-22 16:57:48,957][111881] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1 [2023-09-22 16:57:49,002][111881] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:57:49,003][111881] InferenceWorker_p0-w0: min num requests: 1 [2023-09-22 16:57:49,006][111881] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-09-22 16:57:49,006][111881] InferenceWorker_p1-w0: min num requests: 1 [2023-09-22 16:57:49,031][111881] Starting all processes... [2023-09-22 16:57:49,032][111881] Starting process learner_proc0 [2023-09-22 16:57:50,753][111881] Starting process learner_proc1 [2023-09-22 16:57:50,756][112639] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:57:50,756][112639] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-09-22 16:57:50,794][112639] Num visible devices: 1 [2023-09-22 16:57:50,915][112639] Starting seed is not provided [2023-09-22 16:57:50,916][112639] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:57:50,916][112639] Initializing actor-critic model on device cuda:0 [2023-09-22 16:57:50,916][112639] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:57:50,917][112639] RunningMeanStd input shape: (1,) [2023-09-22 16:57:50,936][112639] ConvEncoder: input_channels=4 [2023-09-22 16:57:51,070][112639] Conv encoder output size: 512 [2023-09-22 16:57:51,071][112639] Created Actor Critic model with architecture: [2023-09-22 16:57:51,072][112639] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=14, bias=True) ) ) [2023-09-22 16:57:51,629][112639] Using optimizer [2023-09-22 16:57:51,629][112639] Loading state from checkpoint ./train_atari/Asteroids/checkpoint_p0/checkpoint_000000064_16384.pth... [2023-09-22 16:57:51,648][112639] Loading model from checkpoint [2023-09-22 16:57:51,652][112639] Loaded experiment state at self.train_step=64, self.env_steps=16384 [2023-09-22 16:57:51,653][112639] Initialized policy 0 weights for model version 64 [2023-09-22 16:57:51,655][112639] LearnerWorker_p0 finished initialization! [2023-09-22 16:57:51,655][112639] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:57:52,609][111881] Starting all processes... [2023-09-22 16:57:52,613][112735] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-09-22 16:57:52,613][112735] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1 [2023-09-22 16:57:52,613][111881] Starting process inference_proc0-0 [2023-09-22 16:57:52,613][111881] Starting process inference_proc1-0 [2023-09-22 16:57:52,614][111881] Starting process rollout_proc0 [2023-09-22 16:57:52,614][111881] Starting process rollout_proc1 [2023-09-22 16:57:52,614][111881] Starting process rollout_proc2 [2023-09-22 16:57:52,615][111881] Starting process rollout_proc3 [2023-09-22 16:57:52,616][111881] Starting process rollout_proc4 [2023-09-22 16:57:52,617][111881] Starting process rollout_proc5 [2023-09-22 16:57:52,618][111881] Starting process rollout_proc6 [2023-09-22 16:57:52,618][111881] Starting process rollout_proc7 [2023-09-22 16:57:52,649][112735] Num visible devices: 1 [2023-09-22 16:57:52,707][112735] Starting seed is not provided [2023-09-22 16:57:52,708][112735] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-09-22 16:57:52,708][112735] Initializing actor-critic model on device cuda:0 [2023-09-22 16:57:52,708][112735] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:57:52,710][112735] RunningMeanStd input shape: (1,) [2023-09-22 16:57:52,806][112735] ConvEncoder: input_channels=4 [2023-09-22 16:57:53,083][112735] Conv encoder output size: 512 [2023-09-22 16:57:53,084][112735] Created Actor Critic model with architecture: [2023-09-22 16:57:53,085][112735] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): MultiInputEncoder( (encoders): ModuleDict( (obs): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) ) ) (core): ModelCoreIdentity() (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=14, bias=True) ) ) [2023-09-22 16:57:53,660][112735] Using optimizer [2023-09-22 16:57:53,661][112735] No checkpoints found [2023-09-22 16:57:53,661][112735] Did not load from checkpoint, starting from scratch! [2023-09-22 16:57:53,661][112735] Initialized policy 1 weights for model version 0 [2023-09-22 16:57:53,663][112735] LearnerWorker_p1 finished initialization! [2023-09-22 16:57:53,663][112735] Using GPUs [0] for process 1 (actually maps to GPUs [1]) [2023-09-22 16:57:54,559][112946] Worker 7 uses CPU cores [28, 29, 30, 31] [2023-09-22 16:57:54,572][112939] Worker 0 uses CPU cores [0, 1, 2, 3] [2023-09-22 16:57:54,574][112937] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-09-22 16:57:54,574][112937] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-09-22 16:57:54,591][112938] Using GPUs [1] for process 1 (actually maps to GPUs [1]) [2023-09-22 16:57:54,591][112938] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1 [2023-09-22 16:57:54,591][112945] Worker 5 uses CPU cores [20, 21, 22, 23] [2023-09-22 16:57:54,592][112937] Num visible devices: 1 [2023-09-22 16:57:54,609][112938] Num visible devices: 1 [2023-09-22 16:57:54,616][112942] Worker 2 uses CPU cores [8, 9, 10, 11] [2023-09-22 16:57:54,616][112941] Worker 1 uses CPU cores [4, 5, 6, 7] [2023-09-22 16:57:54,668][112947] Worker 6 uses CPU cores [24, 25, 26, 27] [2023-09-22 16:57:54,675][112943] Worker 3 uses CPU cores [12, 13, 14, 15] [2023-09-22 16:57:54,682][111881] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 16384. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-09-22 16:57:54,687][112944] Worker 4 uses CPU cores [16, 17, 18, 19] [2023-09-22 16:57:55,205][112937] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:57:55,206][112937] RunningMeanStd input shape: (1,) [2023-09-22 16:57:55,210][112938] RunningMeanStd input shape: (4, 84, 84) [2023-09-22 16:57:55,210][112938] RunningMeanStd input shape: (1,) [2023-09-22 16:57:55,217][112937] ConvEncoder: input_channels=4 [2023-09-22 16:57:55,221][112938] ConvEncoder: input_channels=4 [2023-09-22 16:57:55,323][112937] Conv encoder output size: 512 [2023-09-22 16:57:55,329][111881] Inference worker 0-0 is ready! [2023-09-22 16:57:55,361][112938] Conv encoder output size: 512 [2023-09-22 16:57:55,367][111881] Inference worker 1-0 is ready! [2023-09-22 16:57:55,367][111881] All inference workers are ready! Signal rollout workers to start! [2023-09-22 16:57:55,826][112941] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,829][112944] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,829][112939] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,830][112942] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,832][112945] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,838][112947] Decorrelating experience for 0 frames... [2023-09-22 16:57:55,852][112946] Decorrelating experience for 0 frames... [2023-09-22 16:57:56,181][112943] Decorrelating experience for 0 frames... [2023-09-22 16:57:59,672][111881] Fps is (10 sec: 1641.4, 60 sec: 1641.4, 300 sec: 1641.4). Total num frames: 24576. Throughput: 0: 205.2, 1: 205.2. Samples: 2048. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 16:57:59,673][111881] Avg episode reward: [(0, '3.167'), (1, '1.600')] [2023-09-22 16:58:04,673][111881] Fps is (10 sec: 2459.8, 60 sec: 2459.8, 300 sec: 2459.8). Total num frames: 40960. Throughput: 0: 397.5, 1: 399.2. Samples: 7959. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 16:58:04,674][111881] Avg episode reward: [(0, '4.500'), (1, '4.000')] [2023-09-22 16:58:08,989][111881] Heartbeat connected on Batcher_0 [2023-09-22 16:58:08,993][111881] Heartbeat connected on LearnerWorker_p0 [2023-09-22 16:58:08,995][111881] Heartbeat connected on Batcher_1 [2023-09-22 16:58:08,998][111881] Heartbeat connected on LearnerWorker_p1 [2023-09-22 16:58:09,004][111881] Heartbeat connected on InferenceWorker_p0-w0 [2023-09-22 16:58:09,008][111881] Heartbeat connected on InferenceWorker_p1-w0 [2023-09-22 16:58:09,011][111881] Heartbeat connected on RolloutWorker_w0 [2023-09-22 16:58:09,013][111881] Heartbeat connected on RolloutWorker_w1 [2023-09-22 16:58:09,017][111881] Heartbeat connected on RolloutWorker_w2 [2023-09-22 16:58:09,021][111881] Heartbeat connected on RolloutWorker_w3 [2023-09-22 16:58:09,021][111881] Heartbeat connected on RolloutWorker_w4 [2023-09-22 16:58:09,026][111881] Heartbeat connected on RolloutWorker_w5 [2023-09-22 16:58:09,027][111881] Heartbeat connected on RolloutWorker_w6 [2023-09-22 16:58:09,031][111881] Heartbeat connected on RolloutWorker_w7 [2023-09-22 16:58:09,672][111881] Fps is (10 sec: 4915.2, 60 sec: 3825.3, 300 sec: 3825.3). Total num frames: 73728. Throughput: 0: 410.1, 1: 410.2. Samples: 12297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:58:09,673][111881] Avg episode reward: [(0, '4.091'), (1, '4.250')] [2023-09-22 16:58:12,591][112937] Updated weights for policy 0, policy_version 224 (0.0016) [2023-09-22 16:58:12,591][112938] Updated weights for policy 1, policy_version 160 (0.0015) [2023-09-22 16:58:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 4507.6, 300 sec: 4507.6). Total num frames: 106496. Throughput: 0: 549.5, 1: 549.6. Samples: 21973. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 16:58:14,674][111881] Avg episode reward: [(0, '4.300'), (1, '4.375')] [2023-09-22 16:58:19,673][111881] Fps is (10 sec: 6553.4, 60 sec: 4917.0, 300 sec: 4917.0). Total num frames: 139264. Throughput: 0: 621.0, 1: 622.6. Samples: 31078. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 16:58:19,674][111881] Avg episode reward: [(0, '4.177'), (1, '4.458')] [2023-09-22 16:58:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 5189.9, 300 sec: 5189.9). Total num frames: 172032. Throughput: 0: 597.7, 1: 599.1. Samples: 35894. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 16:58:24,673][111881] Avg episode reward: [(0, '4.093'), (1, '4.295')] [2023-09-22 16:58:25,672][112937] Updated weights for policy 0, policy_version 384 (0.0015) [2023-09-22 16:58:25,672][112938] Updated weights for policy 1, policy_version 320 (0.0017) [2023-09-22 16:58:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 5384.7, 300 sec: 5384.7). Total num frames: 204800. Throughput: 0: 644.3, 1: 645.4. Samples: 45128. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 16:58:29,674][111881] Avg episode reward: [(0, '4.190'), (1, '4.300')] [2023-09-22 16:58:29,681][112639] Saving new best policy, reward=4.190! [2023-09-22 16:58:34,673][111881] Fps is (10 sec: 5734.3, 60 sec: 5326.0, 300 sec: 5326.0). Total num frames: 229376. Throughput: 0: 686.4, 1: 687.4. Samples: 54940. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:58:34,674][111881] Avg episode reward: [(0, '4.080'), (1, '4.230')] [2023-09-22 16:58:34,746][112735] Saving new best policy, reward=4.230! [2023-09-22 16:58:38,663][112937] Updated weights for policy 0, policy_version 544 (0.0020) [2023-09-22 16:58:38,663][112938] Updated weights for policy 1, policy_version 480 (0.0021) [2023-09-22 16:58:39,672][111881] Fps is (10 sec: 5734.5, 60 sec: 5462.5, 300 sec: 5462.5). Total num frames: 262144. Throughput: 0: 660.1, 1: 660.3. Samples: 59407. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:58:39,673][111881] Avg episode reward: [(0, '4.230'), (1, '4.260')] [2023-09-22 16:58:39,674][112639] Saving new best policy, reward=4.230! [2023-09-22 16:58:39,674][112735] Saving new best policy, reward=4.260! [2023-09-22 16:58:44,673][111881] Fps is (10 sec: 6553.5, 60 sec: 5571.6, 300 sec: 5571.6). Total num frames: 294912. Throughput: 0: 741.2, 1: 741.9. Samples: 68787. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:58:44,674][111881] Avg episode reward: [(0, '4.320'), (1, '4.540')] [2023-09-22 16:58:44,679][112639] Saving new best policy, reward=4.320! [2023-09-22 16:58:44,679][112735] Saving new best policy, reward=4.540! [2023-09-22 16:58:49,672][111881] Fps is (10 sec: 6553.6, 60 sec: 5660.9, 300 sec: 5660.9). Total num frames: 327680. Throughput: 0: 776.5, 1: 776.2. Samples: 77828. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 16:58:49,673][111881] Avg episode reward: [(0, '4.440'), (1, '4.770')] [2023-09-22 16:58:49,674][112639] Saving new best policy, reward=4.440! [2023-09-22 16:58:49,674][112735] Saving new best policy, reward=4.770! [2023-09-22 16:58:51,989][112937] Updated weights for policy 0, policy_version 704 (0.0016) [2023-09-22 16:58:51,990][112938] Updated weights for policy 1, policy_version 640 (0.0019) [2023-09-22 16:58:54,672][111881] Fps is (10 sec: 6553.8, 60 sec: 5735.3, 300 sec: 5735.3). Total num frames: 360448. Throughput: 0: 780.9, 1: 781.6. Samples: 82613. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 16:58:54,673][111881] Avg episode reward: [(0, '4.730'), (1, '5.170')] [2023-09-22 16:58:54,674][112639] Saving new best policy, reward=4.730! [2023-09-22 16:58:54,674][112735] Saving new best policy, reward=5.170! [2023-09-22 16:58:59,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5672.2). Total num frames: 385024. Throughput: 0: 779.9, 1: 779.8. Samples: 92160. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:58:59,674][111881] Avg episode reward: [(0, '4.690'), (1, '5.340')] [2023-09-22 16:58:59,825][112735] Saving new best policy, reward=5.340! [2023-09-22 16:59:04,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 5735.1). Total num frames: 417792. Throughput: 0: 779.6, 1: 779.6. Samples: 101239. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 16:59:04,674][111881] Avg episode reward: [(0, '4.960'), (1, '5.190')] [2023-09-22 16:59:04,675][112639] Saving new best policy, reward=4.960! [2023-09-22 16:59:05,163][112937] Updated weights for policy 0, policy_version 864 (0.0018) [2023-09-22 16:59:05,164][112938] Updated weights for policy 1, policy_version 800 (0.0017) [2023-09-22 16:59:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5789.7). Total num frames: 450560. Throughput: 0: 779.1, 1: 779.5. Samples: 106031. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:09,673][111881] Avg episode reward: [(0, '4.870'), (1, '5.160')] [2023-09-22 16:59:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5837.5). Total num frames: 483328. Throughput: 0: 777.4, 1: 777.3. Samples: 115091. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 16:59:14,674][111881] Avg episode reward: [(0, '4.930'), (1, '5.210')] [2023-09-22 16:59:18,507][112937] Updated weights for policy 0, policy_version 1024 (0.0017) [2023-09-22 16:59:18,507][112938] Updated weights for policy 1, policy_version 960 (0.0016) [2023-09-22 16:59:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5783.2). Total num frames: 507904. Throughput: 0: 771.9, 1: 772.8. Samples: 124450. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:19,673][111881] Avg episode reward: [(0, '4.860'), (1, '5.170')] [2023-09-22 16:59:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5826.0). Total num frames: 540672. Throughput: 0: 773.6, 1: 773.4. Samples: 129024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:24,673][111881] Avg episode reward: [(0, '4.720'), (1, '5.120')] [2023-09-22 16:59:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5864.3). Total num frames: 573440. Throughput: 0: 773.8, 1: 773.9. Samples: 138432. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 16:59:29,674][111881] Avg episode reward: [(0, '5.030'), (1, '5.170')] [2023-09-22 16:59:29,680][112639] Saving new best policy, reward=5.030! [2023-09-22 16:59:31,766][112937] Updated weights for policy 0, policy_version 1184 (0.0017) [2023-09-22 16:59:31,766][112938] Updated weights for policy 1, policy_version 1120 (0.0021) [2023-09-22 16:59:34,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 5898.8). Total num frames: 606208. Throughput: 0: 773.8, 1: 773.7. Samples: 147466. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:34,674][111881] Avg episode reward: [(0, '5.060'), (1, '5.540')] [2023-09-22 16:59:34,675][112639] Saving new best policy, reward=5.060! [2023-09-22 16:59:34,675][112735] Saving new best policy, reward=5.540! [2023-09-22 16:59:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5851.9). Total num frames: 630784. Throughput: 0: 771.1, 1: 771.0. Samples: 152006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:39,674][111881] Avg episode reward: [(0, '4.970'), (1, '5.550')] [2023-09-22 16:59:39,740][112735] Saving new best policy, reward=5.550! [2023-09-22 16:59:44,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5883.8). Total num frames: 663552. Throughput: 0: 770.0, 1: 770.9. Samples: 161499. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 16:59:44,674][111881] Avg episode reward: [(0, '5.350'), (1, '5.360')] [2023-09-22 16:59:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000001264_323584.pth... [2023-09-22 16:59:44,684][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000001328_339968.pth... [2023-09-22 16:59:44,731][112639] Saving new best policy, reward=5.350! [2023-09-22 16:59:45,056][112937] Updated weights for policy 0, policy_version 1344 (0.0015) [2023-09-22 16:59:45,056][112938] Updated weights for policy 1, policy_version 1280 (0.0016) [2023-09-22 16:59:49,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 5913.0). Total num frames: 696320. Throughput: 0: 769.4, 1: 768.9. Samples: 170461. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 16:59:49,673][111881] Avg episode reward: [(0, '4.920'), (1, '5.200')] [2023-09-22 16:59:54,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 5939.7). Total num frames: 729088. Throughput: 0: 769.4, 1: 768.4. Samples: 175229. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 16:59:54,673][111881] Avg episode reward: [(0, '5.050'), (1, '5.370')] [2023-09-22 16:59:58,299][112938] Updated weights for policy 1, policy_version 1440 (0.0016) [2023-09-22 16:59:58,300][112937] Updated weights for policy 0, policy_version 1504 (0.0016) [2023-09-22 16:59:59,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 5964.2). Total num frames: 761856. Throughput: 0: 770.1, 1: 770.2. Samples: 184401. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 16:59:59,673][111881] Avg episode reward: [(0, '4.950'), (1, '4.820')] [2023-09-22 17:00:04,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5923.9). Total num frames: 786432. Throughput: 0: 775.3, 1: 774.4. Samples: 194184. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:04,673][111881] Avg episode reward: [(0, '4.950'), (1, '5.120')] [2023-09-22 17:00:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5947.2). Total num frames: 819200. Throughput: 0: 773.8, 1: 774.2. Samples: 198682. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:00:09,674][111881] Avg episode reward: [(0, '5.190'), (1, '5.240')] [2023-09-22 17:00:11,281][112937] Updated weights for policy 0, policy_version 1664 (0.0019) [2023-09-22 17:00:11,281][112938] Updated weights for policy 1, policy_version 1600 (0.0018) [2023-09-22 17:00:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5968.8). Total num frames: 851968. Throughput: 0: 776.6, 1: 776.3. Samples: 208312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:14,674][111881] Avg episode reward: [(0, '5.140'), (1, '5.470')] [2023-09-22 17:00:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5989.0). Total num frames: 884736. Throughput: 0: 778.1, 1: 778.8. Samples: 217524. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:00:19,673][111881] Avg episode reward: [(0, '5.320'), (1, '5.150')] [2023-09-22 17:00:24,396][112937] Updated weights for policy 0, policy_version 1824 (0.0015) [2023-09-22 17:00:24,397][112938] Updated weights for policy 1, policy_version 1760 (0.0018) [2023-09-22 17:00:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6007.8). Total num frames: 917504. Throughput: 0: 780.3, 1: 781.0. Samples: 222262. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:24,674][111881] Avg episode reward: [(0, '5.520'), (1, '5.250')] [2023-09-22 17:00:24,675][112639] Saving new best policy, reward=5.520! [2023-09-22 17:00:29,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5972.6). Total num frames: 942080. Throughput: 0: 777.4, 1: 776.5. Samples: 231425. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:00:29,673][111881] Avg episode reward: [(0, '5.380'), (1, '4.970')] [2023-09-22 17:00:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5990.7). Total num frames: 974848. Throughput: 0: 781.1, 1: 781.2. Samples: 240763. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:34,674][111881] Avg episode reward: [(0, '5.130'), (1, '4.970')] [2023-09-22 17:00:37,727][112938] Updated weights for policy 1, policy_version 1920 (0.0017) [2023-09-22 17:00:37,727][112937] Updated weights for policy 0, policy_version 1984 (0.0017) [2023-09-22 17:00:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6007.8). Total num frames: 1007616. Throughput: 0: 780.2, 1: 780.4. Samples: 245456. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:00:39,674][111881] Avg episode reward: [(0, '4.800'), (1, '5.220')] [2023-09-22 17:00:44,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6023.8). Total num frames: 1040384. Throughput: 0: 781.4, 1: 781.2. Samples: 254719. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:00:44,674][111881] Avg episode reward: [(0, '4.720'), (1, '5.720')] [2023-09-22 17:00:44,683][112735] Saving new best policy, reward=5.720! [2023-09-22 17:00:49,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6039.0). Total num frames: 1073152. Throughput: 0: 778.3, 1: 777.4. Samples: 264192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:00:49,673][111881] Avg episode reward: [(0, '4.510'), (1, '5.370')] [2023-09-22 17:00:50,786][112937] Updated weights for policy 0, policy_version 2144 (0.0018) [2023-09-22 17:00:50,786][112938] Updated weights for policy 1, policy_version 2080 (0.0017) [2023-09-22 17:00:54,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6007.8). Total num frames: 1097728. Throughput: 0: 777.6, 1: 778.1. Samples: 268689. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:54,674][111881] Avg episode reward: [(0, '4.650'), (1, '5.450')] [2023-09-22 17:00:59,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6022.5). Total num frames: 1130496. Throughput: 0: 776.4, 1: 777.4. Samples: 278235. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:00:59,673][111881] Avg episode reward: [(0, '5.070'), (1, '5.450')] [2023-09-22 17:01:03,911][112937] Updated weights for policy 0, policy_version 2304 (0.0017) [2023-09-22 17:01:03,911][112938] Updated weights for policy 1, policy_version 2240 (0.0016) [2023-09-22 17:01:04,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6036.5). Total num frames: 1163264. Throughput: 0: 778.1, 1: 777.8. Samples: 287539. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:04,674][111881] Avg episode reward: [(0, '5.300'), (1, '5.280')] [2023-09-22 17:01:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6049.8). Total num frames: 1196032. Throughput: 0: 779.2, 1: 778.7. Samples: 292369. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:01:09,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.010')] [2023-09-22 17:01:09,673][112639] Saving new best policy, reward=5.550! [2023-09-22 17:01:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6062.4). Total num frames: 1228800. Throughput: 0: 777.8, 1: 778.8. Samples: 301468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:14,674][111881] Avg episode reward: [(0, '5.440'), (1, '5.090')] [2023-09-22 17:01:17,103][112937] Updated weights for policy 0, policy_version 2464 (0.0018) [2023-09-22 17:01:17,103][112938] Updated weights for policy 1, policy_version 2400 (0.0015) [2023-09-22 17:01:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6034.4). Total num frames: 1253376. Throughput: 0: 778.0, 1: 780.4. Samples: 310890. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:01:19,674][111881] Avg episode reward: [(0, '4.940'), (1, '5.000')] [2023-09-22 17:01:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6046.7). Total num frames: 1286144. Throughput: 0: 777.3, 1: 776.8. Samples: 315392. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:24,673][111881] Avg episode reward: [(0, '5.080'), (1, '4.930')] [2023-09-22 17:01:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6058.5). Total num frames: 1318912. Throughput: 0: 781.3, 1: 780.5. Samples: 324998. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:29,673][111881] Avg episode reward: [(0, '5.180'), (1, '5.290')] [2023-09-22 17:01:30,269][112937] Updated weights for policy 0, policy_version 2624 (0.0013) [2023-09-22 17:01:30,269][112938] Updated weights for policy 1, policy_version 2560 (0.0017) [2023-09-22 17:01:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6069.8). Total num frames: 1351680. Throughput: 0: 776.1, 1: 776.6. Samples: 334066. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:34,673][111881] Avg episode reward: [(0, '4.650'), (1, '4.920')] [2023-09-22 17:01:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6080.5). Total num frames: 1384448. Throughput: 0: 779.8, 1: 779.8. Samples: 338872. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:39,674][111881] Avg episode reward: [(0, '5.010'), (1, '4.490')] [2023-09-22 17:01:43,583][112937] Updated weights for policy 0, policy_version 2784 (0.0016) [2023-09-22 17:01:43,583][112938] Updated weights for policy 1, policy_version 2720 (0.0019) [2023-09-22 17:01:44,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6055.2). Total num frames: 1409024. Throughput: 0: 777.4, 1: 776.2. Samples: 348150. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 17:01:44,674][111881] Avg episode reward: [(0, '5.390'), (1, '4.990')] [2023-09-22 17:01:44,683][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000002720_696320.pth... [2023-09-22 17:01:44,874][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000002800_716800.pth... [2023-09-22 17:01:44,903][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000000064_16384.pth [2023-09-22 17:01:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6065.8). Total num frames: 1441792. Throughput: 0: 772.9, 1: 773.5. Samples: 357125. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:01:49,673][111881] Avg episode reward: [(0, '5.340'), (1, '4.900')] [2023-09-22 17:01:54,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6076.0). Total num frames: 1474560. Throughput: 0: 770.7, 1: 770.4. Samples: 361721. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:01:54,674][111881] Avg episode reward: [(0, '5.660'), (1, '5.260')] [2023-09-22 17:01:54,675][112639] Saving new best policy, reward=5.660! [2023-09-22 17:01:56,917][112938] Updated weights for policy 1, policy_version 2880 (0.0014) [2023-09-22 17:01:56,918][112937] Updated weights for policy 0, policy_version 2944 (0.0018) [2023-09-22 17:01:59,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6085.7). Total num frames: 1507328. Throughput: 0: 770.4, 1: 770.2. Samples: 370793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:01:59,673][111881] Avg episode reward: [(0, '5.940'), (1, '5.570')] [2023-09-22 17:01:59,679][112639] Saving new best policy, reward=5.940! [2023-09-22 17:02:04,672][111881] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6062.3). Total num frames: 1531904. Throughput: 0: 772.5, 1: 769.4. Samples: 380277. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:04,673][111881] Avg episode reward: [(0, '5.870'), (1, '5.390')] [2023-09-22 17:02:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6071.9). Total num frames: 1564672. Throughput: 0: 773.7, 1: 773.7. Samples: 385024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:09,674][111881] Avg episode reward: [(0, '5.440'), (1, '5.040')] [2023-09-22 17:02:10,121][112937] Updated weights for policy 0, policy_version 3104 (0.0014) [2023-09-22 17:02:10,122][112938] Updated weights for policy 1, policy_version 3040 (0.0016) [2023-09-22 17:02:14,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6081.2). Total num frames: 1597440. Throughput: 0: 768.0, 1: 768.6. Samples: 394147. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:14,674][111881] Avg episode reward: [(0, '5.550'), (1, '5.110')] [2023-09-22 17:02:19,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6090.1). Total num frames: 1630208. Throughput: 0: 771.2, 1: 770.8. Samples: 403457. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:19,674][111881] Avg episode reward: [(0, '5.490'), (1, '5.060')] [2023-09-22 17:02:23,410][112937] Updated weights for policy 0, policy_version 3264 (0.0013) [2023-09-22 17:02:23,411][112938] Updated weights for policy 1, policy_version 3200 (0.0018) [2023-09-22 17:02:24,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6068.4). Total num frames: 1654784. Throughput: 0: 768.2, 1: 767.7. Samples: 407986. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:24,674][111881] Avg episode reward: [(0, '5.330'), (1, '5.220')] [2023-09-22 17:02:29,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6077.2). Total num frames: 1687552. Throughput: 0: 773.0, 1: 773.7. Samples: 417752. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:02:29,673][111881] Avg episode reward: [(0, '5.280'), (1, '5.060')] [2023-09-22 17:02:34,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6085.7). Total num frames: 1720320. Throughput: 0: 771.4, 1: 771.2. Samples: 426541. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:34,674][111881] Avg episode reward: [(0, '5.720'), (1, '4.500')] [2023-09-22 17:02:36,651][112938] Updated weights for policy 1, policy_version 3360 (0.0016) [2023-09-22 17:02:36,652][112937] Updated weights for policy 0, policy_version 3424 (0.0016) [2023-09-22 17:02:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6093.9). Total num frames: 1753088. Throughput: 0: 774.9, 1: 775.0. Samples: 431463. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:39,674][111881] Avg episode reward: [(0, '5.510'), (1, '4.620')] [2023-09-22 17:02:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6101.8). Total num frames: 1785856. Throughput: 0: 776.5, 1: 776.4. Samples: 440673. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:44,673][111881] Avg episode reward: [(0, '5.630'), (1, '4.620')] [2023-09-22 17:02:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6081.7). Total num frames: 1810432. Throughput: 0: 778.0, 1: 778.5. Samples: 450321. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:49,674][111881] Avg episode reward: [(0, '5.540'), (1, '4.560')] [2023-09-22 17:02:49,709][112937] Updated weights for policy 0, policy_version 3584 (0.0017) [2023-09-22 17:02:49,709][112938] Updated weights for policy 1, policy_version 3520 (0.0017) [2023-09-22 17:02:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 1843200. Throughput: 0: 773.6, 1: 773.7. Samples: 454651. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:54,674][111881] Avg episode reward: [(0, '5.640'), (1, '4.620')] [2023-09-22 17:02:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 1875968. Throughput: 0: 770.4, 1: 771.0. Samples: 463510. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:02:59,673][111881] Avg episode reward: [(0, '5.540'), (1, '4.670')] [2023-09-22 17:03:03,237][112937] Updated weights for policy 0, policy_version 3744 (0.0014) [2023-09-22 17:03:03,237][112938] Updated weights for policy 1, policy_version 3680 (0.0016) [2023-09-22 17:03:04,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 1908736. Throughput: 0: 773.7, 1: 773.7. Samples: 473088. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:03:04,674][111881] Avg episode reward: [(0, '6.030'), (1, '4.810')] [2023-09-22 17:03:04,675][112639] Saving new best policy, reward=6.030! [2023-09-22 17:03:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1933312. Throughput: 0: 771.6, 1: 771.8. Samples: 477437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:09,674][111881] Avg episode reward: [(0, '6.140'), (1, '4.820')] [2023-09-22 17:03:09,675][112639] Saving new best policy, reward=6.140! [2023-09-22 17:03:14,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1966080. Throughput: 0: 767.8, 1: 768.1. Samples: 486868. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:03:14,673][111881] Avg episode reward: [(0, '6.330'), (1, '5.250')] [2023-09-22 17:03:14,679][112639] Saving new best policy, reward=6.330! [2023-09-22 17:03:16,548][112937] Updated weights for policy 0, policy_version 3904 (0.0016) [2023-09-22 17:03:16,548][112938] Updated weights for policy 1, policy_version 3840 (0.0016) [2023-09-22 17:03:19,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1998848. Throughput: 0: 770.4, 1: 770.5. Samples: 495883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:19,674][111881] Avg episode reward: [(0, '6.340'), (1, '5.380')] [2023-09-22 17:03:19,675][112639] Saving new best policy, reward=6.340! [2023-09-22 17:03:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 2031616. Throughput: 0: 766.2, 1: 766.7. Samples: 500445. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:24,673][111881] Avg episode reward: [(0, '5.860'), (1, '5.440')] [2023-09-22 17:03:29,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2056192. Throughput: 0: 770.1, 1: 769.4. Samples: 509952. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:29,674][111881] Avg episode reward: [(0, '5.570'), (1, '5.280')] [2023-09-22 17:03:29,899][112937] Updated weights for policy 0, policy_version 4064 (0.0016) [2023-09-22 17:03:29,900][112938] Updated weights for policy 1, policy_version 4000 (0.0017) [2023-09-22 17:03:34,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2088960. Throughput: 0: 765.3, 1: 766.0. Samples: 519231. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:34,673][111881] Avg episode reward: [(0, '5.140'), (1, '5.240')] [2023-09-22 17:03:39,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2121728. Throughput: 0: 769.7, 1: 770.1. Samples: 523943. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:39,674][111881] Avg episode reward: [(0, '5.030'), (1, '5.070')] [2023-09-22 17:03:42,979][112938] Updated weights for policy 1, policy_version 4160 (0.0016) [2023-09-22 17:03:42,979][112937] Updated weights for policy 0, policy_version 4224 (0.0017) [2023-09-22 17:03:44,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2154496. Throughput: 0: 773.5, 1: 773.8. Samples: 533141. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:44,674][111881] Avg episode reward: [(0, '5.410'), (1, '5.300')] [2023-09-22 17:03:44,684][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000004176_1069056.pth... [2023-09-22 17:03:44,684][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000004240_1085440.pth... [2023-09-22 17:03:44,720][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000001264_323584.pth [2023-09-22 17:03:44,722][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000001328_339968.pth [2023-09-22 17:03:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 2187264. Throughput: 0: 772.8, 1: 772.6. Samples: 542630. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:03:49,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.210')] [2023-09-22 17:03:54,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2211840. Throughput: 0: 773.1, 1: 773.2. Samples: 547023. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:54,674][111881] Avg episode reward: [(0, '5.990'), (1, '5.430')] [2023-09-22 17:03:56,340][112937] Updated weights for policy 0, policy_version 4384 (0.0016) [2023-09-22 17:03:56,340][112938] Updated weights for policy 1, policy_version 4320 (0.0016) [2023-09-22 17:03:59,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2244608. Throughput: 0: 771.6, 1: 771.9. Samples: 556325. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:03:59,673][111881] Avg episode reward: [(0, '6.160'), (1, '5.180')] [2023-09-22 17:04:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2277376. Throughput: 0: 771.6, 1: 771.4. Samples: 565317. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:04:04,673][111881] Avg episode reward: [(0, '6.370'), (1, '5.070')] [2023-09-22 17:04:04,674][112639] Saving new best policy, reward=6.370! [2023-09-22 17:04:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2301952. Throughput: 0: 769.5, 1: 769.4. Samples: 569694. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:04:09,674][111881] Avg episode reward: [(0, '6.800'), (1, '5.200')] [2023-09-22 17:04:09,675][112639] Saving new best policy, reward=6.800! [2023-09-22 17:04:09,919][112937] Updated weights for policy 0, policy_version 4544 (0.0015) [2023-09-22 17:04:09,919][112938] Updated weights for policy 1, policy_version 4480 (0.0018) [2023-09-22 17:04:14,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2334720. Throughput: 0: 763.4, 1: 763.9. Samples: 578678. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:04:14,673][111881] Avg episode reward: [(0, '6.760'), (1, '4.920')] [2023-09-22 17:04:19,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2359296. Throughput: 0: 757.9, 1: 759.0. Samples: 587492. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:04:19,673][111881] Avg episode reward: [(0, '5.910'), (1, '4.850')] [2023-09-22 17:04:23,994][112937] Updated weights for policy 0, policy_version 4704 (0.0016) [2023-09-22 17:04:23,994][112938] Updated weights for policy 1, policy_version 4640 (0.0014) [2023-09-22 17:04:24,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6164.8). Total num frames: 2392064. Throughput: 0: 754.8, 1: 754.5. Samples: 591865. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:04:24,674][111881] Avg episode reward: [(0, '5.610'), (1, '5.200')] [2023-09-22 17:04:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2424832. Throughput: 0: 750.4, 1: 749.9. Samples: 600654. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:04:29,674][111881] Avg episode reward: [(0, '5.440'), (1, '4.670')] [2023-09-22 17:04:34,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2449408. Throughput: 0: 745.6, 1: 747.8. Samples: 609832. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:04:34,673][111881] Avg episode reward: [(0, '5.030'), (1, '4.690')] [2023-09-22 17:04:37,561][112938] Updated weights for policy 1, policy_version 4800 (0.0017) [2023-09-22 17:04:37,562][112937] Updated weights for policy 0, policy_version 4864 (0.0019) [2023-09-22 17:04:39,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2482176. Throughput: 0: 749.0, 1: 748.3. Samples: 614400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:04:39,673][111881] Avg episode reward: [(0, '5.040'), (1, '4.650')] [2023-09-22 17:04:44,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2514944. Throughput: 0: 738.8, 1: 738.0. Samples: 622783. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:04:44,674][111881] Avg episode reward: [(0, '5.100'), (1, '4.510')] [2023-09-22 17:04:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6137.1). Total num frames: 2539520. Throughput: 0: 736.3, 1: 735.2. Samples: 631537. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:04:49,674][111881] Avg episode reward: [(0, '5.190'), (1, '4.610')] [2023-09-22 17:04:51,705][112938] Updated weights for policy 1, policy_version 4960 (0.0012) [2023-09-22 17:04:51,705][112937] Updated weights for policy 0, policy_version 5024 (0.0016) [2023-09-22 17:04:54,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 2572288. Throughput: 0: 737.1, 1: 737.7. Samples: 636060. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:04:54,674][111881] Avg episode reward: [(0, '5.480'), (1, '4.760')] [2023-09-22 17:04:59,673][111881] Fps is (10 sec: 5734.4, 60 sec: 5870.9, 300 sec: 6137.1). Total num frames: 2596864. Throughput: 0: 738.5, 1: 738.0. Samples: 645120. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:04:59,674][111881] Avg episode reward: [(0, '5.640'), (1, '4.950')] [2023-09-22 17:05:04,672][111881] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 6137.1). Total num frames: 2629632. Throughput: 0: 736.2, 1: 734.5. Samples: 653674. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:05:04,673][111881] Avg episode reward: [(0, '5.860'), (1, '4.600')] [2023-09-22 17:05:05,466][112937] Updated weights for policy 0, policy_version 5184 (0.0016) [2023-09-22 17:05:05,467][112938] Updated weights for policy 1, policy_version 5120 (0.0019) [2023-09-22 17:05:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 2662400. Throughput: 0: 738.5, 1: 739.3. Samples: 658368. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:05:09,673][111881] Avg episode reward: [(0, '5.770'), (1, '4.400')] [2023-09-22 17:05:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 5870.9, 300 sec: 6109.3). Total num frames: 2686976. Throughput: 0: 744.9, 1: 743.9. Samples: 667648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:14,674][111881] Avg episode reward: [(0, '5.720'), (1, '4.400')] [2023-09-22 17:05:18,823][112938] Updated weights for policy 1, policy_version 5280 (0.0015) [2023-09-22 17:05:18,823][112937] Updated weights for policy 0, policy_version 5344 (0.0016) [2023-09-22 17:05:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 2719744. Throughput: 0: 744.0, 1: 742.9. Samples: 676744. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:19,674][111881] Avg episode reward: [(0, '5.660'), (1, '3.970')] [2023-09-22 17:05:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 2752512. Throughput: 0: 741.3, 1: 742.0. Samples: 681152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:24,674][111881] Avg episode reward: [(0, '5.840'), (1, '4.190')] [2023-09-22 17:05:29,673][111881] Fps is (10 sec: 6143.9, 60 sec: 5939.2, 300 sec: 6123.2). Total num frames: 2781184. Throughput: 0: 749.6, 1: 749.7. Samples: 690251. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:29,674][111881] Avg episode reward: [(0, '6.290'), (1, '4.490')] [2023-09-22 17:05:32,333][112937] Updated weights for policy 0, policy_version 5504 (0.0018) [2023-09-22 17:05:32,333][112938] Updated weights for policy 1, policy_version 5440 (0.0018) [2023-09-22 17:05:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6109.3). Total num frames: 2809856. Throughput: 0: 756.6, 1: 757.4. Samples: 699668. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:05:34,674][111881] Avg episode reward: [(0, '6.090'), (1, '4.800')] [2023-09-22 17:05:39,673][111881] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 2842624. Throughput: 0: 757.5, 1: 756.8. Samples: 704206. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:39,674][111881] Avg episode reward: [(0, '5.790'), (1, '5.050')] [2023-09-22 17:05:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 2875392. Throughput: 0: 753.8, 1: 754.4. Samples: 712990. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:44,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.130')] [2023-09-22 17:05:44,684][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000005584_1429504.pth... [2023-09-22 17:05:44,684][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000005648_1445888.pth... [2023-09-22 17:05:44,717][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000002720_696320.pth [2023-09-22 17:05:44,720][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000002800_716800.pth [2023-09-22 17:05:46,056][112938] Updated weights for policy 1, policy_version 5600 (0.0017) [2023-09-22 17:05:46,057][112937] Updated weights for policy 0, policy_version 5664 (0.0017) [2023-09-22 17:05:49,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 2899968. Throughput: 0: 754.6, 1: 755.4. Samples: 721623. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:05:49,673][111881] Avg episode reward: [(0, '5.930'), (1, '5.230')] [2023-09-22 17:05:54,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 2932736. Throughput: 0: 754.3, 1: 754.4. Samples: 726261. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:05:54,673][111881] Avg episode reward: [(0, '5.340'), (1, '5.120')] [2023-09-22 17:05:59,523][112937] Updated weights for policy 0, policy_version 5824 (0.0013) [2023-09-22 17:05:59,524][112938] Updated weights for policy 1, policy_version 5760 (0.0018) [2023-09-22 17:05:59,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 2965504. Throughput: 0: 753.7, 1: 754.8. Samples: 735533. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:05:59,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.790')] [2023-09-22 17:06:04,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 2990080. Throughput: 0: 758.8, 1: 760.1. Samples: 745096. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:04,674][111881] Avg episode reward: [(0, '5.850'), (1, '4.730')] [2023-09-22 17:06:09,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 3022848. Throughput: 0: 760.5, 1: 759.8. Samples: 749568. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:06:09,674][111881] Avg episode reward: [(0, '5.950'), (1, '4.290')] [2023-09-22 17:06:12,830][112937] Updated weights for policy 0, policy_version 5984 (0.0016) [2023-09-22 17:06:12,830][112938] Updated weights for policy 1, policy_version 5920 (0.0016) [2023-09-22 17:06:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 3055616. Throughput: 0: 760.6, 1: 760.8. Samples: 758714. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:14,673][111881] Avg episode reward: [(0, '5.440'), (1, '4.090')] [2023-09-22 17:06:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 3088384. Throughput: 0: 759.5, 1: 759.0. Samples: 768000. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:19,673][111881] Avg episode reward: [(0, '5.700'), (1, '4.360')] [2023-09-22 17:06:24,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 3112960. Throughput: 0: 757.5, 1: 757.4. Samples: 772378. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:24,674][111881] Avg episode reward: [(0, '5.900'), (1, '4.310')] [2023-09-22 17:06:26,381][112937] Updated weights for policy 0, policy_version 6144 (0.0024) [2023-09-22 17:06:26,382][112938] Updated weights for policy 1, policy_version 6080 (0.0017) [2023-09-22 17:06:29,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6075.7, 300 sec: 6081.5). Total num frames: 3145728. Throughput: 0: 761.4, 1: 761.5. Samples: 781521. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:06:29,674][111881] Avg episode reward: [(0, '6.070'), (1, '4.550')] [2023-09-22 17:06:34,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3178496. Throughput: 0: 766.3, 1: 764.9. Samples: 790528. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:06:34,674][111881] Avg episode reward: [(0, '6.110'), (1, '4.630')] [2023-09-22 17:06:39,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 3203072. Throughput: 0: 762.8, 1: 762.8. Samples: 794913. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:06:39,674][111881] Avg episode reward: [(0, '6.250'), (1, '4.690')] [2023-09-22 17:06:39,930][112938] Updated weights for policy 1, policy_version 6240 (0.0016) [2023-09-22 17:06:39,931][112937] Updated weights for policy 0, policy_version 6304 (0.0016) [2023-09-22 17:06:44,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 3235840. Throughput: 0: 762.2, 1: 762.2. Samples: 804131. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:06:44,673][111881] Avg episode reward: [(0, '6.620'), (1, '5.000')] [2023-09-22 17:06:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3268608. Throughput: 0: 756.2, 1: 754.0. Samples: 813056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:49,674][111881] Avg episode reward: [(0, '6.760'), (1, '5.170')] [2023-09-22 17:06:53,435][112938] Updated weights for policy 1, policy_version 6400 (0.0016) [2023-09-22 17:06:53,436][112937] Updated weights for policy 0, policy_version 6464 (0.0019) [2023-09-22 17:06:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6053.7). Total num frames: 3293184. Throughput: 0: 755.0, 1: 756.1. Samples: 817566. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:54,674][111881] Avg episode reward: [(0, '6.300'), (1, '5.390')] [2023-09-22 17:06:59,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.4, 300 sec: 6081.5). Total num frames: 3325952. Throughput: 0: 760.9, 1: 760.1. Samples: 827158. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:06:59,674][111881] Avg episode reward: [(0, '6.430'), (1, '5.180')] [2023-09-22 17:07:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3358720. Throughput: 0: 757.7, 1: 758.2. Samples: 836215. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:04,673][111881] Avg episode reward: [(0, '6.290'), (1, '4.840')] [2023-09-22 17:07:06,710][112937] Updated weights for policy 0, policy_version 6624 (0.0017) [2023-09-22 17:07:06,711][112938] Updated weights for policy 1, policy_version 6560 (0.0015) [2023-09-22 17:07:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3391488. Throughput: 0: 761.9, 1: 761.8. Samples: 840946. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:07:09,673][111881] Avg episode reward: [(0, '5.960'), (1, '4.620')] [2023-09-22 17:07:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3424256. Throughput: 0: 762.2, 1: 762.3. Samples: 850120. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:14,674][111881] Avg episode reward: [(0, '5.700'), (1, '4.750')] [2023-09-22 17:07:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6081.5). Total num frames: 3448832. Throughput: 0: 765.6, 1: 765.7. Samples: 859435. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:19,673][111881] Avg episode reward: [(0, '5.810'), (1, '4.670')] [2023-09-22 17:07:19,986][112937] Updated weights for policy 0, policy_version 6784 (0.0016) [2023-09-22 17:07:19,986][112938] Updated weights for policy 1, policy_version 6720 (0.0017) [2023-09-22 17:07:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3481600. Throughput: 0: 767.8, 1: 767.7. Samples: 864007. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:24,673][111881] Avg episode reward: [(0, '5.590'), (1, '4.670')] [2023-09-22 17:07:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3514368. Throughput: 0: 763.2, 1: 763.1. Samples: 872817. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:07:29,674][111881] Avg episode reward: [(0, '5.040'), (1, '4.730')] [2023-09-22 17:07:33,582][112937] Updated weights for policy 0, policy_version 6944 (0.0016) [2023-09-22 17:07:33,582][112938] Updated weights for policy 1, policy_version 6880 (0.0016) [2023-09-22 17:07:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.8). Total num frames: 3538944. Throughput: 0: 766.4, 1: 767.3. Samples: 882070. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:07:34,673][111881] Avg episode reward: [(0, '5.250'), (1, '4.910')] [2023-09-22 17:07:39,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3571712. Throughput: 0: 768.5, 1: 767.5. Samples: 886686. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:07:39,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.940')] [2023-09-22 17:07:44,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3604480. Throughput: 0: 759.5, 1: 760.2. Samples: 895546. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:07:44,674][111881] Avg episode reward: [(0, '5.340'), (1, '4.920')] [2023-09-22 17:07:44,685][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000007072_1810432.pth... [2023-09-22 17:07:44,685][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000007008_1794048.pth... [2023-09-22 17:07:44,723][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000004176_1069056.pth [2023-09-22 17:07:44,724][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000004240_1085440.pth [2023-09-22 17:07:47,107][112937] Updated weights for policy 0, policy_version 7104 (0.0015) [2023-09-22 17:07:47,107][112938] Updated weights for policy 1, policy_version 7040 (0.0015) [2023-09-22 17:07:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3629056. Throughput: 0: 763.7, 1: 764.5. Samples: 904983. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:49,674][111881] Avg episode reward: [(0, '5.270'), (1, '5.200')] [2023-09-22 17:07:54,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3661824. Throughput: 0: 761.0, 1: 761.0. Samples: 909437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:54,673][111881] Avg episode reward: [(0, '5.500'), (1, '5.270')] [2023-09-22 17:07:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 3694592. Throughput: 0: 766.4, 1: 765.3. Samples: 919047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:07:59,673][111881] Avg episode reward: [(0, '5.640'), (1, '5.250')] [2023-09-22 17:08:00,300][112937] Updated weights for policy 0, policy_version 7264 (0.0017) [2023-09-22 17:08:00,300][112938] Updated weights for policy 1, policy_version 7200 (0.0016) [2023-09-22 17:08:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3727360. Throughput: 0: 759.1, 1: 759.6. Samples: 927776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:04,673][111881] Avg episode reward: [(0, '5.260'), (1, '5.360')] [2023-09-22 17:08:09,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3751936. Throughput: 0: 759.5, 1: 759.8. Samples: 932376. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:08:09,673][111881] Avg episode reward: [(0, '5.330'), (1, '5.230')] [2023-09-22 17:08:13,846][112938] Updated weights for policy 1, policy_version 7360 (0.0017) [2023-09-22 17:08:13,846][112937] Updated weights for policy 0, policy_version 7424 (0.0017) [2023-09-22 17:08:14,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6007.5, 300 sec: 6053.7). Total num frames: 3784704. Throughput: 0: 764.0, 1: 763.4. Samples: 941547. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:08:14,674][111881] Avg episode reward: [(0, '5.510'), (1, '4.890')] [2023-09-22 17:08:19,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3817472. Throughput: 0: 763.2, 1: 762.9. Samples: 950741. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:19,673][111881] Avg episode reward: [(0, '5.100'), (1, '4.980')] [2023-09-22 17:08:24,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3850240. Throughput: 0: 764.3, 1: 765.7. Samples: 955533. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:24,673][111881] Avg episode reward: [(0, '5.160'), (1, '4.960')] [2023-09-22 17:08:27,032][112938] Updated weights for policy 1, policy_version 7520 (0.0019) [2023-09-22 17:08:27,032][112937] Updated weights for policy 0, policy_version 7584 (0.0018) [2023-09-22 17:08:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 3883008. Throughput: 0: 768.0, 1: 768.1. Samples: 964671. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:29,673][111881] Avg episode reward: [(0, '5.200'), (1, '4.910')] [2023-09-22 17:08:34,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 3907584. Throughput: 0: 769.7, 1: 770.0. Samples: 974272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:34,674][111881] Avg episode reward: [(0, '5.220'), (1, '4.690')] [2023-09-22 17:08:39,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 3940352. Throughput: 0: 772.0, 1: 772.0. Samples: 978915. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:39,674][111881] Avg episode reward: [(0, '4.980'), (1, '4.820')] [2023-09-22 17:08:40,310][112937] Updated weights for policy 0, policy_version 7744 (0.0014) [2023-09-22 17:08:40,312][112938] Updated weights for policy 1, policy_version 7680 (0.0017) [2023-09-22 17:08:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6053.8). Total num frames: 3973120. Throughput: 0: 762.5, 1: 764.1. Samples: 987745. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:44,673][111881] Avg episode reward: [(0, '4.900'), (1, '4.560')] [2023-09-22 17:08:49,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6081.5). Total num frames: 4005888. Throughput: 0: 773.6, 1: 772.5. Samples: 997351. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:49,674][111881] Avg episode reward: [(0, '4.760'), (1, '4.390')] [2023-09-22 17:08:53,722][112937] Updated weights for policy 0, policy_version 7904 (0.0019) [2023-09-22 17:08:53,722][112938] Updated weights for policy 1, policy_version 7840 (0.0017) [2023-09-22 17:08:54,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 4030464. Throughput: 0: 768.4, 1: 767.3. Samples: 1001482. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:54,674][111881] Avg episode reward: [(0, '4.590'), (1, '4.540')] [2023-09-22 17:08:59,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 4063232. Throughput: 0: 771.6, 1: 772.2. Samples: 1011021. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:08:59,674][111881] Avg episode reward: [(0, '4.440'), (1, '4.600')] [2023-09-22 17:09:04,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4096000. Throughput: 0: 768.8, 1: 768.2. Samples: 1019904. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 17:09:04,673][111881] Avg episode reward: [(0, '4.650'), (1, '4.830')] [2023-09-22 17:09:07,175][112937] Updated weights for policy 0, policy_version 8064 (0.0017) [2023-09-22 17:09:07,175][112938] Updated weights for policy 1, policy_version 8000 (0.0017) [2023-09-22 17:09:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6053.7). Total num frames: 4120576. Throughput: 0: 764.4, 1: 764.1. Samples: 1024317. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 17:09:09,674][111881] Avg episode reward: [(0, '5.190'), (1, '5.200')] [2023-09-22 17:09:14,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4153344. Throughput: 0: 767.4, 1: 767.1. Samples: 1033726. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:14,673][111881] Avg episode reward: [(0, '5.380'), (1, '4.990')] [2023-09-22 17:09:19,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4186112. Throughput: 0: 763.8, 1: 762.9. Samples: 1042973. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:09:19,674][111881] Avg episode reward: [(0, '5.620'), (1, '5.120')] [2023-09-22 17:09:20,386][112937] Updated weights for policy 0, policy_version 8224 (0.0016) [2023-09-22 17:09:20,386][112938] Updated weights for policy 1, policy_version 8160 (0.0016) [2023-09-22 17:09:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4218880. Throughput: 0: 765.0, 1: 764.8. Samples: 1047755. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:09:24,673][111881] Avg episode reward: [(0, '5.540'), (1, '4.870')] [2023-09-22 17:09:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6143.9, 300 sec: 6109.3). Total num frames: 4251648. Throughput: 0: 770.6, 1: 770.3. Samples: 1057088. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:29,674][111881] Avg episode reward: [(0, '5.130'), (1, '4.500')] [2023-09-22 17:09:33,432][112938] Updated weights for policy 1, policy_version 8320 (0.0015) [2023-09-22 17:09:33,432][112937] Updated weights for policy 0, policy_version 8384 (0.0017) [2023-09-22 17:09:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4276224. Throughput: 0: 770.3, 1: 771.4. Samples: 1066727. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:34,673][111881] Avg episode reward: [(0, '4.870'), (1, '4.970')] [2023-09-22 17:09:39,672][111881] Fps is (10 sec: 5734.7, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 4308992. Throughput: 0: 774.7, 1: 775.4. Samples: 1071239. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:39,673][111881] Avg episode reward: [(0, '4.770'), (1, '5.040')] [2023-09-22 17:09:44,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 4341760. Throughput: 0: 775.6, 1: 775.1. Samples: 1080800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:44,674][111881] Avg episode reward: [(0, '5.210'), (1, '5.090')] [2023-09-22 17:09:44,686][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000008448_2162688.pth... [2023-09-22 17:09:44,686][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000008512_2179072.pth... [2023-09-22 17:09:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000005584_1429504.pth [2023-09-22 17:09:44,721][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000005648_1445888.pth [2023-09-22 17:09:46,517][112937] Updated weights for policy 0, policy_version 8544 (0.0014) [2023-09-22 17:09:46,517][112938] Updated weights for policy 1, policy_version 8480 (0.0017) [2023-09-22 17:09:49,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 4374528. Throughput: 0: 777.4, 1: 778.5. Samples: 1089916. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:49,673][111881] Avg episode reward: [(0, '5.260'), (1, '5.370')] [2023-09-22 17:09:54,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6137.1). Total num frames: 4407296. Throughput: 0: 783.4, 1: 782.2. Samples: 1094767. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:54,673][111881] Avg episode reward: [(0, '5.640'), (1, '5.080')] [2023-09-22 17:09:59,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6123.2). Total num frames: 4435968. Throughput: 0: 780.3, 1: 780.4. Samples: 1103960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:09:59,674][111881] Avg episode reward: [(0, '5.930'), (1, '5.150')] [2023-09-22 17:09:59,694][112937] Updated weights for policy 0, policy_version 8704 (0.0018) [2023-09-22 17:09:59,694][112938] Updated weights for policy 1, policy_version 8640 (0.0020) [2023-09-22 17:10:04,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 4464640. Throughput: 0: 784.8, 1: 784.2. Samples: 1113576. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:10:04,674][111881] Avg episode reward: [(0, '6.110'), (1, '5.540')] [2023-09-22 17:10:09,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 4497408. Throughput: 0: 782.6, 1: 782.5. Samples: 1118189. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:10:09,674][111881] Avg episode reward: [(0, '6.060'), (1, '5.320')] [2023-09-22 17:10:12,939][112937] Updated weights for policy 0, policy_version 8864 (0.0017) [2023-09-22 17:10:12,940][112938] Updated weights for policy 1, policy_version 8800 (0.0016) [2023-09-22 17:10:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 4530176. Throughput: 0: 778.7, 1: 778.8. Samples: 1127176. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:10:14,674][111881] Avg episode reward: [(0, '6.170'), (1, '5.490')] [2023-09-22 17:10:19,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 4554752. Throughput: 0: 773.3, 1: 774.6. Samples: 1136383. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:10:19,674][111881] Avg episode reward: [(0, '6.100'), (1, '4.960')] [2023-09-22 17:10:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6123.2). Total num frames: 4587520. Throughput: 0: 772.6, 1: 772.0. Samples: 1140746. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:10:24,673][111881] Avg episode reward: [(0, '5.630'), (1, '4.670')] [2023-09-22 17:10:26,272][112937] Updated weights for policy 0, policy_version 9024 (0.0017) [2023-09-22 17:10:26,272][112938] Updated weights for policy 1, policy_version 8960 (0.0017) [2023-09-22 17:10:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4620288. Throughput: 0: 773.4, 1: 773.0. Samples: 1150390. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:10:29,673][111881] Avg episode reward: [(0, '5.710'), (1, '4.960')] [2023-09-22 17:10:34,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 4653056. Throughput: 0: 770.1, 1: 769.0. Samples: 1159174. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:10:34,674][111881] Avg episode reward: [(0, '5.720'), (1, '4.500')] [2023-09-22 17:10:39,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 4677632. Throughput: 0: 763.8, 1: 765.2. Samples: 1163573. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:10:39,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.620')] [2023-09-22 17:10:39,888][112938] Updated weights for policy 1, policy_version 9120 (0.0016) [2023-09-22 17:10:39,888][112937] Updated weights for policy 0, policy_version 9184 (0.0018) [2023-09-22 17:10:44,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4710400. Throughput: 0: 766.6, 1: 766.2. Samples: 1172932. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:10:44,674][111881] Avg episode reward: [(0, '6.050'), (1, '4.300')] [2023-09-22 17:10:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4743168. Throughput: 0: 757.1, 1: 757.3. Samples: 1181723. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:10:49,674][111881] Avg episode reward: [(0, '5.680'), (1, '4.370')] [2023-09-22 17:10:53,421][112937] Updated weights for policy 0, policy_version 9344 (0.0015) [2023-09-22 17:10:53,422][112938] Updated weights for policy 1, policy_version 9280 (0.0017) [2023-09-22 17:10:54,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 4767744. Throughput: 0: 756.7, 1: 757.4. Samples: 1186322. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:10:54,673][111881] Avg episode reward: [(0, '5.750'), (1, '4.450')] [2023-09-22 17:10:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6075.8, 300 sec: 6137.1). Total num frames: 4800512. Throughput: 0: 761.7, 1: 761.4. Samples: 1195713. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:10:59,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.590')] [2023-09-22 17:11:04,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4833280. Throughput: 0: 762.1, 1: 761.2. Samples: 1204935. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:11:04,674][111881] Avg episode reward: [(0, '5.770'), (1, '5.070')] [2023-09-22 17:11:06,658][112938] Updated weights for policy 1, policy_version 9440 (0.0016) [2023-09-22 17:11:06,658][112937] Updated weights for policy 0, policy_version 9504 (0.0017) [2023-09-22 17:11:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4866048. Throughput: 0: 765.3, 1: 765.9. Samples: 1209652. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:11:09,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.040')] [2023-09-22 17:11:14,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6109.3). Total num frames: 4890624. Throughput: 0: 757.5, 1: 757.4. Samples: 1218560. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:11:14,674][111881] Avg episode reward: [(0, '5.160'), (1, '5.360')] [2023-09-22 17:11:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4923392. Throughput: 0: 757.9, 1: 758.5. Samples: 1227411. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:19,674][111881] Avg episode reward: [(0, '5.000'), (1, '5.450')] [2023-09-22 17:11:20,315][112937] Updated weights for policy 0, policy_version 9664 (0.0017) [2023-09-22 17:11:20,315][112938] Updated weights for policy 1, policy_version 9600 (0.0018) [2023-09-22 17:11:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4956160. Throughput: 0: 762.0, 1: 761.3. Samples: 1232120. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:24,674][111881] Avg episode reward: [(0, '4.910'), (1, '4.940')] [2023-09-22 17:11:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 4988928. Throughput: 0: 758.8, 1: 759.6. Samples: 1241261. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:29,674][111881] Avg episode reward: [(0, '5.160'), (1, '5.090')] [2023-09-22 17:11:33,690][112938] Updated weights for policy 1, policy_version 9760 (0.0014) [2023-09-22 17:11:33,691][112937] Updated weights for policy 0, policy_version 9824 (0.0017) [2023-09-22 17:11:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 5013504. Throughput: 0: 764.6, 1: 765.0. Samples: 1250557. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:34,674][111881] Avg episode reward: [(0, '5.350'), (1, '4.790')] [2023-09-22 17:11:39,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5046272. Throughput: 0: 766.7, 1: 767.0. Samples: 1255339. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:39,673][111881] Avg episode reward: [(0, '6.300'), (1, '5.080')] [2023-09-22 17:11:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5079040. Throughput: 0: 762.8, 1: 763.5. Samples: 1264397. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:11:44,673][111881] Avg episode reward: [(0, '6.250'), (1, '5.430')] [2023-09-22 17:11:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000009888_2531328.pth... [2023-09-22 17:11:44,682][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000009952_2547712.pth... [2023-09-22 17:11:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000007008_1794048.pth [2023-09-22 17:11:44,720][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000007072_1810432.pth [2023-09-22 17:11:46,996][112938] Updated weights for policy 1, policy_version 9920 (0.0016) [2023-09-22 17:11:46,997][112937] Updated weights for policy 0, policy_version 9984 (0.0017) [2023-09-22 17:11:49,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5111808. Throughput: 0: 765.3, 1: 765.1. Samples: 1273803. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:49,673][111881] Avg episode reward: [(0, '6.340'), (1, '5.470')] [2023-09-22 17:11:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5136384. Throughput: 0: 761.2, 1: 761.8. Samples: 1278187. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:54,674][111881] Avg episode reward: [(0, '6.100'), (1, '5.340')] [2023-09-22 17:11:59,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5169152. Throughput: 0: 767.9, 1: 767.8. Samples: 1287668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:11:59,673][111881] Avg episode reward: [(0, '6.040'), (1, '5.780')] [2023-09-22 17:11:59,681][112735] Saving new best policy, reward=5.780! [2023-09-22 17:12:00,352][112937] Updated weights for policy 0, policy_version 10144 (0.0018) [2023-09-22 17:12:00,353][112938] Updated weights for policy 1, policy_version 10080 (0.0018) [2023-09-22 17:12:04,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5201920. Throughput: 0: 766.8, 1: 767.0. Samples: 1296435. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:12:04,674][111881] Avg episode reward: [(0, '5.580'), (1, '5.330')] [2023-09-22 17:12:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5234688. Throughput: 0: 768.7, 1: 769.1. Samples: 1301321. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:12:09,673][111881] Avg episode reward: [(0, '5.380'), (1, '5.220')] [2023-09-22 17:12:13,446][112937] Updated weights for policy 0, policy_version 10304 (0.0017) [2023-09-22 17:12:13,446][112938] Updated weights for policy 1, policy_version 10240 (0.0017) [2023-09-22 17:12:14,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5259264. Throughput: 0: 772.3, 1: 771.3. Samples: 1310720. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:12:14,674][111881] Avg episode reward: [(0, '5.030'), (1, '5.110')] [2023-09-22 17:12:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5292032. Throughput: 0: 771.0, 1: 771.6. Samples: 1319973. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:12:19,673][111881] Avg episode reward: [(0, '5.130'), (1, '4.980')] [2023-09-22 17:12:24,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5324800. Throughput: 0: 769.8, 1: 769.2. Samples: 1324596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:12:24,673][111881] Avg episode reward: [(0, '5.150'), (1, '4.950')] [2023-09-22 17:12:26,854][112938] Updated weights for policy 1, policy_version 10400 (0.0015) [2023-09-22 17:12:26,854][112937] Updated weights for policy 0, policy_version 10464 (0.0015) [2023-09-22 17:12:29,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5357568. Throughput: 0: 767.9, 1: 767.4. Samples: 1333488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:12:29,673][111881] Avg episode reward: [(0, '5.270'), (1, '5.120')] [2023-09-22 17:12:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5382144. Throughput: 0: 766.8, 1: 767.7. Samples: 1342856. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:12:34,673][111881] Avg episode reward: [(0, '5.420'), (1, '5.290')] [2023-09-22 17:12:39,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5414912. Throughput: 0: 771.7, 1: 770.4. Samples: 1347583. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:12:39,673][111881] Avg episode reward: [(0, '5.410'), (1, '5.670')] [2023-09-22 17:12:40,203][112937] Updated weights for policy 0, policy_version 10624 (0.0015) [2023-09-22 17:12:40,203][112938] Updated weights for policy 1, policy_version 10560 (0.0017) [2023-09-22 17:12:44,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5447680. Throughput: 0: 766.9, 1: 768.2. Samples: 1356750. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:12:44,674][111881] Avg episode reward: [(0, '5.760'), (1, '5.530')] [2023-09-22 17:12:49,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5480448. Throughput: 0: 773.5, 1: 772.7. Samples: 1366016. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:12:49,673][111881] Avg episode reward: [(0, '5.830'), (1, '5.220')] [2023-09-22 17:12:53,476][112938] Updated weights for policy 1, policy_version 10720 (0.0016) [2023-09-22 17:12:53,476][112937] Updated weights for policy 0, policy_version 10784 (0.0017) [2023-09-22 17:12:54,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5505024. Throughput: 0: 768.0, 1: 768.0. Samples: 1370440. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:12:54,674][111881] Avg episode reward: [(0, '5.640'), (1, '4.910')] [2023-09-22 17:12:59,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5537792. Throughput: 0: 771.0, 1: 771.8. Samples: 1380149. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:12:59,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.770')] [2023-09-22 17:13:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5570560. Throughput: 0: 765.6, 1: 765.0. Samples: 1388853. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:13:04,673][111881] Avg episode reward: [(0, '5.740'), (1, '5.060')] [2023-09-22 17:13:06,847][112937] Updated weights for policy 0, policy_version 10944 (0.0015) [2023-09-22 17:13:06,847][112938] Updated weights for policy 1, policy_version 10880 (0.0015) [2023-09-22 17:13:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 5603328. Throughput: 0: 766.7, 1: 766.6. Samples: 1393595. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:13:09,673][111881] Avg episode reward: [(0, '5.250'), (1, '5.180')] [2023-09-22 17:13:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5627904. Throughput: 0: 771.4, 1: 770.6. Samples: 1402880. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:13:14,674][111881] Avg episode reward: [(0, '5.110'), (1, '5.430')] [2023-09-22 17:13:19,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5660672. Throughput: 0: 767.2, 1: 766.6. Samples: 1411876. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:13:19,674][111881] Avg episode reward: [(0, '4.670'), (1, '5.540')] [2023-09-22 17:13:20,323][112937] Updated weights for policy 0, policy_version 11104 (0.0014) [2023-09-22 17:13:20,324][112938] Updated weights for policy 1, policy_version 11040 (0.0016) [2023-09-22 17:13:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5693440. Throughput: 0: 761.7, 1: 762.9. Samples: 1416191. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:24,673][111881] Avg episode reward: [(0, '4.520'), (1, '5.440')] [2023-09-22 17:13:29,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.4, 300 sec: 6137.1). Total num frames: 5718016. Throughput: 0: 763.5, 1: 762.2. Samples: 1425408. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:29,674][111881] Avg episode reward: [(0, '4.810'), (1, '5.540')] [2023-09-22 17:13:34,090][112937] Updated weights for policy 0, policy_version 11264 (0.0015) [2023-09-22 17:13:34,091][112938] Updated weights for policy 1, policy_version 11200 (0.0017) [2023-09-22 17:13:34,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5750784. Throughput: 0: 756.1, 1: 756.8. Samples: 1434097. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:34,674][111881] Avg episode reward: [(0, '4.760'), (1, '5.280')] [2023-09-22 17:13:39,673][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5783552. Throughput: 0: 761.2, 1: 761.6. Samples: 1438963. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:39,674][111881] Avg episode reward: [(0, '5.050'), (1, '5.000')] [2023-09-22 17:13:44,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6075.8, 300 sec: 6123.2). Total num frames: 5812224. Throughput: 0: 754.4, 1: 754.3. Samples: 1448039. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:13:44,673][111881] Avg episode reward: [(0, '5.170'), (1, '4.890')] [2023-09-22 17:13:44,684][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000011328_2899968.pth... [2023-09-22 17:13:44,696][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000011392_2916352.pth... [2023-09-22 17:13:44,720][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000008448_2162688.pth [2023-09-22 17:13:44,735][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000008512_2179072.pth [2023-09-22 17:13:47,466][112937] Updated weights for policy 0, policy_version 11424 (0.0015) [2023-09-22 17:13:47,466][112938] Updated weights for policy 1, policy_version 11360 (0.0015) [2023-09-22 17:13:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6137.1). Total num frames: 5840896. Throughput: 0: 757.1, 1: 756.8. Samples: 1456978. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:13:49,674][111881] Avg episode reward: [(0, '5.150'), (1, '4.410')] [2023-09-22 17:13:54,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5873664. Throughput: 0: 755.0, 1: 756.4. Samples: 1461608. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:54,674][111881] Avg episode reward: [(0, '5.330'), (1, '4.630')] [2023-09-22 17:13:59,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5906432. Throughput: 0: 752.6, 1: 753.6. Samples: 1470657. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:13:59,674][111881] Avg episode reward: [(0, '4.900'), (1, '4.380')] [2023-09-22 17:14:00,915][112937] Updated weights for policy 0, policy_version 11584 (0.0019) [2023-09-22 17:14:00,916][112938] Updated weights for policy 1, policy_version 11520 (0.0017) [2023-09-22 17:14:04,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 5931008. Throughput: 0: 759.0, 1: 758.7. Samples: 1480172. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:14:04,673][111881] Avg episode reward: [(0, '4.930'), (1, '4.360')] [2023-09-22 17:14:09,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 5963776. Throughput: 0: 763.0, 1: 761.7. Samples: 1484800. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:14:09,673][111881] Avg episode reward: [(0, '4.810'), (1, '4.670')] [2023-09-22 17:14:14,134][112937] Updated weights for policy 0, policy_version 11744 (0.0015) [2023-09-22 17:14:14,135][112938] Updated weights for policy 1, policy_version 11680 (0.0017) [2023-09-22 17:14:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 5996544. Throughput: 0: 761.2, 1: 761.9. Samples: 1493949. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:14,674][111881] Avg episode reward: [(0, '4.730'), (1, '4.840')] [2023-09-22 17:14:19,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6029312. Throughput: 0: 768.5, 1: 767.8. Samples: 1503232. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:19,673][111881] Avg episode reward: [(0, '4.750'), (1, '5.120')] [2023-09-22 17:14:24,673][111881] Fps is (10 sec: 6144.0, 60 sec: 6075.7, 300 sec: 6123.2). Total num frames: 6057984. Throughput: 0: 766.4, 1: 765.8. Samples: 1507912. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:24,674][111881] Avg episode reward: [(0, '4.870'), (1, '5.370')] [2023-09-22 17:14:27,343][112937] Updated weights for policy 0, policy_version 11904 (0.0017) [2023-09-22 17:14:27,343][112938] Updated weights for policy 1, policy_version 11840 (0.0017) [2023-09-22 17:14:29,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6086656. Throughput: 0: 772.1, 1: 771.7. Samples: 1517513. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:29,674][111881] Avg episode reward: [(0, '5.060'), (1, '5.440')] [2023-09-22 17:14:34,673][111881] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6119424. Throughput: 0: 772.4, 1: 772.9. Samples: 1526516. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:34,673][111881] Avg episode reward: [(0, '4.820'), (1, '5.270')] [2023-09-22 17:14:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6152192. Throughput: 0: 774.8, 1: 774.0. Samples: 1531306. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:39,673][111881] Avg episode reward: [(0, '5.140'), (1, '4.910')] [2023-09-22 17:14:40,533][112938] Updated weights for policy 1, policy_version 12000 (0.0015) [2023-09-22 17:14:40,534][112937] Updated weights for policy 0, policy_version 12064 (0.0017) [2023-09-22 17:14:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6137.1). Total num frames: 6184960. Throughput: 0: 774.9, 1: 774.5. Samples: 1540381. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:44,673][111881] Avg episode reward: [(0, '5.140'), (1, '4.760')] [2023-09-22 17:14:49,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 6209536. Throughput: 0: 777.0, 1: 775.6. Samples: 1550041. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:49,673][111881] Avg episode reward: [(0, '4.990'), (1, '4.960')] [2023-09-22 17:14:53,684][112938] Updated weights for policy 1, policy_version 12160 (0.0014) [2023-09-22 17:14:53,685][112937] Updated weights for policy 0, policy_version 12224 (0.0018) [2023-09-22 17:14:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6123.2). Total num frames: 6242304. Throughput: 0: 773.7, 1: 773.7. Samples: 1554433. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:54,673][111881] Avg episode reward: [(0, '4.950'), (1, '4.970')] [2023-09-22 17:14:59,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6275072. Throughput: 0: 777.5, 1: 778.9. Samples: 1563987. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:14:59,673][111881] Avg episode reward: [(0, '5.240'), (1, '5.120')] [2023-09-22 17:15:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 6307840. Throughput: 0: 773.8, 1: 773.9. Samples: 1572876. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:15:04,673][111881] Avg episode reward: [(0, '5.280'), (1, '5.030')] [2023-09-22 17:15:06,964][112938] Updated weights for policy 1, policy_version 12320 (0.0018) [2023-09-22 17:15:06,964][112937] Updated weights for policy 0, policy_version 12384 (0.0019) [2023-09-22 17:15:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 6340608. Throughput: 0: 775.6, 1: 775.3. Samples: 1577701. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:15:09,673][111881] Avg episode reward: [(0, '5.670'), (1, '4.750')] [2023-09-22 17:15:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6365184. Throughput: 0: 774.5, 1: 774.1. Samples: 1587200. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:15:14,674][111881] Avg episode reward: [(0, '5.710'), (1, '4.890')] [2023-09-22 17:15:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6397952. Throughput: 0: 775.7, 1: 775.0. Samples: 1596301. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:15:19,673][111881] Avg episode reward: [(0, '5.540'), (1, '4.600')] [2023-09-22 17:15:20,197][112938] Updated weights for policy 1, policy_version 12480 (0.0015) [2023-09-22 17:15:20,197][112937] Updated weights for policy 0, policy_version 12544 (0.0015) [2023-09-22 17:15:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6137.1). Total num frames: 6430720. Throughput: 0: 774.5, 1: 773.9. Samples: 1600986. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:15:24,674][111881] Avg episode reward: [(0, '5.330'), (1, '4.690')] [2023-09-22 17:15:29,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6137.1). Total num frames: 6463488. Throughput: 0: 776.4, 1: 776.5. Samples: 1610262. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:15:29,673][111881] Avg episode reward: [(0, '5.030'), (1, '4.930')] [2023-09-22 17:15:33,428][112937] Updated weights for policy 0, policy_version 12704 (0.0017) [2023-09-22 17:15:33,429][112938] Updated weights for policy 1, policy_version 12640 (0.0017) [2023-09-22 17:15:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6488064. Throughput: 0: 771.7, 1: 773.6. Samples: 1619580. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:15:34,674][111881] Avg episode reward: [(0, '5.090'), (1, '4.630')] [2023-09-22 17:15:39,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6520832. Throughput: 0: 773.7, 1: 773.7. Samples: 1624064. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:15:39,673][111881] Avg episode reward: [(0, '5.330'), (1, '4.970')] [2023-09-22 17:15:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6553600. Throughput: 0: 772.6, 1: 770.6. Samples: 1633432. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:15:44,673][111881] Avg episode reward: [(0, '4.900'), (1, '5.150')] [2023-09-22 17:15:44,681][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000012768_3268608.pth... [2023-09-22 17:15:44,682][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000012832_3284992.pth... [2023-09-22 17:15:44,720][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000009952_2547712.pth [2023-09-22 17:15:44,721][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000009888_2531328.pth [2023-09-22 17:15:46,685][112937] Updated weights for policy 0, policy_version 12864 (0.0020) [2023-09-22 17:15:46,685][112938] Updated weights for policy 1, policy_version 12800 (0.0018) [2023-09-22 17:15:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6586368. Throughput: 0: 773.7, 1: 774.2. Samples: 1642530. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:15:49,674][111881] Avg episode reward: [(0, '4.990'), (1, '5.030')] [2023-09-22 17:15:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6619136. Throughput: 0: 773.0, 1: 773.9. Samples: 1647311. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:15:54,673][111881] Avg episode reward: [(0, '5.330'), (1, '5.370')] [2023-09-22 17:15:59,672][111881] Fps is (10 sec: 6144.2, 60 sec: 6212.3, 300 sec: 6150.9). Total num frames: 6647808. Throughput: 0: 773.8, 1: 773.9. Samples: 1656846. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:15:59,673][111881] Avg episode reward: [(0, '5.380'), (1, '5.470')] [2023-09-22 17:15:59,680][112937] Updated weights for policy 0, policy_version 13024 (0.0017) [2023-09-22 17:15:59,680][112938] Updated weights for policy 1, policy_version 12960 (0.0016) [2023-09-22 17:16:04,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6676480. Throughput: 0: 781.2, 1: 782.6. Samples: 1666674. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:04,674][111881] Avg episode reward: [(0, '5.580'), (1, '5.340')] [2023-09-22 17:16:09,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 6709248. Throughput: 0: 780.1, 1: 779.5. Samples: 1671168. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:09,674][111881] Avg episode reward: [(0, '5.570'), (1, '5.310')] [2023-09-22 17:16:12,742][112937] Updated weights for policy 0, policy_version 13184 (0.0016) [2023-09-22 17:16:12,742][112938] Updated weights for policy 1, policy_version 13120 (0.0015) [2023-09-22 17:16:14,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6742016. Throughput: 0: 780.3, 1: 781.0. Samples: 1680520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:16:14,673][111881] Avg episode reward: [(0, '5.510'), (1, '5.220')] [2023-09-22 17:16:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6774784. Throughput: 0: 778.6, 1: 777.4. Samples: 1689600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:16:19,673][111881] Avg episode reward: [(0, '5.670'), (1, '5.270')] [2023-09-22 17:16:24,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 6799360. Throughput: 0: 779.6, 1: 780.0. Samples: 1694246. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:24,674][111881] Avg episode reward: [(0, '5.550'), (1, '5.310')] [2023-09-22 17:16:26,140][112937] Updated weights for policy 0, policy_version 13344 (0.0017) [2023-09-22 17:16:26,140][112938] Updated weights for policy 1, policy_version 13280 (0.0017) [2023-09-22 17:16:29,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 6832128. Throughput: 0: 779.6, 1: 781.0. Samples: 1703655. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:29,674][111881] Avg episode reward: [(0, '5.350'), (1, '5.280')] [2023-09-22 17:16:34,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6164.8). Total num frames: 6864896. Throughput: 0: 779.7, 1: 779.9. Samples: 1712710. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:34,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.200')] [2023-09-22 17:16:39,256][112938] Updated weights for policy 1, policy_version 13440 (0.0013) [2023-09-22 17:16:39,256][112937] Updated weights for policy 0, policy_version 13504 (0.0016) [2023-09-22 17:16:39,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6897664. Throughput: 0: 781.4, 1: 780.7. Samples: 1717604. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:39,673][111881] Avg episode reward: [(0, '6.130'), (1, '5.090')] [2023-09-22 17:16:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 6930432. Throughput: 0: 775.4, 1: 775.9. Samples: 1726654. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:44,673][111881] Avg episode reward: [(0, '5.910'), (1, '5.350')] [2023-09-22 17:16:49,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 6955008. Throughput: 0: 770.9, 1: 770.7. Samples: 1736046. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:49,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.050')] [2023-09-22 17:16:52,678][112937] Updated weights for policy 0, policy_version 13664 (0.0015) [2023-09-22 17:16:52,678][112938] Updated weights for policy 1, policy_version 13600 (0.0015) [2023-09-22 17:16:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 6987776. Throughput: 0: 772.1, 1: 772.2. Samples: 1740660. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:54,674][111881] Avg episode reward: [(0, '6.170'), (1, '4.940')] [2023-09-22 17:16:59,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6164.8). Total num frames: 7020544. Throughput: 0: 767.7, 1: 767.3. Samples: 1749595. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:16:59,673][111881] Avg episode reward: [(0, '6.030'), (1, '4.770')] [2023-09-22 17:17:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 7053312. Throughput: 0: 771.9, 1: 773.4. Samples: 1759135. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:04,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.650')] [2023-09-22 17:17:06,020][112937] Updated weights for policy 0, policy_version 13824 (0.0015) [2023-09-22 17:17:06,020][112938] Updated weights for policy 1, policy_version 13760 (0.0017) [2023-09-22 17:17:09,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7077888. Throughput: 0: 767.9, 1: 767.5. Samples: 1763337. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:17:09,673][111881] Avg episode reward: [(0, '5.620'), (1, '5.220')] [2023-09-22 17:17:14,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7110656. Throughput: 0: 768.5, 1: 768.6. Samples: 1772823. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:17:14,674][111881] Avg episode reward: [(0, '5.070'), (1, '5.220')] [2023-09-22 17:17:19,338][112938] Updated weights for policy 1, policy_version 13920 (0.0017) [2023-09-22 17:17:19,338][112937] Updated weights for policy 0, policy_version 13984 (0.0016) [2023-09-22 17:17:19,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7143424. Throughput: 0: 767.7, 1: 767.3. Samples: 1781785. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:17:19,674][111881] Avg episode reward: [(0, '4.330'), (1, '5.360')] [2023-09-22 17:17:24,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 7176192. Throughput: 0: 764.7, 1: 764.7. Samples: 1786427. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:17:24,674][111881] Avg episode reward: [(0, '4.210'), (1, '5.510')] [2023-09-22 17:17:29,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7200768. Throughput: 0: 770.7, 1: 771.2. Samples: 1796037. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:29,673][111881] Avg episode reward: [(0, '4.540'), (1, '5.500')] [2023-09-22 17:17:32,895][112937] Updated weights for policy 0, policy_version 14144 (0.0016) [2023-09-22 17:17:32,896][112938] Updated weights for policy 1, policy_version 14080 (0.0017) [2023-09-22 17:17:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7233536. Throughput: 0: 761.8, 1: 761.3. Samples: 1804586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:34,674][111881] Avg episode reward: [(0, '4.820'), (1, '5.090')] [2023-09-22 17:17:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7266304. Throughput: 0: 763.2, 1: 764.0. Samples: 1809382. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:39,674][111881] Avg episode reward: [(0, '5.150'), (1, '5.270')] [2023-09-22 17:17:44,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7299072. Throughput: 0: 768.0, 1: 767.8. Samples: 1818706. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:44,674][111881] Avg episode reward: [(0, '5.230'), (1, '5.070')] [2023-09-22 17:17:44,687][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000014224_3641344.pth... [2023-09-22 17:17:44,687][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000014288_3657728.pth... [2023-09-22 17:17:44,720][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000011328_2899968.pth [2023-09-22 17:17:44,722][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000011392_2916352.pth [2023-09-22 17:17:45,940][112937] Updated weights for policy 0, policy_version 14304 (0.0016) [2023-09-22 17:17:45,941][112938] Updated weights for policy 1, policy_version 14240 (0.0016) [2023-09-22 17:17:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7323648. Throughput: 0: 766.9, 1: 766.4. Samples: 1828133. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:49,674][111881] Avg episode reward: [(0, '4.970'), (1, '4.790')] [2023-09-22 17:17:54,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7356416. Throughput: 0: 768.0, 1: 770.7. Samples: 1832579. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:54,674][111881] Avg episode reward: [(0, '4.950'), (1, '4.760')] [2023-09-22 17:17:59,555][112938] Updated weights for policy 1, policy_version 14400 (0.0016) [2023-09-22 17:17:59,556][112937] Updated weights for policy 0, policy_version 14464 (0.0016) [2023-09-22 17:17:59,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7389184. Throughput: 0: 762.5, 1: 761.6. Samples: 1841404. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:17:59,674][111881] Avg episode reward: [(0, '4.880'), (1, '4.590')] [2023-09-22 17:18:04,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 7413760. Throughput: 0: 767.7, 1: 767.7. Samples: 1850878. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:04,674][111881] Avg episode reward: [(0, '5.160'), (1, '4.670')] [2023-09-22 17:18:09,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7446528. Throughput: 0: 766.6, 1: 767.0. Samples: 1855438. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:09,673][111881] Avg episode reward: [(0, '5.530'), (1, '4.430')] [2023-09-22 17:18:13,153][112938] Updated weights for policy 1, policy_version 14560 (0.0018) [2023-09-22 17:18:13,153][112937] Updated weights for policy 0, policy_version 14624 (0.0018) [2023-09-22 17:18:14,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7479296. Throughput: 0: 755.7, 1: 755.6. Samples: 1864045. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 17:18:14,673][111881] Avg episode reward: [(0, '5.670'), (1, '4.720')] [2023-09-22 17:18:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 7503872. Throughput: 0: 762.5, 1: 763.0. Samples: 1873233. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-09-22 17:18:19,673][111881] Avg episode reward: [(0, '5.410'), (1, '4.960')] [2023-09-22 17:18:24,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 7536640. Throughput: 0: 760.9, 1: 759.9. Samples: 1877819. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:24,674][111881] Avg episode reward: [(0, '5.370'), (1, '5.160')] [2023-09-22 17:18:26,756][112937] Updated weights for policy 0, policy_version 14784 (0.0018) [2023-09-22 17:18:26,756][112938] Updated weights for policy 1, policy_version 14720 (0.0016) [2023-09-22 17:18:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7569408. Throughput: 0: 755.6, 1: 755.3. Samples: 1886696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:29,674][111881] Avg episode reward: [(0, '4.940'), (1, '5.080')] [2023-09-22 17:18:34,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 7593984. Throughput: 0: 753.7, 1: 753.5. Samples: 1895959. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:34,674][111881] Avg episode reward: [(0, '4.880'), (1, '5.080')] [2023-09-22 17:18:39,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6150.9). Total num frames: 7626752. Throughput: 0: 756.2, 1: 753.8. Samples: 1900529. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:18:39,673][111881] Avg episode reward: [(0, '5.200'), (1, '4.990')] [2023-09-22 17:18:40,339][112937] Updated weights for policy 0, policy_version 14944 (0.0018) [2023-09-22 17:18:40,340][112938] Updated weights for policy 1, policy_version 14880 (0.0016) [2023-09-22 17:18:44,672][111881] Fps is (10 sec: 6144.1, 60 sec: 5939.2, 300 sec: 6150.9). Total num frames: 7655424. Throughput: 0: 748.9, 1: 749.0. Samples: 1908807. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:18:44,673][111881] Avg episode reward: [(0, '5.210'), (1, '5.170')] [2023-09-22 17:18:49,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 7684096. Throughput: 0: 750.4, 1: 750.6. Samples: 1918423. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:18:49,673][111881] Avg episode reward: [(0, '5.250'), (1, '5.240')] [2023-09-22 17:18:53,759][112937] Updated weights for policy 0, policy_version 15104 (0.0016) [2023-09-22 17:18:53,759][112938] Updated weights for policy 1, policy_version 15040 (0.0020) [2023-09-22 17:18:54,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 7716864. Throughput: 0: 752.0, 1: 750.9. Samples: 1923072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:18:54,674][111881] Avg episode reward: [(0, '5.570'), (1, '5.170')] [2023-09-22 17:18:59,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 7749632. Throughput: 0: 764.2, 1: 764.1. Samples: 1932822. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:18:59,674][111881] Avg episode reward: [(0, '5.290'), (1, '5.840')] [2023-09-22 17:18:59,688][112735] Saving new best policy, reward=5.840! [2023-09-22 17:19:04,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7782400. Throughput: 0: 767.6, 1: 767.5. Samples: 1942312. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:19:04,674][111881] Avg episode reward: [(0, '4.960'), (1, '5.640')] [2023-09-22 17:19:06,509][112937] Updated weights for policy 0, policy_version 15264 (0.0016) [2023-09-22 17:19:06,509][112938] Updated weights for policy 1, policy_version 15200 (0.0017) [2023-09-22 17:19:09,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7815168. Throughput: 0: 771.0, 1: 771.4. Samples: 1947227. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:19:09,673][111881] Avg episode reward: [(0, '4.940'), (1, '5.270')] [2023-09-22 17:19:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7847936. Throughput: 0: 775.6, 1: 776.0. Samples: 1956519. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:19:14,674][111881] Avg episode reward: [(0, '4.490'), (1, '4.970')] [2023-09-22 17:19:19,460][112937] Updated weights for policy 0, policy_version 15424 (0.0018) [2023-09-22 17:19:19,460][112938] Updated weights for policy 1, policy_version 15360 (0.0019) [2023-09-22 17:19:19,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6178.7). Total num frames: 7880704. Throughput: 0: 779.5, 1: 778.7. Samples: 1966080. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:19,674][111881] Avg episode reward: [(0, '4.670'), (1, '5.590')] [2023-09-22 17:19:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 7913472. Throughput: 0: 780.5, 1: 781.2. Samples: 1970804. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:24,673][111881] Avg episode reward: [(0, '5.200'), (1, '5.000')] [2023-09-22 17:19:29,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 7938048. Throughput: 0: 793.8, 1: 794.0. Samples: 1980258. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:29,673][111881] Avg episode reward: [(0, '5.580'), (1, '5.210')] [2023-09-22 17:19:32,941][112937] Updated weights for policy 0, policy_version 15584 (0.0017) [2023-09-22 17:19:32,941][112938] Updated weights for policy 1, policy_version 15520 (0.0018) [2023-09-22 17:19:34,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6164.8). Total num frames: 7970816. Throughput: 0: 780.1, 1: 779.7. Samples: 1988611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:34,673][111881] Avg episode reward: [(0, '5.630'), (1, '5.260')] [2023-09-22 17:19:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 8003584. Throughput: 0: 780.0, 1: 780.7. Samples: 1993305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:39,674][111881] Avg episode reward: [(0, '5.850'), (1, '5.110')] [2023-09-22 17:19:44,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6212.2, 300 sec: 6164.8). Total num frames: 8028160. Throughput: 0: 779.6, 1: 778.7. Samples: 2002944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:44,674][111881] Avg episode reward: [(0, '5.760'), (1, '4.460')] [2023-09-22 17:19:44,778][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000015728_4026368.pth... [2023-09-22 17:19:44,805][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000012832_3284992.pth [2023-09-22 17:19:44,808][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000015664_4009984.pth... [2023-09-22 17:19:44,840][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000012768_3268608.pth [2023-09-22 17:19:46,144][112937] Updated weights for policy 0, policy_version 15744 (0.0015) [2023-09-22 17:19:46,144][112938] Updated weights for policy 1, policy_version 15680 (0.0016) [2023-09-22 17:19:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 8060928. Throughput: 0: 777.0, 1: 776.8. Samples: 2012235. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:49,674][111881] Avg episode reward: [(0, '5.680'), (1, '4.360')] [2023-09-22 17:19:54,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6164.8). Total num frames: 8093696. Throughput: 0: 774.4, 1: 775.4. Samples: 2016971. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:54,673][111881] Avg episode reward: [(0, '5.370'), (1, '4.260')] [2023-09-22 17:19:59,308][112938] Updated weights for policy 1, policy_version 15840 (0.0017) [2023-09-22 17:19:59,308][112937] Updated weights for policy 0, policy_version 15904 (0.0017) [2023-09-22 17:19:59,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 8126464. Throughput: 0: 773.0, 1: 772.6. Samples: 2026072. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:19:59,674][111881] Avg episode reward: [(0, '5.240'), (1, '4.530')] [2023-09-22 17:20:04,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 8159232. Throughput: 0: 773.7, 1: 773.8. Samples: 2035716. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:20:04,674][111881] Avg episode reward: [(0, '5.290'), (1, '4.680')] [2023-09-22 17:20:09,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 8192000. Throughput: 0: 775.1, 1: 775.1. Samples: 2040565. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:20:09,673][111881] Avg episode reward: [(0, '5.290'), (1, '4.850')] [2023-09-22 17:20:12,089][112937] Updated weights for policy 0, policy_version 16064 (0.0016) [2023-09-22 17:20:12,089][112938] Updated weights for policy 1, policy_version 16000 (0.0016) [2023-09-22 17:20:14,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 8224768. Throughput: 0: 776.0, 1: 775.2. Samples: 2050061. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:20:14,674][111881] Avg episode reward: [(0, '5.240'), (1, '5.030')] [2023-09-22 17:20:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 8249344. Throughput: 0: 782.4, 1: 782.5. Samples: 2059031. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:20:19,673][111881] Avg episode reward: [(0, '5.240'), (1, '4.820')] [2023-09-22 17:20:24,672][111881] Fps is (10 sec: 4096.1, 60 sec: 5870.9, 300 sec: 6109.3). Total num frames: 8265728. Throughput: 0: 767.4, 1: 766.7. Samples: 2062339. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:20:24,673][111881] Avg episode reward: [(0, '5.300'), (1, '5.150')] [2023-09-22 17:20:29,672][111881] Fps is (10 sec: 3276.8, 60 sec: 5734.4, 300 sec: 6081.5). Total num frames: 8282112. Throughput: 0: 700.6, 1: 703.0. Samples: 2066106. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:20:29,674][111881] Avg episode reward: [(0, '5.400'), (1, '5.380')] [2023-09-22 17:20:32,301][112937] Updated weights for policy 0, policy_version 16224 (0.0013) [2023-09-22 17:20:32,302][112938] Updated weights for policy 1, policy_version 16160 (0.0015) [2023-09-22 17:20:34,672][111881] Fps is (10 sec: 2867.2, 60 sec: 5393.1, 300 sec: 6012.1). Total num frames: 8294400. Throughput: 0: 646.7, 1: 646.7. Samples: 2070436. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:20:34,673][111881] Avg episode reward: [(0, '5.350'), (1, '5.270')] [2023-09-22 17:20:39,672][111881] Fps is (10 sec: 3276.8, 60 sec: 5188.3, 300 sec: 5970.4). Total num frames: 8314880. Throughput: 0: 633.9, 1: 633.3. Samples: 2073992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:20:39,673][111881] Avg episode reward: [(0, '5.300'), (1, '5.300')] [2023-09-22 17:20:44,672][111881] Fps is (10 sec: 5324.8, 60 sec: 5324.8, 300 sec: 5970.4). Total num frames: 8347648. Throughput: 0: 614.4, 1: 614.4. Samples: 2081367. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:20:44,673][111881] Avg episode reward: [(0, '5.260'), (1, '5.340')] [2023-09-22 17:20:49,513][112938] Updated weights for policy 1, policy_version 16320 (0.0013) [2023-09-22 17:20:49,513][112937] Updated weights for policy 0, policy_version 16384 (0.0014) [2023-09-22 17:20:49,672][111881] Fps is (10 sec: 5734.4, 60 sec: 5188.3, 300 sec: 5942.7). Total num frames: 8372224. Throughput: 0: 591.6, 1: 591.6. Samples: 2088960. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:20:49,673][111881] Avg episode reward: [(0, '5.240'), (1, '5.500')] [2023-09-22 17:20:54,672][111881] Fps is (10 sec: 4915.3, 60 sec: 5051.7, 300 sec: 5928.8). Total num frames: 8396800. Throughput: 0: 581.8, 1: 581.4. Samples: 2092908. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:20:54,673][111881] Avg episode reward: [(0, '5.200'), (1, '5.570')] [2023-09-22 17:20:59,672][111881] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 5914.9). Total num frames: 8421376. Throughput: 0: 561.9, 1: 562.4. Samples: 2100653. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:20:59,673][111881] Avg episode reward: [(0, '4.990'), (1, '5.160')] [2023-09-22 17:21:04,672][111881] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 5887.1). Total num frames: 8445952. Throughput: 0: 553.8, 1: 553.2. Samples: 2108844. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:21:04,673][111881] Avg episode reward: [(0, '4.970'), (1, '5.200')] [2023-09-22 17:21:05,005][112938] Updated weights for policy 1, policy_version 16480 (0.0013) [2023-09-22 17:21:05,006][112937] Updated weights for policy 0, policy_version 16544 (0.0012) [2023-09-22 17:21:09,672][111881] Fps is (10 sec: 5734.5, 60 sec: 4778.7, 300 sec: 5887.1). Total num frames: 8478720. Throughput: 0: 562.3, 1: 563.9. Samples: 2113015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:21:09,699][111881] Avg episode reward: [(0, '5.170'), (1, '4.950')] [2023-09-22 17:21:14,672][111881] Fps is (10 sec: 5734.3, 60 sec: 4642.1, 300 sec: 5859.4). Total num frames: 8503296. Throughput: 0: 619.2, 1: 616.8. Samples: 2121729. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:21:14,699][111881] Avg episode reward: [(0, '5.280'), (1, '4.930')] [2023-09-22 17:21:19,016][112937] Updated weights for policy 0, policy_version 16704 (0.0011) [2023-09-22 17:21:19,023][112938] Updated weights for policy 1, policy_version 16640 (0.0012) [2023-09-22 17:21:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 4778.7, 300 sec: 5887.1). Total num frames: 8536064. Throughput: 0: 667.6, 1: 667.4. Samples: 2130511. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:21:19,673][111881] Avg episode reward: [(0, '5.460'), (1, '4.730')] [2023-09-22 17:21:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 5051.7, 300 sec: 5887.1). Total num frames: 8568832. Throughput: 0: 676.9, 1: 677.0. Samples: 2134918. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:21:24,699][111881] Avg episode reward: [(0, '5.260'), (1, '4.990')] [2023-09-22 17:21:29,673][111881] Fps is (10 sec: 5734.2, 60 sec: 5188.3, 300 sec: 5859.4). Total num frames: 8593408. Throughput: 0: 695.0, 1: 695.4. Samples: 2143934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:21:29,674][111881] Avg episode reward: [(0, '5.670'), (1, '5.100')] [2023-09-22 17:21:32,825][112937] Updated weights for policy 0, policy_version 16864 (0.0014) [2023-09-22 17:21:32,827][112938] Updated weights for policy 1, policy_version 16800 (0.0017) [2023-09-22 17:21:34,673][111881] Fps is (10 sec: 5734.3, 60 sec: 5529.6, 300 sec: 5859.4). Total num frames: 8626176. Throughput: 0: 708.6, 1: 709.3. Samples: 2152767. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:21:34,673][111881] Avg episode reward: [(0, '5.710'), (1, '5.230')] [2023-09-22 17:21:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 5597.8, 300 sec: 5831.6). Total num frames: 8650752. Throughput: 0: 715.6, 1: 715.7. Samples: 2157316. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:21:39,674][111881] Avg episode reward: [(0, '5.730'), (1, '5.230')] [2023-09-22 17:21:44,673][111881] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5859.4). Total num frames: 8683520. Throughput: 0: 730.2, 1: 730.2. Samples: 2166370. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:21:44,673][111881] Avg episode reward: [(0, '5.540'), (1, '5.230')] [2023-09-22 17:21:44,684][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000016928_4333568.pth... [2023-09-22 17:21:44,684][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000016992_4349952.pth... [2023-09-22 17:21:44,719][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000014224_3641344.pth [2023-09-22 17:21:44,719][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000014288_3657728.pth [2023-09-22 17:21:46,525][112937] Updated weights for policy 0, policy_version 17024 (0.0014) [2023-09-22 17:21:46,525][112938] Updated weights for policy 1, policy_version 16960 (0.0015) [2023-09-22 17:21:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 5734.4, 300 sec: 5859.4). Total num frames: 8716288. Throughput: 0: 736.2, 1: 737.7. Samples: 2175168. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:21:49,673][111881] Avg episode reward: [(0, '5.620'), (1, '5.220')] [2023-09-22 17:21:54,673][111881] Fps is (10 sec: 6144.0, 60 sec: 5802.6, 300 sec: 5845.5). Total num frames: 8744960. Throughput: 0: 742.5, 1: 741.9. Samples: 2179812. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:21:54,673][111881] Avg episode reward: [(0, '5.660'), (1, '4.990')] [2023-09-22 17:21:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 5870.9, 300 sec: 5831.6). Total num frames: 8773632. Throughput: 0: 746.3, 1: 747.9. Samples: 2188968. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:21:59,673][111881] Avg episode reward: [(0, '5.410'), (1, '4.740')] [2023-09-22 17:22:00,150][112937] Updated weights for policy 0, policy_version 17184 (0.0017) [2023-09-22 17:22:00,150][112938] Updated weights for policy 1, policy_version 17120 (0.0017) [2023-09-22 17:22:04,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 8806400. Throughput: 0: 749.9, 1: 749.8. Samples: 2197997. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:22:04,673][111881] Avg episode reward: [(0, '5.370'), (1, '4.980')] [2023-09-22 17:22:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 8839168. Throughput: 0: 753.5, 1: 753.1. Samples: 2202715. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:22:09,673][111881] Avg episode reward: [(0, '5.510'), (1, '4.950')] [2023-09-22 17:22:13,479][112938] Updated weights for policy 1, policy_version 17280 (0.0017) [2023-09-22 17:22:13,479][112937] Updated weights for policy 0, policy_version 17344 (0.0017) [2023-09-22 17:22:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 8863744. Throughput: 0: 755.0, 1: 754.0. Samples: 2211840. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:22:14,673][111881] Avg episode reward: [(0, '5.410'), (1, '5.320')] [2023-09-22 17:22:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 8896512. Throughput: 0: 759.7, 1: 759.5. Samples: 2221131. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:22:19,673][111881] Avg episode reward: [(0, '5.430'), (1, '5.640')] [2023-09-22 17:22:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6007.4, 300 sec: 5859.4). Total num frames: 8929280. Throughput: 0: 763.4, 1: 763.1. Samples: 2226012. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:22:24,673][111881] Avg episode reward: [(0, '5.640'), (1, '5.330')] [2023-09-22 17:22:26,566][112938] Updated weights for policy 1, policy_version 17440 (0.0015) [2023-09-22 17:22:26,567][112937] Updated weights for policy 0, policy_version 17504 (0.0016) [2023-09-22 17:22:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 8962048. Throughput: 0: 765.2, 1: 764.8. Samples: 2235218. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:22:29,674][111881] Avg episode reward: [(0, '5.300'), (1, '5.210')] [2023-09-22 17:22:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 8994816. Throughput: 0: 772.0, 1: 771.1. Samples: 2244608. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:22:34,673][111881] Avg episode reward: [(0, '5.490'), (1, '5.410')] [2023-09-22 17:22:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 9019392. Throughput: 0: 770.6, 1: 770.8. Samples: 2249176. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:22:39,674][111881] Avg episode reward: [(0, '5.180'), (1, '5.030')] [2023-09-22 17:22:39,900][112938] Updated weights for policy 1, policy_version 17600 (0.0016) [2023-09-22 17:22:39,901][112937] Updated weights for policy 0, policy_version 17664 (0.0017) [2023-09-22 17:22:44,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9052160. Throughput: 0: 769.9, 1: 768.8. Samples: 2258212. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:22:44,674][111881] Avg episode reward: [(0, '5.020'), (1, '5.020')] [2023-09-22 17:22:49,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9084928. Throughput: 0: 769.6, 1: 770.0. Samples: 2267278. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:22:49,674][111881] Avg episode reward: [(0, '5.520'), (1, '4.590')] [2023-09-22 17:22:53,287][112937] Updated weights for policy 0, policy_version 17824 (0.0016) [2023-09-22 17:22:53,288][112938] Updated weights for policy 1, policy_version 17760 (0.0016) [2023-09-22 17:22:54,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6212.3, 300 sec: 5859.4). Total num frames: 9117696. Throughput: 0: 769.6, 1: 769.7. Samples: 2271984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:22:54,673][111881] Avg episode reward: [(0, '5.620'), (1, '4.300')] [2023-09-22 17:22:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9142272. Throughput: 0: 773.3, 1: 773.7. Samples: 2281456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:22:59,673][111881] Avg episode reward: [(0, '6.010'), (1, '4.270')] [2023-09-22 17:23:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9175040. Throughput: 0: 771.2, 1: 771.1. Samples: 2290533. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:04,673][111881] Avg episode reward: [(0, '5.930'), (1, '4.370')] [2023-09-22 17:23:06,533][112938] Updated weights for policy 1, policy_version 17920 (0.0016) [2023-09-22 17:23:06,533][112937] Updated weights for policy 0, policy_version 17984 (0.0017) [2023-09-22 17:23:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9207808. Throughput: 0: 769.8, 1: 769.4. Samples: 2295278. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:09,673][111881] Avg episode reward: [(0, '6.260'), (1, '4.600')] [2023-09-22 17:23:14,672][111881] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 5873.2). Total num frames: 9236480. Throughput: 0: 765.2, 1: 765.7. Samples: 2304111. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:14,673][111881] Avg episode reward: [(0, '5.720'), (1, '4.640')] [2023-09-22 17:23:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9265152. Throughput: 0: 760.2, 1: 761.6. Samples: 2313087. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:19,674][111881] Avg episode reward: [(0, '5.950'), (1, '4.800')] [2023-09-22 17:23:20,266][112938] Updated weights for policy 1, policy_version 18080 (0.0016) [2023-09-22 17:23:20,266][112937] Updated weights for policy 0, policy_version 18144 (0.0016) [2023-09-22 17:23:24,672][111881] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9297920. Throughput: 0: 762.5, 1: 762.5. Samples: 2317800. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:24,673][111881] Avg episode reward: [(0, '5.610'), (1, '4.940')] [2023-09-22 17:23:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 9330688. Throughput: 0: 764.6, 1: 764.9. Samples: 2327041. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:23:29,674][111881] Avg episode reward: [(0, '5.800'), (1, '5.090')] [2023-09-22 17:23:33,343][112937] Updated weights for policy 0, policy_version 18304 (0.0014) [2023-09-22 17:23:33,343][112938] Updated weights for policy 1, policy_version 18240 (0.0018) [2023-09-22 17:23:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 9363456. Throughput: 0: 770.6, 1: 770.9. Samples: 2336645. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:23:34,673][111881] Avg episode reward: [(0, '5.460'), (1, '5.310')] [2023-09-22 17:23:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5873.2). Total num frames: 9388032. Throughput: 0: 767.0, 1: 767.1. Samples: 2341018. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:23:39,674][111881] Avg episode reward: [(0, '5.730'), (1, '5.210')] [2023-09-22 17:23:44,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 9420800. Throughput: 0: 765.4, 1: 766.6. Samples: 2350396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:44,674][111881] Avg episode reward: [(0, '5.150'), (1, '5.190')] [2023-09-22 17:23:44,685][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000018368_4702208.pth... [2023-09-22 17:23:44,686][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000018432_4718592.pth... [2023-09-22 17:23:44,722][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000015664_4009984.pth [2023-09-22 17:23:44,730][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000015728_4026368.pth [2023-09-22 17:23:46,554][112937] Updated weights for policy 0, policy_version 18464 (0.0018) [2023-09-22 17:23:46,555][112938] Updated weights for policy 1, policy_version 18400 (0.0017) [2023-09-22 17:23:49,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 9453568. Throughput: 0: 764.3, 1: 763.9. Samples: 2359305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:49,674][111881] Avg episode reward: [(0, '5.310'), (1, '5.390')] [2023-09-22 17:23:54,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 5859.4). Total num frames: 9478144. Throughput: 0: 758.2, 1: 758.9. Samples: 2363545. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:23:54,673][111881] Avg episode reward: [(0, '5.130'), (1, '5.440')] [2023-09-22 17:23:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9510912. Throughput: 0: 762.7, 1: 763.6. Samples: 2372796. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:23:59,673][111881] Avg episode reward: [(0, '4.880'), (1, '5.430')] [2023-09-22 17:24:00,438][112938] Updated weights for policy 1, policy_version 18560 (0.0016) [2023-09-22 17:24:00,438][112937] Updated weights for policy 0, policy_version 18624 (0.0016) [2023-09-22 17:24:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9543680. Throughput: 0: 765.0, 1: 764.4. Samples: 2381913. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:24:04,673][111881] Avg episode reward: [(0, '5.050'), (1, '5.400')] [2023-09-22 17:24:09,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9576448. Throughput: 0: 767.1, 1: 766.7. Samples: 2386823. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:24:09,674][111881] Avg episode reward: [(0, '4.760'), (1, '5.540')] [2023-09-22 17:24:13,400][112937] Updated weights for policy 0, policy_version 18784 (0.0018) [2023-09-22 17:24:13,400][112938] Updated weights for policy 1, policy_version 18720 (0.0017) [2023-09-22 17:24:14,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6075.7, 300 sec: 5831.6). Total num frames: 9601024. Throughput: 0: 768.4, 1: 767.7. Samples: 2396164. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:24:14,673][111881] Avg episode reward: [(0, '4.750'), (1, '5.230')] [2023-09-22 17:24:19,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 9633792. Throughput: 0: 759.3, 1: 758.7. Samples: 2404953. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:24:19,674][111881] Avg episode reward: [(0, '5.010'), (1, '5.010')] [2023-09-22 17:24:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9666560. Throughput: 0: 763.4, 1: 762.9. Samples: 2409703. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:24,673][111881] Avg episode reward: [(0, '5.180'), (1, '4.790')] [2023-09-22 17:24:26,913][112937] Updated weights for policy 0, policy_version 18944 (0.0014) [2023-09-22 17:24:26,913][112938] Updated weights for policy 1, policy_version 18880 (0.0016) [2023-09-22 17:24:29,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9699328. Throughput: 0: 760.1, 1: 759.4. Samples: 2418775. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:29,673][111881] Avg episode reward: [(0, '5.000'), (1, '4.630')] [2023-09-22 17:24:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5831.6). Total num frames: 9723904. Throughput: 0: 769.4, 1: 770.5. Samples: 2428600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:34,673][111881] Avg episode reward: [(0, '5.060'), (1, '4.750')] [2023-09-22 17:24:39,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9756672. Throughput: 0: 772.7, 1: 772.7. Samples: 2433090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:39,673][111881] Avg episode reward: [(0, '4.830'), (1, '4.990')] [2023-09-22 17:24:39,963][112937] Updated weights for policy 0, policy_version 19104 (0.0015) [2023-09-22 17:24:39,964][112938] Updated weights for policy 1, policy_version 19040 (0.0018) [2023-09-22 17:24:44,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9789440. Throughput: 0: 777.6, 1: 776.1. Samples: 2442710. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:44,673][111881] Avg episode reward: [(0, '4.670'), (1, '5.220')] [2023-09-22 17:24:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5859.4). Total num frames: 9822208. Throughput: 0: 776.1, 1: 776.7. Samples: 2451788. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:24:49,674][111881] Avg episode reward: [(0, '4.550'), (1, '5.220')] [2023-09-22 17:24:53,221][112938] Updated weights for policy 1, policy_version 19200 (0.0016) [2023-09-22 17:24:53,221][112937] Updated weights for policy 0, policy_version 19264 (0.0017) [2023-09-22 17:24:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5859.4). Total num frames: 9854976. Throughput: 0: 773.7, 1: 774.7. Samples: 2456501. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:24:54,673][111881] Avg episode reward: [(0, '4.490'), (1, '4.790')] [2023-09-22 17:24:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 9879552. Throughput: 0: 772.8, 1: 772.4. Samples: 2465696. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:24:59,673][111881] Avg episode reward: [(0, '4.740'), (1, '4.870')] [2023-09-22 17:25:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 9912320. Throughput: 0: 776.5, 1: 776.8. Samples: 2474850. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:25:04,673][111881] Avg episode reward: [(0, '4.890'), (1, '4.710')] [2023-09-22 17:25:06,519][112937] Updated weights for policy 0, policy_version 19424 (0.0014) [2023-09-22 17:25:06,520][112938] Updated weights for policy 1, policy_version 19360 (0.0015) [2023-09-22 17:25:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 5831.6). Total num frames: 9945088. Throughput: 0: 777.5, 1: 778.2. Samples: 2479709. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:25:09,673][111881] Avg episode reward: [(0, '5.020'), (1, '4.780')] [2023-09-22 17:25:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 5859.4). Total num frames: 9977856. Throughput: 0: 781.1, 1: 781.4. Samples: 2489090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:25:14,673][111881] Avg episode reward: [(0, '5.560'), (1, '5.090')] [2023-09-22 17:25:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5887.1). Total num frames: 10002432. Throughput: 0: 774.3, 1: 775.0. Samples: 2498316. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:25:19,674][111881] Avg episode reward: [(0, '5.450'), (1, '4.870')] [2023-09-22 17:25:19,759][112937] Updated weights for policy 0, policy_version 19584 (0.0017) [2023-09-22 17:25:19,760][112938] Updated weights for policy 1, policy_version 19520 (0.0015) [2023-09-22 17:25:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5942.7). Total num frames: 10035200. Throughput: 0: 774.1, 1: 774.1. Samples: 2502758. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:25:24,673][111881] Avg episode reward: [(0, '5.580'), (1, '5.230')] [2023-09-22 17:25:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6012.1). Total num frames: 10067968. Throughput: 0: 773.4, 1: 773.5. Samples: 2512319. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:25:29,674][111881] Avg episode reward: [(0, '5.310'), (1, '5.150')] [2023-09-22 17:25:32,738][112938] Updated weights for policy 1, policy_version 19680 (0.0017) [2023-09-22 17:25:32,738][112937] Updated weights for policy 0, policy_version 19744 (0.0017) [2023-09-22 17:25:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6053.7). Total num frames: 10100736. Throughput: 0: 777.6, 1: 777.2. Samples: 2521756. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:25:34,673][111881] Avg episode reward: [(0, '5.090'), (1, '4.900')] [2023-09-22 17:25:39,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6053.8). Total num frames: 10133504. Throughput: 0: 781.0, 1: 780.1. Samples: 2526753. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:25:39,673][111881] Avg episode reward: [(0, '5.000'), (1, '5.040')] [2023-09-22 17:25:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6081.5). Total num frames: 10166272. Throughput: 0: 779.0, 1: 780.3. Samples: 2535865. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:25:44,673][111881] Avg episode reward: [(0, '4.840'), (1, '5.010')] [2023-09-22 17:25:44,683][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000019888_5091328.pth... [2023-09-22 17:25:44,683][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000019824_5074944.pth... [2023-09-22 17:25:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000016928_4333568.pth [2023-09-22 17:25:44,726][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000016992_4349952.pth [2023-09-22 17:25:45,725][112937] Updated weights for policy 0, policy_version 19904 (0.0018) [2023-09-22 17:25:45,725][112938] Updated weights for policy 1, policy_version 19840 (0.0019) [2023-09-22 17:25:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 10190848. Throughput: 0: 783.3, 1: 783.1. Samples: 2545336. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:25:49,674][111881] Avg episode reward: [(0, '5.090'), (1, '5.140')] [2023-09-22 17:25:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6109.3). Total num frames: 10223616. Throughput: 0: 778.9, 1: 778.3. Samples: 2549780. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:25:54,673][111881] Avg episode reward: [(0, '5.650'), (1, '5.450')] [2023-09-22 17:25:58,832][112937] Updated weights for policy 0, policy_version 20064 (0.0016) [2023-09-22 17:25:58,832][112938] Updated weights for policy 1, policy_version 20000 (0.0018) [2023-09-22 17:25:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 10256384. Throughput: 0: 783.1, 1: 782.2. Samples: 2559530. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0) [2023-09-22 17:25:59,673][111881] Avg episode reward: [(0, '5.440'), (1, '5.770')] [2023-09-22 17:26:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 10289152. Throughput: 0: 784.9, 1: 784.1. Samples: 2568920. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:26:04,673][111881] Avg episode reward: [(0, '5.670'), (1, '6.030')] [2023-09-22 17:26:04,674][112735] Saving new best policy, reward=6.030! [2023-09-22 17:26:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 10321920. Throughput: 0: 788.2, 1: 788.8. Samples: 2573722. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:26:09,673][111881] Avg episode reward: [(0, '5.780'), (1, '5.270')] [2023-09-22 17:26:11,879][112938] Updated weights for policy 1, policy_version 20160 (0.0015) [2023-09-22 17:26:11,880][112937] Updated weights for policy 0, policy_version 20224 (0.0016) [2023-09-22 17:26:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 10354688. Throughput: 0: 783.3, 1: 783.5. Samples: 2582826. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:26:14,673][111881] Avg episode reward: [(0, '5.270'), (1, '4.900')] [2023-09-22 17:26:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 10379264. Throughput: 0: 784.3, 1: 784.0. Samples: 2592327. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:26:19,674][111881] Avg episode reward: [(0, '5.480'), (1, '5.030')] [2023-09-22 17:26:24,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 10412032. Throughput: 0: 777.4, 1: 777.8. Samples: 2596739. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:26:24,673][111881] Avg episode reward: [(0, '5.710'), (1, '4.680')] [2023-09-22 17:26:25,352][112938] Updated weights for policy 1, policy_version 20320 (0.0016) [2023-09-22 17:26:25,353][112937] Updated weights for policy 0, policy_version 20384 (0.0017) [2023-09-22 17:26:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 10444800. Throughput: 0: 778.0, 1: 778.1. Samples: 2605886. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:26:29,674][111881] Avg episode reward: [(0, '5.640'), (1, '4.680')] [2023-09-22 17:26:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10477568. Throughput: 0: 777.7, 1: 776.9. Samples: 2615296. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:26:34,673][111881] Avg episode reward: [(0, '5.520'), (1, '4.620')] [2023-09-22 17:26:38,526][112937] Updated weights for policy 0, policy_version 20544 (0.0015) [2023-09-22 17:26:38,526][112938] Updated weights for policy 1, policy_version 20480 (0.0015) [2023-09-22 17:26:39,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 10502144. Throughput: 0: 776.9, 1: 777.5. Samples: 2619729. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:26:39,674][111881] Avg episode reward: [(0, '5.570'), (1, '4.420')] [2023-09-22 17:26:44,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 10534912. Throughput: 0: 776.3, 1: 776.3. Samples: 2629397. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:26:44,673][111881] Avg episode reward: [(0, '5.460'), (1, '4.350')] [2023-09-22 17:26:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6178.7). Total num frames: 10567680. Throughput: 0: 770.3, 1: 770.0. Samples: 2638235. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:26:49,673][111881] Avg episode reward: [(0, '5.660'), (1, '4.530')] [2023-09-22 17:26:51,718][112937] Updated weights for policy 0, policy_version 20704 (0.0016) [2023-09-22 17:26:51,718][112938] Updated weights for policy 1, policy_version 20640 (0.0018) [2023-09-22 17:26:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10600448. Throughput: 0: 772.0, 1: 771.7. Samples: 2643185. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:26:54,673][111881] Avg episode reward: [(0, '5.900'), (1, '4.890')] [2023-09-22 17:26:59,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10633216. Throughput: 0: 773.4, 1: 773.3. Samples: 2652428. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:26:59,674][111881] Avg episode reward: [(0, '5.630'), (1, '4.780')] [2023-09-22 17:27:04,552][112938] Updated weights for policy 1, policy_version 20800 (0.0017) [2023-09-22 17:27:04,552][112937] Updated weights for policy 0, policy_version 20864 (0.0016) [2023-09-22 17:27:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10665984. Throughput: 0: 778.9, 1: 778.2. Samples: 2662400. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:27:04,673][111881] Avg episode reward: [(0, '6.100'), (1, '4.550')] [2023-09-22 17:27:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10690560. Throughput: 0: 777.0, 1: 776.2. Samples: 2666635. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:27:09,674][111881] Avg episode reward: [(0, '6.440'), (1, '4.680')] [2023-09-22 17:27:14,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10723328. Throughput: 0: 779.6, 1: 779.7. Samples: 2676051. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:27:14,673][111881] Avg episode reward: [(0, '6.290'), (1, '4.520')] [2023-09-22 17:27:17,889][112937] Updated weights for policy 0, policy_version 21024 (0.0017) [2023-09-22 17:27:17,889][112938] Updated weights for policy 1, policy_version 20960 (0.0017) [2023-09-22 17:27:19,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 10756096. Throughput: 0: 774.3, 1: 775.0. Samples: 2685018. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:19,673][111881] Avg episode reward: [(0, '6.170'), (1, '4.800')] [2023-09-22 17:27:24,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 10780672. Throughput: 0: 775.6, 1: 775.5. Samples: 2689527. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:24,673][111881] Avg episode reward: [(0, '6.130'), (1, '4.950')] [2023-09-22 17:27:29,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 10813440. Throughput: 0: 772.8, 1: 772.9. Samples: 2698954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:29,673][111881] Avg episode reward: [(0, '5.770'), (1, '5.140')] [2023-09-22 17:27:31,264][112937] Updated weights for policy 0, policy_version 21184 (0.0017) [2023-09-22 17:27:31,264][112938] Updated weights for policy 1, policy_version 21120 (0.0015) [2023-09-22 17:27:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10846208. Throughput: 0: 781.5, 1: 781.6. Samples: 2708572. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:34,673][111881] Avg episode reward: [(0, '5.600'), (1, '5.200')] [2023-09-22 17:27:39,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 10878976. Throughput: 0: 781.1, 1: 780.5. Samples: 2713456. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:39,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.240')] [2023-09-22 17:27:44,373][112937] Updated weights for policy 0, policy_version 21344 (0.0015) [2023-09-22 17:27:44,373][112938] Updated weights for policy 1, policy_version 21280 (0.0014) [2023-09-22 17:27:44,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10911744. Throughput: 0: 778.2, 1: 778.2. Samples: 2722470. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:27:44,673][111881] Avg episode reward: [(0, '5.800'), (1, '5.310')] [2023-09-22 17:27:44,682][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000021344_5464064.pth... [2023-09-22 17:27:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000021280_5447680.pth... [2023-09-22 17:27:44,717][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000018368_4702208.pth [2023-09-22 17:27:44,718][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000018432_4718592.pth [2023-09-22 17:27:49,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 10944512. Throughput: 0: 773.6, 1: 773.7. Samples: 2732030. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:27:49,673][111881] Avg episode reward: [(0, '6.110'), (1, '5.570')] [2023-09-22 17:27:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 10969088. Throughput: 0: 776.3, 1: 776.8. Samples: 2736527. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:27:54,673][111881] Avg episode reward: [(0, '6.130'), (1, '5.280')] [2023-09-22 17:27:57,329][112937] Updated weights for policy 0, policy_version 21504 (0.0015) [2023-09-22 17:27:57,330][112938] Updated weights for policy 1, policy_version 21440 (0.0017) [2023-09-22 17:27:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11001856. Throughput: 0: 779.1, 1: 779.4. Samples: 2746182. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:27:59,673][111881] Avg episode reward: [(0, '6.350'), (1, '5.460')] [2023-09-22 17:28:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11034624. Throughput: 0: 782.0, 1: 781.6. Samples: 2755381. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:04,673][111881] Avg episode reward: [(0, '6.240'), (1, '5.570')] [2023-09-22 17:28:09,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6206.5). Total num frames: 11067392. Throughput: 0: 785.3, 1: 785.2. Samples: 2760200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:09,674][111881] Avg episode reward: [(0, '5.700'), (1, '5.270')] [2023-09-22 17:28:10,547][112938] Updated weights for policy 1, policy_version 21600 (0.0018) [2023-09-22 17:28:10,547][112937] Updated weights for policy 0, policy_version 21664 (0.0018) [2023-09-22 17:28:14,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11100160. Throughput: 0: 780.4, 1: 781.0. Samples: 2769217. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:14,673][111881] Avg episode reward: [(0, '5.970'), (1, '5.120')] [2023-09-22 17:28:19,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11124736. Throughput: 0: 780.1, 1: 781.0. Samples: 2778822. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:19,673][111881] Avg episode reward: [(0, '5.850'), (1, '4.800')] [2023-09-22 17:28:23,667][112938] Updated weights for policy 1, policy_version 21760 (0.0017) [2023-09-22 17:28:23,667][112937] Updated weights for policy 0, policy_version 21824 (0.0017) [2023-09-22 17:28:24,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 11157504. Throughput: 0: 775.6, 1: 775.1. Samples: 2783237. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:24,673][111881] Avg episode reward: [(0, '5.750'), (1, '4.800')] [2023-09-22 17:28:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 11190272. Throughput: 0: 781.5, 1: 782.4. Samples: 2792847. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:29,674][111881] Avg episode reward: [(0, '5.760'), (1, '4.960')] [2023-09-22 17:28:34,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11223040. Throughput: 0: 780.6, 1: 781.5. Samples: 2802326. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:34,673][111881] Avg episode reward: [(0, '5.910'), (1, '4.980')] [2023-09-22 17:28:36,561][112937] Updated weights for policy 0, policy_version 21984 (0.0016) [2023-09-22 17:28:36,561][112938] Updated weights for policy 1, policy_version 21920 (0.0018) [2023-09-22 17:28:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11255808. Throughput: 0: 786.9, 1: 786.5. Samples: 2807329. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:39,673][111881] Avg episode reward: [(0, '5.810'), (1, '5.480')] [2023-09-22 17:28:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 11288576. Throughput: 0: 779.5, 1: 779.0. Samples: 2816314. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:44,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.040')] [2023-09-22 17:28:49,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11313152. Throughput: 0: 781.8, 1: 782.6. Samples: 2825779. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:49,673][111881] Avg episode reward: [(0, '5.300'), (1, '5.250')] [2023-09-22 17:28:49,828][112937] Updated weights for policy 0, policy_version 22144 (0.0015) [2023-09-22 17:28:49,828][112938] Updated weights for policy 1, policy_version 22080 (0.0014) [2023-09-22 17:28:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11345920. Throughput: 0: 779.6, 1: 779.0. Samples: 2830337. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:54,673][111881] Avg episode reward: [(0, '5.370'), (1, '5.120')] [2023-09-22 17:28:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11378688. Throughput: 0: 784.2, 1: 784.5. Samples: 2839808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:28:59,673][111881] Avg episode reward: [(0, '5.430'), (1, '4.900')] [2023-09-22 17:29:02,911][112937] Updated weights for policy 0, policy_version 22304 (0.0016) [2023-09-22 17:29:02,911][112938] Updated weights for policy 1, policy_version 22240 (0.0017) [2023-09-22 17:29:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11411456. Throughput: 0: 780.8, 1: 780.0. Samples: 2849060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:04,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.040')] [2023-09-22 17:29:09,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11444224. Throughput: 0: 786.1, 1: 786.8. Samples: 2854017. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:09,674][111881] Avg episode reward: [(0, '5.710'), (1, '5.160')] [2023-09-22 17:29:14,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11476992. Throughput: 0: 781.4, 1: 780.7. Samples: 2863143. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:14,673][111881] Avg episode reward: [(0, '5.730'), (1, '5.390')] [2023-09-22 17:29:15,888][112938] Updated weights for policy 1, policy_version 22400 (0.0015) [2023-09-22 17:29:15,889][112937] Updated weights for policy 0, policy_version 22464 (0.0016) [2023-09-22 17:29:19,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11501568. Throughput: 0: 784.0, 1: 783.7. Samples: 2872874. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:19,673][111881] Avg episode reward: [(0, '5.630'), (1, '5.370')] [2023-09-22 17:29:24,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11534336. Throughput: 0: 779.3, 1: 778.7. Samples: 2877440. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:24,673][111881] Avg episode reward: [(0, '5.350'), (1, '5.400')] [2023-09-22 17:29:28,957][112937] Updated weights for policy 0, policy_version 22624 (0.0017) [2023-09-22 17:29:28,958][112938] Updated weights for policy 1, policy_version 22560 (0.0017) [2023-09-22 17:29:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11567104. Throughput: 0: 784.4, 1: 784.6. Samples: 2886921. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:29,674][111881] Avg episode reward: [(0, '5.520'), (1, '5.160')] [2023-09-22 17:29:34,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11599872. Throughput: 0: 784.4, 1: 783.9. Samples: 2896352. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:34,673][111881] Avg episode reward: [(0, '5.390'), (1, '4.860')] [2023-09-22 17:29:39,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11632640. Throughput: 0: 788.9, 1: 789.2. Samples: 2901351. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:29:39,674][111881] Avg episode reward: [(0, '5.500'), (1, '5.200')] [2023-09-22 17:29:41,795][112938] Updated weights for policy 1, policy_version 22720 (0.0015) [2023-09-22 17:29:41,795][112937] Updated weights for policy 0, policy_version 22784 (0.0016) [2023-09-22 17:29:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11665408. Throughput: 0: 789.0, 1: 788.5. Samples: 2910797. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:29:44,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.290')] [2023-09-22 17:29:44,679][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000022816_5840896.pth... [2023-09-22 17:29:44,679][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000022752_5824512.pth... [2023-09-22 17:29:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000019824_5074944.pth [2023-09-22 17:29:44,718][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000019888_5091328.pth [2023-09-22 17:29:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 11689984. Throughput: 0: 789.7, 1: 789.8. Samples: 2920135. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:29:49,674][111881] Avg episode reward: [(0, '5.600'), (1, '5.460')] [2023-09-22 17:29:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11722752. Throughput: 0: 784.0, 1: 783.4. Samples: 2924548. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:54,673][111881] Avg episode reward: [(0, '5.500'), (1, '5.490')] [2023-09-22 17:29:55,150][112938] Updated weights for policy 1, policy_version 22880 (0.0016) [2023-09-22 17:29:55,150][112937] Updated weights for policy 0, policy_version 22944 (0.0017) [2023-09-22 17:29:59,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11755520. Throughput: 0: 785.4, 1: 784.7. Samples: 2933798. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:29:59,674][111881] Avg episode reward: [(0, '5.710'), (1, '5.070')] [2023-09-22 17:30:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11788288. Throughput: 0: 779.2, 1: 778.6. Samples: 2942978. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:04,673][111881] Avg episode reward: [(0, '5.910'), (1, '5.080')] [2023-09-22 17:30:08,250][112937] Updated weights for policy 0, policy_version 23104 (0.0016) [2023-09-22 17:30:08,251][112938] Updated weights for policy 1, policy_version 23040 (0.0017) [2023-09-22 17:30:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11821056. Throughput: 0: 781.9, 1: 782.5. Samples: 2947840. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:09,673][111881] Avg episode reward: [(0, '5.790'), (1, '4.620')] [2023-09-22 17:30:14,672][111881] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 11849728. Throughput: 0: 782.8, 1: 781.6. Samples: 2957317. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:14,673][111881] Avg episode reward: [(0, '5.630'), (1, '4.810')] [2023-09-22 17:30:19,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11878400. Throughput: 0: 783.0, 1: 782.8. Samples: 2966816. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:19,673][111881] Avg episode reward: [(0, '5.620'), (1, '5.260')] [2023-09-22 17:30:21,257][112938] Updated weights for policy 1, policy_version 23200 (0.0017) [2023-09-22 17:30:21,257][112937] Updated weights for policy 0, policy_version 23264 (0.0014) [2023-09-22 17:30:24,672][111881] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11911168. Throughput: 0: 780.7, 1: 780.9. Samples: 2971622. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:30:24,673][111881] Avg episode reward: [(0, '5.210'), (1, '5.130')] [2023-09-22 17:30:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11943936. Throughput: 0: 774.3, 1: 774.4. Samples: 2980489. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:30:29,674][111881] Avg episode reward: [(0, '5.150'), (1, '5.270')] [2023-09-22 17:30:34,510][112938] Updated weights for policy 1, policy_version 23360 (0.0016) [2023-09-22 17:30:34,511][112937] Updated weights for policy 0, policy_version 23424 (0.0017) [2023-09-22 17:30:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11976704. Throughput: 0: 777.6, 1: 776.7. Samples: 2990080. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:30:34,673][111881] Avg episode reward: [(0, '5.190'), (1, '5.030')] [2023-09-22 17:30:39,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12009472. Throughput: 0: 779.7, 1: 780.4. Samples: 2994750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:39,673][111881] Avg episode reward: [(0, '5.510'), (1, '4.580')] [2023-09-22 17:30:44,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12034048. Throughput: 0: 783.4, 1: 784.7. Samples: 3004364. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:44,673][111881] Avg episode reward: [(0, '5.390'), (1, '4.450')] [2023-09-22 17:30:47,588][112938] Updated weights for policy 1, policy_version 23520 (0.0015) [2023-09-22 17:30:47,588][112937] Updated weights for policy 0, policy_version 23584 (0.0015) [2023-09-22 17:30:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12066816. Throughput: 0: 782.2, 1: 781.8. Samples: 3013355. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:49,674][111881] Avg episode reward: [(0, '5.430'), (1, '4.270')] [2023-09-22 17:30:54,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12099584. Throughput: 0: 781.2, 1: 781.1. Samples: 3018144. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:30:54,673][111881] Avg episode reward: [(0, '5.290'), (1, '4.720')] [2023-09-22 17:30:59,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12132352. Throughput: 0: 778.4, 1: 779.0. Samples: 3027398. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:30:59,673][111881] Avg episode reward: [(0, '5.540'), (1, '4.990')] [2023-09-22 17:31:00,716][112937] Updated weights for policy 0, policy_version 23744 (0.0016) [2023-09-22 17:31:00,716][112938] Updated weights for policy 1, policy_version 23680 (0.0015) [2023-09-22 17:31:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12165120. Throughput: 0: 782.1, 1: 781.5. Samples: 3037181. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:04,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.090')] [2023-09-22 17:31:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 12189696. Throughput: 0: 776.3, 1: 776.8. Samples: 3041509. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:09,673][111881] Avg episode reward: [(0, '5.620'), (1, '5.130')] [2023-09-22 17:31:13,831][112937] Updated weights for policy 0, policy_version 23904 (0.0017) [2023-09-22 17:31:13,831][112938] Updated weights for policy 1, policy_version 23840 (0.0017) [2023-09-22 17:31:14,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 12222464. Throughput: 0: 784.3, 1: 784.6. Samples: 3051090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:31:14,673][111881] Avg episode reward: [(0, '5.650'), (1, '5.240')] [2023-09-22 17:31:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12255232. Throughput: 0: 777.9, 1: 778.4. Samples: 3060115. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:31:19,673][111881] Avg episode reward: [(0, '5.850'), (1, '5.080')] [2023-09-22 17:31:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12288000. Throughput: 0: 779.6, 1: 779.4. Samples: 3064903. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:31:24,673][111881] Avg episode reward: [(0, '5.670'), (1, '5.320')] [2023-09-22 17:31:26,897][112937] Updated weights for policy 0, policy_version 24064 (0.0018) [2023-09-22 17:31:26,897][112938] Updated weights for policy 1, policy_version 24000 (0.0018) [2023-09-22 17:31:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12320768. Throughput: 0: 777.2, 1: 776.9. Samples: 3074297. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:31:29,673][111881] Avg episode reward: [(0, '5.710'), (1, '5.460')] [2023-09-22 17:31:34,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12345344. Throughput: 0: 784.7, 1: 785.7. Samples: 3084023. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:34,673][111881] Avg episode reward: [(0, '5.830'), (1, '5.820')] [2023-09-22 17:31:39,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12378112. Throughput: 0: 781.7, 1: 782.1. Samples: 3088515. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:39,674][111881] Avg episode reward: [(0, '6.150'), (1, '5.420')] [2023-09-22 17:31:39,865][112938] Updated weights for policy 1, policy_version 24160 (0.0016) [2023-09-22 17:31:39,865][112937] Updated weights for policy 0, policy_version 24224 (0.0015) [2023-09-22 17:31:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12410880. Throughput: 0: 787.2, 1: 786.4. Samples: 3098211. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:44,673][111881] Avg episode reward: [(0, '6.010'), (1, '6.000')] [2023-09-22 17:31:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000024208_6197248.pth... [2023-09-22 17:31:44,682][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000024272_6213632.pth... [2023-09-22 17:31:44,717][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000021280_5447680.pth [2023-09-22 17:31:44,722][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000021344_5464064.pth [2023-09-22 17:31:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12443648. Throughput: 0: 780.0, 1: 780.5. Samples: 3107402. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:49,673][111881] Avg episode reward: [(0, '6.080'), (1, '5.660')] [2023-09-22 17:31:52,975][112937] Updated weights for policy 0, policy_version 24384 (0.0017) [2023-09-22 17:31:52,975][112938] Updated weights for policy 1, policy_version 24320 (0.0018) [2023-09-22 17:31:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12476416. Throughput: 0: 785.2, 1: 785.4. Samples: 3112187. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:54,673][111881] Avg episode reward: [(0, '5.990'), (1, '5.650')] [2023-09-22 17:31:59,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12509184. Throughput: 0: 779.0, 1: 778.0. Samples: 3121156. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:31:59,674][111881] Avg episode reward: [(0, '5.770'), (1, '5.450')] [2023-09-22 17:32:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12533760. Throughput: 0: 786.4, 1: 786.6. Samples: 3130899. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:32:04,673][111881] Avg episode reward: [(0, '6.070'), (1, '5.200')] [2023-09-22 17:32:06,177][112938] Updated weights for policy 1, policy_version 24480 (0.0018) [2023-09-22 17:32:06,177][112937] Updated weights for policy 0, policy_version 24544 (0.0016) [2023-09-22 17:32:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12566528. Throughput: 0: 784.6, 1: 784.0. Samples: 3135488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:32:09,674][111881] Avg episode reward: [(0, '5.980'), (1, '5.300')] [2023-09-22 17:32:14,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12599296. Throughput: 0: 781.5, 1: 781.4. Samples: 3144629. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:32:14,673][111881] Avg episode reward: [(0, '6.390'), (1, '5.410')] [2023-09-22 17:32:19,376][112937] Updated weights for policy 0, policy_version 24704 (0.0017) [2023-09-22 17:32:19,376][112938] Updated weights for policy 1, policy_version 24640 (0.0019) [2023-09-22 17:32:19,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 12632064. Throughput: 0: 777.0, 1: 776.4. Samples: 3153925. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0) [2023-09-22 17:32:19,673][111881] Avg episode reward: [(0, '6.280'), (1, '5.400')] [2023-09-22 17:32:24,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12656640. Throughput: 0: 779.2, 1: 779.4. Samples: 3158653. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:32:24,674][111881] Avg episode reward: [(0, '6.510'), (1, '5.260')] [2023-09-22 17:32:29,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12689408. Throughput: 0: 775.0, 1: 776.4. Samples: 3168024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:32:29,673][111881] Avg episode reward: [(0, '6.710'), (1, '5.040')] [2023-09-22 17:32:32,494][112937] Updated weights for policy 0, policy_version 24864 (0.0015) [2023-09-22 17:32:32,493][112938] Updated weights for policy 1, policy_version 24800 (0.0017) [2023-09-22 17:32:34,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12722176. Throughput: 0: 778.8, 1: 779.0. Samples: 3177504. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:32:34,673][111881] Avg episode reward: [(0, '6.110'), (1, '5.220')] [2023-09-22 17:32:39,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12754944. Throughput: 0: 781.6, 1: 781.1. Samples: 3182507. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:32:39,674][111881] Avg episode reward: [(0, '6.260'), (1, '5.020')] [2023-09-22 17:32:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12787712. Throughput: 0: 782.4, 1: 783.5. Samples: 3191622. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:32:44,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.040')] [2023-09-22 17:32:45,582][112938] Updated weights for policy 1, policy_version 24960 (0.0017) [2023-09-22 17:32:45,582][112937] Updated weights for policy 0, policy_version 25024 (0.0017) [2023-09-22 17:32:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 12820480. Throughput: 0: 779.5, 1: 778.8. Samples: 3201024. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:32:49,673][111881] Avg episode reward: [(0, '5.820'), (1, '5.050')] [2023-09-22 17:32:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12845056. Throughput: 0: 778.0, 1: 778.8. Samples: 3205541. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0) [2023-09-22 17:32:54,673][111881] Avg episode reward: [(0, '5.730'), (1, '4.840')] [2023-09-22 17:32:58,731][112938] Updated weights for policy 1, policy_version 25120 (0.0017) [2023-09-22 17:32:58,731][112937] Updated weights for policy 0, policy_version 25184 (0.0016) [2023-09-22 17:32:59,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 12877824. Throughput: 0: 782.3, 1: 783.1. Samples: 3215075. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:32:59,673][111881] Avg episode reward: [(0, '5.870'), (1, '5.220')] [2023-09-22 17:33:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 12910592. Throughput: 0: 779.0, 1: 779.8. Samples: 3224072. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:04,673][111881] Avg episode reward: [(0, '6.320'), (1, '5.200')] [2023-09-22 17:33:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12943360. Throughput: 0: 781.7, 1: 781.3. Samples: 3228989. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:09,673][111881] Avg episode reward: [(0, '6.290'), (1, '5.380')] [2023-09-22 17:33:11,936][112937] Updated weights for policy 0, policy_version 25344 (0.0016) [2023-09-22 17:33:11,937][112938] Updated weights for policy 1, policy_version 25280 (0.0016) [2023-09-22 17:33:14,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 12976128. Throughput: 0: 778.4, 1: 778.0. Samples: 3238060. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:14,673][111881] Avg episode reward: [(0, '5.670'), (1, '5.650')] [2023-09-22 17:33:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13000704. Throughput: 0: 777.5, 1: 778.0. Samples: 3247499. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:19,673][111881] Avg episode reward: [(0, '5.590'), (1, '5.730')] [2023-09-22 17:33:24,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13033472. Throughput: 0: 773.9, 1: 773.9. Samples: 3252156. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:24,673][111881] Avg episode reward: [(0, '5.670'), (1, '5.800')] [2023-09-22 17:33:25,196][112937] Updated weights for policy 0, policy_version 25504 (0.0017) [2023-09-22 17:33:25,196][112938] Updated weights for policy 1, policy_version 25440 (0.0015) [2023-09-22 17:33:29,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13066240. Throughput: 0: 774.5, 1: 774.6. Samples: 3261333. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:33:29,673][111881] Avg episode reward: [(0, '5.330'), (1, '5.900')] [2023-09-22 17:33:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13099008. Throughput: 0: 773.7, 1: 773.8. Samples: 3270662. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:33:34,673][111881] Avg episode reward: [(0, '5.630'), (1, '5.880')] [2023-09-22 17:33:38,345][112938] Updated weights for policy 1, policy_version 25600 (0.0017) [2023-09-22 17:33:38,346][112937] Updated weights for policy 0, policy_version 25664 (0.0017) [2023-09-22 17:33:39,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 13127680. Throughput: 0: 777.0, 1: 776.6. Samples: 3275453. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:33:39,674][111881] Avg episode reward: [(0, '5.640'), (1, '6.010')] [2023-09-22 17:33:44,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13156352. Throughput: 0: 776.5, 1: 774.3. Samples: 3284862. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:33:44,673][111881] Avg episode reward: [(0, '5.780'), (1, '5.750')] [2023-09-22 17:33:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000025664_6569984.pth... [2023-09-22 17:33:44,682][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000025728_6586368.pth... [2023-09-22 17:33:44,715][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000022752_5824512.pth [2023-09-22 17:33:44,719][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000022816_5840896.pth [2023-09-22 17:33:49,673][111881] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13189120. Throughput: 0: 770.7, 1: 770.7. Samples: 3293435. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:33:49,674][111881] Avg episode reward: [(0, '6.360'), (1, '5.700')] [2023-09-22 17:33:51,880][112938] Updated weights for policy 1, policy_version 25760 (0.0018) [2023-09-22 17:33:51,880][112937] Updated weights for policy 0, policy_version 25824 (0.0018) [2023-09-22 17:33:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13221888. Throughput: 0: 768.7, 1: 768.9. Samples: 3298180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:33:54,673][111881] Avg episode reward: [(0, '6.230'), (1, '5.220')] [2023-09-22 17:33:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13254656. Throughput: 0: 772.3, 1: 771.4. Samples: 3307525. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:33:59,673][111881] Avg episode reward: [(0, '5.470'), (1, '5.250')] [2023-09-22 17:34:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13279232. Throughput: 0: 777.1, 1: 776.6. Samples: 3317414. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:34:04,673][111881] Avg episode reward: [(0, '5.580'), (1, '5.180')] [2023-09-22 17:34:04,760][112937] Updated weights for policy 0, policy_version 25984 (0.0016) [2023-09-22 17:34:04,760][112938] Updated weights for policy 1, policy_version 25920 (0.0015) [2023-09-22 17:34:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13312000. Throughput: 0: 774.9, 1: 774.5. Samples: 3321876. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:09,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.130')] [2023-09-22 17:34:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13344768. Throughput: 0: 781.5, 1: 780.6. Samples: 3331628. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:14,673][111881] Avg episode reward: [(0, '5.260'), (1, '5.490')] [2023-09-22 17:34:17,848][112937] Updated weights for policy 0, policy_version 26144 (0.0018) [2023-09-22 17:34:17,848][112938] Updated weights for policy 1, policy_version 26080 (0.0014) [2023-09-22 17:34:19,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13377536. Throughput: 0: 775.4, 1: 776.0. Samples: 3340476. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:19,674][111881] Avg episode reward: [(0, '5.370'), (1, '5.320')] [2023-09-22 17:34:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13410304. Throughput: 0: 774.8, 1: 775.2. Samples: 3345205. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:24,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.440')] [2023-09-22 17:34:29,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13443072. Throughput: 0: 774.8, 1: 775.6. Samples: 3354630. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:29,673][111881] Avg episode reward: [(0, '5.840'), (1, '5.950')] [2023-09-22 17:34:30,925][112937] Updated weights for policy 0, policy_version 26304 (0.0017) [2023-09-22 17:34:30,926][112938] Updated weights for policy 1, policy_version 26240 (0.0018) [2023-09-22 17:34:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13467648. Throughput: 0: 789.5, 1: 789.1. Samples: 3364473. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:34,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.660')] [2023-09-22 17:34:39,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 13500416. Throughput: 0: 786.9, 1: 786.6. Samples: 3368988. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:34:39,674][111881] Avg episode reward: [(0, '6.010'), (1, '5.980')] [2023-09-22 17:34:43,757][112937] Updated weights for policy 0, policy_version 26464 (0.0019) [2023-09-22 17:34:43,758][112938] Updated weights for policy 1, policy_version 26400 (0.0020) [2023-09-22 17:34:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13533184. Throughput: 0: 792.3, 1: 793.2. Samples: 3378872. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:34:44,673][111881] Avg episode reward: [(0, '6.060'), (1, '6.040')] [2023-09-22 17:34:44,682][112735] Saving new best policy, reward=6.040! [2023-09-22 17:34:49,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13565952. Throughput: 0: 786.3, 1: 786.1. Samples: 3388173. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:34:49,674][111881] Avg episode reward: [(0, '6.220'), (1, '5.840')] [2023-09-22 17:34:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13598720. Throughput: 0: 791.5, 1: 792.2. Samples: 3393143. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:34:54,673][111881] Avg episode reward: [(0, '6.050'), (1, '5.840')] [2023-09-22 17:34:56,654][112937] Updated weights for policy 0, policy_version 26624 (0.0015) [2023-09-22 17:34:56,655][112938] Updated weights for policy 1, policy_version 26560 (0.0016) [2023-09-22 17:34:59,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13631488. Throughput: 0: 787.6, 1: 787.6. Samples: 3402509. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:34:59,674][111881] Avg episode reward: [(0, '6.220'), (1, '5.950')] [2023-09-22 17:35:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6248.1). Total num frames: 13664256. Throughput: 0: 794.7, 1: 794.1. Samples: 3411968. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:35:04,673][111881] Avg episode reward: [(0, '5.620'), (1, '5.620')] [2023-09-22 17:35:09,673][111881] Fps is (10 sec: 6144.0, 60 sec: 6348.8, 300 sec: 6248.1). Total num frames: 13692928. Throughput: 0: 794.0, 1: 794.1. Samples: 3416671. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:35:09,673][111881] Avg episode reward: [(0, '5.350'), (1, '5.410')] [2023-09-22 17:35:09,677][112937] Updated weights for policy 0, policy_version 26784 (0.0017) [2023-09-22 17:35:09,677][112938] Updated weights for policy 1, policy_version 26720 (0.0016) [2023-09-22 17:35:14,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13721600. Throughput: 0: 796.4, 1: 796.3. Samples: 3426304. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0) [2023-09-22 17:35:14,673][111881] Avg episode reward: [(0, '4.990'), (1, '5.580')] [2023-09-22 17:35:19,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13754368. Throughput: 0: 791.2, 1: 791.6. Samples: 3435698. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:19,673][111881] Avg episode reward: [(0, '5.060'), (1, '5.640')] [2023-09-22 17:35:22,686][112937] Updated weights for policy 0, policy_version 26944 (0.0016) [2023-09-22 17:35:22,686][112938] Updated weights for policy 1, policy_version 26880 (0.0016) [2023-09-22 17:35:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13787136. Throughput: 0: 793.7, 1: 795.6. Samples: 3440510. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:24,673][111881] Avg episode reward: [(0, '5.290'), (1, '5.500')] [2023-09-22 17:35:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13819904. Throughput: 0: 785.0, 1: 784.9. Samples: 3449514. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:29,674][111881] Avg episode reward: [(0, '5.140'), (1, '5.700')] [2023-09-22 17:35:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6248.1). Total num frames: 13852672. Throughput: 0: 788.0, 1: 787.5. Samples: 3459072. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:34,673][111881] Avg episode reward: [(0, '5.300'), (1, '5.630')] [2023-09-22 17:35:35,793][112937] Updated weights for policy 0, policy_version 27104 (0.0015) [2023-09-22 17:35:35,793][112938] Updated weights for policy 1, policy_version 27040 (0.0015) [2023-09-22 17:35:39,673][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13877248. Throughput: 0: 783.4, 1: 783.3. Samples: 3463643. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:39,674][111881] Avg episode reward: [(0, '5.550'), (1, '5.080')] [2023-09-22 17:35:44,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13910016. Throughput: 0: 785.3, 1: 785.4. Samples: 3473190. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:44,674][111881] Avg episode reward: [(0, '5.180'), (1, '4.940')] [2023-09-22 17:35:44,683][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000027136_6946816.pth... [2023-09-22 17:35:44,683][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000027200_6963200.pth... [2023-09-22 17:35:44,716][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000024272_6213632.pth [2023-09-22 17:35:44,719][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000024208_6197248.pth [2023-09-22 17:35:49,075][112938] Updated weights for policy 1, policy_version 27200 (0.0016) [2023-09-22 17:35:49,075][112937] Updated weights for policy 0, policy_version 27264 (0.0015) [2023-09-22 17:35:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13942784. Throughput: 0: 779.1, 1: 779.6. Samples: 3482110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:49,675][111881] Avg episode reward: [(0, '5.310'), (1, '4.760')] [2023-09-22 17:35:54,672][111881] Fps is (10 sec: 6553.9, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13975552. Throughput: 0: 780.2, 1: 779.5. Samples: 3486857. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:54,673][111881] Avg episode reward: [(0, '5.490'), (1, '4.680')] [2023-09-22 17:35:59,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14008320. Throughput: 0: 774.4, 1: 775.3. Samples: 3496043. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:35:59,674][111881] Avg episode reward: [(0, '5.760'), (1, '4.870')] [2023-09-22 17:36:02,202][112937] Updated weights for policy 0, policy_version 27424 (0.0017) [2023-09-22 17:36:02,202][112938] Updated weights for policy 1, policy_version 27360 (0.0017) [2023-09-22 17:36:04,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14032896. Throughput: 0: 777.2, 1: 777.9. Samples: 3505678. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:04,673][111881] Avg episode reward: [(0, '5.290'), (1, '5.140')] [2023-09-22 17:36:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 14065664. Throughput: 0: 776.3, 1: 774.0. Samples: 3510272. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:09,674][111881] Avg episode reward: [(0, '6.150'), (1, '5.020')] [2023-09-22 17:36:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14098432. Throughput: 0: 780.5, 1: 780.2. Samples: 3519747. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:36:14,673][111881] Avg episode reward: [(0, '6.160'), (1, '5.120')] [2023-09-22 17:36:15,304][112937] Updated weights for policy 0, policy_version 27584 (0.0016) [2023-09-22 17:36:15,304][112938] Updated weights for policy 1, policy_version 27520 (0.0017) [2023-09-22 17:36:19,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14131200. Throughput: 0: 777.0, 1: 777.7. Samples: 3529035. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:36:19,674][111881] Avg episode reward: [(0, '6.040'), (1, '5.490')] [2023-09-22 17:36:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14163968. Throughput: 0: 779.7, 1: 779.7. Samples: 3533818. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:36:24,673][111881] Avg episode reward: [(0, '6.080'), (1, '5.110')] [2023-09-22 17:36:28,373][112937] Updated weights for policy 0, policy_version 27744 (0.0015) [2023-09-22 17:36:28,373][112938] Updated weights for policy 1, policy_version 27680 (0.0015) [2023-09-22 17:36:29,673][111881] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 14192640. Throughput: 0: 776.4, 1: 775.9. Samples: 3543046. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:36:29,674][111881] Avg episode reward: [(0, '5.500'), (1, '5.460')] [2023-09-22 17:36:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14221312. Throughput: 0: 779.9, 1: 779.3. Samples: 3552275. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:34,673][111881] Avg episode reward: [(0, '5.310'), (1, '5.270')] [2023-09-22 17:36:39,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14254080. Throughput: 0: 777.5, 1: 779.9. Samples: 3556939. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:39,673][111881] Avg episode reward: [(0, '5.470'), (1, '5.600')] [2023-09-22 17:36:41,799][112937] Updated weights for policy 0, policy_version 27904 (0.0016) [2023-09-22 17:36:41,799][112938] Updated weights for policy 1, policy_version 27840 (0.0016) [2023-09-22 17:36:44,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14286848. Throughput: 0: 776.7, 1: 776.9. Samples: 3565953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:44,673][111881] Avg episode reward: [(0, '5.450'), (1, '5.720')] [2023-09-22 17:36:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14319616. Throughput: 0: 777.5, 1: 777.3. Samples: 3575644. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:49,673][111881] Avg episode reward: [(0, '5.860'), (1, '6.040')] [2023-09-22 17:36:54,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14344192. Throughput: 0: 775.2, 1: 776.0. Samples: 3580075. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:54,673][111881] Avg episode reward: [(0, '6.110'), (1, '6.040')] [2023-09-22 17:36:54,809][112937] Updated weights for policy 0, policy_version 28064 (0.0018) [2023-09-22 17:36:54,809][112938] Updated weights for policy 1, policy_version 28000 (0.0016) [2023-09-22 17:36:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14376960. Throughput: 0: 779.7, 1: 780.6. Samples: 3589961. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:36:59,673][111881] Avg episode reward: [(0, '6.240'), (1, '5.700')] [2023-09-22 17:37:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14409728. Throughput: 0: 781.9, 1: 782.2. Samples: 3599419. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:04,673][111881] Avg episode reward: [(0, '6.330'), (1, '5.470')] [2023-09-22 17:37:07,709][112937] Updated weights for policy 0, policy_version 28224 (0.0017) [2023-09-22 17:37:07,709][112938] Updated weights for policy 1, policy_version 28160 (0.0016) [2023-09-22 17:37:09,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14442496. Throughput: 0: 782.2, 1: 782.1. Samples: 3604209. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:09,674][111881] Avg episode reward: [(0, '6.550'), (1, '5.120')] [2023-09-22 17:37:14,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14475264. Throughput: 0: 783.0, 1: 783.8. Samples: 3613552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:14,673][111881] Avg episode reward: [(0, '6.320'), (1, '4.720')] [2023-09-22 17:37:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 14508032. Throughput: 0: 785.0, 1: 785.8. Samples: 3622963. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:19,673][111881] Avg episode reward: [(0, '6.150'), (1, '4.560')] [2023-09-22 17:37:20,618][112937] Updated weights for policy 0, policy_version 28384 (0.0017) [2023-09-22 17:37:20,618][112938] Updated weights for policy 1, policy_version 28320 (0.0017) [2023-09-22 17:37:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14540800. Throughput: 0: 786.3, 1: 784.5. Samples: 3627625. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:24,675][111881] Avg episode reward: [(0, '6.220'), (1, '4.430')] [2023-09-22 17:37:29,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 14565376. Throughput: 0: 792.4, 1: 791.6. Samples: 3637233. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:29,674][111881] Avg episode reward: [(0, '5.860'), (1, '4.720')] [2023-09-22 17:37:33,738][112937] Updated weights for policy 0, policy_version 28544 (0.0017) [2023-09-22 17:37:33,738][112938] Updated weights for policy 1, policy_version 28480 (0.0017) [2023-09-22 17:37:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14598144. Throughput: 0: 788.6, 1: 788.4. Samples: 3646610. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:37:34,673][111881] Avg episode reward: [(0, '5.880'), (1, '4.730')] [2023-09-22 17:37:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14630912. Throughput: 0: 790.8, 1: 791.6. Samples: 3651283. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:37:39,673][111881] Avg episode reward: [(0, '5.680'), (1, '4.770')] [2023-09-22 17:37:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14663680. Throughput: 0: 777.8, 1: 777.0. Samples: 3659927. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:37:44,673][111881] Avg episode reward: [(0, '5.900'), (1, '5.030')] [2023-09-22 17:37:44,679][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000028608_7323648.pth... [2023-09-22 17:37:44,679][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000028672_7340032.pth... [2023-09-22 17:37:44,722][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000025728_6586368.pth [2023-09-22 17:37:44,722][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000025664_6569984.pth [2023-09-22 17:37:47,274][112937] Updated weights for policy 0, policy_version 28704 (0.0019) [2023-09-22 17:37:47,274][112938] Updated weights for policy 1, policy_version 28640 (0.0020) [2023-09-22 17:37:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14688256. Throughput: 0: 777.6, 1: 777.9. Samples: 3669415. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0) [2023-09-22 17:37:49,673][111881] Avg episode reward: [(0, '6.080'), (1, '5.000')] [2023-09-22 17:37:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14721024. Throughput: 0: 777.1, 1: 776.3. Samples: 3674112. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:54,673][111881] Avg episode reward: [(0, '6.190'), (1, '5.190')] [2023-09-22 17:37:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14753792. Throughput: 0: 776.4, 1: 776.6. Samples: 3683437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:37:59,673][111881] Avg episode reward: [(0, '6.420'), (1, '5.260')] [2023-09-22 17:38:00,333][112937] Updated weights for policy 0, policy_version 28864 (0.0016) [2023-09-22 17:38:00,333][112938] Updated weights for policy 1, policy_version 28800 (0.0018) [2023-09-22 17:38:04,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14786560. Throughput: 0: 775.5, 1: 775.8. Samples: 3692772. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:04,673][111881] Avg episode reward: [(0, '6.110'), (1, '5.570')] [2023-09-22 17:38:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14819328. Throughput: 0: 779.2, 1: 779.3. Samples: 3697760. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:09,673][111881] Avg episode reward: [(0, '5.870'), (1, '5.360')] [2023-09-22 17:38:13,232][112937] Updated weights for policy 0, policy_version 29024 (0.0014) [2023-09-22 17:38:13,233][112938] Updated weights for policy 1, policy_version 28960 (0.0017) [2023-09-22 17:38:14,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 14852096. Throughput: 0: 776.1, 1: 776.5. Samples: 3707100. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:14,673][111881] Avg episode reward: [(0, '5.730'), (1, '5.360')] [2023-09-22 17:38:19,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14876672. Throughput: 0: 779.8, 1: 780.3. Samples: 3716814. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:19,673][111881] Avg episode reward: [(0, '5.790'), (1, '5.130')] [2023-09-22 17:38:24,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14909440. Throughput: 0: 777.9, 1: 776.5. Samples: 3721230. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:24,673][111881] Avg episode reward: [(0, '5.900'), (1, '5.170')] [2023-09-22 17:38:26,263][112937] Updated weights for policy 0, policy_version 29184 (0.0017) [2023-09-22 17:38:26,263][112938] Updated weights for policy 1, policy_version 29120 (0.0017) [2023-09-22 17:38:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14942208. Throughput: 0: 787.4, 1: 787.4. Samples: 3730789. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:38:29,673][111881] Avg episode reward: [(0, '5.990'), (1, '5.070')] [2023-09-22 17:38:34,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 14974976. Throughput: 0: 785.1, 1: 784.7. Samples: 3740057. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:38:34,673][111881] Avg episode reward: [(0, '6.010'), (1, '4.720')] [2023-09-22 17:38:39,407][112938] Updated weights for policy 1, policy_version 29280 (0.0017) [2023-09-22 17:38:39,407][112937] Updated weights for policy 0, policy_version 29344 (0.0017) [2023-09-22 17:38:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15007744. Throughput: 0: 786.6, 1: 787.8. Samples: 3744960. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:38:39,673][111881] Avg episode reward: [(0, '5.960'), (1, '4.800')] [2023-09-22 17:38:44,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15032320. Throughput: 0: 784.4, 1: 783.4. Samples: 3753986. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0) [2023-09-22 17:38:44,673][111881] Avg episode reward: [(0, '5.480'), (1, '4.620')] [2023-09-22 17:38:49,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15065088. Throughput: 0: 784.2, 1: 783.7. Samples: 3763327. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:49,674][111881] Avg episode reward: [(0, '5.630'), (1, '4.790')] [2023-09-22 17:38:52,737][112938] Updated weights for policy 1, policy_version 29440 (0.0015) [2023-09-22 17:38:52,737][112937] Updated weights for policy 0, policy_version 29504 (0.0015) [2023-09-22 17:38:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15097856. Throughput: 0: 782.2, 1: 780.7. Samples: 3768089. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:54,673][111881] Avg episode reward: [(0, '5.750'), (1, '4.930')] [2023-09-22 17:38:59,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15130624. Throughput: 0: 781.7, 1: 781.6. Samples: 3777450. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:38:59,674][111881] Avg episode reward: [(0, '5.770'), (1, '5.110')] [2023-09-22 17:39:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 15163392. Throughput: 0: 777.9, 1: 777.2. Samples: 3786792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:04,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.120')] [2023-09-22 17:39:05,602][112938] Updated weights for policy 1, policy_version 29600 (0.0016) [2023-09-22 17:39:05,602][112937] Updated weights for policy 0, policy_version 29664 (0.0015) [2023-09-22 17:39:09,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15196160. Throughput: 0: 783.8, 1: 784.3. Samples: 3791792. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:09,673][111881] Avg episode reward: [(0, '5.630'), (1, '5.820')] [2023-09-22 17:39:14,672][111881] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 15224832. Throughput: 0: 781.5, 1: 780.8. Samples: 3801093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:14,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.670')] [2023-09-22 17:39:18,759][112937] Updated weights for policy 0, policy_version 29824 (0.0017) [2023-09-22 17:39:18,759][112938] Updated weights for policy 1, policy_version 29760 (0.0016) [2023-09-22 17:39:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15253504. Throughput: 0: 782.2, 1: 781.8. Samples: 3810438. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:19,674][111881] Avg episode reward: [(0, '5.660'), (1, '4.970')] [2023-09-22 17:39:24,672][111881] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15286272. Throughput: 0: 782.8, 1: 782.0. Samples: 3815374. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:24,673][111881] Avg episode reward: [(0, '5.380'), (1, '4.890')] [2023-09-22 17:39:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15319040. Throughput: 0: 783.3, 1: 783.6. Samples: 3824498. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:29,673][111881] Avg episode reward: [(0, '5.550'), (1, '4.450')] [2023-09-22 17:39:31,796][112937] Updated weights for policy 0, policy_version 29984 (0.0017) [2023-09-22 17:39:31,797][112938] Updated weights for policy 1, policy_version 29920 (0.0017) [2023-09-22 17:39:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15351808. Throughput: 0: 784.6, 1: 785.0. Samples: 3833960. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:34,673][111881] Avg episode reward: [(0, '5.870'), (1, '4.930')] [2023-09-22 17:39:39,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15384576. Throughput: 0: 784.4, 1: 785.5. Samples: 3838735. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:39,674][111881] Avg episode reward: [(0, '5.920'), (1, '5.430')] [2023-09-22 17:39:44,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15409152. Throughput: 0: 786.4, 1: 785.7. Samples: 3848192. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:39:44,673][111881] Avg episode reward: [(0, '5.930'), (1, '5.450')] [2023-09-22 17:39:44,797][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000030080_7700480.pth... [2023-09-22 17:39:44,810][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000030144_7716864.pth... [2023-09-22 17:39:44,812][112937] Updated weights for policy 0, policy_version 30144 (0.0015) [2023-09-22 17:39:44,812][112938] Updated weights for policy 1, policy_version 30080 (0.0018) [2023-09-22 17:39:44,827][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000027136_6946816.pth [2023-09-22 17:39:44,841][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000027200_6963200.pth [2023-09-22 17:39:49,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15441920. Throughput: 0: 781.2, 1: 781.3. Samples: 3857104. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:39:49,673][111881] Avg episode reward: [(0, '5.500'), (1, '6.160')] [2023-09-22 17:39:49,674][112735] Saving new best policy, reward=6.160! [2023-09-22 17:39:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15474688. Throughput: 0: 778.6, 1: 778.1. Samples: 3861846. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:39:54,673][111881] Avg episode reward: [(0, '5.010'), (1, '5.950')] [2023-09-22 17:39:58,238][112937] Updated weights for policy 0, policy_version 30304 (0.0017) [2023-09-22 17:39:58,239][112938] Updated weights for policy 1, policy_version 30240 (0.0017) [2023-09-22 17:39:59,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15507456. Throughput: 0: 775.6, 1: 776.1. Samples: 3870916. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:39:59,674][111881] Avg episode reward: [(0, '5.020'), (1, '5.890')] [2023-09-22 17:40:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 15532032. Throughput: 0: 776.5, 1: 777.1. Samples: 3880350. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0) [2023-09-22 17:40:04,673][111881] Avg episode reward: [(0, '5.130'), (1, '5.530')] [2023-09-22 17:40:09,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15564800. Throughput: 0: 774.4, 1: 774.1. Samples: 3885056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:40:09,673][111881] Avg episode reward: [(0, '5.150'), (1, '4.780')] [2023-09-22 17:40:11,369][112937] Updated weights for policy 0, policy_version 30464 (0.0017) [2023-09-22 17:40:11,369][112938] Updated weights for policy 1, policy_version 30400 (0.0017) [2023-09-22 17:40:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 15597568. Throughput: 0: 779.2, 1: 779.8. Samples: 3894651. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:40:14,673][111881] Avg episode reward: [(0, '5.330'), (1, '4.540')] [2023-09-22 17:40:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15630336. Throughput: 0: 777.7, 1: 777.7. Samples: 3903953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:40:19,673][111881] Avg episode reward: [(0, '5.700'), (1, '4.580')] [2023-09-22 17:40:24,272][112938] Updated weights for policy 1, policy_version 30560 (0.0018) [2023-09-22 17:40:24,273][112937] Updated weights for policy 0, policy_version 30624 (0.0018) [2023-09-22 17:40:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15663104. Throughput: 0: 778.6, 1: 779.3. Samples: 3908841. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:40:24,673][111881] Avg episode reward: [(0, '5.770'), (1, '4.670')] [2023-09-22 17:40:29,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15695872. Throughput: 0: 779.0, 1: 780.1. Samples: 3918355. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:40:29,673][111881] Avg episode reward: [(0, '5.710'), (1, '4.840')] [2023-09-22 17:40:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 15728640. Throughput: 0: 788.8, 1: 788.0. Samples: 3928064. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:40:34,673][111881] Avg episode reward: [(0, '5.770'), (1, '4.860')] [2023-09-22 17:40:37,107][112937] Updated weights for policy 0, policy_version 30784 (0.0016) [2023-09-22 17:40:37,107][112938] Updated weights for policy 1, policy_version 30720 (0.0017) [2023-09-22 17:40:39,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 15753216. Throughput: 0: 785.9, 1: 786.6. Samples: 3932609. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:40:39,674][111881] Avg episode reward: [(0, '5.850'), (1, '4.650')] [2023-09-22 17:40:44,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15785984. Throughput: 0: 793.6, 1: 793.9. Samples: 3942356. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:40:44,673][111881] Avg episode reward: [(0, '5.730'), (1, '4.840')] [2023-09-22 17:40:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15818752. Throughput: 0: 792.3, 1: 791.8. Samples: 3951639. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:40:49,674][111881] Avg episode reward: [(0, '6.010'), (1, '5.040')] [2023-09-22 17:40:50,085][112938] Updated weights for policy 1, policy_version 30880 (0.0019) [2023-09-22 17:40:50,085][112937] Updated weights for policy 0, policy_version 30944 (0.0017) [2023-09-22 17:40:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15851520. Throughput: 0: 794.1, 1: 794.0. Samples: 3956522. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:40:54,673][111881] Avg episode reward: [(0, '5.960'), (1, '5.050')] [2023-09-22 17:40:59,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 15884288. Throughput: 0: 786.0, 1: 786.3. Samples: 3965402. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:40:59,673][111881] Avg episode reward: [(0, '5.690'), (1, '5.310')] [2023-09-22 17:41:03,295][112937] Updated weights for policy 0, policy_version 31104 (0.0017) [2023-09-22 17:41:03,295][112938] Updated weights for policy 1, policy_version 31040 (0.0018) [2023-09-22 17:41:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 15917056. Throughput: 0: 791.5, 1: 790.8. Samples: 3975158. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:41:04,673][111881] Avg episode reward: [(0, '5.390'), (1, '5.460')] [2023-09-22 17:41:09,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15941632. Throughput: 0: 787.2, 1: 786.9. Samples: 3979677. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:41:09,673][111881] Avg episode reward: [(0, '5.630'), (1, '5.430')] [2023-09-22 17:41:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15974400. Throughput: 0: 790.3, 1: 790.0. Samples: 3989468. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:14,674][111881] Avg episode reward: [(0, '5.580'), (1, '5.200')] [2023-09-22 17:41:16,214][112937] Updated weights for policy 0, policy_version 31264 (0.0017) [2023-09-22 17:41:16,214][112938] Updated weights for policy 1, policy_version 31200 (0.0019) [2023-09-22 17:41:19,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16007168. Throughput: 0: 783.6, 1: 784.6. Samples: 3998633. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:19,673][111881] Avg episode reward: [(0, '5.650'), (1, '4.690')] [2023-09-22 17:41:24,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 16039936. Throughput: 0: 787.9, 1: 788.1. Samples: 4003529. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:24,674][111881] Avg episode reward: [(0, '5.900'), (1, '4.270')] [2023-09-22 17:41:29,211][112937] Updated weights for policy 0, policy_version 31424 (0.0017) [2023-09-22 17:41:29,211][112938] Updated weights for policy 1, policy_version 31360 (0.0017) [2023-09-22 17:41:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16072704. Throughput: 0: 783.6, 1: 784.2. Samples: 4012907. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:29,674][111881] Avg episode reward: [(0, '5.990'), (1, '4.290')] [2023-09-22 17:41:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16105472. Throughput: 0: 785.2, 1: 785.2. Samples: 4022310. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:34,673][111881] Avg episode reward: [(0, '5.650'), (1, '4.710')] [2023-09-22 17:41:39,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 16138240. Throughput: 0: 786.4, 1: 786.1. Samples: 4027285. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:39,674][111881] Avg episode reward: [(0, '5.870'), (1, '4.970')] [2023-09-22 17:41:42,132][112937] Updated weights for policy 0, policy_version 31584 (0.0017) [2023-09-22 17:41:42,132][112938] Updated weights for policy 1, policy_version 31520 (0.0014) [2023-09-22 17:41:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 16171008. Throughput: 0: 791.8, 1: 790.6. Samples: 4036610. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:44,673][111881] Avg episode reward: [(0, '5.850'), (1, '5.270')] [2023-09-22 17:41:44,686][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000031616_8093696.pth... [2023-09-22 17:41:44,686][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000031552_8077312.pth... [2023-09-22 17:41:44,716][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000028672_7340032.pth [2023-09-22 17:41:44,722][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000028608_7323648.pth [2023-09-22 17:41:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16195584. Throughput: 0: 788.8, 1: 788.0. Samples: 4046110. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:41:49,674][111881] Avg episode reward: [(0, '5.900'), (1, '5.630')] [2023-09-22 17:41:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16228352. Throughput: 0: 792.2, 1: 791.5. Samples: 4050944. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:41:54,674][111881] Avg episode reward: [(0, '5.940'), (1, '5.240')] [2023-09-22 17:41:55,099][112937] Updated weights for policy 0, policy_version 31744 (0.0015) [2023-09-22 17:41:55,100][112938] Updated weights for policy 1, policy_version 31680 (0.0018) [2023-09-22 17:41:59,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16261120. Throughput: 0: 788.0, 1: 788.2. Samples: 4060401. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:41:59,674][111881] Avg episode reward: [(0, '6.090'), (1, '5.300')] [2023-09-22 17:42:04,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16293888. Throughput: 0: 787.8, 1: 787.7. Samples: 4069530. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:42:04,673][111881] Avg episode reward: [(0, '5.950'), (1, '4.980')] [2023-09-22 17:42:08,278][112937] Updated weights for policy 0, policy_version 31904 (0.0017) [2023-09-22 17:42:08,278][112938] Updated weights for policy 1, policy_version 31840 (0.0018) [2023-09-22 17:42:09,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6275.9). Total num frames: 16326656. Throughput: 0: 785.6, 1: 785.0. Samples: 4074209. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:42:09,673][111881] Avg episode reward: [(0, '5.900'), (1, '4.960')] [2023-09-22 17:42:14,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16351232. Throughput: 0: 787.4, 1: 786.0. Samples: 4083712. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:14,673][111881] Avg episode reward: [(0, '5.940'), (1, '5.350')] [2023-09-22 17:42:19,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16384000. Throughput: 0: 787.5, 1: 788.3. Samples: 4093219. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:19,673][111881] Avg episode reward: [(0, '6.070'), (1, '5.470')] [2023-09-22 17:42:21,200][112938] Updated weights for policy 1, policy_version 32000 (0.0022) [2023-09-22 17:42:21,200][112937] Updated weights for policy 0, policy_version 32064 (0.0020) [2023-09-22 17:42:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16416768. Throughput: 0: 786.1, 1: 786.4. Samples: 4098048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:24,673][111881] Avg episode reward: [(0, '5.840'), (1, '5.770')] [2023-09-22 17:42:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16449536. Throughput: 0: 789.1, 1: 789.6. Samples: 4107654. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:29,674][111881] Avg episode reward: [(0, '5.640'), (1, '6.070')] [2023-09-22 17:42:34,047][112937] Updated weights for policy 0, policy_version 32224 (0.0017) [2023-09-22 17:42:34,047][112938] Updated weights for policy 1, policy_version 32160 (0.0019) [2023-09-22 17:42:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16482304. Throughput: 0: 787.9, 1: 789.1. Samples: 4117074. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:34,673][111881] Avg episode reward: [(0, '5.840'), (1, '5.980')] [2023-09-22 17:42:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16515072. Throughput: 0: 787.5, 1: 789.9. Samples: 4121929. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:42:39,673][111881] Avg episode reward: [(0, '5.770'), (1, '5.910')] [2023-09-22 17:42:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 16547840. Throughput: 0: 787.7, 1: 787.3. Samples: 4131276. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:42:44,673][111881] Avg episode reward: [(0, '5.740'), (1, '5.620')] [2023-09-22 17:42:46,901][112937] Updated weights for policy 0, policy_version 32384 (0.0016) [2023-09-22 17:42:46,901][112938] Updated weights for policy 1, policy_version 32320 (0.0017) [2023-09-22 17:42:49,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6417.1, 300 sec: 6303.7). Total num frames: 16580608. Throughput: 0: 795.2, 1: 794.3. Samples: 4141056. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:42:49,674][111881] Avg episode reward: [(0, '5.740'), (1, '5.020')] [2023-09-22 17:42:54,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16605184. Throughput: 0: 792.2, 1: 792.7. Samples: 4145531. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0) [2023-09-22 17:42:54,674][111881] Avg episode reward: [(0, '5.640'), (1, '4.320')] [2023-09-22 17:42:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16637952. Throughput: 0: 788.3, 1: 788.9. Samples: 4154686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:42:59,673][111881] Avg episode reward: [(0, '5.760'), (1, '4.830')] [2023-09-22 17:43:00,177][112937] Updated weights for policy 0, policy_version 32544 (0.0016) [2023-09-22 17:43:00,177][112938] Updated weights for policy 1, policy_version 32480 (0.0018) [2023-09-22 17:43:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16670720. Throughput: 0: 786.3, 1: 786.0. Samples: 4163969. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:04,673][111881] Avg episode reward: [(0, '5.550'), (1, '4.650')] [2023-09-22 17:43:09,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16703488. Throughput: 0: 785.4, 1: 785.6. Samples: 4168746. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:09,673][111881] Avg episode reward: [(0, '5.960'), (1, '4.840')] [2023-09-22 17:43:13,473][112937] Updated weights for policy 0, policy_version 32704 (0.0015) [2023-09-22 17:43:13,475][112938] Updated weights for policy 1, policy_version 32640 (0.0016) [2023-09-22 17:43:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16728064. Throughput: 0: 781.0, 1: 780.5. Samples: 4177920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:14,673][111881] Avg episode reward: [(0, '5.960'), (1, '5.300')] [2023-09-22 17:43:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16760832. Throughput: 0: 782.1, 1: 782.3. Samples: 4187472. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:19,673][111881] Avg episode reward: [(0, '6.190'), (1, '5.800')] [2023-09-22 17:43:24,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16793600. Throughput: 0: 782.6, 1: 780.2. Samples: 4192256. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:24,673][111881] Avg episode reward: [(0, '6.230'), (1, '5.520')] [2023-09-22 17:43:26,339][112938] Updated weights for policy 1, policy_version 32800 (0.0015) [2023-09-22 17:43:26,340][112937] Updated weights for policy 0, policy_version 32864 (0.0017) [2023-09-22 17:43:29,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 16826368. Throughput: 0: 781.4, 1: 781.8. Samples: 4201616. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:29,673][111881] Avg episode reward: [(0, '6.220'), (1, '5.430')] [2023-09-22 17:43:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16859136. Throughput: 0: 778.8, 1: 779.9. Samples: 4211200. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:34,673][111881] Avg episode reward: [(0, '6.230'), (1, '5.100')] [2023-09-22 17:43:39,197][112937] Updated weights for policy 0, policy_version 33024 (0.0016) [2023-09-22 17:43:39,197][112938] Updated weights for policy 1, policy_version 32960 (0.0016) [2023-09-22 17:43:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 16891904. Throughput: 0: 785.1, 1: 784.6. Samples: 4216171. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:39,673][111881] Avg episode reward: [(0, '5.860'), (1, '4.950')] [2023-09-22 17:43:44,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 16924672. Throughput: 0: 788.2, 1: 788.2. Samples: 4225628. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:44,673][111881] Avg episode reward: [(0, '6.240'), (1, '4.980')] [2023-09-22 17:43:44,680][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000033088_8470528.pth... [2023-09-22 17:43:44,681][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000033024_8454144.pth... [2023-09-22 17:43:44,715][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000030144_7716864.pth [2023-09-22 17:43:44,716][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000030080_7700480.pth [2023-09-22 17:43:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6303.7). Total num frames: 16957440. Throughput: 0: 792.7, 1: 791.3. Samples: 4235249. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:49,673][111881] Avg episode reward: [(0, '6.370'), (1, '5.060')] [2023-09-22 17:43:52,333][112937] Updated weights for policy 0, policy_version 33184 (0.0018) [2023-09-22 17:43:52,333][112938] Updated weights for policy 1, policy_version 33120 (0.0017) [2023-09-22 17:43:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 16982016. Throughput: 0: 784.8, 1: 785.1. Samples: 4239390. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:54,673][111881] Avg episode reward: [(0, '6.170'), (1, '5.390')] [2023-09-22 17:43:59,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17014784. Throughput: 0: 791.1, 1: 792.4. Samples: 4249174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:43:59,674][111881] Avg episode reward: [(0, '5.870'), (1, '5.580')] [2023-09-22 17:44:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17047552. Throughput: 0: 788.7, 1: 789.6. Samples: 4258496. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:04,673][111881] Avg episode reward: [(0, '5.970'), (1, '5.530')] [2023-09-22 17:44:05,341][112937] Updated weights for policy 0, policy_version 33344 (0.0016) [2023-09-22 17:44:05,341][112938] Updated weights for policy 1, policy_version 33280 (0.0016) [2023-09-22 17:44:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6289.8). Total num frames: 17080320. Throughput: 0: 787.1, 1: 787.5. Samples: 4263111. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:09,673][111881] Avg episode reward: [(0, '5.490'), (1, '5.700')] [2023-09-22 17:44:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6417.1, 300 sec: 6303.7). Total num frames: 17113088. Throughput: 0: 785.6, 1: 785.4. Samples: 4272311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:14,673][111881] Avg episode reward: [(0, '5.010'), (1, '5.770')] [2023-09-22 17:44:18,571][112938] Updated weights for policy 1, policy_version 33440 (0.0016) [2023-09-22 17:44:18,572][112937] Updated weights for policy 0, policy_version 33504 (0.0015) [2023-09-22 17:44:19,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17137664. Throughput: 0: 783.9, 1: 783.5. Samples: 4281733. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:19,674][111881] Avg episode reward: [(0, '4.920'), (1, '5.760')] [2023-09-22 17:44:24,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 17170432. Throughput: 0: 781.3, 1: 780.8. Samples: 4286464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:24,673][111881] Avg episode reward: [(0, '4.750'), (1, '5.370')] [2023-09-22 17:44:29,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17203200. Throughput: 0: 780.4, 1: 780.5. Samples: 4295871. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:29,674][111881] Avg episode reward: [(0, '4.700'), (1, '5.260')] [2023-09-22 17:44:31,600][112937] Updated weights for policy 0, policy_version 33664 (0.0019) [2023-09-22 17:44:31,600][112938] Updated weights for policy 1, policy_version 33600 (0.0018) [2023-09-22 17:44:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17235968. Throughput: 0: 778.4, 1: 779.4. Samples: 4305348. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:34,673][111881] Avg episode reward: [(0, '4.860'), (1, '5.430')] [2023-09-22 17:44:39,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17268736. Throughput: 0: 788.4, 1: 788.4. Samples: 4310343. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:39,673][111881] Avg episode reward: [(0, '5.250'), (1, '5.160')] [2023-09-22 17:44:44,489][112938] Updated weights for policy 1, policy_version 33760 (0.0015) [2023-09-22 17:44:44,489][112937] Updated weights for policy 0, policy_version 33824 (0.0014) [2023-09-22 17:44:44,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17301504. Throughput: 0: 781.4, 1: 781.2. Samples: 4319491. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:44:44,673][111881] Avg episode reward: [(0, '5.470'), (1, '5.390')] [2023-09-22 17:44:49,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6289.8). Total num frames: 17330176. Throughput: 0: 787.2, 1: 785.4. Samples: 4329263. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:44:49,673][111881] Avg episode reward: [(0, '5.720'), (1, '5.580')] [2023-09-22 17:44:54,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17358848. Throughput: 0: 783.7, 1: 784.1. Samples: 4333660. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:44:54,673][111881] Avg episode reward: [(0, '5.830'), (1, '4.960')] [2023-09-22 17:44:57,516][112937] Updated weights for policy 0, policy_version 33984 (0.0014) [2023-09-22 17:44:57,518][112938] Updated weights for policy 1, policy_version 33920 (0.0016) [2023-09-22 17:44:59,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17391616. Throughput: 0: 788.7, 1: 788.9. Samples: 4343302. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:44:59,673][111881] Avg episode reward: [(0, '5.890'), (1, '4.870')] [2023-09-22 17:45:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17424384. Throughput: 0: 789.6, 1: 789.4. Samples: 4352791. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:04,673][111881] Avg episode reward: [(0, '5.670'), (1, '4.990')] [2023-09-22 17:45:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17457152. Throughput: 0: 791.2, 1: 792.2. Samples: 4357718. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:09,673][111881] Avg episode reward: [(0, '5.480'), (1, '5.030')] [2023-09-22 17:45:10,541][112938] Updated weights for policy 1, policy_version 34080 (0.0015) [2023-09-22 17:45:10,543][112937] Updated weights for policy 0, policy_version 34144 (0.0016) [2023-09-22 17:45:14,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17489920. Throughput: 0: 787.3, 1: 787.6. Samples: 4366742. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:14,673][111881] Avg episode reward: [(0, '5.550'), (1, '5.050')] [2023-09-22 17:45:19,673][111881] Fps is (10 sec: 6143.9, 60 sec: 6348.8, 300 sec: 6289.8). Total num frames: 17518592. Throughput: 0: 789.2, 1: 789.3. Samples: 4376382. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:19,673][111881] Avg episode reward: [(0, '5.690'), (1, '5.020')] [2023-09-22 17:45:23,712][112938] Updated weights for policy 1, policy_version 34240 (0.0017) [2023-09-22 17:45:23,712][112937] Updated weights for policy 0, policy_version 34304 (0.0017) [2023-09-22 17:45:24,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17547264. Throughput: 0: 781.6, 1: 781.2. Samples: 4380673. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:24,673][111881] Avg episode reward: [(0, '5.720'), (1, '5.110')] [2023-09-22 17:45:29,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 17580032. Throughput: 0: 783.7, 1: 783.4. Samples: 4390008. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:45:29,673][111881] Avg episode reward: [(0, '5.450'), (1, '4.880')] [2023-09-22 17:45:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17612800. Throughput: 0: 776.2, 1: 776.8. Samples: 4399152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:45:34,673][111881] Avg episode reward: [(0, '5.570'), (1, '4.990')] [2023-09-22 17:45:36,923][112938] Updated weights for policy 1, policy_version 34400 (0.0017) [2023-09-22 17:45:36,923][112937] Updated weights for policy 0, policy_version 34464 (0.0017) [2023-09-22 17:45:39,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17645568. Throughput: 0: 782.1, 1: 781.9. Samples: 4404042. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:45:39,673][111881] Avg episode reward: [(0, '5.670'), (1, '5.360')] [2023-09-22 17:45:44,673][111881] Fps is (10 sec: 6143.8, 60 sec: 6212.2, 300 sec: 6289.8). Total num frames: 17674240. Throughput: 0: 779.9, 1: 778.9. Samples: 4413449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:45:44,674][111881] Avg episode reward: [(0, '5.980'), (1, '5.230')] [2023-09-22 17:45:44,687][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000034496_8830976.pth... [2023-09-22 17:45:44,687][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000034560_8847360.pth... [2023-09-22 17:45:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000031552_8077312.pth [2023-09-22 17:45:44,723][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000031616_8093696.pth [2023-09-22 17:45:49,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6212.2, 300 sec: 6275.9). Total num frames: 17702912. Throughput: 0: 780.6, 1: 781.3. Samples: 4423075. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:45:49,674][111881] Avg episode reward: [(0, '6.040'), (1, '5.420')] [2023-09-22 17:45:49,901][112938] Updated weights for policy 1, policy_version 34560 (0.0018) [2023-09-22 17:45:49,901][112937] Updated weights for policy 0, policy_version 34624 (0.0018) [2023-09-22 17:45:54,673][111881] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17735680. Throughput: 0: 779.0, 1: 777.9. Samples: 4427776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:45:54,673][111881] Avg episode reward: [(0, '6.610'), (1, '5.440')] [2023-09-22 17:45:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17768448. Throughput: 0: 779.6, 1: 779.4. Samples: 4436899. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:45:59,673][111881] Avg episode reward: [(0, '6.660'), (1, '5.460')] [2023-09-22 17:46:03,060][112937] Updated weights for policy 0, policy_version 34784 (0.0018) [2023-09-22 17:46:03,060][112938] Updated weights for policy 1, policy_version 34720 (0.0016) [2023-09-22 17:46:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17801216. Throughput: 0: 776.3, 1: 775.5. Samples: 4446212. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:46:04,673][111881] Avg episode reward: [(0, '6.690'), (1, '5.530')] [2023-09-22 17:46:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17833984. Throughput: 0: 782.0, 1: 782.9. Samples: 4451095. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:46:09,673][111881] Avg episode reward: [(0, '6.970'), (1, '5.730')] [2023-09-22 17:46:09,674][112639] Saving new best policy, reward=6.970! [2023-09-22 17:46:14,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 17866752. Throughput: 0: 784.4, 1: 784.5. Samples: 4460608. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:46:14,673][111881] Avg episode reward: [(0, '6.950'), (1, '5.990')] [2023-09-22 17:46:15,868][112938] Updated weights for policy 1, policy_version 34880 (0.0016) [2023-09-22 17:46:15,868][112937] Updated weights for policy 0, policy_version 34944 (0.0019) [2023-09-22 17:46:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6275.9). Total num frames: 17891328. Throughput: 0: 790.4, 1: 789.8. Samples: 4470259. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:46:19,673][111881] Avg episode reward: [(0, '7.100'), (1, '6.320')] [2023-09-22 17:46:19,863][112639] Saving new best policy, reward=7.100! [2023-09-22 17:46:19,865][112735] Saving new best policy, reward=6.320! [2023-09-22 17:46:24,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17924096. Throughput: 0: 787.4, 1: 786.8. Samples: 4474880. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:24,673][111881] Avg episode reward: [(0, '7.020'), (1, '6.450')] [2023-09-22 17:46:24,674][112735] Saving new best policy, reward=6.450! [2023-09-22 17:46:29,190][112938] Updated weights for policy 1, policy_version 35040 (0.0016) [2023-09-22 17:46:29,190][112937] Updated weights for policy 0, policy_version 35104 (0.0017) [2023-09-22 17:46:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17956864. Throughput: 0: 784.4, 1: 784.8. Samples: 4484063. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:29,674][111881] Avg episode reward: [(0, '6.720'), (1, '6.280')] [2023-09-22 17:46:34,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17989632. Throughput: 0: 781.0, 1: 779.8. Samples: 4493312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:34,673][111881] Avg episode reward: [(0, '6.470'), (1, '6.020')] [2023-09-22 17:46:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18022400. Throughput: 0: 780.1, 1: 780.7. Samples: 4498012. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:39,673][111881] Avg episode reward: [(0, '6.300'), (1, '5.760')] [2023-09-22 17:46:42,144][112938] Updated weights for policy 1, policy_version 35200 (0.0017) [2023-09-22 17:46:42,144][112937] Updated weights for policy 0, policy_version 35264 (0.0016) [2023-09-22 17:46:44,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6275.9). Total num frames: 18046976. Throughput: 0: 786.4, 1: 785.8. Samples: 4507648. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:44,673][111881] Avg episode reward: [(0, '6.420'), (1, '5.550')] [2023-09-22 17:46:49,673][111881] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18079744. Throughput: 0: 788.3, 1: 789.1. Samples: 4517195. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:49,674][111881] Avg episode reward: [(0, '6.260'), (1, '5.530')] [2023-09-22 17:46:54,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18112512. Throughput: 0: 788.1, 1: 787.2. Samples: 4521984. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:54,673][111881] Avg episode reward: [(0, '6.370'), (1, '5.590')] [2023-09-22 17:46:55,099][112937] Updated weights for policy 0, policy_version 35424 (0.0016) [2023-09-22 17:46:55,099][112938] Updated weights for policy 1, policy_version 35360 (0.0017) [2023-09-22 17:46:59,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18145280. Throughput: 0: 786.6, 1: 786.0. Samples: 4531375. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:46:59,673][111881] Avg episode reward: [(0, '6.540'), (1, '5.630')] [2023-09-22 17:47:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18178048. Throughput: 0: 780.4, 1: 781.2. Samples: 4540531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:47:04,673][111881] Avg episode reward: [(0, '6.310'), (1, '5.420')] [2023-09-22 17:47:08,228][112938] Updated weights for policy 1, policy_version 35520 (0.0017) [2023-09-22 17:47:08,229][112937] Updated weights for policy 0, policy_version 35584 (0.0016) [2023-09-22 17:47:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 18210816. Throughput: 0: 782.8, 1: 783.9. Samples: 4545380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:47:09,673][111881] Avg episode reward: [(0, '6.440'), (1, '5.330')] [2023-09-22 17:47:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 18243584. Throughput: 0: 785.7, 1: 785.9. Samples: 4554785. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:47:14,673][111881] Avg episode reward: [(0, '6.620'), (1, '5.140')] [2023-09-22 17:47:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18268160. Throughput: 0: 789.2, 1: 790.4. Samples: 4564394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:47:19,673][111881] Avg episode reward: [(0, '6.290'), (1, '4.970')] [2023-09-22 17:47:21,204][112937] Updated weights for policy 0, policy_version 35744 (0.0018) [2023-09-22 17:47:21,204][112938] Updated weights for policy 1, policy_version 35680 (0.0016) [2023-09-22 17:47:24,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18300928. Throughput: 0: 790.0, 1: 789.4. Samples: 4569088. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:47:24,673][111881] Avg episode reward: [(0, '6.400'), (1, '4.930')] [2023-09-22 17:47:29,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18333696. Throughput: 0: 787.0, 1: 788.1. Samples: 4578526. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-09-22 17:47:29,673][111881] Avg episode reward: [(0, '6.510'), (1, '4.960')] [2023-09-22 17:47:34,322][112938] Updated weights for policy 1, policy_version 35840 (0.0018) [2023-09-22 17:47:34,322][112937] Updated weights for policy 0, policy_version 35904 (0.0019) [2023-09-22 17:47:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18366464. Throughput: 0: 782.0, 1: 781.8. Samples: 4587567. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:47:34,673][111881] Avg episode reward: [(0, '6.600'), (1, '4.940')] [2023-09-22 17:47:39,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18399232. Throughput: 0: 783.2, 1: 784.4. Samples: 4592525. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:47:39,674][111881] Avg episode reward: [(0, '6.490'), (1, '5.120')] [2023-09-22 17:47:44,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18423808. Throughput: 0: 783.3, 1: 782.9. Samples: 4601856. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:47:44,673][111881] Avg episode reward: [(0, '6.360'), (1, '5.260')] [2023-09-22 17:47:44,746][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000036032_9224192.pth... [2023-09-22 17:47:44,763][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000035968_9207808.pth... [2023-09-22 17:47:44,776][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000033088_8470528.pth [2023-09-22 17:47:44,792][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000033024_8454144.pth [2023-09-22 17:47:47,317][112938] Updated weights for policy 1, policy_version 36000 (0.0018) [2023-09-22 17:47:47,317][112937] Updated weights for policy 0, policy_version 36064 (0.0018) [2023-09-22 17:47:49,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18456576. Throughput: 0: 786.9, 1: 786.5. Samples: 4611335. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0) [2023-09-22 17:47:49,673][111881] Avg episode reward: [(0, '6.750'), (1, '5.180')] [2023-09-22 17:47:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18489344. Throughput: 0: 787.0, 1: 786.2. Samples: 4616175. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:47:54,673][111881] Avg episode reward: [(0, '6.330'), (1, '5.260')] [2023-09-22 17:47:59,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18522112. Throughput: 0: 786.7, 1: 787.8. Samples: 4625638. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:47:59,673][111881] Avg episode reward: [(0, '6.390'), (1, '5.480')] [2023-09-22 17:48:00,256][112938] Updated weights for policy 1, policy_version 36160 (0.0016) [2023-09-22 17:48:00,256][112937] Updated weights for policy 0, policy_version 36224 (0.0015) [2023-09-22 17:48:04,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18554880. Throughput: 0: 784.8, 1: 784.6. Samples: 4635017. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:48:04,673][111881] Avg episode reward: [(0, '6.720'), (1, '5.600')] [2023-09-22 17:48:09,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6303.7). Total num frames: 18587648. Throughput: 0: 784.5, 1: 785.3. Samples: 4639726. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:48:09,673][111881] Avg episode reward: [(0, '6.670'), (1, '5.640')] [2023-09-22 17:48:13,352][112938] Updated weights for policy 1, policy_version 36320 (0.0017) [2023-09-22 17:48:13,352][112937] Updated weights for policy 0, policy_version 36384 (0.0016) [2023-09-22 17:48:14,672][111881] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6275.9). Total num frames: 18612224. Throughput: 0: 783.2, 1: 782.3. Samples: 4648977. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0) [2023-09-22 17:48:14,673][111881] Avg episode reward: [(0, '6.620'), (1, '5.660')] [2023-09-22 17:48:19,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18644992. Throughput: 0: 782.6, 1: 782.9. Samples: 4658015. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:48:19,673][111881] Avg episode reward: [(0, '6.190'), (1, '5.550')] [2023-09-22 17:48:24,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18677760. Throughput: 0: 781.8, 1: 781.6. Samples: 4662878. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:48:24,673][111881] Avg episode reward: [(0, '6.380'), (1, '5.400')] [2023-09-22 17:48:26,636][112937] Updated weights for policy 0, policy_version 36544 (0.0016) [2023-09-22 17:48:26,636][112938] Updated weights for policy 1, policy_version 36480 (0.0014) [2023-09-22 17:48:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18710528. Throughput: 0: 782.4, 1: 783.7. Samples: 4672330. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:48:29,674][111881] Avg episode reward: [(0, '6.080'), (1, '5.760')] [2023-09-22 17:48:34,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18743296. Throughput: 0: 783.1, 1: 783.4. Samples: 4681828. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:48:34,673][111881] Avg episode reward: [(0, '6.230'), (1, '5.370')] [2023-09-22 17:48:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18767872. Throughput: 0: 781.4, 1: 782.5. Samples: 4686550. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0) [2023-09-22 17:48:39,674][111881] Avg episode reward: [(0, '6.520'), (1, '5.400')] [2023-09-22 17:48:39,701][112937] Updated weights for policy 0, policy_version 36704 (0.0017) [2023-09-22 17:48:39,702][112938] Updated weights for policy 1, policy_version 36640 (0.0015) [2023-09-22 17:48:44,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18800640. Throughput: 0: 782.8, 1: 781.3. Samples: 4696021. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:48:44,673][111881] Avg episode reward: [(0, '6.820'), (1, '5.290')] [2023-09-22 17:48:49,673][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18833408. Throughput: 0: 783.6, 1: 781.8. Samples: 4705464. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:48:49,674][111881] Avg episode reward: [(0, '7.080'), (1, '5.080')] [2023-09-22 17:48:52,795][112937] Updated weights for policy 0, policy_version 36864 (0.0015) [2023-09-22 17:48:52,796][112938] Updated weights for policy 1, policy_version 36800 (0.0018) [2023-09-22 17:48:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18866176. Throughput: 0: 780.7, 1: 780.2. Samples: 4709968. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:48:54,673][111881] Avg episode reward: [(0, '7.100'), (1, '5.300')] [2023-09-22 17:48:59,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18898944. Throughput: 0: 780.3, 1: 780.4. Samples: 4719211. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:48:59,673][111881] Avg episode reward: [(0, '7.010'), (1, '5.350')] [2023-09-22 17:49:04,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18931712. Throughput: 0: 787.3, 1: 786.4. Samples: 4728832. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:04,673][111881] Avg episode reward: [(0, '6.950'), (1, '5.210')] [2023-09-22 17:49:05,855][112938] Updated weights for policy 1, policy_version 36960 (0.0019) [2023-09-22 17:49:05,855][112937] Updated weights for policy 0, policy_version 37024 (0.0018) [2023-09-22 17:49:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18956288. Throughput: 0: 783.4, 1: 783.2. Samples: 4733374. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:09,673][111881] Avg episode reward: [(0, '6.770'), (1, '5.490')] [2023-09-22 17:49:14,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18989056. Throughput: 0: 787.1, 1: 786.4. Samples: 4743140. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:14,674][111881] Avg episode reward: [(0, '6.460'), (1, '5.720')] [2023-09-22 17:49:18,869][112938] Updated weights for policy 1, policy_version 37120 (0.0017) [2023-09-22 17:49:18,870][112937] Updated weights for policy 0, policy_version 37184 (0.0015) [2023-09-22 17:49:19,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19021824. Throughput: 0: 782.9, 1: 783.1. Samples: 4752298. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:19,673][111881] Avg episode reward: [(0, '6.100'), (1, '5.740')] [2023-09-22 17:49:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19054592. Throughput: 0: 784.9, 1: 784.1. Samples: 4757154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:24,673][111881] Avg episode reward: [(0, '5.950'), (1, '6.040')] [2023-09-22 17:49:29,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19087360. Throughput: 0: 777.4, 1: 778.5. Samples: 4766037. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:29,674][111881] Avg episode reward: [(0, '5.830'), (1, '5.760')] [2023-09-22 17:49:32,002][112937] Updated weights for policy 0, policy_version 37344 (0.0016) [2023-09-22 17:49:32,002][112938] Updated weights for policy 1, policy_version 37280 (0.0017) [2023-09-22 17:49:34,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19120128. Throughput: 0: 781.3, 1: 783.4. Samples: 4775875. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:34,673][111881] Avg episode reward: [(0, '5.570'), (1, '5.620')] [2023-09-22 17:49:39,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19144704. Throughput: 0: 782.3, 1: 782.9. Samples: 4780401. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:39,674][111881] Avg episode reward: [(0, '5.600'), (1, '5.850')] [2023-09-22 17:49:44,673][111881] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 19177472. Throughput: 0: 788.1, 1: 788.5. Samples: 4790158. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:44,673][111881] Avg episode reward: [(0, '5.500'), (1, '5.740')] [2023-09-22 17:49:44,684][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000037488_9596928.pth... [2023-09-22 17:49:44,685][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000037424_9580544.pth... [2023-09-22 17:49:44,715][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000034560_8847360.pth [2023-09-22 17:49:44,718][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000034496_8830976.pth [2023-09-22 17:49:44,952][112938] Updated weights for policy 1, policy_version 37440 (0.0013) [2023-09-22 17:49:44,954][112937] Updated weights for policy 0, policy_version 37504 (0.0017) [2023-09-22 17:49:49,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 19210240. Throughput: 0: 783.0, 1: 783.2. Samples: 4799312. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:49,673][111881] Avg episode reward: [(0, '5.700'), (1, '5.620')] [2023-09-22 17:49:54,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19243008. Throughput: 0: 786.1, 1: 783.8. Samples: 4804022. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:54,673][111881] Avg episode reward: [(0, '5.950'), (1, '5.690')] [2023-09-22 17:49:58,224][112938] Updated weights for policy 1, policy_version 37600 (0.0016) [2023-09-22 17:49:58,225][112937] Updated weights for policy 0, policy_version 37664 (0.0016) [2023-09-22 17:49:59,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19275776. Throughput: 0: 776.5, 1: 776.9. Samples: 4813041. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:49:59,674][111881] Avg episode reward: [(0, '5.940'), (1, '5.800')] [2023-09-22 17:50:04,672][111881] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6262.0). Total num frames: 19304448. Throughput: 0: 784.0, 1: 783.9. Samples: 4822853. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:04,673][111881] Avg episode reward: [(0, '6.200'), (1, '5.650')] [2023-09-22 17:50:09,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19333120. Throughput: 0: 779.4, 1: 779.5. Samples: 4827302. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:09,674][111881] Avg episode reward: [(0, '6.510'), (1, '5.530')] [2023-09-22 17:50:11,280][112937] Updated weights for policy 0, policy_version 37824 (0.0016) [2023-09-22 17:50:11,280][112938] Updated weights for policy 1, policy_version 37760 (0.0017) [2023-09-22 17:50:14,672][111881] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 19365888. Throughput: 0: 785.3, 1: 784.9. Samples: 4836696. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:14,673][111881] Avg episode reward: [(0, '6.510'), (1, '5.460')] [2023-09-22 17:50:19,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19398656. Throughput: 0: 777.9, 1: 777.2. Samples: 4845853. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:19,673][111881] Avg episode reward: [(0, '6.660'), (1, '5.180')] [2023-09-22 17:50:24,328][112938] Updated weights for policy 1, policy_version 37920 (0.0016) [2023-09-22 17:50:24,328][112937] Updated weights for policy 0, policy_version 37984 (0.0017) [2023-09-22 17:50:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19431424. Throughput: 0: 782.4, 1: 782.3. Samples: 4850811. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:24,673][111881] Avg episode reward: [(0, '6.380'), (1, '4.970')] [2023-09-22 17:50:29,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19464192. Throughput: 0: 777.1, 1: 777.3. Samples: 4860108. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:29,674][111881] Avg episode reward: [(0, '5.790'), (1, '5.100')] [2023-09-22 17:50:34,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19496960. Throughput: 0: 786.2, 1: 786.4. Samples: 4870082. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:34,673][111881] Avg episode reward: [(0, '5.450'), (1, '5.180')] [2023-09-22 17:50:37,445][112937] Updated weights for policy 0, policy_version 38144 (0.0018) [2023-09-22 17:50:37,445][112938] Updated weights for policy 1, policy_version 38080 (0.0017) [2023-09-22 17:50:39,672][111881] Fps is (10 sec: 5734.6, 60 sec: 6280.6, 300 sec: 6262.0). Total num frames: 19521536. Throughput: 0: 779.5, 1: 780.9. Samples: 4874240. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:39,673][111881] Avg episode reward: [(0, '4.920'), (1, '5.010')] [2023-09-22 17:50:44,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19554304. Throughput: 0: 787.5, 1: 786.8. Samples: 4883885. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:50:44,673][111881] Avg episode reward: [(0, '5.110'), (1, '5.400')] [2023-09-22 17:50:49,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19587072. Throughput: 0: 781.8, 1: 781.9. Samples: 4893217. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:50:49,674][111881] Avg episode reward: [(0, '5.650'), (1, '5.570')] [2023-09-22 17:50:50,373][112938] Updated weights for policy 1, policy_version 38240 (0.0017) [2023-09-22 17:50:50,373][112937] Updated weights for policy 0, policy_version 38304 (0.0017) [2023-09-22 17:50:54,673][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19619840. Throughput: 0: 785.3, 1: 785.8. Samples: 4898004. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:50:54,673][111881] Avg episode reward: [(0, '6.010'), (1, '5.450')] [2023-09-22 17:50:59,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19644416. Throughput: 0: 781.7, 1: 781.0. Samples: 4907016. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:50:59,674][111881] Avg episode reward: [(0, '6.210'), (1, '5.510')] [2023-09-22 17:51:03,605][112938] Updated weights for policy 1, policy_version 38400 (0.0017) [2023-09-22 17:51:03,605][112937] Updated weights for policy 0, policy_version 38464 (0.0016) [2023-09-22 17:51:04,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 19677184. Throughput: 0: 785.7, 1: 786.8. Samples: 4916616. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:51:04,673][111881] Avg episode reward: [(0, '6.440'), (1, '5.170')] [2023-09-22 17:51:09,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19709952. Throughput: 0: 784.1, 1: 783.3. Samples: 4921344. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0) [2023-09-22 17:51:09,673][111881] Avg episode reward: [(0, '6.250'), (1, '5.050')] [2023-09-22 17:51:14,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19742720. Throughput: 0: 786.4, 1: 785.4. Samples: 4930838. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:51:14,673][111881] Avg episode reward: [(0, '6.180'), (1, '5.180')] [2023-09-22 17:51:16,593][112938] Updated weights for policy 1, policy_version 38560 (0.0015) [2023-09-22 17:51:16,593][112937] Updated weights for policy 0, policy_version 38624 (0.0015) [2023-09-22 17:51:19,673][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19775488. Throughput: 0: 776.5, 1: 777.1. Samples: 4939996. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:51:19,674][111881] Avg episode reward: [(0, '5.790'), (1, '5.310')] [2023-09-22 17:51:24,672][111881] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19808256. Throughput: 0: 784.0, 1: 784.9. Samples: 4944841. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:51:24,673][111881] Avg episode reward: [(0, '5.870'), (1, '5.400')] [2023-09-22 17:51:29,673][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19832832. Throughput: 0: 780.5, 1: 780.2. Samples: 4954114. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:51:29,674][111881] Avg episode reward: [(0, '5.870'), (1, '5.470')] [2023-09-22 17:51:29,784][112937] Updated weights for policy 0, policy_version 38784 (0.0018) [2023-09-22 17:51:29,784][112938] Updated weights for policy 1, policy_version 38720 (0.0018) [2023-09-22 17:51:34,672][111881] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19865600. Throughput: 0: 783.0, 1: 782.6. Samples: 4963665. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0) [2023-09-22 17:51:34,673][111881] Avg episode reward: [(0, '6.050'), (1, '5.380')] [2023-09-22 17:51:39,672][111881] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19898368. Throughput: 0: 783.4, 1: 782.0. Samples: 4968448. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:51:39,673][111881] Avg episode reward: [(0, '5.940'), (1, '5.190')] [2023-09-22 17:51:42,698][112937] Updated weights for policy 0, policy_version 38944 (0.0016) [2023-09-22 17:51:42,698][112938] Updated weights for policy 1, policy_version 38880 (0.0017) [2023-09-22 17:51:44,672][111881] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19931136. Throughput: 0: 787.2, 1: 788.4. Samples: 4977915. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:51:44,673][111881] Avg episode reward: [(0, '6.470'), (1, '5.190')] [2023-09-22 17:51:44,682][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000038896_9957376.pth... [2023-09-22 17:51:44,683][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000038960_9973760.pth... [2023-09-22 17:51:44,716][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000035968_9207808.pth [2023-09-22 17:51:44,718][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000036032_9224192.pth [2023-09-22 17:51:49,673][111881] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19963904. Throughput: 0: 784.8, 1: 783.9. Samples: 4987208. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:51:49,674][111881] Avg episode reward: [(0, '6.790'), (1, '5.080')] [2023-09-22 17:51:54,672][111881] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19996672. Throughput: 0: 785.9, 1: 785.9. Samples: 4992075. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:51:54,673][111881] Avg episode reward: [(0, '6.720'), (1, '4.830')] [2023-09-22 17:51:55,705][112937] Updated weights for policy 0, policy_version 39104 (0.0015) [2023-09-22 17:51:55,705][112938] Updated weights for policy 1, policy_version 39040 (0.0016) [2023-09-22 17:51:59,672][111881] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 20021248. Throughput: 0: 781.9, 1: 782.1. Samples: 5001217. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0) [2023-09-22 17:51:59,673][111881] Avg episode reward: [(0, '6.640'), (1, '4.740')] [2023-09-22 17:51:59,693][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000039152_10022912.pth... [2023-09-22 17:51:59,694][112945] Stopping RolloutWorker_w5... [2023-09-22 17:51:59,694][112939] Stopping RolloutWorker_w0... [2023-09-22 17:51:59,694][112947] Stopping RolloutWorker_w6... [2023-09-22 17:51:59,694][112946] Stopping RolloutWorker_w7... [2023-09-22 17:51:59,694][112943] Stopping RolloutWorker_w3... [2023-09-22 17:51:59,694][112942] Stopping RolloutWorker_w2... [2023-09-22 17:51:59,694][112945] Loop rollout_proc5_evt_loop terminating... [2023-09-22 17:51:59,694][112944] Stopping RolloutWorker_w4... [2023-09-22 17:51:59,694][112941] Stopping RolloutWorker_w1... [2023-09-22 17:51:59,695][112939] Loop rollout_proc0_evt_loop terminating... [2023-09-22 17:51:59,695][112947] Loop rollout_proc6_evt_loop terminating... [2023-09-22 17:51:59,695][112946] Loop rollout_proc7_evt_loop terminating... [2023-09-22 17:51:59,695][112943] Loop rollout_proc3_evt_loop terminating... [2023-09-22 17:51:59,695][112942] Loop rollout_proc2_evt_loop terminating... [2023-09-22 17:51:59,695][112944] Loop rollout_proc4_evt_loop terminating... [2023-09-22 17:51:59,694][111881] Component RolloutWorker_w0 stopped! [2023-09-22 17:51:59,695][112941] Loop rollout_proc1_evt_loop terminating... [2023-09-22 17:51:59,695][112735] Stopping Batcher_1... [2023-09-22 17:51:59,696][111881] Component RolloutWorker_w5 stopped! [2023-09-22 17:51:59,696][112735] Loop batcher_evt_loop terminating... [2023-09-22 17:51:59,696][111881] Component RolloutWorker_w6 stopped! [2023-09-22 17:51:59,697][111881] Component RolloutWorker_w7 stopped! [2023-09-22 17:51:59,698][111881] Component RolloutWorker_w4 stopped! [2023-09-22 17:51:59,700][111881] Component RolloutWorker_w3 stopped! [2023-09-22 17:51:59,701][111881] Component RolloutWorker_w2 stopped! [2023-09-22 17:51:59,701][111881] Component RolloutWorker_w1 stopped! [2023-09-22 17:51:59,702][111881] Component Batcher_1 stopped! [2023-09-22 17:51:59,702][111881] Component Batcher_0 stopped! [2023-09-22 17:51:59,699][112639] Stopping Batcher_0... [2023-09-22 17:51:59,726][112639] Loop batcher_evt_loop terminating... [2023-09-22 17:51:59,738][112639] Removing ./train_atari/Asteroids/checkpoint_p0/checkpoint_000037488_9596928.pth [2023-09-22 17:51:59,744][112639] Saving ./train_atari/Asteroids/checkpoint_p0/checkpoint_000039152_10022912.pth... [2023-09-22 17:51:59,746][112938] Weights refcount: 2 0 [2023-09-22 17:51:59,747][112938] Stopping InferenceWorker_p1-w0... [2023-09-22 17:51:59,748][112938] Loop inference_proc1-0_evt_loop terminating... [2023-09-22 17:51:59,748][111881] Component InferenceWorker_p1-w0 stopped! [2023-09-22 17:51:59,756][112937] Weights refcount: 2 0 [2023-09-22 17:51:59,757][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000039088_10006528.pth... [2023-09-22 17:51:59,757][112937] Stopping InferenceWorker_p0-w0... [2023-09-22 17:51:59,758][112937] Loop inference_proc0-0_evt_loop terminating... [2023-09-22 17:51:59,757][111881] Component InferenceWorker_p0-w0 stopped! [2023-09-22 17:51:59,796][112735] Removing ./train_atari/Asteroids/checkpoint_p1/checkpoint_000037424_9580544.pth [2023-09-22 17:51:59,800][112735] Saving ./train_atari/Asteroids/checkpoint_p1/checkpoint_000039088_10006528.pth... [2023-09-22 17:51:59,803][112639] Stopping LearnerWorker_p0... [2023-09-22 17:51:59,803][112639] Loop learner_proc0_evt_loop terminating... [2023-09-22 17:51:59,804][111881] Component LearnerWorker_p0 stopped! [2023-09-22 17:51:59,837][112735] Stopping LearnerWorker_p1... [2023-09-22 17:51:59,837][112735] Loop learner_proc1_evt_loop terminating... [2023-09-22 17:51:59,837][111881] Component LearnerWorker_p1 stopped! [2023-09-22 17:51:59,838][111881] Waiting for process learner_proc0 to stop... [2023-09-22 17:52:00,542][111881] Waiting for process learner_proc1 to stop... [2023-09-22 17:52:00,542][111881] Waiting for process inference_proc0-0 to join... [2023-09-22 17:52:00,543][111881] Waiting for process inference_proc1-0 to join... [2023-09-22 17:52:00,544][111881] Waiting for process rollout_proc0 to join... [2023-09-22 17:52:00,544][111881] Waiting for process rollout_proc1 to join... [2023-09-22 17:52:00,545][111881] Waiting for process rollout_proc2 to join... [2023-09-22 17:52:00,546][111881] Waiting for process rollout_proc3 to join... [2023-09-22 17:52:00,546][111881] Waiting for process rollout_proc4 to join... [2023-09-22 17:52:00,547][111881] Waiting for process rollout_proc5 to join... [2023-09-22 17:52:00,547][111881] Waiting for process rollout_proc6 to join... [2023-09-22 17:52:00,548][111881] Waiting for process rollout_proc7 to join... [2023-09-22 17:52:00,549][111881] Batcher 0 profile tree view: batching: 21.2721, releasing_batches: 1.7736 [2023-09-22 17:52:00,549][111881] Batcher 1 profile tree view: batching: 21.0684, releasing_batches: 1.7681 [2023-09-22 17:52:00,549][111881] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0051 wait_policy_total: 667.5353 update_model: 37.5844 weight_update: 0.0014 one_step: 0.0012 handle_policy_step: 2327.8999 deserialize: 68.7588, stack: 16.4020, obs_to_device_normalize: 566.0973, forward: 1122.0235, send_messages: 96.3770 prepare_outputs: 309.1293 to_cpu: 156.3570 [2023-09-22 17:52:00,550][111881] InferenceWorker_p1-w0 profile tree view: wait_policy: 0.0052 wait_policy_total: 673.0897 update_model: 39.2699 weight_update: 0.0017 one_step: 0.0014 handle_policy_step: 2325.7635 deserialize: 69.2501, stack: 16.3985, obs_to_device_normalize: 561.8172, forward: 1126.5358, send_messages: 96.4687 prepare_outputs: 308.1529 to_cpu: 155.2783 [2023-09-22 17:52:00,550][111881] Learner 0 profile tree view: misc: 0.0164, prepare_batch: 31.9383 train: 468.2595 epoch_init: 0.1119, minibatch_init: 3.4668, losses_postprocess: 58.2648, kl_divergence: 5.9840, after_optimizer: 10.7143 calculate_losses: 49.1894 losses_init: 0.1189, forward_head: 15.0096, bptt_initial: 0.4655, bptt: 0.4836, tail: 11.5957, advantages_returns: 3.4075, losses: 14.1916 update: 336.1996 clip: 163.5737 [2023-09-22 17:52:00,550][111881] Learner 1 profile tree view: misc: 0.0153, prepare_batch: 31.9400 train: 457.3194 epoch_init: 0.1169, minibatch_init: 3.4758, losses_postprocess: 59.5395, kl_divergence: 5.8346, after_optimizer: 20.5176 calculate_losses: 48.1619 losses_init: 0.1125, forward_head: 14.6147, bptt_initial: 0.4654, bptt: 0.5470, tail: 11.3017, advantages_returns: 3.3467, losses: 13.8646 update: 315.2555 clip: 164.6686 [2023-09-22 17:52:00,551][111881] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.4025, enqueue_policy_requests: 46.3733, env_step: 1108.2481, overhead: 31.9089, complete_rollouts: 1.1364 save_policy_outputs: 58.2084 split_output_tensors: 20.1749 [2023-09-22 17:52:00,551][111881] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.4062, enqueue_policy_requests: 45.6476, env_step: 1078.2113, overhead: 30.2215, complete_rollouts: 1.1347 save_policy_outputs: 56.0724 split_output_tensors: 19.0281 [2023-09-22 17:52:00,551][111881] Loop Runner_EvtLoop terminating... [2023-09-22 17:52:00,552][111881] Runner profile tree view: main_loop: 3251.5209 [2023-09-22 17:52:00,552][111881] Collected {0: 10022912, 1: 10006528}, FPS: 6155.0