[2023-02-22 21:22:50,811][24717] Saving configuration to /home/flahoud/studies/collab/train_dir/default_experiment/config.json... [2023-02-22 21:22:50,812][24717] Rollout worker 0 uses device cpu [2023-02-22 21:22:50,813][24717] Rollout worker 1 uses device cpu [2023-02-22 21:22:50,813][24717] Rollout worker 2 uses device cpu [2023-02-22 21:22:50,814][24717] Rollout worker 3 uses device cpu [2023-02-22 21:22:50,815][24717] Rollout worker 4 uses device cpu [2023-02-22 21:22:50,815][24717] Rollout worker 5 uses device cpu [2023-02-22 21:22:50,816][24717] Rollout worker 6 uses device cpu [2023-02-22 21:22:50,817][24717] Rollout worker 7 uses device cpu [2023-02-22 21:22:50,874][24717] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:22:50,875][24717] InferenceWorker_p0-w0: min num requests: 2 [2023-02-22 21:22:50,907][24717] Starting all processes... [2023-02-22 21:22:50,908][24717] Starting process learner_proc0 [2023-02-22 21:22:50,957][24717] Starting all processes... [2023-02-22 21:22:50,964][24717] Starting process inference_proc0-0 [2023-02-22 21:22:50,965][24717] Starting process rollout_proc0 [2023-02-22 21:22:50,965][24717] Starting process rollout_proc1 [2023-02-22 21:22:50,966][24717] Starting process rollout_proc2 [2023-02-22 21:22:50,967][24717] Starting process rollout_proc3 [2023-02-22 21:22:50,967][24717] Starting process rollout_proc4 [2023-02-22 21:22:50,968][24717] Starting process rollout_proc5 [2023-02-22 21:22:50,968][24717] Starting process rollout_proc6 [2023-02-22 21:22:50,968][24717] Starting process rollout_proc7 [2023-02-22 21:22:52,699][32247] Worker 1 uses CPU cores [1] [2023-02-22 21:22:52,745][32230] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:22:52,745][32230] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-22 21:22:52,758][32230] Num visible devices: 1 [2023-02-22 21:22:52,803][32230] Starting seed is not provided [2023-02-22 21:22:52,803][32230] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:22:52,803][32230] Initializing actor-critic model on device cuda:0 [2023-02-22 21:22:52,803][32230] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:22:52,804][32230] RunningMeanStd input shape: (1,) [2023-02-22 21:22:52,815][32230] ConvEncoder: input_channels=3 [2023-02-22 21:22:52,855][32246] Worker 0 uses CPU cores [0] [2023-02-22 21:22:52,864][32245] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:22:52,864][32245] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-22 21:22:52,878][32245] Num visible devices: 1 [2023-02-22 21:22:52,943][32249] Worker 3 uses CPU cores [3] [2023-02-22 21:22:52,955][32253] Worker 4 uses CPU cores [4] [2023-02-22 21:22:52,969][32230] Conv encoder output size: 512 [2023-02-22 21:22:52,970][32230] Policy head output size: 512 [2023-02-22 21:22:52,983][32230] Created Actor Critic model with architecture: [2023-02-22 21:22:52,984][32230] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-22 21:22:52,989][32262] Worker 6 uses CPU cores [6] [2023-02-22 21:22:53,089][32248] Worker 2 uses CPU cores [2] [2023-02-22 21:22:53,090][32252] Worker 5 uses CPU cores [5] [2023-02-22 21:22:53,203][32263] Worker 7 uses CPU cores [7] [2023-02-22 21:22:55,724][32230] Using optimizer [2023-02-22 21:22:55,725][32230] No checkpoints found [2023-02-22 21:22:55,725][32230] Did not load from checkpoint, starting from scratch! [2023-02-22 21:22:55,725][32230] Initialized policy 0 weights for model version 0 [2023-02-22 21:22:55,727][32230] LearnerWorker_p0 finished initialization! [2023-02-22 21:22:55,728][32230] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:22:55,918][32245] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:22:55,919][32245] RunningMeanStd input shape: (1,) [2023-02-22 21:22:55,929][32245] ConvEncoder: input_channels=3 [2023-02-22 21:22:56,020][32245] Conv encoder output size: 512 [2023-02-22 21:22:56,021][32245] Policy head output size: 512 [2023-02-22 21:22:58,156][24717] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 21:22:58,538][24717] Inference worker 0-0 is ready! [2023-02-22 21:22:58,539][24717] All inference workers are ready! Signal rollout workers to start! [2023-02-22 21:22:58,557][32247] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,558][32262] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,558][32263] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,558][32248] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,558][32246] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,559][32253] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,560][32249] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:58,579][32252] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:22:59,188][32247] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,191][32246] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,192][32249] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,193][32252] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,194][32262] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,195][32263] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,738][32249] Decorrelating experience for 32 frames... [2023-02-22 21:22:59,739][32263] Decorrelating experience for 32 frames... [2023-02-22 21:22:59,739][32246] Decorrelating experience for 32 frames... [2023-02-22 21:22:59,740][32253] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,744][32248] Decorrelating experience for 0 frames... [2023-02-22 21:22:59,745][32247] Decorrelating experience for 32 frames... [2023-02-22 21:23:00,186][32253] Decorrelating experience for 32 frames... [2023-02-22 21:23:00,320][32252] Decorrelating experience for 32 frames... [2023-02-22 21:23:00,322][32248] Decorrelating experience for 32 frames... [2023-02-22 21:23:00,323][32263] Decorrelating experience for 64 frames... [2023-02-22 21:23:00,324][32246] Decorrelating experience for 64 frames... [2023-02-22 21:23:00,324][32262] Decorrelating experience for 32 frames... [2023-02-22 21:23:00,527][32249] Decorrelating experience for 64 frames... [2023-02-22 21:23:00,770][32246] Decorrelating experience for 96 frames... [2023-02-22 21:23:00,863][32253] Decorrelating experience for 64 frames... [2023-02-22 21:23:00,863][32263] Decorrelating experience for 96 frames... [2023-02-22 21:23:00,863][32252] Decorrelating experience for 64 frames... [2023-02-22 21:23:00,865][32247] Decorrelating experience for 64 frames... [2023-02-22 21:23:01,383][32252] Decorrelating experience for 96 frames... [2023-02-22 21:23:01,384][32253] Decorrelating experience for 96 frames... [2023-02-22 21:23:01,384][32247] Decorrelating experience for 96 frames... [2023-02-22 21:23:01,386][32249] Decorrelating experience for 96 frames... [2023-02-22 21:23:01,386][32248] Decorrelating experience for 64 frames... [2023-02-22 21:23:01,776][32262] Decorrelating experience for 64 frames... [2023-02-22 21:23:01,777][32248] Decorrelating experience for 96 frames... [2023-02-22 21:23:02,120][32262] Decorrelating experience for 96 frames... [2023-02-22 21:23:02,338][32230] Signal inference workers to stop experience collection... [2023-02-22 21:23:02,342][32245] InferenceWorker_p0-w0: stopping experience collection [2023-02-22 21:23:03,156][24717] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 7.2. Samples: 36. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 21:23:03,158][24717] Avg episode reward: [(0, '2.456')] [2023-02-22 21:23:04,256][32230] Signal inference workers to resume experience collection... [2023-02-22 21:23:04,256][32245] InferenceWorker_p0-w0: resuming experience collection [2023-02-22 21:23:06,289][32245] Updated weights for policy 0, policy_version 10 (0.0008) [2023-02-22 21:23:07,913][32245] Updated weights for policy 0, policy_version 20 (0.0010) [2023-02-22 21:23:08,156][24717] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8601.6). Total num frames: 86016. Throughput: 0: 1899.2. Samples: 18992. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 21:23:08,157][24717] Avg episode reward: [(0, '4.545')] [2023-02-22 21:23:09,552][32245] Updated weights for policy 0, policy_version 30 (0.0007) [2023-02-22 21:23:10,866][24717] Heartbeat connected on Batcher_0 [2023-02-22 21:23:10,869][24717] Heartbeat connected on LearnerWorker_p0 [2023-02-22 21:23:10,881][24717] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-22 21:23:10,887][24717] Heartbeat connected on RolloutWorker_w2 [2023-02-22 21:23:10,890][24717] Heartbeat connected on RolloutWorker_w0 [2023-02-22 21:23:10,892][24717] Heartbeat connected on RolloutWorker_w3 [2023-02-22 21:23:10,893][24717] Heartbeat connected on RolloutWorker_w1 [2023-02-22 21:23:10,898][24717] Heartbeat connected on RolloutWorker_w4 [2023-02-22 21:23:10,900][24717] Heartbeat connected on RolloutWorker_w5 [2023-02-22 21:23:10,905][24717] Heartbeat connected on RolloutWorker_w6 [2023-02-22 21:23:10,907][24717] Heartbeat connected on RolloutWorker_w7 [2023-02-22 21:23:11,206][32245] Updated weights for policy 0, policy_version 40 (0.0007) [2023-02-22 21:23:12,868][32245] Updated weights for policy 0, policy_version 50 (0.0008) [2023-02-22 21:23:13,156][24717] Fps is (10 sec: 20889.3, 60 sec: 13926.3, 300 sec: 13926.3). Total num frames: 208896. Throughput: 0: 2502.7. Samples: 37540. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:23:13,158][24717] Avg episode reward: [(0, '4.397')] [2023-02-22 21:23:13,159][32230] Saving new best policy, reward=4.397! [2023-02-22 21:23:14,551][32245] Updated weights for policy 0, policy_version 60 (0.0007) [2023-02-22 21:23:16,232][32245] Updated weights for policy 0, policy_version 70 (0.0006) [2023-02-22 21:23:17,952][32245] Updated weights for policy 0, policy_version 80 (0.0007) [2023-02-22 21:23:18,156][24717] Fps is (10 sec: 24575.9, 60 sec: 16588.8, 300 sec: 16588.8). Total num frames: 331776. Throughput: 0: 3717.4. Samples: 74348. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:23:18,157][24717] Avg episode reward: [(0, '4.329')] [2023-02-22 21:23:19,740][32245] Updated weights for policy 0, policy_version 90 (0.0010) [2023-02-22 21:23:21,483][32245] Updated weights for policy 0, policy_version 100 (0.0007) [2023-02-22 21:23:23,156][24717] Fps is (10 sec: 23757.1, 60 sec: 17858.6, 300 sec: 17858.6). Total num frames: 446464. Throughput: 0: 4375.5. Samples: 109386. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:23:23,157][24717] Avg episode reward: [(0, '4.485')] [2023-02-22 21:23:23,159][32230] Saving new best policy, reward=4.485! [2023-02-22 21:23:23,248][32245] Updated weights for policy 0, policy_version 110 (0.0009) [2023-02-22 21:23:25,020][32245] Updated weights for policy 0, policy_version 120 (0.0008) [2023-02-22 21:23:26,756][32245] Updated weights for policy 0, policy_version 130 (0.0007) [2023-02-22 21:23:28,156][24717] Fps is (10 sec: 23347.1, 60 sec: 18841.6, 300 sec: 18841.6). Total num frames: 565248. Throughput: 0: 4225.1. Samples: 126752. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:23:28,158][24717] Avg episode reward: [(0, '4.570')] [2023-02-22 21:23:28,162][32230] Saving new best policy, reward=4.570! [2023-02-22 21:23:28,520][32245] Updated weights for policy 0, policy_version 140 (0.0008) [2023-02-22 21:23:30,204][32245] Updated weights for policy 0, policy_version 150 (0.0007) [2023-02-22 21:23:31,950][32245] Updated weights for policy 0, policy_version 160 (0.0008) [2023-02-22 21:23:33,156][24717] Fps is (10 sec: 23756.7, 60 sec: 19543.8, 300 sec: 19543.8). Total num frames: 684032. Throughput: 0: 4632.2. Samples: 162128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:23:33,157][24717] Avg episode reward: [(0, '4.758')] [2023-02-22 21:23:33,159][32230] Saving new best policy, reward=4.758! [2023-02-22 21:23:33,637][32245] Updated weights for policy 0, policy_version 170 (0.0006) [2023-02-22 21:23:35,345][32245] Updated weights for policy 0, policy_version 180 (0.0008) [2023-02-22 21:23:37,053][32245] Updated weights for policy 0, policy_version 190 (0.0007) [2023-02-22 21:23:38,156][24717] Fps is (10 sec: 23757.0, 60 sec: 20070.4, 300 sec: 20070.4). Total num frames: 802816. Throughput: 0: 4956.4. Samples: 198256. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:23:38,158][24717] Avg episode reward: [(0, '5.231')] [2023-02-22 21:23:38,163][32230] Saving new best policy, reward=5.231! [2023-02-22 21:23:38,776][32245] Updated weights for policy 0, policy_version 200 (0.0006) [2023-02-22 21:23:40,492][32245] Updated weights for policy 0, policy_version 210 (0.0008) [2023-02-22 21:23:42,206][32245] Updated weights for policy 0, policy_version 220 (0.0007) [2023-02-22 21:23:43,156][24717] Fps is (10 sec: 23756.8, 60 sec: 20480.0, 300 sec: 20480.0). Total num frames: 921600. Throughput: 0: 4802.7. Samples: 216120. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:23:43,159][24717] Avg episode reward: [(0, '6.078')] [2023-02-22 21:23:43,160][32230] Saving new best policy, reward=6.078! [2023-02-22 21:23:43,956][32245] Updated weights for policy 0, policy_version 230 (0.0010) [2023-02-22 21:23:45,660][32245] Updated weights for policy 0, policy_version 240 (0.0008) [2023-02-22 21:23:47,448][32245] Updated weights for policy 0, policy_version 250 (0.0006) [2023-02-22 21:23:48,156][24717] Fps is (10 sec: 23346.9, 60 sec: 20725.7, 300 sec: 20725.7). Total num frames: 1036288. Throughput: 0: 5590.7. Samples: 251620. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:23:48,161][24717] Avg episode reward: [(0, '6.450')] [2023-02-22 21:23:48,167][32230] Saving new best policy, reward=6.450! [2023-02-22 21:23:49,218][32245] Updated weights for policy 0, policy_version 260 (0.0008) [2023-02-22 21:23:50,966][32245] Updated weights for policy 0, policy_version 270 (0.0012) [2023-02-22 21:23:52,686][32245] Updated weights for policy 0, policy_version 280 (0.0008) [2023-02-22 21:23:53,156][24717] Fps is (10 sec: 23347.1, 60 sec: 21001.3, 300 sec: 21001.3). Total num frames: 1155072. Throughput: 0: 5946.3. Samples: 286574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:23:53,159][24717] Avg episode reward: [(0, '7.235')] [2023-02-22 21:23:53,160][32230] Saving new best policy, reward=7.235! [2023-02-22 21:23:54,405][32245] Updated weights for policy 0, policy_version 290 (0.0008) [2023-02-22 21:23:56,162][32245] Updated weights for policy 0, policy_version 300 (0.0008) [2023-02-22 21:23:57,856][32245] Updated weights for policy 0, policy_version 310 (0.0007) [2023-02-22 21:23:58,156][24717] Fps is (10 sec: 23757.1, 60 sec: 21230.9, 300 sec: 21230.9). Total num frames: 1273856. Throughput: 0: 5927.5. Samples: 304276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:23:58,160][24717] Avg episode reward: [(0, '7.943')] [2023-02-22 21:23:58,164][32230] Saving new best policy, reward=7.943! [2023-02-22 21:23:59,584][32245] Updated weights for policy 0, policy_version 320 (0.0006) [2023-02-22 21:24:01,399][32245] Updated weights for policy 0, policy_version 330 (0.0007) [2023-02-22 21:24:03,081][32245] Updated weights for policy 0, policy_version 340 (0.0008) [2023-02-22 21:24:03,156][24717] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 21425.2). Total num frames: 1392640. Throughput: 0: 5894.8. Samples: 339612. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:24:03,158][24717] Avg episode reward: [(0, '9.597')] [2023-02-22 21:24:03,160][32230] Saving new best policy, reward=9.597! [2023-02-22 21:24:04,774][32245] Updated weights for policy 0, policy_version 350 (0.0006) [2023-02-22 21:24:06,472][32245] Updated weights for policy 0, policy_version 360 (0.0007) [2023-02-22 21:24:08,079][32245] Updated weights for policy 0, policy_version 370 (0.0008) [2023-02-22 21:24:08,156][24717] Fps is (10 sec: 24166.4, 60 sec: 23825.1, 300 sec: 21650.3). Total num frames: 1515520. Throughput: 0: 5926.3. Samples: 376070. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:24:08,159][24717] Avg episode reward: [(0, '11.741')] [2023-02-22 21:24:08,164][32230] Saving new best policy, reward=11.741! [2023-02-22 21:24:09,745][32245] Updated weights for policy 0, policy_version 380 (0.0006) [2023-02-22 21:24:11,395][32245] Updated weights for policy 0, policy_version 390 (0.0006) [2023-02-22 21:24:13,047][32245] Updated weights for policy 0, policy_version 400 (0.0007) [2023-02-22 21:24:13,156][24717] Fps is (10 sec: 24576.1, 60 sec: 23825.1, 300 sec: 21845.4). Total num frames: 1638400. Throughput: 0: 5958.9. Samples: 394900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:13,157][24717] Avg episode reward: [(0, '17.552')] [2023-02-22 21:24:13,158][32230] Saving new best policy, reward=17.552! [2023-02-22 21:24:14,722][32245] Updated weights for policy 0, policy_version 410 (0.0006) [2023-02-22 21:24:16,491][32245] Updated weights for policy 0, policy_version 420 (0.0007) [2023-02-22 21:24:18,156][24717] Fps is (10 sec: 24166.4, 60 sec: 23756.8, 300 sec: 21964.8). Total num frames: 1757184. Throughput: 0: 5979.0. Samples: 431182. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:18,158][24717] Avg episode reward: [(0, '17.099')] [2023-02-22 21:24:18,175][32245] Updated weights for policy 0, policy_version 430 (0.0008) [2023-02-22 21:24:19,921][32245] Updated weights for policy 0, policy_version 440 (0.0006) [2023-02-22 21:24:21,535][32245] Updated weights for policy 0, policy_version 450 (0.0008) [2023-02-22 21:24:23,156][24717] Fps is (10 sec: 24166.4, 60 sec: 23893.3, 300 sec: 22118.4). Total num frames: 1880064. Throughput: 0: 5985.9. Samples: 467622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:24:23,157][24717] Avg episode reward: [(0, '16.734')] [2023-02-22 21:24:23,244][32245] Updated weights for policy 0, policy_version 460 (0.0007) [2023-02-22 21:24:24,887][32245] Updated weights for policy 0, policy_version 470 (0.0006) [2023-02-22 21:24:26,622][32245] Updated weights for policy 0, policy_version 480 (0.0006) [2023-02-22 21:24:28,157][24717] Fps is (10 sec: 24575.5, 60 sec: 23961.5, 300 sec: 22254.9). Total num frames: 2002944. Throughput: 0: 5995.7. Samples: 485926. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:28,158][24717] Avg episode reward: [(0, '15.807')] [2023-02-22 21:24:28,351][32245] Updated weights for policy 0, policy_version 490 (0.0007) [2023-02-22 21:24:30,011][32245] Updated weights for policy 0, policy_version 500 (0.0008) [2023-02-22 21:24:31,794][32245] Updated weights for policy 0, policy_version 510 (0.0008) [2023-02-22 21:24:33,156][24717] Fps is (10 sec: 24166.3, 60 sec: 23961.6, 300 sec: 22334.0). Total num frames: 2121728. Throughput: 0: 5997.7. Samples: 521516. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:33,157][24717] Avg episode reward: [(0, '20.357')] [2023-02-22 21:24:33,159][32230] Saving new best policy, reward=20.357! [2023-02-22 21:24:33,468][32245] Updated weights for policy 0, policy_version 520 (0.0006) [2023-02-22 21:24:35,160][32245] Updated weights for policy 0, policy_version 530 (0.0006) [2023-02-22 21:24:36,867][32245] Updated weights for policy 0, policy_version 540 (0.0007) [2023-02-22 21:24:38,156][24717] Fps is (10 sec: 23757.3, 60 sec: 23961.6, 300 sec: 22405.1). Total num frames: 2240512. Throughput: 0: 6027.9. Samples: 557828. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:24:38,157][24717] Avg episode reward: [(0, '18.501')] [2023-02-22 21:24:38,559][32245] Updated weights for policy 0, policy_version 550 (0.0008) [2023-02-22 21:24:40,271][32245] Updated weights for policy 0, policy_version 560 (0.0008) [2023-02-22 21:24:41,996][32245] Updated weights for policy 0, policy_version 570 (0.0007) [2023-02-22 21:24:43,156][24717] Fps is (10 sec: 24166.5, 60 sec: 24029.9, 300 sec: 22508.5). Total num frames: 2363392. Throughput: 0: 6032.4. Samples: 575736. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:43,157][24717] Avg episode reward: [(0, '20.253')] [2023-02-22 21:24:43,651][32245] Updated weights for policy 0, policy_version 580 (0.0006) [2023-02-22 21:24:45,447][32245] Updated weights for policy 0, policy_version 590 (0.0008) [2023-02-22 21:24:47,173][32245] Updated weights for policy 0, policy_version 600 (0.0007) [2023-02-22 21:24:48,156][24717] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 22528.0). Total num frames: 2478080. Throughput: 0: 6041.4. Samples: 611476. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:24:48,160][24717] Avg episode reward: [(0, '19.851')] [2023-02-22 21:24:48,164][32230] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000605_2478080.pth... [2023-02-22 21:24:48,870][32245] Updated weights for policy 0, policy_version 610 (0.0008) [2023-02-22 21:24:50,606][32245] Updated weights for policy 0, policy_version 620 (0.0009) [2023-02-22 21:24:52,395][32245] Updated weights for policy 0, policy_version 630 (0.0009) [2023-02-22 21:24:53,156][24717] Fps is (10 sec: 23347.2, 60 sec: 24029.9, 300 sec: 22581.4). Total num frames: 2596864. Throughput: 0: 6013.9. Samples: 646694. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:24:53,157][24717] Avg episode reward: [(0, '22.850')] [2023-02-22 21:24:53,160][32230] Saving new best policy, reward=22.850! [2023-02-22 21:24:54,156][32245] Updated weights for policy 0, policy_version 640 (0.0008) [2023-02-22 21:24:55,879][32245] Updated weights for policy 0, policy_version 650 (0.0007) [2023-02-22 21:24:57,621][32245] Updated weights for policy 0, policy_version 660 (0.0007) [2023-02-22 21:24:58,156][24717] Fps is (10 sec: 23756.8, 60 sec: 24029.9, 300 sec: 22630.4). Total num frames: 2715648. Throughput: 0: 5991.9. Samples: 664536. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:24:58,157][24717] Avg episode reward: [(0, '23.369')] [2023-02-22 21:24:58,161][32230] Saving new best policy, reward=23.369! [2023-02-22 21:24:59,373][32245] Updated weights for policy 0, policy_version 670 (0.0008) [2023-02-22 21:25:01,221][32245] Updated weights for policy 0, policy_version 680 (0.0009) [2023-02-22 21:25:02,970][32245] Updated weights for policy 0, policy_version 690 (0.0007) [2023-02-22 21:25:03,156][24717] Fps is (10 sec: 23347.2, 60 sec: 23961.6, 300 sec: 22642.7). Total num frames: 2830336. Throughput: 0: 5951.6. Samples: 699006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:25:03,157][24717] Avg episode reward: [(0, '21.691')] [2023-02-22 21:25:04,706][32245] Updated weights for policy 0, policy_version 700 (0.0007) [2023-02-22 21:25:06,440][32245] Updated weights for policy 0, policy_version 710 (0.0007) [2023-02-22 21:25:08,156][24717] Fps is (10 sec: 22937.6, 60 sec: 23825.1, 300 sec: 22654.0). Total num frames: 2945024. Throughput: 0: 5924.6. Samples: 734228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:25:08,159][24717] Avg episode reward: [(0, '20.888')] [2023-02-22 21:25:08,168][32245] Updated weights for policy 0, policy_version 720 (0.0008) [2023-02-22 21:25:09,866][32245] Updated weights for policy 0, policy_version 730 (0.0009) [2023-02-22 21:25:11,496][32245] Updated weights for policy 0, policy_version 740 (0.0008) [2023-02-22 21:25:13,140][32245] Updated weights for policy 0, policy_version 750 (0.0007) [2023-02-22 21:25:13,156][24717] Fps is (10 sec: 24166.4, 60 sec: 23893.3, 300 sec: 22755.6). Total num frames: 3072000. Throughput: 0: 5921.3. Samples: 752384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:25:13,157][24717] Avg episode reward: [(0, '22.374')] [2023-02-22 21:25:14,875][32245] Updated weights for policy 0, policy_version 760 (0.0008) [2023-02-22 21:25:16,543][32245] Updated weights for policy 0, policy_version 770 (0.0007) [2023-02-22 21:25:18,156][24717] Fps is (10 sec: 24576.0, 60 sec: 23893.3, 300 sec: 22791.3). Total num frames: 3190784. Throughput: 0: 5945.4. Samples: 789058. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:25:18,157][24717] Avg episode reward: [(0, '22.226')] [2023-02-22 21:25:18,258][32245] Updated weights for policy 0, policy_version 780 (0.0006) [2023-02-22 21:25:19,995][32245] Updated weights for policy 0, policy_version 790 (0.0007) [2023-02-22 21:25:21,683][32245] Updated weights for policy 0, policy_version 800 (0.0006) [2023-02-22 21:25:23,156][24717] Fps is (10 sec: 23756.7, 60 sec: 23825.0, 300 sec: 22824.6). Total num frames: 3309568. Throughput: 0: 5944.3. Samples: 825322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:25:23,158][24717] Avg episode reward: [(0, '23.568')] [2023-02-22 21:25:23,159][32230] Saving new best policy, reward=23.568! [2023-02-22 21:25:23,385][32245] Updated weights for policy 0, policy_version 810 (0.0007) [2023-02-22 21:25:25,018][32245] Updated weights for policy 0, policy_version 820 (0.0006) [2023-02-22 21:25:26,705][32245] Updated weights for policy 0, policy_version 830 (0.0006) [2023-02-22 21:25:28,156][24717] Fps is (10 sec: 24166.4, 60 sec: 23825.2, 300 sec: 22883.0). Total num frames: 3432448. Throughput: 0: 5950.8. Samples: 843522. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:25:28,157][24717] Avg episode reward: [(0, '21.152')] [2023-02-22 21:25:28,392][32245] Updated weights for policy 0, policy_version 840 (0.0008) [2023-02-22 21:25:30,067][32245] Updated weights for policy 0, policy_version 850 (0.0006) [2023-02-22 21:25:31,809][32245] Updated weights for policy 0, policy_version 860 (0.0009) [2023-02-22 21:25:33,156][24717] Fps is (10 sec: 24576.2, 60 sec: 23893.4, 300 sec: 22937.6). Total num frames: 3555328. Throughput: 0: 5961.2. Samples: 879730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:25:33,157][24717] Avg episode reward: [(0, '25.351')] [2023-02-22 21:25:33,159][32230] Saving new best policy, reward=25.351! [2023-02-22 21:25:33,498][32245] Updated weights for policy 0, policy_version 870 (0.0006) [2023-02-22 21:25:35,212][32245] Updated weights for policy 0, policy_version 880 (0.0007) [2023-02-22 21:25:37,001][32245] Updated weights for policy 0, policy_version 890 (0.0012) [2023-02-22 21:25:38,156][24717] Fps is (10 sec: 23756.8, 60 sec: 23825.1, 300 sec: 22937.6). Total num frames: 3670016. Throughput: 0: 5960.4. Samples: 914910. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:25:38,158][24717] Avg episode reward: [(0, '24.882')] [2023-02-22 21:25:38,807][32245] Updated weights for policy 0, policy_version 900 (0.0010) [2023-02-22 21:25:40,496][32245] Updated weights for policy 0, policy_version 910 (0.0006) [2023-02-22 21:25:42,288][32245] Updated weights for policy 0, policy_version 920 (0.0008) [2023-02-22 21:25:43,157][24717] Fps is (10 sec: 22935.2, 60 sec: 23688.1, 300 sec: 22937.5). Total num frames: 3784704. Throughput: 0: 5959.6. Samples: 932726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:25:43,158][24717] Avg episode reward: [(0, '23.635')] [2023-02-22 21:25:44,045][32245] Updated weights for policy 0, policy_version 930 (0.0008) [2023-02-22 21:25:45,787][32245] Updated weights for policy 0, policy_version 940 (0.0008) [2023-02-22 21:25:47,486][32245] Updated weights for policy 0, policy_version 950 (0.0007) [2023-02-22 21:25:48,156][24717] Fps is (10 sec: 23347.0, 60 sec: 23756.8, 300 sec: 22961.7). Total num frames: 3903488. Throughput: 0: 5975.9. Samples: 967920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:25:48,158][24717] Avg episode reward: [(0, '25.987')] [2023-02-22 21:25:48,180][32230] Saving new best policy, reward=25.987! [2023-02-22 21:25:49,224][32245] Updated weights for policy 0, policy_version 960 (0.0008) [2023-02-22 21:25:50,941][32245] Updated weights for policy 0, policy_version 970 (0.0008) [2023-02-22 21:25:52,319][32230] Stopping Batcher_0... [2023-02-22 21:25:52,319][32230] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-22 21:25:52,319][24717] Component Batcher_0 stopped! [2023-02-22 21:25:52,319][32230] Loop batcher_evt_loop terminating... [2023-02-22 21:25:52,333][24717] Component RolloutWorker_w5 stopped! [2023-02-22 21:25:52,334][32253] Stopping RolloutWorker_w4... [2023-02-22 21:25:52,335][32253] Loop rollout_proc4_evt_loop terminating... [2023-02-22 21:25:52,335][24717] Component RolloutWorker_w4 stopped! [2023-02-22 21:25:52,335][32252] Stopping RolloutWorker_w5... [2023-02-22 21:25:52,336][32252] Loop rollout_proc5_evt_loop terminating... [2023-02-22 21:25:52,337][32248] Stopping RolloutWorker_w2... [2023-02-22 21:25:52,337][32248] Loop rollout_proc2_evt_loop terminating... [2023-02-22 21:25:52,337][24717] Component RolloutWorker_w2 stopped! [2023-02-22 21:25:52,338][32247] Stopping RolloutWorker_w1... [2023-02-22 21:25:52,338][24717] Component RolloutWorker_w1 stopped! [2023-02-22 21:25:52,341][32262] Stopping RolloutWorker_w6... [2023-02-22 21:25:52,341][24717] Component RolloutWorker_w6 stopped! [2023-02-22 21:25:52,342][32262] Loop rollout_proc6_evt_loop terminating... [2023-02-22 21:25:52,339][32247] Loop rollout_proc1_evt_loop terminating... [2023-02-22 21:25:52,344][32245] Weights refcount: 2 0 [2023-02-22 21:25:52,345][32245] Stopping InferenceWorker_p0-w0... [2023-02-22 21:25:52,345][32245] Loop inference_proc0-0_evt_loop terminating... [2023-02-22 21:25:52,345][24717] Component InferenceWorker_p0-w0 stopped! [2023-02-22 21:25:52,348][24717] Component RolloutWorker_w0 stopped! [2023-02-22 21:25:52,348][32246] Stopping RolloutWorker_w0... [2023-02-22 21:25:52,349][32246] Loop rollout_proc0_evt_loop terminating... [2023-02-22 21:25:52,353][32263] Stopping RolloutWorker_w7... [2023-02-22 21:25:52,353][32263] Loop rollout_proc7_evt_loop terminating... [2023-02-22 21:25:52,353][24717] Component RolloutWorker_w7 stopped! [2023-02-22 21:25:52,380][32230] Saving new best policy, reward=28.129! [2023-02-22 21:25:52,411][32249] Stopping RolloutWorker_w3... [2023-02-22 21:25:52,411][32249] Loop rollout_proc3_evt_loop terminating... [2023-02-22 21:25:52,411][24717] Component RolloutWorker_w3 stopped! [2023-02-22 21:25:52,447][32230] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-22 21:25:52,520][32230] Stopping LearnerWorker_p0... [2023-02-22 21:25:52,521][32230] Loop learner_proc0_evt_loop terminating... [2023-02-22 21:25:52,520][24717] Component LearnerWorker_p0 stopped! [2023-02-22 21:25:52,522][24717] Waiting for process learner_proc0 to stop... [2023-02-22 21:25:53,288][24717] Waiting for process inference_proc0-0 to join... [2023-02-22 21:25:53,289][24717] Waiting for process rollout_proc0 to join... [2023-02-22 21:25:53,290][24717] Waiting for process rollout_proc1 to join... [2023-02-22 21:25:53,290][24717] Waiting for process rollout_proc2 to join... [2023-02-22 21:25:53,291][24717] Waiting for process rollout_proc3 to join... [2023-02-22 21:25:53,292][24717] Waiting for process rollout_proc4 to join... [2023-02-22 21:25:53,293][24717] Waiting for process rollout_proc5 to join... [2023-02-22 21:25:53,293][24717] Waiting for process rollout_proc6 to join... [2023-02-22 21:25:53,294][24717] Waiting for process rollout_proc7 to join... [2023-02-22 21:25:53,295][24717] Batcher 0 profile tree view: batching: 13.1577, releasing_batches: 0.0181 [2023-02-22 21:25:53,295][24717] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 4.3703 update_model: 2.3692 weight_update: 0.0008 one_step: 0.0019 handle_policy_step: 155.6174 deserialize: 7.8731, stack: 0.8396, obs_to_device_normalize: 39.5270, forward: 65.6351, send_messages: 13.4852 prepare_outputs: 21.2746 to_cpu: 12.7880 [2023-02-22 21:25:53,296][24717] Learner 0 profile tree view: misc: 0.0051, prepare_batch: 6.8617 train: 17.8979 epoch_init: 0.0044, minibatch_init: 0.0044, losses_postprocess: 0.2348, kl_divergence: 0.2579, after_optimizer: 3.0676 calculate_losses: 7.3319 losses_init: 0.0024, forward_head: 0.7714, bptt_initial: 4.3619, tail: 0.4261, advantages_returns: 0.1152, losses: 0.7021 bptt: 0.8259 bptt_forward_core: 0.7938 update: 6.7190 clip: 0.8093 [2023-02-22 21:25:53,297][24717] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.0989, enqueue_policy_requests: 5.3262, env_step: 71.2475, overhead: 6.5361, complete_rollouts: 0.5137 save_policy_outputs: 5.6181 split_output_tensors: 2.7734 [2023-02-22 21:25:53,297][24717] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.1168, enqueue_policy_requests: 5.4305, env_step: 72.9827, overhead: 6.5238, complete_rollouts: 0.4607 save_policy_outputs: 5.7456 split_output_tensors: 2.8446 [2023-02-22 21:25:53,299][24717] Loop Runner_EvtLoop terminating... [2023-02-22 21:25:53,300][24717] Runner profile tree view: main_loop: 182.3930 [2023-02-22 21:25:53,301][24717] Collected {0: 4005888}, FPS: 21962.9 [2023-02-22 21:26:34,224][24717] Loading existing experiment configuration from /home/flahoud/studies/collab/train_dir/default_experiment/config.json [2023-02-22 21:26:34,225][24717] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 21:26:34,226][24717] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 21:26:34,226][24717] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 21:26:34,227][24717] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 21:26:34,228][24717] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 21:26:34,228][24717] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 21:26:34,229][24717] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 21:26:34,229][24717] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-22 21:26:34,230][24717] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-22 21:26:34,230][24717] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 21:26:34,231][24717] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 21:26:34,231][24717] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 21:26:34,232][24717] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 21:26:34,233][24717] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 21:26:34,249][24717] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:26:34,251][24717] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:26:34,252][24717] RunningMeanStd input shape: (1,) [2023-02-22 21:26:34,264][24717] ConvEncoder: input_channels=3 [2023-02-22 21:26:34,367][24717] Conv encoder output size: 512 [2023-02-22 21:26:34,369][24717] Policy head output size: 512 [2023-02-22 21:26:37,138][24717] Loading state from checkpoint /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-22 21:26:38,361][24717] Num frames 100... [2023-02-22 21:26:38,459][24717] Num frames 200... [2023-02-22 21:26:38,556][24717] Num frames 300... [2023-02-22 21:26:38,658][24717] Num frames 400... [2023-02-22 21:26:38,753][24717] Num frames 500... [2023-02-22 21:26:38,849][24717] Num frames 600... [2023-02-22 21:26:38,945][24717] Num frames 700... [2023-02-22 21:26:39,045][24717] Num frames 800... [2023-02-22 21:26:39,097][24717] Avg episode rewards: #0: 16.000, true rewards: #0: 8.000 [2023-02-22 21:26:39,098][24717] Avg episode reward: 16.000, avg true_objective: 8.000 [2023-02-22 21:26:39,202][24717] Num frames 900... [2023-02-22 21:26:39,299][24717] Num frames 1000... [2023-02-22 21:26:39,395][24717] Num frames 1100... [2023-02-22 21:26:39,491][24717] Num frames 1200... [2023-02-22 21:26:39,588][24717] Num frames 1300... [2023-02-22 21:26:39,690][24717] Num frames 1400... [2023-02-22 21:26:39,803][24717] Avg episode rewards: #0: 14.805, true rewards: #0: 7.305 [2023-02-22 21:26:39,804][24717] Avg episode reward: 14.805, avg true_objective: 7.305 [2023-02-22 21:26:39,843][24717] Num frames 1500... [2023-02-22 21:26:39,946][24717] Num frames 1600... [2023-02-22 21:26:40,046][24717] Num frames 1700... [2023-02-22 21:26:40,148][24717] Num frames 1800... [2023-02-22 21:26:40,247][24717] Num frames 1900... [2023-02-22 21:26:40,346][24717] Num frames 2000... [2023-02-22 21:26:40,445][24717] Num frames 2100... [2023-02-22 21:26:40,542][24717] Num frames 2200... [2023-02-22 21:26:40,689][24717] Avg episode rewards: #0: 14.977, true rewards: #0: 7.643 [2023-02-22 21:26:40,690][24717] Avg episode reward: 14.977, avg true_objective: 7.643 [2023-02-22 21:26:40,698][24717] Num frames 2300... [2023-02-22 21:26:40,798][24717] Num frames 2400... [2023-02-22 21:26:40,901][24717] Num frames 2500... [2023-02-22 21:26:40,999][24717] Num frames 2600... [2023-02-22 21:26:41,095][24717] Num frames 2700... [2023-02-22 21:26:41,199][24717] Num frames 2800... [2023-02-22 21:26:41,293][24717] Num frames 2900... [2023-02-22 21:26:41,384][24717] Num frames 3000... [2023-02-22 21:26:41,475][24717] Num frames 3100... [2023-02-22 21:26:41,570][24717] Num frames 3200... [2023-02-22 21:26:41,668][24717] Num frames 3300... [2023-02-22 21:26:41,772][24717] Num frames 3400... [2023-02-22 21:26:41,875][24717] Num frames 3500... [2023-02-22 21:26:42,000][24717] Avg episode rewards: #0: 18.183, true rewards: #0: 8.932 [2023-02-22 21:26:42,001][24717] Avg episode reward: 18.183, avg true_objective: 8.932 [2023-02-22 21:26:42,034][24717] Num frames 3600... [2023-02-22 21:26:42,140][24717] Num frames 3700... [2023-02-22 21:26:42,243][24717] Num frames 3800... [2023-02-22 21:26:42,342][24717] Num frames 3900... [2023-02-22 21:26:42,446][24717] Num frames 4000... [2023-02-22 21:26:42,548][24717] Num frames 4100... [2023-02-22 21:26:42,648][24717] Num frames 4200... [2023-02-22 21:26:42,776][24717] Avg episode rewards: #0: 16.954, true rewards: #0: 8.554 [2023-02-22 21:26:42,777][24717] Avg episode reward: 16.954, avg true_objective: 8.554 [2023-02-22 21:26:42,800][24717] Num frames 4300... [2023-02-22 21:26:42,898][24717] Num frames 4400... [2023-02-22 21:26:42,995][24717] Num frames 4500... [2023-02-22 21:26:43,090][24717] Num frames 4600... [2023-02-22 21:26:43,194][24717] Num frames 4700... [2023-02-22 21:26:43,294][24717] Num frames 4800... [2023-02-22 21:26:43,391][24717] Num frames 4900... [2023-02-22 21:26:43,491][24717] Num frames 5000... [2023-02-22 21:26:43,590][24717] Num frames 5100... [2023-02-22 21:26:43,687][24717] Num frames 5200... [2023-02-22 21:26:43,789][24717] Num frames 5300... [2023-02-22 21:26:43,889][24717] Num frames 5400... [2023-02-22 21:26:43,992][24717] Num frames 5500... [2023-02-22 21:26:44,093][24717] Num frames 5600... [2023-02-22 21:26:44,171][24717] Avg episode rewards: #0: 19.535, true rewards: #0: 9.368 [2023-02-22 21:26:44,172][24717] Avg episode reward: 19.535, avg true_objective: 9.368 [2023-02-22 21:26:44,254][24717] Num frames 5700... [2023-02-22 21:26:44,359][24717] Num frames 5800... [2023-02-22 21:26:44,461][24717] Num frames 5900... [2023-02-22 21:26:44,563][24717] Num frames 6000... [2023-02-22 21:26:44,663][24717] Num frames 6100... [2023-02-22 21:26:44,815][24717] Avg episode rewards: #0: 18.139, true rewards: #0: 8.853 [2023-02-22 21:26:44,817][24717] Avg episode reward: 18.139, avg true_objective: 8.853 [2023-02-22 21:26:44,820][24717] Num frames 6200... [2023-02-22 21:26:44,921][24717] Num frames 6300... [2023-02-22 21:26:45,024][24717] Num frames 6400... [2023-02-22 21:26:45,122][24717] Num frames 6500... [2023-02-22 21:26:45,222][24717] Num frames 6600... [2023-02-22 21:26:45,317][24717] Num frames 6700... [2023-02-22 21:26:45,413][24717] Num frames 6800... [2023-02-22 21:26:45,510][24717] Num frames 6900... [2023-02-22 21:26:45,604][24717] Num frames 7000... [2023-02-22 21:26:45,703][24717] Num frames 7100... [2023-02-22 21:26:45,813][24717] Num frames 7200... [2023-02-22 21:26:45,923][24717] Avg episode rewards: #0: 18.571, true rewards: #0: 9.071 [2023-02-22 21:26:45,924][24717] Avg episode reward: 18.571, avg true_objective: 9.071 [2023-02-22 21:26:45,967][24717] Num frames 7300... [2023-02-22 21:26:46,066][24717] Num frames 7400... [2023-02-22 21:26:46,160][24717] Num frames 7500... [2023-02-22 21:26:46,259][24717] Num frames 7600... [2023-02-22 21:26:46,352][24717] Num frames 7700... [2023-02-22 21:26:46,451][24717] Num frames 7800... [2023-02-22 21:26:46,548][24717] Num frames 7900... [2023-02-22 21:26:46,646][24717] Num frames 8000... [2023-02-22 21:26:46,726][24717] Avg episode rewards: #0: 18.250, true rewards: #0: 8.917 [2023-02-22 21:26:46,727][24717] Avg episode reward: 18.250, avg true_objective: 8.917 [2023-02-22 21:26:46,803][24717] Num frames 8100... [2023-02-22 21:26:46,897][24717] Num frames 8200... [2023-02-22 21:26:46,991][24717] Num frames 8300... [2023-02-22 21:26:47,084][24717] Num frames 8400... [2023-02-22 21:26:47,177][24717] Num frames 8500... [2023-02-22 21:26:47,270][24717] Num frames 8600... [2023-02-22 21:26:47,383][24717] Avg episode rewards: #0: 17.462, true rewards: #0: 8.662 [2023-02-22 21:26:47,384][24717] Avg episode reward: 17.462, avg true_objective: 8.662 [2023-02-22 21:27:02,989][24717] Replay video saved to /home/flahoud/studies/collab/train_dir/default_experiment/replay.mp4! [2023-02-22 21:32:31,282][24717] Loading existing experiment configuration from /home/flahoud/studies/collab/train_dir/default_experiment/config.json [2023-02-22 21:32:31,283][24717] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 21:32:31,284][24717] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 21:32:31,285][24717] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 21:32:31,286][24717] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 21:32:31,286][24717] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 21:32:31,286][24717] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-22 21:32:31,287][24717] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 21:32:31,287][24717] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-22 21:32:31,288][24717] Adding new argument 'hf_repository'='GrimReaperSam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-22 21:32:31,288][24717] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 21:32:31,289][24717] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 21:32:31,289][24717] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 21:32:31,291][24717] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 21:32:31,292][24717] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 21:32:31,303][24717] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:32:31,304][24717] RunningMeanStd input shape: (1,) [2023-02-22 21:32:31,315][24717] ConvEncoder: input_channels=3 [2023-02-22 21:32:31,371][24717] Conv encoder output size: 512 [2023-02-22 21:32:31,372][24717] Policy head output size: 512 [2023-02-22 21:32:31,403][24717] Loading state from checkpoint /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-22 21:32:31,890][24717] Num frames 100... [2023-02-22 21:32:31,981][24717] Num frames 200... [2023-02-22 21:32:32,072][24717] Num frames 300... [2023-02-22 21:32:32,172][24717] Num frames 400... [2023-02-22 21:32:32,320][24717] Avg episode rewards: #0: 7.980, true rewards: #0: 4.980 [2023-02-22 21:32:32,321][24717] Avg episode reward: 7.980, avg true_objective: 4.980 [2023-02-22 21:32:32,323][24717] Num frames 500... [2023-02-22 21:32:32,420][24717] Num frames 600... [2023-02-22 21:32:32,518][24717] Num frames 700... [2023-02-22 21:32:32,623][24717] Num frames 800... [2023-02-22 21:32:32,718][24717] Num frames 900... [2023-02-22 21:32:32,811][24717] Num frames 1000... [2023-02-22 21:32:32,928][24717] Num frames 1100... [2023-02-22 21:32:33,028][24717] Num frames 1200... [2023-02-22 21:32:33,125][24717] Num frames 1300... [2023-02-22 21:32:33,222][24717] Num frames 1400... [2023-02-22 21:32:33,324][24717] Num frames 1500... [2023-02-22 21:32:33,427][24717] Num frames 1600... [2023-02-22 21:32:33,531][24717] Num frames 1700... [2023-02-22 21:32:33,631][24717] Num frames 1800... [2023-02-22 21:32:33,730][24717] Num frames 1900... [2023-02-22 21:32:33,835][24717] Num frames 2000... [2023-02-22 21:32:33,937][24717] Num frames 2100... [2023-02-22 21:32:34,048][24717] Num frames 2200... [2023-02-22 21:32:34,193][24717] Avg episode rewards: #0: 26.950, true rewards: #0: 11.450 [2023-02-22 21:32:34,194][24717] Avg episode reward: 26.950, avg true_objective: 11.450 [2023-02-22 21:32:34,206][24717] Num frames 2300... [2023-02-22 21:32:34,315][24717] Num frames 2400... [2023-02-22 21:32:34,424][24717] Num frames 2500... [2023-02-22 21:32:34,528][24717] Num frames 2600... [2023-02-22 21:32:34,632][24717] Num frames 2700... [2023-02-22 21:32:34,736][24717] Num frames 2800... [2023-02-22 21:32:34,843][24717] Num frames 2900... [2023-02-22 21:32:34,943][24717] Num frames 3000... [2023-02-22 21:32:35,049][24717] Num frames 3100... [2023-02-22 21:32:35,153][24717] Num frames 3200... [2023-02-22 21:32:35,254][24717] Num frames 3300... [2023-02-22 21:32:35,356][24717] Num frames 3400... [2023-02-22 21:32:35,457][24717] Num frames 3500... [2023-02-22 21:32:35,558][24717] Num frames 3600... [2023-02-22 21:32:35,655][24717] Num frames 3700... [2023-02-22 21:32:35,741][24717] Avg episode rewards: #0: 30.766, true rewards: #0: 12.433 [2023-02-22 21:32:35,742][24717] Avg episode reward: 30.766, avg true_objective: 12.433 [2023-02-22 21:32:35,816][24717] Num frames 3800... [2023-02-22 21:32:35,916][24717] Num frames 3900... [2023-02-22 21:32:36,017][24717] Num frames 4000... [2023-02-22 21:32:36,125][24717] Num frames 4100... [2023-02-22 21:32:36,232][24717] Num frames 4200... [2023-02-22 21:32:36,342][24717] Num frames 4300... [2023-02-22 21:32:36,445][24717] Num frames 4400... [2023-02-22 21:32:36,553][24717] Num frames 4500... [2023-02-22 21:32:36,664][24717] Num frames 4600... [2023-02-22 21:32:36,734][24717] Avg episode rewards: #0: 28.032, true rewards: #0: 11.532 [2023-02-22 21:32:36,735][24717] Avg episode reward: 28.032, avg true_objective: 11.532 [2023-02-22 21:32:36,823][24717] Num frames 4700... [2023-02-22 21:32:36,929][24717] Num frames 4800... [2023-02-22 21:32:37,027][24717] Num frames 4900... [2023-02-22 21:32:37,131][24717] Num frames 5000... [2023-02-22 21:32:37,231][24717] Num frames 5100... [2023-02-22 21:32:37,331][24717] Num frames 5200... [2023-02-22 21:32:37,428][24717] Num frames 5300... [2023-02-22 21:32:37,530][24717] Num frames 5400... [2023-02-22 21:32:37,643][24717] Num frames 5500... [2023-02-22 21:32:37,747][24717] Num frames 5600... [2023-02-22 21:32:37,817][24717] Avg episode rewards: #0: 27.428, true rewards: #0: 11.228 [2023-02-22 21:32:37,818][24717] Avg episode reward: 27.428, avg true_objective: 11.228 [2023-02-22 21:32:37,907][24717] Num frames 5700... [2023-02-22 21:32:38,007][24717] Num frames 5800... [2023-02-22 21:32:38,106][24717] Num frames 5900... [2023-02-22 21:32:38,202][24717] Num frames 6000... [2023-02-22 21:32:38,296][24717] Num frames 6100... [2023-02-22 21:32:38,398][24717] Num frames 6200... [2023-02-22 21:32:38,494][24717] Num frames 6300... [2023-02-22 21:32:38,593][24717] Num frames 6400... [2023-02-22 21:32:38,689][24717] Num frames 6500... [2023-02-22 21:32:38,755][24717] Avg episode rewards: #0: 25.850, true rewards: #0: 10.850 [2023-02-22 21:32:38,757][24717] Avg episode reward: 25.850, avg true_objective: 10.850 [2023-02-22 21:32:38,845][24717] Num frames 6600... [2023-02-22 21:32:38,943][24717] Num frames 6700... [2023-02-22 21:32:39,043][24717] Num frames 6800... [2023-02-22 21:32:39,187][24717] Avg episode rewards: #0: 23.134, true rewards: #0: 9.849 [2023-02-22 21:32:39,188][24717] Avg episode reward: 23.134, avg true_objective: 9.849 [2023-02-22 21:32:39,195][24717] Num frames 6900... [2023-02-22 21:32:39,302][24717] Num frames 7000... [2023-02-22 21:32:39,399][24717] Num frames 7100... [2023-02-22 21:32:39,501][24717] Num frames 7200... [2023-02-22 21:32:39,600][24717] Num frames 7300... [2023-02-22 21:32:39,696][24717] Num frames 7400... [2023-02-22 21:32:39,790][24717] Num frames 7500... [2023-02-22 21:32:39,884][24717] Num frames 7600... [2023-02-22 21:32:39,985][24717] Num frames 7700... [2023-02-22 21:32:40,084][24717] Num frames 7800... [2023-02-22 21:32:40,187][24717] Num frames 7900... [2023-02-22 21:32:40,292][24717] Avg episode rewards: #0: 22.937, true rewards: #0: 9.937 [2023-02-22 21:32:40,294][24717] Avg episode reward: 22.937, avg true_objective: 9.937 [2023-02-22 21:32:40,349][24717] Num frames 8000... [2023-02-22 21:32:40,445][24717] Num frames 8100... [2023-02-22 21:32:40,550][24717] Num frames 8200... [2023-02-22 21:32:40,657][24717] Num frames 8300... [2023-02-22 21:32:40,805][24717] Avg episode rewards: #0: 20.998, true rewards: #0: 9.331 [2023-02-22 21:32:40,806][24717] Avg episode reward: 20.998, avg true_objective: 9.331 [2023-02-22 21:32:40,809][24717] Num frames 8400... [2023-02-22 21:32:40,907][24717] Num frames 8500... [2023-02-22 21:32:41,012][24717] Num frames 8600... [2023-02-22 21:32:41,114][24717] Num frames 8700... [2023-02-22 21:32:41,255][24717] Avg episode rewards: #0: 19.282, true rewards: #0: 8.782 [2023-02-22 21:32:41,256][24717] Avg episode reward: 19.282, avg true_objective: 8.782 [2023-02-22 21:32:57,417][24717] Replay video saved to /home/flahoud/studies/collab/train_dir/default_experiment/replay.mp4! [2023-02-22 21:33:05,970][24717] The model has been pushed to https://huggingface.co/GrimReaperSam/rl_course_vizdoom_health_gathering_supreme [2023-02-22 21:35:47,214][24717] Environment doom_basic already registered, overwriting... [2023-02-22 21:35:47,215][24717] Environment doom_two_colors_easy already registered, overwriting... [2023-02-22 21:35:47,216][24717] Environment doom_two_colors_hard already registered, overwriting... [2023-02-22 21:35:47,217][24717] Environment doom_dm already registered, overwriting... [2023-02-22 21:35:47,218][24717] Environment doom_dwango5 already registered, overwriting... [2023-02-22 21:35:47,218][24717] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-22 21:35:47,219][24717] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-22 21:35:47,220][24717] Environment doom_my_way_home already registered, overwriting... [2023-02-22 21:35:47,221][24717] Environment doom_deadly_corridor already registered, overwriting... [2023-02-22 21:35:47,221][24717] Environment doom_defend_the_center already registered, overwriting... [2023-02-22 21:35:47,222][24717] Environment doom_defend_the_line already registered, overwriting... [2023-02-22 21:35:47,222][24717] Environment doom_health_gathering already registered, overwriting... [2023-02-22 21:35:47,223][24717] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-22 21:35:47,224][24717] Environment doom_battle already registered, overwriting... [2023-02-22 21:35:47,224][24717] Environment doom_battle2 already registered, overwriting... [2023-02-22 21:35:47,225][24717] Environment doom_duel_bots already registered, overwriting... [2023-02-22 21:35:47,225][24717] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-22 21:35:47,226][24717] Environment doom_duel already registered, overwriting... [2023-02-22 21:35:47,227][24717] Environment doom_deathmatch_full already registered, overwriting... [2023-02-22 21:35:47,227][24717] Environment doom_benchmark already registered, overwriting... [2023-02-22 21:35:47,228][24717] register_encoder_factory: [2023-02-22 21:35:47,235][24717] Loading existing experiment configuration from /home/flahoud/studies/collab/train_dir/default_experiment/config.json [2023-02-22 21:35:47,236][24717] Overriding arg 'train_for_env_steps' with value 40000000 passed from command line [2023-02-22 21:35:47,241][24717] Experiment dir /home/flahoud/studies/collab/train_dir/default_experiment already exists! [2023-02-22 21:35:47,241][24717] Resuming existing experiment from /home/flahoud/studies/collab/train_dir/default_experiment... [2023-02-22 21:35:47,242][24717] Weights and Biases integration disabled [2023-02-22 21:35:47,244][24717] Environment var CUDA_VISIBLE_DEVICES is 0 [2023-02-22 21:35:48,838][24717] Starting experiment with the following configuration: help=False algo=APPO env=doom_health_gathering_supreme experiment=default_experiment train_dir=/home/flahoud/studies/collab/train_dir restart_behavior=resume device=gpu seed=None num_policies=1 async_rl=True serial_mode=False batched_sampling=False num_batches_to_accumulate=2 worker_num_splits=2 policy_workers_per_policy=1 max_policy_lag=1000 num_workers=8 num_envs_per_worker=4 batch_size=1024 num_batches_per_epoch=1 num_epochs=1 rollout=32 recurrence=32 shuffle_minibatches=False gamma=0.99 reward_scale=1.0 reward_clip=1000.0 value_bootstrap=False normalize_returns=True exploration_loss_coeff=0.001 value_loss_coeff=0.5 kl_loss_coeff=0.0 exploration_loss=symmetric_kl gae_lambda=0.95 ppo_clip_ratio=0.1 ppo_clip_value=0.2 with_vtrace=False vtrace_rho=1.0 vtrace_c=1.0 optimizer=adam adam_eps=1e-06 adam_beta1=0.9 adam_beta2=0.999 max_grad_norm=4.0 learning_rate=0.0001 lr_schedule=constant lr_schedule_kl_threshold=0.008 lr_adaptive_min=1e-06 lr_adaptive_max=0.01 obs_subtract_mean=0.0 obs_scale=255.0 normalize_input=True normalize_input_keys=None decorrelate_experience_max_seconds=0 decorrelate_envs_on_one_worker=True actor_worker_gpus=[] set_workers_cpu_affinity=True force_envs_single_thread=False default_niceness=0 log_to_file=True experiment_summaries_interval=10 flush_summaries_interval=30 stats_avg=100 summaries_use_frameskip=True heartbeat_interval=20 heartbeat_reporting_interval=600 train_for_env_steps=40000000 train_for_seconds=10000000000 save_every_sec=120 keep_checkpoints=2 load_checkpoint_kind=latest save_milestones_sec=-1 save_best_every_sec=5 save_best_metric=reward save_best_after=100000 benchmark=False encoder_mlp_layers=[512, 512] encoder_conv_architecture=convnet_simple encoder_conv_mlp_layers=[512] use_rnn=True rnn_size=512 rnn_type=gru rnn_num_layers=1 decoder_mlp_layers=[] nonlinearity=elu policy_initialization=orthogonal policy_init_gain=1.0 actor_critic_share_weights=True adaptive_stddev=True continuous_tanh_scale=0.0 initial_stddev=1.0 use_env_info_cache=False env_gpu_actions=False env_gpu_observations=True env_frameskip=4 env_framestack=1 pixel_format=CHW use_record_episode_statistics=False with_wandb=False wandb_user=None wandb_project=sample_factory wandb_group=None wandb_job_type=SF wandb_tags=[] with_pbt=False pbt_mix_policies_in_one_env=True pbt_period_env_steps=5000000 pbt_start_mutation=20000000 pbt_replace_fraction=0.3 pbt_mutation_rate=0.15 pbt_replace_reward_gap=0.1 pbt_replace_reward_gap_absolute=1e-06 pbt_optimize_gamma=False pbt_target_objective=true_objective pbt_perturb_min=1.1 pbt_perturb_max=1.5 num_agents=-1 num_humans=0 num_bots=-1 start_bot_difficulty=None timelimit=None res_w=128 res_h=72 wide_aspect_ratio=False eval_env_frameskip=1 fps=35 command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000000 cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000000} git_hash=unknown git_repo_name=not a git repository [2023-02-22 21:35:48,840][24717] Saving configuration to /home/flahoud/studies/collab/train_dir/default_experiment/config.json... [2023-02-22 21:35:48,841][24717] Rollout worker 0 uses device cpu [2023-02-22 21:35:48,842][24717] Rollout worker 1 uses device cpu [2023-02-22 21:35:48,843][24717] Rollout worker 2 uses device cpu [2023-02-22 21:35:48,843][24717] Rollout worker 3 uses device cpu [2023-02-22 21:35:48,844][24717] Rollout worker 4 uses device cpu [2023-02-22 21:35:48,845][24717] Rollout worker 5 uses device cpu [2023-02-22 21:35:48,845][24717] Rollout worker 6 uses device cpu [2023-02-22 21:35:48,846][24717] Rollout worker 7 uses device cpu [2023-02-22 21:35:48,898][24717] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:35:48,899][24717] InferenceWorker_p0-w0: min num requests: 2 [2023-02-22 21:35:48,930][24717] Starting all processes... [2023-02-22 21:35:48,931][24717] Starting process learner_proc0 [2023-02-22 21:35:49,034][24717] Starting all processes... [2023-02-22 21:35:49,039][24717] Starting process inference_proc0-0 [2023-02-22 21:35:49,040][24717] Starting process rollout_proc0 [2023-02-22 21:35:49,040][24717] Starting process rollout_proc1 [2023-02-22 21:35:49,040][24717] Starting process rollout_proc2 [2023-02-22 21:35:49,040][24717] Starting process rollout_proc3 [2023-02-22 21:35:49,042][24717] Starting process rollout_proc4 [2023-02-22 21:35:49,042][24717] Starting process rollout_proc5 [2023-02-22 21:35:49,042][24717] Starting process rollout_proc6 [2023-02-22 21:35:49,042][24717] Starting process rollout_proc7 [2023-02-22 21:35:50,848][37090] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:35:50,848][37090] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-22 21:35:50,875][37091] Worker 0 uses CPU cores [0] [2023-02-22 21:35:50,880][37090] Num visible devices: 1 [2023-02-22 21:35:50,948][37101] Worker 2 uses CPU cores [2] [2023-02-22 21:35:50,961][37076] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:35:50,962][37076] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-22 21:35:50,976][37076] Num visible devices: 1 [2023-02-22 21:35:50,993][37076] Starting seed is not provided [2023-02-22 21:35:50,993][37076] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:35:50,993][37076] Initializing actor-critic model on device cuda:0 [2023-02-22 21:35:50,993][37076] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:35:50,994][37076] RunningMeanStd input shape: (1,) [2023-02-22 21:35:51,005][37076] ConvEncoder: input_channels=3 [2023-02-22 21:35:51,133][37076] Conv encoder output size: 512 [2023-02-22 21:35:51,133][37076] Policy head output size: 512 [2023-02-22 21:35:51,144][37103] Worker 3 uses CPU cores [3] [2023-02-22 21:35:51,146][37076] Created Actor Critic model with architecture: [2023-02-22 21:35:51,146][37076] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-22 21:35:51,286][37106] Worker 5 uses CPU cores [5] [2023-02-22 21:35:51,312][37104] Worker 6 uses CPU cores [6] [2023-02-22 21:35:51,349][37111] Worker 7 uses CPU cores [7] [2023-02-22 21:35:51,446][37109] Worker 4 uses CPU cores [4] [2023-02-22 21:35:51,454][37092] Worker 1 uses CPU cores [1] [2023-02-22 21:35:53,896][37076] Using optimizer [2023-02-22 21:35:53,896][37076] Loading state from checkpoint /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-22 21:35:53,925][37076] Loading model from checkpoint [2023-02-22 21:35:53,928][37076] Loaded experiment state at self.train_step=978, self.env_steps=4005888 [2023-02-22 21:35:53,929][37076] Initialized policy 0 weights for model version 978 [2023-02-22 21:35:53,930][37076] LearnerWorker_p0 finished initialization! [2023-02-22 21:35:53,931][37076] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 21:35:54,112][37090] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 21:35:54,113][37090] RunningMeanStd input shape: (1,) [2023-02-22 21:35:54,123][37090] ConvEncoder: input_channels=3 [2023-02-22 21:35:54,209][37090] Conv encoder output size: 512 [2023-02-22 21:35:54,209][37090] Policy head output size: 512 [2023-02-22 21:35:56,778][24717] Inference worker 0-0 is ready! [2023-02-22 21:35:56,780][24717] All inference workers are ready! Signal rollout workers to start! [2023-02-22 21:35:56,802][37092] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,810][37104] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,816][37091] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,818][37109] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,821][37101] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,825][37106] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,827][37111] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:56,829][37103] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 21:35:57,244][24717] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 21:35:57,271][37092] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,306][37104] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,359][37111] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,359][37106] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,360][37109] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,766][37092] Decorrelating experience for 32 frames... [2023-02-22 21:35:57,879][37103] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,880][37109] Decorrelating experience for 32 frames... [2023-02-22 21:35:57,881][37104] Decorrelating experience for 32 frames... [2023-02-22 21:35:57,882][37091] Decorrelating experience for 0 frames... [2023-02-22 21:35:57,882][37111] Decorrelating experience for 32 frames... [2023-02-22 21:35:58,126][37092] Decorrelating experience for 64 frames... [2023-02-22 21:35:58,376][37103] Decorrelating experience for 32 frames... [2023-02-22 21:35:58,376][37111] Decorrelating experience for 64 frames... [2023-02-22 21:35:58,379][37091] Decorrelating experience for 32 frames... [2023-02-22 21:35:58,380][37101] Decorrelating experience for 0 frames... [2023-02-22 21:35:58,411][37104] Decorrelating experience for 64 frames... [2023-02-22 21:35:58,835][37106] Decorrelating experience for 32 frames... [2023-02-22 21:35:58,921][37111] Decorrelating experience for 96 frames... [2023-02-22 21:35:58,923][37091] Decorrelating experience for 64 frames... [2023-02-22 21:35:58,924][37092] Decorrelating experience for 96 frames... [2023-02-22 21:35:58,924][37109] Decorrelating experience for 64 frames... [2023-02-22 21:35:59,163][37103] Decorrelating experience for 64 frames... [2023-02-22 21:35:59,267][37106] Decorrelating experience for 64 frames... [2023-02-22 21:35:59,392][37109] Decorrelating experience for 96 frames... [2023-02-22 21:35:59,450][37104] Decorrelating experience for 96 frames... [2023-02-22 21:35:59,450][37101] Decorrelating experience for 32 frames... [2023-02-22 21:35:59,454][37091] Decorrelating experience for 96 frames... [2023-02-22 21:35:59,615][37106] Decorrelating experience for 96 frames... [2023-02-22 21:35:59,849][37101] Decorrelating experience for 64 frames... [2023-02-22 21:35:59,882][37103] Decorrelating experience for 96 frames... [2023-02-22 21:36:00,294][37101] Decorrelating experience for 96 frames... [2023-02-22 21:36:00,397][37076] Signal inference workers to stop experience collection... [2023-02-22 21:36:00,406][37090] InferenceWorker_p0-w0: stopping experience collection [2023-02-22 21:36:01,673][37076] Signal inference workers to resume experience collection... [2023-02-22 21:36:01,674][37090] InferenceWorker_p0-w0: resuming experience collection [2023-02-22 21:36:02,244][24717] Fps is (10 sec: 819.2, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 4009984. Throughput: 0: 14.4. Samples: 72. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-22 21:36:02,245][24717] Avg episode reward: [(0, '3.894')] [2023-02-22 21:36:03,733][37090] Updated weights for policy 0, policy_version 988 (0.0292) [2023-02-22 21:36:05,540][37090] Updated weights for policy 0, policy_version 998 (0.0007) [2023-02-22 21:36:07,244][24717] Fps is (10 sec: 11878.4, 60 sec: 11878.4, 300 sec: 11878.4). Total num frames: 4124672. Throughput: 0: 2271.4. Samples: 22714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:36:07,245][24717] Avg episode reward: [(0, '21.648')] [2023-02-22 21:36:07,254][37090] Updated weights for policy 0, policy_version 1008 (0.0008) [2023-02-22 21:36:08,890][24717] Heartbeat connected on Batcher_0 [2023-02-22 21:36:08,894][24717] Heartbeat connected on LearnerWorker_p0 [2023-02-22 21:36:08,901][24717] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-22 21:36:08,907][24717] Heartbeat connected on RolloutWorker_w1 [2023-02-22 21:36:08,911][24717] Heartbeat connected on RolloutWorker_w2 [2023-02-22 21:36:08,915][24717] Heartbeat connected on RolloutWorker_w3 [2023-02-22 21:36:08,917][24717] Heartbeat connected on RolloutWorker_w0 [2023-02-22 21:36:08,920][24717] Heartbeat connected on RolloutWorker_w4 [2023-02-22 21:36:08,923][24717] Heartbeat connected on RolloutWorker_w5 [2023-02-22 21:36:08,925][24717] Heartbeat connected on RolloutWorker_w6 [2023-02-22 21:36:08,930][24717] Heartbeat connected on RolloutWorker_w7 [2023-02-22 21:36:08,981][37090] Updated weights for policy 0, policy_version 1018 (0.0007) [2023-02-22 21:36:10,619][37090] Updated weights for policy 0, policy_version 1028 (0.0008) [2023-02-22 21:36:12,245][24717] Fps is (10 sec: 23754.8, 60 sec: 16110.0, 300 sec: 16110.0). Total num frames: 4247552. Throughput: 0: 3945.2. Samples: 59182. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:36:12,246][24717] Avg episode reward: [(0, '26.987')] [2023-02-22 21:36:12,262][37090] Updated weights for policy 0, policy_version 1038 (0.0008) [2023-02-22 21:36:13,912][37090] Updated weights for policy 0, policy_version 1048 (0.0007) [2023-02-22 21:36:15,567][37090] Updated weights for policy 0, policy_version 1058 (0.0008) [2023-02-22 21:36:17,224][37090] Updated weights for policy 0, policy_version 1068 (0.0006) [2023-02-22 21:36:17,244][24717] Fps is (10 sec: 24985.6, 60 sec: 18432.0, 300 sec: 18432.0). Total num frames: 4374528. Throughput: 0: 3892.3. Samples: 77846. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:36:17,245][24717] Avg episode reward: [(0, '27.007')] [2023-02-22 21:36:18,849][37090] Updated weights for policy 0, policy_version 1078 (0.0008) [2023-02-22 21:36:20,568][37090] Updated weights for policy 0, policy_version 1088 (0.0008) [2023-02-22 21:36:22,245][24717] Fps is (10 sec: 24575.4, 60 sec: 19496.1, 300 sec: 19496.1). Total num frames: 4493312. Throughput: 0: 4591.2. Samples: 114786. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:36:22,248][24717] Avg episode reward: [(0, '28.049')] [2023-02-22 21:36:22,261][37090] Updated weights for policy 0, policy_version 1098 (0.0007) [2023-02-22 21:36:23,993][37090] Updated weights for policy 0, policy_version 1108 (0.0007) [2023-02-22 21:36:25,673][37090] Updated weights for policy 0, policy_version 1118 (0.0007) [2023-02-22 21:36:27,244][24717] Fps is (10 sec: 24166.5, 60 sec: 20343.5, 300 sec: 20343.5). Total num frames: 4616192. Throughput: 0: 5028.5. Samples: 150854. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:36:27,246][24717] Avg episode reward: [(0, '21.155')] [2023-02-22 21:36:27,402][37090] Updated weights for policy 0, policy_version 1128 (0.0008) [2023-02-22 21:36:29,190][37090] Updated weights for policy 0, policy_version 1138 (0.0009) [2023-02-22 21:36:30,927][37090] Updated weights for policy 0, policy_version 1148 (0.0007) [2023-02-22 21:36:32,245][24717] Fps is (10 sec: 23757.0, 60 sec: 20713.5, 300 sec: 20713.5). Total num frames: 4730880. Throughput: 0: 4802.6. Samples: 168096. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:36:32,246][24717] Avg episode reward: [(0, '23.643')] [2023-02-22 21:36:32,702][37090] Updated weights for policy 0, policy_version 1158 (0.0008) [2023-02-22 21:36:34,466][37090] Updated weights for policy 0, policy_version 1168 (0.0009) [2023-02-22 21:36:36,167][37090] Updated weights for policy 0, policy_version 1178 (0.0009) [2023-02-22 21:36:37,244][24717] Fps is (10 sec: 23347.2, 60 sec: 21094.4, 300 sec: 21094.4). Total num frames: 4849664. Throughput: 0: 5080.9. Samples: 203236. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:36:37,245][24717] Avg episode reward: [(0, '27.936')] [2023-02-22 21:36:37,877][37090] Updated weights for policy 0, policy_version 1188 (0.0008) [2023-02-22 21:36:39,601][37090] Updated weights for policy 0, policy_version 1198 (0.0008) [2023-02-22 21:36:41,289][37090] Updated weights for policy 0, policy_version 1208 (0.0009) [2023-02-22 21:36:42,244][24717] Fps is (10 sec: 23759.0, 60 sec: 21390.2, 300 sec: 21390.2). Total num frames: 4968448. Throughput: 0: 5313.3. Samples: 239098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:36:42,245][24717] Avg episode reward: [(0, '25.803')] [2023-02-22 21:36:43,009][37090] Updated weights for policy 0, policy_version 1218 (0.0006) [2023-02-22 21:36:44,720][37090] Updated weights for policy 0, policy_version 1228 (0.0009) [2023-02-22 21:36:46,397][37090] Updated weights for policy 0, policy_version 1238 (0.0009) [2023-02-22 21:36:47,244][24717] Fps is (10 sec: 24166.4, 60 sec: 21708.8, 300 sec: 21708.8). Total num frames: 5091328. Throughput: 0: 5713.4. Samples: 257174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:36:47,245][24717] Avg episode reward: [(0, '28.074')] [2023-02-22 21:36:48,018][37090] Updated weights for policy 0, policy_version 1248 (0.0007) [2023-02-22 21:36:49,721][37090] Updated weights for policy 0, policy_version 1258 (0.0007) [2023-02-22 21:36:51,434][37090] Updated weights for policy 0, policy_version 1268 (0.0006) [2023-02-22 21:36:52,244][24717] Fps is (10 sec: 24166.5, 60 sec: 21895.0, 300 sec: 21895.0). Total num frames: 5210112. Throughput: 0: 6024.4. Samples: 293812. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:36:52,245][24717] Avg episode reward: [(0, '25.116')] [2023-02-22 21:36:53,104][37090] Updated weights for policy 0, policy_version 1278 (0.0008) [2023-02-22 21:36:54,833][37090] Updated weights for policy 0, policy_version 1288 (0.0007) [2023-02-22 21:36:56,571][37090] Updated weights for policy 0, policy_version 1298 (0.0009) [2023-02-22 21:36:57,244][24717] Fps is (10 sec: 23756.7, 60 sec: 22050.1, 300 sec: 22050.1). Total num frames: 5328896. Throughput: 0: 6010.9. Samples: 329668. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:36:57,245][24717] Avg episode reward: [(0, '23.664')] [2023-02-22 21:36:58,287][37090] Updated weights for policy 0, policy_version 1308 (0.0009) [2023-02-22 21:37:00,084][37090] Updated weights for policy 0, policy_version 1318 (0.0007) [2023-02-22 21:37:01,894][37090] Updated weights for policy 0, policy_version 1328 (0.0010) [2023-02-22 21:37:02,244][24717] Fps is (10 sec: 23347.3, 60 sec: 23893.3, 300 sec: 22118.4). Total num frames: 5443584. Throughput: 0: 5988.8. Samples: 347342. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:02,245][24717] Avg episode reward: [(0, '24.494')] [2023-02-22 21:37:03,706][37090] Updated weights for policy 0, policy_version 1338 (0.0008) [2023-02-22 21:37:05,427][37090] Updated weights for policy 0, policy_version 1348 (0.0009) [2023-02-22 21:37:07,215][37090] Updated weights for policy 0, policy_version 1358 (0.0007) [2023-02-22 21:37:07,245][24717] Fps is (10 sec: 23346.8, 60 sec: 23961.5, 300 sec: 22235.4). Total num frames: 5562368. Throughput: 0: 5926.2. Samples: 381458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:37:07,246][24717] Avg episode reward: [(0, '22.373')] [2023-02-22 21:37:08,935][37090] Updated weights for policy 0, policy_version 1368 (0.0007) [2023-02-22 21:37:10,597][37090] Updated weights for policy 0, policy_version 1378 (0.0006) [2023-02-22 21:37:12,244][24717] Fps is (10 sec: 23756.9, 60 sec: 23893.7, 300 sec: 22336.9). Total num frames: 5681152. Throughput: 0: 5927.1. Samples: 417574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:37:12,247][24717] Avg episode reward: [(0, '28.823')] [2023-02-22 21:37:12,260][37076] Saving new best policy, reward=28.823! [2023-02-22 21:37:12,264][37090] Updated weights for policy 0, policy_version 1388 (0.0007) [2023-02-22 21:37:13,885][37090] Updated weights for policy 0, policy_version 1398 (0.0007) [2023-02-22 21:37:15,557][37090] Updated weights for policy 0, policy_version 1408 (0.0007) [2023-02-22 21:37:17,182][37090] Updated weights for policy 0, policy_version 1418 (0.0008) [2023-02-22 21:37:17,244][24717] Fps is (10 sec: 24576.5, 60 sec: 23893.3, 300 sec: 22528.0). Total num frames: 5808128. Throughput: 0: 5958.9. Samples: 436240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:37:17,245][24717] Avg episode reward: [(0, '26.787')] [2023-02-22 21:37:18,814][37090] Updated weights for policy 0, policy_version 1428 (0.0007) [2023-02-22 21:37:20,509][37090] Updated weights for policy 0, policy_version 1438 (0.0006) [2023-02-22 21:37:22,173][37090] Updated weights for policy 0, policy_version 1448 (0.0006) [2023-02-22 21:37:22,244][24717] Fps is (10 sec: 24985.2, 60 sec: 23962.0, 300 sec: 22648.4). Total num frames: 5931008. Throughput: 0: 6003.6. Samples: 473398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:37:22,246][24717] Avg episode reward: [(0, '29.462')] [2023-02-22 21:37:22,248][37076] Saving new best policy, reward=29.462! [2023-02-22 21:37:23,909][37090] Updated weights for policy 0, policy_version 1458 (0.0007) [2023-02-22 21:37:25,716][37090] Updated weights for policy 0, policy_version 1468 (0.0009) [2023-02-22 21:37:27,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23825.1, 300 sec: 22664.5). Total num frames: 6045696. Throughput: 0: 5992.0. Samples: 508736. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:27,248][24717] Avg episode reward: [(0, '28.907')] [2023-02-22 21:37:27,441][37090] Updated weights for policy 0, policy_version 1478 (0.0008) [2023-02-22 21:37:29,177][37090] Updated weights for policy 0, policy_version 1488 (0.0006) [2023-02-22 21:37:30,930][37090] Updated weights for policy 0, policy_version 1498 (0.0007) [2023-02-22 21:37:32,244][24717] Fps is (10 sec: 23347.6, 60 sec: 23893.7, 300 sec: 22722.0). Total num frames: 6164480. Throughput: 0: 5984.1. Samples: 526458. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:32,245][24717] Avg episode reward: [(0, '28.522')] [2023-02-22 21:37:32,649][37090] Updated weights for policy 0, policy_version 1508 (0.0007) [2023-02-22 21:37:34,393][37090] Updated weights for policy 0, policy_version 1518 (0.0007) [2023-02-22 21:37:36,129][37090] Updated weights for policy 0, policy_version 1528 (0.0007) [2023-02-22 21:37:37,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23893.3, 300 sec: 22773.8). Total num frames: 6283264. Throughput: 0: 5957.6. Samples: 561904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:37,246][24717] Avg episode reward: [(0, '26.927')] [2023-02-22 21:37:37,855][37090] Updated weights for policy 0, policy_version 1538 (0.0007) [2023-02-22 21:37:39,575][37090] Updated weights for policy 0, policy_version 1548 (0.0007) [2023-02-22 21:37:41,280][37090] Updated weights for policy 0, policy_version 1558 (0.0006) [2023-02-22 21:37:42,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23893.4, 300 sec: 22820.6). Total num frames: 6402048. Throughput: 0: 5954.9. Samples: 597636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:37:42,245][24717] Avg episode reward: [(0, '26.315')] [2023-02-22 21:37:42,969][37090] Updated weights for policy 0, policy_version 1568 (0.0007) [2023-02-22 21:37:44,630][37090] Updated weights for policy 0, policy_version 1578 (0.0007) [2023-02-22 21:37:46,326][37090] Updated weights for policy 0, policy_version 1588 (0.0007) [2023-02-22 21:37:47,244][24717] Fps is (10 sec: 24166.3, 60 sec: 23893.3, 300 sec: 22900.4). Total num frames: 6524928. Throughput: 0: 5969.8. Samples: 615984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:47,245][24717] Avg episode reward: [(0, '24.430')] [2023-02-22 21:37:47,252][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000001593_6524928.pth... [2023-02-22 21:37:47,301][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000605_2478080.pth [2023-02-22 21:37:48,005][37090] Updated weights for policy 0, policy_version 1598 (0.0007) [2023-02-22 21:37:49,700][37090] Updated weights for policy 0, policy_version 1608 (0.0006) [2023-02-22 21:37:51,390][37090] Updated weights for policy 0, policy_version 1618 (0.0008) [2023-02-22 21:37:52,244][24717] Fps is (10 sec: 24575.8, 60 sec: 23961.6, 300 sec: 22973.2). Total num frames: 6647808. Throughput: 0: 6022.7. Samples: 652478. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:37:52,248][24717] Avg episode reward: [(0, '25.157')] [2023-02-22 21:37:53,060][37090] Updated weights for policy 0, policy_version 1628 (0.0007) [2023-02-22 21:37:54,844][37090] Updated weights for policy 0, policy_version 1638 (0.0008) [2023-02-22 21:37:56,529][37090] Updated weights for policy 0, policy_version 1648 (0.0007) [2023-02-22 21:37:57,244][24717] Fps is (10 sec: 24166.5, 60 sec: 23961.6, 300 sec: 23005.9). Total num frames: 6766592. Throughput: 0: 6020.6. Samples: 688502. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:37:57,245][24717] Avg episode reward: [(0, '26.791')] [2023-02-22 21:37:58,241][37090] Updated weights for policy 0, policy_version 1658 (0.0006) [2023-02-22 21:37:59,990][37090] Updated weights for policy 0, policy_version 1668 (0.0008) [2023-02-22 21:38:01,762][37090] Updated weights for policy 0, policy_version 1678 (0.0008) [2023-02-22 21:38:02,244][24717] Fps is (10 sec: 23347.4, 60 sec: 23961.6, 300 sec: 23003.1). Total num frames: 6881280. Throughput: 0: 5995.4. Samples: 706034. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:38:02,245][24717] Avg episode reward: [(0, '26.820')] [2023-02-22 21:38:03,466][37090] Updated weights for policy 0, policy_version 1688 (0.0006) [2023-02-22 21:38:05,198][37090] Updated weights for policy 0, policy_version 1698 (0.0007) [2023-02-22 21:38:06,907][37090] Updated weights for policy 0, policy_version 1708 (0.0007) [2023-02-22 21:38:07,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23961.7, 300 sec: 23032.1). Total num frames: 7000064. Throughput: 0: 5959.7. Samples: 741582. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:38:07,246][24717] Avg episode reward: [(0, '28.493')] [2023-02-22 21:38:08,598][37090] Updated weights for policy 0, policy_version 1718 (0.0010) [2023-02-22 21:38:10,258][37090] Updated weights for policy 0, policy_version 1728 (0.0006) [2023-02-22 21:38:11,893][37090] Updated weights for policy 0, policy_version 1738 (0.0007) [2023-02-22 21:38:12,244][24717] Fps is (10 sec: 24576.0, 60 sec: 24098.1, 300 sec: 23119.6). Total num frames: 7127040. Throughput: 0: 5991.5. Samples: 778354. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:38:12,245][24717] Avg episode reward: [(0, '25.683')] [2023-02-22 21:38:13,489][37090] Updated weights for policy 0, policy_version 1748 (0.0008) [2023-02-22 21:38:15,152][37090] Updated weights for policy 0, policy_version 1758 (0.0008) [2023-02-22 21:38:16,761][37090] Updated weights for policy 0, policy_version 1768 (0.0009) [2023-02-22 21:38:17,244][24717] Fps is (10 sec: 24985.4, 60 sec: 24029.8, 300 sec: 23171.6). Total num frames: 7249920. Throughput: 0: 6016.5. Samples: 797202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:38:17,246][24717] Avg episode reward: [(0, '26.001')] [2023-02-22 21:38:18,402][37090] Updated weights for policy 0, policy_version 1778 (0.0007) [2023-02-22 21:38:20,088][37090] Updated weights for policy 0, policy_version 1788 (0.0008) [2023-02-22 21:38:21,929][37090] Updated weights for policy 0, policy_version 1798 (0.0007) [2023-02-22 21:38:22,245][24717] Fps is (10 sec: 24165.2, 60 sec: 23961.5, 300 sec: 23191.8). Total num frames: 7368704. Throughput: 0: 6051.0. Samples: 834202. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:38:22,247][24717] Avg episode reward: [(0, '30.189')] [2023-02-22 21:38:22,252][37076] Saving new best policy, reward=30.189! [2023-02-22 21:38:23,740][37090] Updated weights for policy 0, policy_version 1808 (0.0009) [2023-02-22 21:38:25,443][37090] Updated weights for policy 0, policy_version 1818 (0.0008) [2023-02-22 21:38:27,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23961.6, 300 sec: 23183.3). Total num frames: 7483392. Throughput: 0: 6020.4. Samples: 868554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:38:27,246][24717] Avg episode reward: [(0, '28.317')] [2023-02-22 21:38:27,294][37090] Updated weights for policy 0, policy_version 1828 (0.0010) [2023-02-22 21:38:29,325][37090] Updated weights for policy 0, policy_version 1838 (0.0008) [2023-02-22 21:38:31,321][37090] Updated weights for policy 0, policy_version 1848 (0.0010) [2023-02-22 21:38:32,244][24717] Fps is (10 sec: 22119.4, 60 sec: 23756.8, 300 sec: 23122.6). Total num frames: 7589888. Throughput: 0: 5948.2. Samples: 883654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:38:32,246][24717] Avg episode reward: [(0, '28.656')] [2023-02-22 21:38:33,152][37090] Updated weights for policy 0, policy_version 1858 (0.0007) [2023-02-22 21:38:34,978][37090] Updated weights for policy 0, policy_version 1868 (0.0007) [2023-02-22 21:38:36,772][37090] Updated weights for policy 0, policy_version 1878 (0.0009) [2023-02-22 21:38:37,244][24717] Fps is (10 sec: 21709.0, 60 sec: 23620.3, 300 sec: 23091.2). Total num frames: 7700480. Throughput: 0: 5870.9. Samples: 916666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:38:37,246][24717] Avg episode reward: [(0, '26.280')] [2023-02-22 21:38:38,560][37090] Updated weights for policy 0, policy_version 1888 (0.0008) [2023-02-22 21:38:40,530][37090] Updated weights for policy 0, policy_version 1898 (0.0007) [2023-02-22 21:38:42,244][24717] Fps is (10 sec: 21708.9, 60 sec: 23415.5, 300 sec: 23036.9). Total num frames: 7806976. Throughput: 0: 5803.7. Samples: 949668. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:38:42,245][24717] Avg episode reward: [(0, '25.265')] [2023-02-22 21:38:42,448][37090] Updated weights for policy 0, policy_version 1908 (0.0007) [2023-02-22 21:38:44,448][37090] Updated weights for policy 0, policy_version 1918 (0.0007) [2023-02-22 21:38:46,495][37090] Updated weights for policy 0, policy_version 1928 (0.0011) [2023-02-22 21:38:47,244][24717] Fps is (10 sec: 20889.6, 60 sec: 23074.1, 300 sec: 22961.7). Total num frames: 7909376. Throughput: 0: 5753.5. Samples: 964940. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:38:47,245][24717] Avg episode reward: [(0, '28.922')] [2023-02-22 21:38:48,437][37090] Updated weights for policy 0, policy_version 1938 (0.0007) [2023-02-22 21:38:50,279][37090] Updated weights for policy 0, policy_version 1948 (0.0010) [2023-02-22 21:38:52,046][37090] Updated weights for policy 0, policy_version 1958 (0.0007) [2023-02-22 21:38:52,244][24717] Fps is (10 sec: 21708.8, 60 sec: 22937.6, 300 sec: 22961.0). Total num frames: 8024064. Throughput: 0: 5672.3. Samples: 996834. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:38:52,245][24717] Avg episode reward: [(0, '30.228')] [2023-02-22 21:38:52,247][37076] Saving new best policy, reward=30.228! [2023-02-22 21:38:53,877][37090] Updated weights for policy 0, policy_version 1968 (0.0009) [2023-02-22 21:38:55,691][37090] Updated weights for policy 0, policy_version 1978 (0.0008) [2023-02-22 21:38:57,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22801.1, 300 sec: 22937.6). Total num frames: 8134656. Throughput: 0: 5613.7. Samples: 1030972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:38:57,245][24717] Avg episode reward: [(0, '27.696')] [2023-02-22 21:38:57,447][37090] Updated weights for policy 0, policy_version 1988 (0.0008) [2023-02-22 21:38:59,282][37090] Updated weights for policy 0, policy_version 1998 (0.0007) [2023-02-22 21:39:01,194][37090] Updated weights for policy 0, policy_version 2008 (0.0008) [2023-02-22 21:39:02,244][24717] Fps is (10 sec: 22118.4, 60 sec: 22732.8, 300 sec: 22915.5). Total num frames: 8245248. Throughput: 0: 5570.1. Samples: 1047858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:39:02,246][24717] Avg episode reward: [(0, '25.381')] [2023-02-22 21:39:02,958][37090] Updated weights for policy 0, policy_version 2018 (0.0008) [2023-02-22 21:39:04,788][37090] Updated weights for policy 0, policy_version 2028 (0.0009) [2023-02-22 21:39:06,681][37090] Updated weights for policy 0, policy_version 2038 (0.0007) [2023-02-22 21:39:07,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22664.5, 300 sec: 22916.0). Total num frames: 8359936. Throughput: 0: 5496.8. Samples: 1081556. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:39:07,245][24717] Avg episode reward: [(0, '28.126')] [2023-02-22 21:39:08,514][37090] Updated weights for policy 0, policy_version 2048 (0.0008) [2023-02-22 21:39:10,269][37090] Updated weights for policy 0, policy_version 2058 (0.0008) [2023-02-22 21:39:12,007][37090] Updated weights for policy 0, policy_version 2068 (0.0006) [2023-02-22 21:39:12,244][24717] Fps is (10 sec: 22937.7, 60 sec: 22459.7, 300 sec: 22916.6). Total num frames: 8474624. Throughput: 0: 5489.7. Samples: 1115590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:39:12,246][24717] Avg episode reward: [(0, '27.021')] [2023-02-22 21:39:13,691][37090] Updated weights for policy 0, policy_version 2078 (0.0006) [2023-02-22 21:39:15,502][37090] Updated weights for policy 0, policy_version 2088 (0.0008) [2023-02-22 21:39:17,244][24717] Fps is (10 sec: 22937.5, 60 sec: 22323.2, 300 sec: 22917.1). Total num frames: 8589312. Throughput: 0: 5548.8. Samples: 1133350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:39:17,246][24717] Avg episode reward: [(0, '30.461')] [2023-02-22 21:39:17,251][37076] Saving new best policy, reward=30.461! [2023-02-22 21:39:17,391][37090] Updated weights for policy 0, policy_version 2098 (0.0007) [2023-02-22 21:39:19,234][37090] Updated weights for policy 0, policy_version 2108 (0.0008) [2023-02-22 21:39:21,217][37090] Updated weights for policy 0, policy_version 2118 (0.0008) [2023-02-22 21:39:22,244][24717] Fps is (10 sec: 22118.4, 60 sec: 22118.6, 300 sec: 22877.7). Total num frames: 8695808. Throughput: 0: 5537.9. Samples: 1165872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:39:22,245][24717] Avg episode reward: [(0, '26.340')] [2023-02-22 21:39:23,154][37090] Updated weights for policy 0, policy_version 2128 (0.0009) [2023-02-22 21:39:25,317][37090] Updated weights for policy 0, policy_version 2138 (0.0007) [2023-02-22 21:39:27,244][24717] Fps is (10 sec: 20480.0, 60 sec: 21845.4, 300 sec: 22801.1). Total num frames: 8794112. Throughput: 0: 5474.9. Samples: 1196038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:39:27,245][24717] Avg episode reward: [(0, '28.516')] [2023-02-22 21:39:27,270][37090] Updated weights for policy 0, policy_version 2148 (0.0008) [2023-02-22 21:39:29,396][37090] Updated weights for policy 0, policy_version 2158 (0.0008) [2023-02-22 21:39:31,410][37090] Updated weights for policy 0, policy_version 2168 (0.0007) [2023-02-22 21:39:32,244][24717] Fps is (10 sec: 20070.4, 60 sec: 21777.1, 300 sec: 22747.1). Total num frames: 8896512. Throughput: 0: 5463.8. Samples: 1210812. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:39:32,245][24717] Avg episode reward: [(0, '26.192')] [2023-02-22 21:39:33,467][37090] Updated weights for policy 0, policy_version 2178 (0.0009) [2023-02-22 21:39:35,788][37090] Updated weights for policy 0, policy_version 2188 (0.0008) [2023-02-22 21:39:37,245][24717] Fps is (10 sec: 19660.4, 60 sec: 21503.9, 300 sec: 22658.3). Total num frames: 8990720. Throughput: 0: 5399.9. Samples: 1239830. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 21:39:37,246][24717] Avg episode reward: [(0, '27.843')] [2023-02-22 21:39:37,749][37090] Updated weights for policy 0, policy_version 2198 (0.0008) [2023-02-22 21:39:39,996][37090] Updated weights for policy 0, policy_version 2208 (0.0008) [2023-02-22 21:39:42,021][37090] Updated weights for policy 0, policy_version 2218 (0.0009) [2023-02-22 21:39:42,244][24717] Fps is (10 sec: 19251.2, 60 sec: 21367.5, 300 sec: 22591.7). Total num frames: 9089024. Throughput: 0: 5292.4. Samples: 1269132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:39:42,245][24717] Avg episode reward: [(0, '29.069')] [2023-02-22 21:39:43,930][37090] Updated weights for policy 0, policy_version 2228 (0.0007) [2023-02-22 21:39:45,889][37090] Updated weights for policy 0, policy_version 2238 (0.0007) [2023-02-22 21:39:47,245][24717] Fps is (10 sec: 20480.0, 60 sec: 21435.7, 300 sec: 22563.6). Total num frames: 9195520. Throughput: 0: 5278.9. Samples: 1285408. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:39:47,246][24717] Avg episode reward: [(0, '32.051')] [2023-02-22 21:39:47,250][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000002245_9195520.pth... [2023-02-22 21:39:47,307][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth [2023-02-22 21:39:47,316][37076] Saving new best policy, reward=32.051! [2023-02-22 21:39:47,776][37090] Updated weights for policy 0, policy_version 2248 (0.0007) [2023-02-22 21:39:49,695][37090] Updated weights for policy 0, policy_version 2258 (0.0008) [2023-02-22 21:39:51,590][37090] Updated weights for policy 0, policy_version 2268 (0.0007) [2023-02-22 21:39:52,245][24717] Fps is (10 sec: 21298.8, 60 sec: 21299.1, 300 sec: 22536.7). Total num frames: 9302016. Throughput: 0: 5241.8. Samples: 1317436. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:39:52,246][24717] Avg episode reward: [(0, '30.927')] [2023-02-22 21:39:53,504][37090] Updated weights for policy 0, policy_version 2278 (0.0008) [2023-02-22 21:39:55,423][37090] Updated weights for policy 0, policy_version 2288 (0.0007) [2023-02-22 21:39:57,244][24717] Fps is (10 sec: 21299.6, 60 sec: 21230.9, 300 sec: 22510.9). Total num frames: 9408512. Throughput: 0: 5192.2. Samples: 1349240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:39:57,246][24717] Avg episode reward: [(0, '28.586')] [2023-02-22 21:39:57,376][37090] Updated weights for policy 0, policy_version 2298 (0.0008) [2023-02-22 21:39:59,294][37090] Updated weights for policy 0, policy_version 2308 (0.0010) [2023-02-22 21:40:01,255][37090] Updated weights for policy 0, policy_version 2318 (0.0009) [2023-02-22 21:40:02,244][24717] Fps is (10 sec: 21299.5, 60 sec: 21162.7, 300 sec: 22486.2). Total num frames: 9515008. Throughput: 0: 5152.0. Samples: 1365188. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:40:02,245][24717] Avg episode reward: [(0, '27.079')] [2023-02-22 21:40:03,102][37090] Updated weights for policy 0, policy_version 2328 (0.0009) [2023-02-22 21:40:05,111][37090] Updated weights for policy 0, policy_version 2338 (0.0007) [2023-02-22 21:40:06,979][37090] Updated weights for policy 0, policy_version 2348 (0.0006) [2023-02-22 21:40:07,244][24717] Fps is (10 sec: 21299.0, 60 sec: 21026.1, 300 sec: 22462.5). Total num frames: 9621504. Throughput: 0: 5137.1. Samples: 1397040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:40:07,245][24717] Avg episode reward: [(0, '25.967')] [2023-02-22 21:40:08,810][37090] Updated weights for policy 0, policy_version 2358 (0.0008) [2023-02-22 21:40:10,615][37090] Updated weights for policy 0, policy_version 2368 (0.0009) [2023-02-22 21:40:12,244][24717] Fps is (10 sec: 21708.6, 60 sec: 20957.8, 300 sec: 22455.7). Total num frames: 9732096. Throughput: 0: 5212.7. Samples: 1430610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:40:12,246][24717] Avg episode reward: [(0, '26.212')] [2023-02-22 21:40:12,435][37090] Updated weights for policy 0, policy_version 2378 (0.0007) [2023-02-22 21:40:14,185][37090] Updated weights for policy 0, policy_version 2388 (0.0008) [2023-02-22 21:40:15,891][37090] Updated weights for policy 0, policy_version 2398 (0.0007) [2023-02-22 21:40:17,244][24717] Fps is (10 sec: 22937.8, 60 sec: 21026.1, 300 sec: 22480.7). Total num frames: 9850880. Throughput: 0: 5275.4. Samples: 1448206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:40:17,245][24717] Avg episode reward: [(0, '32.827')] [2023-02-22 21:40:17,257][37076] Saving new best policy, reward=32.827! [2023-02-22 21:40:17,623][37090] Updated weights for policy 0, policy_version 2408 (0.0007) [2023-02-22 21:40:19,324][37090] Updated weights for policy 0, policy_version 2418 (0.0010) [2023-02-22 21:40:21,060][37090] Updated weights for policy 0, policy_version 2428 (0.0010) [2023-02-22 21:40:22,244][24717] Fps is (10 sec: 23757.0, 60 sec: 21230.9, 300 sec: 22504.8). Total num frames: 9969664. Throughput: 0: 5423.5. Samples: 1483886. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:40:22,246][24717] Avg episode reward: [(0, '34.547')] [2023-02-22 21:40:22,248][37076] Saving new best policy, reward=34.547! [2023-02-22 21:40:22,859][37090] Updated weights for policy 0, policy_version 2438 (0.0007) [2023-02-22 21:40:24,626][37090] Updated weights for policy 0, policy_version 2448 (0.0009) [2023-02-22 21:40:26,352][37090] Updated weights for policy 0, policy_version 2458 (0.0008) [2023-02-22 21:40:27,244][24717] Fps is (10 sec: 23756.8, 60 sec: 21572.3, 300 sec: 22528.0). Total num frames: 10088448. Throughput: 0: 5545.2. Samples: 1518664. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:40:27,246][24717] Avg episode reward: [(0, '30.083')] [2023-02-22 21:40:28,120][37090] Updated weights for policy 0, policy_version 2468 (0.0006) [2023-02-22 21:40:29,919][37090] Updated weights for policy 0, policy_version 2478 (0.0008) [2023-02-22 21:40:31,732][37090] Updated weights for policy 0, policy_version 2488 (0.0008) [2023-02-22 21:40:32,244][24717] Fps is (10 sec: 22937.6, 60 sec: 21708.8, 300 sec: 22520.6). Total num frames: 10199040. Throughput: 0: 5567.8. Samples: 1535958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:40:32,245][24717] Avg episode reward: [(0, '31.499')] [2023-02-22 21:40:33,488][37090] Updated weights for policy 0, policy_version 2498 (0.0007) [2023-02-22 21:40:35,377][37090] Updated weights for policy 0, policy_version 2508 (0.0008) [2023-02-22 21:40:37,206][37090] Updated weights for policy 0, policy_version 2518 (0.0007) [2023-02-22 21:40:37,244][24717] Fps is (10 sec: 22527.8, 60 sec: 22050.2, 300 sec: 22528.0). Total num frames: 10313728. Throughput: 0: 5602.7. Samples: 1569558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:40:37,245][24717] Avg episode reward: [(0, '29.469')] [2023-02-22 21:40:39,042][37090] Updated weights for policy 0, policy_version 2528 (0.0006) [2023-02-22 21:40:40,823][37090] Updated weights for policy 0, policy_version 2538 (0.0007) [2023-02-22 21:40:42,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22254.9, 300 sec: 22520.8). Total num frames: 10424320. Throughput: 0: 5650.0. Samples: 1603492. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:40:42,245][24717] Avg episode reward: [(0, '27.655')] [2023-02-22 21:40:42,663][37090] Updated weights for policy 0, policy_version 2548 (0.0008) [2023-02-22 21:40:44,452][37090] Updated weights for policy 0, policy_version 2558 (0.0008) [2023-02-22 21:40:46,374][37090] Updated weights for policy 0, policy_version 2568 (0.0007) [2023-02-22 21:40:47,244][24717] Fps is (10 sec: 22118.3, 60 sec: 22323.2, 300 sec: 22513.9). Total num frames: 10534912. Throughput: 0: 5673.5. Samples: 1620496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:40:47,246][24717] Avg episode reward: [(0, '29.679')] [2023-02-22 21:40:48,220][37090] Updated weights for policy 0, policy_version 2578 (0.0009) [2023-02-22 21:40:50,008][37090] Updated weights for policy 0, policy_version 2588 (0.0006) [2023-02-22 21:40:51,795][37090] Updated weights for policy 0, policy_version 2598 (0.0006) [2023-02-22 21:40:52,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22459.8, 300 sec: 22521.1). Total num frames: 10649600. Throughput: 0: 5704.9. Samples: 1653760. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:40:52,245][24717] Avg episode reward: [(0, '27.111')] [2023-02-22 21:40:53,541][37090] Updated weights for policy 0, policy_version 2608 (0.0007) [2023-02-22 21:40:55,316][37090] Updated weights for policy 0, policy_version 2618 (0.0007) [2023-02-22 21:40:57,074][37090] Updated weights for policy 0, policy_version 2628 (0.0007) [2023-02-22 21:40:57,244][24717] Fps is (10 sec: 22937.8, 60 sec: 22596.3, 300 sec: 22895.9). Total num frames: 10764288. Throughput: 0: 5733.3. Samples: 1688606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:40:57,245][24717] Avg episode reward: [(0, '29.312')] [2023-02-22 21:40:58,851][37090] Updated weights for policy 0, policy_version 2638 (0.0007) [2023-02-22 21:41:00,626][37090] Updated weights for policy 0, policy_version 2648 (0.0007) [2023-02-22 21:41:02,244][24717] Fps is (10 sec: 23347.2, 60 sec: 22801.1, 300 sec: 22909.8). Total num frames: 10883072. Throughput: 0: 5732.5. Samples: 1706168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:41:02,249][24717] Avg episode reward: [(0, '27.535')] [2023-02-22 21:41:02,361][37090] Updated weights for policy 0, policy_version 2658 (0.0007) [2023-02-22 21:41:04,165][37090] Updated weights for policy 0, policy_version 2668 (0.0009) [2023-02-22 21:41:06,050][37090] Updated weights for policy 0, policy_version 2678 (0.0008) [2023-02-22 21:41:07,244][24717] Fps is (10 sec: 22937.6, 60 sec: 22869.4, 300 sec: 22868.2). Total num frames: 10993664. Throughput: 0: 5696.8. Samples: 1740244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:41:07,245][24717] Avg episode reward: [(0, '30.581')] [2023-02-22 21:41:07,856][37090] Updated weights for policy 0, policy_version 2688 (0.0007) [2023-02-22 21:41:09,614][37090] Updated weights for policy 0, policy_version 2698 (0.0007) [2023-02-22 21:41:11,250][37090] Updated weights for policy 0, policy_version 2708 (0.0007) [2023-02-22 21:41:12,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23005.9, 300 sec: 22840.4). Total num frames: 11112448. Throughput: 0: 5706.8. Samples: 1775468. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:41:12,246][24717] Avg episode reward: [(0, '24.777')] [2023-02-22 21:41:12,955][37090] Updated weights for policy 0, policy_version 2718 (0.0006) [2023-02-22 21:41:14,708][37090] Updated weights for policy 0, policy_version 2728 (0.0007) [2023-02-22 21:41:16,370][37090] Updated weights for policy 0, policy_version 2738 (0.0008) [2023-02-22 21:41:17,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23074.1, 300 sec: 22854.4). Total num frames: 11235328. Throughput: 0: 5715.8. Samples: 1793170. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:41:17,245][24717] Avg episode reward: [(0, '29.624')] [2023-02-22 21:41:18,040][37090] Updated weights for policy 0, policy_version 2748 (0.0006) [2023-02-22 21:41:19,705][37090] Updated weights for policy 0, policy_version 2758 (0.0007) [2023-02-22 21:41:21,325][37090] Updated weights for policy 0, policy_version 2768 (0.0006) [2023-02-22 21:41:22,244][24717] Fps is (10 sec: 24576.0, 60 sec: 23142.4, 300 sec: 22854.3). Total num frames: 11358208. Throughput: 0: 5790.7. Samples: 1830140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:41:22,245][24717] Avg episode reward: [(0, '29.753')] [2023-02-22 21:41:23,091][37090] Updated weights for policy 0, policy_version 2778 (0.0008) [2023-02-22 21:41:24,837][37090] Updated weights for policy 0, policy_version 2788 (0.0008) [2023-02-22 21:41:26,509][37090] Updated weights for policy 0, policy_version 2798 (0.0007) [2023-02-22 21:41:27,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23142.4, 300 sec: 22868.3). Total num frames: 11476992. Throughput: 0: 5832.0. Samples: 1865932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:41:27,245][24717] Avg episode reward: [(0, '29.617')] [2023-02-22 21:41:28,213][37090] Updated weights for policy 0, policy_version 2808 (0.0009) [2023-02-22 21:41:29,951][37090] Updated weights for policy 0, policy_version 2818 (0.0009) [2023-02-22 21:41:31,669][37090] Updated weights for policy 0, policy_version 2828 (0.0008) [2023-02-22 21:41:32,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23278.9, 300 sec: 22868.2). Total num frames: 11595776. Throughput: 0: 5853.2. Samples: 1883890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:41:32,246][24717] Avg episode reward: [(0, '30.919')] [2023-02-22 21:41:33,376][37090] Updated weights for policy 0, policy_version 2838 (0.0007) [2023-02-22 21:41:35,123][37090] Updated weights for policy 0, policy_version 2848 (0.0007) [2023-02-22 21:41:36,889][37090] Updated weights for policy 0, policy_version 2858 (0.0006) [2023-02-22 21:41:37,244][24717] Fps is (10 sec: 23756.6, 60 sec: 23347.2, 300 sec: 22868.2). Total num frames: 11714560. Throughput: 0: 5900.0. Samples: 1919262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 21:41:37,246][24717] Avg episode reward: [(0, '27.486')] [2023-02-22 21:41:38,669][37090] Updated weights for policy 0, policy_version 2868 (0.0009) [2023-02-22 21:41:40,403][37090] Updated weights for policy 0, policy_version 2878 (0.0007) [2023-02-22 21:41:42,133][37090] Updated weights for policy 0, policy_version 2888 (0.0007) [2023-02-22 21:41:42,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23415.5, 300 sec: 22840.4). Total num frames: 11829248. Throughput: 0: 5907.0. Samples: 1954420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:41:42,245][24717] Avg episode reward: [(0, '27.919')] [2023-02-22 21:41:43,887][37090] Updated weights for policy 0, policy_version 2898 (0.0006) [2023-02-22 21:41:45,599][37090] Updated weights for policy 0, policy_version 2908 (0.0006) [2023-02-22 21:41:47,244][24717] Fps is (10 sec: 23347.3, 60 sec: 23552.0, 300 sec: 22840.4). Total num frames: 11948032. Throughput: 0: 5910.6. Samples: 1972144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:41:47,246][24717] Avg episode reward: [(0, '29.504')] [2023-02-22 21:41:47,251][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000002917_11948032.pth... [2023-02-22 21:41:47,301][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000001593_6524928.pth [2023-02-22 21:41:47,346][37090] Updated weights for policy 0, policy_version 2918 (0.0006) [2023-02-22 21:41:49,050][37090] Updated weights for policy 0, policy_version 2928 (0.0007) [2023-02-22 21:41:50,764][37090] Updated weights for policy 0, policy_version 2938 (0.0007) [2023-02-22 21:41:52,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23620.3, 300 sec: 22840.4). Total num frames: 12066816. Throughput: 0: 5947.9. Samples: 2007898. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:41:52,245][24717] Avg episode reward: [(0, '29.961')] [2023-02-22 21:41:52,454][37090] Updated weights for policy 0, policy_version 2948 (0.0006) [2023-02-22 21:41:54,229][37090] Updated weights for policy 0, policy_version 2958 (0.0008) [2023-02-22 21:41:55,936][37090] Updated weights for policy 0, policy_version 2968 (0.0009) [2023-02-22 21:41:57,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23688.5, 300 sec: 22854.3). Total num frames: 12185600. Throughput: 0: 5957.7. Samples: 2043564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:41:57,246][24717] Avg episode reward: [(0, '29.759')] [2023-02-22 21:41:57,627][37090] Updated weights for policy 0, policy_version 2978 (0.0006) [2023-02-22 21:41:59,334][37090] Updated weights for policy 0, policy_version 2988 (0.0007) [2023-02-22 21:42:01,119][37090] Updated weights for policy 0, policy_version 2998 (0.0009) [2023-02-22 21:42:02,244][24717] Fps is (10 sec: 23756.4, 60 sec: 23688.5, 300 sec: 22854.3). Total num frames: 12304384. Throughput: 0: 5964.0. Samples: 2061552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:42:02,247][24717] Avg episode reward: [(0, '28.041')] [2023-02-22 21:42:02,834][37090] Updated weights for policy 0, policy_version 3008 (0.0009) [2023-02-22 21:42:04,585][37090] Updated weights for policy 0, policy_version 3018 (0.0007) [2023-02-22 21:42:06,405][37090] Updated weights for policy 0, policy_version 3028 (0.0007) [2023-02-22 21:42:07,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23756.8, 300 sec: 22840.4). Total num frames: 12419072. Throughput: 0: 5916.2. Samples: 2096370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:42:07,247][24717] Avg episode reward: [(0, '33.405')] [2023-02-22 21:42:08,141][37090] Updated weights for policy 0, policy_version 3038 (0.0006) [2023-02-22 21:42:09,882][37090] Updated weights for policy 0, policy_version 3048 (0.0009) [2023-02-22 21:42:11,585][37090] Updated weights for policy 0, policy_version 3058 (0.0006) [2023-02-22 21:42:12,244][24717] Fps is (10 sec: 23757.2, 60 sec: 23825.1, 300 sec: 22826.5). Total num frames: 12541952. Throughput: 0: 5910.1. Samples: 2131888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:42:12,245][24717] Avg episode reward: [(0, '28.672')] [2023-02-22 21:42:13,269][37090] Updated weights for policy 0, policy_version 3068 (0.0006) [2023-02-22 21:42:14,920][37090] Updated weights for policy 0, policy_version 3078 (0.0007) [2023-02-22 21:42:16,574][37090] Updated weights for policy 0, policy_version 3088 (0.0007) [2023-02-22 21:42:17,244][24717] Fps is (10 sec: 24576.1, 60 sec: 23825.1, 300 sec: 22826.5). Total num frames: 12664832. Throughput: 0: 5918.7. Samples: 2150230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:42:17,245][24717] Avg episode reward: [(0, '29.226')] [2023-02-22 21:42:18,276][37090] Updated weights for policy 0, policy_version 3098 (0.0006) [2023-02-22 21:42:19,927][37090] Updated weights for policy 0, policy_version 3108 (0.0007) [2023-02-22 21:42:21,760][37090] Updated weights for policy 0, policy_version 3118 (0.0009) [2023-02-22 21:42:22,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23688.5, 300 sec: 22826.5). Total num frames: 12779520. Throughput: 0: 5940.9. Samples: 2186604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:42:22,245][24717] Avg episode reward: [(0, '28.209')] [2023-02-22 21:42:23,524][37090] Updated weights for policy 0, policy_version 3128 (0.0009) [2023-02-22 21:42:25,226][37090] Updated weights for policy 0, policy_version 3138 (0.0007) [2023-02-22 21:42:26,989][37090] Updated weights for policy 0, policy_version 3148 (0.0008) [2023-02-22 21:42:27,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23688.5, 300 sec: 22826.5). Total num frames: 12898304. Throughput: 0: 5937.7. Samples: 2221616. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:42:27,246][24717] Avg episode reward: [(0, '29.555')] [2023-02-22 21:42:28,797][37090] Updated weights for policy 0, policy_version 3158 (0.0008) [2023-02-22 21:42:30,640][37090] Updated weights for policy 0, policy_version 3168 (0.0009) [2023-02-22 21:42:32,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23552.0, 300 sec: 22798.8). Total num frames: 13008896. Throughput: 0: 5920.5. Samples: 2238566. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:42:32,245][24717] Avg episode reward: [(0, '26.220')] [2023-02-22 21:42:32,477][37090] Updated weights for policy 0, policy_version 3178 (0.0009) [2023-02-22 21:42:34,418][37090] Updated weights for policy 0, policy_version 3188 (0.0007) [2023-02-22 21:42:36,389][37090] Updated weights for policy 0, policy_version 3198 (0.0009) [2023-02-22 21:42:37,244][24717] Fps is (10 sec: 21708.8, 60 sec: 23347.2, 300 sec: 22757.1). Total num frames: 13115392. Throughput: 0: 5839.6. Samples: 2270682. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:42:37,245][24717] Avg episode reward: [(0, '31.145')] [2023-02-22 21:42:38,247][37090] Updated weights for policy 0, policy_version 3208 (0.0010) [2023-02-22 21:42:40,088][37090] Updated weights for policy 0, policy_version 3218 (0.0011) [2023-02-22 21:42:41,913][37090] Updated weights for policy 0, policy_version 3228 (0.0007) [2023-02-22 21:42:42,244][24717] Fps is (10 sec: 21708.8, 60 sec: 23278.9, 300 sec: 22715.4). Total num frames: 13225984. Throughput: 0: 5783.5. Samples: 2303822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:42:42,245][24717] Avg episode reward: [(0, '31.793')] [2023-02-22 21:42:43,820][37090] Updated weights for policy 0, policy_version 3238 (0.0009) [2023-02-22 21:42:45,636][37090] Updated weights for policy 0, policy_version 3248 (0.0007) [2023-02-22 21:42:47,244][24717] Fps is (10 sec: 22528.0, 60 sec: 23210.7, 300 sec: 22687.7). Total num frames: 13340672. Throughput: 0: 5746.5. Samples: 2320144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:42:47,245][24717] Avg episode reward: [(0, '31.939')] [2023-02-22 21:42:47,382][37090] Updated weights for policy 0, policy_version 3258 (0.0008) [2023-02-22 21:42:49,105][37090] Updated weights for policy 0, policy_version 3268 (0.0008) [2023-02-22 21:42:50,804][37090] Updated weights for policy 0, policy_version 3278 (0.0007) [2023-02-22 21:42:52,244][24717] Fps is (10 sec: 23346.9, 60 sec: 23210.6, 300 sec: 22687.7). Total num frames: 13459456. Throughput: 0: 5765.6. Samples: 2355824. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:42:52,247][24717] Avg episode reward: [(0, '34.782')] [2023-02-22 21:42:52,248][37076] Saving new best policy, reward=34.782! [2023-02-22 21:42:52,540][37090] Updated weights for policy 0, policy_version 3288 (0.0008) [2023-02-22 21:42:54,287][37090] Updated weights for policy 0, policy_version 3298 (0.0006) [2023-02-22 21:42:55,956][37090] Updated weights for policy 0, policy_version 3308 (0.0007) [2023-02-22 21:42:57,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23210.7, 300 sec: 22701.6). Total num frames: 13578240. Throughput: 0: 5773.8. Samples: 2391710. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:42:57,245][24717] Avg episode reward: [(0, '31.893')] [2023-02-22 21:42:57,605][37090] Updated weights for policy 0, policy_version 3318 (0.0006) [2023-02-22 21:42:59,319][37090] Updated weights for policy 0, policy_version 3328 (0.0009) [2023-02-22 21:43:01,083][37090] Updated weights for policy 0, policy_version 3338 (0.0007) [2023-02-22 21:43:02,244][24717] Fps is (10 sec: 23757.0, 60 sec: 23210.7, 300 sec: 22701.6). Total num frames: 13697024. Throughput: 0: 5766.3. Samples: 2409714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:43:02,245][24717] Avg episode reward: [(0, '29.135')] [2023-02-22 21:43:02,836][37090] Updated weights for policy 0, policy_version 3348 (0.0009) [2023-02-22 21:43:04,520][37090] Updated weights for policy 0, policy_version 3358 (0.0006) [2023-02-22 21:43:06,333][37090] Updated weights for policy 0, policy_version 3368 (0.0007) [2023-02-22 21:43:07,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23279.0, 300 sec: 22673.8). Total num frames: 13815808. Throughput: 0: 5737.8. Samples: 2444806. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:43:07,245][24717] Avg episode reward: [(0, '32.198')] [2023-02-22 21:43:08,060][37090] Updated weights for policy 0, policy_version 3378 (0.0007) [2023-02-22 21:43:09,805][37090] Updated weights for policy 0, policy_version 3388 (0.0007) [2023-02-22 21:43:11,517][37090] Updated weights for policy 0, policy_version 3398 (0.0007) [2023-02-22 21:43:12,244][24717] Fps is (10 sec: 23756.9, 60 sec: 23210.7, 300 sec: 22659.9). Total num frames: 13934592. Throughput: 0: 5752.6. Samples: 2480484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:43:12,246][24717] Avg episode reward: [(0, '34.694')] [2023-02-22 21:43:13,170][37090] Updated weights for policy 0, policy_version 3408 (0.0007) [2023-02-22 21:43:14,830][37090] Updated weights for policy 0, policy_version 3418 (0.0006) [2023-02-22 21:43:16,453][37090] Updated weights for policy 0, policy_version 3428 (0.0007) [2023-02-22 21:43:17,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23210.7, 300 sec: 22673.8). Total num frames: 14057472. Throughput: 0: 5789.1. Samples: 2499074. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:43:17,245][24717] Avg episode reward: [(0, '29.529')] [2023-02-22 21:43:18,110][37090] Updated weights for policy 0, policy_version 3438 (0.0006) [2023-02-22 21:43:19,780][37090] Updated weights for policy 0, policy_version 3448 (0.0007) [2023-02-22 21:43:21,468][37090] Updated weights for policy 0, policy_version 3458 (0.0006) [2023-02-22 21:43:22,244][24717] Fps is (10 sec: 24576.0, 60 sec: 23347.2, 300 sec: 22701.6). Total num frames: 14180352. Throughput: 0: 5899.1. Samples: 2536142. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:43:22,246][24717] Avg episode reward: [(0, '33.344')] [2023-02-22 21:43:23,211][37090] Updated weights for policy 0, policy_version 3468 (0.0008) [2023-02-22 21:43:24,999][37090] Updated weights for policy 0, policy_version 3478 (0.0007) [2023-02-22 21:43:26,701][37090] Updated weights for policy 0, policy_version 3488 (0.0008) [2023-02-22 21:43:27,244][24717] Fps is (10 sec: 24166.3, 60 sec: 23347.2, 300 sec: 22743.2). Total num frames: 14299136. Throughput: 0: 5946.3. Samples: 2571406. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:43:27,246][24717] Avg episode reward: [(0, '30.518')] [2023-02-22 21:43:28,485][37090] Updated weights for policy 0, policy_version 3498 (0.0007) [2023-02-22 21:43:30,170][37090] Updated weights for policy 0, policy_version 3508 (0.0008) [2023-02-22 21:43:31,982][37090] Updated weights for policy 0, policy_version 3518 (0.0008) [2023-02-22 21:43:32,244][24717] Fps is (10 sec: 23347.3, 60 sec: 23415.5, 300 sec: 22757.1). Total num frames: 14413824. Throughput: 0: 5975.5. Samples: 2589042. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 21:43:32,245][24717] Avg episode reward: [(0, '30.464')] [2023-02-22 21:43:33,685][37090] Updated weights for policy 0, policy_version 3528 (0.0006) [2023-02-22 21:43:35,567][37090] Updated weights for policy 0, policy_version 3538 (0.0007) [2023-02-22 21:43:37,244][24717] Fps is (10 sec: 22937.7, 60 sec: 23552.0, 300 sec: 22784.9). Total num frames: 14528512. Throughput: 0: 5947.5. Samples: 2623460. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:43:37,246][24717] Avg episode reward: [(0, '29.456')] [2023-02-22 21:43:37,326][37090] Updated weights for policy 0, policy_version 3548 (0.0007) [2023-02-22 21:43:39,081][37090] Updated weights for policy 0, policy_version 3558 (0.0007) [2023-02-22 21:43:40,815][37090] Updated weights for policy 0, policy_version 3568 (0.0007) [2023-02-22 21:43:42,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23688.5, 300 sec: 22840.4). Total num frames: 14647296. Throughput: 0: 5930.7. Samples: 2658592. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:43:42,245][24717] Avg episode reward: [(0, '34.081')] [2023-02-22 21:43:42,548][37090] Updated weights for policy 0, policy_version 3578 (0.0009) [2023-02-22 21:43:44,282][37090] Updated weights for policy 0, policy_version 3588 (0.0006) [2023-02-22 21:43:46,055][37090] Updated weights for policy 0, policy_version 3598 (0.0008) [2023-02-22 21:43:47,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23688.5, 300 sec: 22840.4). Total num frames: 14761984. Throughput: 0: 5924.9. Samples: 2676336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:43:47,246][24717] Avg episode reward: [(0, '30.688')] [2023-02-22 21:43:47,252][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000003604_14761984.pth... [2023-02-22 21:43:47,312][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000002245_9195520.pth [2023-02-22 21:43:47,913][37090] Updated weights for policy 0, policy_version 3608 (0.0008) [2023-02-22 21:43:49,618][37090] Updated weights for policy 0, policy_version 3618 (0.0006) [2023-02-22 21:43:51,363][37090] Updated weights for policy 0, policy_version 3628 (0.0007) [2023-02-22 21:43:52,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23620.3, 300 sec: 22854.3). Total num frames: 14876672. Throughput: 0: 5908.7. Samples: 2710696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:43:52,245][24717] Avg episode reward: [(0, '29.969')] [2023-02-22 21:43:53,073][37090] Updated weights for policy 0, policy_version 3638 (0.0009) [2023-02-22 21:43:54,851][37090] Updated weights for policy 0, policy_version 3648 (0.0007) [2023-02-22 21:43:56,547][37090] Updated weights for policy 0, policy_version 3658 (0.0006) [2023-02-22 21:43:57,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23620.3, 300 sec: 22882.1). Total num frames: 14995456. Throughput: 0: 5907.1. Samples: 2746304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:43:57,245][24717] Avg episode reward: [(0, '30.404')] [2023-02-22 21:43:58,273][37090] Updated weights for policy 0, policy_version 3668 (0.0006) [2023-02-22 21:44:00,045][37090] Updated weights for policy 0, policy_version 3678 (0.0007) [2023-02-22 21:44:01,893][37090] Updated weights for policy 0, policy_version 3688 (0.0007) [2023-02-22 21:44:02,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23552.0, 300 sec: 22882.1). Total num frames: 15110144. Throughput: 0: 5889.0. Samples: 2764080. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:02,246][24717] Avg episode reward: [(0, '30.644')] [2023-02-22 21:44:03,682][37090] Updated weights for policy 0, policy_version 3698 (0.0008) [2023-02-22 21:44:05,534][37090] Updated weights for policy 0, policy_version 3708 (0.0006) [2023-02-22 21:44:07,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23483.7, 300 sec: 22882.1). Total num frames: 15224832. Throughput: 0: 5811.7. Samples: 2797668. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:07,245][24717] Avg episode reward: [(0, '32.530')] [2023-02-22 21:44:07,314][37090] Updated weights for policy 0, policy_version 3718 (0.0007) [2023-02-22 21:44:09,104][37090] Updated weights for policy 0, policy_version 3728 (0.0009) [2023-02-22 21:44:10,811][37090] Updated weights for policy 0, policy_version 3738 (0.0008) [2023-02-22 21:44:12,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23483.7, 300 sec: 22896.0). Total num frames: 15343616. Throughput: 0: 5807.4. Samples: 2832738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:44:12,245][24717] Avg episode reward: [(0, '33.090')] [2023-02-22 21:44:12,500][37090] Updated weights for policy 0, policy_version 3748 (0.0008) [2023-02-22 21:44:14,209][37090] Updated weights for policy 0, policy_version 3758 (0.0007) [2023-02-22 21:44:15,851][37090] Updated weights for policy 0, policy_version 3768 (0.0008) [2023-02-22 21:44:17,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23483.7, 300 sec: 22951.5). Total num frames: 15466496. Throughput: 0: 5820.1. Samples: 2850948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:17,245][24717] Avg episode reward: [(0, '32.261')] [2023-02-22 21:44:17,508][37090] Updated weights for policy 0, policy_version 3778 (0.0007) [2023-02-22 21:44:19,133][37090] Updated weights for policy 0, policy_version 3788 (0.0009) [2023-02-22 21:44:20,801][37090] Updated weights for policy 0, policy_version 3798 (0.0008) [2023-02-22 21:44:22,244][24717] Fps is (10 sec: 24576.0, 60 sec: 23483.7, 300 sec: 23034.8). Total num frames: 15589376. Throughput: 0: 5886.3. Samples: 2888344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:44:22,246][24717] Avg episode reward: [(0, '32.146')] [2023-02-22 21:44:22,462][37090] Updated weights for policy 0, policy_version 3808 (0.0007) [2023-02-22 21:44:24,236][37090] Updated weights for policy 0, policy_version 3818 (0.0007) [2023-02-22 21:44:25,972][37090] Updated weights for policy 0, policy_version 3828 (0.0007) [2023-02-22 21:44:27,244][24717] Fps is (10 sec: 24166.0, 60 sec: 23483.7, 300 sec: 23090.3). Total num frames: 15708160. Throughput: 0: 5892.7. Samples: 2923766. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:27,246][24717] Avg episode reward: [(0, '32.395')] [2023-02-22 21:44:27,751][37090] Updated weights for policy 0, policy_version 3838 (0.0009) [2023-02-22 21:44:29,486][37090] Updated weights for policy 0, policy_version 3848 (0.0007) [2023-02-22 21:44:31,215][37090] Updated weights for policy 0, policy_version 3858 (0.0007) [2023-02-22 21:44:32,244][24717] Fps is (10 sec: 23347.2, 60 sec: 23483.7, 300 sec: 23159.8). Total num frames: 15822848. Throughput: 0: 5890.7. Samples: 2941416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:44:32,245][24717] Avg episode reward: [(0, '33.336')] [2023-02-22 21:44:33,013][37090] Updated weights for policy 0, policy_version 3868 (0.0008) [2023-02-22 21:44:34,841][37090] Updated weights for policy 0, policy_version 3878 (0.0007) [2023-02-22 21:44:36,731][37090] Updated weights for policy 0, policy_version 3888 (0.0008) [2023-02-22 21:44:37,244][24717] Fps is (10 sec: 22528.3, 60 sec: 23415.5, 300 sec: 23201.4). Total num frames: 15933440. Throughput: 0: 5882.2. Samples: 2975394. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:37,246][24717] Avg episode reward: [(0, '32.868')] [2023-02-22 21:44:38,486][37090] Updated weights for policy 0, policy_version 3898 (0.0007) [2023-02-22 21:44:40,309][37090] Updated weights for policy 0, policy_version 3908 (0.0006) [2023-02-22 21:44:42,080][37090] Updated weights for policy 0, policy_version 3918 (0.0007) [2023-02-22 21:44:42,244][24717] Fps is (10 sec: 22937.4, 60 sec: 23415.4, 300 sec: 23243.1). Total num frames: 16052224. Throughput: 0: 5849.4. Samples: 3009526. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 21:44:42,246][24717] Avg episode reward: [(0, '33.703')] [2023-02-22 21:44:43,837][37090] Updated weights for policy 0, policy_version 3928 (0.0006) [2023-02-22 21:44:45,605][37090] Updated weights for policy 0, policy_version 3938 (0.0009) [2023-02-22 21:44:47,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23415.5, 300 sec: 23270.8). Total num frames: 16166912. Throughput: 0: 5843.5. Samples: 3027038. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:44:47,246][24717] Avg episode reward: [(0, '29.202')] [2023-02-22 21:44:47,285][37090] Updated weights for policy 0, policy_version 3948 (0.0006) [2023-02-22 21:44:48,985][37090] Updated weights for policy 0, policy_version 3958 (0.0006) [2023-02-22 21:44:50,702][37090] Updated weights for policy 0, policy_version 3968 (0.0008) [2023-02-22 21:44:52,244][24717] Fps is (10 sec: 23347.4, 60 sec: 23483.7, 300 sec: 23312.5). Total num frames: 16285696. Throughput: 0: 5897.1. Samples: 3063036. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:44:52,245][24717] Avg episode reward: [(0, '35.020')] [2023-02-22 21:44:52,247][37076] Saving new best policy, reward=35.020! [2023-02-22 21:44:52,445][37090] Updated weights for policy 0, policy_version 3978 (0.0007) [2023-02-22 21:44:54,299][37090] Updated weights for policy 0, policy_version 3988 (0.0007) [2023-02-22 21:44:56,035][37090] Updated weights for policy 0, policy_version 3998 (0.0008) [2023-02-22 21:44:57,244][24717] Fps is (10 sec: 23756.9, 60 sec: 23483.7, 300 sec: 23354.1). Total num frames: 16404480. Throughput: 0: 5889.1. Samples: 3097748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:44:57,245][24717] Avg episode reward: [(0, '35.818')] [2023-02-22 21:44:57,250][37076] Saving new best policy, reward=35.818! [2023-02-22 21:44:57,782][37090] Updated weights for policy 0, policy_version 4008 (0.0006) [2023-02-22 21:44:59,479][37090] Updated weights for policy 0, policy_version 4018 (0.0007) [2023-02-22 21:45:01,238][37090] Updated weights for policy 0, policy_version 4028 (0.0008) [2023-02-22 21:45:02,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23483.7, 300 sec: 23381.9). Total num frames: 16519168. Throughput: 0: 5878.7. Samples: 3115488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:45:02,245][24717] Avg episode reward: [(0, '32.874')] [2023-02-22 21:45:02,983][37090] Updated weights for policy 0, policy_version 4038 (0.0008) [2023-02-22 21:45:04,678][37090] Updated weights for policy 0, policy_version 4048 (0.0007) [2023-02-22 21:45:06,461][37090] Updated weights for policy 0, policy_version 4058 (0.0008) [2023-02-22 21:45:07,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23552.0, 300 sec: 23409.7). Total num frames: 16637952. Throughput: 0: 5829.4. Samples: 3150668. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:45:07,245][24717] Avg episode reward: [(0, '34.412')] [2023-02-22 21:45:08,223][37090] Updated weights for policy 0, policy_version 4068 (0.0008) [2023-02-22 21:45:09,952][37090] Updated weights for policy 0, policy_version 4078 (0.0009) [2023-02-22 21:45:11,648][37090] Updated weights for policy 0, policy_version 4088 (0.0006) [2023-02-22 21:45:12,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23552.0, 300 sec: 23409.7). Total num frames: 16756736. Throughput: 0: 5832.4. Samples: 3186222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:45:12,245][24717] Avg episode reward: [(0, '30.671')] [2023-02-22 21:45:13,316][37090] Updated weights for policy 0, policy_version 4098 (0.0008) [2023-02-22 21:45:14,981][37090] Updated weights for policy 0, policy_version 4108 (0.0007) [2023-02-22 21:45:16,680][37090] Updated weights for policy 0, policy_version 4118 (0.0008) [2023-02-22 21:45:17,244][24717] Fps is (10 sec: 24166.3, 60 sec: 23552.0, 300 sec: 23423.6). Total num frames: 16879616. Throughput: 0: 5848.8. Samples: 3204612. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:45:17,245][24717] Avg episode reward: [(0, '29.507')] [2023-02-22 21:45:18,365][37090] Updated weights for policy 0, policy_version 4128 (0.0006) [2023-02-22 21:45:20,012][37090] Updated weights for policy 0, policy_version 4138 (0.0008) [2023-02-22 21:45:21,656][37090] Updated weights for policy 0, policy_version 4148 (0.0006) [2023-02-22 21:45:22,244][24717] Fps is (10 sec: 24576.0, 60 sec: 23552.0, 300 sec: 23437.5). Total num frames: 17002496. Throughput: 0: 5906.4. Samples: 3241184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:45:22,245][24717] Avg episode reward: [(0, '28.206')] [2023-02-22 21:45:23,316][37090] Updated weights for policy 0, policy_version 4158 (0.0006) [2023-02-22 21:45:25,040][37090] Updated weights for policy 0, policy_version 4168 (0.0007) [2023-02-22 21:45:26,717][37090] Updated weights for policy 0, policy_version 4178 (0.0006) [2023-02-22 21:45:27,244][24717] Fps is (10 sec: 24576.1, 60 sec: 23620.3, 300 sec: 23479.1). Total num frames: 17125376. Throughput: 0: 5962.2. Samples: 3277826. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:45:27,245][24717] Avg episode reward: [(0, '27.928')] [2023-02-22 21:45:28,431][37090] Updated weights for policy 0, policy_version 4188 (0.0007) [2023-02-22 21:45:30,146][37090] Updated weights for policy 0, policy_version 4198 (0.0006) [2023-02-22 21:45:31,849][37090] Updated weights for policy 0, policy_version 4208 (0.0007) [2023-02-22 21:45:32,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23688.5, 300 sec: 23493.0). Total num frames: 17244160. Throughput: 0: 5976.3. Samples: 3295972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:45:32,246][24717] Avg episode reward: [(0, '32.893')] [2023-02-22 21:45:33,540][37090] Updated weights for policy 0, policy_version 4218 (0.0007) [2023-02-22 21:45:35,278][37090] Updated weights for policy 0, policy_version 4228 (0.0006) [2023-02-22 21:45:37,039][37090] Updated weights for policy 0, policy_version 4238 (0.0008) [2023-02-22 21:45:37,244][24717] Fps is (10 sec: 23756.7, 60 sec: 23825.1, 300 sec: 23520.8). Total num frames: 17362944. Throughput: 0: 5958.7. Samples: 3331176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:45:37,245][24717] Avg episode reward: [(0, '33.336')] [2023-02-22 21:45:38,829][37090] Updated weights for policy 0, policy_version 4248 (0.0008) [2023-02-22 21:45:40,653][37090] Updated weights for policy 0, policy_version 4258 (0.0006) [2023-02-22 21:45:42,244][24717] Fps is (10 sec: 22937.2, 60 sec: 23688.5, 300 sec: 23520.8). Total num frames: 17473536. Throughput: 0: 5960.1. Samples: 3365952. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:45:42,245][24717] Avg episode reward: [(0, '32.648')] [2023-02-22 21:45:42,453][37090] Updated weights for policy 0, policy_version 4268 (0.0008) [2023-02-22 21:45:44,192][37090] Updated weights for policy 0, policy_version 4278 (0.0007) [2023-02-22 21:45:45,914][37090] Updated weights for policy 0, policy_version 4288 (0.0007) [2023-02-22 21:45:47,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23756.8, 300 sec: 23534.6). Total num frames: 17592320. Throughput: 0: 5952.6. Samples: 3383354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:45:47,245][24717] Avg episode reward: [(0, '30.249')] [2023-02-22 21:45:47,251][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000004295_17592320.pth... [2023-02-22 21:45:47,304][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000002917_11948032.pth [2023-02-22 21:45:47,674][37090] Updated weights for policy 0, policy_version 4298 (0.0006) [2023-02-22 21:45:49,455][37090] Updated weights for policy 0, policy_version 4308 (0.0010) [2023-02-22 21:45:51,249][37090] Updated weights for policy 0, policy_version 4318 (0.0007) [2023-02-22 21:45:52,244][24717] Fps is (10 sec: 23347.3, 60 sec: 23688.5, 300 sec: 23534.6). Total num frames: 17707008. Throughput: 0: 5947.2. Samples: 3418294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:45:52,245][24717] Avg episode reward: [(0, '31.271')] [2023-02-22 21:45:52,948][37090] Updated weights for policy 0, policy_version 4328 (0.0007) [2023-02-22 21:45:54,639][37090] Updated weights for policy 0, policy_version 4338 (0.0007) [2023-02-22 21:45:56,312][37090] Updated weights for policy 0, policy_version 4348 (0.0006) [2023-02-22 21:45:57,244][24717] Fps is (10 sec: 23756.7, 60 sec: 23756.8, 300 sec: 23548.5). Total num frames: 17829888. Throughput: 0: 5960.2. Samples: 3454432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:45:57,246][24717] Avg episode reward: [(0, '31.587')] [2023-02-22 21:45:58,042][37090] Updated weights for policy 0, policy_version 4358 (0.0007) [2023-02-22 21:45:59,681][37090] Updated weights for policy 0, policy_version 4368 (0.0007) [2023-02-22 21:46:01,463][37090] Updated weights for policy 0, policy_version 4378 (0.0007) [2023-02-22 21:46:02,244][24717] Fps is (10 sec: 24166.7, 60 sec: 23825.1, 300 sec: 23576.3). Total num frames: 17948672. Throughput: 0: 5955.3. Samples: 3472602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:46:02,245][24717] Avg episode reward: [(0, '31.087')] [2023-02-22 21:46:03,132][37090] Updated weights for policy 0, policy_version 4388 (0.0007) [2023-02-22 21:46:04,848][37090] Updated weights for policy 0, policy_version 4398 (0.0006) [2023-02-22 21:46:06,588][37090] Updated weights for policy 0, policy_version 4408 (0.0008) [2023-02-22 21:46:07,244][24717] Fps is (10 sec: 23757.0, 60 sec: 23825.1, 300 sec: 23576.3). Total num frames: 18067456. Throughput: 0: 5935.5. Samples: 3508280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:46:07,245][24717] Avg episode reward: [(0, '31.142')] [2023-02-22 21:46:08,248][37090] Updated weights for policy 0, policy_version 4418 (0.0008) [2023-02-22 21:46:09,978][37090] Updated weights for policy 0, policy_version 4428 (0.0006) [2023-02-22 21:46:11,628][37090] Updated weights for policy 0, policy_version 4438 (0.0008) [2023-02-22 21:46:12,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23893.3, 300 sec: 23576.3). Total num frames: 18190336. Throughput: 0: 5931.7. Samples: 3544752. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:46:12,248][24717] Avg episode reward: [(0, '31.574')] [2023-02-22 21:46:13,268][37090] Updated weights for policy 0, policy_version 4448 (0.0008) [2023-02-22 21:46:14,934][37090] Updated weights for policy 0, policy_version 4458 (0.0007) [2023-02-22 21:46:16,582][37090] Updated weights for policy 0, policy_version 4468 (0.0007) [2023-02-22 21:46:17,244][24717] Fps is (10 sec: 24986.0, 60 sec: 23961.7, 300 sec: 23590.2). Total num frames: 18317312. Throughput: 0: 5942.8. Samples: 3563398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:46:17,245][24717] Avg episode reward: [(0, '31.565')] [2023-02-22 21:46:18,266][37090] Updated weights for policy 0, policy_version 4478 (0.0006) [2023-02-22 21:46:19,939][37090] Updated weights for policy 0, policy_version 4488 (0.0006) [2023-02-22 21:46:21,651][37090] Updated weights for policy 0, policy_version 4498 (0.0007) [2023-02-22 21:46:22,244][24717] Fps is (10 sec: 24576.0, 60 sec: 23893.3, 300 sec: 23590.2). Total num frames: 18436096. Throughput: 0: 5972.3. Samples: 3599930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:46:22,246][24717] Avg episode reward: [(0, '36.527')] [2023-02-22 21:46:22,247][37076] Saving new best policy, reward=36.527! [2023-02-22 21:46:23,382][37090] Updated weights for policy 0, policy_version 4508 (0.0007) [2023-02-22 21:46:25,145][37090] Updated weights for policy 0, policy_version 4518 (0.0009) [2023-02-22 21:46:26,813][37090] Updated weights for policy 0, policy_version 4528 (0.0006) [2023-02-22 21:46:27,244][24717] Fps is (10 sec: 23756.4, 60 sec: 23825.0, 300 sec: 23590.2). Total num frames: 18554880. Throughput: 0: 5994.9. Samples: 3635720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:46:27,246][24717] Avg episode reward: [(0, '29.675')] [2023-02-22 21:46:28,577][37090] Updated weights for policy 0, policy_version 4538 (0.0009) [2023-02-22 21:46:30,287][37090] Updated weights for policy 0, policy_version 4548 (0.0008) [2023-02-22 21:46:31,956][37090] Updated weights for policy 0, policy_version 4558 (0.0006) [2023-02-22 21:46:32,244][24717] Fps is (10 sec: 23756.7, 60 sec: 23825.1, 300 sec: 23590.2). Total num frames: 18673664. Throughput: 0: 6005.8. Samples: 3653614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:46:32,245][24717] Avg episode reward: [(0, '33.243')] [2023-02-22 21:46:33,661][37090] Updated weights for policy 0, policy_version 4568 (0.0007) [2023-02-22 21:46:35,320][37090] Updated weights for policy 0, policy_version 4578 (0.0006) [2023-02-22 21:46:37,078][37090] Updated weights for policy 0, policy_version 4588 (0.0008) [2023-02-22 21:46:37,244][24717] Fps is (10 sec: 23756.9, 60 sec: 23825.1, 300 sec: 23604.1). Total num frames: 18792448. Throughput: 0: 6028.4. Samples: 3689570. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 21:46:37,246][24717] Avg episode reward: [(0, '31.311')] [2023-02-22 21:46:38,844][37090] Updated weights for policy 0, policy_version 4598 (0.0011) [2023-02-22 21:46:40,507][37090] Updated weights for policy 0, policy_version 4608 (0.0008) [2023-02-22 21:46:42,219][37090] Updated weights for policy 0, policy_version 4618 (0.0008) [2023-02-22 21:46:42,244][24717] Fps is (10 sec: 24166.4, 60 sec: 24029.9, 300 sec: 23618.0). Total num frames: 18915328. Throughput: 0: 6023.1. Samples: 3725470. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:46:42,246][24717] Avg episode reward: [(0, '29.411')] [2023-02-22 21:46:43,907][37090] Updated weights for policy 0, policy_version 4628 (0.0008) [2023-02-22 21:46:45,616][37090] Updated weights for policy 0, policy_version 4638 (0.0007) [2023-02-22 21:46:47,244][24717] Fps is (10 sec: 24166.4, 60 sec: 24029.9, 300 sec: 23618.0). Total num frames: 19034112. Throughput: 0: 6018.2. Samples: 3743422. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:46:47,245][24717] Avg episode reward: [(0, '29.349')] [2023-02-22 21:46:47,296][37090] Updated weights for policy 0, policy_version 4648 (0.0007) [2023-02-22 21:46:48,996][37090] Updated weights for policy 0, policy_version 4658 (0.0007) [2023-02-22 21:46:50,718][37090] Updated weights for policy 0, policy_version 4668 (0.0006) [2023-02-22 21:46:52,244][24717] Fps is (10 sec: 24166.3, 60 sec: 24166.4, 300 sec: 23631.8). Total num frames: 19156992. Throughput: 0: 6032.7. Samples: 3779754. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:46:52,245][24717] Avg episode reward: [(0, '32.512')] [2023-02-22 21:46:52,430][37090] Updated weights for policy 0, policy_version 4678 (0.0007) [2023-02-22 21:46:54,082][37090] Updated weights for policy 0, policy_version 4688 (0.0008) [2023-02-22 21:46:55,822][37090] Updated weights for policy 0, policy_version 4698 (0.0007) [2023-02-22 21:46:57,244][24717] Fps is (10 sec: 24166.1, 60 sec: 24098.1, 300 sec: 23631.8). Total num frames: 19275776. Throughput: 0: 6023.5. Samples: 3815810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 21:46:57,245][24717] Avg episode reward: [(0, '34.294')] [2023-02-22 21:46:57,492][37090] Updated weights for policy 0, policy_version 4708 (0.0008) [2023-02-22 21:46:59,215][37090] Updated weights for policy 0, policy_version 4718 (0.0007) [2023-02-22 21:47:00,931][37090] Updated weights for policy 0, policy_version 4728 (0.0007) [2023-02-22 21:47:02,244][24717] Fps is (10 sec: 23757.0, 60 sec: 24098.1, 300 sec: 23645.7). Total num frames: 19394560. Throughput: 0: 6009.1. Samples: 3833808. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:02,245][24717] Avg episode reward: [(0, '36.072')] [2023-02-22 21:47:02,612][37090] Updated weights for policy 0, policy_version 4738 (0.0008) [2023-02-22 21:47:04,380][37090] Updated weights for policy 0, policy_version 4748 (0.0008) [2023-02-22 21:47:06,195][37090] Updated weights for policy 0, policy_version 4758 (0.0008) [2023-02-22 21:47:07,244][24717] Fps is (10 sec: 23756.7, 60 sec: 24098.1, 300 sec: 23631.8). Total num frames: 19513344. Throughput: 0: 5981.6. Samples: 3869102. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:07,246][24717] Avg episode reward: [(0, '33.176')] [2023-02-22 21:47:07,910][37090] Updated weights for policy 0, policy_version 4768 (0.0006) [2023-02-22 21:47:09,603][37090] Updated weights for policy 0, policy_version 4778 (0.0007) [2023-02-22 21:47:11,242][37090] Updated weights for policy 0, policy_version 4788 (0.0006) [2023-02-22 21:47:12,244][24717] Fps is (10 sec: 24166.4, 60 sec: 24098.1, 300 sec: 23631.8). Total num frames: 19636224. Throughput: 0: 5995.9. Samples: 3905534. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:12,245][24717] Avg episode reward: [(0, '34.559')] [2023-02-22 21:47:12,849][37090] Updated weights for policy 0, policy_version 4798 (0.0009) [2023-02-22 21:47:14,481][37090] Updated weights for policy 0, policy_version 4808 (0.0007) [2023-02-22 21:47:16,107][37090] Updated weights for policy 0, policy_version 4818 (0.0007) [2023-02-22 21:47:17,244][24717] Fps is (10 sec: 24576.4, 60 sec: 24029.8, 300 sec: 23659.6). Total num frames: 19759104. Throughput: 0: 6022.4. Samples: 3924622. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:47:17,245][24717] Avg episode reward: [(0, '37.397')] [2023-02-22 21:47:17,252][37076] Saving new best policy, reward=37.397! [2023-02-22 21:47:17,750][37090] Updated weights for policy 0, policy_version 4828 (0.0007) [2023-02-22 21:47:19,387][37090] Updated weights for policy 0, policy_version 4838 (0.0009) [2023-02-22 21:47:21,029][37090] Updated weights for policy 0, policy_version 4848 (0.0007) [2023-02-22 21:47:22,244][24717] Fps is (10 sec: 24985.6, 60 sec: 24166.4, 300 sec: 23687.4). Total num frames: 19886080. Throughput: 0: 6055.8. Samples: 3962080. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:22,246][24717] Avg episode reward: [(0, '33.229')] [2023-02-22 21:47:22,748][37090] Updated weights for policy 0, policy_version 4858 (0.0007) [2023-02-22 21:47:24,468][37090] Updated weights for policy 0, policy_version 4868 (0.0008) [2023-02-22 21:47:26,154][37090] Updated weights for policy 0, policy_version 4878 (0.0009) [2023-02-22 21:47:27,244][24717] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 23715.1). Total num frames: 20004864. Throughput: 0: 6059.5. Samples: 3998148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:47:27,246][24717] Avg episode reward: [(0, '31.405')] [2023-02-22 21:47:27,850][37090] Updated weights for policy 0, policy_version 4888 (0.0007) [2023-02-22 21:47:29,505][37090] Updated weights for policy 0, policy_version 4898 (0.0008) [2023-02-22 21:47:31,187][37090] Updated weights for policy 0, policy_version 4908 (0.0008) [2023-02-22 21:47:32,244][24717] Fps is (10 sec: 24166.2, 60 sec: 24234.6, 300 sec: 23770.7). Total num frames: 20127744. Throughput: 0: 6068.8. Samples: 4016520. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:32,246][24717] Avg episode reward: [(0, '33.718')] [2023-02-22 21:47:32,869][37090] Updated weights for policy 0, policy_version 4918 (0.0007) [2023-02-22 21:47:34,544][37090] Updated weights for policy 0, policy_version 4928 (0.0008) [2023-02-22 21:47:36,247][37090] Updated weights for policy 0, policy_version 4938 (0.0007) [2023-02-22 21:47:37,244][24717] Fps is (10 sec: 24166.4, 60 sec: 24234.7, 300 sec: 23798.5). Total num frames: 20246528. Throughput: 0: 6075.8. Samples: 4053166. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 21:47:37,246][24717] Avg episode reward: [(0, '32.848')] [2023-02-22 21:47:37,925][37090] Updated weights for policy 0, policy_version 4948 (0.0007) [2023-02-22 21:47:39,734][37090] Updated weights for policy 0, policy_version 4958 (0.0008) [2023-02-22 21:47:41,562][37090] Updated weights for policy 0, policy_version 4968 (0.0008) [2023-02-22 21:47:42,244][24717] Fps is (10 sec: 23347.4, 60 sec: 24098.1, 300 sec: 23798.5). Total num frames: 20361216. Throughput: 0: 6040.2. Samples: 4087618. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:42,246][24717] Avg episode reward: [(0, '31.145')] [2023-02-22 21:47:43,448][37090] Updated weights for policy 0, policy_version 4978 (0.0007) [2023-02-22 21:47:45,344][37090] Updated weights for policy 0, policy_version 4988 (0.0009) [2023-02-22 21:47:47,190][37090] Updated weights for policy 0, policy_version 4998 (0.0007) [2023-02-22 21:47:47,244][24717] Fps is (10 sec: 22527.9, 60 sec: 23961.6, 300 sec: 23770.7). Total num frames: 20471808. Throughput: 0: 6001.5. Samples: 4103874. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:47,246][24717] Avg episode reward: [(0, '33.785')] [2023-02-22 21:47:47,252][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000004998_20471808.pth... [2023-02-22 21:47:47,304][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000003604_14761984.pth [2023-02-22 21:47:48,998][37090] Updated weights for policy 0, policy_version 5008 (0.0009) [2023-02-22 21:47:50,698][37090] Updated weights for policy 0, policy_version 5018 (0.0006) [2023-02-22 21:47:52,244][24717] Fps is (10 sec: 22937.5, 60 sec: 23893.3, 300 sec: 23770.7). Total num frames: 20590592. Throughput: 0: 5982.3. Samples: 4138304. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:47:52,246][24717] Avg episode reward: [(0, '32.499')] [2023-02-22 21:47:52,364][37090] Updated weights for policy 0, policy_version 5028 (0.0008) [2023-02-22 21:47:54,073][37090] Updated weights for policy 0, policy_version 5038 (0.0008) [2023-02-22 21:47:55,790][37090] Updated weights for policy 0, policy_version 5048 (0.0009) [2023-02-22 21:47:57,244][24717] Fps is (10 sec: 23756.9, 60 sec: 23893.4, 300 sec: 23770.7). Total num frames: 20709376. Throughput: 0: 5978.6. Samples: 4174570. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:47:57,246][24717] Avg episode reward: [(0, '32.433')] [2023-02-22 21:47:57,451][37090] Updated weights for policy 0, policy_version 5058 (0.0008) [2023-02-22 21:47:59,179][37090] Updated weights for policy 0, policy_version 5068 (0.0007) [2023-02-22 21:48:00,852][37090] Updated weights for policy 0, policy_version 5078 (0.0008) [2023-02-22 21:48:02,244][24717] Fps is (10 sec: 24166.5, 60 sec: 23961.6, 300 sec: 23784.6). Total num frames: 20832256. Throughput: 0: 5955.3. Samples: 4192612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:02,245][24717] Avg episode reward: [(0, '35.511')] [2023-02-22 21:48:02,530][37090] Updated weights for policy 0, policy_version 5088 (0.0007) [2023-02-22 21:48:04,245][37090] Updated weights for policy 0, policy_version 5098 (0.0007) [2023-02-22 21:48:05,951][37090] Updated weights for policy 0, policy_version 5108 (0.0006) [2023-02-22 21:48:07,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23961.7, 300 sec: 23784.6). Total num frames: 20951040. Throughput: 0: 5928.0. Samples: 4228842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:07,245][24717] Avg episode reward: [(0, '33.585')] [2023-02-22 21:48:07,637][37090] Updated weights for policy 0, policy_version 5118 (0.0007) [2023-02-22 21:48:09,301][37090] Updated weights for policy 0, policy_version 5128 (0.0007) [2023-02-22 21:48:10,950][37090] Updated weights for policy 0, policy_version 5138 (0.0009) [2023-02-22 21:48:12,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23961.6, 300 sec: 23784.6). Total num frames: 21073920. Throughput: 0: 5948.5. Samples: 4265830. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:48:12,246][24717] Avg episode reward: [(0, '29.845')] [2023-02-22 21:48:12,598][37090] Updated weights for policy 0, policy_version 5148 (0.0007) [2023-02-22 21:48:14,188][37090] Updated weights for policy 0, policy_version 5158 (0.0008) [2023-02-22 21:48:15,769][37090] Updated weights for policy 0, policy_version 5168 (0.0006) [2023-02-22 21:48:17,245][24717] Fps is (10 sec: 25394.7, 60 sec: 24098.1, 300 sec: 23812.3). Total num frames: 21204992. Throughput: 0: 5964.0. Samples: 4284902. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:17,245][24717] Avg episode reward: [(0, '29.943')] [2023-02-22 21:48:17,360][37090] Updated weights for policy 0, policy_version 5178 (0.0007) [2023-02-22 21:48:18,969][37090] Updated weights for policy 0, policy_version 5188 (0.0008) [2023-02-22 21:48:20,646][37090] Updated weights for policy 0, policy_version 5198 (0.0007) [2023-02-22 21:48:22,244][24717] Fps is (10 sec: 25395.2, 60 sec: 24029.9, 300 sec: 23826.2). Total num frames: 21327872. Throughput: 0: 5990.5. Samples: 4322738. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:22,245][24717] Avg episode reward: [(0, '31.515')] [2023-02-22 21:48:22,412][37090] Updated weights for policy 0, policy_version 5208 (0.0007) [2023-02-22 21:48:24,133][37090] Updated weights for policy 0, policy_version 5218 (0.0008) [2023-02-22 21:48:25,847][37090] Updated weights for policy 0, policy_version 5228 (0.0007) [2023-02-22 21:48:27,244][24717] Fps is (10 sec: 24166.9, 60 sec: 24029.9, 300 sec: 23840.1). Total num frames: 21446656. Throughput: 0: 6012.5. Samples: 4358180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:48:27,246][24717] Avg episode reward: [(0, '33.253')] [2023-02-22 21:48:27,595][37090] Updated weights for policy 0, policy_version 5238 (0.0007) [2023-02-22 21:48:29,242][37090] Updated weights for policy 0, policy_version 5248 (0.0007) [2023-02-22 21:48:30,950][37090] Updated weights for policy 0, policy_version 5258 (0.0009) [2023-02-22 21:48:32,244][24717] Fps is (10 sec: 23756.8, 60 sec: 23961.6, 300 sec: 23854.0). Total num frames: 21565440. Throughput: 0: 6054.1. Samples: 4376310. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:32,246][24717] Avg episode reward: [(0, '32.805')] [2023-02-22 21:48:32,615][37090] Updated weights for policy 0, policy_version 5268 (0.0009) [2023-02-22 21:48:34,286][37090] Updated weights for policy 0, policy_version 5278 (0.0008) [2023-02-22 21:48:36,028][37090] Updated weights for policy 0, policy_version 5288 (0.0008) [2023-02-22 21:48:37,244][24717] Fps is (10 sec: 24166.4, 60 sec: 24029.9, 300 sec: 23867.9). Total num frames: 21688320. Throughput: 0: 6103.9. Samples: 4412978. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:48:37,245][24717] Avg episode reward: [(0, '32.532')] [2023-02-22 21:48:37,686][37090] Updated weights for policy 0, policy_version 5298 (0.0008) [2023-02-22 21:48:39,363][37090] Updated weights for policy 0, policy_version 5308 (0.0007) [2023-02-22 21:48:40,994][37090] Updated weights for policy 0, policy_version 5318 (0.0007) [2023-02-22 21:48:42,244][24717] Fps is (10 sec: 24576.0, 60 sec: 24166.4, 300 sec: 23895.6). Total num frames: 21811200. Throughput: 0: 6118.0. Samples: 4449880. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:42,245][24717] Avg episode reward: [(0, '31.019')] [2023-02-22 21:48:42,637][37090] Updated weights for policy 0, policy_version 5328 (0.0009) [2023-02-22 21:48:44,313][37090] Updated weights for policy 0, policy_version 5338 (0.0007) [2023-02-22 21:48:45,978][37090] Updated weights for policy 0, policy_version 5348 (0.0008) [2023-02-22 21:48:47,244][24717] Fps is (10 sec: 24575.5, 60 sec: 24371.1, 300 sec: 23923.4). Total num frames: 21934080. Throughput: 0: 6125.4. Samples: 4468258. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:47,246][24717] Avg episode reward: [(0, '35.713')] [2023-02-22 21:48:47,608][37090] Updated weights for policy 0, policy_version 5358 (0.0008) [2023-02-22 21:48:49,297][37090] Updated weights for policy 0, policy_version 5368 (0.0006) [2023-02-22 21:48:50,931][37090] Updated weights for policy 0, policy_version 5378 (0.0006) [2023-02-22 21:48:52,244][24717] Fps is (10 sec: 24575.9, 60 sec: 24439.5, 300 sec: 23937.3). Total num frames: 22056960. Throughput: 0: 6143.9. Samples: 4505316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:48:52,245][24717] Avg episode reward: [(0, '33.994')] [2023-02-22 21:48:52,604][37090] Updated weights for policy 0, policy_version 5388 (0.0008) [2023-02-22 21:48:54,265][37090] Updated weights for policy 0, policy_version 5398 (0.0006) [2023-02-22 21:48:56,028][37090] Updated weights for policy 0, policy_version 5408 (0.0008) [2023-02-22 21:48:57,244][24717] Fps is (10 sec: 24166.8, 60 sec: 24439.5, 300 sec: 23951.2). Total num frames: 22175744. Throughput: 0: 6121.9. Samples: 4541316. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:48:57,245][24717] Avg episode reward: [(0, '34.886')] [2023-02-22 21:48:57,838][37090] Updated weights for policy 0, policy_version 5418 (0.0007) [2023-02-22 21:48:59,637][37090] Updated weights for policy 0, policy_version 5428 (0.0007) [2023-02-22 21:49:01,442][37090] Updated weights for policy 0, policy_version 5438 (0.0007) [2023-02-22 21:49:02,244][24717] Fps is (10 sec: 23347.3, 60 sec: 24302.9, 300 sec: 23951.2). Total num frames: 22290432. Throughput: 0: 6077.3. Samples: 4558380. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:49:02,245][24717] Avg episode reward: [(0, '32.700')] [2023-02-22 21:49:03,330][37090] Updated weights for policy 0, policy_version 5448 (0.0008) [2023-02-22 21:49:05,092][37090] Updated weights for policy 0, policy_version 5458 (0.0007) [2023-02-22 21:49:06,933][37090] Updated weights for policy 0, policy_version 5468 (0.0008) [2023-02-22 21:49:07,244][24717] Fps is (10 sec: 22528.0, 60 sec: 24166.4, 300 sec: 23923.4). Total num frames: 22401024. Throughput: 0: 5990.8. Samples: 4592324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:07,245][24717] Avg episode reward: [(0, '31.272')] [2023-02-22 21:49:08,701][37090] Updated weights for policy 0, policy_version 5478 (0.0008) [2023-02-22 21:49:10,495][37090] Updated weights for policy 0, policy_version 5488 (0.0008) [2023-02-22 21:49:12,190][37090] Updated weights for policy 0, policy_version 5498 (0.0007) [2023-02-22 21:49:12,244][24717] Fps is (10 sec: 22937.6, 60 sec: 24098.1, 300 sec: 23909.5). Total num frames: 22519808. Throughput: 0: 5965.6. Samples: 4626634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:49:12,245][24717] Avg episode reward: [(0, '35.978')] [2023-02-22 21:49:13,849][37090] Updated weights for policy 0, policy_version 5508 (0.0008) [2023-02-22 21:49:15,519][37090] Updated weights for policy 0, policy_version 5518 (0.0007) [2023-02-22 21:49:17,191][37090] Updated weights for policy 0, policy_version 5528 (0.0007) [2023-02-22 21:49:17,244][24717] Fps is (10 sec: 24166.4, 60 sec: 23961.7, 300 sec: 23909.5). Total num frames: 22642688. Throughput: 0: 5971.2. Samples: 4645014. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:17,245][24717] Avg episode reward: [(0, '34.798')] [2023-02-22 21:49:18,829][37090] Updated weights for policy 0, policy_version 5538 (0.0006) [2023-02-22 21:49:20,615][37090] Updated weights for policy 0, policy_version 5548 (0.0007) [2023-02-22 21:49:22,244][24717] Fps is (10 sec: 24166.5, 60 sec: 23893.3, 300 sec: 23909.5). Total num frames: 22761472. Throughput: 0: 5966.7. Samples: 4681478. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:22,245][24717] Avg episode reward: [(0, '33.505')] [2023-02-22 21:49:22,382][37090] Updated weights for policy 0, policy_version 5558 (0.0006) [2023-02-22 21:49:24,149][37090] Updated weights for policy 0, policy_version 5568 (0.0009) [2023-02-22 21:49:25,914][37090] Updated weights for policy 0, policy_version 5578 (0.0008) [2023-02-22 21:49:27,244][24717] Fps is (10 sec: 23347.1, 60 sec: 23825.0, 300 sec: 23909.5). Total num frames: 22876160. Throughput: 0: 5911.5. Samples: 4715896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:27,246][24717] Avg episode reward: [(0, '34.393')] [2023-02-22 21:49:27,758][37090] Updated weights for policy 0, policy_version 5588 (0.0009) [2023-02-22 21:49:29,515][37090] Updated weights for policy 0, policy_version 5598 (0.0008) [2023-02-22 21:49:31,252][37090] Updated weights for policy 0, policy_version 5608 (0.0009) [2023-02-22 21:49:32,244][24717] Fps is (10 sec: 22937.3, 60 sec: 23756.8, 300 sec: 23923.4). Total num frames: 22990848. Throughput: 0: 5886.5. Samples: 4733150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:49:32,246][24717] Avg episode reward: [(0, '33.922')] [2023-02-22 21:49:33,084][37090] Updated weights for policy 0, policy_version 5618 (0.0007) [2023-02-22 21:49:34,945][37090] Updated weights for policy 0, policy_version 5628 (0.0009) [2023-02-22 21:49:36,758][37090] Updated weights for policy 0, policy_version 5638 (0.0007) [2023-02-22 21:49:37,244][24717] Fps is (10 sec: 22528.2, 60 sec: 23552.0, 300 sec: 23895.7). Total num frames: 23101440. Throughput: 0: 5816.7. Samples: 4767066. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:37,246][24717] Avg episode reward: [(0, '38.366')] [2023-02-22 21:49:37,252][37076] Saving new best policy, reward=38.366! [2023-02-22 21:49:38,593][37090] Updated weights for policy 0, policy_version 5648 (0.0007) [2023-02-22 21:49:40,398][37090] Updated weights for policy 0, policy_version 5658 (0.0006) [2023-02-22 21:49:42,244][37090] Updated weights for policy 0, policy_version 5668 (0.0007) [2023-02-22 21:49:42,244][24717] Fps is (10 sec: 22527.9, 60 sec: 23415.4, 300 sec: 23895.6). Total num frames: 23216128. Throughput: 0: 5765.5. Samples: 4800762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:49:42,245][24717] Avg episode reward: [(0, '38.838')] [2023-02-22 21:49:42,247][37076] Saving new best policy, reward=38.838! [2023-02-22 21:49:44,037][37090] Updated weights for policy 0, policy_version 5678 (0.0007) [2023-02-22 21:49:45,852][37090] Updated weights for policy 0, policy_version 5688 (0.0007) [2023-02-22 21:49:47,244][24717] Fps is (10 sec: 22527.9, 60 sec: 23210.7, 300 sec: 23867.9). Total num frames: 23326720. Throughput: 0: 5760.5. Samples: 4817602. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:47,245][24717] Avg episode reward: [(0, '36.303')] [2023-02-22 21:49:47,264][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000005696_23330816.pth... [2023-02-22 21:49:47,316][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000004295_17592320.pth [2023-02-22 21:49:47,647][37090] Updated weights for policy 0, policy_version 5698 (0.0008) [2023-02-22 21:49:49,408][37090] Updated weights for policy 0, policy_version 5708 (0.0007) [2023-02-22 21:49:51,162][37090] Updated weights for policy 0, policy_version 5718 (0.0007) [2023-02-22 21:49:52,244][24717] Fps is (10 sec: 22937.8, 60 sec: 23142.4, 300 sec: 23867.9). Total num frames: 23445504. Throughput: 0: 5774.3. Samples: 4852168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:49:52,245][24717] Avg episode reward: [(0, '32.662')] [2023-02-22 21:49:52,960][37090] Updated weights for policy 0, policy_version 5728 (0.0008) [2023-02-22 21:49:54,789][37090] Updated weights for policy 0, policy_version 5738 (0.0009) [2023-02-22 21:49:56,540][37090] Updated weights for policy 0, policy_version 5748 (0.0009) [2023-02-22 21:49:57,244][24717] Fps is (10 sec: 22937.6, 60 sec: 23005.9, 300 sec: 23854.0). Total num frames: 23556096. Throughput: 0: 5773.2. Samples: 4886426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:49:57,249][24717] Avg episode reward: [(0, '36.159')] [2023-02-22 21:49:58,355][37090] Updated weights for policy 0, policy_version 5758 (0.0008) [2023-02-22 21:50:00,134][37090] Updated weights for policy 0, policy_version 5768 (0.0007) [2023-02-22 21:50:01,938][37090] Updated weights for policy 0, policy_version 5778 (0.0008) [2023-02-22 21:50:02,244][24717] Fps is (10 sec: 22528.0, 60 sec: 23005.9, 300 sec: 23840.1). Total num frames: 23670784. Throughput: 0: 5746.2. Samples: 4903594. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:02,246][24717] Avg episode reward: [(0, '35.128')] [2023-02-22 21:50:03,694][37090] Updated weights for policy 0, policy_version 5788 (0.0007) [2023-02-22 21:50:05,486][37090] Updated weights for policy 0, policy_version 5798 (0.0009) [2023-02-22 21:50:07,244][24717] Fps is (10 sec: 22937.7, 60 sec: 23074.1, 300 sec: 23826.2). Total num frames: 23785472. Throughput: 0: 5703.4. Samples: 4938130. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:07,247][24717] Avg episode reward: [(0, '34.357')] [2023-02-22 21:50:07,329][37090] Updated weights for policy 0, policy_version 5808 (0.0006) [2023-02-22 21:50:09,139][37090] Updated weights for policy 0, policy_version 5818 (0.0010) [2023-02-22 21:50:10,952][37090] Updated weights for policy 0, policy_version 5828 (0.0007) [2023-02-22 21:50:12,244][24717] Fps is (10 sec: 22937.7, 60 sec: 23005.9, 300 sec: 23798.5). Total num frames: 23900160. Throughput: 0: 5689.0. Samples: 4971900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:12,246][24717] Avg episode reward: [(0, '33.547')] [2023-02-22 21:50:12,743][37090] Updated weights for policy 0, policy_version 5838 (0.0007) [2023-02-22 21:50:14,425][37090] Updated weights for policy 0, policy_version 5848 (0.0006) [2023-02-22 21:50:16,157][37090] Updated weights for policy 0, policy_version 5858 (0.0008) [2023-02-22 21:50:17,244][24717] Fps is (10 sec: 23347.1, 60 sec: 22937.6, 300 sec: 23784.6). Total num frames: 24018944. Throughput: 0: 5699.2. Samples: 4989612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:17,245][24717] Avg episode reward: [(0, '33.276')] [2023-02-22 21:50:17,836][37090] Updated weights for policy 0, policy_version 5868 (0.0006) [2023-02-22 21:50:19,520][37090] Updated weights for policy 0, policy_version 5878 (0.0007) [2023-02-22 21:50:21,285][37090] Updated weights for policy 0, policy_version 5888 (0.0007) [2023-02-22 21:50:22,244][24717] Fps is (10 sec: 23756.7, 60 sec: 22937.6, 300 sec: 23770.7). Total num frames: 24137728. Throughput: 0: 5748.4. Samples: 5025746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:50:22,245][24717] Avg episode reward: [(0, '34.917')] [2023-02-22 21:50:23,049][37090] Updated weights for policy 0, policy_version 5898 (0.0008) [2023-02-22 21:50:24,902][37090] Updated weights for policy 0, policy_version 5908 (0.0008) [2023-02-22 21:50:26,702][37090] Updated weights for policy 0, policy_version 5918 (0.0009) [2023-02-22 21:50:27,244][24717] Fps is (10 sec: 22937.7, 60 sec: 22869.4, 300 sec: 23742.9). Total num frames: 24248320. Throughput: 0: 5753.5. Samples: 5059668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:50:27,245][24717] Avg episode reward: [(0, '35.374')] [2023-02-22 21:50:28,533][37090] Updated weights for policy 0, policy_version 5928 (0.0007) [2023-02-22 21:50:30,304][37090] Updated weights for policy 0, policy_version 5938 (0.0008) [2023-02-22 21:50:32,094][37090] Updated weights for policy 0, policy_version 5948 (0.0007) [2023-02-22 21:50:32,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22869.4, 300 sec: 23729.0). Total num frames: 24363008. Throughput: 0: 5757.7. Samples: 5076700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:50:32,245][24717] Avg episode reward: [(0, '34.994')] [2023-02-22 21:50:33,909][37090] Updated weights for policy 0, policy_version 5958 (0.0008) [2023-02-22 21:50:35,689][37090] Updated weights for policy 0, policy_version 5968 (0.0007) [2023-02-22 21:50:37,244][24717] Fps is (10 sec: 22937.6, 60 sec: 22937.6, 300 sec: 23742.9). Total num frames: 24477696. Throughput: 0: 5751.3. Samples: 5110978. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:50:37,246][24717] Avg episode reward: [(0, '30.797')] [2023-02-22 21:50:37,524][37090] Updated weights for policy 0, policy_version 5978 (0.0009) [2023-02-22 21:50:39,314][37090] Updated weights for policy 0, policy_version 5988 (0.0009) [2023-02-22 21:50:41,088][37090] Updated weights for policy 0, policy_version 5998 (0.0007) [2023-02-22 21:50:42,244][24717] Fps is (10 sec: 22937.6, 60 sec: 22937.7, 300 sec: 23729.0). Total num frames: 24592384. Throughput: 0: 5747.7. Samples: 5145070. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:42,245][24717] Avg episode reward: [(0, '30.243')] [2023-02-22 21:50:42,884][37090] Updated weights for policy 0, policy_version 6008 (0.0009) [2023-02-22 21:50:44,780][37090] Updated weights for policy 0, policy_version 6018 (0.0007) [2023-02-22 21:50:46,682][37090] Updated weights for policy 0, policy_version 6028 (0.0008) [2023-02-22 21:50:47,244][24717] Fps is (10 sec: 22118.3, 60 sec: 22869.3, 300 sec: 23701.3). Total num frames: 24698880. Throughput: 0: 5732.1. Samples: 5161540. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:50:47,246][24717] Avg episode reward: [(0, '32.176')] [2023-02-22 21:50:48,845][37090] Updated weights for policy 0, policy_version 6038 (0.0008) [2023-02-22 21:50:50,866][37090] Updated weights for policy 0, policy_version 6048 (0.0009) [2023-02-22 21:50:52,245][24717] Fps is (10 sec: 20889.1, 60 sec: 22596.2, 300 sec: 23631.8). Total num frames: 24801280. Throughput: 0: 5638.5. Samples: 5191864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:50:52,246][24717] Avg episode reward: [(0, '33.712')] [2023-02-22 21:50:52,729][37090] Updated weights for policy 0, policy_version 6058 (0.0008) [2023-02-22 21:50:54,679][37090] Updated weights for policy 0, policy_version 6068 (0.0007) [2023-02-22 21:50:56,527][37090] Updated weights for policy 0, policy_version 6078 (0.0009) [2023-02-22 21:50:57,244][24717] Fps is (10 sec: 20889.4, 60 sec: 22528.0, 300 sec: 23590.2). Total num frames: 24907776. Throughput: 0: 5613.9. Samples: 5224526. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:50:57,246][24717] Avg episode reward: [(0, '34.385')] [2023-02-22 21:50:58,413][37090] Updated weights for policy 0, policy_version 6088 (0.0007) [2023-02-22 21:51:00,266][37090] Updated weights for policy 0, policy_version 6098 (0.0008) [2023-02-22 21:51:02,148][37090] Updated weights for policy 0, policy_version 6108 (0.0010) [2023-02-22 21:51:02,244][24717] Fps is (10 sec: 21708.9, 60 sec: 22459.7, 300 sec: 23562.4). Total num frames: 25018368. Throughput: 0: 5581.9. Samples: 5240800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:51:02,246][24717] Avg episode reward: [(0, '38.397')] [2023-02-22 21:51:04,062][37090] Updated weights for policy 0, policy_version 6118 (0.0009) [2023-02-22 21:51:05,925][37090] Updated weights for policy 0, policy_version 6128 (0.0008) [2023-02-22 21:51:07,244][24717] Fps is (10 sec: 21709.1, 60 sec: 22323.2, 300 sec: 23506.9). Total num frames: 25124864. Throughput: 0: 5504.5. Samples: 5273450. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:51:07,246][24717] Avg episode reward: [(0, '34.811')] [2023-02-22 21:51:07,862][37090] Updated weights for policy 0, policy_version 6138 (0.0009) [2023-02-22 21:51:09,794][37090] Updated weights for policy 0, policy_version 6148 (0.0008) [2023-02-22 21:51:11,717][37090] Updated weights for policy 0, policy_version 6158 (0.0008) [2023-02-22 21:51:12,244][24717] Fps is (10 sec: 21299.2, 60 sec: 22186.6, 300 sec: 23437.4). Total num frames: 25231360. Throughput: 0: 5457.8. Samples: 5305272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:51:12,247][24717] Avg episode reward: [(0, '38.092')] [2023-02-22 21:51:13,629][37090] Updated weights for policy 0, policy_version 6168 (0.0007) [2023-02-22 21:51:15,368][37090] Updated weights for policy 0, policy_version 6178 (0.0006) [2023-02-22 21:51:17,116][37090] Updated weights for policy 0, policy_version 6188 (0.0007) [2023-02-22 21:51:17,244][24717] Fps is (10 sec: 22118.4, 60 sec: 22118.4, 300 sec: 23423.6). Total num frames: 25346048. Throughput: 0: 5453.9. Samples: 5322124. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-22 21:51:17,245][24717] Avg episode reward: [(0, '37.017')] [2023-02-22 21:51:18,828][37090] Updated weights for policy 0, policy_version 6198 (0.0006) [2023-02-22 21:51:20,600][37090] Updated weights for policy 0, policy_version 6208 (0.0008) [2023-02-22 21:51:22,244][24717] Fps is (10 sec: 22937.8, 60 sec: 22050.1, 300 sec: 23409.7). Total num frames: 25460736. Throughput: 0: 5468.0. Samples: 5357040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:51:22,246][24717] Avg episode reward: [(0, '36.279')] [2023-02-22 21:51:22,510][37090] Updated weights for policy 0, policy_version 6218 (0.0008) [2023-02-22 21:51:24,408][37090] Updated weights for policy 0, policy_version 6228 (0.0007) [2023-02-22 21:51:26,301][37090] Updated weights for policy 0, policy_version 6238 (0.0008) [2023-02-22 21:51:27,244][24717] Fps is (10 sec: 22528.0, 60 sec: 22050.1, 300 sec: 23381.9). Total num frames: 25571328. Throughput: 0: 5432.7. Samples: 5389540. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:51:27,247][24717] Avg episode reward: [(0, '33.523')] [2023-02-22 21:51:28,177][37090] Updated weights for policy 0, policy_version 6248 (0.0011) [2023-02-22 21:51:30,049][37090] Updated weights for policy 0, policy_version 6258 (0.0009) [2023-02-22 21:51:31,929][37090] Updated weights for policy 0, policy_version 6268 (0.0008) [2023-02-22 21:51:32,244][24717] Fps is (10 sec: 21709.0, 60 sec: 21913.6, 300 sec: 23340.3). Total num frames: 25677824. Throughput: 0: 5432.4. Samples: 5405998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:51:32,245][24717] Avg episode reward: [(0, '33.259')] [2023-02-22 21:51:33,798][37090] Updated weights for policy 0, policy_version 6278 (0.0007) [2023-02-22 21:51:35,732][37090] Updated weights for policy 0, policy_version 6288 (0.0008) [2023-02-22 21:51:37,244][24717] Fps is (10 sec: 21299.0, 60 sec: 21777.0, 300 sec: 23284.7). Total num frames: 25784320. Throughput: 0: 5478.1. Samples: 5438380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:51:37,245][24717] Avg episode reward: [(0, '34.549')] [2023-02-22 21:51:37,645][37090] Updated weights for policy 0, policy_version 6298 (0.0007) [2023-02-22 21:51:39,528][37090] Updated weights for policy 0, policy_version 6308 (0.0008) [2023-02-22 21:51:41,398][37090] Updated weights for policy 0, policy_version 6318 (0.0006) [2023-02-22 21:51:42,244][24717] Fps is (10 sec: 21708.8, 60 sec: 21708.8, 300 sec: 23256.9). Total num frames: 25894912. Throughput: 0: 5476.7. Samples: 5470978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:51:42,245][24717] Avg episode reward: [(0, '36.516')] [2023-02-22 21:51:43,229][37090] Updated weights for policy 0, policy_version 6328 (0.0007) [2023-02-22 21:51:45,123][37090] Updated weights for policy 0, policy_version 6338 (0.0009) [2023-02-22 21:51:47,012][37090] Updated weights for policy 0, policy_version 6348 (0.0006) [2023-02-22 21:51:47,244][24717] Fps is (10 sec: 22118.5, 60 sec: 21777.1, 300 sec: 23215.3). Total num frames: 26005504. Throughput: 0: 5483.4. Samples: 5487552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:51:47,246][24717] Avg episode reward: [(0, '33.788')] [2023-02-22 21:51:47,250][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000006349_26005504.pth... [2023-02-22 21:51:47,300][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000004998_20471808.pth [2023-02-22 21:51:48,921][37090] Updated weights for policy 0, policy_version 6358 (0.0007) [2023-02-22 21:51:50,797][37090] Updated weights for policy 0, policy_version 6368 (0.0010) [2023-02-22 21:51:52,244][24717] Fps is (10 sec: 21708.9, 60 sec: 21845.4, 300 sec: 23173.7). Total num frames: 26112000. Throughput: 0: 5479.3. Samples: 5520018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:51:52,245][24717] Avg episode reward: [(0, '33.463')] [2023-02-22 21:51:52,696][37090] Updated weights for policy 0, policy_version 6378 (0.0008) [2023-02-22 21:51:54,577][37090] Updated weights for policy 0, policy_version 6388 (0.0008) [2023-02-22 21:51:56,413][37090] Updated weights for policy 0, policy_version 6398 (0.0008) [2023-02-22 21:51:57,244][24717] Fps is (10 sec: 21708.7, 60 sec: 21913.6, 300 sec: 23145.9). Total num frames: 26222592. Throughput: 0: 5494.6. Samples: 5552530. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:51:57,246][24717] Avg episode reward: [(0, '35.017')] [2023-02-22 21:51:58,339][37090] Updated weights for policy 0, policy_version 6408 (0.0009) [2023-02-22 21:52:00,253][37090] Updated weights for policy 0, policy_version 6418 (0.0007) [2023-02-22 21:52:02,148][37090] Updated weights for policy 0, policy_version 6428 (0.0007) [2023-02-22 21:52:02,244][24717] Fps is (10 sec: 21708.7, 60 sec: 21845.4, 300 sec: 23104.2). Total num frames: 26329088. Throughput: 0: 5482.2. Samples: 5568824. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:02,245][24717] Avg episode reward: [(0, '32.604')] [2023-02-22 21:52:04,058][37090] Updated weights for policy 0, policy_version 6438 (0.0007) [2023-02-22 21:52:05,953][37090] Updated weights for policy 0, policy_version 6448 (0.0008) [2023-02-22 21:52:07,244][24717] Fps is (10 sec: 21299.3, 60 sec: 21845.3, 300 sec: 23048.7). Total num frames: 26435584. Throughput: 0: 5420.1. Samples: 5600942. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:52:07,245][24717] Avg episode reward: [(0, '37.959')] [2023-02-22 21:52:07,881][37090] Updated weights for policy 0, policy_version 6458 (0.0007) [2023-02-22 21:52:09,735][37090] Updated weights for policy 0, policy_version 6468 (0.0007) [2023-02-22 21:52:11,645][37090] Updated weights for policy 0, policy_version 6478 (0.0007) [2023-02-22 21:52:12,244][24717] Fps is (10 sec: 21708.6, 60 sec: 21913.6, 300 sec: 23007.0). Total num frames: 26546176. Throughput: 0: 5420.9. Samples: 5633480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:12,245][24717] Avg episode reward: [(0, '36.452')] [2023-02-22 21:52:13,461][37090] Updated weights for policy 0, policy_version 6488 (0.0009) [2023-02-22 21:52:15,237][37090] Updated weights for policy 0, policy_version 6498 (0.0007) [2023-02-22 21:52:16,991][37090] Updated weights for policy 0, policy_version 6508 (0.0008) [2023-02-22 21:52:17,244][24717] Fps is (10 sec: 22528.0, 60 sec: 21913.6, 300 sec: 22965.4). Total num frames: 26660864. Throughput: 0: 5436.8. Samples: 5650654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:17,245][24717] Avg episode reward: [(0, '35.487')] [2023-02-22 21:52:18,767][37090] Updated weights for policy 0, policy_version 6518 (0.0006) [2023-02-22 21:52:20,542][37090] Updated weights for policy 0, policy_version 6528 (0.0007) [2023-02-22 21:52:22,244][24717] Fps is (10 sec: 22937.8, 60 sec: 21913.6, 300 sec: 22951.5). Total num frames: 26775552. Throughput: 0: 5486.4. Samples: 5685266. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:22,246][24717] Avg episode reward: [(0, '36.427')] [2023-02-22 21:52:22,449][37090] Updated weights for policy 0, policy_version 6538 (0.0007) [2023-02-22 21:52:24,452][37090] Updated weights for policy 0, policy_version 6548 (0.0007) [2023-02-22 21:52:26,365][37090] Updated weights for policy 0, policy_version 6558 (0.0007) [2023-02-22 21:52:27,244][24717] Fps is (10 sec: 21708.8, 60 sec: 21777.1, 300 sec: 22882.1). Total num frames: 26877952. Throughput: 0: 5463.5. Samples: 5716834. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:52:27,246][24717] Avg episode reward: [(0, '37.701')] [2023-02-22 21:52:28,228][37090] Updated weights for policy 0, policy_version 6568 (0.0007) [2023-02-22 21:52:30,096][37090] Updated weights for policy 0, policy_version 6578 (0.0007) [2023-02-22 21:52:32,080][37090] Updated weights for policy 0, policy_version 6588 (0.0009) [2023-02-22 21:52:32,244][24717] Fps is (10 sec: 20889.3, 60 sec: 21777.0, 300 sec: 22840.4). Total num frames: 26984448. Throughput: 0: 5460.9. Samples: 5733294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:52:32,245][24717] Avg episode reward: [(0, '38.170')] [2023-02-22 21:52:34,091][37090] Updated weights for policy 0, policy_version 6598 (0.0008) [2023-02-22 21:52:36,011][37090] Updated weights for policy 0, policy_version 6608 (0.0008) [2023-02-22 21:52:37,244][24717] Fps is (10 sec: 21299.0, 60 sec: 21777.1, 300 sec: 22812.6). Total num frames: 27090944. Throughput: 0: 5437.8. Samples: 5764720. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:52:37,245][24717] Avg episode reward: [(0, '37.633')] [2023-02-22 21:52:37,969][37090] Updated weights for policy 0, policy_version 6618 (0.0008) [2023-02-22 21:52:39,892][37090] Updated weights for policy 0, policy_version 6628 (0.0009) [2023-02-22 21:52:41,847][37090] Updated weights for policy 0, policy_version 6638 (0.0008) [2023-02-22 21:52:42,244][24717] Fps is (10 sec: 21299.4, 60 sec: 21708.8, 300 sec: 22798.8). Total num frames: 27197440. Throughput: 0: 5418.3. Samples: 5796354. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:52:42,245][24717] Avg episode reward: [(0, '39.632')] [2023-02-22 21:52:42,247][37076] Saving new best policy, reward=39.632! [2023-02-22 21:52:43,828][37090] Updated weights for policy 0, policy_version 6648 (0.0008) [2023-02-22 21:52:45,793][37090] Updated weights for policy 0, policy_version 6658 (0.0007) [2023-02-22 21:52:47,245][24717] Fps is (10 sec: 20889.3, 60 sec: 21572.2, 300 sec: 22743.2). Total num frames: 27299840. Throughput: 0: 5404.2. Samples: 5812014. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:52:47,246][24717] Avg episode reward: [(0, '38.729')] [2023-02-22 21:52:47,763][37090] Updated weights for policy 0, policy_version 6668 (0.0009) [2023-02-22 21:52:49,712][37090] Updated weights for policy 0, policy_version 6678 (0.0007) [2023-02-22 21:52:51,619][37090] Updated weights for policy 0, policy_version 6688 (0.0007) [2023-02-22 21:52:52,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21572.3, 300 sec: 22701.6). Total num frames: 27406336. Throughput: 0: 5387.9. Samples: 5843396. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:52,246][24717] Avg episode reward: [(0, '39.753')] [2023-02-22 21:52:52,247][37076] Saving new best policy, reward=39.753! [2023-02-22 21:52:53,589][37090] Updated weights for policy 0, policy_version 6698 (0.0009) [2023-02-22 21:52:55,485][37090] Updated weights for policy 0, policy_version 6708 (0.0007) [2023-02-22 21:52:57,244][24717] Fps is (10 sec: 21299.6, 60 sec: 21504.0, 300 sec: 22646.0). Total num frames: 27512832. Throughput: 0: 5372.6. Samples: 5875246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:52:57,246][24717] Avg episode reward: [(0, '39.898')] [2023-02-22 21:52:57,253][37076] Saving new best policy, reward=39.898! [2023-02-22 21:52:57,434][37090] Updated weights for policy 0, policy_version 6718 (0.0007) [2023-02-22 21:52:59,384][37090] Updated weights for policy 0, policy_version 6728 (0.0008) [2023-02-22 21:53:01,401][37090] Updated weights for policy 0, policy_version 6738 (0.0010) [2023-02-22 21:53:02,244][24717] Fps is (10 sec: 20889.4, 60 sec: 21435.7, 300 sec: 22590.5). Total num frames: 27615232. Throughput: 0: 5335.5. Samples: 5890750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:53:02,246][24717] Avg episode reward: [(0, '36.971')] [2023-02-22 21:53:03,264][37090] Updated weights for policy 0, policy_version 6748 (0.0007) [2023-02-22 21:53:05,203][37090] Updated weights for policy 0, policy_version 6758 (0.0008) [2023-02-22 21:53:07,181][37090] Updated weights for policy 0, policy_version 6768 (0.0007) [2023-02-22 21:53:07,244][24717] Fps is (10 sec: 20889.7, 60 sec: 21435.7, 300 sec: 22534.9). Total num frames: 27721728. Throughput: 0: 5267.8. Samples: 5922318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 21:53:07,245][24717] Avg episode reward: [(0, '36.080')] [2023-02-22 21:53:09,176][37090] Updated weights for policy 0, policy_version 6778 (0.0010) [2023-02-22 21:53:11,126][37090] Updated weights for policy 0, policy_version 6788 (0.0008) [2023-02-22 21:53:12,245][24717] Fps is (10 sec: 20888.9, 60 sec: 21299.1, 300 sec: 22437.7). Total num frames: 27824128. Throughput: 0: 5261.5. Samples: 5953604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:53:12,247][24717] Avg episode reward: [(0, '37.634')] [2023-02-22 21:53:13,048][37090] Updated weights for policy 0, policy_version 6798 (0.0008) [2023-02-22 21:53:15,019][37090] Updated weights for policy 0, policy_version 6808 (0.0008) [2023-02-22 21:53:16,799][37090] Updated weights for policy 0, policy_version 6818 (0.0006) [2023-02-22 21:53:17,244][24717] Fps is (10 sec: 21299.1, 60 sec: 21230.9, 300 sec: 22396.1). Total num frames: 27934720. Throughput: 0: 5246.2. Samples: 5969374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:53:17,245][24717] Avg episode reward: [(0, '37.782')] [2023-02-22 21:53:18,577][37090] Updated weights for policy 0, policy_version 6828 (0.0006) [2023-02-22 21:53:20,328][37090] Updated weights for policy 0, policy_version 6838 (0.0006) [2023-02-22 21:53:22,244][24717] Fps is (10 sec: 22119.4, 60 sec: 21162.7, 300 sec: 22368.3). Total num frames: 28045312. Throughput: 0: 5310.5. Samples: 6003694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:53:22,245][24717] Avg episode reward: [(0, '36.550')] [2023-02-22 21:53:22,294][37090] Updated weights for policy 0, policy_version 6848 (0.0008) [2023-02-22 21:53:24,288][37090] Updated weights for policy 0, policy_version 6858 (0.0008) [2023-02-22 21:53:26,279][37090] Updated weights for policy 0, policy_version 6868 (0.0007) [2023-02-22 21:53:27,244][24717] Fps is (10 sec: 21708.9, 60 sec: 21230.9, 300 sec: 22326.7). Total num frames: 28151808. Throughput: 0: 5299.6. Samples: 6034836. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:53:27,245][24717] Avg episode reward: [(0, '33.535')] [2023-02-22 21:53:28,188][37090] Updated weights for policy 0, policy_version 6878 (0.0009) [2023-02-22 21:53:30,143][37090] Updated weights for policy 0, policy_version 6888 (0.0007) [2023-02-22 21:53:32,091][37090] Updated weights for policy 0, policy_version 6898 (0.0011) [2023-02-22 21:53:32,245][24717] Fps is (10 sec: 20888.9, 60 sec: 21162.6, 300 sec: 22257.2). Total num frames: 28254208. Throughput: 0: 5302.3. Samples: 6050616. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 21:53:32,246][24717] Avg episode reward: [(0, '34.724')] [2023-02-22 21:53:34,114][37090] Updated weights for policy 0, policy_version 6908 (0.0009) [2023-02-22 21:53:36,070][37090] Updated weights for policy 0, policy_version 6918 (0.0008) [2023-02-22 21:53:37,245][24717] Fps is (10 sec: 20479.5, 60 sec: 21094.3, 300 sec: 22187.8). Total num frames: 28356608. Throughput: 0: 5295.3. Samples: 6081688. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:53:37,246][24717] Avg episode reward: [(0, '37.462')] [2023-02-22 21:53:38,050][37090] Updated weights for policy 0, policy_version 6928 (0.0008) [2023-02-22 21:53:40,171][37090] Updated weights for policy 0, policy_version 6938 (0.0009) [2023-02-22 21:53:42,244][24717] Fps is (10 sec: 20070.9, 60 sec: 20957.9, 300 sec: 22104.5). Total num frames: 28454912. Throughput: 0: 5245.7. Samples: 6111304. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:53:42,245][24717] Avg episode reward: [(0, '36.234')] [2023-02-22 21:53:42,324][37090] Updated weights for policy 0, policy_version 6948 (0.0008) [2023-02-22 21:53:44,545][37090] Updated weights for policy 0, policy_version 6958 (0.0012) [2023-02-22 21:53:46,640][37090] Updated weights for policy 0, policy_version 6968 (0.0007) [2023-02-22 21:53:47,244][24717] Fps is (10 sec: 19251.5, 60 sec: 20821.4, 300 sec: 22007.3). Total num frames: 28549120. Throughput: 0: 5216.1. Samples: 6125474. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:53:47,246][24717] Avg episode reward: [(0, '39.423')] [2023-02-22 21:53:47,266][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000006971_28553216.pth... [2023-02-22 21:53:47,321][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000005696_23330816.pth [2023-02-22 21:53:48,828][37090] Updated weights for policy 0, policy_version 6978 (0.0010) [2023-02-22 21:53:50,910][37090] Updated weights for policy 0, policy_version 6988 (0.0009) [2023-02-22 21:53:52,244][24717] Fps is (10 sec: 19660.9, 60 sec: 20753.1, 300 sec: 21951.8). Total num frames: 28651520. Throughput: 0: 5152.8. Samples: 6154192. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:53:52,245][24717] Avg episode reward: [(0, '37.699')] [2023-02-22 21:53:52,896][37090] Updated weights for policy 0, policy_version 6998 (0.0007) [2023-02-22 21:53:54,928][37090] Updated weights for policy 0, policy_version 7008 (0.0008) [2023-02-22 21:53:56,825][37090] Updated weights for policy 0, policy_version 7018 (0.0009) [2023-02-22 21:53:57,244][24717] Fps is (10 sec: 20480.2, 60 sec: 20684.8, 300 sec: 21910.1). Total num frames: 28753920. Throughput: 0: 5149.2. Samples: 6185314. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:53:57,246][24717] Avg episode reward: [(0, '40.311')] [2023-02-22 21:53:57,253][37076] Saving new best policy, reward=40.311! [2023-02-22 21:53:58,812][37090] Updated weights for policy 0, policy_version 7028 (0.0007) [2023-02-22 21:54:00,750][37090] Updated weights for policy 0, policy_version 7038 (0.0007) [2023-02-22 21:54:02,244][24717] Fps is (10 sec: 20479.8, 60 sec: 20684.8, 300 sec: 21882.4). Total num frames: 28856320. Throughput: 0: 5148.1. Samples: 6201038. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:02,246][24717] Avg episode reward: [(0, '37.248')] [2023-02-22 21:54:02,651][37090] Updated weights for policy 0, policy_version 7048 (0.0011) [2023-02-22 21:54:04,550][37090] Updated weights for policy 0, policy_version 7058 (0.0007) [2023-02-22 21:54:06,482][37090] Updated weights for policy 0, policy_version 7068 (0.0008) [2023-02-22 21:54:07,244][24717] Fps is (10 sec: 20889.2, 60 sec: 20684.7, 300 sec: 21840.7). Total num frames: 28962816. Throughput: 0: 5100.4. Samples: 6233214. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:07,246][24717] Avg episode reward: [(0, '33.218')] [2023-02-22 21:54:08,508][37090] Updated weights for policy 0, policy_version 7078 (0.0007) [2023-02-22 21:54:10,473][37090] Updated weights for policy 0, policy_version 7088 (0.0007) [2023-02-22 21:54:12,244][24717] Fps is (10 sec: 21299.5, 60 sec: 20753.2, 300 sec: 21785.2). Total num frames: 29069312. Throughput: 0: 5098.4. Samples: 6264264. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:12,245][24717] Avg episode reward: [(0, '34.803')] [2023-02-22 21:54:12,458][37090] Updated weights for policy 0, policy_version 7098 (0.0007) [2023-02-22 21:54:14,355][37090] Updated weights for policy 0, policy_version 7108 (0.0008) [2023-02-22 21:54:16,308][37090] Updated weights for policy 0, policy_version 7118 (0.0007) [2023-02-22 21:54:17,244][24717] Fps is (10 sec: 21299.7, 60 sec: 20684.8, 300 sec: 21743.5). Total num frames: 29175808. Throughput: 0: 5098.3. Samples: 6280038. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:17,245][24717] Avg episode reward: [(0, '33.866')] [2023-02-22 21:54:18,082][37090] Updated weights for policy 0, policy_version 7128 (0.0009) [2023-02-22 21:54:19,889][37090] Updated weights for policy 0, policy_version 7138 (0.0008) [2023-02-22 21:54:21,816][37090] Updated weights for policy 0, policy_version 7148 (0.0008) [2023-02-22 21:54:22,244][24717] Fps is (10 sec: 21708.6, 60 sec: 20684.8, 300 sec: 21729.6). Total num frames: 29286400. Throughput: 0: 5154.2. Samples: 6313624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:54:22,246][24717] Avg episode reward: [(0, '33.652')] [2023-02-22 21:54:23,768][37090] Updated weights for policy 0, policy_version 7158 (0.0009) [2023-02-22 21:54:25,711][37090] Updated weights for policy 0, policy_version 7168 (0.0007) [2023-02-22 21:54:27,244][24717] Fps is (10 sec: 21708.7, 60 sec: 20684.8, 300 sec: 21701.9). Total num frames: 29392896. Throughput: 0: 5195.3. Samples: 6345092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:54:27,247][24717] Avg episode reward: [(0, '32.173')] [2023-02-22 21:54:27,660][37090] Updated weights for policy 0, policy_version 7178 (0.0007) [2023-02-22 21:54:29,592][37090] Updated weights for policy 0, policy_version 7188 (0.0011) [2023-02-22 21:54:31,466][37090] Updated weights for policy 0, policy_version 7198 (0.0008) [2023-02-22 21:54:32,244][24717] Fps is (10 sec: 21299.4, 60 sec: 20753.2, 300 sec: 21688.0). Total num frames: 29499392. Throughput: 0: 5229.1. Samples: 6360782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:54:32,246][24717] Avg episode reward: [(0, '32.556')] [2023-02-22 21:54:33,389][37090] Updated weights for policy 0, policy_version 7208 (0.0009) [2023-02-22 21:54:35,368][37090] Updated weights for policy 0, policy_version 7218 (0.0010) [2023-02-22 21:54:37,232][37090] Updated weights for policy 0, policy_version 7228 (0.0008) [2023-02-22 21:54:37,244][24717] Fps is (10 sec: 21299.2, 60 sec: 20821.4, 300 sec: 21660.2). Total num frames: 29605888. Throughput: 0: 5303.7. Samples: 6392860. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:54:37,245][24717] Avg episode reward: [(0, '32.974')] [2023-02-22 21:54:39,270][37090] Updated weights for policy 0, policy_version 7238 (0.0008) [2023-02-22 21:54:41,295][37090] Updated weights for policy 0, policy_version 7248 (0.0007) [2023-02-22 21:54:42,244][24717] Fps is (10 sec: 20480.0, 60 sec: 20821.4, 300 sec: 21618.6). Total num frames: 29704192. Throughput: 0: 5294.1. Samples: 6423550. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:54:42,248][24717] Avg episode reward: [(0, '37.922')] [2023-02-22 21:54:43,277][37090] Updated weights for policy 0, policy_version 7258 (0.0008) [2023-02-22 21:54:45,177][37090] Updated weights for policy 0, policy_version 7268 (0.0008) [2023-02-22 21:54:47,144][37090] Updated weights for policy 0, policy_version 7278 (0.0009) [2023-02-22 21:54:47,244][24717] Fps is (10 sec: 20479.9, 60 sec: 21026.2, 300 sec: 21576.9). Total num frames: 29810688. Throughput: 0: 5301.3. Samples: 6439598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:54:47,245][24717] Avg episode reward: [(0, '35.923')] [2023-02-22 21:54:49,116][37090] Updated weights for policy 0, policy_version 7288 (0.0009) [2023-02-22 21:54:51,084][37090] Updated weights for policy 0, policy_version 7298 (0.0007) [2023-02-22 21:54:52,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21026.1, 300 sec: 21549.1). Total num frames: 29913088. Throughput: 0: 5281.4. Samples: 6470876. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:52,246][24717] Avg episode reward: [(0, '36.168')] [2023-02-22 21:54:53,113][37090] Updated weights for policy 0, policy_version 7308 (0.0009) [2023-02-22 21:54:55,139][37090] Updated weights for policy 0, policy_version 7318 (0.0007) [2023-02-22 21:54:57,128][37090] Updated weights for policy 0, policy_version 7328 (0.0011) [2023-02-22 21:54:57,245][24717] Fps is (10 sec: 20479.4, 60 sec: 21026.0, 300 sec: 21507.4). Total num frames: 30015488. Throughput: 0: 5269.4. Samples: 6501390. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:54:57,246][24717] Avg episode reward: [(0, '38.886')] [2023-02-22 21:54:59,073][37090] Updated weights for policy 0, policy_version 7338 (0.0007) [2023-02-22 21:55:01,063][37090] Updated weights for policy 0, policy_version 7348 (0.0007) [2023-02-22 21:55:02,245][24717] Fps is (10 sec: 20479.1, 60 sec: 21026.0, 300 sec: 21465.8). Total num frames: 30117888. Throughput: 0: 5269.4. Samples: 6517162. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:02,247][24717] Avg episode reward: [(0, '35.511')] [2023-02-22 21:55:03,050][37090] Updated weights for policy 0, policy_version 7358 (0.0007) [2023-02-22 21:55:04,996][37090] Updated weights for policy 0, policy_version 7368 (0.0009) [2023-02-22 21:55:06,930][37090] Updated weights for policy 0, policy_version 7378 (0.0012) [2023-02-22 21:55:07,244][24717] Fps is (10 sec: 20890.0, 60 sec: 21026.1, 300 sec: 21438.0). Total num frames: 30224384. Throughput: 0: 5214.8. Samples: 6548292. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:55:07,246][24717] Avg episode reward: [(0, '38.440')] [2023-02-22 21:55:08,954][37090] Updated weights for policy 0, policy_version 7388 (0.0008) [2023-02-22 21:55:10,895][37090] Updated weights for policy 0, policy_version 7398 (0.0010) [2023-02-22 21:55:12,244][24717] Fps is (10 sec: 20890.4, 60 sec: 20957.9, 300 sec: 21382.5). Total num frames: 30326784. Throughput: 0: 5206.4. Samples: 6579382. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:12,245][24717] Avg episode reward: [(0, '36.593')] [2023-02-22 21:55:12,864][37090] Updated weights for policy 0, policy_version 7408 (0.0010) [2023-02-22 21:55:14,847][37090] Updated weights for policy 0, policy_version 7418 (0.0009) [2023-02-22 21:55:16,702][37090] Updated weights for policy 0, policy_version 7428 (0.0009) [2023-02-22 21:55:17,244][24717] Fps is (10 sec: 21299.5, 60 sec: 21026.1, 300 sec: 21354.7). Total num frames: 30437376. Throughput: 0: 5202.5. Samples: 6594894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:55:17,245][24717] Avg episode reward: [(0, '34.447')] [2023-02-22 21:55:18,468][37090] Updated weights for policy 0, policy_version 7438 (0.0008) [2023-02-22 21:55:20,308][37090] Updated weights for policy 0, policy_version 7448 (0.0008) [2023-02-22 21:55:22,245][24717] Fps is (10 sec: 21707.6, 60 sec: 20957.7, 300 sec: 21340.8). Total num frames: 30543872. Throughput: 0: 5238.0. Samples: 6628574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:55:22,247][24717] Avg episode reward: [(0, '37.744')] [2023-02-22 21:55:22,309][37090] Updated weights for policy 0, policy_version 7458 (0.0008) [2023-02-22 21:55:24,314][37090] Updated weights for policy 0, policy_version 7468 (0.0009) [2023-02-22 21:55:26,236][37090] Updated weights for policy 0, policy_version 7478 (0.0008) [2023-02-22 21:55:27,244][24717] Fps is (10 sec: 21299.3, 60 sec: 20957.9, 300 sec: 21313.1). Total num frames: 30650368. Throughput: 0: 5244.8. Samples: 6659566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:55:27,245][24717] Avg episode reward: [(0, '40.960')] [2023-02-22 21:55:27,251][37076] Saving new best policy, reward=40.960! [2023-02-22 21:55:28,227][37090] Updated weights for policy 0, policy_version 7488 (0.0008) [2023-02-22 21:55:30,154][37090] Updated weights for policy 0, policy_version 7498 (0.0008) [2023-02-22 21:55:32,129][37090] Updated weights for policy 0, policy_version 7508 (0.0009) [2023-02-22 21:55:32,244][24717] Fps is (10 sec: 20890.7, 60 sec: 20889.6, 300 sec: 21271.4). Total num frames: 30752768. Throughput: 0: 5236.3. Samples: 6675230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:32,246][24717] Avg episode reward: [(0, '40.301')] [2023-02-22 21:55:34,066][37090] Updated weights for policy 0, policy_version 7518 (0.0008) [2023-02-22 21:55:36,002][37090] Updated weights for policy 0, policy_version 7528 (0.0008) [2023-02-22 21:55:37,244][24717] Fps is (10 sec: 20889.5, 60 sec: 20889.6, 300 sec: 21243.7). Total num frames: 30859264. Throughput: 0: 5242.6. Samples: 6706792. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:37,245][24717] Avg episode reward: [(0, '34.946')] [2023-02-22 21:55:37,967][37090] Updated weights for policy 0, policy_version 7538 (0.0007) [2023-02-22 21:55:39,924][37090] Updated weights for policy 0, policy_version 7548 (0.0007) [2023-02-22 21:55:41,829][37090] Updated weights for policy 0, policy_version 7558 (0.0006) [2023-02-22 21:55:42,245][24717] Fps is (10 sec: 21298.4, 60 sec: 21026.0, 300 sec: 21243.6). Total num frames: 30965760. Throughput: 0: 5261.8. Samples: 6738172. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:42,246][24717] Avg episode reward: [(0, '35.026')] [2023-02-22 21:55:43,907][37090] Updated weights for policy 0, policy_version 7568 (0.0009) [2023-02-22 21:55:45,896][37090] Updated weights for policy 0, policy_version 7578 (0.0011) [2023-02-22 21:55:47,244][24717] Fps is (10 sec: 20480.0, 60 sec: 20889.6, 300 sec: 21229.8). Total num frames: 31064064. Throughput: 0: 5252.3. Samples: 6753512. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:55:47,248][24717] Avg episode reward: [(0, '37.971')] [2023-02-22 21:55:47,269][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000007585_31068160.pth... [2023-02-22 21:55:47,322][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000006349_26005504.pth [2023-02-22 21:55:47,945][37090] Updated weights for policy 0, policy_version 7588 (0.0007) [2023-02-22 21:55:49,850][37090] Updated weights for policy 0, policy_version 7598 (0.0007) [2023-02-22 21:55:51,778][37090] Updated weights for policy 0, policy_version 7608 (0.0009) [2023-02-22 21:55:52,244][24717] Fps is (10 sec: 20480.6, 60 sec: 20957.8, 300 sec: 21229.8). Total num frames: 31170560. Throughput: 0: 5247.9. Samples: 6784448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:55:52,246][24717] Avg episode reward: [(0, '35.031')] [2023-02-22 21:55:53,851][37090] Updated weights for policy 0, policy_version 7618 (0.0009) [2023-02-22 21:55:55,811][37090] Updated weights for policy 0, policy_version 7628 (0.0008) [2023-02-22 21:55:57,244][24717] Fps is (10 sec: 20889.6, 60 sec: 20958.0, 300 sec: 21202.0). Total num frames: 31272960. Throughput: 0: 5244.9. Samples: 6815404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:55:57,246][24717] Avg episode reward: [(0, '36.348')] [2023-02-22 21:55:57,753][37090] Updated weights for policy 0, policy_version 7638 (0.0007) [2023-02-22 21:55:59,702][37090] Updated weights for policy 0, policy_version 7648 (0.0009) [2023-02-22 21:56:01,740][37090] Updated weights for policy 0, policy_version 7658 (0.0007) [2023-02-22 21:56:02,244][24717] Fps is (10 sec: 20479.9, 60 sec: 20958.0, 300 sec: 21188.1). Total num frames: 31375360. Throughput: 0: 5247.9. Samples: 6831050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 21:56:02,246][24717] Avg episode reward: [(0, '38.978')] [2023-02-22 21:56:03,731][37090] Updated weights for policy 0, policy_version 7668 (0.0008) [2023-02-22 21:56:05,687][37090] Updated weights for policy 0, policy_version 7678 (0.0010) [2023-02-22 21:56:07,244][24717] Fps is (10 sec: 20479.7, 60 sec: 20889.6, 300 sec: 21174.2). Total num frames: 31477760. Throughput: 0: 5187.5. Samples: 6862010. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:56:07,247][24717] Avg episode reward: [(0, '40.202')] [2023-02-22 21:56:07,698][37090] Updated weights for policy 0, policy_version 7688 (0.0007) [2023-02-22 21:56:09,686][37090] Updated weights for policy 0, policy_version 7698 (0.0007) [2023-02-22 21:56:11,689][37090] Updated weights for policy 0, policy_version 7708 (0.0008) [2023-02-22 21:56:12,244][24717] Fps is (10 sec: 20480.3, 60 sec: 20889.6, 300 sec: 21132.6). Total num frames: 31580160. Throughput: 0: 5186.5. Samples: 6892958. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:56:12,245][24717] Avg episode reward: [(0, '39.097')] [2023-02-22 21:56:13,617][37090] Updated weights for policy 0, policy_version 7718 (0.0009) [2023-02-22 21:56:15,559][37090] Updated weights for policy 0, policy_version 7728 (0.0009) [2023-02-22 21:56:17,244][24717] Fps is (10 sec: 21299.6, 60 sec: 20889.6, 300 sec: 21118.7). Total num frames: 31690752. Throughput: 0: 5187.9. Samples: 6908684. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:56:17,245][24717] Avg episode reward: [(0, '37.494')] [2023-02-22 21:56:17,374][37090] Updated weights for policy 0, policy_version 7738 (0.0007) [2023-02-22 21:56:19,163][37090] Updated weights for policy 0, policy_version 7748 (0.0008) [2023-02-22 21:56:21,054][37090] Updated weights for policy 0, policy_version 7758 (0.0007) [2023-02-22 21:56:22,244][24717] Fps is (10 sec: 22118.1, 60 sec: 20958.0, 300 sec: 21118.7). Total num frames: 31801344. Throughput: 0: 5229.3. Samples: 6942110. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:56:22,246][24717] Avg episode reward: [(0, '35.287')] [2023-02-22 21:56:22,930][37090] Updated weights for policy 0, policy_version 7768 (0.0008) [2023-02-22 21:56:24,927][37090] Updated weights for policy 0, policy_version 7778 (0.0007) [2023-02-22 21:56:26,840][37090] Updated weights for policy 0, policy_version 7788 (0.0006) [2023-02-22 21:56:27,244][24717] Fps is (10 sec: 21708.5, 60 sec: 20957.8, 300 sec: 21118.7). Total num frames: 31907840. Throughput: 0: 5239.5. Samples: 6973948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:56:27,246][24717] Avg episode reward: [(0, '35.991')] [2023-02-22 21:56:28,874][37090] Updated weights for policy 0, policy_version 7798 (0.0008) [2023-02-22 21:56:30,764][37090] Updated weights for policy 0, policy_version 7808 (0.0008) [2023-02-22 21:56:32,244][24717] Fps is (10 sec: 20889.4, 60 sec: 20957.8, 300 sec: 21104.8). Total num frames: 32010240. Throughput: 0: 5241.2. Samples: 6989368. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:56:32,246][24717] Avg episode reward: [(0, '34.098')] [2023-02-22 21:56:32,708][37090] Updated weights for policy 0, policy_version 7818 (0.0008) [2023-02-22 21:56:34,677][37090] Updated weights for policy 0, policy_version 7828 (0.0007) [2023-02-22 21:56:36,660][37090] Updated weights for policy 0, policy_version 7838 (0.0008) [2023-02-22 21:56:37,244][24717] Fps is (10 sec: 20889.6, 60 sec: 20957.8, 300 sec: 21090.9). Total num frames: 32116736. Throughput: 0: 5253.1. Samples: 7020836. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:56:37,247][24717] Avg episode reward: [(0, '35.385')] [2023-02-22 21:56:38,650][37090] Updated weights for policy 0, policy_version 7848 (0.0007) [2023-02-22 21:56:40,600][37090] Updated weights for policy 0, policy_version 7858 (0.0008) [2023-02-22 21:56:42,244][24717] Fps is (10 sec: 20889.9, 60 sec: 20889.7, 300 sec: 21063.2). Total num frames: 32219136. Throughput: 0: 5263.2. Samples: 7052246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:56:42,245][24717] Avg episode reward: [(0, '37.778')] [2023-02-22 21:56:42,542][37090] Updated weights for policy 0, policy_version 7868 (0.0009) [2023-02-22 21:56:44,511][37090] Updated weights for policy 0, policy_version 7878 (0.0007) [2023-02-22 21:56:46,414][37090] Updated weights for policy 0, policy_version 7888 (0.0007) [2023-02-22 21:56:47,246][24717] Fps is (10 sec: 20886.5, 60 sec: 21025.6, 300 sec: 21063.0). Total num frames: 32325632. Throughput: 0: 5260.3. Samples: 7067772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:56:47,247][24717] Avg episode reward: [(0, '39.554')] [2023-02-22 21:56:48,349][37090] Updated weights for policy 0, policy_version 7898 (0.0008) [2023-02-22 21:56:50,276][37090] Updated weights for policy 0, policy_version 7908 (0.0008) [2023-02-22 21:56:52,150][37090] Updated weights for policy 0, policy_version 7918 (0.0007) [2023-02-22 21:56:52,244][24717] Fps is (10 sec: 21299.2, 60 sec: 21026.2, 300 sec: 21049.3). Total num frames: 32432128. Throughput: 0: 5284.8. Samples: 7099826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:56:52,245][24717] Avg episode reward: [(0, '39.206')] [2023-02-22 21:56:54,219][37090] Updated weights for policy 0, policy_version 7928 (0.0009) [2023-02-22 21:56:56,176][37090] Updated weights for policy 0, policy_version 7938 (0.0009) [2023-02-22 21:56:57,244][24717] Fps is (10 sec: 20892.8, 60 sec: 21026.1, 300 sec: 21035.4). Total num frames: 32534528. Throughput: 0: 5292.7. Samples: 7131132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:56:57,246][24717] Avg episode reward: [(0, '36.772')] [2023-02-22 21:56:58,098][37090] Updated weights for policy 0, policy_version 7948 (0.0007) [2023-02-22 21:57:00,085][37090] Updated weights for policy 0, policy_version 7958 (0.0008) [2023-02-22 21:57:02,077][37090] Updated weights for policy 0, policy_version 7968 (0.0010) [2023-02-22 21:57:02,244][24717] Fps is (10 sec: 20480.0, 60 sec: 21026.2, 300 sec: 21021.5). Total num frames: 32636928. Throughput: 0: 5288.4. Samples: 7146660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:02,245][24717] Avg episode reward: [(0, '35.287')] [2023-02-22 21:57:04,032][37090] Updated weights for policy 0, policy_version 7978 (0.0008) [2023-02-22 21:57:05,933][37090] Updated weights for policy 0, policy_version 7988 (0.0008) [2023-02-22 21:57:07,244][24717] Fps is (10 sec: 20889.7, 60 sec: 21094.5, 300 sec: 21007.6). Total num frames: 32743424. Throughput: 0: 5243.4. Samples: 7178064. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:57:07,245][24717] Avg episode reward: [(0, '35.184')] [2023-02-22 21:57:07,868][37090] Updated weights for policy 0, policy_version 7998 (0.0010) [2023-02-22 21:57:09,862][37090] Updated weights for policy 0, policy_version 8008 (0.0008) [2023-02-22 21:57:11,873][37090] Updated weights for policy 0, policy_version 8018 (0.0008) [2023-02-22 21:57:12,244][24717] Fps is (10 sec: 20889.5, 60 sec: 21094.4, 300 sec: 20966.0). Total num frames: 32845824. Throughput: 0: 5229.3. Samples: 7209268. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:57:12,245][24717] Avg episode reward: [(0, '35.488')] [2023-02-22 21:57:13,862][37090] Updated weights for policy 0, policy_version 8028 (0.0010) [2023-02-22 21:57:15,760][37090] Updated weights for policy 0, policy_version 8038 (0.0008) [2023-02-22 21:57:17,244][24717] Fps is (10 sec: 21299.2, 60 sec: 21094.4, 300 sec: 20952.1). Total num frames: 32956416. Throughput: 0: 5235.4. Samples: 7224962. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:57:17,246][24717] Avg episode reward: [(0, '36.530')] [2023-02-22 21:57:17,538][37090] Updated weights for policy 0, policy_version 8048 (0.0007) [2023-02-22 21:57:19,309][37090] Updated weights for policy 0, policy_version 8058 (0.0008) [2023-02-22 21:57:21,112][37090] Updated weights for policy 0, policy_version 8068 (0.0007) [2023-02-22 21:57:22,244][24717] Fps is (10 sec: 22118.5, 60 sec: 21094.4, 300 sec: 20979.9). Total num frames: 33067008. Throughput: 0: 5295.2. Samples: 7259118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:22,246][24717] Avg episode reward: [(0, '38.866')] [2023-02-22 21:57:23,023][37090] Updated weights for policy 0, policy_version 8078 (0.0007) [2023-02-22 21:57:25,008][37090] Updated weights for policy 0, policy_version 8088 (0.0009) [2023-02-22 21:57:26,896][37090] Updated weights for policy 0, policy_version 8098 (0.0009) [2023-02-22 21:57:27,244][24717] Fps is (10 sec: 21708.8, 60 sec: 21094.5, 300 sec: 20979.9). Total num frames: 33173504. Throughput: 0: 5304.0. Samples: 7290926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:27,245][24717] Avg episode reward: [(0, '36.248')] [2023-02-22 21:57:28,903][37090] Updated weights for policy 0, policy_version 8108 (0.0009) [2023-02-22 21:57:30,821][37090] Updated weights for policy 0, policy_version 8118 (0.0007) [2023-02-22 21:57:32,244][24717] Fps is (10 sec: 21299.2, 60 sec: 21162.7, 300 sec: 20979.9). Total num frames: 33280000. Throughput: 0: 5305.6. Samples: 7306516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:32,245][24717] Avg episode reward: [(0, '36.272')] [2023-02-22 21:57:32,750][37090] Updated weights for policy 0, policy_version 8128 (0.0010) [2023-02-22 21:57:34,694][37090] Updated weights for policy 0, policy_version 8138 (0.0009) [2023-02-22 21:57:36,666][37090] Updated weights for policy 0, policy_version 8148 (0.0012) [2023-02-22 21:57:37,244][24717] Fps is (10 sec: 21299.2, 60 sec: 21162.7, 300 sec: 20979.9). Total num frames: 33386496. Throughput: 0: 5301.4. Samples: 7338390. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:57:37,245][24717] Avg episode reward: [(0, '35.909')] [2023-02-22 21:57:38,570][37090] Updated weights for policy 0, policy_version 8158 (0.0007) [2023-02-22 21:57:40,542][37090] Updated weights for policy 0, policy_version 8168 (0.0007) [2023-02-22 21:57:42,244][24717] Fps is (10 sec: 20889.5, 60 sec: 21162.6, 300 sec: 20979.9). Total num frames: 33488896. Throughput: 0: 5303.4. Samples: 7369784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:42,246][24717] Avg episode reward: [(0, '35.230')] [2023-02-22 21:57:42,506][37090] Updated weights for policy 0, policy_version 8178 (0.0009) [2023-02-22 21:57:44,452][37090] Updated weights for policy 0, policy_version 8188 (0.0006) [2023-02-22 21:57:46,462][37090] Updated weights for policy 0, policy_version 8198 (0.0009) [2023-02-22 21:57:47,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21163.2, 300 sec: 20979.9). Total num frames: 33595392. Throughput: 0: 5304.0. Samples: 7385338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:57:47,245][24717] Avg episode reward: [(0, '35.981')] [2023-02-22 21:57:47,252][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000008202_33595392.pth... [2023-02-22 21:57:47,300][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000006971_28553216.pth [2023-02-22 21:57:48,456][37090] Updated weights for policy 0, policy_version 8208 (0.0008) [2023-02-22 21:57:50,407][37090] Updated weights for policy 0, policy_version 8218 (0.0009) [2023-02-22 21:57:52,244][24717] Fps is (10 sec: 20889.8, 60 sec: 21094.4, 300 sec: 20966.0). Total num frames: 33697792. Throughput: 0: 5294.2. Samples: 7416302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:57:52,245][24717] Avg episode reward: [(0, '35.534')] [2023-02-22 21:57:52,405][37090] Updated weights for policy 0, policy_version 8228 (0.0010) [2023-02-22 21:57:54,395][37090] Updated weights for policy 0, policy_version 8238 (0.0008) [2023-02-22 21:57:56,310][37090] Updated weights for policy 0, policy_version 8248 (0.0007) [2023-02-22 21:57:57,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21162.7, 300 sec: 20979.9). Total num frames: 33804288. Throughput: 0: 5301.2. Samples: 7447820. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:57:57,246][24717] Avg episode reward: [(0, '36.552')] [2023-02-22 21:57:58,217][37090] Updated weights for policy 0, policy_version 8258 (0.0010) [2023-02-22 21:58:00,131][37090] Updated weights for policy 0, policy_version 8268 (0.0007) [2023-02-22 21:58:02,136][37090] Updated weights for policy 0, policy_version 8278 (0.0008) [2023-02-22 21:58:02,244][24717] Fps is (10 sec: 20889.5, 60 sec: 21162.7, 300 sec: 20966.0). Total num frames: 33906688. Throughput: 0: 5307.1. Samples: 7463782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:58:02,245][24717] Avg episode reward: [(0, '36.321')] [2023-02-22 21:58:04,098][37090] Updated weights for policy 0, policy_version 8288 (0.0008) [2023-02-22 21:58:06,047][37090] Updated weights for policy 0, policy_version 8298 (0.0009) [2023-02-22 21:58:07,245][24717] Fps is (10 sec: 20889.1, 60 sec: 21162.6, 300 sec: 20979.9). Total num frames: 34013184. Throughput: 0: 5239.5. Samples: 7494896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:58:07,247][24717] Avg episode reward: [(0, '38.408')] [2023-02-22 21:58:07,983][37090] Updated weights for policy 0, policy_version 8308 (0.0009) [2023-02-22 21:58:10,005][37090] Updated weights for policy 0, policy_version 8318 (0.0007) [2023-02-22 21:58:11,894][37090] Updated weights for policy 0, policy_version 8328 (0.0008) [2023-02-22 21:58:12,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21162.7, 300 sec: 20952.1). Total num frames: 34115584. Throughput: 0: 5232.0. Samples: 7526366. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:58:12,247][24717] Avg episode reward: [(0, '37.307')] [2023-02-22 21:58:13,881][37090] Updated weights for policy 0, policy_version 8338 (0.0008) [2023-02-22 21:58:15,820][37090] Updated weights for policy 0, policy_version 8348 (0.0010) [2023-02-22 21:58:17,244][24717] Fps is (10 sec: 21299.8, 60 sec: 21162.7, 300 sec: 20952.1). Total num frames: 34226176. Throughput: 0: 5235.2. Samples: 7542102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:58:17,245][24717] Avg episode reward: [(0, '35.014')] [2023-02-22 21:58:17,549][37090] Updated weights for policy 0, policy_version 8358 (0.0006) [2023-02-22 21:58:19,287][37090] Updated weights for policy 0, policy_version 8368 (0.0008) [2023-02-22 21:58:21,052][37090] Updated weights for policy 0, policy_version 8378 (0.0010) [2023-02-22 21:58:22,244][24717] Fps is (10 sec: 22528.1, 60 sec: 21230.9, 300 sec: 20979.9). Total num frames: 34340864. Throughput: 0: 5300.7. Samples: 7576920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:58:22,247][24717] Avg episode reward: [(0, '34.319')] [2023-02-22 21:58:22,995][37090] Updated weights for policy 0, policy_version 8388 (0.0008) [2023-02-22 21:58:24,956][37090] Updated weights for policy 0, policy_version 8398 (0.0008) [2023-02-22 21:58:26,894][37090] Updated weights for policy 0, policy_version 8408 (0.0008) [2023-02-22 21:58:27,245][24717] Fps is (10 sec: 21708.0, 60 sec: 21162.5, 300 sec: 20979.8). Total num frames: 34443264. Throughput: 0: 5304.0. Samples: 7608466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:58:27,246][24717] Avg episode reward: [(0, '35.208')] [2023-02-22 21:58:28,872][37090] Updated weights for policy 0, policy_version 8418 (0.0008) [2023-02-22 21:58:30,848][37090] Updated weights for policy 0, policy_version 8428 (0.0009) [2023-02-22 21:58:32,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21162.7, 300 sec: 20993.8). Total num frames: 34549760. Throughput: 0: 5303.5. Samples: 7623994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 21:58:32,245][24717] Avg episode reward: [(0, '36.165')] [2023-02-22 21:58:32,830][37090] Updated weights for policy 0, policy_version 8438 (0.0007) [2023-02-22 21:58:34,861][37090] Updated weights for policy 0, policy_version 8448 (0.0008) [2023-02-22 21:58:36,847][37090] Updated weights for policy 0, policy_version 8458 (0.0009) [2023-02-22 21:58:37,244][24717] Fps is (10 sec: 20480.7, 60 sec: 21026.1, 300 sec: 20993.7). Total num frames: 34648064. Throughput: 0: 5298.4. Samples: 7654730. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:58:37,246][24717] Avg episode reward: [(0, '40.022')] [2023-02-22 21:58:38,948][37090] Updated weights for policy 0, policy_version 8468 (0.0009) [2023-02-22 21:58:40,916][37090] Updated weights for policy 0, policy_version 8478 (0.0007) [2023-02-22 21:58:42,244][24717] Fps is (10 sec: 20070.3, 60 sec: 21026.1, 300 sec: 21021.5). Total num frames: 34750464. Throughput: 0: 5271.8. Samples: 7685052. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:58:42,245][24717] Avg episode reward: [(0, '39.379')] [2023-02-22 21:58:42,884][37090] Updated weights for policy 0, policy_version 8488 (0.0010) [2023-02-22 21:58:44,796][37090] Updated weights for policy 0, policy_version 8498 (0.0006) [2023-02-22 21:58:46,792][37090] Updated weights for policy 0, policy_version 8508 (0.0007) [2023-02-22 21:58:47,244][24717] Fps is (10 sec: 20889.6, 60 sec: 21026.1, 300 sec: 21035.4). Total num frames: 34856960. Throughput: 0: 5275.2. Samples: 7701166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 21:58:47,246][24717] Avg episode reward: [(0, '39.240')] [2023-02-22 21:58:48,730][37090] Updated weights for policy 0, policy_version 8518 (0.0011) [2023-02-22 21:58:50,748][37090] Updated weights for policy 0, policy_version 8528 (0.0007) [2023-02-22 21:58:52,244][24717] Fps is (10 sec: 20889.7, 60 sec: 21026.1, 300 sec: 21035.4). Total num frames: 34959360. Throughput: 0: 5272.1. Samples: 7732140. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:58:52,245][24717] Avg episode reward: [(0, '35.279')] [2023-02-22 21:58:52,816][37090] Updated weights for policy 0, policy_version 8538 (0.0008) [2023-02-22 21:58:54,842][37090] Updated weights for policy 0, policy_version 8548 (0.0008) [2023-02-22 21:58:56,860][37090] Updated weights for policy 0, policy_version 8558 (0.0008) [2023-02-22 21:58:57,244][24717] Fps is (10 sec: 20070.5, 60 sec: 20889.6, 300 sec: 21021.5). Total num frames: 35057664. Throughput: 0: 5240.7. Samples: 7762198. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:58:57,245][24717] Avg episode reward: [(0, '38.694')] [2023-02-22 21:58:58,947][37090] Updated weights for policy 0, policy_version 8568 (0.0007) [2023-02-22 21:59:00,999][37090] Updated weights for policy 0, policy_version 8578 (0.0007) [2023-02-22 21:59:02,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20889.6, 300 sec: 21007.6). Total num frames: 35160064. Throughput: 0: 5218.2. Samples: 7776920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:02,246][24717] Avg episode reward: [(0, '37.629')] [2023-02-22 21:59:02,993][37090] Updated weights for policy 0, policy_version 8588 (0.0007) [2023-02-22 21:59:05,062][37090] Updated weights for policy 0, policy_version 8598 (0.0007) [2023-02-22 21:59:07,138][37090] Updated weights for policy 0, policy_version 8608 (0.0011) [2023-02-22 21:59:07,245][24717] Fps is (10 sec: 20069.9, 60 sec: 20753.1, 300 sec: 20979.8). Total num frames: 35258368. Throughput: 0: 5114.7. Samples: 7807084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:07,246][24717] Avg episode reward: [(0, '39.114')] [2023-02-22 21:59:09,242][37090] Updated weights for policy 0, policy_version 8618 (0.0010) [2023-02-22 21:59:11,275][37090] Updated weights for policy 0, policy_version 8628 (0.0007) [2023-02-22 21:59:12,244][24717] Fps is (10 sec: 19660.8, 60 sec: 20684.8, 300 sec: 20952.1). Total num frames: 35356672. Throughput: 0: 5076.9. Samples: 7836924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:12,246][24717] Avg episode reward: [(0, '39.532')] [2023-02-22 21:59:13,265][37090] Updated weights for policy 0, policy_version 8638 (0.0008) [2023-02-22 21:59:15,256][37090] Updated weights for policy 0, policy_version 8648 (0.0009) [2023-02-22 21:59:17,160][37090] Updated weights for policy 0, policy_version 8658 (0.0007) [2023-02-22 21:59:17,244][24717] Fps is (10 sec: 20480.5, 60 sec: 20616.5, 300 sec: 20938.2). Total num frames: 35463168. Throughput: 0: 5078.9. Samples: 7852544. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 21:59:17,245][24717] Avg episode reward: [(0, '35.131')] [2023-02-22 21:59:19,094][37090] Updated weights for policy 0, policy_version 8668 (0.0010) [2023-02-22 21:59:20,974][37090] Updated weights for policy 0, policy_version 8678 (0.0009) [2023-02-22 21:59:22,244][24717] Fps is (10 sec: 21299.1, 60 sec: 20480.0, 300 sec: 20938.2). Total num frames: 35569664. Throughput: 0: 5105.1. Samples: 7884460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:22,246][24717] Avg episode reward: [(0, '38.003')] [2023-02-22 21:59:22,928][37090] Updated weights for policy 0, policy_version 8688 (0.0008) [2023-02-22 21:59:24,963][37090] Updated weights for policy 0, policy_version 8698 (0.0007) [2023-02-22 21:59:26,944][37090] Updated weights for policy 0, policy_version 8708 (0.0007) [2023-02-22 21:59:27,244][24717] Fps is (10 sec: 20889.5, 60 sec: 20480.1, 300 sec: 20924.3). Total num frames: 35672064. Throughput: 0: 5116.1. Samples: 7915276. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:59:27,245][24717] Avg episode reward: [(0, '38.334')] [2023-02-22 21:59:29,028][37090] Updated weights for policy 0, policy_version 8718 (0.0010) [2023-02-22 21:59:31,020][37090] Updated weights for policy 0, policy_version 8728 (0.0010) [2023-02-22 21:59:32,244][24717] Fps is (10 sec: 20480.3, 60 sec: 20411.8, 300 sec: 20910.4). Total num frames: 35774464. Throughput: 0: 5091.4. Samples: 7930280. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:59:32,246][24717] Avg episode reward: [(0, '38.252')] [2023-02-22 21:59:32,949][37090] Updated weights for policy 0, policy_version 8738 (0.0008) [2023-02-22 21:59:34,975][37090] Updated weights for policy 0, policy_version 8748 (0.0010) [2023-02-22 21:59:36,923][37090] Updated weights for policy 0, policy_version 8758 (0.0011) [2023-02-22 21:59:37,245][24717] Fps is (10 sec: 20479.0, 60 sec: 20479.8, 300 sec: 20924.3). Total num frames: 35876864. Throughput: 0: 5094.4. Samples: 7961392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:37,246][24717] Avg episode reward: [(0, '37.942')] [2023-02-22 21:59:39,002][37090] Updated weights for policy 0, policy_version 8768 (0.0010) [2023-02-22 21:59:41,101][37090] Updated weights for policy 0, policy_version 8778 (0.0010) [2023-02-22 21:59:42,244][24717] Fps is (10 sec: 20070.2, 60 sec: 20411.7, 300 sec: 20896.5). Total num frames: 35975168. Throughput: 0: 5086.8. Samples: 7991102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:42,245][24717] Avg episode reward: [(0, '38.350')] [2023-02-22 21:59:43,126][37090] Updated weights for policy 0, policy_version 8788 (0.0009) [2023-02-22 21:59:45,163][37090] Updated weights for policy 0, policy_version 8798 (0.0008) [2023-02-22 21:59:47,232][37090] Updated weights for policy 0, policy_version 8808 (0.0009) [2023-02-22 21:59:47,244][24717] Fps is (10 sec: 20071.5, 60 sec: 20343.5, 300 sec: 20896.5). Total num frames: 36077568. Throughput: 0: 5103.6. Samples: 8006582. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 21:59:47,248][24717] Avg episode reward: [(0, '39.254')] [2023-02-22 21:59:47,253][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000008808_36077568.pth... [2023-02-22 21:59:47,313][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000007585_31068160.pth [2023-02-22 21:59:49,224][37090] Updated weights for policy 0, policy_version 8818 (0.0008) [2023-02-22 21:59:51,225][37090] Updated weights for policy 0, policy_version 8828 (0.0009) [2023-02-22 21:59:52,244][24717] Fps is (10 sec: 20070.3, 60 sec: 20275.2, 300 sec: 20882.7). Total num frames: 36175872. Throughput: 0: 5109.1. Samples: 8036992. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:59:52,246][24717] Avg episode reward: [(0, '37.823')] [2023-02-22 21:59:53,264][37090] Updated weights for policy 0, policy_version 8838 (0.0010) [2023-02-22 21:59:55,323][37090] Updated weights for policy 0, policy_version 8848 (0.0007) [2023-02-22 21:59:57,244][24717] Fps is (10 sec: 20070.3, 60 sec: 20343.4, 300 sec: 20882.7). Total num frames: 36278272. Throughput: 0: 5120.2. Samples: 8067332. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 21:59:57,246][24717] Avg episode reward: [(0, '39.022')] [2023-02-22 21:59:57,296][37090] Updated weights for policy 0, policy_version 8858 (0.0006) [2023-02-22 21:59:59,286][37090] Updated weights for policy 0, policy_version 8868 (0.0007) [2023-02-22 22:00:01,432][37090] Updated weights for policy 0, policy_version 8878 (0.0010) [2023-02-22 22:00:02,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20854.9). Total num frames: 36376576. Throughput: 0: 5107.5. Samples: 8082380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:00:02,246][24717] Avg episode reward: [(0, '38.919')] [2023-02-22 22:00:03,609][37090] Updated weights for policy 0, policy_version 8888 (0.0009) [2023-02-22 22:00:05,658][37090] Updated weights for policy 0, policy_version 8898 (0.0009) [2023-02-22 22:00:07,245][24717] Fps is (10 sec: 19660.7, 60 sec: 20275.2, 300 sec: 20841.0). Total num frames: 36474880. Throughput: 0: 5048.2. Samples: 8111628. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:00:07,246][24717] Avg episode reward: [(0, '37.904')] [2023-02-22 22:00:07,697][37090] Updated weights for policy 0, policy_version 8908 (0.0007) [2023-02-22 22:00:09,695][37090] Updated weights for policy 0, policy_version 8918 (0.0009) [2023-02-22 22:00:11,704][37090] Updated weights for policy 0, policy_version 8928 (0.0009) [2023-02-22 22:00:12,244][24717] Fps is (10 sec: 20070.2, 60 sec: 20343.4, 300 sec: 20813.2). Total num frames: 36577280. Throughput: 0: 5035.3. Samples: 8141864. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:00:12,246][24717] Avg episode reward: [(0, '37.412')] [2023-02-22 22:00:13,758][37090] Updated weights for policy 0, policy_version 8938 (0.0008) [2023-02-22 22:00:15,730][37090] Updated weights for policy 0, policy_version 8948 (0.0008) [2023-02-22 22:00:17,244][24717] Fps is (10 sec: 20889.7, 60 sec: 20343.5, 300 sec: 20813.3). Total num frames: 36683776. Throughput: 0: 5042.8. Samples: 8157206. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:00:17,245][24717] Avg episode reward: [(0, '39.926')] [2023-02-22 22:00:17,608][37090] Updated weights for policy 0, policy_version 8958 (0.0009) [2023-02-22 22:00:19,438][37090] Updated weights for policy 0, policy_version 8968 (0.0009) [2023-02-22 22:00:21,352][37090] Updated weights for policy 0, policy_version 8978 (0.0008) [2023-02-22 22:00:22,244][24717] Fps is (10 sec: 21299.4, 60 sec: 20343.4, 300 sec: 20813.2). Total num frames: 36790272. Throughput: 0: 5077.1. Samples: 8189858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:00:22,246][24717] Avg episode reward: [(0, '41.158')] [2023-02-22 22:00:22,247][37076] Saving new best policy, reward=41.158! [2023-02-22 22:00:23,488][37090] Updated weights for policy 0, policy_version 8988 (0.0008) [2023-02-22 22:00:25,575][37090] Updated weights for policy 0, policy_version 8998 (0.0009) [2023-02-22 22:00:27,244][24717] Fps is (10 sec: 20479.9, 60 sec: 20275.2, 300 sec: 20799.3). Total num frames: 36888576. Throughput: 0: 5065.8. Samples: 8219064. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:00:27,246][24717] Avg episode reward: [(0, '37.182')] [2023-02-22 22:00:27,708][37090] Updated weights for policy 0, policy_version 9008 (0.0009) [2023-02-22 22:00:29,747][37090] Updated weights for policy 0, policy_version 9018 (0.0007) [2023-02-22 22:00:31,754][37090] Updated weights for policy 0, policy_version 9028 (0.0008) [2023-02-22 22:00:32,244][24717] Fps is (10 sec: 19661.0, 60 sec: 20206.9, 300 sec: 20771.6). Total num frames: 36986880. Throughput: 0: 5052.1. Samples: 8233928. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:00:32,246][24717] Avg episode reward: [(0, '38.178')] [2023-02-22 22:00:33,768][37090] Updated weights for policy 0, policy_version 9038 (0.0008) [2023-02-22 22:00:35,917][37090] Updated weights for policy 0, policy_version 9048 (0.0009) [2023-02-22 22:00:37,244][24717] Fps is (10 sec: 19660.9, 60 sec: 20138.9, 300 sec: 20743.8). Total num frames: 37085184. Throughput: 0: 5038.7. Samples: 8263732. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:00:37,245][24717] Avg episode reward: [(0, '42.013')] [2023-02-22 22:00:37,250][37076] Saving new best policy, reward=42.013! [2023-02-22 22:00:38,044][37090] Updated weights for policy 0, policy_version 9058 (0.0010) [2023-02-22 22:00:40,167][37090] Updated weights for policy 0, policy_version 9068 (0.0009) [2023-02-22 22:00:42,244][24717] Fps is (10 sec: 19251.1, 60 sec: 20070.4, 300 sec: 20729.9). Total num frames: 37179392. Throughput: 0: 5012.7. Samples: 8292904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:00:42,245][24717] Avg episode reward: [(0, '40.203')] [2023-02-22 22:00:42,263][37090] Updated weights for policy 0, policy_version 9078 (0.0012) [2023-02-22 22:00:44,336][37090] Updated weights for policy 0, policy_version 9088 (0.0007) [2023-02-22 22:00:46,446][37090] Updated weights for policy 0, policy_version 9098 (0.0009) [2023-02-22 22:00:47,244][24717] Fps is (10 sec: 19251.0, 60 sec: 20002.1, 300 sec: 20702.2). Total num frames: 37277696. Throughput: 0: 4998.8. Samples: 8307324. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:00:47,246][24717] Avg episode reward: [(0, '40.347')] [2023-02-22 22:00:48,554][37090] Updated weights for policy 0, policy_version 9108 (0.0008) [2023-02-22 22:00:50,540][37090] Updated weights for policy 0, policy_version 9118 (0.0008) [2023-02-22 22:00:52,245][24717] Fps is (10 sec: 20068.9, 60 sec: 20070.1, 300 sec: 20702.1). Total num frames: 37380096. Throughput: 0: 5016.7. Samples: 8337382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:00:52,246][24717] Avg episode reward: [(0, '39.892')] [2023-02-22 22:00:52,580][37090] Updated weights for policy 0, policy_version 9128 (0.0009) [2023-02-22 22:00:54,573][37090] Updated weights for policy 0, policy_version 9138 (0.0007) [2023-02-22 22:00:56,532][37090] Updated weights for policy 0, policy_version 9148 (0.0008) [2023-02-22 22:00:57,244][24717] Fps is (10 sec: 20480.2, 60 sec: 20070.4, 300 sec: 20702.2). Total num frames: 37482496. Throughput: 0: 5028.1. Samples: 8368126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:00:57,246][24717] Avg episode reward: [(0, '40.967')] [2023-02-22 22:00:58,495][37090] Updated weights for policy 0, policy_version 9158 (0.0011) [2023-02-22 22:01:00,512][37090] Updated weights for policy 0, policy_version 9168 (0.0008) [2023-02-22 22:01:02,244][24717] Fps is (10 sec: 20891.4, 60 sec: 20207.0, 300 sec: 20716.1). Total num frames: 37588992. Throughput: 0: 5032.9. Samples: 8383688. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:01:02,245][24717] Avg episode reward: [(0, '41.323')] [2023-02-22 22:01:02,463][37090] Updated weights for policy 0, policy_version 9178 (0.0008) [2023-02-22 22:01:04,523][37090] Updated weights for policy 0, policy_version 9188 (0.0007) [2023-02-22 22:01:06,598][37090] Updated weights for policy 0, policy_version 9198 (0.0010) [2023-02-22 22:01:07,244][24717] Fps is (10 sec: 20070.3, 60 sec: 20138.7, 300 sec: 20688.3). Total num frames: 37683200. Throughput: 0: 4982.6. Samples: 8414074. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:01:07,246][24717] Avg episode reward: [(0, '38.768')] [2023-02-22 22:01:08,675][37090] Updated weights for policy 0, policy_version 9208 (0.0009) [2023-02-22 22:01:10,733][37090] Updated weights for policy 0, policy_version 9218 (0.0008) [2023-02-22 22:01:12,244][24717] Fps is (10 sec: 19660.7, 60 sec: 20138.7, 300 sec: 20660.5). Total num frames: 37785600. Throughput: 0: 4993.5. Samples: 8443770. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:01:12,246][24717] Avg episode reward: [(0, '37.214')] [2023-02-22 22:01:12,748][37090] Updated weights for policy 0, policy_version 9228 (0.0008) [2023-02-22 22:01:14,773][37090] Updated weights for policy 0, policy_version 9238 (0.0009) [2023-02-22 22:01:16,750][37090] Updated weights for policy 0, policy_version 9248 (0.0008) [2023-02-22 22:01:17,245][24717] Fps is (10 sec: 20479.7, 60 sec: 20070.4, 300 sec: 20632.7). Total num frames: 37888000. Throughput: 0: 5001.8. Samples: 8459008. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:01:17,246][24717] Avg episode reward: [(0, '38.972')] [2023-02-22 22:01:18,561][37090] Updated weights for policy 0, policy_version 9258 (0.0009) [2023-02-22 22:01:20,377][37090] Updated weights for policy 0, policy_version 9268 (0.0006) [2023-02-22 22:01:22,244][24717] Fps is (10 sec: 21299.0, 60 sec: 20138.7, 300 sec: 20646.6). Total num frames: 37998592. Throughput: 0: 5068.3. Samples: 8491808. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:01:22,246][24717] Avg episode reward: [(0, '40.067')] [2023-02-22 22:01:22,280][37090] Updated weights for policy 0, policy_version 9278 (0.0009) [2023-02-22 22:01:24,335][37090] Updated weights for policy 0, policy_version 9288 (0.0008) [2023-02-22 22:01:26,407][37090] Updated weights for policy 0, policy_version 9298 (0.0008) [2023-02-22 22:01:27,244][24717] Fps is (10 sec: 21299.5, 60 sec: 20206.9, 300 sec: 20646.6). Total num frames: 38100992. Throughput: 0: 5095.3. Samples: 8522192. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:01:27,246][24717] Avg episode reward: [(0, '38.752')] [2023-02-22 22:01:28,524][37090] Updated weights for policy 0, policy_version 9308 (0.0008) [2023-02-22 22:01:30,550][37090] Updated weights for policy 0, policy_version 9318 (0.0010) [2023-02-22 22:01:32,244][24717] Fps is (10 sec: 20070.6, 60 sec: 20206.9, 300 sec: 20618.9). Total num frames: 38199296. Throughput: 0: 5104.6. Samples: 8537032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:32,245][24717] Avg episode reward: [(0, '39.974')] [2023-02-22 22:01:32,579][37090] Updated weights for policy 0, policy_version 9328 (0.0007) [2023-02-22 22:01:34,658][37090] Updated weights for policy 0, policy_version 9338 (0.0008) [2023-02-22 22:01:36,671][37090] Updated weights for policy 0, policy_version 9348 (0.0009) [2023-02-22 22:01:37,244][24717] Fps is (10 sec: 19660.8, 60 sec: 20206.9, 300 sec: 20605.0). Total num frames: 38297600. Throughput: 0: 5103.1. Samples: 8567016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:37,246][24717] Avg episode reward: [(0, '44.299')] [2023-02-22 22:01:37,251][37076] Saving new best policy, reward=44.299! [2023-02-22 22:01:38,701][37090] Updated weights for policy 0, policy_version 9358 (0.0008) [2023-02-22 22:01:40,773][37090] Updated weights for policy 0, policy_version 9368 (0.0009) [2023-02-22 22:01:42,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20591.2). Total num frames: 38400000. Throughput: 0: 5088.4. Samples: 8597102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:42,246][24717] Avg episode reward: [(0, '44.782')] [2023-02-22 22:01:42,247][37076] Saving new best policy, reward=44.782! [2023-02-22 22:01:42,835][37090] Updated weights for policy 0, policy_version 9378 (0.0011) [2023-02-22 22:01:44,824][37090] Updated weights for policy 0, policy_version 9388 (0.0007) [2023-02-22 22:01:46,901][37090] Updated weights for policy 0, policy_version 9398 (0.0011) [2023-02-22 22:01:47,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20343.5, 300 sec: 20563.3). Total num frames: 38498304. Throughput: 0: 5080.9. Samples: 8612328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:47,245][24717] Avg episode reward: [(0, '38.855')] [2023-02-22 22:01:47,253][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000009399_38498304.pth... [2023-02-22 22:01:47,307][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000008202_33595392.pth [2023-02-22 22:01:49,143][37090] Updated weights for policy 0, policy_version 9408 (0.0011) [2023-02-22 22:01:51,117][37090] Updated weights for policy 0, policy_version 9418 (0.0009) [2023-02-22 22:01:52,244][24717] Fps is (10 sec: 19660.8, 60 sec: 20275.5, 300 sec: 20549.4). Total num frames: 38596608. Throughput: 0: 5050.6. Samples: 8641352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:52,246][24717] Avg episode reward: [(0, '38.023')] [2023-02-22 22:01:53,091][37090] Updated weights for policy 0, policy_version 9428 (0.0009) [2023-02-22 22:01:55,149][37090] Updated weights for policy 0, policy_version 9438 (0.0007) [2023-02-22 22:01:57,170][37090] Updated weights for policy 0, policy_version 9448 (0.0007) [2023-02-22 22:01:57,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20549.4). Total num frames: 38699008. Throughput: 0: 5070.6. Samples: 8671948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:01:57,246][24717] Avg episode reward: [(0, '38.836')] [2023-02-22 22:01:59,173][37090] Updated weights for policy 0, policy_version 9458 (0.0010) [2023-02-22 22:02:01,197][37090] Updated weights for policy 0, policy_version 9468 (0.0007) [2023-02-22 22:02:02,244][24717] Fps is (10 sec: 20480.1, 60 sec: 20206.9, 300 sec: 20535.5). Total num frames: 38801408. Throughput: 0: 5071.1. Samples: 8687208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:02:02,245][24717] Avg episode reward: [(0, '38.702')] [2023-02-22 22:02:03,184][37090] Updated weights for policy 0, policy_version 9478 (0.0008) [2023-02-22 22:02:05,168][37090] Updated weights for policy 0, policy_version 9488 (0.0009) [2023-02-22 22:02:07,144][37090] Updated weights for policy 0, policy_version 9498 (0.0009) [2023-02-22 22:02:07,244][24717] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20535.5). Total num frames: 38903808. Throughput: 0: 5032.3. Samples: 8718262. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:07,245][24717] Avg episode reward: [(0, '40.366')] [2023-02-22 22:02:09,214][37090] Updated weights for policy 0, policy_version 9508 (0.0009) [2023-02-22 22:02:11,400][37090] Updated weights for policy 0, policy_version 9518 (0.0009) [2023-02-22 22:02:12,244][24717] Fps is (10 sec: 20070.3, 60 sec: 20275.2, 300 sec: 20493.9). Total num frames: 39002112. Throughput: 0: 5009.6. Samples: 8747622. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:12,245][24717] Avg episode reward: [(0, '42.879')] [2023-02-22 22:02:13,399][37090] Updated weights for policy 0, policy_version 9528 (0.0010) [2023-02-22 22:02:15,381][37090] Updated weights for policy 0, policy_version 9538 (0.0009) [2023-02-22 22:02:17,244][24717] Fps is (10 sec: 20070.2, 60 sec: 20275.2, 300 sec: 20466.1). Total num frames: 39104512. Throughput: 0: 5021.5. Samples: 8763000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:17,245][24717] Avg episode reward: [(0, '43.718')] [2023-02-22 22:02:17,370][37090] Updated weights for policy 0, policy_version 9548 (0.0009) [2023-02-22 22:02:19,205][37090] Updated weights for policy 0, policy_version 9558 (0.0006) [2023-02-22 22:02:21,022][37090] Updated weights for policy 0, policy_version 9568 (0.0009) [2023-02-22 22:02:22,244][24717] Fps is (10 sec: 21299.3, 60 sec: 20275.2, 300 sec: 20480.0). Total num frames: 39215104. Throughput: 0: 5077.3. Samples: 8795492. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:22,245][24717] Avg episode reward: [(0, '39.766')] [2023-02-22 22:02:23,013][37090] Updated weights for policy 0, policy_version 9578 (0.0007) [2023-02-22 22:02:25,013][37090] Updated weights for policy 0, policy_version 9588 (0.0009) [2023-02-22 22:02:27,042][37090] Updated weights for policy 0, policy_version 9598 (0.0010) [2023-02-22 22:02:27,244][24717] Fps is (10 sec: 21299.5, 60 sec: 20275.2, 300 sec: 20466.1). Total num frames: 39317504. Throughput: 0: 5096.1. Samples: 8826426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:27,246][24717] Avg episode reward: [(0, '35.518')] [2023-02-22 22:02:29,177][37090] Updated weights for policy 0, policy_version 9608 (0.0010) [2023-02-22 22:02:31,272][37090] Updated weights for policy 0, policy_version 9618 (0.0009) [2023-02-22 22:02:32,245][24717] Fps is (10 sec: 20069.9, 60 sec: 20275.1, 300 sec: 20438.3). Total num frames: 39415808. Throughput: 0: 5072.7. Samples: 8840600. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:32,247][24717] Avg episode reward: [(0, '38.648')] [2023-02-22 22:02:33,301][37090] Updated weights for policy 0, policy_version 9628 (0.0010) [2023-02-22 22:02:35,412][37090] Updated weights for policy 0, policy_version 9638 (0.0007) [2023-02-22 22:02:37,244][24717] Fps is (10 sec: 19660.7, 60 sec: 20275.2, 300 sec: 20424.5). Total num frames: 39514112. Throughput: 0: 5092.6. Samples: 8870520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 22:02:37,245][24717] Avg episode reward: [(0, '40.827')] [2023-02-22 22:02:37,387][37090] Updated weights for policy 0, policy_version 9648 (0.0008) [2023-02-22 22:02:39,429][37090] Updated weights for policy 0, policy_version 9658 (0.0007) [2023-02-22 22:02:41,462][37090] Updated weights for policy 0, policy_version 9668 (0.0008) [2023-02-22 22:02:42,244][24717] Fps is (10 sec: 19661.2, 60 sec: 20206.9, 300 sec: 20396.7). Total num frames: 39612416. Throughput: 0: 5087.7. Samples: 8900896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:42,245][24717] Avg episode reward: [(0, '40.484')] [2023-02-22 22:02:43,446][37090] Updated weights for policy 0, policy_version 9678 (0.0009) [2023-02-22 22:02:45,473][37090] Updated weights for policy 0, policy_version 9688 (0.0009) [2023-02-22 22:02:47,244][24717] Fps is (10 sec: 20070.4, 60 sec: 20275.2, 300 sec: 20396.7). Total num frames: 39714816. Throughput: 0: 5090.7. Samples: 8916292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:47,246][24717] Avg episode reward: [(0, '39.314')] [2023-02-22 22:02:47,559][37090] Updated weights for policy 0, policy_version 9698 (0.0008) [2023-02-22 22:02:49,577][37090] Updated weights for policy 0, policy_version 9708 (0.0007) [2023-02-22 22:02:51,559][37090] Updated weights for policy 0, policy_version 9718 (0.0009) [2023-02-22 22:02:52,244][24717] Fps is (10 sec: 20480.0, 60 sec: 20343.5, 300 sec: 20382.8). Total num frames: 39817216. Throughput: 0: 5069.7. Samples: 8946398. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 22:02:52,248][24717] Avg episode reward: [(0, '38.136')] [2023-02-22 22:02:53,666][37090] Updated weights for policy 0, policy_version 9728 (0.0008) [2023-02-22 22:02:55,733][37090] Updated weights for policy 0, policy_version 9738 (0.0010) [2023-02-22 22:02:57,245][24717] Fps is (10 sec: 20069.0, 60 sec: 20275.0, 300 sec: 20368.9). Total num frames: 39915520. Throughput: 0: 5084.3. Samples: 8976418. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 22:02:57,247][24717] Avg episode reward: [(0, '38.180')] [2023-02-22 22:02:57,733][37090] Updated weights for policy 0, policy_version 9748 (0.0007) [2023-02-22 22:02:59,731][37090] Updated weights for policy 0, policy_version 9758 (0.0008) [2023-02-22 22:03:01,520][37076] Stopping Batcher_0... [2023-02-22 22:03:01,520][37076] Loop batcher_evt_loop terminating... [2023-02-22 22:03:01,522][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-02-22 22:03:01,520][24717] Component Batcher_0 stopped! [2023-02-22 22:03:01,540][37091] Stopping RolloutWorker_w0... [2023-02-22 22:03:01,541][37091] Loop rollout_proc0_evt_loop terminating... [2023-02-22 22:03:01,541][24717] Component RolloutWorker_w0 stopped! [2023-02-22 22:03:01,543][37090] Weights refcount: 2 0 [2023-02-22 22:03:01,543][37101] Stopping RolloutWorker_w2... [2023-02-22 22:03:01,544][24717] Component RolloutWorker_w2 stopped! [2023-02-22 22:03:01,544][37101] Loop rollout_proc2_evt_loop terminating... [2023-02-22 22:03:01,545][37090] Stopping InferenceWorker_p0-w0... [2023-02-22 22:03:01,546][37090] Loop inference_proc0-0_evt_loop terminating... [2023-02-22 22:03:01,546][24717] Component InferenceWorker_p0-w0 stopped! [2023-02-22 22:03:01,553][37111] Stopping RolloutWorker_w7... [2023-02-22 22:03:01,554][37111] Loop rollout_proc7_evt_loop terminating... [2023-02-22 22:03:01,553][24717] Component RolloutWorker_w7 stopped! [2023-02-22 22:03:01,556][37092] Stopping RolloutWorker_w1... [2023-02-22 22:03:01,556][37103] Stopping RolloutWorker_w3... [2023-02-22 22:03:01,556][37092] Loop rollout_proc1_evt_loop terminating... [2023-02-22 22:03:01,556][37103] Loop rollout_proc3_evt_loop terminating... [2023-02-22 22:03:01,557][24717] Component RolloutWorker_w1 stopped! [2023-02-22 22:03:01,559][37104] Stopping RolloutWorker_w6... [2023-02-22 22:03:01,559][37104] Loop rollout_proc6_evt_loop terminating... [2023-02-22 22:03:01,558][24717] Component RolloutWorker_w3 stopped! [2023-02-22 22:03:01,560][24717] Component RolloutWorker_w6 stopped! [2023-02-22 22:03:01,577][37106] Stopping RolloutWorker_w5... [2023-02-22 22:03:01,578][37106] Loop rollout_proc5_evt_loop terminating... [2023-02-22 22:03:01,577][24717] Component RolloutWorker_w5 stopped! [2023-02-22 22:03:01,594][37076] Removing /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000008808_36077568.pth [2023-02-22 22:03:01,602][37076] Saving /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-02-22 22:03:01,626][37109] Stopping RolloutWorker_w4... [2023-02-22 22:03:01,626][24717] Component RolloutWorker_w4 stopped! [2023-02-22 22:03:01,627][37109] Loop rollout_proc4_evt_loop terminating... [2023-02-22 22:03:01,703][37076] Stopping LearnerWorker_p0... [2023-02-22 22:03:01,704][37076] Loop learner_proc0_evt_loop terminating... [2023-02-22 22:03:01,703][24717] Component LearnerWorker_p0 stopped! [2023-02-22 22:03:01,704][24717] Waiting for process learner_proc0 to stop... [2023-02-22 22:03:02,665][24717] Waiting for process inference_proc0-0 to join... [2023-02-22 22:03:02,666][24717] Waiting for process rollout_proc0 to join... [2023-02-22 22:03:02,667][24717] Waiting for process rollout_proc1 to join... [2023-02-22 22:03:02,668][24717] Waiting for process rollout_proc2 to join... [2023-02-22 22:03:02,669][24717] Waiting for process rollout_proc3 to join... [2023-02-22 22:03:02,670][24717] Waiting for process rollout_proc4 to join... [2023-02-22 22:03:02,671][24717] Waiting for process rollout_proc5 to join... [2023-02-22 22:03:02,672][24717] Waiting for process rollout_proc6 to join... [2023-02-22 22:03:02,673][24717] Waiting for process rollout_proc7 to join... [2023-02-22 22:03:02,674][24717] Batcher 0 profile tree view: batching: 125.2472, releasing_batches: 0.1732 [2023-02-22 22:03:02,675][24717] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 22.1416 update_model: 22.4268 weight_update: 0.0009 one_step: 0.0128 handle_policy_step: 1495.7790 deserialize: 75.2391, stack: 8.3199, obs_to_device_normalize: 384.8088, forward: 622.6954, send_messages: 128.0487 prepare_outputs: 211.0680 to_cpu: 130.5185 [2023-02-22 22:03:02,676][24717] Learner 0 profile tree view: misc: 0.0391, prepare_batch: 46.5270 train: 178.5467 epoch_init: 0.0384, minibatch_init: 0.0426, losses_postprocess: 2.6196, kl_divergence: 2.8618, after_optimizer: 2.5086 calculate_losses: 70.1534 losses_init: 0.0209, forward_head: 6.9000, bptt_initial: 42.2469, tail: 3.9804, advantages_returns: 1.1027, losses: 6.8977 bptt: 7.7898 bptt_forward_core: 7.4842 update: 97.7856 clip: 7.5269 [2023-02-22 22:03:02,677][24717] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.1946, enqueue_policy_requests: 51.0600, env_step: 692.4753, overhead: 60.1363, complete_rollouts: 4.0683 save_policy_outputs: 53.5847 split_output_tensors: 26.3238 [2023-02-22 22:03:02,678][24717] RolloutWorker_w7 profile tree view: wait_for_trajectories: 1.2092, enqueue_policy_requests: 51.0892, env_step: 698.7843, overhead: 61.1465, complete_rollouts: 4.7165 save_policy_outputs: 53.0975 split_output_tensors: 25.9738 [2023-02-22 22:03:02,679][24717] Loop Runner_EvtLoop terminating... [2023-02-22 22:03:02,682][24717] Runner profile tree view: main_loop: 1633.7512 [2023-02-22 22:03:02,683][24717] Collected {0: 40005632}, FPS: 22035.0 [2023-02-22 22:03:17,333][24717] Loading existing experiment configuration from /home/flahoud/studies/collab/train_dir/default_experiment/config.json [2023-02-22 22:03:17,334][24717] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 22:03:17,335][24717] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 22:03:17,336][24717] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 22:03:17,337][24717] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 22:03:17,337][24717] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 22:03:17,338][24717] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 22:03:17,339][24717] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 22:03:17,339][24717] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-22 22:03:17,340][24717] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-22 22:03:17,341][24717] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 22:03:17,341][24717] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 22:03:17,342][24717] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 22:03:17,342][24717] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 22:03:17,343][24717] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 22:03:17,361][24717] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 22:03:17,362][24717] RunningMeanStd input shape: (1,) [2023-02-22 22:03:17,372][24717] ConvEncoder: input_channels=3 [2023-02-22 22:03:17,417][24717] Conv encoder output size: 512 [2023-02-22 22:03:17,419][24717] Policy head output size: 512 [2023-02-22 22:03:17,450][24717] Loading state from checkpoint /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-02-22 22:03:17,959][24717] Num frames 100... [2023-02-22 22:03:18,060][24717] Num frames 200... [2023-02-22 22:03:18,168][24717] Num frames 300... [2023-02-22 22:03:18,275][24717] Num frames 400... [2023-02-22 22:03:18,379][24717] Num frames 500... [2023-02-22 22:03:18,493][24717] Num frames 600... [2023-02-22 22:03:18,598][24717] Num frames 700... [2023-02-22 22:03:18,704][24717] Num frames 800... [2023-02-22 22:03:18,807][24717] Num frames 900... [2023-02-22 22:03:18,927][24717] Num frames 1000... [2023-02-22 22:03:19,032][24717] Num frames 1100... [2023-02-22 22:03:19,140][24717] Num frames 1200... [2023-02-22 22:03:19,244][24717] Num frames 1300... [2023-02-22 22:03:19,349][24717] Num frames 1400... [2023-02-22 22:03:19,460][24717] Num frames 1500... [2023-02-22 22:03:19,519][24717] Avg episode rewards: #0: 39.040, true rewards: #0: 15.040 [2023-02-22 22:03:19,520][24717] Avg episode reward: 39.040, avg true_objective: 15.040 [2023-02-22 22:03:19,624][24717] Num frames 1600... [2023-02-22 22:03:19,723][24717] Num frames 1700... [2023-02-22 22:03:19,822][24717] Num frames 1800... [2023-02-22 22:03:19,923][24717] Num frames 1900... [2023-02-22 22:03:20,023][24717] Num frames 2000... [2023-02-22 22:03:20,127][24717] Num frames 2100... [2023-02-22 22:03:20,230][24717] Num frames 2200... [2023-02-22 22:03:20,332][24717] Num frames 2300... [2023-02-22 22:03:20,436][24717] Num frames 2400... [2023-02-22 22:03:20,528][24717] Avg episode rewards: #0: 31.160, true rewards: #0: 12.160 [2023-02-22 22:03:20,529][24717] Avg episode reward: 31.160, avg true_objective: 12.160 [2023-02-22 22:03:20,601][24717] Num frames 2500... [2023-02-22 22:03:20,706][24717] Num frames 2600... [2023-02-22 22:03:20,812][24717] Num frames 2700... [2023-02-22 22:03:20,913][24717] Num frames 2800... [2023-02-22 22:03:21,014][24717] Num frames 2900... [2023-02-22 22:03:21,124][24717] Num frames 3000... [2023-02-22 22:03:21,258][24717] Num frames 3100... [2023-02-22 22:03:21,372][24717] Num frames 3200... [2023-02-22 22:03:21,485][24717] Num frames 3300... [2023-02-22 22:03:21,602][24717] Num frames 3400... [2023-02-22 22:03:21,712][24717] Num frames 3500... [2023-02-22 22:03:21,827][24717] Num frames 3600... [2023-02-22 22:03:21,933][24717] Num frames 3700... [2023-02-22 22:03:22,048][24717] Num frames 3800... [2023-02-22 22:03:22,163][24717] Num frames 3900... [2023-02-22 22:03:22,277][24717] Num frames 4000... [2023-02-22 22:03:22,403][24717] Num frames 4100... [2023-02-22 22:03:22,517][24717] Num frames 4200... [2023-02-22 22:03:22,637][24717] Num frames 4300... [2023-02-22 22:03:22,748][24717] Num frames 4400... [2023-02-22 22:03:22,866][24717] Num frames 4500... [2023-02-22 22:03:22,954][24717] Avg episode rewards: #0: 41.106, true rewards: #0: 15.107 [2023-02-22 22:03:22,956][24717] Avg episode reward: 41.106, avg true_objective: 15.107 [2023-02-22 22:03:23,037][24717] Num frames 4600... [2023-02-22 22:03:23,159][24717] Num frames 4700... [2023-02-22 22:03:23,270][24717] Num frames 4800... [2023-02-22 22:03:23,392][24717] Num frames 4900... [2023-02-22 22:03:23,511][24717] Num frames 5000... [2023-02-22 22:03:23,619][24717] Num frames 5100... [2023-02-22 22:03:23,727][24717] Num frames 5200... [2023-02-22 22:03:23,841][24717] Num frames 5300... [2023-02-22 22:03:23,965][24717] Num frames 5400... [2023-02-22 22:03:24,077][24717] Num frames 5500... [2023-02-22 22:03:24,196][24717] Num frames 5600... [2023-02-22 22:03:24,302][24717] Num frames 5700... [2023-02-22 22:03:24,417][24717] Num frames 5800... [2023-02-22 22:03:24,532][24717] Num frames 5900... [2023-02-22 22:03:24,658][24717] Num frames 6000... [2023-02-22 22:03:24,800][24717] Num frames 6100... [2023-02-22 22:03:24,940][24717] Num frames 6200... [2023-02-22 22:03:25,054][24717] Num frames 6300... [2023-02-22 22:03:25,171][24717] Num frames 6400... [2023-02-22 22:03:25,302][24717] Num frames 6500... [2023-02-22 22:03:25,430][24717] Num frames 6600... [2023-02-22 22:03:25,529][24717] Avg episode rewards: #0: 44.829, true rewards: #0: 16.580 [2023-02-22 22:03:25,530][24717] Avg episode reward: 44.829, avg true_objective: 16.580 [2023-02-22 22:03:25,617][24717] Num frames 6700... [2023-02-22 22:03:25,728][24717] Num frames 6800... [2023-02-22 22:03:25,853][24717] Num frames 6900... [2023-02-22 22:03:25,965][24717] Num frames 7000... [2023-02-22 22:03:26,072][24717] Num frames 7100... [2023-02-22 22:03:26,194][24717] Num frames 7200... [2023-02-22 22:03:26,304][24717] Num frames 7300... [2023-02-22 22:03:26,441][24717] Num frames 7400... [2023-02-22 22:03:26,551][24717] Num frames 7500... [2023-02-22 22:03:26,678][24717] Num frames 7600... [2023-02-22 22:03:26,795][24717] Num frames 7700... [2023-02-22 22:03:26,912][24717] Num frames 7800... [2023-02-22 22:03:27,037][24717] Num frames 7900... [2023-02-22 22:03:27,154][24717] Num frames 8000... [2023-02-22 22:03:27,280][24717] Num frames 8100... [2023-02-22 22:03:27,404][24717] Num frames 8200... [2023-02-22 22:03:27,501][24717] Avg episode rewards: #0: 42.864, true rewards: #0: 16.464 [2023-02-22 22:03:27,502][24717] Avg episode reward: 42.864, avg true_objective: 16.464 [2023-02-22 22:03:27,578][24717] Num frames 8300... [2023-02-22 22:03:27,689][24717] Num frames 8400... [2023-02-22 22:03:27,805][24717] Num frames 8500... [2023-02-22 22:03:27,914][24717] Num frames 8600... [2023-02-22 22:03:28,039][24717] Num frames 8700... [2023-02-22 22:03:28,146][24717] Num frames 8800... [2023-02-22 22:03:28,259][24717] Num frames 8900... [2023-02-22 22:03:28,372][24717] Num frames 9000... [2023-02-22 22:03:28,494][24717] Num frames 9100... [2023-02-22 22:03:28,622][24717] Num frames 9200... [2023-02-22 22:03:28,753][24717] Num frames 9300... [2023-02-22 22:03:28,879][24717] Num frames 9400... [2023-02-22 22:03:29,034][24717] Num frames 9500... [2023-02-22 22:03:29,175][24717] Num frames 9600... [2023-02-22 22:03:29,301][24717] Num frames 9700... [2023-02-22 22:03:29,454][24717] Num frames 9800... [2023-02-22 22:03:29,573][24717] Num frames 9900... [2023-02-22 22:03:29,696][24717] Num frames 10000... [2023-02-22 22:03:29,817][24717] Num frames 10100... [2023-02-22 22:03:29,946][24717] Num frames 10200... [2023-02-22 22:03:30,020][24717] Avg episode rewards: #0: 44.526, true rewards: #0: 17.027 [2023-02-22 22:03:30,021][24717] Avg episode reward: 44.526, avg true_objective: 17.027 [2023-02-22 22:03:30,122][24717] Num frames 10300... [2023-02-22 22:03:30,238][24717] Num frames 10400... [2023-02-22 22:03:30,355][24717] Num frames 10500... [2023-02-22 22:03:30,470][24717] Num frames 10600... [2023-02-22 22:03:30,599][24717] Num frames 10700... [2023-02-22 22:03:30,735][24717] Num frames 10800... [2023-02-22 22:03:30,854][24717] Num frames 10900... [2023-02-22 22:03:30,932][24717] Avg episode rewards: #0: 39.885, true rewards: #0: 15.600 [2023-02-22 22:03:30,934][24717] Avg episode reward: 39.885, avg true_objective: 15.600 [2023-02-22 22:03:31,038][24717] Num frames 11000... [2023-02-22 22:03:31,166][24717] Num frames 11100... [2023-02-22 22:03:31,294][24717] Num frames 11200... [2023-02-22 22:03:31,424][24717] Num frames 11300... [2023-02-22 22:03:31,574][24717] Num frames 11400... [2023-02-22 22:03:31,694][24717] Num frames 11500... [2023-02-22 22:03:31,815][24717] Num frames 11600... [2023-02-22 22:03:31,936][24717] Num frames 11700... [2023-02-22 22:03:32,054][24717] Num frames 11800... [2023-02-22 22:03:32,168][24717] Num frames 11900... [2023-02-22 22:03:32,285][24717] Num frames 12000... [2023-02-22 22:03:32,414][24717] Num frames 12100... [2023-02-22 22:03:32,536][24717] Num frames 12200... [2023-02-22 22:03:32,660][24717] Num frames 12300... [2023-02-22 22:03:32,790][24717] Num frames 12400... [2023-02-22 22:03:32,909][24717] Num frames 12500... [2023-02-22 22:03:33,029][24717] Num frames 12600... [2023-02-22 22:03:33,151][24717] Num frames 12700... [2023-02-22 22:03:33,290][24717] Num frames 12800... [2023-02-22 22:03:33,415][24717] Num frames 12900... [2023-02-22 22:03:33,530][24717] Num frames 13000... [2023-02-22 22:03:33,608][24717] Avg episode rewards: #0: 41.774, true rewards: #0: 16.275 [2023-02-22 22:03:33,609][24717] Avg episode reward: 41.774, avg true_objective: 16.275 [2023-02-22 22:03:33,715][24717] Num frames 13100... [2023-02-22 22:03:33,830][24717] Num frames 13200... [2023-02-22 22:03:33,941][24717] Num frames 13300... [2023-02-22 22:03:34,054][24717] Num frames 13400... [2023-02-22 22:03:34,169][24717] Num frames 13500... [2023-02-22 22:03:34,289][24717] Num frames 13600... [2023-02-22 22:03:34,411][24717] Num frames 13700... [2023-02-22 22:03:34,527][24717] Num frames 13800... [2023-02-22 22:03:34,641][24717] Num frames 13900... [2023-02-22 22:03:34,753][24717] Num frames 14000... [2023-02-22 22:03:34,864][24717] Num frames 14100... [2023-02-22 22:03:34,996][24717] Num frames 14200... [2023-02-22 22:03:35,113][24717] Num frames 14300... [2023-02-22 22:03:35,231][24717] Num frames 14400... [2023-02-22 22:03:35,343][24717] Num frames 14500... [2023-02-22 22:03:35,465][24717] Num frames 14600... [2023-02-22 22:03:35,575][24717] Num frames 14700... [2023-02-22 22:03:35,696][24717] Num frames 14800... [2023-02-22 22:03:35,805][24717] Num frames 14900... [2023-02-22 22:03:35,947][24717] Num frames 15000... [2023-02-22 22:03:36,047][24717] Avg episode rewards: #0: 42.150, true rewards: #0: 16.707 [2023-02-22 22:03:36,048][24717] Avg episode reward: 42.150, avg true_objective: 16.707 [2023-02-22 22:03:36,128][24717] Num frames 15100... [2023-02-22 22:03:36,239][24717] Num frames 15200... [2023-02-22 22:03:36,348][24717] Num frames 15300... [2023-02-22 22:03:36,456][24717] Num frames 15400... [2023-02-22 22:03:36,571][24717] Num frames 15500... [2023-02-22 22:03:36,691][24717] Num frames 15600... [2023-02-22 22:03:36,761][24717] Avg episode rewards: #0: 38.911, true rewards: #0: 15.612 [2023-02-22 22:03:36,763][24717] Avg episode reward: 38.911, avg true_objective: 15.612 [2023-02-22 22:04:08,623][24717] Replay video saved to /home/flahoud/studies/collab/train_dir/default_experiment/replay.mp4! [2023-02-22 22:04:18,600][24717] Loading existing experiment configuration from /home/flahoud/studies/collab/train_dir/default_experiment/config.json [2023-02-22 22:04:18,601][24717] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 22:04:18,602][24717] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 22:04:18,602][24717] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 22:04:18,603][24717] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 22:04:18,603][24717] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 22:04:18,604][24717] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-22 22:04:18,604][24717] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 22:04:18,605][24717] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-22 22:04:18,605][24717] Adding new argument 'hf_repository'='GrimReaperSam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-22 22:04:18,607][24717] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 22:04:18,608][24717] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 22:04:18,608][24717] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 22:04:18,609][24717] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 22:04:18,609][24717] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 22:04:18,621][24717] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 22:04:18,622][24717] RunningMeanStd input shape: (1,) [2023-02-22 22:04:18,642][24717] ConvEncoder: input_channels=3 [2023-02-22 22:04:18,719][24717] Conv encoder output size: 512 [2023-02-22 22:04:18,720][24717] Policy head output size: 512 [2023-02-22 22:04:18,750][24717] Loading state from checkpoint /home/flahoud/studies/collab/train_dir/default_experiment/checkpoint_p0/checkpoint_000009767_40005632.pth... [2023-02-22 22:04:19,263][24717] Num frames 100... [2023-02-22 22:04:19,386][24717] Num frames 200... [2023-02-22 22:04:19,494][24717] Num frames 300... [2023-02-22 22:04:19,598][24717] Num frames 400... [2023-02-22 22:04:19,698][24717] Num frames 500... [2023-02-22 22:04:19,802][24717] Num frames 600... [2023-02-22 22:04:19,907][24717] Num frames 700... [2023-02-22 22:04:20,018][24717] Num frames 800... [2023-02-22 22:04:20,118][24717] Num frames 900... [2023-02-22 22:04:20,209][24717] Avg episode rewards: #0: 17.350, true rewards: #0: 9.350 [2023-02-22 22:04:20,211][24717] Avg episode reward: 17.350, avg true_objective: 9.350 [2023-02-22 22:04:20,289][24717] Num frames 1000... [2023-02-22 22:04:20,393][24717] Num frames 1100... [2023-02-22 22:04:20,494][24717] Num frames 1200... [2023-02-22 22:04:20,594][24717] Num frames 1300... [2023-02-22 22:04:20,692][24717] Num frames 1400... [2023-02-22 22:04:20,786][24717] Num frames 1500... [2023-02-22 22:04:20,890][24717] Num frames 1600... [2023-02-22 22:04:20,992][24717] Num frames 1700... [2023-02-22 22:04:21,102][24717] Num frames 1800... [2023-02-22 22:04:21,211][24717] Num frames 1900... [2023-02-22 22:04:21,311][24717] Num frames 2000... [2023-02-22 22:04:21,409][24717] Num frames 2100... [2023-02-22 22:04:21,505][24717] Num frames 2200... [2023-02-22 22:04:21,613][24717] Num frames 2300... [2023-02-22 22:04:21,713][24717] Num frames 2400... [2023-02-22 22:04:21,828][24717] Num frames 2500... [2023-02-22 22:04:21,950][24717] Num frames 2600... [2023-02-22 22:04:22,073][24717] Num frames 2700... [2023-02-22 22:04:22,196][24717] Num frames 2800... [2023-02-22 22:04:22,300][24717] Num frames 2900... [2023-02-22 22:04:22,404][24717] Num frames 3000... [2023-02-22 22:04:22,499][24717] Avg episode rewards: #0: 37.675, true rewards: #0: 15.175 [2023-02-22 22:04:22,501][24717] Avg episode reward: 37.675, avg true_objective: 15.175 [2023-02-22 22:04:22,579][24717] Num frames 3100... [2023-02-22 22:04:22,683][24717] Num frames 3200... [2023-02-22 22:04:22,788][24717] Num frames 3300... [2023-02-22 22:04:22,891][24717] Num frames 3400... [2023-02-22 22:04:22,994][24717] Num frames 3500... [2023-02-22 22:04:23,105][24717] Num frames 3600... [2023-02-22 22:04:23,216][24717] Avg episode rewards: #0: 29.176, true rewards: #0: 12.177 [2023-02-22 22:04:23,218][24717] Avg episode reward: 29.176, avg true_objective: 12.177 [2023-02-22 22:04:23,279][24717] Num frames 3700... [2023-02-22 22:04:23,401][24717] Num frames 3800... [2023-02-22 22:04:23,524][24717] Num frames 3900... [2023-02-22 22:04:23,650][24717] Num frames 4000... [2023-02-22 22:04:23,775][24717] Num frames 4100... [2023-02-22 22:04:23,900][24717] Num frames 4200... [2023-02-22 22:04:24,037][24717] Num frames 4300... [2023-02-22 22:04:24,151][24717] Num frames 4400... [2023-02-22 22:04:24,252][24717] Num frames 4500... [2023-02-22 22:04:24,362][24717] Num frames 4600... [2023-02-22 22:04:24,462][24717] Num frames 4700... [2023-02-22 22:04:24,571][24717] Num frames 4800... [2023-02-22 22:04:24,680][24717] Num frames 4900... [2023-02-22 22:04:24,787][24717] Num frames 5000... [2023-02-22 22:04:24,942][24717] Avg episode rewards: #0: 30.482, true rewards: #0: 12.732 [2023-02-22 22:04:24,943][24717] Avg episode reward: 30.482, avg true_objective: 12.732 [2023-02-22 22:04:24,951][24717] Num frames 5100... [2023-02-22 22:04:25,086][24717] Num frames 5200... [2023-02-22 22:04:25,194][24717] Num frames 5300... [2023-02-22 22:04:25,330][24717] Num frames 5400... [2023-02-22 22:04:25,468][24717] Num frames 5500... [2023-02-22 22:04:25,608][24717] Num frames 5600... [2023-02-22 22:04:25,729][24717] Num frames 5700... [2023-02-22 22:04:25,848][24717] Num frames 5800... [2023-02-22 22:04:25,969][24717] Num frames 5900... [2023-02-22 22:04:26,089][24717] Num frames 6000... [2023-02-22 22:04:26,222][24717] Num frames 6100... [2023-02-22 22:04:26,328][24717] Num frames 6200... [2023-02-22 22:04:26,441][24717] Num frames 6300... [2023-02-22 22:04:26,578][24717] Num frames 6400... [2023-02-22 22:04:26,641][24717] Avg episode rewards: #0: 30.210, true rewards: #0: 12.810 [2023-02-22 22:04:26,643][24717] Avg episode reward: 30.210, avg true_objective: 12.810 [2023-02-22 22:04:26,756][24717] Num frames 6500... [2023-02-22 22:04:26,864][24717] Num frames 6600... [2023-02-22 22:04:26,982][24717] Num frames 6700... [2023-02-22 22:04:27,104][24717] Num frames 6800... [2023-02-22 22:04:27,216][24717] Num frames 6900... [2023-02-22 22:04:27,322][24717] Num frames 7000... [2023-02-22 22:04:27,453][24717] Num frames 7100... [2023-02-22 22:04:27,573][24717] Num frames 7200... [2023-02-22 22:04:27,714][24717] Num frames 7300... [2023-02-22 22:04:27,846][24717] Num frames 7400... [2023-02-22 22:04:27,990][24717] Num frames 7500... [2023-02-22 22:04:28,084][24717] Avg episode rewards: #0: 29.042, true rewards: #0: 12.542 [2023-02-22 22:04:28,085][24717] Avg episode reward: 29.042, avg true_objective: 12.542 [2023-02-22 22:04:28,180][24717] Num frames 7600... [2023-02-22 22:04:28,294][24717] Num frames 7700... [2023-02-22 22:04:28,402][24717] Num frames 7800... [2023-02-22 22:04:28,519][24717] Num frames 7900... [2023-02-22 22:04:28,640][24717] Num frames 8000... [2023-02-22 22:04:28,764][24717] Num frames 8100... [2023-02-22 22:04:28,879][24717] Num frames 8200... [2023-02-22 22:04:29,003][24717] Num frames 8300... [2023-02-22 22:04:29,135][24717] Num frames 8400... [2023-02-22 22:04:29,243][24717] Num frames 8500... [2023-02-22 22:04:29,346][24717] Num frames 8600... [2023-02-22 22:04:29,449][24717] Num frames 8700... [2023-02-22 22:04:29,555][24717] Num frames 8800... [2023-02-22 22:04:29,657][24717] Num frames 8900... [2023-02-22 22:04:29,773][24717] Num frames 9000... [2023-02-22 22:04:29,882][24717] Num frames 9100... [2023-02-22 22:04:29,987][24717] Num frames 9200... [2023-02-22 22:04:30,105][24717] Num frames 9300... [2023-02-22 22:04:30,216][24717] Num frames 9400... [2023-02-22 22:04:30,328][24717] Num frames 9500... [2023-02-22 22:04:30,457][24717] Num frames 9600... [2023-02-22 22:04:30,543][24717] Avg episode rewards: #0: 32.464, true rewards: #0: 13.750 [2023-02-22 22:04:30,544][24717] Avg episode reward: 32.464, avg true_objective: 13.750 [2023-02-22 22:04:30,627][24717] Num frames 9700... [2023-02-22 22:04:30,745][24717] Num frames 9800... [2023-02-22 22:04:30,858][24717] Num frames 9900... [2023-02-22 22:04:30,985][24717] Num frames 10000... [2023-02-22 22:04:31,106][24717] Num frames 10100... [2023-02-22 22:04:31,227][24717] Num frames 10200... [2023-02-22 22:04:31,361][24717] Num frames 10300... [2023-02-22 22:04:31,492][24717] Num frames 10400... [2023-02-22 22:04:31,633][24717] Num frames 10500... [2023-02-22 22:04:31,755][24717] Avg episode rewards: #0: 30.941, true rewards: #0: 13.191 [2023-02-22 22:04:31,756][24717] Avg episode reward: 30.941, avg true_objective: 13.191 [2023-02-22 22:04:31,814][24717] Num frames 10600... [2023-02-22 22:04:31,927][24717] Num frames 10700... [2023-02-22 22:04:32,042][24717] Num frames 10800... [2023-02-22 22:04:32,168][24717] Num frames 10900... [2023-02-22 22:04:32,299][24717] Num frames 11000... [2023-02-22 22:04:32,420][24717] Num frames 11100... [2023-02-22 22:04:32,532][24717] Num frames 11200... [2023-02-22 22:04:32,654][24717] Num frames 11300... [2023-02-22 22:04:32,773][24717] Num frames 11400... [2023-02-22 22:04:32,894][24717] Avg episode rewards: #0: 29.721, true rewards: #0: 12.721 [2023-02-22 22:04:32,895][24717] Avg episode reward: 29.721, avg true_objective: 12.721 [2023-02-22 22:04:32,952][24717] Num frames 11500... [2023-02-22 22:04:33,078][24717] Num frames 11600... [2023-02-22 22:04:33,208][24717] Num frames 11700... [2023-02-22 22:04:33,353][24717] Num frames 11800... [2023-02-22 22:04:33,490][24717] Num frames 11900... [2023-02-22 22:04:33,628][24717] Num frames 12000... [2023-02-22 22:04:33,742][24717] Num frames 12100... [2023-02-22 22:04:33,858][24717] Num frames 12200... [2023-02-22 22:04:33,970][24717] Num frames 12300... [2023-02-22 22:04:34,076][24717] Num frames 12400... [2023-02-22 22:04:34,188][24717] Num frames 12500... [2023-02-22 22:04:34,301][24717] Num frames 12600... [2023-02-22 22:04:34,412][24717] Num frames 12700... [2023-02-22 22:04:34,539][24717] Num frames 12800... [2023-02-22 22:04:34,658][24717] Num frames 12900... [2023-02-22 22:04:34,777][24717] Num frames 13000... [2023-02-22 22:04:34,898][24717] Num frames 13100... [2023-02-22 22:04:35,011][24717] Num frames 13200... [2023-02-22 22:04:35,137][24717] Num frames 13300... [2023-02-22 22:04:35,256][24717] Num frames 13400... [2023-02-22 22:04:35,381][24717] Num frames 13500... [2023-02-22 22:04:35,503][24717] Avg episode rewards: #0: 32.549, true rewards: #0: 13.549 [2023-02-22 22:04:35,504][24717] Avg episode reward: 32.549, avg true_objective: 13.549 [2023-02-22 22:05:02,988][24717] Replay video saved to /home/flahoud/studies/collab/train_dir/default_experiment/replay.mp4!