diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,3992 @@ +[2024-08-24 20:04:17,014][01192] Saving configuration to /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/config.json... +[2024-08-24 20:04:17,014][01192] Rollout worker 0 uses device cpu +[2024-08-24 20:04:17,015][01192] Rollout worker 1 uses device cpu +[2024-08-24 20:04:17,015][01192] Rollout worker 2 uses device cpu +[2024-08-24 20:04:17,016][01192] Rollout worker 3 uses device cpu +[2024-08-24 20:04:17,016][01192] Rollout worker 4 uses device cpu +[2024-08-24 20:04:17,016][01192] Rollout worker 5 uses device cpu +[2024-08-24 20:04:17,017][01192] Rollout worker 6 uses device cpu +[2024-08-24 20:04:17,017][01192] Rollout worker 7 uses device cpu +[2024-08-24 20:04:17,054][01192] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-24 20:04:17,055][01192] InferenceWorker_p0-w0: min num requests: 2 +[2024-08-24 20:04:17,066][01192] Starting all processes... +[2024-08-24 20:04:17,067][01192] Starting process learner_proc0 +[2024-08-24 20:04:17,116][01192] Starting all processes... +[2024-08-24 20:04:17,120][01192] Starting process inference_proc0-0 +[2024-08-24 20:04:17,120][01192] Starting process rollout_proc0 +[2024-08-24 20:04:17,120][01192] Starting process rollout_proc1 +[2024-08-24 20:04:17,121][01192] Starting process rollout_proc2 +[2024-08-24 20:04:17,121][01192] Starting process rollout_proc3 +[2024-08-24 20:04:17,121][01192] Starting process rollout_proc4 +[2024-08-24 20:04:17,122][01192] Starting process rollout_proc5 +[2024-08-24 20:04:17,122][01192] Starting process rollout_proc6 +[2024-08-24 20:04:17,122][01192] Starting process rollout_proc7 +[2024-08-24 20:04:17,908][03430] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-24 20:04:17,908][03430] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2024-08-24 20:04:17,908][03417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-24 20:04:17,908][03417] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2024-08-24 20:04:17,926][03417] Num visible devices: 1 +[2024-08-24 20:04:17,926][03430] Num visible devices: 1 +[2024-08-24 20:04:17,960][03463] Worker 0 uses CPU cores [0, 1, 2, 3] +[2024-08-24 20:04:17,978][03417] Starting seed is not provided +[2024-08-24 20:04:17,978][03417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-24 20:04:17,978][03417] Initializing actor-critic model on device cuda:0 +[2024-08-24 20:04:17,978][03417] RunningMeanStd input shape: (3, 72, 128) +[2024-08-24 20:04:17,979][03417] RunningMeanStd input shape: (1,) +[2024-08-24 20:04:17,984][03417] ConvEncoder: input_channels=3 +[2024-08-24 20:04:17,992][03469] Worker 6 uses CPU cores [24, 25, 26, 27] +[2024-08-24 20:04:17,999][03467] Worker 4 uses CPU cores [16, 17, 18, 19] +[2024-08-24 20:04:18,002][03470] Worker 7 uses CPU cores [28, 29, 30, 31] +[2024-08-24 20:04:18,038][03466] Worker 3 uses CPU cores [12, 13, 14, 15] +[2024-08-24 20:04:18,052][03464] Worker 1 uses CPU cores [4, 5, 6, 7] +[2024-08-24 20:04:18,052][03468] Worker 5 uses CPU cores [20, 21, 22, 23] +[2024-08-24 20:04:18,097][03465] Worker 2 uses CPU cores [8, 9, 10, 11] +[2024-08-24 20:04:18,111][03417] Conv encoder output size: 512 +[2024-08-24 20:04:18,111][03417] Policy head output size: 512 +[2024-08-24 20:04:18,130][03417] Created Actor Critic model with architecture: +[2024-08-24 20:04:18,131][03417] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): VizdoomEncoder( + (basic_encoder): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ELU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ELU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ELU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ELU) + ) + ) + ) + ) + (core): ModelCoreRNN( + (core): GRU(512, 512) + ) + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=5, bias=True) + ) +) +[2024-08-24 20:04:19,933][03417] Using optimizer +[2024-08-24 20:04:19,934][03417] No checkpoints found +[2024-08-24 20:04:19,934][03417] Did not load from checkpoint, starting from scratch! +[2024-08-24 20:04:19,934][03417] Initialized policy 0 weights for model version 0 +[2024-08-24 20:04:19,937][03417] LearnerWorker_p0 finished initialization! +[2024-08-24 20:04:19,937][03417] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2024-08-24 20:04:20,040][03430] RunningMeanStd input shape: (3, 72, 128) +[2024-08-24 20:04:20,040][03430] RunningMeanStd input shape: (1,) +[2024-08-24 20:04:20,045][03430] ConvEncoder: input_channels=3 +[2024-08-24 20:04:20,084][03430] Conv encoder output size: 512 +[2024-08-24 20:04:20,084][03430] Policy head output size: 512 +[2024-08-24 20:04:20,812][01192] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2024-08-24 20:04:20,822][01192] Inference worker 0-0 is ready! +[2024-08-24 20:04:20,822][01192] All inference workers are ready! Signal rollout workers to start! +[2024-08-24 20:04:20,834][03463] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,834][03470] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,834][03468] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,834][03465] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,834][03469] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,835][03467] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,835][03464] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:20,835][03466] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:04:21,087][03466] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,087][03463] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,087][03468] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,087][03469] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,087][03470] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,087][03465] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,195][03463] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,196][03469] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,196][03465] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,204][03464] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,204][03468] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,217][03467] Decorrelating experience for 0 frames... +[2024-08-24 20:04:21,318][03464] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,329][03467] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,329][03465] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,330][03469] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,332][03468] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,356][03466] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,383][03463] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,450][03464] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,459][03469] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,460][03468] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,468][03465] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,492][03466] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,506][03470] Decorrelating experience for 32 frames... +[2024-08-24 20:04:21,513][03467] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,582][03464] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,621][03463] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,635][03470] Decorrelating experience for 64 frames... +[2024-08-24 20:04:21,643][03467] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,774][03470] Decorrelating experience for 96 frames... +[2024-08-24 20:04:21,780][03466] Decorrelating experience for 96 frames... +[2024-08-24 20:04:22,485][03417] Signal inference workers to stop experience collection... +[2024-08-24 20:04:22,489][03430] InferenceWorker_p0-w0: stopping experience collection +[2024-08-24 20:04:23,253][03417] Signal inference workers to resume experience collection... +[2024-08-24 20:04:23,254][03430] InferenceWorker_p0-w0: resuming experience collection +[2024-08-24 20:04:24,340][03430] Updated weights for policy 0, policy_version 10 (0.0117) +[2024-08-24 20:04:25,433][03430] Updated weights for policy 0, policy_version 20 (0.0006) +[2024-08-24 20:04:25,812][01192] Fps is (10 sec: 18841.7, 60 sec: 18841.7, 300 sec: 18841.7). Total num frames: 94208. Throughput: 0: 1124.8. Samples: 5624. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2024-08-24 20:04:25,813][01192] Avg episode reward: [(0, '4.344')] +[2024-08-24 20:04:26,541][03430] Updated weights for policy 0, policy_version 30 (0.0006) +[2024-08-24 20:04:27,697][03430] Updated weights for policy 0, policy_version 40 (0.0006) +[2024-08-24 20:04:28,840][03430] Updated weights for policy 0, policy_version 50 (0.0005) +[2024-08-24 20:04:29,947][03430] Updated weights for policy 0, policy_version 60 (0.0005) +[2024-08-24 20:04:30,812][01192] Fps is (10 sec: 27443.3, 60 sec: 27443.3, 300 sec: 27443.3). Total num frames: 274432. Throughput: 0: 6098.8. Samples: 60988. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:04:30,813][01192] Avg episode reward: [(0, '4.438')] +[2024-08-24 20:04:30,813][03417] Saving new best policy, reward=4.438! +[2024-08-24 20:04:31,082][03430] Updated weights for policy 0, policy_version 70 (0.0005) +[2024-08-24 20:04:32,227][03430] Updated weights for policy 0, policy_version 80 (0.0005) +[2024-08-24 20:04:33,357][03430] Updated weights for policy 0, policy_version 90 (0.0006) +[2024-08-24 20:04:34,536][03430] Updated weights for policy 0, policy_version 100 (0.0006) +[2024-08-24 20:04:35,636][03430] Updated weights for policy 0, policy_version 110 (0.0006) +[2024-08-24 20:04:35,812][01192] Fps is (10 sec: 36044.8, 60 sec: 30310.4, 300 sec: 30310.4). Total num frames: 454656. Throughput: 0: 5877.2. Samples: 88158. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:04:35,813][01192] Avg episode reward: [(0, '4.413')] +[2024-08-24 20:04:36,751][03430] Updated weights for policy 0, policy_version 120 (0.0005) +[2024-08-24 20:04:37,051][01192] Heartbeat connected on Batcher_0 +[2024-08-24 20:04:37,053][01192] Heartbeat connected on LearnerWorker_p0 +[2024-08-24 20:04:37,057][01192] Heartbeat connected on InferenceWorker_p0-w0 +[2024-08-24 20:04:37,058][01192] Heartbeat connected on RolloutWorker_w1 +[2024-08-24 20:04:37,059][01192] Heartbeat connected on RolloutWorker_w0 +[2024-08-24 20:04:37,060][01192] Heartbeat connected on RolloutWorker_w2 +[2024-08-24 20:04:37,061][01192] Heartbeat connected on RolloutWorker_w3 +[2024-08-24 20:04:37,062][01192] Heartbeat connected on RolloutWorker_w4 +[2024-08-24 20:04:37,065][01192] Heartbeat connected on RolloutWorker_w6 +[2024-08-24 20:04:37,065][01192] Heartbeat connected on RolloutWorker_w5 +[2024-08-24 20:04:37,066][01192] Heartbeat connected on RolloutWorker_w7 +[2024-08-24 20:04:37,882][03430] Updated weights for policy 0, policy_version 130 (0.0006) +[2024-08-24 20:04:39,020][03430] Updated weights for policy 0, policy_version 140 (0.0005) +[2024-08-24 20:04:40,101][03430] Updated weights for policy 0, policy_version 150 (0.0006) +[2024-08-24 20:04:40,812][01192] Fps is (10 sec: 36454.6, 60 sec: 31949.0, 300 sec: 31949.0). Total num frames: 638976. Throughput: 0: 7116.0. Samples: 142320. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:04:40,813][01192] Avg episode reward: [(0, '4.511')] +[2024-08-24 20:04:40,813][03417] Saving new best policy, reward=4.511! +[2024-08-24 20:04:41,227][03430] Updated weights for policy 0, policy_version 160 (0.0006) +[2024-08-24 20:04:42,366][03430] Updated weights for policy 0, policy_version 170 (0.0006) +[2024-08-24 20:04:43,493][03430] Updated weights for policy 0, policy_version 180 (0.0005) +[2024-08-24 20:04:44,622][03430] Updated weights for policy 0, policy_version 190 (0.0006) +[2024-08-24 20:04:45,756][03430] Updated weights for policy 0, policy_version 200 (0.0006) +[2024-08-24 20:04:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 32768.0, 300 sec: 32768.0). Total num frames: 819200. Throughput: 0: 7891.4. Samples: 197286. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:04:45,813][01192] Avg episode reward: [(0, '4.407')] +[2024-08-24 20:04:46,858][03430] Updated weights for policy 0, policy_version 210 (0.0005) +[2024-08-24 20:04:47,966][03430] Updated weights for policy 0, policy_version 220 (0.0005) +[2024-08-24 20:04:49,061][03430] Updated weights for policy 0, policy_version 230 (0.0006) +[2024-08-24 20:04:50,168][03430] Updated weights for policy 0, policy_version 240 (0.0006) +[2024-08-24 20:04:50,812][01192] Fps is (10 sec: 36454.0, 60 sec: 33450.6, 300 sec: 33450.6). Total num frames: 1003520. Throughput: 0: 7495.5. Samples: 224866. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:04:50,813][01192] Avg episode reward: [(0, '4.626')] +[2024-08-24 20:04:50,830][03417] Saving new best policy, reward=4.626! +[2024-08-24 20:04:51,353][03430] Updated weights for policy 0, policy_version 250 (0.0006) +[2024-08-24 20:04:52,537][03430] Updated weights for policy 0, policy_version 260 (0.0006) +[2024-08-24 20:04:53,668][03430] Updated weights for policy 0, policy_version 270 (0.0006) +[2024-08-24 20:04:54,758][03430] Updated weights for policy 0, policy_version 280 (0.0006) +[2024-08-24 20:04:55,812][01192] Fps is (10 sec: 36454.1, 60 sec: 33821.2, 300 sec: 33821.2). Total num frames: 1183744. Throughput: 0: 7970.8. Samples: 278980. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:04:55,813][01192] Avg episode reward: [(0, '4.396')] +[2024-08-24 20:04:55,891][03430] Updated weights for policy 0, policy_version 290 (0.0006) +[2024-08-24 20:04:57,022][03430] Updated weights for policy 0, policy_version 300 (0.0006) +[2024-08-24 20:04:58,113][03430] Updated weights for policy 0, policy_version 310 (0.0005) +[2024-08-24 20:04:59,221][03430] Updated weights for policy 0, policy_version 320 (0.0006) +[2024-08-24 20:05:00,307][03430] Updated weights for policy 0, policy_version 330 (0.0006) +[2024-08-24 20:05:00,812][01192] Fps is (10 sec: 36454.5, 60 sec: 34201.6, 300 sec: 34201.6). Total num frames: 1368064. Throughput: 0: 8356.0. Samples: 334240. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:05:00,813][01192] Avg episode reward: [(0, '4.701')] +[2024-08-24 20:05:00,814][03417] Saving new best policy, reward=4.701! +[2024-08-24 20:05:01,405][03430] Updated weights for policy 0, policy_version 340 (0.0005) +[2024-08-24 20:05:02,517][03430] Updated weights for policy 0, policy_version 350 (0.0005) +[2024-08-24 20:05:03,602][03430] Updated weights for policy 0, policy_version 360 (0.0005) +[2024-08-24 20:05:04,713][03430] Updated weights for policy 0, policy_version 370 (0.0006) +[2024-08-24 20:05:05,810][03430] Updated weights for policy 0, policy_version 380 (0.0005) +[2024-08-24 20:05:05,812][01192] Fps is (10 sec: 37274.0, 60 sec: 34588.5, 300 sec: 34588.5). Total num frames: 1556480. Throughput: 0: 8043.8. Samples: 361970. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:05:05,813][01192] Avg episode reward: [(0, '4.375')] +[2024-08-24 20:05:06,910][03430] Updated weights for policy 0, policy_version 390 (0.0006) +[2024-08-24 20:05:08,056][03430] Updated weights for policy 0, policy_version 400 (0.0006) +[2024-08-24 20:05:09,184][03430] Updated weights for policy 0, policy_version 410 (0.0006) +[2024-08-24 20:05:10,317][03430] Updated weights for policy 0, policy_version 420 (0.0007) +[2024-08-24 20:05:10,812][01192] Fps is (10 sec: 36864.0, 60 sec: 34734.1, 300 sec: 34734.1). Total num frames: 1736704. Throughput: 0: 9151.3. Samples: 417432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:05:10,813][01192] Avg episode reward: [(0, '4.566')] +[2024-08-24 20:05:11,409][03430] Updated weights for policy 0, policy_version 430 (0.0005) +[2024-08-24 20:05:12,516][03430] Updated weights for policy 0, policy_version 440 (0.0005) +[2024-08-24 20:05:13,619][03430] Updated weights for policy 0, policy_version 450 (0.0006) +[2024-08-24 20:05:14,713][03430] Updated weights for policy 0, policy_version 460 (0.0006) +[2024-08-24 20:05:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 34927.7, 300 sec: 34927.7). Total num frames: 1921024. Throughput: 0: 9154.8. Samples: 472952. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:05:15,812][01192] Avg episode reward: [(0, '4.353')] +[2024-08-24 20:05:15,816][03430] Updated weights for policy 0, policy_version 470 (0.0005) +[2024-08-24 20:05:16,937][03430] Updated weights for policy 0, policy_version 480 (0.0006) +[2024-08-24 20:05:18,025][03430] Updated weights for policy 0, policy_version 490 (0.0005) +[2024-08-24 20:05:19,102][03430] Updated weights for policy 0, policy_version 500 (0.0005) +[2024-08-24 20:05:20,237][03430] Updated weights for policy 0, policy_version 510 (0.0005) +[2024-08-24 20:05:20,812][01192] Fps is (10 sec: 37273.2, 60 sec: 35157.3, 300 sec: 35157.3). Total num frames: 2109440. Throughput: 0: 9166.1. Samples: 500634. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:05:20,813][01192] Avg episode reward: [(0, '4.357')] +[2024-08-24 20:05:21,349][03430] Updated weights for policy 0, policy_version 520 (0.0006) +[2024-08-24 20:05:22,452][03430] Updated weights for policy 0, policy_version 530 (0.0006) +[2024-08-24 20:05:23,595][03430] Updated weights for policy 0, policy_version 540 (0.0005) +[2024-08-24 20:05:24,724][03430] Updated weights for policy 0, policy_version 550 (0.0006) +[2024-08-24 20:05:25,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36590.9, 300 sec: 35225.6). Total num frames: 2289664. Throughput: 0: 9191.0. Samples: 555916. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:05:25,813][01192] Avg episode reward: [(0, '4.409')] +[2024-08-24 20:05:25,854][03430] Updated weights for policy 0, policy_version 560 (0.0005) +[2024-08-24 20:05:26,935][03430] Updated weights for policy 0, policy_version 570 (0.0005) +[2024-08-24 20:05:28,076][03430] Updated weights for policy 0, policy_version 580 (0.0006) +[2024-08-24 20:05:29,183][03430] Updated weights for policy 0, policy_version 590 (0.0006) +[2024-08-24 20:05:30,294][03430] Updated weights for policy 0, policy_version 600 (0.0006) +[2024-08-24 20:05:30,812][01192] Fps is (10 sec: 36454.8, 60 sec: 36659.2, 300 sec: 35342.6). Total num frames: 2473984. Throughput: 0: 9190.8. Samples: 610870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:05:30,813][01192] Avg episode reward: [(0, '4.324')] +[2024-08-24 20:05:31,430][03430] Updated weights for policy 0, policy_version 610 (0.0006) +[2024-08-24 20:05:32,544][03430] Updated weights for policy 0, policy_version 620 (0.0006) +[2024-08-24 20:05:33,655][03430] Updated weights for policy 0, policy_version 630 (0.0005) +[2024-08-24 20:05:34,726][03430] Updated weights for policy 0, policy_version 640 (0.0005) +[2024-08-24 20:05:35,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36727.5, 300 sec: 35444.1). Total num frames: 2658304. Throughput: 0: 9183.4. Samples: 638120. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:05:35,813][01192] Avg episode reward: [(0, '4.479')] +[2024-08-24 20:05:35,814][03430] Updated weights for policy 0, policy_version 650 (0.0005) +[2024-08-24 20:05:36,877][03430] Updated weights for policy 0, policy_version 660 (0.0006) +[2024-08-24 20:05:37,994][03430] Updated weights for policy 0, policy_version 670 (0.0007) +[2024-08-24 20:05:39,098][03430] Updated weights for policy 0, policy_version 680 (0.0005) +[2024-08-24 20:05:40,214][03430] Updated weights for policy 0, policy_version 690 (0.0006) +[2024-08-24 20:05:40,812][01192] Fps is (10 sec: 37273.7, 60 sec: 36795.7, 300 sec: 35584.0). Total num frames: 2846720. Throughput: 0: 9238.5. Samples: 694710. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:05:40,813][01192] Avg episode reward: [(0, '4.481')] +[2024-08-24 20:05:41,308][03430] Updated weights for policy 0, policy_version 700 (0.0005) +[2024-08-24 20:05:42,401][03430] Updated weights for policy 0, policy_version 710 (0.0005) +[2024-08-24 20:05:43,500][03430] Updated weights for policy 0, policy_version 720 (0.0006) +[2024-08-24 20:05:44,599][03430] Updated weights for policy 0, policy_version 730 (0.0005) +[2024-08-24 20:05:45,685][03430] Updated weights for policy 0, policy_version 740 (0.0006) +[2024-08-24 20:05:45,812][01192] Fps is (10 sec: 37683.3, 60 sec: 36932.3, 300 sec: 35707.5). Total num frames: 3035136. Throughput: 0: 9249.2. Samples: 750456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:05:45,813][01192] Avg episode reward: [(0, '4.375')] +[2024-08-24 20:05:46,798][03430] Updated weights for policy 0, policy_version 750 (0.0006) +[2024-08-24 20:05:47,919][03430] Updated weights for policy 0, policy_version 760 (0.0005) +[2024-08-24 20:05:49,029][03430] Updated weights for policy 0, policy_version 770 (0.0006) +[2024-08-24 20:05:50,145][03430] Updated weights for policy 0, policy_version 780 (0.0006) +[2024-08-24 20:05:50,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36864.0, 300 sec: 35726.2). Total num frames: 3215360. Throughput: 0: 9248.3. Samples: 778144. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:05:50,813][01192] Avg episode reward: [(0, '4.352')] +[2024-08-24 20:05:51,263][03430] Updated weights for policy 0, policy_version 790 (0.0006) +[2024-08-24 20:05:52,368][03430] Updated weights for policy 0, policy_version 800 (0.0005) +[2024-08-24 20:05:53,432][03430] Updated weights for policy 0, policy_version 810 (0.0006) +[2024-08-24 20:05:54,505][03430] Updated weights for policy 0, policy_version 820 (0.0006) +[2024-08-24 20:05:55,618][03430] Updated weights for policy 0, policy_version 830 (0.0006) +[2024-08-24 20:05:55,812][01192] Fps is (10 sec: 36863.5, 60 sec: 37000.5, 300 sec: 35829.2). Total num frames: 3403776. Throughput: 0: 9259.3. Samples: 834102. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:05:55,813][01192] Avg episode reward: [(0, '4.379')] +[2024-08-24 20:05:56,727][03430] Updated weights for policy 0, policy_version 840 (0.0005) +[2024-08-24 20:05:57,832][03430] Updated weights for policy 0, policy_version 850 (0.0005) +[2024-08-24 20:05:58,930][03430] Updated weights for policy 0, policy_version 860 (0.0006) +[2024-08-24 20:06:00,039][03430] Updated weights for policy 0, policy_version 870 (0.0005) +[2024-08-24 20:06:00,812][01192] Fps is (10 sec: 37683.2, 60 sec: 37068.8, 300 sec: 35921.9). Total num frames: 3592192. Throughput: 0: 9260.4. Samples: 889670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:06:00,813][01192] Avg episode reward: [(0, '4.255')] +[2024-08-24 20:06:01,155][03430] Updated weights for policy 0, policy_version 880 (0.0006) +[2024-08-24 20:06:02,245][03430] Updated weights for policy 0, policy_version 890 (0.0006) +[2024-08-24 20:06:03,377][03430] Updated weights for policy 0, policy_version 900 (0.0006) +[2024-08-24 20:06:04,478][03430] Updated weights for policy 0, policy_version 910 (0.0006) +[2024-08-24 20:06:05,583][03430] Updated weights for policy 0, policy_version 920 (0.0005) +[2024-08-24 20:06:05,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37000.5, 300 sec: 35966.8). Total num frames: 3776512. Throughput: 0: 9261.0. Samples: 917378. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:05,813][01192] Avg episode reward: [(0, '4.351')] +[2024-08-24 20:06:06,690][03430] Updated weights for policy 0, policy_version 930 (0.0004) +[2024-08-24 20:06:07,737][03430] Updated weights for policy 0, policy_version 940 (0.0006) +[2024-08-24 20:06:08,814][03430] Updated weights for policy 0, policy_version 950 (0.0006) +[2024-08-24 20:06:09,917][03430] Updated weights for policy 0, policy_version 960 (0.0004) +[2024-08-24 20:06:10,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.1, 300 sec: 36044.8). Total num frames: 3964928. Throughput: 0: 9283.6. Samples: 973680. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:10,813][01192] Avg episode reward: [(0, '4.447')] +[2024-08-24 20:06:11,019][03430] Updated weights for policy 0, policy_version 970 (0.0005) +[2024-08-24 20:06:12,143][03430] Updated weights for policy 0, policy_version 980 (0.0005) +[2024-08-24 20:06:13,230][03430] Updated weights for policy 0, policy_version 990 (0.0005) +[2024-08-24 20:06:14,345][03430] Updated weights for policy 0, policy_version 1000 (0.0005) +[2024-08-24 20:06:15,411][03430] Updated weights for policy 0, policy_version 1010 (0.0005) +[2024-08-24 20:06:15,812][01192] Fps is (10 sec: 37274.3, 60 sec: 37137.1, 300 sec: 36080.5). Total num frames: 4149248. Throughput: 0: 9303.5. Samples: 1029528. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:15,813][01192] Avg episode reward: [(0, '4.382')] +[2024-08-24 20:06:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth... +[2024-08-24 20:06:16,499][03430] Updated weights for policy 0, policy_version 1020 (0.0005) +[2024-08-24 20:06:17,624][03430] Updated weights for policy 0, policy_version 1030 (0.0005) +[2024-08-24 20:06:18,743][03430] Updated weights for policy 0, policy_version 1040 (0.0005) +[2024-08-24 20:06:19,828][03430] Updated weights for policy 0, policy_version 1050 (0.0006) +[2024-08-24 20:06:20,812][01192] Fps is (10 sec: 36864.1, 60 sec: 37068.9, 300 sec: 36113.1). Total num frames: 4333568. Throughput: 0: 9318.5. Samples: 1057452. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:20,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:06:20,967][03430] Updated weights for policy 0, policy_version 1060 (0.0006) +[2024-08-24 20:06:22,064][03430] Updated weights for policy 0, policy_version 1070 (0.0007) +[2024-08-24 20:06:23,171][03430] Updated weights for policy 0, policy_version 1080 (0.0005) +[2024-08-24 20:06:24,273][03430] Updated weights for policy 0, policy_version 1090 (0.0005) +[2024-08-24 20:06:25,383][03430] Updated weights for policy 0, policy_version 1100 (0.0006) +[2024-08-24 20:06:25,812][01192] Fps is (10 sec: 36863.6, 60 sec: 37137.1, 300 sec: 36143.1). Total num frames: 4517888. Throughput: 0: 9295.0. Samples: 1112984. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:06:25,813][01192] Avg episode reward: [(0, '4.526')] +[2024-08-24 20:06:26,496][03430] Updated weights for policy 0, policy_version 1110 (0.0005) +[2024-08-24 20:06:27,629][03430] Updated weights for policy 0, policy_version 1120 (0.0006) +[2024-08-24 20:06:28,712][03430] Updated weights for policy 0, policy_version 1130 (0.0006) +[2024-08-24 20:06:29,794][03430] Updated weights for policy 0, policy_version 1140 (0.0005) +[2024-08-24 20:06:30,812][01192] Fps is (10 sec: 37273.5, 60 sec: 37205.3, 300 sec: 36202.3). Total num frames: 4706304. Throughput: 0: 9292.1. Samples: 1168602. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:30,813][01192] Avg episode reward: [(0, '4.416')] +[2024-08-24 20:06:30,871][03430] Updated weights for policy 0, policy_version 1150 (0.0004) +[2024-08-24 20:06:31,964][03430] Updated weights for policy 0, policy_version 1160 (0.0005) +[2024-08-24 20:06:33,066][03430] Updated weights for policy 0, policy_version 1170 (0.0006) +[2024-08-24 20:06:34,172][03430] Updated weights for policy 0, policy_version 1180 (0.0005) +[2024-08-24 20:06:35,286][03430] Updated weights for policy 0, policy_version 1190 (0.0006) +[2024-08-24 20:06:35,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37205.3, 300 sec: 36226.8). Total num frames: 4890624. Throughput: 0: 9303.1. Samples: 1196784. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:35,813][01192] Avg episode reward: [(0, '4.568')] +[2024-08-24 20:06:36,378][03430] Updated weights for policy 0, policy_version 1200 (0.0005) +[2024-08-24 20:06:37,507][03430] Updated weights for policy 0, policy_version 1210 (0.0006) +[2024-08-24 20:06:38,646][03430] Updated weights for policy 0, policy_version 1220 (0.0006) +[2024-08-24 20:06:39,725][03430] Updated weights for policy 0, policy_version 1230 (0.0006) +[2024-08-24 20:06:40,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37137.0, 300 sec: 36249.6). Total num frames: 5074944. Throughput: 0: 9285.4. Samples: 1251946. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:06:40,813][01192] Avg episode reward: [(0, '4.419')] +[2024-08-24 20:06:40,846][03430] Updated weights for policy 0, policy_version 1240 (0.0006) +[2024-08-24 20:06:41,933][03430] Updated weights for policy 0, policy_version 1250 (0.0007) +[2024-08-24 20:06:42,995][03430] Updated weights for policy 0, policy_version 1260 (0.0005) +[2024-08-24 20:06:44,111][03430] Updated weights for policy 0, policy_version 1270 (0.0005) +[2024-08-24 20:06:45,256][03430] Updated weights for policy 0, policy_version 1280 (0.0006) +[2024-08-24 20:06:45,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.0, 300 sec: 36299.0). Total num frames: 5263360. Throughput: 0: 9289.8. Samples: 1307712. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:06:45,819][01192] Avg episode reward: [(0, '4.578')] +[2024-08-24 20:06:46,353][03430] Updated weights for policy 0, policy_version 1290 (0.0006) +[2024-08-24 20:06:47,451][03430] Updated weights for policy 0, policy_version 1300 (0.0005) +[2024-08-24 20:06:48,521][03430] Updated weights for policy 0, policy_version 1310 (0.0005) +[2024-08-24 20:06:49,599][03430] Updated weights for policy 0, policy_version 1320 (0.0006) +[2024-08-24 20:06:50,711][03430] Updated weights for policy 0, policy_version 1330 (0.0005) +[2024-08-24 20:06:50,812][01192] Fps is (10 sec: 37273.3, 60 sec: 37205.3, 300 sec: 36317.9). Total num frames: 5447680. Throughput: 0: 9295.6. Samples: 1335678. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:06:50,813][01192] Avg episode reward: [(0, '4.422')] +[2024-08-24 20:06:51,804][03430] Updated weights for policy 0, policy_version 1340 (0.0006) +[2024-08-24 20:06:52,896][03430] Updated weights for policy 0, policy_version 1350 (0.0005) +[2024-08-24 20:06:54,062][03430] Updated weights for policy 0, policy_version 1360 (0.0006) +[2024-08-24 20:06:55,235][03430] Updated weights for policy 0, policy_version 1370 (0.0006) +[2024-08-24 20:06:55,812][01192] Fps is (10 sec: 36864.1, 60 sec: 37137.2, 300 sec: 36335.5). Total num frames: 5632000. Throughput: 0: 9279.2. Samples: 1391242. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:06:55,812][01192] Avg episode reward: [(0, '4.425')] +[2024-08-24 20:06:56,327][03430] Updated weights for policy 0, policy_version 1380 (0.0005) +[2024-08-24 20:06:57,414][03430] Updated weights for policy 0, policy_version 1390 (0.0005) +[2024-08-24 20:06:58,483][03430] Updated weights for policy 0, policy_version 1400 (0.0006) +[2024-08-24 20:06:59,571][03430] Updated weights for policy 0, policy_version 1410 (0.0005) +[2024-08-24 20:07:00,685][03430] Updated weights for policy 0, policy_version 1420 (0.0006) +[2024-08-24 20:07:00,812][01192] Fps is (10 sec: 37274.0, 60 sec: 37137.1, 300 sec: 36377.6). Total num frames: 5820416. Throughput: 0: 9272.3. Samples: 1446784. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:07:00,812][01192] Avg episode reward: [(0, '4.387')] +[2024-08-24 20:07:01,789][03430] Updated weights for policy 0, policy_version 1430 (0.0005) +[2024-08-24 20:07:02,879][03430] Updated weights for policy 0, policy_version 1440 (0.0006) +[2024-08-24 20:07:03,966][03430] Updated weights for policy 0, policy_version 1450 (0.0006) +[2024-08-24 20:07:05,044][03430] Updated weights for policy 0, policy_version 1460 (0.0005) +[2024-08-24 20:07:05,812][01192] Fps is (10 sec: 37273.5, 60 sec: 37137.1, 300 sec: 36392.3). Total num frames: 6004736. Throughput: 0: 9278.0. Samples: 1474960. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:07:05,813][01192] Avg episode reward: [(0, '4.282')] +[2024-08-24 20:07:06,139][03430] Updated weights for policy 0, policy_version 1470 (0.0007) +[2024-08-24 20:07:07,200][03430] Updated weights for policy 0, policy_version 1480 (0.0005) +[2024-08-24 20:07:08,290][03430] Updated weights for policy 0, policy_version 1490 (0.0005) +[2024-08-24 20:07:09,392][03430] Updated weights for policy 0, policy_version 1500 (0.0005) +[2024-08-24 20:07:10,491][03430] Updated weights for policy 0, policy_version 1510 (0.0005) +[2024-08-24 20:07:10,812][01192] Fps is (10 sec: 37682.8, 60 sec: 37205.3, 300 sec: 36454.4). Total num frames: 6197248. Throughput: 0: 9302.6. Samples: 1531600. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:07:10,813][01192] Avg episode reward: [(0, '4.385')] +[2024-08-24 20:07:11,545][03430] Updated weights for policy 0, policy_version 1520 (0.0005) +[2024-08-24 20:07:12,631][03430] Updated weights for policy 0, policy_version 1530 (0.0006) +[2024-08-24 20:07:13,747][03430] Updated weights for policy 0, policy_version 1540 (0.0006) +[2024-08-24 20:07:14,921][03430] Updated weights for policy 0, policy_version 1550 (0.0005) +[2024-08-24 20:07:15,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.0, 300 sec: 36442.7). Total num frames: 6377472. Throughput: 0: 9300.4. Samples: 1587120. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:07:15,813][01192] Avg episode reward: [(0, '4.420')] +[2024-08-24 20:07:16,069][03430] Updated weights for policy 0, policy_version 1560 (0.0006) +[2024-08-24 20:07:17,192][03430] Updated weights for policy 0, policy_version 1570 (0.0005) +[2024-08-24 20:07:18,276][03430] Updated weights for policy 0, policy_version 1580 (0.0005) +[2024-08-24 20:07:19,379][03430] Updated weights for policy 0, policy_version 1590 (0.0006) +[2024-08-24 20:07:20,462][03430] Updated weights for policy 0, policy_version 1600 (0.0005) +[2024-08-24 20:07:20,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37205.3, 300 sec: 36477.1). Total num frames: 6565888. Throughput: 0: 9284.6. Samples: 1614592. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:07:20,813][01192] Avg episode reward: [(0, '4.309')] +[2024-08-24 20:07:21,566][03430] Updated weights for policy 0, policy_version 1610 (0.0006) +[2024-08-24 20:07:22,693][03430] Updated weights for policy 0, policy_version 1620 (0.0005) +[2024-08-24 20:07:23,805][03430] Updated weights for policy 0, policy_version 1630 (0.0005) +[2024-08-24 20:07:24,902][03430] Updated weights for policy 0, policy_version 1640 (0.0005) +[2024-08-24 20:07:25,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37205.3, 300 sec: 36487.6). Total num frames: 6750208. Throughput: 0: 9295.7. Samples: 1670254. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:07:25,813][01192] Avg episode reward: [(0, '4.267')] +[2024-08-24 20:07:26,032][03430] Updated weights for policy 0, policy_version 1650 (0.0005) +[2024-08-24 20:07:27,170][03430] Updated weights for policy 0, policy_version 1660 (0.0006) +[2024-08-24 20:07:28,268][03430] Updated weights for policy 0, policy_version 1670 (0.0005) +[2024-08-24 20:07:29,318][03430] Updated weights for policy 0, policy_version 1680 (0.0006) +[2024-08-24 20:07:30,410][03430] Updated weights for policy 0, policy_version 1690 (0.0005) +[2024-08-24 20:07:30,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37137.0, 300 sec: 36497.5). Total num frames: 6934528. Throughput: 0: 9292.7. Samples: 1725882. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:07:30,813][01192] Avg episode reward: [(0, '4.377')] +[2024-08-24 20:07:31,524][03430] Updated weights for policy 0, policy_version 1700 (0.0005) +[2024-08-24 20:07:32,639][03430] Updated weights for policy 0, policy_version 1710 (0.0005) +[2024-08-24 20:07:33,729][03430] Updated weights for policy 0, policy_version 1720 (0.0006) +[2024-08-24 20:07:34,808][03430] Updated weights for policy 0, policy_version 1730 (0.0005) +[2024-08-24 20:07:35,812][01192] Fps is (10 sec: 37273.3, 60 sec: 37205.3, 300 sec: 36527.9). Total num frames: 7122944. Throughput: 0: 9292.2. Samples: 1753826. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:07:35,813][01192] Avg episode reward: [(0, '4.549')] +[2024-08-24 20:07:35,868][03430] Updated weights for policy 0, policy_version 1740 (0.0006) +[2024-08-24 20:07:36,942][03430] Updated weights for policy 0, policy_version 1750 (0.0006) +[2024-08-24 20:07:38,052][03430] Updated weights for policy 0, policy_version 1760 (0.0005) +[2024-08-24 20:07:39,186][03430] Updated weights for policy 0, policy_version 1770 (0.0005) +[2024-08-24 20:07:40,289][03430] Updated weights for policy 0, policy_version 1780 (0.0006) +[2024-08-24 20:07:40,812][01192] Fps is (10 sec: 37273.8, 60 sec: 37205.3, 300 sec: 36536.3). Total num frames: 7307264. Throughput: 0: 9303.2. Samples: 1809884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:07:40,813][01192] Avg episode reward: [(0, '4.528')] +[2024-08-24 20:07:41,407][03430] Updated weights for policy 0, policy_version 1790 (0.0005) +[2024-08-24 20:07:42,518][03430] Updated weights for policy 0, policy_version 1800 (0.0006) +[2024-08-24 20:07:43,644][03430] Updated weights for policy 0, policy_version 1810 (0.0006) +[2024-08-24 20:07:44,759][03430] Updated weights for policy 0, policy_version 1820 (0.0006) +[2024-08-24 20:07:45,812][01192] Fps is (10 sec: 36864.2, 60 sec: 37137.1, 300 sec: 36544.3). Total num frames: 7491584. Throughput: 0: 9298.6. Samples: 1865220. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:07:45,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:07:45,842][03430] Updated weights for policy 0, policy_version 1830 (0.0005) +[2024-08-24 20:07:46,938][03430] Updated weights for policy 0, policy_version 1840 (0.0005) +[2024-08-24 20:07:48,058][03430] Updated weights for policy 0, policy_version 1850 (0.0006) +[2024-08-24 20:07:49,175][03430] Updated weights for policy 0, policy_version 1860 (0.0005) +[2024-08-24 20:07:50,298][03430] Updated weights for policy 0, policy_version 1870 (0.0007) +[2024-08-24 20:07:50,812][01192] Fps is (10 sec: 36863.8, 60 sec: 37137.1, 300 sec: 36551.9). Total num frames: 7675904. Throughput: 0: 9295.0. Samples: 1893236. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:07:50,813][01192] Avg episode reward: [(0, '4.370')] +[2024-08-24 20:07:51,401][03430] Updated weights for policy 0, policy_version 1880 (0.0005) +[2024-08-24 20:07:52,504][03430] Updated weights for policy 0, policy_version 1890 (0.0005) +[2024-08-24 20:07:53,643][03430] Updated weights for policy 0, policy_version 1900 (0.0005) +[2024-08-24 20:07:54,754][03430] Updated weights for policy 0, policy_version 1910 (0.0005) +[2024-08-24 20:07:55,812][01192] Fps is (10 sec: 36863.7, 60 sec: 37137.0, 300 sec: 36559.2). Total num frames: 7860224. Throughput: 0: 9251.6. Samples: 1947924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:07:55,813][01192] Avg episode reward: [(0, '4.537')] +[2024-08-24 20:07:55,858][03430] Updated weights for policy 0, policy_version 1920 (0.0005) +[2024-08-24 20:07:56,982][03430] Updated weights for policy 0, policy_version 1930 (0.0006) +[2024-08-24 20:07:58,087][03430] Updated weights for policy 0, policy_version 1940 (0.0007) +[2024-08-24 20:07:59,191][03430] Updated weights for policy 0, policy_version 1950 (0.0006) +[2024-08-24 20:08:00,276][03430] Updated weights for policy 0, policy_version 1960 (0.0005) +[2024-08-24 20:08:00,812][01192] Fps is (10 sec: 36864.2, 60 sec: 37068.8, 300 sec: 36566.1). Total num frames: 8044544. Throughput: 0: 9255.6. Samples: 2003620. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:00,813][01192] Avg episode reward: [(0, '4.445')] +[2024-08-24 20:08:01,369][03430] Updated weights for policy 0, policy_version 1970 (0.0006) +[2024-08-24 20:08:02,498][03430] Updated weights for policy 0, policy_version 1980 (0.0006) +[2024-08-24 20:08:03,614][03430] Updated weights for policy 0, policy_version 1990 (0.0005) +[2024-08-24 20:08:04,746][03430] Updated weights for policy 0, policy_version 2000 (0.0006) +[2024-08-24 20:08:05,812][01192] Fps is (10 sec: 36864.3, 60 sec: 37068.8, 300 sec: 36572.7). Total num frames: 8228864. Throughput: 0: 9258.9. Samples: 2031244. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:05,813][01192] Avg episode reward: [(0, '4.340')] +[2024-08-24 20:08:05,846][03430] Updated weights for policy 0, policy_version 2010 (0.0005) +[2024-08-24 20:08:06,947][03430] Updated weights for policy 0, policy_version 2020 (0.0006) +[2024-08-24 20:08:08,065][03430] Updated weights for policy 0, policy_version 2030 (0.0005) +[2024-08-24 20:08:09,171][03430] Updated weights for policy 0, policy_version 2040 (0.0005) +[2024-08-24 20:08:10,279][03430] Updated weights for policy 0, policy_version 2050 (0.0006) +[2024-08-24 20:08:10,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36932.3, 300 sec: 36579.1). Total num frames: 8413184. Throughput: 0: 9249.1. Samples: 2086464. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:08:10,813][01192] Avg episode reward: [(0, '4.606')] +[2024-08-24 20:08:11,399][03430] Updated weights for policy 0, policy_version 2060 (0.0006) +[2024-08-24 20:08:12,492][03430] Updated weights for policy 0, policy_version 2070 (0.0005) +[2024-08-24 20:08:13,610][03430] Updated weights for policy 0, policy_version 2080 (0.0005) +[2024-08-24 20:08:14,727][03430] Updated weights for policy 0, policy_version 2090 (0.0006) +[2024-08-24 20:08:15,812][01192] Fps is (10 sec: 36863.6, 60 sec: 37000.5, 300 sec: 36585.1). Total num frames: 8597504. Throughput: 0: 9246.5. Samples: 2141974. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:15,813][01192] Avg episode reward: [(0, '4.292')] +[2024-08-24 20:08:15,836][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002100_8601600.pth... +[2024-08-24 20:08:15,837][03430] Updated weights for policy 0, policy_version 2100 (0.0005) +[2024-08-24 20:08:16,968][03430] Updated weights for policy 0, policy_version 2110 (0.0006) +[2024-08-24 20:08:18,077][03430] Updated weights for policy 0, policy_version 2120 (0.0006) +[2024-08-24 20:08:19,188][03430] Updated weights for policy 0, policy_version 2130 (0.0005) +[2024-08-24 20:08:20,303][03430] Updated weights for policy 0, policy_version 2140 (0.0005) +[2024-08-24 20:08:20,812][01192] Fps is (10 sec: 36863.4, 60 sec: 36932.2, 300 sec: 36590.9). Total num frames: 8781824. Throughput: 0: 9238.4. Samples: 2169556. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:08:20,813][01192] Avg episode reward: [(0, '4.362')] +[2024-08-24 20:08:21,404][03430] Updated weights for policy 0, policy_version 2150 (0.0006) +[2024-08-24 20:08:22,513][03430] Updated weights for policy 0, policy_version 2160 (0.0006) +[2024-08-24 20:08:23,621][03430] Updated weights for policy 0, policy_version 2170 (0.0005) +[2024-08-24 20:08:24,740][03430] Updated weights for policy 0, policy_version 2180 (0.0006) +[2024-08-24 20:08:25,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36932.1, 300 sec: 36596.5). Total num frames: 8966144. Throughput: 0: 9218.9. Samples: 2224738. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:25,814][01192] Avg episode reward: [(0, '4.486')] +[2024-08-24 20:08:25,841][03430] Updated weights for policy 0, policy_version 2190 (0.0006) +[2024-08-24 20:08:26,941][03430] Updated weights for policy 0, policy_version 2200 (0.0005) +[2024-08-24 20:08:28,047][03430] Updated weights for policy 0, policy_version 2210 (0.0005) +[2024-08-24 20:08:29,182][03430] Updated weights for policy 0, policy_version 2220 (0.0007) +[2024-08-24 20:08:30,294][03430] Updated weights for policy 0, policy_version 2230 (0.0005) +[2024-08-24 20:08:30,812][01192] Fps is (10 sec: 36864.6, 60 sec: 36932.3, 300 sec: 36601.9). Total num frames: 9150464. Throughput: 0: 9217.1. Samples: 2279990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:08:30,813][01192] Avg episode reward: [(0, '4.329')] +[2024-08-24 20:08:31,385][03430] Updated weights for policy 0, policy_version 2240 (0.0006) +[2024-08-24 20:08:32,495][03430] Updated weights for policy 0, policy_version 2250 (0.0006) +[2024-08-24 20:08:33,615][03430] Updated weights for policy 0, policy_version 2260 (0.0006) +[2024-08-24 20:08:34,708][03430] Updated weights for policy 0, policy_version 2270 (0.0004) +[2024-08-24 20:08:35,811][03430] Updated weights for policy 0, policy_version 2280 (0.0006) +[2024-08-24 20:08:35,812][01192] Fps is (10 sec: 37274.4, 60 sec: 36932.3, 300 sec: 36623.1). Total num frames: 9338880. Throughput: 0: 9211.9. Samples: 2307772. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:35,812][01192] Avg episode reward: [(0, '4.426')] +[2024-08-24 20:08:36,921][03430] Updated weights for policy 0, policy_version 2290 (0.0006) +[2024-08-24 20:08:38,042][03430] Updated weights for policy 0, policy_version 2300 (0.0006) +[2024-08-24 20:08:39,171][03430] Updated weights for policy 0, policy_version 2310 (0.0005) +[2024-08-24 20:08:40,305][03430] Updated weights for policy 0, policy_version 2320 (0.0006) +[2024-08-24 20:08:40,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36864.0, 300 sec: 36611.9). Total num frames: 9519104. Throughput: 0: 9225.4. Samples: 2363068. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:40,813][01192] Avg episode reward: [(0, '4.505')] +[2024-08-24 20:08:41,443][03430] Updated weights for policy 0, policy_version 2330 (0.0006) +[2024-08-24 20:08:42,603][03430] Updated weights for policy 0, policy_version 2340 (0.0006) +[2024-08-24 20:08:43,717][03430] Updated weights for policy 0, policy_version 2350 (0.0006) +[2024-08-24 20:08:44,856][03430] Updated weights for policy 0, policy_version 2360 (0.0006) +[2024-08-24 20:08:45,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36795.7, 300 sec: 36601.2). Total num frames: 9699328. Throughput: 0: 9196.1. Samples: 2417446. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:08:45,813][01192] Avg episode reward: [(0, '4.284')] +[2024-08-24 20:08:45,945][03430] Updated weights for policy 0, policy_version 2370 (0.0005) +[2024-08-24 20:08:47,018][03430] Updated weights for policy 0, policy_version 2380 (0.0005) +[2024-08-24 20:08:48,139][03430] Updated weights for policy 0, policy_version 2390 (0.0006) +[2024-08-24 20:08:49,249][03430] Updated weights for policy 0, policy_version 2400 (0.0004) +[2024-08-24 20:08:50,348][03430] Updated weights for policy 0, policy_version 2410 (0.0006) +[2024-08-24 20:08:50,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36864.0, 300 sec: 36621.3). Total num frames: 9887744. Throughput: 0: 9201.5. Samples: 2445312. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:08:50,812][01192] Avg episode reward: [(0, '4.192')] +[2024-08-24 20:08:51,474][03430] Updated weights for policy 0, policy_version 2420 (0.0006) +[2024-08-24 20:08:52,567][03430] Updated weights for policy 0, policy_version 2430 (0.0005) +[2024-08-24 20:08:53,666][03430] Updated weights for policy 0, policy_version 2440 (0.0005) +[2024-08-24 20:08:54,786][03430] Updated weights for policy 0, policy_version 2450 (0.0005) +[2024-08-24 20:08:55,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36795.8, 300 sec: 36610.8). Total num frames: 10067968. Throughput: 0: 9213.9. Samples: 2501088. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:08:55,813][01192] Avg episode reward: [(0, '4.523')] +[2024-08-24 20:08:55,946][03430] Updated weights for policy 0, policy_version 2460 (0.0006) +[2024-08-24 20:08:57,111][03430] Updated weights for policy 0, policy_version 2470 (0.0006) +[2024-08-24 20:08:58,235][03430] Updated weights for policy 0, policy_version 2480 (0.0006) +[2024-08-24 20:08:59,325][03430] Updated weights for policy 0, policy_version 2490 (0.0005) +[2024-08-24 20:09:00,423][03430] Updated weights for policy 0, policy_version 2500 (0.0006) +[2024-08-24 20:09:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36795.7, 300 sec: 36615.3). Total num frames: 10252288. Throughput: 0: 9188.0. Samples: 2555432. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:09:00,813][01192] Avg episode reward: [(0, '4.391')] +[2024-08-24 20:09:01,521][03430] Updated weights for policy 0, policy_version 2510 (0.0006) +[2024-08-24 20:09:02,614][03430] Updated weights for policy 0, policy_version 2520 (0.0005) +[2024-08-24 20:09:03,723][03430] Updated weights for policy 0, policy_version 2530 (0.0005) +[2024-08-24 20:09:04,855][03430] Updated weights for policy 0, policy_version 2540 (0.0006) +[2024-08-24 20:09:05,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36795.7, 300 sec: 36619.7). Total num frames: 10436608. Throughput: 0: 9193.6. Samples: 2583266. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:09:05,813][01192] Avg episode reward: [(0, '4.330')] +[2024-08-24 20:09:05,978][03430] Updated weights for policy 0, policy_version 2550 (0.0007) +[2024-08-24 20:09:07,115][03430] Updated weights for policy 0, policy_version 2560 (0.0006) +[2024-08-24 20:09:08,238][03430] Updated weights for policy 0, policy_version 2570 (0.0005) +[2024-08-24 20:09:09,379][03430] Updated weights for policy 0, policy_version 2580 (0.0005) +[2024-08-24 20:09:10,506][03430] Updated weights for policy 0, policy_version 2590 (0.0005) +[2024-08-24 20:09:10,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36727.5, 300 sec: 36609.8). Total num frames: 10616832. Throughput: 0: 9179.9. Samples: 2637830. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:09:10,813][01192] Avg episode reward: [(0, '4.486')] +[2024-08-24 20:09:11,607][03430] Updated weights for policy 0, policy_version 2600 (0.0006) +[2024-08-24 20:09:12,728][03430] Updated weights for policy 0, policy_version 2610 (0.0005) +[2024-08-24 20:09:13,867][03430] Updated weights for policy 0, policy_version 2620 (0.0005) +[2024-08-24 20:09:14,986][03430] Updated weights for policy 0, policy_version 2630 (0.0006) +[2024-08-24 20:09:15,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36727.5, 300 sec: 36614.1). Total num frames: 10801152. Throughput: 0: 9168.2. Samples: 2692560. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:09:15,813][01192] Avg episode reward: [(0, '4.448')] +[2024-08-24 20:09:16,123][03430] Updated weights for policy 0, policy_version 2640 (0.0007) +[2024-08-24 20:09:17,252][03430] Updated weights for policy 0, policy_version 2650 (0.0005) +[2024-08-24 20:09:18,362][03430] Updated weights for policy 0, policy_version 2660 (0.0005) +[2024-08-24 20:09:19,494][03430] Updated weights for policy 0, policy_version 2670 (0.0006) +[2024-08-24 20:09:20,632][03430] Updated weights for policy 0, policy_version 2680 (0.0005) +[2024-08-24 20:09:20,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36659.3, 300 sec: 36905.7). Total num frames: 10981376. Throughput: 0: 9152.4. Samples: 2719632. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:09:20,813][01192] Avg episode reward: [(0, '4.521')] +[2024-08-24 20:09:21,769][03430] Updated weights for policy 0, policy_version 2690 (0.0006) +[2024-08-24 20:09:22,898][03430] Updated weights for policy 0, policy_version 2700 (0.0006) +[2024-08-24 20:09:24,023][03430] Updated weights for policy 0, policy_version 2710 (0.0005) +[2024-08-24 20:09:25,142][03430] Updated weights for policy 0, policy_version 2720 (0.0005) +[2024-08-24 20:09:25,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36591.1, 300 sec: 36905.7). Total num frames: 11161600. Throughput: 0: 9129.1. Samples: 2773878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:09:25,813][01192] Avg episode reward: [(0, '4.348')] +[2024-08-24 20:09:26,295][03430] Updated weights for policy 0, policy_version 2730 (0.0005) +[2024-08-24 20:09:27,442][03430] Updated weights for policy 0, policy_version 2740 (0.0005) +[2024-08-24 20:09:28,587][03430] Updated weights for policy 0, policy_version 2750 (0.0007) +[2024-08-24 20:09:29,717][03430] Updated weights for policy 0, policy_version 2760 (0.0006) +[2024-08-24 20:09:30,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36522.7, 300 sec: 36905.7). Total num frames: 11341824. Throughput: 0: 9121.6. Samples: 2827918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:09:30,813][01192] Avg episode reward: [(0, '4.551')] +[2024-08-24 20:09:30,843][03430] Updated weights for policy 0, policy_version 2770 (0.0005) +[2024-08-24 20:09:31,976][03430] Updated weights for policy 0, policy_version 2780 (0.0006) +[2024-08-24 20:09:33,121][03430] Updated weights for policy 0, policy_version 2790 (0.0006) +[2024-08-24 20:09:34,263][03430] Updated weights for policy 0, policy_version 2800 (0.0005) +[2024-08-24 20:09:35,422][03430] Updated weights for policy 0, policy_version 2810 (0.0006) +[2024-08-24 20:09:35,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36891.8). Total num frames: 11522048. Throughput: 0: 9106.3. Samples: 2855094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:09:35,813][01192] Avg episode reward: [(0, '4.227')] +[2024-08-24 20:09:36,572][03430] Updated weights for policy 0, policy_version 2820 (0.0005) +[2024-08-24 20:09:37,701][03430] Updated weights for policy 0, policy_version 2830 (0.0005) +[2024-08-24 20:09:38,833][03430] Updated weights for policy 0, policy_version 2840 (0.0006) +[2024-08-24 20:09:39,952][03430] Updated weights for policy 0, policy_version 2850 (0.0006) +[2024-08-24 20:09:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36891.8). Total num frames: 11702272. Throughput: 0: 9064.0. Samples: 2908966. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:09:40,813][01192] Avg episode reward: [(0, '4.517')] +[2024-08-24 20:09:41,065][03430] Updated weights for policy 0, policy_version 2860 (0.0004) +[2024-08-24 20:09:42,158][03430] Updated weights for policy 0, policy_version 2870 (0.0005) +[2024-08-24 20:09:43,291][03430] Updated weights for policy 0, policy_version 2880 (0.0005) +[2024-08-24 20:09:44,429][03430] Updated weights for policy 0, policy_version 2890 (0.0005) +[2024-08-24 20:09:45,566][03430] Updated weights for policy 0, policy_version 2900 (0.0006) +[2024-08-24 20:09:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36891.8). Total num frames: 11886592. Throughput: 0: 9071.2. Samples: 2963634. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:09:45,813][01192] Avg episode reward: [(0, '4.323')] +[2024-08-24 20:09:46,685][03430] Updated weights for policy 0, policy_version 2910 (0.0005) +[2024-08-24 20:09:47,774][03430] Updated weights for policy 0, policy_version 2920 (0.0006) +[2024-08-24 20:09:48,895][03430] Updated weights for policy 0, policy_version 2930 (0.0006) +[2024-08-24 20:09:50,025][03430] Updated weights for policy 0, policy_version 2940 (0.0006) +[2024-08-24 20:09:50,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36386.1, 300 sec: 36905.7). Total num frames: 12070912. Throughput: 0: 9067.3. Samples: 2991294. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:09:50,813][01192] Avg episode reward: [(0, '4.348')] +[2024-08-24 20:09:51,125][03430] Updated weights for policy 0, policy_version 2950 (0.0006) +[2024-08-24 20:09:52,259][03430] Updated weights for policy 0, policy_version 2960 (0.0005) +[2024-08-24 20:09:53,379][03430] Updated weights for policy 0, policy_version 2970 (0.0006) +[2024-08-24 20:09:54,474][03430] Updated weights for policy 0, policy_version 2980 (0.0006) +[2024-08-24 20:09:55,614][03430] Updated weights for policy 0, policy_version 2990 (0.0005) +[2024-08-24 20:09:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36891.8). Total num frames: 12251136. Throughput: 0: 9073.6. Samples: 3046142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:09:55,812][01192] Avg episode reward: [(0, '4.247')] +[2024-08-24 20:09:56,742][03430] Updated weights for policy 0, policy_version 3000 (0.0004) +[2024-08-24 20:09:57,852][03430] Updated weights for policy 0, policy_version 3010 (0.0006) +[2024-08-24 20:09:58,982][03430] Updated weights for policy 0, policy_version 3020 (0.0005) +[2024-08-24 20:10:00,088][03430] Updated weights for policy 0, policy_version 3030 (0.0005) +[2024-08-24 20:10:00,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36386.1, 300 sec: 36877.9). Total num frames: 12435456. Throughput: 0: 9078.5. Samples: 3101094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:10:00,813][01192] Avg episode reward: [(0, '4.651')] +[2024-08-24 20:10:01,219][03430] Updated weights for policy 0, policy_version 3040 (0.0006) +[2024-08-24 20:10:02,366][03430] Updated weights for policy 0, policy_version 3050 (0.0006) +[2024-08-24 20:10:03,488][03430] Updated weights for policy 0, policy_version 3060 (0.0006) +[2024-08-24 20:10:04,605][03430] Updated weights for policy 0, policy_version 3070 (0.0005) +[2024-08-24 20:10:05,730][03430] Updated weights for policy 0, policy_version 3080 (0.0007) +[2024-08-24 20:10:05,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36877.9). Total num frames: 12615680. Throughput: 0: 9077.8. Samples: 3128134. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:05,813][01192] Avg episode reward: [(0, '4.375')] +[2024-08-24 20:10:06,843][03430] Updated weights for policy 0, policy_version 3090 (0.0005) +[2024-08-24 20:10:07,947][03430] Updated weights for policy 0, policy_version 3100 (0.0005) +[2024-08-24 20:10:09,060][03430] Updated weights for policy 0, policy_version 3110 (0.0005) +[2024-08-24 20:10:10,146][03430] Updated weights for policy 0, policy_version 3120 (0.0005) +[2024-08-24 20:10:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.2, 300 sec: 36877.9). Total num frames: 12800000. Throughput: 0: 9100.2. Samples: 3183386. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:10,812][01192] Avg episode reward: [(0, '4.355')] +[2024-08-24 20:10:11,264][03430] Updated weights for policy 0, policy_version 3130 (0.0006) +[2024-08-24 20:10:12,375][03430] Updated weights for policy 0, policy_version 3140 (0.0006) +[2024-08-24 20:10:13,487][03430] Updated weights for policy 0, policy_version 3150 (0.0006) +[2024-08-24 20:10:14,591][03430] Updated weights for policy 0, policy_version 3160 (0.0006) +[2024-08-24 20:10:15,742][03430] Updated weights for policy 0, policy_version 3170 (0.0006) +[2024-08-24 20:10:15,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36386.2, 300 sec: 36864.0). Total num frames: 12984320. Throughput: 0: 9126.9. Samples: 3238630. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:10:15,813][01192] Avg episode reward: [(0, '4.384')] +[2024-08-24 20:10:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003170_12984320.pth... +[2024-08-24 20:10:15,843][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth +[2024-08-24 20:10:16,882][03430] Updated weights for policy 0, policy_version 3180 (0.0005) +[2024-08-24 20:10:18,012][03430] Updated weights for policy 0, policy_version 3190 (0.0005) +[2024-08-24 20:10:19,122][03430] Updated weights for policy 0, policy_version 3200 (0.0006) +[2024-08-24 20:10:20,237][03430] Updated weights for policy 0, policy_version 3210 (0.0007) +[2024-08-24 20:10:20,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36454.4, 300 sec: 36877.9). Total num frames: 13168640. Throughput: 0: 9123.9. Samples: 3265670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:10:20,814][01192] Avg episode reward: [(0, '4.493')] +[2024-08-24 20:10:21,356][03430] Updated weights for policy 0, policy_version 3220 (0.0005) +[2024-08-24 20:10:22,469][03430] Updated weights for policy 0, policy_version 3230 (0.0005) +[2024-08-24 20:10:23,570][03430] Updated weights for policy 0, policy_version 3240 (0.0006) +[2024-08-24 20:10:24,688][03430] Updated weights for policy 0, policy_version 3250 (0.0005) +[2024-08-24 20:10:25,812][01192] Fps is (10 sec: 36453.3, 60 sec: 36454.2, 300 sec: 36864.0). Total num frames: 13348864. Throughput: 0: 9150.0. Samples: 3320720. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:10:25,813][01192] Avg episode reward: [(0, '4.404')] +[2024-08-24 20:10:25,828][03430] Updated weights for policy 0, policy_version 3260 (0.0006) +[2024-08-24 20:10:26,974][03430] Updated weights for policy 0, policy_version 3270 (0.0005) +[2024-08-24 20:10:28,078][03430] Updated weights for policy 0, policy_version 3280 (0.0005) +[2024-08-24 20:10:29,195][03430] Updated weights for policy 0, policy_version 3290 (0.0005) +[2024-08-24 20:10:30,325][03430] Updated weights for policy 0, policy_version 3300 (0.0005) +[2024-08-24 20:10:30,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36522.6, 300 sec: 36864.0). Total num frames: 13533184. Throughput: 0: 9152.4. Samples: 3375492. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:30,813][01192] Avg episode reward: [(0, '4.348')] +[2024-08-24 20:10:31,446][03430] Updated weights for policy 0, policy_version 3310 (0.0005) +[2024-08-24 20:10:32,550][03430] Updated weights for policy 0, policy_version 3320 (0.0006) +[2024-08-24 20:10:33,664][03430] Updated weights for policy 0, policy_version 3330 (0.0005) +[2024-08-24 20:10:34,771][03430] Updated weights for policy 0, policy_version 3340 (0.0005) +[2024-08-24 20:10:35,812][01192] Fps is (10 sec: 36865.0, 60 sec: 36590.9, 300 sec: 36850.1). Total num frames: 13717504. Throughput: 0: 9148.1. Samples: 3402958. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:35,813][01192] Avg episode reward: [(0, '4.363')] +[2024-08-24 20:10:35,869][03430] Updated weights for policy 0, policy_version 3350 (0.0006) +[2024-08-24 20:10:36,995][03430] Updated weights for policy 0, policy_version 3360 (0.0006) +[2024-08-24 20:10:38,097][03430] Updated weights for policy 0, policy_version 3370 (0.0006) +[2024-08-24 20:10:39,207][03430] Updated weights for policy 0, policy_version 3380 (0.0006) +[2024-08-24 20:10:40,308][03430] Updated weights for policy 0, policy_version 3390 (0.0005) +[2024-08-24 20:10:40,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36659.2, 300 sec: 36836.2). Total num frames: 13901824. Throughput: 0: 9162.1. Samples: 3458436. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:10:40,813][01192] Avg episode reward: [(0, '4.507')] +[2024-08-24 20:10:41,458][03430] Updated weights for policy 0, policy_version 3400 (0.0007) +[2024-08-24 20:10:42,572][03430] Updated weights for policy 0, policy_version 3410 (0.0006) +[2024-08-24 20:10:43,713][03430] Updated weights for policy 0, policy_version 3420 (0.0006) +[2024-08-24 20:10:44,826][03430] Updated weights for policy 0, policy_version 3430 (0.0006) +[2024-08-24 20:10:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36836.2). Total num frames: 14082048. Throughput: 0: 9155.8. Samples: 3513104. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:45,819][01192] Avg episode reward: [(0, '4.547')] +[2024-08-24 20:10:45,938][03430] Updated weights for policy 0, policy_version 3440 (0.0006) +[2024-08-24 20:10:47,109][03430] Updated weights for policy 0, policy_version 3450 (0.0005) +[2024-08-24 20:10:48,219][03430] Updated weights for policy 0, policy_version 3460 (0.0005) +[2024-08-24 20:10:49,335][03430] Updated weights for policy 0, policy_version 3470 (0.0006) +[2024-08-24 20:10:50,503][03430] Updated weights for policy 0, policy_version 3480 (0.0006) +[2024-08-24 20:10:50,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36522.7, 300 sec: 36808.5). Total num frames: 14262272. Throughput: 0: 9158.2. Samples: 3540254. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:10:50,813][01192] Avg episode reward: [(0, '4.301')] +[2024-08-24 20:10:51,629][03430] Updated weights for policy 0, policy_version 3490 (0.0006) +[2024-08-24 20:10:52,766][03430] Updated weights for policy 0, policy_version 3500 (0.0005) +[2024-08-24 20:10:53,918][03430] Updated weights for policy 0, policy_version 3510 (0.0006) +[2024-08-24 20:10:55,042][03430] Updated weights for policy 0, policy_version 3520 (0.0006) +[2024-08-24 20:10:55,812][01192] Fps is (10 sec: 36044.5, 60 sec: 36522.6, 300 sec: 36780.7). Total num frames: 14442496. Throughput: 0: 9128.6. Samples: 3594176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:10:55,813][01192] Avg episode reward: [(0, '4.368')] +[2024-08-24 20:10:56,210][03430] Updated weights for policy 0, policy_version 3530 (0.0006) +[2024-08-24 20:10:57,424][03430] Updated weights for policy 0, policy_version 3540 (0.0006) +[2024-08-24 20:10:58,563][03430] Updated weights for policy 0, policy_version 3550 (0.0006) +[2024-08-24 20:10:59,729][03430] Updated weights for policy 0, policy_version 3560 (0.0005) +[2024-08-24 20:11:00,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36386.1, 300 sec: 36752.9). Total num frames: 14618624. Throughput: 0: 9080.1. Samples: 3647234. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:00,813][01192] Avg episode reward: [(0, '4.457')] +[2024-08-24 20:11:00,861][03430] Updated weights for policy 0, policy_version 3570 (0.0006) +[2024-08-24 20:11:01,980][03430] Updated weights for policy 0, policy_version 3580 (0.0005) +[2024-08-24 20:11:03,087][03430] Updated weights for policy 0, policy_version 3590 (0.0006) +[2024-08-24 20:11:04,231][03430] Updated weights for policy 0, policy_version 3600 (0.0006) +[2024-08-24 20:11:05,364][03430] Updated weights for policy 0, policy_version 3610 (0.0006) +[2024-08-24 20:11:05,812][01192] Fps is (10 sec: 36045.2, 60 sec: 36454.4, 300 sec: 36739.0). Total num frames: 14802944. Throughput: 0: 9082.1. Samples: 3674362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:11:05,813][01192] Avg episode reward: [(0, '4.605')] +[2024-08-24 20:11:06,451][03430] Updated weights for policy 0, policy_version 3620 (0.0005) +[2024-08-24 20:11:07,567][03430] Updated weights for policy 0, policy_version 3630 (0.0006) +[2024-08-24 20:11:08,697][03430] Updated weights for policy 0, policy_version 3640 (0.0005) +[2024-08-24 20:11:09,843][03430] Updated weights for policy 0, policy_version 3650 (0.0006) +[2024-08-24 20:11:10,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36386.1, 300 sec: 36725.1). Total num frames: 14983168. Throughput: 0: 9077.3. Samples: 3729196. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:11:10,813][01192] Avg episode reward: [(0, '4.726')] +[2024-08-24 20:11:10,819][03417] Saving new best policy, reward=4.726! +[2024-08-24 20:11:10,991][03430] Updated weights for policy 0, policy_version 3660 (0.0006) +[2024-08-24 20:11:12,111][03430] Updated weights for policy 0, policy_version 3670 (0.0006) +[2024-08-24 20:11:13,220][03430] Updated weights for policy 0, policy_version 3680 (0.0006) +[2024-08-24 20:11:14,325][03430] Updated weights for policy 0, policy_version 3690 (0.0005) +[2024-08-24 20:11:15,433][03430] Updated weights for policy 0, policy_version 3700 (0.0005) +[2024-08-24 20:11:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36725.1). Total num frames: 15167488. Throughput: 0: 9080.4. Samples: 3784108. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:15,813][01192] Avg episode reward: [(0, '4.244')] +[2024-08-24 20:11:16,545][03430] Updated weights for policy 0, policy_version 3710 (0.0005) +[2024-08-24 20:11:17,650][03430] Updated weights for policy 0, policy_version 3720 (0.0006) +[2024-08-24 20:11:18,777][03430] Updated weights for policy 0, policy_version 3730 (0.0006) +[2024-08-24 20:11:19,888][03430] Updated weights for policy 0, policy_version 3740 (0.0006) +[2024-08-24 20:11:20,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36386.2, 300 sec: 36725.2). Total num frames: 15351808. Throughput: 0: 9083.5. Samples: 3811714. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:20,813][01192] Avg episode reward: [(0, '4.328')] +[2024-08-24 20:11:21,020][03430] Updated weights for policy 0, policy_version 3750 (0.0005) +[2024-08-24 20:11:22,128][03430] Updated weights for policy 0, policy_version 3760 (0.0006) +[2024-08-24 20:11:23,235][03430] Updated weights for policy 0, policy_version 3770 (0.0006) +[2024-08-24 20:11:24,324][03430] Updated weights for policy 0, policy_version 3780 (0.0006) +[2024-08-24 20:11:25,463][03430] Updated weights for policy 0, policy_version 3790 (0.0006) +[2024-08-24 20:11:25,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36454.6, 300 sec: 36711.3). Total num frames: 15536128. Throughput: 0: 9078.3. Samples: 3866960. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:25,813][01192] Avg episode reward: [(0, '4.557')] +[2024-08-24 20:11:26,588][03430] Updated weights for policy 0, policy_version 3800 (0.0006) +[2024-08-24 20:11:27,700][03430] Updated weights for policy 0, policy_version 3810 (0.0005) +[2024-08-24 20:11:28,818][03430] Updated weights for policy 0, policy_version 3820 (0.0006) +[2024-08-24 20:11:29,929][03430] Updated weights for policy 0, policy_version 3830 (0.0006) +[2024-08-24 20:11:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.2, 300 sec: 36697.4). Total num frames: 15716352. Throughput: 0: 9081.4. Samples: 3921766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:11:30,812][01192] Avg episode reward: [(0, '4.318')] +[2024-08-24 20:11:31,063][03430] Updated weights for policy 0, policy_version 3840 (0.0005) +[2024-08-24 20:11:32,193][03430] Updated weights for policy 0, policy_version 3850 (0.0006) +[2024-08-24 20:11:33,310][03430] Updated weights for policy 0, policy_version 3860 (0.0007) +[2024-08-24 20:11:34,427][03430] Updated weights for policy 0, policy_version 3870 (0.0005) +[2024-08-24 20:11:35,540][03430] Updated weights for policy 0, policy_version 3880 (0.0005) +[2024-08-24 20:11:35,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36697.4). Total num frames: 15900672. Throughput: 0: 9082.4. Samples: 3948960. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:11:35,813][01192] Avg episode reward: [(0, '4.472')] +[2024-08-24 20:11:36,649][03430] Updated weights for policy 0, policy_version 3890 (0.0006) +[2024-08-24 20:11:37,777][03430] Updated weights for policy 0, policy_version 3900 (0.0005) +[2024-08-24 20:11:38,913][03430] Updated weights for policy 0, policy_version 3910 (0.0006) +[2024-08-24 20:11:40,051][03430] Updated weights for policy 0, policy_version 3920 (0.0006) +[2024-08-24 20:11:40,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36669.6). Total num frames: 16080896. Throughput: 0: 9098.4. Samples: 4003602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:11:40,812][01192] Avg episode reward: [(0, '4.153')] +[2024-08-24 20:11:41,192][03430] Updated weights for policy 0, policy_version 3930 (0.0005) +[2024-08-24 20:11:42,309][03430] Updated weights for policy 0, policy_version 3940 (0.0005) +[2024-08-24 20:11:43,407][03430] Updated weights for policy 0, policy_version 3950 (0.0005) +[2024-08-24 20:11:44,501][03430] Updated weights for policy 0, policy_version 3960 (0.0006) +[2024-08-24 20:11:45,619][03430] Updated weights for policy 0, policy_version 3970 (0.0006) +[2024-08-24 20:11:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36669.6). Total num frames: 16265216. Throughput: 0: 9144.5. Samples: 4058734. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:11:45,813][01192] Avg episode reward: [(0, '4.648')] +[2024-08-24 20:11:46,732][03430] Updated weights for policy 0, policy_version 3980 (0.0006) +[2024-08-24 20:11:47,863][03430] Updated weights for policy 0, policy_version 3990 (0.0006) +[2024-08-24 20:11:48,955][03430] Updated weights for policy 0, policy_version 4000 (0.0006) +[2024-08-24 20:11:50,078][03430] Updated weights for policy 0, policy_version 4010 (0.0005) +[2024-08-24 20:11:50,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36454.4, 300 sec: 36669.6). Total num frames: 16449536. Throughput: 0: 9154.1. Samples: 4086298. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:50,813][01192] Avg episode reward: [(0, '4.616')] +[2024-08-24 20:11:51,215][03430] Updated weights for policy 0, policy_version 4020 (0.0006) +[2024-08-24 20:11:52,342][03430] Updated weights for policy 0, policy_version 4030 (0.0006) +[2024-08-24 20:11:53,486][03430] Updated weights for policy 0, policy_version 4040 (0.0006) +[2024-08-24 20:11:54,614][03430] Updated weights for policy 0, policy_version 4050 (0.0006) +[2024-08-24 20:11:55,745][03430] Updated weights for policy 0, policy_version 4060 (0.0005) +[2024-08-24 20:11:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36641.8). Total num frames: 16629760. Throughput: 0: 9150.2. Samples: 4140954. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:11:55,813][01192] Avg episode reward: [(0, '4.473')] +[2024-08-24 20:11:56,845][03430] Updated weights for policy 0, policy_version 4070 (0.0006) +[2024-08-24 20:11:57,978][03430] Updated weights for policy 0, policy_version 4080 (0.0006) +[2024-08-24 20:11:59,084][03430] Updated weights for policy 0, policy_version 4090 (0.0006) +[2024-08-24 20:12:00,186][03430] Updated weights for policy 0, policy_version 4100 (0.0005) +[2024-08-24 20:12:00,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36591.0, 300 sec: 36641.8). Total num frames: 16814080. Throughput: 0: 9152.1. Samples: 4195952. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:12:00,813][01192] Avg episode reward: [(0, '4.328')] +[2024-08-24 20:12:01,301][03430] Updated weights for policy 0, policy_version 4110 (0.0005) +[2024-08-24 20:12:02,423][03430] Updated weights for policy 0, policy_version 4120 (0.0006) +[2024-08-24 20:12:03,527][03430] Updated weights for policy 0, policy_version 4130 (0.0005) +[2024-08-24 20:12:04,668][03430] Updated weights for policy 0, policy_version 4140 (0.0006) +[2024-08-24 20:12:05,769][03430] Updated weights for policy 0, policy_version 4150 (0.0005) +[2024-08-24 20:12:05,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36590.9, 300 sec: 36614.1). Total num frames: 16998400. Throughput: 0: 9150.9. Samples: 4223504. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:12:05,812][01192] Avg episode reward: [(0, '4.293')] +[2024-08-24 20:12:06,901][03430] Updated weights for policy 0, policy_version 4160 (0.0006) +[2024-08-24 20:12:08,036][03430] Updated weights for policy 0, policy_version 4170 (0.0006) +[2024-08-24 20:12:09,145][03430] Updated weights for policy 0, policy_version 4180 (0.0006) +[2024-08-24 20:12:10,274][03430] Updated weights for policy 0, policy_version 4190 (0.0005) +[2024-08-24 20:12:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36614.1). Total num frames: 17178624. Throughput: 0: 9142.9. Samples: 4278388. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:12:10,813][01192] Avg episode reward: [(0, '4.438')] +[2024-08-24 20:12:11,382][03430] Updated weights for policy 0, policy_version 4200 (0.0006) +[2024-08-24 20:12:12,521][03430] Updated weights for policy 0, policy_version 4210 (0.0006) +[2024-08-24 20:12:13,615][03430] Updated weights for policy 0, policy_version 4220 (0.0005) +[2024-08-24 20:12:14,729][03430] Updated weights for policy 0, policy_version 4230 (0.0006) +[2024-08-24 20:12:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36600.2). Total num frames: 17362944. Throughput: 0: 9147.1. Samples: 4333386. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:12:15,813][01192] Avg episode reward: [(0, '4.461')] +[2024-08-24 20:12:15,832][03430] Updated weights for policy 0, policy_version 4240 (0.0005) +[2024-08-24 20:12:15,832][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000004240_17367040.pth... +[2024-08-24 20:12:15,857][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000002100_8601600.pth +[2024-08-24 20:12:16,947][03430] Updated weights for policy 0, policy_version 4250 (0.0006) +[2024-08-24 20:12:18,066][03430] Updated weights for policy 0, policy_version 4260 (0.0006) +[2024-08-24 20:12:19,180][03430] Updated weights for policy 0, policy_version 4270 (0.0006) +[2024-08-24 20:12:20,278][03430] Updated weights for policy 0, policy_version 4280 (0.0005) +[2024-08-24 20:12:20,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36659.2, 300 sec: 36614.1). Total num frames: 17551360. Throughput: 0: 9154.5. Samples: 4360912. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:12:20,813][01192] Avg episode reward: [(0, '4.395')] +[2024-08-24 20:12:21,366][03430] Updated weights for policy 0, policy_version 4290 (0.0005) +[2024-08-24 20:12:22,483][03430] Updated weights for policy 0, policy_version 4300 (0.0006) +[2024-08-24 20:12:23,621][03430] Updated weights for policy 0, policy_version 4310 (0.0006) +[2024-08-24 20:12:24,775][03430] Updated weights for policy 0, policy_version 4320 (0.0007) +[2024-08-24 20:12:25,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36590.9, 300 sec: 36600.2). Total num frames: 17731584. Throughput: 0: 9163.8. Samples: 4415972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:12:25,813][01192] Avg episode reward: [(0, '4.351')] +[2024-08-24 20:12:25,914][03430] Updated weights for policy 0, policy_version 4330 (0.0006) +[2024-08-24 20:12:27,030][03430] Updated weights for policy 0, policy_version 4340 (0.0005) +[2024-08-24 20:12:28,150][03430] Updated weights for policy 0, policy_version 4350 (0.0006) +[2024-08-24 20:12:29,275][03430] Updated weights for policy 0, policy_version 4360 (0.0005) +[2024-08-24 20:12:30,411][03430] Updated weights for policy 0, policy_version 4370 (0.0006) +[2024-08-24 20:12:30,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36590.9, 300 sec: 36572.4). Total num frames: 17911808. Throughput: 0: 9150.3. Samples: 4470496. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:12:30,813][01192] Avg episode reward: [(0, '4.435')] +[2024-08-24 20:12:31,518][03430] Updated weights for policy 0, policy_version 4380 (0.0005) +[2024-08-24 20:12:32,612][03430] Updated weights for policy 0, policy_version 4390 (0.0005) +[2024-08-24 20:12:33,722][03430] Updated weights for policy 0, policy_version 4400 (0.0006) +[2024-08-24 20:12:34,840][03430] Updated weights for policy 0, policy_version 4410 (0.0006) +[2024-08-24 20:12:35,812][01192] Fps is (10 sec: 36454.9, 60 sec: 36591.0, 300 sec: 36572.4). Total num frames: 18096128. Throughput: 0: 9154.1. Samples: 4498232. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:12:35,813][01192] Avg episode reward: [(0, '4.456')] +[2024-08-24 20:12:35,949][03430] Updated weights for policy 0, policy_version 4420 (0.0005) +[2024-08-24 20:12:37,062][03430] Updated weights for policy 0, policy_version 4430 (0.0005) +[2024-08-24 20:12:38,184][03430] Updated weights for policy 0, policy_version 4440 (0.0006) +[2024-08-24 20:12:39,307][03430] Updated weights for policy 0, policy_version 4450 (0.0006) +[2024-08-24 20:12:40,423][03430] Updated weights for policy 0, policy_version 4460 (0.0006) +[2024-08-24 20:12:40,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36659.2, 300 sec: 36572.4). Total num frames: 18280448. Throughput: 0: 9167.0. Samples: 4553470. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:12:40,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:12:41,531][03430] Updated weights for policy 0, policy_version 4470 (0.0005) +[2024-08-24 20:12:42,653][03430] Updated weights for policy 0, policy_version 4480 (0.0006) +[2024-08-24 20:12:43,767][03430] Updated weights for policy 0, policy_version 4490 (0.0006) +[2024-08-24 20:12:44,880][03430] Updated weights for policy 0, policy_version 4500 (0.0005) +[2024-08-24 20:12:45,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36659.2, 300 sec: 36572.4). Total num frames: 18464768. Throughput: 0: 9167.9. Samples: 4608508. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:12:45,813][01192] Avg episode reward: [(0, '4.293')] +[2024-08-24 20:12:45,978][03430] Updated weights for policy 0, policy_version 4510 (0.0005) +[2024-08-24 20:12:47,117][03430] Updated weights for policy 0, policy_version 4520 (0.0005) +[2024-08-24 20:12:48,168][03430] Updated weights for policy 0, policy_version 4530 (0.0005) +[2024-08-24 20:12:49,247][03430] Updated weights for policy 0, policy_version 4540 (0.0005) +[2024-08-24 20:12:50,350][03430] Updated weights for policy 0, policy_version 4550 (0.0006) +[2024-08-24 20:12:50,812][01192] Fps is (10 sec: 37273.4, 60 sec: 36727.4, 300 sec: 36586.3). Total num frames: 18653184. Throughput: 0: 9171.1. Samples: 4636204. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:12:50,813][01192] Avg episode reward: [(0, '4.480')] +[2024-08-24 20:12:51,473][03430] Updated weights for policy 0, policy_version 4560 (0.0006) +[2024-08-24 20:12:52,609][03430] Updated weights for policy 0, policy_version 4570 (0.0006) +[2024-08-24 20:12:53,712][03430] Updated weights for policy 0, policy_version 4580 (0.0005) +[2024-08-24 20:12:54,808][03430] Updated weights for policy 0, policy_version 4590 (0.0006) +[2024-08-24 20:12:55,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36727.5, 300 sec: 36572.4). Total num frames: 18833408. Throughput: 0: 9188.0. Samples: 4691850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:12:55,819][01192] Avg episode reward: [(0, '4.174')] +[2024-08-24 20:12:55,957][03430] Updated weights for policy 0, policy_version 4600 (0.0006) +[2024-08-24 20:12:57,094][03430] Updated weights for policy 0, policy_version 4610 (0.0006) +[2024-08-24 20:12:58,206][03430] Updated weights for policy 0, policy_version 4620 (0.0006) +[2024-08-24 20:12:59,388][03430] Updated weights for policy 0, policy_version 4630 (0.0006) +[2024-08-24 20:13:00,542][03430] Updated weights for policy 0, policy_version 4640 (0.0007) +[2024-08-24 20:13:00,812][01192] Fps is (10 sec: 36045.2, 60 sec: 36659.2, 300 sec: 36558.5). Total num frames: 19013632. Throughput: 0: 9164.3. Samples: 4745780. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:13:00,812][01192] Avg episode reward: [(0, '4.490')] +[2024-08-24 20:13:01,658][03430] Updated weights for policy 0, policy_version 4650 (0.0006) +[2024-08-24 20:13:02,785][03430] Updated weights for policy 0, policy_version 4660 (0.0006) +[2024-08-24 20:13:03,896][03430] Updated weights for policy 0, policy_version 4670 (0.0004) +[2024-08-24 20:13:05,008][03430] Updated weights for policy 0, policy_version 4680 (0.0005) +[2024-08-24 20:13:05,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36659.2, 300 sec: 36558.5). Total num frames: 19197952. Throughput: 0: 9155.2. Samples: 4772896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:13:05,813][01192] Avg episode reward: [(0, '4.323')] +[2024-08-24 20:13:06,101][03430] Updated weights for policy 0, policy_version 4690 (0.0005) +[2024-08-24 20:13:07,215][03430] Updated weights for policy 0, policy_version 4700 (0.0007) +[2024-08-24 20:13:08,325][03430] Updated weights for policy 0, policy_version 4710 (0.0006) +[2024-08-24 20:13:09,429][03430] Updated weights for policy 0, policy_version 4720 (0.0006) +[2024-08-24 20:13:10,513][03430] Updated weights for policy 0, policy_version 4730 (0.0005) +[2024-08-24 20:13:10,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36727.5, 300 sec: 36558.6). Total num frames: 19382272. Throughput: 0: 9174.9. Samples: 4828842. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:13:10,812][01192] Avg episode reward: [(0, '4.452')] +[2024-08-24 20:13:11,585][03430] Updated weights for policy 0, policy_version 4740 (0.0005) +[2024-08-24 20:13:12,689][03430] Updated weights for policy 0, policy_version 4750 (0.0006) +[2024-08-24 20:13:13,789][03430] Updated weights for policy 0, policy_version 4760 (0.0007) +[2024-08-24 20:13:14,872][03430] Updated weights for policy 0, policy_version 4770 (0.0005) +[2024-08-24 20:13:15,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36795.7, 300 sec: 36572.4). Total num frames: 19570688. Throughput: 0: 9209.0. Samples: 4884902. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:13:15,813][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:13:15,998][03430] Updated weights for policy 0, policy_version 4780 (0.0006) +[2024-08-24 20:13:17,122][03430] Updated weights for policy 0, policy_version 4790 (0.0005) +[2024-08-24 20:13:18,171][03430] Updated weights for policy 0, policy_version 4800 (0.0005) +[2024-08-24 20:13:19,270][03430] Updated weights for policy 0, policy_version 4810 (0.0006) +[2024-08-24 20:13:20,361][03430] Updated weights for policy 0, policy_version 4820 (0.0006) +[2024-08-24 20:13:20,812][01192] Fps is (10 sec: 37683.1, 60 sec: 36795.7, 300 sec: 36586.3). Total num frames: 19759104. Throughput: 0: 9212.9. Samples: 4912814. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:13:20,813][01192] Avg episode reward: [(0, '4.398')] +[2024-08-24 20:13:21,461][03430] Updated weights for policy 0, policy_version 4830 (0.0005) +[2024-08-24 20:13:22,561][03430] Updated weights for policy 0, policy_version 4840 (0.0005) +[2024-08-24 20:13:23,678][03430] Updated weights for policy 0, policy_version 4850 (0.0005) +[2024-08-24 20:13:24,749][03430] Updated weights for policy 0, policy_version 4860 (0.0005) +[2024-08-24 20:13:25,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36864.1, 300 sec: 36586.3). Total num frames: 19943424. Throughput: 0: 9227.7. Samples: 4968718. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:13:25,813][01192] Avg episode reward: [(0, '4.545')] +[2024-08-24 20:13:25,886][03430] Updated weights for policy 0, policy_version 4870 (0.0005) +[2024-08-24 20:13:26,973][03430] Updated weights for policy 0, policy_version 4880 (0.0004) +[2024-08-24 20:13:28,057][03430] Updated weights for policy 0, policy_version 4890 (0.0006) +[2024-08-24 20:13:29,116][03430] Updated weights for policy 0, policy_version 4900 (0.0005) +[2024-08-24 20:13:30,236][03430] Updated weights for policy 0, policy_version 4910 (0.0006) +[2024-08-24 20:13:30,812][01192] Fps is (10 sec: 37274.0, 60 sec: 37000.6, 300 sec: 36586.3). Total num frames: 20131840. Throughput: 0: 9256.3. Samples: 5025040. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:13:30,812][01192] Avg episode reward: [(0, '4.308')] +[2024-08-24 20:13:31,361][03430] Updated weights for policy 0, policy_version 4920 (0.0005) +[2024-08-24 20:13:32,446][03430] Updated weights for policy 0, policy_version 4930 (0.0006) +[2024-08-24 20:13:33,524][03430] Updated weights for policy 0, policy_version 4940 (0.0006) +[2024-08-24 20:13:34,628][03430] Updated weights for policy 0, policy_version 4950 (0.0006) +[2024-08-24 20:13:35,782][03430] Updated weights for policy 0, policy_version 4960 (0.0005) +[2024-08-24 20:13:35,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37000.5, 300 sec: 36600.2). Total num frames: 20316160. Throughput: 0: 9260.5. Samples: 5052924. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:13:35,813][01192] Avg episode reward: [(0, '4.278')] +[2024-08-24 20:13:36,900][03430] Updated weights for policy 0, policy_version 4970 (0.0006) +[2024-08-24 20:13:37,996][03430] Updated weights for policy 0, policy_version 4980 (0.0005) +[2024-08-24 20:13:39,081][03430] Updated weights for policy 0, policy_version 4990 (0.0005) +[2024-08-24 20:13:40,203][03430] Updated weights for policy 0, policy_version 5000 (0.0006) +[2024-08-24 20:13:40,812][01192] Fps is (10 sec: 36863.5, 60 sec: 37000.5, 300 sec: 36614.1). Total num frames: 20500480. Throughput: 0: 9253.4. Samples: 5108254. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:13:40,813][01192] Avg episode reward: [(0, '4.589')] +[2024-08-24 20:13:41,347][03430] Updated weights for policy 0, policy_version 5010 (0.0006) +[2024-08-24 20:13:42,447][03430] Updated weights for policy 0, policy_version 5020 (0.0006) +[2024-08-24 20:13:43,531][03430] Updated weights for policy 0, policy_version 5030 (0.0005) +[2024-08-24 20:13:44,637][03430] Updated weights for policy 0, policy_version 5040 (0.0007) +[2024-08-24 20:13:45,764][03430] Updated weights for policy 0, policy_version 5050 (0.0005) +[2024-08-24 20:13:45,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37000.5, 300 sec: 36600.2). Total num frames: 20684800. Throughput: 0: 9281.7. Samples: 5163458. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:13:45,813][01192] Avg episode reward: [(0, '4.369')] +[2024-08-24 20:13:46,882][03430] Updated weights for policy 0, policy_version 5060 (0.0006) +[2024-08-24 20:13:47,976][03430] Updated weights for policy 0, policy_version 5070 (0.0006) +[2024-08-24 20:13:49,080][03430] Updated weights for policy 0, policy_version 5080 (0.0005) +[2024-08-24 20:13:50,193][03430] Updated weights for policy 0, policy_version 5090 (0.0006) +[2024-08-24 20:13:50,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36932.3, 300 sec: 36614.1). Total num frames: 20869120. Throughput: 0: 9292.1. Samples: 5191040. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:13:50,813][01192] Avg episode reward: [(0, '4.407')] +[2024-08-24 20:13:51,287][03430] Updated weights for policy 0, policy_version 5100 (0.0006) +[2024-08-24 20:13:52,401][03430] Updated weights for policy 0, policy_version 5110 (0.0006) +[2024-08-24 20:13:53,534][03430] Updated weights for policy 0, policy_version 5120 (0.0007) +[2024-08-24 20:13:54,658][03430] Updated weights for policy 0, policy_version 5130 (0.0006) +[2024-08-24 20:13:55,772][03430] Updated weights for policy 0, policy_version 5140 (0.0006) +[2024-08-24 20:13:55,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37000.5, 300 sec: 36614.1). Total num frames: 21053440. Throughput: 0: 9282.6. Samples: 5246558. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:13:55,813][01192] Avg episode reward: [(0, '4.318')] +[2024-08-24 20:13:56,916][03430] Updated weights for policy 0, policy_version 5150 (0.0006) +[2024-08-24 20:13:58,046][03430] Updated weights for policy 0, policy_version 5160 (0.0006) +[2024-08-24 20:13:59,163][03430] Updated weights for policy 0, policy_version 5170 (0.0005) +[2024-08-24 20:14:00,273][03430] Updated weights for policy 0, policy_version 5180 (0.0005) +[2024-08-24 20:14:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 37000.5, 300 sec: 36600.2). Total num frames: 21233664. Throughput: 0: 9251.6. Samples: 5301222. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:14:00,813][01192] Avg episode reward: [(0, '4.289')] +[2024-08-24 20:14:01,398][03430] Updated weights for policy 0, policy_version 5190 (0.0006) +[2024-08-24 20:14:02,520][03430] Updated weights for policy 0, policy_version 5200 (0.0005) +[2024-08-24 20:14:03,632][03430] Updated weights for policy 0, policy_version 5210 (0.0006) +[2024-08-24 20:14:04,762][03430] Updated weights for policy 0, policy_version 5220 (0.0004) +[2024-08-24 20:14:05,812][01192] Fps is (10 sec: 36454.2, 60 sec: 37000.5, 300 sec: 36614.1). Total num frames: 21417984. Throughput: 0: 9240.6. Samples: 5328642. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:14:05,813][01192] Avg episode reward: [(0, '4.330')] +[2024-08-24 20:14:05,859][03430] Updated weights for policy 0, policy_version 5230 (0.0005) +[2024-08-24 20:14:06,947][03430] Updated weights for policy 0, policy_version 5240 (0.0006) +[2024-08-24 20:14:08,029][03430] Updated weights for policy 0, policy_version 5250 (0.0005) +[2024-08-24 20:14:09,103][03430] Updated weights for policy 0, policy_version 5260 (0.0005) +[2024-08-24 20:14:10,206][03430] Updated weights for policy 0, policy_version 5270 (0.0005) +[2024-08-24 20:14:10,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37068.8, 300 sec: 36628.0). Total num frames: 21606400. Throughput: 0: 9230.3. Samples: 5384080. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2024-08-24 20:14:10,813][01192] Avg episode reward: [(0, '4.533')] +[2024-08-24 20:14:11,295][03430] Updated weights for policy 0, policy_version 5280 (0.0006) +[2024-08-24 20:14:12,392][03430] Updated weights for policy 0, policy_version 5290 (0.0005) +[2024-08-24 20:14:13,507][03430] Updated weights for policy 0, policy_version 5300 (0.0006) +[2024-08-24 20:14:14,591][03430] Updated weights for policy 0, policy_version 5310 (0.0006) +[2024-08-24 20:14:15,682][03430] Updated weights for policy 0, policy_version 5320 (0.0005) +[2024-08-24 20:14:15,812][01192] Fps is (10 sec: 37683.3, 60 sec: 37068.8, 300 sec: 36655.7). Total num frames: 21794816. Throughput: 0: 9225.0. Samples: 5440168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2024-08-24 20:14:15,813][01192] Avg episode reward: [(0, '4.467')] +[2024-08-24 20:14:15,820][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005321_21794816.pth... +[2024-08-24 20:14:15,845][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000003170_12984320.pth +[2024-08-24 20:14:16,801][03430] Updated weights for policy 0, policy_version 5330 (0.0005) +[2024-08-24 20:14:17,854][03430] Updated weights for policy 0, policy_version 5340 (0.0005) +[2024-08-24 20:14:18,948][03430] Updated weights for policy 0, policy_version 5350 (0.0006) +[2024-08-24 20:14:20,062][03430] Updated weights for policy 0, policy_version 5360 (0.0005) +[2024-08-24 20:14:20,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37000.5, 300 sec: 36669.6). Total num frames: 21979136. Throughput: 0: 9237.4. Samples: 5468608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:14:20,813][01192] Avg episode reward: [(0, '4.566')] +[2024-08-24 20:14:21,183][03430] Updated weights for policy 0, policy_version 5370 (0.0006) +[2024-08-24 20:14:22,266][03430] Updated weights for policy 0, policy_version 5380 (0.0005) +[2024-08-24 20:14:23,361][03430] Updated weights for policy 0, policy_version 5390 (0.0005) +[2024-08-24 20:14:24,475][03430] Updated weights for policy 0, policy_version 5400 (0.0005) +[2024-08-24 20:14:25,603][03430] Updated weights for policy 0, policy_version 5410 (0.0005) +[2024-08-24 20:14:25,812][01192] Fps is (10 sec: 36863.7, 60 sec: 37000.5, 300 sec: 36683.5). Total num frames: 22163456. Throughput: 0: 9242.6. Samples: 5524174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:14:25,813][01192] Avg episode reward: [(0, '4.510')] +[2024-08-24 20:14:26,717][03430] Updated weights for policy 0, policy_version 5420 (0.0006) +[2024-08-24 20:14:27,815][03430] Updated weights for policy 0, policy_version 5430 (0.0005) +[2024-08-24 20:14:28,930][03430] Updated weights for policy 0, policy_version 5440 (0.0005) +[2024-08-24 20:14:30,036][03430] Updated weights for policy 0, policy_version 5450 (0.0006) +[2024-08-24 20:14:30,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37000.5, 300 sec: 36711.3). Total num frames: 22351872. Throughput: 0: 9245.9. Samples: 5579522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:14:30,813][01192] Avg episode reward: [(0, '4.485')] +[2024-08-24 20:14:31,129][03430] Updated weights for policy 0, policy_version 5460 (0.0006) +[2024-08-24 20:14:32,237][03430] Updated weights for policy 0, policy_version 5470 (0.0005) +[2024-08-24 20:14:33,340][03430] Updated weights for policy 0, policy_version 5480 (0.0006) +[2024-08-24 20:14:34,478][03430] Updated weights for policy 0, policy_version 5490 (0.0005) +[2024-08-24 20:14:35,564][03430] Updated weights for policy 0, policy_version 5500 (0.0005) +[2024-08-24 20:14:35,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37000.5, 300 sec: 36725.1). Total num frames: 22536192. Throughput: 0: 9253.1. Samples: 5607430. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:14:35,813][01192] Avg episode reward: [(0, '4.444')] +[2024-08-24 20:14:36,706][03430] Updated weights for policy 0, policy_version 5510 (0.0006) +[2024-08-24 20:14:37,820][03430] Updated weights for policy 0, policy_version 5520 (0.0006) +[2024-08-24 20:14:38,967][03430] Updated weights for policy 0, policy_version 5530 (0.0006) +[2024-08-24 20:14:40,076][03430] Updated weights for policy 0, policy_version 5540 (0.0005) +[2024-08-24 20:14:40,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36932.3, 300 sec: 36711.3). Total num frames: 22716416. Throughput: 0: 9239.0. Samples: 5662314. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:14:40,813][01192] Avg episode reward: [(0, '4.491')] +[2024-08-24 20:14:41,199][03430] Updated weights for policy 0, policy_version 5550 (0.0005) +[2024-08-24 20:14:42,323][03430] Updated weights for policy 0, policy_version 5560 (0.0006) +[2024-08-24 20:14:43,459][03430] Updated weights for policy 0, policy_version 5570 (0.0005) +[2024-08-24 20:14:44,565][03430] Updated weights for policy 0, policy_version 5580 (0.0005) +[2024-08-24 20:14:45,666][03430] Updated weights for policy 0, policy_version 5590 (0.0005) +[2024-08-24 20:14:45,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36932.3, 300 sec: 36711.3). Total num frames: 22900736. Throughput: 0: 9244.8. Samples: 5717240. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:14:45,819][01192] Avg episode reward: [(0, '4.417')] +[2024-08-24 20:14:46,779][03430] Updated weights for policy 0, policy_version 5600 (0.0005) +[2024-08-24 20:14:47,912][03430] Updated weights for policy 0, policy_version 5610 (0.0006) +[2024-08-24 20:14:49,037][03430] Updated weights for policy 0, policy_version 5620 (0.0005) +[2024-08-24 20:14:50,181][03430] Updated weights for policy 0, policy_version 5630 (0.0007) +[2024-08-24 20:14:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 23080960. Throughput: 0: 9245.3. Samples: 5744682. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:14:50,813][01192] Avg episode reward: [(0, '4.385')] +[2024-08-24 20:14:51,271][03430] Updated weights for policy 0, policy_version 5640 (0.0006) +[2024-08-24 20:14:52,384][03430] Updated weights for policy 0, policy_version 5650 (0.0006) +[2024-08-24 20:14:53,510][03430] Updated weights for policy 0, policy_version 5660 (0.0006) +[2024-08-24 20:14:54,658][03430] Updated weights for policy 0, policy_version 5670 (0.0006) +[2024-08-24 20:14:55,799][03430] Updated weights for policy 0, policy_version 5680 (0.0006) +[2024-08-24 20:14:55,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36863.9, 300 sec: 36711.3). Total num frames: 23265280. Throughput: 0: 9231.5. Samples: 5799498. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:14:55,813][01192] Avg episode reward: [(0, '4.522')] +[2024-08-24 20:14:56,943][03430] Updated weights for policy 0, policy_version 5690 (0.0006) +[2024-08-24 20:14:58,074][03430] Updated weights for policy 0, policy_version 5700 (0.0006) +[2024-08-24 20:14:59,191][03430] Updated weights for policy 0, policy_version 5710 (0.0006) +[2024-08-24 20:15:00,324][03430] Updated weights for policy 0, policy_version 5720 (0.0005) +[2024-08-24 20:15:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36864.0, 300 sec: 36711.3). Total num frames: 23445504. Throughput: 0: 9191.2. Samples: 5853772. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:15:00,813][01192] Avg episode reward: [(0, '4.290')] +[2024-08-24 20:15:01,511][03430] Updated weights for policy 0, policy_version 5730 (0.0005) +[2024-08-24 20:15:02,678][03430] Updated weights for policy 0, policy_version 5740 (0.0006) +[2024-08-24 20:15:03,800][03430] Updated weights for policy 0, policy_version 5750 (0.0006) +[2024-08-24 20:15:04,929][03430] Updated weights for policy 0, policy_version 5760 (0.0005) +[2024-08-24 20:15:05,812][01192] Fps is (10 sec: 35635.5, 60 sec: 36727.5, 300 sec: 36683.5). Total num frames: 23621632. Throughput: 0: 9139.7. Samples: 5879896. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:05,813][01192] Avg episode reward: [(0, '4.355')] +[2024-08-24 20:15:06,041][03430] Updated weights for policy 0, policy_version 5770 (0.0006) +[2024-08-24 20:15:07,170][03430] Updated weights for policy 0, policy_version 5780 (0.0006) +[2024-08-24 20:15:08,295][03430] Updated weights for policy 0, policy_version 5790 (0.0006) +[2024-08-24 20:15:09,421][03430] Updated weights for policy 0, policy_version 5800 (0.0006) +[2024-08-24 20:15:10,542][03430] Updated weights for policy 0, policy_version 5810 (0.0005) +[2024-08-24 20:15:10,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36659.2, 300 sec: 36683.5). Total num frames: 23805952. Throughput: 0: 9121.9. Samples: 5934658. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:10,813][01192] Avg episode reward: [(0, '4.309')] +[2024-08-24 20:15:11,643][03430] Updated weights for policy 0, policy_version 5820 (0.0007) +[2024-08-24 20:15:12,758][03430] Updated weights for policy 0, policy_version 5830 (0.0006) +[2024-08-24 20:15:13,869][03430] Updated weights for policy 0, policy_version 5840 (0.0007) +[2024-08-24 20:15:15,012][03430] Updated weights for policy 0, policy_version 5850 (0.0006) +[2024-08-24 20:15:15,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36590.9, 300 sec: 36683.5). Total num frames: 23990272. Throughput: 0: 9114.2. Samples: 5989664. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:15,813][01192] Avg episode reward: [(0, '4.360')] +[2024-08-24 20:15:16,134][03430] Updated weights for policy 0, policy_version 5860 (0.0006) +[2024-08-24 20:15:17,272][03430] Updated weights for policy 0, policy_version 5870 (0.0006) +[2024-08-24 20:15:18,388][03430] Updated weights for policy 0, policy_version 5880 (0.0006) +[2024-08-24 20:15:19,502][03430] Updated weights for policy 0, policy_version 5890 (0.0005) +[2024-08-24 20:15:20,626][03430] Updated weights for policy 0, policy_version 5900 (0.0006) +[2024-08-24 20:15:20,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.7, 300 sec: 36683.5). Total num frames: 24170496. Throughput: 0: 9100.2. Samples: 6016936. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:20,812][01192] Avg episode reward: [(0, '4.295')] +[2024-08-24 20:15:21,742][03430] Updated weights for policy 0, policy_version 5910 (0.0006) +[2024-08-24 20:15:22,869][03430] Updated weights for policy 0, policy_version 5920 (0.0005) +[2024-08-24 20:15:23,977][03430] Updated weights for policy 0, policy_version 5930 (0.0006) +[2024-08-24 20:15:25,086][03430] Updated weights for policy 0, policy_version 5940 (0.0006) +[2024-08-24 20:15:25,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.7, 300 sec: 36683.5). Total num frames: 24354816. Throughput: 0: 9099.2. Samples: 6071780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:15:25,813][01192] Avg episode reward: [(0, '4.282')] +[2024-08-24 20:15:26,226][03430] Updated weights for policy 0, policy_version 5950 (0.0005) +[2024-08-24 20:15:27,348][03430] Updated weights for policy 0, policy_version 5960 (0.0006) +[2024-08-24 20:15:28,469][03430] Updated weights for policy 0, policy_version 5970 (0.0005) +[2024-08-24 20:15:29,606][03430] Updated weights for policy 0, policy_version 5980 (0.0006) +[2024-08-24 20:15:30,742][03430] Updated weights for policy 0, policy_version 5990 (0.0005) +[2024-08-24 20:15:30,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36386.1, 300 sec: 36669.6). Total num frames: 24535040. Throughput: 0: 9088.0. Samples: 6126200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:15:30,813][01192] Avg episode reward: [(0, '4.612')] +[2024-08-24 20:15:31,872][03430] Updated weights for policy 0, policy_version 6000 (0.0006) +[2024-08-24 20:15:33,030][03430] Updated weights for policy 0, policy_version 6010 (0.0006) +[2024-08-24 20:15:34,142][03430] Updated weights for policy 0, policy_version 6020 (0.0006) +[2024-08-24 20:15:35,280][03430] Updated weights for policy 0, policy_version 6030 (0.0005) +[2024-08-24 20:15:35,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36317.9, 300 sec: 36655.7). Total num frames: 24715264. Throughput: 0: 9078.9. Samples: 6153232. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:15:35,813][01192] Avg episode reward: [(0, '4.471')] +[2024-08-24 20:15:36,389][03430] Updated weights for policy 0, policy_version 6040 (0.0007) +[2024-08-24 20:15:37,488][03430] Updated weights for policy 0, policy_version 6050 (0.0006) +[2024-08-24 20:15:38,618][03430] Updated weights for policy 0, policy_version 6060 (0.0006) +[2024-08-24 20:15:39,733][03430] Updated weights for policy 0, policy_version 6070 (0.0005) +[2024-08-24 20:15:40,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36386.1, 300 sec: 36669.6). Total num frames: 24899584. Throughput: 0: 9084.0. Samples: 6208278. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:40,813][01192] Avg episode reward: [(0, '4.540')] +[2024-08-24 20:15:40,835][03430] Updated weights for policy 0, policy_version 6080 (0.0005) +[2024-08-24 20:15:41,938][03430] Updated weights for policy 0, policy_version 6090 (0.0006) +[2024-08-24 20:15:43,040][03430] Updated weights for policy 0, policy_version 6100 (0.0006) +[2024-08-24 20:15:44,168][03430] Updated weights for policy 0, policy_version 6110 (0.0005) +[2024-08-24 20:15:45,298][03430] Updated weights for policy 0, policy_version 6120 (0.0006) +[2024-08-24 20:15:45,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36386.1, 300 sec: 36683.5). Total num frames: 25083904. Throughput: 0: 9102.0. Samples: 6263360. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:15:45,813][01192] Avg episode reward: [(0, '4.601')] +[2024-08-24 20:15:46,431][03430] Updated weights for policy 0, policy_version 6130 (0.0005) +[2024-08-24 20:15:47,562][03430] Updated weights for policy 0, policy_version 6140 (0.0005) +[2024-08-24 20:15:48,673][03430] Updated weights for policy 0, policy_version 6150 (0.0006) +[2024-08-24 20:15:49,768][03430] Updated weights for policy 0, policy_version 6160 (0.0005) +[2024-08-24 20:15:50,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36454.4, 300 sec: 36697.4). Total num frames: 25268224. Throughput: 0: 9126.2. Samples: 6290574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:15:50,813][01192] Avg episode reward: [(0, '4.473')] +[2024-08-24 20:15:50,867][03430] Updated weights for policy 0, policy_version 6170 (0.0006) +[2024-08-24 20:15:51,991][03430] Updated weights for policy 0, policy_version 6180 (0.0006) +[2024-08-24 20:15:53,096][03430] Updated weights for policy 0, policy_version 6190 (0.0006) +[2024-08-24 20:15:54,187][03430] Updated weights for policy 0, policy_version 6200 (0.0005) +[2024-08-24 20:15:55,298][03430] Updated weights for policy 0, policy_version 6210 (0.0005) +[2024-08-24 20:15:55,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36454.5, 300 sec: 36725.2). Total num frames: 25452544. Throughput: 0: 9142.3. Samples: 6346060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:15:55,813][01192] Avg episode reward: [(0, '4.737')] +[2024-08-24 20:15:55,822][03417] Saving new best policy, reward=4.737! +[2024-08-24 20:15:56,436][03430] Updated weights for policy 0, policy_version 6220 (0.0006) +[2024-08-24 20:15:57,574][03430] Updated weights for policy 0, policy_version 6230 (0.0006) +[2024-08-24 20:15:58,713][03430] Updated weights for policy 0, policy_version 6240 (0.0006) +[2024-08-24 20:15:59,822][03430] Updated weights for policy 0, policy_version 6250 (0.0005) +[2024-08-24 20:16:00,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36454.4, 300 sec: 36711.3). Total num frames: 25632768. Throughput: 0: 9135.4. Samples: 6400756. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:16:00,813][01192] Avg episode reward: [(0, '4.452')] +[2024-08-24 20:16:00,943][03430] Updated weights for policy 0, policy_version 6260 (0.0005) +[2024-08-24 20:16:02,061][03430] Updated weights for policy 0, policy_version 6270 (0.0006) +[2024-08-24 20:16:03,178][03430] Updated weights for policy 0, policy_version 6280 (0.0006) +[2024-08-24 20:16:04,309][03430] Updated weights for policy 0, policy_version 6290 (0.0006) +[2024-08-24 20:16:05,416][03430] Updated weights for policy 0, policy_version 6300 (0.0006) +[2024-08-24 20:16:05,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36590.9, 300 sec: 36725.1). Total num frames: 25817088. Throughput: 0: 9143.2. Samples: 6428382. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:16:05,813][01192] Avg episode reward: [(0, '4.268')] +[2024-08-24 20:16:06,532][03430] Updated weights for policy 0, policy_version 6310 (0.0005) +[2024-08-24 20:16:07,650][03430] Updated weights for policy 0, policy_version 6320 (0.0006) +[2024-08-24 20:16:08,764][03430] Updated weights for policy 0, policy_version 6330 (0.0005) +[2024-08-24 20:16:09,864][03430] Updated weights for policy 0, policy_version 6340 (0.0005) +[2024-08-24 20:16:10,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36590.9, 300 sec: 36725.1). Total num frames: 26001408. Throughput: 0: 9148.8. Samples: 6483478. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:16:10,813][01192] Avg episode reward: [(0, '4.335')] +[2024-08-24 20:16:10,983][03430] Updated weights for policy 0, policy_version 6350 (0.0007) +[2024-08-24 20:16:12,094][03430] Updated weights for policy 0, policy_version 6360 (0.0005) +[2024-08-24 20:16:13,191][03430] Updated weights for policy 0, policy_version 6370 (0.0005) +[2024-08-24 20:16:14,312][03430] Updated weights for policy 0, policy_version 6380 (0.0005) +[2024-08-24 20:16:15,437][03430] Updated weights for policy 0, policy_version 6390 (0.0006) +[2024-08-24 20:16:15,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36591.0, 300 sec: 36725.2). Total num frames: 26185728. Throughput: 0: 9167.2. Samples: 6538726. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:16:15,813][01192] Avg episode reward: [(0, '4.502')] +[2024-08-24 20:16:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006393_26185728.pth... +[2024-08-24 20:16:15,841][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000004240_17367040.pth +[2024-08-24 20:16:16,549][03430] Updated weights for policy 0, policy_version 6400 (0.0006) +[2024-08-24 20:16:17,665][03430] Updated weights for policy 0, policy_version 6410 (0.0007) +[2024-08-24 20:16:18,791][03430] Updated weights for policy 0, policy_version 6420 (0.0006) +[2024-08-24 20:16:19,939][03430] Updated weights for policy 0, policy_version 6430 (0.0006) +[2024-08-24 20:16:20,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36590.9, 300 sec: 36711.3). Total num frames: 26365952. Throughput: 0: 9173.8. Samples: 6566054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:16:20,812][01192] Avg episode reward: [(0, '4.308')] +[2024-08-24 20:16:21,043][03430] Updated weights for policy 0, policy_version 6440 (0.0006) +[2024-08-24 20:16:22,154][03430] Updated weights for policy 0, policy_version 6450 (0.0005) +[2024-08-24 20:16:23,257][03430] Updated weights for policy 0, policy_version 6460 (0.0005) +[2024-08-24 20:16:24,392][03430] Updated weights for policy 0, policy_version 6470 (0.0006) +[2024-08-24 20:16:25,471][03430] Updated weights for policy 0, policy_version 6480 (0.0005) +[2024-08-24 20:16:25,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36659.2, 300 sec: 36739.0). Total num frames: 26554368. Throughput: 0: 9173.0. Samples: 6621064. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:16:25,813][01192] Avg episode reward: [(0, '4.312')] +[2024-08-24 20:16:26,589][03430] Updated weights for policy 0, policy_version 6490 (0.0005) +[2024-08-24 20:16:27,699][03430] Updated weights for policy 0, policy_version 6500 (0.0005) +[2024-08-24 20:16:28,817][03430] Updated weights for policy 0, policy_version 6510 (0.0005) +[2024-08-24 20:16:29,952][03430] Updated weights for policy 0, policy_version 6520 (0.0006) +[2024-08-24 20:16:30,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36659.2, 300 sec: 36725.2). Total num frames: 26734592. Throughput: 0: 9175.4. Samples: 6676254. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:16:30,813][01192] Avg episode reward: [(0, '4.656')] +[2024-08-24 20:16:31,060][03430] Updated weights for policy 0, policy_version 6530 (0.0006) +[2024-08-24 20:16:32,142][03430] Updated weights for policy 0, policy_version 6540 (0.0007) +[2024-08-24 20:16:33,239][03430] Updated weights for policy 0, policy_version 6550 (0.0006) +[2024-08-24 20:16:34,371][03430] Updated weights for policy 0, policy_version 6560 (0.0006) +[2024-08-24 20:16:35,495][03430] Updated weights for policy 0, policy_version 6570 (0.0006) +[2024-08-24 20:16:35,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36727.5, 300 sec: 36739.0). Total num frames: 26918912. Throughput: 0: 9189.2. Samples: 6704088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:16:35,813][01192] Avg episode reward: [(0, '4.511')] +[2024-08-24 20:16:36,621][03430] Updated weights for policy 0, policy_version 6580 (0.0006) +[2024-08-24 20:16:37,732][03430] Updated weights for policy 0, policy_version 6590 (0.0005) +[2024-08-24 20:16:38,867][03430] Updated weights for policy 0, policy_version 6600 (0.0005) +[2024-08-24 20:16:39,989][03430] Updated weights for policy 0, policy_version 6610 (0.0006) +[2024-08-24 20:16:40,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36727.4, 300 sec: 36739.0). Total num frames: 27103232. Throughput: 0: 9177.6. Samples: 6759052. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:16:40,813][01192] Avg episode reward: [(0, '4.418')] +[2024-08-24 20:16:41,100][03430] Updated weights for policy 0, policy_version 6620 (0.0006) +[2024-08-24 20:16:42,212][03430] Updated weights for policy 0, policy_version 6630 (0.0006) +[2024-08-24 20:16:43,345][03430] Updated weights for policy 0, policy_version 6640 (0.0006) +[2024-08-24 20:16:44,462][03430] Updated weights for policy 0, policy_version 6650 (0.0006) +[2024-08-24 20:16:45,595][03430] Updated weights for policy 0, policy_version 6660 (0.0006) +[2024-08-24 20:16:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36659.2, 300 sec: 36725.2). Total num frames: 27283456. Throughput: 0: 9177.0. Samples: 6813720. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:16:45,813][01192] Avg episode reward: [(0, '4.463')] +[2024-08-24 20:16:46,690][03430] Updated weights for policy 0, policy_version 6670 (0.0005) +[2024-08-24 20:16:47,758][03430] Updated weights for policy 0, policy_version 6680 (0.0006) +[2024-08-24 20:16:48,828][03430] Updated weights for policy 0, policy_version 6690 (0.0006) +[2024-08-24 20:16:49,906][03430] Updated weights for policy 0, policy_version 6700 (0.0005) +[2024-08-24 20:16:50,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36727.5, 300 sec: 36752.9). Total num frames: 27471872. Throughput: 0: 9187.3. Samples: 6841812. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:16:50,812][01192] Avg episode reward: [(0, '4.610')] +[2024-08-24 20:16:51,041][03430] Updated weights for policy 0, policy_version 6710 (0.0006) +[2024-08-24 20:16:52,150][03430] Updated weights for policy 0, policy_version 6720 (0.0006) +[2024-08-24 20:16:53,249][03430] Updated weights for policy 0, policy_version 6730 (0.0005) +[2024-08-24 20:16:54,364][03430] Updated weights for policy 0, policy_version 6740 (0.0005) +[2024-08-24 20:16:55,466][03430] Updated weights for policy 0, policy_version 6750 (0.0005) +[2024-08-24 20:16:55,812][01192] Fps is (10 sec: 37683.3, 60 sec: 36795.7, 300 sec: 36766.8). Total num frames: 27660288. Throughput: 0: 9204.3. Samples: 6897670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:16:55,813][01192] Avg episode reward: [(0, '4.479')] +[2024-08-24 20:16:56,573][03430] Updated weights for policy 0, policy_version 6760 (0.0006) +[2024-08-24 20:16:57,696][03430] Updated weights for policy 0, policy_version 6770 (0.0005) +[2024-08-24 20:16:58,817][03430] Updated weights for policy 0, policy_version 6780 (0.0006) +[2024-08-24 20:16:59,924][03430] Updated weights for policy 0, policy_version 6790 (0.0006) +[2024-08-24 20:17:00,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36795.7, 300 sec: 36752.9). Total num frames: 27840512. Throughput: 0: 9205.1. Samples: 6952956. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:17:00,812][01192] Avg episode reward: [(0, '4.411')] +[2024-08-24 20:17:01,047][03430] Updated weights for policy 0, policy_version 6800 (0.0006) +[2024-08-24 20:17:02,188][03430] Updated weights for policy 0, policy_version 6810 (0.0006) +[2024-08-24 20:17:03,366][03430] Updated weights for policy 0, policy_version 6820 (0.0006) +[2024-08-24 20:17:04,511][03430] Updated weights for policy 0, policy_version 6830 (0.0005) +[2024-08-24 20:17:05,615][03430] Updated weights for policy 0, policy_version 6840 (0.0006) +[2024-08-24 20:17:05,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36727.4, 300 sec: 36752.9). Total num frames: 28020736. Throughput: 0: 9193.5. Samples: 6979762. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:17:05,813][01192] Avg episode reward: [(0, '4.319')] +[2024-08-24 20:17:06,710][03430] Updated weights for policy 0, policy_version 6850 (0.0006) +[2024-08-24 20:17:07,811][03430] Updated weights for policy 0, policy_version 6860 (0.0006) +[2024-08-24 20:17:08,936][03430] Updated weights for policy 0, policy_version 6870 (0.0006) +[2024-08-24 20:17:10,039][03430] Updated weights for policy 0, policy_version 6880 (0.0006) +[2024-08-24 20:17:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36727.5, 300 sec: 36752.9). Total num frames: 28205056. Throughput: 0: 9193.6. Samples: 7034776. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:17:10,812][01192] Avg episode reward: [(0, '4.642')] +[2024-08-24 20:17:11,153][03430] Updated weights for policy 0, policy_version 6890 (0.0006) +[2024-08-24 20:17:12,237][03430] Updated weights for policy 0, policy_version 6900 (0.0005) +[2024-08-24 20:17:13,359][03430] Updated weights for policy 0, policy_version 6910 (0.0006) +[2024-08-24 20:17:14,459][03430] Updated weights for policy 0, policy_version 6920 (0.0006) +[2024-08-24 20:17:15,577][03430] Updated weights for policy 0, policy_version 6930 (0.0007) +[2024-08-24 20:17:15,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36795.7, 300 sec: 36752.9). Total num frames: 28393472. Throughput: 0: 9205.6. Samples: 7090506. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:17:15,813][01192] Avg episode reward: [(0, '4.412')] +[2024-08-24 20:17:16,694][03430] Updated weights for policy 0, policy_version 6940 (0.0007) +[2024-08-24 20:17:17,753][03430] Updated weights for policy 0, policy_version 6950 (0.0005) +[2024-08-24 20:17:18,807][03430] Updated weights for policy 0, policy_version 6960 (0.0006) +[2024-08-24 20:17:19,880][03430] Updated weights for policy 0, policy_version 6970 (0.0005) +[2024-08-24 20:17:20,812][01192] Fps is (10 sec: 37682.9, 60 sec: 36932.2, 300 sec: 36780.7). Total num frames: 28581888. Throughput: 0: 9205.9. Samples: 7118354. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:17:20,813][01192] Avg episode reward: [(0, '4.309')] +[2024-08-24 20:17:20,986][03430] Updated weights for policy 0, policy_version 6980 (0.0005) +[2024-08-24 20:17:22,073][03430] Updated weights for policy 0, policy_version 6990 (0.0005) +[2024-08-24 20:17:23,173][03430] Updated weights for policy 0, policy_version 7000 (0.0005) +[2024-08-24 20:17:24,279][03430] Updated weights for policy 0, policy_version 7010 (0.0006) +[2024-08-24 20:17:25,368][03430] Updated weights for policy 0, policy_version 7020 (0.0006) +[2024-08-24 20:17:25,812][01192] Fps is (10 sec: 37273.7, 60 sec: 36864.0, 300 sec: 36794.6). Total num frames: 28766208. Throughput: 0: 9237.9. Samples: 7174758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:17:25,813][01192] Avg episode reward: [(0, '4.544')] +[2024-08-24 20:17:26,508][03430] Updated weights for policy 0, policy_version 7030 (0.0006) +[2024-08-24 20:17:27,606][03430] Updated weights for policy 0, policy_version 7040 (0.0005) +[2024-08-24 20:17:28,718][03430] Updated weights for policy 0, policy_version 7050 (0.0006) +[2024-08-24 20:17:29,789][03430] Updated weights for policy 0, policy_version 7060 (0.0005) +[2024-08-24 20:17:30,812][01192] Fps is (10 sec: 37274.0, 60 sec: 37000.5, 300 sec: 36808.5). Total num frames: 28954624. Throughput: 0: 9264.5. Samples: 7230624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:17:30,813][01192] Avg episode reward: [(0, '4.507')] +[2024-08-24 20:17:30,900][03430] Updated weights for policy 0, policy_version 7070 (0.0006) +[2024-08-24 20:17:32,012][03430] Updated weights for policy 0, policy_version 7080 (0.0006) +[2024-08-24 20:17:33,117][03430] Updated weights for policy 0, policy_version 7090 (0.0006) +[2024-08-24 20:17:34,221][03430] Updated weights for policy 0, policy_version 7100 (0.0007) +[2024-08-24 20:17:35,333][03430] Updated weights for policy 0, policy_version 7110 (0.0005) +[2024-08-24 20:17:35,812][01192] Fps is (10 sec: 37273.8, 60 sec: 37000.5, 300 sec: 36808.5). Total num frames: 29138944. Throughput: 0: 9254.4. Samples: 7258262. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:17:35,813][01192] Avg episode reward: [(0, '4.286')] +[2024-08-24 20:17:36,452][03430] Updated weights for policy 0, policy_version 7120 (0.0006) +[2024-08-24 20:17:37,555][03430] Updated weights for policy 0, policy_version 7130 (0.0006) +[2024-08-24 20:17:38,654][03430] Updated weights for policy 0, policy_version 7140 (0.0005) +[2024-08-24 20:17:39,769][03430] Updated weights for policy 0, policy_version 7150 (0.0006) +[2024-08-24 20:17:40,812][01192] Fps is (10 sec: 36863.9, 60 sec: 37000.6, 300 sec: 36808.5). Total num frames: 29323264. Throughput: 0: 9245.7. Samples: 7313728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:17:40,813][01192] Avg episode reward: [(0, '4.293')] +[2024-08-24 20:17:40,875][03430] Updated weights for policy 0, policy_version 7160 (0.0007) +[2024-08-24 20:17:41,991][03430] Updated weights for policy 0, policy_version 7170 (0.0005) +[2024-08-24 20:17:43,110][03430] Updated weights for policy 0, policy_version 7180 (0.0005) +[2024-08-24 20:17:44,210][03430] Updated weights for policy 0, policy_version 7190 (0.0006) +[2024-08-24 20:17:45,311][03430] Updated weights for policy 0, policy_version 7200 (0.0005) +[2024-08-24 20:17:45,812][01192] Fps is (10 sec: 36864.1, 60 sec: 37068.8, 300 sec: 36794.6). Total num frames: 29507584. Throughput: 0: 9249.3. Samples: 7369174. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:17:45,813][01192] Avg episode reward: [(0, '4.395')] +[2024-08-24 20:17:46,420][03430] Updated weights for policy 0, policy_version 7210 (0.0006) +[2024-08-24 20:17:47,526][03430] Updated weights for policy 0, policy_version 7220 (0.0006) +[2024-08-24 20:17:48,646][03430] Updated weights for policy 0, policy_version 7230 (0.0005) +[2024-08-24 20:17:49,736][03430] Updated weights for policy 0, policy_version 7240 (0.0005) +[2024-08-24 20:17:50,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37000.5, 300 sec: 36808.5). Total num frames: 29691904. Throughput: 0: 9262.0. Samples: 7396550. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:17:50,812][01192] Avg episode reward: [(0, '4.470')] +[2024-08-24 20:17:50,834][03430] Updated weights for policy 0, policy_version 7250 (0.0005) +[2024-08-24 20:17:51,963][03430] Updated weights for policy 0, policy_version 7260 (0.0005) +[2024-08-24 20:17:53,083][03430] Updated weights for policy 0, policy_version 7270 (0.0006) +[2024-08-24 20:17:54,177][03430] Updated weights for policy 0, policy_version 7280 (0.0005) +[2024-08-24 20:17:55,296][03430] Updated weights for policy 0, policy_version 7290 (0.0006) +[2024-08-24 20:17:55,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36932.3, 300 sec: 36822.3). Total num frames: 29876224. Throughput: 0: 9280.0. Samples: 7452376. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:17:55,813][01192] Avg episode reward: [(0, '4.436')] +[2024-08-24 20:17:56,380][03430] Updated weights for policy 0, policy_version 7300 (0.0006) +[2024-08-24 20:17:57,454][03430] Updated weights for policy 0, policy_version 7310 (0.0005) +[2024-08-24 20:17:58,539][03430] Updated weights for policy 0, policy_version 7320 (0.0005) +[2024-08-24 20:17:59,657][03430] Updated weights for policy 0, policy_version 7330 (0.0005) +[2024-08-24 20:18:00,754][03430] Updated weights for policy 0, policy_version 7340 (0.0006) +[2024-08-24 20:18:00,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37068.8, 300 sec: 36836.2). Total num frames: 30064640. Throughput: 0: 9287.7. Samples: 7508452. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:18:00,819][01192] Avg episode reward: [(0, '4.565')] +[2024-08-24 20:18:01,836][03430] Updated weights for policy 0, policy_version 7350 (0.0005) +[2024-08-24 20:18:02,931][03430] Updated weights for policy 0, policy_version 7360 (0.0005) +[2024-08-24 20:18:04,053][03430] Updated weights for policy 0, policy_version 7370 (0.0006) +[2024-08-24 20:18:05,146][03430] Updated weights for policy 0, policy_version 7380 (0.0006) +[2024-08-24 20:18:05,812][01192] Fps is (10 sec: 37683.1, 60 sec: 37205.4, 300 sec: 36850.1). Total num frames: 30253056. Throughput: 0: 9287.9. Samples: 7536310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:18:05,813][01192] Avg episode reward: [(0, '4.168')] +[2024-08-24 20:18:06,262][03430] Updated weights for policy 0, policy_version 7390 (0.0006) +[2024-08-24 20:18:07,347][03430] Updated weights for policy 0, policy_version 7400 (0.0005) +[2024-08-24 20:18:08,451][03430] Updated weights for policy 0, policy_version 7410 (0.0006) +[2024-08-24 20:18:09,552][03430] Updated weights for policy 0, policy_version 7420 (0.0006) +[2024-08-24 20:18:10,682][03430] Updated weights for policy 0, policy_version 7430 (0.0006) +[2024-08-24 20:18:10,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37205.3, 300 sec: 36836.2). Total num frames: 30437376. Throughput: 0: 9274.9. Samples: 7592130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:18:10,813][01192] Avg episode reward: [(0, '4.458')] +[2024-08-24 20:18:11,740][03430] Updated weights for policy 0, policy_version 7440 (0.0005) +[2024-08-24 20:18:12,794][03430] Updated weights for policy 0, policy_version 7450 (0.0005) +[2024-08-24 20:18:13,929][03430] Updated weights for policy 0, policy_version 7460 (0.0005) +[2024-08-24 20:18:15,016][03430] Updated weights for policy 0, policy_version 7470 (0.0005) +[2024-08-24 20:18:15,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37205.4, 300 sec: 36836.2). Total num frames: 30625792. Throughput: 0: 9280.9. Samples: 7648264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:18:15,813][01192] Avg episode reward: [(0, '4.422')] +[2024-08-24 20:18:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000007477_30625792.pth... +[2024-08-24 20:18:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000005321_21794816.pth +[2024-08-24 20:18:16,121][03430] Updated weights for policy 0, policy_version 7480 (0.0006) +[2024-08-24 20:18:17,216][03430] Updated weights for policy 0, policy_version 7490 (0.0005) +[2024-08-24 20:18:18,289][03430] Updated weights for policy 0, policy_version 7500 (0.0005) +[2024-08-24 20:18:19,336][03430] Updated weights for policy 0, policy_version 7510 (0.0006) +[2024-08-24 20:18:20,434][03430] Updated weights for policy 0, policy_version 7520 (0.0005) +[2024-08-24 20:18:20,812][01192] Fps is (10 sec: 37683.1, 60 sec: 37205.4, 300 sec: 36850.1). Total num frames: 30814208. Throughput: 0: 9292.5. Samples: 7676424. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:18:20,813][01192] Avg episode reward: [(0, '4.352')] +[2024-08-24 20:18:21,534][03430] Updated weights for policy 0, policy_version 7530 (0.0006) +[2024-08-24 20:18:22,621][03430] Updated weights for policy 0, policy_version 7540 (0.0005) +[2024-08-24 20:18:23,729][03430] Updated weights for policy 0, policy_version 7550 (0.0005) +[2024-08-24 20:18:24,824][03430] Updated weights for policy 0, policy_version 7560 (0.0005) +[2024-08-24 20:18:25,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37205.4, 300 sec: 36836.2). Total num frames: 30998528. Throughput: 0: 9312.6. Samples: 7732796. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:18:25,813][01192] Avg episode reward: [(0, '4.546')] +[2024-08-24 20:18:25,943][03430] Updated weights for policy 0, policy_version 7570 (0.0006) +[2024-08-24 20:18:27,059][03430] Updated weights for policy 0, policy_version 7580 (0.0005) +[2024-08-24 20:18:28,175][03430] Updated weights for policy 0, policy_version 7590 (0.0006) +[2024-08-24 20:18:29,283][03430] Updated weights for policy 0, policy_version 7600 (0.0006) +[2024-08-24 20:18:30,381][03430] Updated weights for policy 0, policy_version 7610 (0.0005) +[2024-08-24 20:18:30,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37137.1, 300 sec: 36836.2). Total num frames: 31182848. Throughput: 0: 9316.0. Samples: 7788394. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:18:30,813][01192] Avg episode reward: [(0, '4.456')] +[2024-08-24 20:18:31,493][03430] Updated weights for policy 0, policy_version 7620 (0.0005) +[2024-08-24 20:18:32,580][03430] Updated weights for policy 0, policy_version 7630 (0.0006) +[2024-08-24 20:18:33,676][03430] Updated weights for policy 0, policy_version 7640 (0.0005) +[2024-08-24 20:18:34,765][03430] Updated weights for policy 0, policy_version 7650 (0.0006) +[2024-08-24 20:18:35,812][01192] Fps is (10 sec: 37273.3, 60 sec: 37205.3, 300 sec: 36850.1). Total num frames: 31371264. Throughput: 0: 9326.3. Samples: 7816234. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:18:35,813][01192] Avg episode reward: [(0, '4.487')] +[2024-08-24 20:18:35,888][03430] Updated weights for policy 0, policy_version 7660 (0.0006) +[2024-08-24 20:18:36,990][03430] Updated weights for policy 0, policy_version 7670 (0.0005) +[2024-08-24 20:18:38,080][03430] Updated weights for policy 0, policy_version 7680 (0.0005) +[2024-08-24 20:18:39,179][03430] Updated weights for policy 0, policy_version 7690 (0.0005) +[2024-08-24 20:18:40,297][03430] Updated weights for policy 0, policy_version 7700 (0.0006) +[2024-08-24 20:18:40,812][01192] Fps is (10 sec: 37273.5, 60 sec: 37205.3, 300 sec: 36850.1). Total num frames: 31555584. Throughput: 0: 9325.4. Samples: 7872018. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:18:40,813][01192] Avg episode reward: [(0, '4.496')] +[2024-08-24 20:18:41,403][03430] Updated weights for policy 0, policy_version 7710 (0.0005) +[2024-08-24 20:18:42,506][03430] Updated weights for policy 0, policy_version 7720 (0.0006) +[2024-08-24 20:18:43,599][03430] Updated weights for policy 0, policy_version 7730 (0.0006) +[2024-08-24 20:18:44,713][03430] Updated weights for policy 0, policy_version 7740 (0.0006) +[2024-08-24 20:18:45,792][03430] Updated weights for policy 0, policy_version 7750 (0.0005) +[2024-08-24 20:18:45,812][01192] Fps is (10 sec: 37273.9, 60 sec: 37273.6, 300 sec: 36864.0). Total num frames: 31744000. Throughput: 0: 9317.5. Samples: 7927740. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:18:45,813][01192] Avg episode reward: [(0, '4.411')] +[2024-08-24 20:18:46,895][03430] Updated weights for policy 0, policy_version 7760 (0.0006) +[2024-08-24 20:18:47,995][03430] Updated weights for policy 0, policy_version 7770 (0.0005) +[2024-08-24 20:18:49,085][03430] Updated weights for policy 0, policy_version 7780 (0.0006) +[2024-08-24 20:18:50,181][03430] Updated weights for policy 0, policy_version 7790 (0.0005) +[2024-08-24 20:18:50,812][01192] Fps is (10 sec: 37273.8, 60 sec: 37273.6, 300 sec: 36864.0). Total num frames: 31928320. Throughput: 0: 9321.3. Samples: 7955766. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:18:50,813][01192] Avg episode reward: [(0, '4.420')] +[2024-08-24 20:18:51,261][03430] Updated weights for policy 0, policy_version 7800 (0.0006) +[2024-08-24 20:18:52,335][03430] Updated weights for policy 0, policy_version 7810 (0.0005) +[2024-08-24 20:18:53,435][03430] Updated weights for policy 0, policy_version 7820 (0.0005) +[2024-08-24 20:18:54,532][03430] Updated weights for policy 0, policy_version 7830 (0.0004) +[2024-08-24 20:18:55,623][03430] Updated weights for policy 0, policy_version 7840 (0.0005) +[2024-08-24 20:18:55,812][01192] Fps is (10 sec: 37273.5, 60 sec: 37341.8, 300 sec: 36891.8). Total num frames: 32116736. Throughput: 0: 9334.9. Samples: 8012202. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:18:55,813][01192] Avg episode reward: [(0, '4.306')] +[2024-08-24 20:18:56,728][03430] Updated weights for policy 0, policy_version 7850 (0.0005) +[2024-08-24 20:18:57,856][03430] Updated weights for policy 0, policy_version 7860 (0.0006) +[2024-08-24 20:18:58,978][03430] Updated weights for policy 0, policy_version 7870 (0.0006) +[2024-08-24 20:19:00,105][03430] Updated weights for policy 0, policy_version 7880 (0.0005) +[2024-08-24 20:19:00,812][01192] Fps is (10 sec: 37273.5, 60 sec: 37273.6, 300 sec: 36891.8). Total num frames: 32301056. Throughput: 0: 9315.1. Samples: 8067442. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:19:00,813][01192] Avg episode reward: [(0, '4.518')] +[2024-08-24 20:19:01,191][03430] Updated weights for policy 0, policy_version 7890 (0.0005) +[2024-08-24 20:19:02,274][03430] Updated weights for policy 0, policy_version 7900 (0.0005) +[2024-08-24 20:19:03,407][03430] Updated weights for policy 0, policy_version 7910 (0.0006) +[2024-08-24 20:19:04,734][03430] Updated weights for policy 0, policy_version 7920 (0.0006) +[2024-08-24 20:19:05,812][01192] Fps is (10 sec: 35635.1, 60 sec: 37000.5, 300 sec: 36836.2). Total num frames: 32473088. Throughput: 0: 9308.4. Samples: 8095304. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:19:05,813][01192] Avg episode reward: [(0, '4.407')] +[2024-08-24 20:19:06,030][03430] Updated weights for policy 0, policy_version 7930 (0.0007) +[2024-08-24 20:19:07,195][03430] Updated weights for policy 0, policy_version 7940 (0.0006) +[2024-08-24 20:19:08,330][03430] Updated weights for policy 0, policy_version 7950 (0.0006) +[2024-08-24 20:19:09,441][03430] Updated weights for policy 0, policy_version 7960 (0.0005) +[2024-08-24 20:19:10,528][03430] Updated weights for policy 0, policy_version 7970 (0.0004) +[2024-08-24 20:19:10,812][01192] Fps is (10 sec: 35225.6, 60 sec: 36932.3, 300 sec: 36808.5). Total num frames: 32653312. Throughput: 0: 9184.8. Samples: 8146110. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:19:10,812][01192] Avg episode reward: [(0, '4.337')] +[2024-08-24 20:19:11,638][03430] Updated weights for policy 0, policy_version 7980 (0.0006) +[2024-08-24 20:19:12,738][03430] Updated weights for policy 0, policy_version 7990 (0.0005) +[2024-08-24 20:19:13,809][03430] Updated weights for policy 0, policy_version 8000 (0.0005) +[2024-08-24 20:19:14,881][03430] Updated weights for policy 0, policy_version 8010 (0.0006) +[2024-08-24 20:19:15,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36932.2, 300 sec: 36822.3). Total num frames: 32841728. Throughput: 0: 9204.0. Samples: 8202574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:19:15,813][01192] Avg episode reward: [(0, '4.364')] +[2024-08-24 20:19:15,976][03430] Updated weights for policy 0, policy_version 8020 (0.0006) +[2024-08-24 20:19:17,037][03430] Updated weights for policy 0, policy_version 8030 (0.0006) +[2024-08-24 20:19:18,122][03430] Updated weights for policy 0, policy_version 8040 (0.0005) +[2024-08-24 20:19:19,214][03430] Updated weights for policy 0, policy_version 8050 (0.0005) +[2024-08-24 20:19:20,350][03430] Updated weights for policy 0, policy_version 8060 (0.0006) +[2024-08-24 20:19:20,812][01192] Fps is (10 sec: 37682.9, 60 sec: 36932.2, 300 sec: 36836.2). Total num frames: 33030144. Throughput: 0: 9213.8. Samples: 8230854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:19:20,813][01192] Avg episode reward: [(0, '4.426')] +[2024-08-24 20:19:21,444][03430] Updated weights for policy 0, policy_version 8070 (0.0005) +[2024-08-24 20:19:22,533][03430] Updated weights for policy 0, policy_version 8080 (0.0006) +[2024-08-24 20:19:23,628][03430] Updated weights for policy 0, policy_version 8090 (0.0006) +[2024-08-24 20:19:24,721][03430] Updated weights for policy 0, policy_version 8100 (0.0005) +[2024-08-24 20:19:25,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36932.2, 300 sec: 36822.3). Total num frames: 33214464. Throughput: 0: 9216.2. Samples: 8286746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:19:25,813][01192] Avg episode reward: [(0, '4.425')] +[2024-08-24 20:19:25,823][03430] Updated weights for policy 0, policy_version 8110 (0.0006) +[2024-08-24 20:19:26,926][03430] Updated weights for policy 0, policy_version 8120 (0.0005) +[2024-08-24 20:19:28,018][03430] Updated weights for policy 0, policy_version 8130 (0.0006) +[2024-08-24 20:19:29,132][03430] Updated weights for policy 0, policy_version 8140 (0.0006) +[2024-08-24 20:19:30,308][03430] Updated weights for policy 0, policy_version 8150 (0.0006) +[2024-08-24 20:19:30,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36932.2, 300 sec: 36822.3). Total num frames: 33398784. Throughput: 0: 9211.1. Samples: 8342240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:19:30,813][01192] Avg episode reward: [(0, '4.500')] +[2024-08-24 20:19:31,460][03430] Updated weights for policy 0, policy_version 8160 (0.0006) +[2024-08-24 20:19:32,571][03430] Updated weights for policy 0, policy_version 8170 (0.0005) +[2024-08-24 20:19:33,641][03430] Updated weights for policy 0, policy_version 8180 (0.0006) +[2024-08-24 20:19:34,743][03430] Updated weights for policy 0, policy_version 8190 (0.0006) +[2024-08-24 20:19:35,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36864.1, 300 sec: 36836.2). Total num frames: 33583104. Throughput: 0: 9187.1. Samples: 8369184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:19:35,813][01192] Avg episode reward: [(0, '4.309')] +[2024-08-24 20:19:35,838][03430] Updated weights for policy 0, policy_version 8200 (0.0005) +[2024-08-24 20:19:36,968][03430] Updated weights for policy 0, policy_version 8210 (0.0005) +[2024-08-24 20:19:38,071][03430] Updated weights for policy 0, policy_version 8220 (0.0005) +[2024-08-24 20:19:39,190][03430] Updated weights for policy 0, policy_version 8230 (0.0005) +[2024-08-24 20:19:40,283][03430] Updated weights for policy 0, policy_version 8240 (0.0006) +[2024-08-24 20:19:40,812][01192] Fps is (10 sec: 36864.4, 60 sec: 36864.0, 300 sec: 36836.2). Total num frames: 33767424. Throughput: 0: 9172.9. Samples: 8424982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:19:40,813][01192] Avg episode reward: [(0, '4.476')] +[2024-08-24 20:19:41,394][03430] Updated weights for policy 0, policy_version 8250 (0.0005) +[2024-08-24 20:19:42,475][03430] Updated weights for policy 0, policy_version 8260 (0.0005) +[2024-08-24 20:19:43,598][03430] Updated weights for policy 0, policy_version 8270 (0.0005) +[2024-08-24 20:19:44,698][03430] Updated weights for policy 0, policy_version 8280 (0.0006) +[2024-08-24 20:19:45,811][03430] Updated weights for policy 0, policy_version 8290 (0.0005) +[2024-08-24 20:19:45,812][01192] Fps is (10 sec: 37273.5, 60 sec: 36864.0, 300 sec: 36864.0). Total num frames: 33955840. Throughput: 0: 9185.3. Samples: 8480782. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:19:45,816][01192] Avg episode reward: [(0, '4.285')] +[2024-08-24 20:19:46,890][03430] Updated weights for policy 0, policy_version 8300 (0.0005) +[2024-08-24 20:19:48,012][03430] Updated weights for policy 0, policy_version 8310 (0.0006) +[2024-08-24 20:19:49,105][03430] Updated weights for policy 0, policy_version 8320 (0.0005) +[2024-08-24 20:19:50,205][03430] Updated weights for policy 0, policy_version 8330 (0.0005) +[2024-08-24 20:19:50,812][01192] Fps is (10 sec: 37273.2, 60 sec: 36863.9, 300 sec: 36864.0). Total num frames: 34140160. Throughput: 0: 9181.6. Samples: 8508476. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:19:50,813][01192] Avg episode reward: [(0, '4.493')] +[2024-08-24 20:19:51,345][03430] Updated weights for policy 0, policy_version 8340 (0.0007) +[2024-08-24 20:19:52,467][03430] Updated weights for policy 0, policy_version 8350 (0.0006) +[2024-08-24 20:19:53,580][03430] Updated weights for policy 0, policy_version 8360 (0.0006) +[2024-08-24 20:19:54,664][03430] Updated weights for policy 0, policy_version 8370 (0.0005) +[2024-08-24 20:19:55,768][03430] Updated weights for policy 0, policy_version 8380 (0.0005) +[2024-08-24 20:19:55,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36795.8, 300 sec: 36877.9). Total num frames: 34324480. Throughput: 0: 9280.8. Samples: 8563748. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:19:55,813][01192] Avg episode reward: [(0, '4.378')] +[2024-08-24 20:19:56,816][03430] Updated weights for policy 0, policy_version 8390 (0.0005) +[2024-08-24 20:19:57,882][03430] Updated weights for policy 0, policy_version 8400 (0.0005) +[2024-08-24 20:19:59,009][03430] Updated weights for policy 0, policy_version 8410 (0.0006) +[2024-08-24 20:20:00,132][03430] Updated weights for policy 0, policy_version 8420 (0.0005) +[2024-08-24 20:20:00,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36863.9, 300 sec: 36919.5). Total num frames: 34512896. Throughput: 0: 9280.7. Samples: 8620208. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:20:00,813][01192] Avg episode reward: [(0, '4.320')] +[2024-08-24 20:20:01,231][03430] Updated weights for policy 0, policy_version 8430 (0.0006) +[2024-08-24 20:20:02,318][03430] Updated weights for policy 0, policy_version 8440 (0.0006) +[2024-08-24 20:20:03,441][03430] Updated weights for policy 0, policy_version 8450 (0.0005) +[2024-08-24 20:20:04,574][03430] Updated weights for policy 0, policy_version 8460 (0.0006) +[2024-08-24 20:20:05,652][03430] Updated weights for policy 0, policy_version 8470 (0.0005) +[2024-08-24 20:20:05,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37068.8, 300 sec: 36919.5). Total num frames: 34697216. Throughput: 0: 9273.3. Samples: 8648150. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:20:05,813][01192] Avg episode reward: [(0, '4.511')] +[2024-08-24 20:20:06,789][03430] Updated weights for policy 0, policy_version 8480 (0.0006) +[2024-08-24 20:20:07,900][03430] Updated weights for policy 0, policy_version 8490 (0.0005) +[2024-08-24 20:20:09,009][03430] Updated weights for policy 0, policy_version 8500 (0.0005) +[2024-08-24 20:20:10,108][03430] Updated weights for policy 0, policy_version 8510 (0.0006) +[2024-08-24 20:20:10,812][01192] Fps is (10 sec: 36864.4, 60 sec: 37137.1, 300 sec: 36919.6). Total num frames: 34881536. Throughput: 0: 9255.6. Samples: 8703248. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:20:10,813][01192] Avg episode reward: [(0, '4.316')] +[2024-08-24 20:20:11,186][03430] Updated weights for policy 0, policy_version 8520 (0.0006) +[2024-08-24 20:20:12,291][03430] Updated weights for policy 0, policy_version 8530 (0.0006) +[2024-08-24 20:20:13,392][03430] Updated weights for policy 0, policy_version 8540 (0.0005) +[2024-08-24 20:20:14,500][03430] Updated weights for policy 0, policy_version 8550 (0.0006) +[2024-08-24 20:20:15,586][03430] Updated weights for policy 0, policy_version 8560 (0.0005) +[2024-08-24 20:20:15,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.1, 300 sec: 36947.3). Total num frames: 35069952. Throughput: 0: 9259.5. Samples: 8758916. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:20:15,813][01192] Avg episode reward: [(0, '4.612')] +[2024-08-24 20:20:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008562_35069952.pth... +[2024-08-24 20:20:15,850][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000006393_26185728.pth +[2024-08-24 20:20:16,695][03430] Updated weights for policy 0, policy_version 8570 (0.0005) +[2024-08-24 20:20:17,784][03430] Updated weights for policy 0, policy_version 8580 (0.0005) +[2024-08-24 20:20:18,875][03430] Updated weights for policy 0, policy_version 8590 (0.0005) +[2024-08-24 20:20:19,968][03430] Updated weights for policy 0, policy_version 8600 (0.0005) +[2024-08-24 20:20:20,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37068.9, 300 sec: 36947.3). Total num frames: 35254272. Throughput: 0: 9283.6. Samples: 8786948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:20:20,813][01192] Avg episode reward: [(0, '4.361')] +[2024-08-24 20:20:21,063][03430] Updated weights for policy 0, policy_version 8610 (0.0005) +[2024-08-24 20:20:22,175][03430] Updated weights for policy 0, policy_version 8620 (0.0005) +[2024-08-24 20:20:23,277][03430] Updated weights for policy 0, policy_version 8630 (0.0006) +[2024-08-24 20:20:24,403][03430] Updated weights for policy 0, policy_version 8640 (0.0005) +[2024-08-24 20:20:25,503][03430] Updated weights for policy 0, policy_version 8650 (0.0005) +[2024-08-24 20:20:25,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37068.8, 300 sec: 36961.2). Total num frames: 35438592. Throughput: 0: 9289.8. Samples: 8843024. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:20:25,813][01192] Avg episode reward: [(0, '4.371')] +[2024-08-24 20:20:26,609][03430] Updated weights for policy 0, policy_version 8660 (0.0005) +[2024-08-24 20:20:27,704][03430] Updated weights for policy 0, policy_version 8670 (0.0005) +[2024-08-24 20:20:28,787][03430] Updated weights for policy 0, policy_version 8680 (0.0005) +[2024-08-24 20:20:29,887][03430] Updated weights for policy 0, policy_version 8690 (0.0005) +[2024-08-24 20:20:30,812][01192] Fps is (10 sec: 37273.7, 60 sec: 37137.1, 300 sec: 36989.0). Total num frames: 35627008. Throughput: 0: 9289.4. Samples: 8898804. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:20:30,813][01192] Avg episode reward: [(0, '4.778')] +[2024-08-24 20:20:30,819][03417] Saving new best policy, reward=4.778! +[2024-08-24 20:20:30,997][03430] Updated weights for policy 0, policy_version 8700 (0.0006) +[2024-08-24 20:20:32,093][03430] Updated weights for policy 0, policy_version 8710 (0.0006) +[2024-08-24 20:20:33,188][03430] Updated weights for policy 0, policy_version 8720 (0.0005) +[2024-08-24 20:20:34,301][03430] Updated weights for policy 0, policy_version 8730 (0.0005) +[2024-08-24 20:20:35,389][03430] Updated weights for policy 0, policy_version 8740 (0.0006) +[2024-08-24 20:20:35,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.1, 300 sec: 36989.0). Total num frames: 35811328. Throughput: 0: 9294.5. Samples: 8926726. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:20:35,813][01192] Avg episode reward: [(0, '4.303')] +[2024-08-24 20:20:36,519][03430] Updated weights for policy 0, policy_version 8750 (0.0005) +[2024-08-24 20:20:37,638][03430] Updated weights for policy 0, policy_version 8760 (0.0005) +[2024-08-24 20:20:38,745][03430] Updated weights for policy 0, policy_version 8770 (0.0005) +[2024-08-24 20:20:39,861][03430] Updated weights for policy 0, policy_version 8780 (0.0006) +[2024-08-24 20:20:40,812][01192] Fps is (10 sec: 36864.0, 60 sec: 37137.1, 300 sec: 36989.0). Total num frames: 35995648. Throughput: 0: 9299.5. Samples: 8982226. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:20:40,813][01192] Avg episode reward: [(0, '4.294')] +[2024-08-24 20:20:40,957][03430] Updated weights for policy 0, policy_version 8790 (0.0005) +[2024-08-24 20:20:42,042][03430] Updated weights for policy 0, policy_version 8800 (0.0006) +[2024-08-24 20:20:43,140][03430] Updated weights for policy 0, policy_version 8810 (0.0006) +[2024-08-24 20:20:44,247][03430] Updated weights for policy 0, policy_version 8820 (0.0005) +[2024-08-24 20:20:45,339][03430] Updated weights for policy 0, policy_version 8830 (0.0006) +[2024-08-24 20:20:45,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37137.1, 300 sec: 37002.8). Total num frames: 36184064. Throughput: 0: 9282.6. Samples: 9037924. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:20:45,812][01192] Avg episode reward: [(0, '4.470')] +[2024-08-24 20:20:46,467][03430] Updated weights for policy 0, policy_version 8840 (0.0005) +[2024-08-24 20:20:47,576][03430] Updated weights for policy 0, policy_version 8850 (0.0006) +[2024-08-24 20:20:48,678][03430] Updated weights for policy 0, policy_version 8860 (0.0005) +[2024-08-24 20:20:49,761][03430] Updated weights for policy 0, policy_version 8870 (0.0006) +[2024-08-24 20:20:50,812][01192] Fps is (10 sec: 37273.3, 60 sec: 37137.1, 300 sec: 37002.8). Total num frames: 36368384. Throughput: 0: 9275.4. Samples: 9065544. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:20:50,813][01192] Avg episode reward: [(0, '4.667')] +[2024-08-24 20:20:50,888][03430] Updated weights for policy 0, policy_version 8880 (0.0006) +[2024-08-24 20:20:52,001][03430] Updated weights for policy 0, policy_version 8890 (0.0005) +[2024-08-24 20:20:53,109][03430] Updated weights for policy 0, policy_version 8900 (0.0005) +[2024-08-24 20:20:54,209][03430] Updated weights for policy 0, policy_version 8910 (0.0005) +[2024-08-24 20:20:55,303][03430] Updated weights for policy 0, policy_version 8920 (0.0005) +[2024-08-24 20:20:55,812][01192] Fps is (10 sec: 36863.8, 60 sec: 37137.0, 300 sec: 37016.7). Total num frames: 36552704. Throughput: 0: 9286.8. Samples: 9121154. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:20:55,813][01192] Avg episode reward: [(0, '4.585')] +[2024-08-24 20:20:56,391][03430] Updated weights for policy 0, policy_version 8930 (0.0005) +[2024-08-24 20:20:57,501][03430] Updated weights for policy 0, policy_version 8940 (0.0006) +[2024-08-24 20:20:58,592][03430] Updated weights for policy 0, policy_version 8950 (0.0005) +[2024-08-24 20:20:59,682][03430] Updated weights for policy 0, policy_version 8960 (0.0005) +[2024-08-24 20:21:00,812][01192] Fps is (10 sec: 36864.2, 60 sec: 37068.9, 300 sec: 37016.7). Total num frames: 36737024. Throughput: 0: 9292.0. Samples: 9177054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:21:00,812][01192] Avg episode reward: [(0, '4.522')] +[2024-08-24 20:21:00,815][03430] Updated weights for policy 0, policy_version 8970 (0.0006) +[2024-08-24 20:21:02,005][03430] Updated weights for policy 0, policy_version 8980 (0.0005) +[2024-08-24 20:21:03,104][03430] Updated weights for policy 0, policy_version 8990 (0.0006) +[2024-08-24 20:21:04,203][03430] Updated weights for policy 0, policy_version 9000 (0.0006) +[2024-08-24 20:21:05,303][03430] Updated weights for policy 0, policy_version 9010 (0.0005) +[2024-08-24 20:21:05,812][01192] Fps is (10 sec: 36864.2, 60 sec: 37068.8, 300 sec: 37016.7). Total num frames: 36921344. Throughput: 0: 9267.0. Samples: 9203964. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:21:05,813][01192] Avg episode reward: [(0, '4.563')] +[2024-08-24 20:21:06,503][03430] Updated weights for policy 0, policy_version 9020 (0.0006) +[2024-08-24 20:21:07,680][03430] Updated weights for policy 0, policy_version 9030 (0.0006) +[2024-08-24 20:21:08,789][03430] Updated weights for policy 0, policy_version 9040 (0.0006) +[2024-08-24 20:21:09,890][03430] Updated weights for policy 0, policy_version 9050 (0.0005) +[2024-08-24 20:21:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 37000.5, 300 sec: 37002.8). Total num frames: 37101568. Throughput: 0: 9224.5. Samples: 9258128. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:21:10,813][01192] Avg episode reward: [(0, '4.377')] +[2024-08-24 20:21:10,993][03430] Updated weights for policy 0, policy_version 9060 (0.0006) +[2024-08-24 20:21:12,101][03430] Updated weights for policy 0, policy_version 9070 (0.0006) +[2024-08-24 20:21:13,207][03430] Updated weights for policy 0, policy_version 9080 (0.0006) +[2024-08-24 20:21:14,321][03430] Updated weights for policy 0, policy_version 9090 (0.0006) +[2024-08-24 20:21:15,444][03430] Updated weights for policy 0, policy_version 9100 (0.0006) +[2024-08-24 20:21:15,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36932.2, 300 sec: 37016.7). Total num frames: 37285888. Throughput: 0: 9220.7. Samples: 9313738. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:21:15,813][01192] Avg episode reward: [(0, '4.497')] +[2024-08-24 20:21:16,567][03430] Updated weights for policy 0, policy_version 9110 (0.0006) +[2024-08-24 20:21:17,650][03430] Updated weights for policy 0, policy_version 9120 (0.0005) +[2024-08-24 20:21:18,723][03430] Updated weights for policy 0, policy_version 9130 (0.0005) +[2024-08-24 20:21:19,827][03430] Updated weights for policy 0, policy_version 9140 (0.0005) +[2024-08-24 20:21:20,812][01192] Fps is (10 sec: 37273.6, 60 sec: 37000.5, 300 sec: 37016.7). Total num frames: 37474304. Throughput: 0: 9215.7. Samples: 9341432. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:21:20,812][01192] Avg episode reward: [(0, '4.519')] +[2024-08-24 20:21:20,926][03430] Updated weights for policy 0, policy_version 9150 (0.0005) +[2024-08-24 20:21:22,013][03430] Updated weights for policy 0, policy_version 9160 (0.0005) +[2024-08-24 20:21:23,116][03430] Updated weights for policy 0, policy_version 9170 (0.0005) +[2024-08-24 20:21:24,197][03430] Updated weights for policy 0, policy_version 9180 (0.0006) +[2024-08-24 20:21:25,317][03430] Updated weights for policy 0, policy_version 9190 (0.0005) +[2024-08-24 20:21:25,812][01192] Fps is (10 sec: 37273.8, 60 sec: 37000.5, 300 sec: 37030.6). Total num frames: 37658624. Throughput: 0: 9232.6. Samples: 9397694. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:21:25,813][01192] Avg episode reward: [(0, '4.476')] +[2024-08-24 20:21:26,429][03430] Updated weights for policy 0, policy_version 9200 (0.0006) +[2024-08-24 20:21:27,555][03430] Updated weights for policy 0, policy_version 9210 (0.0005) +[2024-08-24 20:21:28,668][03430] Updated weights for policy 0, policy_version 9220 (0.0005) +[2024-08-24 20:21:29,849][03430] Updated weights for policy 0, policy_version 9230 (0.0006) +[2024-08-24 20:21:30,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36864.0, 300 sec: 37016.7). Total num frames: 37838848. Throughput: 0: 9207.7. Samples: 9452270. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:21:30,813][01192] Avg episode reward: [(0, '4.712')] +[2024-08-24 20:21:30,982][03430] Updated weights for policy 0, policy_version 9240 (0.0005) +[2024-08-24 20:21:32,115][03430] Updated weights for policy 0, policy_version 9250 (0.0005) +[2024-08-24 20:21:33,273][03430] Updated weights for policy 0, policy_version 9260 (0.0006) +[2024-08-24 20:21:34,393][03430] Updated weights for policy 0, policy_version 9270 (0.0006) +[2024-08-24 20:21:35,552][03430] Updated weights for policy 0, policy_version 9280 (0.0005) +[2024-08-24 20:21:35,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36795.7, 300 sec: 37002.9). Total num frames: 38019072. Throughput: 0: 9186.1. Samples: 9478916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:21:35,819][01192] Avg episode reward: [(0, '4.602')] +[2024-08-24 20:21:36,695][03430] Updated weights for policy 0, policy_version 9290 (0.0007) +[2024-08-24 20:21:37,819][03430] Updated weights for policy 0, policy_version 9300 (0.0005) +[2024-08-24 20:21:38,945][03430] Updated weights for policy 0, policy_version 9310 (0.0005) +[2024-08-24 20:21:40,049][03430] Updated weights for policy 0, policy_version 9320 (0.0005) +[2024-08-24 20:21:40,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36727.5, 300 sec: 37002.8). Total num frames: 38199296. Throughput: 0: 9156.5. Samples: 9533196. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:21:40,813][01192] Avg episode reward: [(0, '4.384')] +[2024-08-24 20:21:41,182][03430] Updated weights for policy 0, policy_version 9330 (0.0006) +[2024-08-24 20:21:42,311][03430] Updated weights for policy 0, policy_version 9340 (0.0006) +[2024-08-24 20:21:43,438][03430] Updated weights for policy 0, policy_version 9350 (0.0006) +[2024-08-24 20:21:44,557][03430] Updated weights for policy 0, policy_version 9360 (0.0006) +[2024-08-24 20:21:45,662][03430] Updated weights for policy 0, policy_version 9370 (0.0006) +[2024-08-24 20:21:45,812][01192] Fps is (10 sec: 36453.6, 60 sec: 36659.1, 300 sec: 36988.9). Total num frames: 38383616. Throughput: 0: 9131.2. Samples: 9587962. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:21:45,819][01192] Avg episode reward: [(0, '4.422')] +[2024-08-24 20:21:46,802][03430] Updated weights for policy 0, policy_version 9380 (0.0006) +[2024-08-24 20:21:47,909][03430] Updated weights for policy 0, policy_version 9390 (0.0007) +[2024-08-24 20:21:49,040][03430] Updated weights for policy 0, policy_version 9400 (0.0006) +[2024-08-24 20:21:50,154][03430] Updated weights for policy 0, policy_version 9410 (0.0006) +[2024-08-24 20:21:50,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36591.0, 300 sec: 36961.2). Total num frames: 38563840. Throughput: 0: 9137.0. Samples: 9615128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2024-08-24 20:21:50,812][01192] Avg episode reward: [(0, '4.533')] +[2024-08-24 20:21:51,294][03430] Updated weights for policy 0, policy_version 9420 (0.0006) +[2024-08-24 20:21:52,415][03430] Updated weights for policy 0, policy_version 9430 (0.0005) +[2024-08-24 20:21:53,563][03430] Updated weights for policy 0, policy_version 9440 (0.0006) +[2024-08-24 20:21:54,663][03430] Updated weights for policy 0, policy_version 9450 (0.0006) +[2024-08-24 20:21:55,796][03430] Updated weights for policy 0, policy_version 9460 (0.0006) +[2024-08-24 20:21:55,812][01192] Fps is (10 sec: 36455.0, 60 sec: 36590.9, 300 sec: 36975.1). Total num frames: 38748160. Throughput: 0: 9148.3. Samples: 9669804. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:21:55,813][01192] Avg episode reward: [(0, '4.427')] +[2024-08-24 20:21:56,906][03430] Updated weights for policy 0, policy_version 9470 (0.0005) +[2024-08-24 20:21:58,046][03430] Updated weights for policy 0, policy_version 9480 (0.0006) +[2024-08-24 20:21:59,175][03430] Updated weights for policy 0, policy_version 9490 (0.0006) +[2024-08-24 20:22:00,311][03430] Updated weights for policy 0, policy_version 9500 (0.0004) +[2024-08-24 20:22:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.6, 300 sec: 36975.1). Total num frames: 38928384. Throughput: 0: 9127.6. Samples: 9724480. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:22:00,813][01192] Avg episode reward: [(0, '4.334')] +[2024-08-24 20:22:01,435][03430] Updated weights for policy 0, policy_version 9510 (0.0006) +[2024-08-24 20:22:02,569][03430] Updated weights for policy 0, policy_version 9520 (0.0005) +[2024-08-24 20:22:03,720][03430] Updated weights for policy 0, policy_version 9530 (0.0005) +[2024-08-24 20:22:04,847][03430] Updated weights for policy 0, policy_version 9540 (0.0005) +[2024-08-24 20:22:05,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36454.4, 300 sec: 36961.2). Total num frames: 39108608. Throughput: 0: 9108.8. Samples: 9751328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:22:05,813][01192] Avg episode reward: [(0, '4.367')] +[2024-08-24 20:22:05,984][03430] Updated weights for policy 0, policy_version 9550 (0.0006) +[2024-08-24 20:22:07,112][03430] Updated weights for policy 0, policy_version 9560 (0.0007) +[2024-08-24 20:22:08,228][03430] Updated weights for policy 0, policy_version 9570 (0.0006) +[2024-08-24 20:22:09,366][03430] Updated weights for policy 0, policy_version 9580 (0.0006) +[2024-08-24 20:22:10,514][03430] Updated weights for policy 0, policy_version 9590 (0.0005) +[2024-08-24 20:22:10,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36454.4, 300 sec: 36933.4). Total num frames: 39288832. Throughput: 0: 9064.7. Samples: 9805608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:22:10,813][01192] Avg episode reward: [(0, '4.444')] +[2024-08-24 20:22:11,635][03430] Updated weights for policy 0, policy_version 9600 (0.0005) +[2024-08-24 20:22:12,767][03430] Updated weights for policy 0, policy_version 9610 (0.0006) +[2024-08-24 20:22:13,888][03430] Updated weights for policy 0, policy_version 9620 (0.0005) +[2024-08-24 20:22:15,025][03430] Updated weights for policy 0, policy_version 9630 (0.0006) +[2024-08-24 20:22:15,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36454.5, 300 sec: 36919.6). Total num frames: 39473152. Throughput: 0: 9061.8. Samples: 9860052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:22:15,812][01192] Avg episode reward: [(0, '4.350')] +[2024-08-24 20:22:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009637_39473152.pth... +[2024-08-24 20:22:15,842][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000007477_30625792.pth +[2024-08-24 20:22:16,145][03430] Updated weights for policy 0, policy_version 9640 (0.0004) +[2024-08-24 20:22:17,284][03430] Updated weights for policy 0, policy_version 9650 (0.0005) +[2024-08-24 20:22:18,476][03430] Updated weights for policy 0, policy_version 9660 (0.0006) +[2024-08-24 20:22:19,583][03430] Updated weights for policy 0, policy_version 9670 (0.0005) +[2024-08-24 20:22:20,742][03430] Updated weights for policy 0, policy_version 9680 (0.0005) +[2024-08-24 20:22:20,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36249.6, 300 sec: 36891.8). Total num frames: 39649280. Throughput: 0: 9070.5. Samples: 9887088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:20,812][01192] Avg episode reward: [(0, '4.553')] +[2024-08-24 20:22:21,829][03430] Updated weights for policy 0, policy_version 9690 (0.0007) +[2024-08-24 20:22:22,935][03430] Updated weights for policy 0, policy_version 9700 (0.0006) +[2024-08-24 20:22:24,062][03430] Updated weights for policy 0, policy_version 9710 (0.0006) +[2024-08-24 20:22:25,193][03430] Updated weights for policy 0, policy_version 9720 (0.0006) +[2024-08-24 20:22:25,812][01192] Fps is (10 sec: 36044.5, 60 sec: 36249.6, 300 sec: 36877.9). Total num frames: 39833600. Throughput: 0: 9075.0. Samples: 9941572. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:25,813][01192] Avg episode reward: [(0, '4.472')] +[2024-08-24 20:22:26,356][03430] Updated weights for policy 0, policy_version 9730 (0.0006) +[2024-08-24 20:22:27,472][03430] Updated weights for policy 0, policy_version 9740 (0.0006) +[2024-08-24 20:22:28,583][03430] Updated weights for policy 0, policy_version 9750 (0.0006) +[2024-08-24 20:22:29,705][03430] Updated weights for policy 0, policy_version 9760 (0.0005) +[2024-08-24 20:22:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36249.6, 300 sec: 36864.0). Total num frames: 40013824. Throughput: 0: 9070.5. Samples: 9996134. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:30,812][01192] Avg episode reward: [(0, '4.412')] +[2024-08-24 20:22:30,821][03430] Updated weights for policy 0, policy_version 9770 (0.0006) +[2024-08-24 20:22:31,944][03430] Updated weights for policy 0, policy_version 9780 (0.0006) +[2024-08-24 20:22:33,092][03430] Updated weights for policy 0, policy_version 9790 (0.0005) +[2024-08-24 20:22:34,219][03430] Updated weights for policy 0, policy_version 9800 (0.0006) +[2024-08-24 20:22:35,374][03430] Updated weights for policy 0, policy_version 9810 (0.0006) +[2024-08-24 20:22:35,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36249.6, 300 sec: 36850.1). Total num frames: 40194048. Throughput: 0: 9071.4. Samples: 10023342. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:35,813][01192] Avg episode reward: [(0, '4.300')] +[2024-08-24 20:22:36,491][03430] Updated weights for policy 0, policy_version 9820 (0.0006) +[2024-08-24 20:22:37,657][03430] Updated weights for policy 0, policy_version 9830 (0.0005) +[2024-08-24 20:22:38,786][03430] Updated weights for policy 0, policy_version 9840 (0.0005) +[2024-08-24 20:22:39,891][03430] Updated weights for policy 0, policy_version 9850 (0.0005) +[2024-08-24 20:22:40,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36317.8, 300 sec: 36850.1). Total num frames: 40378368. Throughput: 0: 9055.3. Samples: 10077292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:40,813][01192] Avg episode reward: [(0, '4.312')] +[2024-08-24 20:22:40,956][03430] Updated weights for policy 0, policy_version 9860 (0.0005) +[2024-08-24 20:22:42,095][03430] Updated weights for policy 0, policy_version 9870 (0.0005) +[2024-08-24 20:22:43,231][03430] Updated weights for policy 0, policy_version 9880 (0.0006) +[2024-08-24 20:22:44,354][03430] Updated weights for policy 0, policy_version 9890 (0.0006) +[2024-08-24 20:22:45,470][03430] Updated weights for policy 0, policy_version 9900 (0.0007) +[2024-08-24 20:22:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36249.7, 300 sec: 36836.2). Total num frames: 40558592. Throughput: 0: 9066.4. Samples: 10132468. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:22:45,813][01192] Avg episode reward: [(0, '4.432')] +[2024-08-24 20:22:46,591][03430] Updated weights for policy 0, policy_version 9910 (0.0006) +[2024-08-24 20:22:47,708][03430] Updated weights for policy 0, policy_version 9920 (0.0006) +[2024-08-24 20:22:48,840][03430] Updated weights for policy 0, policy_version 9930 (0.0005) +[2024-08-24 20:22:49,988][03430] Updated weights for policy 0, policy_version 9940 (0.0006) +[2024-08-24 20:22:50,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36317.9, 300 sec: 36836.2). Total num frames: 40742912. Throughput: 0: 9074.0. Samples: 10159660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:22:50,813][01192] Avg episode reward: [(0, '4.273')] +[2024-08-24 20:22:51,148][03430] Updated weights for policy 0, policy_version 9950 (0.0006) +[2024-08-24 20:22:52,283][03430] Updated weights for policy 0, policy_version 9960 (0.0006) +[2024-08-24 20:22:53,430][03430] Updated weights for policy 0, policy_version 9970 (0.0005) +[2024-08-24 20:22:54,556][03430] Updated weights for policy 0, policy_version 9980 (0.0006) +[2024-08-24 20:22:55,670][03430] Updated weights for policy 0, policy_version 9990 (0.0006) +[2024-08-24 20:22:55,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36249.6, 300 sec: 36808.5). Total num frames: 40923136. Throughput: 0: 9069.4. Samples: 10213728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:22:55,813][01192] Avg episode reward: [(0, '4.472')] +[2024-08-24 20:22:56,791][03430] Updated weights for policy 0, policy_version 10000 (0.0005) +[2024-08-24 20:22:57,939][03430] Updated weights for policy 0, policy_version 10010 (0.0005) +[2024-08-24 20:22:59,052][03430] Updated weights for policy 0, policy_version 10020 (0.0005) +[2024-08-24 20:23:00,182][03430] Updated weights for policy 0, policy_version 10030 (0.0005) +[2024-08-24 20:23:00,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36249.6, 300 sec: 36780.7). Total num frames: 41103360. Throughput: 0: 9074.8. Samples: 10268418. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:23:00,813][01192] Avg episode reward: [(0, '4.226')] +[2024-08-24 20:23:01,307][03430] Updated weights for policy 0, policy_version 10040 (0.0006) +[2024-08-24 20:23:02,445][03430] Updated weights for policy 0, policy_version 10050 (0.0006) +[2024-08-24 20:23:03,567][03430] Updated weights for policy 0, policy_version 10060 (0.0006) +[2024-08-24 20:23:04,704][03430] Updated weights for policy 0, policy_version 10070 (0.0006) +[2024-08-24 20:23:05,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36249.6, 300 sec: 36766.8). Total num frames: 41283584. Throughput: 0: 9074.2. Samples: 10295426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:23:05,812][01192] Avg episode reward: [(0, '4.487')] +[2024-08-24 20:23:05,820][03430] Updated weights for policy 0, policy_version 10080 (0.0006) +[2024-08-24 20:23:06,961][03430] Updated weights for policy 0, policy_version 10090 (0.0005) +[2024-08-24 20:23:08,179][03430] Updated weights for policy 0, policy_version 10100 (0.0006) +[2024-08-24 20:23:09,328][03430] Updated weights for policy 0, policy_version 10110 (0.0006) +[2024-08-24 20:23:10,476][03430] Updated weights for policy 0, policy_version 10120 (0.0005) +[2024-08-24 20:23:10,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36249.7, 300 sec: 36739.0). Total num frames: 41463808. Throughput: 0: 9050.8. Samples: 10348856. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:23:10,812][01192] Avg episode reward: [(0, '4.777')] +[2024-08-24 20:23:11,611][03430] Updated weights for policy 0, policy_version 10130 (0.0005) +[2024-08-24 20:23:12,745][03430] Updated weights for policy 0, policy_version 10140 (0.0006) +[2024-08-24 20:23:13,880][03430] Updated weights for policy 0, policy_version 10150 (0.0005) +[2024-08-24 20:23:15,055][03430] Updated weights for policy 0, policy_version 10160 (0.0007) +[2024-08-24 20:23:15,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36113.1, 300 sec: 36697.4). Total num frames: 41639936. Throughput: 0: 9037.7. Samples: 10402832. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:23:15,813][01192] Avg episode reward: [(0, '4.501')] +[2024-08-24 20:23:16,204][03430] Updated weights for policy 0, policy_version 10170 (0.0006) +[2024-08-24 20:23:17,398][03430] Updated weights for policy 0, policy_version 10180 (0.0005) +[2024-08-24 20:23:18,481][03430] Updated weights for policy 0, policy_version 10190 (0.0005) +[2024-08-24 20:23:19,596][03430] Updated weights for policy 0, policy_version 10200 (0.0007) +[2024-08-24 20:23:20,757][03430] Updated weights for policy 0, policy_version 10210 (0.0006) +[2024-08-24 20:23:20,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36181.3, 300 sec: 36683.5). Total num frames: 41820160. Throughput: 0: 9019.0. Samples: 10429196. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:23:20,813][01192] Avg episode reward: [(0, '4.410')] +[2024-08-24 20:23:21,889][03430] Updated weights for policy 0, policy_version 10220 (0.0006) +[2024-08-24 20:23:23,026][03430] Updated weights for policy 0, policy_version 10230 (0.0005) +[2024-08-24 20:23:24,163][03430] Updated weights for policy 0, policy_version 10240 (0.0005) +[2024-08-24 20:23:25,291][03430] Updated weights for policy 0, policy_version 10250 (0.0005) +[2024-08-24 20:23:25,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36113.1, 300 sec: 36669.6). Total num frames: 42000384. Throughput: 0: 9026.7. Samples: 10483494. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:23:25,813][01192] Avg episode reward: [(0, '4.637')] +[2024-08-24 20:23:26,432][03430] Updated weights for policy 0, policy_version 10260 (0.0006) +[2024-08-24 20:23:27,575][03430] Updated weights for policy 0, policy_version 10270 (0.0005) +[2024-08-24 20:23:28,710][03430] Updated weights for policy 0, policy_version 10280 (0.0005) +[2024-08-24 20:23:29,787][03430] Updated weights for policy 0, policy_version 10290 (0.0006) +[2024-08-24 20:23:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36181.3, 300 sec: 36655.7). Total num frames: 42184704. Throughput: 0: 9012.3. Samples: 10538020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:23:30,812][01192] Avg episode reward: [(0, '4.332')] +[2024-08-24 20:23:30,900][03430] Updated weights for policy 0, policy_version 10300 (0.0005) +[2024-08-24 20:23:32,048][03430] Updated weights for policy 0, policy_version 10310 (0.0006) +[2024-08-24 20:23:33,213][03430] Updated weights for policy 0, policy_version 10320 (0.0005) +[2024-08-24 20:23:34,335][03430] Updated weights for policy 0, policy_version 10330 (0.0005) +[2024-08-24 20:23:35,470][03430] Updated weights for policy 0, policy_version 10340 (0.0007) +[2024-08-24 20:23:35,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36181.3, 300 sec: 36641.8). Total num frames: 42364928. Throughput: 0: 9006.1. Samples: 10564938. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:23:35,813][01192] Avg episode reward: [(0, '4.349')] +[2024-08-24 20:23:36,595][03430] Updated weights for policy 0, policy_version 10350 (0.0006) +[2024-08-24 20:23:37,754][03430] Updated weights for policy 0, policy_version 10360 (0.0006) +[2024-08-24 20:23:38,877][03430] Updated weights for policy 0, policy_version 10370 (0.0006) +[2024-08-24 20:23:39,984][03430] Updated weights for policy 0, policy_version 10380 (0.0006) +[2024-08-24 20:23:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36614.1). Total num frames: 42545152. Throughput: 0: 9013.4. Samples: 10619332. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:23:40,813][01192] Avg episode reward: [(0, '4.373')] +[2024-08-24 20:23:41,152][03430] Updated weights for policy 0, policy_version 10390 (0.0005) +[2024-08-24 20:23:42,308][03430] Updated weights for policy 0, policy_version 10400 (0.0006) +[2024-08-24 20:23:43,435][03430] Updated weights for policy 0, policy_version 10410 (0.0006) +[2024-08-24 20:23:44,569][03430] Updated weights for policy 0, policy_version 10420 (0.0005) +[2024-08-24 20:23:45,703][03430] Updated weights for policy 0, policy_version 10430 (0.0005) +[2024-08-24 20:23:45,812][01192] Fps is (10 sec: 35635.7, 60 sec: 36044.8, 300 sec: 36586.3). Total num frames: 42721280. Throughput: 0: 8994.1. Samples: 10673152. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:23:45,812][01192] Avg episode reward: [(0, '4.273')] +[2024-08-24 20:23:46,804][03430] Updated weights for policy 0, policy_version 10440 (0.0006) +[2024-08-24 20:23:47,907][03430] Updated weights for policy 0, policy_version 10450 (0.0005) +[2024-08-24 20:23:49,029][03430] Updated weights for policy 0, policy_version 10460 (0.0005) +[2024-08-24 20:23:50,174][03430] Updated weights for policy 0, policy_version 10470 (0.0006) +[2024-08-24 20:23:50,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36044.8, 300 sec: 36572.4). Total num frames: 42905600. Throughput: 0: 9010.5. Samples: 10700896. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:23:50,813][01192] Avg episode reward: [(0, '4.531')] +[2024-08-24 20:23:51,289][03430] Updated weights for policy 0, policy_version 10480 (0.0006) +[2024-08-24 20:23:52,405][03430] Updated weights for policy 0, policy_version 10490 (0.0005) +[2024-08-24 20:23:53,544][03430] Updated weights for policy 0, policy_version 10500 (0.0005) +[2024-08-24 20:23:54,686][03430] Updated weights for policy 0, policy_version 10510 (0.0006) +[2024-08-24 20:23:55,792][03430] Updated weights for policy 0, policy_version 10520 (0.0006) +[2024-08-24 20:23:55,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36113.0, 300 sec: 36572.4). Total num frames: 43089920. Throughput: 0: 9033.9. Samples: 10755384. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:23:55,813][01192] Avg episode reward: [(0, '4.450')] +[2024-08-24 20:23:56,924][03430] Updated weights for policy 0, policy_version 10530 (0.0005) +[2024-08-24 20:23:58,053][03430] Updated weights for policy 0, policy_version 10540 (0.0007) +[2024-08-24 20:23:59,159][03430] Updated weights for policy 0, policy_version 10550 (0.0005) +[2024-08-24 20:24:00,283][03430] Updated weights for policy 0, policy_version 10560 (0.0006) +[2024-08-24 20:24:00,812][01192] Fps is (10 sec: 36454.0, 60 sec: 36113.0, 300 sec: 36600.2). Total num frames: 43270144. Throughput: 0: 9052.4. Samples: 10810192. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:24:00,813][01192] Avg episode reward: [(0, '4.285')] +[2024-08-24 20:24:01,400][03430] Updated weights for policy 0, policy_version 10570 (0.0006) +[2024-08-24 20:24:02,549][03430] Updated weights for policy 0, policy_version 10580 (0.0005) +[2024-08-24 20:24:03,675][03430] Updated weights for policy 0, policy_version 10590 (0.0005) +[2024-08-24 20:24:04,791][03430] Updated weights for policy 0, policy_version 10600 (0.0006) +[2024-08-24 20:24:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36600.2). Total num frames: 43450368. Throughput: 0: 9066.3. Samples: 10837180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:24:05,813][01192] Avg episode reward: [(0, '4.292')] +[2024-08-24 20:24:05,943][03430] Updated weights for policy 0, policy_version 10610 (0.0006) +[2024-08-24 20:24:07,169][03430] Updated weights for policy 0, policy_version 10620 (0.0006) +[2024-08-24 20:24:08,301][03430] Updated weights for policy 0, policy_version 10630 (0.0005) +[2024-08-24 20:24:09,375][03430] Updated weights for policy 0, policy_version 10640 (0.0006) +[2024-08-24 20:24:10,485][03430] Updated weights for policy 0, policy_version 10650 (0.0005) +[2024-08-24 20:24:10,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.1, 300 sec: 36572.4). Total num frames: 43630592. Throughput: 0: 9053.4. Samples: 10890898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:24:10,812][01192] Avg episode reward: [(0, '4.435')] +[2024-08-24 20:24:11,619][03430] Updated weights for policy 0, policy_version 10660 (0.0004) +[2024-08-24 20:24:12,774][03430] Updated weights for policy 0, policy_version 10670 (0.0006) +[2024-08-24 20:24:13,881][03430] Updated weights for policy 0, policy_version 10680 (0.0005) +[2024-08-24 20:24:15,001][03430] Updated weights for policy 0, policy_version 10690 (0.0006) +[2024-08-24 20:24:15,812][01192] Fps is (10 sec: 36454.0, 60 sec: 36249.5, 300 sec: 36558.5). Total num frames: 43814912. Throughput: 0: 9059.0. Samples: 10945678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:24:15,813][01192] Avg episode reward: [(0, '4.361')] +[2024-08-24 20:24:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000010697_43814912.pth... +[2024-08-24 20:24:15,840][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000008562_35069952.pth +[2024-08-24 20:24:16,135][03430] Updated weights for policy 0, policy_version 10700 (0.0006) +[2024-08-24 20:24:17,282][03430] Updated weights for policy 0, policy_version 10710 (0.0006) +[2024-08-24 20:24:18,397][03430] Updated weights for policy 0, policy_version 10720 (0.0005) +[2024-08-24 20:24:19,485][03430] Updated weights for policy 0, policy_version 10730 (0.0006) +[2024-08-24 20:24:20,623][03430] Updated weights for policy 0, policy_version 10740 (0.0006) +[2024-08-24 20:24:20,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36249.6, 300 sec: 36544.7). Total num frames: 43995136. Throughput: 0: 9062.9. Samples: 10972766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:24:20,813][01192] Avg episode reward: [(0, '4.496')] +[2024-08-24 20:24:21,771][03430] Updated weights for policy 0, policy_version 10750 (0.0005) +[2024-08-24 20:24:22,914][03430] Updated weights for policy 0, policy_version 10760 (0.0006) +[2024-08-24 20:24:24,063][03430] Updated weights for policy 0, policy_version 10770 (0.0005) +[2024-08-24 20:24:25,226][03430] Updated weights for policy 0, policy_version 10780 (0.0005) +[2024-08-24 20:24:25,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36249.6, 300 sec: 36530.8). Total num frames: 44175360. Throughput: 0: 9062.2. Samples: 11027130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:24:25,819][01192] Avg episode reward: [(0, '4.394')] +[2024-08-24 20:24:26,371][03430] Updated weights for policy 0, policy_version 10790 (0.0005) +[2024-08-24 20:24:27,479][03430] Updated weights for policy 0, policy_version 10800 (0.0006) +[2024-08-24 20:24:28,618][03430] Updated weights for policy 0, policy_version 10810 (0.0006) +[2024-08-24 20:24:29,763][03430] Updated weights for policy 0, policy_version 10820 (0.0005) +[2024-08-24 20:24:30,812][01192] Fps is (10 sec: 35635.0, 60 sec: 36113.0, 300 sec: 36503.0). Total num frames: 44351488. Throughput: 0: 9066.3. Samples: 11081136. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:24:30,813][01192] Avg episode reward: [(0, '4.325')] +[2024-08-24 20:24:30,943][03430] Updated weights for policy 0, policy_version 10830 (0.0006) +[2024-08-24 20:24:32,115][03430] Updated weights for policy 0, policy_version 10840 (0.0006) +[2024-08-24 20:24:33,247][03430] Updated weights for policy 0, policy_version 10850 (0.0006) +[2024-08-24 20:24:34,388][03430] Updated weights for policy 0, policy_version 10860 (0.0005) +[2024-08-24 20:24:35,458][03430] Updated weights for policy 0, policy_version 10870 (0.0006) +[2024-08-24 20:24:35,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36181.4, 300 sec: 36503.0). Total num frames: 44535808. Throughput: 0: 9032.8. Samples: 11107372. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:24:35,813][01192] Avg episode reward: [(0, '4.306')] +[2024-08-24 20:24:36,576][03430] Updated weights for policy 0, policy_version 10880 (0.0006) +[2024-08-24 20:24:37,705][03430] Updated weights for policy 0, policy_version 10890 (0.0007) +[2024-08-24 20:24:38,854][03430] Updated weights for policy 0, policy_version 10900 (0.0005) +[2024-08-24 20:24:39,993][03430] Updated weights for policy 0, policy_version 10910 (0.0006) +[2024-08-24 20:24:40,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36181.3, 300 sec: 36475.2). Total num frames: 44716032. Throughput: 0: 9041.9. Samples: 11162270. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:24:40,813][01192] Avg episode reward: [(0, '4.459')] +[2024-08-24 20:24:41,126][03430] Updated weights for policy 0, policy_version 10920 (0.0005) +[2024-08-24 20:24:42,242][03430] Updated weights for policy 0, policy_version 10930 (0.0006) +[2024-08-24 20:24:43,385][03430] Updated weights for policy 0, policy_version 10940 (0.0005) +[2024-08-24 20:24:44,529][03430] Updated weights for policy 0, policy_version 10950 (0.0005) +[2024-08-24 20:24:45,657][03430] Updated weights for policy 0, policy_version 10960 (0.0005) +[2024-08-24 20:24:45,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36249.5, 300 sec: 36461.3). Total num frames: 44896256. Throughput: 0: 9028.4. Samples: 11216468. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:24:45,813][01192] Avg episode reward: [(0, '4.307')] +[2024-08-24 20:24:46,788][03430] Updated weights for policy 0, policy_version 10970 (0.0005) +[2024-08-24 20:24:47,912][03430] Updated weights for policy 0, policy_version 10980 (0.0005) +[2024-08-24 20:24:49,022][03430] Updated weights for policy 0, policy_version 10990 (0.0006) +[2024-08-24 20:24:50,142][03430] Updated weights for policy 0, policy_version 11000 (0.0006) +[2024-08-24 20:24:50,812][01192] Fps is (10 sec: 36454.0, 60 sec: 36249.5, 300 sec: 36461.3). Total num frames: 45080576. Throughput: 0: 9030.9. Samples: 11243570. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:24:50,813][01192] Avg episode reward: [(0, '4.352')] +[2024-08-24 20:24:51,279][03430] Updated weights for policy 0, policy_version 11010 (0.0006) +[2024-08-24 20:24:52,416][03430] Updated weights for policy 0, policy_version 11020 (0.0006) +[2024-08-24 20:24:53,533][03430] Updated weights for policy 0, policy_version 11030 (0.0005) +[2024-08-24 20:24:54,655][03430] Updated weights for policy 0, policy_version 11040 (0.0005) +[2024-08-24 20:24:55,800][03430] Updated weights for policy 0, policy_version 11050 (0.0006) +[2024-08-24 20:24:55,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36181.3, 300 sec: 36433.6). Total num frames: 45260800. Throughput: 0: 9049.8. Samples: 11298140. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:24:55,813][01192] Avg episode reward: [(0, '4.408')] +[2024-08-24 20:24:56,932][03430] Updated weights for policy 0, policy_version 11060 (0.0005) +[2024-08-24 20:24:58,053][03430] Updated weights for policy 0, policy_version 11070 (0.0006) +[2024-08-24 20:24:59,175][03430] Updated weights for policy 0, policy_version 11080 (0.0006) +[2024-08-24 20:25:00,289][03430] Updated weights for policy 0, policy_version 11090 (0.0006) +[2024-08-24 20:25:00,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36181.4, 300 sec: 36419.7). Total num frames: 45441024. Throughput: 0: 9048.6. Samples: 11352864. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:00,813][01192] Avg episode reward: [(0, '4.542')] +[2024-08-24 20:25:01,414][03430] Updated weights for policy 0, policy_version 11100 (0.0006) +[2024-08-24 20:25:02,526][03430] Updated weights for policy 0, policy_version 11110 (0.0005) +[2024-08-24 20:25:03,669][03430] Updated weights for policy 0, policy_version 11120 (0.0006) +[2024-08-24 20:25:04,779][03430] Updated weights for policy 0, policy_version 11130 (0.0006) +[2024-08-24 20:25:05,812][01192] Fps is (10 sec: 36454.0, 60 sec: 36249.5, 300 sec: 36419.7). Total num frames: 45625344. Throughput: 0: 9051.4. Samples: 11380082. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:05,813][01192] Avg episode reward: [(0, '4.461')] +[2024-08-24 20:25:05,874][03430] Updated weights for policy 0, policy_version 11140 (0.0005) +[2024-08-24 20:25:06,997][03430] Updated weights for policy 0, policy_version 11150 (0.0006) +[2024-08-24 20:25:08,120][03430] Updated weights for policy 0, policy_version 11160 (0.0004) +[2024-08-24 20:25:09,241][03430] Updated weights for policy 0, policy_version 11170 (0.0006) +[2024-08-24 20:25:10,417][03430] Updated weights for policy 0, policy_version 11180 (0.0006) +[2024-08-24 20:25:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36249.6, 300 sec: 36391.9). Total num frames: 45805568. Throughput: 0: 9069.7. Samples: 11435266. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:10,813][01192] Avg episode reward: [(0, '4.508')] +[2024-08-24 20:25:11,527][03430] Updated weights for policy 0, policy_version 11190 (0.0006) +[2024-08-24 20:25:12,597][03430] Updated weights for policy 0, policy_version 11200 (0.0005) +[2024-08-24 20:25:13,707][03430] Updated weights for policy 0, policy_version 11210 (0.0005) +[2024-08-24 20:25:14,794][03430] Updated weights for policy 0, policy_version 11220 (0.0005) +[2024-08-24 20:25:15,812][01192] Fps is (10 sec: 36864.5, 60 sec: 36317.9, 300 sec: 36405.8). Total num frames: 45993984. Throughput: 0: 9096.2. Samples: 11490464. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:15,813][01192] Avg episode reward: [(0, '4.449')] +[2024-08-24 20:25:15,922][03430] Updated weights for policy 0, policy_version 11230 (0.0005) +[2024-08-24 20:25:17,061][03430] Updated weights for policy 0, policy_version 11240 (0.0006) +[2024-08-24 20:25:18,151][03430] Updated weights for policy 0, policy_version 11250 (0.0005) +[2024-08-24 20:25:19,237][03430] Updated weights for policy 0, policy_version 11260 (0.0006) +[2024-08-24 20:25:20,342][03430] Updated weights for policy 0, policy_version 11270 (0.0006) +[2024-08-24 20:25:20,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36386.1, 300 sec: 36405.8). Total num frames: 46178304. Throughput: 0: 9119.1. Samples: 11517732. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:20,813][01192] Avg episode reward: [(0, '4.284')] +[2024-08-24 20:25:21,429][03430] Updated weights for policy 0, policy_version 11280 (0.0006) +[2024-08-24 20:25:22,533][03430] Updated weights for policy 0, policy_version 11290 (0.0005) +[2024-08-24 20:25:23,593][03430] Updated weights for policy 0, policy_version 11300 (0.0005) +[2024-08-24 20:25:24,666][03430] Updated weights for policy 0, policy_version 11310 (0.0005) +[2024-08-24 20:25:25,741][03430] Updated weights for policy 0, policy_version 11320 (0.0005) +[2024-08-24 20:25:25,812][01192] Fps is (10 sec: 37273.7, 60 sec: 36522.7, 300 sec: 36405.8). Total num frames: 46366720. Throughput: 0: 9148.3. Samples: 11573942. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:25:25,813][01192] Avg episode reward: [(0, '4.393')] +[2024-08-24 20:25:26,879][03430] Updated weights for policy 0, policy_version 11330 (0.0005) +[2024-08-24 20:25:28,007][03430] Updated weights for policy 0, policy_version 11340 (0.0006) +[2024-08-24 20:25:29,100][03430] Updated weights for policy 0, policy_version 11350 (0.0005) +[2024-08-24 20:25:30,310][03430] Updated weights for policy 0, policy_version 11360 (0.0006) +[2024-08-24 20:25:30,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36590.9, 300 sec: 36391.9). Total num frames: 46546944. Throughput: 0: 9176.4. Samples: 11629406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:25:30,813][01192] Avg episode reward: [(0, '4.462')] +[2024-08-24 20:25:31,433][03430] Updated weights for policy 0, policy_version 11370 (0.0005) +[2024-08-24 20:25:32,542][03430] Updated weights for policy 0, policy_version 11380 (0.0006) +[2024-08-24 20:25:33,655][03430] Updated weights for policy 0, policy_version 11390 (0.0005) +[2024-08-24 20:25:34,754][03430] Updated weights for policy 0, policy_version 11400 (0.0005) +[2024-08-24 20:25:35,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36391.9). Total num frames: 46731264. Throughput: 0: 9181.4. Samples: 11656732. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:25:35,812][01192] Avg episode reward: [(0, '4.389')] +[2024-08-24 20:25:35,867][03430] Updated weights for policy 0, policy_version 11410 (0.0006) +[2024-08-24 20:25:37,005][03430] Updated weights for policy 0, policy_version 11420 (0.0005) +[2024-08-24 20:25:38,167][03430] Updated weights for policy 0, policy_version 11430 (0.0006) +[2024-08-24 20:25:39,295][03430] Updated weights for policy 0, policy_version 11440 (0.0005) +[2024-08-24 20:25:40,427][03430] Updated weights for policy 0, policy_version 11450 (0.0006) +[2024-08-24 20:25:40,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36590.9, 300 sec: 36364.2). Total num frames: 46911488. Throughput: 0: 9179.3. Samples: 11711208. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:25:40,813][01192] Avg episode reward: [(0, '4.513')] +[2024-08-24 20:25:41,561][03430] Updated weights for policy 0, policy_version 11460 (0.0005) +[2024-08-24 20:25:42,705][03430] Updated weights for policy 0, policy_version 11470 (0.0005) +[2024-08-24 20:25:43,838][03430] Updated weights for policy 0, policy_version 11480 (0.0005) +[2024-08-24 20:25:44,983][03430] Updated weights for policy 0, policy_version 11490 (0.0006) +[2024-08-24 20:25:45,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36590.9, 300 sec: 36350.3). Total num frames: 47091712. Throughput: 0: 9164.4. Samples: 11765262. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:25:45,813][01192] Avg episode reward: [(0, '4.292')] +[2024-08-24 20:25:46,115][03430] Updated weights for policy 0, policy_version 11500 (0.0006) +[2024-08-24 20:25:47,250][03430] Updated weights for policy 0, policy_version 11510 (0.0006) +[2024-08-24 20:25:48,345][03430] Updated weights for policy 0, policy_version 11520 (0.0006) +[2024-08-24 20:25:49,463][03430] Updated weights for policy 0, policy_version 11530 (0.0007) +[2024-08-24 20:25:50,601][03430] Updated weights for policy 0, policy_version 11540 (0.0005) +[2024-08-24 20:25:50,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36522.7, 300 sec: 36336.4). Total num frames: 47271936. Throughput: 0: 9162.5. Samples: 11792392. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:25:50,812][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:25:51,760][03430] Updated weights for policy 0, policy_version 11550 (0.0005) +[2024-08-24 20:25:52,903][03430] Updated weights for policy 0, policy_version 11560 (0.0006) +[2024-08-24 20:25:54,039][03430] Updated weights for policy 0, policy_version 11570 (0.0006) +[2024-08-24 20:25:55,148][03430] Updated weights for policy 0, policy_version 11580 (0.0006) +[2024-08-24 20:25:55,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36522.7, 300 sec: 36322.5). Total num frames: 47452160. Throughput: 0: 9139.2. Samples: 11846530. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:25:55,813][01192] Avg episode reward: [(0, '4.255')] +[2024-08-24 20:25:56,278][03430] Updated weights for policy 0, policy_version 11590 (0.0007) +[2024-08-24 20:25:57,382][03430] Updated weights for policy 0, policy_version 11600 (0.0005) +[2024-08-24 20:25:58,443][03430] Updated weights for policy 0, policy_version 11610 (0.0005) +[2024-08-24 20:25:59,584][03430] Updated weights for policy 0, policy_version 11620 (0.0005) +[2024-08-24 20:26:00,688][03430] Updated weights for policy 0, policy_version 11630 (0.0005) +[2024-08-24 20:26:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36322.5). Total num frames: 47636480. Throughput: 0: 9144.8. Samples: 11901980. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:26:00,812][01192] Avg episode reward: [(0, '4.367')] +[2024-08-24 20:26:01,812][03430] Updated weights for policy 0, policy_version 11640 (0.0006) +[2024-08-24 20:26:02,919][03430] Updated weights for policy 0, policy_version 11650 (0.0006) +[2024-08-24 20:26:04,031][03430] Updated weights for policy 0, policy_version 11660 (0.0005) +[2024-08-24 20:26:05,116][03430] Updated weights for policy 0, policy_version 11670 (0.0005) +[2024-08-24 20:26:05,812][01192] Fps is (10 sec: 37273.7, 60 sec: 36659.3, 300 sec: 36350.3). Total num frames: 47824896. Throughput: 0: 9153.3. Samples: 11929632. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:26:05,813][01192] Avg episode reward: [(0, '4.417')] +[2024-08-24 20:26:06,243][03430] Updated weights for policy 0, policy_version 11680 (0.0007) +[2024-08-24 20:26:07,382][03430] Updated weights for policy 0, policy_version 11690 (0.0005) +[2024-08-24 20:26:08,506][03430] Updated weights for policy 0, policy_version 11700 (0.0006) +[2024-08-24 20:26:09,649][03430] Updated weights for policy 0, policy_version 11710 (0.0005) +[2024-08-24 20:26:10,745][03430] Updated weights for policy 0, policy_version 11720 (0.0005) +[2024-08-24 20:26:10,812][01192] Fps is (10 sec: 36863.4, 60 sec: 36659.1, 300 sec: 36336.4). Total num frames: 48005120. Throughput: 0: 9120.9. Samples: 11984386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:26:10,813][01192] Avg episode reward: [(0, '4.545')] +[2024-08-24 20:26:11,829][03430] Updated weights for policy 0, policy_version 11730 (0.0005) +[2024-08-24 20:26:12,919][03430] Updated weights for policy 0, policy_version 11740 (0.0006) +[2024-08-24 20:26:14,028][03430] Updated weights for policy 0, policy_version 11750 (0.0006) +[2024-08-24 20:26:15,135][03430] Updated weights for policy 0, policy_version 11760 (0.0005) +[2024-08-24 20:26:15,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36590.9, 300 sec: 36322.5). Total num frames: 48189440. Throughput: 0: 9128.4. Samples: 12040186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:26:15,812][01192] Avg episode reward: [(0, '4.521')] +[2024-08-24 20:26:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000011766_48193536.pth... +[2024-08-24 20:26:15,841][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000009637_39473152.pth +[2024-08-24 20:26:16,281][03430] Updated weights for policy 0, policy_version 11770 (0.0005) +[2024-08-24 20:26:17,420][03430] Updated weights for policy 0, policy_version 11780 (0.0006) +[2024-08-24 20:26:18,524][03430] Updated weights for policy 0, policy_version 11790 (0.0005) +[2024-08-24 20:26:19,614][03430] Updated weights for policy 0, policy_version 11800 (0.0006) +[2024-08-24 20:26:20,745][03430] Updated weights for policy 0, policy_version 11810 (0.0005) +[2024-08-24 20:26:20,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36590.9, 300 sec: 36322.5). Total num frames: 48373760. Throughput: 0: 9128.0. Samples: 12067492. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:26:20,813][01192] Avg episode reward: [(0, '4.310')] +[2024-08-24 20:26:21,858][03430] Updated weights for policy 0, policy_version 11820 (0.0005) +[2024-08-24 20:26:22,970][03430] Updated weights for policy 0, policy_version 11830 (0.0005) +[2024-08-24 20:26:24,087][03430] Updated weights for policy 0, policy_version 11840 (0.0006) +[2024-08-24 20:26:25,205][03430] Updated weights for policy 0, policy_version 11850 (0.0006) +[2024-08-24 20:26:25,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36522.6, 300 sec: 36336.4). Total num frames: 48558080. Throughput: 0: 9144.2. Samples: 12122696. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:26:25,813][01192] Avg episode reward: [(0, '4.373')] +[2024-08-24 20:26:26,329][03430] Updated weights for policy 0, policy_version 11860 (0.0005) +[2024-08-24 20:26:27,472][03430] Updated weights for policy 0, policy_version 11870 (0.0005) +[2024-08-24 20:26:28,571][03430] Updated weights for policy 0, policy_version 11880 (0.0006) +[2024-08-24 20:26:29,644][03430] Updated weights for policy 0, policy_version 11890 (0.0006) +[2024-08-24 20:26:30,749][03430] Updated weights for policy 0, policy_version 11900 (0.0005) +[2024-08-24 20:26:30,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36590.9, 300 sec: 36350.3). Total num frames: 48742400. Throughput: 0: 9166.5. Samples: 12177752. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:26:30,813][01192] Avg episode reward: [(0, '4.255')] +[2024-08-24 20:26:31,854][03430] Updated weights for policy 0, policy_version 11910 (0.0006) +[2024-08-24 20:26:32,973][03430] Updated weights for policy 0, policy_version 11920 (0.0006) +[2024-08-24 20:26:34,067][03430] Updated weights for policy 0, policy_version 11930 (0.0005) +[2024-08-24 20:26:35,209][03430] Updated weights for policy 0, policy_version 11940 (0.0006) +[2024-08-24 20:26:35,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36590.9, 300 sec: 36364.1). Total num frames: 48926720. Throughput: 0: 9179.5. Samples: 12205468. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:26:35,813][01192] Avg episode reward: [(0, '4.566')] +[2024-08-24 20:26:36,345][03430] Updated weights for policy 0, policy_version 11950 (0.0006) +[2024-08-24 20:26:37,458][03430] Updated weights for policy 0, policy_version 11960 (0.0006) +[2024-08-24 20:26:38,580][03430] Updated weights for policy 0, policy_version 11970 (0.0006) +[2024-08-24 20:26:39,652][03430] Updated weights for policy 0, policy_version 11980 (0.0006) +[2024-08-24 20:26:40,797][03430] Updated weights for policy 0, policy_version 11990 (0.0006) +[2024-08-24 20:26:40,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36659.2, 300 sec: 36364.2). Total num frames: 49111040. Throughput: 0: 9198.4. Samples: 12260458. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:26:40,813][01192] Avg episode reward: [(0, '4.290')] +[2024-08-24 20:26:41,929][03430] Updated weights for policy 0, policy_version 12000 (0.0006) +[2024-08-24 20:26:43,059][03430] Updated weights for policy 0, policy_version 12010 (0.0005) +[2024-08-24 20:26:44,173][03430] Updated weights for policy 0, policy_version 12020 (0.0006) +[2024-08-24 20:26:45,299][03430] Updated weights for policy 0, policy_version 12030 (0.0006) +[2024-08-24 20:26:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36659.3, 300 sec: 36364.2). Total num frames: 49291264. Throughput: 0: 9185.7. Samples: 12315338. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:26:45,813][01192] Avg episode reward: [(0, '4.597')] +[2024-08-24 20:26:46,397][03430] Updated weights for policy 0, policy_version 12040 (0.0005) +[2024-08-24 20:26:47,504][03430] Updated weights for policy 0, policy_version 12050 (0.0005) +[2024-08-24 20:26:48,643][03430] Updated weights for policy 0, policy_version 12060 (0.0005) +[2024-08-24 20:26:49,735][03430] Updated weights for policy 0, policy_version 12070 (0.0006) +[2024-08-24 20:26:50,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36727.4, 300 sec: 36364.1). Total num frames: 49475584. Throughput: 0: 9185.1. Samples: 12342964. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:26:50,813][01192] Avg episode reward: [(0, '4.310')] +[2024-08-24 20:26:50,857][03430] Updated weights for policy 0, policy_version 12080 (0.0006) +[2024-08-24 20:26:51,949][03430] Updated weights for policy 0, policy_version 12090 (0.0005) +[2024-08-24 20:26:53,046][03430] Updated weights for policy 0, policy_version 12100 (0.0006) +[2024-08-24 20:26:54,157][03430] Updated weights for policy 0, policy_version 12110 (0.0005) +[2024-08-24 20:26:55,257][03430] Updated weights for policy 0, policy_version 12120 (0.0005) +[2024-08-24 20:26:55,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36795.7, 300 sec: 36378.0). Total num frames: 49659904. Throughput: 0: 9198.7. Samples: 12398326. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:26:55,813][01192] Avg episode reward: [(0, '4.507')] +[2024-08-24 20:26:56,391][03430] Updated weights for policy 0, policy_version 12130 (0.0005) +[2024-08-24 20:26:57,526][03430] Updated weights for policy 0, policy_version 12140 (0.0005) +[2024-08-24 20:26:58,671][03430] Updated weights for policy 0, policy_version 12150 (0.0006) +[2024-08-24 20:26:59,805][03430] Updated weights for policy 0, policy_version 12160 (0.0006) +[2024-08-24 20:27:00,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36727.5, 300 sec: 36378.0). Total num frames: 49840128. Throughput: 0: 9168.9. Samples: 12452788. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:27:00,813][01192] Avg episode reward: [(0, '4.415')] +[2024-08-24 20:27:00,934][03430] Updated weights for policy 0, policy_version 12170 (0.0006) +[2024-08-24 20:27:02,049][03430] Updated weights for policy 0, policy_version 12180 (0.0005) +[2024-08-24 20:27:03,146][03430] Updated weights for policy 0, policy_version 12190 (0.0006) +[2024-08-24 20:27:04,310][03430] Updated weights for policy 0, policy_version 12200 (0.0005) +[2024-08-24 20:27:05,450][03430] Updated weights for policy 0, policy_version 12210 (0.0005) +[2024-08-24 20:27:05,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36659.2, 300 sec: 36391.9). Total num frames: 50024448. Throughput: 0: 9173.2. Samples: 12480286. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:27:05,813][01192] Avg episode reward: [(0, '4.322')] +[2024-08-24 20:27:06,589][03430] Updated weights for policy 0, policy_version 12220 (0.0007) +[2024-08-24 20:27:07,693][03430] Updated weights for policy 0, policy_version 12230 (0.0006) +[2024-08-24 20:27:08,857][03430] Updated weights for policy 0, policy_version 12240 (0.0005) +[2024-08-24 20:27:09,960][03430] Updated weights for policy 0, policy_version 12250 (0.0006) +[2024-08-24 20:27:10,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36659.3, 300 sec: 36378.0). Total num frames: 50204672. Throughput: 0: 9150.2. Samples: 12534456. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:27:10,813][01192] Avg episode reward: [(0, '4.506')] +[2024-08-24 20:27:11,091][03430] Updated weights for policy 0, policy_version 12260 (0.0005) +[2024-08-24 20:27:12,293][03430] Updated weights for policy 0, policy_version 12270 (0.0006) +[2024-08-24 20:27:13,422][03430] Updated weights for policy 0, policy_version 12280 (0.0005) +[2024-08-24 20:27:14,538][03430] Updated weights for policy 0, policy_version 12290 (0.0005) +[2024-08-24 20:27:15,674][03430] Updated weights for policy 0, policy_version 12300 (0.0006) +[2024-08-24 20:27:15,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36590.9, 300 sec: 36391.9). Total num frames: 50384896. Throughput: 0: 9125.1. Samples: 12588382. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:27:15,813][01192] Avg episode reward: [(0, '4.625')] +[2024-08-24 20:27:16,780][03430] Updated weights for policy 0, policy_version 12310 (0.0005) +[2024-08-24 20:27:17,913][03430] Updated weights for policy 0, policy_version 12320 (0.0006) +[2024-08-24 20:27:19,027][03430] Updated weights for policy 0, policy_version 12330 (0.0005) +[2024-08-24 20:27:20,133][03430] Updated weights for policy 0, policy_version 12340 (0.0005) +[2024-08-24 20:27:20,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36522.7, 300 sec: 36378.0). Total num frames: 50565120. Throughput: 0: 9113.6. Samples: 12615578. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:27:20,813][01192] Avg episode reward: [(0, '4.356')] +[2024-08-24 20:27:21,263][03430] Updated weights for policy 0, policy_version 12350 (0.0006) +[2024-08-24 20:27:22,367][03430] Updated weights for policy 0, policy_version 12360 (0.0006) +[2024-08-24 20:27:23,463][03430] Updated weights for policy 0, policy_version 12370 (0.0006) +[2024-08-24 20:27:24,566][03430] Updated weights for policy 0, policy_version 12380 (0.0006) +[2024-08-24 20:27:25,694][03430] Updated weights for policy 0, policy_version 12390 (0.0005) +[2024-08-24 20:27:25,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36591.0, 300 sec: 36405.8). Total num frames: 50753536. Throughput: 0: 9124.1. Samples: 12671042. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:27:25,813][01192] Avg episode reward: [(0, '4.591')] +[2024-08-24 20:27:26,840][03430] Updated weights for policy 0, policy_version 12400 (0.0006) +[2024-08-24 20:27:27,978][03430] Updated weights for policy 0, policy_version 12410 (0.0005) +[2024-08-24 20:27:29,110][03430] Updated weights for policy 0, policy_version 12420 (0.0006) +[2024-08-24 20:27:30,240][03430] Updated weights for policy 0, policy_version 12430 (0.0007) +[2024-08-24 20:27:30,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36454.3, 300 sec: 36391.9). Total num frames: 50929664. Throughput: 0: 9115.0. Samples: 12725512. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:27:30,813][01192] Avg episode reward: [(0, '4.341')] +[2024-08-24 20:27:31,387][03430] Updated weights for policy 0, policy_version 12440 (0.0006) +[2024-08-24 20:27:32,491][03430] Updated weights for policy 0, policy_version 12450 (0.0005) +[2024-08-24 20:27:33,610][03430] Updated weights for policy 0, policy_version 12460 (0.0006) +[2024-08-24 20:27:34,735][03430] Updated weights for policy 0, policy_version 12470 (0.0005) +[2024-08-24 20:27:35,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36454.4, 300 sec: 36391.9). Total num frames: 51113984. Throughput: 0: 9104.9. Samples: 12752682. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:27:35,813][01192] Avg episode reward: [(0, '4.272')] +[2024-08-24 20:27:35,848][03430] Updated weights for policy 0, policy_version 12480 (0.0007) +[2024-08-24 20:27:36,972][03430] Updated weights for policy 0, policy_version 12490 (0.0006) +[2024-08-24 20:27:38,109][03430] Updated weights for policy 0, policy_version 12500 (0.0004) +[2024-08-24 20:27:39,229][03430] Updated weights for policy 0, policy_version 12510 (0.0005) +[2024-08-24 20:27:40,340][03430] Updated weights for policy 0, policy_version 12520 (0.0006) +[2024-08-24 20:27:40,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36454.4, 300 sec: 36405.8). Total num frames: 51298304. Throughput: 0: 9091.4. Samples: 12807440. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:27:40,813][01192] Avg episode reward: [(0, '4.454')] +[2024-08-24 20:27:41,442][03430] Updated weights for policy 0, policy_version 12530 (0.0006) +[2024-08-24 20:27:42,570][03430] Updated weights for policy 0, policy_version 12540 (0.0006) +[2024-08-24 20:27:43,686][03430] Updated weights for policy 0, policy_version 12550 (0.0005) +[2024-08-24 20:27:44,805][03430] Updated weights for policy 0, policy_version 12560 (0.0005) +[2024-08-24 20:27:45,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36522.6, 300 sec: 36405.8). Total num frames: 51482624. Throughput: 0: 9101.5. Samples: 12862358. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:27:45,813][01192] Avg episode reward: [(0, '4.536')] +[2024-08-24 20:27:45,922][03430] Updated weights for policy 0, policy_version 12570 (0.0006) +[2024-08-24 20:27:47,050][03430] Updated weights for policy 0, policy_version 12580 (0.0006) +[2024-08-24 20:27:48,178][03430] Updated weights for policy 0, policy_version 12590 (0.0005) +[2024-08-24 20:27:49,311][03430] Updated weights for policy 0, policy_version 12600 (0.0007) +[2024-08-24 20:27:50,402][03430] Updated weights for policy 0, policy_version 12610 (0.0005) +[2024-08-24 20:27:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36405.8). Total num frames: 51662848. Throughput: 0: 9098.7. Samples: 12889726. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:27:50,812][01192] Avg episode reward: [(0, '4.481')] +[2024-08-24 20:27:51,538][03430] Updated weights for policy 0, policy_version 12620 (0.0006) +[2024-08-24 20:27:52,641][03430] Updated weights for policy 0, policy_version 12630 (0.0005) +[2024-08-24 20:27:53,765][03430] Updated weights for policy 0, policy_version 12640 (0.0006) +[2024-08-24 20:27:54,871][03430] Updated weights for policy 0, policy_version 12650 (0.0005) +[2024-08-24 20:27:55,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36454.4, 300 sec: 36419.7). Total num frames: 51847168. Throughput: 0: 9113.6. Samples: 12944570. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:27:55,813][01192] Avg episode reward: [(0, '4.354')] +[2024-08-24 20:27:55,996][03430] Updated weights for policy 0, policy_version 12660 (0.0005) +[2024-08-24 20:27:57,129][03430] Updated weights for policy 0, policy_version 12670 (0.0005) +[2024-08-24 20:27:58,250][03430] Updated weights for policy 0, policy_version 12680 (0.0006) +[2024-08-24 20:27:59,363][03430] Updated weights for policy 0, policy_version 12690 (0.0005) +[2024-08-24 20:28:00,476][03430] Updated weights for policy 0, policy_version 12700 (0.0006) +[2024-08-24 20:28:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36419.7). Total num frames: 52027392. Throughput: 0: 9135.5. Samples: 12999480. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:28:00,812][01192] Avg episode reward: [(0, '4.410')] +[2024-08-24 20:28:01,606][03430] Updated weights for policy 0, policy_version 12710 (0.0005) +[2024-08-24 20:28:02,730][03430] Updated weights for policy 0, policy_version 12720 (0.0006) +[2024-08-24 20:28:03,859][03430] Updated weights for policy 0, policy_version 12730 (0.0006) +[2024-08-24 20:28:05,005][03430] Updated weights for policy 0, policy_version 12740 (0.0005) +[2024-08-24 20:28:05,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36433.6). Total num frames: 52211712. Throughput: 0: 9141.9. Samples: 13026962. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:28:05,813][01192] Avg episode reward: [(0, '4.317')] +[2024-08-24 20:28:06,122][03430] Updated weights for policy 0, policy_version 12750 (0.0005) +[2024-08-24 20:28:07,231][03430] Updated weights for policy 0, policy_version 12760 (0.0006) +[2024-08-24 20:28:08,360][03430] Updated weights for policy 0, policy_version 12770 (0.0005) +[2024-08-24 20:28:09,516][03430] Updated weights for policy 0, policy_version 12780 (0.0005) +[2024-08-24 20:28:10,631][03430] Updated weights for policy 0, policy_version 12790 (0.0006) +[2024-08-24 20:28:10,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36454.3, 300 sec: 36447.4). Total num frames: 52391936. Throughput: 0: 9121.6. Samples: 13081516. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:28:10,813][01192] Avg episode reward: [(0, '4.583')] +[2024-08-24 20:28:11,786][03430] Updated weights for policy 0, policy_version 12800 (0.0006) +[2024-08-24 20:28:12,918][03430] Updated weights for policy 0, policy_version 12810 (0.0005) +[2024-08-24 20:28:14,029][03430] Updated weights for policy 0, policy_version 12820 (0.0006) +[2024-08-24 20:28:15,140][03430] Updated weights for policy 0, policy_version 12830 (0.0005) +[2024-08-24 20:28:15,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36454.3, 300 sec: 36447.4). Total num frames: 52572160. Throughput: 0: 9117.6. Samples: 13135802. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:28:15,813][01192] Avg episode reward: [(0, '4.591')] +[2024-08-24 20:28:15,817][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000012836_52576256.pth... +[2024-08-24 20:28:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000010697_43814912.pth +[2024-08-24 20:28:16,284][03430] Updated weights for policy 0, policy_version 12840 (0.0006) +[2024-08-24 20:28:17,403][03430] Updated weights for policy 0, policy_version 12850 (0.0005) +[2024-08-24 20:28:18,530][03430] Updated weights for policy 0, policy_version 12860 (0.0005) +[2024-08-24 20:28:19,659][03430] Updated weights for policy 0, policy_version 12870 (0.0005) +[2024-08-24 20:28:20,773][03430] Updated weights for policy 0, policy_version 12880 (0.0005) +[2024-08-24 20:28:20,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36522.6, 300 sec: 36461.3). Total num frames: 52756480. Throughput: 0: 9117.4. Samples: 13162966. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:28:20,813][01192] Avg episode reward: [(0, '4.285')] +[2024-08-24 20:28:21,909][03430] Updated weights for policy 0, policy_version 12890 (0.0006) +[2024-08-24 20:28:23,022][03430] Updated weights for policy 0, policy_version 12900 (0.0006) +[2024-08-24 20:28:24,134][03430] Updated weights for policy 0, policy_version 12910 (0.0006) +[2024-08-24 20:28:25,250][03430] Updated weights for policy 0, policy_version 12920 (0.0007) +[2024-08-24 20:28:25,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36454.4, 300 sec: 36461.3). Total num frames: 52940800. Throughput: 0: 9121.4. Samples: 13217902. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:28:25,813][01192] Avg episode reward: [(0, '4.760')] +[2024-08-24 20:28:26,374][03430] Updated weights for policy 0, policy_version 12930 (0.0006) +[2024-08-24 20:28:27,479][03430] Updated weights for policy 0, policy_version 12940 (0.0005) +[2024-08-24 20:28:28,552][03430] Updated weights for policy 0, policy_version 12950 (0.0005) +[2024-08-24 20:28:29,666][03430] Updated weights for policy 0, policy_version 12960 (0.0005) +[2024-08-24 20:28:30,784][03430] Updated weights for policy 0, policy_version 12970 (0.0006) +[2024-08-24 20:28:30,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36591.0, 300 sec: 36475.2). Total num frames: 53125120. Throughput: 0: 9129.3. Samples: 13273176. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:28:30,813][01192] Avg episode reward: [(0, '4.445')] +[2024-08-24 20:28:31,924][03430] Updated weights for policy 0, policy_version 12980 (0.0005) +[2024-08-24 20:28:33,059][03430] Updated weights for policy 0, policy_version 12990 (0.0005) +[2024-08-24 20:28:34,219][03430] Updated weights for policy 0, policy_version 13000 (0.0005) +[2024-08-24 20:28:35,321][03430] Updated weights for policy 0, policy_version 13010 (0.0006) +[2024-08-24 20:28:35,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36522.6, 300 sec: 36475.2). Total num frames: 53305344. Throughput: 0: 9124.9. Samples: 13300348. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:28:35,813][01192] Avg episode reward: [(0, '4.399')] +[2024-08-24 20:28:36,465][03430] Updated weights for policy 0, policy_version 13020 (0.0006) +[2024-08-24 20:28:37,607][03430] Updated weights for policy 0, policy_version 13030 (0.0007) +[2024-08-24 20:28:38,733][03430] Updated weights for policy 0, policy_version 13040 (0.0006) +[2024-08-24 20:28:39,854][03430] Updated weights for policy 0, policy_version 13050 (0.0006) +[2024-08-24 20:28:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 53485568. Throughput: 0: 9107.8. Samples: 13354420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:28:40,812][01192] Avg episode reward: [(0, '4.404')] +[2024-08-24 20:28:40,960][03430] Updated weights for policy 0, policy_version 13060 (0.0006) +[2024-08-24 20:28:42,088][03430] Updated weights for policy 0, policy_version 13070 (0.0006) +[2024-08-24 20:28:43,213][03430] Updated weights for policy 0, policy_version 13080 (0.0005) +[2024-08-24 20:28:44,318][03430] Updated weights for policy 0, policy_version 13090 (0.0005) +[2024-08-24 20:28:45,444][03430] Updated weights for policy 0, policy_version 13100 (0.0005) +[2024-08-24 20:28:45,812][01192] Fps is (10 sec: 36454.8, 60 sec: 36454.5, 300 sec: 36489.1). Total num frames: 53669888. Throughput: 0: 9115.0. Samples: 13409656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:28:45,813][01192] Avg episode reward: [(0, '4.555')] +[2024-08-24 20:28:46,581][03430] Updated weights for policy 0, policy_version 13110 (0.0006) +[2024-08-24 20:28:47,707][03430] Updated weights for policy 0, policy_version 13120 (0.0007) +[2024-08-24 20:28:48,842][03430] Updated weights for policy 0, policy_version 13130 (0.0006) +[2024-08-24 20:28:49,964][03430] Updated weights for policy 0, policy_version 13140 (0.0006) +[2024-08-24 20:28:50,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36454.4, 300 sec: 36475.2). Total num frames: 53850112. Throughput: 0: 9105.6. Samples: 13436712. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:28:50,812][01192] Avg episode reward: [(0, '4.663')] +[2024-08-24 20:28:51,086][03430] Updated weights for policy 0, policy_version 13150 (0.0006) +[2024-08-24 20:28:52,197][03430] Updated weights for policy 0, policy_version 13160 (0.0006) +[2024-08-24 20:28:53,330][03430] Updated weights for policy 0, policy_version 13170 (0.0006) +[2024-08-24 20:28:54,445][03430] Updated weights for policy 0, policy_version 13180 (0.0005) +[2024-08-24 20:28:55,560][03430] Updated weights for policy 0, policy_version 13190 (0.0004) +[2024-08-24 20:28:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 54034432. Throughput: 0: 9107.9. Samples: 13491372. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:28:55,813][01192] Avg episode reward: [(0, '4.636')] +[2024-08-24 20:28:56,676][03430] Updated weights for policy 0, policy_version 13200 (0.0005) +[2024-08-24 20:28:57,784][03430] Updated weights for policy 0, policy_version 13210 (0.0005) +[2024-08-24 20:28:58,901][03430] Updated weights for policy 0, policy_version 13220 (0.0006) +[2024-08-24 20:29:00,014][03430] Updated weights for policy 0, policy_version 13230 (0.0006) +[2024-08-24 20:29:00,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36522.7, 300 sec: 36503.0). Total num frames: 54218752. Throughput: 0: 9126.4. Samples: 13546488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:29:00,813][01192] Avg episode reward: [(0, '4.401')] +[2024-08-24 20:29:01,151][03430] Updated weights for policy 0, policy_version 13240 (0.0005) +[2024-08-24 20:29:02,259][03430] Updated weights for policy 0, policy_version 13250 (0.0006) +[2024-08-24 20:29:03,357][03430] Updated weights for policy 0, policy_version 13260 (0.0006) +[2024-08-24 20:29:04,465][03430] Updated weights for policy 0, policy_version 13270 (0.0006) +[2024-08-24 20:29:05,591][03430] Updated weights for policy 0, policy_version 13280 (0.0006) +[2024-08-24 20:29:05,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36522.7, 300 sec: 36516.9). Total num frames: 54403072. Throughput: 0: 9135.3. Samples: 13574056. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:29:05,813][01192] Avg episode reward: [(0, '4.550')] +[2024-08-24 20:29:06,712][03430] Updated weights for policy 0, policy_version 13290 (0.0007) +[2024-08-24 20:29:07,837][03430] Updated weights for policy 0, policy_version 13300 (0.0006) +[2024-08-24 20:29:08,992][03430] Updated weights for policy 0, policy_version 13310 (0.0006) +[2024-08-24 20:29:10,118][03430] Updated weights for policy 0, policy_version 13320 (0.0006) +[2024-08-24 20:29:10,812][01192] Fps is (10 sec: 36044.3, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 54579200. Throughput: 0: 9129.1. Samples: 13628714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:29:10,813][01192] Avg episode reward: [(0, '4.544')] +[2024-08-24 20:29:11,263][03430] Updated weights for policy 0, policy_version 13330 (0.0006) +[2024-08-24 20:29:12,418][03430] Updated weights for policy 0, policy_version 13340 (0.0006) +[2024-08-24 20:29:13,620][03430] Updated weights for policy 0, policy_version 13350 (0.0006) +[2024-08-24 20:29:14,766][03430] Updated weights for policy 0, policy_version 13360 (0.0006) +[2024-08-24 20:29:15,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36454.5, 300 sec: 36489.1). Total num frames: 54759424. Throughput: 0: 9087.2. Samples: 13682098. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:29:15,813][01192] Avg episode reward: [(0, '4.436')] +[2024-08-24 20:29:15,904][03430] Updated weights for policy 0, policy_version 13370 (0.0007) +[2024-08-24 20:29:17,036][03430] Updated weights for policy 0, policy_version 13380 (0.0005) +[2024-08-24 20:29:18,157][03430] Updated weights for policy 0, policy_version 13390 (0.0006) +[2024-08-24 20:29:19,232][03430] Updated weights for policy 0, policy_version 13400 (0.0006) +[2024-08-24 20:29:20,332][03430] Updated weights for policy 0, policy_version 13410 (0.0006) +[2024-08-24 20:29:20,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36503.0). Total num frames: 54943744. Throughput: 0: 9080.7. Samples: 13708980. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:29:20,813][01192] Avg episode reward: [(0, '4.348')] +[2024-08-24 20:29:21,453][03430] Updated weights for policy 0, policy_version 13420 (0.0005) +[2024-08-24 20:29:22,587][03430] Updated weights for policy 0, policy_version 13430 (0.0006) +[2024-08-24 20:29:23,726][03430] Updated weights for policy 0, policy_version 13440 (0.0006) +[2024-08-24 20:29:24,848][03430] Updated weights for policy 0, policy_version 13450 (0.0006) +[2024-08-24 20:29:25,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36386.1, 300 sec: 36516.9). Total num frames: 55123968. Throughput: 0: 9110.1. Samples: 13764374. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:29:25,813][01192] Avg episode reward: [(0, '4.411')] +[2024-08-24 20:29:25,972][03430] Updated weights for policy 0, policy_version 13460 (0.0006) +[2024-08-24 20:29:27,066][03430] Updated weights for policy 0, policy_version 13470 (0.0005) +[2024-08-24 20:29:28,173][03430] Updated weights for policy 0, policy_version 13480 (0.0005) +[2024-08-24 20:29:29,293][03430] Updated weights for policy 0, policy_version 13490 (0.0005) +[2024-08-24 20:29:30,412][03430] Updated weights for policy 0, policy_version 13500 (0.0005) +[2024-08-24 20:29:30,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36386.1, 300 sec: 36516.9). Total num frames: 55308288. Throughput: 0: 9104.5. Samples: 13819358. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:29:30,813][01192] Avg episode reward: [(0, '4.557')] +[2024-08-24 20:29:31,558][03430] Updated weights for policy 0, policy_version 13510 (0.0006) +[2024-08-24 20:29:32,686][03430] Updated weights for policy 0, policy_version 13520 (0.0006) +[2024-08-24 20:29:33,810][03430] Updated weights for policy 0, policy_version 13530 (0.0006) +[2024-08-24 20:29:34,969][03430] Updated weights for policy 0, policy_version 13540 (0.0005) +[2024-08-24 20:29:35,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36386.1, 300 sec: 36516.9). Total num frames: 55488512. Throughput: 0: 9105.6. Samples: 13846466. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:29:35,813][01192] Avg episode reward: [(0, '4.378')] +[2024-08-24 20:29:36,096][03430] Updated weights for policy 0, policy_version 13550 (0.0006) +[2024-08-24 20:29:37,227][03430] Updated weights for policy 0, policy_version 13560 (0.0006) +[2024-08-24 20:29:38,315][03430] Updated weights for policy 0, policy_version 13570 (0.0005) +[2024-08-24 20:29:39,400][03430] Updated weights for policy 0, policy_version 13580 (0.0005) +[2024-08-24 20:29:40,514][03430] Updated weights for policy 0, policy_version 13590 (0.0006) +[2024-08-24 20:29:40,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36530.8). Total num frames: 55672832. Throughput: 0: 9109.5. Samples: 13901300. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:29:40,813][01192] Avg episode reward: [(0, '4.423')] +[2024-08-24 20:29:41,630][03430] Updated weights for policy 0, policy_version 13600 (0.0006) +[2024-08-24 20:29:42,738][03430] Updated weights for policy 0, policy_version 13610 (0.0006) +[2024-08-24 20:29:43,858][03430] Updated weights for policy 0, policy_version 13620 (0.0005) +[2024-08-24 20:29:44,962][03430] Updated weights for policy 0, policy_version 13630 (0.0006) +[2024-08-24 20:29:45,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36454.4, 300 sec: 36530.8). Total num frames: 55857152. Throughput: 0: 9115.1. Samples: 13956668. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:29:45,813][01192] Avg episode reward: [(0, '4.143')] +[2024-08-24 20:29:46,065][03430] Updated weights for policy 0, policy_version 13640 (0.0006) +[2024-08-24 20:29:47,195][03430] Updated weights for policy 0, policy_version 13650 (0.0006) +[2024-08-24 20:29:48,337][03430] Updated weights for policy 0, policy_version 13660 (0.0006) +[2024-08-24 20:29:49,496][03430] Updated weights for policy 0, policy_version 13670 (0.0005) +[2024-08-24 20:29:50,625][03430] Updated weights for policy 0, policy_version 13680 (0.0005) +[2024-08-24 20:29:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36530.8). Total num frames: 56037376. Throughput: 0: 9107.2. Samples: 13983882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:29:50,813][01192] Avg episode reward: [(0, '4.358')] +[2024-08-24 20:29:51,742][03430] Updated weights for policy 0, policy_version 13690 (0.0006) +[2024-08-24 20:29:52,882][03430] Updated weights for policy 0, policy_version 13700 (0.0005) +[2024-08-24 20:29:53,998][03430] Updated weights for policy 0, policy_version 13710 (0.0005) +[2024-08-24 20:29:55,093][03430] Updated weights for policy 0, policy_version 13720 (0.0006) +[2024-08-24 20:29:55,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36454.4, 300 sec: 36544.7). Total num frames: 56221696. Throughput: 0: 9104.1. Samples: 14038396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:29:55,813][01192] Avg episode reward: [(0, '4.372')] +[2024-08-24 20:29:56,228][03430] Updated weights for policy 0, policy_version 13730 (0.0005) +[2024-08-24 20:29:57,356][03430] Updated weights for policy 0, policy_version 13740 (0.0006) +[2024-08-24 20:29:58,489][03430] Updated weights for policy 0, policy_version 13750 (0.0007) +[2024-08-24 20:29:59,641][03430] Updated weights for policy 0, policy_version 13760 (0.0006) +[2024-08-24 20:30:00,752][03430] Updated weights for policy 0, policy_version 13770 (0.0005) +[2024-08-24 20:30:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36530.8). Total num frames: 56401920. Throughput: 0: 9128.5. Samples: 14092882. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:30:00,813][01192] Avg episode reward: [(0, '4.597')] +[2024-08-24 20:30:01,875][03430] Updated weights for policy 0, policy_version 13780 (0.0006) +[2024-08-24 20:30:02,976][03430] Updated weights for policy 0, policy_version 13790 (0.0006) +[2024-08-24 20:30:04,096][03430] Updated weights for policy 0, policy_version 13800 (0.0005) +[2024-08-24 20:30:05,231][03430] Updated weights for policy 0, policy_version 13810 (0.0005) +[2024-08-24 20:30:05,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36386.1, 300 sec: 36544.7). Total num frames: 56586240. Throughput: 0: 9143.8. Samples: 14120448. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:30:05,813][01192] Avg episode reward: [(0, '4.558')] +[2024-08-24 20:30:06,345][03430] Updated weights for policy 0, policy_version 13820 (0.0005) +[2024-08-24 20:30:07,452][03430] Updated weights for policy 0, policy_version 13830 (0.0007) +[2024-08-24 20:30:08,559][03430] Updated weights for policy 0, policy_version 13840 (0.0006) +[2024-08-24 20:30:09,683][03430] Updated weights for policy 0, policy_version 13850 (0.0005) +[2024-08-24 20:30:10,788][03430] Updated weights for policy 0, policy_version 13860 (0.0006) +[2024-08-24 20:30:10,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36522.7, 300 sec: 36530.8). Total num frames: 56770560. Throughput: 0: 9133.8. Samples: 14175394. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:30:10,813][01192] Avg episode reward: [(0, '4.277')] +[2024-08-24 20:30:11,901][03430] Updated weights for policy 0, policy_version 13870 (0.0006) +[2024-08-24 20:30:13,027][03430] Updated weights for policy 0, policy_version 13880 (0.0006) +[2024-08-24 20:30:14,156][03430] Updated weights for policy 0, policy_version 13890 (0.0005) +[2024-08-24 20:30:15,287][03430] Updated weights for policy 0, policy_version 13900 (0.0006) +[2024-08-24 20:30:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.7, 300 sec: 36516.9). Total num frames: 56950784. Throughput: 0: 9130.4. Samples: 14230224. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:30:15,813][01192] Avg episode reward: [(0, '4.539')] +[2024-08-24 20:30:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000013904_56950784.pth... +[2024-08-24 20:30:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000011766_48193536.pth +[2024-08-24 20:30:16,423][03430] Updated weights for policy 0, policy_version 13910 (0.0005) +[2024-08-24 20:30:17,554][03430] Updated weights for policy 0, policy_version 13920 (0.0006) +[2024-08-24 20:30:18,677][03430] Updated weights for policy 0, policy_version 13930 (0.0006) +[2024-08-24 20:30:19,813][03430] Updated weights for policy 0, policy_version 13940 (0.0006) +[2024-08-24 20:30:20,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36454.5, 300 sec: 36489.1). Total num frames: 57131008. Throughput: 0: 9130.2. Samples: 14257324. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:30:20,813][01192] Avg episode reward: [(0, '4.423')] +[2024-08-24 20:30:20,947][03430] Updated weights for policy 0, policy_version 13950 (0.0006) +[2024-08-24 20:30:22,077][03430] Updated weights for policy 0, policy_version 13960 (0.0005) +[2024-08-24 20:30:23,194][03430] Updated weights for policy 0, policy_version 13970 (0.0005) +[2024-08-24 20:30:24,296][03430] Updated weights for policy 0, policy_version 13980 (0.0005) +[2024-08-24 20:30:25,418][03430] Updated weights for policy 0, policy_version 13990 (0.0006) +[2024-08-24 20:30:25,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.7, 300 sec: 36503.0). Total num frames: 57315328. Throughput: 0: 9127.0. Samples: 14312016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:30:25,812][01192] Avg episode reward: [(0, '4.380')] +[2024-08-24 20:30:26,540][03430] Updated weights for policy 0, policy_version 14000 (0.0006) +[2024-08-24 20:30:27,668][03430] Updated weights for policy 0, policy_version 14010 (0.0005) +[2024-08-24 20:30:28,813][03430] Updated weights for policy 0, policy_version 14020 (0.0006) +[2024-08-24 20:30:29,950][03430] Updated weights for policy 0, policy_version 14030 (0.0006) +[2024-08-24 20:30:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 57495552. Throughput: 0: 9107.0. Samples: 14366484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:30:30,813][01192] Avg episode reward: [(0, '4.696')] +[2024-08-24 20:30:31,073][03430] Updated weights for policy 0, policy_version 14040 (0.0005) +[2024-08-24 20:30:32,202][03430] Updated weights for policy 0, policy_version 14050 (0.0006) +[2024-08-24 20:30:33,325][03430] Updated weights for policy 0, policy_version 14060 (0.0006) +[2024-08-24 20:30:34,413][03430] Updated weights for policy 0, policy_version 14070 (0.0006) +[2024-08-24 20:30:35,536][03430] Updated weights for policy 0, policy_version 14080 (0.0005) +[2024-08-24 20:30:35,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36522.7, 300 sec: 36503.0). Total num frames: 57679872. Throughput: 0: 9105.8. Samples: 14393644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:30:35,813][01192] Avg episode reward: [(0, '4.396')] +[2024-08-24 20:30:36,687][03430] Updated weights for policy 0, policy_version 14090 (0.0007) +[2024-08-24 20:30:37,825][03430] Updated weights for policy 0, policy_version 14100 (0.0005) +[2024-08-24 20:30:38,983][03430] Updated weights for policy 0, policy_version 14110 (0.0006) +[2024-08-24 20:30:40,105][03430] Updated weights for policy 0, policy_version 14120 (0.0004) +[2024-08-24 20:30:40,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36386.1, 300 sec: 36489.1). Total num frames: 57856000. Throughput: 0: 9103.9. Samples: 14448074. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:30:40,813][01192] Avg episode reward: [(0, '4.401')] +[2024-08-24 20:30:41,293][03430] Updated weights for policy 0, policy_version 14130 (0.0005) +[2024-08-24 20:30:42,395][03430] Updated weights for policy 0, policy_version 14140 (0.0005) +[2024-08-24 20:30:43,531][03430] Updated weights for policy 0, policy_version 14150 (0.0005) +[2024-08-24 20:30:44,665][03430] Updated weights for policy 0, policy_version 14160 (0.0006) +[2024-08-24 20:30:45,812][01192] Fps is (10 sec: 35635.3, 60 sec: 36317.9, 300 sec: 36489.1). Total num frames: 58036224. Throughput: 0: 9091.6. Samples: 14502004. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:30:45,812][01192] Avg episode reward: [(0, '4.469')] +[2024-08-24 20:30:45,819][03430] Updated weights for policy 0, policy_version 14170 (0.0006) +[2024-08-24 20:30:46,967][03430] Updated weights for policy 0, policy_version 14180 (0.0006) +[2024-08-24 20:30:48,060][03430] Updated weights for policy 0, policy_version 14190 (0.0006) +[2024-08-24 20:30:49,174][03430] Updated weights for policy 0, policy_version 14200 (0.0006) +[2024-08-24 20:30:50,294][03430] Updated weights for policy 0, policy_version 14210 (0.0005) +[2024-08-24 20:30:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36503.0). Total num frames: 58220544. Throughput: 0: 9086.2. Samples: 14529328. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:30:50,813][01192] Avg episode reward: [(0, '4.341')] +[2024-08-24 20:30:51,440][03430] Updated weights for policy 0, policy_version 14220 (0.0006) +[2024-08-24 20:30:52,568][03430] Updated weights for policy 0, policy_version 14230 (0.0006) +[2024-08-24 20:30:53,715][03430] Updated weights for policy 0, policy_version 14240 (0.0005) +[2024-08-24 20:30:54,835][03430] Updated weights for policy 0, policy_version 14250 (0.0006) +[2024-08-24 20:30:55,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36317.9, 300 sec: 36489.1). Total num frames: 58400768. Throughput: 0: 9073.4. Samples: 14583696. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:30:55,813][01192] Avg episode reward: [(0, '4.315')] +[2024-08-24 20:30:55,965][03430] Updated weights for policy 0, policy_version 14260 (0.0006) +[2024-08-24 20:30:57,131][03430] Updated weights for policy 0, policy_version 14270 (0.0006) +[2024-08-24 20:30:58,247][03430] Updated weights for policy 0, policy_version 14280 (0.0005) +[2024-08-24 20:30:59,377][03430] Updated weights for policy 0, policy_version 14290 (0.0006) +[2024-08-24 20:31:00,510][03430] Updated weights for policy 0, policy_version 14300 (0.0007) +[2024-08-24 20:31:00,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36317.8, 300 sec: 36461.3). Total num frames: 58580992. Throughput: 0: 9059.9. Samples: 14637922. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:00,813][01192] Avg episode reward: [(0, '4.328')] +[2024-08-24 20:31:01,636][03430] Updated weights for policy 0, policy_version 14310 (0.0005) +[2024-08-24 20:31:02,766][03430] Updated weights for policy 0, policy_version 14320 (0.0005) +[2024-08-24 20:31:03,892][03430] Updated weights for policy 0, policy_version 14330 (0.0006) +[2024-08-24 20:31:04,974][03430] Updated weights for policy 0, policy_version 14340 (0.0005) +[2024-08-24 20:31:05,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36475.2). Total num frames: 58765312. Throughput: 0: 9062.2. Samples: 14665124. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:05,813][01192] Avg episode reward: [(0, '4.436')] +[2024-08-24 20:31:06,069][03430] Updated weights for policy 0, policy_version 14350 (0.0005) +[2024-08-24 20:31:07,171][03430] Updated weights for policy 0, policy_version 14360 (0.0006) +[2024-08-24 20:31:08,295][03430] Updated weights for policy 0, policy_version 14370 (0.0006) +[2024-08-24 20:31:09,388][03430] Updated weights for policy 0, policy_version 14380 (0.0006) +[2024-08-24 20:31:10,504][03430] Updated weights for policy 0, policy_version 14390 (0.0006) +[2024-08-24 20:31:10,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36317.9, 300 sec: 36475.2). Total num frames: 58949632. Throughput: 0: 9084.0. Samples: 14720798. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:31:10,813][01192] Avg episode reward: [(0, '4.457')] +[2024-08-24 20:31:11,620][03430] Updated weights for policy 0, policy_version 14400 (0.0005) +[2024-08-24 20:31:12,713][03430] Updated weights for policy 0, policy_version 14410 (0.0005) +[2024-08-24 20:31:13,829][03430] Updated weights for policy 0, policy_version 14420 (0.0005) +[2024-08-24 20:31:14,986][03430] Updated weights for policy 0, policy_version 14430 (0.0008) +[2024-08-24 20:31:15,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36317.8, 300 sec: 36461.3). Total num frames: 59129856. Throughput: 0: 9096.0. Samples: 14775806. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:31:15,813][01192] Avg episode reward: [(0, '4.310')] +[2024-08-24 20:31:16,170][03430] Updated weights for policy 0, policy_version 14440 (0.0007) +[2024-08-24 20:31:17,290][03430] Updated weights for policy 0, policy_version 14450 (0.0006) +[2024-08-24 20:31:18,416][03430] Updated weights for policy 0, policy_version 14460 (0.0005) +[2024-08-24 20:31:19,517][03430] Updated weights for policy 0, policy_version 14470 (0.0006) +[2024-08-24 20:31:20,633][03430] Updated weights for policy 0, policy_version 14480 (0.0006) +[2024-08-24 20:31:20,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36386.1, 300 sec: 36461.3). Total num frames: 59314176. Throughput: 0: 9088.8. Samples: 14802642. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:20,813][01192] Avg episode reward: [(0, '4.347')] +[2024-08-24 20:31:21,721][03430] Updated weights for policy 0, policy_version 14490 (0.0005) +[2024-08-24 20:31:22,792][03430] Updated weights for policy 0, policy_version 14500 (0.0005) +[2024-08-24 20:31:23,889][03430] Updated weights for policy 0, policy_version 14510 (0.0005) +[2024-08-24 20:31:24,984][03430] Updated weights for policy 0, policy_version 14520 (0.0006) +[2024-08-24 20:31:25,812][01192] Fps is (10 sec: 37274.0, 60 sec: 36454.4, 300 sec: 36475.2). Total num frames: 59502592. Throughput: 0: 9119.7. Samples: 14858458. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:25,813][01192] Avg episode reward: [(0, '4.555')] +[2024-08-24 20:31:26,098][03430] Updated weights for policy 0, policy_version 14530 (0.0005) +[2024-08-24 20:31:27,184][03430] Updated weights for policy 0, policy_version 14540 (0.0005) +[2024-08-24 20:31:28,276][03430] Updated weights for policy 0, policy_version 14550 (0.0005) +[2024-08-24 20:31:29,384][03430] Updated weights for policy 0, policy_version 14560 (0.0005) +[2024-08-24 20:31:30,468][03430] Updated weights for policy 0, policy_version 14570 (0.0006) +[2024-08-24 20:31:30,812][01192] Fps is (10 sec: 37683.5, 60 sec: 36590.9, 300 sec: 36489.1). Total num frames: 59691008. Throughput: 0: 9164.8. Samples: 14914418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:31:30,813][01192] Avg episode reward: [(0, '4.396')] +[2024-08-24 20:31:31,559][03430] Updated weights for policy 0, policy_version 14580 (0.0005) +[2024-08-24 20:31:32,648][03430] Updated weights for policy 0, policy_version 14590 (0.0006) +[2024-08-24 20:31:33,755][03430] Updated weights for policy 0, policy_version 14600 (0.0006) +[2024-08-24 20:31:34,892][03430] Updated weights for policy 0, policy_version 14610 (0.0006) +[2024-08-24 20:31:35,812][01192] Fps is (10 sec: 37273.4, 60 sec: 36590.9, 300 sec: 36489.1). Total num frames: 59875328. Throughput: 0: 9185.2. Samples: 14942660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:31:35,813][01192] Avg episode reward: [(0, '4.299')] +[2024-08-24 20:31:36,023][03430] Updated weights for policy 0, policy_version 14620 (0.0005) +[2024-08-24 20:31:37,138][03430] Updated weights for policy 0, policy_version 14630 (0.0006) +[2024-08-24 20:31:38,265][03430] Updated weights for policy 0, policy_version 14640 (0.0006) +[2024-08-24 20:31:39,380][03430] Updated weights for policy 0, policy_version 14650 (0.0005) +[2024-08-24 20:31:40,513][03430] Updated weights for policy 0, policy_version 14660 (0.0006) +[2024-08-24 20:31:40,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36659.2, 300 sec: 36489.1). Total num frames: 60055552. Throughput: 0: 9192.7. Samples: 14997366. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:31:40,813][01192] Avg episode reward: [(0, '4.395')] +[2024-08-24 20:31:41,631][03430] Updated weights for policy 0, policy_version 14670 (0.0006) +[2024-08-24 20:31:42,736][03430] Updated weights for policy 0, policy_version 14680 (0.0006) +[2024-08-24 20:31:43,866][03430] Updated weights for policy 0, policy_version 14690 (0.0006) +[2024-08-24 20:31:45,012][03430] Updated weights for policy 0, policy_version 14700 (0.0007) +[2024-08-24 20:31:45,812][01192] Fps is (10 sec: 36454.6, 60 sec: 36727.5, 300 sec: 36489.1). Total num frames: 60239872. Throughput: 0: 9198.3. Samples: 15051846. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:31:45,813][01192] Avg episode reward: [(0, '4.616')] +[2024-08-24 20:31:46,127][03430] Updated weights for policy 0, policy_version 14710 (0.0005) +[2024-08-24 20:31:47,259][03430] Updated weights for policy 0, policy_version 14720 (0.0005) +[2024-08-24 20:31:48,323][03430] Updated weights for policy 0, policy_version 14730 (0.0006) +[2024-08-24 20:31:49,427][03430] Updated weights for policy 0, policy_version 14740 (0.0006) +[2024-08-24 20:31:50,562][03430] Updated weights for policy 0, policy_version 14750 (0.0005) +[2024-08-24 20:31:50,812][01192] Fps is (10 sec: 36864.1, 60 sec: 36727.5, 300 sec: 36489.1). Total num frames: 60424192. Throughput: 0: 9207.5. Samples: 15079460. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:50,813][01192] Avg episode reward: [(0, '4.341')] +[2024-08-24 20:31:51,685][03430] Updated weights for policy 0, policy_version 14760 (0.0006) +[2024-08-24 20:31:52,786][03430] Updated weights for policy 0, policy_version 14770 (0.0005) +[2024-08-24 20:31:53,909][03430] Updated weights for policy 0, policy_version 14780 (0.0006) +[2024-08-24 20:31:55,031][03430] Updated weights for policy 0, policy_version 14790 (0.0005) +[2024-08-24 20:31:55,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36727.4, 300 sec: 36489.1). Total num frames: 60604416. Throughput: 0: 9199.6. Samples: 15134782. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:31:55,813][01192] Avg episode reward: [(0, '4.415')] +[2024-08-24 20:31:56,160][03430] Updated weights for policy 0, policy_version 14800 (0.0006) +[2024-08-24 20:31:57,282][03430] Updated weights for policy 0, policy_version 14810 (0.0007) +[2024-08-24 20:31:58,395][03430] Updated weights for policy 0, policy_version 14820 (0.0005) +[2024-08-24 20:31:59,524][03430] Updated weights for policy 0, policy_version 14830 (0.0006) +[2024-08-24 20:32:00,631][03430] Updated weights for policy 0, policy_version 14840 (0.0005) +[2024-08-24 20:32:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36795.8, 300 sec: 36489.1). Total num frames: 60788736. Throughput: 0: 9197.2. Samples: 15189678. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:32:00,813][01192] Avg episode reward: [(0, '4.647')] +[2024-08-24 20:32:01,750][03430] Updated weights for policy 0, policy_version 14850 (0.0006) +[2024-08-24 20:32:02,869][03430] Updated weights for policy 0, policy_version 14860 (0.0006) +[2024-08-24 20:32:03,988][03430] Updated weights for policy 0, policy_version 14870 (0.0006) +[2024-08-24 20:32:05,094][03430] Updated weights for policy 0, policy_version 14880 (0.0006) +[2024-08-24 20:32:05,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36795.7, 300 sec: 36503.0). Total num frames: 60973056. Throughput: 0: 9210.3. Samples: 15217104. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:32:05,813][01192] Avg episode reward: [(0, '4.289')] +[2024-08-24 20:32:06,222][03430] Updated weights for policy 0, policy_version 14890 (0.0006) +[2024-08-24 20:32:07,348][03430] Updated weights for policy 0, policy_version 14900 (0.0005) +[2024-08-24 20:32:08,480][03430] Updated weights for policy 0, policy_version 14910 (0.0006) +[2024-08-24 20:32:09,621][03430] Updated weights for policy 0, policy_version 14920 (0.0006) +[2024-08-24 20:32:10,756][03430] Updated weights for policy 0, policy_version 14930 (0.0006) +[2024-08-24 20:32:10,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36727.5, 300 sec: 36503.0). Total num frames: 61153280. Throughput: 0: 9186.0. Samples: 15271830. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:32:10,813][01192] Avg episode reward: [(0, '4.238')] +[2024-08-24 20:32:11,886][03430] Updated weights for policy 0, policy_version 14940 (0.0006) +[2024-08-24 20:32:13,014][03430] Updated weights for policy 0, policy_version 14950 (0.0005) +[2024-08-24 20:32:14,171][03430] Updated weights for policy 0, policy_version 14960 (0.0006) +[2024-08-24 20:32:15,299][03430] Updated weights for policy 0, policy_version 14970 (0.0005) +[2024-08-24 20:32:15,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36727.5, 300 sec: 36503.0). Total num frames: 61333504. Throughput: 0: 9144.4. Samples: 15325916. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:32:15,812][01192] Avg episode reward: [(0, '4.293')] +[2024-08-24 20:32:15,821][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000014974_61333504.pth... +[2024-08-24 20:32:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000012836_52576256.pth +[2024-08-24 20:32:16,442][03430] Updated weights for policy 0, policy_version 14980 (0.0006) +[2024-08-24 20:32:17,544][03430] Updated weights for policy 0, policy_version 14990 (0.0005) +[2024-08-24 20:32:18,659][03430] Updated weights for policy 0, policy_version 15000 (0.0005) +[2024-08-24 20:32:19,779][03430] Updated weights for policy 0, policy_version 15010 (0.0005) +[2024-08-24 20:32:20,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36727.5, 300 sec: 36489.1). Total num frames: 61517824. Throughput: 0: 9122.6. Samples: 15353178. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:32:20,813][01192] Avg episode reward: [(0, '4.406')] +[2024-08-24 20:32:20,911][03430] Updated weights for policy 0, policy_version 15020 (0.0005) +[2024-08-24 20:32:22,004][03430] Updated weights for policy 0, policy_version 15030 (0.0006) +[2024-08-24 20:32:23,142][03430] Updated weights for policy 0, policy_version 15040 (0.0006) +[2024-08-24 20:32:24,255][03430] Updated weights for policy 0, policy_version 15050 (0.0006) +[2024-08-24 20:32:25,388][03430] Updated weights for policy 0, policy_version 15060 (0.0005) +[2024-08-24 20:32:25,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36590.9, 300 sec: 36503.0). Total num frames: 61698048. Throughput: 0: 9124.4. Samples: 15407964. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:32:25,812][01192] Avg episode reward: [(0, '4.335')] +[2024-08-24 20:32:26,510][03430] Updated weights for policy 0, policy_version 15070 (0.0006) +[2024-08-24 20:32:27,632][03430] Updated weights for policy 0, policy_version 15080 (0.0006) +[2024-08-24 20:32:28,756][03430] Updated weights for policy 0, policy_version 15090 (0.0006) +[2024-08-24 20:32:29,874][03430] Updated weights for policy 0, policy_version 15100 (0.0005) +[2024-08-24 20:32:30,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36522.7, 300 sec: 36503.0). Total num frames: 61882368. Throughput: 0: 9134.0. Samples: 15462876. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:32:30,813][01192] Avg episode reward: [(0, '4.554')] +[2024-08-24 20:32:30,997][03430] Updated weights for policy 0, policy_version 15110 (0.0007) +[2024-08-24 20:32:32,103][03430] Updated weights for policy 0, policy_version 15120 (0.0005) +[2024-08-24 20:32:33,232][03430] Updated weights for policy 0, policy_version 15130 (0.0006) +[2024-08-24 20:32:34,352][03430] Updated weights for policy 0, policy_version 15140 (0.0005) +[2024-08-24 20:32:35,462][03430] Updated weights for policy 0, policy_version 15150 (0.0005) +[2024-08-24 20:32:35,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36522.7, 300 sec: 36503.0). Total num frames: 62066688. Throughput: 0: 9131.8. Samples: 15490390. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:32:35,819][01192] Avg episode reward: [(0, '4.368')] +[2024-08-24 20:32:36,613][03430] Updated weights for policy 0, policy_version 15160 (0.0005) +[2024-08-24 20:32:37,706][03430] Updated weights for policy 0, policy_version 15170 (0.0006) +[2024-08-24 20:32:38,830][03430] Updated weights for policy 0, policy_version 15180 (0.0006) +[2024-08-24 20:32:39,957][03430] Updated weights for policy 0, policy_version 15190 (0.0006) +[2024-08-24 20:32:40,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36522.6, 300 sec: 36489.1). Total num frames: 62246912. Throughput: 0: 9119.2. Samples: 15545144. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:32:40,813][01192] Avg episode reward: [(0, '4.592')] +[2024-08-24 20:32:41,090][03430] Updated weights for policy 0, policy_version 15200 (0.0005) +[2024-08-24 20:32:42,215][03430] Updated weights for policy 0, policy_version 15210 (0.0005) +[2024-08-24 20:32:43,369][03430] Updated weights for policy 0, policy_version 15220 (0.0006) +[2024-08-24 20:32:44,474][03430] Updated weights for policy 0, policy_version 15230 (0.0006) +[2024-08-24 20:32:45,599][03430] Updated weights for policy 0, policy_version 15240 (0.0007) +[2024-08-24 20:32:45,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 62427136. Throughput: 0: 9105.3. Samples: 15599418. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:32:45,813][01192] Avg episode reward: [(0, '4.325')] +[2024-08-24 20:32:46,736][03430] Updated weights for policy 0, policy_version 15250 (0.0006) +[2024-08-24 20:32:47,870][03430] Updated weights for policy 0, policy_version 15260 (0.0006) +[2024-08-24 20:32:48,988][03430] Updated weights for policy 0, policy_version 15270 (0.0006) +[2024-08-24 20:32:50,113][03430] Updated weights for policy 0, policy_version 15280 (0.0005) +[2024-08-24 20:32:50,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 62611456. Throughput: 0: 9099.8. Samples: 15626596. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:32:50,813][01192] Avg episode reward: [(0, '4.369')] +[2024-08-24 20:32:51,241][03430] Updated weights for policy 0, policy_version 15290 (0.0005) +[2024-08-24 20:32:52,367][03430] Updated weights for policy 0, policy_version 15300 (0.0005) +[2024-08-24 20:32:53,501][03430] Updated weights for policy 0, policy_version 15310 (0.0005) +[2024-08-24 20:32:54,653][03430] Updated weights for policy 0, policy_version 15320 (0.0006) +[2024-08-24 20:32:55,772][03430] Updated weights for policy 0, policy_version 15330 (0.0006) +[2024-08-24 20:32:55,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 62791680. Throughput: 0: 9095.0. Samples: 15681104. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:32:55,813][01192] Avg episode reward: [(0, '4.473')] +[2024-08-24 20:32:56,883][03430] Updated weights for policy 0, policy_version 15340 (0.0005) +[2024-08-24 20:32:57,994][03430] Updated weights for policy 0, policy_version 15350 (0.0006) +[2024-08-24 20:32:59,129][03430] Updated weights for policy 0, policy_version 15360 (0.0006) +[2024-08-24 20:33:00,245][03430] Updated weights for policy 0, policy_version 15370 (0.0006) +[2024-08-24 20:33:00,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 62976000. Throughput: 0: 9106.8. Samples: 15735720. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:33:00,813][01192] Avg episode reward: [(0, '4.410')] +[2024-08-24 20:33:01,358][03430] Updated weights for policy 0, policy_version 15380 (0.0005) +[2024-08-24 20:33:02,493][03430] Updated weights for policy 0, policy_version 15390 (0.0006) +[2024-08-24 20:33:03,611][03430] Updated weights for policy 0, policy_version 15400 (0.0006) +[2024-08-24 20:33:04,765][03430] Updated weights for policy 0, policy_version 15410 (0.0006) +[2024-08-24 20:33:05,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36386.2, 300 sec: 36489.1). Total num frames: 63156224. Throughput: 0: 9109.9. Samples: 15763122. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:33:05,813][01192] Avg episode reward: [(0, '4.458')] +[2024-08-24 20:33:05,885][03430] Updated weights for policy 0, policy_version 15420 (0.0006) +[2024-08-24 20:33:07,023][03430] Updated weights for policy 0, policy_version 15430 (0.0007) +[2024-08-24 20:33:08,149][03430] Updated weights for policy 0, policy_version 15440 (0.0005) +[2024-08-24 20:33:09,290][03430] Updated weights for policy 0, policy_version 15450 (0.0006) +[2024-08-24 20:33:10,420][03430] Updated weights for policy 0, policy_version 15460 (0.0005) +[2024-08-24 20:33:10,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36386.1, 300 sec: 36489.1). Total num frames: 63336448. Throughput: 0: 9100.8. Samples: 15817500. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:10,813][01192] Avg episode reward: [(0, '4.402')] +[2024-08-24 20:33:11,551][03430] Updated weights for policy 0, policy_version 15470 (0.0005) +[2024-08-24 20:33:12,662][03430] Updated weights for policy 0, policy_version 15480 (0.0006) +[2024-08-24 20:33:13,766][03430] Updated weights for policy 0, policy_version 15490 (0.0005) +[2024-08-24 20:33:14,890][03430] Updated weights for policy 0, policy_version 15500 (0.0005) +[2024-08-24 20:33:15,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36454.4, 300 sec: 36489.1). Total num frames: 63520768. Throughput: 0: 9099.5. Samples: 15872352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:15,813][01192] Avg episode reward: [(0, '4.223')] +[2024-08-24 20:33:16,021][03430] Updated weights for policy 0, policy_version 15510 (0.0005) +[2024-08-24 20:33:17,209][03430] Updated weights for policy 0, policy_version 15520 (0.0005) +[2024-08-24 20:33:18,367][03430] Updated weights for policy 0, policy_version 15530 (0.0006) +[2024-08-24 20:33:19,493][03430] Updated weights for policy 0, policy_version 15540 (0.0006) +[2024-08-24 20:33:20,603][03430] Updated weights for policy 0, policy_version 15550 (0.0007) +[2024-08-24 20:33:20,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36317.8, 300 sec: 36461.3). Total num frames: 63696896. Throughput: 0: 9076.1. Samples: 15898814. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:33:20,813][01192] Avg episode reward: [(0, '4.393')] +[2024-08-24 20:33:21,722][03430] Updated weights for policy 0, policy_version 15560 (0.0007) +[2024-08-24 20:33:22,839][03430] Updated weights for policy 0, policy_version 15570 (0.0005) +[2024-08-24 20:33:23,933][03430] Updated weights for policy 0, policy_version 15580 (0.0006) +[2024-08-24 20:33:25,055][03430] Updated weights for policy 0, policy_version 15590 (0.0006) +[2024-08-24 20:33:25,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36386.1, 300 sec: 36461.3). Total num frames: 63881216. Throughput: 0: 9078.2. Samples: 15953662. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:33:25,813][01192] Avg episode reward: [(0, '4.566')] +[2024-08-24 20:33:26,190][03430] Updated weights for policy 0, policy_version 15600 (0.0005) +[2024-08-24 20:33:27,306][03430] Updated weights for policy 0, policy_version 15610 (0.0005) +[2024-08-24 20:33:28,474][03430] Updated weights for policy 0, policy_version 15620 (0.0006) +[2024-08-24 20:33:29,590][03430] Updated weights for policy 0, policy_version 15630 (0.0005) +[2024-08-24 20:33:30,703][03430] Updated weights for policy 0, policy_version 15640 (0.0005) +[2024-08-24 20:33:30,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36386.1, 300 sec: 36475.2). Total num frames: 64065536. Throughput: 0: 9081.7. Samples: 16008094. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:30,813][01192] Avg episode reward: [(0, '4.278')] +[2024-08-24 20:33:31,813][03430] Updated weights for policy 0, policy_version 15650 (0.0005) +[2024-08-24 20:33:32,923][03430] Updated weights for policy 0, policy_version 15660 (0.0005) +[2024-08-24 20:33:34,038][03430] Updated weights for policy 0, policy_version 15670 (0.0005) +[2024-08-24 20:33:35,176][03430] Updated weights for policy 0, policy_version 15680 (0.0006) +[2024-08-24 20:33:35,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36317.8, 300 sec: 36475.2). Total num frames: 64245760. Throughput: 0: 9088.4. Samples: 16035574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:35,813][01192] Avg episode reward: [(0, '4.414')] +[2024-08-24 20:33:36,302][03430] Updated weights for policy 0, policy_version 15690 (0.0006) +[2024-08-24 20:33:37,411][03430] Updated weights for policy 0, policy_version 15700 (0.0005) +[2024-08-24 20:33:38,539][03430] Updated weights for policy 0, policy_version 15710 (0.0005) +[2024-08-24 20:33:39,710][03430] Updated weights for policy 0, policy_version 15720 (0.0006) +[2024-08-24 20:33:40,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36317.9, 300 sec: 36461.3). Total num frames: 64425984. Throughput: 0: 9092.1. Samples: 16090246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:40,812][01192] Avg episode reward: [(0, '4.242')] +[2024-08-24 20:33:40,825][03430] Updated weights for policy 0, policy_version 15730 (0.0007) +[2024-08-24 20:33:41,946][03430] Updated weights for policy 0, policy_version 15740 (0.0006) +[2024-08-24 20:33:43,085][03430] Updated weights for policy 0, policy_version 15750 (0.0006) +[2024-08-24 20:33:44,202][03430] Updated weights for policy 0, policy_version 15760 (0.0006) +[2024-08-24 20:33:45,332][03430] Updated weights for policy 0, policy_version 15770 (0.0005) +[2024-08-24 20:33:45,812][01192] Fps is (10 sec: 36454.6, 60 sec: 36386.2, 300 sec: 36475.2). Total num frames: 64610304. Throughput: 0: 9087.7. Samples: 16144668. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:45,813][01192] Avg episode reward: [(0, '4.165')] +[2024-08-24 20:33:46,493][03430] Updated weights for policy 0, policy_version 15780 (0.0005) +[2024-08-24 20:33:47,591][03430] Updated weights for policy 0, policy_version 15790 (0.0007) +[2024-08-24 20:33:48,690][03430] Updated weights for policy 0, policy_version 15800 (0.0006) +[2024-08-24 20:33:49,813][03430] Updated weights for policy 0, policy_version 15810 (0.0006) +[2024-08-24 20:33:50,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36386.1, 300 sec: 36475.2). Total num frames: 64794624. Throughput: 0: 9081.5. Samples: 16171790. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:50,813][01192] Avg episode reward: [(0, '4.518')] +[2024-08-24 20:33:50,919][03430] Updated weights for policy 0, policy_version 15820 (0.0007) +[2024-08-24 20:33:52,045][03430] Updated weights for policy 0, policy_version 15830 (0.0005) +[2024-08-24 20:33:53,163][03430] Updated weights for policy 0, policy_version 15840 (0.0006) +[2024-08-24 20:33:54,314][03430] Updated weights for policy 0, policy_version 15850 (0.0005) +[2024-08-24 20:33:55,451][03430] Updated weights for policy 0, policy_version 15860 (0.0006) +[2024-08-24 20:33:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.2, 300 sec: 36461.3). Total num frames: 64974848. Throughput: 0: 9098.5. Samples: 16226932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:33:55,813][01192] Avg episode reward: [(0, '4.403')] +[2024-08-24 20:33:56,567][03430] Updated weights for policy 0, policy_version 15870 (0.0005) +[2024-08-24 20:33:57,711][03430] Updated weights for policy 0, policy_version 15880 (0.0006) +[2024-08-24 20:33:58,853][03430] Updated weights for policy 0, policy_version 15890 (0.0006) +[2024-08-24 20:33:59,970][03430] Updated weights for policy 0, policy_version 15900 (0.0005) +[2024-08-24 20:34:00,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36317.9, 300 sec: 36447.5). Total num frames: 65155072. Throughput: 0: 9078.4. Samples: 16280878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:34:00,813][01192] Avg episode reward: [(0, '4.596')] +[2024-08-24 20:34:01,099][03430] Updated weights for policy 0, policy_version 15910 (0.0006) +[2024-08-24 20:34:02,239][03430] Updated weights for policy 0, policy_version 15920 (0.0006) +[2024-08-24 20:34:03,353][03430] Updated weights for policy 0, policy_version 15930 (0.0005) +[2024-08-24 20:34:04,470][03430] Updated weights for policy 0, policy_version 15940 (0.0006) +[2024-08-24 20:34:05,635][03430] Updated weights for policy 0, policy_version 15950 (0.0006) +[2024-08-24 20:34:05,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36317.9, 300 sec: 36461.4). Total num frames: 65335296. Throughput: 0: 9098.3. Samples: 16308238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:34:05,812][01192] Avg episode reward: [(0, '4.531')] +[2024-08-24 20:34:06,779][03430] Updated weights for policy 0, policy_version 15960 (0.0006) +[2024-08-24 20:34:07,908][03430] Updated weights for policy 0, policy_version 15970 (0.0006) +[2024-08-24 20:34:09,060][03430] Updated weights for policy 0, policy_version 15980 (0.0006) +[2024-08-24 20:34:10,212][03430] Updated weights for policy 0, policy_version 15990 (0.0006) +[2024-08-24 20:34:10,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36249.6, 300 sec: 36447.5). Total num frames: 65511424. Throughput: 0: 9075.9. Samples: 16362078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2024-08-24 20:34:10,812][01192] Avg episode reward: [(0, '4.453')] +[2024-08-24 20:34:11,496][03430] Updated weights for policy 0, policy_version 16000 (0.0006) +[2024-08-24 20:34:12,740][03430] Updated weights for policy 0, policy_version 16010 (0.0006) +[2024-08-24 20:34:13,942][03430] Updated weights for policy 0, policy_version 16020 (0.0006) +[2024-08-24 20:34:15,134][03430] Updated weights for policy 0, policy_version 16030 (0.0006) +[2024-08-24 20:34:15,812][01192] Fps is (10 sec: 34406.4, 60 sec: 35976.6, 300 sec: 36391.9). Total num frames: 65679360. Throughput: 0: 8995.0. Samples: 16412868. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:34:15,813][01192] Avg episode reward: [(0, '4.368')] +[2024-08-24 20:34:15,823][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000016036_65683456.pth... +[2024-08-24 20:34:15,853][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000013904_56950784.pth +[2024-08-24 20:34:16,284][03430] Updated weights for policy 0, policy_version 16040 (0.0005) +[2024-08-24 20:34:17,426][03430] Updated weights for policy 0, policy_version 16050 (0.0005) +[2024-08-24 20:34:18,544][03430] Updated weights for policy 0, policy_version 16060 (0.0006) +[2024-08-24 20:34:19,699][03430] Updated weights for policy 0, policy_version 16070 (0.0006) +[2024-08-24 20:34:20,812][01192] Fps is (10 sec: 34816.0, 60 sec: 36044.9, 300 sec: 36391.9). Total num frames: 65859584. Throughput: 0: 8979.3. Samples: 16439644. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:34:20,813][01192] Avg episode reward: [(0, '4.486')] +[2024-08-24 20:34:20,831][03430] Updated weights for policy 0, policy_version 16080 (0.0005) +[2024-08-24 20:34:21,987][03430] Updated weights for policy 0, policy_version 16090 (0.0005) +[2024-08-24 20:34:23,129][03430] Updated weights for policy 0, policy_version 16100 (0.0005) +[2024-08-24 20:34:24,253][03430] Updated weights for policy 0, policy_version 16110 (0.0005) +[2024-08-24 20:34:25,434][03430] Updated weights for policy 0, policy_version 16120 (0.0006) +[2024-08-24 20:34:25,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35976.5, 300 sec: 36378.0). Total num frames: 66039808. Throughput: 0: 8965.5. Samples: 16493692. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:34:25,813][01192] Avg episode reward: [(0, '4.364')] +[2024-08-24 20:34:26,571][03430] Updated weights for policy 0, policy_version 16130 (0.0006) +[2024-08-24 20:34:27,717][03430] Updated weights for policy 0, policy_version 16140 (0.0006) +[2024-08-24 20:34:28,818][03430] Updated weights for policy 0, policy_version 16150 (0.0006) +[2024-08-24 20:34:29,944][03430] Updated weights for policy 0, policy_version 16160 (0.0005) +[2024-08-24 20:34:30,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35908.3, 300 sec: 36378.0). Total num frames: 66220032. Throughput: 0: 8954.4. Samples: 16547616. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:34:30,813][01192] Avg episode reward: [(0, '4.249')] +[2024-08-24 20:34:31,091][03430] Updated weights for policy 0, policy_version 16170 (0.0006) +[2024-08-24 20:34:32,190][03430] Updated weights for policy 0, policy_version 16180 (0.0005) +[2024-08-24 20:34:33,340][03430] Updated weights for policy 0, policy_version 16190 (0.0006) +[2024-08-24 20:34:34,455][03430] Updated weights for policy 0, policy_version 16200 (0.0005) +[2024-08-24 20:34:35,600][03430] Updated weights for policy 0, policy_version 16210 (0.0006) +[2024-08-24 20:34:35,812][01192] Fps is (10 sec: 36044.9, 60 sec: 35908.3, 300 sec: 36364.1). Total num frames: 66400256. Throughput: 0: 8956.3. Samples: 16574824. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:34:35,812][01192] Avg episode reward: [(0, '4.358')] +[2024-08-24 20:34:36,744][03430] Updated weights for policy 0, policy_version 16220 (0.0007) +[2024-08-24 20:34:37,866][03430] Updated weights for policy 0, policy_version 16230 (0.0005) +[2024-08-24 20:34:39,000][03430] Updated weights for policy 0, policy_version 16240 (0.0006) +[2024-08-24 20:34:40,111][03430] Updated weights for policy 0, policy_version 16250 (0.0006) +[2024-08-24 20:34:40,812][01192] Fps is (10 sec: 36454.4, 60 sec: 35976.5, 300 sec: 36364.2). Total num frames: 66584576. Throughput: 0: 8932.4. Samples: 16628888. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:34:40,813][01192] Avg episode reward: [(0, '4.395')] +[2024-08-24 20:34:41,246][03430] Updated weights for policy 0, policy_version 16260 (0.0005) +[2024-08-24 20:34:42,382][03430] Updated weights for policy 0, policy_version 16270 (0.0006) +[2024-08-24 20:34:43,510][03430] Updated weights for policy 0, policy_version 16280 (0.0005) +[2024-08-24 20:34:44,639][03430] Updated weights for policy 0, policy_version 16290 (0.0005) +[2024-08-24 20:34:45,754][03430] Updated weights for policy 0, policy_version 16300 (0.0006) +[2024-08-24 20:34:45,812][01192] Fps is (10 sec: 36454.5, 60 sec: 35908.3, 300 sec: 36364.2). Total num frames: 66764800. Throughput: 0: 8947.7. Samples: 16683524. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:34:45,812][01192] Avg episode reward: [(0, '4.338')] +[2024-08-24 20:34:46,882][03430] Updated weights for policy 0, policy_version 16310 (0.0006) +[2024-08-24 20:34:48,023][03430] Updated weights for policy 0, policy_version 16320 (0.0006) +[2024-08-24 20:34:49,163][03430] Updated weights for policy 0, policy_version 16330 (0.0006) +[2024-08-24 20:34:50,313][03430] Updated weights for policy 0, policy_version 16340 (0.0006) +[2024-08-24 20:34:50,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35840.0, 300 sec: 36350.3). Total num frames: 66945024. Throughput: 0: 8941.7. Samples: 16710616. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:34:50,812][01192] Avg episode reward: [(0, '4.441')] +[2024-08-24 20:34:51,480][03430] Updated weights for policy 0, policy_version 16350 (0.0005) +[2024-08-24 20:34:52,651][03430] Updated weights for policy 0, policy_version 16360 (0.0006) +[2024-08-24 20:34:53,790][03430] Updated weights for policy 0, policy_version 16370 (0.0006) +[2024-08-24 20:34:54,964][03430] Updated weights for policy 0, policy_version 16380 (0.0006) +[2024-08-24 20:34:55,812][01192] Fps is (10 sec: 35634.9, 60 sec: 35771.7, 300 sec: 36336.4). Total num frames: 67121152. Throughput: 0: 8930.2. Samples: 16763938. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:34:55,813][01192] Avg episode reward: [(0, '4.373')] +[2024-08-24 20:34:56,076][03430] Updated weights for policy 0, policy_version 16390 (0.0005) +[2024-08-24 20:34:57,215][03430] Updated weights for policy 0, policy_version 16400 (0.0006) +[2024-08-24 20:34:58,321][03430] Updated weights for policy 0, policy_version 16410 (0.0006) +[2024-08-24 20:34:59,428][03430] Updated weights for policy 0, policy_version 16420 (0.0005) +[2024-08-24 20:35:00,540][03430] Updated weights for policy 0, policy_version 16430 (0.0006) +[2024-08-24 20:35:00,812][01192] Fps is (10 sec: 36044.5, 60 sec: 35840.0, 300 sec: 36336.4). Total num frames: 67305472. Throughput: 0: 9013.0. Samples: 16818454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:35:00,813][01192] Avg episode reward: [(0, '4.575')] +[2024-08-24 20:35:01,656][03430] Updated weights for policy 0, policy_version 16440 (0.0006) +[2024-08-24 20:35:02,772][03430] Updated weights for policy 0, policy_version 16450 (0.0006) +[2024-08-24 20:35:03,920][03430] Updated weights for policy 0, policy_version 16460 (0.0006) +[2024-08-24 20:35:05,065][03430] Updated weights for policy 0, policy_version 16470 (0.0006) +[2024-08-24 20:35:05,812][01192] Fps is (10 sec: 36454.7, 60 sec: 35840.0, 300 sec: 36322.5). Total num frames: 67485696. Throughput: 0: 9025.8. Samples: 16845806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:35:05,813][01192] Avg episode reward: [(0, '4.477')] +[2024-08-24 20:35:06,225][03430] Updated weights for policy 0, policy_version 16480 (0.0006) +[2024-08-24 20:35:07,350][03430] Updated weights for policy 0, policy_version 16490 (0.0006) +[2024-08-24 20:35:08,522][03430] Updated weights for policy 0, policy_version 16500 (0.0006) +[2024-08-24 20:35:09,643][03430] Updated weights for policy 0, policy_version 16510 (0.0006) +[2024-08-24 20:35:10,788][03430] Updated weights for policy 0, policy_version 16520 (0.0006) +[2024-08-24 20:35:10,812][01192] Fps is (10 sec: 36045.0, 60 sec: 35908.3, 300 sec: 36322.5). Total num frames: 67665920. Throughput: 0: 9023.4. Samples: 16899746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:35:10,813][01192] Avg episode reward: [(0, '4.567')] +[2024-08-24 20:35:11,920][03430] Updated weights for policy 0, policy_version 16530 (0.0004) +[2024-08-24 20:35:13,033][03430] Updated weights for policy 0, policy_version 16540 (0.0006) +[2024-08-24 20:35:14,140][03430] Updated weights for policy 0, policy_version 16550 (0.0006) +[2024-08-24 20:35:15,273][03430] Updated weights for policy 0, policy_version 16560 (0.0005) +[2024-08-24 20:35:15,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36322.5). Total num frames: 67846144. Throughput: 0: 9034.6. Samples: 16954174. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:35:15,813][01192] Avg episode reward: [(0, '4.354')] +[2024-08-24 20:35:16,396][03430] Updated weights for policy 0, policy_version 16570 (0.0006) +[2024-08-24 20:35:17,526][03430] Updated weights for policy 0, policy_version 16580 (0.0007) +[2024-08-24 20:35:18,690][03430] Updated weights for policy 0, policy_version 16590 (0.0006) +[2024-08-24 20:35:19,899][03430] Updated weights for policy 0, policy_version 16600 (0.0006) +[2024-08-24 20:35:20,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36113.1, 300 sec: 36308.6). Total num frames: 68026368. Throughput: 0: 9037.8. Samples: 16981526. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:35:20,812][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:35:21,035][03430] Updated weights for policy 0, policy_version 16610 (0.0007) +[2024-08-24 20:35:22,192][03430] Updated weights for policy 0, policy_version 16620 (0.0005) +[2024-08-24 20:35:23,320][03430] Updated weights for policy 0, policy_version 16630 (0.0006) +[2024-08-24 20:35:24,438][03430] Updated weights for policy 0, policy_version 16640 (0.0006) +[2024-08-24 20:35:25,579][03430] Updated weights for policy 0, policy_version 16650 (0.0006) +[2024-08-24 20:35:25,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36044.8, 300 sec: 36294.7). Total num frames: 68202496. Throughput: 0: 9013.3. Samples: 17034488. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:35:25,813][01192] Avg episode reward: [(0, '4.492')] +[2024-08-24 20:35:26,733][03430] Updated weights for policy 0, policy_version 16660 (0.0006) +[2024-08-24 20:35:27,861][03430] Updated weights for policy 0, policy_version 16670 (0.0007) +[2024-08-24 20:35:28,971][03430] Updated weights for policy 0, policy_version 16680 (0.0005) +[2024-08-24 20:35:30,093][03430] Updated weights for policy 0, policy_version 16690 (0.0006) +[2024-08-24 20:35:30,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36294.7). Total num frames: 68386816. Throughput: 0: 9009.2. Samples: 17088940. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:35:30,813][01192] Avg episode reward: [(0, '4.705')] +[2024-08-24 20:35:31,209][03430] Updated weights for policy 0, policy_version 16700 (0.0006) +[2024-08-24 20:35:32,324][03430] Updated weights for policy 0, policy_version 16710 (0.0006) +[2024-08-24 20:35:33,447][03430] Updated weights for policy 0, policy_version 16720 (0.0006) +[2024-08-24 20:35:34,549][03430] Updated weights for policy 0, policy_version 16730 (0.0005) +[2024-08-24 20:35:35,681][03430] Updated weights for policy 0, policy_version 16740 (0.0005) +[2024-08-24 20:35:35,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36181.3, 300 sec: 36322.5). Total num frames: 68571136. Throughput: 0: 9016.8. Samples: 17116372. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:35:35,813][01192] Avg episode reward: [(0, '4.472')] +[2024-08-24 20:35:36,816][03430] Updated weights for policy 0, policy_version 16750 (0.0005) +[2024-08-24 20:35:37,927][03430] Updated weights for policy 0, policy_version 16760 (0.0006) +[2024-08-24 20:35:39,041][03430] Updated weights for policy 0, policy_version 16770 (0.0005) +[2024-08-24 20:35:40,153][03430] Updated weights for policy 0, policy_version 16780 (0.0005) +[2024-08-24 20:35:40,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36113.1, 300 sec: 36322.5). Total num frames: 68751360. Throughput: 0: 9054.0. Samples: 17171368. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:35:40,813][01192] Avg episode reward: [(0, '4.469')] +[2024-08-24 20:35:41,284][03430] Updated weights for policy 0, policy_version 16790 (0.0006) +[2024-08-24 20:35:42,391][03430] Updated weights for policy 0, policy_version 16800 (0.0006) +[2024-08-24 20:35:43,532][03430] Updated weights for policy 0, policy_version 16810 (0.0006) +[2024-08-24 20:35:44,649][03430] Updated weights for policy 0, policy_version 16820 (0.0006) +[2024-08-24 20:35:45,745][03430] Updated weights for policy 0, policy_version 16830 (0.0005) +[2024-08-24 20:35:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36181.3, 300 sec: 36322.5). Total num frames: 68935680. Throughput: 0: 9056.1. Samples: 17225980. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:35:45,813][01192] Avg episode reward: [(0, '4.370')] +[2024-08-24 20:35:46,869][03430] Updated weights for policy 0, policy_version 16840 (0.0005) +[2024-08-24 20:35:47,982][03430] Updated weights for policy 0, policy_version 16850 (0.0006) +[2024-08-24 20:35:49,085][03430] Updated weights for policy 0, policy_version 16860 (0.0006) +[2024-08-24 20:35:50,205][03430] Updated weights for policy 0, policy_version 16870 (0.0005) +[2024-08-24 20:35:50,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36249.6, 300 sec: 36336.4). Total num frames: 69120000. Throughput: 0: 9064.0. Samples: 17253686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:35:50,813][01192] Avg episode reward: [(0, '4.407')] +[2024-08-24 20:35:51,360][03430] Updated weights for policy 0, policy_version 16880 (0.0006) +[2024-08-24 20:35:52,487][03430] Updated weights for policy 0, policy_version 16890 (0.0006) +[2024-08-24 20:35:53,625][03430] Updated weights for policy 0, policy_version 16900 (0.0005) +[2024-08-24 20:35:54,751][03430] Updated weights for policy 0, policy_version 16910 (0.0006) +[2024-08-24 20:35:55,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36317.9, 300 sec: 36336.4). Total num frames: 69300224. Throughput: 0: 9076.2. Samples: 17308174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:35:55,813][01192] Avg episode reward: [(0, '4.493')] +[2024-08-24 20:35:55,885][03430] Updated weights for policy 0, policy_version 16920 (0.0005) +[2024-08-24 20:35:56,991][03430] Updated weights for policy 0, policy_version 16930 (0.0005) +[2024-08-24 20:35:58,109][03430] Updated weights for policy 0, policy_version 16940 (0.0005) +[2024-08-24 20:35:59,252][03430] Updated weights for policy 0, policy_version 16950 (0.0006) +[2024-08-24 20:36:00,380][03430] Updated weights for policy 0, policy_version 16960 (0.0005) +[2024-08-24 20:36:00,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36249.7, 300 sec: 36322.5). Total num frames: 69480448. Throughput: 0: 9078.7. Samples: 17362716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:36:00,813][01192] Avg episode reward: [(0, '4.572')] +[2024-08-24 20:36:01,529][03430] Updated weights for policy 0, policy_version 16970 (0.0005) +[2024-08-24 20:36:02,677][03430] Updated weights for policy 0, policy_version 16980 (0.0006) +[2024-08-24 20:36:03,799][03430] Updated weights for policy 0, policy_version 16990 (0.0006) +[2024-08-24 20:36:04,925][03430] Updated weights for policy 0, policy_version 17000 (0.0006) +[2024-08-24 20:36:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36249.6, 300 sec: 36308.6). Total num frames: 69660672. Throughput: 0: 9069.7. Samples: 17389662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:36:05,813][01192] Avg episode reward: [(0, '4.265')] +[2024-08-24 20:36:06,083][03430] Updated weights for policy 0, policy_version 17010 (0.0006) +[2024-08-24 20:36:07,234][03430] Updated weights for policy 0, policy_version 17020 (0.0006) +[2024-08-24 20:36:08,340][03430] Updated weights for policy 0, policy_version 17030 (0.0006) +[2024-08-24 20:36:09,458][03430] Updated weights for policy 0, policy_version 17040 (0.0005) +[2024-08-24 20:36:10,574][03430] Updated weights for policy 0, policy_version 17050 (0.0006) +[2024-08-24 20:36:10,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36322.5). Total num frames: 69844992. Throughput: 0: 9100.8. Samples: 17444026. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:10,812][01192] Avg episode reward: [(0, '4.412')] +[2024-08-24 20:36:11,710][03430] Updated weights for policy 0, policy_version 17060 (0.0006) +[2024-08-24 20:36:12,817][03430] Updated weights for policy 0, policy_version 17070 (0.0005) +[2024-08-24 20:36:13,945][03430] Updated weights for policy 0, policy_version 17080 (0.0006) +[2024-08-24 20:36:15,060][03430] Updated weights for policy 0, policy_version 17090 (0.0005) +[2024-08-24 20:36:15,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36317.9, 300 sec: 36308.6). Total num frames: 70025216. Throughput: 0: 9106.0. Samples: 17498708. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:15,813][01192] Avg episode reward: [(0, '4.448')] +[2024-08-24 20:36:15,820][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000017096_70025216.pth... +[2024-08-24 20:36:15,849][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000014974_61333504.pth +[2024-08-24 20:36:16,204][03430] Updated weights for policy 0, policy_version 17100 (0.0006) +[2024-08-24 20:36:17,351][03430] Updated weights for policy 0, policy_version 17110 (0.0005) +[2024-08-24 20:36:18,474][03430] Updated weights for policy 0, policy_version 17120 (0.0006) +[2024-08-24 20:36:19,586][03430] Updated weights for policy 0, policy_version 17130 (0.0005) +[2024-08-24 20:36:20,721][03430] Updated weights for policy 0, policy_version 17140 (0.0005) +[2024-08-24 20:36:20,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36317.9, 300 sec: 36280.8). Total num frames: 70205440. Throughput: 0: 9099.4. Samples: 17525846. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:20,813][01192] Avg episode reward: [(0, '4.462')] +[2024-08-24 20:36:21,843][03430] Updated weights for policy 0, policy_version 17150 (0.0005) +[2024-08-24 20:36:22,965][03430] Updated weights for policy 0, policy_version 17160 (0.0006) +[2024-08-24 20:36:24,069][03430] Updated weights for policy 0, policy_version 17170 (0.0006) +[2024-08-24 20:36:25,168][03430] Updated weights for policy 0, policy_version 17180 (0.0005) +[2024-08-24 20:36:25,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36267.0). Total num frames: 70389760. Throughput: 0: 9095.8. Samples: 17580678. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:25,813][01192] Avg episode reward: [(0, '4.498')] +[2024-08-24 20:36:26,304][03430] Updated weights for policy 0, policy_version 17190 (0.0005) +[2024-08-24 20:36:27,435][03430] Updated weights for policy 0, policy_version 17200 (0.0006) +[2024-08-24 20:36:28,564][03430] Updated weights for policy 0, policy_version 17210 (0.0006) +[2024-08-24 20:36:29,684][03430] Updated weights for policy 0, policy_version 17220 (0.0005) +[2024-08-24 20:36:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36253.1). Total num frames: 70569984. Throughput: 0: 9099.4. Samples: 17635454. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:30,813][01192] Avg episode reward: [(0, '4.386')] +[2024-08-24 20:36:30,828][03430] Updated weights for policy 0, policy_version 17230 (0.0006) +[2024-08-24 20:36:31,935][03430] Updated weights for policy 0, policy_version 17240 (0.0005) +[2024-08-24 20:36:33,082][03430] Updated weights for policy 0, policy_version 17250 (0.0006) +[2024-08-24 20:36:34,232][03430] Updated weights for policy 0, policy_version 17260 (0.0006) +[2024-08-24 20:36:35,335][03430] Updated weights for policy 0, policy_version 17270 (0.0005) +[2024-08-24 20:36:35,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36386.2, 300 sec: 36267.0). Total num frames: 70754304. Throughput: 0: 9086.5. Samples: 17662578. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:35,813][01192] Avg episode reward: [(0, '4.306')] +[2024-08-24 20:36:36,455][03430] Updated weights for policy 0, policy_version 17280 (0.0005) +[2024-08-24 20:36:37,582][03430] Updated weights for policy 0, policy_version 17290 (0.0006) +[2024-08-24 20:36:38,717][03430] Updated weights for policy 0, policy_version 17300 (0.0006) +[2024-08-24 20:36:39,865][03430] Updated weights for policy 0, policy_version 17310 (0.0005) +[2024-08-24 20:36:40,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36386.1, 300 sec: 36253.1). Total num frames: 70934528. Throughput: 0: 9083.5. Samples: 17716930. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:40,813][01192] Avg episode reward: [(0, '4.524')] +[2024-08-24 20:36:41,028][03430] Updated weights for policy 0, policy_version 17320 (0.0006) +[2024-08-24 20:36:42,168][03430] Updated weights for policy 0, policy_version 17330 (0.0006) +[2024-08-24 20:36:43,292][03430] Updated weights for policy 0, policy_version 17340 (0.0006) +[2024-08-24 20:36:44,454][03430] Updated weights for policy 0, policy_version 17350 (0.0006) +[2024-08-24 20:36:45,600][03430] Updated weights for policy 0, policy_version 17360 (0.0005) +[2024-08-24 20:36:45,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36249.7, 300 sec: 36225.3). Total num frames: 71110656. Throughput: 0: 9064.9. Samples: 17770636. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:45,813][01192] Avg episode reward: [(0, '4.320')] +[2024-08-24 20:36:46,736][03430] Updated weights for policy 0, policy_version 17370 (0.0006) +[2024-08-24 20:36:47,870][03430] Updated weights for policy 0, policy_version 17380 (0.0006) +[2024-08-24 20:36:49,017][03430] Updated weights for policy 0, policy_version 17390 (0.0006) +[2024-08-24 20:36:50,164][03430] Updated weights for policy 0, policy_version 17400 (0.0006) +[2024-08-24 20:36:50,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36181.3, 300 sec: 36225.3). Total num frames: 71290880. Throughput: 0: 9067.5. Samples: 17797700. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:50,813][01192] Avg episode reward: [(0, '4.447')] +[2024-08-24 20:36:51,286][03430] Updated weights for policy 0, policy_version 17410 (0.0005) +[2024-08-24 20:36:52,410][03430] Updated weights for policy 0, policy_version 17420 (0.0005) +[2024-08-24 20:36:53,521][03430] Updated weights for policy 0, policy_version 17430 (0.0005) +[2024-08-24 20:36:54,649][03430] Updated weights for policy 0, policy_version 17440 (0.0006) +[2024-08-24 20:36:55,780][03430] Updated weights for policy 0, policy_version 17450 (0.0006) +[2024-08-24 20:36:55,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36249.6, 300 sec: 36225.3). Total num frames: 71475200. Throughput: 0: 9066.5. Samples: 17852018. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:36:55,813][01192] Avg episode reward: [(0, '4.301')] +[2024-08-24 20:36:56,957][03430] Updated weights for policy 0, policy_version 17460 (0.0006) +[2024-08-24 20:36:58,085][03430] Updated weights for policy 0, policy_version 17470 (0.0006) +[2024-08-24 20:36:59,205][03430] Updated weights for policy 0, policy_version 17480 (0.0005) +[2024-08-24 20:37:00,334][03430] Updated weights for policy 0, policy_version 17490 (0.0005) +[2024-08-24 20:37:00,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36249.5, 300 sec: 36211.4). Total num frames: 71655424. Throughput: 0: 9051.8. Samples: 17906040. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:37:00,813][01192] Avg episode reward: [(0, '4.290')] +[2024-08-24 20:37:01,472][03430] Updated weights for policy 0, policy_version 17500 (0.0006) +[2024-08-24 20:37:02,566][03430] Updated weights for policy 0, policy_version 17510 (0.0006) +[2024-08-24 20:37:03,701][03430] Updated weights for policy 0, policy_version 17520 (0.0005) +[2024-08-24 20:37:04,819][03430] Updated weights for policy 0, policy_version 17530 (0.0005) +[2024-08-24 20:37:05,812][01192] Fps is (10 sec: 36453.7, 60 sec: 36317.8, 300 sec: 36225.3). Total num frames: 71839744. Throughput: 0: 9054.5. Samples: 17933298. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:37:05,813][01192] Avg episode reward: [(0, '4.419')] +[2024-08-24 20:37:05,918][03430] Updated weights for policy 0, policy_version 17540 (0.0005) +[2024-08-24 20:37:06,987][03430] Updated weights for policy 0, policy_version 17550 (0.0005) +[2024-08-24 20:37:08,074][03430] Updated weights for policy 0, policy_version 17560 (0.0005) +[2024-08-24 20:37:09,168][03430] Updated weights for policy 0, policy_version 17570 (0.0005) +[2024-08-24 20:37:10,274][03430] Updated weights for policy 0, policy_version 17580 (0.0006) +[2024-08-24 20:37:10,812][01192] Fps is (10 sec: 36864.3, 60 sec: 36317.9, 300 sec: 36239.2). Total num frames: 72024064. Throughput: 0: 9080.3. Samples: 17989292. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:37:10,813][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:37:11,367][03430] Updated weights for policy 0, policy_version 17590 (0.0005) +[2024-08-24 20:37:12,473][03430] Updated weights for policy 0, policy_version 17600 (0.0006) +[2024-08-24 20:37:13,561][03430] Updated weights for policy 0, policy_version 17610 (0.0006) +[2024-08-24 20:37:14,671][03430] Updated weights for policy 0, policy_version 17620 (0.0005) +[2024-08-24 20:37:15,788][03430] Updated weights for policy 0, policy_version 17630 (0.0006) +[2024-08-24 20:37:15,812][01192] Fps is (10 sec: 37274.2, 60 sec: 36454.4, 300 sec: 36253.1). Total num frames: 72212480. Throughput: 0: 9099.8. Samples: 18044946. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:15,813][01192] Avg episode reward: [(0, '4.515')] +[2024-08-24 20:37:16,897][03430] Updated weights for policy 0, policy_version 17640 (0.0005) +[2024-08-24 20:37:17,981][03430] Updated weights for policy 0, policy_version 17650 (0.0006) +[2024-08-24 20:37:19,068][03430] Updated weights for policy 0, policy_version 17660 (0.0006) +[2024-08-24 20:37:20,217][03430] Updated weights for policy 0, policy_version 17670 (0.0007) +[2024-08-24 20:37:20,812][01192] Fps is (10 sec: 37273.4, 60 sec: 36522.6, 300 sec: 36267.0). Total num frames: 72396800. Throughput: 0: 9119.0. Samples: 18072932. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:20,813][01192] Avg episode reward: [(0, '4.565')] +[2024-08-24 20:37:21,386][03430] Updated weights for policy 0, policy_version 17680 (0.0006) +[2024-08-24 20:37:22,505][03430] Updated weights for policy 0, policy_version 17690 (0.0005) +[2024-08-24 20:37:23,607][03430] Updated weights for policy 0, policy_version 17700 (0.0006) +[2024-08-24 20:37:24,709][03430] Updated weights for policy 0, policy_version 17710 (0.0006) +[2024-08-24 20:37:25,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36253.1). Total num frames: 72577024. Throughput: 0: 9128.5. Samples: 18127712. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:25,812][01192] Avg episode reward: [(0, '4.356')] +[2024-08-24 20:37:25,818][03430] Updated weights for policy 0, policy_version 17720 (0.0006) +[2024-08-24 20:37:26,940][03430] Updated weights for policy 0, policy_version 17730 (0.0005) +[2024-08-24 20:37:28,006][03430] Updated weights for policy 0, policy_version 17740 (0.0004) +[2024-08-24 20:37:29,114][03430] Updated weights for policy 0, policy_version 17750 (0.0005) +[2024-08-24 20:37:30,218][03430] Updated weights for policy 0, policy_version 17760 (0.0005) +[2024-08-24 20:37:30,812][01192] Fps is (10 sec: 36864.2, 60 sec: 36590.9, 300 sec: 36267.0). Total num frames: 72765440. Throughput: 0: 9172.3. Samples: 18183388. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:30,813][01192] Avg episode reward: [(0, '4.446')] +[2024-08-24 20:37:31,311][03430] Updated weights for policy 0, policy_version 17770 (0.0005) +[2024-08-24 20:37:32,412][03430] Updated weights for policy 0, policy_version 17780 (0.0005) +[2024-08-24 20:37:33,516][03430] Updated weights for policy 0, policy_version 17790 (0.0005) +[2024-08-24 20:37:34,628][03430] Updated weights for policy 0, policy_version 17800 (0.0005) +[2024-08-24 20:37:35,786][03430] Updated weights for policy 0, policy_version 17810 (0.0006) +[2024-08-24 20:37:35,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36590.9, 300 sec: 36280.9). Total num frames: 72949760. Throughput: 0: 9195.6. Samples: 18211504. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:35,813][01192] Avg episode reward: [(0, '4.570')] +[2024-08-24 20:37:36,912][03430] Updated weights for policy 0, policy_version 17820 (0.0006) +[2024-08-24 20:37:38,030][03430] Updated weights for policy 0, policy_version 17830 (0.0007) +[2024-08-24 20:37:39,139][03430] Updated weights for policy 0, policy_version 17840 (0.0006) +[2024-08-24 20:37:40,247][03430] Updated weights for policy 0, policy_version 17850 (0.0008) +[2024-08-24 20:37:40,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36659.1, 300 sec: 36294.7). Total num frames: 73134080. Throughput: 0: 9203.5. Samples: 18266176. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:40,813][01192] Avg episode reward: [(0, '4.454')] +[2024-08-24 20:37:41,344][03430] Updated weights for policy 0, policy_version 17860 (0.0006) +[2024-08-24 20:37:42,461][03430] Updated weights for policy 0, policy_version 17870 (0.0005) +[2024-08-24 20:37:43,592][03430] Updated weights for policy 0, policy_version 17880 (0.0006) +[2024-08-24 20:37:44,702][03430] Updated weights for policy 0, policy_version 17890 (0.0005) +[2024-08-24 20:37:45,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36727.5, 300 sec: 36280.8). Total num frames: 73314304. Throughput: 0: 9223.1. Samples: 18321080. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:45,813][01192] Avg episode reward: [(0, '4.548')] +[2024-08-24 20:37:45,832][03430] Updated weights for policy 0, policy_version 17900 (0.0006) +[2024-08-24 20:37:46,977][03430] Updated weights for policy 0, policy_version 17910 (0.0006) +[2024-08-24 20:37:48,090][03430] Updated weights for policy 0, policy_version 17920 (0.0006) +[2024-08-24 20:37:49,177][03430] Updated weights for policy 0, policy_version 17930 (0.0006) +[2024-08-24 20:37:50,305][03430] Updated weights for policy 0, policy_version 17940 (0.0006) +[2024-08-24 20:37:50,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36795.7, 300 sec: 36294.7). Total num frames: 73498624. Throughput: 0: 9226.7. Samples: 18348496. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:50,813][01192] Avg episode reward: [(0, '4.415')] +[2024-08-24 20:37:51,435][03430] Updated weights for policy 0, policy_version 17950 (0.0005) +[2024-08-24 20:37:52,581][03430] Updated weights for policy 0, policy_version 17960 (0.0005) +[2024-08-24 20:37:53,703][03430] Updated weights for policy 0, policy_version 17970 (0.0005) +[2024-08-24 20:37:54,818][03430] Updated weights for policy 0, policy_version 17980 (0.0005) +[2024-08-24 20:37:55,812][01192] Fps is (10 sec: 36454.0, 60 sec: 36727.4, 300 sec: 36280.8). Total num frames: 73678848. Throughput: 0: 9197.9. Samples: 18403200. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:37:55,813][01192] Avg episode reward: [(0, '4.297')] +[2024-08-24 20:37:55,959][03430] Updated weights for policy 0, policy_version 17990 (0.0006) +[2024-08-24 20:37:57,086][03430] Updated weights for policy 0, policy_version 18000 (0.0006) +[2024-08-24 20:37:58,209][03430] Updated weights for policy 0, policy_version 18010 (0.0005) +[2024-08-24 20:37:59,330][03430] Updated weights for policy 0, policy_version 18020 (0.0006) +[2024-08-24 20:38:00,478][03430] Updated weights for policy 0, policy_version 18030 (0.0005) +[2024-08-24 20:38:00,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36727.5, 300 sec: 36280.8). Total num frames: 73859072. Throughput: 0: 9169.2. Samples: 18457558. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:00,812][01192] Avg episode reward: [(0, '4.441')] +[2024-08-24 20:38:01,596][03430] Updated weights for policy 0, policy_version 18040 (0.0005) +[2024-08-24 20:38:02,717][03430] Updated weights for policy 0, policy_version 18050 (0.0006) +[2024-08-24 20:38:03,843][03430] Updated weights for policy 0, policy_version 18060 (0.0005) +[2024-08-24 20:38:04,954][03430] Updated weights for policy 0, policy_version 18070 (0.0005) +[2024-08-24 20:38:05,812][01192] Fps is (10 sec: 36454.8, 60 sec: 36727.6, 300 sec: 36294.7). Total num frames: 74043392. Throughput: 0: 9154.6. Samples: 18484890. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:05,812][01192] Avg episode reward: [(0, '4.379')] +[2024-08-24 20:38:06,040][03430] Updated weights for policy 0, policy_version 18080 (0.0005) +[2024-08-24 20:38:07,125][03430] Updated weights for policy 0, policy_version 18090 (0.0005) +[2024-08-24 20:38:08,248][03430] Updated weights for policy 0, policy_version 18100 (0.0005) +[2024-08-24 20:38:09,373][03430] Updated weights for policy 0, policy_version 18110 (0.0006) +[2024-08-24 20:38:10,515][03430] Updated weights for policy 0, policy_version 18120 (0.0006) +[2024-08-24 20:38:10,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36727.4, 300 sec: 36294.7). Total num frames: 74227712. Throughput: 0: 9166.7. Samples: 18540214. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:10,813][01192] Avg episode reward: [(0, '4.370')] +[2024-08-24 20:38:11,639][03430] Updated weights for policy 0, policy_version 18130 (0.0005) +[2024-08-24 20:38:12,772][03430] Updated weights for policy 0, policy_version 18140 (0.0005) +[2024-08-24 20:38:13,889][03430] Updated weights for policy 0, policy_version 18150 (0.0005) +[2024-08-24 20:38:14,983][03430] Updated weights for policy 0, policy_version 18160 (0.0005) +[2024-08-24 20:38:15,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36659.1, 300 sec: 36322.5). Total num frames: 74412032. Throughput: 0: 9150.3. Samples: 18595154. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:15,813][01192] Avg episode reward: [(0, '4.576')] +[2024-08-24 20:38:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000018167_74412032.pth... +[2024-08-24 20:38:15,848][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000016036_65683456.pth +[2024-08-24 20:38:16,097][03430] Updated weights for policy 0, policy_version 18170 (0.0005) +[2024-08-24 20:38:17,173][03430] Updated weights for policy 0, policy_version 18180 (0.0005) +[2024-08-24 20:38:18,274][03430] Updated weights for policy 0, policy_version 18190 (0.0006) +[2024-08-24 20:38:19,349][03430] Updated weights for policy 0, policy_version 18200 (0.0005) +[2024-08-24 20:38:20,467][03430] Updated weights for policy 0, policy_version 18210 (0.0005) +[2024-08-24 20:38:20,812][01192] Fps is (10 sec: 37273.7, 60 sec: 36727.5, 300 sec: 36336.4). Total num frames: 74600448. Throughput: 0: 9145.2. Samples: 18623038. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:20,813][01192] Avg episode reward: [(0, '4.628')] +[2024-08-24 20:38:21,595][03430] Updated weights for policy 0, policy_version 18220 (0.0005) +[2024-08-24 20:38:22,687][03430] Updated weights for policy 0, policy_version 18230 (0.0006) +[2024-08-24 20:38:23,812][03430] Updated weights for policy 0, policy_version 18240 (0.0005) +[2024-08-24 20:38:24,914][03430] Updated weights for policy 0, policy_version 18250 (0.0006) +[2024-08-24 20:38:25,812][01192] Fps is (10 sec: 37274.0, 60 sec: 36795.7, 300 sec: 36336.4). Total num frames: 74784768. Throughput: 0: 9168.1. Samples: 18678740. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:25,812][01192] Avg episode reward: [(0, '4.468')] +[2024-08-24 20:38:26,032][03430] Updated weights for policy 0, policy_version 18260 (0.0005) +[2024-08-24 20:38:27,136][03430] Updated weights for policy 0, policy_version 18270 (0.0005) +[2024-08-24 20:38:28,262][03430] Updated weights for policy 0, policy_version 18280 (0.0005) +[2024-08-24 20:38:29,393][03430] Updated weights for policy 0, policy_version 18290 (0.0005) +[2024-08-24 20:38:30,511][03430] Updated weights for policy 0, policy_version 18300 (0.0006) +[2024-08-24 20:38:30,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36659.2, 300 sec: 36336.4). Total num frames: 74964992. Throughput: 0: 9172.5. Samples: 18733844. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:30,813][01192] Avg episode reward: [(0, '4.477')] +[2024-08-24 20:38:31,619][03430] Updated weights for policy 0, policy_version 18310 (0.0005) +[2024-08-24 20:38:32,740][03430] Updated weights for policy 0, policy_version 18320 (0.0006) +[2024-08-24 20:38:33,837][03430] Updated weights for policy 0, policy_version 18330 (0.0006) +[2024-08-24 20:38:34,976][03430] Updated weights for policy 0, policy_version 18340 (0.0006) +[2024-08-24 20:38:35,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36659.1, 300 sec: 36350.3). Total num frames: 75149312. Throughput: 0: 9176.4. Samples: 18761434. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:38:35,813][01192] Avg episode reward: [(0, '4.494')] +[2024-08-24 20:38:36,096][03430] Updated weights for policy 0, policy_version 18350 (0.0006) +[2024-08-24 20:38:37,242][03430] Updated weights for policy 0, policy_version 18360 (0.0006) +[2024-08-24 20:38:38,360][03430] Updated weights for policy 0, policy_version 18370 (0.0007) +[2024-08-24 20:38:39,504][03430] Updated weights for policy 0, policy_version 18380 (0.0006) +[2024-08-24 20:38:40,643][03430] Updated weights for policy 0, policy_version 18390 (0.0006) +[2024-08-24 20:38:40,812][01192] Fps is (10 sec: 36453.8, 60 sec: 36590.9, 300 sec: 36336.4). Total num frames: 75329536. Throughput: 0: 9166.6. Samples: 18815696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:38:40,813][01192] Avg episode reward: [(0, '4.456')] +[2024-08-24 20:38:41,768][03430] Updated weights for policy 0, policy_version 18400 (0.0006) +[2024-08-24 20:38:42,879][03430] Updated weights for policy 0, policy_version 18410 (0.0005) +[2024-08-24 20:38:43,998][03430] Updated weights for policy 0, policy_version 18420 (0.0006) +[2024-08-24 20:38:45,112][03430] Updated weights for policy 0, policy_version 18430 (0.0006) +[2024-08-24 20:38:45,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36659.2, 300 sec: 36336.4). Total num frames: 75513856. Throughput: 0: 9174.4. Samples: 18870404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:38:45,813][01192] Avg episode reward: [(0, '4.398')] +[2024-08-24 20:38:46,234][03430] Updated weights for policy 0, policy_version 18440 (0.0006) +[2024-08-24 20:38:47,360][03430] Updated weights for policy 0, policy_version 18450 (0.0005) +[2024-08-24 20:38:48,483][03430] Updated weights for policy 0, policy_version 18460 (0.0005) +[2024-08-24 20:38:49,601][03430] Updated weights for policy 0, policy_version 18470 (0.0005) +[2024-08-24 20:38:50,723][03430] Updated weights for policy 0, policy_version 18480 (0.0005) +[2024-08-24 20:38:50,812][01192] Fps is (10 sec: 36455.0, 60 sec: 36590.9, 300 sec: 36336.4). Total num frames: 75694080. Throughput: 0: 9173.7. Samples: 18897706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2024-08-24 20:38:50,813][01192] Avg episode reward: [(0, '4.467')] +[2024-08-24 20:38:51,845][03430] Updated weights for policy 0, policy_version 18490 (0.0005) +[2024-08-24 20:38:52,998][03430] Updated weights for policy 0, policy_version 18500 (0.0006) +[2024-08-24 20:38:54,106][03430] Updated weights for policy 0, policy_version 18510 (0.0005) +[2024-08-24 20:38:55,203][03430] Updated weights for policy 0, policy_version 18520 (0.0006) +[2024-08-24 20:38:55,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36659.3, 300 sec: 36350.3). Total num frames: 75878400. Throughput: 0: 9156.9. Samples: 18952274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:38:55,813][01192] Avg episode reward: [(0, '4.139')] +[2024-08-24 20:38:56,336][03430] Updated weights for policy 0, policy_version 18530 (0.0006) +[2024-08-24 20:38:57,471][03430] Updated weights for policy 0, policy_version 18540 (0.0005) +[2024-08-24 20:38:58,591][03430] Updated weights for policy 0, policy_version 18550 (0.0006) +[2024-08-24 20:38:59,706][03430] Updated weights for policy 0, policy_version 18560 (0.0005) +[2024-08-24 20:39:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36659.2, 300 sec: 36350.3). Total num frames: 76058624. Throughput: 0: 9155.7. Samples: 19007160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:39:00,812][01192] Avg episode reward: [(0, '4.465')] +[2024-08-24 20:39:00,877][03430] Updated weights for policy 0, policy_version 18570 (0.0006) +[2024-08-24 20:39:02,020][03430] Updated weights for policy 0, policy_version 18580 (0.0006) +[2024-08-24 20:39:03,152][03430] Updated weights for policy 0, policy_version 18590 (0.0006) +[2024-08-24 20:39:04,271][03430] Updated weights for policy 0, policy_version 18600 (0.0006) +[2024-08-24 20:39:05,393][03430] Updated weights for policy 0, policy_version 18610 (0.0005) +[2024-08-24 20:39:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36590.9, 300 sec: 36364.1). Total num frames: 76238848. Throughput: 0: 9132.1. Samples: 19033984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:05,812][01192] Avg episode reward: [(0, '4.552')] +[2024-08-24 20:39:06,522][03430] Updated weights for policy 0, policy_version 18620 (0.0006) +[2024-08-24 20:39:07,659][03430] Updated weights for policy 0, policy_version 18630 (0.0006) +[2024-08-24 20:39:08,802][03430] Updated weights for policy 0, policy_version 18640 (0.0006) +[2024-08-24 20:39:09,925][03430] Updated weights for policy 0, policy_version 18650 (0.0006) +[2024-08-24 20:39:10,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36522.7, 300 sec: 36405.8). Total num frames: 76419072. Throughput: 0: 9100.4. Samples: 19088258. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:10,813][01192] Avg episode reward: [(0, '4.504')] +[2024-08-24 20:39:11,050][03430] Updated weights for policy 0, policy_version 18660 (0.0006) +[2024-08-24 20:39:12,197][03430] Updated weights for policy 0, policy_version 18670 (0.0006) +[2024-08-24 20:39:13,425][03430] Updated weights for policy 0, policy_version 18680 (0.0006) +[2024-08-24 20:39:14,549][03430] Updated weights for policy 0, policy_version 18690 (0.0005) +[2024-08-24 20:39:15,617][03430] Updated weights for policy 0, policy_version 18700 (0.0005) +[2024-08-24 20:39:15,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36454.5, 300 sec: 36405.8). Total num frames: 76599296. Throughput: 0: 9067.5. Samples: 19141882. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:15,813][01192] Avg episode reward: [(0, '4.234')] +[2024-08-24 20:39:16,718][03430] Updated weights for policy 0, policy_version 18710 (0.0005) +[2024-08-24 20:39:17,805][03430] Updated weights for policy 0, policy_version 18720 (0.0005) +[2024-08-24 20:39:18,826][03430] Updated weights for policy 0, policy_version 18730 (0.0005) +[2024-08-24 20:39:19,879][03430] Updated weights for policy 0, policy_version 18740 (0.0005) +[2024-08-24 20:39:20,812][01192] Fps is (10 sec: 37273.5, 60 sec: 36522.7, 300 sec: 36447.5). Total num frames: 76791808. Throughput: 0: 9084.8. Samples: 19170248. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0) +[2024-08-24 20:39:20,813][01192] Avg episode reward: [(0, '4.351')] +[2024-08-24 20:39:20,957][03430] Updated weights for policy 0, policy_version 18750 (0.0004) +[2024-08-24 20:39:22,053][03430] Updated weights for policy 0, policy_version 18760 (0.0005) +[2024-08-24 20:39:23,273][03430] Updated weights for policy 0, policy_version 18770 (0.0006) +[2024-08-24 20:39:24,521][03430] Updated weights for policy 0, policy_version 18780 (0.0005) +[2024-08-24 20:39:25,659][03430] Updated weights for policy 0, policy_version 18790 (0.0005) +[2024-08-24 20:39:25,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36386.1, 300 sec: 36433.6). Total num frames: 76967936. Throughput: 0: 9113.6. Samples: 19225808. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0) +[2024-08-24 20:39:25,813][01192] Avg episode reward: [(0, '4.363')] +[2024-08-24 20:39:26,792][03430] Updated weights for policy 0, policy_version 18800 (0.0006) +[2024-08-24 20:39:27,918][03430] Updated weights for policy 0, policy_version 18810 (0.0005) +[2024-08-24 20:39:29,057][03430] Updated weights for policy 0, policy_version 18820 (0.0006) +[2024-08-24 20:39:30,170][03430] Updated weights for policy 0, policy_version 18830 (0.0006) +[2024-08-24 20:39:30,812][01192] Fps is (10 sec: 35635.3, 60 sec: 36386.1, 300 sec: 36433.6). Total num frames: 77148160. Throughput: 0: 9090.9. Samples: 19279496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:30,813][01192] Avg episode reward: [(0, '4.403')] +[2024-08-24 20:39:31,318][03430] Updated weights for policy 0, policy_version 18840 (0.0006) +[2024-08-24 20:39:32,440][03430] Updated weights for policy 0, policy_version 18850 (0.0005) +[2024-08-24 20:39:33,557][03430] Updated weights for policy 0, policy_version 18860 (0.0006) +[2024-08-24 20:39:34,692][03430] Updated weights for policy 0, policy_version 18870 (0.0006) +[2024-08-24 20:39:35,796][03430] Updated weights for policy 0, policy_version 18880 (0.0005) +[2024-08-24 20:39:35,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36386.2, 300 sec: 36433.6). Total num frames: 77332480. Throughput: 0: 9088.7. Samples: 19306698. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:35,812][01192] Avg episode reward: [(0, '4.352')] +[2024-08-24 20:39:36,922][03430] Updated weights for policy 0, policy_version 18890 (0.0006) +[2024-08-24 20:39:38,058][03430] Updated weights for policy 0, policy_version 18900 (0.0006) +[2024-08-24 20:39:39,184][03430] Updated weights for policy 0, policy_version 18910 (0.0005) +[2024-08-24 20:39:40,323][03430] Updated weights for policy 0, policy_version 18920 (0.0006) +[2024-08-24 20:39:40,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36386.2, 300 sec: 36433.6). Total num frames: 77512704. Throughput: 0: 9094.0. Samples: 19361504. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:40,813][01192] Avg episode reward: [(0, '4.372')] +[2024-08-24 20:39:41,416][03430] Updated weights for policy 0, policy_version 18930 (0.0005) +[2024-08-24 20:39:42,546][03430] Updated weights for policy 0, policy_version 18940 (0.0005) +[2024-08-24 20:39:43,690][03430] Updated weights for policy 0, policy_version 18950 (0.0006) +[2024-08-24 20:39:44,888][03430] Updated weights for policy 0, policy_version 18960 (0.0006) +[2024-08-24 20:39:45,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36249.6, 300 sec: 36419.7). Total num frames: 77688832. Throughput: 0: 9067.6. Samples: 19415202. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:45,813][01192] Avg episode reward: [(0, '4.450')] +[2024-08-24 20:39:46,077][03430] Updated weights for policy 0, policy_version 18970 (0.0005) +[2024-08-24 20:39:47,196][03430] Updated weights for policy 0, policy_version 18980 (0.0006) +[2024-08-24 20:39:48,341][03430] Updated weights for policy 0, policy_version 18990 (0.0006) +[2024-08-24 20:39:49,467][03430] Updated weights for policy 0, policy_version 19000 (0.0006) +[2024-08-24 20:39:50,593][03430] Updated weights for policy 0, policy_version 19010 (0.0006) +[2024-08-24 20:39:50,812][01192] Fps is (10 sec: 35635.5, 60 sec: 36249.6, 300 sec: 36433.6). Total num frames: 77869056. Throughput: 0: 9061.3. Samples: 19441744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:50,812][01192] Avg episode reward: [(0, '4.456')] +[2024-08-24 20:39:51,746][03430] Updated weights for policy 0, policy_version 19020 (0.0007) +[2024-08-24 20:39:52,843][03430] Updated weights for policy 0, policy_version 19030 (0.0005) +[2024-08-24 20:39:53,976][03430] Updated weights for policy 0, policy_version 19040 (0.0005) +[2024-08-24 20:39:55,093][03430] Updated weights for policy 0, policy_version 19050 (0.0007) +[2024-08-24 20:39:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36249.6, 300 sec: 36433.6). Total num frames: 78053376. Throughput: 0: 9066.4. Samples: 19496248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:39:55,813][01192] Avg episode reward: [(0, '4.672')] +[2024-08-24 20:39:56,234][03430] Updated weights for policy 0, policy_version 19060 (0.0007) +[2024-08-24 20:39:57,368][03430] Updated weights for policy 0, policy_version 19070 (0.0006) +[2024-08-24 20:39:58,472][03430] Updated weights for policy 0, policy_version 19080 (0.0005) +[2024-08-24 20:39:59,625][03430] Updated weights for policy 0, policy_version 19090 (0.0006) +[2024-08-24 20:40:00,749][03430] Updated weights for policy 0, policy_version 19100 (0.0007) +[2024-08-24 20:40:00,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36249.6, 300 sec: 36433.6). Total num frames: 78233600. Throughput: 0: 9085.9. Samples: 19550746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:00,813][01192] Avg episode reward: [(0, '4.433')] +[2024-08-24 20:40:01,886][03430] Updated weights for policy 0, policy_version 19110 (0.0006) +[2024-08-24 20:40:03,026][03430] Updated weights for policy 0, policy_version 19120 (0.0006) +[2024-08-24 20:40:04,171][03430] Updated weights for policy 0, policy_version 19130 (0.0006) +[2024-08-24 20:40:05,308][03430] Updated weights for policy 0, policy_version 19140 (0.0006) +[2024-08-24 20:40:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36249.6, 300 sec: 36433.6). Total num frames: 78413824. Throughput: 0: 9057.0. Samples: 19577814. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:05,813][01192] Avg episode reward: [(0, '4.521')] +[2024-08-24 20:40:06,406][03430] Updated weights for policy 0, policy_version 19150 (0.0005) +[2024-08-24 20:40:07,588][03430] Updated weights for policy 0, policy_version 19160 (0.0006) +[2024-08-24 20:40:08,820][03430] Updated weights for policy 0, policy_version 19170 (0.0005) +[2024-08-24 20:40:09,957][03430] Updated weights for policy 0, policy_version 19180 (0.0005) +[2024-08-24 20:40:10,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36181.3, 300 sec: 36419.7). Total num frames: 78589952. Throughput: 0: 9000.8. Samples: 19630842. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:40:10,813][01192] Avg episode reward: [(0, '4.281')] +[2024-08-24 20:40:11,073][03430] Updated weights for policy 0, policy_version 19190 (0.0005) +[2024-08-24 20:40:12,198][03430] Updated weights for policy 0, policy_version 19200 (0.0005) +[2024-08-24 20:40:13,325][03430] Updated weights for policy 0, policy_version 19210 (0.0005) +[2024-08-24 20:40:14,474][03430] Updated weights for policy 0, policy_version 19220 (0.0006) +[2024-08-24 20:40:15,614][03430] Updated weights for policy 0, policy_version 19230 (0.0006) +[2024-08-24 20:40:15,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36181.3, 300 sec: 36419.7). Total num frames: 78770176. Throughput: 0: 9015.8. Samples: 19685206. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:40:15,812][01192] Avg episode reward: [(0, '4.343')] +[2024-08-24 20:40:15,836][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000019232_78774272.pth... +[2024-08-24 20:40:15,861][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000017096_70025216.pth +[2024-08-24 20:40:16,723][03430] Updated weights for policy 0, policy_version 19240 (0.0006) +[2024-08-24 20:40:17,861][03430] Updated weights for policy 0, policy_version 19250 (0.0006) +[2024-08-24 20:40:18,990][03430] Updated weights for policy 0, policy_version 19260 (0.0006) +[2024-08-24 20:40:20,130][03430] Updated weights for policy 0, policy_version 19270 (0.0006) +[2024-08-24 20:40:20,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36044.8, 300 sec: 36447.5). Total num frames: 78954496. Throughput: 0: 9013.5. Samples: 19712306. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:40:20,812][01192] Avg episode reward: [(0, '4.456')] +[2024-08-24 20:40:21,268][03430] Updated weights for policy 0, policy_version 19280 (0.0005) +[2024-08-24 20:40:22,400][03430] Updated weights for policy 0, policy_version 19290 (0.0006) +[2024-08-24 20:40:23,546][03430] Updated weights for policy 0, policy_version 19300 (0.0006) +[2024-08-24 20:40:24,710][03430] Updated weights for policy 0, policy_version 19310 (0.0006) +[2024-08-24 20:40:25,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36044.8, 300 sec: 36419.7). Total num frames: 79130624. Throughput: 0: 9000.9. Samples: 19766546. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:25,813][01192] Avg episode reward: [(0, '4.638')] +[2024-08-24 20:40:25,847][03430] Updated weights for policy 0, policy_version 19320 (0.0006) +[2024-08-24 20:40:26,987][03430] Updated weights for policy 0, policy_version 19330 (0.0006) +[2024-08-24 20:40:28,133][03430] Updated weights for policy 0, policy_version 19340 (0.0005) +[2024-08-24 20:40:29,244][03430] Updated weights for policy 0, policy_version 19350 (0.0006) +[2024-08-24 20:40:30,357][03430] Updated weights for policy 0, policy_version 19360 (0.0005) +[2024-08-24 20:40:30,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36113.0, 300 sec: 36419.7). Total num frames: 79314944. Throughput: 0: 9007.5. Samples: 19820540. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:30,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:40:31,462][03430] Updated weights for policy 0, policy_version 19370 (0.0006) +[2024-08-24 20:40:32,593][03430] Updated weights for policy 0, policy_version 19380 (0.0006) +[2024-08-24 20:40:33,715][03430] Updated weights for policy 0, policy_version 19390 (0.0006) +[2024-08-24 20:40:34,853][03430] Updated weights for policy 0, policy_version 19400 (0.0005) +[2024-08-24 20:40:35,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36044.8, 300 sec: 36419.7). Total num frames: 79495168. Throughput: 0: 9026.8. Samples: 19847948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:35,813][01192] Avg episode reward: [(0, '4.435')] +[2024-08-24 20:40:35,992][03430] Updated weights for policy 0, policy_version 19410 (0.0005) +[2024-08-24 20:40:37,135][03430] Updated weights for policy 0, policy_version 19420 (0.0006) +[2024-08-24 20:40:38,254][03430] Updated weights for policy 0, policy_version 19430 (0.0007) +[2024-08-24 20:40:39,387][03430] Updated weights for policy 0, policy_version 19440 (0.0005) +[2024-08-24 20:40:40,812][01192] Fps is (10 sec: 34816.1, 60 sec: 35840.0, 300 sec: 36364.2). Total num frames: 79663104. Throughput: 0: 9020.7. Samples: 19902182. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:40:40,813][01192] Avg episode reward: [(0, '4.492')] +[2024-08-24 20:40:40,911][03430] Updated weights for policy 0, policy_version 19450 (0.0005) +[2024-08-24 20:40:41,981][03430] Updated weights for policy 0, policy_version 19460 (0.0004) +[2024-08-24 20:40:43,009][03430] Updated weights for policy 0, policy_version 19470 (0.0004) +[2024-08-24 20:40:44,075][03430] Updated weights for policy 0, policy_version 19480 (0.0005) +[2024-08-24 20:40:45,166][03430] Updated weights for policy 0, policy_version 19490 (0.0004) +[2024-08-24 20:40:45,812][01192] Fps is (10 sec: 35635.1, 60 sec: 36044.8, 300 sec: 36378.0). Total num frames: 79851520. Throughput: 0: 8999.2. Samples: 19955712. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:40:45,812][01192] Avg episode reward: [(0, '4.429')] +[2024-08-24 20:40:46,273][03430] Updated weights for policy 0, policy_version 19500 (0.0005) +[2024-08-24 20:40:47,444][03430] Updated weights for policy 0, policy_version 19510 (0.0006) +[2024-08-24 20:40:48,592][03430] Updated weights for policy 0, policy_version 19520 (0.0005) +[2024-08-24 20:40:49,703][03430] Updated weights for policy 0, policy_version 19530 (0.0005) +[2024-08-24 20:40:50,812][01192] Fps is (10 sec: 36863.8, 60 sec: 36044.7, 300 sec: 36378.0). Total num frames: 80031744. Throughput: 0: 8994.6. Samples: 19982574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:50,813][01192] Avg episode reward: [(0, '4.346')] +[2024-08-24 20:40:50,827][03430] Updated weights for policy 0, policy_version 19540 (0.0006) +[2024-08-24 20:40:51,958][03430] Updated weights for policy 0, policy_version 19550 (0.0005) +[2024-08-24 20:40:53,087][03430] Updated weights for policy 0, policy_version 19560 (0.0005) +[2024-08-24 20:40:54,231][03430] Updated weights for policy 0, policy_version 19570 (0.0005) +[2024-08-24 20:40:55,381][03430] Updated weights for policy 0, policy_version 19580 (0.0006) +[2024-08-24 20:40:55,812][01192] Fps is (10 sec: 36044.6, 60 sec: 35976.5, 300 sec: 36378.0). Total num frames: 80211968. Throughput: 0: 9023.6. Samples: 20036904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:40:55,813][01192] Avg episode reward: [(0, '4.340')] +[2024-08-24 20:40:56,526][03430] Updated weights for policy 0, policy_version 19590 (0.0006) +[2024-08-24 20:40:57,668][03430] Updated weights for policy 0, policy_version 19600 (0.0006) +[2024-08-24 20:40:58,832][03430] Updated weights for policy 0, policy_version 19610 (0.0006) +[2024-08-24 20:40:59,936][03430] Updated weights for policy 0, policy_version 19620 (0.0006) +[2024-08-24 20:41:00,812][01192] Fps is (10 sec: 36045.1, 60 sec: 35976.5, 300 sec: 36378.0). Total num frames: 80392192. Throughput: 0: 9009.1. Samples: 20090616. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:41:00,813][01192] Avg episode reward: [(0, '4.578')] +[2024-08-24 20:41:01,094][03430] Updated weights for policy 0, policy_version 19630 (0.0005) +[2024-08-24 20:41:02,276][03430] Updated weights for policy 0, policy_version 19640 (0.0006) +[2024-08-24 20:41:03,410][03430] Updated weights for policy 0, policy_version 19650 (0.0006) +[2024-08-24 20:41:04,537][03430] Updated weights for policy 0, policy_version 19660 (0.0005) +[2024-08-24 20:41:05,671][03430] Updated weights for policy 0, policy_version 19670 (0.0005) +[2024-08-24 20:41:05,812][01192] Fps is (10 sec: 36045.1, 60 sec: 35976.5, 300 sec: 36364.1). Total num frames: 80572416. Throughput: 0: 8999.9. Samples: 20117300. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:41:05,812][01192] Avg episode reward: [(0, '4.376')] +[2024-08-24 20:41:06,793][03430] Updated weights for policy 0, policy_version 19680 (0.0005) +[2024-08-24 20:41:07,944][03430] Updated weights for policy 0, policy_version 19690 (0.0006) +[2024-08-24 20:41:09,070][03430] Updated weights for policy 0, policy_version 19700 (0.0006) +[2024-08-24 20:41:10,208][03430] Updated weights for policy 0, policy_version 19710 (0.0005) +[2024-08-24 20:41:10,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36044.8, 300 sec: 36364.2). Total num frames: 80752640. Throughput: 0: 8998.4. Samples: 20171474. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:41:10,813][01192] Avg episode reward: [(0, '4.303')] +[2024-08-24 20:41:11,346][03430] Updated weights for policy 0, policy_version 19720 (0.0006) +[2024-08-24 20:41:12,501][03430] Updated weights for policy 0, policy_version 19730 (0.0005) +[2024-08-24 20:41:13,651][03430] Updated weights for policy 0, policy_version 19740 (0.0005) +[2024-08-24 20:41:14,772][03430] Updated weights for policy 0, policy_version 19750 (0.0005) +[2024-08-24 20:41:15,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36044.8, 300 sec: 36364.1). Total num frames: 80932864. Throughput: 0: 8992.6. Samples: 20225206. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:41:15,812][01192] Avg episode reward: [(0, '4.689')] +[2024-08-24 20:41:15,902][03430] Updated weights for policy 0, policy_version 19760 (0.0005) +[2024-08-24 20:41:17,074][03430] Updated weights for policy 0, policy_version 19770 (0.0007) +[2024-08-24 20:41:18,205][03430] Updated weights for policy 0, policy_version 19780 (0.0005) +[2024-08-24 20:41:19,367][03430] Updated weights for policy 0, policy_version 19790 (0.0006) +[2024-08-24 20:41:20,497][03430] Updated weights for policy 0, policy_version 19800 (0.0005) +[2024-08-24 20:41:20,812][01192] Fps is (10 sec: 35635.1, 60 sec: 35908.3, 300 sec: 36336.4). Total num frames: 81108992. Throughput: 0: 8983.0. Samples: 20252184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:41:20,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:41:21,644][03430] Updated weights for policy 0, policy_version 19810 (0.0006) +[2024-08-24 20:41:22,764][03430] Updated weights for policy 0, policy_version 19820 (0.0005) +[2024-08-24 20:41:23,878][03430] Updated weights for policy 0, policy_version 19830 (0.0006) +[2024-08-24 20:41:25,095][03430] Updated weights for policy 0, policy_version 19840 (0.0006) +[2024-08-24 20:41:25,812][01192] Fps is (10 sec: 35225.0, 60 sec: 35908.2, 300 sec: 36322.5). Total num frames: 81285120. Throughput: 0: 8981.3. Samples: 20306340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:41:25,813][01192] Avg episode reward: [(0, '4.527')] +[2024-08-24 20:41:26,271][03430] Updated weights for policy 0, policy_version 19850 (0.0006) +[2024-08-24 20:41:27,403][03430] Updated weights for policy 0, policy_version 19860 (0.0005) +[2024-08-24 20:41:28,521][03430] Updated weights for policy 0, policy_version 19870 (0.0005) +[2024-08-24 20:41:29,649][03430] Updated weights for policy 0, policy_version 19880 (0.0005) +[2024-08-24 20:41:30,753][03430] Updated weights for policy 0, policy_version 19890 (0.0005) +[2024-08-24 20:41:30,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35908.3, 300 sec: 36322.5). Total num frames: 81469440. Throughput: 0: 8976.5. Samples: 20359654. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:30,813][01192] Avg episode reward: [(0, '4.466')] +[2024-08-24 20:41:31,886][03430] Updated weights for policy 0, policy_version 19900 (0.0006) +[2024-08-24 20:41:33,029][03430] Updated weights for policy 0, policy_version 19910 (0.0006) +[2024-08-24 20:41:34,175][03430] Updated weights for policy 0, policy_version 19920 (0.0006) +[2024-08-24 20:41:35,300][03430] Updated weights for policy 0, policy_version 19930 (0.0005) +[2024-08-24 20:41:35,812][01192] Fps is (10 sec: 36455.1, 60 sec: 35908.3, 300 sec: 36322.5). Total num frames: 81649664. Throughput: 0: 8982.9. Samples: 20386802. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:35,813][01192] Avg episode reward: [(0, '4.343')] +[2024-08-24 20:41:36,408][03430] Updated weights for policy 0, policy_version 19940 (0.0006) +[2024-08-24 20:41:37,514][03430] Updated weights for policy 0, policy_version 19950 (0.0006) +[2024-08-24 20:41:38,654][03430] Updated weights for policy 0, policy_version 19960 (0.0006) +[2024-08-24 20:41:39,759][03430] Updated weights for policy 0, policy_version 19970 (0.0006) +[2024-08-24 20:41:40,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36181.4, 300 sec: 36350.3). Total num frames: 81833984. Throughput: 0: 8993.0. Samples: 20441588. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:40,813][01192] Avg episode reward: [(0, '4.583')] +[2024-08-24 20:41:40,868][03430] Updated weights for policy 0, policy_version 19980 (0.0005) +[2024-08-24 20:41:41,996][03430] Updated weights for policy 0, policy_version 19990 (0.0006) +[2024-08-24 20:41:43,115][03430] Updated weights for policy 0, policy_version 20000 (0.0006) +[2024-08-24 20:41:44,256][03430] Updated weights for policy 0, policy_version 20010 (0.0006) +[2024-08-24 20:41:45,393][03430] Updated weights for policy 0, policy_version 20020 (0.0006) +[2024-08-24 20:41:45,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36044.8, 300 sec: 36350.3). Total num frames: 82014208. Throughput: 0: 9010.2. Samples: 20496074. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:45,813][01192] Avg episode reward: [(0, '4.372')] +[2024-08-24 20:41:46,517][03430] Updated weights for policy 0, policy_version 20030 (0.0006) +[2024-08-24 20:41:47,627][03430] Updated weights for policy 0, policy_version 20040 (0.0006) +[2024-08-24 20:41:48,753][03430] Updated weights for policy 0, policy_version 20050 (0.0006) +[2024-08-24 20:41:49,871][03430] Updated weights for policy 0, policy_version 20060 (0.0005) +[2024-08-24 20:41:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36113.1, 300 sec: 36350.3). Total num frames: 82198528. Throughput: 0: 9024.6. Samples: 20523406. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:50,813][01192] Avg episode reward: [(0, '4.449')] +[2024-08-24 20:41:51,013][03430] Updated weights for policy 0, policy_version 20070 (0.0005) +[2024-08-24 20:41:52,136][03430] Updated weights for policy 0, policy_version 20080 (0.0006) +[2024-08-24 20:41:53,274][03430] Updated weights for policy 0, policy_version 20090 (0.0006) +[2024-08-24 20:41:54,406][03430] Updated weights for policy 0, policy_version 20100 (0.0006) +[2024-08-24 20:41:55,538][03430] Updated weights for policy 0, policy_version 20110 (0.0006) +[2024-08-24 20:41:55,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36113.1, 300 sec: 36350.3). Total num frames: 82378752. Throughput: 0: 9031.2. Samples: 20577878. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:41:55,813][01192] Avg episode reward: [(0, '4.462')] +[2024-08-24 20:41:56,701][03430] Updated weights for policy 0, policy_version 20120 (0.0005) +[2024-08-24 20:41:57,845][03430] Updated weights for policy 0, policy_version 20130 (0.0007) +[2024-08-24 20:41:59,014][03430] Updated weights for policy 0, policy_version 20140 (0.0006) +[2024-08-24 20:42:00,124][03430] Updated weights for policy 0, policy_version 20150 (0.0006) +[2024-08-24 20:42:00,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36113.0, 300 sec: 36336.4). Total num frames: 82558976. Throughput: 0: 9033.9. Samples: 20631730. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:00,813][01192] Avg episode reward: [(0, '4.445')] +[2024-08-24 20:42:01,274][03430] Updated weights for policy 0, policy_version 20160 (0.0006) +[2024-08-24 20:42:02,395][03430] Updated weights for policy 0, policy_version 20170 (0.0006) +[2024-08-24 20:42:03,499][03430] Updated weights for policy 0, policy_version 20180 (0.0005) +[2024-08-24 20:42:04,639][03430] Updated weights for policy 0, policy_version 20190 (0.0005) +[2024-08-24 20:42:05,779][03430] Updated weights for policy 0, policy_version 20200 (0.0006) +[2024-08-24 20:42:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36322.5). Total num frames: 82739200. Throughput: 0: 9041.3. Samples: 20659042. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:05,813][01192] Avg episode reward: [(0, '4.441')] +[2024-08-24 20:42:06,935][03430] Updated weights for policy 0, policy_version 20210 (0.0006) +[2024-08-24 20:42:08,056][03430] Updated weights for policy 0, policy_version 20220 (0.0006) +[2024-08-24 20:42:09,192][03430] Updated weights for policy 0, policy_version 20230 (0.0005) +[2024-08-24 20:42:10,323][03430] Updated weights for policy 0, policy_version 20240 (0.0006) +[2024-08-24 20:42:10,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.1, 300 sec: 36294.7). Total num frames: 82919424. Throughput: 0: 9039.9. Samples: 20713134. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:10,812][01192] Avg episode reward: [(0, '4.573')] +[2024-08-24 20:42:11,398][03430] Updated weights for policy 0, policy_version 20250 (0.0006) +[2024-08-24 20:42:12,457][03430] Updated weights for policy 0, policy_version 20260 (0.0006) +[2024-08-24 20:42:13,561][03430] Updated weights for policy 0, policy_version 20270 (0.0005) +[2024-08-24 20:42:14,684][03430] Updated weights for policy 0, policy_version 20280 (0.0006) +[2024-08-24 20:42:15,802][03430] Updated weights for policy 0, policy_version 20290 (0.0005) +[2024-08-24 20:42:15,812][01192] Fps is (10 sec: 36863.6, 60 sec: 36249.5, 300 sec: 36308.6). Total num frames: 83107840. Throughput: 0: 9091.4. Samples: 20768770. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:42:15,813][01192] Avg episode reward: [(0, '4.615')] +[2024-08-24 20:42:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000020290_83107840.pth... +[2024-08-24 20:42:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000018167_74412032.pth +[2024-08-24 20:42:16,958][03430] Updated weights for policy 0, policy_version 20300 (0.0006) +[2024-08-24 20:42:18,104][03430] Updated weights for policy 0, policy_version 20310 (0.0006) +[2024-08-24 20:42:19,223][03430] Updated weights for policy 0, policy_version 20320 (0.0005) +[2024-08-24 20:42:20,368][03430] Updated weights for policy 0, policy_version 20330 (0.0005) +[2024-08-24 20:42:20,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36249.6, 300 sec: 36294.7). Total num frames: 83283968. Throughput: 0: 9089.3. Samples: 20795822. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:42:20,813][01192] Avg episode reward: [(0, '4.495')] +[2024-08-24 20:42:21,508][03430] Updated weights for policy 0, policy_version 20340 (0.0005) +[2024-08-24 20:42:22,639][03430] Updated weights for policy 0, policy_version 20350 (0.0006) +[2024-08-24 20:42:23,760][03430] Updated weights for policy 0, policy_version 20360 (0.0005) +[2024-08-24 20:42:24,883][03430] Updated weights for policy 0, policy_version 20370 (0.0005) +[2024-08-24 20:42:25,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36386.2, 300 sec: 36280.8). Total num frames: 83468288. Throughput: 0: 9080.2. Samples: 20850196. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:25,813][01192] Avg episode reward: [(0, '4.625')] +[2024-08-24 20:42:26,037][03430] Updated weights for policy 0, policy_version 20380 (0.0006) +[2024-08-24 20:42:27,147][03430] Updated weights for policy 0, policy_version 20390 (0.0006) +[2024-08-24 20:42:28,254][03430] Updated weights for policy 0, policy_version 20400 (0.0006) +[2024-08-24 20:42:29,375][03430] Updated weights for policy 0, policy_version 20410 (0.0006) +[2024-08-24 20:42:30,529][03430] Updated weights for policy 0, policy_version 20420 (0.0007) +[2024-08-24 20:42:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36267.0). Total num frames: 83648512. Throughput: 0: 9079.2. Samples: 20904638. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:30,813][01192] Avg episode reward: [(0, '4.468')] +[2024-08-24 20:42:31,669][03430] Updated weights for policy 0, policy_version 20430 (0.0006) +[2024-08-24 20:42:32,792][03430] Updated weights for policy 0, policy_version 20440 (0.0006) +[2024-08-24 20:42:33,954][03430] Updated weights for policy 0, policy_version 20450 (0.0005) +[2024-08-24 20:42:35,047][03430] Updated weights for policy 0, policy_version 20460 (0.0005) +[2024-08-24 20:42:35,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36317.9, 300 sec: 36253.1). Total num frames: 83828736. Throughput: 0: 9073.2. Samples: 20931698. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:42:35,813][01192] Avg episode reward: [(0, '4.383')] +[2024-08-24 20:42:36,199][03430] Updated weights for policy 0, policy_version 20470 (0.0006) +[2024-08-24 20:42:37,343][03430] Updated weights for policy 0, policy_version 20480 (0.0005) +[2024-08-24 20:42:38,463][03430] Updated weights for policy 0, policy_version 20490 (0.0006) +[2024-08-24 20:42:39,629][03430] Updated weights for policy 0, policy_version 20500 (0.0006) +[2024-08-24 20:42:40,753][03430] Updated weights for policy 0, policy_version 20510 (0.0006) +[2024-08-24 20:42:40,812][01192] Fps is (10 sec: 36044.5, 60 sec: 36249.6, 300 sec: 36253.1). Total num frames: 84008960. Throughput: 0: 9061.3. Samples: 20985638. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:42:40,813][01192] Avg episode reward: [(0, '4.356')] +[2024-08-24 20:42:41,897][03430] Updated weights for policy 0, policy_version 20520 (0.0006) +[2024-08-24 20:42:43,041][03430] Updated weights for policy 0, policy_version 20530 (0.0006) +[2024-08-24 20:42:44,157][03430] Updated weights for policy 0, policy_version 20540 (0.0007) +[2024-08-24 20:42:45,330][03430] Updated weights for policy 0, policy_version 20550 (0.0007) +[2024-08-24 20:42:45,812][01192] Fps is (10 sec: 36044.5, 60 sec: 36249.6, 300 sec: 36239.2). Total num frames: 84189184. Throughput: 0: 9066.8. Samples: 21039736. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:42:45,813][01192] Avg episode reward: [(0, '4.241')] +[2024-08-24 20:42:46,482][03430] Updated weights for policy 0, policy_version 20560 (0.0006) +[2024-08-24 20:42:47,597][03430] Updated weights for policy 0, policy_version 20570 (0.0005) +[2024-08-24 20:42:48,744][03430] Updated weights for policy 0, policy_version 20580 (0.0006) +[2024-08-24 20:42:49,864][03430] Updated weights for policy 0, policy_version 20590 (0.0007) +[2024-08-24 20:42:50,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36181.3, 300 sec: 36239.2). Total num frames: 84369408. Throughput: 0: 9053.7. Samples: 21066460. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:42:50,813][01192] Avg episode reward: [(0, '4.408')] +[2024-08-24 20:42:51,036][03430] Updated weights for policy 0, policy_version 20600 (0.0006) +[2024-08-24 20:42:52,191][03430] Updated weights for policy 0, policy_version 20610 (0.0005) +[2024-08-24 20:42:53,344][03430] Updated weights for policy 0, policy_version 20620 (0.0005) +[2024-08-24 20:42:54,480][03430] Updated weights for policy 0, policy_version 20630 (0.0006) +[2024-08-24 20:42:55,608][03430] Updated weights for policy 0, policy_version 20640 (0.0006) +[2024-08-24 20:42:55,812][01192] Fps is (10 sec: 35635.4, 60 sec: 36113.1, 300 sec: 36225.3). Total num frames: 84545536. Throughput: 0: 9040.8. Samples: 21119972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:42:55,813][01192] Avg episode reward: [(0, '4.489')] +[2024-08-24 20:42:56,753][03430] Updated weights for policy 0, policy_version 20650 (0.0005) +[2024-08-24 20:42:57,877][03430] Updated weights for policy 0, policy_version 20660 (0.0006) +[2024-08-24 20:42:59,036][03430] Updated weights for policy 0, policy_version 20670 (0.0006) +[2024-08-24 20:43:00,187][03430] Updated weights for policy 0, policy_version 20680 (0.0006) +[2024-08-24 20:43:00,812][01192] Fps is (10 sec: 35635.3, 60 sec: 36113.1, 300 sec: 36211.4). Total num frames: 84725760. Throughput: 0: 9002.2. Samples: 21173868. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:43:00,813][01192] Avg episode reward: [(0, '4.460')] +[2024-08-24 20:43:01,348][03430] Updated weights for policy 0, policy_version 20690 (0.0006) +[2024-08-24 20:43:02,484][03430] Updated weights for policy 0, policy_version 20700 (0.0005) +[2024-08-24 20:43:03,629][03430] Updated weights for policy 0, policy_version 20710 (0.0006) +[2024-08-24 20:43:04,702][03430] Updated weights for policy 0, policy_version 20720 (0.0005) +[2024-08-24 20:43:05,771][03430] Updated weights for policy 0, policy_version 20730 (0.0005) +[2024-08-24 20:43:05,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36181.3, 300 sec: 36211.4). Total num frames: 84910080. Throughput: 0: 8997.3. Samples: 21200700. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:43:05,812][01192] Avg episode reward: [(0, '4.530')] +[2024-08-24 20:43:06,901][03430] Updated weights for policy 0, policy_version 20740 (0.0006) +[2024-08-24 20:43:07,995][03430] Updated weights for policy 0, policy_version 20750 (0.0006) +[2024-08-24 20:43:09,115][03430] Updated weights for policy 0, policy_version 20760 (0.0006) +[2024-08-24 20:43:10,253][03430] Updated weights for policy 0, policy_version 20770 (0.0006) +[2024-08-24 20:43:10,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36249.6, 300 sec: 36211.4). Total num frames: 85094400. Throughput: 0: 9029.4. Samples: 21256520. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:43:10,813][01192] Avg episode reward: [(0, '4.543')] +[2024-08-24 20:43:11,346][03430] Updated weights for policy 0, policy_version 20780 (0.0005) +[2024-08-24 20:43:12,473][03430] Updated weights for policy 0, policy_version 20790 (0.0005) +[2024-08-24 20:43:13,578][03430] Updated weights for policy 0, policy_version 20800 (0.0006) +[2024-08-24 20:43:14,643][03430] Updated weights for policy 0, policy_version 20810 (0.0005) +[2024-08-24 20:43:15,726][03430] Updated weights for policy 0, policy_version 20820 (0.0005) +[2024-08-24 20:43:15,812][01192] Fps is (10 sec: 36863.7, 60 sec: 36181.3, 300 sec: 36197.5). Total num frames: 85278720. Throughput: 0: 9058.1. Samples: 21312254. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:43:15,813][01192] Avg episode reward: [(0, '4.436')] +[2024-08-24 20:43:16,834][03430] Updated weights for policy 0, policy_version 20830 (0.0005) +[2024-08-24 20:43:17,963][03430] Updated weights for policy 0, policy_version 20840 (0.0007) +[2024-08-24 20:43:19,084][03430] Updated weights for policy 0, policy_version 20850 (0.0005) +[2024-08-24 20:43:20,196][03430] Updated weights for policy 0, policy_version 20860 (0.0005) +[2024-08-24 20:43:20,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36317.9, 300 sec: 36197.5). Total num frames: 85463040. Throughput: 0: 9068.4. Samples: 21339778. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:43:20,813][01192] Avg episode reward: [(0, '4.358')] +[2024-08-24 20:43:21,344][03430] Updated weights for policy 0, policy_version 20870 (0.0006) +[2024-08-24 20:43:22,451][03430] Updated weights for policy 0, policy_version 20880 (0.0006) +[2024-08-24 20:43:23,541][03430] Updated weights for policy 0, policy_version 20890 (0.0007) +[2024-08-24 20:43:24,659][03430] Updated weights for policy 0, policy_version 20900 (0.0006) +[2024-08-24 20:43:25,812][01192] Fps is (10 sec: 36454.7, 60 sec: 36249.6, 300 sec: 36197.5). Total num frames: 85643264. Throughput: 0: 9095.3. Samples: 21394924. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:43:25,813][01192] Avg episode reward: [(0, '4.417')] +[2024-08-24 20:43:25,821][03430] Updated weights for policy 0, policy_version 20910 (0.0006) +[2024-08-24 20:43:27,017][03430] Updated weights for policy 0, policy_version 20920 (0.0004) +[2024-08-24 20:43:28,150][03430] Updated weights for policy 0, policy_version 20930 (0.0006) +[2024-08-24 20:43:29,206][03430] Updated weights for policy 0, policy_version 20940 (0.0005) +[2024-08-24 20:43:30,288][03430] Updated weights for policy 0, policy_version 20950 (0.0006) +[2024-08-24 20:43:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36197.5). Total num frames: 85827584. Throughput: 0: 9098.9. Samples: 21449184. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:43:30,813][01192] Avg episode reward: [(0, '4.490')] +[2024-08-24 20:43:31,416][03430] Updated weights for policy 0, policy_version 20960 (0.0005) +[2024-08-24 20:43:32,563][03430] Updated weights for policy 0, policy_version 20970 (0.0005) +[2024-08-24 20:43:33,719][03430] Updated weights for policy 0, policy_version 20980 (0.0006) +[2024-08-24 20:43:34,831][03430] Updated weights for policy 0, policy_version 20990 (0.0005) +[2024-08-24 20:43:35,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36317.9, 300 sec: 36197.6). Total num frames: 86007808. Throughput: 0: 9107.9. Samples: 21476314. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:43:35,813][01192] Avg episode reward: [(0, '4.500')] +[2024-08-24 20:43:35,978][03430] Updated weights for policy 0, policy_version 21000 (0.0007) +[2024-08-24 20:43:37,119][03430] Updated weights for policy 0, policy_version 21010 (0.0006) +[2024-08-24 20:43:38,264][03430] Updated weights for policy 0, policy_version 21020 (0.0006) +[2024-08-24 20:43:39,413][03430] Updated weights for policy 0, policy_version 21030 (0.0005) +[2024-08-24 20:43:40,540][03430] Updated weights for policy 0, policy_version 21040 (0.0005) +[2024-08-24 20:43:40,812][01192] Fps is (10 sec: 36044.9, 60 sec: 36317.9, 300 sec: 36183.7). Total num frames: 86188032. Throughput: 0: 9118.8. Samples: 21530318. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:43:40,813][01192] Avg episode reward: [(0, '4.422')] +[2024-08-24 20:43:41,696][03430] Updated weights for policy 0, policy_version 21050 (0.0006) +[2024-08-24 20:43:42,813][03430] Updated weights for policy 0, policy_version 21060 (0.0005) +[2024-08-24 20:43:43,911][03430] Updated weights for policy 0, policy_version 21070 (0.0006) +[2024-08-24 20:43:45,033][03430] Updated weights for policy 0, policy_version 21080 (0.0006) +[2024-08-24 20:43:45,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36317.9, 300 sec: 36183.6). Total num frames: 86368256. Throughput: 0: 9134.6. Samples: 21584924. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:43:45,812][01192] Avg episode reward: [(0, '4.401')] +[2024-08-24 20:43:46,172][03430] Updated weights for policy 0, policy_version 21090 (0.0006) +[2024-08-24 20:43:47,314][03430] Updated weights for policy 0, policy_version 21100 (0.0006) +[2024-08-24 20:43:48,463][03430] Updated weights for policy 0, policy_version 21110 (0.0006) +[2024-08-24 20:43:49,570][03430] Updated weights for policy 0, policy_version 21120 (0.0006) +[2024-08-24 20:43:50,708][03430] Updated weights for policy 0, policy_version 21130 (0.0006) +[2024-08-24 20:43:50,812][01192] Fps is (10 sec: 36044.6, 60 sec: 36317.9, 300 sec: 36169.8). Total num frames: 86548480. Throughput: 0: 9135.4. Samples: 21611794. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:43:50,813][01192] Avg episode reward: [(0, '4.368')] +[2024-08-24 20:43:51,843][03430] Updated weights for policy 0, policy_version 21140 (0.0005) +[2024-08-24 20:43:52,969][03430] Updated weights for policy 0, policy_version 21150 (0.0006) +[2024-08-24 20:43:54,089][03430] Updated weights for policy 0, policy_version 21160 (0.0006) +[2024-08-24 20:43:55,250][03430] Updated weights for policy 0, policy_version 21170 (0.0006) +[2024-08-24 20:43:55,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36169.8). Total num frames: 86728704. Throughput: 0: 9110.3. Samples: 21666484. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:43:55,812][01192] Avg episode reward: [(0, '4.468')] +[2024-08-24 20:43:56,396][03430] Updated weights for policy 0, policy_version 21180 (0.0006) +[2024-08-24 20:43:57,531][03430] Updated weights for policy 0, policy_version 21190 (0.0007) +[2024-08-24 20:43:58,667][03430] Updated weights for policy 0, policy_version 21200 (0.0006) +[2024-08-24 20:43:59,771][03430] Updated weights for policy 0, policy_version 21210 (0.0005) +[2024-08-24 20:44:00,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36454.4, 300 sec: 36183.6). Total num frames: 86913024. Throughput: 0: 9070.0. Samples: 21720404. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:44:00,813][01192] Avg episode reward: [(0, '4.770')] +[2024-08-24 20:44:00,903][03430] Updated weights for policy 0, policy_version 21220 (0.0006) +[2024-08-24 20:44:02,056][03430] Updated weights for policy 0, policy_version 21230 (0.0007) +[2024-08-24 20:44:03,180][03430] Updated weights for policy 0, policy_version 21240 (0.0007) +[2024-08-24 20:44:04,320][03430] Updated weights for policy 0, policy_version 21250 (0.0006) +[2024-08-24 20:44:05,459][03430] Updated weights for policy 0, policy_version 21260 (0.0006) +[2024-08-24 20:44:05,812][01192] Fps is (10 sec: 36453.8, 60 sec: 36386.0, 300 sec: 36183.6). Total num frames: 87093248. Throughput: 0: 9057.2. Samples: 21747354. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:44:05,813][01192] Avg episode reward: [(0, '4.335')] +[2024-08-24 20:44:06,604][03430] Updated weights for policy 0, policy_version 21270 (0.0005) +[2024-08-24 20:44:07,725][03430] Updated weights for policy 0, policy_version 21280 (0.0006) +[2024-08-24 20:44:08,858][03430] Updated weights for policy 0, policy_version 21290 (0.0006) +[2024-08-24 20:44:09,988][03430] Updated weights for policy 0, policy_version 21300 (0.0006) +[2024-08-24 20:44:10,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36317.9, 300 sec: 36183.6). Total num frames: 87273472. Throughput: 0: 9041.9. Samples: 21801810. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:44:10,813][01192] Avg episode reward: [(0, '4.290')] +[2024-08-24 20:44:11,133][03430] Updated weights for policy 0, policy_version 21310 (0.0006) +[2024-08-24 20:44:12,251][03430] Updated weights for policy 0, policy_version 21320 (0.0005) +[2024-08-24 20:44:13,368][03430] Updated weights for policy 0, policy_version 21330 (0.0005) +[2024-08-24 20:44:14,466][03430] Updated weights for policy 0, policy_version 21340 (0.0006) +[2024-08-24 20:44:15,565][03430] Updated weights for policy 0, policy_version 21350 (0.0006) +[2024-08-24 20:44:15,812][01192] Fps is (10 sec: 36454.6, 60 sec: 36317.9, 300 sec: 36155.9). Total num frames: 87457792. Throughput: 0: 9046.3. Samples: 21856270. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:44:15,813][01192] Avg episode reward: [(0, '4.248')] +[2024-08-24 20:44:15,816][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000021352_87457792.pth... +[2024-08-24 20:44:15,839][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000019232_78774272.pth +[2024-08-24 20:44:16,713][03430] Updated weights for policy 0, policy_version 21360 (0.0006) +[2024-08-24 20:44:17,821][03430] Updated weights for policy 0, policy_version 21370 (0.0006) +[2024-08-24 20:44:18,914][03430] Updated weights for policy 0, policy_version 21380 (0.0005) +[2024-08-24 20:44:20,005][03430] Updated weights for policy 0, policy_version 21390 (0.0005) +[2024-08-24 20:44:20,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36317.9, 300 sec: 36183.7). Total num frames: 87642112. Throughput: 0: 9051.1. Samples: 21883612. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:44:20,812][01192] Avg episode reward: [(0, '4.476')] +[2024-08-24 20:44:21,130][03430] Updated weights for policy 0, policy_version 21400 (0.0006) +[2024-08-24 20:44:22,257][03430] Updated weights for policy 0, policy_version 21410 (0.0006) +[2024-08-24 20:44:23,372][03430] Updated weights for policy 0, policy_version 21420 (0.0005) +[2024-08-24 20:44:24,434][03430] Updated weights for policy 0, policy_version 21430 (0.0005) +[2024-08-24 20:44:25,567][03430] Updated weights for policy 0, policy_version 21440 (0.0005) +[2024-08-24 20:44:25,812][01192] Fps is (10 sec: 36864.4, 60 sec: 36386.1, 300 sec: 36197.5). Total num frames: 87826432. Throughput: 0: 9089.6. Samples: 21939348. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:44:25,813][01192] Avg episode reward: [(0, '4.298')] +[2024-08-24 20:44:26,667][03430] Updated weights for policy 0, policy_version 21450 (0.0005) +[2024-08-24 20:44:27,782][03430] Updated weights for policy 0, policy_version 21460 (0.0005) +[2024-08-24 20:44:28,877][03430] Updated weights for policy 0, policy_version 21470 (0.0007) +[2024-08-24 20:44:30,000][03430] Updated weights for policy 0, policy_version 21480 (0.0006) +[2024-08-24 20:44:30,812][01192] Fps is (10 sec: 36864.0, 60 sec: 36386.1, 300 sec: 36197.5). Total num frames: 88010752. Throughput: 0: 9100.5. Samples: 21994448. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:44:30,813][01192] Avg episode reward: [(0, '4.480')] +[2024-08-24 20:44:31,117][03430] Updated weights for policy 0, policy_version 21490 (0.0007) +[2024-08-24 20:44:32,248][03430] Updated weights for policy 0, policy_version 21500 (0.0006) +[2024-08-24 20:44:33,359][03430] Updated weights for policy 0, policy_version 21510 (0.0005) +[2024-08-24 20:44:34,520][03430] Updated weights for policy 0, policy_version 21520 (0.0006) +[2024-08-24 20:44:35,645][03430] Updated weights for policy 0, policy_version 21530 (0.0006) +[2024-08-24 20:44:35,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.1, 300 sec: 36197.5). Total num frames: 88190976. Throughput: 0: 9117.2. Samples: 22022068. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:44:35,813][01192] Avg episode reward: [(0, '4.324')] +[2024-08-24 20:44:36,786][03430] Updated weights for policy 0, policy_version 21540 (0.0005) +[2024-08-24 20:44:37,877][03430] Updated weights for policy 0, policy_version 21550 (0.0005) +[2024-08-24 20:44:38,989][03430] Updated weights for policy 0, policy_version 21560 (0.0006) +[2024-08-24 20:44:40,145][03430] Updated weights for policy 0, policy_version 21570 (0.0007) +[2024-08-24 20:44:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36211.4). Total num frames: 88371200. Throughput: 0: 9117.0. Samples: 22076748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:44:40,813][01192] Avg episode reward: [(0, '4.412')] +[2024-08-24 20:44:41,294][03430] Updated weights for policy 0, policy_version 21580 (0.0005) +[2024-08-24 20:44:42,432][03430] Updated weights for policy 0, policy_version 21590 (0.0005) +[2024-08-24 20:44:43,548][03430] Updated weights for policy 0, policy_version 21600 (0.0006) +[2024-08-24 20:44:44,665][03430] Updated weights for policy 0, policy_version 21610 (0.0005) +[2024-08-24 20:44:45,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36211.4). Total num frames: 88551424. Throughput: 0: 9121.2. Samples: 22130860. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:44:45,812][01192] Avg episode reward: [(0, '4.628')] +[2024-08-24 20:44:45,817][03430] Updated weights for policy 0, policy_version 21620 (0.0005) +[2024-08-24 20:44:46,929][03430] Updated weights for policy 0, policy_version 21630 (0.0006) +[2024-08-24 20:44:48,053][03430] Updated weights for policy 0, policy_version 21640 (0.0005) +[2024-08-24 20:44:49,202][03430] Updated weights for policy 0, policy_version 21650 (0.0006) +[2024-08-24 20:44:50,358][03430] Updated weights for policy 0, policy_version 21660 (0.0006) +[2024-08-24 20:44:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.4, 300 sec: 36211.4). Total num frames: 88735744. Throughput: 0: 9133.4. Samples: 22158354. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:44:50,813][01192] Avg episode reward: [(0, '4.334')] +[2024-08-24 20:44:51,489][03430] Updated weights for policy 0, policy_version 21670 (0.0006) +[2024-08-24 20:44:52,630][03430] Updated weights for policy 0, policy_version 21680 (0.0006) +[2024-08-24 20:44:53,779][03430] Updated weights for policy 0, policy_version 21690 (0.0006) +[2024-08-24 20:44:54,920][03430] Updated weights for policy 0, policy_version 21700 (0.0006) +[2024-08-24 20:44:55,812][01192] Fps is (10 sec: 36454.3, 60 sec: 36454.4, 300 sec: 36211.4). Total num frames: 88915968. Throughput: 0: 9114.6. Samples: 22211966. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:44:55,813][01192] Avg episode reward: [(0, '4.381')] +[2024-08-24 20:44:56,009][03430] Updated weights for policy 0, policy_version 21710 (0.0005) +[2024-08-24 20:44:57,124][03430] Updated weights for policy 0, policy_version 21720 (0.0005) +[2024-08-24 20:44:58,293][03430] Updated weights for policy 0, policy_version 21730 (0.0006) +[2024-08-24 20:44:59,394][03430] Updated weights for policy 0, policy_version 21740 (0.0006) +[2024-08-24 20:45:00,546][03430] Updated weights for policy 0, policy_version 21750 (0.0005) +[2024-08-24 20:45:00,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36211.4). Total num frames: 89096192. Throughput: 0: 9119.7. Samples: 22266654. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) +[2024-08-24 20:45:00,813][01192] Avg episode reward: [(0, '4.219')] +[2024-08-24 20:45:01,697][03430] Updated weights for policy 0, policy_version 21760 (0.0005) +[2024-08-24 20:45:02,838][03430] Updated weights for policy 0, policy_version 21770 (0.0005) +[2024-08-24 20:45:03,967][03430] Updated weights for policy 0, policy_version 21780 (0.0007) +[2024-08-24 20:45:05,119][03430] Updated weights for policy 0, policy_version 21790 (0.0006) +[2024-08-24 20:45:05,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.2, 300 sec: 36225.3). Total num frames: 89276416. Throughput: 0: 9103.6. Samples: 22293272. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:05,813][01192] Avg episode reward: [(0, '4.356')] +[2024-08-24 20:45:06,256][03430] Updated weights for policy 0, policy_version 21800 (0.0005) +[2024-08-24 20:45:07,340][03430] Updated weights for policy 0, policy_version 21810 (0.0005) +[2024-08-24 20:45:08,427][03430] Updated weights for policy 0, policy_version 21820 (0.0007) +[2024-08-24 20:45:09,555][03430] Updated weights for policy 0, policy_version 21830 (0.0005) +[2024-08-24 20:45:10,694][03430] Updated weights for policy 0, policy_version 21840 (0.0006) +[2024-08-24 20:45:10,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36386.1, 300 sec: 36225.3). Total num frames: 89456640. Throughput: 0: 9080.8. Samples: 22347982. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:10,813][01192] Avg episode reward: [(0, '4.490')] +[2024-08-24 20:45:11,829][03430] Updated weights for policy 0, policy_version 21850 (0.0005) +[2024-08-24 20:45:12,960][03430] Updated weights for policy 0, policy_version 21860 (0.0006) +[2024-08-24 20:45:14,097][03430] Updated weights for policy 0, policy_version 21870 (0.0006) +[2024-08-24 20:45:15,223][03430] Updated weights for policy 0, policy_version 21880 (0.0006) +[2024-08-24 20:45:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36386.2, 300 sec: 36225.3). Total num frames: 89640960. Throughput: 0: 9065.9. Samples: 22402412. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:15,813][01192] Avg episode reward: [(0, '4.355')] +[2024-08-24 20:45:16,351][03430] Updated weights for policy 0, policy_version 21890 (0.0005) +[2024-08-24 20:45:17,492][03430] Updated weights for policy 0, policy_version 21900 (0.0006) +[2024-08-24 20:45:18,576][03430] Updated weights for policy 0, policy_version 21910 (0.0006) +[2024-08-24 20:45:19,658][03430] Updated weights for policy 0, policy_version 21920 (0.0005) +[2024-08-24 20:45:20,790][03430] Updated weights for policy 0, policy_version 21930 (0.0005) +[2024-08-24 20:45:20,812][01192] Fps is (10 sec: 36863.3, 60 sec: 36386.0, 300 sec: 36253.1). Total num frames: 89825280. Throughput: 0: 9055.3. Samples: 22429560. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:20,813][01192] Avg episode reward: [(0, '4.611')] +[2024-08-24 20:45:21,920][03430] Updated weights for policy 0, policy_version 21940 (0.0006) +[2024-08-24 20:45:23,062][03430] Updated weights for policy 0, policy_version 21950 (0.0007) +[2024-08-24 20:45:24,196][03430] Updated weights for policy 0, policy_version 21960 (0.0006) +[2024-08-24 20:45:25,322][03430] Updated weights for policy 0, policy_version 21970 (0.0005) +[2024-08-24 20:45:25,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36317.8, 300 sec: 36239.2). Total num frames: 90005504. Throughput: 0: 9061.1. Samples: 22484498. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:25,813][01192] Avg episode reward: [(0, '4.658')] +[2024-08-24 20:45:26,496][03430] Updated weights for policy 0, policy_version 21980 (0.0006) +[2024-08-24 20:45:27,664][03430] Updated weights for policy 0, policy_version 21990 (0.0005) +[2024-08-24 20:45:28,875][03430] Updated weights for policy 0, policy_version 22000 (0.0006) +[2024-08-24 20:45:30,071][03430] Updated weights for policy 0, policy_version 22010 (0.0006) +[2024-08-24 20:45:30,812][01192] Fps is (10 sec: 35226.3, 60 sec: 36113.1, 300 sec: 36211.4). Total num frames: 90177536. Throughput: 0: 9022.7. Samples: 22536882. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:45:30,813][01192] Avg episode reward: [(0, '4.518')] +[2024-08-24 20:45:31,200][03430] Updated weights for policy 0, policy_version 22020 (0.0007) +[2024-08-24 20:45:32,349][03430] Updated weights for policy 0, policy_version 22030 (0.0006) +[2024-08-24 20:45:33,470][03430] Updated weights for policy 0, policy_version 22040 (0.0005) +[2024-08-24 20:45:34,609][03430] Updated weights for policy 0, policy_version 22050 (0.0005) +[2024-08-24 20:45:35,721][03430] Updated weights for policy 0, policy_version 22060 (0.0006) +[2024-08-24 20:45:35,812][01192] Fps is (10 sec: 35225.9, 60 sec: 36113.1, 300 sec: 36253.1). Total num frames: 90357760. Throughput: 0: 9007.4. Samples: 22563688. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:45:35,812][01192] Avg episode reward: [(0, '4.127')] +[2024-08-24 20:45:36,880][03430] Updated weights for policy 0, policy_version 22070 (0.0006) +[2024-08-24 20:45:38,006][03430] Updated weights for policy 0, policy_version 22080 (0.0005) +[2024-08-24 20:45:39,145][03430] Updated weights for policy 0, policy_version 22090 (0.0006) +[2024-08-24 20:45:40,294][03430] Updated weights for policy 0, policy_version 22100 (0.0007) +[2024-08-24 20:45:40,812][01192] Fps is (10 sec: 36044.5, 60 sec: 36113.0, 300 sec: 36225.3). Total num frames: 90537984. Throughput: 0: 9027.2. Samples: 22618190. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:45:40,813][01192] Avg episode reward: [(0, '4.256')] +[2024-08-24 20:45:41,423][03430] Updated weights for policy 0, policy_version 22110 (0.0007) +[2024-08-24 20:45:42,540][03430] Updated weights for policy 0, policy_version 22120 (0.0006) +[2024-08-24 20:45:43,688][03430] Updated weights for policy 0, policy_version 22130 (0.0006) +[2024-08-24 20:45:44,803][03430] Updated weights for policy 0, policy_version 22140 (0.0005) +[2024-08-24 20:45:45,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36181.3, 300 sec: 36239.2). Total num frames: 90722304. Throughput: 0: 9012.2. Samples: 22672202. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:45:45,814][01192] Avg episode reward: [(0, '4.480')] +[2024-08-24 20:45:45,939][03430] Updated weights for policy 0, policy_version 22150 (0.0005) +[2024-08-24 20:45:47,103][03430] Updated weights for policy 0, policy_version 22160 (0.0005) +[2024-08-24 20:45:48,216][03430] Updated weights for policy 0, policy_version 22170 (0.0006) +[2024-08-24 20:45:49,348][03430] Updated weights for policy 0, policy_version 22180 (0.0006) +[2024-08-24 20:45:50,468][03430] Updated weights for policy 0, policy_version 22190 (0.0006) +[2024-08-24 20:45:50,812][01192] Fps is (10 sec: 36454.6, 60 sec: 36113.1, 300 sec: 36239.2). Total num frames: 90902528. Throughput: 0: 9025.6. Samples: 22699422. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:45:50,813][01192] Avg episode reward: [(0, '4.554')] +[2024-08-24 20:45:51,568][03430] Updated weights for policy 0, policy_version 22200 (0.0005) +[2024-08-24 20:45:52,682][03430] Updated weights for policy 0, policy_version 22210 (0.0005) +[2024-08-24 20:45:53,824][03430] Updated weights for policy 0, policy_version 22220 (0.0006) +[2024-08-24 20:45:54,943][03430] Updated weights for policy 0, policy_version 22230 (0.0005) +[2024-08-24 20:45:55,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.1, 300 sec: 36239.2). Total num frames: 91082752. Throughput: 0: 9026.2. Samples: 22754162. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:45:55,812][01192] Avg episode reward: [(0, '4.336')] +[2024-08-24 20:45:56,071][03430] Updated weights for policy 0, policy_version 22240 (0.0005) +[2024-08-24 20:45:57,208][03430] Updated weights for policy 0, policy_version 22250 (0.0005) +[2024-08-24 20:45:58,337][03430] Updated weights for policy 0, policy_version 22260 (0.0006) +[2024-08-24 20:45:59,450][03430] Updated weights for policy 0, policy_version 22270 (0.0005) +[2024-08-24 20:46:00,577][03430] Updated weights for policy 0, policy_version 22280 (0.0005) +[2024-08-24 20:46:00,812][01192] Fps is (10 sec: 36454.1, 60 sec: 36181.3, 300 sec: 36253.1). Total num frames: 91267072. Throughput: 0: 9029.3. Samples: 22808730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:00,813][01192] Avg episode reward: [(0, '4.505')] +[2024-08-24 20:46:01,687][03430] Updated weights for policy 0, policy_version 22290 (0.0004) +[2024-08-24 20:46:02,820][03430] Updated weights for policy 0, policy_version 22300 (0.0006) +[2024-08-24 20:46:03,925][03430] Updated weights for policy 0, policy_version 22310 (0.0005) +[2024-08-24 20:46:05,050][03430] Updated weights for policy 0, policy_version 22320 (0.0006) +[2024-08-24 20:46:05,812][01192] Fps is (10 sec: 36453.9, 60 sec: 36181.3, 300 sec: 36253.1). Total num frames: 91447296. Throughput: 0: 9032.4. Samples: 22836016. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:05,813][01192] Avg episode reward: [(0, '4.607')] +[2024-08-24 20:46:06,189][03430] Updated weights for policy 0, policy_version 22330 (0.0006) +[2024-08-24 20:46:07,336][03430] Updated weights for policy 0, policy_version 22340 (0.0005) +[2024-08-24 20:46:08,474][03430] Updated weights for policy 0, policy_version 22350 (0.0006) +[2024-08-24 20:46:09,619][03430] Updated weights for policy 0, policy_version 22360 (0.0006) +[2024-08-24 20:46:10,731][03430] Updated weights for policy 0, policy_version 22370 (0.0006) +[2024-08-24 20:46:10,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36181.3, 300 sec: 36253.1). Total num frames: 91627520. Throughput: 0: 9018.9. Samples: 22890346. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:10,812][01192] Avg episode reward: [(0, '4.665')] +[2024-08-24 20:46:11,864][03430] Updated weights for policy 0, policy_version 22380 (0.0006) +[2024-08-24 20:46:12,988][03430] Updated weights for policy 0, policy_version 22390 (0.0005) +[2024-08-24 20:46:14,119][03430] Updated weights for policy 0, policy_version 22400 (0.0006) +[2024-08-24 20:46:15,253][03430] Updated weights for policy 0, policy_version 22410 (0.0006) +[2024-08-24 20:46:15,812][01192] Fps is (10 sec: 36045.0, 60 sec: 36113.1, 300 sec: 36267.0). Total num frames: 91807744. Throughput: 0: 9066.8. Samples: 22944888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:15,813][01192] Avg episode reward: [(0, '4.507')] +[2024-08-24 20:46:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000022415_91811840.pth... +[2024-08-24 20:46:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000020290_83107840.pth +[2024-08-24 20:46:16,380][03430] Updated weights for policy 0, policy_version 22420 (0.0006) +[2024-08-24 20:46:17,505][03430] Updated weights for policy 0, policy_version 22430 (0.0006) +[2024-08-24 20:46:18,675][03430] Updated weights for policy 0, policy_version 22440 (0.0007) +[2024-08-24 20:46:19,777][03430] Updated weights for policy 0, policy_version 22450 (0.0005) +[2024-08-24 20:46:20,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36113.2, 300 sec: 36294.7). Total num frames: 91992064. Throughput: 0: 9076.1. Samples: 22972114. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:46:20,813][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:46:20,881][03430] Updated weights for policy 0, policy_version 22460 (0.0006) +[2024-08-24 20:46:22,042][03430] Updated weights for policy 0, policy_version 22470 (0.0006) +[2024-08-24 20:46:23,181][03430] Updated weights for policy 0, policy_version 22480 (0.0005) +[2024-08-24 20:46:24,328][03430] Updated weights for policy 0, policy_version 22490 (0.0005) +[2024-08-24 20:46:25,480][03430] Updated weights for policy 0, policy_version 22500 (0.0006) +[2024-08-24 20:46:25,812][01192] Fps is (10 sec: 36044.7, 60 sec: 36044.9, 300 sec: 36267.0). Total num frames: 92168192. Throughput: 0: 9066.1. Samples: 23026162. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:46:25,812][01192] Avg episode reward: [(0, '4.463')] +[2024-08-24 20:46:26,661][03430] Updated weights for policy 0, policy_version 22510 (0.0006) +[2024-08-24 20:46:27,810][03430] Updated weights for policy 0, policy_version 22520 (0.0006) +[2024-08-24 20:46:28,938][03430] Updated weights for policy 0, policy_version 22530 (0.0006) +[2024-08-24 20:46:30,058][03430] Updated weights for policy 0, policy_version 22540 (0.0006) +[2024-08-24 20:46:30,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36181.3, 300 sec: 36267.0). Total num frames: 92348416. Throughput: 0: 9054.1. Samples: 23079638. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:46:30,813][01192] Avg episode reward: [(0, '4.418')] +[2024-08-24 20:46:31,213][03430] Updated weights for policy 0, policy_version 22550 (0.0006) +[2024-08-24 20:46:32,377][03430] Updated weights for policy 0, policy_version 22560 (0.0006) +[2024-08-24 20:46:33,549][03430] Updated weights for policy 0, policy_version 22570 (0.0006) +[2024-08-24 20:46:34,712][03430] Updated weights for policy 0, policy_version 22580 (0.0006) +[2024-08-24 20:46:35,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36113.1, 300 sec: 36239.2). Total num frames: 92524544. Throughput: 0: 9038.8. Samples: 23106170. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:35,813][01192] Avg episode reward: [(0, '4.635')] +[2024-08-24 20:46:35,843][03430] Updated weights for policy 0, policy_version 22590 (0.0006) +[2024-08-24 20:46:36,989][03430] Updated weights for policy 0, policy_version 22600 (0.0006) +[2024-08-24 20:46:38,120][03430] Updated weights for policy 0, policy_version 22610 (0.0006) +[2024-08-24 20:46:39,268][03430] Updated weights for policy 0, policy_version 22620 (0.0007) +[2024-08-24 20:46:40,360][03430] Updated weights for policy 0, policy_version 22630 (0.0005) +[2024-08-24 20:46:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36181.4, 300 sec: 36253.1). Total num frames: 92708864. Throughput: 0: 9008.0. Samples: 23159524. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:40,812][01192] Avg episode reward: [(0, '4.501')] +[2024-08-24 20:46:41,465][03430] Updated weights for policy 0, policy_version 22640 (0.0006) +[2024-08-24 20:46:42,574][03430] Updated weights for policy 0, policy_version 22650 (0.0006) +[2024-08-24 20:46:43,681][03430] Updated weights for policy 0, policy_version 22660 (0.0005) +[2024-08-24 20:46:44,816][03430] Updated weights for policy 0, policy_version 22670 (0.0006) +[2024-08-24 20:46:45,812][01192] Fps is (10 sec: 36454.2, 60 sec: 36113.0, 300 sec: 36239.2). Total num frames: 92889088. Throughput: 0: 9023.1. Samples: 23214768. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:45,813][01192] Avg episode reward: [(0, '4.613')] +[2024-08-24 20:46:45,961][03430] Updated weights for policy 0, policy_version 22680 (0.0005) +[2024-08-24 20:46:47,131][03430] Updated weights for policy 0, policy_version 22690 (0.0006) +[2024-08-24 20:46:48,240][03430] Updated weights for policy 0, policy_version 22700 (0.0006) +[2024-08-24 20:46:49,357][03430] Updated weights for policy 0, policy_version 22710 (0.0006) +[2024-08-24 20:46:50,500][03430] Updated weights for policy 0, policy_version 22720 (0.0005) +[2024-08-24 20:46:50,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36239.2). Total num frames: 93069312. Throughput: 0: 9016.7. Samples: 23241766. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:50,813][01192] Avg episode reward: [(0, '4.376')] +[2024-08-24 20:46:51,655][03430] Updated weights for policy 0, policy_version 22730 (0.0006) +[2024-08-24 20:46:52,794][03430] Updated weights for policy 0, policy_version 22740 (0.0006) +[2024-08-24 20:46:53,946][03430] Updated weights for policy 0, policy_version 22750 (0.0005) +[2024-08-24 20:46:55,080][03430] Updated weights for policy 0, policy_version 22760 (0.0005) +[2024-08-24 20:46:55,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.0, 300 sec: 36239.2). Total num frames: 93249536. Throughput: 0: 9011.6. Samples: 23295866. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:46:55,813][01192] Avg episode reward: [(0, '4.329')] +[2024-08-24 20:46:56,302][03430] Updated weights for policy 0, policy_version 22770 (0.0006) +[2024-08-24 20:46:57,548][03430] Updated weights for policy 0, policy_version 22780 (0.0007) +[2024-08-24 20:46:58,695][03430] Updated weights for policy 0, policy_version 22790 (0.0005) +[2024-08-24 20:46:59,845][03430] Updated weights for policy 0, policy_version 22800 (0.0006) +[2024-08-24 20:47:00,812][01192] Fps is (10 sec: 35225.6, 60 sec: 35908.3, 300 sec: 36211.4). Total num frames: 93421568. Throughput: 0: 8958.0. Samples: 23347998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:47:00,812][01192] Avg episode reward: [(0, '4.353')] +[2024-08-24 20:47:00,967][03430] Updated weights for policy 0, policy_version 22810 (0.0005) +[2024-08-24 20:47:02,102][03430] Updated weights for policy 0, policy_version 22820 (0.0007) +[2024-08-24 20:47:03,252][03430] Updated weights for policy 0, policy_version 22830 (0.0006) +[2024-08-24 20:47:04,359][03430] Updated weights for policy 0, policy_version 22840 (0.0006) +[2024-08-24 20:47:05,497][03430] Updated weights for policy 0, policy_version 22850 (0.0006) +[2024-08-24 20:47:05,812][01192] Fps is (10 sec: 35225.5, 60 sec: 35908.3, 300 sec: 36211.4). Total num frames: 93601792. Throughput: 0: 8952.6. Samples: 23374980. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:47:05,813][01192] Avg episode reward: [(0, '4.545')] +[2024-08-24 20:47:06,636][03430] Updated weights for policy 0, policy_version 22860 (0.0005) +[2024-08-24 20:47:07,771][03430] Updated weights for policy 0, policy_version 22870 (0.0006) +[2024-08-24 20:47:08,903][03430] Updated weights for policy 0, policy_version 22880 (0.0006) +[2024-08-24 20:47:10,031][03430] Updated weights for policy 0, policy_version 22890 (0.0006) +[2024-08-24 20:47:10,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35908.3, 300 sec: 36183.7). Total num frames: 93782016. Throughput: 0: 8957.5. Samples: 23429248. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:47:10,813][01192] Avg episode reward: [(0, '4.508')] +[2024-08-24 20:47:11,184][03430] Updated weights for policy 0, policy_version 22900 (0.0007) +[2024-08-24 20:47:12,324][03430] Updated weights for policy 0, policy_version 22910 (0.0005) +[2024-08-24 20:47:13,454][03430] Updated weights for policy 0, policy_version 22920 (0.0005) +[2024-08-24 20:47:14,584][03430] Updated weights for policy 0, policy_version 22930 (0.0005) +[2024-08-24 20:47:15,718][03430] Updated weights for policy 0, policy_version 22940 (0.0005) +[2024-08-24 20:47:15,812][01192] Fps is (10 sec: 36044.7, 60 sec: 35908.2, 300 sec: 36197.5). Total num frames: 93962240. Throughput: 0: 8965.9. Samples: 23483104. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) +[2024-08-24 20:47:15,813][01192] Avg episode reward: [(0, '4.324')] +[2024-08-24 20:47:16,841][03430] Updated weights for policy 0, policy_version 22950 (0.0005) +[2024-08-24 20:47:18,008][03430] Updated weights for policy 0, policy_version 22960 (0.0006) +[2024-08-24 20:47:19,142][03430] Updated weights for policy 0, policy_version 22970 (0.0006) +[2024-08-24 20:47:20,261][03430] Updated weights for policy 0, policy_version 22980 (0.0005) +[2024-08-24 20:47:20,812][01192] Fps is (10 sec: 36044.4, 60 sec: 35839.9, 300 sec: 36183.6). Total num frames: 94142464. Throughput: 0: 8975.6. Samples: 23510072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:47:20,814][01192] Avg episode reward: [(0, '4.429')] +[2024-08-24 20:47:21,409][03430] Updated weights for policy 0, policy_version 22990 (0.0006) +[2024-08-24 20:47:22,564][03430] Updated weights for policy 0, policy_version 23000 (0.0005) +[2024-08-24 20:47:23,692][03430] Updated weights for policy 0, policy_version 23010 (0.0006) +[2024-08-24 20:47:24,842][03430] Updated weights for policy 0, policy_version 23020 (0.0004) +[2024-08-24 20:47:25,812][01192] Fps is (10 sec: 36044.8, 60 sec: 35908.3, 300 sec: 36183.6). Total num frames: 94322688. Throughput: 0: 8993.2. Samples: 23564216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:47:25,813][01192] Avg episode reward: [(0, '4.470')] +[2024-08-24 20:47:25,969][03430] Updated weights for policy 0, policy_version 23030 (0.0005) +[2024-08-24 20:47:27,139][03430] Updated weights for policy 0, policy_version 23040 (0.0006) +[2024-08-24 20:47:28,254][03430] Updated weights for policy 0, policy_version 23050 (0.0006) +[2024-08-24 20:47:29,396][03430] Updated weights for policy 0, policy_version 23060 (0.0005) +[2024-08-24 20:47:30,643][03430] Updated weights for policy 0, policy_version 23070 (0.0005) +[2024-08-24 20:47:30,812][01192] Fps is (10 sec: 35635.6, 60 sec: 35840.0, 300 sec: 36169.8). Total num frames: 94498816. Throughput: 0: 8951.6. Samples: 23617588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:47:30,813][01192] Avg episode reward: [(0, '4.488')] +[2024-08-24 20:47:31,824][03430] Updated weights for policy 0, policy_version 23080 (0.0005) +[2024-08-24 20:47:32,969][03430] Updated weights for policy 0, policy_version 23090 (0.0006) +[2024-08-24 20:47:34,132][03430] Updated weights for policy 0, policy_version 23100 (0.0007) +[2024-08-24 20:47:35,226][03430] Updated weights for policy 0, policy_version 23110 (0.0006) +[2024-08-24 20:47:35,812][01192] Fps is (10 sec: 35635.2, 60 sec: 35908.3, 300 sec: 36169.8). Total num frames: 94679040. Throughput: 0: 8935.6. Samples: 23643866. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:47:35,813][01192] Avg episode reward: [(0, '4.315')] +[2024-08-24 20:47:36,343][03430] Updated weights for policy 0, policy_version 23120 (0.0005) +[2024-08-24 20:47:37,459][03430] Updated weights for policy 0, policy_version 23130 (0.0005) +[2024-08-24 20:47:38,586][03430] Updated weights for policy 0, policy_version 23140 (0.0006) +[2024-08-24 20:47:39,724][03430] Updated weights for policy 0, policy_version 23150 (0.0006) +[2024-08-24 20:47:40,812][01192] Fps is (10 sec: 36044.6, 60 sec: 35839.9, 300 sec: 36169.8). Total num frames: 94859264. Throughput: 0: 8941.7. Samples: 23698244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:47:40,813][01192] Avg episode reward: [(0, '4.614')] +[2024-08-24 20:47:40,833][03430] Updated weights for policy 0, policy_version 23160 (0.0006) +[2024-08-24 20:47:41,966][03430] Updated weights for policy 0, policy_version 23170 (0.0005) +[2024-08-24 20:47:43,114][03430] Updated weights for policy 0, policy_version 23180 (0.0006) +[2024-08-24 20:47:44,245][03430] Updated weights for policy 0, policy_version 23190 (0.0006) +[2024-08-24 20:47:45,385][03430] Updated weights for policy 0, policy_version 23200 (0.0006) +[2024-08-24 20:47:45,812][01192] Fps is (10 sec: 36044.6, 60 sec: 35840.0, 300 sec: 36169.8). Total num frames: 95039488. Throughput: 0: 8990.4. Samples: 23752568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:47:45,813][01192] Avg episode reward: [(0, '4.376')] +[2024-08-24 20:47:46,530][03430] Updated weights for policy 0, policy_version 23210 (0.0006) +[2024-08-24 20:47:47,664][03430] Updated weights for policy 0, policy_version 23220 (0.0006) +[2024-08-24 20:47:48,788][03430] Updated weights for policy 0, policy_version 23230 (0.0006) +[2024-08-24 20:47:49,938][03430] Updated weights for policy 0, policy_version 23240 (0.0006) +[2024-08-24 20:47:50,812][01192] Fps is (10 sec: 36044.9, 60 sec: 35840.0, 300 sec: 36183.6). Total num frames: 95219712. Throughput: 0: 8990.8. Samples: 23779566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:47:50,813][01192] Avg episode reward: [(0, '4.388')] +[2024-08-24 20:47:51,097][03430] Updated weights for policy 0, policy_version 23250 (0.0006) +[2024-08-24 20:47:52,235][03430] Updated weights for policy 0, policy_version 23260 (0.0005) +[2024-08-24 20:47:53,355][03430] Updated weights for policy 0, policy_version 23270 (0.0006) +[2024-08-24 20:47:54,454][03430] Updated weights for policy 0, policy_version 23280 (0.0006) +[2024-08-24 20:47:55,584][03430] Updated weights for policy 0, policy_version 23290 (0.0006) +[2024-08-24 20:47:55,812][01192] Fps is (10 sec: 36044.5, 60 sec: 35839.9, 300 sec: 36183.6). Total num frames: 95399936. Throughput: 0: 8984.8. Samples: 23833564. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:47:55,813][01192] Avg episode reward: [(0, '4.543')] +[2024-08-24 20:47:56,739][03430] Updated weights for policy 0, policy_version 23300 (0.0006) +[2024-08-24 20:47:57,865][03430] Updated weights for policy 0, policy_version 23310 (0.0005) +[2024-08-24 20:47:59,025][03430] Updated weights for policy 0, policy_version 23320 (0.0005) +[2024-08-24 20:48:00,147][03430] Updated weights for policy 0, policy_version 23330 (0.0005) +[2024-08-24 20:48:00,812][01192] Fps is (10 sec: 36044.9, 60 sec: 35976.5, 300 sec: 36169.8). Total num frames: 95580160. Throughput: 0: 8987.0. Samples: 23887518. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2024-08-24 20:48:00,813][01192] Avg episode reward: [(0, '4.545')] +[2024-08-24 20:48:01,286][03430] Updated weights for policy 0, policy_version 23340 (0.0006) +[2024-08-24 20:48:02,410][03430] Updated weights for policy 0, policy_version 23350 (0.0007) +[2024-08-24 20:48:03,554][03430] Updated weights for policy 0, policy_version 23360 (0.0006) +[2024-08-24 20:48:04,716][03430] Updated weights for policy 0, policy_version 23370 (0.0007) +[2024-08-24 20:48:05,812][01192] Fps is (10 sec: 36045.2, 60 sec: 35976.5, 300 sec: 36155.9). Total num frames: 95760384. Throughput: 0: 8997.4. Samples: 23914956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:48:05,812][01192] Avg episode reward: [(0, '3.977')] +[2024-08-24 20:48:05,876][03430] Updated weights for policy 0, policy_version 23380 (0.0006) +[2024-08-24 20:48:07,014][03430] Updated weights for policy 0, policy_version 23390 (0.0005) +[2024-08-24 20:48:08,134][03430] Updated weights for policy 0, policy_version 23400 (0.0006) +[2024-08-24 20:48:09,282][03430] Updated weights for policy 0, policy_version 23410 (0.0005) +[2024-08-24 20:48:10,438][03430] Updated weights for policy 0, policy_version 23420 (0.0005) +[2024-08-24 20:48:10,812][01192] Fps is (10 sec: 36044.9, 60 sec: 35976.5, 300 sec: 36142.0). Total num frames: 95940608. Throughput: 0: 8980.7. Samples: 23968346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:48:10,813][01192] Avg episode reward: [(0, '4.437')] +[2024-08-24 20:48:11,584][03430] Updated weights for policy 0, policy_version 23430 (0.0006) +[2024-08-24 20:48:12,712][03430] Updated weights for policy 0, policy_version 23440 (0.0005) +[2024-08-24 20:48:13,828][03430] Updated weights for policy 0, policy_version 23450 (0.0006) +[2024-08-24 20:48:14,961][03430] Updated weights for policy 0, policy_version 23460 (0.0005) +[2024-08-24 20:48:15,812][01192] Fps is (10 sec: 36044.4, 60 sec: 35976.5, 300 sec: 36128.1). Total num frames: 96120832. Throughput: 0: 8999.1. Samples: 24022550. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:48:15,813][01192] Avg episode reward: [(0, '4.264')] +[2024-08-24 20:48:15,822][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000023467_96120832.pth... +[2024-08-24 20:48:15,846][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000021352_87457792.pth +[2024-08-24 20:48:16,108][03430] Updated weights for policy 0, policy_version 23470 (0.0005) +[2024-08-24 20:48:17,226][03430] Updated weights for policy 0, policy_version 23480 (0.0005) +[2024-08-24 20:48:18,334][03430] Updated weights for policy 0, policy_version 23490 (0.0006) +[2024-08-24 20:48:19,481][03430] Updated weights for policy 0, policy_version 23500 (0.0006) +[2024-08-24 20:48:20,595][03430] Updated weights for policy 0, policy_version 23510 (0.0005) +[2024-08-24 20:48:20,812][01192] Fps is (10 sec: 36044.7, 60 sec: 35976.6, 300 sec: 36128.1). Total num frames: 96301056. Throughput: 0: 9021.0. Samples: 24049812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:48:20,813][01192] Avg episode reward: [(0, '4.630')] +[2024-08-24 20:48:21,741][03430] Updated weights for policy 0, policy_version 23520 (0.0006) +[2024-08-24 20:48:22,872][03430] Updated weights for policy 0, policy_version 23530 (0.0006) +[2024-08-24 20:48:24,019][03430] Updated weights for policy 0, policy_version 23540 (0.0006) +[2024-08-24 20:48:25,137][03430] Updated weights for policy 0, policy_version 23550 (0.0006) +[2024-08-24 20:48:25,812][01192] Fps is (10 sec: 36454.5, 60 sec: 36044.8, 300 sec: 36128.1). Total num frames: 96485376. Throughput: 0: 9019.3. Samples: 24104114. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:48:25,813][01192] Avg episode reward: [(0, '4.442')] +[2024-08-24 20:48:26,263][03430] Updated weights for policy 0, policy_version 23560 (0.0005) +[2024-08-24 20:48:27,392][03430] Updated weights for policy 0, policy_version 23570 (0.0006) +[2024-08-24 20:48:28,524][03430] Updated weights for policy 0, policy_version 23580 (0.0005) +[2024-08-24 20:48:29,617][03430] Updated weights for policy 0, policy_version 23590 (0.0006) +[2024-08-24 20:48:30,724][03430] Updated weights for policy 0, policy_version 23600 (0.0005) +[2024-08-24 20:48:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36113.1, 300 sec: 36128.1). Total num frames: 96665600. Throughput: 0: 9029.1. Samples: 24158878. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:48:30,813][01192] Avg episode reward: [(0, '4.356')] +[2024-08-24 20:48:31,874][03430] Updated weights for policy 0, policy_version 23610 (0.0006) +[2024-08-24 20:48:33,021][03430] Updated weights for policy 0, policy_version 23620 (0.0006) +[2024-08-24 20:48:34,127][03430] Updated weights for policy 0, policy_version 23630 (0.0005) +[2024-08-24 20:48:35,271][03430] Updated weights for policy 0, policy_version 23640 (0.0005) +[2024-08-24 20:48:35,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.1, 300 sec: 36128.1). Total num frames: 96845824. Throughput: 0: 9028.1. Samples: 24185828. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2024-08-24 20:48:35,813][01192] Avg episode reward: [(0, '4.373')] +[2024-08-24 20:48:36,407][03430] Updated weights for policy 0, policy_version 23650 (0.0005) +[2024-08-24 20:48:37,545][03430] Updated weights for policy 0, policy_version 23660 (0.0005) +[2024-08-24 20:48:38,684][03430] Updated weights for policy 0, policy_version 23670 (0.0005) +[2024-08-24 20:48:39,810][03430] Updated weights for policy 0, policy_version 23680 (0.0006) +[2024-08-24 20:48:40,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36128.1). Total num frames: 97026048. Throughput: 0: 9035.4. Samples: 24240154. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:48:40,812][01192] Avg episode reward: [(0, '4.383')] +[2024-08-24 20:48:40,964][03430] Updated weights for policy 0, policy_version 23690 (0.0006) +[2024-08-24 20:48:42,096][03430] Updated weights for policy 0, policy_version 23700 (0.0006) +[2024-08-24 20:48:43,222][03430] Updated weights for policy 0, policy_version 23710 (0.0005) +[2024-08-24 20:48:44,338][03430] Updated weights for policy 0, policy_version 23720 (0.0006) +[2024-08-24 20:48:45,471][03430] Updated weights for policy 0, policy_version 23730 (0.0006) +[2024-08-24 20:48:45,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36128.1). Total num frames: 97206272. Throughput: 0: 9040.0. Samples: 24294316. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:48:45,812][01192] Avg episode reward: [(0, '4.569')] +[2024-08-24 20:48:46,613][03430] Updated weights for policy 0, policy_version 23740 (0.0006) +[2024-08-24 20:48:47,735][03430] Updated weights for policy 0, policy_version 23750 (0.0004) +[2024-08-24 20:48:48,857][03430] Updated weights for policy 0, policy_version 23760 (0.0006) +[2024-08-24 20:48:49,975][03430] Updated weights for policy 0, policy_version 23770 (0.0006) +[2024-08-24 20:48:50,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36181.4, 300 sec: 36142.0). Total num frames: 97390592. Throughput: 0: 9034.0. Samples: 24321488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:48:50,813][01192] Avg episode reward: [(0, '4.525')] +[2024-08-24 20:48:51,108][03430] Updated weights for policy 0, policy_version 23780 (0.0006) +[2024-08-24 20:48:52,224][03430] Updated weights for policy 0, policy_version 23790 (0.0006) +[2024-08-24 20:48:53,366][03430] Updated weights for policy 0, policy_version 23800 (0.0005) +[2024-08-24 20:48:54,496][03430] Updated weights for policy 0, policy_version 23810 (0.0007) +[2024-08-24 20:48:55,616][03430] Updated weights for policy 0, policy_version 23820 (0.0005) +[2024-08-24 20:48:55,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36181.4, 300 sec: 36128.1). Total num frames: 97570816. Throughput: 0: 9059.2. Samples: 24376010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:48:55,813][01192] Avg episode reward: [(0, '4.411')] +[2024-08-24 20:48:56,725][03430] Updated weights for policy 0, policy_version 23830 (0.0006) +[2024-08-24 20:48:57,762][03430] Updated weights for policy 0, policy_version 23840 (0.0006) +[2024-08-24 20:48:58,879][03430] Updated weights for policy 0, policy_version 23850 (0.0006) +[2024-08-24 20:49:00,004][03430] Updated weights for policy 0, policy_version 23860 (0.0006) +[2024-08-24 20:49:00,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36317.9, 300 sec: 36155.9). Total num frames: 97759232. Throughput: 0: 9096.6. Samples: 24431896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:49:00,813][01192] Avg episode reward: [(0, '4.490')] +[2024-08-24 20:49:01,156][03430] Updated weights for policy 0, policy_version 23870 (0.0006) +[2024-08-24 20:49:02,252][03430] Updated weights for policy 0, policy_version 23880 (0.0005) +[2024-08-24 20:49:03,373][03430] Updated weights for policy 0, policy_version 23890 (0.0006) +[2024-08-24 20:49:04,486][03430] Updated weights for policy 0, policy_version 23900 (0.0005) +[2024-08-24 20:49:05,576][03430] Updated weights for policy 0, policy_version 23910 (0.0005) +[2024-08-24 20:49:05,812][01192] Fps is (10 sec: 37273.6, 60 sec: 36386.1, 300 sec: 36169.8). Total num frames: 97943552. Throughput: 0: 9091.8. Samples: 24458942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:49:05,812][01192] Avg episode reward: [(0, '4.333')] +[2024-08-24 20:49:06,672][03430] Updated weights for policy 0, policy_version 23920 (0.0006) +[2024-08-24 20:49:07,750][03430] Updated weights for policy 0, policy_version 23930 (0.0006) +[2024-08-24 20:49:08,846][03430] Updated weights for policy 0, policy_version 23940 (0.0005) +[2024-08-24 20:49:09,987][03430] Updated weights for policy 0, policy_version 23950 (0.0006) +[2024-08-24 20:49:10,812][01192] Fps is (10 sec: 36863.9, 60 sec: 36454.4, 300 sec: 36169.8). Total num frames: 98127872. Throughput: 0: 9133.0. Samples: 24515100. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:10,813][01192] Avg episode reward: [(0, '4.380')] +[2024-08-24 20:49:11,126][03430] Updated weights for policy 0, policy_version 23960 (0.0006) +[2024-08-24 20:49:12,260][03430] Updated weights for policy 0, policy_version 23970 (0.0005) +[2024-08-24 20:49:13,371][03430] Updated weights for policy 0, policy_version 23980 (0.0006) +[2024-08-24 20:49:14,503][03430] Updated weights for policy 0, policy_version 23990 (0.0006) +[2024-08-24 20:49:15,637][03430] Updated weights for policy 0, policy_version 24000 (0.0005) +[2024-08-24 20:49:15,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36454.5, 300 sec: 36155.9). Total num frames: 98308096. Throughput: 0: 9123.9. Samples: 24569452. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:15,813][01192] Avg episode reward: [(0, '4.384')] +[2024-08-24 20:49:16,860][03430] Updated weights for policy 0, policy_version 24010 (0.0007) +[2024-08-24 20:49:18,084][03430] Updated weights for policy 0, policy_version 24020 (0.0005) +[2024-08-24 20:49:19,263][03430] Updated weights for policy 0, policy_version 24030 (0.0005) +[2024-08-24 20:49:20,432][03430] Updated weights for policy 0, policy_version 24040 (0.0006) +[2024-08-24 20:49:20,812][01192] Fps is (10 sec: 35225.8, 60 sec: 36317.9, 300 sec: 36114.2). Total num frames: 98480128. Throughput: 0: 9098.0. Samples: 24595238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:20,813][01192] Avg episode reward: [(0, '4.438')] +[2024-08-24 20:49:21,570][03430] Updated weights for policy 0, policy_version 24050 (0.0005) +[2024-08-24 20:49:22,678][03430] Updated weights for policy 0, policy_version 24060 (0.0005) +[2024-08-24 20:49:23,779][03430] Updated weights for policy 0, policy_version 24070 (0.0005) +[2024-08-24 20:49:24,876][03430] Updated weights for policy 0, policy_version 24080 (0.0006) +[2024-08-24 20:49:25,812][01192] Fps is (10 sec: 35635.2, 60 sec: 36317.9, 300 sec: 36114.2). Total num frames: 98664448. Throughput: 0: 9084.9. Samples: 24648974. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:25,813][01192] Avg episode reward: [(0, '4.474')] +[2024-08-24 20:49:26,009][03430] Updated weights for policy 0, policy_version 24090 (0.0005) +[2024-08-24 20:49:27,150][03430] Updated weights for policy 0, policy_version 24100 (0.0007) +[2024-08-24 20:49:28,292][03430] Updated weights for policy 0, policy_version 24110 (0.0006) +[2024-08-24 20:49:29,429][03430] Updated weights for policy 0, policy_version 24120 (0.0006) +[2024-08-24 20:49:30,534][03430] Updated weights for policy 0, policy_version 24130 (0.0005) +[2024-08-24 20:49:30,812][01192] Fps is (10 sec: 36454.4, 60 sec: 36317.9, 300 sec: 36114.2). Total num frames: 98844672. Throughput: 0: 9090.0. Samples: 24703366. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:30,813][01192] Avg episode reward: [(0, '4.397')] +[2024-08-24 20:49:31,698][03430] Updated weights for policy 0, policy_version 24140 (0.0006) +[2024-08-24 20:49:32,920][03430] Updated weights for policy 0, policy_version 24150 (0.0006) +[2024-08-24 20:49:34,097][03430] Updated weights for policy 0, policy_version 24160 (0.0006) +[2024-08-24 20:49:35,263][03430] Updated weights for policy 0, policy_version 24170 (0.0006) +[2024-08-24 20:49:35,812][01192] Fps is (10 sec: 35225.8, 60 sec: 36181.4, 300 sec: 36086.5). Total num frames: 99016704. Throughput: 0: 9068.9. Samples: 24729590. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:35,813][01192] Avg episode reward: [(0, '4.395')] +[2024-08-24 20:49:36,398][03430] Updated weights for policy 0, policy_version 24180 (0.0005) +[2024-08-24 20:49:37,530][03430] Updated weights for policy 0, policy_version 24190 (0.0006) +[2024-08-24 20:49:38,684][03430] Updated weights for policy 0, policy_version 24200 (0.0006) +[2024-08-24 20:49:39,845][03430] Updated weights for policy 0, policy_version 24210 (0.0006) +[2024-08-24 20:49:40,812][01192] Fps is (10 sec: 35225.6, 60 sec: 36181.3, 300 sec: 36086.5). Total num frames: 99196928. Throughput: 0: 9038.0. Samples: 24782720. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:40,813][01192] Avg episode reward: [(0, '4.561')] +[2024-08-24 20:49:40,967][03430] Updated weights for policy 0, policy_version 24220 (0.0005) +[2024-08-24 20:49:42,113][03430] Updated weights for policy 0, policy_version 24230 (0.0006) +[2024-08-24 20:49:43,234][03430] Updated weights for policy 0, policy_version 24240 (0.0006) +[2024-08-24 20:49:44,346][03430] Updated weights for policy 0, policy_version 24250 (0.0006) +[2024-08-24 20:49:45,504][03430] Updated weights for policy 0, policy_version 24260 (0.0006) +[2024-08-24 20:49:45,812][01192] Fps is (10 sec: 36044.4, 60 sec: 36181.3, 300 sec: 36072.6). Total num frames: 99377152. Throughput: 0: 9002.9. Samples: 24837026. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:45,813][01192] Avg episode reward: [(0, '4.451')] +[2024-08-24 20:49:46,619][03430] Updated weights for policy 0, policy_version 24270 (0.0006) +[2024-08-24 20:49:47,768][03430] Updated weights for policy 0, policy_version 24280 (0.0005) +[2024-08-24 20:49:48,898][03430] Updated weights for policy 0, policy_version 24290 (0.0006) +[2024-08-24 20:49:50,031][03430] Updated weights for policy 0, policy_version 24300 (0.0007) +[2024-08-24 20:49:50,812][01192] Fps is (10 sec: 36044.8, 60 sec: 36113.1, 300 sec: 36072.6). Total num frames: 99557376. Throughput: 0: 9001.6. Samples: 24864016. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:50,812][01192] Avg episode reward: [(0, '4.430')] +[2024-08-24 20:49:51,196][03430] Updated weights for policy 0, policy_version 24310 (0.0006) +[2024-08-24 20:49:52,322][03430] Updated weights for policy 0, policy_version 24320 (0.0006) +[2024-08-24 20:49:53,465][03430] Updated weights for policy 0, policy_version 24330 (0.0005) +[2024-08-24 20:49:54,611][03430] Updated weights for policy 0, policy_version 24340 (0.0006) +[2024-08-24 20:49:55,745][03430] Updated weights for policy 0, policy_version 24350 (0.0006) +[2024-08-24 20:49:55,812][01192] Fps is (10 sec: 36045.1, 60 sec: 36113.1, 300 sec: 36072.6). Total num frames: 99737600. Throughput: 0: 8952.5. Samples: 24917960. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2024-08-24 20:49:55,813][01192] Avg episode reward: [(0, '4.389')] +[2024-08-24 20:49:56,864][03430] Updated weights for policy 0, policy_version 24360 (0.0006) +[2024-08-24 20:49:57,966][03430] Updated weights for policy 0, policy_version 24370 (0.0007) +[2024-08-24 20:49:59,120][03430] Updated weights for policy 0, policy_version 24380 (0.0005) +[2024-08-24 20:50:00,274][03430] Updated weights for policy 0, policy_version 24390 (0.0006) +[2024-08-24 20:50:00,812][01192] Fps is (10 sec: 36044.7, 60 sec: 35976.5, 300 sec: 36072.6). Total num frames: 99917824. Throughput: 0: 8946.8. Samples: 24972058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2024-08-24 20:50:00,813][01192] Avg episode reward: [(0, '4.499')] +[2024-08-24 20:50:01,416][03430] Updated weights for policy 0, policy_version 24400 (0.0005) +[2024-08-24 20:50:02,573][03430] Updated weights for policy 0, policy_version 24410 (0.0006) +[2024-08-24 20:50:03,274][03417] Stopping Batcher_0... +[2024-08-24 20:50:03,274][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2024-08-24 20:50:03,274][01192] Component Batcher_0 stopped! +[2024-08-24 20:50:03,282][03430] Weights refcount: 2 0 +[2024-08-24 20:50:03,283][03430] Stopping InferenceWorker_p0-w0... +[2024-08-24 20:50:03,284][03430] Loop inference_proc0-0_evt_loop terminating... +[2024-08-24 20:50:03,283][01192] Component InferenceWorker_p0-w0 stopped! +[2024-08-24 20:50:03,275][03417] Loop batcher_evt_loop terminating... +[2024-08-24 20:50:03,301][03465] Stopping RolloutWorker_w2... +[2024-08-24 20:50:03,301][03465] Loop rollout_proc2_evt_loop terminating... +[2024-08-24 20:50:03,301][01192] Component RolloutWorker_w2 stopped! +[2024-08-24 20:50:03,302][03470] Stopping RolloutWorker_w7... +[2024-08-24 20:50:03,302][03470] Loop rollout_proc7_evt_loop terminating... +[2024-08-24 20:50:03,303][03469] Stopping RolloutWorker_w6... +[2024-08-24 20:50:03,303][03467] Stopping RolloutWorker_w4... +[2024-08-24 20:50:03,303][03467] Loop rollout_proc4_evt_loop terminating... +[2024-08-24 20:50:03,303][03469] Loop rollout_proc6_evt_loop terminating... +[2024-08-24 20:50:03,303][01192] Component RolloutWorker_w7 stopped! +[2024-08-24 20:50:03,303][01192] Component RolloutWorker_w6 stopped! +[2024-08-24 20:50:03,304][03468] Stopping RolloutWorker_w5... +[2024-08-24 20:50:03,304][03466] Stopping RolloutWorker_w3... +[2024-08-24 20:50:03,304][03468] Loop rollout_proc5_evt_loop terminating... +[2024-08-24 20:50:03,304][03466] Loop rollout_proc3_evt_loop terminating... +[2024-08-24 20:50:03,304][01192] Component RolloutWorker_w4 stopped! +[2024-08-24 20:50:03,305][03417] Removing /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000022415_91811840.pth +[2024-08-24 20:50:03,305][01192] Component RolloutWorker_w5 stopped! +[2024-08-24 20:50:03,305][03464] Stopping RolloutWorker_w1... +[2024-08-24 20:50:03,306][03464] Loop rollout_proc1_evt_loop terminating... +[2024-08-24 20:50:03,306][03463] Stopping RolloutWorker_w0... +[2024-08-24 20:50:03,306][03463] Loop rollout_proc0_evt_loop terminating... +[2024-08-24 20:50:03,305][01192] Component RolloutWorker_w3 stopped! +[2024-08-24 20:50:03,307][01192] Component RolloutWorker_w1 stopped! +[2024-08-24 20:50:03,307][03417] Saving /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2024-08-24 20:50:03,307][01192] Component RolloutWorker_w0 stopped! +[2024-08-24 20:50:03,341][03417] Stopping LearnerWorker_p0... +[2024-08-24 20:50:03,341][03417] Loop learner_proc0_evt_loop terminating... +[2024-08-24 20:50:03,341][01192] Component LearnerWorker_p0 stopped! +[2024-08-24 20:50:03,342][01192] Waiting for process learner_proc0 to stop... +[2024-08-24 20:50:03,655][01192] Waiting for process inference_proc0-0 to join... +[2024-08-24 20:50:03,656][01192] Waiting for process rollout_proc0 to join... +[2024-08-24 20:50:03,657][01192] Waiting for process rollout_proc1 to join... +[2024-08-24 20:50:03,657][01192] Waiting for process rollout_proc2 to join... +[2024-08-24 20:50:03,658][01192] Waiting for process rollout_proc3 to join... +[2024-08-24 20:50:03,659][01192] Waiting for process rollout_proc4 to join... +[2024-08-24 20:50:03,659][01192] Waiting for process rollout_proc5 to join... +[2024-08-24 20:50:03,660][01192] Waiting for process rollout_proc6 to join... +[2024-08-24 20:50:03,660][01192] Waiting for process rollout_proc7 to join... +[2024-08-24 20:50:03,661][01192] Batcher 0 profile tree view: +batching: 192.5574, releasing_batches: 0.2617 +[2024-08-24 20:50:03,661][01192] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0000 + wait_policy_total: 20.6984 +update_model: 32.0875 + weight_update: 0.0006 +one_step: 0.0014 + handle_policy_step: 2598.1219 + deserialize: 72.5874, stack: 9.8404, obs_to_device_normalize: 636.5788, forward: 922.4727, send_messages: 220.8932 + prepare_outputs: 663.0986 + to_cpu: 571.0565 +[2024-08-24 20:50:03,661][01192] Learner 0 profile tree view: +misc: 0.0670, prepare_batch: 190.5836 +train: 448.4509 + epoch_init: 0.0550, minibatch_init: 0.0645, losses_postprocess: 5.7168, kl_divergence: 6.5373, after_optimizer: 144.5615 + calculate_losses: 174.0789 + losses_init: 0.0260, forward_head: 10.8965, bptt_initial: 115.1432, tail: 7.3309, advantages_returns: 2.0615, losses: 18.2923 + bptt: 18.4187 + bptt_forward_core: 17.7735 + update: 112.9407 + clip: 14.9297 +[2024-08-24 20:50:03,662][01192] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.4997, enqueue_policy_requests: 80.8168, env_step: 1445.9170, overhead: 72.3760, complete_rollouts: 3.0570 +save_policy_outputs: 94.3165 + split_output_tensors: 33.5095 +[2024-08-24 20:50:03,662][01192] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 1.4837, enqueue_policy_requests: 81.6324, env_step: 1455.8268, overhead: 73.0896, complete_rollouts: 3.0529 +save_policy_outputs: 97.4577 + split_output_tensors: 35.0521 +[2024-08-24 20:50:03,662][01192] Loop Runner_EvtLoop terminating... +[2024-08-24 20:50:03,663][01192] Runner profile tree view: +main_loop: 2746.5968 +[2024-08-24 20:50:03,663][01192] Collected {0: 100007936}, FPS: 36411.6 +[2024-08-24 20:56:47,071][01192] Loading existing experiment configuration from /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/config.json +[2024-08-24 20:56:47,073][01192] Overriding arg 'num_workers' with value 1 passed from command line +[2024-08-24 20:56:47,073][01192] Adding new argument 'no_render'=True that is not in the saved config file! +[2024-08-24 20:56:47,074][01192] Adding new argument 'save_video'=True that is not in the saved config file! +[2024-08-24 20:56:47,074][01192] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2024-08-24 20:56:47,074][01192] Adding new argument 'video_name'=None that is not in the saved config file! +[2024-08-24 20:56:47,075][01192] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! +[2024-08-24 20:56:47,075][01192] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2024-08-24 20:56:47,076][01192] Adding new argument 'push_to_hub'=False that is not in the saved config file! +[2024-08-24 20:56:47,076][01192] Adding new argument 'hf_repository'=None that is not in the saved config file! +[2024-08-24 20:56:47,076][01192] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2024-08-24 20:56:47,077][01192] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2024-08-24 20:56:47,077][01192] Adding new argument 'train_script'=None that is not in the saved config file! +[2024-08-24 20:56:47,078][01192] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2024-08-24 20:56:47,078][01192] Using frameskip 1 and render_action_repeat=4 for evaluation +[2024-08-24 20:56:47,084][01192] Doom resolution: 160x120, resize resolution: (128, 72) +[2024-08-24 20:56:47,085][01192] RunningMeanStd input shape: (3, 72, 128) +[2024-08-24 20:56:47,086][01192] RunningMeanStd input shape: (1,) +[2024-08-24 20:56:47,094][01192] ConvEncoder: input_channels=3 +[2024-08-24 20:56:47,189][01192] Conv encoder output size: 512 +[2024-08-24 20:56:47,189][01192] Policy head output size: 512 +[2024-08-24 20:56:48,926][01192] Loading state from checkpoint /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2024-08-24 20:56:49,860][01192] Num frames 100... +[2024-08-24 20:56:49,916][01192] Num frames 200... +[2024-08-24 20:56:49,974][01192] Num frames 300... +[2024-08-24 20:56:50,028][01192] Num frames 400... +[2024-08-24 20:56:50,086][01192] Num frames 500... +[2024-08-24 20:56:50,166][01192] Avg episode rewards: #0: 7.440, true rewards: #0: 5.440 +[2024-08-24 20:56:50,166][01192] Avg episode reward: 7.440, avg true_objective: 5.440 +[2024-08-24 20:56:50,198][01192] Num frames 600... +[2024-08-24 20:56:50,257][01192] Num frames 700... +[2024-08-24 20:56:50,310][01192] Num frames 800... +[2024-08-24 20:56:50,366][01192] Num frames 900... +[2024-08-24 20:56:50,468][01192] Avg episode rewards: #0: 6.460, true rewards: #0: 4.960 +[2024-08-24 20:56:50,469][01192] Avg episode reward: 6.460, avg true_objective: 4.960 +[2024-08-24 20:56:50,475][01192] Num frames 1000... +[2024-08-24 20:56:50,535][01192] Num frames 1100... +[2024-08-24 20:56:50,596][01192] Num frames 1200... +[2024-08-24 20:56:50,655][01192] Num frames 1300... +[2024-08-24 20:56:50,710][01192] Num frames 1400... +[2024-08-24 20:56:50,769][01192] Num frames 1500... +[2024-08-24 20:56:50,841][01192] Avg episode rewards: #0: 6.787, true rewards: #0: 5.120 +[2024-08-24 20:56:50,842][01192] Avg episode reward: 6.787, avg true_objective: 5.120 +[2024-08-24 20:56:50,881][01192] Num frames 1600... +[2024-08-24 20:56:50,935][01192] Num frames 1700... +[2024-08-24 20:56:50,988][01192] Num frames 1800... +[2024-08-24 20:56:51,041][01192] Num frames 1900... +[2024-08-24 20:56:51,102][01192] Avg episode rewards: #0: 6.050, true rewards: #0: 4.800 +[2024-08-24 20:56:51,103][01192] Avg episode reward: 6.050, avg true_objective: 4.800 +[2024-08-24 20:56:51,148][01192] Num frames 2000... +[2024-08-24 20:56:51,202][01192] Num frames 2100... +[2024-08-24 20:56:51,258][01192] Num frames 2200... +[2024-08-24 20:56:51,319][01192] Num frames 2300... +[2024-08-24 20:56:51,374][01192] Avg episode rewards: #0: 5.608, true rewards: #0: 4.608 +[2024-08-24 20:56:51,375][01192] Avg episode reward: 5.608, avg true_objective: 4.608 +[2024-08-24 20:56:51,431][01192] Num frames 2400... +[2024-08-24 20:56:51,482][01192] Num frames 2500... +[2024-08-24 20:56:51,547][01192] Num frames 2600... +[2024-08-24 20:56:51,605][01192] Num frames 2700... +[2024-08-24 20:56:51,685][01192] Avg episode rewards: #0: 5.587, true rewards: #0: 4.587 +[2024-08-24 20:56:51,685][01192] Avg episode reward: 5.587, avg true_objective: 4.587 +[2024-08-24 20:56:51,711][01192] Num frames 2800... +[2024-08-24 20:56:51,761][01192] Num frames 2900... +[2024-08-24 20:56:51,814][01192] Num frames 3000... +[2024-08-24 20:56:51,871][01192] Num frames 3100... +[2024-08-24 20:56:51,945][01192] Avg episode rewards: #0: 5.337, true rewards: #0: 4.480 +[2024-08-24 20:56:51,946][01192] Avg episode reward: 5.337, avg true_objective: 4.480 +[2024-08-24 20:56:51,980][01192] Num frames 3200... +[2024-08-24 20:56:52,034][01192] Num frames 3300... +[2024-08-24 20:56:52,090][01192] Num frames 3400... +[2024-08-24 20:56:52,144][01192] Num frames 3500... +[2024-08-24 20:56:52,209][01192] Avg episode rewards: #0: 5.150, true rewards: #0: 4.400 +[2024-08-24 20:56:52,209][01192] Avg episode reward: 5.150, avg true_objective: 4.400 +[2024-08-24 20:56:52,255][01192] Num frames 3600... +[2024-08-24 20:56:52,308][01192] Num frames 3700... +[2024-08-24 20:56:52,366][01192] Num frames 3800... +[2024-08-24 20:56:52,423][01192] Num frames 3900... +[2024-08-24 20:56:52,478][01192] Avg episode rewards: #0: 5.004, true rewards: #0: 4.338 +[2024-08-24 20:56:52,478][01192] Avg episode reward: 5.004, avg true_objective: 4.338 +[2024-08-24 20:56:52,534][01192] Num frames 4000... +[2024-08-24 20:56:52,587][01192] Num frames 4100... +[2024-08-24 20:56:52,644][01192] Num frames 4200... +[2024-08-24 20:56:52,748][01192] Avg episode rewards: #0: 4.888, true rewards: #0: 4.288 +[2024-08-24 20:56:52,748][01192] Avg episode reward: 4.888, avg true_objective: 4.288 +[2024-08-24 20:56:56,379][01192] Replay video saved to /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/replay.mp4! +[2024-08-24 21:09:39,571][01192] Loading existing experiment configuration from /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/config.json +[2024-08-24 21:09:39,572][01192] Overriding arg 'num_workers' with value 1 passed from command line +[2024-08-24 21:09:39,572][01192] Adding new argument 'no_render'=True that is not in the saved config file! +[2024-08-24 21:09:39,572][01192] Adding new argument 'save_video'=True that is not in the saved config file! +[2024-08-24 21:09:39,572][01192] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2024-08-24 21:09:39,573][01192] Adding new argument 'video_name'=None that is not in the saved config file! +[2024-08-24 21:09:39,573][01192] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2024-08-24 21:09:39,573][01192] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2024-08-24 21:09:39,574][01192] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2024-08-24 21:09:39,574][01192] Adding new argument 'hf_repository'='cpgrant/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2024-08-24 21:09:39,574][01192] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2024-08-24 21:09:39,574][01192] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2024-08-24 21:09:39,575][01192] Adding new argument 'train_script'=None that is not in the saved config file! +[2024-08-24 21:09:39,576][01192] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2024-08-24 21:09:39,576][01192] Using frameskip 1 and render_action_repeat=4 for evaluation +[2024-08-24 21:09:39,582][01192] RunningMeanStd input shape: (3, 72, 128) +[2024-08-24 21:09:39,583][01192] RunningMeanStd input shape: (1,) +[2024-08-24 21:09:39,587][01192] ConvEncoder: input_channels=3 +[2024-08-24 21:09:39,603][01192] Conv encoder output size: 512 +[2024-08-24 21:09:39,604][01192] Policy head output size: 512 +[2024-08-24 21:09:39,625][01192] Loading state from checkpoint /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/checkpoint_p0/checkpoint_000024416_100007936.pth... +[2024-08-24 21:09:39,982][01192] Num frames 100... +[2024-08-24 21:09:40,068][01192] Num frames 200... +[2024-08-24 21:09:40,153][01192] Num frames 300... +[2024-08-24 21:09:40,279][01192] Avg episode rewards: #0: 3.840, true rewards: #0: 3.840 +[2024-08-24 21:09:40,280][01192] Avg episode reward: 3.840, avg true_objective: 3.840 +[2024-08-24 21:09:40,293][01192] Num frames 400... +[2024-08-24 21:09:40,384][01192] Num frames 500... +[2024-08-24 21:09:40,459][01192] Num frames 600... +[2024-08-24 21:09:40,542][01192] Num frames 700... +[2024-08-24 21:09:40,623][01192] Num frames 800... +[2024-08-24 21:09:40,674][01192] Avg episode rewards: #0: 4.000, true rewards: #0: 4.000 +[2024-08-24 21:09:40,675][01192] Avg episode reward: 4.000, avg true_objective: 4.000 +[2024-08-24 21:09:40,764][01192] Num frames 900... +[2024-08-24 21:09:40,847][01192] Num frames 1000... +[2024-08-24 21:09:40,931][01192] Num frames 1100... +[2024-08-24 21:09:41,050][01192] Avg episode rewards: #0: 3.947, true rewards: #0: 3.947 +[2024-08-24 21:09:41,050][01192] Avg episode reward: 3.947, avg true_objective: 3.947 +[2024-08-24 21:09:41,063][01192] Num frames 1200... +[2024-08-24 21:09:41,154][01192] Num frames 1300... +[2024-08-24 21:09:41,230][01192] Num frames 1400... +[2024-08-24 21:09:41,310][01192] Num frames 1500... +[2024-08-24 21:09:41,394][01192] Num frames 1600... +[2024-08-24 21:09:41,499][01192] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160 +[2024-08-24 21:09:41,499][01192] Avg episode reward: 4.660, avg true_objective: 4.160 +[2024-08-24 21:09:41,532][01192] Num frames 1700... +[2024-08-24 21:09:41,620][01192] Num frames 1800... +[2024-08-24 21:09:41,695][01192] Num frames 1900... +[2024-08-24 21:09:41,780][01192] Num frames 2000... +[2024-08-24 21:09:41,856][01192] Num frames 2100... +[2024-08-24 21:09:41,922][01192] Avg episode rewards: #0: 4.824, true rewards: #0: 4.224 +[2024-08-24 21:09:41,922][01192] Avg episode reward: 4.824, avg true_objective: 4.224 +[2024-08-24 21:09:42,013][01192] Num frames 2200... +[2024-08-24 21:09:42,109][01192] Num frames 2300... +[2024-08-24 21:09:42,209][01192] Num frames 2400... +[2024-08-24 21:09:42,332][01192] Avg episode rewards: #0: 4.660, true rewards: #0: 4.160 +[2024-08-24 21:09:42,333][01192] Avg episode reward: 4.660, avg true_objective: 4.160 +[2024-08-24 21:09:42,338][01192] Num frames 2500... +[2024-08-24 21:09:42,420][01192] Num frames 2600... +[2024-08-24 21:09:42,497][01192] Num frames 2700... +[2024-08-24 21:09:42,576][01192] Num frames 2800... +[2024-08-24 21:09:42,697][01192] Avg episode rewards: #0: 4.543, true rewards: #0: 4.114 +[2024-08-24 21:09:42,697][01192] Avg episode reward: 4.543, avg true_objective: 4.114 +[2024-08-24 21:09:42,712][01192] Num frames 2900... +[2024-08-24 21:09:42,790][01192] Num frames 3000... +[2024-08-24 21:09:42,873][01192] Num frames 3100... +[2024-08-24 21:09:42,954][01192] Num frames 3200... +[2024-08-24 21:09:43,080][01192] Avg episode rewards: #0: 4.620, true rewards: #0: 4.120 +[2024-08-24 21:09:43,081][01192] Avg episode reward: 4.620, avg true_objective: 4.120 +[2024-08-24 21:09:43,087][01192] Num frames 3300... +[2024-08-24 21:09:43,162][01192] Num frames 3400... +[2024-08-24 21:09:43,240][01192] Num frames 3500... +[2024-08-24 21:09:43,317][01192] Num frames 3600... +[2024-08-24 21:09:43,392][01192] Num frames 3700... +[2024-08-24 21:09:43,489][01192] Num frames 3800... +[2024-08-24 21:09:43,576][01192] Avg episode rewards: #0: 4.933, true rewards: #0: 4.267 +[2024-08-24 21:09:43,576][01192] Avg episode reward: 4.933, avg true_objective: 4.267 +[2024-08-24 21:09:43,636][01192] Num frames 3900... +[2024-08-24 21:09:43,721][01192] Num frames 4000... +[2024-08-24 21:09:43,816][01192] Num frames 4100... +[2024-08-24 21:09:43,911][01192] Num frames 4200... +[2024-08-24 21:09:43,987][01192] Avg episode rewards: #0: 4.824, true rewards: #0: 4.224 +[2024-08-24 21:09:43,988][01192] Avg episode reward: 4.824, avg true_objective: 4.224 +[2024-08-24 21:09:47,454][01192] Replay video saved to /home/ai24/condaprojects/hfrl/hfrl8doom/train_dir/default_experiment/replay.mp4!