[2023-02-26 06:14:57,471][06480] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-26 06:14:57,475][06480] Rollout worker 0 uses device cpu [2023-02-26 06:14:57,476][06480] Rollout worker 1 uses device cpu [2023-02-26 06:14:57,478][06480] Rollout worker 2 uses device cpu [2023-02-26 06:14:57,479][06480] Rollout worker 3 uses device cpu [2023-02-26 06:14:57,480][06480] Rollout worker 4 uses device cpu [2023-02-26 06:14:57,481][06480] Rollout worker 5 uses device cpu [2023-02-26 06:14:57,484][06480] Rollout worker 6 uses device cpu [2023-02-26 06:14:57,485][06480] Rollout worker 7 uses device cpu [2023-02-26 06:14:57,682][06480] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 06:14:57,685][06480] InferenceWorker_p0-w0: min num requests: 2 [2023-02-26 06:14:57,717][06480] Starting all processes... [2023-02-26 06:14:57,718][06480] Starting process learner_proc0 [2023-02-26 06:14:57,774][06480] Starting all processes... [2023-02-26 06:14:57,787][06480] Starting process inference_proc0-0 [2023-02-26 06:14:57,790][06480] Starting process rollout_proc0 [2023-02-26 06:14:57,790][06480] Starting process rollout_proc1 [2023-02-26 06:14:57,790][06480] Starting process rollout_proc2 [2023-02-26 06:14:57,797][06480] Starting process rollout_proc3 [2023-02-26 06:14:57,797][06480] Starting process rollout_proc4 [2023-02-26 06:14:57,797][06480] Starting process rollout_proc5 [2023-02-26 06:14:57,797][06480] Starting process rollout_proc6 [2023-02-26 06:14:57,797][06480] Starting process rollout_proc7 [2023-02-26 06:15:09,485][13238] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 06:15:09,485][13238] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-26 06:15:09,499][13252] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 06:15:09,503][13252] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-26 06:15:09,550][13253] Worker 0 uses CPU cores [0] [2023-02-26 06:15:09,609][13254] Worker 1 uses CPU cores [1] [2023-02-26 06:15:09,624][13256] Worker 2 uses CPU cores [0] [2023-02-26 06:15:09,784][13257] Worker 5 uses CPU cores [1] [2023-02-26 06:15:09,849][13258] Worker 4 uses CPU cores [0] [2023-02-26 06:15:09,913][13259] Worker 7 uses CPU cores [1] [2023-02-26 06:15:09,933][13260] Worker 6 uses CPU cores [0] [2023-02-26 06:15:10,078][13255] Worker 3 uses CPU cores [1] [2023-02-26 06:15:10,410][13252] Num visible devices: 1 [2023-02-26 06:15:10,410][13238] Num visible devices: 1 [2023-02-26 06:15:10,431][13238] Starting seed is not provided [2023-02-26 06:15:10,432][13238] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 06:15:10,433][13238] Initializing actor-critic model on device cuda:0 [2023-02-26 06:15:10,434][13238] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 06:15:10,437][13238] RunningMeanStd input shape: (1,) [2023-02-26 06:15:10,457][13238] ConvEncoder: input_channels=3 [2023-02-26 06:15:10,752][13238] Conv encoder output size: 512 [2023-02-26 06:15:10,752][13238] Policy head output size: 512 [2023-02-26 06:15:10,805][13238] Created Actor Critic model with architecture: [2023-02-26 06:15:10,805][13238] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-26 06:15:17,675][06480] Heartbeat connected on Batcher_0 [2023-02-26 06:15:17,683][06480] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-26 06:15:17,692][06480] Heartbeat connected on RolloutWorker_w0 [2023-02-26 06:15:17,696][06480] Heartbeat connected on RolloutWorker_w1 [2023-02-26 06:15:17,699][06480] Heartbeat connected on RolloutWorker_w2 [2023-02-26 06:15:17,703][06480] Heartbeat connected on RolloutWorker_w3 [2023-02-26 06:15:17,706][06480] Heartbeat connected on RolloutWorker_w4 [2023-02-26 06:15:17,709][06480] Heartbeat connected on RolloutWorker_w5 [2023-02-26 06:15:17,713][06480] Heartbeat connected on RolloutWorker_w6 [2023-02-26 06:15:17,716][06480] Heartbeat connected on RolloutWorker_w7 [2023-02-26 06:15:18,520][13238] Using optimizer [2023-02-26 06:15:18,522][13238] No checkpoints found [2023-02-26 06:15:18,522][13238] Did not load from checkpoint, starting from scratch! [2023-02-26 06:15:18,522][13238] Initialized policy 0 weights for model version 0 [2023-02-26 06:15:18,525][13238] LearnerWorker_p0 finished initialization! [2023-02-26 06:15:18,526][06480] Heartbeat connected on LearnerWorker_p0 [2023-02-26 06:15:18,526][13238] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 06:15:18,759][13252] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 06:15:18,761][13252] RunningMeanStd input shape: (1,) [2023-02-26 06:15:18,781][13252] ConvEncoder: input_channels=3 [2023-02-26 06:15:18,933][13252] Conv encoder output size: 512 [2023-02-26 06:15:18,934][13252] Policy head output size: 512 [2023-02-26 06:15:22,133][06480] Inference worker 0-0 is ready! [2023-02-26 06:15:22,135][06480] All inference workers are ready! Signal rollout workers to start! [2023-02-26 06:15:22,259][13254] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,262][13259] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,311][13257] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,314][13255] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,388][13256] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,380][13253] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,405][13258] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:22,392][13260] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 06:15:23,272][06480] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 06:15:23,671][13259] Decorrelating experience for 0 frames... [2023-02-26 06:15:23,672][13254] Decorrelating experience for 0 frames... [2023-02-26 06:15:23,673][13257] Decorrelating experience for 0 frames... [2023-02-26 06:15:24,303][13258] Decorrelating experience for 0 frames... [2023-02-26 06:15:24,306][13253] Decorrelating experience for 0 frames... [2023-02-26 06:15:24,309][13256] Decorrelating experience for 0 frames... [2023-02-26 06:15:24,322][13260] Decorrelating experience for 0 frames... [2023-02-26 06:15:24,722][13259] Decorrelating experience for 32 frames... [2023-02-26 06:15:24,724][13254] Decorrelating experience for 32 frames... [2023-02-26 06:15:24,733][13257] Decorrelating experience for 32 frames... [2023-02-26 06:15:25,361][13258] Decorrelating experience for 32 frames... [2023-02-26 06:15:25,367][13256] Decorrelating experience for 32 frames... [2023-02-26 06:15:25,378][13260] Decorrelating experience for 32 frames... [2023-02-26 06:15:25,915][13256] Decorrelating experience for 64 frames... [2023-02-26 06:15:26,083][13255] Decorrelating experience for 0 frames... [2023-02-26 06:15:26,295][13257] Decorrelating experience for 64 frames... [2023-02-26 06:15:26,299][13259] Decorrelating experience for 64 frames... [2023-02-26 06:15:26,327][13256] Decorrelating experience for 96 frames... [2023-02-26 06:15:26,835][13260] Decorrelating experience for 64 frames... [2023-02-26 06:15:26,875][13254] Decorrelating experience for 64 frames... [2023-02-26 06:15:27,469][13255] Decorrelating experience for 32 frames... [2023-02-26 06:15:27,748][13259] Decorrelating experience for 96 frames... [2023-02-26 06:15:27,753][13257] Decorrelating experience for 96 frames... [2023-02-26 06:15:28,088][13253] Decorrelating experience for 32 frames... [2023-02-26 06:15:28,191][13260] Decorrelating experience for 96 frames... [2023-02-26 06:15:28,272][06480] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 06:15:28,590][13255] Decorrelating experience for 64 frames... [2023-02-26 06:15:28,681][13254] Decorrelating experience for 96 frames... [2023-02-26 06:15:29,081][13258] Decorrelating experience for 64 frames... [2023-02-26 06:15:29,172][13253] Decorrelating experience for 64 frames... [2023-02-26 06:15:29,652][13255] Decorrelating experience for 96 frames... [2023-02-26 06:15:29,901][13258] Decorrelating experience for 96 frames... [2023-02-26 06:15:29,986][13253] Decorrelating experience for 96 frames... [2023-02-26 06:15:33,157][13238] Signal inference workers to stop experience collection... [2023-02-26 06:15:33,169][13252] InferenceWorker_p0-w0: stopping experience collection [2023-02-26 06:15:33,272][06480] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 46.2. Samples: 462. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 06:15:33,275][06480] Avg episode reward: [(0, '1.582')] [2023-02-26 06:15:36,192][13238] Signal inference workers to resume experience collection... [2023-02-26 06:15:36,193][13252] InferenceWorker_p0-w0: resuming experience collection [2023-02-26 06:15:38,276][06480] Fps is (10 sec: 409.4, 60 sec: 273.0, 300 sec: 273.0). Total num frames: 4096. Throughput: 0: 170.4. Samples: 2556. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-26 06:15:38,279][06480] Avg episode reward: [(0, '2.078')] [2023-02-26 06:15:43,272][06480] Fps is (10 sec: 2457.6, 60 sec: 1228.8, 300 sec: 1228.8). Total num frames: 24576. Throughput: 0: 324.4. Samples: 6488. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-26 06:15:43,274][06480] Avg episode reward: [(0, '3.576')] [2023-02-26 06:15:46,847][13252] Updated weights for policy 0, policy_version 10 (0.0017) [2023-02-26 06:15:48,272][06480] Fps is (10 sec: 4097.8, 60 sec: 1802.2, 300 sec: 1802.2). Total num frames: 45056. Throughput: 0: 388.8. Samples: 9720. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-26 06:15:48,274][06480] Avg episode reward: [(0, '4.253')] [2023-02-26 06:15:53,276][06480] Fps is (10 sec: 3684.9, 60 sec: 2047.7, 300 sec: 2047.7). Total num frames: 61440. Throughput: 0: 539.8. Samples: 16196. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:15:53,278][06480] Avg episode reward: [(0, '4.511')] [2023-02-26 06:15:58,272][06480] Fps is (10 sec: 3276.8, 60 sec: 2223.5, 300 sec: 2223.5). Total num frames: 77824. Throughput: 0: 582.1. Samples: 20372. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:15:58,278][06480] Avg episode reward: [(0, '4.533')] [2023-02-26 06:15:58,948][13252] Updated weights for policy 0, policy_version 20 (0.0015) [2023-02-26 06:16:03,272][06480] Fps is (10 sec: 3278.1, 60 sec: 2355.2, 300 sec: 2355.2). Total num frames: 94208. Throughput: 0: 560.0. Samples: 22398. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:16:03,274][06480] Avg episode reward: [(0, '4.462')] [2023-02-26 06:16:08,272][06480] Fps is (10 sec: 3686.4, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 639.0. Samples: 28754. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:16:08,274][06480] Avg episode reward: [(0, '4.427')] [2023-02-26 06:16:08,277][13238] Saving new best policy, reward=4.427! [2023-02-26 06:16:09,601][13252] Updated weights for policy 0, policy_version 30 (0.0013) [2023-02-26 06:16:13,275][06480] Fps is (10 sec: 4094.7, 60 sec: 2703.2, 300 sec: 2703.2). Total num frames: 135168. Throughput: 0: 764.4. Samples: 34402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:16:13,281][06480] Avg episode reward: [(0, '4.640')] [2023-02-26 06:16:13,295][13238] Saving new best policy, reward=4.640! [2023-02-26 06:16:18,274][06480] Fps is (10 sec: 3276.1, 60 sec: 2680.9, 300 sec: 2680.9). Total num frames: 147456. Throughput: 0: 799.8. Samples: 36456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:16:18,278][06480] Avg episode reward: [(0, '4.596')] [2023-02-26 06:16:22,752][13252] Updated weights for policy 0, policy_version 40 (0.0018) [2023-02-26 06:16:23,272][06480] Fps is (10 sec: 2868.1, 60 sec: 2730.7, 300 sec: 2730.7). Total num frames: 163840. Throughput: 0: 850.7. Samples: 40834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:16:23,275][06480] Avg episode reward: [(0, '4.562')] [2023-02-26 06:16:28,272][06480] Fps is (10 sec: 4096.8, 60 sec: 3140.3, 300 sec: 2898.7). Total num frames: 188416. Throughput: 0: 912.8. Samples: 47564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:16:28,274][06480] Avg episode reward: [(0, '4.379')] [2023-02-26 06:16:32,261][13252] Updated weights for policy 0, policy_version 50 (0.0012) [2023-02-26 06:16:33,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 204800. Throughput: 0: 913.0. Samples: 50806. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:16:33,277][06480] Avg episode reward: [(0, '4.403')] [2023-02-26 06:16:38,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3550.1, 300 sec: 2894.5). Total num frames: 217088. Throughput: 0: 865.8. Samples: 55152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:16:38,274][06480] Avg episode reward: [(0, '4.397')] [2023-02-26 06:16:43,272][06480] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 2969.6). Total num frames: 237568. Throughput: 0: 879.8. Samples: 59964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:16:43,274][06480] Avg episode reward: [(0, '4.449')] [2023-02-26 06:16:45,062][13252] Updated weights for policy 0, policy_version 60 (0.0018) [2023-02-26 06:16:48,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3035.9). Total num frames: 258048. Throughput: 0: 906.3. Samples: 63180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:16:48,274][06480] Avg episode reward: [(0, '4.486')] [2023-02-26 06:16:53,276][06480] Fps is (10 sec: 3275.5, 60 sec: 3481.6, 300 sec: 3003.6). Total num frames: 270336. Throughput: 0: 878.3. Samples: 68280. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:16:53,278][06480] Avg episode reward: [(0, '4.500')] [2023-02-26 06:16:53,374][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth... [2023-02-26 06:16:58,272][06480] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 2975.0). Total num frames: 282624. Throughput: 0: 822.2. Samples: 71398. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:16:58,281][06480] Avg episode reward: [(0, '4.548')] [2023-02-26 06:16:59,198][13252] Updated weights for policy 0, policy_version 70 (0.0019) [2023-02-26 06:17:03,273][06480] Fps is (10 sec: 2458.3, 60 sec: 3345.0, 300 sec: 2949.1). Total num frames: 294912. Throughput: 0: 813.7. Samples: 73074. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:17:03,281][06480] Avg episode reward: [(0, '4.567')] [2023-02-26 06:17:08,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3003.7). Total num frames: 315392. Throughput: 0: 832.1. Samples: 78278. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:17:08,274][06480] Avg episode reward: [(0, '4.676')] [2023-02-26 06:17:08,282][13238] Saving new best policy, reward=4.676! [2023-02-26 06:17:10,769][13252] Updated weights for policy 0, policy_version 80 (0.0022) [2023-02-26 06:17:13,272][06480] Fps is (10 sec: 4096.5, 60 sec: 3345.2, 300 sec: 3053.4). Total num frames: 335872. Throughput: 0: 830.6. Samples: 84942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:17:13,274][06480] Avg episode reward: [(0, '4.641')] [2023-02-26 06:17:18,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3413.5, 300 sec: 3063.1). Total num frames: 352256. Throughput: 0: 819.9. Samples: 87700. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:17:18,275][06480] Avg episode reward: [(0, '4.437')] [2023-02-26 06:17:23,109][13252] Updated weights for policy 0, policy_version 90 (0.0013) [2023-02-26 06:17:23,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3072.0). Total num frames: 368640. Throughput: 0: 818.2. Samples: 91972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:17:23,274][06480] Avg episode reward: [(0, '4.440')] [2023-02-26 06:17:28,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3113.0). Total num frames: 389120. Throughput: 0: 844.0. Samples: 97946. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:17:28,274][06480] Avg episode reward: [(0, '4.569')] [2023-02-26 06:17:32,564][13252] Updated weights for policy 0, policy_version 100 (0.0028) [2023-02-26 06:17:33,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3150.8). Total num frames: 409600. Throughput: 0: 848.7. Samples: 101372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:17:33,274][06480] Avg episode reward: [(0, '4.773')] [2023-02-26 06:17:33,285][13238] Saving new best policy, reward=4.773! [2023-02-26 06:17:38,275][06480] Fps is (10 sec: 3685.3, 60 sec: 3481.4, 300 sec: 3155.4). Total num frames: 425984. Throughput: 0: 858.4. Samples: 106906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:17:38,278][06480] Avg episode reward: [(0, '4.602')] [2023-02-26 06:17:43,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3130.5). Total num frames: 438272. Throughput: 0: 882.7. Samples: 111120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:17:43,281][06480] Avg episode reward: [(0, '4.502')] [2023-02-26 06:17:45,552][13252] Updated weights for policy 0, policy_version 110 (0.0031) [2023-02-26 06:17:48,272][06480] Fps is (10 sec: 3687.5, 60 sec: 3413.3, 300 sec: 3192.1). Total num frames: 462848. Throughput: 0: 910.4. Samples: 114040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:17:48,281][06480] Avg episode reward: [(0, '4.371')] [2023-02-26 06:17:53,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3550.1, 300 sec: 3222.2). Total num frames: 483328. Throughput: 0: 943.4. Samples: 120732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:17:53,279][06480] Avg episode reward: [(0, '4.391')] [2023-02-26 06:17:54,717][13252] Updated weights for policy 0, policy_version 120 (0.0018) [2023-02-26 06:17:58,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3223.9). Total num frames: 499712. Throughput: 0: 908.8. Samples: 125838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:17:58,278][06480] Avg episode reward: [(0, '4.418')] [2023-02-26 06:18:03,272][06480] Fps is (10 sec: 2867.1, 60 sec: 3618.2, 300 sec: 3200.0). Total num frames: 512000. Throughput: 0: 893.1. Samples: 127892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:18:03,277][06480] Avg episode reward: [(0, '4.457')] [2023-02-26 06:18:07,300][13252] Updated weights for policy 0, policy_version 130 (0.0019) [2023-02-26 06:18:08,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3252.0). Total num frames: 536576. Throughput: 0: 923.6. Samples: 133532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:18:08,275][06480] Avg episode reward: [(0, '4.595')] [2023-02-26 06:18:13,272][06480] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3276.8). Total num frames: 557056. Throughput: 0: 939.5. Samples: 140224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:18:13,274][06480] Avg episode reward: [(0, '4.613')] [2023-02-26 06:18:18,185][13252] Updated weights for policy 0, policy_version 140 (0.0016) [2023-02-26 06:18:18,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3276.8). Total num frames: 573440. Throughput: 0: 915.6. Samples: 142574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:18:18,274][06480] Avg episode reward: [(0, '4.654')] [2023-02-26 06:18:23,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3254.0). Total num frames: 585728. Throughput: 0: 884.4. Samples: 146702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:18:23,275][06480] Avg episode reward: [(0, '4.567')] [2023-02-26 06:18:28,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 606208. Throughput: 0: 927.3. Samples: 152848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:18:28,274][06480] Avg episode reward: [(0, '4.478')] [2023-02-26 06:18:29,345][13252] Updated weights for policy 0, policy_version 150 (0.0023) [2023-02-26 06:18:33,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3319.9). Total num frames: 630784. Throughput: 0: 934.3. Samples: 156084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:18:33,274][06480] Avg episode reward: [(0, '4.445')] [2023-02-26 06:18:38,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3297.8). Total num frames: 643072. Throughput: 0: 900.6. Samples: 161258. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:18:38,274][06480] Avg episode reward: [(0, '4.318')] [2023-02-26 06:18:41,892][13252] Updated weights for policy 0, policy_version 160 (0.0023) [2023-02-26 06:18:43,272][06480] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 655360. Throughput: 0: 879.7. Samples: 165426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:18:43,280][06480] Avg episode reward: [(0, '4.402')] [2023-02-26 06:18:48,272][06480] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3316.8). Total num frames: 679936. Throughput: 0: 899.9. Samples: 168388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:18:48,281][06480] Avg episode reward: [(0, '4.531')] [2023-02-26 06:18:51,801][13252] Updated weights for policy 0, policy_version 170 (0.0019) [2023-02-26 06:18:53,272][06480] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3335.3). Total num frames: 700416. Throughput: 0: 924.1. Samples: 175118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:18:53,278][06480] Avg episode reward: [(0, '4.701')] [2023-02-26 06:18:53,288][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000171_700416.pth... [2023-02-26 06:18:58,272][06480] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3334.0). Total num frames: 716800. Throughput: 0: 882.2. Samples: 179924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:18:58,280][06480] Avg episode reward: [(0, '4.791')] [2023-02-26 06:18:58,285][13238] Saving new best policy, reward=4.791! [2023-02-26 06:19:03,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3314.0). Total num frames: 729088. Throughput: 0: 875.9. Samples: 181988. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:19:03,279][06480] Avg episode reward: [(0, '4.633')] [2023-02-26 06:19:04,634][13252] Updated weights for policy 0, policy_version 180 (0.0017) [2023-02-26 06:19:08,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3349.6). Total num frames: 753664. Throughput: 0: 913.0. Samples: 187788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:19:08,274][06480] Avg episode reward: [(0, '4.923')] [2023-02-26 06:19:08,277][13238] Saving new best policy, reward=4.923! [2023-02-26 06:19:13,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3365.8). Total num frames: 774144. Throughput: 0: 926.6. Samples: 194544. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:19:13,275][06480] Avg episode reward: [(0, '5.036')] [2023-02-26 06:19:13,284][13238] Saving new best policy, reward=5.036! [2023-02-26 06:19:13,694][13252] Updated weights for policy 0, policy_version 190 (0.0018) [2023-02-26 06:19:18,274][06480] Fps is (10 sec: 3685.7, 60 sec: 3618.0, 300 sec: 3363.9). Total num frames: 790528. Throughput: 0: 903.4. Samples: 196738. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:19:18,278][06480] Avg episode reward: [(0, '5.137')] [2023-02-26 06:19:18,285][13238] Saving new best policy, reward=5.137! [2023-02-26 06:19:23,272][06480] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3328.0). Total num frames: 798720. Throughput: 0: 868.2. Samples: 200326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:19:23,280][06480] Avg episode reward: [(0, '5.077')] [2023-02-26 06:19:28,272][06480] Fps is (10 sec: 2048.4, 60 sec: 3413.3, 300 sec: 3310.2). Total num frames: 811008. Throughput: 0: 858.9. Samples: 204078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:19:28,274][06480] Avg episode reward: [(0, '5.387')] [2023-02-26 06:19:28,276][13238] Saving new best policy, reward=5.387! [2023-02-26 06:19:30,007][13252] Updated weights for policy 0, policy_version 200 (0.0030) [2023-02-26 06:19:33,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3326.0). Total num frames: 831488. Throughput: 0: 846.0. Samples: 206458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:19:33,274][06480] Avg episode reward: [(0, '5.289')] [2023-02-26 06:19:38,272][06480] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3341.0). Total num frames: 851968. Throughput: 0: 843.8. Samples: 213088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:19:38,279][06480] Avg episode reward: [(0, '5.230')] [2023-02-26 06:19:40,660][13252] Updated weights for policy 0, policy_version 210 (0.0019) [2023-02-26 06:19:43,273][06480] Fps is (10 sec: 3276.5, 60 sec: 3481.5, 300 sec: 3324.0). Total num frames: 864256. Throughput: 0: 831.7. Samples: 217350. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:19:43,277][06480] Avg episode reward: [(0, '4.980')] [2023-02-26 06:19:48,272][06480] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3323.2). Total num frames: 880640. Throughput: 0: 830.9. Samples: 219378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:19:48,279][06480] Avg episode reward: [(0, '4.760')] [2023-02-26 06:19:51,964][13252] Updated weights for policy 0, policy_version 220 (0.0021) [2023-02-26 06:19:53,272][06480] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3352.7). Total num frames: 905216. Throughput: 0: 849.7. Samples: 226024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:19:53,278][06480] Avg episode reward: [(0, '4.862')] [2023-02-26 06:19:58,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3366.2). Total num frames: 925696. Throughput: 0: 830.7. Samples: 231926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:19:58,278][06480] Avg episode reward: [(0, '4.922')] [2023-02-26 06:20:03,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3349.9). Total num frames: 937984. Throughput: 0: 827.9. Samples: 233992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:20:03,278][06480] Avg episode reward: [(0, '4.859')] [2023-02-26 06:20:04,249][13252] Updated weights for policy 0, policy_version 230 (0.0019) [2023-02-26 06:20:08,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3348.7). Total num frames: 954368. Throughput: 0: 848.8. Samples: 238520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:20:08,277][06480] Avg episode reward: [(0, '4.907')] [2023-02-26 06:20:13,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3375.7). Total num frames: 978944. Throughput: 0: 909.6. Samples: 245008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:20:13,279][06480] Avg episode reward: [(0, '4.996')] [2023-02-26 06:20:14,129][13252] Updated weights for policy 0, policy_version 240 (0.0018) [2023-02-26 06:20:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3374.0). Total num frames: 995328. Throughput: 0: 929.9. Samples: 248302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:20:18,279][06480] Avg episode reward: [(0, '5.029')] [2023-02-26 06:20:23,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1007616. Throughput: 0: 873.8. Samples: 252408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:20:23,278][06480] Avg episode reward: [(0, '5.137')] [2023-02-26 06:20:27,511][13252] Updated weights for policy 0, policy_version 250 (0.0014) [2023-02-26 06:20:28,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 1024000. Throughput: 0: 887.7. Samples: 257294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:20:28,279][06480] Avg episode reward: [(0, '5.182')] [2023-02-26 06:20:33,272][06480] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3540.7). Total num frames: 1048576. Throughput: 0: 915.1. Samples: 260556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:20:33,275][06480] Avg episode reward: [(0, '5.287')] [2023-02-26 06:20:36,996][13252] Updated weights for policy 0, policy_version 260 (0.0013) [2023-02-26 06:20:38,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1064960. Throughput: 0: 906.8. Samples: 266832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:20:38,274][06480] Avg episode reward: [(0, '5.402')] [2023-02-26 06:20:38,277][13238] Saving new best policy, reward=5.402! [2023-02-26 06:20:43,277][06480] Fps is (10 sec: 3275.2, 60 sec: 3617.9, 300 sec: 3512.8). Total num frames: 1081344. Throughput: 0: 866.0. Samples: 270900. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:20:43,279][06480] Avg episode reward: [(0, '5.303')] [2023-02-26 06:20:48,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3512.9). Total num frames: 1097728. Throughput: 0: 866.7. Samples: 272994. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:20:48,274][06480] Avg episode reward: [(0, '5.488')] [2023-02-26 06:20:48,278][13238] Saving new best policy, reward=5.488! [2023-02-26 06:20:50,192][13252] Updated weights for policy 0, policy_version 270 (0.0043) [2023-02-26 06:20:53,272][06480] Fps is (10 sec: 3688.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1118208. Throughput: 0: 904.3. Samples: 279214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:20:53,288][06480] Avg episode reward: [(0, '5.539')] [2023-02-26 06:20:53,304][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000273_1118208.pth... [2023-02-26 06:20:53,479][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth [2023-02-26 06:20:53,506][13238] Saving new best policy, reward=5.539! [2023-02-26 06:20:58,272][06480] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1134592. Throughput: 0: 886.7. Samples: 284910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:20:58,276][06480] Avg episode reward: [(0, '5.594')] [2023-02-26 06:20:58,284][13238] Saving new best policy, reward=5.594! [2023-02-26 06:21:01,821][13252] Updated weights for policy 0, policy_version 280 (0.0012) [2023-02-26 06:21:03,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1146880. Throughput: 0: 855.7. Samples: 286810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:03,276][06480] Avg episode reward: [(0, '5.562')] [2023-02-26 06:21:08,272][06480] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1163264. Throughput: 0: 861.7. Samples: 291186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:21:08,279][06480] Avg episode reward: [(0, '5.819')] [2023-02-26 06:21:08,287][13238] Saving new best policy, reward=5.819! [2023-02-26 06:21:13,026][13252] Updated weights for policy 0, policy_version 290 (0.0019) [2023-02-26 06:21:13,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1187840. Throughput: 0: 896.5. Samples: 297638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:21:13,274][06480] Avg episode reward: [(0, '5.861')] [2023-02-26 06:21:13,292][13238] Saving new best policy, reward=5.861! [2023-02-26 06:21:18,273][06480] Fps is (10 sec: 4095.3, 60 sec: 3481.5, 300 sec: 3526.7). Total num frames: 1204224. Throughput: 0: 894.1. Samples: 300792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:18,275][06480] Avg episode reward: [(0, '5.671')] [2023-02-26 06:21:23,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 1220608. Throughput: 0: 846.7. Samples: 304932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:23,282][06480] Avg episode reward: [(0, '5.812')] [2023-02-26 06:21:25,992][13252] Updated weights for policy 0, policy_version 300 (0.0039) [2023-02-26 06:21:28,272][06480] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 1236992. Throughput: 0: 866.5. Samples: 309890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:21:28,277][06480] Avg episode reward: [(0, '6.061')] [2023-02-26 06:21:28,282][13238] Saving new best policy, reward=6.061! [2023-02-26 06:21:33,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1257472. Throughput: 0: 889.6. Samples: 313028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:21:33,278][06480] Avg episode reward: [(0, '6.405')] [2023-02-26 06:21:33,291][13238] Saving new best policy, reward=6.405! [2023-02-26 06:21:35,717][13252] Updated weights for policy 0, policy_version 310 (0.0012) [2023-02-26 06:21:38,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1277952. Throughput: 0: 888.4. Samples: 319194. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 06:21:38,276][06480] Avg episode reward: [(0, '6.488')] [2023-02-26 06:21:38,283][13238] Saving new best policy, reward=6.488! [2023-02-26 06:21:43,273][06480] Fps is (10 sec: 3276.4, 60 sec: 3481.8, 300 sec: 3498.9). Total num frames: 1290240. Throughput: 0: 853.5. Samples: 323316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:43,277][06480] Avg episode reward: [(0, '6.629')] [2023-02-26 06:21:43,294][13238] Saving new best policy, reward=6.629! [2023-02-26 06:21:48,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.9). Total num frames: 1306624. Throughput: 0: 857.9. Samples: 325416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:48,276][06480] Avg episode reward: [(0, '6.930')] [2023-02-26 06:21:48,279][13238] Saving new best policy, reward=6.930! [2023-02-26 06:21:49,158][13252] Updated weights for policy 0, policy_version 320 (0.0011) [2023-02-26 06:21:53,282][06480] Fps is (10 sec: 2864.7, 60 sec: 3344.5, 300 sec: 3512.7). Total num frames: 1318912. Throughput: 0: 866.4. Samples: 330184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:21:53,288][06480] Avg episode reward: [(0, '7.178')] [2023-02-26 06:21:53,299][13238] Saving new best policy, reward=7.178! [2023-02-26 06:21:58,272][06480] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3512.9). Total num frames: 1331200. Throughput: 0: 806.2. Samples: 333918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:21:58,275][06480] Avg episode reward: [(0, '7.361')] [2023-02-26 06:21:58,280][13238] Saving new best policy, reward=7.361! [2023-02-26 06:22:03,272][06480] Fps is (10 sec: 2460.0, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 1343488. Throughput: 0: 780.5. Samples: 335914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:22:03,276][06480] Avg episode reward: [(0, '7.781')] [2023-02-26 06:22:03,352][13238] Saving new best policy, reward=7.781! [2023-02-26 06:22:04,984][13252] Updated weights for policy 0, policy_version 330 (0.0026) [2023-02-26 06:22:08,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3471.2). Total num frames: 1359872. Throughput: 0: 780.1. Samples: 340036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:22:08,275][06480] Avg episode reward: [(0, '8.176')] [2023-02-26 06:22:08,279][13238] Saving new best policy, reward=8.176! [2023-02-26 06:22:13,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3485.1). Total num frames: 1380352. Throughput: 0: 814.7. Samples: 346552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:22:13,274][06480] Avg episode reward: [(0, '8.337')] [2023-02-26 06:22:13,285][13238] Saving new best policy, reward=8.337! [2023-02-26 06:22:15,203][13252] Updated weights for policy 0, policy_version 340 (0.0024) [2023-02-26 06:22:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3276.9, 300 sec: 3499.0). Total num frames: 1400832. Throughput: 0: 815.9. Samples: 349744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:22:18,276][06480] Avg episode reward: [(0, '8.028')] [2023-02-26 06:22:23,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3471.2). Total num frames: 1413120. Throughput: 0: 775.3. Samples: 354084. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:22:23,277][06480] Avg episode reward: [(0, '8.052')] [2023-02-26 06:22:28,270][13252] Updated weights for policy 0, policy_version 350 (0.0017) [2023-02-26 06:22:28,272][06480] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3471.2). Total num frames: 1433600. Throughput: 0: 790.6. Samples: 358894. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:22:28,281][06480] Avg episode reward: [(0, '7.913')] [2023-02-26 06:22:33,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 1454080. Throughput: 0: 816.5. Samples: 362160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:22:33,274][06480] Avg episode reward: [(0, '7.815')] [2023-02-26 06:22:37,605][13252] Updated weights for policy 0, policy_version 360 (0.0017) [2023-02-26 06:22:38,272][06480] Fps is (10 sec: 4096.2, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 1474560. Throughput: 0: 855.7. Samples: 368682. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:22:38,280][06480] Avg episode reward: [(0, '8.477')] [2023-02-26 06:22:38,286][13238] Saving new best policy, reward=8.477! [2023-02-26 06:22:43,275][06480] Fps is (10 sec: 3275.8, 60 sec: 3276.7, 300 sec: 3471.2). Total num frames: 1486848. Throughput: 0: 863.5. Samples: 372778. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-26 06:22:43,278][06480] Avg episode reward: [(0, '8.631')] [2023-02-26 06:22:43,300][13238] Saving new best policy, reward=8.631! [2023-02-26 06:22:48,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3457.3). Total num frames: 1503232. Throughput: 0: 862.9. Samples: 374744. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-26 06:22:48,279][06480] Avg episode reward: [(0, '8.606')] [2023-02-26 06:22:50,953][13252] Updated weights for policy 0, policy_version 370 (0.0017) [2023-02-26 06:22:53,272][06480] Fps is (10 sec: 3687.5, 60 sec: 3413.9, 300 sec: 3471.2). Total num frames: 1523712. Throughput: 0: 906.8. Samples: 380844. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:22:53,278][06480] Avg episode reward: [(0, '8.821')] [2023-02-26 06:22:53,288][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000372_1523712.pth... [2023-02-26 06:22:53,411][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000171_700416.pth [2023-02-26 06:22:53,421][13238] Saving new best policy, reward=8.821! [2023-02-26 06:22:58,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 1544192. Throughput: 0: 894.0. Samples: 386780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:22:58,274][06480] Avg episode reward: [(0, '8.741')] [2023-02-26 06:23:02,355][13252] Updated weights for policy 0, policy_version 380 (0.0011) [2023-02-26 06:23:03,276][06480] Fps is (10 sec: 3275.5, 60 sec: 3549.6, 300 sec: 3457.3). Total num frames: 1556480. Throughput: 0: 867.5. Samples: 388784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:23:03,287][06480] Avg episode reward: [(0, '9.003')] [2023-02-26 06:23:03,309][13238] Saving new best policy, reward=9.003! [2023-02-26 06:23:08,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1572864. Throughput: 0: 865.2. Samples: 393020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:23:08,281][06480] Avg episode reward: [(0, '9.082')] [2023-02-26 06:23:08,285][13238] Saving new best policy, reward=9.082! [2023-02-26 06:23:13,272][06480] Fps is (10 sec: 3687.9, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 1593344. Throughput: 0: 902.2. Samples: 399494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:23:13,276][06480] Avg episode reward: [(0, '10.086')] [2023-02-26 06:23:13,291][13238] Saving new best policy, reward=10.086! [2023-02-26 06:23:13,709][13252] Updated weights for policy 0, policy_version 390 (0.0017) [2023-02-26 06:23:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1613824. Throughput: 0: 900.0. Samples: 402662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:23:18,276][06480] Avg episode reward: [(0, '10.959')] [2023-02-26 06:23:18,281][13238] Saving new best policy, reward=10.959! [2023-02-26 06:23:23,273][06480] Fps is (10 sec: 3276.4, 60 sec: 3549.8, 300 sec: 3457.3). Total num frames: 1626112. Throughput: 0: 848.8. Samples: 406878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:23:23,280][06480] Avg episode reward: [(0, '10.642')] [2023-02-26 06:23:26,935][13252] Updated weights for policy 0, policy_version 400 (0.0019) [2023-02-26 06:23:28,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1642496. Throughput: 0: 864.4. Samples: 411674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:23:28,274][06480] Avg episode reward: [(0, '9.966')] [2023-02-26 06:23:33,272][06480] Fps is (10 sec: 3686.6, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1662976. Throughput: 0: 893.3. Samples: 414944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:23:33,279][06480] Avg episode reward: [(0, '9.625')] [2023-02-26 06:23:36,157][13252] Updated weights for policy 0, policy_version 410 (0.0012) [2023-02-26 06:23:38,272][06480] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1683456. Throughput: 0: 898.7. Samples: 421286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:23:38,277][06480] Avg episode reward: [(0, '9.815')] [2023-02-26 06:23:43,275][06480] Fps is (10 sec: 3275.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1695744. Throughput: 0: 855.6. Samples: 425286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:23:43,279][06480] Avg episode reward: [(0, '10.248')] [2023-02-26 06:23:48,272][06480] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1716224. Throughput: 0: 858.2. Samples: 427400. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:23:48,277][06480] Avg episode reward: [(0, '10.448')] [2023-02-26 06:23:49,177][13252] Updated weights for policy 0, policy_version 420 (0.0022) [2023-02-26 06:23:53,272][06480] Fps is (10 sec: 4097.5, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 1736704. Throughput: 0: 904.1. Samples: 433704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:23:53,274][06480] Avg episode reward: [(0, '9.990')] [2023-02-26 06:23:58,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1753088. Throughput: 0: 887.1. Samples: 439414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:23:58,276][06480] Avg episode reward: [(0, '10.032')] [2023-02-26 06:24:00,330][13252] Updated weights for policy 0, policy_version 430 (0.0022) [2023-02-26 06:24:03,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.8, 300 sec: 3429.5). Total num frames: 1765376. Throughput: 0: 861.3. Samples: 441422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:03,277][06480] Avg episode reward: [(0, '10.741')] [2023-02-26 06:24:08,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1781760. Throughput: 0: 863.0. Samples: 445710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:08,278][06480] Avg episode reward: [(0, '11.157')] [2023-02-26 06:24:08,283][13238] Saving new best policy, reward=11.157! [2023-02-26 06:24:12,034][13252] Updated weights for policy 0, policy_version 440 (0.0019) [2023-02-26 06:24:13,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1806336. Throughput: 0: 901.1. Samples: 452222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:24:13,274][06480] Avg episode reward: [(0, '11.962')] [2023-02-26 06:24:13,288][13238] Saving new best policy, reward=11.962! [2023-02-26 06:24:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1822720. Throughput: 0: 898.1. Samples: 455360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:24:18,274][06480] Avg episode reward: [(0, '11.934')] [2023-02-26 06:24:23,274][06480] Fps is (10 sec: 2866.6, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 1835008. Throughput: 0: 837.8. Samples: 458990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:24:23,276][06480] Avg episode reward: [(0, '11.035')] [2023-02-26 06:24:26,674][13252] Updated weights for policy 0, policy_version 450 (0.0020) [2023-02-26 06:24:28,273][06480] Fps is (10 sec: 2047.8, 60 sec: 3345.0, 300 sec: 3429.5). Total num frames: 1843200. Throughput: 0: 819.9. Samples: 462178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:24:28,277][06480] Avg episode reward: [(0, '11.075')] [2023-02-26 06:24:33,272][06480] Fps is (10 sec: 2458.1, 60 sec: 3276.8, 300 sec: 3415.7). Total num frames: 1859584. Throughput: 0: 812.7. Samples: 463972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:33,278][06480] Avg episode reward: [(0, '11.480')] [2023-02-26 06:24:38,272][06480] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 1880064. Throughput: 0: 809.7. Samples: 470140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:38,274][06480] Avg episode reward: [(0, '12.761')] [2023-02-26 06:24:38,281][13238] Saving new best policy, reward=12.761! [2023-02-26 06:24:38,610][13252] Updated weights for policy 0, policy_version 460 (0.0027) [2023-02-26 06:24:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3457.3). Total num frames: 1900544. Throughput: 0: 809.2. Samples: 475828. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:43,274][06480] Avg episode reward: [(0, '12.489')] [2023-02-26 06:24:48,274][06480] Fps is (10 sec: 3276.2, 60 sec: 3276.7, 300 sec: 3415.6). Total num frames: 1912832. Throughput: 0: 810.4. Samples: 477890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:24:48,276][06480] Avg episode reward: [(0, '12.721')] [2023-02-26 06:24:51,589][13252] Updated weights for policy 0, policy_version 470 (0.0011) [2023-02-26 06:24:53,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1929216. Throughput: 0: 818.0. Samples: 482520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:24:53,274][06480] Avg episode reward: [(0, '12.100')] [2023-02-26 06:24:53,381][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000472_1933312.pth... [2023-02-26 06:24:53,496][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000273_1118208.pth [2023-02-26 06:24:58,272][06480] Fps is (10 sec: 4096.8, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1953792. Throughput: 0: 814.4. Samples: 488870. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:24:58,279][06480] Avg episode reward: [(0, '12.147')] [2023-02-26 06:25:01,100][13252] Updated weights for policy 0, policy_version 480 (0.0037) [2023-02-26 06:25:03,274][06480] Fps is (10 sec: 4095.2, 60 sec: 3413.2, 300 sec: 3443.4). Total num frames: 1970176. Throughput: 0: 813.6. Samples: 491972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:25:03,278][06480] Avg episode reward: [(0, '12.727')] [2023-02-26 06:25:08,274][06480] Fps is (10 sec: 2866.5, 60 sec: 3344.9, 300 sec: 3401.7). Total num frames: 1982464. Throughput: 0: 823.4. Samples: 496044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:08,281][06480] Avg episode reward: [(0, '12.665')] [2023-02-26 06:25:13,272][06480] Fps is (10 sec: 2867.8, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 1998848. Throughput: 0: 866.7. Samples: 501178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:13,282][06480] Avg episode reward: [(0, '12.999')] [2023-02-26 06:25:13,332][13238] Saving new best policy, reward=12.999! [2023-02-26 06:25:14,465][13252] Updated weights for policy 0, policy_version 490 (0.0015) [2023-02-26 06:25:18,272][06480] Fps is (10 sec: 4097.0, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 2023424. Throughput: 0: 896.4. Samples: 504312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:25:18,274][06480] Avg episode reward: [(0, '12.398')] [2023-02-26 06:25:23,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3443.4). Total num frames: 2039808. Throughput: 0: 891.8. Samples: 510270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:25:23,274][06480] Avg episode reward: [(0, '12.724')] [2023-02-26 06:25:25,642][13252] Updated weights for policy 0, policy_version 500 (0.0024) [2023-02-26 06:25:28,272][06480] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 2052096. Throughput: 0: 856.3. Samples: 514362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:28,275][06480] Avg episode reward: [(0, '12.522')] [2023-02-26 06:25:33,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2072576. Throughput: 0: 859.4. Samples: 516562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:33,275][06480] Avg episode reward: [(0, '13.707')] [2023-02-26 06:25:33,291][13238] Saving new best policy, reward=13.707! [2023-02-26 06:25:36,922][13252] Updated weights for policy 0, policy_version 510 (0.0015) [2023-02-26 06:25:38,272][06480] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3429.6). Total num frames: 2093056. Throughput: 0: 900.2. Samples: 523028. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:25:38,278][06480] Avg episode reward: [(0, '14.946')] [2023-02-26 06:25:38,284][13238] Saving new best policy, reward=14.946! [2023-02-26 06:25:43,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2109440. Throughput: 0: 880.2. Samples: 528480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:25:43,277][06480] Avg episode reward: [(0, '16.676')] [2023-02-26 06:25:43,286][13238] Saving new best policy, reward=16.676! [2023-02-26 06:25:48,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3401.8). Total num frames: 2121728. Throughput: 0: 855.8. Samples: 530480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:48,277][06480] Avg episode reward: [(0, '16.354')] [2023-02-26 06:25:50,013][13252] Updated weights for policy 0, policy_version 520 (0.0025) [2023-02-26 06:25:53,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3415.7). Total num frames: 2142208. Throughput: 0: 870.2. Samples: 535200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:25:53,275][06480] Avg episode reward: [(0, '16.321')] [2023-02-26 06:25:58,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2162688. Throughput: 0: 903.0. Samples: 541814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:25:58,274][06480] Avg episode reward: [(0, '15.866')] [2023-02-26 06:25:59,521][13252] Updated weights for policy 0, policy_version 530 (0.0014) [2023-02-26 06:26:03,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3457.3). Total num frames: 2183168. Throughput: 0: 904.0. Samples: 544992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:26:03,274][06480] Avg episode reward: [(0, '14.541')] [2023-02-26 06:26:08,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3415.6). Total num frames: 2195456. Throughput: 0: 864.4. Samples: 549166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:26:08,277][06480] Avg episode reward: [(0, '14.902')] [2023-02-26 06:26:12,365][13252] Updated weights for policy 0, policy_version 540 (0.0019) [2023-02-26 06:26:13,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3415.7). Total num frames: 2211840. Throughput: 0: 892.6. Samples: 554528. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:26:13,278][06480] Avg episode reward: [(0, '15.436')] [2023-02-26 06:26:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2236416. Throughput: 0: 916.6. Samples: 557810. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:26:18,274][06480] Avg episode reward: [(0, '16.401')] [2023-02-26 06:26:21,845][13252] Updated weights for policy 0, policy_version 550 (0.0021) [2023-02-26 06:26:23,272][06480] Fps is (10 sec: 4095.8, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 2252800. Throughput: 0: 907.0. Samples: 563844. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:26:23,278][06480] Avg episode reward: [(0, '16.750')] [2023-02-26 06:26:23,386][13238] Saving new best policy, reward=16.750! [2023-02-26 06:26:28,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3429.5). Total num frames: 2269184. Throughput: 0: 877.2. Samples: 567956. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:26:28,280][06480] Avg episode reward: [(0, '17.033')] [2023-02-26 06:26:28,284][13238] Saving new best policy, reward=17.033! [2023-02-26 06:26:33,272][06480] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2285568. Throughput: 0: 888.6. Samples: 570466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:26:33,277][06480] Avg episode reward: [(0, '17.097')] [2023-02-26 06:26:33,286][13238] Saving new best policy, reward=17.097! [2023-02-26 06:26:34,319][13252] Updated weights for policy 0, policy_version 560 (0.0018) [2023-02-26 06:26:38,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2310144. Throughput: 0: 932.1. Samples: 577144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:26:38,277][06480] Avg episode reward: [(0, '18.273')] [2023-02-26 06:26:38,283][13238] Saving new best policy, reward=18.273! [2023-02-26 06:26:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2326528. Throughput: 0: 908.1. Samples: 582680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:26:43,282][06480] Avg episode reward: [(0, '18.638')] [2023-02-26 06:26:43,301][13238] Saving new best policy, reward=18.638! [2023-02-26 06:26:45,227][13252] Updated weights for policy 0, policy_version 570 (0.0012) [2023-02-26 06:26:48,272][06480] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3457.4). Total num frames: 2338816. Throughput: 0: 882.9. Samples: 584722. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:26:48,281][06480] Avg episode reward: [(0, '17.486')] [2023-02-26 06:26:53,272][06480] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2351104. Throughput: 0: 871.3. Samples: 588376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:26:53,275][06480] Avg episode reward: [(0, '17.578')] [2023-02-26 06:26:53,286][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth... [2023-02-26 06:26:53,451][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000372_1523712.pth [2023-02-26 06:26:58,272][06480] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 2363392. Throughput: 0: 843.7. Samples: 592496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:26:58,281][06480] Avg episode reward: [(0, '16.942')] [2023-02-26 06:27:00,305][13252] Updated weights for policy 0, policy_version 580 (0.0047) [2023-02-26 06:27:03,272][06480] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 2383872. Throughput: 0: 840.2. Samples: 595618. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:27:03,279][06480] Avg episode reward: [(0, '16.733')] [2023-02-26 06:27:08,276][06480] Fps is (10 sec: 3684.9, 60 sec: 3413.1, 300 sec: 3457.3). Total num frames: 2400256. Throughput: 0: 800.2. Samples: 599858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:27:08,280][06480] Avg episode reward: [(0, '16.715')] [2023-02-26 06:27:12,919][13252] Updated weights for policy 0, policy_version 590 (0.0018) [2023-02-26 06:27:13,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2416640. Throughput: 0: 823.2. Samples: 604998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:27:13,281][06480] Avg episode reward: [(0, '16.404')] [2023-02-26 06:27:18,272][06480] Fps is (10 sec: 3687.9, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 2437120. Throughput: 0: 842.5. Samples: 608380. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:27:18,274][06480] Avg episode reward: [(0, '18.634')] [2023-02-26 06:27:22,218][13252] Updated weights for policy 0, policy_version 600 (0.0023) [2023-02-26 06:27:23,274][06480] Fps is (10 sec: 4095.1, 60 sec: 3413.2, 300 sec: 3471.2). Total num frames: 2457600. Throughput: 0: 837.2. Samples: 614818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:27:23,278][06480] Avg episode reward: [(0, '19.932')] [2023-02-26 06:27:23,289][13238] Saving new best policy, reward=19.932! [2023-02-26 06:27:28,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 2469888. Throughput: 0: 804.6. Samples: 618886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:27:28,276][06480] Avg episode reward: [(0, '19.055')] [2023-02-26 06:27:33,272][06480] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2490368. Throughput: 0: 806.1. Samples: 620994. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:27:33,279][06480] Avg episode reward: [(0, '19.446')] [2023-02-26 06:27:35,019][13252] Updated weights for policy 0, policy_version 610 (0.0018) [2023-02-26 06:27:38,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 2510848. Throughput: 0: 873.2. Samples: 627668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:27:38,274][06480] Avg episode reward: [(0, '20.452')] [2023-02-26 06:27:38,276][13238] Saving new best policy, reward=20.452! [2023-02-26 06:27:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2531328. Throughput: 0: 913.1. Samples: 633586. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:27:43,280][06480] Avg episode reward: [(0, '20.137')] [2023-02-26 06:27:45,965][13252] Updated weights for policy 0, policy_version 620 (0.0030) [2023-02-26 06:27:48,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3457.3). Total num frames: 2543616. Throughput: 0: 889.3. Samples: 635636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:27:48,276][06480] Avg episode reward: [(0, '19.549')] [2023-02-26 06:27:53,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2564096. Throughput: 0: 898.8. Samples: 640298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:27:53,278][06480] Avg episode reward: [(0, '19.501')] [2023-02-26 06:27:57,010][13252] Updated weights for policy 0, policy_version 630 (0.0044) [2023-02-26 06:27:58,273][06480] Fps is (10 sec: 4095.6, 60 sec: 3686.3, 300 sec: 3485.1). Total num frames: 2584576. Throughput: 0: 931.8. Samples: 646930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:27:58,275][06480] Avg episode reward: [(0, '20.517')] [2023-02-26 06:27:58,277][13238] Saving new best policy, reward=20.517! [2023-02-26 06:28:03,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2600960. Throughput: 0: 927.3. Samples: 650108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:28:03,276][06480] Avg episode reward: [(0, '18.537')] [2023-02-26 06:28:08,272][06480] Fps is (10 sec: 3277.1, 60 sec: 3618.4, 300 sec: 3471.2). Total num frames: 2617344. Throughput: 0: 877.6. Samples: 654310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:28:08,279][06480] Avg episode reward: [(0, '18.562')] [2023-02-26 06:28:09,296][13252] Updated weights for policy 0, policy_version 640 (0.0014) [2023-02-26 06:28:13,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2633728. Throughput: 0: 901.7. Samples: 659462. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:28:13,277][06480] Avg episode reward: [(0, '18.507')] [2023-02-26 06:28:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3499.0). Total num frames: 2658304. Throughput: 0: 928.4. Samples: 662770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:28:18,274][06480] Avg episode reward: [(0, '17.729')] [2023-02-26 06:28:19,227][13252] Updated weights for policy 0, policy_version 650 (0.0027) [2023-02-26 06:28:23,276][06480] Fps is (10 sec: 4094.3, 60 sec: 3618.0, 300 sec: 3498.9). Total num frames: 2674688. Throughput: 0: 919.0. Samples: 669028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:28:23,278][06480] Avg episode reward: [(0, '17.950')] [2023-02-26 06:28:28,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2686976. Throughput: 0: 875.7. Samples: 672994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:28:28,282][06480] Avg episode reward: [(0, '17.807')] [2023-02-26 06:28:32,225][13252] Updated weights for policy 0, policy_version 660 (0.0032) [2023-02-26 06:28:33,272][06480] Fps is (10 sec: 3278.2, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2707456. Throughput: 0: 876.9. Samples: 675098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:28:33,279][06480] Avg episode reward: [(0, '18.766')] [2023-02-26 06:28:38,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 2727936. Throughput: 0: 922.4. Samples: 681806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:28:38,278][06480] Avg episode reward: [(0, '18.210')] [2023-02-26 06:28:41,744][13252] Updated weights for policy 0, policy_version 670 (0.0013) [2023-02-26 06:28:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 2748416. Throughput: 0: 906.4. Samples: 687716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:28:43,274][06480] Avg episode reward: [(0, '20.171')] [2023-02-26 06:28:48,273][06480] Fps is (10 sec: 3276.3, 60 sec: 3618.0, 300 sec: 3471.2). Total num frames: 2760704. Throughput: 0: 883.5. Samples: 689868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:28:48,278][06480] Avg episode reward: [(0, '21.242')] [2023-02-26 06:28:48,281][13238] Saving new best policy, reward=21.242! [2023-02-26 06:28:53,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2781184. Throughput: 0: 894.6. Samples: 694566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:28:53,274][06480] Avg episode reward: [(0, '22.411')] [2023-02-26 06:28:53,288][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000679_2781184.pth... [2023-02-26 06:28:53,401][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000472_1933312.pth [2023-02-26 06:28:53,408][13238] Saving new best policy, reward=22.411! [2023-02-26 06:28:54,305][13252] Updated weights for policy 0, policy_version 680 (0.0014) [2023-02-26 06:28:58,272][06480] Fps is (10 sec: 4096.5, 60 sec: 3618.2, 300 sec: 3512.8). Total num frames: 2801664. Throughput: 0: 923.8. Samples: 701032. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:28:58,275][06480] Avg episode reward: [(0, '22.438')] [2023-02-26 06:28:58,281][13238] Saving new best policy, reward=22.438! [2023-02-26 06:29:03,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 2818048. Throughput: 0: 922.2. Samples: 704268. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:29:03,277][06480] Avg episode reward: [(0, '22.316')] [2023-02-26 06:29:05,404][13252] Updated weights for policy 0, policy_version 690 (0.0025) [2023-02-26 06:29:08,272][06480] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2830336. Throughput: 0: 873.0. Samples: 708308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:29:08,279][06480] Avg episode reward: [(0, '20.347')] [2023-02-26 06:29:13,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 895.9. Samples: 713310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:29:13,275][06480] Avg episode reward: [(0, '18.684')] [2023-02-26 06:29:16,786][13252] Updated weights for policy 0, policy_version 700 (0.0019) [2023-02-26 06:29:18,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 2871296. Throughput: 0: 919.6. Samples: 716482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:29:18,274][06480] Avg episode reward: [(0, '18.632')] [2023-02-26 06:29:23,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3550.1, 300 sec: 3540.6). Total num frames: 2887680. Throughput: 0: 890.2. Samples: 721864. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:29:23,276][06480] Avg episode reward: [(0, '18.087')] [2023-02-26 06:29:28,272][06480] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 2895872. Throughput: 0: 830.8. Samples: 725104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:29:28,280][06480] Avg episode reward: [(0, '17.589')] [2023-02-26 06:29:32,583][13252] Updated weights for policy 0, policy_version 710 (0.0027) [2023-02-26 06:29:33,272][06480] Fps is (10 sec: 2048.0, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2908160. Throughput: 0: 819.1. Samples: 726728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:29:33,275][06480] Avg episode reward: [(0, '17.534')] [2023-02-26 06:29:38,272][06480] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2928640. Throughput: 0: 826.4. Samples: 731752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:29:38,280][06480] Avg episode reward: [(0, '19.361')] [2023-02-26 06:29:42,605][13252] Updated weights for policy 0, policy_version 720 (0.0012) [2023-02-26 06:29:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3512.9). Total num frames: 2949120. Throughput: 0: 830.1. Samples: 738386. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:29:43,277][06480] Avg episode reward: [(0, '20.260')] [2023-02-26 06:29:48,273][06480] Fps is (10 sec: 3686.1, 60 sec: 3413.4, 300 sec: 3512.8). Total num frames: 2965504. Throughput: 0: 818.7. Samples: 741112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:29:48,280][06480] Avg episode reward: [(0, '19.681')] [2023-02-26 06:29:53,273][06480] Fps is (10 sec: 2866.9, 60 sec: 3276.7, 300 sec: 3471.2). Total num frames: 2977792. Throughput: 0: 819.5. Samples: 745186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:29:53,277][06480] Avg episode reward: [(0, '19.725')] [2023-02-26 06:29:55,662][13252] Updated weights for policy 0, policy_version 730 (0.0019) [2023-02-26 06:29:58,272][06480] Fps is (10 sec: 3277.0, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 2998272. Throughput: 0: 828.9. Samples: 750612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:29:58,274][06480] Avg episode reward: [(0, '20.222')] [2023-02-26 06:30:03,272][06480] Fps is (10 sec: 4096.4, 60 sec: 3345.1, 300 sec: 3512.9). Total num frames: 3018752. Throughput: 0: 829.9. Samples: 753826. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:30:03,278][06480] Avg episode reward: [(0, '19.873')] [2023-02-26 06:30:06,057][13252] Updated weights for policy 0, policy_version 740 (0.0012) [2023-02-26 06:30:08,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 3035136. Throughput: 0: 829.6. Samples: 759196. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 06:30:08,279][06480] Avg episode reward: [(0, '20.334')] [2023-02-26 06:30:13,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3471.2). Total num frames: 3047424. Throughput: 0: 848.6. Samples: 763292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:30:13,275][06480] Avg episode reward: [(0, '21.425')] [2023-02-26 06:30:18,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 3067904. Throughput: 0: 870.6. Samples: 765906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:30:18,278][06480] Avg episode reward: [(0, '21.415')] [2023-02-26 06:30:18,452][13252] Updated weights for policy 0, policy_version 750 (0.0048) [2023-02-26 06:30:23,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 3092480. Throughput: 0: 908.8. Samples: 772646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:30:23,274][06480] Avg episode reward: [(0, '22.097')] [2023-02-26 06:30:28,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3108864. Throughput: 0: 877.2. Samples: 777860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:30:28,279][06480] Avg episode reward: [(0, '23.036')] [2023-02-26 06:30:28,285][13238] Saving new best policy, reward=23.036! [2023-02-26 06:30:29,827][13252] Updated weights for policy 0, policy_version 760 (0.0021) [2023-02-26 06:30:33,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3121152. Throughput: 0: 859.7. Samples: 779798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:30:33,280][06480] Avg episode reward: [(0, '23.180')] [2023-02-26 06:30:33,303][13238] Saving new best policy, reward=23.180! [2023-02-26 06:30:38,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3141632. Throughput: 0: 882.2. Samples: 784884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:30:38,281][06480] Avg episode reward: [(0, '22.062')] [2023-02-26 06:30:40,913][13252] Updated weights for policy 0, policy_version 770 (0.0024) [2023-02-26 06:30:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3162112. Throughput: 0: 907.3. Samples: 791440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:30:43,274][06480] Avg episode reward: [(0, '21.140')] [2023-02-26 06:30:48,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3178496. Throughput: 0: 895.6. Samples: 794126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:30:48,277][06480] Avg episode reward: [(0, '20.583')] [2023-02-26 06:30:53,272][06480] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3190784. Throughput: 0: 868.8. Samples: 798292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:30:53,275][06480] Avg episode reward: [(0, '19.434')] [2023-02-26 06:30:53,287][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000779_3190784.pth... [2023-02-26 06:30:53,475][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth [2023-02-26 06:30:53,592][13252] Updated weights for policy 0, policy_version 780 (0.0022) [2023-02-26 06:30:58,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3211264. Throughput: 0: 900.6. Samples: 803820. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:30:58,279][06480] Avg episode reward: [(0, '19.900')] [2023-02-26 06:31:03,272][06480] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3231744. Throughput: 0: 914.4. Samples: 807056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:31:03,274][06480] Avg episode reward: [(0, '20.001')] [2023-02-26 06:31:03,605][13252] Updated weights for policy 0, policy_version 790 (0.0013) [2023-02-26 06:31:08,274][06480] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3512.8). Total num frames: 3248128. Throughput: 0: 885.6. Samples: 812502. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 06:31:08,277][06480] Avg episode reward: [(0, '20.948')] [2023-02-26 06:31:13,272][06480] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3260416. Throughput: 0: 859.9. Samples: 816556. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:31:13,277][06480] Avg episode reward: [(0, '22.050')] [2023-02-26 06:31:16,737][13252] Updated weights for policy 0, policy_version 800 (0.0025) [2023-02-26 06:31:18,272][06480] Fps is (10 sec: 3277.4, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 3280896. Throughput: 0: 875.8. Samples: 819210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:31:18,275][06480] Avg episode reward: [(0, '22.213')] [2023-02-26 06:31:23,272][06480] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3301376. Throughput: 0: 906.2. Samples: 825662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:31:23,277][06480] Avg episode reward: [(0, '21.573')] [2023-02-26 06:31:27,064][13252] Updated weights for policy 0, policy_version 810 (0.0012) [2023-02-26 06:31:28,274][06480] Fps is (10 sec: 3685.8, 60 sec: 3481.5, 300 sec: 3498.9). Total num frames: 3317760. Throughput: 0: 875.1. Samples: 830820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:31:28,276][06480] Avg episode reward: [(0, '21.304')] [2023-02-26 06:31:33,273][06480] Fps is (10 sec: 3276.4, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3334144. Throughput: 0: 858.9. Samples: 832776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:31:33,276][06480] Avg episode reward: [(0, '21.973')] [2023-02-26 06:31:38,272][06480] Fps is (10 sec: 3277.3, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3350528. Throughput: 0: 882.8. Samples: 838016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:31:38,277][06480] Avg episode reward: [(0, '19.576')] [2023-02-26 06:31:39,254][13252] Updated weights for policy 0, policy_version 820 (0.0016) [2023-02-26 06:31:43,272][06480] Fps is (10 sec: 4096.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3375104. Throughput: 0: 908.1. Samples: 844684. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:31:43,275][06480] Avg episode reward: [(0, '19.675')] [2023-02-26 06:31:48,276][06480] Fps is (10 sec: 4094.4, 60 sec: 3549.6, 300 sec: 3526.7). Total num frames: 3391488. Throughput: 0: 895.7. Samples: 847366. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:31:48,279][06480] Avg episode reward: [(0, '20.379')] [2023-02-26 06:31:50,889][13252] Updated weights for policy 0, policy_version 830 (0.0015) [2023-02-26 06:31:53,272][06480] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3403776. Throughput: 0: 865.5. Samples: 851446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:31:53,274][06480] Avg episode reward: [(0, '19.535')] [2023-02-26 06:31:58,272][06480] Fps is (10 sec: 2458.7, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 3416064. Throughput: 0: 853.7. Samples: 854974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:31:58,274][06480] Avg episode reward: [(0, '19.374')] [2023-02-26 06:32:03,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 3432448. Throughput: 0: 839.3. Samples: 856980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:32:03,277][06480] Avg episode reward: [(0, '20.583')] [2023-02-26 06:32:05,233][13252] Updated weights for policy 0, policy_version 840 (0.0018) [2023-02-26 06:32:08,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3345.2, 300 sec: 3499.0). Total num frames: 3448832. Throughput: 0: 810.3. Samples: 862126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:32:08,279][06480] Avg episode reward: [(0, '20.284')] [2023-02-26 06:32:13,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 3461120. Throughput: 0: 786.4. Samples: 866206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:32:13,278][06480] Avg episode reward: [(0, '19.823')] [2023-02-26 06:32:18,232][13252] Updated weights for policy 0, policy_version 850 (0.0016) [2023-02-26 06:32:18,272][06480] Fps is (10 sec: 3276.7, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 3481600. Throughput: 0: 799.8. Samples: 868766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:32:18,274][06480] Avg episode reward: [(0, '20.445')] [2023-02-26 06:32:23,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 3502080. Throughput: 0: 829.4. Samples: 875338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:32:23,280][06480] Avg episode reward: [(0, '21.473')] [2023-02-26 06:32:28,272][06480] Fps is (10 sec: 3686.5, 60 sec: 3345.2, 300 sec: 3485.1). Total num frames: 3518464. Throughput: 0: 802.3. Samples: 880786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:32:28,274][06480] Avg episode reward: [(0, '21.928')] [2023-02-26 06:32:28,796][13252] Updated weights for policy 0, policy_version 860 (0.0017) [2023-02-26 06:32:33,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3276.9, 300 sec: 3457.3). Total num frames: 3530752. Throughput: 0: 787.8. Samples: 882812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:32:33,275][06480] Avg episode reward: [(0, '22.125')] [2023-02-26 06:32:38,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 3551232. Throughput: 0: 809.2. Samples: 887862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:32:38,274][06480] Avg episode reward: [(0, '23.461')] [2023-02-26 06:32:38,281][13238] Saving new best policy, reward=23.461! [2023-02-26 06:32:40,460][13252] Updated weights for policy 0, policy_version 870 (0.0017) [2023-02-26 06:32:43,272][06480] Fps is (10 sec: 4505.6, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 3575808. Throughput: 0: 875.0. Samples: 894348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:32:43,280][06480] Avg episode reward: [(0, '24.011')] [2023-02-26 06:32:43,290][13238] Saving new best policy, reward=24.011! [2023-02-26 06:32:48,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3277.1, 300 sec: 3471.2). Total num frames: 3588096. Throughput: 0: 890.1. Samples: 897034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:32:48,277][06480] Avg episode reward: [(0, '25.016')] [2023-02-26 06:32:48,278][13238] Saving new best policy, reward=25.016! [2023-02-26 06:32:52,831][13252] Updated weights for policy 0, policy_version 880 (0.0014) [2023-02-26 06:32:53,273][06480] Fps is (10 sec: 2866.9, 60 sec: 3345.0, 300 sec: 3457.3). Total num frames: 3604480. Throughput: 0: 865.9. Samples: 901092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:32:53,279][06480] Avg episode reward: [(0, '24.956')] [2023-02-26 06:32:53,298][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000880_3604480.pth... [2023-02-26 06:32:53,437][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000679_2781184.pth [2023-02-26 06:32:58,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3624960. Throughput: 0: 899.6. Samples: 906688. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 06:32:58,275][06480] Avg episode reward: [(0, '24.288')] [2023-02-26 06:33:03,073][13252] Updated weights for policy 0, policy_version 890 (0.0020) [2023-02-26 06:33:03,272][06480] Fps is (10 sec: 4096.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3645440. Throughput: 0: 914.7. Samples: 909928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:03,274][06480] Avg episode reward: [(0, '23.134')] [2023-02-26 06:33:08,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3661824. Throughput: 0: 891.6. Samples: 915462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:08,279][06480] Avg episode reward: [(0, '21.925')] [2023-02-26 06:33:13,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3674112. Throughput: 0: 862.4. Samples: 919596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:33:13,277][06480] Avg episode reward: [(0, '21.501')] [2023-02-26 06:33:16,185][13252] Updated weights for policy 0, policy_version 900 (0.0039) [2023-02-26 06:33:18,272][06480] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 3694592. Throughput: 0: 877.3. Samples: 922292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:33:18,274][06480] Avg episode reward: [(0, '20.837')] [2023-02-26 06:33:23,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3715072. Throughput: 0: 911.5. Samples: 928878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:33:23,274][06480] Avg episode reward: [(0, '21.001')] [2023-02-26 06:33:25,819][13252] Updated weights for policy 0, policy_version 910 (0.0015) [2023-02-26 06:33:28,273][06480] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3731456. Throughput: 0: 879.9. Samples: 933944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:28,285][06480] Avg episode reward: [(0, '22.389')] [2023-02-26 06:33:33,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3743744. Throughput: 0: 863.8. Samples: 935906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:33,284][06480] Avg episode reward: [(0, '23.375')] [2023-02-26 06:33:38,272][06480] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3764224. Throughput: 0: 887.0. Samples: 941004. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:33:38,274][06480] Avg episode reward: [(0, '24.938')] [2023-02-26 06:33:38,708][13252] Updated weights for policy 0, policy_version 920 (0.0014) [2023-02-26 06:33:43,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3784704. Throughput: 0: 909.8. Samples: 947628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:43,274][06480] Avg episode reward: [(0, '25.481')] [2023-02-26 06:33:43,290][13238] Saving new best policy, reward=25.481! [2023-02-26 06:33:48,273][06480] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3457.3). Total num frames: 3801088. Throughput: 0: 893.7. Samples: 950146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:48,275][06480] Avg episode reward: [(0, '25.422')] [2023-02-26 06:33:50,045][13252] Updated weights for policy 0, policy_version 930 (0.0015) [2023-02-26 06:33:53,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3817472. Throughput: 0: 864.0. Samples: 954342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 06:33:53,274][06480] Avg episode reward: [(0, '25.514')] [2023-02-26 06:33:53,289][13238] Saving new best policy, reward=25.514! [2023-02-26 06:33:58,272][06480] Fps is (10 sec: 3686.7, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 3837952. Throughput: 0: 897.1. Samples: 959964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:33:58,274][06480] Avg episode reward: [(0, '26.394')] [2023-02-26 06:33:58,281][13238] Saving new best policy, reward=26.394! [2023-02-26 06:34:01,161][13252] Updated weights for policy 0, policy_version 940 (0.0015) [2023-02-26 06:34:03,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3858432. Throughput: 0: 906.3. Samples: 963074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:34:03,280][06480] Avg episode reward: [(0, '25.180')] [2023-02-26 06:34:08,272][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3870720. Throughput: 0: 876.3. Samples: 968312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 06:34:08,275][06480] Avg episode reward: [(0, '24.195')] [2023-02-26 06:34:13,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3887104. Throughput: 0: 856.0. Samples: 972464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 06:34:13,282][06480] Avg episode reward: [(0, '25.719')] [2023-02-26 06:34:14,409][13252] Updated weights for policy 0, policy_version 950 (0.0031) [2023-02-26 06:34:18,272][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 3907584. Throughput: 0: 876.4. Samples: 975346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:34:18,274][06480] Avg episode reward: [(0, '26.194')] [2023-02-26 06:34:23,272][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3928064. Throughput: 0: 904.1. Samples: 981688. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 06:34:23,274][06480] Avg episode reward: [(0, '24.475')] [2023-02-26 06:34:23,847][13252] Updated weights for policy 0, policy_version 960 (0.0027) [2023-02-26 06:34:28,273][06480] Fps is (10 sec: 3686.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3944448. Throughput: 0: 865.8. Samples: 986592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:34:28,284][06480] Avg episode reward: [(0, '24.273')] [2023-02-26 06:34:33,272][06480] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3952640. Throughput: 0: 845.2. Samples: 988178. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:34:33,277][06480] Avg episode reward: [(0, '24.760')] [2023-02-26 06:34:38,275][06480] Fps is (10 sec: 2047.5, 60 sec: 3344.9, 300 sec: 3443.4). Total num frames: 3964928. Throughput: 0: 821.9. Samples: 991330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 06:34:38,278][06480] Avg episode reward: [(0, '25.315')] [2023-02-26 06:34:40,519][13252] Updated weights for policy 0, policy_version 970 (0.0017) [2023-02-26 06:34:43,272][06480] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 3981312. Throughput: 0: 816.6. Samples: 996712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 06:34:43,278][06480] Avg episode reward: [(0, '24.845')] [2023-02-26 06:34:48,123][13238] Stopping Batcher_0... [2023-02-26 06:34:48,124][13238] Loop batcher_evt_loop terminating... [2023-02-26 06:34:48,124][06480] Component Batcher_0 stopped! [2023-02-26 06:34:48,141][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 06:34:48,208][13252] Weights refcount: 2 0 [2023-02-26 06:34:48,218][06480] Component InferenceWorker_p0-w0 stopped! [2023-02-26 06:34:48,233][13252] Stopping InferenceWorker_p0-w0... [2023-02-26 06:34:48,234][13252] Loop inference_proc0-0_evt_loop terminating... [2023-02-26 06:34:48,258][06480] Component RolloutWorker_w3 stopped! [2023-02-26 06:34:48,265][06480] Component RolloutWorker_w5 stopped! [2023-02-26 06:34:48,267][13257] Stopping RolloutWorker_w5... [2023-02-26 06:34:48,268][13257] Loop rollout_proc5_evt_loop terminating... [2023-02-26 06:34:48,260][13255] Stopping RolloutWorker_w3... [2023-02-26 06:34:48,272][13255] Loop rollout_proc3_evt_loop terminating... [2023-02-26 06:34:48,292][06480] Component RolloutWorker_w1 stopped! [2023-02-26 06:34:48,294][13254] Stopping RolloutWorker_w1... [2023-02-26 06:34:48,305][13254] Loop rollout_proc1_evt_loop terminating... [2023-02-26 06:34:48,315][06480] Component RolloutWorker_w7 stopped! [2023-02-26 06:34:48,319][13259] Stopping RolloutWorker_w7... [2023-02-26 06:34:48,320][13259] Loop rollout_proc7_evt_loop terminating... [2023-02-26 06:34:48,325][13258] Stopping RolloutWorker_w4... [2023-02-26 06:34:48,325][06480] Component RolloutWorker_w4 stopped! [2023-02-26 06:34:48,346][13258] Loop rollout_proc4_evt_loop terminating... [2023-02-26 06:34:48,360][13238] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000779_3190784.pth [2023-02-26 06:34:48,371][13260] Stopping RolloutWorker_w6... [2023-02-26 06:34:48,371][13260] Loop rollout_proc6_evt_loop terminating... [2023-02-26 06:34:48,371][06480] Component RolloutWorker_w6 stopped! [2023-02-26 06:34:48,381][13238] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 06:34:48,384][13256] Stopping RolloutWorker_w2... [2023-02-26 06:34:48,384][13256] Loop rollout_proc2_evt_loop terminating... [2023-02-26 06:34:48,384][06480] Component RolloutWorker_w2 stopped! [2023-02-26 06:34:48,439][13253] Stopping RolloutWorker_w0... [2023-02-26 06:34:48,440][13253] Loop rollout_proc0_evt_loop terminating... [2023-02-26 06:34:48,439][06480] Component RolloutWorker_w0 stopped! [2023-02-26 06:34:48,683][13238] Stopping LearnerWorker_p0... [2023-02-26 06:34:48,683][06480] Component LearnerWorker_p0 stopped! [2023-02-26 06:34:48,685][06480] Waiting for process learner_proc0 to stop... [2023-02-26 06:34:48,699][13238] Loop learner_proc0_evt_loop terminating... [2023-02-26 06:34:50,928][06480] Waiting for process inference_proc0-0 to join... [2023-02-26 06:34:51,761][06480] Waiting for process rollout_proc0 to join... [2023-02-26 06:34:52,689][06480] Waiting for process rollout_proc1 to join... [2023-02-26 06:34:52,692][06480] Waiting for process rollout_proc2 to join... [2023-02-26 06:34:52,699][06480] Waiting for process rollout_proc3 to join... [2023-02-26 06:34:52,700][06480] Waiting for process rollout_proc4 to join... [2023-02-26 06:34:52,703][06480] Waiting for process rollout_proc5 to join... [2023-02-26 06:34:52,705][06480] Waiting for process rollout_proc6 to join... [2023-02-26 06:34:52,708][06480] Waiting for process rollout_proc7 to join... [2023-02-26 06:34:52,714][06480] Batcher 0 profile tree view: batching: 25.3344, releasing_batches: 0.0277 [2023-02-26 06:34:52,716][06480] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 582.9391 update_model: 7.9647 weight_update: 0.0014 one_step: 0.0113 handle_policy_step: 529.2481 deserialize: 15.6504, stack: 3.1020, obs_to_device_normalize: 117.3064, forward: 254.7965, send_messages: 27.2844 prepare_outputs: 84.1286 to_cpu: 51.2895 [2023-02-26 06:34:52,718][06480] Learner 0 profile tree view: misc: 0.0078, prepare_batch: 17.2218 train: 76.6284 epoch_init: 0.0271, minibatch_init: 0.0107, losses_postprocess: 0.5949, kl_divergence: 0.5760, after_optimizer: 33.3189 calculate_losses: 27.2007 losses_init: 0.0035, forward_head: 1.9252, bptt_initial: 17.7151, tail: 1.1068, advantages_returns: 0.3227, losses: 3.5600 bptt: 2.2171 bptt_forward_core: 2.1199 update: 14.2121 clip: 1.4362 [2023-02-26 06:34:52,720][06480] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.3924, enqueue_policy_requests: 163.6997, env_step: 867.2261, overhead: 23.9751, complete_rollouts: 7.0514 save_policy_outputs: 22.1973 split_output_tensors: 10.7631 [2023-02-26 06:34:52,723][06480] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.3423, enqueue_policy_requests: 169.2092, env_step: 861.8023, overhead: 24.6130, complete_rollouts: 7.1703 save_policy_outputs: 22.0299 split_output_tensors: 10.8844 [2023-02-26 06:34:52,725][06480] Loop Runner_EvtLoop terminating... [2023-02-26 06:34:52,731][06480] Runner profile tree view: main_loop: 1195.0150 [2023-02-26 06:34:52,733][06480] Collected {0: 4005888}, FPS: 3352.2 [2023-02-26 07:01:04,608][06480] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 07:01:04,610][06480] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 07:01:04,612][06480] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 07:01:04,613][06480] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 07:01:04,615][06480] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:01:04,617][06480] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 07:01:04,619][06480] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:01:04,621][06480] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 07:01:04,623][06480] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-26 07:01:04,625][06480] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-26 07:01:04,628][06480] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 07:01:04,631][06480] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 07:01:04,634][06480] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 07:01:04,637][06480] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 07:01:04,640][06480] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 07:01:04,670][06480] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:01:04,673][06480] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:01:04,679][06480] RunningMeanStd input shape: (1,) [2023-02-26 07:01:04,697][06480] ConvEncoder: input_channels=3 [2023-02-26 07:01:05,374][06480] Conv encoder output size: 512 [2023-02-26 07:01:05,377][06480] Policy head output size: 512 [2023-02-26 07:01:07,835][06480] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 07:01:09,126][06480] Num frames 100... [2023-02-26 07:01:09,246][06480] Num frames 200... [2023-02-26 07:01:09,369][06480] Num frames 300... [2023-02-26 07:01:09,487][06480] Num frames 400... [2023-02-26 07:01:09,606][06480] Num frames 500... [2023-02-26 07:01:09,737][06480] Num frames 600... [2023-02-26 07:01:09,861][06480] Num frames 700... [2023-02-26 07:01:09,995][06480] Num frames 800... [2023-02-26 07:01:10,112][06480] Num frames 900... [2023-02-26 07:01:10,191][06480] Avg episode rewards: #0: 19.200, true rewards: #0: 9.200 [2023-02-26 07:01:10,193][06480] Avg episode reward: 19.200, avg true_objective: 9.200 [2023-02-26 07:01:10,289][06480] Num frames 1000... [2023-02-26 07:01:10,404][06480] Num frames 1100... [2023-02-26 07:01:10,518][06480] Num frames 1200... [2023-02-26 07:01:10,642][06480] Num frames 1300... [2023-02-26 07:01:10,787][06480] Avg episode rewards: #0: 12.340, true rewards: #0: 6.840 [2023-02-26 07:01:10,790][06480] Avg episode reward: 12.340, avg true_objective: 6.840 [2023-02-26 07:01:10,829][06480] Num frames 1400... [2023-02-26 07:01:10,960][06480] Num frames 1500... [2023-02-26 07:01:11,073][06480] Num frames 1600... [2023-02-26 07:01:11,185][06480] Num frames 1700... [2023-02-26 07:01:11,304][06480] Num frames 1800... [2023-02-26 07:01:11,422][06480] Num frames 1900... [2023-02-26 07:01:11,538][06480] Num frames 2000... [2023-02-26 07:01:11,652][06480] Num frames 2100... [2023-02-26 07:01:11,773][06480] Num frames 2200... [2023-02-26 07:01:11,889][06480] Num frames 2300... [2023-02-26 07:01:12,005][06480] Num frames 2400... [2023-02-26 07:01:12,122][06480] Num frames 2500... [2023-02-26 07:01:12,233][06480] Num frames 2600... [2023-02-26 07:01:12,356][06480] Num frames 2700... [2023-02-26 07:01:12,474][06480] Num frames 2800... [2023-02-26 07:01:12,573][06480] Avg episode rewards: #0: 19.800, true rewards: #0: 9.467 [2023-02-26 07:01:12,574][06480] Avg episode reward: 19.800, avg true_objective: 9.467 [2023-02-26 07:01:12,645][06480] Num frames 2900... [2023-02-26 07:01:12,766][06480] Num frames 3000... [2023-02-26 07:01:12,880][06480] Num frames 3100... [2023-02-26 07:01:13,000][06480] Num frames 3200... [2023-02-26 07:01:13,116][06480] Num frames 3300... [2023-02-26 07:01:13,255][06480] Num frames 3400... [2023-02-26 07:01:13,408][06480] Num frames 3500... [2023-02-26 07:01:13,583][06480] Avg episode rewards: #0: 18.190, true rewards: #0: 8.940 [2023-02-26 07:01:13,586][06480] Avg episode reward: 18.190, avg true_objective: 8.940 [2023-02-26 07:01:13,629][06480] Num frames 3600... [2023-02-26 07:01:13,788][06480] Num frames 3700... [2023-02-26 07:01:13,960][06480] Num frames 3800... [2023-02-26 07:01:14,119][06480] Num frames 3900... [2023-02-26 07:01:14,281][06480] Num frames 4000... [2023-02-26 07:01:14,455][06480] Num frames 4100... [2023-02-26 07:01:14,619][06480] Num frames 4200... [2023-02-26 07:01:14,777][06480] Num frames 4300... [2023-02-26 07:01:14,949][06480] Num frames 4400... [2023-02-26 07:01:15,124][06480] Num frames 4500... [2023-02-26 07:01:15,290][06480] Num frames 4600... [2023-02-26 07:01:15,458][06480] Num frames 4700... [2023-02-26 07:01:15,631][06480] Num frames 4800... [2023-02-26 07:01:15,795][06480] Num frames 4900... [2023-02-26 07:01:15,960][06480] Num frames 5000... [2023-02-26 07:01:16,126][06480] Num frames 5100... [2023-02-26 07:01:16,289][06480] Num frames 5200... [2023-02-26 07:01:16,454][06480] Num frames 5300... [2023-02-26 07:01:16,614][06480] Num frames 5400... [2023-02-26 07:01:16,765][06480] Num frames 5500... [2023-02-26 07:01:16,853][06480] Avg episode rewards: #0: 24.456, true rewards: #0: 11.056 [2023-02-26 07:01:16,854][06480] Avg episode reward: 24.456, avg true_objective: 11.056 [2023-02-26 07:01:16,953][06480] Num frames 5600... [2023-02-26 07:01:17,073][06480] Num frames 5700... [2023-02-26 07:01:17,201][06480] Num frames 5800... [2023-02-26 07:01:17,324][06480] Num frames 5900... [2023-02-26 07:01:17,436][06480] Num frames 6000... [2023-02-26 07:01:17,549][06480] Num frames 6100... [2023-02-26 07:01:17,667][06480] Num frames 6200... [2023-02-26 07:01:17,787][06480] Num frames 6300... [2023-02-26 07:01:17,900][06480] Num frames 6400... [2023-02-26 07:01:18,015][06480] Num frames 6500... [2023-02-26 07:01:18,127][06480] Num frames 6600... [2023-02-26 07:01:18,243][06480] Num frames 6700... [2023-02-26 07:01:18,359][06480] Num frames 6800... [2023-02-26 07:01:18,469][06480] Num frames 6900... [2023-02-26 07:01:18,541][06480] Avg episode rewards: #0: 26.022, true rewards: #0: 11.522 [2023-02-26 07:01:18,542][06480] Avg episode reward: 26.022, avg true_objective: 11.522 [2023-02-26 07:01:18,642][06480] Num frames 7000... [2023-02-26 07:01:18,751][06480] Num frames 7100... [2023-02-26 07:01:18,868][06480] Num frames 7200... [2023-02-26 07:01:18,995][06480] Num frames 7300... [2023-02-26 07:01:19,106][06480] Num frames 7400... [2023-02-26 07:01:19,217][06480] Num frames 7500... [2023-02-26 07:01:19,335][06480] Num frames 7600... [2023-02-26 07:01:19,453][06480] Num frames 7700... [2023-02-26 07:01:19,565][06480] Num frames 7800... [2023-02-26 07:01:19,679][06480] Num frames 7900... [2023-02-26 07:01:19,797][06480] Avg episode rewards: #0: 26.081, true rewards: #0: 11.367 [2023-02-26 07:01:19,799][06480] Avg episode reward: 26.081, avg true_objective: 11.367 [2023-02-26 07:01:19,850][06480] Num frames 8000... [2023-02-26 07:01:19,969][06480] Num frames 8100... [2023-02-26 07:01:20,082][06480] Num frames 8200... [2023-02-26 07:01:20,201][06480] Num frames 8300... [2023-02-26 07:01:20,313][06480] Num frames 8400... [2023-02-26 07:01:20,430][06480] Num frames 8500... [2023-02-26 07:01:20,544][06480] Num frames 8600... [2023-02-26 07:01:20,661][06480] Num frames 8700... [2023-02-26 07:01:20,777][06480] Num frames 8800... [2023-02-26 07:01:20,893][06480] Num frames 8900... [2023-02-26 07:01:21,016][06480] Num frames 9000... [2023-02-26 07:01:21,129][06480] Num frames 9100... [2023-02-26 07:01:21,248][06480] Num frames 9200... [2023-02-26 07:01:21,364][06480] Num frames 9300... [2023-02-26 07:01:21,483][06480] Num frames 9400... [2023-02-26 07:01:21,577][06480] Avg episode rewards: #0: 27.536, true rewards: #0: 11.786 [2023-02-26 07:01:21,578][06480] Avg episode reward: 27.536, avg true_objective: 11.786 [2023-02-26 07:01:21,667][06480] Num frames 9500... [2023-02-26 07:01:21,781][06480] Num frames 9600... [2023-02-26 07:01:21,900][06480] Num frames 9700... [2023-02-26 07:01:22,023][06480] Num frames 9800... [2023-02-26 07:01:22,141][06480] Num frames 9900... [2023-02-26 07:01:22,255][06480] Num frames 10000... [2023-02-26 07:01:22,370][06480] Num frames 10100... [2023-02-26 07:01:22,487][06480] Num frames 10200... [2023-02-26 07:01:22,603][06480] Num frames 10300... [2023-02-26 07:01:22,716][06480] Num frames 10400... [2023-02-26 07:01:22,827][06480] Num frames 10500... [2023-02-26 07:01:22,944][06480] Avg episode rewards: #0: 27.165, true rewards: #0: 11.721 [2023-02-26 07:01:22,946][06480] Avg episode reward: 27.165, avg true_objective: 11.721 [2023-02-26 07:01:23,014][06480] Num frames 10600... [2023-02-26 07:01:23,126][06480] Num frames 10700... [2023-02-26 07:01:23,239][06480] Num frames 10800... [2023-02-26 07:01:23,349][06480] Num frames 10900... [2023-02-26 07:01:23,457][06480] Num frames 11000... [2023-02-26 07:01:23,573][06480] Num frames 11100... [2023-02-26 07:01:23,692][06480] Avg episode rewards: #0: 25.357, true rewards: #0: 11.157 [2023-02-26 07:01:23,694][06480] Avg episode reward: 25.357, avg true_objective: 11.157 [2023-02-26 07:02:30,978][06480] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 07:14:21,650][06480] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 07:14:21,652][06480] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 07:14:21,654][06480] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 07:14:21,656][06480] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 07:14:21,658][06480] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:14:21,661][06480] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 07:14:21,662][06480] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-26 07:14:21,664][06480] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 07:14:21,666][06480] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-26 07:14:21,667][06480] Adding new argument 'hf_repository'='sd99/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-26 07:14:21,668][06480] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 07:14:21,670][06480] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 07:14:21,671][06480] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 07:14:21,672][06480] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 07:14:21,673][06480] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 07:14:21,702][06480] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:14:21,704][06480] RunningMeanStd input shape: (1,) [2023-02-26 07:14:21,719][06480] ConvEncoder: input_channels=3 [2023-02-26 07:14:21,755][06480] Conv encoder output size: 512 [2023-02-26 07:14:21,757][06480] Policy head output size: 512 [2023-02-26 07:14:21,778][06480] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 07:14:22,441][06480] Num frames 100... [2023-02-26 07:14:22,600][06480] Num frames 200... [2023-02-26 07:14:22,752][06480] Num frames 300... [2023-02-26 07:14:22,907][06480] Num frames 400... [2023-02-26 07:14:23,059][06480] Num frames 500... [2023-02-26 07:14:23,237][06480] Avg episode rewards: #0: 12.760, true rewards: #0: 5.760 [2023-02-26 07:14:23,239][06480] Avg episode reward: 12.760, avg true_objective: 5.760 [2023-02-26 07:14:23,281][06480] Num frames 600... [2023-02-26 07:14:23,432][06480] Num frames 700... [2023-02-26 07:14:23,585][06480] Num frames 800... [2023-02-26 07:14:23,745][06480] Num frames 900... [2023-02-26 07:14:23,900][06480] Num frames 1000... [2023-02-26 07:14:24,054][06480] Num frames 1100... [2023-02-26 07:14:24,161][06480] Avg episode rewards: #0: 12.160, true rewards: #0: 5.660 [2023-02-26 07:14:24,164][06480] Avg episode reward: 12.160, avg true_objective: 5.660 [2023-02-26 07:14:24,276][06480] Num frames 1200... [2023-02-26 07:14:24,443][06480] Num frames 1300... [2023-02-26 07:14:24,610][06480] Num frames 1400... [2023-02-26 07:14:24,775][06480] Num frames 1500... [2023-02-26 07:14:24,937][06480] Num frames 1600... [2023-02-26 07:14:25,095][06480] Num frames 1700... [2023-02-26 07:14:25,254][06480] Num frames 1800... [2023-02-26 07:14:25,418][06480] Num frames 1900... [2023-02-26 07:14:25,530][06480] Num frames 2000... [2023-02-26 07:14:25,639][06480] Num frames 2100... [2023-02-26 07:14:25,751][06480] Num frames 2200... [2023-02-26 07:14:25,871][06480] Num frames 2300... [2023-02-26 07:14:25,987][06480] Num frames 2400... [2023-02-26 07:14:26,120][06480] Avg episode rewards: #0: 17.897, true rewards: #0: 8.230 [2023-02-26 07:14:26,121][06480] Avg episode reward: 17.897, avg true_objective: 8.230 [2023-02-26 07:14:26,159][06480] Num frames 2500... [2023-02-26 07:14:26,275][06480] Num frames 2600... [2023-02-26 07:14:26,384][06480] Num frames 2700... [2023-02-26 07:14:26,502][06480] Num frames 2800... [2023-02-26 07:14:26,612][06480] Num frames 2900... [2023-02-26 07:14:26,720][06480] Num frames 3000... [2023-02-26 07:14:26,833][06480] Num frames 3100... [2023-02-26 07:14:26,945][06480] Num frames 3200... [2023-02-26 07:14:27,054][06480] Num frames 3300... [2023-02-26 07:14:27,167][06480] Num frames 3400... [2023-02-26 07:14:27,281][06480] Num frames 3500... [2023-02-26 07:14:27,394][06480] Num frames 3600... [2023-02-26 07:14:27,504][06480] Num frames 3700... [2023-02-26 07:14:27,617][06480] Num frames 3800... [2023-02-26 07:14:27,722][06480] Avg episode rewards: #0: 21.613, true rewards: #0: 9.612 [2023-02-26 07:14:27,725][06480] Avg episode reward: 21.613, avg true_objective: 9.612 [2023-02-26 07:14:27,791][06480] Num frames 3900... [2023-02-26 07:14:27,899][06480] Num frames 4000... [2023-02-26 07:14:28,019][06480] Num frames 4100... [2023-02-26 07:14:28,141][06480] Num frames 4200... [2023-02-26 07:14:28,257][06480] Num frames 4300... [2023-02-26 07:14:28,369][06480] Num frames 4400... [2023-02-26 07:14:28,481][06480] Num frames 4500... [2023-02-26 07:14:28,593][06480] Num frames 4600... [2023-02-26 07:14:28,715][06480] Num frames 4700... [2023-02-26 07:14:28,830][06480] Num frames 4800... [2023-02-26 07:14:28,949][06480] Num frames 4900... [2023-02-26 07:14:29,082][06480] Avg episode rewards: #0: 22.340, true rewards: #0: 9.940 [2023-02-26 07:14:29,084][06480] Avg episode reward: 22.340, avg true_objective: 9.940 [2023-02-26 07:14:29,122][06480] Num frames 5000... [2023-02-26 07:14:29,241][06480] Num frames 5100... [2023-02-26 07:14:29,352][06480] Num frames 5200... [2023-02-26 07:14:29,465][06480] Num frames 5300... [2023-02-26 07:14:29,581][06480] Num frames 5400... [2023-02-26 07:14:29,691][06480] Num frames 5500... [2023-02-26 07:14:29,809][06480] Num frames 5600... [2023-02-26 07:14:29,923][06480] Num frames 5700... [2023-02-26 07:14:30,021][06480] Avg episode rewards: #0: 21.230, true rewards: #0: 9.563 [2023-02-26 07:14:30,022][06480] Avg episode reward: 21.230, avg true_objective: 9.563 [2023-02-26 07:14:30,096][06480] Num frames 5800... [2023-02-26 07:14:30,210][06480] Num frames 5900... [2023-02-26 07:14:30,326][06480] Num frames 6000... [2023-02-26 07:14:30,440][06480] Num frames 6100... [2023-02-26 07:14:30,552][06480] Num frames 6200... [2023-02-26 07:14:30,665][06480] Num frames 6300... [2023-02-26 07:14:30,775][06480] Num frames 6400... [2023-02-26 07:14:30,889][06480] Num frames 6500... [2023-02-26 07:14:31,006][06480] Num frames 6600... [2023-02-26 07:14:31,123][06480] Num frames 6700... [2023-02-26 07:14:31,238][06480] Num frames 6800... [2023-02-26 07:14:31,358][06480] Num frames 6900... [2023-02-26 07:14:31,468][06480] Num frames 7000... [2023-02-26 07:14:31,581][06480] Num frames 7100... [2023-02-26 07:14:31,693][06480] Num frames 7200... [2023-02-26 07:14:31,807][06480] Num frames 7300... [2023-02-26 07:14:31,925][06480] Num frames 7400... [2023-02-26 07:14:32,036][06480] Num frames 7500... [2023-02-26 07:14:32,151][06480] Num frames 7600... [2023-02-26 07:14:32,266][06480] Num frames 7700... [2023-02-26 07:14:32,384][06480] Num frames 7800... [2023-02-26 07:14:32,484][06480] Avg episode rewards: #0: 25.483, true rewards: #0: 11.197 [2023-02-26 07:14:32,486][06480] Avg episode reward: 25.483, avg true_objective: 11.197 [2023-02-26 07:14:32,563][06480] Num frames 7900... [2023-02-26 07:14:32,675][06480] Num frames 8000... [2023-02-26 07:14:32,785][06480] Num frames 8100... [2023-02-26 07:14:32,910][06480] Num frames 8200... [2023-02-26 07:14:32,990][06480] Avg episode rewards: #0: 23.152, true rewards: #0: 10.277 [2023-02-26 07:14:32,992][06480] Avg episode reward: 23.152, avg true_objective: 10.277 [2023-02-26 07:14:33,079][06480] Num frames 8300... [2023-02-26 07:14:33,189][06480] Num frames 8400... [2023-02-26 07:14:33,307][06480] Num frames 8500... [2023-02-26 07:14:33,425][06480] Num frames 8600... [2023-02-26 07:14:33,536][06480] Num frames 8700... [2023-02-26 07:14:33,647][06480] Num frames 8800... [2023-02-26 07:14:33,758][06480] Num frames 8900... [2023-02-26 07:14:33,867][06480] Num frames 9000... [2023-02-26 07:14:33,982][06480] Num frames 9100... [2023-02-26 07:14:34,104][06480] Num frames 9200... [2023-02-26 07:14:34,217][06480] Num frames 9300... [2023-02-26 07:14:34,339][06480] Num frames 9400... [2023-02-26 07:14:34,451][06480] Num frames 9500... [2023-02-26 07:14:34,565][06480] Num frames 9600... [2023-02-26 07:14:34,678][06480] Num frames 9700... [2023-02-26 07:14:34,796][06480] Num frames 9800... [2023-02-26 07:14:34,909][06480] Num frames 9900... [2023-02-26 07:14:35,025][06480] Num frames 10000... [2023-02-26 07:14:35,155][06480] Num frames 10100... [2023-02-26 07:14:35,223][06480] Avg episode rewards: #0: 25.455, true rewards: #0: 11.233 [2023-02-26 07:14:35,225][06480] Avg episode reward: 25.455, avg true_objective: 11.233 [2023-02-26 07:14:35,330][06480] Num frames 10200... [2023-02-26 07:14:35,465][06480] Num frames 10300... [2023-02-26 07:14:35,623][06480] Num frames 10400... [2023-02-26 07:14:35,775][06480] Num frames 10500... [2023-02-26 07:14:35,913][06480] Avg episode rewards: #0: 23.750, true rewards: #0: 10.550 [2023-02-26 07:14:35,919][06480] Avg episode reward: 23.750, avg true_objective: 10.550 [2023-02-26 07:15:38,696][06480] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 07:15:42,696][06480] The model has been pushed to https://huggingface.co/sd99/rl_course_vizdoom_health_gathering_supreme [2023-02-26 07:17:10,353][06480] Environment doom_basic already registered, overwriting... [2023-02-26 07:17:10,356][06480] Environment doom_two_colors_easy already registered, overwriting... [2023-02-26 07:17:10,357][06480] Environment doom_two_colors_hard already registered, overwriting... [2023-02-26 07:17:10,363][06480] Environment doom_dm already registered, overwriting... [2023-02-26 07:17:10,364][06480] Environment doom_dwango5 already registered, overwriting... [2023-02-26 07:17:10,366][06480] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-26 07:17:10,367][06480] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-26 07:17:10,368][06480] Environment doom_my_way_home already registered, overwriting... [2023-02-26 07:17:10,373][06480] Environment doom_deadly_corridor already registered, overwriting... [2023-02-26 07:17:10,374][06480] Environment doom_defend_the_center already registered, overwriting... [2023-02-26 07:17:10,375][06480] Environment doom_defend_the_line already registered, overwriting... [2023-02-26 07:17:10,376][06480] Environment doom_health_gathering already registered, overwriting... [2023-02-26 07:17:10,377][06480] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-26 07:17:10,391][06480] Environment doom_battle already registered, overwriting... [2023-02-26 07:17:10,393][06480] Environment doom_battle2 already registered, overwriting... [2023-02-26 07:17:10,394][06480] Environment doom_duel_bots already registered, overwriting... [2023-02-26 07:17:10,397][06480] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-26 07:17:10,398][06480] Environment doom_duel already registered, overwriting... [2023-02-26 07:17:10,402][06480] Environment doom_deathmatch_full already registered, overwriting... [2023-02-26 07:17:10,403][06480] Environment doom_benchmark already registered, overwriting... [2023-02-26 07:17:10,404][06480] register_encoder_factory: [2023-02-26 07:17:10,427][06480] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 07:17:10,429][06480] Overriding arg 'train_for_env_steps' with value 10000000 passed from command line [2023-02-26 07:17:10,437][06480] Experiment dir /content/train_dir/default_experiment already exists! [2023-02-26 07:17:10,439][06480] Resuming existing experiment from /content/train_dir/default_experiment... [2023-02-26 07:17:10,442][06480] Weights and Biases integration disabled [2023-02-26 07:17:10,446][06480] Environment var CUDA_VISIBLE_DEVICES is 0 [2023-02-26 07:17:11,981][06480] Starting experiment with the following configuration: help=False algo=APPO env=doom_health_gathering_supreme experiment=default_experiment train_dir=/content/train_dir restart_behavior=resume device=gpu seed=None num_policies=1 async_rl=True serial_mode=False batched_sampling=False num_batches_to_accumulate=2 worker_num_splits=2 policy_workers_per_policy=1 max_policy_lag=1000 num_workers=8 num_envs_per_worker=4 batch_size=1024 num_batches_per_epoch=1 num_epochs=1 rollout=32 recurrence=32 shuffle_minibatches=False gamma=0.99 reward_scale=1.0 reward_clip=1000.0 value_bootstrap=False normalize_returns=True exploration_loss_coeff=0.001 value_loss_coeff=0.5 kl_loss_coeff=0.0 exploration_loss=symmetric_kl gae_lambda=0.95 ppo_clip_ratio=0.1 ppo_clip_value=0.2 with_vtrace=False vtrace_rho=1.0 vtrace_c=1.0 optimizer=adam adam_eps=1e-06 adam_beta1=0.9 adam_beta2=0.999 max_grad_norm=4.0 learning_rate=0.0001 lr_schedule=constant lr_schedule_kl_threshold=0.008 lr_adaptive_min=1e-06 lr_adaptive_max=0.01 obs_subtract_mean=0.0 obs_scale=255.0 normalize_input=True normalize_input_keys=None decorrelate_experience_max_seconds=0 decorrelate_envs_on_one_worker=True actor_worker_gpus=[] set_workers_cpu_affinity=True force_envs_single_thread=False default_niceness=0 log_to_file=True experiment_summaries_interval=10 flush_summaries_interval=30 stats_avg=100 summaries_use_frameskip=True heartbeat_interval=20 heartbeat_reporting_interval=600 train_for_env_steps=10000000 train_for_seconds=10000000000 save_every_sec=120 keep_checkpoints=2 load_checkpoint_kind=latest save_milestones_sec=-1 save_best_every_sec=5 save_best_metric=reward save_best_after=100000 benchmark=False encoder_mlp_layers=[512, 512] encoder_conv_architecture=convnet_simple encoder_conv_mlp_layers=[512] use_rnn=True rnn_size=512 rnn_type=gru rnn_num_layers=1 decoder_mlp_layers=[] nonlinearity=elu policy_initialization=orthogonal policy_init_gain=1.0 actor_critic_share_weights=True adaptive_stddev=True continuous_tanh_scale=0.0 initial_stddev=1.0 use_env_info_cache=False env_gpu_actions=False env_gpu_observations=True env_frameskip=4 env_framestack=1 pixel_format=CHW use_record_episode_statistics=False with_wandb=False wandb_user=None wandb_project=sample_factory wandb_group=None wandb_job_type=SF wandb_tags=[] with_pbt=False pbt_mix_policies_in_one_env=True pbt_period_env_steps=5000000 pbt_start_mutation=20000000 pbt_replace_fraction=0.3 pbt_mutation_rate=0.15 pbt_replace_reward_gap=0.1 pbt_replace_reward_gap_absolute=1e-06 pbt_optimize_gamma=False pbt_target_objective=true_objective pbt_perturb_min=1.1 pbt_perturb_max=1.5 num_agents=-1 num_humans=0 num_bots=-1 start_bot_difficulty=None timelimit=None res_w=128 res_h=72 wide_aspect_ratio=False eval_env_frameskip=1 fps=35 command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000000 cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000000} git_hash=unknown git_repo_name=not a git repository [2023-02-26 07:17:11,986][06480] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-26 07:17:11,991][06480] Rollout worker 0 uses device cpu [2023-02-26 07:17:11,992][06480] Rollout worker 1 uses device cpu [2023-02-26 07:17:11,994][06480] Rollout worker 2 uses device cpu [2023-02-26 07:17:11,996][06480] Rollout worker 3 uses device cpu [2023-02-26 07:17:11,997][06480] Rollout worker 4 uses device cpu [2023-02-26 07:17:11,998][06480] Rollout worker 5 uses device cpu [2023-02-26 07:17:12,000][06480] Rollout worker 6 uses device cpu [2023-02-26 07:17:12,001][06480] Rollout worker 7 uses device cpu [2023-02-26 07:17:12,115][06480] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 07:17:12,117][06480] InferenceWorker_p0-w0: min num requests: 2 [2023-02-26 07:17:12,148][06480] Starting all processes... [2023-02-26 07:17:12,149][06480] Starting process learner_proc0 [2023-02-26 07:17:12,288][06480] Starting all processes... [2023-02-26 07:17:12,301][06480] Starting process inference_proc0-0 [2023-02-26 07:17:12,301][06480] Starting process rollout_proc0 [2023-02-26 07:17:12,306][06480] Starting process rollout_proc1 [2023-02-26 07:17:12,306][06480] Starting process rollout_proc2 [2023-02-26 07:17:12,306][06480] Starting process rollout_proc3 [2023-02-26 07:17:12,307][06480] Starting process rollout_proc4 [2023-02-26 07:17:12,307][06480] Starting process rollout_proc5 [2023-02-26 07:17:12,307][06480] Starting process rollout_proc6 [2023-02-26 07:17:12,307][06480] Starting process rollout_proc7 [2023-02-26 07:17:22,066][30933] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 07:17:22,066][30933] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-26 07:17:22,120][30933] Num visible devices: 1 [2023-02-26 07:17:22,176][30933] Starting seed is not provided [2023-02-26 07:17:22,177][30933] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 07:17:22,178][30933] Initializing actor-critic model on device cuda:0 [2023-02-26 07:17:22,179][30933] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:17:22,180][30933] RunningMeanStd input shape: (1,) [2023-02-26 07:17:22,233][30933] ConvEncoder: input_channels=3 [2023-02-26 07:17:23,022][30947] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 07:17:23,026][30947] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-26 07:17:23,104][30947] Num visible devices: 1 [2023-02-26 07:17:23,283][30933] Conv encoder output size: 512 [2023-02-26 07:17:23,290][30933] Policy head output size: 512 [2023-02-26 07:17:23,411][30933] Created Actor Critic model with architecture: [2023-02-26 07:17:23,411][30933] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-26 07:17:23,775][30948] Worker 0 uses CPU cores [0] [2023-02-26 07:17:24,015][30951] Worker 2 uses CPU cores [0] [2023-02-26 07:17:24,232][30958] Worker 3 uses CPU cores [1] [2023-02-26 07:17:24,265][30952] Worker 1 uses CPU cores [1] [2023-02-26 07:17:24,597][30960] Worker 5 uses CPU cores [1] [2023-02-26 07:17:24,687][30968] Worker 7 uses CPU cores [1] [2023-02-26 07:17:24,701][30962] Worker 4 uses CPU cores [0] [2023-02-26 07:17:24,810][30970] Worker 6 uses CPU cores [0] [2023-02-26 07:17:26,686][30933] Using optimizer [2023-02-26 07:17:26,687][30933] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-26 07:17:26,715][30933] Loading model from checkpoint [2023-02-26 07:17:26,719][30933] Loaded experiment state at self.train_step=978, self.env_steps=4005888 [2023-02-26 07:17:26,720][30933] Initialized policy 0 weights for model version 978 [2023-02-26 07:17:26,723][30933] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 07:17:26,730][30933] LearnerWorker_p0 finished initialization! [2023-02-26 07:17:26,934][30947] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:17:26,935][30947] RunningMeanStd input shape: (1,) [2023-02-26 07:17:26,947][30947] ConvEncoder: input_channels=3 [2023-02-26 07:17:27,045][30947] Conv encoder output size: 512 [2023-02-26 07:17:27,046][30947] Policy head output size: 512 [2023-02-26 07:17:29,984][06480] Inference worker 0-0 is ready! [2023-02-26 07:17:29,987][06480] All inference workers are ready! Signal rollout workers to start! [2023-02-26 07:17:30,132][30951] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,136][30948] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,149][30970] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,166][30962] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,187][30958] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,226][30968] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,220][30960] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,223][30952] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 07:17:30,447][06480] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 07:17:31,436][30951] Decorrelating experience for 0 frames... [2023-02-26 07:17:31,458][30962] Decorrelating experience for 0 frames... [2023-02-26 07:17:31,799][30958] Decorrelating experience for 0 frames... [2023-02-26 07:17:31,802][30968] Decorrelating experience for 0 frames... [2023-02-26 07:17:31,807][30960] Decorrelating experience for 0 frames... [2023-02-26 07:17:32,107][06480] Heartbeat connected on Batcher_0 [2023-02-26 07:17:32,112][06480] Heartbeat connected on LearnerWorker_p0 [2023-02-26 07:17:32,157][06480] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-26 07:17:32,295][30951] Decorrelating experience for 32 frames... [2023-02-26 07:17:33,036][30952] Decorrelating experience for 0 frames... [2023-02-26 07:17:33,068][30948] Decorrelating experience for 0 frames... [2023-02-26 07:17:33,379][30958] Decorrelating experience for 32 frames... [2023-02-26 07:17:33,385][30960] Decorrelating experience for 32 frames... [2023-02-26 07:17:33,383][30968] Decorrelating experience for 32 frames... [2023-02-26 07:17:33,949][30951] Decorrelating experience for 64 frames... [2023-02-26 07:17:34,330][30970] Decorrelating experience for 0 frames... [2023-02-26 07:17:34,395][30952] Decorrelating experience for 32 frames... [2023-02-26 07:17:34,639][30948] Decorrelating experience for 32 frames... [2023-02-26 07:17:34,879][30958] Decorrelating experience for 64 frames... [2023-02-26 07:17:34,887][30960] Decorrelating experience for 64 frames... [2023-02-26 07:17:35,301][30951] Decorrelating experience for 96 frames... [2023-02-26 07:17:35,446][06480] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 07:17:35,525][06480] Heartbeat connected on RolloutWorker_w2 [2023-02-26 07:17:35,700][30952] Decorrelating experience for 64 frames... [2023-02-26 07:17:35,904][30970] Decorrelating experience for 32 frames... [2023-02-26 07:17:35,944][30962] Decorrelating experience for 32 frames... [2023-02-26 07:17:36,047][30968] Decorrelating experience for 64 frames... [2023-02-26 07:17:36,114][30960] Decorrelating experience for 96 frames... [2023-02-26 07:17:36,390][06480] Heartbeat connected on RolloutWorker_w5 [2023-02-26 07:17:36,464][30948] Decorrelating experience for 64 frames... [2023-02-26 07:17:37,306][30970] Decorrelating experience for 64 frames... [2023-02-26 07:17:37,434][30958] Decorrelating experience for 96 frames... [2023-02-26 07:17:37,668][06480] Heartbeat connected on RolloutWorker_w3 [2023-02-26 07:17:37,732][30948] Decorrelating experience for 96 frames... [2023-02-26 07:17:38,234][06480] Heartbeat connected on RolloutWorker_w0 [2023-02-26 07:17:39,128][30968] Decorrelating experience for 96 frames... [2023-02-26 07:17:39,624][06480] Heartbeat connected on RolloutWorker_w7 [2023-02-26 07:17:40,168][30962] Decorrelating experience for 64 frames... [2023-02-26 07:17:40,446][06480] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 117.8. Samples: 1178. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 07:17:40,448][06480] Avg episode reward: [(0, '2.110')] [2023-02-26 07:17:41,095][30970] Decorrelating experience for 96 frames... [2023-02-26 07:17:41,481][30933] Signal inference workers to stop experience collection... [2023-02-26 07:17:41,500][30947] InferenceWorker_p0-w0: stopping experience collection [2023-02-26 07:17:41,603][30952] Decorrelating experience for 96 frames... [2023-02-26 07:17:41,635][06480] Heartbeat connected on RolloutWorker_w6 [2023-02-26 07:17:41,856][06480] Heartbeat connected on RolloutWorker_w1 [2023-02-26 07:17:41,957][30962] Decorrelating experience for 96 frames... [2023-02-26 07:17:42,014][06480] Heartbeat connected on RolloutWorker_w4 [2023-02-26 07:17:44,109][30933] Signal inference workers to resume experience collection... [2023-02-26 07:17:44,109][30947] InferenceWorker_p0-w0: resuming experience collection [2023-02-26 07:17:45,446][06480] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4009984. Throughput: 0: 180.5. Samples: 2708. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-26 07:17:45,449][06480] Avg episode reward: [(0, '4.592')] [2023-02-26 07:17:50,449][06480] Fps is (10 sec: 2047.4, 60 sec: 1023.9, 300 sec: 1023.9). Total num frames: 4026368. Throughput: 0: 211.9. Samples: 4238. Policy #0 lag: (min: 0.0, avg: 0.4, max: 3.0) [2023-02-26 07:17:50,452][06480] Avg episode reward: [(0, '8.576')] [2023-02-26 07:17:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 4042752. Throughput: 0: 342.3. Samples: 8556. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:17:55,453][06480] Avg episode reward: [(0, '12.684')] [2023-02-26 07:17:56,086][30947] Updated weights for policy 0, policy_version 988 (0.0012) [2023-02-26 07:18:00,446][06480] Fps is (10 sec: 3687.4, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 4063232. Throughput: 0: 510.3. Samples: 15310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:18:00,452][06480] Avg episode reward: [(0, '15.241')] [2023-02-26 07:18:05,446][06480] Fps is (10 sec: 4095.9, 60 sec: 2223.6, 300 sec: 2223.6). Total num frames: 4083712. Throughput: 0: 536.2. Samples: 18768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:18:05,456][06480] Avg episode reward: [(0, '18.738')] [2023-02-26 07:18:05,544][30947] Updated weights for policy 0, policy_version 998 (0.0014) [2023-02-26 07:18:10,446][06480] Fps is (10 sec: 3686.3, 60 sec: 2355.2, 300 sec: 2355.2). Total num frames: 4100096. Throughput: 0: 586.5. Samples: 23460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:18:10,453][06480] Avg episode reward: [(0, '20.240')] [2023-02-26 07:18:15,446][06480] Fps is (10 sec: 3276.8, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 4116480. Throughput: 0: 628.0. Samples: 28260. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:18:15,454][06480] Avg episode reward: [(0, '20.983')] [2023-02-26 07:18:17,818][30947] Updated weights for policy 0, policy_version 1008 (0.0011) [2023-02-26 07:18:20,446][06480] Fps is (10 sec: 4096.1, 60 sec: 2703.4, 300 sec: 2703.4). Total num frames: 4141056. Throughput: 0: 702.8. Samples: 31624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:18:20,454][06480] Avg episode reward: [(0, '22.618')] [2023-02-26 07:18:25,452][06480] Fps is (10 sec: 4503.1, 60 sec: 2829.7, 300 sec: 2829.7). Total num frames: 4161536. Throughput: 0: 827.4. Samples: 38414. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:18:25,460][06480] Avg episode reward: [(0, '24.076')] [2023-02-26 07:18:28,204][30947] Updated weights for policy 0, policy_version 1018 (0.0022) [2023-02-26 07:18:30,450][06480] Fps is (10 sec: 3275.6, 60 sec: 2798.8, 300 sec: 2798.8). Total num frames: 4173824. Throughput: 0: 887.3. Samples: 42642. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:18:30,452][06480] Avg episode reward: [(0, '24.339')] [2023-02-26 07:18:35,446][06480] Fps is (10 sec: 2868.8, 60 sec: 3072.0, 300 sec: 2835.7). Total num frames: 4190208. Throughput: 0: 903.6. Samples: 44898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:18:35,454][06480] Avg episode reward: [(0, '22.478')] [2023-02-26 07:18:40,446][06480] Fps is (10 sec: 2868.3, 60 sec: 3276.8, 300 sec: 2808.7). Total num frames: 4202496. Throughput: 0: 908.8. Samples: 49452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:18:40,452][06480] Avg episode reward: [(0, '23.040')] [2023-02-26 07:18:42,243][30947] Updated weights for policy 0, policy_version 1028 (0.0020) [2023-02-26 07:18:45,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 2839.9). Total num frames: 4218880. Throughput: 0: 856.7. Samples: 53860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:18:45,449][06480] Avg episode reward: [(0, '23.201')] [2023-02-26 07:18:50,449][06480] Fps is (10 sec: 2866.5, 60 sec: 3413.4, 300 sec: 2815.9). Total num frames: 4231168. Throughput: 0: 827.2. Samples: 55996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:18:50,455][06480] Avg episode reward: [(0, '24.211')] [2023-02-26 07:18:55,328][30947] Updated weights for policy 0, policy_version 1038 (0.0022) [2023-02-26 07:18:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 2891.3). Total num frames: 4251648. Throughput: 0: 825.0. Samples: 60584. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:18:55,449][06480] Avg episode reward: [(0, '24.330')] [2023-02-26 07:19:00,446][06480] Fps is (10 sec: 4096.9, 60 sec: 3481.6, 300 sec: 2958.2). Total num frames: 4272128. Throughput: 0: 866.8. Samples: 67264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:00,453][06480] Avg episode reward: [(0, '24.813')] [2023-02-26 07:19:04,945][30947] Updated weights for policy 0, policy_version 1048 (0.0014) [2023-02-26 07:19:05,447][06480] Fps is (10 sec: 4095.7, 60 sec: 3481.6, 300 sec: 3018.1). Total num frames: 4292608. Throughput: 0: 868.4. Samples: 70704. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:19:05,453][06480] Avg episode reward: [(0, '26.808')] [2023-02-26 07:19:05,457][30933] Saving new best policy, reward=26.808! [2023-02-26 07:19:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 2990.1). Total num frames: 4304896. Throughput: 0: 819.3. Samples: 75280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:19:10,448][06480] Avg episode reward: [(0, '26.542')] [2023-02-26 07:19:10,469][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001051_4304896.pth... [2023-02-26 07:19:10,692][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000880_3604480.pth [2023-02-26 07:19:15,446][06480] Fps is (10 sec: 3277.1, 60 sec: 3481.6, 300 sec: 3042.8). Total num frames: 4325376. Throughput: 0: 833.7. Samples: 80156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:15,452][06480] Avg episode reward: [(0, '25.196')] [2023-02-26 07:19:17,099][30947] Updated weights for policy 0, policy_version 1058 (0.0021) [2023-02-26 07:19:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3090.6). Total num frames: 4345856. Throughput: 0: 857.7. Samples: 83494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:20,452][06480] Avg episode reward: [(0, '25.079')] [2023-02-26 07:19:25,446][06480] Fps is (10 sec: 4095.9, 60 sec: 3413.6, 300 sec: 3134.3). Total num frames: 4366336. Throughput: 0: 902.8. Samples: 90080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:19:25,449][06480] Avg episode reward: [(0, '23.154')] [2023-02-26 07:19:27,739][30947] Updated weights for policy 0, policy_version 1068 (0.0019) [2023-02-26 07:19:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3106.2). Total num frames: 4378624. Throughput: 0: 897.3. Samples: 94238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:30,452][06480] Avg episode reward: [(0, '23.155')] [2023-02-26 07:19:35,446][06480] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3145.8). Total num frames: 4399104. Throughput: 0: 897.4. Samples: 96376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:35,452][06480] Avg episode reward: [(0, '22.533')] [2023-02-26 07:19:38,761][30947] Updated weights for policy 0, policy_version 1078 (0.0014) [2023-02-26 07:19:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3182.3). Total num frames: 4419584. Throughput: 0: 944.4. Samples: 103082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:19:40,453][06480] Avg episode reward: [(0, '22.927')] [2023-02-26 07:19:45,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3216.1). Total num frames: 4440064. Throughput: 0: 930.8. Samples: 109150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:45,449][06480] Avg episode reward: [(0, '23.907')] [2023-02-26 07:19:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3189.0). Total num frames: 4452352. Throughput: 0: 901.7. Samples: 111278. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:19:50,454][06480] Avg episode reward: [(0, '23.716')] [2023-02-26 07:19:50,563][30947] Updated weights for policy 0, policy_version 1088 (0.0022) [2023-02-26 07:19:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3220.3). Total num frames: 4472832. Throughput: 0: 905.4. Samples: 116024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:19:55,454][06480] Avg episode reward: [(0, '24.564')] [2023-02-26 07:20:00,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3249.5). Total num frames: 4493312. Throughput: 0: 938.3. Samples: 122378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:20:00,454][06480] Avg episode reward: [(0, '23.580')] [2023-02-26 07:20:00,999][30947] Updated weights for policy 0, policy_version 1098 (0.0017) [2023-02-26 07:20:05,450][06480] Fps is (10 sec: 3685.1, 60 sec: 3618.0, 300 sec: 3250.3). Total num frames: 4509696. Throughput: 0: 932.9. Samples: 125480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:20:05,457][06480] Avg episode reward: [(0, '23.591')] [2023-02-26 07:20:10,448][06480] Fps is (10 sec: 2866.8, 60 sec: 3618.0, 300 sec: 3225.6). Total num frames: 4521984. Throughput: 0: 878.1. Samples: 129596. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:20:10,453][06480] Avg episode reward: [(0, '23.469')] [2023-02-26 07:20:14,019][30947] Updated weights for policy 0, policy_version 1108 (0.0024) [2023-02-26 07:20:15,446][06480] Fps is (10 sec: 3278.0, 60 sec: 3618.1, 300 sec: 3252.0). Total num frames: 4542464. Throughput: 0: 899.6. Samples: 134718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:20:15,454][06480] Avg episode reward: [(0, '22.784')] [2023-02-26 07:20:20,447][06480] Fps is (10 sec: 4096.5, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 4562944. Throughput: 0: 924.3. Samples: 137972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:20:20,453][06480] Avg episode reward: [(0, '22.696')] [2023-02-26 07:20:24,118][30947] Updated weights for policy 0, policy_version 1118 (0.0016) [2023-02-26 07:20:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 4579328. Throughput: 0: 906.3. Samples: 143866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:20:25,453][06480] Avg episode reward: [(0, '21.762')] [2023-02-26 07:20:30,447][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 4595712. Throughput: 0: 859.9. Samples: 147848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:20:30,454][06480] Avg episode reward: [(0, '22.195')] [2023-02-26 07:20:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 4612096. Throughput: 0: 858.7. Samples: 149918. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:20:35,454][06480] Avg episode reward: [(0, '22.617')] [2023-02-26 07:20:37,073][30947] Updated weights for policy 0, policy_version 1128 (0.0033) [2023-02-26 07:20:40,446][06480] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3298.4). Total num frames: 4632576. Throughput: 0: 894.0. Samples: 156256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:20:40,449][06480] Avg episode reward: [(0, '22.455')] [2023-02-26 07:20:45,449][06480] Fps is (10 sec: 4094.9, 60 sec: 3549.7, 300 sec: 3318.8). Total num frames: 4653056. Throughput: 0: 878.9. Samples: 161930. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:20:45,452][06480] Avg episode reward: [(0, '22.442')] [2023-02-26 07:20:48,695][30947] Updated weights for policy 0, policy_version 1138 (0.0032) [2023-02-26 07:20:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3297.3). Total num frames: 4665344. Throughput: 0: 853.7. Samples: 163892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:20:50,455][06480] Avg episode reward: [(0, '23.387')] [2023-02-26 07:20:55,446][06480] Fps is (10 sec: 2868.0, 60 sec: 3481.6, 300 sec: 3296.8). Total num frames: 4681728. Throughput: 0: 862.0. Samples: 168384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:20:55,452][06480] Avg episode reward: [(0, '24.021')] [2023-02-26 07:20:59,381][30947] Updated weights for policy 0, policy_version 1148 (0.0023) [2023-02-26 07:21:00,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3335.3). Total num frames: 4706304. Throughput: 0: 900.8. Samples: 175254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:21:00,449][06480] Avg episode reward: [(0, '23.355')] [2023-02-26 07:21:05,448][06480] Fps is (10 sec: 4504.9, 60 sec: 3618.3, 300 sec: 3353.0). Total num frames: 4726784. Throughput: 0: 904.9. Samples: 178692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:21:05,451][06480] Avg episode reward: [(0, '23.583')] [2023-02-26 07:21:10,446][06480] Fps is (10 sec: 3276.7, 60 sec: 3618.2, 300 sec: 3332.7). Total num frames: 4739072. Throughput: 0: 877.9. Samples: 183370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:21:10,453][06480] Avg episode reward: [(0, '23.515')] [2023-02-26 07:21:10,464][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001157_4739072.pth... [2023-02-26 07:21:10,671][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth [2023-02-26 07:21:11,217][30947] Updated weights for policy 0, policy_version 1158 (0.0015) [2023-02-26 07:21:15,449][06480] Fps is (10 sec: 2457.3, 60 sec: 3481.4, 300 sec: 3313.2). Total num frames: 4751360. Throughput: 0: 875.0. Samples: 187226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:21:15,454][06480] Avg episode reward: [(0, '23.771')] [2023-02-26 07:21:20,446][06480] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3294.6). Total num frames: 4763648. Throughput: 0: 876.2. Samples: 189346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:21:20,450][06480] Avg episode reward: [(0, '23.132')] [2023-02-26 07:21:25,230][30947] Updated weights for policy 0, policy_version 1168 (0.0022) [2023-02-26 07:21:25,446][06480] Fps is (10 sec: 3277.7, 60 sec: 3413.3, 300 sec: 3311.7). Total num frames: 4784128. Throughput: 0: 845.2. Samples: 194292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:21:25,455][06480] Avg episode reward: [(0, '22.853')] [2023-02-26 07:21:30,447][06480] Fps is (10 sec: 3276.7, 60 sec: 3345.1, 300 sec: 3293.9). Total num frames: 4796416. Throughput: 0: 815.0. Samples: 198604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:21:30,449][06480] Avg episode reward: [(0, '21.906')] [2023-02-26 07:21:35,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3293.5). Total num frames: 4812800. Throughput: 0: 820.4. Samples: 200812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:21:35,453][06480] Avg episode reward: [(0, '23.725')] [2023-02-26 07:21:37,338][30947] Updated weights for policy 0, policy_version 1178 (0.0019) [2023-02-26 07:21:40,446][06480] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3326.0). Total num frames: 4837376. Throughput: 0: 863.2. Samples: 207228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:21:40,452][06480] Avg episode reward: [(0, '25.590')] [2023-02-26 07:21:45,454][06480] Fps is (10 sec: 4502.1, 60 sec: 3413.0, 300 sec: 3341.0). Total num frames: 4857856. Throughput: 0: 854.1. Samples: 213696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:21:45,456][06480] Avg episode reward: [(0, '24.209')] [2023-02-26 07:21:47,767][30947] Updated weights for policy 0, policy_version 1188 (0.0012) [2023-02-26 07:21:50,447][06480] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3324.1). Total num frames: 4870144. Throughput: 0: 825.3. Samples: 215830. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:21:50,450][06480] Avg episode reward: [(0, '25.376')] [2023-02-26 07:21:55,450][06480] Fps is (10 sec: 3278.3, 60 sec: 3481.4, 300 sec: 3338.6). Total num frames: 4890624. Throughput: 0: 816.1. Samples: 220098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:21:55,456][06480] Avg episode reward: [(0, '25.045')] [2023-02-26 07:21:59,025][30947] Updated weights for policy 0, policy_version 1198 (0.0023) [2023-02-26 07:22:00,446][06480] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3352.7). Total num frames: 4911104. Throughput: 0: 880.8. Samples: 226860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:22:00,449][06480] Avg episode reward: [(0, '25.060')] [2023-02-26 07:22:05,446][06480] Fps is (10 sec: 4097.4, 60 sec: 3413.4, 300 sec: 3366.2). Total num frames: 4931584. Throughput: 0: 909.2. Samples: 230262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:22:05,450][06480] Avg episode reward: [(0, '23.819')] [2023-02-26 07:22:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3350.0). Total num frames: 4943872. Throughput: 0: 903.5. Samples: 234948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:22:10,449][06480] Avg episode reward: [(0, '24.578')] [2023-02-26 07:22:10,531][30947] Updated weights for policy 0, policy_version 1208 (0.0011) [2023-02-26 07:22:15,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3363.0). Total num frames: 4964352. Throughput: 0: 917.1. Samples: 239872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:22:15,449][06480] Avg episode reward: [(0, '23.869')] [2023-02-26 07:22:20,447][06480] Fps is (10 sec: 4095.7, 60 sec: 3686.4, 300 sec: 3375.7). Total num frames: 4984832. Throughput: 0: 942.6. Samples: 243228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:22:20,458][06480] Avg episode reward: [(0, '23.727')] [2023-02-26 07:22:20,762][30947] Updated weights for policy 0, policy_version 1218 (0.0016) [2023-02-26 07:22:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3387.9). Total num frames: 5005312. Throughput: 0: 947.9. Samples: 249882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:22:25,453][06480] Avg episode reward: [(0, '23.303')] [2023-02-26 07:22:30,446][06480] Fps is (10 sec: 3277.0, 60 sec: 3686.4, 300 sec: 3429.5). Total num frames: 5017600. Throughput: 0: 900.2. Samples: 254198. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:22:30,453][06480] Avg episode reward: [(0, '23.259')] [2023-02-26 07:22:33,445][30947] Updated weights for policy 0, policy_version 1228 (0.0015) [2023-02-26 07:22:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3499.0). Total num frames: 5038080. Throughput: 0: 900.7. Samples: 256360. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:22:35,451][06480] Avg episode reward: [(0, '23.555')] [2023-02-26 07:22:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 5058560. Throughput: 0: 957.1. Samples: 263166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:22:40,453][06480] Avg episode reward: [(0, '23.870')] [2023-02-26 07:22:42,103][30947] Updated weights for policy 0, policy_version 1238 (0.0024) [2023-02-26 07:22:45,453][06480] Fps is (10 sec: 4093.3, 60 sec: 3686.5, 300 sec: 3568.3). Total num frames: 5079040. Throughput: 0: 943.7. Samples: 269332. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:22:45,464][06480] Avg episode reward: [(0, '23.773')] [2023-02-26 07:22:50,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 5095424. Throughput: 0: 915.9. Samples: 271476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:22:50,454][06480] Avg episode reward: [(0, '23.686')] [2023-02-26 07:22:54,557][30947] Updated weights for policy 0, policy_version 1248 (0.0021) [2023-02-26 07:22:55,446][06480] Fps is (10 sec: 3279.0, 60 sec: 3686.6, 300 sec: 3554.5). Total num frames: 5111808. Throughput: 0: 919.6. Samples: 276332. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:22:55,449][06480] Avg episode reward: [(0, '23.833')] [2023-02-26 07:23:00,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 5136384. Throughput: 0: 963.0. Samples: 283206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:00,449][06480] Avg episode reward: [(0, '23.801')] [2023-02-26 07:23:04,153][30947] Updated weights for policy 0, policy_version 1258 (0.0012) [2023-02-26 07:23:05,448][06480] Fps is (10 sec: 4095.2, 60 sec: 3686.3, 300 sec: 3568.4). Total num frames: 5152768. Throughput: 0: 960.0. Samples: 286428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:05,455][06480] Avg episode reward: [(0, '23.005')] [2023-02-26 07:23:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 5169152. Throughput: 0: 908.0. Samples: 290740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:23:10,454][06480] Avg episode reward: [(0, '22.873')] [2023-02-26 07:23:10,468][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001262_5169152.pth... [2023-02-26 07:23:10,687][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001051_4304896.pth [2023-02-26 07:23:15,446][06480] Fps is (10 sec: 3277.5, 60 sec: 3686.4, 300 sec: 3540.6). Total num frames: 5185536. Throughput: 0: 927.2. Samples: 295922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:23:15,449][06480] Avg episode reward: [(0, '23.774')] [2023-02-26 07:23:16,487][30947] Updated weights for policy 0, policy_version 1268 (0.0012) [2023-02-26 07:23:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3554.6). Total num frames: 5210112. Throughput: 0: 953.0. Samples: 299246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:20,454][06480] Avg episode reward: [(0, '24.114')] [2023-02-26 07:23:25,450][06480] Fps is (10 sec: 4095.9, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5226496. Throughput: 0: 937.9. Samples: 305372. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:25,452][06480] Avg episode reward: [(0, '25.747')] [2023-02-26 07:23:27,330][30947] Updated weights for policy 0, policy_version 1278 (0.0012) [2023-02-26 07:23:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 5242880. Throughput: 0: 897.5. Samples: 309714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:23:30,452][06480] Avg episode reward: [(0, '24.370')] [2023-02-26 07:23:35,446][06480] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 5259264. Throughput: 0: 900.3. Samples: 311988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:23:35,453][06480] Avg episode reward: [(0, '23.740')] [2023-02-26 07:23:38,207][30947] Updated weights for policy 0, policy_version 1288 (0.0016) [2023-02-26 07:23:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 5283840. Throughput: 0: 947.5. Samples: 318968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:23:40,454][06480] Avg episode reward: [(0, '22.615')] [2023-02-26 07:23:45,448][06480] Fps is (10 sec: 4504.7, 60 sec: 3755.0, 300 sec: 3637.8). Total num frames: 5304320. Throughput: 0: 926.0. Samples: 324878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:45,452][06480] Avg episode reward: [(0, '22.077')] [2023-02-26 07:23:49,623][30947] Updated weights for policy 0, policy_version 1298 (0.0013) [2023-02-26 07:23:50,448][06480] Fps is (10 sec: 3276.3, 60 sec: 3686.3, 300 sec: 3610.0). Total num frames: 5316608. Throughput: 0: 902.2. Samples: 327026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:23:50,455][06480] Avg episode reward: [(0, '21.586')] [2023-02-26 07:23:55,449][06480] Fps is (10 sec: 2457.4, 60 sec: 3618.0, 300 sec: 3582.2). Total num frames: 5328896. Throughput: 0: 889.8. Samples: 330784. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:23:55,452][06480] Avg episode reward: [(0, '21.124')] [2023-02-26 07:24:00,446][06480] Fps is (10 sec: 2867.6, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 5345280. Throughput: 0: 872.2. Samples: 335172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:24:00,449][06480] Avg episode reward: [(0, '21.742')] [2023-02-26 07:24:03,308][30947] Updated weights for policy 0, policy_version 1308 (0.0028) [2023-02-26 07:24:05,450][06480] Fps is (10 sec: 3276.5, 60 sec: 3481.5, 300 sec: 3582.2). Total num frames: 5361664. Throughput: 0: 864.0. Samples: 338130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:24:05,457][06480] Avg episode reward: [(0, '22.705')] [2023-02-26 07:24:10,448][06480] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3568.4). Total num frames: 5378048. Throughput: 0: 828.3. Samples: 342646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:24:10,456][06480] Avg episode reward: [(0, '22.505')] [2023-02-26 07:24:15,446][06480] Fps is (10 sec: 3278.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5394432. Throughput: 0: 843.1. Samples: 347652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:24:15,453][06480] Avg episode reward: [(0, '23.217')] [2023-02-26 07:24:15,933][30947] Updated weights for policy 0, policy_version 1318 (0.0022) [2023-02-26 07:24:20,446][06480] Fps is (10 sec: 3687.1, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 5414912. Throughput: 0: 866.6. Samples: 350984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:24:20,452][06480] Avg episode reward: [(0, '24.884')] [2023-02-26 07:24:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 5435392. Throughput: 0: 855.4. Samples: 357462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:24:25,449][06480] Avg episode reward: [(0, '24.387')] [2023-02-26 07:24:26,351][30947] Updated weights for policy 0, policy_version 1328 (0.0020) [2023-02-26 07:24:30,451][06480] Fps is (10 sec: 3275.1, 60 sec: 3413.0, 300 sec: 3554.4). Total num frames: 5447680. Throughput: 0: 819.2. Samples: 361744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:24:30,458][06480] Avg episode reward: [(0, '24.330')] [2023-02-26 07:24:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5468160. Throughput: 0: 819.1. Samples: 363882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:24:35,453][06480] Avg episode reward: [(0, '24.145')] [2023-02-26 07:24:37,952][30947] Updated weights for policy 0, policy_version 1338 (0.0014) [2023-02-26 07:24:40,446][06480] Fps is (10 sec: 4098.1, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 5488640. Throughput: 0: 880.6. Samples: 370408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:24:40,450][06480] Avg episode reward: [(0, '23.669')] [2023-02-26 07:24:45,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3582.3). Total num frames: 5509120. Throughput: 0: 917.9. Samples: 376478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:24:45,452][06480] Avg episode reward: [(0, '23.690')] [2023-02-26 07:24:49,148][30947] Updated weights for policy 0, policy_version 1348 (0.0030) [2023-02-26 07:24:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3554.5). Total num frames: 5521408. Throughput: 0: 899.8. Samples: 378616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:24:50,451][06480] Avg episode reward: [(0, '23.215')] [2023-02-26 07:24:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5541888. Throughput: 0: 905.9. Samples: 383410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:24:55,448][06480] Avg episode reward: [(0, '24.228')] [2023-02-26 07:24:59,483][30947] Updated weights for policy 0, policy_version 1358 (0.0013) [2023-02-26 07:25:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 5566464. Throughput: 0: 947.1. Samples: 390270. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:25:00,449][06480] Avg episode reward: [(0, '25.126')] [2023-02-26 07:25:05,450][06480] Fps is (10 sec: 4094.5, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 5582848. Throughput: 0: 944.2. Samples: 393476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:25:05,455][06480] Avg episode reward: [(0, '25.420')] [2023-02-26 07:25:10,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3618.3, 300 sec: 3568.4). Total num frames: 5595136. Throughput: 0: 895.0. Samples: 397738. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:25:10,449][06480] Avg episode reward: [(0, '25.399')] [2023-02-26 07:25:10,458][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001366_5595136.pth... [2023-02-26 07:25:10,657][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001157_4739072.pth [2023-02-26 07:25:12,264][30947] Updated weights for policy 0, policy_version 1368 (0.0026) [2023-02-26 07:25:15,446][06480] Fps is (10 sec: 3278.0, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5615616. Throughput: 0: 916.2. Samples: 402968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:25:15,448][06480] Avg episode reward: [(0, '24.072')] [2023-02-26 07:25:20,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 5640192. Throughput: 0: 944.3. Samples: 406376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:25:20,449][06480] Avg episode reward: [(0, '22.563')] [2023-02-26 07:25:21,384][30947] Updated weights for policy 0, policy_version 1378 (0.0012) [2023-02-26 07:25:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 5656576. Throughput: 0: 938.4. Samples: 412638. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:25:25,449][06480] Avg episode reward: [(0, '21.814')] [2023-02-26 07:25:30,448][06480] Fps is (10 sec: 3276.2, 60 sec: 3754.9, 300 sec: 3596.1). Total num frames: 5672960. Throughput: 0: 901.0. Samples: 417026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:25:30,451][06480] Avg episode reward: [(0, '21.538')] [2023-02-26 07:25:33,854][30947] Updated weights for policy 0, policy_version 1388 (0.0022) [2023-02-26 07:25:35,446][06480] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 5689344. Throughput: 0: 906.9. Samples: 419428. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:25:35,449][06480] Avg episode reward: [(0, '21.328')] [2023-02-26 07:25:40,446][06480] Fps is (10 sec: 4096.7, 60 sec: 3754.7, 300 sec: 3596.2). Total num frames: 5713920. Throughput: 0: 951.2. Samples: 426214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:25:40,449][06480] Avg episode reward: [(0, '22.251')] [2023-02-26 07:25:43,099][30947] Updated weights for policy 0, policy_version 1398 (0.0012) [2023-02-26 07:25:45,446][06480] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 5730304. Throughput: 0: 927.4. Samples: 432004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:25:45,450][06480] Avg episode reward: [(0, '20.666')] [2023-02-26 07:25:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 5746688. Throughput: 0: 902.6. Samples: 434088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:25:50,449][06480] Avg episode reward: [(0, '20.609')] [2023-02-26 07:25:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 5763072. Throughput: 0: 923.7. Samples: 439304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:25:55,456][06480] Avg episode reward: [(0, '21.633')] [2023-02-26 07:25:55,483][30947] Updated weights for policy 0, policy_version 1408 (0.0025) [2023-02-26 07:26:00,447][06480] Fps is (10 sec: 4095.7, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 5787648. Throughput: 0: 959.2. Samples: 446132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:26:00,453][06480] Avg episode reward: [(0, '21.098')] [2023-02-26 07:26:05,451][06480] Fps is (10 sec: 4094.2, 60 sec: 3686.3, 300 sec: 3610.0). Total num frames: 5804032. Throughput: 0: 949.6. Samples: 449112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:26:05,453][06480] Avg episode reward: [(0, '21.543')] [2023-02-26 07:26:05,791][30947] Updated weights for policy 0, policy_version 1418 (0.0019) [2023-02-26 07:26:10,447][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.6, 300 sec: 3623.9). Total num frames: 5820416. Throughput: 0: 905.9. Samples: 453406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:26:10,454][06480] Avg episode reward: [(0, '20.768')] [2023-02-26 07:26:15,446][06480] Fps is (10 sec: 3688.1, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 5840896. Throughput: 0: 933.2. Samples: 459020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:26:15,452][06480] Avg episode reward: [(0, '23.128')] [2023-02-26 07:26:17,051][30947] Updated weights for policy 0, policy_version 1428 (0.0011) [2023-02-26 07:26:20,446][06480] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 5861376. Throughput: 0: 955.0. Samples: 462402. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:26:20,452][06480] Avg episode reward: [(0, '24.516')] [2023-02-26 07:26:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 5877760. Throughput: 0: 937.4. Samples: 468398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:26:25,452][06480] Avg episode reward: [(0, '24.399')] [2023-02-26 07:26:28,354][30947] Updated weights for policy 0, policy_version 1438 (0.0012) [2023-02-26 07:26:30,447][06480] Fps is (10 sec: 3276.7, 60 sec: 3686.5, 300 sec: 3665.6). Total num frames: 5894144. Throughput: 0: 902.5. Samples: 472616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:26:30,452][06480] Avg episode reward: [(0, '24.210')] [2023-02-26 07:26:35,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 5906432. Throughput: 0: 893.8. Samples: 474310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:26:35,454][06480] Avg episode reward: [(0, '24.118')] [2023-02-26 07:26:40,446][06480] Fps is (10 sec: 2457.7, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 5918720. Throughput: 0: 870.7. Samples: 478486. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:26:40,454][06480] Avg episode reward: [(0, '24.380')] [2023-02-26 07:26:42,354][30947] Updated weights for policy 0, policy_version 1448 (0.0030) [2023-02-26 07:26:45,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 5939200. Throughput: 0: 839.0. Samples: 483888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:26:45,451][06480] Avg episode reward: [(0, '24.512')] [2023-02-26 07:26:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 5951488. Throughput: 0: 820.6. Samples: 486034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:26:50,455][06480] Avg episode reward: [(0, '24.582')] [2023-02-26 07:26:55,038][30947] Updated weights for policy 0, policy_version 1458 (0.0022) [2023-02-26 07:26:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3596.1). Total num frames: 5971968. Throughput: 0: 835.3. Samples: 490992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:26:55,453][06480] Avg episode reward: [(0, '24.055')] [2023-02-26 07:27:00,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3596.1). Total num frames: 5992448. Throughput: 0: 861.2. Samples: 497772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:00,454][06480] Avg episode reward: [(0, '23.510')] [2023-02-26 07:27:04,513][30947] Updated weights for policy 0, policy_version 1468 (0.0016) [2023-02-26 07:27:05,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.9, 300 sec: 3623.9). Total num frames: 6012928. Throughput: 0: 861.0. Samples: 501146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:05,455][06480] Avg episode reward: [(0, '24.418')] [2023-02-26 07:27:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3596.1). Total num frames: 6025216. Throughput: 0: 821.7. Samples: 505374. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:27:10,454][06480] Avg episode reward: [(0, '24.669')] [2023-02-26 07:27:10,470][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001471_6025216.pth... [2023-02-26 07:27:10,659][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001262_5169152.pth [2023-02-26 07:27:15,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 6045696. Throughput: 0: 843.4. Samples: 510570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:15,454][06480] Avg episode reward: [(0, '24.484')] [2023-02-26 07:27:16,812][30947] Updated weights for policy 0, policy_version 1478 (0.0019) [2023-02-26 07:27:20,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 6070272. Throughput: 0: 881.6. Samples: 513984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:20,450][06480] Avg episode reward: [(0, '25.618')] [2023-02-26 07:27:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 6086656. Throughput: 0: 924.3. Samples: 520080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:25,453][06480] Avg episode reward: [(0, '25.184')] [2023-02-26 07:27:27,725][30947] Updated weights for policy 0, policy_version 1488 (0.0016) [2023-02-26 07:27:30,447][06480] Fps is (10 sec: 2866.9, 60 sec: 3413.3, 300 sec: 3596.1). Total num frames: 6098944. Throughput: 0: 899.8. Samples: 524380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:27:30,455][06480] Avg episode reward: [(0, '24.885')] [2023-02-26 07:27:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6119424. Throughput: 0: 910.0. Samples: 526982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:27:35,454][06480] Avg episode reward: [(0, '24.489')] [2023-02-26 07:27:38,427][30947] Updated weights for policy 0, policy_version 1498 (0.0011) [2023-02-26 07:27:40,446][06480] Fps is (10 sec: 4506.0, 60 sec: 3754.7, 300 sec: 3610.1). Total num frames: 6144000. Throughput: 0: 951.0. Samples: 533788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:27:40,449][06480] Avg episode reward: [(0, '24.006')] [2023-02-26 07:27:45,448][06480] Fps is (10 sec: 4095.3, 60 sec: 3686.3, 300 sec: 3610.0). Total num frames: 6160384. Throughput: 0: 925.9. Samples: 539440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:27:45,452][06480] Avg episode reward: [(0, '23.315')] [2023-02-26 07:27:50,244][30947] Updated weights for policy 0, policy_version 1508 (0.0011) [2023-02-26 07:27:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 6176768. Throughput: 0: 898.1. Samples: 541562. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:27:50,457][06480] Avg episode reward: [(0, '22.595')] [2023-02-26 07:27:55,446][06480] Fps is (10 sec: 3687.0, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 6197248. Throughput: 0: 923.9. Samples: 546948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:27:55,453][06480] Avg episode reward: [(0, '22.399')] [2023-02-26 07:27:59,799][30947] Updated weights for policy 0, policy_version 1518 (0.0016) [2023-02-26 07:28:00,446][06480] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3610.1). Total num frames: 6217728. Throughput: 0: 960.8. Samples: 553806. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:28:00,452][06480] Avg episode reward: [(0, '21.534')] [2023-02-26 07:28:05,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 6234112. Throughput: 0: 947.7. Samples: 556630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:28:05,454][06480] Avg episode reward: [(0, '21.394')] [2023-02-26 07:28:10,447][06480] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3610.0). Total num frames: 6250496. Throughput: 0: 907.1. Samples: 560902. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 07:28:10,450][06480] Avg episode reward: [(0, '20.759')] [2023-02-26 07:28:12,416][30947] Updated weights for policy 0, policy_version 1528 (0.0027) [2023-02-26 07:28:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3596.2). Total num frames: 6270976. Throughput: 0: 941.6. Samples: 566752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:28:15,452][06480] Avg episode reward: [(0, '21.422')] [2023-02-26 07:28:20,446][06480] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 6291456. Throughput: 0: 958.8. Samples: 570130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:28:20,452][06480] Avg episode reward: [(0, '22.432')] [2023-02-26 07:28:21,528][30947] Updated weights for policy 0, policy_version 1538 (0.0016) [2023-02-26 07:28:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 6307840. Throughput: 0: 934.9. Samples: 575858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:28:25,451][06480] Avg episode reward: [(0, '21.955')] [2023-02-26 07:28:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 6324224. Throughput: 0: 903.9. Samples: 580114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:28:30,458][06480] Avg episode reward: [(0, '22.209')] [2023-02-26 07:28:33,957][30947] Updated weights for policy 0, policy_version 1548 (0.0026) [2023-02-26 07:28:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 6344704. Throughput: 0: 924.3. Samples: 583156. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:28:35,449][06480] Avg episode reward: [(0, '20.356')] [2023-02-26 07:28:40,446][06480] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3610.1). Total num frames: 6369280. Throughput: 0: 953.6. Samples: 589862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:28:40,448][06480] Avg episode reward: [(0, '20.112')] [2023-02-26 07:28:44,523][30947] Updated weights for policy 0, policy_version 1558 (0.0013) [2023-02-26 07:28:45,446][06480] Fps is (10 sec: 3686.3, 60 sec: 3686.5, 300 sec: 3610.1). Total num frames: 6381568. Throughput: 0: 912.2. Samples: 594854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:28:45,448][06480] Avg episode reward: [(0, '20.062')] [2023-02-26 07:28:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3624.0). Total num frames: 6397952. Throughput: 0: 896.0. Samples: 596952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:28:50,453][06480] Avg episode reward: [(0, '20.393')] [2023-02-26 07:28:55,446][06480] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 6418432. Throughput: 0: 929.6. Samples: 602734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:28:55,449][06480] Avg episode reward: [(0, '22.322')] [2023-02-26 07:28:55,706][30947] Updated weights for policy 0, policy_version 1568 (0.0026) [2023-02-26 07:29:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 6443008. Throughput: 0: 950.9. Samples: 609542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:00,452][06480] Avg episode reward: [(0, '22.021')] [2023-02-26 07:29:05,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 6455296. Throughput: 0: 929.5. Samples: 611956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:05,449][06480] Avg episode reward: [(0, '23.273')] [2023-02-26 07:29:07,436][30947] Updated weights for policy 0, policy_version 1578 (0.0014) [2023-02-26 07:29:10,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 6467584. Throughput: 0: 892.5. Samples: 616020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:29:10,449][06480] Avg episode reward: [(0, '22.623')] [2023-02-26 07:29:10,467][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001579_6467584.pth... [2023-02-26 07:29:10,733][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001366_5595136.pth [2023-02-26 07:29:15,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 6483968. Throughput: 0: 884.0. Samples: 619896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:29:15,449][06480] Avg episode reward: [(0, '22.691')] [2023-02-26 07:29:20,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 6500352. Throughput: 0: 863.6. Samples: 622016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:29:20,454][06480] Avg episode reward: [(0, '21.632')] [2023-02-26 07:29:21,000][30947] Updated weights for policy 0, policy_version 1588 (0.0015) [2023-02-26 07:29:25,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3624.0). Total num frames: 6516736. Throughput: 0: 841.2. Samples: 627718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:25,452][06480] Avg episode reward: [(0, '21.905')] [2023-02-26 07:29:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 6533120. Throughput: 0: 825.9. Samples: 632018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:30,448][06480] Avg episode reward: [(0, '22.487')] [2023-02-26 07:29:33,534][30947] Updated weights for policy 0, policy_version 1598 (0.0014) [2023-02-26 07:29:35,446][06480] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 6553600. Throughput: 0: 841.1. Samples: 634802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:35,448][06480] Avg episode reward: [(0, '22.063')] [2023-02-26 07:29:40,446][06480] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 6574080. Throughput: 0: 860.6. Samples: 641462. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:29:40,448][06480] Avg episode reward: [(0, '24.196')] [2023-02-26 07:29:43,418][30947] Updated weights for policy 0, policy_version 1608 (0.0012) [2023-02-26 07:29:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 6590464. Throughput: 0: 828.0. Samples: 646804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:29:45,453][06480] Avg episode reward: [(0, '24.558')] [2023-02-26 07:29:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3596.1). Total num frames: 6602752. Throughput: 0: 819.9. Samples: 648850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:29:50,455][06480] Avg episode reward: [(0, '24.989')] [2023-02-26 07:29:55,288][30947] Updated weights for policy 0, policy_version 1618 (0.0018) [2023-02-26 07:29:55,447][06480] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3596.1). Total num frames: 6627328. Throughput: 0: 853.1. Samples: 654412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:29:55,454][06480] Avg episode reward: [(0, '24.401')] [2023-02-26 07:30:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3610.1). Total num frames: 6647808. Throughput: 0: 920.6. Samples: 661322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:30:00,449][06480] Avg episode reward: [(0, '25.400')] [2023-02-26 07:30:05,446][06480] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 6664192. Throughput: 0: 933.2. Samples: 664010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:30:05,464][06480] Avg episode reward: [(0, '25.448')] [2023-02-26 07:30:06,043][30947] Updated weights for policy 0, policy_version 1628 (0.0011) [2023-02-26 07:30:10,447][06480] Fps is (10 sec: 3276.6, 60 sec: 3549.8, 300 sec: 3610.0). Total num frames: 6680576. Throughput: 0: 901.1. Samples: 668266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:30:10,450][06480] Avg episode reward: [(0, '24.827')] [2023-02-26 07:30:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 6701056. Throughput: 0: 939.1. Samples: 674276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:30:15,452][06480] Avg episode reward: [(0, '24.071')] [2023-02-26 07:30:17,002][30947] Updated weights for policy 0, policy_version 1638 (0.0015) [2023-02-26 07:30:20,446][06480] Fps is (10 sec: 4505.9, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 6725632. Throughput: 0: 953.5. Samples: 677710. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:30:20,452][06480] Avg episode reward: [(0, '25.094')] [2023-02-26 07:30:25,448][06480] Fps is (10 sec: 4095.3, 60 sec: 3754.6, 300 sec: 3623.9). Total num frames: 6742016. Throughput: 0: 933.3. Samples: 683462. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:30:25,454][06480] Avg episode reward: [(0, '23.794')] [2023-02-26 07:30:28,194][30947] Updated weights for policy 0, policy_version 1648 (0.0013) [2023-02-26 07:30:30,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 6754304. Throughput: 0: 911.7. Samples: 687832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:30:30,449][06480] Avg episode reward: [(0, '23.786')] [2023-02-26 07:30:35,446][06480] Fps is (10 sec: 3277.4, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 6774784. Throughput: 0: 931.5. Samples: 690766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:30:35,453][06480] Avg episode reward: [(0, '24.316')] [2023-02-26 07:30:38,291][30947] Updated weights for policy 0, policy_version 1658 (0.0025) [2023-02-26 07:30:40,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 6799360. Throughput: 0: 964.9. Samples: 697834. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:30:40,453][06480] Avg episode reward: [(0, '23.834')] [2023-02-26 07:30:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 6811648. Throughput: 0: 922.9. Samples: 702854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:30:45,452][06480] Avg episode reward: [(0, '24.319')] [2023-02-26 07:30:50,450][06480] Fps is (10 sec: 2866.1, 60 sec: 3754.4, 300 sec: 3610.0). Total num frames: 6828032. Throughput: 0: 912.8. Samples: 705090. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:30:50,459][06480] Avg episode reward: [(0, '23.748')] [2023-02-26 07:30:50,883][30947] Updated weights for policy 0, policy_version 1668 (0.0030) [2023-02-26 07:30:55,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 6852608. Throughput: 0: 950.2. Samples: 711026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:30:55,451][06480] Avg episode reward: [(0, '23.870')] [2023-02-26 07:30:59,473][30947] Updated weights for policy 0, policy_version 1678 (0.0011) [2023-02-26 07:31:00,446][06480] Fps is (10 sec: 4917.0, 60 sec: 3822.9, 300 sec: 3637.9). Total num frames: 6877184. Throughput: 0: 970.4. Samples: 717942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:31:00,453][06480] Avg episode reward: [(0, '23.148')] [2023-02-26 07:31:05,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 6889472. Throughput: 0: 948.1. Samples: 720376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:31:05,455][06480] Avg episode reward: [(0, '23.203')] [2023-02-26 07:31:10,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 6905856. Throughput: 0: 914.6. Samples: 724616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:31:10,448][06480] Avg episode reward: [(0, '24.213')] [2023-02-26 07:31:10,461][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001686_6905856.pth... [2023-02-26 07:31:10,619][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001471_6025216.pth [2023-02-26 07:31:12,347][30947] Updated weights for policy 0, policy_version 1688 (0.0022) [2023-02-26 07:31:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 6926336. Throughput: 0: 950.0. Samples: 730582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:31:15,449][06480] Avg episode reward: [(0, '23.063')] [2023-02-26 07:31:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 6946816. Throughput: 0: 960.9. Samples: 734006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:31:20,455][06480] Avg episode reward: [(0, '23.145')] [2023-02-26 07:31:22,061][30947] Updated weights for policy 0, policy_version 1698 (0.0015) [2023-02-26 07:31:25,448][06480] Fps is (10 sec: 3685.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 6963200. Throughput: 0: 920.2. Samples: 739244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:31:25,450][06480] Avg episode reward: [(0, '23.530')] [2023-02-26 07:31:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 6979584. Throughput: 0: 907.8. Samples: 743706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:31:30,454][06480] Avg episode reward: [(0, '24.687')] [2023-02-26 07:31:33,867][30947] Updated weights for policy 0, policy_version 1708 (0.0017) [2023-02-26 07:31:35,446][06480] Fps is (10 sec: 3687.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 7000064. Throughput: 0: 935.3. Samples: 747174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:31:35,449][06480] Avg episode reward: [(0, '23.460')] [2023-02-26 07:31:40,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 7024640. Throughput: 0: 958.9. Samples: 754178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:31:40,449][06480] Avg episode reward: [(0, '21.934')] [2023-02-26 07:31:44,366][30947] Updated weights for policy 0, policy_version 1718 (0.0011) [2023-02-26 07:31:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 7036928. Throughput: 0: 907.4. Samples: 758776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:31:45,456][06480] Avg episode reward: [(0, '23.191')] [2023-02-26 07:31:50,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3686.6, 300 sec: 3651.7). Total num frames: 7049216. Throughput: 0: 894.6. Samples: 760632. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:31:50,450][06480] Avg episode reward: [(0, '23.413')] [2023-02-26 07:31:55,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 7065600. Throughput: 0: 886.4. Samples: 764506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:31:55,453][06480] Avg episode reward: [(0, '22.811')] [2023-02-26 07:31:58,702][30947] Updated weights for policy 0, policy_version 1728 (0.0023) [2023-02-26 07:32:00,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 7081984. Throughput: 0: 877.3. Samples: 770062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:00,448][06480] Avg episode reward: [(0, '22.600')] [2023-02-26 07:32:05,450][06480] Fps is (10 sec: 3685.1, 60 sec: 3549.7, 300 sec: 3651.6). Total num frames: 7102464. Throughput: 0: 866.7. Samples: 773012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:32:05,453][06480] Avg episode reward: [(0, '22.817')] [2023-02-26 07:32:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 7114752. Throughput: 0: 848.1. Samples: 777406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:10,452][06480] Avg episode reward: [(0, '23.679')] [2023-02-26 07:32:11,041][30947] Updated weights for policy 0, policy_version 1738 (0.0027) [2023-02-26 07:32:15,446][06480] Fps is (10 sec: 3278.0, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 7135232. Throughput: 0: 874.4. Samples: 783056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:15,451][06480] Avg episode reward: [(0, '22.871')] [2023-02-26 07:32:19,866][30947] Updated weights for policy 0, policy_version 1748 (0.0014) [2023-02-26 07:32:20,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 7159808. Throughput: 0: 878.3. Samples: 786696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:20,449][06480] Avg episode reward: [(0, '24.448')] [2023-02-26 07:32:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3651.7). Total num frames: 7176192. Throughput: 0: 859.7. Samples: 792864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:25,450][06480] Avg episode reward: [(0, '23.525')] [2023-02-26 07:32:30,449][06480] Fps is (10 sec: 3276.0, 60 sec: 3549.7, 300 sec: 3637.8). Total num frames: 7192576. Throughput: 0: 853.3. Samples: 797176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:30,454][06480] Avg episode reward: [(0, '23.898')] [2023-02-26 07:32:32,376][30947] Updated weights for policy 0, policy_version 1758 (0.0020) [2023-02-26 07:32:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 7213056. Throughput: 0: 877.6. Samples: 800126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:32:35,454][06480] Avg episode reward: [(0, '24.703')] [2023-02-26 07:32:40,446][06480] Fps is (10 sec: 4506.6, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 7237632. Throughput: 0: 947.5. Samples: 807144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:32:40,453][06480] Avg episode reward: [(0, '25.725')] [2023-02-26 07:32:41,040][30947] Updated weights for policy 0, policy_version 1768 (0.0013) [2023-02-26 07:32:45,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 7254016. Throughput: 0: 939.9. Samples: 812358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:32:45,455][06480] Avg episode reward: [(0, '26.374')] [2023-02-26 07:32:50,447][06480] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 7266304. Throughput: 0: 921.1. Samples: 814458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:32:50,448][06480] Avg episode reward: [(0, '24.860')] [2023-02-26 07:32:53,836][30947] Updated weights for policy 0, policy_version 1778 (0.0019) [2023-02-26 07:32:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 7286784. Throughput: 0: 947.1. Samples: 820026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:32:55,449][06480] Avg episode reward: [(0, '24.313')] [2023-02-26 07:33:00,447][06480] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3651.7). Total num frames: 7311360. Throughput: 0: 977.9. Samples: 827060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:33:00,454][06480] Avg episode reward: [(0, '24.487')] [2023-02-26 07:33:03,378][30947] Updated weights for policy 0, policy_version 1788 (0.0018) [2023-02-26 07:33:05,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3651.7). Total num frames: 7327744. Throughput: 0: 956.2. Samples: 829726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:33:05,448][06480] Avg episode reward: [(0, '23.168')] [2023-02-26 07:33:10,446][06480] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3637.8). Total num frames: 7344128. Throughput: 0: 915.6. Samples: 834064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:33:10,453][06480] Avg episode reward: [(0, '23.577')] [2023-02-26 07:33:10,468][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001793_7344128.pth... [2023-02-26 07:33:10,675][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001579_6467584.pth [2023-02-26 07:33:15,148][30947] Updated weights for policy 0, policy_version 1798 (0.0019) [2023-02-26 07:33:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3637.8). Total num frames: 7364608. Throughput: 0: 954.1. Samples: 840110. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:33:15,449][06480] Avg episode reward: [(0, '24.646')] [2023-02-26 07:33:20,447][06480] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3651.7). Total num frames: 7385088. Throughput: 0: 965.3. Samples: 843564. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:33:20,449][06480] Avg episode reward: [(0, '25.290')] [2023-02-26 07:33:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 7401472. Throughput: 0: 933.4. Samples: 849146. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:33:25,452][06480] Avg episode reward: [(0, '26.333')] [2023-02-26 07:33:25,935][30947] Updated weights for policy 0, policy_version 1808 (0.0011) [2023-02-26 07:33:30,446][06480] Fps is (10 sec: 3276.9, 60 sec: 3754.8, 300 sec: 3637.8). Total num frames: 7417856. Throughput: 0: 914.9. Samples: 853530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:33:30,449][06480] Avg episode reward: [(0, '26.124')] [2023-02-26 07:33:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 7438336. Throughput: 0: 940.5. Samples: 856782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:33:35,448][06480] Avg episode reward: [(0, '26.769')] [2023-02-26 07:33:36,509][30947] Updated weights for policy 0, policy_version 1818 (0.0016) [2023-02-26 07:33:40,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 7462912. Throughput: 0: 971.6. Samples: 863750. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:33:40,452][06480] Avg episode reward: [(0, '26.870')] [2023-02-26 07:33:40,469][30933] Saving new best policy, reward=26.870! [2023-02-26 07:33:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 7475200. Throughput: 0: 921.5. Samples: 868528. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:33:45,451][06480] Avg episode reward: [(0, '26.141')] [2023-02-26 07:33:48,416][30947] Updated weights for policy 0, policy_version 1828 (0.0034) [2023-02-26 07:33:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 7491584. Throughput: 0: 909.4. Samples: 870648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:33:50,451][06480] Avg episode reward: [(0, '25.799')] [2023-02-26 07:33:55,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3637.8). Total num frames: 7516160. Throughput: 0: 947.2. Samples: 876690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:33:55,451][06480] Avg episode reward: [(0, '24.469')] [2023-02-26 07:33:57,916][30947] Updated weights for policy 0, policy_version 1838 (0.0015) [2023-02-26 07:34:00,450][06480] Fps is (10 sec: 4504.0, 60 sec: 3754.5, 300 sec: 3665.5). Total num frames: 7536640. Throughput: 0: 968.3. Samples: 883688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:34:00,452][06480] Avg episode reward: [(0, '24.388')] [2023-02-26 07:34:05,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 7553024. Throughput: 0: 941.3. Samples: 885920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:34:05,453][06480] Avg episode reward: [(0, '25.044')] [2023-02-26 07:34:10,357][30947] Updated weights for policy 0, policy_version 1848 (0.0019) [2023-02-26 07:34:10,446][06480] Fps is (10 sec: 3278.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 7569408. Throughput: 0: 914.5. Samples: 890300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:34:10,450][06480] Avg episode reward: [(0, '24.825')] [2023-02-26 07:34:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 7589888. Throughput: 0: 958.1. Samples: 896644. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:34:15,449][06480] Avg episode reward: [(0, '24.808')] [2023-02-26 07:34:19,395][30947] Updated weights for policy 0, policy_version 1858 (0.0018) [2023-02-26 07:34:20,446][06480] Fps is (10 sec: 4505.7, 60 sec: 3823.0, 300 sec: 3721.1). Total num frames: 7614464. Throughput: 0: 962.6. Samples: 900098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:34:20,450][06480] Avg episode reward: [(0, '25.447')] [2023-02-26 07:34:25,447][06480] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3707.2). Total num frames: 7626752. Throughput: 0: 922.1. Samples: 905246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:34:25,449][06480] Avg episode reward: [(0, '25.745')] [2023-02-26 07:34:30,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 7639040. Throughput: 0: 895.0. Samples: 908802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:34:30,453][06480] Avg episode reward: [(0, '26.017')] [2023-02-26 07:34:34,676][30947] Updated weights for policy 0, policy_version 1868 (0.0033) [2023-02-26 07:34:35,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3549.8, 300 sec: 3651.7). Total num frames: 7651328. Throughput: 0: 889.7. Samples: 910684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:34:35,455][06480] Avg episode reward: [(0, '26.570')] [2023-02-26 07:34:40,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 7671808. Throughput: 0: 876.1. Samples: 916116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:34:40,449][06480] Avg episode reward: [(0, '24.630')] [2023-02-26 07:34:45,449][06480] Fps is (10 sec: 3685.5, 60 sec: 3549.7, 300 sec: 3679.4). Total num frames: 7688192. Throughput: 0: 842.8. Samples: 921614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:34:45,452][06480] Avg episode reward: [(0, '25.597')] [2023-02-26 07:34:45,757][30947] Updated weights for policy 0, policy_version 1878 (0.0017) [2023-02-26 07:34:50,448][06480] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3651.7). Total num frames: 7704576. Throughput: 0: 837.3. Samples: 923600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:34:50,451][06480] Avg episode reward: [(0, '26.227')] [2023-02-26 07:34:55,446][06480] Fps is (10 sec: 3687.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 7725056. Throughput: 0: 859.7. Samples: 928986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:34:55,449][06480] Avg episode reward: [(0, '26.141')] [2023-02-26 07:34:56,928][30947] Updated weights for policy 0, policy_version 1888 (0.0021) [2023-02-26 07:35:00,446][06480] Fps is (10 sec: 4096.6, 60 sec: 3481.8, 300 sec: 3665.6). Total num frames: 7745536. Throughput: 0: 873.8. Samples: 935964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:00,448][06480] Avg episode reward: [(0, '26.378')] [2023-02-26 07:35:05,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 7766016. Throughput: 0: 863.2. Samples: 938944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:35:05,450][06480] Avg episode reward: [(0, '25.759')] [2023-02-26 07:35:07,922][30947] Updated weights for policy 0, policy_version 1898 (0.0016) [2023-02-26 07:35:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 7778304. Throughput: 0: 846.8. Samples: 943350. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:35:10,457][06480] Avg episode reward: [(0, '26.604')] [2023-02-26 07:35:10,469][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001899_7778304.pth... [2023-02-26 07:35:10,687][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001686_6905856.pth [2023-02-26 07:35:15,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 7798784. Throughput: 0: 891.8. Samples: 948934. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:15,448][06480] Avg episode reward: [(0, '26.011')] [2023-02-26 07:35:18,541][30947] Updated weights for policy 0, policy_version 1908 (0.0015) [2023-02-26 07:35:20,446][06480] Fps is (10 sec: 4505.5, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 7823360. Throughput: 0: 925.0. Samples: 952310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:35:20,454][06480] Avg episode reward: [(0, '24.774')] [2023-02-26 07:35:25,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 7839744. Throughput: 0: 935.0. Samples: 958192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:35:25,449][06480] Avg episode reward: [(0, '24.464')] [2023-02-26 07:35:30,448][06480] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3651.7). Total num frames: 7852032. Throughput: 0: 909.6. Samples: 962544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:30,452][06480] Avg episode reward: [(0, '24.578')] [2023-02-26 07:35:30,908][30947] Updated weights for policy 0, policy_version 1918 (0.0019) [2023-02-26 07:35:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 7876608. Throughput: 0: 931.3. Samples: 965508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:35,451][06480] Avg episode reward: [(0, '24.388')] [2023-02-26 07:35:39,806][30947] Updated weights for policy 0, policy_version 1928 (0.0013) [2023-02-26 07:35:40,446][06480] Fps is (10 sec: 4506.3, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 7897088. Throughput: 0: 964.9. Samples: 972408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:40,451][06480] Avg episode reward: [(0, '24.790')] [2023-02-26 07:35:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3679.5). Total num frames: 7913472. Throughput: 0: 926.9. Samples: 977674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:35:45,451][06480] Avg episode reward: [(0, '25.930')] [2023-02-26 07:35:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 7925760. Throughput: 0: 905.6. Samples: 979694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:35:50,451][06480] Avg episode reward: [(0, '26.518')] [2023-02-26 07:35:52,592][30947] Updated weights for policy 0, policy_version 1938 (0.0012) [2023-02-26 07:35:55,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 7950336. Throughput: 0: 932.9. Samples: 985332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:35:55,449][06480] Avg episode reward: [(0, '27.088')] [2023-02-26 07:35:55,454][30933] Saving new best policy, reward=27.088! [2023-02-26 07:36:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 7970816. Throughput: 0: 961.7. Samples: 992210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:36:00,450][06480] Avg episode reward: [(0, '26.090')] [2023-02-26 07:36:01,891][30947] Updated weights for policy 0, policy_version 1948 (0.0016) [2023-02-26 07:36:05,452][06480] Fps is (10 sec: 3684.3, 60 sec: 3686.1, 300 sec: 3665.5). Total num frames: 7987200. Throughput: 0: 941.9. Samples: 994700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:36:05,454][06480] Avg episode reward: [(0, '26.063')] [2023-02-26 07:36:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 8003584. Throughput: 0: 905.8. Samples: 998952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:36:10,450][06480] Avg episode reward: [(0, '26.014')] [2023-02-26 07:36:14,156][30947] Updated weights for policy 0, policy_version 1958 (0.0018) [2023-02-26 07:36:15,446][06480] Fps is (10 sec: 3688.5, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 8024064. Throughput: 0: 943.3. Samples: 1004992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:36:15,454][06480] Avg episode reward: [(0, '23.998')] [2023-02-26 07:36:20,446][06480] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 8044544. Throughput: 0: 948.0. Samples: 1008168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:36:20,454][06480] Avg episode reward: [(0, '22.165')] [2023-02-26 07:36:24,728][30947] Updated weights for policy 0, policy_version 1968 (0.0021) [2023-02-26 07:36:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 8060928. Throughput: 0: 917.8. Samples: 1013708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:36:25,450][06480] Avg episode reward: [(0, '22.411')] [2023-02-26 07:36:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3651.7). Total num frames: 8077312. Throughput: 0: 897.2. Samples: 1018048. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:36:30,451][06480] Avg episode reward: [(0, '23.138')] [2023-02-26 07:36:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 8097792. Throughput: 0: 922.0. Samples: 1021182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:36:35,452][06480] Avg episode reward: [(0, '23.771')] [2023-02-26 07:36:35,927][30947] Updated weights for policy 0, policy_version 1978 (0.0022) [2023-02-26 07:36:40,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 8122368. Throughput: 0: 951.0. Samples: 1028126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:36:40,448][06480] Avg episode reward: [(0, '22.902')] [2023-02-26 07:36:45,447][06480] Fps is (10 sec: 3686.1, 60 sec: 3686.3, 300 sec: 3679.4). Total num frames: 8134656. Throughput: 0: 907.1. Samples: 1033030. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 07:36:45,450][06480] Avg episode reward: [(0, '23.697')] [2023-02-26 07:36:47,301][30947] Updated weights for policy 0, policy_version 1988 (0.0011) [2023-02-26 07:36:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 8151040. Throughput: 0: 899.5. Samples: 1035174. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:36:50,449][06480] Avg episode reward: [(0, '24.005')] [2023-02-26 07:36:55,446][06480] Fps is (10 sec: 3686.7, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 8171520. Throughput: 0: 937.2. Samples: 1041128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:36:55,453][06480] Avg episode reward: [(0, '24.759')] [2023-02-26 07:36:57,318][30947] Updated weights for policy 0, policy_version 1998 (0.0012) [2023-02-26 07:37:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3707.3). Total num frames: 8196096. Throughput: 0: 957.1. Samples: 1048062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:37:00,453][06480] Avg episode reward: [(0, '22.753')] [2023-02-26 07:37:05,449][06480] Fps is (10 sec: 3685.4, 60 sec: 3686.6, 300 sec: 3707.2). Total num frames: 8208384. Throughput: 0: 936.8. Samples: 1050328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:37:05,451][06480] Avg episode reward: [(0, '21.891')] [2023-02-26 07:37:10,447][06480] Fps is (10 sec: 2457.4, 60 sec: 3618.1, 300 sec: 3679.4). Total num frames: 8220672. Throughput: 0: 894.1. Samples: 1053942. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:37:10,452][06480] Avg episode reward: [(0, '21.987')] [2023-02-26 07:37:10,467][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002007_8220672.pth... [2023-02-26 07:37:10,681][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001793_7344128.pth [2023-02-26 07:37:11,280][30947] Updated weights for policy 0, policy_version 2008 (0.0053) [2023-02-26 07:37:15,446][06480] Fps is (10 sec: 2458.2, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 8232960. Throughput: 0: 880.4. Samples: 1057668. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:37:15,451][06480] Avg episode reward: [(0, '22.591')] [2023-02-26 07:37:20,446][06480] Fps is (10 sec: 3277.1, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 8253440. Throughput: 0: 869.0. Samples: 1060286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:37:20,451][06480] Avg episode reward: [(0, '23.436')] [2023-02-26 07:37:22,746][30947] Updated weights for policy 0, policy_version 2018 (0.0011) [2023-02-26 07:37:25,454][06480] Fps is (10 sec: 3683.6, 60 sec: 3481.2, 300 sec: 3651.6). Total num frames: 8269824. Throughput: 0: 852.7. Samples: 1066506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:37:25,456][06480] Avg episode reward: [(0, '23.560')] [2023-02-26 07:37:30,446][06480] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 8286208. Throughput: 0: 839.8. Samples: 1070822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:37:30,449][06480] Avg episode reward: [(0, '22.737')] [2023-02-26 07:37:35,135][30947] Updated weights for policy 0, policy_version 2028 (0.0015) [2023-02-26 07:37:35,446][06480] Fps is (10 sec: 3689.2, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 8306688. Throughput: 0: 847.2. Samples: 1073296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:37:35,449][06480] Avg episode reward: [(0, '23.546')] [2023-02-26 07:37:40,446][06480] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 8331264. Throughput: 0: 868.7. Samples: 1080218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:37:40,449][06480] Avg episode reward: [(0, '24.514')] [2023-02-26 07:37:45,120][30947] Updated weights for policy 0, policy_version 2038 (0.0010) [2023-02-26 07:37:45,448][06480] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3665.6). Total num frames: 8347648. Throughput: 0: 842.0. Samples: 1085954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:37:45,454][06480] Avg episode reward: [(0, '25.337')] [2023-02-26 07:37:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 8359936. Throughput: 0: 837.6. Samples: 1088020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:37:50,452][06480] Avg episode reward: [(0, '26.003')] [2023-02-26 07:37:55,446][06480] Fps is (10 sec: 3277.3, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 8380416. Throughput: 0: 875.6. Samples: 1093344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:37:55,452][06480] Avg episode reward: [(0, '26.212')] [2023-02-26 07:37:56,429][30947] Updated weights for policy 0, policy_version 2048 (0.0014) [2023-02-26 07:38:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 8404992. Throughput: 0: 947.0. Samples: 1100282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:38:00,454][06480] Avg episode reward: [(0, '26.809')] [2023-02-26 07:38:05,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3651.7). Total num frames: 8421376. Throughput: 0: 953.0. Samples: 1103172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:38:05,449][06480] Avg episode reward: [(0, '25.551')] [2023-02-26 07:38:07,624][30947] Updated weights for policy 0, policy_version 2058 (0.0032) [2023-02-26 07:38:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 8437760. Throughput: 0: 909.4. Samples: 1107424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:38:10,453][06480] Avg episode reward: [(0, '26.393')] [2023-02-26 07:38:15,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 8458240. Throughput: 0: 942.7. Samples: 1113242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:38:15,448][06480] Avg episode reward: [(0, '26.528')] [2023-02-26 07:38:18,256][30947] Updated weights for policy 0, policy_version 2068 (0.0021) [2023-02-26 07:38:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 8478720. Throughput: 0: 962.2. Samples: 1116596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:38:20,449][06480] Avg episode reward: [(0, '24.914')] [2023-02-26 07:38:25,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3755.1, 300 sec: 3651.7). Total num frames: 8495104. Throughput: 0: 932.7. Samples: 1122188. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-26 07:38:25,453][06480] Avg episode reward: [(0, '22.969')] [2023-02-26 07:38:30,362][30947] Updated weights for policy 0, policy_version 2078 (0.0019) [2023-02-26 07:38:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 8511488. Throughput: 0: 903.0. Samples: 1126586. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:38:30,454][06480] Avg episode reward: [(0, '22.785')] [2023-02-26 07:38:35,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 8531968. Throughput: 0: 924.6. Samples: 1129626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:38:35,455][06480] Avg episode reward: [(0, '24.164')] [2023-02-26 07:38:39,637][30947] Updated weights for policy 0, policy_version 2088 (0.0012) [2023-02-26 07:38:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 8552448. Throughput: 0: 960.0. Samples: 1136542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:38:40,448][06480] Avg episode reward: [(0, '22.012')] [2023-02-26 07:38:45,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3651.7). Total num frames: 8568832. Throughput: 0: 920.3. Samples: 1141696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:38:45,453][06480] Avg episode reward: [(0, '22.489')] [2023-02-26 07:38:50,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 8581120. Throughput: 0: 903.4. Samples: 1143824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:38:50,452][06480] Avg episode reward: [(0, '21.695')] [2023-02-26 07:38:52,228][30947] Updated weights for policy 0, policy_version 2098 (0.0017) [2023-02-26 07:38:55,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3624.0). Total num frames: 8605696. Throughput: 0: 934.4. Samples: 1149470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:38:55,448][06480] Avg episode reward: [(0, '22.063')] [2023-02-26 07:39:00,446][06480] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 8630272. Throughput: 0: 959.5. Samples: 1156420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:39:00,454][06480] Avg episode reward: [(0, '21.145')] [2023-02-26 07:39:01,180][30947] Updated weights for policy 0, policy_version 2108 (0.0020) [2023-02-26 07:39:05,447][06480] Fps is (10 sec: 4095.8, 60 sec: 3754.6, 300 sec: 3651.7). Total num frames: 8646656. Throughput: 0: 941.7. Samples: 1158972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:39:05,455][06480] Avg episode reward: [(0, '22.230')] [2023-02-26 07:39:10,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 8658944. Throughput: 0: 914.7. Samples: 1163348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:39:10,450][06480] Avg episode reward: [(0, '23.640')] [2023-02-26 07:39:10,465][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002114_8658944.pth... [2023-02-26 07:39:10,656][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001899_7778304.pth [2023-02-26 07:39:13,738][30947] Updated weights for policy 0, policy_version 2118 (0.0019) [2023-02-26 07:39:15,446][06480] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 8679424. Throughput: 0: 950.8. Samples: 1169374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:39:15,451][06480] Avg episode reward: [(0, '24.455')] [2023-02-26 07:39:20,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 8704000. Throughput: 0: 961.3. Samples: 1172886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:39:20,451][06480] Avg episode reward: [(0, '25.518')] [2023-02-26 07:39:23,917][30947] Updated weights for policy 0, policy_version 2128 (0.0012) [2023-02-26 07:39:25,447][06480] Fps is (10 sec: 4095.5, 60 sec: 3754.6, 300 sec: 3665.6). Total num frames: 8720384. Throughput: 0: 925.4. Samples: 1178188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:39:25,455][06480] Avg episode reward: [(0, '25.458')] [2023-02-26 07:39:30,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 8732672. Throughput: 0: 906.8. Samples: 1182500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:39:30,454][06480] Avg episode reward: [(0, '27.245')] [2023-02-26 07:39:30,466][30933] Saving new best policy, reward=27.245! [2023-02-26 07:39:35,399][30947] Updated weights for policy 0, policy_version 2138 (0.0017) [2023-02-26 07:39:35,446][06480] Fps is (10 sec: 3686.8, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 8757248. Throughput: 0: 932.0. Samples: 1185766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:39:35,449][06480] Avg episode reward: [(0, '27.434')] [2023-02-26 07:39:35,452][30933] Saving new best policy, reward=27.434! [2023-02-26 07:39:40,447][06480] Fps is (10 sec: 4505.1, 60 sec: 3754.6, 300 sec: 3693.4). Total num frames: 8777728. Throughput: 0: 958.8. Samples: 1192618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:39:40,450][06480] Avg episode reward: [(0, '26.986')] [2023-02-26 07:39:45,446][06480] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 8790016. Throughput: 0: 901.3. Samples: 1196978. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:39:45,449][06480] Avg episode reward: [(0, '27.441')] [2023-02-26 07:39:45,462][30933] Saving new best policy, reward=27.441! [2023-02-26 07:39:47,908][30947] Updated weights for policy 0, policy_version 2148 (0.0014) [2023-02-26 07:39:50,447][06480] Fps is (10 sec: 2457.7, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 8802304. Throughput: 0: 881.8. Samples: 1198652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:39:50,450][06480] Avg episode reward: [(0, '26.706')] [2023-02-26 07:39:55,447][06480] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 8814592. Throughput: 0: 861.2. Samples: 1202102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:39:55,453][06480] Avg episode reward: [(0, '25.515')] [2023-02-26 07:40:00,446][06480] Fps is (10 sec: 3277.0, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 8835072. Throughput: 0: 864.6. Samples: 1208282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:40:00,454][06480] Avg episode reward: [(0, '24.726')] [2023-02-26 07:40:00,505][30947] Updated weights for policy 0, policy_version 2158 (0.0020) [2023-02-26 07:40:05,446][06480] Fps is (10 sec: 4505.8, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 8859648. Throughput: 0: 863.3. Samples: 1211736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:40:05,451][06480] Avg episode reward: [(0, '25.968')] [2023-02-26 07:40:10,447][06480] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3637.8). Total num frames: 8871936. Throughput: 0: 859.5. Samples: 1216864. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:40:10,454][06480] Avg episode reward: [(0, '25.906')] [2023-02-26 07:40:11,908][30947] Updated weights for policy 0, policy_version 2168 (0.0013) [2023-02-26 07:40:15,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3610.0). Total num frames: 8888320. Throughput: 0: 865.4. Samples: 1221442. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:40:15,449][06480] Avg episode reward: [(0, '26.557')] [2023-02-26 07:40:20,446][06480] Fps is (10 sec: 4096.5, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 8912896. Throughput: 0: 868.7. Samples: 1224858. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:40:20,451][06480] Avg episode reward: [(0, '27.134')] [2023-02-26 07:40:21,819][30947] Updated weights for policy 0, policy_version 2178 (0.0014) [2023-02-26 07:40:25,454][06480] Fps is (10 sec: 4502.2, 60 sec: 3549.5, 300 sec: 3665.5). Total num frames: 8933376. Throughput: 0: 868.0. Samples: 1231684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:40:25,456][06480] Avg episode reward: [(0, '28.060')] [2023-02-26 07:40:25,465][30933] Saving new best policy, reward=28.060! [2023-02-26 07:40:30,452][06480] Fps is (10 sec: 3684.2, 60 sec: 3617.8, 300 sec: 3637.7). Total num frames: 8949760. Throughput: 0: 870.8. Samples: 1236170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:40:30,464][06480] Avg episode reward: [(0, '27.655')] [2023-02-26 07:40:34,569][30947] Updated weights for policy 0, policy_version 2188 (0.0015) [2023-02-26 07:40:35,446][06480] Fps is (10 sec: 2869.4, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 8962048. Throughput: 0: 883.1. Samples: 1238390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:40:35,454][06480] Avg episode reward: [(0, '26.844')] [2023-02-26 07:40:40,446][06480] Fps is (10 sec: 3688.6, 60 sec: 3481.7, 300 sec: 3637.8). Total num frames: 8986624. Throughput: 0: 946.9. Samples: 1244712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:40:40,448][06480] Avg episode reward: [(0, '27.013')] [2023-02-26 07:40:43,420][30947] Updated weights for policy 0, policy_version 2198 (0.0019) [2023-02-26 07:40:45,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 9007104. Throughput: 0: 955.3. Samples: 1251272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:40:45,452][06480] Avg episode reward: [(0, '26.685')] [2023-02-26 07:40:50,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 9023488. Throughput: 0: 926.4. Samples: 1253422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:40:50,449][06480] Avg episode reward: [(0, '26.471')] [2023-02-26 07:40:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 9039872. Throughput: 0: 911.8. Samples: 1257896. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:40:55,452][06480] Avg episode reward: [(0, '26.166')] [2023-02-26 07:40:55,917][30947] Updated weights for policy 0, policy_version 2208 (0.0019) [2023-02-26 07:41:00,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3651.8). Total num frames: 9064448. Throughput: 0: 964.4. Samples: 1264840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:41:00,451][06480] Avg episode reward: [(0, '27.488')] [2023-02-26 07:41:05,247][30947] Updated weights for policy 0, policy_version 2218 (0.0023) [2023-02-26 07:41:05,452][06480] Fps is (10 sec: 4503.1, 60 sec: 3754.3, 300 sec: 3665.5). Total num frames: 9084928. Throughput: 0: 964.9. Samples: 1268286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:41:05,454][06480] Avg episode reward: [(0, '27.305')] [2023-02-26 07:41:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 9097216. Throughput: 0: 917.4. Samples: 1272962. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-26 07:41:10,454][06480] Avg episode reward: [(0, '27.296')] [2023-02-26 07:41:10,465][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002221_9097216.pth... [2023-02-26 07:41:10,697][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002007_8220672.pth [2023-02-26 07:41:15,446][06480] Fps is (10 sec: 2868.8, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 9113600. Throughput: 0: 923.9. Samples: 1277740. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:41:15,449][06480] Avg episode reward: [(0, '26.732')] [2023-02-26 07:41:17,417][30947] Updated weights for policy 0, policy_version 2228 (0.0014) [2023-02-26 07:41:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 9138176. Throughput: 0: 951.1. Samples: 1281188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:41:20,449][06480] Avg episode reward: [(0, '25.947')] [2023-02-26 07:41:25,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3755.1, 300 sec: 3665.6). Total num frames: 9158656. Throughput: 0: 957.6. Samples: 1287802. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-26 07:41:25,454][06480] Avg episode reward: [(0, '26.765')] [2023-02-26 07:41:28,087][30947] Updated weights for policy 0, policy_version 2238 (0.0011) [2023-02-26 07:41:30,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.8, 300 sec: 3637.8). Total num frames: 9170944. Throughput: 0: 908.9. Samples: 1292174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:41:30,453][06480] Avg episode reward: [(0, '25.706')] [2023-02-26 07:41:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 9191424. Throughput: 0: 910.0. Samples: 1294374. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:41:35,454][06480] Avg episode reward: [(0, '25.259')] [2023-02-26 07:41:38,873][30947] Updated weights for policy 0, policy_version 2248 (0.0021) [2023-02-26 07:41:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 9211904. Throughput: 0: 960.3. Samples: 1301110. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-26 07:41:40,454][06480] Avg episode reward: [(0, '24.425')] [2023-02-26 07:41:45,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 9232384. Throughput: 0: 942.9. Samples: 1307270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:41:45,449][06480] Avg episode reward: [(0, '25.970')] [2023-02-26 07:41:50,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 9244672. Throughput: 0: 913.3. Samples: 1309378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:41:50,454][06480] Avg episode reward: [(0, '27.058')] [2023-02-26 07:41:50,673][30947] Updated weights for policy 0, policy_version 2258 (0.0014) [2023-02-26 07:41:55,446][06480] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 9265152. Throughput: 0: 917.7. Samples: 1314260. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:41:55,454][06480] Avg episode reward: [(0, '25.692')] [2023-02-26 07:42:00,411][30947] Updated weights for policy 0, policy_version 2268 (0.0012) [2023-02-26 07:42:00,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 9289728. Throughput: 0: 966.7. Samples: 1321242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:00,448][06480] Avg episode reward: [(0, '26.390')] [2023-02-26 07:42:05,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3686.7, 300 sec: 3679.5). Total num frames: 9306112. Throughput: 0: 966.9. Samples: 1324700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:42:05,453][06480] Avg episode reward: [(0, '26.578')] [2023-02-26 07:42:10,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 9322496. Throughput: 0: 916.7. Samples: 1329054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:10,453][06480] Avg episode reward: [(0, '26.682')] [2023-02-26 07:42:12,807][30947] Updated weights for policy 0, policy_version 2278 (0.0014) [2023-02-26 07:42:15,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 9338880. Throughput: 0: 937.5. Samples: 1334360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:15,452][06480] Avg episode reward: [(0, '25.903')] [2023-02-26 07:42:20,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3707.3). Total num frames: 9363456. Throughput: 0: 965.4. Samples: 1337816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:20,448][06480] Avg episode reward: [(0, '26.157')] [2023-02-26 07:42:22,544][30947] Updated weights for policy 0, policy_version 2288 (0.0013) [2023-02-26 07:42:25,446][06480] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 9375744. Throughput: 0: 927.7. Samples: 1342856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:42:25,454][06480] Avg episode reward: [(0, '25.773')] [2023-02-26 07:42:30,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 9388032. Throughput: 0: 864.3. Samples: 1346162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:30,449][06480] Avg episode reward: [(0, '27.171')] [2023-02-26 07:42:35,446][06480] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 9400320. Throughput: 0: 855.3. Samples: 1347866. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:42:35,453][06480] Avg episode reward: [(0, '26.982')] [2023-02-26 07:42:37,994][30947] Updated weights for policy 0, policy_version 2298 (0.0024) [2023-02-26 07:42:40,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 9420800. Throughput: 0: 872.0. Samples: 1353500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:42:40,452][06480] Avg episode reward: [(0, '27.083')] [2023-02-26 07:42:45,446][06480] Fps is (10 sec: 4505.5, 60 sec: 3549.8, 300 sec: 3679.5). Total num frames: 9445376. Throughput: 0: 872.6. Samples: 1360510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:42:45,450][06480] Avg episode reward: [(0, '26.640')] [2023-02-26 07:42:47,241][30947] Updated weights for policy 0, policy_version 2308 (0.0017) [2023-02-26 07:42:50,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 9461760. Throughput: 0: 852.1. Samples: 1363044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:42:50,449][06480] Avg episode reward: [(0, '26.232')] [2023-02-26 07:42:55,449][06480] Fps is (10 sec: 2866.5, 60 sec: 3481.5, 300 sec: 3623.9). Total num frames: 9474048. Throughput: 0: 852.0. Samples: 1367398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:42:55,453][06480] Avg episode reward: [(0, '25.788')] [2023-02-26 07:42:59,181][30947] Updated weights for policy 0, policy_version 2318 (0.0021) [2023-02-26 07:43:00,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 9498624. Throughput: 0: 871.6. Samples: 1373580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:43:00,449][06480] Avg episode reward: [(0, '26.575')] [2023-02-26 07:43:05,446][06480] Fps is (10 sec: 4506.8, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 9519104. Throughput: 0: 871.3. Samples: 1377026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:43:05,452][06480] Avg episode reward: [(0, '25.961')] [2023-02-26 07:43:09,659][30947] Updated weights for policy 0, policy_version 2328 (0.0027) [2023-02-26 07:43:10,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 9535488. Throughput: 0: 882.5. Samples: 1382570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:43:10,449][06480] Avg episode reward: [(0, '27.589')] [2023-02-26 07:43:10,463][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002328_9535488.pth... [2023-02-26 07:43:10,668][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002114_8658944.pth [2023-02-26 07:43:15,447][06480] Fps is (10 sec: 2867.0, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 9547776. Throughput: 0: 904.4. Samples: 1386860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:43:15,451][06480] Avg episode reward: [(0, '27.340')] [2023-02-26 07:43:20,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 9572352. Throughput: 0: 938.0. Samples: 1390074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:43:20,449][06480] Avg episode reward: [(0, '26.245')] [2023-02-26 07:43:20,741][30947] Updated weights for policy 0, policy_version 2338 (0.0015) [2023-02-26 07:43:25,446][06480] Fps is (10 sec: 4915.5, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 9596928. Throughput: 0: 970.2. Samples: 1397158. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:43:25,453][06480] Avg episode reward: [(0, '25.649')] [2023-02-26 07:43:30,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 9609216. Throughput: 0: 922.4. Samples: 1402016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:43:30,449][06480] Avg episode reward: [(0, '26.424')] [2023-02-26 07:43:32,122][30947] Updated weights for policy 0, policy_version 2348 (0.0031) [2023-02-26 07:43:35,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 9625600. Throughput: 0: 914.3. Samples: 1404188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:43:35,450][06480] Avg episode reward: [(0, '26.528')] [2023-02-26 07:43:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 9650176. Throughput: 0: 950.6. Samples: 1410172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:43:40,449][06480] Avg episode reward: [(0, '24.940')] [2023-02-26 07:43:42,019][30947] Updated weights for policy 0, policy_version 2358 (0.0018) [2023-02-26 07:43:45,447][06480] Fps is (10 sec: 4505.0, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 9670656. Throughput: 0: 968.3. Samples: 1417154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:43:45,451][06480] Avg episode reward: [(0, '25.527')] [2023-02-26 07:43:50,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 9687040. Throughput: 0: 944.6. Samples: 1419534. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:43:50,456][06480] Avg episode reward: [(0, '25.565')] [2023-02-26 07:43:54,064][30947] Updated weights for policy 0, policy_version 2368 (0.0018) [2023-02-26 07:43:55,446][06480] Fps is (10 sec: 3277.2, 60 sec: 3823.1, 300 sec: 3637.8). Total num frames: 9703424. Throughput: 0: 918.7. Samples: 1423912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:43:55,455][06480] Avg episode reward: [(0, '25.331')] [2023-02-26 07:44:00,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 9723904. Throughput: 0: 963.7. Samples: 1430226. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-26 07:44:00,453][06480] Avg episode reward: [(0, '22.780')] [2023-02-26 07:44:03,356][30947] Updated weights for policy 0, policy_version 2378 (0.0020) [2023-02-26 07:44:05,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 9748480. Throughput: 0: 970.0. Samples: 1433724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:44:05,451][06480] Avg episode reward: [(0, '22.351')] [2023-02-26 07:44:10,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 9760768. Throughput: 0: 928.7. Samples: 1438950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:44:10,453][06480] Avg episode reward: [(0, '23.164')] [2023-02-26 07:44:15,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3637.8). Total num frames: 9777152. Throughput: 0: 918.0. Samples: 1443326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:44:15,453][06480] Avg episode reward: [(0, '22.911')] [2023-02-26 07:44:16,007][30947] Updated weights for policy 0, policy_version 2388 (0.0018) [2023-02-26 07:44:20,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 9797632. Throughput: 0: 945.9. Samples: 1446754. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-26 07:44:20,455][06480] Avg episode reward: [(0, '23.117')] [2023-02-26 07:44:24,909][30947] Updated weights for policy 0, policy_version 2398 (0.0014) [2023-02-26 07:44:25,446][06480] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 9822208. Throughput: 0: 970.0. Samples: 1453820. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-26 07:44:25,451][06480] Avg episode reward: [(0, '24.133')] [2023-02-26 07:44:30,446][06480] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 9834496. Throughput: 0: 917.6. Samples: 1458446. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-26 07:44:30,454][06480] Avg episode reward: [(0, '24.458')] [2023-02-26 07:44:35,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3651.7). Total num frames: 9854976. Throughput: 0: 913.3. Samples: 1460634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:44:35,448][06480] Avg episode reward: [(0, '24.038')] [2023-02-26 07:44:37,305][30947] Updated weights for policy 0, policy_version 2408 (0.0031) [2023-02-26 07:44:40,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 9875456. Throughput: 0: 955.0. Samples: 1466888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:44:40,449][06480] Avg episode reward: [(0, '25.575')] [2023-02-26 07:44:45,446][06480] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 9895936. Throughput: 0: 966.7. Samples: 1473726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-26 07:44:45,450][06480] Avg episode reward: [(0, '26.741')] [2023-02-26 07:44:46,907][30947] Updated weights for policy 0, policy_version 2418 (0.0013) [2023-02-26 07:44:50,449][06480] Fps is (10 sec: 3685.6, 60 sec: 3754.5, 300 sec: 3721.1). Total num frames: 9912320. Throughput: 0: 937.2. Samples: 1475902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:44:50,451][06480] Avg episode reward: [(0, '26.634')] [2023-02-26 07:44:55,446][06480] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 9928704. Throughput: 0: 917.6. Samples: 1480242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:44:55,454][06480] Avg episode reward: [(0, '25.752')] [2023-02-26 07:44:59,377][30947] Updated weights for policy 0, policy_version 2428 (0.0024) [2023-02-26 07:45:00,446][06480] Fps is (10 sec: 3277.6, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 9945088. Throughput: 0: 942.7. Samples: 1485748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:45:00,449][06480] Avg episode reward: [(0, '25.547')] [2023-02-26 07:45:05,449][06480] Fps is (10 sec: 3275.8, 60 sec: 3549.7, 300 sec: 3693.3). Total num frames: 9961472. Throughput: 0: 915.0. Samples: 1487930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-26 07:45:05,453][06480] Avg episode reward: [(0, '25.025')] [2023-02-26 07:45:10,446][06480] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 9973760. Throughput: 0: 840.3. Samples: 1491634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-26 07:45:10,449][06480] Avg episode reward: [(0, '24.159')] [2023-02-26 07:45:10,475][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002435_9973760.pth... [2023-02-26 07:45:10,704][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002221_9097216.pth [2023-02-26 07:45:14,509][30947] Updated weights for policy 0, policy_version 2438 (0.0024) [2023-02-26 07:45:15,446][06480] Fps is (10 sec: 2458.3, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 9986048. Throughput: 0: 832.8. Samples: 1495922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-26 07:45:15,449][06480] Avg episode reward: [(0, '23.011')] [2023-02-26 07:45:19,186][30933] Stopping Batcher_0... [2023-02-26 07:45:19,187][30933] Loop batcher_evt_loop terminating... [2023-02-26 07:45:19,188][06480] Component Batcher_0 stopped! [2023-02-26 07:45:19,192][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... [2023-02-26 07:45:19,255][30947] Weights refcount: 2 0 [2023-02-26 07:45:19,294][30947] Stopping InferenceWorker_p0-w0... [2023-02-26 07:45:19,298][30947] Loop inference_proc0-0_evt_loop terminating... [2023-02-26 07:45:19,294][06480] Component InferenceWorker_p0-w0 stopped! [2023-02-26 07:45:19,330][30933] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002328_9535488.pth [2023-02-26 07:45:19,341][30933] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... [2023-02-26 07:45:19,451][30933] Stopping LearnerWorker_p0... [2023-02-26 07:45:19,452][30933] Loop learner_proc0_evt_loop terminating... [2023-02-26 07:45:19,453][06480] Component LearnerWorker_p0 stopped! [2023-02-26 07:45:19,557][30962] Stopping RolloutWorker_w4... [2023-02-26 07:45:19,562][30962] Loop rollout_proc4_evt_loop terminating... [2023-02-26 07:45:19,558][06480] Component RolloutWorker_w7 stopped! [2023-02-26 07:45:19,566][06480] Component RolloutWorker_w4 stopped! [2023-02-26 07:45:19,574][06480] Component RolloutWorker_w1 stopped! [2023-02-26 07:45:19,575][30952] Stopping RolloutWorker_w1... [2023-02-26 07:45:19,576][30952] Loop rollout_proc1_evt_loop terminating... [2023-02-26 07:45:19,554][30968] Stopping RolloutWorker_w7... [2023-02-26 07:45:19,582][30958] Stopping RolloutWorker_w3... [2023-02-26 07:45:19,583][30960] Stopping RolloutWorker_w5... [2023-02-26 07:45:19,583][30960] Loop rollout_proc5_evt_loop terminating... [2023-02-26 07:45:19,584][30958] Loop rollout_proc3_evt_loop terminating... [2023-02-26 07:45:19,584][30968] Loop rollout_proc7_evt_loop terminating... [2023-02-26 07:45:19,582][06480] Component RolloutWorker_w3 stopped! [2023-02-26 07:45:19,585][06480] Component RolloutWorker_w5 stopped! [2023-02-26 07:45:19,592][30970] Stopping RolloutWorker_w6... [2023-02-26 07:45:19,592][30970] Loop rollout_proc6_evt_loop terminating... [2023-02-26 07:45:19,596][06480] Component RolloutWorker_w6 stopped! [2023-02-26 07:45:19,616][30951] Stopping RolloutWorker_w2... [2023-02-26 07:45:19,616][06480] Component RolloutWorker_w2 stopped! [2023-02-26 07:45:19,617][30951] Loop rollout_proc2_evt_loop terminating... [2023-02-26 07:45:19,650][06480] Component RolloutWorker_w0 stopped! [2023-02-26 07:45:19,653][06480] Waiting for process learner_proc0 to stop... [2023-02-26 07:45:19,658][30948] Stopping RolloutWorker_w0... [2023-02-26 07:45:19,664][30948] Loop rollout_proc0_evt_loop terminating... [2023-02-26 07:45:23,164][06480] Waiting for process inference_proc0-0 to join... [2023-02-26 07:45:23,191][06480] Waiting for process rollout_proc0 to join... [2023-02-26 07:45:23,193][06480] Waiting for process rollout_proc1 to join... [2023-02-26 07:45:23,195][06480] Waiting for process rollout_proc2 to join... [2023-02-26 07:45:23,200][06480] Waiting for process rollout_proc3 to join... [2023-02-26 07:45:23,203][06480] Waiting for process rollout_proc4 to join... [2023-02-26 07:45:23,204][06480] Waiting for process rollout_proc5 to join... [2023-02-26 07:45:23,205][06480] Waiting for process rollout_proc6 to join... [2023-02-26 07:45:23,206][06480] Waiting for process rollout_proc7 to join... [2023-02-26 07:45:23,208][06480] Batcher 0 profile tree view: batching: 38.0811, releasing_batches: 0.0430 [2023-02-26 07:45:23,209][06480] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 848.1106 update_model: 11.1522 weight_update: 0.0023 one_step: 0.0032 handle_policy_step: 748.4622 deserialize: 21.9104, stack: 4.3503, obs_to_device_normalize: 169.2788, forward: 355.3837, send_messages: 39.5183 prepare_outputs: 120.6791 to_cpu: 74.4310 [2023-02-26 07:45:23,210][06480] Learner 0 profile tree view: misc: 0.0107, prepare_batch: 23.4033 train: 117.9822 epoch_init: 0.0165, minibatch_init: 0.0092, losses_postprocess: 0.8250, kl_divergence: 0.8165, after_optimizer: 4.6216 calculate_losses: 39.1543 losses_init: 0.0050, forward_head: 2.7332, bptt_initial: 25.6047, tail: 1.5578, advantages_returns: 0.4931, losses: 4.8133 bptt: 3.4172 bptt_forward_core: 3.2826 update: 71.4942 clip: 2.0844 [2023-02-26 07:45:23,212][06480] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.6495, enqueue_policy_requests: 240.7376, env_step: 1244.5267, overhead: 33.3235, complete_rollouts: 10.1732 save_policy_outputs: 32.8304 split_output_tensors: 15.9062 [2023-02-26 07:45:23,213][06480] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.4744, enqueue_policy_requests: 239.4893, env_step: 1244.9929, overhead: 32.5271, complete_rollouts: 11.3433 save_policy_outputs: 31.0280 split_output_tensors: 15.2532 [2023-02-26 07:45:23,215][06480] Loop Runner_EvtLoop terminating... [2023-02-26 07:45:23,216][06480] Runner profile tree view: main_loop: 1691.0690 [2023-02-26 07:45:23,218][06480] Collected {0: 10006528}, FPS: 3548.4 [2023-02-26 07:57:00,951][06480] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 07:57:00,953][06480] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 07:57:00,955][06480] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 07:57:00,958][06480] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 07:57:00,961][06480] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:57:00,962][06480] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 07:57:00,965][06480] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:57:00,967][06480] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 07:57:00,969][06480] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-26 07:57:00,970][06480] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-26 07:57:00,974][06480] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 07:57:00,975][06480] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 07:57:00,976][06480] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 07:57:00,978][06480] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 07:57:00,979][06480] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 07:57:01,011][06480] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:57:01,020][06480] RunningMeanStd input shape: (1,) [2023-02-26 07:57:01,049][06480] ConvEncoder: input_channels=3 [2023-02-26 07:57:01,210][06480] Conv encoder output size: 512 [2023-02-26 07:57:01,212][06480] Policy head output size: 512 [2023-02-26 07:57:01,330][06480] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... [2023-02-26 07:57:02,310][06480] Num frames 100... [2023-02-26 07:57:02,426][06480] Num frames 200... [2023-02-26 07:57:02,543][06480] Num frames 300... [2023-02-26 07:57:02,666][06480] Num frames 400... [2023-02-26 07:57:02,779][06480] Num frames 500... [2023-02-26 07:57:02,898][06480] Num frames 600... [2023-02-26 07:57:03,016][06480] Avg episode rewards: #0: 13.400, true rewards: #0: 6.400 [2023-02-26 07:57:03,020][06480] Avg episode reward: 13.400, avg true_objective: 6.400 [2023-02-26 07:57:03,092][06480] Num frames 700... [2023-02-26 07:57:03,205][06480] Num frames 800... [2023-02-26 07:57:03,325][06480] Num frames 900... [2023-02-26 07:57:03,439][06480] Num frames 1000... [2023-02-26 07:57:03,558][06480] Num frames 1100... [2023-02-26 07:57:03,672][06480] Num frames 1200... [2023-02-26 07:57:03,784][06480] Num frames 1300... [2023-02-26 07:57:03,898][06480] Num frames 1400... [2023-02-26 07:57:04,013][06480] Num frames 1500... [2023-02-26 07:57:04,128][06480] Num frames 1600... [2023-02-26 07:57:04,260][06480] Avg episode rewards: #0: 17.820, true rewards: #0: 8.320 [2023-02-26 07:57:04,261][06480] Avg episode reward: 17.820, avg true_objective: 8.320 [2023-02-26 07:57:04,315][06480] Num frames 1700... [2023-02-26 07:57:04,432][06480] Num frames 1800... [2023-02-26 07:57:04,561][06480] Num frames 1900... [2023-02-26 07:57:04,677][06480] Num frames 2000... [2023-02-26 07:57:04,800][06480] Num frames 2100... [2023-02-26 07:57:04,915][06480] Num frames 2200... [2023-02-26 07:57:05,031][06480] Num frames 2300... [2023-02-26 07:57:05,145][06480] Num frames 2400... [2023-02-26 07:57:05,264][06480] Num frames 2500... [2023-02-26 07:57:05,394][06480] Num frames 2600... [2023-02-26 07:57:05,524][06480] Num frames 2700... [2023-02-26 07:57:05,641][06480] Num frames 2800... [2023-02-26 07:57:05,764][06480] Num frames 2900... [2023-02-26 07:57:05,889][06480] Num frames 3000... [2023-02-26 07:57:06,008][06480] Num frames 3100... [2023-02-26 07:57:06,129][06480] Num frames 3200... [2023-02-26 07:57:06,254][06480] Avg episode rewards: #0: 24.200, true rewards: #0: 10.867 [2023-02-26 07:57:06,256][06480] Avg episode reward: 24.200, avg true_objective: 10.867 [2023-02-26 07:57:06,309][06480] Num frames 3300... [2023-02-26 07:57:06,425][06480] Num frames 3400... [2023-02-26 07:57:06,545][06480] Num frames 3500... [2023-02-26 07:57:06,658][06480] Num frames 3600... [2023-02-26 07:57:06,772][06480] Num frames 3700... [2023-02-26 07:57:06,893][06480] Num frames 3800... [2023-02-26 07:57:07,009][06480] Num frames 3900... [2023-02-26 07:57:07,122][06480] Num frames 4000... [2023-02-26 07:57:07,241][06480] Num frames 4100... [2023-02-26 07:57:07,361][06480] Num frames 4200... [2023-02-26 07:57:07,475][06480] Num frames 4300... [2023-02-26 07:57:07,609][06480] Num frames 4400... [2023-02-26 07:57:07,723][06480] Num frames 4500... [2023-02-26 07:57:07,845][06480] Num frames 4600... [2023-02-26 07:57:07,960][06480] Num frames 4700... [2023-02-26 07:57:08,039][06480] Avg episode rewards: #0: 28.532, true rewards: #0: 11.782 [2023-02-26 07:57:08,041][06480] Avg episode reward: 28.532, avg true_objective: 11.782 [2023-02-26 07:57:08,143][06480] Num frames 4800... [2023-02-26 07:57:08,259][06480] Num frames 4900... [2023-02-26 07:57:08,376][06480] Num frames 5000... [2023-02-26 07:57:08,504][06480] Avg episode rewards: #0: 23.730, true rewards: #0: 10.130 [2023-02-26 07:57:08,507][06480] Avg episode reward: 23.730, avg true_objective: 10.130 [2023-02-26 07:57:08,550][06480] Num frames 5100... [2023-02-26 07:57:08,674][06480] Num frames 5200... [2023-02-26 07:57:08,791][06480] Num frames 5300... [2023-02-26 07:57:08,915][06480] Num frames 5400... [2023-02-26 07:57:09,030][06480] Num frames 5500... [2023-02-26 07:57:09,158][06480] Num frames 5600... [2023-02-26 07:57:09,273][06480] Num frames 5700... [2023-02-26 07:57:09,397][06480] Num frames 5800... [2023-02-26 07:57:09,511][06480] Num frames 5900... [2023-02-26 07:57:09,632][06480] Num frames 6000... [2023-02-26 07:57:09,790][06480] Avg episode rewards: #0: 24.977, true rewards: #0: 10.143 [2023-02-26 07:57:09,793][06480] Avg episode reward: 24.977, avg true_objective: 10.143 [2023-02-26 07:57:09,813][06480] Num frames 6100... [2023-02-26 07:57:09,950][06480] Num frames 6200... [2023-02-26 07:57:10,091][06480] Num frames 6300... [2023-02-26 07:57:10,253][06480] Num frames 6400... [2023-02-26 07:57:10,420][06480] Num frames 6500... [2023-02-26 07:57:10,573][06480] Num frames 6600... [2023-02-26 07:57:10,731][06480] Num frames 6700... [2023-02-26 07:57:10,884][06480] Avg episode rewards: #0: 23.083, true rewards: #0: 9.654 [2023-02-26 07:57:10,887][06480] Avg episode reward: 23.083, avg true_objective: 9.654 [2023-02-26 07:57:10,955][06480] Num frames 6800... [2023-02-26 07:57:11,126][06480] Num frames 6900... [2023-02-26 07:57:11,291][06480] Num frames 7000... [2023-02-26 07:57:11,451][06480] Num frames 7100... [2023-02-26 07:57:11,603][06480] Num frames 7200... [2023-02-26 07:57:11,778][06480] Avg episode rewards: #0: 21.462, true rewards: #0: 9.087 [2023-02-26 07:57:11,780][06480] Avg episode reward: 21.462, avg true_objective: 9.087 [2023-02-26 07:57:11,835][06480] Num frames 7300... [2023-02-26 07:57:12,003][06480] Num frames 7400... [2023-02-26 07:57:12,172][06480] Num frames 7500... [2023-02-26 07:57:12,333][06480] Num frames 7600... [2023-02-26 07:57:12,494][06480] Num frames 7700... [2023-02-26 07:57:12,656][06480] Num frames 7800... [2023-02-26 07:57:12,821][06480] Num frames 7900... [2023-02-26 07:57:12,977][06480] Num frames 8000... [2023-02-26 07:57:13,138][06480] Num frames 8100... [2023-02-26 07:57:13,303][06480] Num frames 8200... [2023-02-26 07:57:13,467][06480] Num frames 8300... [2023-02-26 07:57:13,633][06480] Num frames 8400... [2023-02-26 07:57:13,751][06480] Num frames 8500... [2023-02-26 07:57:13,867][06480] Num frames 8600... [2023-02-26 07:57:13,981][06480] Num frames 8700... [2023-02-26 07:57:14,095][06480] Num frames 8800... [2023-02-26 07:57:14,206][06480] Num frames 8900... [2023-02-26 07:57:14,324][06480] Num frames 9000... [2023-02-26 07:57:14,442][06480] Num frames 9100... [2023-02-26 07:57:14,553][06480] Num frames 9200... [2023-02-26 07:57:14,666][06480] Num frames 9300... [2023-02-26 07:57:14,808][06480] Avg episode rewards: #0: 25.522, true rewards: #0: 10.411 [2023-02-26 07:57:14,810][06480] Avg episode reward: 25.522, avg true_objective: 10.411 [2023-02-26 07:57:14,849][06480] Num frames 9400... [2023-02-26 07:57:14,964][06480] Num frames 9500... [2023-02-26 07:57:15,081][06480] Num frames 9600... [2023-02-26 07:57:15,190][06480] Num frames 9700... [2023-02-26 07:57:15,316][06480] Num frames 9800... [2023-02-26 07:57:15,432][06480] Num frames 9900... [2023-02-26 07:57:15,539][06480] Num frames 10000... [2023-02-26 07:57:15,648][06480] Num frames 10100... [2023-02-26 07:57:15,750][06480] Avg episode rewards: #0: 24.438, true rewards: #0: 10.138 [2023-02-26 07:57:15,754][06480] Avg episode reward: 24.438, avg true_objective: 10.138 [2023-02-26 07:58:17,864][06480] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 07:58:32,539][06480] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 07:58:32,541][06480] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 07:58:32,543][06480] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 07:58:32,545][06480] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 07:58:32,548][06480] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 07:58:32,550][06480] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 07:58:32,552][06480] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-26 07:58:32,553][06480] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 07:58:32,555][06480] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-26 07:58:32,557][06480] Adding new argument 'hf_repository'='sd99/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-26 07:58:32,558][06480] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 07:58:32,560][06480] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 07:58:32,562][06480] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 07:58:32,563][06480] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 07:58:32,565][06480] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 07:58:32,587][06480] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 07:58:32,591][06480] RunningMeanStd input shape: (1,) [2023-02-26 07:58:32,606][06480] ConvEncoder: input_channels=3 [2023-02-26 07:58:32,641][06480] Conv encoder output size: 512 [2023-02-26 07:58:32,643][06480] Policy head output size: 512 [2023-02-26 07:58:32,662][06480] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002443_10006528.pth... [2023-02-26 07:58:33,110][06480] Num frames 100... [2023-02-26 07:58:33,232][06480] Num frames 200... [2023-02-26 07:58:33,351][06480] Num frames 300... [2023-02-26 07:58:33,461][06480] Num frames 400... [2023-02-26 07:58:33,575][06480] Num frames 500... [2023-02-26 07:58:33,695][06480] Num frames 600... [2023-02-26 07:58:33,761][06480] Avg episode rewards: #0: 9.080, true rewards: #0: 6.080 [2023-02-26 07:58:33,764][06480] Avg episode reward: 9.080, avg true_objective: 6.080 [2023-02-26 07:58:33,877][06480] Num frames 700... [2023-02-26 07:58:33,997][06480] Num frames 800... [2023-02-26 07:58:34,108][06480] Num frames 900... [2023-02-26 07:58:34,223][06480] Num frames 1000... [2023-02-26 07:58:34,343][06480] Num frames 1100... [2023-02-26 07:58:34,459][06480] Num frames 1200... [2023-02-26 07:58:34,582][06480] Num frames 1300... [2023-02-26 07:58:34,705][06480] Num frames 1400... [2023-02-26 07:58:34,827][06480] Num frames 1500... [2023-02-26 07:58:34,947][06480] Num frames 1600... [2023-02-26 07:58:35,111][06480] Avg episode rewards: #0: 17.320, true rewards: #0: 8.320 [2023-02-26 07:58:35,114][06480] Avg episode reward: 17.320, avg true_objective: 8.320 [2023-02-26 07:58:35,177][06480] Num frames 1700... [2023-02-26 07:58:35,346][06480] Num frames 1800... [2023-02-26 07:58:35,503][06480] Num frames 1900... [2023-02-26 07:58:35,659][06480] Num frames 2000... [2023-02-26 07:58:35,820][06480] Num frames 2100... [2023-02-26 07:58:35,985][06480] Num frames 2200... [2023-02-26 07:58:36,146][06480] Num frames 2300... [2023-02-26 07:58:36,307][06480] Num frames 2400... [2023-02-26 07:58:36,471][06480] Num frames 2500... [2023-02-26 07:58:36,636][06480] Num frames 2600... [2023-02-26 07:58:36,800][06480] Num frames 2700... [2023-02-26 07:58:36,971][06480] Num frames 2800... [2023-02-26 07:58:37,140][06480] Num frames 2900... [2023-02-26 07:58:37,287][06480] Avg episode rewards: #0: 21.504, true rewards: #0: 9.837 [2023-02-26 07:58:37,290][06480] Avg episode reward: 21.504, avg true_objective: 9.837 [2023-02-26 07:58:37,377][06480] Num frames 3000... [2023-02-26 07:58:37,545][06480] Num frames 3100... [2023-02-26 07:58:37,706][06480] Num frames 3200... [2023-02-26 07:58:37,875][06480] Num frames 3300... [2023-02-26 07:58:38,048][06480] Num frames 3400... [2023-02-26 07:58:38,216][06480] Num frames 3500... [2023-02-26 07:58:38,388][06480] Num frames 3600... [2023-02-26 07:58:38,532][06480] Num frames 3700... [2023-02-26 07:58:38,647][06480] Num frames 3800... [2023-02-26 07:58:38,762][06480] Num frames 3900... [2023-02-26 07:58:38,872][06480] Num frames 4000... [2023-02-26 07:58:38,996][06480] Num frames 4100... [2023-02-26 07:58:39,119][06480] Num frames 4200... [2023-02-26 07:58:39,237][06480] Num frames 4300... [2023-02-26 07:58:39,361][06480] Avg episode rewards: #0: 24.398, true rewards: #0: 10.897 [2023-02-26 07:58:39,363][06480] Avg episode reward: 24.398, avg true_objective: 10.897 [2023-02-26 07:58:39,413][06480] Num frames 4400... [2023-02-26 07:58:39,526][06480] Num frames 4500... [2023-02-26 07:58:39,646][06480] Num frames 4600... [2023-02-26 07:58:39,758][06480] Num frames 4700... [2023-02-26 07:58:39,876][06480] Num frames 4800... [2023-02-26 07:58:40,011][06480] Num frames 4900... [2023-02-26 07:58:40,123][06480] Num frames 5000... [2023-02-26 07:58:40,238][06480] Num frames 5100... [2023-02-26 07:58:40,355][06480] Num frames 5200... [2023-02-26 07:58:40,467][06480] Num frames 5300... [2023-02-26 07:58:40,579][06480] Num frames 5400... [2023-02-26 07:58:40,695][06480] Num frames 5500... [2023-02-26 07:58:40,814][06480] Num frames 5600... [2023-02-26 07:58:40,968][06480] Avg episode rewards: #0: 26.380, true rewards: #0: 11.380 [2023-02-26 07:58:40,970][06480] Avg episode reward: 26.380, avg true_objective: 11.380 [2023-02-26 07:58:40,986][06480] Num frames 5700... [2023-02-26 07:58:41,095][06480] Num frames 5800... [2023-02-26 07:58:41,211][06480] Num frames 5900... [2023-02-26 07:58:41,331][06480] Num frames 6000... [2023-02-26 07:58:41,455][06480] Num frames 6100... [2023-02-26 07:58:41,587][06480] Num frames 6200... [2023-02-26 07:58:41,710][06480] Num frames 6300... [2023-02-26 07:58:41,836][06480] Num frames 6400... [2023-02-26 07:58:41,958][06480] Num frames 6500... [2023-02-26 07:58:42,086][06480] Num frames 6600... [2023-02-26 07:58:42,204][06480] Num frames 6700... [2023-02-26 07:58:42,337][06480] Num frames 6800... [2023-02-26 07:58:42,456][06480] Num frames 6900... [2023-02-26 07:58:42,574][06480] Num frames 7000... [2023-02-26 07:58:42,674][06480] Avg episode rewards: #0: 26.887, true rewards: #0: 11.720 [2023-02-26 07:58:42,677][06480] Avg episode reward: 26.887, avg true_objective: 11.720 [2023-02-26 07:58:42,766][06480] Num frames 7100... [2023-02-26 07:58:42,879][06480] Num frames 7200... [2023-02-26 07:58:42,998][06480] Num frames 7300... [2023-02-26 07:58:43,108][06480] Num frames 7400... [2023-02-26 07:58:43,232][06480] Num frames 7500... [2023-02-26 07:58:43,347][06480] Num frames 7600... [2023-02-26 07:58:43,457][06480] Num frames 7700... [2023-02-26 07:58:43,569][06480] Num frames 7800... [2023-02-26 07:58:43,679][06480] Num frames 7900... [2023-02-26 07:58:43,799][06480] Num frames 8000... [2023-02-26 07:58:43,911][06480] Num frames 8100... [2023-02-26 07:58:44,064][06480] Avg episode rewards: #0: 26.834, true rewards: #0: 11.691 [2023-02-26 07:58:44,066][06480] Avg episode reward: 26.834, avg true_objective: 11.691 [2023-02-26 07:58:44,087][06480] Num frames 8200... [2023-02-26 07:58:44,202][06480] Num frames 8300... [2023-02-26 07:58:44,315][06480] Num frames 8400... [2023-02-26 07:58:44,429][06480] Num frames 8500... [2023-02-26 07:58:44,540][06480] Num frames 8600... [2023-02-26 07:58:44,655][06480] Num frames 8700... [2023-02-26 07:58:44,767][06480] Num frames 8800... [2023-02-26 07:58:44,880][06480] Num frames 8900... [2023-02-26 07:58:45,012][06480] Num frames 9000... [2023-02-26 07:58:45,132][06480] Num frames 9100... [2023-02-26 07:58:45,245][06480] Num frames 9200... [2023-02-26 07:58:45,359][06480] Num frames 9300... [2023-02-26 07:58:45,479][06480] Num frames 9400... [2023-02-26 07:58:45,591][06480] Num frames 9500... [2023-02-26 07:58:45,702][06480] Num frames 9600... [2023-02-26 07:58:45,822][06480] Num frames 9700... [2023-02-26 07:58:45,933][06480] Num frames 9800... [2023-02-26 07:58:46,080][06480] Avg episode rewards: #0: 28.100, true rewards: #0: 12.350 [2023-02-26 07:58:46,081][06480] Avg episode reward: 28.100, avg true_objective: 12.350 [2023-02-26 07:58:46,110][06480] Num frames 9900... [2023-02-26 07:58:46,226][06480] Num frames 10000... [2023-02-26 07:58:46,340][06480] Num frames 10100... [2023-02-26 07:58:46,451][06480] Num frames 10200... [2023-02-26 07:58:46,543][06480] Avg episode rewards: #0: 25.480, true rewards: #0: 11.369 [2023-02-26 07:58:46,545][06480] Avg episode reward: 25.480, avg true_objective: 11.369 [2023-02-26 07:58:46,627][06480] Num frames 10300... [2023-02-26 07:58:46,737][06480] Num frames 10400... [2023-02-26 07:58:46,856][06480] Num frames 10500... [2023-02-26 07:58:46,970][06480] Num frames 10600... [2023-02-26 07:58:47,089][06480] Num frames 10700... [2023-02-26 07:58:47,218][06480] Num frames 10800... [2023-02-26 07:58:47,334][06480] Num frames 10900... [2023-02-26 07:58:47,450][06480] Num frames 11000... [2023-02-26 07:58:47,563][06480] Num frames 11100... [2023-02-26 07:58:47,683][06480] Num frames 11200... [2023-02-26 07:58:47,800][06480] Num frames 11300... [2023-02-26 07:58:47,921][06480] Num frames 11400... [2023-02-26 07:58:48,092][06480] Avg episode rewards: #0: 25.995, true rewards: #0: 11.495 [2023-02-26 07:58:48,095][06480] Avg episode reward: 25.995, avg true_objective: 11.495 [2023-02-26 07:59:56,374][06480] Replay video saved to /content/train_dir/default_experiment/replay.mp4!