[2023-02-22 15:15:47,809][00422] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-22 15:15:47,812][00422] Rollout worker 0 uses device cpu [2023-02-22 15:15:47,815][00422] Rollout worker 1 uses device cpu [2023-02-22 15:15:47,818][00422] Rollout worker 2 uses device cpu [2023-02-22 15:15:47,819][00422] Rollout worker 3 uses device cpu [2023-02-22 15:15:47,822][00422] Rollout worker 4 uses device cpu [2023-02-22 15:15:47,823][00422] Rollout worker 5 uses device cpu [2023-02-22 15:15:47,824][00422] Rollout worker 6 uses device cpu [2023-02-22 15:15:47,825][00422] Rollout worker 7 uses device cpu [2023-02-22 15:15:48,018][00422] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 15:15:48,019][00422] InferenceWorker_p0-w0: min num requests: 2 [2023-02-22 15:15:48,052][00422] Starting all processes... [2023-02-22 15:15:48,054][00422] Starting process learner_proc0 [2023-02-22 15:15:48,113][00422] Starting all processes... [2023-02-22 15:15:48,124][00422] Starting process inference_proc0-0 [2023-02-22 15:15:48,124][00422] Starting process rollout_proc0 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc1 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc2 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc3 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc4 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc5 [2023-02-22 15:15:48,128][00422] Starting process rollout_proc6 [2023-02-22 15:15:48,129][00422] Starting process rollout_proc7 [2023-02-22 15:16:00,282][11055] Worker 3 uses CPU cores [1] [2023-02-22 15:16:00,623][11037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 15:16:00,628][11037] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-22 15:16:00,720][11051] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 15:16:00,720][11051] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-22 15:16:00,785][11052] Worker 0 uses CPU cores [0] [2023-02-22 15:16:00,858][11053] Worker 1 uses CPU cores [1] [2023-02-22 15:16:00,869][11057] Worker 4 uses CPU cores [0] [2023-02-22 15:16:00,886][11054] Worker 2 uses CPU cores [0] [2023-02-22 15:16:00,934][11059] Worker 6 uses CPU cores [0] [2023-02-22 15:16:00,944][11056] Worker 5 uses CPU cores [1] [2023-02-22 15:16:00,961][11058] Worker 7 uses CPU cores [1] [2023-02-22 15:16:01,409][11051] Num visible devices: 1 [2023-02-22 15:16:01,409][11037] Num visible devices: 1 [2023-02-22 15:16:01,418][11037] Starting seed is not provided [2023-02-22 15:16:01,419][11037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 15:16:01,419][11037] Initializing actor-critic model on device cuda:0 [2023-02-22 15:16:01,420][11037] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 15:16:01,422][11037] RunningMeanStd input shape: (1,) [2023-02-22 15:16:01,434][11037] ConvEncoder: input_channels=3 [2023-02-22 15:16:01,706][11037] Conv encoder output size: 512 [2023-02-22 15:16:01,706][11037] Policy head output size: 512 [2023-02-22 15:16:01,752][11037] Created Actor Critic model with architecture: [2023-02-22 15:16:01,752][11037] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-22 15:16:08,014][00422] Heartbeat connected on Batcher_0 [2023-02-22 15:16:08,018][00422] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-22 15:16:08,029][00422] Heartbeat connected on RolloutWorker_w0 [2023-02-22 15:16:08,033][00422] Heartbeat connected on RolloutWorker_w1 [2023-02-22 15:16:08,036][00422] Heartbeat connected on RolloutWorker_w2 [2023-02-22 15:16:08,039][00422] Heartbeat connected on RolloutWorker_w3 [2023-02-22 15:16:08,043][00422] Heartbeat connected on RolloutWorker_w4 [2023-02-22 15:16:08,046][00422] Heartbeat connected on RolloutWorker_w5 [2023-02-22 15:16:08,050][00422] Heartbeat connected on RolloutWorker_w6 [2023-02-22 15:16:08,054][00422] Heartbeat connected on RolloutWorker_w7 [2023-02-22 15:16:09,184][11037] Using optimizer [2023-02-22 15:16:09,185][11037] No checkpoints found [2023-02-22 15:16:09,185][11037] Did not load from checkpoint, starting from scratch! [2023-02-22 15:16:09,186][11037] Initialized policy 0 weights for model version 0 [2023-02-22 15:16:09,195][11037] LearnerWorker_p0 finished initialization! [2023-02-22 15:16:09,195][11037] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-22 15:16:09,201][00422] Heartbeat connected on LearnerWorker_p0 [2023-02-22 15:16:09,372][11051] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 15:16:09,374][11051] RunningMeanStd input shape: (1,) [2023-02-22 15:16:09,394][11051] ConvEncoder: input_channels=3 [2023-02-22 15:16:09,559][11051] Conv encoder output size: 512 [2023-02-22 15:16:09,561][11051] Policy head output size: 512 [2023-02-22 15:16:12,300][00422] Inference worker 0-0 is ready! [2023-02-22 15:16:12,301][00422] All inference workers are ready! Signal rollout workers to start! [2023-02-22 15:16:12,405][11052] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,426][11059] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,430][11057] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,449][11058] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,461][11055] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,472][11054] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,479][11053] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:12,489][11056] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:16:13,331][11058] Decorrelating experience for 0 frames... [2023-02-22 15:16:13,332][11055] Decorrelating experience for 0 frames... [2023-02-22 15:16:13,584][11052] Decorrelating experience for 0 frames... [2023-02-22 15:16:13,589][11057] Decorrelating experience for 0 frames... [2023-02-22 15:16:13,593][11059] Decorrelating experience for 0 frames... [2023-02-22 15:16:13,654][00422] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 15:16:14,274][11058] Decorrelating experience for 32 frames... [2023-02-22 15:16:14,289][11053] Decorrelating experience for 0 frames... [2023-02-22 15:16:14,643][11055] Decorrelating experience for 32 frames... [2023-02-22 15:16:14,650][11054] Decorrelating experience for 0 frames... [2023-02-22 15:16:14,653][11052] Decorrelating experience for 32 frames... [2023-02-22 15:16:14,655][11059] Decorrelating experience for 32 frames... [2023-02-22 15:16:15,297][11053] Decorrelating experience for 32 frames... [2023-02-22 15:16:15,831][11054] Decorrelating experience for 32 frames... [2023-02-22 15:16:15,835][11057] Decorrelating experience for 32 frames... [2023-02-22 15:16:15,897][11058] Decorrelating experience for 64 frames... [2023-02-22 15:16:16,064][11052] Decorrelating experience for 64 frames... [2023-02-22 15:16:16,120][11055] Decorrelating experience for 64 frames... [2023-02-22 15:16:16,749][11056] Decorrelating experience for 0 frames... [2023-02-22 15:16:17,073][11059] Decorrelating experience for 64 frames... [2023-02-22 15:16:17,266][11054] Decorrelating experience for 64 frames... [2023-02-22 15:16:17,281][11053] Decorrelating experience for 64 frames... [2023-02-22 15:16:17,285][11057] Decorrelating experience for 64 frames... [2023-02-22 15:16:17,376][11058] Decorrelating experience for 96 frames... [2023-02-22 15:16:17,561][11055] Decorrelating experience for 96 frames... [2023-02-22 15:16:18,359][11059] Decorrelating experience for 96 frames... [2023-02-22 15:16:18,614][11052] Decorrelating experience for 96 frames... [2023-02-22 15:16:18,654][00422] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 15:16:18,672][11053] Decorrelating experience for 96 frames... [2023-02-22 15:16:18,692][11057] Decorrelating experience for 96 frames... [2023-02-22 15:16:18,782][11056] Decorrelating experience for 32 frames... [2023-02-22 15:16:19,250][11056] Decorrelating experience for 64 frames... [2023-02-22 15:16:19,602][11054] Decorrelating experience for 96 frames... [2023-02-22 15:16:19,678][11056] Decorrelating experience for 96 frames... [2023-02-22 15:16:23,657][00422] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 7.8. Samples: 78. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-22 15:16:23,660][00422] Avg episode reward: [(0, '1.189')] [2023-02-22 15:16:25,935][11037] Signal inference workers to stop experience collection... [2023-02-22 15:16:25,947][11051] InferenceWorker_p0-w0: stopping experience collection [2023-02-22 15:16:28,375][11037] Signal inference workers to resume experience collection... [2023-02-22 15:16:28,376][11051] InferenceWorker_p0-w0: resuming experience collection [2023-02-22 15:16:28,654][00422] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 144.7. Samples: 2170. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-22 15:16:28,660][00422] Avg episode reward: [(0, '2.029')] [2023-02-22 15:16:33,654][00422] Fps is (10 sec: 2048.6, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 20480. Throughput: 0: 294.2. Samples: 5884. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-22 15:16:33,659][00422] Avg episode reward: [(0, '3.452')] [2023-02-22 15:16:37,692][11051] Updated weights for policy 0, policy_version 10 (0.0016) [2023-02-22 15:16:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 40960. Throughput: 0: 363.8. Samples: 9096. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-22 15:16:38,656][00422] Avg episode reward: [(0, '4.180')] [2023-02-22 15:16:43,656][00422] Fps is (10 sec: 3685.4, 60 sec: 1911.3, 300 sec: 1911.3). Total num frames: 57344. Throughput: 0: 481.3. Samples: 14440. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-22 15:16:43,664][00422] Avg episode reward: [(0, '4.370')] [2023-02-22 15:16:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 523.6. Samples: 18326. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:16:48,657][00422] Avg episode reward: [(0, '4.307')] [2023-02-22 15:16:51,897][11051] Updated weights for policy 0, policy_version 20 (0.0027) [2023-02-22 15:16:53,654][00422] Fps is (10 sec: 2867.8, 60 sec: 2150.4, 300 sec: 2150.4). Total num frames: 86016. Throughput: 0: 512.3. Samples: 20494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:16:53,656][00422] Avg episode reward: [(0, '4.412')] [2023-02-22 15:16:58,654][00422] Fps is (10 sec: 3686.4, 60 sec: 2366.6, 300 sec: 2366.6). Total num frames: 106496. Throughput: 0: 596.0. Samples: 26820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:16:58,659][00422] Avg episode reward: [(0, '4.383')] [2023-02-22 15:16:58,715][11037] Saving new best policy, reward=4.383! [2023-02-22 15:17:02,131][11051] Updated weights for policy 0, policy_version 30 (0.0020) [2023-02-22 15:17:03,659][00422] Fps is (10 sec: 4093.8, 60 sec: 2539.2, 300 sec: 2539.2). Total num frames: 126976. Throughput: 0: 712.8. Samples: 32078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:17:03,662][00422] Avg episode reward: [(0, '4.439')] [2023-02-22 15:17:03,673][11037] Saving new best policy, reward=4.439! [2023-02-22 15:17:08,654][00422] Fps is (10 sec: 3276.8, 60 sec: 2532.1, 300 sec: 2532.1). Total num frames: 139264. Throughput: 0: 753.7. Samples: 33992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:17:08,657][00422] Avg episode reward: [(0, '4.481')] [2023-02-22 15:17:08,659][11037] Saving new best policy, reward=4.481! [2023-02-22 15:17:13,654][00422] Fps is (10 sec: 2868.8, 60 sec: 2594.1, 300 sec: 2594.1). Total num frames: 155648. Throughput: 0: 798.1. Samples: 38086. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:17:13,657][00422] Avg episode reward: [(0, '4.535')] [2023-02-22 15:17:13,671][11037] Saving new best policy, reward=4.535! [2023-02-22 15:17:15,563][11051] Updated weights for policy 0, policy_version 40 (0.0016) [2023-02-22 15:17:18,654][00422] Fps is (10 sec: 3686.4, 60 sec: 2935.5, 300 sec: 2709.7). Total num frames: 176128. Throughput: 0: 858.0. Samples: 44496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:17:18,657][00422] Avg episode reward: [(0, '4.571')] [2023-02-22 15:17:18,661][11037] Saving new best policy, reward=4.571! [2023-02-22 15:17:23,655][00422] Fps is (10 sec: 3686.1, 60 sec: 3208.6, 300 sec: 2750.1). Total num frames: 192512. Throughput: 0: 855.8. Samples: 47606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:17:23,657][00422] Avg episode reward: [(0, '4.536')] [2023-02-22 15:17:27,107][11051] Updated weights for policy 0, policy_version 50 (0.0019) [2023-02-22 15:17:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2730.7). Total num frames: 204800. Throughput: 0: 829.1. Samples: 51746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:17:28,662][00422] Avg episode reward: [(0, '4.540')] [2023-02-22 15:17:33,654][00422] Fps is (10 sec: 2867.5, 60 sec: 3345.1, 300 sec: 2764.8). Total num frames: 221184. Throughput: 0: 838.2. Samples: 56044. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:17:33,658][00422] Avg episode reward: [(0, '4.474')] [2023-02-22 15:17:38,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 2843.1). Total num frames: 241664. Throughput: 0: 861.7. Samples: 59270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:17:38,661][00422] Avg episode reward: [(0, '4.546')] [2023-02-22 15:17:38,803][11051] Updated weights for policy 0, policy_version 60 (0.0028) [2023-02-22 15:17:43,656][00422] Fps is (10 sec: 3685.4, 60 sec: 3345.1, 300 sec: 2867.1). Total num frames: 258048. Throughput: 0: 843.4. Samples: 64776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:17:43,659][00422] Avg episode reward: [(0, '4.675')] [2023-02-22 15:17:43,682][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth... [2023-02-22 15:17:43,954][11037] Saving new best policy, reward=4.675! [2023-02-22 15:17:48,654][00422] Fps is (10 sec: 2048.0, 60 sec: 3208.5, 300 sec: 2759.4). Total num frames: 262144. Throughput: 0: 768.4. Samples: 66650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:17:48,656][00422] Avg episode reward: [(0, '4.670')] [2023-02-22 15:17:53,654][00422] Fps is (10 sec: 2048.5, 60 sec: 3208.5, 300 sec: 2785.3). Total num frames: 278528. Throughput: 0: 772.7. Samples: 68764. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:17:53,656][00422] Avg episode reward: [(0, '4.506')] [2023-02-22 15:17:55,342][11051] Updated weights for policy 0, policy_version 70 (0.0028) [2023-02-22 15:17:58,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3208.5, 300 sec: 2847.7). Total num frames: 299008. Throughput: 0: 804.9. Samples: 74306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:17:58,657][00422] Avg episode reward: [(0, '4.334')] [2023-02-22 15:18:03,661][00422] Fps is (10 sec: 4093.0, 60 sec: 3208.4, 300 sec: 2904.2). Total num frames: 319488. Throughput: 0: 802.2. Samples: 80602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:18:03,665][00422] Avg episode reward: [(0, '4.460')] [2023-02-22 15:18:06,082][11051] Updated weights for policy 0, policy_version 80 (0.0024) [2023-02-22 15:18:08,657][00422] Fps is (10 sec: 3275.9, 60 sec: 3208.4, 300 sec: 2884.9). Total num frames: 331776. Throughput: 0: 775.3. Samples: 82494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:08,659][00422] Avg episode reward: [(0, '4.451')] [2023-02-22 15:18:13,655][00422] Fps is (10 sec: 2459.1, 60 sec: 3140.2, 300 sec: 2867.2). Total num frames: 344064. Throughput: 0: 772.0. Samples: 86488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:18:13,665][00422] Avg episode reward: [(0, '4.554')] [2023-02-22 15:18:18,654][00422] Fps is (10 sec: 3277.8, 60 sec: 3140.3, 300 sec: 2916.4). Total num frames: 364544. Throughput: 0: 799.1. Samples: 92004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:18:18,660][00422] Avg episode reward: [(0, '4.462')] [2023-02-22 15:18:19,090][11051] Updated weights for policy 0, policy_version 90 (0.0015) [2023-02-22 15:18:23,654][00422] Fps is (10 sec: 4096.6, 60 sec: 3208.6, 300 sec: 2961.7). Total num frames: 385024. Throughput: 0: 797.4. Samples: 95154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:23,659][00422] Avg episode reward: [(0, '4.483')] [2023-02-22 15:18:28,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3276.8, 300 sec: 2973.4). Total num frames: 401408. Throughput: 0: 792.4. Samples: 100430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:28,656][00422] Avg episode reward: [(0, '4.485')] [2023-02-22 15:18:30,924][11051] Updated weights for policy 0, policy_version 100 (0.0018) [2023-02-22 15:18:33,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2955.0). Total num frames: 413696. Throughput: 0: 842.7. Samples: 104570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:33,665][00422] Avg episode reward: [(0, '4.645')] [2023-02-22 15:18:38,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3208.5, 300 sec: 2994.3). Total num frames: 434176. Throughput: 0: 854.5. Samples: 107216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:38,659][00422] Avg episode reward: [(0, '4.736')] [2023-02-22 15:18:38,663][11037] Saving new best policy, reward=4.736! [2023-02-22 15:18:42,295][11051] Updated weights for policy 0, policy_version 110 (0.0024) [2023-02-22 15:18:43,654][00422] Fps is (10 sec: 4095.9, 60 sec: 3276.9, 300 sec: 3031.0). Total num frames: 454656. Throughput: 0: 866.4. Samples: 113292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:18:43,657][00422] Avg episode reward: [(0, '4.702')] [2023-02-22 15:18:48,658][00422] Fps is (10 sec: 3684.7, 60 sec: 3481.3, 300 sec: 3038.9). Total num frames: 471040. Throughput: 0: 843.2. Samples: 118544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:48,661][00422] Avg episode reward: [(0, '4.373')] [2023-02-22 15:18:53,654][00422] Fps is (10 sec: 2867.3, 60 sec: 3413.4, 300 sec: 3020.8). Total num frames: 483328. Throughput: 0: 848.1. Samples: 120658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:18:53,658][00422] Avg episode reward: [(0, '4.394')] [2023-02-22 15:18:55,222][11051] Updated weights for policy 0, policy_version 120 (0.0024) [2023-02-22 15:18:58,654][00422] Fps is (10 sec: 3278.3, 60 sec: 3413.3, 300 sec: 3053.4). Total num frames: 503808. Throughput: 0: 866.6. Samples: 125484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:18:58,665][00422] Avg episode reward: [(0, '4.573')] [2023-02-22 15:19:03,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3413.8, 300 sec: 3084.0). Total num frames: 524288. Throughput: 0: 888.1. Samples: 131968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:19:03,657][00422] Avg episode reward: [(0, '5.088')] [2023-02-22 15:19:03,669][11037] Saving new best policy, reward=5.088! [2023-02-22 15:19:05,051][11051] Updated weights for policy 0, policy_version 130 (0.0023) [2023-02-22 15:19:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.8, 300 sec: 3089.6). Total num frames: 540672. Throughput: 0: 884.1. Samples: 134938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:19:08,657][00422] Avg episode reward: [(0, '4.992')] [2023-02-22 15:19:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3094.8). Total num frames: 557056. Throughput: 0: 858.5. Samples: 139060. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:19:13,657][00422] Avg episode reward: [(0, '4.817')] [2023-02-22 15:19:18,503][11051] Updated weights for policy 0, policy_version 140 (0.0030) [2023-02-22 15:19:18,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3099.7). Total num frames: 573440. Throughput: 0: 872.0. Samples: 143808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:19:18,661][00422] Avg episode reward: [(0, '4.555')] [2023-02-22 15:19:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3125.9). Total num frames: 593920. Throughput: 0: 885.9. Samples: 147080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:19:23,662][00422] Avg episode reward: [(0, '4.655')] [2023-02-22 15:19:28,522][11051] Updated weights for policy 0, policy_version 150 (0.0028) [2023-02-22 15:19:28,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3150.8). Total num frames: 614400. Throughput: 0: 891.9. Samples: 153428. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:19:28,658][00422] Avg episode reward: [(0, '4.548')] [2023-02-22 15:19:33,658][00422] Fps is (10 sec: 3275.3, 60 sec: 3549.6, 300 sec: 3133.4). Total num frames: 626688. Throughput: 0: 867.2. Samples: 157570. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:19:33,661][00422] Avg episode reward: [(0, '4.560')] [2023-02-22 15:19:38,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3136.9). Total num frames: 643072. Throughput: 0: 866.0. Samples: 159626. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:19:38,657][00422] Avg episode reward: [(0, '4.605')] [2023-02-22 15:19:41,030][11051] Updated weights for policy 0, policy_version 160 (0.0025) [2023-02-22 15:19:43,654][00422] Fps is (10 sec: 3688.1, 60 sec: 3481.6, 300 sec: 3159.8). Total num frames: 663552. Throughput: 0: 897.9. Samples: 165890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:19:43,657][00422] Avg episode reward: [(0, '4.701')] [2023-02-22 15:19:43,670][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000162_663552.pth... [2023-02-22 15:19:48,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.1, 300 sec: 3181.5). Total num frames: 684032. Throughput: 0: 887.3. Samples: 171896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:19:48,661][00422] Avg episode reward: [(0, '4.468')] [2023-02-22 15:19:52,286][11051] Updated weights for policy 0, policy_version 170 (0.0030) [2023-02-22 15:19:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3165.1). Total num frames: 696320. Throughput: 0: 866.8. Samples: 173944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:19:53,660][00422] Avg episode reward: [(0, '4.620')] [2023-02-22 15:19:58,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3167.6). Total num frames: 712704. Throughput: 0: 865.6. Samples: 178010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:19:58,660][00422] Avg episode reward: [(0, '4.610')] [2023-02-22 15:20:03,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3187.8). Total num frames: 733184. Throughput: 0: 903.5. Samples: 184466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:20:03,657][00422] Avg episode reward: [(0, '4.621')] [2023-02-22 15:20:03,745][11051] Updated weights for policy 0, policy_version 180 (0.0026) [2023-02-22 15:20:08,658][00422] Fps is (10 sec: 4094.1, 60 sec: 3549.6, 300 sec: 3207.0). Total num frames: 753664. Throughput: 0: 901.2. Samples: 187640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:08,661][00422] Avg episode reward: [(0, '4.694')] [2023-02-22 15:20:13,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3208.5). Total num frames: 770048. Throughput: 0: 864.8. Samples: 192342. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:20:13,660][00422] Avg episode reward: [(0, '4.604')] [2023-02-22 15:20:16,285][11051] Updated weights for policy 0, policy_version 190 (0.0012) [2023-02-22 15:20:18,654][00422] Fps is (10 sec: 2868.5, 60 sec: 3481.6, 300 sec: 3193.2). Total num frames: 782336. Throughput: 0: 865.3. Samples: 196506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:20:18,660][00422] Avg episode reward: [(0, '4.803')] [2023-02-22 15:20:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3227.6). Total num frames: 806912. Throughput: 0: 891.5. Samples: 199742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:20:23,658][00422] Avg episode reward: [(0, '4.644')] [2023-02-22 15:20:26,373][11051] Updated weights for policy 0, policy_version 200 (0.0027) [2023-02-22 15:20:28,673][00422] Fps is (10 sec: 4496.7, 60 sec: 3548.7, 300 sec: 3244.4). Total num frames: 827392. Throughput: 0: 900.3. Samples: 206422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:20:28,684][00422] Avg episode reward: [(0, '4.665')] [2023-02-22 15:20:33,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.9, 300 sec: 3213.8). Total num frames: 835584. Throughput: 0: 848.4. Samples: 210074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:20:33,659][00422] Avg episode reward: [(0, '4.601')] [2023-02-22 15:20:38,654][00422] Fps is (10 sec: 2052.1, 60 sec: 3413.3, 300 sec: 3199.5). Total num frames: 847872. Throughput: 0: 840.8. Samples: 211780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:38,656][00422] Avg episode reward: [(0, '4.574')] [2023-02-22 15:20:43,024][11051] Updated weights for policy 0, policy_version 210 (0.0033) [2023-02-22 15:20:43,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3185.8). Total num frames: 860160. Throughput: 0: 826.9. Samples: 215220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:43,657][00422] Avg episode reward: [(0, '4.463')] [2023-02-22 15:20:48,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3202.3). Total num frames: 880640. Throughput: 0: 810.2. Samples: 220926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:48,656][00422] Avg episode reward: [(0, '4.834')] [2023-02-22 15:20:52,746][11051] Updated weights for policy 0, policy_version 220 (0.0015) [2023-02-22 15:20:53,659][00422] Fps is (10 sec: 4503.3, 60 sec: 3481.3, 300 sec: 3232.9). Total num frames: 905216. Throughput: 0: 814.1. Samples: 224276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:53,662][00422] Avg episode reward: [(0, '4.825')] [2023-02-22 15:20:58,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3219.3). Total num frames: 917504. Throughput: 0: 829.8. Samples: 229684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:20:58,660][00422] Avg episode reward: [(0, '4.821')] [2023-02-22 15:21:03,654][00422] Fps is (10 sec: 2868.7, 60 sec: 3345.1, 300 sec: 3220.3). Total num frames: 933888. Throughput: 0: 830.0. Samples: 233858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:21:03,656][00422] Avg episode reward: [(0, '4.892')] [2023-02-22 15:21:05,865][11051] Updated weights for policy 0, policy_version 230 (0.0014) [2023-02-22 15:21:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3345.3, 300 sec: 3235.1). Total num frames: 954368. Throughput: 0: 819.1. Samples: 236602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:21:08,657][00422] Avg episode reward: [(0, '5.126')] [2023-02-22 15:21:08,662][11037] Saving new best policy, reward=5.126! [2023-02-22 15:21:13,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 815.3. Samples: 243096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:21:13,659][00422] Avg episode reward: [(0, '4.924')] [2023-02-22 15:21:15,574][11051] Updated weights for policy 0, policy_version 240 (0.0028) [2023-02-22 15:21:18,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3360.1). Total num frames: 991232. Throughput: 0: 849.6. Samples: 248306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:21:18,659][00422] Avg episode reward: [(0, '4.749')] [2023-02-22 15:21:23,654][00422] Fps is (10 sec: 2867.0, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 1003520. Throughput: 0: 856.0. Samples: 250300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:21:23,660][00422] Avg episode reward: [(0, '4.824')] [2023-02-22 15:21:28,476][11051] Updated weights for policy 0, policy_version 250 (0.0026) [2023-02-22 15:21:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3277.9, 300 sec: 3401.8). Total num frames: 1024000. Throughput: 0: 891.5. Samples: 255336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:21:28,660][00422] Avg episode reward: [(0, '4.965')] [2023-02-22 15:21:33,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1044480. Throughput: 0: 913.4. Samples: 262030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:21:33,656][00422] Avg episode reward: [(0, '4.951')] [2023-02-22 15:21:38,659][00422] Fps is (10 sec: 3684.4, 60 sec: 3549.5, 300 sec: 3401.7). Total num frames: 1060864. Throughput: 0: 905.2. Samples: 265012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:21:38,666][00422] Avg episode reward: [(0, '4.857')] [2023-02-22 15:21:38,952][11051] Updated weights for policy 0, policy_version 260 (0.0021) [2023-02-22 15:21:43,656][00422] Fps is (10 sec: 3275.9, 60 sec: 3618.0, 300 sec: 3415.6). Total num frames: 1077248. Throughput: 0: 880.3. Samples: 269302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:21:43,662][00422] Avg episode reward: [(0, '4.956')] [2023-02-22 15:21:43,679][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000263_1077248.pth... [2023-02-22 15:21:43,825][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth [2023-02-22 15:21:48,654][00422] Fps is (10 sec: 3688.4, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 1097728. Throughput: 0: 904.9. Samples: 274580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:21:48,659][00422] Avg episode reward: [(0, '4.882')] [2023-02-22 15:21:50,357][11051] Updated weights for policy 0, policy_version 270 (0.0025) [2023-02-22 15:21:53,654][00422] Fps is (10 sec: 4097.2, 60 sec: 3550.2, 300 sec: 3429.5). Total num frames: 1118208. Throughput: 0: 917.4. Samples: 277886. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:21:53,661][00422] Avg episode reward: [(0, '5.243')] [2023-02-22 15:21:53,673][11037] Saving new best policy, reward=5.243! [2023-02-22 15:21:58,656][00422] Fps is (10 sec: 3685.6, 60 sec: 3618.0, 300 sec: 3415.7). Total num frames: 1134592. Throughput: 0: 909.9. Samples: 284042. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:21:58,659][00422] Avg episode reward: [(0, '5.340')] [2023-02-22 15:21:58,665][11037] Saving new best policy, reward=5.340! [2023-02-22 15:22:01,889][11051] Updated weights for policy 0, policy_version 280 (0.0015) [2023-02-22 15:22:03,655][00422] Fps is (10 sec: 3276.3, 60 sec: 3618.0, 300 sec: 3429.5). Total num frames: 1150976. Throughput: 0: 885.1. Samples: 288138. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:22:03,662][00422] Avg episode reward: [(0, '5.375')] [2023-02-22 15:22:03,675][11037] Saving new best policy, reward=5.375! [2023-02-22 15:22:08,654][00422] Fps is (10 sec: 3277.4, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1167360. Throughput: 0: 888.0. Samples: 290260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:22:08,657][00422] Avg episode reward: [(0, '4.933')] [2023-02-22 15:22:13,196][11051] Updated weights for policy 0, policy_version 290 (0.0018) [2023-02-22 15:22:13,654][00422] Fps is (10 sec: 3686.8, 60 sec: 3549.8, 300 sec: 3429.5). Total num frames: 1187840. Throughput: 0: 916.3. Samples: 296568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:22:13,662][00422] Avg episode reward: [(0, '5.036')] [2023-02-22 15:22:18,658][00422] Fps is (10 sec: 4094.2, 60 sec: 3617.9, 300 sec: 3443.4). Total num frames: 1208320. Throughput: 0: 903.2. Samples: 302676. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:22:18,661][00422] Avg episode reward: [(0, '5.258')] [2023-02-22 15:22:23,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3443.4). Total num frames: 1220608. Throughput: 0: 881.8. Samples: 304690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:22:23,657][00422] Avg episode reward: [(0, '5.426')] [2023-02-22 15:22:23,668][11037] Saving new best policy, reward=5.426! [2023-02-22 15:22:25,601][11051] Updated weights for policy 0, policy_version 300 (0.0025) [2023-02-22 15:22:28,654][00422] Fps is (10 sec: 2868.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1236992. Throughput: 0: 877.1. Samples: 308768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:22:28,664][00422] Avg episode reward: [(0, '5.325')] [2023-02-22 15:22:33,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1257472. Throughput: 0: 900.8. Samples: 315116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:22:33,661][00422] Avg episode reward: [(0, '5.844')] [2023-02-22 15:22:33,674][11037] Saving new best policy, reward=5.844! [2023-02-22 15:22:35,950][11051] Updated weights for policy 0, policy_version 310 (0.0022) [2023-02-22 15:22:38,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3618.5, 300 sec: 3457.3). Total num frames: 1277952. Throughput: 0: 898.8. Samples: 318330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:22:38,661][00422] Avg episode reward: [(0, '6.309')] [2023-02-22 15:22:38,666][11037] Saving new best policy, reward=6.309! [2023-02-22 15:22:43,657][00422] Fps is (10 sec: 3275.6, 60 sec: 3549.8, 300 sec: 3485.0). Total num frames: 1290240. Throughput: 0: 867.7. Samples: 323090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:22:43,666][00422] Avg episode reward: [(0, '5.892')] [2023-02-22 15:22:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1306624. Throughput: 0: 870.8. Samples: 327324. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:22:48,657][00422] Avg episode reward: [(0, '5.811')] [2023-02-22 15:22:49,053][11051] Updated weights for policy 0, policy_version 320 (0.0030) [2023-02-22 15:22:53,654][00422] Fps is (10 sec: 3687.6, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1327104. Throughput: 0: 895.7. Samples: 330568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:22:53,661][00422] Avg episode reward: [(0, '5.746')] [2023-02-22 15:22:58,522][11051] Updated weights for policy 0, policy_version 330 (0.0020) [2023-02-22 15:22:58,654][00422] Fps is (10 sec: 4505.5, 60 sec: 3618.3, 300 sec: 3499.0). Total num frames: 1351680. Throughput: 0: 903.5. Samples: 337224. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:22:58,657][00422] Avg episode reward: [(0, '5.972')] [2023-02-22 15:23:03,654][00422] Fps is (10 sec: 3686.5, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 1363968. Throughput: 0: 875.0. Samples: 342046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:23:03,660][00422] Avg episode reward: [(0, '5.822')] [2023-02-22 15:23:08,655][00422] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3512.8). Total num frames: 1380352. Throughput: 0: 875.5. Samples: 344088. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:23:08,659][00422] Avg episode reward: [(0, '5.839')] [2023-02-22 15:23:11,583][11051] Updated weights for policy 0, policy_version 340 (0.0018) [2023-02-22 15:23:13,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1400832. Throughput: 0: 904.0. Samples: 349446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:23:13,656][00422] Avg episode reward: [(0, '6.115')] [2023-02-22 15:23:18,654][00422] Fps is (10 sec: 4096.4, 60 sec: 3550.1, 300 sec: 3512.8). Total num frames: 1421312. Throughput: 0: 911.6. Samples: 356136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:18,659][00422] Avg episode reward: [(0, '6.194')] [2023-02-22 15:23:21,640][11051] Updated weights for policy 0, policy_version 350 (0.0015) [2023-02-22 15:23:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 1437696. Throughput: 0: 895.3. Samples: 358620. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:23:23,657][00422] Avg episode reward: [(0, '6.445')] [2023-02-22 15:23:23,683][11037] Saving new best policy, reward=6.445! [2023-02-22 15:23:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1449984. Throughput: 0: 881.4. Samples: 362748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:28,661][00422] Avg episode reward: [(0, '6.216')] [2023-02-22 15:23:33,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1470464. Throughput: 0: 907.3. Samples: 368152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:33,663][00422] Avg episode reward: [(0, '6.334')] [2023-02-22 15:23:34,292][11051] Updated weights for policy 0, policy_version 360 (0.0017) [2023-02-22 15:23:38,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1490944. Throughput: 0: 908.8. Samples: 371462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:38,664][00422] Avg episode reward: [(0, '6.513')] [2023-02-22 15:23:38,670][11037] Saving new best policy, reward=6.513! [2023-02-22 15:23:43,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3512.9). Total num frames: 1507328. Throughput: 0: 887.1. Samples: 377142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:43,661][00422] Avg episode reward: [(0, '6.665')] [2023-02-22 15:23:43,673][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000368_1507328.pth... [2023-02-22 15:23:43,824][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000162_663552.pth [2023-02-22 15:23:43,840][11037] Saving new best policy, reward=6.665! [2023-02-22 15:23:45,886][11051] Updated weights for policy 0, policy_version 370 (0.0028) [2023-02-22 15:23:48,654][00422] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3512.8). Total num frames: 1519616. Throughput: 0: 867.9. Samples: 381100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:23:48,658][00422] Avg episode reward: [(0, '6.851')] [2023-02-22 15:23:48,664][11037] Saving new best policy, reward=6.851! [2023-02-22 15:23:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1540096. Throughput: 0: 870.4. Samples: 383254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:23:53,656][00422] Avg episode reward: [(0, '7.275')] [2023-02-22 15:23:53,674][11037] Saving new best policy, reward=7.275! [2023-02-22 15:23:57,246][11051] Updated weights for policy 0, policy_version 380 (0.0018) [2023-02-22 15:23:58,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1560576. Throughput: 0: 893.3. Samples: 389644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:23:58,657][00422] Avg episode reward: [(0, '6.908')] [2023-02-22 15:24:03,660][00422] Fps is (10 sec: 3684.0, 60 sec: 3549.5, 300 sec: 3512.8). Total num frames: 1576960. Throughput: 0: 871.4. Samples: 395354. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 15:24:03,663][00422] Avg episode reward: [(0, '6.670')] [2023-02-22 15:24:08,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3499.0). Total num frames: 1589248. Throughput: 0: 862.6. Samples: 397438. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:24:08,661][00422] Avg episode reward: [(0, '6.591')] [2023-02-22 15:24:10,304][11051] Updated weights for policy 0, policy_version 390 (0.0025) [2023-02-22 15:24:13,654][00422] Fps is (10 sec: 3278.9, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1609728. Throughput: 0: 867.2. Samples: 401770. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-22 15:24:13,661][00422] Avg episode reward: [(0, '6.705')] [2023-02-22 15:24:18,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1630208. Throughput: 0: 895.7. Samples: 408460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:24:18,660][00422] Avg episode reward: [(0, '7.406')] [2023-02-22 15:24:18,665][11037] Saving new best policy, reward=7.406! [2023-02-22 15:24:20,117][11051] Updated weights for policy 0, policy_version 400 (0.0033) [2023-02-22 15:24:23,654][00422] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1650688. Throughput: 0: 893.8. Samples: 411684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:24:23,657][00422] Avg episode reward: [(0, '7.482')] [2023-02-22 15:24:23,671][11037] Saving new best policy, reward=7.482! [2023-02-22 15:24:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 1662976. Throughput: 0: 861.3. Samples: 415902. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:24:28,656][00422] Avg episode reward: [(0, '7.526')] [2023-02-22 15:24:28,659][11037] Saving new best policy, reward=7.526! [2023-02-22 15:24:33,321][11051] Updated weights for policy 0, policy_version 410 (0.0025) [2023-02-22 15:24:33,654][00422] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1679360. Throughput: 0: 872.5. Samples: 420364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:24:33,657][00422] Avg episode reward: [(0, '7.209')] [2023-02-22 15:24:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1699840. Throughput: 0: 898.2. Samples: 423672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:24:38,656][00422] Avg episode reward: [(0, '7.516')] [2023-02-22 15:24:42,853][11051] Updated weights for policy 0, policy_version 420 (0.0015) [2023-02-22 15:24:43,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1720320. Throughput: 0: 904.1. Samples: 430330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:24:43,660][00422] Avg episode reward: [(0, '7.867')] [2023-02-22 15:24:43,673][11037] Saving new best policy, reward=7.867! [2023-02-22 15:24:48,658][00422] Fps is (10 sec: 3684.9, 60 sec: 3617.9, 300 sec: 3526.7). Total num frames: 1736704. Throughput: 0: 868.9. Samples: 434452. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:24:48,662][00422] Avg episode reward: [(0, '7.720')] [2023-02-22 15:24:53,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1748992. Throughput: 0: 869.4. Samples: 436562. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:24:53,656][00422] Avg episode reward: [(0, '7.577')] [2023-02-22 15:24:55,920][11051] Updated weights for policy 0, policy_version 430 (0.0017) [2023-02-22 15:24:58,659][00422] Fps is (10 sec: 3686.0, 60 sec: 3549.6, 300 sec: 3526.7). Total num frames: 1773568. Throughput: 0: 902.8. Samples: 442400. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:24:58,662][00422] Avg episode reward: [(0, '7.812')] [2023-02-22 15:25:03,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.5, 300 sec: 3526.8). Total num frames: 1794048. Throughput: 0: 900.4. Samples: 448980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:25:03,658][00422] Avg episode reward: [(0, '7.800')] [2023-02-22 15:25:06,225][11051] Updated weights for policy 0, policy_version 440 (0.0022) [2023-02-22 15:25:08,655][00422] Fps is (10 sec: 3277.9, 60 sec: 3618.0, 300 sec: 3512.8). Total num frames: 1806336. Throughput: 0: 875.6. Samples: 451086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:25:08,661][00422] Avg episode reward: [(0, '8.234')] [2023-02-22 15:25:08,664][11037] Saving new best policy, reward=8.234! [2023-02-22 15:25:13,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1822720. Throughput: 0: 876.0. Samples: 455324. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:25:13,657][00422] Avg episode reward: [(0, '8.142')] [2023-02-22 15:25:17,817][11051] Updated weights for policy 0, policy_version 450 (0.0017) [2023-02-22 15:25:18,654][00422] Fps is (10 sec: 3687.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1843200. Throughput: 0: 919.9. Samples: 461760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:25:18,656][00422] Avg episode reward: [(0, '9.003')] [2023-02-22 15:25:18,660][11037] Saving new best policy, reward=9.003! [2023-02-22 15:25:23,654][00422] Fps is (10 sec: 4505.4, 60 sec: 3618.1, 300 sec: 3527.0). Total num frames: 1867776. Throughput: 0: 919.5. Samples: 465050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:25:23,660][00422] Avg episode reward: [(0, '9.315')] [2023-02-22 15:25:23,678][11037] Saving new best policy, reward=9.315! [2023-02-22 15:25:28,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1880064. Throughput: 0: 886.1. Samples: 470204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:25:28,660][00422] Avg episode reward: [(0, '9.438')] [2023-02-22 15:25:28,663][11037] Saving new best policy, reward=9.438! [2023-02-22 15:25:29,192][11051] Updated weights for policy 0, policy_version 460 (0.0018) [2023-02-22 15:25:33,655][00422] Fps is (10 sec: 2457.4, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 1892352. Throughput: 0: 883.9. Samples: 474226. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:25:33,659][00422] Avg episode reward: [(0, '9.273')] [2023-02-22 15:25:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1916928. Throughput: 0: 903.3. Samples: 477212. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:25:38,662][00422] Avg episode reward: [(0, '10.313')] [2023-02-22 15:25:38,666][11037] Saving new best policy, reward=10.313! [2023-02-22 15:25:40,605][11051] Updated weights for policy 0, policy_version 470 (0.0067) [2023-02-22 15:25:43,654][00422] Fps is (10 sec: 4506.2, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1937408. Throughput: 0: 920.9. Samples: 483834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:25:43,657][00422] Avg episode reward: [(0, '10.423')] [2023-02-22 15:25:43,672][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000473_1937408.pth... [2023-02-22 15:25:43,796][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000263_1077248.pth [2023-02-22 15:25:43,819][11037] Saving new best policy, reward=10.423! [2023-02-22 15:25:48,670][00422] Fps is (10 sec: 3680.3, 60 sec: 3617.4, 300 sec: 3554.4). Total num frames: 1953792. Throughput: 0: 883.2. Samples: 488740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:25:48,679][00422] Avg episode reward: [(0, '10.244')] [2023-02-22 15:25:53,291][11051] Updated weights for policy 0, policy_version 480 (0.0026) [2023-02-22 15:25:53,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1966080. Throughput: 0: 883.8. Samples: 490854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:25:53,663][00422] Avg episode reward: [(0, '10.381')] [2023-02-22 15:25:58,654][00422] Fps is (10 sec: 2872.0, 60 sec: 3481.9, 300 sec: 3554.5). Total num frames: 1982464. Throughput: 0: 891.3. Samples: 495432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:25:58,657][00422] Avg episode reward: [(0, '10.120')] [2023-02-22 15:26:03,654][00422] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 2002944. Throughput: 0: 892.5. Samples: 501922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:26:03,661][00422] Avg episode reward: [(0, '10.451')] [2023-02-22 15:26:03,742][11051] Updated weights for policy 0, policy_version 490 (0.0027) [2023-02-22 15:26:03,751][11037] Saving new best policy, reward=10.451! [2023-02-22 15:26:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 2019328. Throughput: 0: 877.8. Samples: 504552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:26:08,658][00422] Avg episode reward: [(0, '11.169')] [2023-02-22 15:26:08,665][11037] Saving new best policy, reward=11.169! [2023-02-22 15:26:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 2035712. Throughput: 0: 853.2. Samples: 508600. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:26:13,658][00422] Avg episode reward: [(0, '10.826')] [2023-02-22 15:26:17,305][11051] Updated weights for policy 0, policy_version 500 (0.0020) [2023-02-22 15:26:18,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 2052096. Throughput: 0: 873.5. Samples: 513532. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:26:18,657][00422] Avg episode reward: [(0, '11.712')] [2023-02-22 15:26:18,659][11037] Saving new best policy, reward=11.712! [2023-02-22 15:26:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3554.5). Total num frames: 2072576. Throughput: 0: 874.8. Samples: 516580. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:26:23,658][00422] Avg episode reward: [(0, '11.998')] [2023-02-22 15:26:23,670][11037] Saving new best policy, reward=11.998! [2023-02-22 15:26:28,056][11051] Updated weights for policy 0, policy_version 510 (0.0039) [2023-02-22 15:26:28,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 2088960. Throughput: 0: 853.4. Samples: 522236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:26:28,657][00422] Avg episode reward: [(0, '12.255')] [2023-02-22 15:26:28,659][11037] Saving new best policy, reward=12.255! [2023-02-22 15:26:33,656][00422] Fps is (10 sec: 2866.6, 60 sec: 3481.6, 300 sec: 3526.8). Total num frames: 2101248. Throughput: 0: 831.2. Samples: 526132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:26:33,659][00422] Avg episode reward: [(0, '12.418')] [2023-02-22 15:26:33,672][11037] Saving new best policy, reward=12.418! [2023-02-22 15:26:38,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3526.8). Total num frames: 2117632. Throughput: 0: 827.6. Samples: 528096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:26:38,659][00422] Avg episode reward: [(0, '12.042')] [2023-02-22 15:26:40,970][11051] Updated weights for policy 0, policy_version 520 (0.0017) [2023-02-22 15:26:43,654][00422] Fps is (10 sec: 3687.1, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 2138112. Throughput: 0: 870.6. Samples: 534608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:26:43,664][00422] Avg episode reward: [(0, '12.464')] [2023-02-22 15:26:43,708][11037] Saving new best policy, reward=12.464! [2023-02-22 15:26:48,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3414.3, 300 sec: 3526.7). Total num frames: 2158592. Throughput: 0: 860.9. Samples: 540664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:26:48,657][00422] Avg episode reward: [(0, '12.748')] [2023-02-22 15:26:48,660][11037] Saving new best policy, reward=12.748! [2023-02-22 15:26:52,106][11051] Updated weights for policy 0, policy_version 530 (0.0018) [2023-02-22 15:26:53,658][00422] Fps is (10 sec: 3275.4, 60 sec: 3413.1, 300 sec: 3512.8). Total num frames: 2170880. Throughput: 0: 847.7. Samples: 542702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:26:53,665][00422] Avg episode reward: [(0, '12.809')] [2023-02-22 15:26:53,676][11037] Saving new best policy, reward=12.809! [2023-02-22 15:26:58,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3512.9). Total num frames: 2187264. Throughput: 0: 848.8. Samples: 546796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:26:58,662][00422] Avg episode reward: [(0, '12.422')] [2023-02-22 15:27:03,474][11051] Updated weights for policy 0, policy_version 540 (0.0028) [2023-02-22 15:27:03,654][00422] Fps is (10 sec: 4097.9, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 2211840. Throughput: 0: 888.0. Samples: 553490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:27:03,664][00422] Avg episode reward: [(0, '11.856')] [2023-02-22 15:27:08,654][00422] Fps is (10 sec: 4505.4, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 2232320. Throughput: 0: 895.3. Samples: 556870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:27:08,661][00422] Avg episode reward: [(0, '11.255')] [2023-02-22 15:27:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3512.9). Total num frames: 2244608. Throughput: 0: 870.4. Samples: 561402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:27:13,658][00422] Avg episode reward: [(0, '10.949')] [2023-02-22 15:27:15,837][11051] Updated weights for policy 0, policy_version 550 (0.0020) [2023-02-22 15:27:18,654][00422] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2260992. Throughput: 0: 878.1. Samples: 565646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:18,663][00422] Avg episode reward: [(0, '10.671')] [2023-02-22 15:27:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 2281472. Throughput: 0: 906.9. Samples: 568906. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:27:23,660][00422] Avg episode reward: [(0, '9.745')] [2023-02-22 15:27:25,930][11051] Updated weights for policy 0, policy_version 560 (0.0017) [2023-02-22 15:27:28,657][00422] Fps is (10 sec: 4094.5, 60 sec: 3549.6, 300 sec: 3540.6). Total num frames: 2301952. Throughput: 0: 910.9. Samples: 575602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:27:28,663][00422] Avg episode reward: [(0, '9.969')] [2023-02-22 15:27:33,656][00422] Fps is (10 sec: 3685.5, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 2318336. Throughput: 0: 874.1. Samples: 580002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:33,660][00422] Avg episode reward: [(0, '9.684')] [2023-02-22 15:27:38,654][00422] Fps is (10 sec: 2868.3, 60 sec: 3549.9, 300 sec: 3526.8). Total num frames: 2330624. Throughput: 0: 874.2. Samples: 582038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:38,661][00422] Avg episode reward: [(0, '10.173')] [2023-02-22 15:27:39,251][11051] Updated weights for policy 0, policy_version 570 (0.0026) [2023-02-22 15:27:43,654][00422] Fps is (10 sec: 3277.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 2351104. Throughput: 0: 905.3. Samples: 587534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:43,656][00422] Avg episode reward: [(0, '10.327')] [2023-02-22 15:27:43,666][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth... [2023-02-22 15:27:43,818][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000368_1507328.pth [2023-02-22 15:27:48,657][00422] Fps is (10 sec: 3275.8, 60 sec: 3413.2, 300 sec: 3512.8). Total num frames: 2363392. Throughput: 0: 848.9. Samples: 591692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:48,663][00422] Avg episode reward: [(0, '10.355')] [2023-02-22 15:27:53,237][11051] Updated weights for policy 0, policy_version 580 (0.0017) [2023-02-22 15:27:53,656][00422] Fps is (10 sec: 2457.0, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 2375680. Throughput: 0: 811.7. Samples: 593400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:27:53,661][00422] Avg episode reward: [(0, '10.428')] [2023-02-22 15:27:58,656][00422] Fps is (10 sec: 2457.7, 60 sec: 3344.9, 300 sec: 3471.2). Total num frames: 2387968. Throughput: 0: 794.8. Samples: 597172. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:27:58,660][00422] Avg episode reward: [(0, '10.546')] [2023-02-22 15:28:03,654][00422] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3471.2). Total num frames: 2404352. Throughput: 0: 811.0. Samples: 602142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:28:03,666][00422] Avg episode reward: [(0, '11.107')] [2023-02-22 15:28:05,727][11051] Updated weights for policy 0, policy_version 590 (0.0030) [2023-02-22 15:28:08,654][00422] Fps is (10 sec: 4097.1, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 2428928. Throughput: 0: 812.7. Samples: 605476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:28:08,662][00422] Avg episode reward: [(0, '11.151')] [2023-02-22 15:28:13,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 2445312. Throughput: 0: 803.3. Samples: 611748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:28:13,656][00422] Avg episode reward: [(0, '11.422')] [2023-02-22 15:28:16,933][11051] Updated weights for policy 0, policy_version 600 (0.0014) [2023-02-22 15:28:18,654][00422] Fps is (10 sec: 3276.6, 60 sec: 3345.0, 300 sec: 3471.2). Total num frames: 2461696. Throughput: 0: 797.9. Samples: 615906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:28:18,659][00422] Avg episode reward: [(0, '10.405')] [2023-02-22 15:28:23,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 2478080. Throughput: 0: 798.8. Samples: 617982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:28:23,662][00422] Avg episode reward: [(0, '10.068')] [2023-02-22 15:28:28,116][11051] Updated weights for policy 0, policy_version 610 (0.0018) [2023-02-22 15:28:28,654][00422] Fps is (10 sec: 3686.6, 60 sec: 3277.0, 300 sec: 3485.1). Total num frames: 2498560. Throughput: 0: 814.9. Samples: 624204. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:28:28,659][00422] Avg episode reward: [(0, '10.315')] [2023-02-22 15:28:33,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3345.2, 300 sec: 3485.1). Total num frames: 2519040. Throughput: 0: 860.5. Samples: 630414. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:28:33,661][00422] Avg episode reward: [(0, '10.554')] [2023-02-22 15:28:38,655][00422] Fps is (10 sec: 3276.3, 60 sec: 3345.0, 300 sec: 3471.2). Total num frames: 2531328. Throughput: 0: 869.0. Samples: 632506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:28:38,659][00422] Avg episode reward: [(0, '10.919')] [2023-02-22 15:28:40,554][11051] Updated weights for policy 0, policy_version 620 (0.0026) [2023-02-22 15:28:43,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 2547712. Throughput: 0: 878.4. Samples: 636698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:28:43,656][00422] Avg episode reward: [(0, '12.075')] [2023-02-22 15:28:48,654][00422] Fps is (10 sec: 3687.0, 60 sec: 3413.5, 300 sec: 3485.1). Total num frames: 2568192. Throughput: 0: 907.9. Samples: 642998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:28:48,658][00422] Avg episode reward: [(0, '12.847')] [2023-02-22 15:28:48,662][11037] Saving new best policy, reward=12.847! [2023-02-22 15:28:50,848][11051] Updated weights for policy 0, policy_version 630 (0.0026) [2023-02-22 15:28:53,657][00422] Fps is (10 sec: 4094.5, 60 sec: 3549.8, 300 sec: 3485.0). Total num frames: 2588672. Throughput: 0: 905.8. Samples: 646240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:28:53,660][00422] Avg episode reward: [(0, '14.411')] [2023-02-22 15:28:53,670][11037] Saving new best policy, reward=14.411! [2023-02-22 15:28:58,658][00422] Fps is (10 sec: 3684.7, 60 sec: 3618.0, 300 sec: 3485.1). Total num frames: 2605056. Throughput: 0: 873.1. Samples: 651040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:28:58,661][00422] Avg episode reward: [(0, '14.571')] [2023-02-22 15:28:58,664][11037] Saving new best policy, reward=14.571! [2023-02-22 15:29:03,654][00422] Fps is (10 sec: 2868.1, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 2617344. Throughput: 0: 872.4. Samples: 655164. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:03,659][00422] Avg episode reward: [(0, '13.301')] [2023-02-22 15:29:04,226][11051] Updated weights for policy 0, policy_version 640 (0.0019) [2023-02-22 15:29:08,654][00422] Fps is (10 sec: 3278.3, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2637824. Throughput: 0: 895.2. Samples: 658268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:08,662][00422] Avg episode reward: [(0, '12.902')] [2023-02-22 15:29:13,505][11051] Updated weights for policy 0, policy_version 650 (0.0016) [2023-02-22 15:29:13,654][00422] Fps is (10 sec: 4505.8, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 2662400. Throughput: 0: 904.7. Samples: 664914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:13,656][00422] Avg episode reward: [(0, '12.075')] [2023-02-22 15:29:18,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2674688. Throughput: 0: 872.3. Samples: 669670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:29:18,658][00422] Avg episode reward: [(0, '11.241')] [2023-02-22 15:29:23,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2686976. Throughput: 0: 872.1. Samples: 671748. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:23,659][00422] Avg episode reward: [(0, '12.580')] [2023-02-22 15:29:26,730][11051] Updated weights for policy 0, policy_version 660 (0.0031) [2023-02-22 15:29:28,654][00422] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2711552. Throughput: 0: 896.6. Samples: 677046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:29:28,660][00422] Avg episode reward: [(0, '12.546')] [2023-02-22 15:29:33,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2732032. Throughput: 0: 901.9. Samples: 683582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:29:33,662][00422] Avg episode reward: [(0, '13.018')] [2023-02-22 15:29:37,015][11051] Updated weights for policy 0, policy_version 670 (0.0031) [2023-02-22 15:29:38,655][00422] Fps is (10 sec: 3685.9, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2748416. Throughput: 0: 887.0. Samples: 686152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:29:38,666][00422] Avg episode reward: [(0, '14.067')] [2023-02-22 15:29:43,654][00422] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 2760704. Throughput: 0: 872.7. Samples: 690310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:29:43,661][00422] Avg episode reward: [(0, '13.547')] [2023-02-22 15:29:43,678][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000674_2760704.pth... [2023-02-22 15:29:43,832][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000473_1937408.pth [2023-02-22 15:29:48,654][00422] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2781184. Throughput: 0: 899.9. Samples: 695660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:48,663][00422] Avg episode reward: [(0, '13.590')] [2023-02-22 15:29:49,414][11051] Updated weights for policy 0, policy_version 680 (0.0050) [2023-02-22 15:29:53,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3550.1, 300 sec: 3485.1). Total num frames: 2801664. Throughput: 0: 904.3. Samples: 698964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:29:53,657][00422] Avg episode reward: [(0, '13.381')] [2023-02-22 15:29:58,659][00422] Fps is (10 sec: 3684.3, 60 sec: 3549.8, 300 sec: 3471.1). Total num frames: 2818048. Throughput: 0: 889.1. Samples: 704928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:29:58,661][00422] Avg episode reward: [(0, '13.801')] [2023-02-22 15:30:00,380][11051] Updated weights for policy 0, policy_version 690 (0.0023) [2023-02-22 15:30:03,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3485.1). Total num frames: 2834432. Throughput: 0: 876.8. Samples: 709126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:30:03,662][00422] Avg episode reward: [(0, '14.527')] [2023-02-22 15:30:08,654][00422] Fps is (10 sec: 3278.7, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 875.6. Samples: 711150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:30:08,656][00422] Avg episode reward: [(0, '14.804')] [2023-02-22 15:30:08,659][11037] Saving new best policy, reward=14.804! [2023-02-22 15:30:11,955][11051] Updated weights for policy 0, policy_version 700 (0.0039) [2023-02-22 15:30:13,655][00422] Fps is (10 sec: 3685.9, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 2871296. Throughput: 0: 904.5. Samples: 717748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:30:13,659][00422] Avg episode reward: [(0, '16.196')] [2023-02-22 15:30:13,677][11037] Saving new best policy, reward=16.196! [2023-02-22 15:30:18,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3471.2). Total num frames: 2891776. Throughput: 0: 886.9. Samples: 723492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:30:18,662][00422] Avg episode reward: [(0, '17.057')] [2023-02-22 15:30:18,665][11037] Saving new best policy, reward=17.057! [2023-02-22 15:30:23,654][00422] Fps is (10 sec: 3277.1, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2904064. Throughput: 0: 875.6. Samples: 725552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:30:23,657][00422] Avg episode reward: [(0, '16.393')] [2023-02-22 15:30:24,363][11051] Updated weights for policy 0, policy_version 710 (0.0016) [2023-02-22 15:30:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2920448. Throughput: 0: 879.7. Samples: 729894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:30:28,655][00422] Avg episode reward: [(0, '15.766')] [2023-02-22 15:30:33,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2945024. Throughput: 0: 912.3. Samples: 736712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:30:33,656][00422] Avg episode reward: [(0, '15.501')] [2023-02-22 15:30:34,236][11051] Updated weights for policy 0, policy_version 720 (0.0031) [2023-02-22 15:30:38,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 2961408. Throughput: 0: 912.7. Samples: 740034. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:30:38,660][00422] Avg episode reward: [(0, '14.034')] [2023-02-22 15:30:43,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3471.4). Total num frames: 2977792. Throughput: 0: 877.8. Samples: 744422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:30:43,661][00422] Avg episode reward: [(0, '14.642')] [2023-02-22 15:30:47,343][11051] Updated weights for policy 0, policy_version 730 (0.0025) [2023-02-22 15:30:48,654][00422] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 2994176. Throughput: 0: 886.6. Samples: 749024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:30:48,660][00422] Avg episode reward: [(0, '14.409')] [2023-02-22 15:30:53,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3014656. Throughput: 0: 917.9. Samples: 752456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:30:53,657][00422] Avg episode reward: [(0, '13.699')] [2023-02-22 15:30:56,632][11051] Updated weights for policy 0, policy_version 740 (0.0021) [2023-02-22 15:30:58,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3618.5, 300 sec: 3499.0). Total num frames: 3035136. Throughput: 0: 920.5. Samples: 759170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:30:58,657][00422] Avg episode reward: [(0, '14.337')] [2023-02-22 15:31:03,654][00422] Fps is (10 sec: 3686.1, 60 sec: 3618.1, 300 sec: 3498.9). Total num frames: 3051520. Throughput: 0: 888.8. Samples: 763488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:31:03,661][00422] Avg episode reward: [(0, '13.774')] [2023-02-22 15:31:08,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3063808. Throughput: 0: 889.5. Samples: 765580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:31:08,656][00422] Avg episode reward: [(0, '13.840')] [2023-02-22 15:31:09,850][11051] Updated weights for policy 0, policy_version 750 (0.0047) [2023-02-22 15:31:13,654][00422] Fps is (10 sec: 3686.7, 60 sec: 3618.2, 300 sec: 3512.8). Total num frames: 3088384. Throughput: 0: 922.2. Samples: 771394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:31:13,663][00422] Avg episode reward: [(0, '15.538')] [2023-02-22 15:31:18,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 3108864. Throughput: 0: 919.4. Samples: 778086. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:31:18,660][00422] Avg episode reward: [(0, '16.334')] [2023-02-22 15:31:19,337][11051] Updated weights for policy 0, policy_version 760 (0.0014) [2023-02-22 15:31:23,654][00422] Fps is (10 sec: 3686.2, 60 sec: 3686.4, 300 sec: 3512.8). Total num frames: 3125248. Throughput: 0: 894.7. Samples: 780296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:31:23,665][00422] Avg episode reward: [(0, '16.416')] [2023-02-22 15:31:28,656][00422] Fps is (10 sec: 2866.6, 60 sec: 3618.0, 300 sec: 3512.8). Total num frames: 3137536. Throughput: 0: 894.3. Samples: 784668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:31:28,659][00422] Avg episode reward: [(0, '17.043')] [2023-02-22 15:31:31,891][11051] Updated weights for policy 0, policy_version 770 (0.0034) [2023-02-22 15:31:33,654][00422] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3158016. Throughput: 0: 923.7. Samples: 790590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:31:33,656][00422] Avg episode reward: [(0, '16.112')] [2023-02-22 15:31:38,654][00422] Fps is (10 sec: 4506.6, 60 sec: 3686.4, 300 sec: 3540.6). Total num frames: 3182592. Throughput: 0: 922.1. Samples: 793952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:31:38,656][00422] Avg episode reward: [(0, '16.560')] [2023-02-22 15:31:42,024][11051] Updated weights for policy 0, policy_version 780 (0.0012) [2023-02-22 15:31:43,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3526.7). Total num frames: 3198976. Throughput: 0: 893.1. Samples: 799360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:31:43,659][00422] Avg episode reward: [(0, '16.667')] [2023-02-22 15:31:43,674][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000781_3198976.pth... [2023-02-22 15:31:43,850][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth [2023-02-22 15:31:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3526.8). Total num frames: 3211264. Throughput: 0: 890.2. Samples: 803548. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:31:48,659][00422] Avg episode reward: [(0, '16.897')] [2023-02-22 15:31:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3231744. Throughput: 0: 906.1. Samples: 806354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:31:53,657][00422] Avg episode reward: [(0, '17.402')] [2023-02-22 15:31:53,666][11037] Saving new best policy, reward=17.402! [2023-02-22 15:31:54,314][11051] Updated weights for policy 0, policy_version 790 (0.0013) [2023-02-22 15:31:58,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 3252224. Throughput: 0: 922.8. Samples: 812922. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:31:58,656][00422] Avg episode reward: [(0, '17.442')] [2023-02-22 15:31:58,661][11037] Saving new best policy, reward=17.442! [2023-02-22 15:32:03,663][00422] Fps is (10 sec: 3683.0, 60 sec: 3617.6, 300 sec: 3512.7). Total num frames: 3268608. Throughput: 0: 891.5. Samples: 818212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:32:03,669][00422] Avg episode reward: [(0, '17.291')] [2023-02-22 15:32:05,626][11051] Updated weights for policy 0, policy_version 800 (0.0023) [2023-02-22 15:32:08,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3526.7). Total num frames: 3284992. Throughput: 0: 889.3. Samples: 820316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:32:08,663][00422] Avg episode reward: [(0, '17.961')] [2023-02-22 15:32:08,666][11037] Saving new best policy, reward=17.961! [2023-02-22 15:32:13,654][00422] Fps is (10 sec: 3279.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3301376. Throughput: 0: 902.0. Samples: 825256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:32:13,662][00422] Avg episode reward: [(0, '18.218')] [2023-02-22 15:32:13,671][11037] Saving new best policy, reward=18.218! [2023-02-22 15:32:16,636][11051] Updated weights for policy 0, policy_version 810 (0.0052) [2023-02-22 15:32:18,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3325952. Throughput: 0: 919.3. Samples: 831960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:18,660][00422] Avg episode reward: [(0, '18.219')] [2023-02-22 15:32:18,666][11037] Saving new best policy, reward=18.219! [2023-02-22 15:32:23,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3618.2, 300 sec: 3526.8). Total num frames: 3342336. Throughput: 0: 907.6. Samples: 834796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:32:23,657][00422] Avg episode reward: [(0, '19.809')] [2023-02-22 15:32:23,671][11037] Saving new best policy, reward=19.809! [2023-02-22 15:32:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.3, 300 sec: 3512.9). Total num frames: 3354624. Throughput: 0: 879.7. Samples: 838948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:28,661][00422] Avg episode reward: [(0, '20.181')] [2023-02-22 15:32:28,667][11037] Saving new best policy, reward=20.181! [2023-02-22 15:32:29,351][11051] Updated weights for policy 0, policy_version 820 (0.0023) [2023-02-22 15:32:33,654][00422] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3371008. Throughput: 0: 898.5. Samples: 843982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:33,661][00422] Avg episode reward: [(0, '21.860')] [2023-02-22 15:32:33,678][11037] Saving new best policy, reward=21.860! [2023-02-22 15:32:38,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3395584. Throughput: 0: 910.2. Samples: 847314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:32:38,656][00422] Avg episode reward: [(0, '21.950')] [2023-02-22 15:32:38,661][11037] Saving new best policy, reward=21.950! [2023-02-22 15:32:39,285][11051] Updated weights for policy 0, policy_version 830 (0.0027) [2023-02-22 15:32:43,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 3411968. Throughput: 0: 899.3. Samples: 853390. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:43,660][00422] Avg episode reward: [(0, '22.696')] [2023-02-22 15:32:43,680][11037] Saving new best policy, reward=22.696! [2023-02-22 15:32:48,658][00422] Fps is (10 sec: 3275.3, 60 sec: 3617.9, 300 sec: 3568.4). Total num frames: 3428352. Throughput: 0: 875.3. Samples: 857596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:48,661][00422] Avg episode reward: [(0, '22.684')] [2023-02-22 15:32:52,513][11051] Updated weights for policy 0, policy_version 840 (0.0024) [2023-02-22 15:32:53,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 3444736. Throughput: 0: 875.9. Samples: 859730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:32:53,657][00422] Avg episode reward: [(0, '23.048')] [2023-02-22 15:32:53,667][11037] Saving new best policy, reward=23.048! [2023-02-22 15:32:58,654][00422] Fps is (10 sec: 3688.1, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 3465216. Throughput: 0: 906.2. Samples: 866034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:32:58,663][00422] Avg episode reward: [(0, '21.968')] [2023-02-22 15:33:01,871][11051] Updated weights for policy 0, policy_version 850 (0.0016) [2023-02-22 15:33:03,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.7, 300 sec: 3582.3). Total num frames: 3485696. Throughput: 0: 893.6. Samples: 872174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:03,658][00422] Avg episode reward: [(0, '22.154')] [2023-02-22 15:33:08,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3497984. Throughput: 0: 876.7. Samples: 874246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:08,662][00422] Avg episode reward: [(0, '21.668')] [2023-02-22 15:33:13,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3514368. Throughput: 0: 877.4. Samples: 878430. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:33:13,656][00422] Avg episode reward: [(0, '22.693')] [2023-02-22 15:33:14,997][11051] Updated weights for policy 0, policy_version 860 (0.0015) [2023-02-22 15:33:18,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 3538944. Throughput: 0: 911.4. Samples: 884996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:33:18,656][00422] Avg episode reward: [(0, '22.218')] [2023-02-22 15:33:23,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 3559424. Throughput: 0: 910.1. Samples: 888270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:33:23,662][00422] Avg episode reward: [(0, '23.224')] [2023-02-22 15:33:23,671][11037] Saving new best policy, reward=23.224! [2023-02-22 15:33:25,207][11051] Updated weights for policy 0, policy_version 870 (0.0036) [2023-02-22 15:33:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 3571712. Throughput: 0: 878.2. Samples: 892908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:28,662][00422] Avg episode reward: [(0, '22.833')] [2023-02-22 15:33:33,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3584000. Throughput: 0: 875.9. Samples: 897006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:33,664][00422] Avg episode reward: [(0, '23.155')] [2023-02-22 15:33:37,558][11051] Updated weights for policy 0, policy_version 880 (0.0016) [2023-02-22 15:33:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 3608576. Throughput: 0: 900.3. Samples: 900242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:38,661][00422] Avg episode reward: [(0, '21.636')] [2023-02-22 15:33:43,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3596.1). Total num frames: 3629056. Throughput: 0: 901.5. Samples: 906600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:43,660][00422] Avg episode reward: [(0, '21.582')] [2023-02-22 15:33:43,673][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000886_3629056.pth... [2023-02-22 15:33:43,829][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000674_2760704.pth [2023-02-22 15:33:48,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3550.1, 300 sec: 3568.4). Total num frames: 3641344. Throughput: 0: 864.4. Samples: 911074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:33:48,656][00422] Avg episode reward: [(0, '21.008')] [2023-02-22 15:33:49,498][11051] Updated weights for policy 0, policy_version 890 (0.0012) [2023-02-22 15:33:53,654][00422] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 3653632. Throughput: 0: 865.1. Samples: 913176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:53,660][00422] Avg episode reward: [(0, '20.569')] [2023-02-22 15:33:58,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 3674112. Throughput: 0: 892.5. Samples: 918592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:33:58,656][00422] Avg episode reward: [(0, '21.200')] [2023-02-22 15:34:00,701][11051] Updated weights for policy 0, policy_version 900 (0.0016) [2023-02-22 15:34:03,654][00422] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 3698688. Throughput: 0: 891.9. Samples: 925132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:34:03,656][00422] Avg episode reward: [(0, '22.020')] [2023-02-22 15:34:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3710976. Throughput: 0: 872.3. Samples: 927524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:34:08,661][00422] Avg episode reward: [(0, '23.319')] [2023-02-22 15:34:08,664][11037] Saving new best policy, reward=23.319! [2023-02-22 15:34:13,654][00422] Fps is (10 sec: 2457.5, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 3723264. Throughput: 0: 858.7. Samples: 931548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:34:13,658][00422] Avg episode reward: [(0, '24.006')] [2023-02-22 15:34:13,688][11037] Saving new best policy, reward=24.006! [2023-02-22 15:34:13,704][11051] Updated weights for policy 0, policy_version 910 (0.0012) [2023-02-22 15:34:18,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3582.3). Total num frames: 3743744. Throughput: 0: 883.6. Samples: 936766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:34:18,660][00422] Avg episode reward: [(0, '24.458')] [2023-02-22 15:34:18,663][11037] Saving new best policy, reward=24.458! [2023-02-22 15:34:23,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3764224. Throughput: 0: 882.5. Samples: 939956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:34:23,662][00422] Avg episode reward: [(0, '23.263')] [2023-02-22 15:34:23,861][11051] Updated weights for policy 0, policy_version 920 (0.0024) [2023-02-22 15:34:28,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 3780608. Throughput: 0: 865.1. Samples: 945528. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:34:28,662][00422] Avg episode reward: [(0, '22.727')] [2023-02-22 15:34:33,656][00422] Fps is (10 sec: 3276.2, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 3796992. Throughput: 0: 856.2. Samples: 949604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:34:33,665][00422] Avg episode reward: [(0, '21.822')] [2023-02-22 15:34:37,264][11051] Updated weights for policy 0, policy_version 930 (0.0020) [2023-02-22 15:34:38,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3813376. Throughput: 0: 858.1. Samples: 951788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:34:38,657][00422] Avg episode reward: [(0, '20.474')] [2023-02-22 15:34:43,654][00422] Fps is (10 sec: 3687.1, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3833856. Throughput: 0: 878.6. Samples: 958130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:34:43,661][00422] Avg episode reward: [(0, '21.055')] [2023-02-22 15:34:47,399][11051] Updated weights for policy 0, policy_version 940 (0.0020) [2023-02-22 15:34:48,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 3850240. Throughput: 0: 856.2. Samples: 963662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:34:48,656][00422] Avg episode reward: [(0, '22.510')] [2023-02-22 15:34:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.6). Total num frames: 3866624. Throughput: 0: 848.9. Samples: 965726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:34:53,660][00422] Avg episode reward: [(0, '23.120')] [2023-02-22 15:34:58,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 3883008. Throughput: 0: 856.1. Samples: 970072. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:34:58,660][00422] Avg episode reward: [(0, '24.335')] [2023-02-22 15:35:00,360][11051] Updated weights for policy 0, policy_version 950 (0.0016) [2023-02-22 15:35:03,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3903488. Throughput: 0: 880.4. Samples: 976386. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:35:03,663][00422] Avg episode reward: [(0, '24.317')] [2023-02-22 15:35:08,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3923968. Throughput: 0: 883.4. Samples: 979710. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:35:08,656][00422] Avg episode reward: [(0, '25.901')] [2023-02-22 15:35:08,676][11037] Saving new best policy, reward=25.901! [2023-02-22 15:35:12,038][11051] Updated weights for policy 0, policy_version 960 (0.0024) [2023-02-22 15:35:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3936256. Throughput: 0: 853.4. Samples: 983930. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:35:13,661][00422] Avg episode reward: [(0, '26.351')] [2023-02-22 15:35:13,675][11037] Saving new best policy, reward=26.351! [2023-02-22 15:35:18,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 3948544. Throughput: 0: 860.0. Samples: 988302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:35:18,662][00422] Avg episode reward: [(0, '25.993')] [2023-02-22 15:35:23,306][11051] Updated weights for policy 0, policy_version 970 (0.0021) [2023-02-22 15:35:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 3973120. Throughput: 0: 884.4. Samples: 991588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:35:23,660][00422] Avg episode reward: [(0, '24.590')] [2023-02-22 15:35:28,655][00422] Fps is (10 sec: 3685.8, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 3985408. Throughput: 0: 857.6. Samples: 996724. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:35:28,658][00422] Avg episode reward: [(0, '23.223')] [2023-02-22 15:35:33,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3345.2, 300 sec: 3512.8). Total num frames: 3997696. Throughput: 0: 805.6. Samples: 999916. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:35:33,656][00422] Avg episode reward: [(0, '22.517')] [2023-02-22 15:35:38,654][00422] Fps is (10 sec: 2458.0, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 4009984. Throughput: 0: 797.4. Samples: 1001608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:35:38,660][00422] Avg episode reward: [(0, '22.437')] [2023-02-22 15:35:39,967][11051] Updated weights for policy 0, policy_version 980 (0.0049) [2023-02-22 15:35:43,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3499.0). Total num frames: 4026368. Throughput: 0: 798.4. Samples: 1005998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:35:43,657][00422] Avg episode reward: [(0, '22.420')] [2023-02-22 15:35:43,670][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000983_4026368.pth... [2023-02-22 15:35:43,814][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000781_3198976.pth [2023-02-22 15:35:48,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 4046848. Throughput: 0: 802.9. Samples: 1012516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:35:48,657][00422] Avg episode reward: [(0, '22.849')] [2023-02-22 15:35:49,958][11051] Updated weights for policy 0, policy_version 990 (0.0018) [2023-02-22 15:35:53,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 4067328. Throughput: 0: 802.2. Samples: 1015810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:35:53,657][00422] Avg episode reward: [(0, '22.790')] [2023-02-22 15:35:58,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3485.1). Total num frames: 4079616. Throughput: 0: 798.9. Samples: 1019882. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:35:58,657][00422] Avg episode reward: [(0, '23.950')] [2023-02-22 15:36:03,388][11051] Updated weights for policy 0, policy_version 1000 (0.0025) [2023-02-22 15:36:03,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3499.0). Total num frames: 4096000. Throughput: 0: 805.9. Samples: 1024568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:36:03,656][00422] Avg episode reward: [(0, '24.820')] [2023-02-22 15:36:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3485.1). Total num frames: 4116480. Throughput: 0: 808.4. Samples: 1027968. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:36:08,662][00422] Avg episode reward: [(0, '24.646')] [2023-02-22 15:36:12,633][11051] Updated weights for policy 0, policy_version 1010 (0.0026) [2023-02-22 15:36:13,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 4136960. Throughput: 0: 838.8. Samples: 1034470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:36:13,658][00422] Avg episode reward: [(0, '25.693')] [2023-02-22 15:36:18,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 4149248. Throughput: 0: 860.8. Samples: 1038650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:36:18,660][00422] Avg episode reward: [(0, '24.643')] [2023-02-22 15:36:23,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3485.1). Total num frames: 4165632. Throughput: 0: 869.7. Samples: 1040746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:36:23,656][00422] Avg episode reward: [(0, '24.441')] [2023-02-22 15:36:25,754][11051] Updated weights for policy 0, policy_version 1020 (0.0015) [2023-02-22 15:36:28,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3499.0). Total num frames: 4190208. Throughput: 0: 906.0. Samples: 1046768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:36:28,663][00422] Avg episode reward: [(0, '24.786')] [2023-02-22 15:36:33,664][00422] Fps is (10 sec: 4500.8, 60 sec: 3549.2, 300 sec: 3484.9). Total num frames: 4210688. Throughput: 0: 904.1. Samples: 1053208. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:36:33,671][00422] Avg episode reward: [(0, '24.890')] [2023-02-22 15:36:36,423][11051] Updated weights for policy 0, policy_version 1030 (0.0013) [2023-02-22 15:36:38,656][00422] Fps is (10 sec: 3276.0, 60 sec: 3549.7, 300 sec: 3471.2). Total num frames: 4222976. Throughput: 0: 876.2. Samples: 1055240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:36:38,664][00422] Avg episode reward: [(0, '25.752')] [2023-02-22 15:36:43,654][00422] Fps is (10 sec: 2460.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 4235264. Throughput: 0: 878.1. Samples: 1059398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:36:43,663][00422] Avg episode reward: [(0, '25.307')] [2023-02-22 15:36:48,459][11051] Updated weights for policy 0, policy_version 1040 (0.0022) [2023-02-22 15:36:48,654][00422] Fps is (10 sec: 3687.3, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4259840. Throughput: 0: 910.7. Samples: 1065550. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:36:48,664][00422] Avg episode reward: [(0, '25.384')] [2023-02-22 15:36:53,655][00422] Fps is (10 sec: 4504.9, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 4280320. Throughput: 0: 909.5. Samples: 1068896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:36:53,661][00422] Avg episode reward: [(0, '27.000')] [2023-02-22 15:36:53,672][11037] Saving new best policy, reward=27.000! [2023-02-22 15:36:58,655][00422] Fps is (10 sec: 3685.8, 60 sec: 3618.0, 300 sec: 3485.2). Total num frames: 4296704. Throughput: 0: 877.6. Samples: 1073964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:36:58,658][00422] Avg episode reward: [(0, '26.641')] [2023-02-22 15:36:59,710][11051] Updated weights for policy 0, policy_version 1050 (0.0014) [2023-02-22 15:37:03,654][00422] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 4308992. Throughput: 0: 878.0. Samples: 1078162. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:37:03,666][00422] Avg episode reward: [(0, '25.393')] [2023-02-22 15:37:08,654][00422] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4329472. Throughput: 0: 897.8. Samples: 1081146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:37:08,663][00422] Avg episode reward: [(0, '25.559')] [2023-02-22 15:37:10,739][11051] Updated weights for policy 0, policy_version 1060 (0.0025) [2023-02-22 15:37:13,654][00422] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 4354048. Throughput: 0: 910.9. Samples: 1087760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:37:13,657][00422] Avg episode reward: [(0, '25.128')] [2023-02-22 15:37:18,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 4366336. Throughput: 0: 875.9. Samples: 1092616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:37:18,659][00422] Avg episode reward: [(0, '25.813')] [2023-02-22 15:37:23,239][11051] Updated weights for policy 0, policy_version 1070 (0.0024) [2023-02-22 15:37:23,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 4382720. Throughput: 0: 877.1. Samples: 1094708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:37:23,662][00422] Avg episode reward: [(0, '26.007')] [2023-02-22 15:37:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4399104. Throughput: 0: 901.2. Samples: 1099954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:37:28,662][00422] Avg episode reward: [(0, '25.768')] [2023-02-22 15:37:33,367][11051] Updated weights for policy 0, policy_version 1080 (0.0016) [2023-02-22 15:37:33,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.5, 300 sec: 3485.1). Total num frames: 4423680. Throughput: 0: 912.1. Samples: 1106596. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:37:33,656][00422] Avg episode reward: [(0, '25.453')] [2023-02-22 15:37:38,655][00422] Fps is (10 sec: 4095.7, 60 sec: 3618.2, 300 sec: 3485.1). Total num frames: 4440064. Throughput: 0: 897.5. Samples: 1109284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:37:38,659][00422] Avg episode reward: [(0, '25.422')] [2023-02-22 15:37:43,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 4452352. Throughput: 0: 876.9. Samples: 1113422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:37:43,657][00422] Avg episode reward: [(0, '25.482')] [2023-02-22 15:37:43,671][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001087_4452352.pth... [2023-02-22 15:37:43,859][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000886_3629056.pth [2023-02-22 15:37:46,470][11051] Updated weights for policy 0, policy_version 1090 (0.0034) [2023-02-22 15:37:48,654][00422] Fps is (10 sec: 3277.1, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4472832. Throughput: 0: 898.9. Samples: 1118614. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:37:48,659][00422] Avg episode reward: [(0, '24.616')] [2023-02-22 15:37:53,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3485.1). Total num frames: 4493312. Throughput: 0: 906.5. Samples: 1121940. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:37:53,657][00422] Avg episode reward: [(0, '23.813')] [2023-02-22 15:37:55,974][11051] Updated weights for policy 0, policy_version 1100 (0.0026) [2023-02-22 15:37:58,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3471.2). Total num frames: 4509696. Throughput: 0: 888.1. Samples: 1127726. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-22 15:37:58,667][00422] Avg episode reward: [(0, '24.315')] [2023-02-22 15:38:03,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 4526080. Throughput: 0: 875.6. Samples: 1132020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:38:03,660][00422] Avg episode reward: [(0, '23.562')] [2023-02-22 15:38:08,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4542464. Throughput: 0: 877.9. Samples: 1134212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:38:08,662][00422] Avg episode reward: [(0, '24.223')] [2023-02-22 15:38:09,083][11051] Updated weights for policy 0, policy_version 1110 (0.0023) [2023-02-22 15:38:13,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4567040. Throughput: 0: 909.2. Samples: 1140868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:38:13,659][00422] Avg episode reward: [(0, '24.210')] [2023-02-22 15:38:18,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 4583424. Throughput: 0: 887.3. Samples: 1146526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:38:18,656][00422] Avg episode reward: [(0, '25.345')] [2023-02-22 15:38:19,533][11051] Updated weights for policy 0, policy_version 1120 (0.0024) [2023-02-22 15:38:23,660][00422] Fps is (10 sec: 2865.3, 60 sec: 3549.5, 300 sec: 3471.1). Total num frames: 4595712. Throughput: 0: 871.8. Samples: 1148522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:38:23,663][00422] Avg episode reward: [(0, '25.969')] [2023-02-22 15:38:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4612096. Throughput: 0: 873.6. Samples: 1152734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:38:28,657][00422] Avg episode reward: [(0, '26.384')] [2023-02-22 15:38:31,782][11051] Updated weights for policy 0, policy_version 1130 (0.0017) [2023-02-22 15:38:33,654][00422] Fps is (10 sec: 3688.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 4632576. Throughput: 0: 904.0. Samples: 1159292. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:38:33,662][00422] Avg episode reward: [(0, '27.041')] [2023-02-22 15:38:33,715][11037] Saving new best policy, reward=27.041! [2023-02-22 15:38:38,656][00422] Fps is (10 sec: 4095.2, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 4653056. Throughput: 0: 900.5. Samples: 1162466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:38:38,659][00422] Avg episode reward: [(0, '27.282')] [2023-02-22 15:38:38,667][11037] Saving new best policy, reward=27.282! [2023-02-22 15:38:43,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 4669440. Throughput: 0: 869.7. Samples: 1166864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:38:43,662][00422] Avg episode reward: [(0, '27.208')] [2023-02-22 15:38:43,645][11051] Updated weights for policy 0, policy_version 1140 (0.0045) [2023-02-22 15:38:48,654][00422] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4681728. Throughput: 0: 868.4. Samples: 1171100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:38:48,657][00422] Avg episode reward: [(0, '27.255')] [2023-02-22 15:38:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4702208. Throughput: 0: 890.3. Samples: 1174276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:38:53,656][00422] Avg episode reward: [(0, '27.293')] [2023-02-22 15:38:53,666][11037] Saving new best policy, reward=27.293! [2023-02-22 15:38:54,849][11051] Updated weights for policy 0, policy_version 1150 (0.0031) [2023-02-22 15:38:58,656][00422] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3471.2). Total num frames: 4722688. Throughput: 0: 885.8. Samples: 1180732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:38:58,661][00422] Avg episode reward: [(0, '26.772')] [2023-02-22 15:39:03,655][00422] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 4739072. Throughput: 0: 858.1. Samples: 1185142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:39:03,663][00422] Avg episode reward: [(0, '25.618')] [2023-02-22 15:39:07,922][11051] Updated weights for policy 0, policy_version 1160 (0.0019) [2023-02-22 15:39:08,654][00422] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4751360. Throughput: 0: 860.1. Samples: 1187220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:39:08,664][00422] Avg episode reward: [(0, '25.012')] [2023-02-22 15:39:13,654][00422] Fps is (10 sec: 3277.3, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 4771840. Throughput: 0: 890.4. Samples: 1192802. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:39:13,661][00422] Avg episode reward: [(0, '24.641')] [2023-02-22 15:39:17,841][11051] Updated weights for policy 0, policy_version 1170 (0.0026) [2023-02-22 15:39:18,656][00422] Fps is (10 sec: 4095.1, 60 sec: 3481.5, 300 sec: 3485.0). Total num frames: 4792320. Throughput: 0: 890.5. Samples: 1199368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:39:18,661][00422] Avg episode reward: [(0, '25.739')] [2023-02-22 15:39:23,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3550.2, 300 sec: 3485.1). Total num frames: 4808704. Throughput: 0: 871.2. Samples: 1201670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:39:23,661][00422] Avg episode reward: [(0, '26.134')] [2023-02-22 15:39:28,654][00422] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 4820992. Throughput: 0: 863.3. Samples: 1205714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:39:28,657][00422] Avg episode reward: [(0, '27.032')] [2023-02-22 15:39:31,076][11051] Updated weights for policy 0, policy_version 1180 (0.0020) [2023-02-22 15:39:33,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4841472. Throughput: 0: 892.1. Samples: 1211244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:39:33,657][00422] Avg episode reward: [(0, '26.268')] [2023-02-22 15:39:38,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 4866048. Throughput: 0: 896.9. Samples: 1214638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:39:38,657][00422] Avg episode reward: [(0, '26.760')] [2023-02-22 15:39:40,784][11051] Updated weights for policy 0, policy_version 1190 (0.0018) [2023-02-22 15:39:43,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 4882432. Throughput: 0: 875.5. Samples: 1220126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:39:43,661][00422] Avg episode reward: [(0, '27.587')] [2023-02-22 15:39:43,680][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001192_4882432.pth... [2023-02-22 15:39:43,834][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000983_4026368.pth [2023-02-22 15:39:43,859][11037] Saving new best policy, reward=27.587! [2023-02-22 15:39:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4894720. Throughput: 0: 867.1. Samples: 1224162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:39:48,659][00422] Avg episode reward: [(0, '28.229')] [2023-02-22 15:39:48,663][11037] Saving new best policy, reward=28.229! [2023-02-22 15:39:53,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 4911104. Throughput: 0: 870.0. Samples: 1226370. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:39:53,656][00422] Avg episode reward: [(0, '26.764')] [2023-02-22 15:39:54,045][11051] Updated weights for policy 0, policy_version 1200 (0.0015) [2023-02-22 15:39:58,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 4931584. Throughput: 0: 892.5. Samples: 1232964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:39:58,657][00422] Avg episode reward: [(0, '25.632')] [2023-02-22 15:40:03,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3485.1). Total num frames: 4952064. Throughput: 0: 869.8. Samples: 1238508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:40:03,659][00422] Avg episode reward: [(0, '25.462')] [2023-02-22 15:40:04,958][11051] Updated weights for policy 0, policy_version 1210 (0.0012) [2023-02-22 15:40:08,654][00422] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 4964352. Throughput: 0: 864.7. Samples: 1240580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:40:08,661][00422] Avg episode reward: [(0, '26.635')] [2023-02-22 15:40:13,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 4980736. Throughput: 0: 875.7. Samples: 1245120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:40:13,658][00422] Avg episode reward: [(0, '26.835')] [2023-02-22 15:40:16,578][11051] Updated weights for policy 0, policy_version 1220 (0.0021) [2023-02-22 15:40:18,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 5005312. Throughput: 0: 901.8. Samples: 1251824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:40:18,656][00422] Avg episode reward: [(0, '27.084')] [2023-02-22 15:40:23,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 5021696. Throughput: 0: 901.4. Samples: 1255202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:40:23,661][00422] Avg episode reward: [(0, '27.063')] [2023-02-22 15:40:28,434][11051] Updated weights for policy 0, policy_version 1230 (0.0025) [2023-02-22 15:40:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 5038080. Throughput: 0: 869.6. Samples: 1259260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:40:28,664][00422] Avg episode reward: [(0, '29.210')] [2023-02-22 15:40:28,668][11037] Saving new best policy, reward=29.210! [2023-02-22 15:40:33,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5054464. Throughput: 0: 885.5. Samples: 1264008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:40:33,663][00422] Avg episode reward: [(0, '28.514')] [2023-02-22 15:40:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5074944. Throughput: 0: 910.8. Samples: 1267356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:40:38,657][00422] Avg episode reward: [(0, '29.573')] [2023-02-22 15:40:38,661][11037] Saving new best policy, reward=29.573! [2023-02-22 15:40:38,932][11051] Updated weights for policy 0, policy_version 1240 (0.0030) [2023-02-22 15:40:43,656][00422] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5095424. Throughput: 0: 906.4. Samples: 1273756. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:40:43,661][00422] Avg episode reward: [(0, '29.242')] [2023-02-22 15:40:48,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 5107712. Throughput: 0: 874.7. Samples: 1277868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:40:48,660][00422] Avg episode reward: [(0, '28.365')] [2023-02-22 15:40:52,334][11051] Updated weights for policy 0, policy_version 1250 (0.0030) [2023-02-22 15:40:53,654][00422] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5124096. Throughput: 0: 874.3. Samples: 1279924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:40:53,657][00422] Avg episode reward: [(0, '27.862')] [2023-02-22 15:40:58,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5144576. Throughput: 0: 908.3. Samples: 1285994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:40:58,658][00422] Avg episode reward: [(0, '27.403')] [2023-02-22 15:41:01,713][11051] Updated weights for policy 0, policy_version 1260 (0.0014) [2023-02-22 15:41:03,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5165056. Throughput: 0: 898.0. Samples: 1292236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:41:03,663][00422] Avg episode reward: [(0, '28.208')] [2023-02-22 15:41:08,654][00422] Fps is (10 sec: 3276.6, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 5177344. Throughput: 0: 868.7. Samples: 1294292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:41:08,658][00422] Avg episode reward: [(0, '27.083')] [2023-02-22 15:41:13,654][00422] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5193728. Throughput: 0: 871.5. Samples: 1298476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:41:13,660][00422] Avg episode reward: [(0, '25.370')] [2023-02-22 15:41:14,920][11051] Updated weights for policy 0, policy_version 1270 (0.0025) [2023-02-22 15:41:18,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 5218304. Throughput: 0: 906.2. Samples: 1304786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:41:18,657][00422] Avg episode reward: [(0, '25.751')] [2023-02-22 15:41:23,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 5238784. Throughput: 0: 904.0. Samples: 1308036. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-22 15:41:23,659][00422] Avg episode reward: [(0, '25.743')] [2023-02-22 15:41:24,689][11051] Updated weights for policy 0, policy_version 1280 (0.0027) [2023-02-22 15:41:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3526.8). Total num frames: 5251072. Throughput: 0: 872.8. Samples: 1313032. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:41:28,657][00422] Avg episode reward: [(0, '25.333')] [2023-02-22 15:41:33,656][00422] Fps is (10 sec: 2457.0, 60 sec: 3481.5, 300 sec: 3526.7). Total num frames: 5263360. Throughput: 0: 872.9. Samples: 1317150. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:41:33,661][00422] Avg episode reward: [(0, '24.984')] [2023-02-22 15:41:37,533][11051] Updated weights for policy 0, policy_version 1290 (0.0026) [2023-02-22 15:41:38,654][00422] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5287936. Throughput: 0: 893.0. Samples: 1320108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:41:38,656][00422] Avg episode reward: [(0, '25.509')] [2023-02-22 15:41:43,654][00422] Fps is (10 sec: 4506.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5308416. Throughput: 0: 905.2. Samples: 1326728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:41:43,657][00422] Avg episode reward: [(0, '27.163')] [2023-02-22 15:41:43,669][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001296_5308416.pth... [2023-02-22 15:41:43,799][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001087_4452352.pth [2023-02-22 15:41:48,658][11051] Updated weights for policy 0, policy_version 1300 (0.0016) [2023-02-22 15:41:48,654][00422] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5324800. Throughput: 0: 874.7. Samples: 1331596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:41:48,663][00422] Avg episode reward: [(0, '26.068')] [2023-02-22 15:41:53,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 5337088. Throughput: 0: 875.8. Samples: 1333702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:41:53,657][00422] Avg episode reward: [(0, '26.774')] [2023-02-22 15:41:58,654][00422] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5357568. Throughput: 0: 896.3. Samples: 1338810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:41:58,661][00422] Avg episode reward: [(0, '25.587')] [2023-02-22 15:42:00,190][11051] Updated weights for policy 0, policy_version 1310 (0.0015) [2023-02-22 15:42:03,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5378048. Throughput: 0: 900.3. Samples: 1345298. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:42:03,658][00422] Avg episode reward: [(0, '25.716')] [2023-02-22 15:42:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3526.7). Total num frames: 5394432. Throughput: 0: 888.2. Samples: 1348004. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:42:08,658][00422] Avg episode reward: [(0, '25.392')] [2023-02-22 15:42:11,944][11051] Updated weights for policy 0, policy_version 1320 (0.0021) [2023-02-22 15:42:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5410816. Throughput: 0: 873.1. Samples: 1352322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:13,661][00422] Avg episode reward: [(0, '24.669')] [2023-02-22 15:42:18,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5427200. Throughput: 0: 903.2. Samples: 1357792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:42:18,660][00422] Avg episode reward: [(0, '25.155')] [2023-02-22 15:42:22,541][11051] Updated weights for policy 0, policy_version 1330 (0.0029) [2023-02-22 15:42:23,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5451776. Throughput: 0: 909.3. Samples: 1361028. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:23,656][00422] Avg episode reward: [(0, '26.279')] [2023-02-22 15:42:28,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 5468160. Throughput: 0: 892.3. Samples: 1366880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:28,658][00422] Avg episode reward: [(0, '25.807')] [2023-02-22 15:42:33,656][00422] Fps is (10 sec: 2866.5, 60 sec: 3618.2, 300 sec: 3526.7). Total num frames: 5480448. Throughput: 0: 876.0. Samples: 1371018. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:33,664][00422] Avg episode reward: [(0, '26.797')] [2023-02-22 15:42:35,693][11051] Updated weights for policy 0, policy_version 1340 (0.0046) [2023-02-22 15:42:38,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5496832. Throughput: 0: 877.0. Samples: 1373168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:42:38,656][00422] Avg episode reward: [(0, '26.275')] [2023-02-22 15:42:43,654][00422] Fps is (10 sec: 4096.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5521408. Throughput: 0: 905.9. Samples: 1379574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:42:43,657][00422] Avg episode reward: [(0, '27.082')] [2023-02-22 15:42:45,423][11051] Updated weights for policy 0, policy_version 1350 (0.0014) [2023-02-22 15:42:48,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5537792. Throughput: 0: 892.0. Samples: 1385436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:42:48,656][00422] Avg episode reward: [(0, '28.120')] [2023-02-22 15:42:53,661][00422] Fps is (10 sec: 3274.4, 60 sec: 3617.7, 300 sec: 3540.5). Total num frames: 5554176. Throughput: 0: 879.0. Samples: 1387566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:53,664][00422] Avg episode reward: [(0, '27.356')] [2023-02-22 15:42:58,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 5566464. Throughput: 0: 874.4. Samples: 1391672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:42:58,657][00422] Avg episode reward: [(0, '26.826')] [2023-02-22 15:42:58,693][11051] Updated weights for policy 0, policy_version 1360 (0.0013) [2023-02-22 15:43:03,654][00422] Fps is (10 sec: 3689.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5591040. Throughput: 0: 901.2. Samples: 1398346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:43:03,662][00422] Avg episode reward: [(0, '27.542')] [2023-02-22 15:43:08,389][11051] Updated weights for policy 0, policy_version 1370 (0.0032) [2023-02-22 15:43:08,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5611520. Throughput: 0: 903.3. Samples: 1401678. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:43:08,660][00422] Avg episode reward: [(0, '28.020')] [2023-02-22 15:43:13,662][00422] Fps is (10 sec: 3274.3, 60 sec: 3549.4, 300 sec: 3526.6). Total num frames: 5623808. Throughput: 0: 876.0. Samples: 1406308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:43:13,673][00422] Avg episode reward: [(0, '27.765')] [2023-02-22 15:43:18,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.7). Total num frames: 5640192. Throughput: 0: 883.7. Samples: 1410784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:43:18,657][00422] Avg episode reward: [(0, '26.900')] [2023-02-22 15:43:20,980][11051] Updated weights for policy 0, policy_version 1380 (0.0024) [2023-02-22 15:43:23,654][00422] Fps is (10 sec: 3689.3, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5660672. Throughput: 0: 909.7. Samples: 1414106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:43:23,662][00422] Avg episode reward: [(0, '28.010')] [2023-02-22 15:43:28,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5681152. Throughput: 0: 904.7. Samples: 1420284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:43:28,658][00422] Avg episode reward: [(0, '27.698')] [2023-02-22 15:43:33,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3512.9). Total num frames: 5689344. Throughput: 0: 851.2. Samples: 1423742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:43:33,656][00422] Avg episode reward: [(0, '27.242')] [2023-02-22 15:43:33,790][11051] Updated weights for policy 0, policy_version 1390 (0.0025) [2023-02-22 15:43:38,654][00422] Fps is (10 sec: 2047.9, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 5701632. Throughput: 0: 839.3. Samples: 1425328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:43:38,657][00422] Avg episode reward: [(0, '26.706')] [2023-02-22 15:43:43,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3499.0). Total num frames: 5713920. Throughput: 0: 823.9. Samples: 1428748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:43:43,661][00422] Avg episode reward: [(0, '26.361')] [2023-02-22 15:43:43,675][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001395_5713920.pth... [2023-02-22 15:43:43,813][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001192_4882432.pth [2023-02-22 15:43:47,489][11051] Updated weights for policy 0, policy_version 1400 (0.0057) [2023-02-22 15:43:48,654][00422] Fps is (10 sec: 3686.6, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 5738496. Throughput: 0: 817.2. Samples: 1435120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:43:48,656][00422] Avg episode reward: [(0, '26.479')] [2023-02-22 15:43:53,656][00422] Fps is (10 sec: 4504.7, 60 sec: 3413.6, 300 sec: 3512.8). Total num frames: 5758976. Throughput: 0: 816.6. Samples: 1438426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:43:53,660][00422] Avg episode reward: [(0, '25.448')] [2023-02-22 15:43:58,657][00422] Fps is (10 sec: 3275.7, 60 sec: 3413.1, 300 sec: 3498.9). Total num frames: 5771264. Throughput: 0: 822.6. Samples: 1443320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:43:58,668][00422] Avg episode reward: [(0, '24.698')] [2023-02-22 15:43:59,046][11051] Updated weights for policy 0, policy_version 1410 (0.0023) [2023-02-22 15:44:03,654][00422] Fps is (10 sec: 2867.8, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 5787648. Throughput: 0: 814.9. Samples: 1447456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:44:03,660][00422] Avg episode reward: [(0, '24.198')] [2023-02-22 15:44:08,654][00422] Fps is (10 sec: 3687.7, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 5808128. Throughput: 0: 809.7. Samples: 1450542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:44:08,663][00422] Avg episode reward: [(0, '24.664')] [2023-02-22 15:44:09,908][11051] Updated weights for policy 0, policy_version 1420 (0.0015) [2023-02-22 15:44:13,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3482.1, 300 sec: 3526.8). Total num frames: 5832704. Throughput: 0: 823.6. Samples: 1457344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:44:13,661][00422] Avg episode reward: [(0, '25.822')] [2023-02-22 15:44:18,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 5844992. Throughput: 0: 854.9. Samples: 1462214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:44:18,662][00422] Avg episode reward: [(0, '24.461')] [2023-02-22 15:44:22,074][11051] Updated weights for policy 0, policy_version 1430 (0.0017) [2023-02-22 15:44:23,655][00422] Fps is (10 sec: 2866.9, 60 sec: 3345.0, 300 sec: 3526.7). Total num frames: 5861376. Throughput: 0: 865.0. Samples: 1464252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:44:23,661][00422] Avg episode reward: [(0, '22.600')] [2023-02-22 15:44:28,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 5881856. Throughput: 0: 908.7. Samples: 1469638. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-22 15:44:28,660][00422] Avg episode reward: [(0, '23.477')] [2023-02-22 15:44:32,248][11051] Updated weights for policy 0, policy_version 1440 (0.0014) [2023-02-22 15:44:33,654][00422] Fps is (10 sec: 4096.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5902336. Throughput: 0: 915.5. Samples: 1476318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:44:33,661][00422] Avg episode reward: [(0, '20.877')] [2023-02-22 15:44:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3512.8). Total num frames: 5918720. Throughput: 0: 905.6. Samples: 1479174. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:44:38,659][00422] Avg episode reward: [(0, '21.715')] [2023-02-22 15:44:43,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 5931008. Throughput: 0: 888.7. Samples: 1483310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:44:43,661][00422] Avg episode reward: [(0, '22.447')] [2023-02-22 15:44:45,417][11051] Updated weights for policy 0, policy_version 1450 (0.0014) [2023-02-22 15:44:48,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 5951488. Throughput: 0: 913.5. Samples: 1488562. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:44:48,656][00422] Avg episode reward: [(0, '24.276')] [2023-02-22 15:44:53,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 5976064. Throughput: 0: 921.0. Samples: 1491988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:44:53,657][00422] Avg episode reward: [(0, '25.353')] [2023-02-22 15:44:54,547][11051] Updated weights for policy 0, policy_version 1460 (0.0024) [2023-02-22 15:44:58,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3686.6, 300 sec: 3526.7). Total num frames: 5992448. Throughput: 0: 902.0. Samples: 1497932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:44:58,657][00422] Avg episode reward: [(0, '26.932')] [2023-02-22 15:45:03,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 6004736. Throughput: 0: 883.3. Samples: 1501964. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-22 15:45:03,656][00422] Avg episode reward: [(0, '28.316')] [2023-02-22 15:45:08,051][11051] Updated weights for policy 0, policy_version 1470 (0.0017) [2023-02-22 15:45:08,654][00422] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 6021120. Throughput: 0: 886.1. Samples: 1504128. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:45:08,657][00422] Avg episode reward: [(0, '29.793')] [2023-02-22 15:45:08,660][11037] Saving new best policy, reward=29.793! [2023-02-22 15:45:13,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6045696. Throughput: 0: 913.1. Samples: 1510728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:45:13,656][00422] Avg episode reward: [(0, '29.667')] [2023-02-22 15:45:17,662][11051] Updated weights for policy 0, policy_version 1480 (0.0012) [2023-02-22 15:45:18,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 6062080. Throughput: 0: 895.4. Samples: 1516612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:45:18,656][00422] Avg episode reward: [(0, '28.659')] [2023-02-22 15:45:23,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3526.7). Total num frames: 6078464. Throughput: 0: 876.8. Samples: 1518630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:45:23,657][00422] Avg episode reward: [(0, '26.769')] [2023-02-22 15:45:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6094848. Throughput: 0: 878.1. Samples: 1522826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:45:28,657][00422] Avg episode reward: [(0, '24.210')] [2023-02-22 15:45:30,571][11051] Updated weights for policy 0, policy_version 1490 (0.0020) [2023-02-22 15:45:33,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6115328. Throughput: 0: 905.0. Samples: 1529288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:45:33,656][00422] Avg episode reward: [(0, '24.743')] [2023-02-22 15:45:38,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3526.8). Total num frames: 6135808. Throughput: 0: 900.5. Samples: 1532510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:45:38,668][00422] Avg episode reward: [(0, '24.180')] [2023-02-22 15:45:41,681][11051] Updated weights for policy 0, policy_version 1500 (0.0014) [2023-02-22 15:45:43,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 6148096. Throughput: 0: 871.0. Samples: 1537128. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:45:43,658][00422] Avg episode reward: [(0, '23.772')] [2023-02-22 15:45:43,679][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001501_6148096.pth... [2023-02-22 15:45:43,802][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001296_5308416.pth [2023-02-22 15:45:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6164480. Throughput: 0: 880.0. Samples: 1541562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:45:48,663][00422] Avg episode reward: [(0, '22.754')] [2023-02-22 15:45:52,895][11051] Updated weights for policy 0, policy_version 1510 (0.0027) [2023-02-22 15:45:53,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 6184960. Throughput: 0: 907.2. Samples: 1544952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:45:53,657][00422] Avg episode reward: [(0, '24.756')] [2023-02-22 15:45:58,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6209536. Throughput: 0: 908.6. Samples: 1551616. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:45:58,657][00422] Avg episode reward: [(0, '25.255')] [2023-02-22 15:46:03,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6221824. Throughput: 0: 878.6. Samples: 1556150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:03,664][00422] Avg episode reward: [(0, '25.560')] [2023-02-22 15:46:04,436][11051] Updated weights for policy 0, policy_version 1520 (0.0014) [2023-02-22 15:46:08,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6234112. Throughput: 0: 880.4. Samples: 1558246. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:46:08,659][00422] Avg episode reward: [(0, '25.721')] [2023-02-22 15:46:13,654][00422] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6258688. Throughput: 0: 909.9. Samples: 1563770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:46:13,656][00422] Avg episode reward: [(0, '25.613')] [2023-02-22 15:46:15,514][11051] Updated weights for policy 0, policy_version 1530 (0.0016) [2023-02-22 15:46:18,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 6279168. Throughput: 0: 913.3. Samples: 1570388. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:46:18,657][00422] Avg episode reward: [(0, '24.642')] [2023-02-22 15:46:23,654][00422] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6295552. Throughput: 0: 896.9. Samples: 1572872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:23,663][00422] Avg episode reward: [(0, '24.717')] [2023-02-22 15:46:27,862][11051] Updated weights for policy 0, policy_version 1540 (0.0039) [2023-02-22 15:46:28,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6307840. Throughput: 0: 888.5. Samples: 1577110. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:46:28,658][00422] Avg episode reward: [(0, '25.184')] [2023-02-22 15:46:33,654][00422] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6328320. Throughput: 0: 915.2. Samples: 1582748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:33,656][00422] Avg episode reward: [(0, '23.790')] [2023-02-22 15:46:37,848][11051] Updated weights for policy 0, policy_version 1550 (0.0013) [2023-02-22 15:46:38,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6348800. Throughput: 0: 913.4. Samples: 1586054. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:38,660][00422] Avg episode reward: [(0, '24.098')] [2023-02-22 15:46:43,654][00422] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 6365184. Throughput: 0: 891.9. Samples: 1591754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:43,657][00422] Avg episode reward: [(0, '24.927')] [2023-02-22 15:46:48,656][00422] Fps is (10 sec: 3276.2, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 6381568. Throughput: 0: 887.6. Samples: 1596094. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:48,658][00422] Avg episode reward: [(0, '25.676')] [2023-02-22 15:46:50,939][11051] Updated weights for policy 0, policy_version 1560 (0.0026) [2023-02-22 15:46:53,654][00422] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6402048. Throughput: 0: 893.0. Samples: 1598430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:53,657][00422] Avg episode reward: [(0, '26.208')] [2023-02-22 15:46:58,654][00422] Fps is (10 sec: 4096.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6422528. Throughput: 0: 920.3. Samples: 1605184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:46:58,657][00422] Avg episode reward: [(0, '24.629')] [2023-02-22 15:46:59,942][11051] Updated weights for policy 0, policy_version 1570 (0.0022) [2023-02-22 15:47:03,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 6438912. Throughput: 0: 900.3. Samples: 1610902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:47:03,661][00422] Avg episode reward: [(0, '24.671')] [2023-02-22 15:47:08,657][00422] Fps is (10 sec: 3275.6, 60 sec: 3686.2, 300 sec: 3540.6). Total num frames: 6455296. Throughput: 0: 889.5. Samples: 1612904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:47:08,660][00422] Avg episode reward: [(0, '24.646')] [2023-02-22 15:47:13,026][11051] Updated weights for policy 0, policy_version 1580 (0.0019) [2023-02-22 15:47:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6471680. Throughput: 0: 899.9. Samples: 1617604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:47:13,656][00422] Avg episode reward: [(0, '23.712')] [2023-02-22 15:47:18,654][00422] Fps is (10 sec: 4097.5, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6496256. Throughput: 0: 925.1. Samples: 1624376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:47:18,657][00422] Avg episode reward: [(0, '24.641')] [2023-02-22 15:47:22,515][11051] Updated weights for policy 0, policy_version 1590 (0.0021) [2023-02-22 15:47:23,655][00422] Fps is (10 sec: 4095.3, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6512640. Throughput: 0: 925.9. Samples: 1627722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:47:23,658][00422] Avg episode reward: [(0, '24.778')] [2023-02-22 15:47:28,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 6529024. Throughput: 0: 893.9. Samples: 1631980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:47:28,666][00422] Avg episode reward: [(0, '26.127')] [2023-02-22 15:47:33,654][00422] Fps is (10 sec: 3277.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6545408. Throughput: 0: 900.9. Samples: 1636632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:47:33,661][00422] Avg episode reward: [(0, '26.145')] [2023-02-22 15:47:35,696][11051] Updated weights for policy 0, policy_version 1600 (0.0043) [2023-02-22 15:47:38,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6565888. Throughput: 0: 918.9. Samples: 1639782. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:47:38,657][00422] Avg episode reward: [(0, '28.450')] [2023-02-22 15:47:43,655][00422] Fps is (10 sec: 4095.4, 60 sec: 3686.3, 300 sec: 3554.5). Total num frames: 6586368. Throughput: 0: 918.5. Samples: 1646516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:47:43,662][00422] Avg episode reward: [(0, '28.809')] [2023-02-22 15:47:43,674][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001608_6586368.pth... [2023-02-22 15:47:43,827][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001395_5713920.pth [2023-02-22 15:47:46,364][11051] Updated weights for policy 0, policy_version 1610 (0.0020) [2023-02-22 15:47:48,654][00422] Fps is (10 sec: 3276.6, 60 sec: 3618.2, 300 sec: 3540.7). Total num frames: 6598656. Throughput: 0: 885.0. Samples: 1650726. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:47:48,661][00422] Avg episode reward: [(0, '28.620')] [2023-02-22 15:47:53,654][00422] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6615040. Throughput: 0: 886.9. Samples: 1652810. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:47:53,657][00422] Avg episode reward: [(0, '28.779')] [2023-02-22 15:47:57,784][11051] Updated weights for policy 0, policy_version 1620 (0.0014) [2023-02-22 15:47:58,654][00422] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6635520. Throughput: 0: 915.9. Samples: 1658818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:47:58,666][00422] Avg episode reward: [(0, '30.038')] [2023-02-22 15:47:58,751][11037] Saving new best policy, reward=30.038! [2023-02-22 15:48:03,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 6660096. Throughput: 0: 908.4. Samples: 1665252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:48:03,659][00422] Avg episode reward: [(0, '28.791')] [2023-02-22 15:48:08,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.4, 300 sec: 3554.6). Total num frames: 6672384. Throughput: 0: 879.5. Samples: 1667300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:48:08,659][00422] Avg episode reward: [(0, '27.207')] [2023-02-22 15:48:09,860][11051] Updated weights for policy 0, policy_version 1630 (0.0038) [2023-02-22 15:48:13,654][00422] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6684672. Throughput: 0: 880.3. Samples: 1671592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:48:13,659][00422] Avg episode reward: [(0, '27.064')] [2023-02-22 15:48:18,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6709248. Throughput: 0: 910.8. Samples: 1677620. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:48:18,662][00422] Avg episode reward: [(0, '27.782')] [2023-02-22 15:48:20,421][11051] Updated weights for policy 0, policy_version 1640 (0.0021) [2023-02-22 15:48:23,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 6729728. Throughput: 0: 915.0. Samples: 1680958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:48:23,657][00422] Avg episode reward: [(0, '26.361')] [2023-02-22 15:48:28,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 6746112. Throughput: 0: 885.4. Samples: 1686358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:48:28,662][00422] Avg episode reward: [(0, '25.308')] [2023-02-22 15:48:32,935][11051] Updated weights for policy 0, policy_version 1650 (0.0016) [2023-02-22 15:48:33,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6758400. Throughput: 0: 886.2. Samples: 1690604. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:48:33,658][00422] Avg episode reward: [(0, '25.662')] [2023-02-22 15:48:38,654][00422] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3610.0). Total num frames: 6778880. Throughput: 0: 898.9. Samples: 1693260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:48:38,660][00422] Avg episode reward: [(0, '26.382')] [2023-02-22 15:48:42,859][11051] Updated weights for policy 0, policy_version 1660 (0.0031) [2023-02-22 15:48:43,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 6799360. Throughput: 0: 910.5. Samples: 1699792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:48:43,656][00422] Avg episode reward: [(0, '24.992')] [2023-02-22 15:48:48,655][00422] Fps is (10 sec: 3686.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 6815744. Throughput: 0: 886.4. Samples: 1705140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:48:48,661][00422] Avg episode reward: [(0, '24.999')] [2023-02-22 15:48:53,655][00422] Fps is (10 sec: 3276.3, 60 sec: 3618.0, 300 sec: 3596.2). Total num frames: 6832128. Throughput: 0: 887.3. Samples: 1707230. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-22 15:48:53,658][00422] Avg episode reward: [(0, '25.308')] [2023-02-22 15:48:55,994][11051] Updated weights for policy 0, policy_version 1670 (0.0021) [2023-02-22 15:48:58,654][00422] Fps is (10 sec: 3277.1, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6848512. Throughput: 0: 900.7. Samples: 1712122. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:48:58,657][00422] Avg episode reward: [(0, '27.265')] [2023-02-22 15:49:03,654][00422] Fps is (10 sec: 4096.6, 60 sec: 3549.9, 300 sec: 3610.0). Total num frames: 6873088. Throughput: 0: 918.4. Samples: 1718946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:49:03,657][00422] Avg episode reward: [(0, '27.182')] [2023-02-22 15:49:05,100][11051] Updated weights for policy 0, policy_version 1680 (0.0023) [2023-02-22 15:49:08,659][00422] Fps is (10 sec: 4093.9, 60 sec: 3617.8, 300 sec: 3582.2). Total num frames: 6889472. Throughput: 0: 913.6. Samples: 1722076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:49:08,662][00422] Avg episode reward: [(0, '26.674')] [2023-02-22 15:49:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 6905856. Throughput: 0: 885.7. Samples: 1726212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:49:13,661][00422] Avg episode reward: [(0, '26.854')] [2023-02-22 15:49:18,297][11051] Updated weights for policy 0, policy_version 1690 (0.0024) [2023-02-22 15:49:18,654][00422] Fps is (10 sec: 3278.6, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 6922240. Throughput: 0: 902.3. Samples: 1731208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:49:18,658][00422] Avg episode reward: [(0, '27.894')] [2023-02-22 15:49:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6942720. Throughput: 0: 917.3. Samples: 1734540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:49:23,661][00422] Avg episode reward: [(0, '28.075')] [2023-02-22 15:49:28,043][11051] Updated weights for policy 0, policy_version 1700 (0.0019) [2023-02-22 15:49:28,655][00422] Fps is (10 sec: 4095.3, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 6963200. Throughput: 0: 911.3. Samples: 1740804. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:49:28,658][00422] Avg episode reward: [(0, '27.965')] [2023-02-22 15:49:33,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 6975488. Throughput: 0: 886.6. Samples: 1745036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:49:33,659][00422] Avg episode reward: [(0, '27.821')] [2023-02-22 15:49:38,654][00422] Fps is (10 sec: 2867.7, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6991872. Throughput: 0: 885.2. Samples: 1747064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:49:38,660][00422] Avg episode reward: [(0, '29.172')] [2023-02-22 15:49:41,050][11051] Updated weights for policy 0, policy_version 1710 (0.0013) [2023-02-22 15:49:43,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7012352. Throughput: 0: 910.8. Samples: 1753110. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:49:43,660][00422] Avg episode reward: [(0, '30.473')] [2023-02-22 15:49:43,671][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001712_7012352.pth... [2023-02-22 15:49:43,823][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001501_6148096.pth [2023-02-22 15:49:43,837][11037] Saving new best policy, reward=30.473! [2023-02-22 15:49:48,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 7032832. Throughput: 0: 895.2. Samples: 1759228. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:49:48,658][00422] Avg episode reward: [(0, '31.249')] [2023-02-22 15:49:48,676][11037] Saving new best policy, reward=31.249! [2023-02-22 15:49:52,028][11051] Updated weights for policy 0, policy_version 1720 (0.0024) [2023-02-22 15:49:53,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 7049216. Throughput: 0: 871.6. Samples: 1761292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:49:53,658][00422] Avg episode reward: [(0, '30.481')] [2023-02-22 15:49:58,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7061504. Throughput: 0: 875.1. Samples: 1765592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:49:58,656][00422] Avg episode reward: [(0, '29.766')] [2023-02-22 15:50:03,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3596.2). Total num frames: 7081984. Throughput: 0: 900.4. Samples: 1771724. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-22 15:50:03,662][00422] Avg episode reward: [(0, '29.802')] [2023-02-22 15:50:03,685][11051] Updated weights for policy 0, policy_version 1730 (0.0021) [2023-02-22 15:50:08,655][00422] Fps is (10 sec: 4505.2, 60 sec: 3618.4, 300 sec: 3596.1). Total num frames: 7106560. Throughput: 0: 898.0. Samples: 1774950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:50:08,658][00422] Avg episode reward: [(0, '29.736')] [2023-02-22 15:50:13,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7118848. Throughput: 0: 868.3. Samples: 1779874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:13,661][00422] Avg episode reward: [(0, '28.134')] [2023-02-22 15:50:15,885][11051] Updated weights for policy 0, policy_version 1740 (0.0013) [2023-02-22 15:50:18,654][00422] Fps is (10 sec: 2457.8, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 7131136. Throughput: 0: 867.7. Samples: 1784084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:18,657][00422] Avg episode reward: [(0, '27.888')] [2023-02-22 15:50:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7155712. Throughput: 0: 889.3. Samples: 1787082. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:50:23,661][00422] Avg episode reward: [(0, '26.464')] [2023-02-22 15:50:26,394][11051] Updated weights for policy 0, policy_version 1750 (0.0024) [2023-02-22 15:50:28,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 7176192. Throughput: 0: 899.6. Samples: 1793592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:50:28,666][00422] Avg episode reward: [(0, '26.857')] [2023-02-22 15:50:33,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7188480. Throughput: 0: 872.0. Samples: 1798470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:33,661][00422] Avg episode reward: [(0, '28.215')] [2023-02-22 15:50:38,655][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7204864. Throughput: 0: 871.2. Samples: 1800498. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:50:38,662][00422] Avg episode reward: [(0, '27.221')] [2023-02-22 15:50:39,618][11051] Updated weights for policy 0, policy_version 1760 (0.0051) [2023-02-22 15:50:43,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7225344. Throughput: 0: 890.4. Samples: 1805662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:43,664][00422] Avg episode reward: [(0, '29.101')] [2023-02-22 15:50:48,654][00422] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7245824. Throughput: 0: 898.3. Samples: 1812146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:48,656][00422] Avg episode reward: [(0, '28.919')] [2023-02-22 15:50:49,092][11051] Updated weights for policy 0, policy_version 1770 (0.0015) [2023-02-22 15:50:53,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7262208. Throughput: 0: 890.1. Samples: 1815004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:50:53,657][00422] Avg episode reward: [(0, '29.262')] [2023-02-22 15:50:58,654][00422] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 7274496. Throughput: 0: 875.4. Samples: 1819268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:50:58,659][00422] Avg episode reward: [(0, '29.442')] [2023-02-22 15:51:02,277][11051] Updated weights for policy 0, policy_version 1780 (0.0051) [2023-02-22 15:51:03,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7294976. Throughput: 0: 895.8. Samples: 1824394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:51:03,657][00422] Avg episode reward: [(0, '29.149')] [2023-02-22 15:51:08,654][00422] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 7315456. Throughput: 0: 903.3. Samples: 1827730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:51:08,657][00422] Avg episode reward: [(0, '28.024')] [2023-02-22 15:51:12,036][11051] Updated weights for policy 0, policy_version 1790 (0.0014) [2023-02-22 15:51:13,654][00422] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 7335936. Throughput: 0: 890.1. Samples: 1833646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:51:13,659][00422] Avg episode reward: [(0, '27.199')] [2023-02-22 15:51:18,656][00422] Fps is (10 sec: 3276.1, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 7348224. Throughput: 0: 875.1. Samples: 1837850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:51:18,663][00422] Avg episode reward: [(0, '26.633')] [2023-02-22 15:51:23,654][00422] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 7364608. Throughput: 0: 877.8. Samples: 1839998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:51:23,661][00422] Avg episode reward: [(0, '25.490')] [2023-02-22 15:51:24,865][11051] Updated weights for policy 0, policy_version 1800 (0.0020) [2023-02-22 15:51:28,654][00422] Fps is (10 sec: 4097.0, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7389184. Throughput: 0: 907.9. Samples: 1846516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:51:28,657][00422] Avg episode reward: [(0, '26.303')] [2023-02-22 15:51:33,656][00422] Fps is (10 sec: 4095.3, 60 sec: 3618.0, 300 sec: 3582.2). Total num frames: 7405568. Throughput: 0: 895.6. Samples: 1852452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:51:33,664][00422] Avg episode reward: [(0, '25.228')] [2023-02-22 15:51:35,598][11051] Updated weights for policy 0, policy_version 1810 (0.0028) [2023-02-22 15:51:38,662][00422] Fps is (10 sec: 3274.2, 60 sec: 3617.7, 300 sec: 3582.2). Total num frames: 7421952. Throughput: 0: 877.5. Samples: 1854500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:51:38,664][00422] Avg episode reward: [(0, '26.771')] [2023-02-22 15:51:43,654][00422] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 7434240. Throughput: 0: 875.1. Samples: 1858646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:51:43,660][00422] Avg episode reward: [(0, '26.876')] [2023-02-22 15:51:43,672][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001815_7434240.pth... [2023-02-22 15:51:43,810][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001608_6586368.pth [2023-02-22 15:51:47,822][11051] Updated weights for policy 0, policy_version 1820 (0.0015) [2023-02-22 15:51:48,654][00422] Fps is (10 sec: 3279.4, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 7454720. Throughput: 0: 896.8. Samples: 1864750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:51:48,656][00422] Avg episode reward: [(0, '26.746')] [2023-02-22 15:51:53,661][00422] Fps is (10 sec: 4092.9, 60 sec: 3549.4, 300 sec: 3568.3). Total num frames: 7475200. Throughput: 0: 893.2. Samples: 1867932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:51:53,664][00422] Avg episode reward: [(0, '27.016')] [2023-02-22 15:51:58,658][00422] Fps is (10 sec: 3684.7, 60 sec: 3617.9, 300 sec: 3568.3). Total num frames: 7491584. Throughput: 0: 864.6. Samples: 1872558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:51:58,661][00422] Avg episode reward: [(0, '26.595')] [2023-02-22 15:52:00,771][11051] Updated weights for policy 0, policy_version 1830 (0.0023) [2023-02-22 15:52:03,654][00422] Fps is (10 sec: 2459.5, 60 sec: 3413.3, 300 sec: 3540.7). Total num frames: 7499776. Throughput: 0: 840.5. Samples: 1875672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:03,662][00422] Avg episode reward: [(0, '26.911')] [2023-02-22 15:52:08,654][00422] Fps is (10 sec: 2048.9, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 7512064. Throughput: 0: 826.0. Samples: 1877166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:52:08,656][00422] Avg episode reward: [(0, '26.932')] [2023-02-22 15:52:13,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3499.0). Total num frames: 7528448. Throughput: 0: 773.5. Samples: 1881322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:13,657][00422] Avg episode reward: [(0, '27.445')] [2023-02-22 15:52:15,535][11051] Updated weights for policy 0, policy_version 1840 (0.0023) [2023-02-22 15:52:18,654][00422] Fps is (10 sec: 3276.7, 60 sec: 3276.9, 300 sec: 3499.0). Total num frames: 7544832. Throughput: 0: 766.6. Samples: 1886946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:18,660][00422] Avg episode reward: [(0, '28.586')] [2023-02-22 15:52:23,661][00422] Fps is (10 sec: 2865.1, 60 sec: 3208.2, 300 sec: 3485.0). Total num frames: 7557120. Throughput: 0: 765.6. Samples: 1888950. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:52:23,665][00422] Avg episode reward: [(0, '28.735')] [2023-02-22 15:52:28,654][00422] Fps is (10 sec: 2867.3, 60 sec: 3072.0, 300 sec: 3485.1). Total num frames: 7573504. Throughput: 0: 762.0. Samples: 1892938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:52:28,662][00422] Avg episode reward: [(0, '29.505')] [2023-02-22 15:52:29,211][11051] Updated weights for policy 0, policy_version 1850 (0.0026) [2023-02-22 15:52:33,654][00422] Fps is (10 sec: 3689.1, 60 sec: 3140.4, 300 sec: 3485.1). Total num frames: 7593984. Throughput: 0: 773.7. Samples: 1899568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:52:33,661][00422] Avg episode reward: [(0, '31.112')] [2023-02-22 15:52:38,661][00422] Fps is (10 sec: 4092.9, 60 sec: 3208.6, 300 sec: 3485.0). Total num frames: 7614464. Throughput: 0: 777.4. Samples: 1902916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:38,663][00422] Avg episode reward: [(0, '30.843')] [2023-02-22 15:52:38,993][11051] Updated weights for policy 0, policy_version 1860 (0.0039) [2023-02-22 15:52:43,659][00422] Fps is (10 sec: 3684.3, 60 sec: 3276.5, 300 sec: 3498.9). Total num frames: 7630848. Throughput: 0: 775.8. Samples: 1907468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:43,661][00422] Avg episode reward: [(0, '31.225')] [2023-02-22 15:52:48,654][00422] Fps is (10 sec: 2869.4, 60 sec: 3140.3, 300 sec: 3485.1). Total num frames: 7643136. Throughput: 0: 802.5. Samples: 1911786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:52:48,656][00422] Avg episode reward: [(0, '30.626')] [2023-02-22 15:52:51,698][11051] Updated weights for policy 0, policy_version 1870 (0.0016) [2023-02-22 15:52:53,654][00422] Fps is (10 sec: 3688.5, 60 sec: 3208.9, 300 sec: 3499.0). Total num frames: 7667712. Throughput: 0: 844.9. Samples: 1915186. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-22 15:52:53,657][00422] Avg episode reward: [(0, '28.883')] [2023-02-22 15:52:58,654][00422] Fps is (10 sec: 4505.6, 60 sec: 3277.1, 300 sec: 3485.1). Total num frames: 7688192. Throughput: 0: 900.4. Samples: 1921838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:52:58,656][00422] Avg episode reward: [(0, '27.482')] [2023-02-22 15:53:02,404][11051] Updated weights for policy 0, policy_version 1880 (0.0013) [2023-02-22 15:53:03,658][00422] Fps is (10 sec: 3275.3, 60 sec: 3344.8, 300 sec: 3485.0). Total num frames: 7700480. Throughput: 0: 875.3. Samples: 1926338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:53:03,662][00422] Avg episode reward: [(0, '27.477')] [2023-02-22 15:53:08,655][00422] Fps is (10 sec: 2866.9, 60 sec: 3413.3, 300 sec: 3498.9). Total num frames: 7716864. Throughput: 0: 876.7. Samples: 1928398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:53:08,657][00422] Avg episode reward: [(0, '27.912')] [2023-02-22 15:53:13,654][00422] Fps is (10 sec: 3688.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 7737344. Throughput: 0: 912.7. Samples: 1934010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:53:13,656][00422] Avg episode reward: [(0, '27.878')] [2023-02-22 15:53:14,183][11051] Updated weights for policy 0, policy_version 1890 (0.0026) [2023-02-22 15:53:18,654][00422] Fps is (10 sec: 4096.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 7757824. Throughput: 0: 909.6. Samples: 1940500. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:53:18,658][00422] Avg episode reward: [(0, '28.179')] [2023-02-22 15:53:23,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3618.6, 300 sec: 3485.1). Total num frames: 7774208. Throughput: 0: 887.2. Samples: 1942832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:53:23,656][00422] Avg episode reward: [(0, '29.782')] [2023-02-22 15:53:26,107][11051] Updated weights for policy 0, policy_version 1900 (0.0023) [2023-02-22 15:53:28,655][00422] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 7786496. Throughput: 0: 877.9. Samples: 1946970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:53:28,658][00422] Avg episode reward: [(0, '30.973')] [2023-02-22 15:53:33,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 7806976. Throughput: 0: 902.0. Samples: 1952378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:53:33,663][00422] Avg episode reward: [(0, '30.279')] [2023-02-22 15:53:37,021][11051] Updated weights for policy 0, policy_version 1910 (0.0025) [2023-02-22 15:53:38,654][00422] Fps is (10 sec: 4096.6, 60 sec: 3550.3, 300 sec: 3485.1). Total num frames: 7827456. Throughput: 0: 897.5. Samples: 1955572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:53:38,662][00422] Avg episode reward: [(0, '31.529')] [2023-02-22 15:53:38,670][11037] Saving new best policy, reward=31.529! [2023-02-22 15:53:43,654][00422] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3485.1). Total num frames: 7843840. Throughput: 0: 876.8. Samples: 1961296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:53:43,659][00422] Avg episode reward: [(0, '30.590')] [2023-02-22 15:53:43,678][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001915_7843840.pth... [2023-02-22 15:53:43,904][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001712_7012352.pth [2023-02-22 15:53:48,654][00422] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 7856128. Throughput: 0: 868.1. Samples: 1965400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:53:48,662][00422] Avg episode reward: [(0, '29.400')] [2023-02-22 15:53:50,227][11051] Updated weights for policy 0, policy_version 1920 (0.0032) [2023-02-22 15:53:53,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 7876608. Throughput: 0: 870.9. Samples: 1967586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:53:53,657][00422] Avg episode reward: [(0, '29.537')] [2023-02-22 15:53:58,654][00422] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 7897088. Throughput: 0: 892.5. Samples: 1974174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:53:58,657][00422] Avg episode reward: [(0, '29.224')] [2023-02-22 15:53:59,887][11051] Updated weights for policy 0, policy_version 1930 (0.0022) [2023-02-22 15:54:03,657][00422] Fps is (10 sec: 3686.3, 60 sec: 3550.1, 300 sec: 3471.2). Total num frames: 7913472. Throughput: 0: 872.4. Samples: 1979760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-22 15:54:03,664][00422] Avg episode reward: [(0, '29.541')] [2023-02-22 15:54:08,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 7929856. Throughput: 0: 867.5. Samples: 1981868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-22 15:54:08,657][00422] Avg episode reward: [(0, '28.769')] [2023-02-22 15:54:13,095][11051] Updated weights for policy 0, policy_version 1940 (0.0017) [2023-02-22 15:54:13,654][00422] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 7946240. Throughput: 0: 873.0. Samples: 1986252. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-22 15:54:13,665][00422] Avg episode reward: [(0, '28.519')] [2023-02-22 15:54:18,654][00422] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 7966720. Throughput: 0: 899.8. Samples: 1992870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:54:18,657][00422] Avg episode reward: [(0, '27.641')] [2023-02-22 15:54:22,591][11051] Updated weights for policy 0, policy_version 1950 (0.0021) [2023-02-22 15:54:23,655][00422] Fps is (10 sec: 4095.6, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 7987200. Throughput: 0: 902.7. Samples: 1996196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-22 15:54:23,657][00422] Avg episode reward: [(0, '26.242')] [2023-02-22 15:54:28,654][00422] Fps is (10 sec: 3686.5, 60 sec: 3618.2, 300 sec: 3485.1). Total num frames: 8003584. Throughput: 0: 876.0. Samples: 2000718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-22 15:54:28,661][00422] Avg episode reward: [(0, '25.961')] [2023-02-22 15:54:29,813][11037] Stopping Batcher_0... [2023-02-22 15:54:29,814][11037] Loop batcher_evt_loop terminating... [2023-02-22 15:54:29,814][00422] Component Batcher_0 stopped! [2023-02-22 15:54:29,820][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-22 15:54:29,880][00422] Component RolloutWorker_w7 stopped! [2023-02-22 15:54:29,885][11058] Stopping RolloutWorker_w7... [2023-02-22 15:54:29,886][11058] Loop rollout_proc7_evt_loop terminating... [2023-02-22 15:54:29,916][00422] Component RolloutWorker_w1 stopped! [2023-02-22 15:54:29,915][11053] Stopping RolloutWorker_w1... [2023-02-22 15:54:29,927][11053] Loop rollout_proc1_evt_loop terminating... [2023-02-22 15:54:29,928][00422] Component RolloutWorker_w5 stopped! [2023-02-22 15:54:29,931][11056] Stopping RolloutWorker_w5... [2023-02-22 15:54:29,931][11059] Stopping RolloutWorker_w6... [2023-02-22 15:54:29,932][11059] Loop rollout_proc6_evt_loop terminating... [2023-02-22 15:54:29,933][00422] Component RolloutWorker_w6 stopped! [2023-02-22 15:54:29,938][11056] Loop rollout_proc5_evt_loop terminating... [2023-02-22 15:54:29,949][11051] Weights refcount: 2 0 [2023-02-22 15:54:29,967][11051] Stopping InferenceWorker_p0-w0... [2023-02-22 15:54:29,968][11051] Loop inference_proc0-0_evt_loop terminating... [2023-02-22 15:54:29,968][00422] Component InferenceWorker_p0-w0 stopped! [2023-02-22 15:54:29,994][11054] Stopping RolloutWorker_w2... [2023-02-22 15:54:29,994][00422] Component RolloutWorker_w2 stopped! [2023-02-22 15:54:30,009][11057] Stopping RolloutWorker_w4... [2023-02-22 15:54:30,009][11057] Loop rollout_proc4_evt_loop terminating... [2023-02-22 15:54:30,009][00422] Component RolloutWorker_w4 stopped! [2023-02-22 15:54:30,013][11054] Loop rollout_proc2_evt_loop terminating... [2023-02-22 15:54:30,012][00422] Component RolloutWorker_w3 stopped! [2023-02-22 15:54:30,012][11055] Stopping RolloutWorker_w3... [2023-02-22 15:54:30,030][11055] Loop rollout_proc3_evt_loop terminating... [2023-02-22 15:54:30,037][11037] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001815_7434240.pth [2023-02-22 15:54:30,047][00422] Component RolloutWorker_w0 stopped! [2023-02-22 15:54:30,059][11037] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-22 15:54:30,047][11052] Stopping RolloutWorker_w0... [2023-02-22 15:54:30,062][11052] Loop rollout_proc0_evt_loop terminating... [2023-02-22 15:54:30,434][11037] Stopping LearnerWorker_p0... [2023-02-22 15:54:30,434][11037] Loop learner_proc0_evt_loop terminating... [2023-02-22 15:54:30,431][00422] Component LearnerWorker_p0 stopped! [2023-02-22 15:54:30,438][00422] Waiting for process learner_proc0 to stop... [2023-02-22 15:54:32,765][00422] Waiting for process inference_proc0-0 to join... [2023-02-22 15:54:33,234][00422] Waiting for process rollout_proc0 to join... [2023-02-22 15:54:33,709][00422] Waiting for process rollout_proc1 to join... [2023-02-22 15:54:33,711][00422] Waiting for process rollout_proc2 to join... [2023-02-22 15:54:33,714][00422] Waiting for process rollout_proc3 to join... [2023-02-22 15:54:33,720][00422] Waiting for process rollout_proc4 to join... [2023-02-22 15:54:33,722][00422] Waiting for process rollout_proc5 to join... [2023-02-22 15:54:33,726][00422] Waiting for process rollout_proc6 to join... [2023-02-22 15:54:33,729][00422] Waiting for process rollout_proc7 to join... [2023-02-22 15:54:33,731][00422] Batcher 0 profile tree view: batching: 54.2114, releasing_batches: 0.0495 [2023-02-22 15:54:33,733][00422] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 1091.2866 update_model: 15.9941 weight_update: 0.0017 one_step: 0.0166 handle_policy_step: 1101.7193 deserialize: 31.7119, stack: 6.1866, obs_to_device_normalize: 238.9195, forward: 534.0574, send_messages: 54.9229 prepare_outputs: 181.2210 to_cpu: 112.3702 [2023-02-22 15:54:33,735][00422] Learner 0 profile tree view: misc: 0.0144, prepare_batch: 29.7904 train: 152.2556 epoch_init: 0.0433, minibatch_init: 0.0426, losses_postprocess: 1.1138, kl_divergence: 1.3242, after_optimizer: 65.7935 calculate_losses: 53.9803 losses_init: 0.0146, forward_head: 3.5965, bptt_initial: 35.2596, tail: 2.1157, advantages_returns: 0.6018, losses: 6.7898 bptt: 4.8576 bptt_forward_core: 4.6296 update: 28.6753 clip: 2.8311 [2023-02-22 15:54:33,737][00422] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.8096, enqueue_policy_requests: 302.5586, env_step: 1736.4817, overhead: 44.2662, complete_rollouts: 14.6509 save_policy_outputs: 43.5605 split_output_tensors: 20.8468 [2023-02-22 15:54:33,739][00422] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.7683, enqueue_policy_requests: 302.4081, env_step: 1733.7851, overhead: 45.0899, complete_rollouts: 15.7729 save_policy_outputs: 42.6494 split_output_tensors: 20.7561 [2023-02-22 15:54:33,741][00422] Loop Runner_EvtLoop terminating... [2023-02-22 15:54:33,743][00422] Runner profile tree view: main_loop: 2325.6910 [2023-02-22 15:54:33,744][00422] Collected {0: 8007680}, FPS: 3443.1 [2023-02-22 15:54:33,807][00422] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-22 15:54:33,808][00422] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 15:54:33,810][00422] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 15:54:33,811][00422] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 15:54:33,812][00422] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 15:54:33,814][00422] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 15:54:33,815][00422] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 15:54:33,816][00422] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 15:54:33,818][00422] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-22 15:54:33,819][00422] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-22 15:54:33,821][00422] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 15:54:33,822][00422] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 15:54:33,823][00422] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 15:54:33,825][00422] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 15:54:33,828][00422] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 15:54:33,867][00422] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-22 15:54:33,871][00422] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 15:54:33,875][00422] RunningMeanStd input shape: (1,) [2023-02-22 15:54:33,892][00422] ConvEncoder: input_channels=3 [2023-02-22 15:54:34,584][00422] Conv encoder output size: 512 [2023-02-22 15:54:34,586][00422] Policy head output size: 512 [2023-02-22 15:54:36,944][00422] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-22 15:54:38,265][00422] Num frames 100... [2023-02-22 15:54:38,390][00422] Num frames 200... [2023-02-22 15:54:38,506][00422] Num frames 300... [2023-02-22 15:54:38,621][00422] Num frames 400... [2023-02-22 15:54:38,733][00422] Num frames 500... [2023-02-22 15:54:38,853][00422] Num frames 600... [2023-02-22 15:54:38,966][00422] Num frames 700... [2023-02-22 15:54:39,083][00422] Num frames 800... [2023-02-22 15:54:39,198][00422] Num frames 900... [2023-02-22 15:54:39,310][00422] Num frames 1000... [2023-02-22 15:54:39,432][00422] Num frames 1100... [2023-02-22 15:54:39,544][00422] Num frames 1200... [2023-02-22 15:54:39,657][00422] Num frames 1300... [2023-02-22 15:54:39,770][00422] Num frames 1400... [2023-02-22 15:54:39,836][00422] Avg episode rewards: #0: 29.080, true rewards: #0: 14.080 [2023-02-22 15:54:39,838][00422] Avg episode reward: 29.080, avg true_objective: 14.080 [2023-02-22 15:54:39,959][00422] Num frames 1500... [2023-02-22 15:54:40,073][00422] Num frames 1600... [2023-02-22 15:54:40,188][00422] Num frames 1700... [2023-02-22 15:54:40,309][00422] Num frames 1800... [2023-02-22 15:54:40,433][00422] Num frames 1900... [2023-02-22 15:54:40,555][00422] Num frames 2000... [2023-02-22 15:54:40,669][00422] Num frames 2100... [2023-02-22 15:54:40,781][00422] Num frames 2200... [2023-02-22 15:54:40,891][00422] Num frames 2300... [2023-02-22 15:54:41,003][00422] Num frames 2400... [2023-02-22 15:54:41,119][00422] Num frames 2500... [2023-02-22 15:54:41,231][00422] Num frames 2600... [2023-02-22 15:54:41,346][00422] Num frames 2700... [2023-02-22 15:54:41,468][00422] Num frames 2800... [2023-02-22 15:54:41,581][00422] Num frames 2900... [2023-02-22 15:54:41,699][00422] Num frames 3000... [2023-02-22 15:54:41,814][00422] Num frames 3100... [2023-02-22 15:54:41,928][00422] Num frames 3200... [2023-02-22 15:54:42,044][00422] Num frames 3300... [2023-02-22 15:54:42,165][00422] Num frames 3400... [2023-02-22 15:54:42,279][00422] Num frames 3500... [2023-02-22 15:54:42,345][00422] Avg episode rewards: #0: 45.039, true rewards: #0: 17.540 [2023-02-22 15:54:42,346][00422] Avg episode reward: 45.039, avg true_objective: 17.540 [2023-02-22 15:54:42,462][00422] Num frames 3600... [2023-02-22 15:54:42,571][00422] Num frames 3700... [2023-02-22 15:54:42,692][00422] Num frames 3800... [2023-02-22 15:54:42,852][00422] Num frames 3900... [2023-02-22 15:54:43,008][00422] Num frames 4000... [2023-02-22 15:54:43,180][00422] Avg episode rewards: #0: 33.576, true rewards: #0: 13.577 [2023-02-22 15:54:43,183][00422] Avg episode reward: 33.576, avg true_objective: 13.577 [2023-02-22 15:54:43,231][00422] Num frames 4100... [2023-02-22 15:54:43,390][00422] Num frames 4200... [2023-02-22 15:54:43,562][00422] Num frames 4300... [2023-02-22 15:54:43,720][00422] Num frames 4400... [2023-02-22 15:54:43,882][00422] Num frames 4500... [2023-02-22 15:54:44,046][00422] Num frames 4600... [2023-02-22 15:54:44,206][00422] Num frames 4700... [2023-02-22 15:54:44,364][00422] Num frames 4800... [2023-02-22 15:54:44,521][00422] Num frames 4900... [2023-02-22 15:54:44,681][00422] Num frames 5000... [2023-02-22 15:54:44,836][00422] Num frames 5100... [2023-02-22 15:54:45,003][00422] Num frames 5200... [2023-02-22 15:54:45,164][00422] Num frames 5300... [2023-02-22 15:54:45,328][00422] Num frames 5400... [2023-02-22 15:54:45,506][00422] Num frames 5500... [2023-02-22 15:54:45,667][00422] Num frames 5600... [2023-02-22 15:54:45,824][00422] Num frames 5700... [2023-02-22 15:54:45,992][00422] Num frames 5800... [2023-02-22 15:54:46,157][00422] Num frames 5900... [2023-02-22 15:54:46,311][00422] Num frames 6000... [2023-02-22 15:54:46,432][00422] Num frames 6100... [2023-02-22 15:54:46,582][00422] Avg episode rewards: #0: 39.182, true rewards: #0: 15.433 [2023-02-22 15:54:46,584][00422] Avg episode reward: 39.182, avg true_objective: 15.433 [2023-02-22 15:54:46,621][00422] Num frames 6200... [2023-02-22 15:54:46,742][00422] Num frames 6300... [2023-02-22 15:54:46,861][00422] Num frames 6400... [2023-02-22 15:54:46,983][00422] Num frames 6500... [2023-02-22 15:54:47,110][00422] Num frames 6600... [2023-02-22 15:54:47,225][00422] Num frames 6700... [2023-02-22 15:54:47,349][00422] Num frames 6800... [2023-02-22 15:54:47,475][00422] Num frames 6900... [2023-02-22 15:54:47,597][00422] Num frames 7000... [2023-02-22 15:54:47,711][00422] Num frames 7100... [2023-02-22 15:54:47,826][00422] Num frames 7200... [2023-02-22 15:54:47,940][00422] Num frames 7300... [2023-02-22 15:54:48,059][00422] Num frames 7400... [2023-02-22 15:54:48,172][00422] Num frames 7500... [2023-02-22 15:54:48,284][00422] Num frames 7600... [2023-02-22 15:54:48,401][00422] Num frames 7700... [2023-02-22 15:54:48,517][00422] Num frames 7800... [2023-02-22 15:54:48,637][00422] Num frames 7900... [2023-02-22 15:54:48,750][00422] Num frames 8000... [2023-02-22 15:54:48,893][00422] Avg episode rewards: #0: 42.755, true rewards: #0: 16.156 [2023-02-22 15:54:48,895][00422] Avg episode reward: 42.755, avg true_objective: 16.156 [2023-02-22 15:54:48,927][00422] Num frames 8100... [2023-02-22 15:54:49,045][00422] Num frames 8200... [2023-02-22 15:54:49,158][00422] Num frames 8300... [2023-02-22 15:54:49,271][00422] Num frames 8400... [2023-02-22 15:54:49,357][00422] Avg episode rewards: #0: 36.713, true rewards: #0: 14.047 [2023-02-22 15:54:49,359][00422] Avg episode reward: 36.713, avg true_objective: 14.047 [2023-02-22 15:54:49,453][00422] Num frames 8500... [2023-02-22 15:54:49,569][00422] Num frames 8600... [2023-02-22 15:54:49,696][00422] Num frames 8700... [2023-02-22 15:54:49,818][00422] Num frames 8800... [2023-02-22 15:54:49,939][00422] Num frames 8900... [2023-02-22 15:54:50,053][00422] Num frames 9000... [2023-02-22 15:54:50,172][00422] Num frames 9100... [2023-02-22 15:54:50,287][00422] Num frames 9200... [2023-02-22 15:54:50,409][00422] Num frames 9300... [2023-02-22 15:54:50,523][00422] Num frames 9400... [2023-02-22 15:54:50,642][00422] Num frames 9500... [2023-02-22 15:54:50,756][00422] Num frames 9600... [2023-02-22 15:54:50,868][00422] Num frames 9700... [2023-02-22 15:54:50,981][00422] Num frames 9800... [2023-02-22 15:54:51,103][00422] Num frames 9900... [2023-02-22 15:54:51,220][00422] Num frames 10000... [2023-02-22 15:54:51,345][00422] Avg episode rewards: #0: 37.799, true rewards: #0: 14.371 [2023-02-22 15:54:51,347][00422] Avg episode reward: 37.799, avg true_objective: 14.371 [2023-02-22 15:54:51,402][00422] Num frames 10100... [2023-02-22 15:54:51,516][00422] Num frames 10200... [2023-02-22 15:54:51,638][00422] Num frames 10300... [2023-02-22 15:54:51,754][00422] Num frames 10400... [2023-02-22 15:54:51,876][00422] Num frames 10500... [2023-02-22 15:54:51,992][00422] Num frames 10600... [2023-02-22 15:54:52,111][00422] Num frames 10700... [2023-02-22 15:54:52,228][00422] Num frames 10800... [2023-02-22 15:54:52,350][00422] Num frames 10900... [2023-02-22 15:54:52,468][00422] Num frames 11000... [2023-02-22 15:54:52,548][00422] Avg episode rewards: #0: 35.900, true rewards: #0: 13.775 [2023-02-22 15:54:52,551][00422] Avg episode reward: 35.900, avg true_objective: 13.775 [2023-02-22 15:54:52,658][00422] Num frames 11100... [2023-02-22 15:54:52,785][00422] Num frames 11200... [2023-02-22 15:54:52,909][00422] Num frames 11300... [2023-02-22 15:54:53,027][00422] Num frames 11400... [2023-02-22 15:54:53,146][00422] Num frames 11500... [2023-02-22 15:54:53,272][00422] Num frames 11600... [2023-02-22 15:54:53,392][00422] Num frames 11700... [2023-02-22 15:54:53,508][00422] Num frames 11800... [2023-02-22 15:54:53,588][00422] Avg episode rewards: #0: 33.911, true rewards: #0: 13.133 [2023-02-22 15:54:53,590][00422] Avg episode reward: 33.911, avg true_objective: 13.133 [2023-02-22 15:54:53,696][00422] Num frames 11900... [2023-02-22 15:54:53,814][00422] Num frames 12000... [2023-02-22 15:54:53,937][00422] Num frames 12100... [2023-02-22 15:54:54,057][00422] Num frames 12200... [2023-02-22 15:54:54,193][00422] Num frames 12300... [2023-02-22 15:54:54,309][00422] Num frames 12400... [2023-02-22 15:54:54,398][00422] Avg episode rewards: #0: 31.628, true rewards: #0: 12.428 [2023-02-22 15:54:54,400][00422] Avg episode reward: 31.628, avg true_objective: 12.428 [2023-02-22 15:56:16,698][00422] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-22 15:56:17,139][00422] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-22 15:56:17,141][00422] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-22 15:56:17,144][00422] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-22 15:56:17,146][00422] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-22 15:56:17,148][00422] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-22 15:56:17,150][00422] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-22 15:56:17,151][00422] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-22 15:56:17,153][00422] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-22 15:56:17,154][00422] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-22 15:56:17,155][00422] Adding new argument 'hf_repository'='keyblade95/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-22 15:56:17,156][00422] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-22 15:56:17,158][00422] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-22 15:56:17,159][00422] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-22 15:56:17,160][00422] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-22 15:56:17,161][00422] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-22 15:56:17,186][00422] RunningMeanStd input shape: (3, 72, 128) [2023-02-22 15:56:17,189][00422] RunningMeanStd input shape: (1,) [2023-02-22 15:56:17,207][00422] ConvEncoder: input_channels=3 [2023-02-22 15:56:17,263][00422] Conv encoder output size: 512 [2023-02-22 15:56:17,265][00422] Policy head output size: 512 [2023-02-22 15:56:17,292][00422] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-22 15:56:17,997][00422] Num frames 100... [2023-02-22 15:56:18,154][00422] Num frames 200... [2023-02-22 15:56:18,305][00422] Num frames 300... [2023-02-22 15:56:18,458][00422] Num frames 400... [2023-02-22 15:56:18,604][00422] Num frames 500... [2023-02-22 15:56:18,768][00422] Num frames 600... [2023-02-22 15:56:18,917][00422] Num frames 700... [2023-02-22 15:56:19,074][00422] Num frames 800... [2023-02-22 15:56:19,252][00422] Num frames 900... [2023-02-22 15:56:19,401][00422] Num frames 1000... [2023-02-22 15:56:19,550][00422] Num frames 1100... [2023-02-22 15:56:19,703][00422] Num frames 1200... [2023-02-22 15:56:19,860][00422] Num frames 1300... [2023-02-22 15:56:20,006][00422] Num frames 1400... [2023-02-22 15:56:20,158][00422] Num frames 1500... [2023-02-22 15:56:20,311][00422] Num frames 1600... [2023-02-22 15:56:20,459][00422] Num frames 1700... [2023-02-22 15:56:20,620][00422] Num frames 1800... [2023-02-22 15:56:20,772][00422] Num frames 1900... [2023-02-22 15:56:20,856][00422] Avg episode rewards: #0: 54.189, true rewards: #0: 19.190 [2023-02-22 15:56:20,860][00422] Avg episode reward: 54.189, avg true_objective: 19.190 [2023-02-22 15:56:21,005][00422] Num frames 2000... [2023-02-22 15:56:21,164][00422] Num frames 2100... [2023-02-22 15:56:21,330][00422] Num frames 2200... [2023-02-22 15:56:21,491][00422] Num frames 2300... [2023-02-22 15:56:21,651][00422] Num frames 2400... [2023-02-22 15:56:21,822][00422] Num frames 2500... [2023-02-22 15:56:21,984][00422] Num frames 2600... [2023-02-22 15:56:22,138][00422] Num frames 2700... [2023-02-22 15:56:22,300][00422] Num frames 2800... [2023-02-22 15:56:22,457][00422] Num frames 2900... [2023-02-22 15:56:22,625][00422] Num frames 3000... [2023-02-22 15:56:22,779][00422] Num frames 3100... [2023-02-22 15:56:22,931][00422] Num frames 3200... [2023-02-22 15:56:23,082][00422] Num frames 3300... [2023-02-22 15:56:23,238][00422] Num frames 3400... [2023-02-22 15:56:23,399][00422] Num frames 3500... [2023-02-22 15:56:23,585][00422] Avg episode rewards: #0: 47.414, true rewards: #0: 17.915 [2023-02-22 15:56:23,588][00422] Avg episode reward: 47.414, avg true_objective: 17.915 [2023-02-22 15:56:23,624][00422] Num frames 3600... [2023-02-22 15:56:23,799][00422] Num frames 3700... [2023-02-22 15:56:23,968][00422] Num frames 3800... [2023-02-22 15:56:24,159][00422] Num frames 3900... [2023-02-22 15:56:24,319][00422] Num frames 4000... [2023-02-22 15:56:24,494][00422] Num frames 4100... [2023-02-22 15:56:24,678][00422] Num frames 4200... [2023-02-22 15:56:24,842][00422] Avg episode rewards: #0: 36.220, true rewards: #0: 14.220 [2023-02-22 15:56:24,844][00422] Avg episode reward: 36.220, avg true_objective: 14.220 [2023-02-22 15:56:24,918][00422] Num frames 4300... [2023-02-22 15:56:25,076][00422] Num frames 4400... [2023-02-22 15:56:25,231][00422] Num frames 4500... [2023-02-22 15:56:25,391][00422] Num frames 4600... [2023-02-22 15:56:25,545][00422] Num frames 4700... [2023-02-22 15:56:25,695][00422] Num frames 4800... [2023-02-22 15:56:25,853][00422] Num frames 4900... [2023-02-22 15:56:26,005][00422] Num frames 5000... [2023-02-22 15:56:26,159][00422] Num frames 5100... [2023-02-22 15:56:26,320][00422] Num frames 5200... [2023-02-22 15:56:26,527][00422] Avg episode rewards: #0: 31.975, true rewards: #0: 13.225 [2023-02-22 15:56:26,529][00422] Avg episode reward: 31.975, avg true_objective: 13.225 [2023-02-22 15:56:26,548][00422] Num frames 5300... [2023-02-22 15:56:26,710][00422] Num frames 5400... [2023-02-22 15:56:26,870][00422] Num frames 5500... [2023-02-22 15:56:27,027][00422] Num frames 5600... [2023-02-22 15:56:27,188][00422] Num frames 5700... [2023-02-22 15:56:27,347][00422] Num frames 5800... [2023-02-22 15:56:27,563][00422] Avg episode rewards: #0: 27.996, true rewards: #0: 11.796 [2023-02-22 15:56:27,565][00422] Avg episode reward: 27.996, avg true_objective: 11.796 [2023-02-22 15:56:27,570][00422] Num frames 5900... [2023-02-22 15:56:27,690][00422] Num frames 6000... [2023-02-22 15:56:27,817][00422] Num frames 6100... [2023-02-22 15:56:27,939][00422] Num frames 6200... [2023-02-22 15:56:28,060][00422] Num frames 6300... [2023-02-22 15:56:28,179][00422] Num frames 6400... [2023-02-22 15:56:28,297][00422] Num frames 6500... [2023-02-22 15:56:28,412][00422] Num frames 6600... [2023-02-22 15:56:28,531][00422] Num frames 6700... [2023-02-22 15:56:28,647][00422] Num frames 6800... [2023-02-22 15:56:28,767][00422] Avg episode rewards: #0: 27.263, true rewards: #0: 11.430 [2023-02-22 15:56:28,769][00422] Avg episode reward: 27.263, avg true_objective: 11.430 [2023-02-22 15:56:28,821][00422] Num frames 6900... [2023-02-22 15:56:28,941][00422] Num frames 7000... [2023-02-22 15:56:29,051][00422] Num frames 7100... [2023-02-22 15:56:29,161][00422] Num frames 7200... [2023-02-22 15:56:29,272][00422] Num frames 7300... [2023-02-22 15:56:29,390][00422] Num frames 7400... [2023-02-22 15:56:29,510][00422] Num frames 7500... [2023-02-22 15:56:29,621][00422] Num frames 7600... [2023-02-22 15:56:29,731][00422] Num frames 7700... [2023-02-22 15:56:29,882][00422] Avg episode rewards: #0: 26.267, true rewards: #0: 11.124 [2023-02-22 15:56:29,884][00422] Avg episode reward: 26.267, avg true_objective: 11.124 [2023-02-22 15:56:29,903][00422] Num frames 7800... [2023-02-22 15:56:30,021][00422] Num frames 7900... [2023-02-22 15:56:30,140][00422] Num frames 8000... [2023-02-22 15:56:30,259][00422] Num frames 8100... [2023-02-22 15:56:30,378][00422] Num frames 8200... [2023-02-22 15:56:30,496][00422] Num frames 8300... [2023-02-22 15:56:30,625][00422] Num frames 8400... [2023-02-22 15:56:30,744][00422] Num frames 8500... [2023-02-22 15:56:30,904][00422] Avg episode rewards: #0: 24.859, true rewards: #0: 10.734 [2023-02-22 15:56:30,908][00422] Avg episode reward: 24.859, avg true_objective: 10.734 [2023-02-22 15:56:30,929][00422] Num frames 8600... [2023-02-22 15:56:31,054][00422] Num frames 8700... [2023-02-22 15:56:31,167][00422] Num frames 8800... [2023-02-22 15:56:31,278][00422] Num frames 8900... [2023-02-22 15:56:31,393][00422] Num frames 9000... [2023-02-22 15:56:31,505][00422] Num frames 9100... [2023-02-22 15:56:31,623][00422] Num frames 9200... [2023-02-22 15:56:31,733][00422] Num frames 9300... [2023-02-22 15:56:31,844][00422] Num frames 9400... [2023-02-22 15:56:31,957][00422] Num frames 9500... [2023-02-22 15:56:32,069][00422] Num frames 9600... [2023-02-22 15:56:32,184][00422] Num frames 9700... [2023-02-22 15:56:32,297][00422] Num frames 9800... [2023-02-22 15:56:32,415][00422] Num frames 9900... [2023-02-22 15:56:32,531][00422] Num frames 10000... [2023-02-22 15:56:32,647][00422] Num frames 10100... [2023-02-22 15:56:32,756][00422] Num frames 10200... [2023-02-22 15:56:32,915][00422] Avg episode rewards: #0: 27.095, true rewards: #0: 11.429 [2023-02-22 15:56:32,917][00422] Avg episode reward: 27.095, avg true_objective: 11.429 [2023-02-22 15:56:32,937][00422] Num frames 10300... [2023-02-22 15:56:33,054][00422] Num frames 10400... [2023-02-22 15:56:33,171][00422] Num frames 10500... [2023-02-22 15:56:33,288][00422] Num frames 10600... [2023-02-22 15:56:33,407][00422] Num frames 10700... [2023-02-22 15:56:33,520][00422] Num frames 10800... [2023-02-22 15:56:33,639][00422] Num frames 10900... [2023-02-22 15:56:33,753][00422] Num frames 11000... [2023-02-22 15:56:33,870][00422] Num frames 11100... [2023-02-22 15:56:33,989][00422] Num frames 11200... [2023-02-22 15:56:34,103][00422] Num frames 11300... [2023-02-22 15:56:34,217][00422] Num frames 11400... [2023-02-22 15:56:34,331][00422] Num frames 11500... [2023-02-22 15:56:34,444][00422] Num frames 11600... [2023-02-22 15:56:34,558][00422] Num frames 11700... [2023-02-22 15:56:34,675][00422] Num frames 11800... [2023-02-22 15:56:34,790][00422] Avg episode rewards: #0: 28.554, true rewards: #0: 11.854 [2023-02-22 15:56:34,793][00422] Avg episode reward: 28.554, avg true_objective: 11.854 [2023-02-22 15:57:54,244][00422] Replay video saved to /content/train_dir/default_experiment/replay.mp4!