[2023-02-25 18:43:57,962][00219] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-25 18:43:57,966][00219] Rollout worker 0 uses device cpu [2023-02-25 18:43:57,969][00219] Rollout worker 1 uses device cpu [2023-02-25 18:43:57,971][00219] Rollout worker 2 uses device cpu [2023-02-25 18:43:57,972][00219] Rollout worker 3 uses device cpu [2023-02-25 18:43:57,973][00219] Rollout worker 4 uses device cpu [2023-02-25 18:43:57,975][00219] Rollout worker 5 uses device cpu [2023-02-25 18:43:57,976][00219] Rollout worker 6 uses device cpu [2023-02-25 18:43:57,977][00219] Rollout worker 7 uses device cpu [2023-02-25 18:43:58,173][00219] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 18:43:58,177][00219] InferenceWorker_p0-w0: min num requests: 2 [2023-02-25 18:43:58,212][00219] Starting all processes... [2023-02-25 18:43:58,214][00219] Starting process learner_proc0 [2023-02-25 18:43:58,264][00219] Starting all processes... [2023-02-25 18:43:58,272][00219] Starting process inference_proc0-0 [2023-02-25 18:43:58,273][00219] Starting process rollout_proc0 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc1 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc2 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc3 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc4 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc5 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc6 [2023-02-25 18:43:58,275][00219] Starting process rollout_proc7 [2023-02-25 18:44:09,268][12589] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 18:44:09,270][12589] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-25 18:44:09,401][12607] Worker 4 uses CPU cores [0] [2023-02-25 18:44:09,457][12610] Worker 6 uses CPU cores [0] [2023-02-25 18:44:09,597][12605] Worker 1 uses CPU cores [1] [2023-02-25 18:44:09,605][12608] Worker 3 uses CPU cores [1] [2023-02-25 18:44:09,668][12603] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 18:44:09,672][12603] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-25 18:44:09,694][12609] Worker 5 uses CPU cores [1] [2023-02-25 18:44:09,733][12604] Worker 0 uses CPU cores [0] [2023-02-25 18:44:09,764][12606] Worker 2 uses CPU cores [0] [2023-02-25 18:44:09,825][12611] Worker 7 uses CPU cores [1] [2023-02-25 18:44:10,218][12589] Num visible devices: 1 [2023-02-25 18:44:10,219][12603] Num visible devices: 1 [2023-02-25 18:44:10,232][12589] Starting seed is not provided [2023-02-25 18:44:10,232][12589] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 18:44:10,232][12589] Initializing actor-critic model on device cuda:0 [2023-02-25 18:44:10,233][12589] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 18:44:10,235][12589] RunningMeanStd input shape: (1,) [2023-02-25 18:44:10,247][12589] ConvEncoder: input_channels=3 [2023-02-25 18:44:10,525][12589] Conv encoder output size: 512 [2023-02-25 18:44:10,526][12589] Policy head output size: 512 [2023-02-25 18:44:10,588][12589] Created Actor Critic model with architecture: [2023-02-25 18:44:10,588][12589] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-25 18:44:18,166][00219] Heartbeat connected on Batcher_0 [2023-02-25 18:44:18,173][00219] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-25 18:44:18,185][00219] Heartbeat connected on RolloutWorker_w0 [2023-02-25 18:44:18,188][00219] Heartbeat connected on RolloutWorker_w1 [2023-02-25 18:44:18,192][00219] Heartbeat connected on RolloutWorker_w2 [2023-02-25 18:44:18,196][00219] Heartbeat connected on RolloutWorker_w3 [2023-02-25 18:44:18,201][00219] Heartbeat connected on RolloutWorker_w4 [2023-02-25 18:44:18,204][00219] Heartbeat connected on RolloutWorker_w5 [2023-02-25 18:44:18,207][00219] Heartbeat connected on RolloutWorker_w6 [2023-02-25 18:44:18,210][00219] Heartbeat connected on RolloutWorker_w7 [2023-02-25 18:44:18,380][12589] Using optimizer [2023-02-25 18:44:18,382][12589] No checkpoints found [2023-02-25 18:44:18,382][12589] Did not load from checkpoint, starting from scratch! [2023-02-25 18:44:18,383][12589] Initialized policy 0 weights for model version 0 [2023-02-25 18:44:18,386][12589] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 18:44:18,393][12589] LearnerWorker_p0 finished initialization! [2023-02-25 18:44:18,394][00219] Heartbeat connected on LearnerWorker_p0 [2023-02-25 18:44:18,475][00219] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 18:44:18,592][12603] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 18:44:18,593][12603] RunningMeanStd input shape: (1,) [2023-02-25 18:44:18,605][12603] ConvEncoder: input_channels=3 [2023-02-25 18:44:18,704][12603] Conv encoder output size: 512 [2023-02-25 18:44:18,705][12603] Policy head output size: 512 [2023-02-25 18:44:21,257][00219] Inference worker 0-0 is ready! [2023-02-25 18:44:21,260][00219] All inference workers are ready! Signal rollout workers to start! [2023-02-25 18:44:21,391][12609] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,402][12611] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,420][12608] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,422][12605] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,488][12604] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,518][12607] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,523][12610] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:21,534][12606] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 18:44:23,005][12611] Decorrelating experience for 0 frames... [2023-02-25 18:44:23,006][12608] Decorrelating experience for 0 frames... [2023-02-25 18:44:23,006][12606] Decorrelating experience for 0 frames... [2023-02-25 18:44:23,016][12605] Decorrelating experience for 0 frames... [2023-02-25 18:44:23,017][12609] Decorrelating experience for 0 frames... [2023-02-25 18:44:23,474][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 18:44:23,931][12605] Decorrelating experience for 32 frames... [2023-02-25 18:44:23,940][12608] Decorrelating experience for 32 frames... [2023-02-25 18:44:24,449][12604] Decorrelating experience for 0 frames... [2023-02-25 18:44:24,451][12607] Decorrelating experience for 0 frames... [2023-02-25 18:44:24,485][12606] Decorrelating experience for 32 frames... [2023-02-25 18:44:25,260][12611] Decorrelating experience for 32 frames... [2023-02-25 18:44:25,506][12608] Decorrelating experience for 64 frames... [2023-02-25 18:44:25,808][12604] Decorrelating experience for 32 frames... [2023-02-25 18:44:25,888][12607] Decorrelating experience for 32 frames... [2023-02-25 18:44:26,007][12605] Decorrelating experience for 64 frames... [2023-02-25 18:44:26,115][12606] Decorrelating experience for 64 frames... [2023-02-25 18:44:26,620][12610] Decorrelating experience for 0 frames... [2023-02-25 18:44:27,129][12611] Decorrelating experience for 64 frames... [2023-02-25 18:44:27,243][12608] Decorrelating experience for 96 frames... [2023-02-25 18:44:27,357][12604] Decorrelating experience for 64 frames... [2023-02-25 18:44:27,364][12610] Decorrelating experience for 32 frames... [2023-02-25 18:44:27,399][12609] Decorrelating experience for 32 frames... [2023-02-25 18:44:27,594][12605] Decorrelating experience for 96 frames... [2023-02-25 18:44:28,300][12611] Decorrelating experience for 96 frames... [2023-02-25 18:44:28,396][12607] Decorrelating experience for 64 frames... [2023-02-25 18:44:28,459][12609] Decorrelating experience for 64 frames... [2023-02-25 18:44:28,477][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 18:44:28,629][12604] Decorrelating experience for 96 frames... [2023-02-25 18:44:28,865][12609] Decorrelating experience for 96 frames... [2023-02-25 18:44:29,335][12606] Decorrelating experience for 96 frames... [2023-02-25 18:44:29,434][12610] Decorrelating experience for 64 frames... [2023-02-25 18:44:29,741][12607] Decorrelating experience for 96 frames... [2023-02-25 18:44:30,071][12610] Decorrelating experience for 96 frames... [2023-02-25 18:44:33,474][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 125.9. Samples: 1888. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 18:44:33,481][00219] Avg episode reward: [(0, '1.546')] [2023-02-25 18:44:33,723][12589] Signal inference workers to stop experience collection... [2023-02-25 18:44:33,755][12603] InferenceWorker_p0-w0: stopping experience collection [2023-02-25 18:44:36,237][12589] Signal inference workers to resume experience collection... [2023-02-25 18:44:36,238][12603] InferenceWorker_p0-w0: resuming experience collection [2023-02-25 18:44:38,475][00219] Fps is (10 sec: 409.7, 60 sec: 204.8, 300 sec: 204.8). Total num frames: 4096. Throughput: 0: 112.2. Samples: 2244. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-25 18:44:38,481][00219] Avg episode reward: [(0, '2.260')] [2023-02-25 18:44:43,474][00219] Fps is (10 sec: 2048.0, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 20480. Throughput: 0: 229.9. Samples: 5748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:44:43,476][00219] Avg episode reward: [(0, '3.448')] [2023-02-25 18:44:47,940][12603] Updated weights for policy 0, policy_version 10 (0.0030) [2023-02-25 18:44:48,474][00219] Fps is (10 sec: 3686.7, 60 sec: 1365.4, 300 sec: 1365.4). Total num frames: 40960. Throughput: 0: 371.2. Samples: 11136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:44:48,482][00219] Avg episode reward: [(0, '4.223')] [2023-02-25 18:44:53,474][00219] Fps is (10 sec: 4505.6, 60 sec: 1872.5, 300 sec: 1872.5). Total num frames: 65536. Throughput: 0: 416.5. Samples: 14578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:44:53,476][00219] Avg episode reward: [(0, '4.598')] [2023-02-25 18:44:57,383][12603] Updated weights for policy 0, policy_version 20 (0.0022) [2023-02-25 18:44:58,474][00219] Fps is (10 sec: 4096.0, 60 sec: 2048.1, 300 sec: 2048.1). Total num frames: 81920. Throughput: 0: 525.1. Samples: 21002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:44:58,476][00219] Avg episode reward: [(0, '4.443')] [2023-02-25 18:45:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 2184.6, 300 sec: 2184.6). Total num frames: 98304. Throughput: 0: 566.5. Samples: 25494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:45:03,481][00219] Avg episode reward: [(0, '4.268')] [2023-02-25 18:45:08,474][00219] Fps is (10 sec: 3686.3, 60 sec: 2375.7, 300 sec: 2375.7). Total num frames: 118784. Throughput: 0: 622.3. Samples: 28002. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-25 18:45:08,476][00219] Avg episode reward: [(0, '4.393')] [2023-02-25 18:45:08,492][12589] Saving new best policy, reward=4.393! [2023-02-25 18:45:09,211][12603] Updated weights for policy 0, policy_version 30 (0.0021) [2023-02-25 18:45:13,474][00219] Fps is (10 sec: 4096.0, 60 sec: 2532.1, 300 sec: 2532.1). Total num frames: 139264. Throughput: 0: 774.4. Samples: 34846. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:13,479][00219] Avg episode reward: [(0, '4.496')] [2023-02-25 18:45:13,487][12589] Saving new best policy, reward=4.496! [2023-02-25 18:45:18,482][00219] Fps is (10 sec: 4092.8, 60 sec: 2662.1, 300 sec: 2662.1). Total num frames: 159744. Throughput: 0: 863.9. Samples: 40772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:18,486][00219] Avg episode reward: [(0, '4.528')] [2023-02-25 18:45:18,491][12589] Saving new best policy, reward=4.528! [2023-02-25 18:45:19,448][12603] Updated weights for policy 0, policy_version 40 (0.0023) [2023-02-25 18:45:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 2867.2, 300 sec: 2646.7). Total num frames: 172032. Throughput: 0: 903.6. Samples: 42904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:45:23,481][00219] Avg episode reward: [(0, '4.374')] [2023-02-25 18:45:28,474][00219] Fps is (10 sec: 3279.4, 60 sec: 3208.7, 300 sec: 2750.2). Total num frames: 192512. Throughput: 0: 940.2. Samples: 48058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:28,477][00219] Avg episode reward: [(0, '4.430')] [2023-02-25 18:45:30,539][12603] Updated weights for policy 0, policy_version 50 (0.0021) [2023-02-25 18:45:33,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 2894.5). Total num frames: 217088. Throughput: 0: 974.6. Samples: 54994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:45:33,480][00219] Avg episode reward: [(0, '4.384')] [2023-02-25 18:45:38,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 2918.4). Total num frames: 233472. Throughput: 0: 970.2. Samples: 58238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:38,476][00219] Avg episode reward: [(0, '4.359')] [2023-02-25 18:45:41,434][12603] Updated weights for policy 0, policy_version 60 (0.0035) [2023-02-25 18:45:43,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 2939.5). Total num frames: 249856. Throughput: 0: 924.9. Samples: 62622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:45:43,479][00219] Avg episode reward: [(0, '4.487')] [2023-02-25 18:45:48,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3003.8). Total num frames: 270336. Throughput: 0: 952.5. Samples: 68358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:48,478][00219] Avg episode reward: [(0, '4.503')] [2023-02-25 18:45:51,650][12603] Updated weights for policy 0, policy_version 70 (0.0027) [2023-02-25 18:45:53,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3061.3). Total num frames: 290816. Throughput: 0: 974.4. Samples: 71850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:45:53,476][00219] Avg episode reward: [(0, '4.496')] [2023-02-25 18:45:53,494][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth... [2023-02-25 18:45:58,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3113.0). Total num frames: 311296. Throughput: 0: 955.6. Samples: 77846. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:45:58,476][00219] Avg episode reward: [(0, '4.388')] [2023-02-25 18:46:03,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3081.8). Total num frames: 323584. Throughput: 0: 923.0. Samples: 82302. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:46:03,480][00219] Avg episode reward: [(0, '4.505')] [2023-02-25 18:46:03,593][12603] Updated weights for policy 0, policy_version 80 (0.0018) [2023-02-25 18:46:08,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3127.9). Total num frames: 344064. Throughput: 0: 937.4. Samples: 85088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:46:08,479][00219] Avg episode reward: [(0, '4.456')] [2023-02-25 18:46:12,984][12603] Updated weights for policy 0, policy_version 90 (0.0018) [2023-02-25 18:46:13,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3205.6). Total num frames: 368640. Throughput: 0: 976.4. Samples: 91998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:46:13,479][00219] Avg episode reward: [(0, '4.437')] [2023-02-25 18:46:18,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3755.2, 300 sec: 3208.6). Total num frames: 385024. Throughput: 0: 944.4. Samples: 97490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:46:18,481][00219] Avg episode reward: [(0, '4.592')] [2023-02-25 18:46:18,484][12589] Saving new best policy, reward=4.592! [2023-02-25 18:46:23,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3178.5). Total num frames: 397312. Throughput: 0: 907.9. Samples: 99092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:46:23,476][00219] Avg episode reward: [(0, '4.712')] [2023-02-25 18:46:23,490][12589] Saving new best policy, reward=4.712! [2023-02-25 18:46:28,054][12603] Updated weights for policy 0, policy_version 100 (0.0017) [2023-02-25 18:46:28,474][00219] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3150.8). Total num frames: 409600. Throughput: 0: 884.6. Samples: 102430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:46:28,477][00219] Avg episode reward: [(0, '4.731')] [2023-02-25 18:46:28,480][12589] Saving new best policy, reward=4.731! [2023-02-25 18:46:33,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3155.5). Total num frames: 425984. Throughput: 0: 875.6. Samples: 107760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:46:33,476][00219] Avg episode reward: [(0, '4.621')] [2023-02-25 18:46:38,120][12603] Updated weights for policy 0, policy_version 110 (0.0020) [2023-02-25 18:46:38,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3218.3). Total num frames: 450560. Throughput: 0: 876.0. Samples: 111272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:46:38,478][00219] Avg episode reward: [(0, '4.587')] [2023-02-25 18:46:43,474][00219] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3192.1). Total num frames: 462848. Throughput: 0: 852.7. Samples: 116218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:46:43,476][00219] Avg episode reward: [(0, '4.599')] [2023-02-25 18:46:48,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3194.9). Total num frames: 479232. Throughput: 0: 862.5. Samples: 121116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:46:48,476][00219] Avg episode reward: [(0, '4.579')] [2023-02-25 18:46:50,248][12603] Updated weights for policy 0, policy_version 120 (0.0017) [2023-02-25 18:46:53,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3250.4). Total num frames: 503808. Throughput: 0: 877.8. Samples: 124590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:46:53,476][00219] Avg episode reward: [(0, '4.656')] [2023-02-25 18:46:58,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 524288. Throughput: 0: 879.1. Samples: 131558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:46:58,483][00219] Avg episode reward: [(0, '4.637')] [2023-02-25 18:47:00,203][12603] Updated weights for policy 0, policy_version 130 (0.0031) [2023-02-25 18:47:03,474][00219] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 540672. Throughput: 0: 856.1. Samples: 136014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:47:03,481][00219] Avg episode reward: [(0, '4.814')] [2023-02-25 18:47:03,496][12589] Saving new best policy, reward=4.814! [2023-02-25 18:47:08,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 557056. Throughput: 0: 869.7. Samples: 138228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:47:08,476][00219] Avg episode reward: [(0, '4.832')] [2023-02-25 18:47:08,487][12589] Saving new best policy, reward=4.832! [2023-02-25 18:47:11,638][12603] Updated weights for policy 0, policy_version 140 (0.0015) [2023-02-25 18:47:13,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3323.6). Total num frames: 581632. Throughput: 0: 938.0. Samples: 144642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:47:13,476][00219] Avg episode reward: [(0, '4.985')] [2023-02-25 18:47:13,487][12589] Saving new best policy, reward=4.985! [2023-02-25 18:47:18,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3345.1). Total num frames: 602112. Throughput: 0: 964.9. Samples: 151180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:47:18,477][00219] Avg episode reward: [(0, '4.926')] [2023-02-25 18:47:22,365][12603] Updated weights for policy 0, policy_version 150 (0.0011) [2023-02-25 18:47:23,478][00219] Fps is (10 sec: 3275.5, 60 sec: 3617.9, 300 sec: 3321.0). Total num frames: 614400. Throughput: 0: 936.3. Samples: 153410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:47:23,481][00219] Avg episode reward: [(0, '4.939')] [2023-02-25 18:47:28,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3319.9). Total num frames: 630784. Throughput: 0: 925.0. Samples: 157842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:47:28,477][00219] Avg episode reward: [(0, '4.984')] [2023-02-25 18:47:32,986][12603] Updated weights for policy 0, policy_version 160 (0.0015) [2023-02-25 18:47:33,474][00219] Fps is (10 sec: 4097.6, 60 sec: 3822.9, 300 sec: 3360.8). Total num frames: 655360. Throughput: 0: 967.8. Samples: 164668. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:47:33,476][00219] Avg episode reward: [(0, '4.968')] [2023-02-25 18:47:38,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3379.2). Total num frames: 675840. Throughput: 0: 968.3. Samples: 168164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:47:38,482][00219] Avg episode reward: [(0, '4.849')] [2023-02-25 18:47:43,476][00219] Fps is (10 sec: 3685.5, 60 sec: 3822.8, 300 sec: 3376.7). Total num frames: 692224. Throughput: 0: 917.2. Samples: 172832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:47:43,481][00219] Avg episode reward: [(0, '4.887')] [2023-02-25 18:47:44,900][12603] Updated weights for policy 0, policy_version 170 (0.0017) [2023-02-25 18:47:48,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3374.3). Total num frames: 708608. Throughput: 0: 930.9. Samples: 177904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:47:48,482][00219] Avg episode reward: [(0, '4.971')] [2023-02-25 18:47:53,474][00219] Fps is (10 sec: 4096.9, 60 sec: 3822.9, 300 sec: 3410.2). Total num frames: 733184. Throughput: 0: 958.6. Samples: 181366. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:47:53,481][00219] Avg episode reward: [(0, '4.837')] [2023-02-25 18:47:53,494][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000179_733184.pth... [2023-02-25 18:47:54,224][12603] Updated weights for policy 0, policy_version 180 (0.0020) [2023-02-25 18:47:58,474][00219] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3425.8). Total num frames: 753664. Throughput: 0: 970.9. Samples: 188332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:47:58,480][00219] Avg episode reward: [(0, '5.002')] [2023-02-25 18:47:58,483][12589] Saving new best policy, reward=5.002! [2023-02-25 18:48:03,477][00219] Fps is (10 sec: 3275.8, 60 sec: 3754.5, 300 sec: 3404.2). Total num frames: 765952. Throughput: 0: 923.8. Samples: 192752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:48:03,482][00219] Avg episode reward: [(0, '5.009')] [2023-02-25 18:48:03,499][12589] Saving new best policy, reward=5.009! [2023-02-25 18:48:06,428][12603] Updated weights for policy 0, policy_version 190 (0.0020) [2023-02-25 18:48:08,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3419.3). Total num frames: 786432. Throughput: 0: 923.9. Samples: 194984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:48:08,481][00219] Avg episode reward: [(0, '5.056')] [2023-02-25 18:48:08,485][12589] Saving new best policy, reward=5.056! [2023-02-25 18:48:13,474][00219] Fps is (10 sec: 4506.9, 60 sec: 3822.9, 300 sec: 3451.1). Total num frames: 811008. Throughput: 0: 983.1. Samples: 202082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:48:13,476][00219] Avg episode reward: [(0, '5.241')] [2023-02-25 18:48:13,489][12589] Saving new best policy, reward=5.241! [2023-02-25 18:48:15,030][12603] Updated weights for policy 0, policy_version 200 (0.0011) [2023-02-25 18:48:18,480][00219] Fps is (10 sec: 4502.9, 60 sec: 3822.6, 300 sec: 3464.5). Total num frames: 831488. Throughput: 0: 970.8. Samples: 208358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:48:18,482][00219] Avg episode reward: [(0, '5.708')] [2023-02-25 18:48:18,490][12589] Saving new best policy, reward=5.708! [2023-02-25 18:48:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3823.2, 300 sec: 3444.0). Total num frames: 843776. Throughput: 0: 941.9. Samples: 210548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:48:23,476][00219] Avg episode reward: [(0, '5.742')] [2023-02-25 18:48:23,498][12589] Saving new best policy, reward=5.742! [2023-02-25 18:48:27,641][12603] Updated weights for policy 0, policy_version 210 (0.0015) [2023-02-25 18:48:28,474][00219] Fps is (10 sec: 2868.9, 60 sec: 3822.9, 300 sec: 3440.7). Total num frames: 860160. Throughput: 0: 944.1. Samples: 215314. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:48:28,480][00219] Avg episode reward: [(0, '5.680')] [2023-02-25 18:48:33,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3469.6). Total num frames: 884736. Throughput: 0: 987.7. Samples: 222350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:48:33,476][00219] Avg episode reward: [(0, '5.768')] [2023-02-25 18:48:33,497][12589] Saving new best policy, reward=5.768! [2023-02-25 18:48:36,461][12603] Updated weights for policy 0, policy_version 220 (0.0018) [2023-02-25 18:48:38,475][00219] Fps is (10 sec: 4504.8, 60 sec: 3822.8, 300 sec: 3481.6). Total num frames: 905216. Throughput: 0: 984.6. Samples: 225674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:48:38,478][00219] Avg episode reward: [(0, '6.026')] [2023-02-25 18:48:38,485][12589] Saving new best policy, reward=6.026! [2023-02-25 18:48:43,474][00219] Fps is (10 sec: 3276.6, 60 sec: 3754.8, 300 sec: 3462.3). Total num frames: 917504. Throughput: 0: 923.6. Samples: 229894. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:48:43,477][00219] Avg episode reward: [(0, '6.105')] [2023-02-25 18:48:43,490][12589] Saving new best policy, reward=6.105! [2023-02-25 18:48:48,474][00219] Fps is (10 sec: 3277.4, 60 sec: 3823.0, 300 sec: 3474.0). Total num frames: 937984. Throughput: 0: 950.3. Samples: 235512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:48:48,476][00219] Avg episode reward: [(0, '5.425')] [2023-02-25 18:48:48,701][12603] Updated weights for policy 0, policy_version 230 (0.0019) [2023-02-25 18:48:53,474][00219] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3485.3). Total num frames: 958464. Throughput: 0: 963.9. Samples: 238360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:48:53,476][00219] Avg episode reward: [(0, '5.559')] [2023-02-25 18:48:58,477][00219] Fps is (10 sec: 4094.8, 60 sec: 3754.5, 300 sec: 3496.2). Total num frames: 978944. Throughput: 0: 944.2. Samples: 244574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:48:58,480][00219] Avg episode reward: [(0, '6.195')] [2023-02-25 18:48:58,486][12589] Saving new best policy, reward=6.195! [2023-02-25 18:48:59,687][12603] Updated weights for policy 0, policy_version 240 (0.0013) [2023-02-25 18:49:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3478.0). Total num frames: 991232. Throughput: 0: 901.4. Samples: 248914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:49:03,482][00219] Avg episode reward: [(0, '5.986')] [2023-02-25 18:49:08,474][00219] Fps is (10 sec: 3277.8, 60 sec: 3754.7, 300 sec: 3488.7). Total num frames: 1011712. Throughput: 0: 914.1. Samples: 251684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:49:08,483][00219] Avg episode reward: [(0, '6.334')] [2023-02-25 18:49:08,486][12589] Saving new best policy, reward=6.334! [2023-02-25 18:49:10,430][12603] Updated weights for policy 0, policy_version 250 (0.0011) [2023-02-25 18:49:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3512.9). Total num frames: 1036288. Throughput: 0: 964.3. Samples: 258706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:49:13,478][00219] Avg episode reward: [(0, '6.680')] [2023-02-25 18:49:13,492][12589] Saving new best policy, reward=6.680! [2023-02-25 18:49:18,475][00219] Fps is (10 sec: 4095.6, 60 sec: 3686.7, 300 sec: 3568.4). Total num frames: 1052672. Throughput: 0: 933.2. Samples: 264344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:49:18,479][00219] Avg episode reward: [(0, '6.622')] [2023-02-25 18:49:21,615][12603] Updated weights for policy 0, policy_version 260 (0.0019) [2023-02-25 18:49:23,476][00219] Fps is (10 sec: 3276.2, 60 sec: 3754.5, 300 sec: 3623.9). Total num frames: 1069056. Throughput: 0: 910.7. Samples: 266656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:49:23,483][00219] Avg episode reward: [(0, '6.558')] [2023-02-25 18:49:28,474][00219] Fps is (10 sec: 3686.6, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 1089536. Throughput: 0: 942.0. Samples: 272286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:49:28,480][00219] Avg episode reward: [(0, '6.452')] [2023-02-25 18:49:31,377][12603] Updated weights for policy 0, policy_version 270 (0.0011) [2023-02-25 18:49:33,474][00219] Fps is (10 sec: 4506.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 1114112. Throughput: 0: 973.7. Samples: 279328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:49:33,476][00219] Avg episode reward: [(0, '6.595')] [2023-02-25 18:49:38,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 1130496. Throughput: 0: 976.1. Samples: 282284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:49:38,479][00219] Avg episode reward: [(0, '6.517')] [2023-02-25 18:49:42,951][12603] Updated weights for policy 0, policy_version 280 (0.0014) [2023-02-25 18:49:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 1146880. Throughput: 0: 940.1. Samples: 286874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:49:43,484][00219] Avg episode reward: [(0, '7.447')] [2023-02-25 18:49:43,502][12589] Saving new best policy, reward=7.447! [2023-02-25 18:49:48,474][00219] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 1167360. Throughput: 0: 978.9. Samples: 292964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:49:48,476][00219] Avg episode reward: [(0, '7.907')] [2023-02-25 18:49:48,479][12589] Saving new best policy, reward=7.907! [2023-02-25 18:49:52,337][12603] Updated weights for policy 0, policy_version 290 (0.0020) [2023-02-25 18:49:53,474][00219] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 1191936. Throughput: 0: 993.3. Samples: 296382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:49:53,481][00219] Avg episode reward: [(0, '7.722')] [2023-02-25 18:49:53,493][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000291_1191936.pth... [2023-02-25 18:49:53,600][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000072_294912.pth [2023-02-25 18:49:58,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3823.1, 300 sec: 3762.8). Total num frames: 1208320. Throughput: 0: 969.3. Samples: 302324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:49:58,481][00219] Avg episode reward: [(0, '7.716')] [2023-02-25 18:50:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 1224704. Throughput: 0: 941.1. Samples: 306692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:50:03,479][00219] Avg episode reward: [(0, '7.865')] [2023-02-25 18:50:04,793][12603] Updated weights for policy 0, policy_version 300 (0.0017) [2023-02-25 18:50:08,474][00219] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 1245184. Throughput: 0: 955.0. Samples: 309628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:50:08,481][00219] Avg episode reward: [(0, '7.428')] [2023-02-25 18:50:13,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3749.0). Total num frames: 1265664. Throughput: 0: 982.8. Samples: 316514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:50:13,476][00219] Avg episode reward: [(0, '7.632')] [2023-02-25 18:50:13,646][12603] Updated weights for policy 0, policy_version 310 (0.0011) [2023-02-25 18:50:18,474][00219] Fps is (10 sec: 3686.5, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 1282048. Throughput: 0: 940.9. Samples: 321670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:50:18,483][00219] Avg episode reward: [(0, '8.313')] [2023-02-25 18:50:18,486][12589] Saving new best policy, reward=8.313! [2023-02-25 18:50:23,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 1298432. Throughput: 0: 922.3. Samples: 323786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:50:23,479][00219] Avg episode reward: [(0, '8.524')] [2023-02-25 18:50:23,490][12589] Saving new best policy, reward=8.524! [2023-02-25 18:50:26,223][12603] Updated weights for policy 0, policy_version 320 (0.0011) [2023-02-25 18:50:28,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3735.0). Total num frames: 1318912. Throughput: 0: 946.1. Samples: 329450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:50:28,481][00219] Avg episode reward: [(0, '8.948')] [2023-02-25 18:50:28,484][12589] Saving new best policy, reward=8.948! [2023-02-25 18:50:33,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 1343488. Throughput: 0: 964.9. Samples: 336384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:50:33,476][00219] Avg episode reward: [(0, '9.104')] [2023-02-25 18:50:33,484][12589] Saving new best policy, reward=9.104! [2023-02-25 18:50:35,843][12603] Updated weights for policy 0, policy_version 330 (0.0013) [2023-02-25 18:50:38,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1355776. Throughput: 0: 941.3. Samples: 338742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:50:38,480][00219] Avg episode reward: [(0, '9.298')] [2023-02-25 18:50:38,551][12589] Saving new best policy, reward=9.298! [2023-02-25 18:50:43,474][00219] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 1372160. Throughput: 0: 907.6. Samples: 343164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:50:43,482][00219] Avg episode reward: [(0, '10.035')] [2023-02-25 18:50:43,493][12589] Saving new best policy, reward=10.035! [2023-02-25 18:50:47,461][12603] Updated weights for policy 0, policy_version 340 (0.0015) [2023-02-25 18:50:48,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1396736. Throughput: 0: 954.1. Samples: 349628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:50:48,476][00219] Avg episode reward: [(0, '11.035')] [2023-02-25 18:50:48,481][12589] Saving new best policy, reward=11.035! [2023-02-25 18:50:53,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1417216. Throughput: 0: 965.3. Samples: 353068. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:50:53,476][00219] Avg episode reward: [(0, '11.494')] [2023-02-25 18:50:53,562][12589] Saving new best policy, reward=11.494! [2023-02-25 18:50:57,756][12603] Updated weights for policy 0, policy_version 350 (0.0019) [2023-02-25 18:50:58,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 1433600. Throughput: 0: 936.0. Samples: 358634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:50:58,478][00219] Avg episode reward: [(0, '11.011')] [2023-02-25 18:51:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 1449984. Throughput: 0: 922.8. Samples: 363194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:51:03,476][00219] Avg episode reward: [(0, '10.271')] [2023-02-25 18:51:08,312][12603] Updated weights for policy 0, policy_version 360 (0.0019) [2023-02-25 18:51:08,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 1474560. Throughput: 0: 953.1. Samples: 366676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:51:08,476][00219] Avg episode reward: [(0, '9.879')] [2023-02-25 18:51:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 1495040. Throughput: 0: 987.5. Samples: 373886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:51:13,478][00219] Avg episode reward: [(0, '9.359')] [2023-02-25 18:51:18,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 1511424. Throughput: 0: 943.3. Samples: 378832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:51:18,476][00219] Avg episode reward: [(0, '9.599')] [2023-02-25 18:51:19,180][12603] Updated weights for policy 0, policy_version 370 (0.0011) [2023-02-25 18:51:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1527808. Throughput: 0: 939.6. Samples: 381022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:51:23,475][00219] Avg episode reward: [(0, '9.597')] [2023-02-25 18:51:28,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1552384. Throughput: 0: 983.0. Samples: 387400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:51:28,476][00219] Avg episode reward: [(0, '10.448')] [2023-02-25 18:51:29,149][12603] Updated weights for policy 0, policy_version 380 (0.0026) [2023-02-25 18:51:33,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1572864. Throughput: 0: 980.5. Samples: 393750. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:51:33,481][00219] Avg episode reward: [(0, '11.195')] [2023-02-25 18:51:38,476][00219] Fps is (10 sec: 2866.6, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 1581056. Throughput: 0: 940.9. Samples: 395412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:51:38,479][00219] Avg episode reward: [(0, '11.437')] [2023-02-25 18:51:43,477][00219] Fps is (10 sec: 2047.2, 60 sec: 3686.2, 300 sec: 3776.6). Total num frames: 1593344. Throughput: 0: 893.4. Samples: 398840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:51:43,480][00219] Avg episode reward: [(0, '11.771')] [2023-02-25 18:51:43,491][12589] Saving new best policy, reward=11.771! [2023-02-25 18:51:43,982][12603] Updated weights for policy 0, policy_version 390 (0.0020) [2023-02-25 18:51:48,474][00219] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3748.9). Total num frames: 1609728. Throughput: 0: 900.3. Samples: 403706. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:51:48,479][00219] Avg episode reward: [(0, '11.580')] [2023-02-25 18:51:53,476][00219] Fps is (10 sec: 4096.6, 60 sec: 3618.0, 300 sec: 3762.7). Total num frames: 1634304. Throughput: 0: 901.4. Samples: 407240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:51:53,478][00219] Avg episode reward: [(0, '12.265')] [2023-02-25 18:51:53,499][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000399_1634304.pth... [2023-02-25 18:51:53,595][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000179_733184.pth [2023-02-25 18:51:53,610][12589] Saving new best policy, reward=12.265! [2023-02-25 18:51:53,945][12603] Updated weights for policy 0, policy_version 400 (0.0012) [2023-02-25 18:51:58,474][00219] Fps is (10 sec: 4505.3, 60 sec: 3686.4, 300 sec: 3776.6). Total num frames: 1654784. Throughput: 0: 888.6. Samples: 413874. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 18:51:58,477][00219] Avg episode reward: [(0, '13.951')] [2023-02-25 18:51:58,480][12589] Saving new best policy, reward=13.951! [2023-02-25 18:52:03,474][00219] Fps is (10 sec: 3687.2, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 1671168. Throughput: 0: 877.0. Samples: 418298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:52:03,481][00219] Avg episode reward: [(0, '14.649')] [2023-02-25 18:52:03,490][12589] Saving new best policy, reward=14.649! [2023-02-25 18:52:06,234][12603] Updated weights for policy 0, policy_version 410 (0.0020) [2023-02-25 18:52:08,474][00219] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3748.9). Total num frames: 1687552. Throughput: 0: 875.7. Samples: 420428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:52:08,476][00219] Avg episode reward: [(0, '14.893')] [2023-02-25 18:52:08,480][12589] Saving new best policy, reward=14.893! [2023-02-25 18:52:13,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3748.9). Total num frames: 1708032. Throughput: 0: 883.9. Samples: 427174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:52:13,476][00219] Avg episode reward: [(0, '16.413')] [2023-02-25 18:52:13,485][12589] Saving new best policy, reward=16.413! [2023-02-25 18:52:15,359][12603] Updated weights for policy 0, policy_version 420 (0.0016) [2023-02-25 18:52:18,476][00219] Fps is (10 sec: 4095.2, 60 sec: 3618.0, 300 sec: 3776.7). Total num frames: 1728512. Throughput: 0: 880.4. Samples: 433368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:52:18,478][00219] Avg episode reward: [(0, '16.166')] [2023-02-25 18:52:23,478][00219] Fps is (10 sec: 3685.0, 60 sec: 3617.9, 300 sec: 3776.6). Total num frames: 1744896. Throughput: 0: 892.8. Samples: 435590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:52:23,485][00219] Avg episode reward: [(0, '15.296')] [2023-02-25 18:52:27,775][12603] Updated weights for policy 0, policy_version 430 (0.0017) [2023-02-25 18:52:28,474][00219] Fps is (10 sec: 3277.4, 60 sec: 3481.6, 300 sec: 3748.9). Total num frames: 1761280. Throughput: 0: 921.1. Samples: 440284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:52:28,479][00219] Avg episode reward: [(0, '16.563')] [2023-02-25 18:52:28,486][12589] Saving new best policy, reward=16.563! [2023-02-25 18:52:33,474][00219] Fps is (10 sec: 4097.6, 60 sec: 3549.9, 300 sec: 3762.8). Total num frames: 1785856. Throughput: 0: 961.4. Samples: 446968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:52:33,477][00219] Avg episode reward: [(0, '16.916')] [2023-02-25 18:52:33,493][12589] Saving new best policy, reward=16.916! [2023-02-25 18:52:37,191][12603] Updated weights for policy 0, policy_version 440 (0.0013) [2023-02-25 18:52:38,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3686.5, 300 sec: 3762.8). Total num frames: 1802240. Throughput: 0: 958.5. Samples: 450370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:52:38,475][00219] Avg episode reward: [(0, '17.059')] [2023-02-25 18:52:38,508][12589] Saving new best policy, reward=17.059! [2023-02-25 18:52:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3762.8). Total num frames: 1818624. Throughput: 0: 910.8. Samples: 454860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:52:43,476][00219] Avg episode reward: [(0, '16.595')] [2023-02-25 18:52:48,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 1835008. Throughput: 0: 930.3. Samples: 460160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:52:48,476][00219] Avg episode reward: [(0, '17.192')] [2023-02-25 18:52:48,504][12589] Saving new best policy, reward=17.192! [2023-02-25 18:52:49,382][12603] Updated weights for policy 0, policy_version 450 (0.0012) [2023-02-25 18:52:53,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3748.9). Total num frames: 1859584. Throughput: 0: 955.6. Samples: 463430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:52:53,476][00219] Avg episode reward: [(0, '17.036')] [2023-02-25 18:52:58,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 1880064. Throughput: 0: 950.9. Samples: 469964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:52:58,479][00219] Avg episode reward: [(0, '17.476')] [2023-02-25 18:52:58,486][12589] Saving new best policy, reward=17.476! [2023-02-25 18:52:59,519][12603] Updated weights for policy 0, policy_version 460 (0.0014) [2023-02-25 18:53:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 1892352. Throughput: 0: 911.0. Samples: 474362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:03,479][00219] Avg episode reward: [(0, '17.885')] [2023-02-25 18:53:03,493][12589] Saving new best policy, reward=17.885! [2023-02-25 18:53:08,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 1912832. Throughput: 0: 913.7. Samples: 476702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:08,480][00219] Avg episode reward: [(0, '19.489')] [2023-02-25 18:53:08,484][12589] Saving new best policy, reward=19.489! [2023-02-25 18:53:10,799][12603] Updated weights for policy 0, policy_version 470 (0.0033) [2023-02-25 18:53:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3749.0). Total num frames: 1937408. Throughput: 0: 965.1. Samples: 483714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:13,482][00219] Avg episode reward: [(0, '19.218')] [2023-02-25 18:53:18,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 1953792. Throughput: 0: 953.9. Samples: 489892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:18,478][00219] Avg episode reward: [(0, '19.535')] [2023-02-25 18:53:18,593][12589] Saving new best policy, reward=19.535! [2023-02-25 18:53:21,307][12603] Updated weights for policy 0, policy_version 480 (0.0017) [2023-02-25 18:53:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3762.8). Total num frames: 1970176. Throughput: 0: 925.5. Samples: 492018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:53:23,478][00219] Avg episode reward: [(0, '19.995')] [2023-02-25 18:53:23,490][12589] Saving new best policy, reward=19.995! [2023-02-25 18:53:28,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 1990656. Throughput: 0: 939.6. Samples: 497142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:28,478][00219] Avg episode reward: [(0, '19.973')] [2023-02-25 18:53:31,670][12603] Updated weights for policy 0, policy_version 490 (0.0027) [2023-02-25 18:53:33,474][00219] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 2015232. Throughput: 0: 980.4. Samples: 504278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:53:33,479][00219] Avg episode reward: [(0, '19.326')] [2023-02-25 18:53:38,475][00219] Fps is (10 sec: 4095.6, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 2031616. Throughput: 0: 982.3. Samples: 507634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:53:38,476][00219] Avg episode reward: [(0, '18.096')] [2023-02-25 18:53:42,783][12603] Updated weights for policy 0, policy_version 500 (0.0015) [2023-02-25 18:53:43,478][00219] Fps is (10 sec: 3275.6, 60 sec: 3822.7, 300 sec: 3762.7). Total num frames: 2048000. Throughput: 0: 937.9. Samples: 512172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:53:43,485][00219] Avg episode reward: [(0, '19.064')] [2023-02-25 18:53:48,474][00219] Fps is (10 sec: 3686.7, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 2068480. Throughput: 0: 966.2. Samples: 517840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:53:48,477][00219] Avg episode reward: [(0, '18.710')] [2023-02-25 18:53:53,474][00219] Fps is (10 sec: 3687.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 2084864. Throughput: 0: 976.5. Samples: 520646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:53:53,477][00219] Avg episode reward: [(0, '18.964')] [2023-02-25 18:53:53,485][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000509_2084864.pth... [2023-02-25 18:53:53,629][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000291_1191936.pth [2023-02-25 18:53:54,305][12603] Updated weights for policy 0, policy_version 510 (0.0017) [2023-02-25 18:53:58,474][00219] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 2101248. Throughput: 0: 933.2. Samples: 525706. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:53:58,476][00219] Avg episode reward: [(0, '19.222')] [2023-02-25 18:54:03,475][00219] Fps is (10 sec: 3276.3, 60 sec: 3754.6, 300 sec: 3748.9). Total num frames: 2117632. Throughput: 0: 888.1. Samples: 529860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:54:03,479][00219] Avg episode reward: [(0, '19.820')] [2023-02-25 18:54:06,634][12603] Updated weights for policy 0, policy_version 520 (0.0034) [2023-02-25 18:54:08,481][00219] Fps is (10 sec: 3683.6, 60 sec: 3754.2, 300 sec: 3734.9). Total num frames: 2138112. Throughput: 0: 901.4. Samples: 532590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:54:08,486][00219] Avg episode reward: [(0, '19.205')] [2023-02-25 18:54:13,474][00219] Fps is (10 sec: 4096.7, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 2158592. Throughput: 0: 937.4. Samples: 539324. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:54:13,477][00219] Avg episode reward: [(0, '19.135')] [2023-02-25 18:54:16,242][12603] Updated weights for policy 0, policy_version 530 (0.0037) [2023-02-25 18:54:18,474][00219] Fps is (10 sec: 3689.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 2174976. Throughput: 0: 897.5. Samples: 544666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:54:18,480][00219] Avg episode reward: [(0, '20.032')] [2023-02-25 18:54:18,484][12589] Saving new best policy, reward=20.032! [2023-02-25 18:54:23,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 2191360. Throughput: 0: 870.9. Samples: 546822. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:54:23,482][00219] Avg episode reward: [(0, '20.116')] [2023-02-25 18:54:23,496][12589] Saving new best policy, reward=20.116! [2023-02-25 18:54:28,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 2207744. Throughput: 0: 885.5. Samples: 552016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:54:28,479][00219] Avg episode reward: [(0, '19.392')] [2023-02-25 18:54:28,692][12603] Updated weights for policy 0, policy_version 540 (0.0023) [2023-02-25 18:54:33,474][00219] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3735.0). Total num frames: 2232320. Throughput: 0: 909.7. Samples: 558778. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-25 18:54:33,476][00219] Avg episode reward: [(0, '19.379')] [2023-02-25 18:54:38,478][00219] Fps is (10 sec: 4094.4, 60 sec: 3618.0, 300 sec: 3734.9). Total num frames: 2248704. Throughput: 0: 908.0. Samples: 561510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:54:38,480][00219] Avg episode reward: [(0, '19.455')] [2023-02-25 18:54:39,344][12603] Updated weights for policy 0, policy_version 550 (0.0015) [2023-02-25 18:54:43,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3550.1, 300 sec: 3707.2). Total num frames: 2260992. Throughput: 0: 890.1. Samples: 565762. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:54:43,483][00219] Avg episode reward: [(0, '18.415')] [2023-02-25 18:54:48,474][00219] Fps is (10 sec: 3278.1, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 2281472. Throughput: 0: 924.7. Samples: 571468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:54:48,477][00219] Avg episode reward: [(0, '17.624')] [2023-02-25 18:54:50,577][12603] Updated weights for policy 0, policy_version 560 (0.0032) [2023-02-25 18:54:53,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 2306048. Throughput: 0: 937.6. Samples: 574776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:54:53,482][00219] Avg episode reward: [(0, '17.612')] [2023-02-25 18:54:58,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 2322432. Throughput: 0: 917.2. Samples: 580596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:54:58,478][00219] Avg episode reward: [(0, '17.166')] [2023-02-25 18:55:02,617][12603] Updated weights for policy 0, policy_version 570 (0.0011) [2023-02-25 18:55:03,474][00219] Fps is (10 sec: 2867.1, 60 sec: 3618.2, 300 sec: 3693.3). Total num frames: 2334720. Throughput: 0: 891.1. Samples: 584766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:03,480][00219] Avg episode reward: [(0, '17.523')] [2023-02-25 18:55:08,474][00219] Fps is (10 sec: 3276.7, 60 sec: 3618.6, 300 sec: 3693.3). Total num frames: 2355200. Throughput: 0: 908.0. Samples: 587680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:55:08,481][00219] Avg episode reward: [(0, '16.939')] [2023-02-25 18:55:12,524][12603] Updated weights for policy 0, policy_version 580 (0.0015) [2023-02-25 18:55:13,474][00219] Fps is (10 sec: 4505.8, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 2379776. Throughput: 0: 939.8. Samples: 594308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:13,476][00219] Avg episode reward: [(0, '17.952')] [2023-02-25 18:55:18,479][00219] Fps is (10 sec: 4094.0, 60 sec: 3686.1, 300 sec: 3721.1). Total num frames: 2396160. Throughput: 0: 908.6. Samples: 599668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:18,481][00219] Avg episode reward: [(0, '18.257')] [2023-02-25 18:55:23,474][00219] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 2408448. Throughput: 0: 896.3. Samples: 601840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:23,485][00219] Avg episode reward: [(0, '18.382')] [2023-02-25 18:55:24,585][12603] Updated weights for policy 0, policy_version 590 (0.0023) [2023-02-25 18:55:28,474][00219] Fps is (10 sec: 3688.3, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 2433024. Throughput: 0: 933.5. Samples: 607770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:28,481][00219] Avg episode reward: [(0, '19.485')] [2023-02-25 18:55:33,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 2453504. Throughput: 0: 960.1. Samples: 614674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:55:33,480][00219] Avg episode reward: [(0, '19.699')] [2023-02-25 18:55:33,562][12603] Updated weights for policy 0, policy_version 600 (0.0025) [2023-02-25 18:55:38,477][00219] Fps is (10 sec: 3685.3, 60 sec: 3686.5, 300 sec: 3721.1). Total num frames: 2469888. Throughput: 0: 940.7. Samples: 617112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:55:38,481][00219] Avg episode reward: [(0, '19.506')] [2023-02-25 18:55:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 2486272. Throughput: 0: 910.1. Samples: 621550. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:43,480][00219] Avg episode reward: [(0, '21.298')] [2023-02-25 18:55:43,494][12589] Saving new best policy, reward=21.298! [2023-02-25 18:55:46,081][12603] Updated weights for policy 0, policy_version 610 (0.0013) [2023-02-25 18:55:48,474][00219] Fps is (10 sec: 3687.5, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 2506752. Throughput: 0: 955.8. Samples: 627778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:48,480][00219] Avg episode reward: [(0, '21.309')] [2023-02-25 18:55:48,484][12589] Saving new best policy, reward=21.309! [2023-02-25 18:55:53,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 2531328. Throughput: 0: 964.0. Samples: 631058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:55:53,479][00219] Avg episode reward: [(0, '20.129')] [2023-02-25 18:55:53,489][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000618_2531328.pth... [2023-02-25 18:55:53,597][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000399_1634304.pth [2023-02-25 18:55:56,046][12603] Updated weights for policy 0, policy_version 620 (0.0014) [2023-02-25 18:55:58,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 2543616. Throughput: 0: 935.0. Samples: 636384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:55:58,476][00219] Avg episode reward: [(0, '19.630')] [2023-02-25 18:56:03,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2560000. Throughput: 0: 910.6. Samples: 640640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:03,481][00219] Avg episode reward: [(0, '20.189')] [2023-02-25 18:56:07,858][12603] Updated weights for policy 0, policy_version 630 (0.0037) [2023-02-25 18:56:08,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2580480. Throughput: 0: 932.3. Samples: 643792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:56:08,476][00219] Avg episode reward: [(0, '20.244')] [2023-02-25 18:56:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 2605056. Throughput: 0: 953.5. Samples: 650676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:13,476][00219] Avg episode reward: [(0, '19.809')] [2023-02-25 18:56:18,476][00219] Fps is (10 sec: 3685.7, 60 sec: 3686.6, 300 sec: 3693.3). Total num frames: 2617344. Throughput: 0: 907.3. Samples: 655504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:18,480][00219] Avg episode reward: [(0, '19.027')] [2023-02-25 18:56:18,802][12603] Updated weights for policy 0, policy_version 640 (0.0013) [2023-02-25 18:56:23,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 2633728. Throughput: 0: 899.7. Samples: 657596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:56:23,483][00219] Avg episode reward: [(0, '18.358')] [2023-02-25 18:56:28,474][00219] Fps is (10 sec: 3687.1, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2654208. Throughput: 0: 934.3. Samples: 663594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:28,476][00219] Avg episode reward: [(0, '17.866')] [2023-02-25 18:56:29,599][12603] Updated weights for policy 0, policy_version 650 (0.0014) [2023-02-25 18:56:33,474][00219] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 2678784. Throughput: 0: 949.9. Samples: 670522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:33,476][00219] Avg episode reward: [(0, '17.924')] [2023-02-25 18:56:38,475][00219] Fps is (10 sec: 3686.0, 60 sec: 3686.5, 300 sec: 3721.1). Total num frames: 2691072. Throughput: 0: 924.5. Samples: 672662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:56:38,480][00219] Avg episode reward: [(0, '17.515')] [2023-02-25 18:56:42,090][12603] Updated weights for policy 0, policy_version 660 (0.0016) [2023-02-25 18:56:43,474][00219] Fps is (10 sec: 2457.6, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 2703360. Throughput: 0: 890.2. Samples: 676444. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:56:43,477][00219] Avg episode reward: [(0, '18.709')] [2023-02-25 18:56:48,474][00219] Fps is (10 sec: 2867.5, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 2719744. Throughput: 0: 884.4. Samples: 680436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:56:48,478][00219] Avg episode reward: [(0, '19.151')] [2023-02-25 18:56:53,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2736128. Throughput: 0: 867.3. Samples: 682822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:56:53,479][00219] Avg episode reward: [(0, '20.883')] [2023-02-25 18:56:54,703][12603] Updated weights for policy 0, policy_version 670 (0.0022) [2023-02-25 18:56:58,474][00219] Fps is (10 sec: 3276.6, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 2752512. Throughput: 0: 837.0. Samples: 688342. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:56:58,477][00219] Avg episode reward: [(0, '21.594')] [2023-02-25 18:56:58,480][12589] Saving new best policy, reward=21.594! [2023-02-25 18:57:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 2768896. Throughput: 0: 824.8. Samples: 692618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:57:03,479][00219] Avg episode reward: [(0, '21.518')] [2023-02-25 18:57:07,392][12603] Updated weights for policy 0, policy_version 680 (0.0023) [2023-02-25 18:57:08,474][00219] Fps is (10 sec: 3686.6, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 2789376. Throughput: 0: 843.8. Samples: 695566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:08,482][00219] Avg episode reward: [(0, '21.042')] [2023-02-25 18:57:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 2813952. Throughput: 0: 865.2. Samples: 702526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:57:13,481][00219] Avg episode reward: [(0, '19.749')] [2023-02-25 18:57:17,308][12603] Updated weights for policy 0, policy_version 690 (0.0012) [2023-02-25 18:57:18,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3665.6). Total num frames: 2826240. Throughput: 0: 828.1. Samples: 707788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:57:18,478][00219] Avg episode reward: [(0, '18.512')] [2023-02-25 18:57:23,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 2842624. Throughput: 0: 832.0. Samples: 710100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:57:23,480][00219] Avg episode reward: [(0, '17.902')] [2023-02-25 18:57:28,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 2863104. Throughput: 0: 876.4. Samples: 715884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:28,481][00219] Avg episode reward: [(0, '18.349')] [2023-02-25 18:57:28,503][12603] Updated weights for policy 0, policy_version 700 (0.0014) [2023-02-25 18:57:33,474][00219] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 2887680. Throughput: 0: 948.1. Samples: 723102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:33,476][00219] Avg episode reward: [(0, '19.011')] [2023-02-25 18:57:38,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 2904064. Throughput: 0: 953.0. Samples: 725706. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:38,476][00219] Avg episode reward: [(0, '20.156')] [2023-02-25 18:57:39,078][12603] Updated weights for policy 0, policy_version 710 (0.0023) [2023-02-25 18:57:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 2920448. Throughput: 0: 927.0. Samples: 730058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:43,476][00219] Avg episode reward: [(0, '19.614')] [2023-02-25 18:57:48,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2940928. Throughput: 0: 975.7. Samples: 736524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:57:48,489][00219] Avg episode reward: [(0, '21.317')] [2023-02-25 18:57:49,405][12603] Updated weights for policy 0, policy_version 720 (0.0014) [2023-02-25 18:57:53,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 2965504. Throughput: 0: 988.9. Samples: 740066. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:57:53,476][00219] Avg episode reward: [(0, '21.147')] [2023-02-25 18:57:53,488][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000724_2965504.pth... [2023-02-25 18:57:53,622][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000509_2084864.pth [2023-02-25 18:57:58,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3693.3). Total num frames: 2981888. Throughput: 0: 955.1. Samples: 745506. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:57:58,477][00219] Avg episode reward: [(0, '20.580')] [2023-02-25 18:58:00,631][12603] Updated weights for policy 0, policy_version 730 (0.0017) [2023-02-25 18:58:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 2998272. Throughput: 0: 937.0. Samples: 749952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:58:03,481][00219] Avg episode reward: [(0, '19.930')] [2023-02-25 18:58:08,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 3018752. Throughput: 0: 961.7. Samples: 753378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:58:08,476][00219] Avg episode reward: [(0, '21.482')] [2023-02-25 18:58:10,482][12603] Updated weights for policy 0, policy_version 740 (0.0014) [2023-02-25 18:58:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 3043328. Throughput: 0: 989.2. Samples: 760400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:58:13,482][00219] Avg episode reward: [(0, '20.444')] [2023-02-25 18:58:18,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 3059712. Throughput: 0: 938.4. Samples: 765328. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:58:18,478][00219] Avg episode reward: [(0, '19.542')] [2023-02-25 18:58:22,368][12603] Updated weights for policy 0, policy_version 750 (0.0011) [2023-02-25 18:58:23,474][00219] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 3072000. Throughput: 0: 929.2. Samples: 767522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:58:23,482][00219] Avg episode reward: [(0, '18.740')] [2023-02-25 18:58:28,474][00219] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 3096576. Throughput: 0: 977.1. Samples: 774026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:58:28,476][00219] Avg episode reward: [(0, '17.720')] [2023-02-25 18:58:31,253][12603] Updated weights for policy 0, policy_version 760 (0.0012) [2023-02-25 18:58:33,474][00219] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3693.4). Total num frames: 3121152. Throughput: 0: 988.2. Samples: 780992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 18:58:33,484][00219] Avg episode reward: [(0, '17.991')] [2023-02-25 18:58:38,474][00219] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 3133440. Throughput: 0: 959.4. Samples: 783240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 18:58:38,480][00219] Avg episode reward: [(0, '18.305')] [2023-02-25 18:58:43,334][12603] Updated weights for policy 0, policy_version 770 (0.0034) [2023-02-25 18:58:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3153920. Throughput: 0: 943.0. Samples: 787942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:58:43,476][00219] Avg episode reward: [(0, '19.331')] [2023-02-25 18:58:48,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 3174400. Throughput: 0: 999.2. Samples: 794914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:58:48,483][00219] Avg episode reward: [(0, '18.602')] [2023-02-25 18:58:52,072][12603] Updated weights for policy 0, policy_version 780 (0.0018) [2023-02-25 18:58:53,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 3198976. Throughput: 0: 1001.6. Samples: 798448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:58:53,476][00219] Avg episode reward: [(0, '18.692')] [2023-02-25 18:58:58,480][00219] Fps is (10 sec: 4093.4, 60 sec: 3890.8, 300 sec: 3721.1). Total num frames: 3215360. Throughput: 0: 958.9. Samples: 803556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:58:58,483][00219] Avg episode reward: [(0, '19.089')] [2023-02-25 18:59:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3707.3). Total num frames: 3231744. Throughput: 0: 962.5. Samples: 808642. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:59:03,476][00219] Avg episode reward: [(0, '19.058')] [2023-02-25 18:59:04,032][12603] Updated weights for policy 0, policy_version 790 (0.0015) [2023-02-25 18:59:08,474][00219] Fps is (10 sec: 4098.5, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 3256320. Throughput: 0: 995.5. Samples: 812318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:08,476][00219] Avg episode reward: [(0, '20.075')] [2023-02-25 18:59:12,559][12603] Updated weights for policy 0, policy_version 800 (0.0014) [2023-02-25 18:59:13,474][00219] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 3276800. Throughput: 0: 1009.7. Samples: 819464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:13,482][00219] Avg episode reward: [(0, '21.817')] [2023-02-25 18:59:13,492][12589] Saving new best policy, reward=21.817! [2023-02-25 18:59:18,474][00219] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 3293184. Throughput: 0: 952.7. Samples: 823864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:18,481][00219] Avg episode reward: [(0, '22.430')] [2023-02-25 18:59:18,482][12589] Saving new best policy, reward=22.430! [2023-02-25 18:59:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 3309568. Throughput: 0: 952.7. Samples: 826110. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:59:23,482][00219] Avg episode reward: [(0, '22.986')] [2023-02-25 18:59:23,492][12589] Saving new best policy, reward=22.986! [2023-02-25 18:59:25,016][12603] Updated weights for policy 0, policy_version 810 (0.0025) [2023-02-25 18:59:28,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 3334144. Throughput: 0: 996.8. Samples: 832796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:28,482][00219] Avg episode reward: [(0, '22.808')] [2023-02-25 18:59:33,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3354624. Throughput: 0: 987.1. Samples: 839334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 18:59:33,477][00219] Avg episode reward: [(0, '23.563')] [2023-02-25 18:59:33,491][12589] Saving new best policy, reward=23.563! [2023-02-25 18:59:34,660][12603] Updated weights for policy 0, policy_version 820 (0.0010) [2023-02-25 18:59:38,476][00219] Fps is (10 sec: 3276.1, 60 sec: 3891.1, 300 sec: 3748.9). Total num frames: 3366912. Throughput: 0: 957.7. Samples: 841548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 18:59:38,480][00219] Avg episode reward: [(0, '23.947')] [2023-02-25 18:59:38,489][12589] Saving new best policy, reward=23.947! [2023-02-25 18:59:43,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 3387392. Throughput: 0: 949.8. Samples: 846290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:43,480][00219] Avg episode reward: [(0, '24.354')] [2023-02-25 18:59:43,489][12589] Saving new best policy, reward=24.354! [2023-02-25 18:59:45,938][12603] Updated weights for policy 0, policy_version 830 (0.0042) [2023-02-25 18:59:48,474][00219] Fps is (10 sec: 4096.8, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 3407872. Throughput: 0: 989.7. Samples: 853178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:48,476][00219] Avg episode reward: [(0, '24.667')] [2023-02-25 18:59:48,526][12589] Saving new best policy, reward=24.667! [2023-02-25 18:59:53,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3428352. Throughput: 0: 983.4. Samples: 856572. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 18:59:53,480][00219] Avg episode reward: [(0, '25.174')] [2023-02-25 18:59:53,574][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000838_3432448.pth... [2023-02-25 18:59:53,715][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000618_2531328.pth [2023-02-25 18:59:53,731][12589] Saving new best policy, reward=25.174! [2023-02-25 18:59:56,498][12603] Updated weights for policy 0, policy_version 840 (0.0011) [2023-02-25 18:59:58,477][00219] Fps is (10 sec: 3685.3, 60 sec: 3823.1, 300 sec: 3762.7). Total num frames: 3444736. Throughput: 0: 928.6. Samples: 861252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 18:59:58,479][00219] Avg episode reward: [(0, '24.451')] [2023-02-25 19:00:03,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3461120. Throughput: 0: 947.7. Samples: 866510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:00:03,476][00219] Avg episode reward: [(0, '24.150')] [2023-02-25 19:00:07,008][12603] Updated weights for policy 0, policy_version 850 (0.0011) [2023-02-25 19:00:08,474][00219] Fps is (10 sec: 4097.2, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3485696. Throughput: 0: 975.4. Samples: 870002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:00:08,478][00219] Avg episode reward: [(0, '24.074')] [2023-02-25 19:00:13,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3506176. Throughput: 0: 977.8. Samples: 876796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:00:13,479][00219] Avg episode reward: [(0, '23.062')] [2023-02-25 19:00:18,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3518464. Throughput: 0: 927.2. Samples: 881056. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 19:00:18,477][00219] Avg episode reward: [(0, '24.495')] [2023-02-25 19:00:18,703][12603] Updated weights for policy 0, policy_version 860 (0.0013) [2023-02-25 19:00:23,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3538944. Throughput: 0: 926.7. Samples: 883246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:00:23,476][00219] Avg episode reward: [(0, '23.753')] [2023-02-25 19:00:28,462][12603] Updated weights for policy 0, policy_version 870 (0.0016) [2023-02-25 19:00:28,474][00219] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3563520. Throughput: 0: 973.6. Samples: 890104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:00:28,480][00219] Avg episode reward: [(0, '23.539')] [2023-02-25 19:00:33,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3579904. Throughput: 0: 957.7. Samples: 896276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:00:33,478][00219] Avg episode reward: [(0, '22.438')] [2023-02-25 19:00:38,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3762.8). Total num frames: 3596288. Throughput: 0: 932.3. Samples: 898524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 19:00:38,482][00219] Avg episode reward: [(0, '22.935')] [2023-02-25 19:00:40,517][12603] Updated weights for policy 0, policy_version 880 (0.0036) [2023-02-25 19:00:43,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3616768. Throughput: 0: 938.1. Samples: 903462. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:00:43,480][00219] Avg episode reward: [(0, '21.341')] [2023-02-25 19:00:48,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3637248. Throughput: 0: 978.0. Samples: 910518. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:00:48,483][00219] Avg episode reward: [(0, '21.607')] [2023-02-25 19:00:49,524][12603] Updated weights for policy 0, policy_version 890 (0.0013) [2023-02-25 19:00:53,478][00219] Fps is (10 sec: 4094.2, 60 sec: 3822.7, 300 sec: 3776.6). Total num frames: 3657728. Throughput: 0: 978.7. Samples: 914050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 19:00:53,484][00219] Avg episode reward: [(0, '22.003')] [2023-02-25 19:00:58,476][00219] Fps is (10 sec: 3685.7, 60 sec: 3823.0, 300 sec: 3776.6). Total num frames: 3674112. Throughput: 0: 928.1. Samples: 918564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-25 19:00:58,479][00219] Avg episode reward: [(0, '22.499')] [2023-02-25 19:01:01,573][12603] Updated weights for policy 0, policy_version 900 (0.0027) [2023-02-25 19:01:03,474][00219] Fps is (10 sec: 3688.0, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 3694592. Throughput: 0: 955.1. Samples: 924034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:01:03,477][00219] Avg episode reward: [(0, '21.666')] [2023-02-25 19:01:08,474][00219] Fps is (10 sec: 4096.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3715072. Throughput: 0: 984.8. Samples: 927560. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:01:08,476][00219] Avg episode reward: [(0, '22.608')] [2023-02-25 19:01:10,307][12603] Updated weights for policy 0, policy_version 910 (0.0017) [2023-02-25 19:01:13,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 3735552. Throughput: 0: 980.3. Samples: 934218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 19:01:13,476][00219] Avg episode reward: [(0, '23.025')] [2023-02-25 19:01:18,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3751936. Throughput: 0: 939.5. Samples: 938552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:01:18,477][00219] Avg episode reward: [(0, '23.578')] [2023-02-25 19:01:22,457][12603] Updated weights for policy 0, policy_version 920 (0.0026) [2023-02-25 19:01:23,474][00219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3772416. Throughput: 0: 947.8. Samples: 941176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 19:01:23,476][00219] Avg episode reward: [(0, '22.204')] [2023-02-25 19:01:28,474][00219] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 3792896. Throughput: 0: 995.3. Samples: 948250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:01:28,476][00219] Avg episode reward: [(0, '22.208')] [2023-02-25 19:01:31,609][12603] Updated weights for policy 0, policy_version 930 (0.0016) [2023-02-25 19:01:33,474][00219] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3813376. Throughput: 0: 968.7. Samples: 954110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:01:33,476][00219] Avg episode reward: [(0, '22.848')] [2023-02-25 19:01:38,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3825664. Throughput: 0: 938.4. Samples: 956272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-25 19:01:38,477][00219] Avg episode reward: [(0, '23.652')] [2023-02-25 19:01:43,468][12603] Updated weights for policy 0, policy_version 940 (0.0018) [2023-02-25 19:01:43,474][00219] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3850240. Throughput: 0: 959.9. Samples: 961758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:01:43,476][00219] Avg episode reward: [(0, '22.696')] [2023-02-25 19:01:48,477][00219] Fps is (10 sec: 4094.8, 60 sec: 3822.8, 300 sec: 3832.2). Total num frames: 3866624. Throughput: 0: 961.4. Samples: 967298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-25 19:01:48,481][00219] Avg episode reward: [(0, '22.972')] [2023-02-25 19:01:53,479][00219] Fps is (10 sec: 2865.7, 60 sec: 3686.4, 300 sec: 3818.2). Total num frames: 3878912. Throughput: 0: 928.1. Samples: 969328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:01:53,481][00219] Avg episode reward: [(0, '23.154')] [2023-02-25 19:01:53,493][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000947_3878912.pth... [2023-02-25 19:01:53,706][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000724_2965504.pth [2023-02-25 19:01:57,939][12603] Updated weights for policy 0, policy_version 950 (0.0020) [2023-02-25 19:01:58,475][00219] Fps is (10 sec: 2458.0, 60 sec: 3618.2, 300 sec: 3804.4). Total num frames: 3891200. Throughput: 0: 859.0. Samples: 972874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:01:58,479][00219] Avg episode reward: [(0, '22.465')] [2023-02-25 19:02:03,474][00219] Fps is (10 sec: 2868.6, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 3907584. Throughput: 0: 872.0. Samples: 977794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:02:03,476][00219] Avg episode reward: [(0, '23.118')] [2023-02-25 19:02:08,374][12603] Updated weights for policy 0, policy_version 960 (0.0013) [2023-02-25 19:02:08,474][00219] Fps is (10 sec: 4096.5, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 3932160. Throughput: 0: 888.4. Samples: 981154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:02:08,483][00219] Avg episode reward: [(0, '22.579')] [2023-02-25 19:02:13,474][00219] Fps is (10 sec: 4505.4, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 3952640. Throughput: 0: 881.9. Samples: 987934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:02:13,480][00219] Avg episode reward: [(0, '24.567')] [2023-02-25 19:02:18,474][00219] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3804.4). Total num frames: 3964928. Throughput: 0: 844.8. Samples: 992126. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-25 19:02:18,478][00219] Avg episode reward: [(0, '24.035')] [2023-02-25 19:02:20,584][12603] Updated weights for policy 0, policy_version 970 (0.0020) [2023-02-25 19:02:23,474][00219] Fps is (10 sec: 2867.4, 60 sec: 3481.6, 300 sec: 3790.5). Total num frames: 3981312. Throughput: 0: 845.7. Samples: 994330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:02:23,476][00219] Avg episode reward: [(0, '24.706')] [2023-02-25 19:02:28,289][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:02:28,289][00219] Component Batcher_0 stopped! [2023-02-25 19:02:28,290][12589] Stopping Batcher_0... [2023-02-25 19:02:28,304][12589] Loop batcher_evt_loop terminating... [2023-02-25 19:02:28,336][12603] Weights refcount: 2 0 [2023-02-25 19:02:28,355][00219] Component InferenceWorker_p0-w0 stopped! [2023-02-25 19:02:28,357][12603] Stopping InferenceWorker_p0-w0... [2023-02-25 19:02:28,358][12603] Loop inference_proc0-0_evt_loop terminating... [2023-02-25 19:02:28,363][00219] Component RolloutWorker_w7 stopped! [2023-02-25 19:02:28,362][12611] Stopping RolloutWorker_w7... [2023-02-25 19:02:28,377][12611] Loop rollout_proc7_evt_loop terminating... [2023-02-25 19:02:28,390][00219] Component RolloutWorker_w0 stopped! [2023-02-25 19:02:28,393][12604] Stopping RolloutWorker_w0... [2023-02-25 19:02:28,393][12604] Loop rollout_proc0_evt_loop terminating... [2023-02-25 19:02:28,395][00219] Component RolloutWorker_w5 stopped! [2023-02-25 19:02:28,397][00219] Component RolloutWorker_w4 stopped! [2023-02-25 19:02:28,395][12609] Stopping RolloutWorker_w5... [2023-02-25 19:02:28,400][12609] Loop rollout_proc5_evt_loop terminating... [2023-02-25 19:02:28,403][12606] Stopping RolloutWorker_w2... [2023-02-25 19:02:28,403][12606] Loop rollout_proc2_evt_loop terminating... [2023-02-25 19:02:28,395][12607] Stopping RolloutWorker_w4... [2023-02-25 19:02:28,405][12607] Loop rollout_proc4_evt_loop terminating... [2023-02-25 19:02:28,397][12605] Stopping RolloutWorker_w1... [2023-02-25 19:02:28,398][00219] Component RolloutWorker_w3 stopped! [2023-02-25 19:02:28,407][00219] Component RolloutWorker_w1 stopped! [2023-02-25 19:02:28,396][12608] Stopping RolloutWorker_w3... [2023-02-25 19:02:28,408][00219] Component RolloutWorker_w2 stopped! [2023-02-25 19:02:28,412][12608] Loop rollout_proc3_evt_loop terminating... [2023-02-25 19:02:28,406][12605] Loop rollout_proc1_evt_loop terminating... [2023-02-25 19:02:28,431][00219] Component RolloutWorker_w6 stopped! [2023-02-25 19:02:28,433][12610] Stopping RolloutWorker_w6... [2023-02-25 19:02:28,437][12610] Loop rollout_proc6_evt_loop terminating... [2023-02-25 19:02:28,486][12589] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000838_3432448.pth [2023-02-25 19:02:28,504][12589] Saving new best policy, reward=26.052! [2023-02-25 19:02:28,676][12589] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:02:28,864][00219] Component LearnerWorker_p0 stopped! [2023-02-25 19:02:28,866][00219] Waiting for process learner_proc0 to stop... [2023-02-25 19:02:28,869][12589] Stopping LearnerWorker_p0... [2023-02-25 19:02:28,869][12589] Loop learner_proc0_evt_loop terminating... [2023-02-25 19:02:30,632][00219] Waiting for process inference_proc0-0 to join... [2023-02-25 19:02:30,938][00219] Waiting for process rollout_proc0 to join... [2023-02-25 19:02:31,333][00219] Waiting for process rollout_proc1 to join... [2023-02-25 19:02:31,335][00219] Waiting for process rollout_proc2 to join... [2023-02-25 19:02:31,338][00219] Waiting for process rollout_proc3 to join... [2023-02-25 19:02:31,339][00219] Waiting for process rollout_proc4 to join... [2023-02-25 19:02:31,340][00219] Waiting for process rollout_proc5 to join... [2023-02-25 19:02:31,341][00219] Waiting for process rollout_proc6 to join... [2023-02-25 19:02:31,342][00219] Waiting for process rollout_proc7 to join... [2023-02-25 19:02:31,343][00219] Batcher 0 profile tree view: batching: 25.2117, releasing_batches: 0.0278 [2023-02-25 19:02:31,345][00219] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 548.8303 update_model: 7.2818 weight_update: 0.0017 one_step: 0.0024 handle_policy_step: 490.4408 deserialize: 14.9426, stack: 2.8507, obs_to_device_normalize: 110.6946, forward: 233.2077, send_messages: 25.8023 prepare_outputs: 78.1920 to_cpu: 48.4714 [2023-02-25 19:02:31,346][00219] Learner 0 profile tree view: misc: 0.0104, prepare_batch: 16.6086 train: 74.9911 epoch_init: 0.0180, minibatch_init: 0.0099, losses_postprocess: 0.6100, kl_divergence: 0.5355, after_optimizer: 33.3823 calculate_losses: 26.2628 losses_init: 0.0132, forward_head: 1.6609, bptt_initial: 17.5531, tail: 1.0730, advantages_returns: 0.2530, losses: 3.3578 bptt: 2.0744 bptt_forward_core: 1.9836 update: 13.5598 clip: 1.3489 [2023-02-25 19:02:31,347][00219] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.3330, enqueue_policy_requests: 152.1954, env_step: 809.3310, overhead: 20.9125, complete_rollouts: 6.1680 save_policy_outputs: 19.7973 split_output_tensors: 9.6528 [2023-02-25 19:02:31,348][00219] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.2955, enqueue_policy_requests: 153.3418, env_step: 808.5985, overhead: 20.5743, complete_rollouts: 7.3208 save_policy_outputs: 19.7544 split_output_tensors: 9.7981 [2023-02-25 19:02:31,350][00219] Loop Runner_EvtLoop terminating... [2023-02-25 19:02:31,351][00219] Runner profile tree view: main_loop: 1113.1394 [2023-02-25 19:02:31,352][00219] Collected {0: 4005888}, FPS: 3598.7 [2023-02-25 19:19:55,934][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 19:19:55,935][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 19:19:55,938][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 19:19:55,941][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 19:19:55,945][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:19:55,947][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 19:19:55,949][00219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:19:55,954][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 19:19:55,956][00219] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-25 19:19:55,959][00219] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-25 19:19:55,960][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 19:19:55,962][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 19:19:55,963][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 19:19:55,965][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 19:19:55,967][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 19:19:55,989][00219] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 19:19:55,992][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:19:55,996][00219] RunningMeanStd input shape: (1,) [2023-02-25 19:19:56,013][00219] ConvEncoder: input_channels=3 [2023-02-25 19:19:56,665][00219] Conv encoder output size: 512 [2023-02-25 19:19:56,667][00219] Policy head output size: 512 [2023-02-25 19:19:59,027][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:20:00,254][00219] Num frames 100... [2023-02-25 19:20:00,369][00219] Num frames 200... [2023-02-25 19:20:00,478][00219] Num frames 300... [2023-02-25 19:20:00,592][00219] Num frames 400... [2023-02-25 19:20:00,714][00219] Num frames 500... [2023-02-25 19:20:00,835][00219] Num frames 600... [2023-02-25 19:20:00,951][00219] Num frames 700... [2023-02-25 19:20:01,058][00219] Num frames 800... [2023-02-25 19:20:01,172][00219] Num frames 900... [2023-02-25 19:20:01,317][00219] Avg episode rewards: #0: 23.770, true rewards: #0: 9.770 [2023-02-25 19:20:01,319][00219] Avg episode reward: 23.770, avg true_objective: 9.770 [2023-02-25 19:20:01,350][00219] Num frames 1000... [2023-02-25 19:20:01,460][00219] Num frames 1100... [2023-02-25 19:20:01,567][00219] Num frames 1200... [2023-02-25 19:20:01,682][00219] Num frames 1300... [2023-02-25 19:20:01,791][00219] Num frames 1400... [2023-02-25 19:20:01,906][00219] Num frames 1500... [2023-02-25 19:20:02,020][00219] Num frames 1600... [2023-02-25 19:20:02,129][00219] Num frames 1700... [2023-02-25 19:20:02,207][00219] Avg episode rewards: #0: 19.065, true rewards: #0: 8.565 [2023-02-25 19:20:02,209][00219] Avg episode reward: 19.065, avg true_objective: 8.565 [2023-02-25 19:20:02,306][00219] Num frames 1800... [2023-02-25 19:20:02,415][00219] Num frames 1900... [2023-02-25 19:20:02,531][00219] Num frames 2000... [2023-02-25 19:20:02,642][00219] Num frames 2100... [2023-02-25 19:20:02,756][00219] Num frames 2200... [2023-02-25 19:20:02,864][00219] Num frames 2300... [2023-02-25 19:20:02,979][00219] Num frames 2400... [2023-02-25 19:20:03,096][00219] Num frames 2500... [2023-02-25 19:20:03,205][00219] Num frames 2600... [2023-02-25 19:20:03,321][00219] Num frames 2700... [2023-02-25 19:20:03,432][00219] Num frames 2800... [2023-02-25 19:20:03,548][00219] Avg episode rewards: #0: 22.110, true rewards: #0: 9.443 [2023-02-25 19:20:03,551][00219] Avg episode reward: 22.110, avg true_objective: 9.443 [2023-02-25 19:20:03,655][00219] Num frames 2900... [2023-02-25 19:20:03,812][00219] Num frames 3000... [2023-02-25 19:20:03,967][00219] Num frames 3100... [2023-02-25 19:20:04,122][00219] Num frames 3200... [2023-02-25 19:20:04,277][00219] Num frames 3300... [2023-02-25 19:20:04,441][00219] Num frames 3400... [2023-02-25 19:20:04,608][00219] Num frames 3500... [2023-02-25 19:20:04,761][00219] Num frames 3600... [2023-02-25 19:20:04,915][00219] Num frames 3700... [2023-02-25 19:20:05,071][00219] Num frames 3800... [2023-02-25 19:20:05,234][00219] Num frames 3900... [2023-02-25 19:20:05,384][00219] Num frames 4000... [2023-02-25 19:20:05,542][00219] Num frames 4100... [2023-02-25 19:20:05,726][00219] Num frames 4200... [2023-02-25 19:20:05,902][00219] Num frames 4300... [2023-02-25 19:20:06,076][00219] Num frames 4400... [2023-02-25 19:20:06,245][00219] Num frames 4500... [2023-02-25 19:20:06,452][00219] Avg episode rewards: #0: 29.732, true rewards: #0: 11.482 [2023-02-25 19:20:06,454][00219] Avg episode reward: 29.732, avg true_objective: 11.482 [2023-02-25 19:20:06,466][00219] Num frames 4600... [2023-02-25 19:20:06,625][00219] Num frames 4700... [2023-02-25 19:20:06,789][00219] Num frames 4800... [2023-02-25 19:20:06,953][00219] Num frames 4900... [2023-02-25 19:20:07,079][00219] Num frames 5000... [2023-02-25 19:20:07,188][00219] Num frames 5100... [2023-02-25 19:20:07,306][00219] Num frames 5200... [2023-02-25 19:20:07,470][00219] Avg episode rewards: #0: 26.394, true rewards: #0: 10.594 [2023-02-25 19:20:07,471][00219] Avg episode reward: 26.394, avg true_objective: 10.594 [2023-02-25 19:20:07,479][00219] Num frames 5300... [2023-02-25 19:20:07,589][00219] Num frames 5400... [2023-02-25 19:20:07,700][00219] Num frames 5500... [2023-02-25 19:20:07,811][00219] Num frames 5600... [2023-02-25 19:20:07,929][00219] Num frames 5700... [2023-02-25 19:20:08,040][00219] Num frames 5800... [2023-02-25 19:20:08,155][00219] Num frames 5900... [2023-02-25 19:20:08,273][00219] Num frames 6000... [2023-02-25 19:20:08,384][00219] Num frames 6100... [2023-02-25 19:20:08,503][00219] Num frames 6200... [2023-02-25 19:20:08,613][00219] Num frames 6300... [2023-02-25 19:20:08,729][00219] Num frames 6400... [2023-02-25 19:20:08,841][00219] Num frames 6500... [2023-02-25 19:20:08,953][00219] Num frames 6600... [2023-02-25 19:20:09,073][00219] Num frames 6700... [2023-02-25 19:20:09,186][00219] Num frames 6800... [2023-02-25 19:20:09,300][00219] Num frames 6900... [2023-02-25 19:20:09,412][00219] Num frames 7000... [2023-02-25 19:20:09,581][00219] Avg episode rewards: #0: 30.165, true rewards: #0: 11.832 [2023-02-25 19:20:09,582][00219] Avg episode reward: 30.165, avg true_objective: 11.832 [2023-02-25 19:20:09,588][00219] Num frames 7100... [2023-02-25 19:20:09,697][00219] Num frames 7200... [2023-02-25 19:20:09,809][00219] Num frames 7300... [2023-02-25 19:20:09,920][00219] Num frames 7400... [2023-02-25 19:20:10,033][00219] Num frames 7500... [2023-02-25 19:20:10,151][00219] Num frames 7600... [2023-02-25 19:20:10,259][00219] Num frames 7700... [2023-02-25 19:20:10,368][00219] Num frames 7800... [2023-02-25 19:20:10,429][00219] Avg episode rewards: #0: 28.004, true rewards: #0: 11.147 [2023-02-25 19:20:10,431][00219] Avg episode reward: 28.004, avg true_objective: 11.147 [2023-02-25 19:20:10,538][00219] Num frames 7900... [2023-02-25 19:20:10,650][00219] Num frames 8000... [2023-02-25 19:20:10,758][00219] Num frames 8100... [2023-02-25 19:20:10,868][00219] Num frames 8200... [2023-02-25 19:20:11,018][00219] Avg episode rewards: #0: 25.354, true rewards: #0: 10.354 [2023-02-25 19:20:11,019][00219] Avg episode reward: 25.354, avg true_objective: 10.354 [2023-02-25 19:20:11,041][00219] Num frames 8300... [2023-02-25 19:20:11,169][00219] Num frames 8400... [2023-02-25 19:20:11,284][00219] Num frames 8500... [2023-02-25 19:20:11,393][00219] Num frames 8600... [2023-02-25 19:20:11,507][00219] Num frames 8700... [2023-02-25 19:20:11,616][00219] Num frames 8800... [2023-02-25 19:20:11,745][00219] Num frames 8900... [2023-02-25 19:20:11,855][00219] Num frames 9000... [2023-02-25 19:20:11,963][00219] Num frames 9100... [2023-02-25 19:20:12,081][00219] Num frames 9200... [2023-02-25 19:20:12,195][00219] Num frames 9300... [2023-02-25 19:20:12,306][00219] Num frames 9400... [2023-02-25 19:20:12,417][00219] Num frames 9500... [2023-02-25 19:20:12,491][00219] Avg episode rewards: #0: 26.239, true rewards: #0: 10.572 [2023-02-25 19:20:12,493][00219] Avg episode reward: 26.239, avg true_objective: 10.572 [2023-02-25 19:20:12,589][00219] Num frames 9600... [2023-02-25 19:20:12,699][00219] Num frames 9700... [2023-02-25 19:20:12,810][00219] Num frames 9800... [2023-02-25 19:20:12,921][00219] Num frames 9900... [2023-02-25 19:20:13,041][00219] Num frames 10000... [2023-02-25 19:20:13,156][00219] Num frames 10100... [2023-02-25 19:20:13,273][00219] Num frames 10200... [2023-02-25 19:20:13,389][00219] Num frames 10300... [2023-02-25 19:20:13,501][00219] Num frames 10400... [2023-02-25 19:20:13,611][00219] Num frames 10500... [2023-02-25 19:20:13,727][00219] Num frames 10600... [2023-02-25 19:20:13,835][00219] Num frames 10700... [2023-02-25 19:20:13,951][00219] Num frames 10800... [2023-02-25 19:20:14,060][00219] Num frames 10900... [2023-02-25 19:20:14,177][00219] Num frames 11000... [2023-02-25 19:20:14,296][00219] Num frames 11100... [2023-02-25 19:20:14,407][00219] Num frames 11200... [2023-02-25 19:20:14,524][00219] Num frames 11300... [2023-02-25 19:20:14,634][00219] Num frames 11400... [2023-02-25 19:20:14,750][00219] Num frames 11500... [2023-02-25 19:20:14,862][00219] Num frames 11600... [2023-02-25 19:20:14,937][00219] Avg episode rewards: #0: 29.315, true rewards: #0: 11.615 [2023-02-25 19:20:14,938][00219] Avg episode reward: 29.315, avg true_objective: 11.615 [2023-02-25 19:21:22,320][00219] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-25 19:23:11,077][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 19:23:11,079][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 19:23:11,081][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 19:23:11,082][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 19:23:11,084][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:23:11,085][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 19:23:11,087][00219] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-25 19:23:11,088][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 19:23:11,089][00219] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-25 19:23:11,091][00219] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-25 19:23:11,092][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 19:23:11,094][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 19:23:11,095][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 19:23:11,096][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 19:23:11,098][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 19:23:11,128][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:23:11,130][00219] RunningMeanStd input shape: (1,) [2023-02-25 19:23:11,144][00219] ConvEncoder: input_channels=3 [2023-02-25 19:23:11,179][00219] Conv encoder output size: 512 [2023-02-25 19:23:11,180][00219] Policy head output size: 512 [2023-02-25 19:23:11,201][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:23:11,643][00219] Num frames 100... [2023-02-25 19:23:11,759][00219] Num frames 200... [2023-02-25 19:23:11,872][00219] Num frames 300... [2023-02-25 19:23:12,002][00219] Num frames 400... [2023-02-25 19:23:12,118][00219] Num frames 500... [2023-02-25 19:23:12,233][00219] Num frames 600... [2023-02-25 19:23:12,348][00219] Num frames 700... [2023-02-25 19:23:12,455][00219] Num frames 800... [2023-02-25 19:23:12,574][00219] Num frames 900... [2023-02-25 19:23:12,680][00219] Num frames 1000... [2023-02-25 19:23:12,761][00219] Avg episode rewards: #0: 20.240, true rewards: #0: 10.240 [2023-02-25 19:23:12,762][00219] Avg episode reward: 20.240, avg true_objective: 10.240 [2023-02-25 19:23:12,849][00219] Num frames 1100... [2023-02-25 19:23:12,964][00219] Num frames 1200... [2023-02-25 19:23:13,080][00219] Num frames 1300... [2023-02-25 19:23:13,190][00219] Num frames 1400... [2023-02-25 19:23:13,318][00219] Num frames 1500... [2023-02-25 19:23:13,430][00219] Num frames 1600... [2023-02-25 19:23:13,482][00219] Avg episode rewards: #0: 15.000, true rewards: #0: 8.000 [2023-02-25 19:23:13,484][00219] Avg episode reward: 15.000, avg true_objective: 8.000 [2023-02-25 19:23:13,601][00219] Num frames 1700... [2023-02-25 19:23:13,712][00219] Num frames 1800... [2023-02-25 19:23:13,819][00219] Num frames 1900... [2023-02-25 19:23:13,929][00219] Num frames 2000... [2023-02-25 19:23:14,054][00219] Num frames 2100... [2023-02-25 19:23:14,165][00219] Num frames 2200... [2023-02-25 19:23:14,277][00219] Num frames 2300... [2023-02-25 19:23:14,396][00219] Num frames 2400... [2023-02-25 19:23:14,507][00219] Num frames 2500... [2023-02-25 19:23:14,639][00219] Num frames 2600... [2023-02-25 19:23:14,758][00219] Num frames 2700... [2023-02-25 19:23:14,875][00219] Num frames 2800... [2023-02-25 19:23:14,986][00219] Num frames 2900... [2023-02-25 19:23:15,096][00219] Num frames 3000... [2023-02-25 19:23:15,216][00219] Num frames 3100... [2023-02-25 19:23:15,330][00219] Num frames 3200... [2023-02-25 19:23:15,450][00219] Num frames 3300... [2023-02-25 19:23:15,565][00219] Num frames 3400... [2023-02-25 19:23:15,677][00219] Num frames 3500... [2023-02-25 19:23:15,829][00219] Avg episode rewards: #0: 26.280, true rewards: #0: 11.947 [2023-02-25 19:23:15,831][00219] Avg episode reward: 26.280, avg true_objective: 11.947 [2023-02-25 19:23:15,853][00219] Num frames 3600... [2023-02-25 19:23:15,963][00219] Num frames 3700... [2023-02-25 19:23:16,074][00219] Num frames 3800... [2023-02-25 19:23:16,189][00219] Num frames 3900... [2023-02-25 19:23:16,306][00219] Num frames 4000... [2023-02-25 19:23:16,416][00219] Num frames 4100... [2023-02-25 19:23:16,529][00219] Num frames 4200... [2023-02-25 19:23:16,650][00219] Num frames 4300... [2023-02-25 19:23:16,772][00219] Num frames 4400... [2023-02-25 19:23:16,889][00219] Num frames 4500... [2023-02-25 19:23:16,999][00219] Num frames 4600... [2023-02-25 19:23:17,116][00219] Num frames 4700... [2023-02-25 19:23:17,228][00219] Num frames 4800... [2023-02-25 19:23:17,351][00219] Num frames 4900... [2023-02-25 19:23:17,470][00219] Num frames 5000... [2023-02-25 19:23:17,626][00219] Num frames 5100... [2023-02-25 19:23:17,742][00219] Num frames 5200... [2023-02-25 19:23:17,957][00219] Num frames 5300... [2023-02-25 19:23:18,122][00219] Num frames 5400... [2023-02-25 19:23:18,243][00219] Num frames 5500... [2023-02-25 19:23:18,344][00219] Avg episode rewards: #0: 33.100, true rewards: #0: 13.850 [2023-02-25 19:23:18,345][00219] Avg episode reward: 33.100, avg true_objective: 13.850 [2023-02-25 19:23:18,414][00219] Num frames 5600... [2023-02-25 19:23:18,633][00219] Num frames 5700... [2023-02-25 19:23:18,789][00219] Num frames 5800... [2023-02-25 19:23:18,975][00219] Num frames 5900... [2023-02-25 19:23:19,172][00219] Num frames 6000... [2023-02-25 19:23:19,328][00219] Num frames 6100... [2023-02-25 19:23:19,700][00219] Num frames 6200... [2023-02-25 19:23:20,135][00219] Num frames 6300... [2023-02-25 19:23:24,140][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 19:23:24,142][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 19:23:24,144][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 19:23:24,146][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 19:23:24,149][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:23:24,150][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 19:23:24,155][00219] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-25 19:23:24,156][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 19:23:24,158][00219] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-25 19:23:24,159][00219] Adding new argument 'hf_repository'='msgerasyov/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-25 19:23:24,164][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 19:23:24,165][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 19:23:24,167][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 19:23:24,168][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 19:23:24,169][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 19:23:24,193][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:23:24,195][00219] RunningMeanStd input shape: (1,) [2023-02-25 19:23:24,208][00219] ConvEncoder: input_channels=3 [2023-02-25 19:23:24,244][00219] Conv encoder output size: 512 [2023-02-25 19:23:24,245][00219] Policy head output size: 512 [2023-02-25 19:23:24,267][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:23:24,701][00219] Num frames 100... [2023-02-25 19:23:24,821][00219] Num frames 200... [2023-02-25 19:23:24,938][00219] Num frames 300... [2023-02-25 19:23:25,054][00219] Num frames 400... [2023-02-25 19:23:25,170][00219] Num frames 500... [2023-02-25 19:23:25,280][00219] Avg episode rewards: #0: 8.470, true rewards: #0: 5.470 [2023-02-25 19:23:25,281][00219] Avg episode reward: 8.470, avg true_objective: 5.470 [2023-02-25 19:23:25,346][00219] Num frames 600... [2023-02-25 19:23:25,459][00219] Num frames 700... [2023-02-25 19:23:25,571][00219] Num frames 800... [2023-02-25 19:23:25,683][00219] Num frames 900... [2023-02-25 19:23:25,801][00219] Num frames 1000... [2023-02-25 19:23:25,917][00219] Num frames 1100... [2023-02-25 19:23:26,032][00219] Num frames 1200... [2023-02-25 19:23:26,150][00219] Num frames 1300... [2023-02-25 19:23:26,265][00219] Avg episode rewards: #0: 12.770, true rewards: #0: 6.770 [2023-02-25 19:23:26,266][00219] Avg episode reward: 12.770, avg true_objective: 6.770 [2023-02-25 19:23:26,320][00219] Num frames 1400... [2023-02-25 19:23:26,437][00219] Num frames 1500... [2023-02-25 19:23:26,547][00219] Num frames 1600... [2023-02-25 19:23:26,658][00219] Num frames 1700... [2023-02-25 19:23:26,780][00219] Num frames 1800... [2023-02-25 19:23:26,891][00219] Num frames 1900... [2023-02-25 19:23:27,004][00219] Num frames 2000... [2023-02-25 19:23:27,131][00219] Num frames 2100... [2023-02-25 19:23:27,245][00219] Num frames 2200... [2023-02-25 19:23:27,358][00219] Num frames 2300... [2023-02-25 19:23:27,478][00219] Num frames 2400... [2023-02-25 19:23:27,589][00219] Num frames 2500... [2023-02-25 19:23:27,698][00219] Num frames 2600... [2023-02-25 19:23:27,823][00219] Num frames 2700... [2023-02-25 19:23:27,939][00219] Num frames 2800... [2023-02-25 19:23:28,057][00219] Num frames 2900... [2023-02-25 19:23:28,174][00219] Num frames 3000... [2023-02-25 19:23:28,286][00219] Num frames 3100... [2023-02-25 19:23:28,398][00219] Num frames 3200... [2023-02-25 19:23:28,512][00219] Num frames 3300... [2023-02-25 19:23:28,622][00219] Num frames 3400... [2023-02-25 19:23:28,737][00219] Avg episode rewards: #0: 28.180, true rewards: #0: 11.513 [2023-02-25 19:23:28,740][00219] Avg episode reward: 28.180, avg true_objective: 11.513 [2023-02-25 19:23:28,804][00219] Num frames 3500... [2023-02-25 19:23:28,912][00219] Num frames 3600... [2023-02-25 19:23:29,026][00219] Num frames 3700... [2023-02-25 19:23:29,142][00219] Num frames 3800... [2023-02-25 19:23:29,253][00219] Num frames 3900... [2023-02-25 19:23:29,368][00219] Num frames 4000... [2023-02-25 19:23:29,483][00219] Num frames 4100... [2023-02-25 19:23:29,552][00219] Avg episode rewards: #0: 23.780, true rewards: #0: 10.280 [2023-02-25 19:23:29,554][00219] Avg episode reward: 23.780, avg true_objective: 10.280 [2023-02-25 19:23:29,656][00219] Num frames 4200... [2023-02-25 19:23:29,766][00219] Num frames 4300... [2023-02-25 19:23:29,887][00219] Num frames 4400... [2023-02-25 19:23:30,003][00219] Num frames 4500... [2023-02-25 19:23:30,119][00219] Num frames 4600... [2023-02-25 19:23:30,229][00219] Num frames 4700... [2023-02-25 19:23:30,338][00219] Num frames 4800... [2023-02-25 19:23:30,452][00219] Avg episode rewards: #0: 21.896, true rewards: #0: 9.696 [2023-02-25 19:23:30,454][00219] Avg episode reward: 21.896, avg true_objective: 9.696 [2023-02-25 19:23:30,516][00219] Num frames 4900... [2023-02-25 19:23:30,630][00219] Num frames 5000... [2023-02-25 19:23:30,740][00219] Num frames 5100... [2023-02-25 19:23:30,869][00219] Num frames 5200... [2023-02-25 19:23:30,986][00219] Num frames 5300... [2023-02-25 19:23:31,097][00219] Num frames 5400... [2023-02-25 19:23:31,254][00219] Avg episode rewards: #0: 20.147, true rewards: #0: 9.147 [2023-02-25 19:23:31,256][00219] Avg episode reward: 20.147, avg true_objective: 9.147 [2023-02-25 19:23:31,273][00219] Num frames 5500... [2023-02-25 19:23:31,384][00219] Num frames 5600... [2023-02-25 19:23:31,498][00219] Num frames 5700... [2023-02-25 19:23:31,608][00219] Num frames 5800... [2023-02-25 19:23:31,720][00219] Num frames 5900... [2023-02-25 19:23:31,841][00219] Num frames 6000... [2023-02-25 19:23:31,956][00219] Num frames 6100... [2023-02-25 19:23:32,076][00219] Num frames 6200... [2023-02-25 19:23:32,186][00219] Num frames 6300... [2023-02-25 19:23:32,329][00219] Num frames 6400... [2023-02-25 19:23:32,485][00219] Num frames 6500... [2023-02-25 19:23:32,641][00219] Num frames 6600... [2023-02-25 19:23:32,794][00219] Num frames 6700... [2023-02-25 19:23:32,957][00219] Num frames 6800... [2023-02-25 19:23:33,115][00219] Num frames 6900... [2023-02-25 19:23:33,274][00219] Num frames 7000... [2023-02-25 19:23:33,439][00219] Num frames 7100... [2023-02-25 19:23:33,604][00219] Num frames 7200... [2023-02-25 19:23:33,767][00219] Num frames 7300... [2023-02-25 19:23:33,928][00219] Num frames 7400... [2023-02-25 19:23:34,091][00219] Num frames 7500... [2023-02-25 19:23:34,293][00219] Avg episode rewards: #0: 25.554, true rewards: #0: 10.840 [2023-02-25 19:23:34,295][00219] Avg episode reward: 25.554, avg true_objective: 10.840 [2023-02-25 19:23:34,321][00219] Num frames 7600... [2023-02-25 19:23:34,488][00219] Num frames 7700... [2023-02-25 19:23:34,573][00219] Avg episode rewards: #0: 22.520, true rewards: #0: 9.645 [2023-02-25 19:23:34,575][00219] Avg episode reward: 22.520, avg true_objective: 9.645 [2023-02-25 19:23:34,711][00219] Num frames 7800... [2023-02-25 19:23:34,864][00219] Num frames 7900... [2023-02-25 19:23:35,029][00219] Num frames 8000... [2023-02-25 19:23:35,194][00219] Num frames 8100... [2023-02-25 19:23:35,362][00219] Num frames 8200... [2023-02-25 19:23:35,527][00219] Num frames 8300... [2023-02-25 19:23:35,690][00219] Num frames 8400... [2023-02-25 19:23:35,829][00219] Num frames 8500... [2023-02-25 19:23:35,947][00219] Num frames 8600... [2023-02-25 19:23:36,060][00219] Num frames 8700... [2023-02-25 19:23:36,125][00219] Avg episode rewards: #0: 22.786, true rewards: #0: 9.676 [2023-02-25 19:23:36,126][00219] Avg episode reward: 22.786, avg true_objective: 9.676 [2023-02-25 19:23:36,237][00219] Num frames 8800... [2023-02-25 19:23:36,345][00219] Num frames 8900... [2023-02-25 19:23:36,457][00219] Num frames 9000... [2023-02-25 19:23:36,569][00219] Num frames 9100... [2023-02-25 19:23:36,686][00219] Num frames 9200... [2023-02-25 19:23:36,794][00219] Num frames 9300... [2023-02-25 19:23:36,917][00219] Num frames 9400... [2023-02-25 19:23:37,049][00219] Avg episode rewards: #0: 22.456, true rewards: #0: 9.456 [2023-02-25 19:23:37,051][00219] Avg episode reward: 22.456, avg true_objective: 9.456 [2023-02-25 19:24:32,561][00219] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-25 19:24:35,764][00219] The model has been pushed to https://huggingface.co/msgerasyov/rl_course_vizdoom_health_gathering_supreme [2023-02-25 19:27:15,490][00219] Loading legacy config file train_dir/doom_health_gathering_supreme_2222/cfg.json instead of train_dir/doom_health_gathering_supreme_2222/config.json [2023-02-25 19:27:15,492][00219] Loading existing experiment configuration from train_dir/doom_health_gathering_supreme_2222/config.json [2023-02-25 19:27:15,497][00219] Overriding arg 'experiment' with value 'doom_health_gathering_supreme_2222' passed from command line [2023-02-25 19:27:15,498][00219] Overriding arg 'train_dir' with value 'train_dir' passed from command line [2023-02-25 19:27:15,500][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 19:27:15,502][00219] Adding new argument 'lr_adaptive_min'=1e-06 that is not in the saved config file! [2023-02-25 19:27:15,503][00219] Adding new argument 'lr_adaptive_max'=0.01 that is not in the saved config file! [2023-02-25 19:27:15,505][00219] Adding new argument 'env_gpu_observations'=True that is not in the saved config file! [2023-02-25 19:27:15,507][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 19:27:15,508][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 19:27:15,510][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:27:15,511][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 19:27:15,513][00219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:27:15,514][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 19:27:15,516][00219] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-25 19:27:15,517][00219] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-25 19:27:15,519][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 19:27:15,520][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 19:27:15,522][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 19:27:15,523][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 19:27:15,525][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 19:27:15,563][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:27:15,564][00219] RunningMeanStd input shape: (1,) [2023-02-25 19:27:15,579][00219] ConvEncoder: input_channels=3 [2023-02-25 19:27:15,623][00219] Conv encoder output size: 512 [2023-02-25 19:27:15,625][00219] Policy head output size: 512 [2023-02-25 19:27:15,650][00219] Loading state from checkpoint train_dir/doom_health_gathering_supreme_2222/checkpoint_p0/checkpoint_000539850_4422451200.pth... [2023-02-25 19:27:16,153][00219] Num frames 100... [2023-02-25 19:27:16,272][00219] Num frames 200... [2023-02-25 19:27:16,386][00219] Num frames 300... [2023-02-25 19:27:16,500][00219] Num frames 400... [2023-02-25 19:27:16,639][00219] Num frames 500... [2023-02-25 19:27:16,752][00219] Num frames 600... [2023-02-25 19:27:16,877][00219] Num frames 700... [2023-02-25 19:27:16,996][00219] Num frames 800... [2023-02-25 19:27:17,107][00219] Num frames 900... [2023-02-25 19:27:17,221][00219] Num frames 1000... [2023-02-25 19:27:17,332][00219] Num frames 1100... [2023-02-25 19:27:17,447][00219] Num frames 1200... [2023-02-25 19:27:17,555][00219] Num frames 1300... [2023-02-25 19:27:17,676][00219] Num frames 1400... [2023-02-25 19:27:17,867][00219] Num frames 1500... [2023-02-25 19:27:18,146][00219] Num frames 1600... [2023-02-25 19:27:18,298][00219] Num frames 1700... [2023-02-25 19:27:18,432][00219] Num frames 1800... [2023-02-25 19:27:18,545][00219] Num frames 1900... [2023-02-25 19:27:18,658][00219] Num frames 2000... [2023-02-25 19:27:18,773][00219] Num frames 2100... [2023-02-25 19:27:18,825][00219] Avg episode rewards: #0: 63.999, true rewards: #0: 21.000 [2023-02-25 19:27:18,827][00219] Avg episode reward: 63.999, avg true_objective: 21.000 [2023-02-25 19:27:18,946][00219] Num frames 2200... [2023-02-25 19:27:19,064][00219] Num frames 2300... [2023-02-25 19:27:19,178][00219] Num frames 2400... [2023-02-25 19:27:19,312][00219] Num frames 2500... [2023-02-25 19:27:19,433][00219] Num frames 2600... [2023-02-25 19:27:19,561][00219] Num frames 2700... [2023-02-25 19:27:19,684][00219] Num frames 2800... [2023-02-25 19:27:19,809][00219] Num frames 2900... [2023-02-25 19:27:19,935][00219] Num frames 3000... [2023-02-25 19:27:20,052][00219] Num frames 3100... [2023-02-25 19:27:20,170][00219] Num frames 3200... [2023-02-25 19:27:20,288][00219] Num frames 3300... [2023-02-25 19:27:20,411][00219] Num frames 3400... [2023-02-25 19:27:20,542][00219] Num frames 3500... [2023-02-25 19:27:20,668][00219] Num frames 3600... [2023-02-25 19:27:20,795][00219] Num frames 3700... [2023-02-25 19:27:20,927][00219] Num frames 3800... [2023-02-25 19:27:21,060][00219] Num frames 3900... [2023-02-25 19:27:21,193][00219] Num frames 4000... [2023-02-25 19:27:21,316][00219] Num frames 4100... [2023-02-25 19:27:21,458][00219] Num frames 4200... [2023-02-25 19:27:21,510][00219] Avg episode rewards: #0: 62.998, true rewards: #0: 21.000 [2023-02-25 19:27:21,512][00219] Avg episode reward: 62.998, avg true_objective: 21.000 [2023-02-25 19:27:21,636][00219] Num frames 4300... [2023-02-25 19:27:21,751][00219] Num frames 4400... [2023-02-25 19:27:21,865][00219] Num frames 4500... [2023-02-25 19:27:21,994][00219] Num frames 4600... [2023-02-25 19:27:22,110][00219] Num frames 4700... [2023-02-25 19:27:22,222][00219] Num frames 4800... [2023-02-25 19:27:22,331][00219] Num frames 4900... [2023-02-25 19:27:22,450][00219] Num frames 5000... [2023-02-25 19:27:22,560][00219] Num frames 5100... [2023-02-25 19:27:22,673][00219] Num frames 5200... [2023-02-25 19:27:22,793][00219] Num frames 5300... [2023-02-25 19:27:22,912][00219] Num frames 5400... [2023-02-25 19:27:23,046][00219] Num frames 5500... [2023-02-25 19:27:23,166][00219] Num frames 5600... [2023-02-25 19:27:23,290][00219] Num frames 5700... [2023-02-25 19:27:23,414][00219] Num frames 5800... [2023-02-25 19:27:23,540][00219] Num frames 5900... [2023-02-25 19:27:23,661][00219] Num frames 6000... [2023-02-25 19:27:23,780][00219] Num frames 6100... [2023-02-25 19:27:23,903][00219] Num frames 6200... [2023-02-25 19:27:24,051][00219] Num frames 6300... [2023-02-25 19:27:24,104][00219] Avg episode rewards: #0: 63.999, true rewards: #0: 21.000 [2023-02-25 19:27:24,105][00219] Avg episode reward: 63.999, avg true_objective: 21.000 [2023-02-25 19:27:24,234][00219] Num frames 6400... [2023-02-25 19:27:24,353][00219] Num frames 6500... [2023-02-25 19:27:24,475][00219] Num frames 6600... [2023-02-25 19:27:24,604][00219] Num frames 6700... [2023-02-25 19:27:24,723][00219] Num frames 6800... [2023-02-25 19:27:24,843][00219] Num frames 6900... [2023-02-25 19:27:24,963][00219] Num frames 7000... [2023-02-25 19:27:25,120][00219] Num frames 7100... [2023-02-25 19:27:25,294][00219] Num frames 7200... [2023-02-25 19:27:25,463][00219] Num frames 7300... [2023-02-25 19:27:25,632][00219] Num frames 7400... [2023-02-25 19:27:25,796][00219] Num frames 7500... [2023-02-25 19:27:25,972][00219] Num frames 7600... [2023-02-25 19:27:26,152][00219] Num frames 7700... [2023-02-25 19:27:26,336][00219] Num frames 7800... [2023-02-25 19:27:26,507][00219] Num frames 7900... [2023-02-25 19:27:26,666][00219] Num frames 8000... [2023-02-25 19:27:26,831][00219] Num frames 8100... [2023-02-25 19:27:27,009][00219] Num frames 8200... [2023-02-25 19:27:27,196][00219] Num frames 8300... [2023-02-25 19:27:27,372][00219] Num frames 8400... [2023-02-25 19:27:27,425][00219] Avg episode rewards: #0: 64.749, true rewards: #0: 21.000 [2023-02-25 19:27:27,428][00219] Avg episode reward: 64.749, avg true_objective: 21.000 [2023-02-25 19:27:27,590][00219] Num frames 8500... [2023-02-25 19:27:27,757][00219] Num frames 8600... [2023-02-25 19:27:27,921][00219] Num frames 8700... [2023-02-25 19:27:28,105][00219] Num frames 8800... [2023-02-25 19:27:28,282][00219] Num frames 8900... [2023-02-25 19:27:28,462][00219] Num frames 9000... [2023-02-25 19:27:28,637][00219] Num frames 9100... [2023-02-25 19:27:28,817][00219] Num frames 9200... [2023-02-25 19:27:28,943][00219] Num frames 9300... [2023-02-25 19:27:29,060][00219] Num frames 9400... [2023-02-25 19:27:29,176][00219] Num frames 9500... [2023-02-25 19:27:29,300][00219] Num frames 9600... [2023-02-25 19:27:29,417][00219] Num frames 9700... [2023-02-25 19:27:29,539][00219] Num frames 9800... [2023-02-25 19:27:29,654][00219] Num frames 9900... [2023-02-25 19:27:29,781][00219] Num frames 10000... [2023-02-25 19:27:29,898][00219] Num frames 10100... [2023-02-25 19:27:30,015][00219] Num frames 10200... [2023-02-25 19:27:30,146][00219] Num frames 10300... [2023-02-25 19:27:30,267][00219] Num frames 10400... [2023-02-25 19:27:30,393][00219] Num frames 10500... [2023-02-25 19:27:30,446][00219] Avg episode rewards: #0: 65.199, true rewards: #0: 21.000 [2023-02-25 19:27:30,447][00219] Avg episode reward: 65.199, avg true_objective: 21.000 [2023-02-25 19:27:30,577][00219] Num frames 10600... [2023-02-25 19:27:30,704][00219] Num frames 10700... [2023-02-25 19:27:30,821][00219] Num frames 10800... [2023-02-25 19:27:30,956][00219] Num frames 10900... [2023-02-25 19:27:31,084][00219] Num frames 11000... [2023-02-25 19:27:31,203][00219] Num frames 11100... [2023-02-25 19:27:31,338][00219] Num frames 11200... [2023-02-25 19:27:31,463][00219] Num frames 11300... [2023-02-25 19:27:31,587][00219] Num frames 11400... [2023-02-25 19:27:31,711][00219] Num frames 11500... [2023-02-25 19:27:31,831][00219] Num frames 11600... [2023-02-25 19:27:31,956][00219] Num frames 11700... [2023-02-25 19:27:32,088][00219] Num frames 11800... [2023-02-25 19:27:32,216][00219] Num frames 11900... [2023-02-25 19:27:32,352][00219] Num frames 12000... [2023-02-25 19:27:32,469][00219] Num frames 12100... [2023-02-25 19:27:32,591][00219] Num frames 12200... [2023-02-25 19:27:32,710][00219] Num frames 12300... [2023-02-25 19:27:32,838][00219] Num frames 12400... [2023-02-25 19:27:32,961][00219] Num frames 12500... [2023-02-25 19:27:33,088][00219] Num frames 12600... [2023-02-25 19:27:33,140][00219] Avg episode rewards: #0: 64.832, true rewards: #0: 21.000 [2023-02-25 19:27:33,142][00219] Avg episode reward: 64.832, avg true_objective: 21.000 [2023-02-25 19:27:33,268][00219] Num frames 12700... [2023-02-25 19:27:33,406][00219] Num frames 12800... [2023-02-25 19:27:33,532][00219] Num frames 12900... [2023-02-25 19:27:33,657][00219] Num frames 13000... [2023-02-25 19:27:33,780][00219] Num frames 13100... [2023-02-25 19:27:33,909][00219] Num frames 13200... [2023-02-25 19:27:34,036][00219] Num frames 13300... [2023-02-25 19:27:34,164][00219] Num frames 13400... [2023-02-25 19:27:34,303][00219] Num frames 13500... [2023-02-25 19:27:34,434][00219] Num frames 13600... [2023-02-25 19:27:34,555][00219] Num frames 13700... [2023-02-25 19:27:34,673][00219] Num frames 13800... [2023-02-25 19:27:34,797][00219] Num frames 13900... [2023-02-25 19:27:34,919][00219] Num frames 14000... [2023-02-25 19:27:35,040][00219] Num frames 14100... [2023-02-25 19:27:35,156][00219] Num frames 14200... [2023-02-25 19:27:35,290][00219] Num frames 14300... [2023-02-25 19:27:35,416][00219] Num frames 14400... [2023-02-25 19:27:35,544][00219] Num frames 14500... [2023-02-25 19:27:35,662][00219] Num frames 14600... [2023-02-25 19:27:35,785][00219] Num frames 14700... [2023-02-25 19:27:35,837][00219] Avg episode rewards: #0: 65.284, true rewards: #0: 21.000 [2023-02-25 19:27:35,839][00219] Avg episode reward: 65.284, avg true_objective: 21.000 [2023-02-25 19:27:35,961][00219] Num frames 14800... [2023-02-25 19:27:36,083][00219] Num frames 14900... [2023-02-25 19:27:36,216][00219] Num frames 15000... [2023-02-25 19:27:36,338][00219] Num frames 15100... [2023-02-25 19:27:36,471][00219] Num frames 15200... [2023-02-25 19:27:36,590][00219] Num frames 15300... [2023-02-25 19:27:36,711][00219] Num frames 15400... [2023-02-25 19:27:36,836][00219] Num frames 15500... [2023-02-25 19:27:36,972][00219] Num frames 15600... [2023-02-25 19:27:37,099][00219] Num frames 15700... [2023-02-25 19:27:37,224][00219] Num frames 15800... [2023-02-25 19:27:37,360][00219] Num frames 15900... [2023-02-25 19:27:37,496][00219] Num frames 16000... [2023-02-25 19:27:37,628][00219] Num frames 16100... [2023-02-25 19:27:37,755][00219] Num frames 16200... [2023-02-25 19:27:37,889][00219] Num frames 16300... [2023-02-25 19:27:38,020][00219] Num frames 16400... [2023-02-25 19:27:38,149][00219] Num frames 16500... [2023-02-25 19:27:38,266][00219] Num frames 16600... [2023-02-25 19:27:38,399][00219] Num frames 16700... [2023-02-25 19:27:38,519][00219] Num frames 16800... [2023-02-25 19:27:38,573][00219] Avg episode rewards: #0: 65.749, true rewards: #0: 21.000 [2023-02-25 19:27:38,576][00219] Avg episode reward: 65.749, avg true_objective: 21.000 [2023-02-25 19:27:38,697][00219] Num frames 16900... [2023-02-25 19:27:38,824][00219] Num frames 17000... [2023-02-25 19:27:38,993][00219] Num frames 17100... [2023-02-25 19:27:39,160][00219] Num frames 17200... [2023-02-25 19:27:39,321][00219] Num frames 17300... [2023-02-25 19:27:39,522][00219] Avg episode rewards: #0: 59.871, true rewards: #0: 19.317 [2023-02-25 19:27:39,524][00219] Avg episode reward: 59.871, avg true_objective: 19.317 [2023-02-25 19:27:39,550][00219] Num frames 17400... [2023-02-25 19:27:39,717][00219] Num frames 17500... [2023-02-25 19:27:39,877][00219] Num frames 17600... [2023-02-25 19:27:40,044][00219] Num frames 17700... [2023-02-25 19:27:40,201][00219] Num frames 17800... [2023-02-25 19:27:40,360][00219] Num frames 17900... [2023-02-25 19:27:40,522][00219] Num frames 18000... [2023-02-25 19:27:40,683][00219] Num frames 18100... [2023-02-25 19:27:40,864][00219] Num frames 18200... [2023-02-25 19:27:41,053][00219] Num frames 18300... [2023-02-25 19:27:41,232][00219] Num frames 18400... [2023-02-25 19:27:41,413][00219] Num frames 18500... [2023-02-25 19:27:41,607][00219] Num frames 18600... [2023-02-25 19:27:41,789][00219] Num frames 18700... [2023-02-25 19:27:41,975][00219] Num frames 18800... [2023-02-25 19:27:42,171][00219] Num frames 18900... [2023-02-25 19:27:42,357][00219] Num frames 19000... [2023-02-25 19:27:42,531][00219] Num frames 19100... [2023-02-25 19:27:42,708][00219] Num frames 19200... [2023-02-25 19:27:42,832][00219] Num frames 19300... [2023-02-25 19:27:42,969][00219] Num frames 19400... [2023-02-25 19:27:43,130][00219] Avg episode rewards: #0: 60.484, true rewards: #0: 19.485 [2023-02-25 19:27:43,132][00219] Avg episode reward: 60.484, avg true_objective: 19.485 [2023-02-25 19:29:46,788][00219] Replay video saved to train_dir/doom_health_gathering_supreme_2222/replay.mp4! [2023-02-25 19:36:48,856][00219] Environment doom_basic already registered, overwriting... [2023-02-25 19:36:48,858][00219] Environment doom_two_colors_easy already registered, overwriting... [2023-02-25 19:36:48,860][00219] Environment doom_two_colors_hard already registered, overwriting... [2023-02-25 19:36:48,861][00219] Environment doom_dm already registered, overwriting... [2023-02-25 19:36:48,862][00219] Environment doom_dwango5 already registered, overwriting... [2023-02-25 19:36:48,864][00219] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-25 19:36:48,865][00219] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-25 19:36:48,866][00219] Environment doom_my_way_home already registered, overwriting... [2023-02-25 19:36:48,869][00219] Environment doom_deadly_corridor already registered, overwriting... [2023-02-25 19:36:48,870][00219] Environment doom_defend_the_center already registered, overwriting... [2023-02-25 19:36:48,872][00219] Environment doom_defend_the_line already registered, overwriting... [2023-02-25 19:36:48,873][00219] Environment doom_health_gathering already registered, overwriting... [2023-02-25 19:36:48,875][00219] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-25 19:36:48,876][00219] Environment doom_battle already registered, overwriting... [2023-02-25 19:36:48,878][00219] Environment doom_battle2 already registered, overwriting... [2023-02-25 19:36:48,879][00219] Environment doom_duel_bots already registered, overwriting... [2023-02-25 19:36:48,881][00219] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-25 19:36:48,882][00219] Environment doom_duel already registered, overwriting... [2023-02-25 19:36:48,884][00219] Environment doom_deathmatch_full already registered, overwriting... [2023-02-25 19:36:48,885][00219] Environment doom_benchmark already registered, overwriting... [2023-02-25 19:36:48,887][00219] register_encoder_factory: [2023-02-25 19:37:34,993][00219] Environment doom_basic already registered, overwriting... [2023-02-25 19:37:34,998][00219] Environment doom_two_colors_easy already registered, overwriting... [2023-02-25 19:37:35,000][00219] Environment doom_two_colors_hard already registered, overwriting... [2023-02-25 19:37:35,003][00219] Environment doom_dm already registered, overwriting... [2023-02-25 19:37:35,008][00219] Environment doom_dwango5 already registered, overwriting... [2023-02-25 19:37:35,009][00219] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-25 19:37:35,010][00219] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-25 19:37:35,012][00219] Environment doom_my_way_home already registered, overwriting... [2023-02-25 19:37:35,017][00219] Environment doom_deadly_corridor already registered, overwriting... [2023-02-25 19:37:35,018][00219] Environment doom_defend_the_center already registered, overwriting... [2023-02-25 19:37:35,022][00219] Environment doom_defend_the_line already registered, overwriting... [2023-02-25 19:37:35,023][00219] Environment doom_health_gathering already registered, overwriting... [2023-02-25 19:37:35,029][00219] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-25 19:37:35,032][00219] Environment doom_battle already registered, overwriting... [2023-02-25 19:37:35,036][00219] Environment doom_battle2 already registered, overwriting... [2023-02-25 19:37:35,037][00219] Environment doom_duel_bots already registered, overwriting... [2023-02-25 19:37:35,039][00219] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-25 19:37:35,043][00219] Environment doom_duel already registered, overwriting... [2023-02-25 19:37:35,044][00219] Environment doom_deathmatch_full already registered, overwriting... [2023-02-25 19:37:35,045][00219] Environment doom_benchmark already registered, overwriting... [2023-02-25 19:37:35,047][00219] register_encoder_factory: [2023-02-25 19:37:35,065][00219] Loading legacy config file train_dir/doom_deathmatch_bots_2222/cfg.json instead of train_dir/doom_deathmatch_bots_2222/config.json [2023-02-25 19:37:35,066][00219] Loading existing experiment configuration from train_dir/doom_deathmatch_bots_2222/config.json [2023-02-25 19:37:35,069][00219] Overriding arg 'experiment' with value 'doom_deathmatch_bots_2222' passed from command line [2023-02-25 19:37:35,070][00219] Overriding arg 'train_dir' with value 'train_dir' passed from command line [2023-02-25 19:37:35,072][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 19:37:35,073][00219] Adding new argument 'lr_adaptive_min'=1e-06 that is not in the saved config file! [2023-02-25 19:37:35,074][00219] Adding new argument 'lr_adaptive_max'=0.01 that is not in the saved config file! [2023-02-25 19:37:35,075][00219] Adding new argument 'env_gpu_observations'=True that is not in the saved config file! [2023-02-25 19:37:35,077][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 19:37:35,078][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 19:37:35,079][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:37:35,080][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 19:37:35,082][00219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 19:37:35,083][00219] Adding new argument 'max_num_episodes'=1 that is not in the saved config file! [2023-02-25 19:37:35,084][00219] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-25 19:37:35,085][00219] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-25 19:37:35,086][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 19:37:35,088][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 19:37:35,089][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 19:37:35,090][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 19:37:35,091][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 19:37:35,133][00219] Port 40300 is available [2023-02-25 19:37:35,134][00219] Using port 40300 [2023-02-25 19:37:35,137][00219] RunningMeanStd input shape: (23,) [2023-02-25 19:37:35,141][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:37:35,142][00219] RunningMeanStd input shape: (1,) [2023-02-25 19:37:35,160][00219] ConvEncoder: input_channels=3 [2023-02-25 19:37:35,197][00219] Conv encoder output size: 512 [2023-02-25 19:37:35,198][00219] Policy head output size: 512 [2023-02-25 19:37:35,239][00219] Loading state from checkpoint train_dir/doom_deathmatch_bots_2222/checkpoint_p0/checkpoint_000282220_2311946240.pth... [2023-02-25 19:37:35,275][00219] Using port 40300 on host... [2023-02-25 19:37:35,617][00219] Initialized w:0 v:0 player:0 [2023-02-25 19:37:35,792][00219] Num frames 100... [2023-02-25 19:37:35,957][00219] Num frames 200... [2023-02-25 19:37:36,127][00219] Num frames 300... [2023-02-25 19:37:36,294][00219] Num frames 400... [2023-02-25 19:37:36,458][00219] Num frames 500... [2023-02-25 19:37:36,621][00219] Num frames 600... [2023-02-25 19:37:36,790][00219] Num frames 700... [2023-02-25 19:37:36,952][00219] Num frames 800... [2023-02-25 19:37:37,113][00219] Num frames 900... [2023-02-25 19:37:37,271][00219] Num frames 1000... [2023-02-25 19:37:37,439][00219] Num frames 1100... [2023-02-25 19:37:37,608][00219] Num frames 1200... [2023-02-25 19:37:37,775][00219] Num frames 1300... [2023-02-25 19:37:37,945][00219] Num frames 1400... [2023-02-25 19:37:38,116][00219] Num frames 1500... [2023-02-25 19:37:38,280][00219] Num frames 1600... [2023-02-25 19:37:38,443][00219] Num frames 1700... [2023-02-25 19:37:38,610][00219] Num frames 1800... [2023-02-25 19:37:38,783][00219] Num frames 1900... [2023-02-25 19:37:38,946][00219] Num frames 2000... [2023-02-25 19:37:39,110][00219] Num frames 2100... [2023-02-25 19:37:39,276][00219] Num frames 2200... [2023-02-25 19:37:39,448][00219] Num frames 2300... [2023-02-25 19:37:39,617][00219] Num frames 2400... [2023-02-25 19:37:39,800][00219] Num frames 2500... [2023-02-25 19:37:39,970][00219] Num frames 2600... [2023-02-25 19:37:40,136][00219] Num frames 2700... [2023-02-25 19:37:40,296][00219] Num frames 2800... [2023-02-25 19:37:40,470][00219] Num frames 2900... [2023-02-25 19:37:40,632][00219] Num frames 3000... [2023-02-25 19:37:40,799][00219] Num frames 3100... [2023-02-25 19:37:40,970][00219] Num frames 3200... [2023-02-25 19:37:41,138][00219] Num frames 3300... [2023-02-25 19:37:41,305][00219] Num frames 3400... [2023-02-25 19:37:41,474][00219] Num frames 3500... [2023-02-25 19:37:41,637][00219] Num frames 3600... [2023-02-25 19:37:41,811][00219] Num frames 3700... [2023-02-25 19:37:41,978][00219] Num frames 3800... [2023-02-25 19:37:42,185][00219] Num frames 3900... [2023-02-25 19:37:42,432][00219] Num frames 4000... [2023-02-25 19:37:42,672][00219] Num frames 4100... [2023-02-25 19:37:42,929][00219] Num frames 4200... [2023-02-25 19:37:43,172][00219] Num frames 4300... [2023-02-25 19:37:43,401][00219] Num frames 4400... [2023-02-25 19:37:43,632][00219] Num frames 4500... [2023-02-25 19:37:43,892][00219] Num frames 4600... [2023-02-25 19:37:44,131][00219] Num frames 4700... [2023-02-25 19:37:44,365][00219] Num frames 4800... [2023-02-25 19:37:44,608][00219] Num frames 4900... [2023-02-25 19:37:44,847][00219] Num frames 5000... [2023-02-25 19:37:45,082][00219] Num frames 5100... [2023-02-25 19:37:45,312][00219] Num frames 5200... [2023-02-25 19:37:45,545][00219] Num frames 5300... [2023-02-25 19:37:45,735][00219] Num frames 5400... [2023-02-25 19:37:45,911][00219] Num frames 5500... [2023-02-25 19:37:46,066][00219] Num frames 5600... [2023-02-25 19:37:46,225][00219] Num frames 5700... [2023-02-25 19:37:46,394][00219] Num frames 5800... [2023-02-25 19:37:46,551][00219] Num frames 5900... [2023-02-25 19:37:46,725][00219] Num frames 6000... [2023-02-25 19:37:46,889][00219] Num frames 6100... [2023-02-25 19:37:47,063][00219] Num frames 6200... [2023-02-25 19:37:47,224][00219] Num frames 6300... [2023-02-25 19:37:47,393][00219] Num frames 6400... [2023-02-25 19:37:47,551][00219] Num frames 6500... [2023-02-25 19:37:47,713][00219] Num frames 6600... [2023-02-25 19:37:47,885][00219] Num frames 6700... [2023-02-25 19:37:48,073][00219] Num frames 6800... [2023-02-25 19:37:48,256][00219] Num frames 6900... [2023-02-25 19:37:48,433][00219] Num frames 7000... [2023-02-25 19:37:48,614][00219] Num frames 7100... [2023-02-25 19:37:48,810][00219] Num frames 7200... [2023-02-25 19:37:48,984][00219] Num frames 7300... [2023-02-25 19:37:49,152][00219] Num frames 7400... [2023-02-25 19:37:49,337][00219] Num frames 7500... [2023-02-25 19:37:49,518][00219] Num frames 7600... [2023-02-25 19:37:49,673][00219] Num frames 7700... [2023-02-25 19:37:49,831][00219] Num frames 7800... [2023-02-25 19:37:50,003][00219] Num frames 7900... [2023-02-25 19:37:50,164][00219] Num frames 8000... [2023-02-25 19:37:50,331][00219] Num frames 8100... [2023-02-25 19:37:50,492][00219] Num frames 8200... [2023-02-25 19:37:50,656][00219] Num frames 8300... [2023-02-25 19:37:50,826][00219] DAMAGECOUNT value on done: 8109.0 [2023-02-25 19:37:50,828][00219] Sum rewards: 108.304, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-5.660', 'AMMO5': '0.003', 'AMMO2': '0.017', 'AMMO4': '0.083', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'weapon4': '0.130', 'weapon5': '0.198', 'AMMO3': '0.259', 'weapon2': '1.770', 'WEAPON3': '1.900', 'HITCOUNT': '4.340', 'weapon3': '13.988', 'DAMAGECOUNT': '24.327', 'FRAGCOUNT': '81.000'} [2023-02-25 19:37:50,892][00219] Avg episode rewards: #0: 108.299, true rewards: #0: 81.000 [2023-02-25 19:37:50,893][00219] Avg episode reward: 108.299, avg true_objective: 81.000 [2023-02-25 19:37:50,901][00219] Num frames 8400... [2023-02-25 19:38:39,485][00219] Replay video saved to train_dir/doom_deathmatch_bots_2222/replay.mp4! [2023-02-25 19:53:43,287][00219] Environment doom_basic already registered, overwriting... [2023-02-25 19:53:43,290][00219] Environment doom_two_colors_easy already registered, overwriting... [2023-02-25 19:53:43,291][00219] Environment doom_two_colors_hard already registered, overwriting... [2023-02-25 19:53:43,292][00219] Environment doom_dm already registered, overwriting... [2023-02-25 19:53:43,295][00219] Environment doom_dwango5 already registered, overwriting... [2023-02-25 19:53:43,299][00219] Environment doom_my_way_home_flat_actions already registered, overwriting... [2023-02-25 19:53:43,300][00219] Environment doom_defend_the_center_flat_actions already registered, overwriting... [2023-02-25 19:53:43,301][00219] Environment doom_my_way_home already registered, overwriting... [2023-02-25 19:53:43,304][00219] Environment doom_deadly_corridor already registered, overwriting... [2023-02-25 19:53:43,305][00219] Environment doom_defend_the_center already registered, overwriting... [2023-02-25 19:53:43,308][00219] Environment doom_defend_the_line already registered, overwriting... [2023-02-25 19:53:43,309][00219] Environment doom_health_gathering already registered, overwriting... [2023-02-25 19:53:43,311][00219] Environment doom_health_gathering_supreme already registered, overwriting... [2023-02-25 19:53:43,312][00219] Environment doom_battle already registered, overwriting... [2023-02-25 19:53:43,313][00219] Environment doom_battle2 already registered, overwriting... [2023-02-25 19:53:43,318][00219] Environment doom_duel_bots already registered, overwriting... [2023-02-25 19:53:43,319][00219] Environment doom_deathmatch_bots already registered, overwriting... [2023-02-25 19:53:43,320][00219] Environment doom_duel already registered, overwriting... [2023-02-25 19:53:43,322][00219] Environment doom_deathmatch_full already registered, overwriting... [2023-02-25 19:53:43,325][00219] Environment doom_benchmark already registered, overwriting... [2023-02-25 19:53:43,326][00219] register_encoder_factory: [2023-02-25 19:53:43,355][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 19:53:43,356][00219] Overriding arg 'num_workers' with value 2 passed from command line [2023-02-25 19:53:43,358][00219] Overriding arg 'num_envs_per_worker' with value 32 passed from command line [2023-02-25 19:53:43,360][00219] Overriding arg 'train_for_env_steps' with value 40000000 passed from command line [2023-02-25 19:53:43,368][00219] Experiment dir /content/train_dir/default_experiment already exists! [2023-02-25 19:53:43,371][00219] Resuming existing experiment from /content/train_dir/default_experiment... [2023-02-25 19:53:43,373][00219] Weights and Biases integration disabled [2023-02-25 19:53:43,376][00219] Environment var CUDA_VISIBLE_DEVICES is 0 [2023-02-25 19:53:44,741][00219] Starting experiment with the following configuration: help=False algo=APPO env=doom_health_gathering_supreme experiment=default_experiment train_dir=/content/train_dir restart_behavior=resume device=gpu seed=None num_policies=1 async_rl=True serial_mode=False batched_sampling=False num_batches_to_accumulate=2 worker_num_splits=2 policy_workers_per_policy=1 max_policy_lag=1000 num_workers=2 num_envs_per_worker=32 batch_size=1024 num_batches_per_epoch=1 num_epochs=1 rollout=32 recurrence=32 shuffle_minibatches=False gamma=0.99 reward_scale=1.0 reward_clip=1000.0 value_bootstrap=False normalize_returns=True exploration_loss_coeff=0.001 value_loss_coeff=0.5 kl_loss_coeff=0.0 exploration_loss=symmetric_kl gae_lambda=0.95 ppo_clip_ratio=0.1 ppo_clip_value=0.2 with_vtrace=False vtrace_rho=1.0 vtrace_c=1.0 optimizer=adam adam_eps=1e-06 adam_beta1=0.9 adam_beta2=0.999 max_grad_norm=4.0 learning_rate=0.0001 lr_schedule=constant lr_schedule_kl_threshold=0.008 lr_adaptive_min=1e-06 lr_adaptive_max=0.01 obs_subtract_mean=0.0 obs_scale=255.0 normalize_input=True normalize_input_keys=None decorrelate_experience_max_seconds=0 decorrelate_envs_on_one_worker=True actor_worker_gpus=[] set_workers_cpu_affinity=True force_envs_single_thread=False default_niceness=0 log_to_file=True experiment_summaries_interval=10 flush_summaries_interval=30 stats_avg=100 summaries_use_frameskip=True heartbeat_interval=20 heartbeat_reporting_interval=600 train_for_env_steps=40000000 train_for_seconds=10000000000 save_every_sec=120 keep_checkpoints=2 load_checkpoint_kind=latest save_milestones_sec=-1 save_best_every_sec=5 save_best_metric=reward save_best_after=100000 benchmark=False encoder_mlp_layers=[512, 512] encoder_conv_architecture=convnet_simple encoder_conv_mlp_layers=[512] use_rnn=True rnn_size=512 rnn_type=gru rnn_num_layers=1 decoder_mlp_layers=[] nonlinearity=elu policy_initialization=orthogonal policy_init_gain=1.0 actor_critic_share_weights=True adaptive_stddev=True continuous_tanh_scale=0.0 initial_stddev=1.0 use_env_info_cache=False env_gpu_actions=False env_gpu_observations=True env_frameskip=4 env_framestack=1 pixel_format=CHW use_record_episode_statistics=False with_wandb=False wandb_user=None wandb_project=sample_factory wandb_group=None wandb_job_type=SF wandb_tags=[] with_pbt=False pbt_mix_policies_in_one_env=True pbt_period_env_steps=5000000 pbt_start_mutation=20000000 pbt_replace_fraction=0.3 pbt_mutation_rate=0.15 pbt_replace_reward_gap=0.1 pbt_replace_reward_gap_absolute=1e-06 pbt_optimize_gamma=False pbt_target_objective=true_objective pbt_perturb_min=1.1 pbt_perturb_max=1.5 num_agents=-1 num_humans=0 num_bots=-1 start_bot_difficulty=None timelimit=None res_w=128 res_h=72 wide_aspect_ratio=False eval_env_frameskip=1 fps=35 command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000000 cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000000} git_hash=unknown git_repo_name=not a git repository [2023-02-25 19:53:44,743][00219] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-25 19:53:44,751][00219] Rollout worker 0 uses device cpu [2023-02-25 19:53:44,752][00219] Rollout worker 1 uses device cpu [2023-02-25 19:53:44,959][00219] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 19:53:44,961][00219] InferenceWorker_p0-w0: min num requests: 1 [2023-02-25 19:53:44,972][00219] Starting all processes... [2023-02-25 19:53:44,973][00219] Starting process learner_proc0 [2023-02-25 19:53:45,128][00219] Starting all processes... [2023-02-25 19:53:45,136][00219] Starting process inference_proc0-0 [2023-02-25 19:53:45,136][00219] Starting process rollout_proc0 [2023-02-25 19:53:45,137][00219] Starting process rollout_proc1 [2023-02-25 19:53:48,271][32858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 19:53:48,272][32858] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-25 19:53:48,298][32858] Num visible devices: 1 [2023-02-25 19:53:48,339][32858] Starting seed is not provided [2023-02-25 19:53:48,340][32858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 19:53:48,341][32858] Initializing actor-critic model on device cuda:0 [2023-02-25 19:53:48,342][32858] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:53:48,343][32858] RunningMeanStd input shape: (1,) [2023-02-25 19:53:48,377][32858] ConvEncoder: input_channels=3 [2023-02-25 19:53:48,716][32858] Conv encoder output size: 512 [2023-02-25 19:53:48,718][32858] Policy head output size: 512 [2023-02-25 19:53:48,763][32858] Created Actor Critic model with architecture: [2023-02-25 19:53:48,763][32858] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-25 19:53:49,078][32871] Worker 0 uses CPU cores [0] [2023-02-25 19:53:49,079][32866] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 19:53:49,079][32866] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-25 19:53:49,118][32866] Num visible devices: 1 [2023-02-25 19:53:49,209][32872] Worker 1 uses CPU cores [1] [2023-02-25 19:53:51,380][32858] Using optimizer [2023-02-25 19:53:51,381][32858] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-25 19:53:51,413][32858] Loading model from checkpoint [2023-02-25 19:53:51,418][32858] Loaded experiment state at self.train_step=978, self.env_steps=4005888 [2023-02-25 19:53:51,418][32858] Initialized policy 0 weights for model version 978 [2023-02-25 19:53:51,421][32858] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-25 19:53:51,429][32858] LearnerWorker_p0 finished initialization! [2023-02-25 19:53:51,660][32866] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 19:53:51,661][32866] RunningMeanStd input shape: (1,) [2023-02-25 19:53:51,673][32866] ConvEncoder: input_channels=3 [2023-02-25 19:53:51,769][32866] Conv encoder output size: 512 [2023-02-25 19:53:51,769][32866] Policy head output size: 512 [2023-02-25 19:53:53,377][00219] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:53:54,627][00219] Inference worker 0-0 is ready! [2023-02-25 19:53:54,629][00219] All inference workers are ready! Signal rollout workers to start! [2023-02-25 19:53:54,678][32871] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 19:53:54,680][32872] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-25 19:53:55,509][32872] Decorrelating experience for 0 frames... [2023-02-25 19:53:55,640][32871] Decorrelating experience for 0 frames... [2023-02-25 19:53:56,074][32872] Decorrelating experience for 32 frames... [2023-02-25 19:53:56,081][32871] Decorrelating experience for 32 frames... [2023-02-25 19:53:56,504][32872] Decorrelating experience for 64 frames... [2023-02-25 19:53:56,637][32871] Decorrelating experience for 64 frames... [2023-02-25 19:53:57,120][32871] Decorrelating experience for 96 frames... [2023-02-25 19:53:57,225][32872] Decorrelating experience for 96 frames... [2023-02-25 19:53:57,698][32871] Decorrelating experience for 128 frames... [2023-02-25 19:53:57,738][32872] Decorrelating experience for 128 frames... [2023-02-25 19:53:58,134][32871] Decorrelating experience for 160 frames... [2023-02-25 19:53:58,168][32872] Decorrelating experience for 160 frames... [2023-02-25 19:53:58,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:53:58,608][32871] Decorrelating experience for 192 frames... [2023-02-25 19:53:58,636][32872] Decorrelating experience for 192 frames... [2023-02-25 19:53:59,081][32871] Decorrelating experience for 224 frames... [2023-02-25 19:53:59,118][32872] Decorrelating experience for 224 frames... [2023-02-25 19:53:59,593][32871] Decorrelating experience for 256 frames... [2023-02-25 19:53:59,669][32872] Decorrelating experience for 256 frames... [2023-02-25 19:54:00,151][32871] Decorrelating experience for 288 frames... [2023-02-25 19:54:00,213][32872] Decorrelating experience for 288 frames... [2023-02-25 19:54:00,715][32871] Decorrelating experience for 320 frames... [2023-02-25 19:54:00,783][32872] Decorrelating experience for 320 frames... [2023-02-25 19:54:01,313][32871] Decorrelating experience for 352 frames... [2023-02-25 19:54:01,366][32872] Decorrelating experience for 352 frames... [2023-02-25 19:54:01,934][32871] Decorrelating experience for 384 frames... [2023-02-25 19:54:01,979][32872] Decorrelating experience for 384 frames... [2023-02-25 19:54:02,570][32871] Decorrelating experience for 416 frames... [2023-02-25 19:54:02,620][32872] Decorrelating experience for 416 frames... [2023-02-25 19:54:03,244][32871] Decorrelating experience for 448 frames... [2023-02-25 19:54:03,279][32872] Decorrelating experience for 448 frames... [2023-02-25 19:54:03,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:03,964][32871] Decorrelating experience for 480 frames... [2023-02-25 19:54:03,974][32872] Decorrelating experience for 480 frames... [2023-02-25 19:54:04,814][32872] Decorrelating experience for 512 frames... [2023-02-25 19:54:04,952][00219] Heartbeat connected on Batcher_0 [2023-02-25 19:54:04,959][00219] Heartbeat connected on LearnerWorker_p0 [2023-02-25 19:54:05,001][00219] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-25 19:54:05,217][32871] Decorrelating experience for 512 frames... [2023-02-25 19:54:05,519][32872] Decorrelating experience for 544 frames... [2023-02-25 19:54:05,984][32871] Decorrelating experience for 544 frames... [2023-02-25 19:54:06,267][32872] Decorrelating experience for 576 frames... [2023-02-25 19:54:06,742][32871] Decorrelating experience for 576 frames... [2023-02-25 19:54:07,038][32872] Decorrelating experience for 608 frames... [2023-02-25 19:54:07,542][32871] Decorrelating experience for 608 frames... [2023-02-25 19:54:08,000][32872] Decorrelating experience for 640 frames... [2023-02-25 19:54:08,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:08,878][32871] Decorrelating experience for 640 frames... [2023-02-25 19:54:09,199][32872] Decorrelating experience for 672 frames... [2023-02-25 19:54:09,985][32871] Decorrelating experience for 672 frames... [2023-02-25 19:54:10,773][32872] Decorrelating experience for 704 frames... [2023-02-25 19:54:11,425][32871] Decorrelating experience for 704 frames... [2023-02-25 19:54:12,097][32872] Decorrelating experience for 736 frames... [2023-02-25 19:54:12,894][32871] Decorrelating experience for 736 frames... [2023-02-25 19:54:13,378][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:13,466][32872] Decorrelating experience for 768 frames... [2023-02-25 19:54:14,487][32871] Decorrelating experience for 768 frames... [2023-02-25 19:54:14,635][32872] Decorrelating experience for 800 frames... [2023-02-25 19:54:15,935][32871] Decorrelating experience for 800 frames... [2023-02-25 19:54:16,180][32872] Decorrelating experience for 832 frames... [2023-02-25 19:54:17,371][32871] Decorrelating experience for 832 frames... [2023-02-25 19:54:17,898][32872] Decorrelating experience for 864 frames... [2023-02-25 19:54:18,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:18,506][32871] Decorrelating experience for 864 frames... [2023-02-25 19:54:18,927][32872] Decorrelating experience for 896 frames... [2023-02-25 19:54:19,527][32871] Decorrelating experience for 896 frames... [2023-02-25 19:54:19,970][32872] Decorrelating experience for 928 frames... [2023-02-25 19:54:20,590][32871] Decorrelating experience for 928 frames... [2023-02-25 19:54:21,086][32872] Decorrelating experience for 960 frames... [2023-02-25 19:54:21,700][32871] Decorrelating experience for 960 frames... [2023-02-25 19:54:22,191][32872] Decorrelating experience for 992 frames... [2023-02-25 19:54:22,801][32871] Decorrelating experience for 992 frames... [2023-02-25 19:54:23,011][00219] Heartbeat connected on RolloutWorker_w1 [2023-02-25 19:54:23,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:23,665][00219] Heartbeat connected on RolloutWorker_w0 [2023-02-25 19:54:25,714][32858] Signal inference workers to stop experience collection... [2023-02-25 19:54:25,729][32866] InferenceWorker_p0-w0: stopping experience collection [2023-02-25 19:54:28,377][00219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 64.5. Samples: 2256. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-25 19:54:28,385][00219] Avg episode reward: [(0, '0.790')] [2023-02-25 19:54:29,719][32858] Signal inference workers to resume experience collection... [2023-02-25 19:54:29,721][32866] InferenceWorker_p0-w0: resuming experience collection [2023-02-25 19:54:33,380][00219] Fps is (10 sec: 1637.9, 60 sec: 409.6, 300 sec: 409.6). Total num frames: 4022272. Throughput: 0: 68.4. Samples: 2736. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 19:54:33,384][00219] Avg episode reward: [(0, '2.210')] [2023-02-25 19:54:38,377][00219] Fps is (10 sec: 3276.8, 60 sec: 728.2, 300 sec: 728.2). Total num frames: 4038656. Throughput: 0: 177.8. Samples: 8000. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-02-25 19:54:38,382][00219] Avg episode reward: [(0, '4.997')] [2023-02-25 19:54:39,931][32866] Updated weights for policy 0, policy_version 988 (0.0012) [2023-02-25 19:54:43,377][00219] Fps is (10 sec: 4507.0, 60 sec: 1228.8, 300 sec: 1228.8). Total num frames: 4067328. Throughput: 0: 326.4. Samples: 14688. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:54:43,379][00219] Avg episode reward: [(0, '9.220')] [2023-02-25 19:54:47,004][32866] Updated weights for policy 0, policy_version 998 (0.0012) [2023-02-25 19:54:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 1563.9, 300 sec: 1563.9). Total num frames: 4091904. Throughput: 0: 425.6. Samples: 19152. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:54:48,379][00219] Avg episode reward: [(0, '13.848')] [2023-02-25 19:54:53,380][00219] Fps is (10 sec: 4913.7, 60 sec: 1843.1, 300 sec: 1843.1). Total num frames: 4116480. Throughput: 0: 613.3. Samples: 27600. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:54:53,382][00219] Avg episode reward: [(0, '16.386')] [2023-02-25 19:54:55,998][32866] Updated weights for policy 0, policy_version 1008 (0.0012) [2023-02-25 19:54:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 2184.5, 300 sec: 2016.5). Total num frames: 4136960. Throughput: 0: 742.4. Samples: 33408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:54:58,378][00219] Avg episode reward: [(0, '19.390')] [2023-02-25 19:55:03,377][00219] Fps is (10 sec: 4097.2, 60 sec: 2525.9, 300 sec: 2165.0). Total num frames: 4157440. Throughput: 0: 807.8. Samples: 36352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:55:03,379][00219] Avg episode reward: [(0, '21.729')] [2023-02-25 19:55:05,004][32866] Updated weights for policy 0, policy_version 1018 (0.0012) [2023-02-25 19:55:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 3003.7, 300 sec: 2403.0). Total num frames: 4186112. Throughput: 0: 974.9. Samples: 43872. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 19:55:08,384][00219] Avg episode reward: [(0, '23.591')] [2023-02-25 19:55:12,159][32866] Updated weights for policy 0, policy_version 1028 (0.0012) [2023-02-25 19:55:13,381][00219] Fps is (10 sec: 5732.1, 60 sec: 3481.5, 300 sec: 2611.1). Total num frames: 4214784. Throughput: 0: 1121.3. Samples: 52720. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:55:13,383][00219] Avg episode reward: [(0, '24.690')] [2023-02-25 19:55:18,381][00219] Fps is (10 sec: 4913.2, 60 sec: 3822.7, 300 sec: 2698.4). Total num frames: 4235264. Throughput: 0: 1180.1. Samples: 55840. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:55:18,383][00219] Avg episode reward: [(0, '24.414')] [2023-02-25 19:55:21,977][32866] Updated weights for policy 0, policy_version 1038 (0.0017) [2023-02-25 19:55:23,379][00219] Fps is (10 sec: 3687.1, 60 sec: 4095.9, 300 sec: 2730.6). Total num frames: 4251648. Throughput: 0: 1187.9. Samples: 61456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:55:23,387][00219] Avg episode reward: [(0, '24.301')] [2023-02-25 19:55:28,379][00219] Fps is (10 sec: 3277.5, 60 sec: 4368.9, 300 sec: 2759.4). Total num frames: 4268032. Throughput: 0: 1141.6. Samples: 66064. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 19:55:28,385][00219] Avg episode reward: [(0, '23.234')] [2023-02-25 19:55:33,377][00219] Fps is (10 sec: 3277.4, 60 sec: 4369.3, 300 sec: 2785.3). Total num frames: 4284416. Throughput: 0: 1104.7. Samples: 68864. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:55:33,379][00219] Avg episode reward: [(0, '22.837')] [2023-02-25 19:55:34,341][32866] Updated weights for policy 0, policy_version 1048 (0.0022) [2023-02-25 19:55:38,381][00219] Fps is (10 sec: 4504.7, 60 sec: 4573.6, 300 sec: 2925.6). Total num frames: 4313088. Throughput: 0: 1064.2. Samples: 75488. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 19:55:38,383][00219] Avg episode reward: [(0, '21.558')] [2023-02-25 19:55:42,763][32866] Updated weights for policy 0, policy_version 1058 (0.0011) [2023-02-25 19:55:43,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4437.3, 300 sec: 2978.9). Total num frames: 4333568. Throughput: 0: 1081.6. Samples: 82080. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:55:43,380][00219] Avg episode reward: [(0, '22.372')] [2023-02-25 19:55:43,390][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001058_4333568.pth... [2023-02-25 19:55:43,619][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000947_3878912.pth [2023-02-25 19:55:48,377][00219] Fps is (10 sec: 4097.4, 60 sec: 4369.0, 300 sec: 3027.5). Total num frames: 4354048. Throughput: 0: 1075.2. Samples: 84736. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:55:48,385][00219] Avg episode reward: [(0, '21.402')] [2023-02-25 19:55:52,555][32866] Updated weights for policy 0, policy_version 1068 (0.0019) [2023-02-25 19:55:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4301.0, 300 sec: 3072.0). Total num frames: 4374528. Throughput: 0: 1055.3. Samples: 91360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:55:53,379][00219] Avg episode reward: [(0, '21.361')] [2023-02-25 19:55:58,377][00219] Fps is (10 sec: 5325.0, 60 sec: 4505.6, 300 sec: 3211.3). Total num frames: 4407296. Throughput: 0: 1057.9. Samples: 100320. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 19:55:58,379][00219] Avg episode reward: [(0, '24.651')] [2023-02-25 19:55:59,434][32866] Updated weights for policy 0, policy_version 1078 (0.0012) [2023-02-25 19:56:03,380][00219] Fps is (10 sec: 5732.7, 60 sec: 4573.6, 300 sec: 3276.7). Total num frames: 4431872. Throughput: 0: 1081.6. Samples: 104512. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:56:03,386][00219] Avg episode reward: [(0, '23.828')] [2023-02-25 19:56:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4369.1, 300 sec: 3276.8). Total num frames: 4448256. Throughput: 0: 1082.4. Samples: 110160. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:56:08,380][00219] Avg episode reward: [(0, '24.414')] [2023-02-25 19:56:09,758][32866] Updated weights for policy 0, policy_version 1088 (0.0020) [2023-02-25 19:56:13,377][00219] Fps is (10 sec: 3687.5, 60 sec: 4232.8, 300 sec: 3306.1). Total num frames: 4468736. Throughput: 0: 1109.4. Samples: 115984. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:56:13,379][00219] Avg episode reward: [(0, '24.982')] [2023-02-25 19:56:18,124][32866] Updated weights for policy 0, policy_version 1098 (0.0013) [2023-02-25 19:56:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4437.6, 300 sec: 3418.0). Total num frames: 4501504. Throughput: 0: 1142.0. Samples: 120256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:56:18,379][00219] Avg episode reward: [(0, '24.816')] [2023-02-25 19:56:23,377][00219] Fps is (10 sec: 5734.2, 60 sec: 4574.0, 300 sec: 3467.9). Total num frames: 4526080. Throughput: 0: 1195.1. Samples: 129264. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 19:56:23,383][00219] Avg episode reward: [(0, '23.980')] [2023-02-25 19:56:26,239][32866] Updated weights for policy 0, policy_version 1108 (0.0012) [2023-02-25 19:56:28,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4642.2, 300 sec: 3488.2). Total num frames: 4546560. Throughput: 0: 1182.9. Samples: 135312. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:56:28,380][00219] Avg episode reward: [(0, '23.565')] [2023-02-25 19:56:33,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4642.1, 300 sec: 3481.6). Total num frames: 4562944. Throughput: 0: 1187.6. Samples: 138176. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:56:33,383][00219] Avg episode reward: [(0, '23.928')] [2023-02-25 19:56:35,961][32866] Updated weights for policy 0, policy_version 1118 (0.0012) [2023-02-25 19:56:38,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4642.4, 300 sec: 3549.9). Total num frames: 4591616. Throughput: 0: 1194.3. Samples: 145104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:56:38,380][00219] Avg episode reward: [(0, '23.117')] [2023-02-25 19:56:42,866][32866] Updated weights for policy 0, policy_version 1128 (0.0013) [2023-02-25 19:56:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4778.7, 300 sec: 3614.1). Total num frames: 4620288. Throughput: 0: 1195.4. Samples: 154112. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 19:56:43,384][00219] Avg episode reward: [(0, '23.287')] [2023-02-25 19:56:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 3627.9). Total num frames: 4640768. Throughput: 0: 1183.7. Samples: 157776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:56:48,383][00219] Avg episode reward: [(0, '23.391')] [2023-02-25 19:56:53,005][32866] Updated weights for policy 0, policy_version 1138 (0.0012) [2023-02-25 19:56:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 3640.9). Total num frames: 4661248. Throughput: 0: 1183.6. Samples: 163424. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:56:53,386][00219] Avg episode reward: [(0, '25.846')] [2023-02-25 19:56:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4573.9, 300 sec: 3653.2). Total num frames: 4681728. Throughput: 0: 1198.2. Samples: 169904. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 19:56:58,383][00219] Avg episode reward: [(0, '27.323')] [2023-02-25 19:56:58,433][32858] Saving new best policy, reward=27.323! [2023-02-25 19:57:01,238][32866] Updated weights for policy 0, policy_version 1148 (0.0012) [2023-02-25 19:57:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.6, 300 sec: 3729.5). Total num frames: 4714496. Throughput: 0: 1201.4. Samples: 174320. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 19:57:03,378][00219] Avg episode reward: [(0, '24.891')] [2023-02-25 19:57:08,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 3759.9). Total num frames: 4739072. Throughput: 0: 1192.5. Samples: 182928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:57:08,381][00219] Avg episode reward: [(0, '25.513')] [2023-02-25 19:57:08,651][32866] Updated weights for policy 0, policy_version 1158 (0.0012) [2023-02-25 19:57:13,379][00219] Fps is (10 sec: 4504.6, 60 sec: 4846.8, 300 sec: 3768.3). Total num frames: 4759552. Throughput: 0: 1185.4. Samples: 188656. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 19:57:13,381][00219] Avg episode reward: [(0, '24.398')] [2023-02-25 19:57:18,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4573.8, 300 sec: 3756.3). Total num frames: 4775936. Throughput: 0: 1183.6. Samples: 191440. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:57:18,381][00219] Avg episode reward: [(0, '24.430')] [2023-02-25 19:57:18,982][32866] Updated weights for policy 0, policy_version 1168 (0.0012) [2023-02-25 19:57:23,377][00219] Fps is (10 sec: 4916.3, 60 sec: 4710.4, 300 sec: 3822.9). Total num frames: 4808704. Throughput: 0: 1192.9. Samples: 198784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:57:23,384][00219] Avg episode reward: [(0, '24.819')] [2023-02-25 19:57:25,879][32866] Updated weights for policy 0, policy_version 1178 (0.0012) [2023-02-25 19:57:28,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4847.0, 300 sec: 3867.4). Total num frames: 4837376. Throughput: 0: 1189.0. Samples: 207616. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:57:28,383][00219] Avg episode reward: [(0, '27.110')] [2023-02-25 19:57:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 3872.6). Total num frames: 4857856. Throughput: 0: 1179.4. Samples: 210848. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:57:33,382][00219] Avg episode reward: [(0, '28.079')] [2023-02-25 19:57:33,393][32858] Saving new best policy, reward=28.079! [2023-02-25 19:57:35,654][32866] Updated weights for policy 0, policy_version 1188 (0.0012) [2023-02-25 19:57:38,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 3859.3). Total num frames: 4874240. Throughput: 0: 1178.7. Samples: 216464. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 19:57:38,385][00219] Avg episode reward: [(0, '28.788')] [2023-02-25 19:57:38,389][32858] Saving new best policy, reward=28.788! [2023-02-25 19:57:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 3882.3). Total num frames: 4898816. Throughput: 0: 1184.0. Samples: 223184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:57:43,380][00219] Avg episode reward: [(0, '27.801')] [2023-02-25 19:57:43,392][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001196_4898816.pth... [2023-02-25 19:57:43,519][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth [2023-02-25 19:57:44,703][32866] Updated weights for policy 0, policy_version 1198 (0.0012) [2023-02-25 19:57:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 3921.7). Total num frames: 4927488. Throughput: 0: 1180.8. Samples: 227456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:57:48,382][00219] Avg episode reward: [(0, '26.695')] [2023-02-25 19:57:51,854][32866] Updated weights for policy 0, policy_version 1208 (0.0012) [2023-02-25 19:57:53,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4778.6, 300 sec: 3925.3). Total num frames: 4947968. Throughput: 0: 1172.3. Samples: 235680. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:57:53,382][00219] Avg episode reward: [(0, '26.004')] [2023-02-25 19:57:58,380][00219] Fps is (10 sec: 4094.8, 60 sec: 4778.4, 300 sec: 3928.8). Total num frames: 4968448. Throughput: 0: 1171.5. Samples: 241376. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:57:58,382][00219] Avg episode reward: [(0, '25.052')] [2023-02-25 19:58:02,726][32866] Updated weights for policy 0, policy_version 1218 (0.0013) [2023-02-25 19:58:03,377][00219] Fps is (10 sec: 4505.8, 60 sec: 4642.1, 300 sec: 3948.5). Total num frames: 4993024. Throughput: 0: 1175.8. Samples: 244352. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 19:58:03,381][00219] Avg episode reward: [(0, '25.372')] [2023-02-25 19:58:08,377][00219] Fps is (10 sec: 5326.4, 60 sec: 4710.4, 300 sec: 3983.6). Total num frames: 5021696. Throughput: 0: 1194.7. Samples: 252544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:58:08,384][00219] Avg episode reward: [(0, '26.386')] [2023-02-25 19:58:09,538][32866] Updated weights for policy 0, policy_version 1228 (0.0013) [2023-02-25 19:58:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.8, 300 sec: 4001.5). Total num frames: 5046272. Throughput: 0: 1190.4. Samples: 261184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:58:13,382][00219] Avg episode reward: [(0, '26.462')] [2023-02-25 19:58:18,256][32866] Updated weights for policy 0, policy_version 1238 (0.0012) [2023-02-25 19:58:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4018.7). Total num frames: 5070848. Throughput: 0: 1184.0. Samples: 264128. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 19:58:18,381][00219] Avg episode reward: [(0, '26.802')] [2023-02-25 19:58:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4005.0). Total num frames: 5087232. Throughput: 0: 1183.6. Samples: 269728. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:58:23,385][00219] Avg episode reward: [(0, '28.072')] [2023-02-25 19:58:27,611][32866] Updated weights for policy 0, policy_version 1248 (0.0012) [2023-02-25 19:58:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4036.4). Total num frames: 5115904. Throughput: 0: 1206.8. Samples: 277488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:58:28,379][00219] Avg episode reward: [(0, '27.467')] [2023-02-25 19:58:33,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4778.7, 300 sec: 4066.7). Total num frames: 5144576. Throughput: 0: 1212.4. Samples: 282016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:58:33,380][00219] Avg episode reward: [(0, '28.115')] [2023-02-25 19:58:34,474][32866] Updated weights for policy 0, policy_version 1258 (0.0013) [2023-02-25 19:58:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4081.6). Total num frames: 5169152. Throughput: 0: 1196.5. Samples: 289520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:58:38,379][00219] Avg episode reward: [(0, '28.118')] [2023-02-25 19:58:43,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4778.6, 300 sec: 4067.7). Total num frames: 5185536. Throughput: 0: 1196.1. Samples: 295200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 19:58:43,385][00219] Avg episode reward: [(0, '27.605')] [2023-02-25 19:58:44,980][32866] Updated weights for policy 0, policy_version 1268 (0.0012) [2023-02-25 19:58:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4082.1). Total num frames: 5210112. Throughput: 0: 1191.1. Samples: 297952. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:58:48,379][00219] Avg episode reward: [(0, '26.464')] [2023-02-25 19:58:52,149][32866] Updated weights for policy 0, policy_version 1278 (0.0012) [2023-02-25 19:58:53,377][00219] Fps is (10 sec: 5325.4, 60 sec: 4847.0, 300 sec: 4179.3). Total num frames: 5238784. Throughput: 0: 1206.0. Samples: 306816. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:58:53,383][00219] Avg episode reward: [(0, '28.172')] [2023-02-25 19:58:58,377][00219] Fps is (10 sec: 5324.6, 60 sec: 4915.4, 300 sec: 4262.6). Total num frames: 5263360. Throughput: 0: 1190.4. Samples: 314752. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 19:58:58,381][00219] Avg episode reward: [(0, '27.770')] [2023-02-25 19:59:00,809][32866] Updated weights for policy 0, policy_version 1288 (0.0012) [2023-02-25 19:59:03,380][00219] Fps is (10 sec: 4504.1, 60 sec: 4846.7, 300 sec: 4332.0). Total num frames: 5283840. Throughput: 0: 1186.8. Samples: 317536. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:59:03,389][00219] Avg episode reward: [(0, '27.368')] [2023-02-25 19:59:08,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4642.1, 300 sec: 4387.6). Total num frames: 5300224. Throughput: 0: 1190.0. Samples: 323280. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:59:08,384][00219] Avg episode reward: [(0, '27.020')] [2023-02-25 19:59:10,617][32866] Updated weights for policy 0, policy_version 1298 (0.0012) [2023-02-25 19:59:13,377][00219] Fps is (10 sec: 4507.1, 60 sec: 4710.4, 300 sec: 4484.8). Total num frames: 5328896. Throughput: 0: 1201.4. Samples: 331552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:59:13,384][00219] Avg episode reward: [(0, '27.226')] [2023-02-25 19:59:17,533][32866] Updated weights for policy 0, policy_version 1308 (0.0012) [2023-02-25 19:59:18,381][00219] Fps is (10 sec: 6141.2, 60 sec: 4846.6, 300 sec: 4595.8). Total num frames: 5361664. Throughput: 0: 1202.7. Samples: 336144. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 19:59:18,384][00219] Avg episode reward: [(0, '26.805')] [2023-02-25 19:59:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4651.4). Total num frames: 5378048. Throughput: 0: 1187.2. Samples: 342944. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:59:23,380][00219] Avg episode reward: [(0, '26.662')] [2023-02-25 19:59:28,096][32866] Updated weights for policy 0, policy_version 1318 (0.0012) [2023-02-25 19:59:28,377][00219] Fps is (10 sec: 4097.9, 60 sec: 4778.7, 300 sec: 4679.2). Total num frames: 5402624. Throughput: 0: 1190.4. Samples: 348768. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:59:28,379][00219] Avg episode reward: [(0, '27.102')] [2023-02-25 19:59:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4706.9). Total num frames: 5427200. Throughput: 0: 1196.4. Samples: 351792. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 19:59:33,379][00219] Avg episode reward: [(0, '26.959')] [2023-02-25 19:59:35,554][32866] Updated weights for policy 0, policy_version 1328 (0.0012) [2023-02-25 19:59:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4706.9). Total num frames: 5455872. Throughput: 0: 1200.4. Samples: 360832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 19:59:38,379][00219] Avg episode reward: [(0, '27.918')] [2023-02-25 19:59:43,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.0, 300 sec: 4693.0). Total num frames: 5476352. Throughput: 0: 1191.8. Samples: 368384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 19:59:43,382][00219] Avg episode reward: [(0, '27.278')] [2023-02-25 19:59:43,397][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001337_5476352.pth... [2023-02-25 19:59:43,557][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001058_4333568.pth [2023-02-25 19:59:44,028][32866] Updated weights for policy 0, policy_version 1338 (0.0020) [2023-02-25 19:59:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4679.2). Total num frames: 5496832. Throughput: 0: 1190.8. Samples: 371120. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 19:59:48,384][00219] Avg episode reward: [(0, '26.705')] [2023-02-25 19:59:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4573.9, 300 sec: 4665.3). Total num frames: 5513216. Throughput: 0: 1185.1. Samples: 376608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 19:59:53,382][00219] Avg episode reward: [(0, '26.478')] [2023-02-25 19:59:54,089][32866] Updated weights for policy 0, policy_version 1348 (0.0012) [2023-02-25 19:59:58,379][00219] Fps is (10 sec: 3685.5, 60 sec: 4505.4, 300 sec: 4665.2). Total num frames: 5533696. Throughput: 0: 1134.2. Samples: 382592. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 19:59:58,383][00219] Avg episode reward: [(0, '26.638')] [2023-02-25 20:00:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.8, 300 sec: 4637.5). Total num frames: 5554176. Throughput: 0: 1094.9. Samples: 385408. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:00:03,384][00219] Avg episode reward: [(0, '26.991')] [2023-02-25 20:00:05,407][32866] Updated weights for policy 0, policy_version 1358 (0.0021) [2023-02-25 20:00:08,378][00219] Fps is (10 sec: 3686.7, 60 sec: 4505.5, 300 sec: 4595.9). Total num frames: 5570560. Throughput: 0: 1056.0. Samples: 390464. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:00:08,380][00219] Avg episode reward: [(0, '25.423')] [2023-02-25 20:00:13,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.1, 300 sec: 4595.9). Total num frames: 5591040. Throughput: 0: 1055.6. Samples: 396272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:00:13,383][00219] Avg episode reward: [(0, '26.302')] [2023-02-25 20:00:15,891][32866] Updated weights for policy 0, policy_version 1368 (0.0012) [2023-02-25 20:00:18,377][00219] Fps is (10 sec: 4506.4, 60 sec: 4232.9, 300 sec: 4623.7). Total num frames: 5615616. Throughput: 0: 1060.6. Samples: 399520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:00:18,379][00219] Avg episode reward: [(0, '28.056')] [2023-02-25 20:00:23,008][32866] Updated weights for policy 0, policy_version 1378 (0.0012) [2023-02-25 20:00:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4437.3, 300 sec: 4665.3). Total num frames: 5644288. Throughput: 0: 1056.0. Samples: 408352. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:00:23,381][00219] Avg episode reward: [(0, '28.803')] [2023-02-25 20:00:23,390][32858] Saving new best policy, reward=28.803! [2023-02-25 20:00:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4369.1, 300 sec: 4679.2). Total num frames: 5664768. Throughput: 0: 1050.7. Samples: 415664. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:00:28,383][00219] Avg episode reward: [(0, '28.966')] [2023-02-25 20:00:28,391][32858] Saving new best policy, reward=28.966! [2023-02-25 20:00:32,900][32866] Updated weights for policy 0, policy_version 1388 (0.0012) [2023-02-25 20:00:33,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4300.4, 300 sec: 4651.4). Total num frames: 5685248. Throughput: 0: 1053.4. Samples: 418528. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:00:33,385][00219] Avg episode reward: [(0, '30.253')] [2023-02-25 20:00:33,397][32858] Saving new best policy, reward=30.253! [2023-02-25 20:00:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4164.3, 300 sec: 4651.4). Total num frames: 5705728. Throughput: 0: 1053.2. Samples: 424000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:00:38,382][00219] Avg episode reward: [(0, '28.240')] [2023-02-25 20:00:41,170][32866] Updated weights for policy 0, policy_version 1398 (0.0012) [2023-02-25 20:00:43,377][00219] Fps is (10 sec: 4918.2, 60 sec: 4300.8, 300 sec: 4679.2). Total num frames: 5734400. Throughput: 0: 1114.7. Samples: 432752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:00:43,384][00219] Avg episode reward: [(0, '27.524')] [2023-02-25 20:00:48,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4437.3, 300 sec: 4706.9). Total num frames: 5763072. Throughput: 0: 1150.9. Samples: 437200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:00:48,379][00219] Avg episode reward: [(0, '26.509')] [2023-02-25 20:00:48,385][32866] Updated weights for policy 0, policy_version 1408 (0.0013) [2023-02-25 20:00:53,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4505.5, 300 sec: 4665.2). Total num frames: 5783552. Throughput: 0: 1179.0. Samples: 443520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:00:53,383][00219] Avg episode reward: [(0, '25.619')] [2023-02-25 20:00:58,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4437.5, 300 sec: 4637.6). Total num frames: 5799936. Throughput: 0: 1173.0. Samples: 449056. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:00:58,381][00219] Avg episode reward: [(0, '25.054')] [2023-02-25 20:00:59,286][32866] Updated weights for policy 0, policy_version 1418 (0.0012) [2023-02-25 20:01:03,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4573.9, 300 sec: 4679.2). Total num frames: 5828608. Throughput: 0: 1188.6. Samples: 453008. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:01:03,379][00219] Avg episode reward: [(0, '25.011')] [2023-02-25 20:01:06,609][32866] Updated weights for policy 0, policy_version 1428 (0.0012) [2023-02-25 20:01:08,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4847.1, 300 sec: 4720.8). Total num frames: 5861376. Throughput: 0: 1189.0. Samples: 461856. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-25 20:01:08,384][00219] Avg episode reward: [(0, '25.323')] [2023-02-25 20:01:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4665.3). Total num frames: 5877760. Throughput: 0: 1179.4. Samples: 468736. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:01:13,379][00219] Avg episode reward: [(0, '25.150')] [2023-02-25 20:01:15,943][32866] Updated weights for policy 0, policy_version 1438 (0.0012) [2023-02-25 20:01:18,379][00219] Fps is (10 sec: 3685.6, 60 sec: 4710.2, 300 sec: 4651.4). Total num frames: 5898240. Throughput: 0: 1180.6. Samples: 471648. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:01:18,384][00219] Avg episode reward: [(0, '26.629')] [2023-02-25 20:01:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 5922816. Throughput: 0: 1195.0. Samples: 477776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:01:23,383][00219] Avg episode reward: [(0, '25.369')] [2023-02-25 20:01:24,842][32866] Updated weights for policy 0, policy_version 1448 (0.0013) [2023-02-25 20:01:28,377][00219] Fps is (10 sec: 5325.9, 60 sec: 4778.7, 300 sec: 4706.9). Total num frames: 5951488. Throughput: 0: 1197.2. Samples: 486624. Policy #0 lag: (min: 1.0, avg: 1.0, max: 2.0) [2023-02-25 20:01:28,382][00219] Avg episode reward: [(0, '25.410')] [2023-02-25 20:01:32,002][32866] Updated weights for policy 0, policy_version 1458 (0.0015) [2023-02-25 20:01:33,383][00219] Fps is (10 sec: 5321.6, 60 sec: 4846.9, 300 sec: 4692.9). Total num frames: 5976064. Throughput: 0: 1197.7. Samples: 491104. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:01:33,385][00219] Avg episode reward: [(0, '24.512')] [2023-02-25 20:01:38,381][00219] Fps is (10 sec: 4503.8, 60 sec: 4846.6, 300 sec: 4665.2). Total num frames: 5996544. Throughput: 0: 1189.3. Samples: 497040. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:01:38,387][00219] Avg episode reward: [(0, '23.715')] [2023-02-25 20:01:42,763][32866] Updated weights for policy 0, policy_version 1468 (0.0012) [2023-02-25 20:01:43,377][00219] Fps is (10 sec: 4098.5, 60 sec: 4710.4, 300 sec: 4665.3). Total num frames: 6017024. Throughput: 0: 1190.8. Samples: 502640. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:01:43,378][00219] Avg episode reward: [(0, '24.158')] [2023-02-25 20:01:43,390][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001469_6017024.pth... [2023-02-25 20:01:43,522][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001196_4898816.pth [2023-02-25 20:01:48,377][00219] Fps is (10 sec: 4917.2, 60 sec: 4710.4, 300 sec: 4693.0). Total num frames: 6045696. Throughput: 0: 1195.0. Samples: 506784. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:01:48,384][00219] Avg episode reward: [(0, '24.847')] [2023-02-25 20:01:49,649][32866] Updated weights for policy 0, policy_version 1478 (0.0015) [2023-02-25 20:01:53,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4847.1, 300 sec: 4720.8). Total num frames: 6074368. Throughput: 0: 1200.4. Samples: 515872. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:01:53,384][00219] Avg episode reward: [(0, '25.686')] [2023-02-25 20:01:57,894][32866] Updated weights for policy 0, policy_version 1488 (0.0045) [2023-02-25 20:01:58,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4679.2). Total num frames: 6094848. Throughput: 0: 1189.0. Samples: 522240. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:01:58,383][00219] Avg episode reward: [(0, '27.051')] [2023-02-25 20:02:03,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 6111232. Throughput: 0: 1186.5. Samples: 525040. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:02:03,384][00219] Avg episode reward: [(0, '27.219')] [2023-02-25 20:02:07,550][32866] Updated weights for policy 0, policy_version 1498 (0.0012) [2023-02-25 20:02:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 6139904. Throughput: 0: 1201.4. Samples: 531840. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:02:08,379][00219] Avg episode reward: [(0, '26.976')] [2023-02-25 20:02:13,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 6168576. Throughput: 0: 1205.0. Samples: 540848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:02:13,381][00219] Avg episode reward: [(0, '26.838')] [2023-02-25 20:02:14,491][32866] Updated weights for policy 0, policy_version 1508 (0.0012) [2023-02-25 20:02:18,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4847.1, 300 sec: 4679.2). Total num frames: 6189056. Throughput: 0: 1190.2. Samples: 544656. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:02:18,384][00219] Avg episode reward: [(0, '27.774')] [2023-02-25 20:02:23,379][00219] Fps is (10 sec: 4095.2, 60 sec: 4778.5, 300 sec: 4651.4). Total num frames: 6209536. Throughput: 0: 1184.4. Samples: 550336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:02:23,385][00219] Avg episode reward: [(0, '27.967')] [2023-02-25 20:02:24,632][32866] Updated weights for policy 0, policy_version 1518 (0.0012) [2023-02-25 20:02:28,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4710.4, 300 sec: 4665.3). Total num frames: 6234112. Throughput: 0: 1197.9. Samples: 556544. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:02:28,379][00219] Avg episode reward: [(0, '27.646')] [2023-02-25 20:02:32,672][32866] Updated weights for policy 0, policy_version 1528 (0.0012) [2023-02-25 20:02:33,377][00219] Fps is (10 sec: 5325.9, 60 sec: 4779.1, 300 sec: 4706.9). Total num frames: 6262784. Throughput: 0: 1204.6. Samples: 560992. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:02:33,379][00219] Avg episode reward: [(0, '27.646')] [2023-02-25 20:02:38,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4847.2, 300 sec: 4706.9). Total num frames: 6287360. Throughput: 0: 1204.3. Samples: 570064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:02:38,381][00219] Avg episode reward: [(0, '26.490')] [2023-02-25 20:02:40,617][32866] Updated weights for policy 0, policy_version 1538 (0.0012) [2023-02-25 20:02:43,379][00219] Fps is (10 sec: 4504.4, 60 sec: 4846.7, 300 sec: 4679.1). Total num frames: 6307840. Throughput: 0: 1188.9. Samples: 575744. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:02:43,382][00219] Avg episode reward: [(0, '25.026')] [2023-02-25 20:02:48,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 6324224. Throughput: 0: 1191.1. Samples: 578640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:02:48,381][00219] Avg episode reward: [(0, '24.216')] [2023-02-25 20:02:50,400][32866] Updated weights for policy 0, policy_version 1548 (0.0012) [2023-02-25 20:02:53,377][00219] Fps is (10 sec: 4916.5, 60 sec: 4710.4, 300 sec: 4707.0). Total num frames: 6356992. Throughput: 0: 1198.2. Samples: 585760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:02:53,379][00219] Avg episode reward: [(0, '26.346')] [2023-02-25 20:02:57,640][32866] Updated weights for policy 0, policy_version 1558 (0.0012) [2023-02-25 20:02:58,377][00219] Fps is (10 sec: 6144.4, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 6385664. Throughput: 0: 1197.2. Samples: 594720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:02:58,379][00219] Avg episode reward: [(0, '26.055')] [2023-02-25 20:03:03,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4915.2, 300 sec: 4693.0). Total num frames: 6406144. Throughput: 0: 1190.0. Samples: 598208. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:03:03,379][00219] Avg episode reward: [(0, '27.229')] [2023-02-25 20:03:07,495][32866] Updated weights for policy 0, policy_version 1568 (0.0011) [2023-02-25 20:03:08,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4665.3). Total num frames: 6422528. Throughput: 0: 1189.7. Samples: 603872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:03:08,382][00219] Avg episode reward: [(0, '27.924')] [2023-02-25 20:03:13,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 6447104. Throughput: 0: 1201.8. Samples: 610624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:03:13,379][00219] Avg episode reward: [(0, '26.824')] [2023-02-25 20:03:15,400][32866] Updated weights for policy 0, policy_version 1578 (0.0015) [2023-02-25 20:03:18,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 6479872. Throughput: 0: 1201.4. Samples: 615056. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:03:18,379][00219] Avg episode reward: [(0, '27.207')] [2023-02-25 20:03:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.1, 300 sec: 4693.0). Total num frames: 6500352. Throughput: 0: 1184.7. Samples: 623376. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:03:23,379][00219] Avg episode reward: [(0, '27.762')] [2023-02-25 20:03:23,408][32866] Updated weights for policy 0, policy_version 1588 (0.0012) [2023-02-25 20:03:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4665.3). Total num frames: 6520832. Throughput: 0: 1186.9. Samples: 629152. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 20:03:28,380][00219] Avg episode reward: [(0, '27.555')] [2023-02-25 20:03:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4651.4). Total num frames: 6541312. Throughput: 0: 1187.2. Samples: 632064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:03:33,380][00219] Avg episode reward: [(0, '28.635')] [2023-02-25 20:03:33,771][32866] Updated weights for policy 0, policy_version 1598 (0.0012) [2023-02-25 20:03:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4693.1). Total num frames: 6569984. Throughput: 0: 1206.0. Samples: 640032. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:03:38,384][00219] Avg episode reward: [(0, '28.232')] [2023-02-25 20:03:40,352][32866] Updated weights for policy 0, policy_version 1608 (0.0012) [2023-02-25 20:03:43,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4915.4, 300 sec: 4720.8). Total num frames: 6602752. Throughput: 0: 1203.9. Samples: 648896. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:03:43,381][00219] Avg episode reward: [(0, '27.653')] [2023-02-25 20:03:43,395][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001612_6602752.pth... [2023-02-25 20:03:43,549][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001337_5476352.pth [2023-02-25 20:03:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4679.2). Total num frames: 6619136. Throughput: 0: 1186.5. Samples: 651600. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:03:48,379][00219] Avg episode reward: [(0, '26.617')] [2023-02-25 20:03:49,888][32866] Updated weights for policy 0, policy_version 1618 (0.0012) [2023-02-25 20:03:53,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4642.1, 300 sec: 4651.4). Total num frames: 6635520. Throughput: 0: 1188.6. Samples: 657360. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:03:53,386][00219] Avg episode reward: [(0, '27.418')] [2023-02-25 20:03:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 6664192. Throughput: 0: 1204.6. Samples: 664832. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:03:58,379][00219] Avg episode reward: [(0, '26.689')] [2023-02-25 20:03:58,635][32866] Updated weights for policy 0, policy_version 1628 (0.0015) [2023-02-25 20:04:03,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 6692864. Throughput: 0: 1206.8. Samples: 669360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:04:03,382][00219] Avg episode reward: [(0, '27.127')] [2023-02-25 20:04:06,268][32866] Updated weights for policy 0, policy_version 1638 (0.0018) [2023-02-25 20:04:08,379][00219] Fps is (10 sec: 4913.9, 60 sec: 4846.7, 300 sec: 4693.0). Total num frames: 6713344. Throughput: 0: 1193.2. Samples: 677072. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:04:08,382][00219] Avg episode reward: [(0, '27.048')] [2023-02-25 20:04:13,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4846.9, 300 sec: 4665.3). Total num frames: 6737920. Throughput: 0: 1195.7. Samples: 682960. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:04:13,379][00219] Avg episode reward: [(0, '27.201')] [2023-02-25 20:04:16,832][32866] Updated weights for policy 0, policy_version 1648 (0.0027) [2023-02-25 20:04:18,377][00219] Fps is (10 sec: 4506.8, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 6758400. Throughput: 0: 1191.5. Samples: 685680. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:04:18,379][00219] Avg episode reward: [(0, '25.935')] [2023-02-25 20:04:23,332][32866] Updated weights for policy 0, policy_version 1658 (0.0012) [2023-02-25 20:04:23,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4846.9, 300 sec: 4706.9). Total num frames: 6791168. Throughput: 0: 1204.3. Samples: 694224. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:04:23,380][00219] Avg episode reward: [(0, '25.932')] [2023-02-25 20:04:28,382][00219] Fps is (10 sec: 4912.7, 60 sec: 4778.3, 300 sec: 4679.1). Total num frames: 6807552. Throughput: 0: 1150.1. Samples: 700656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:04:28,384][00219] Avg episode reward: [(0, '26.725')] [2023-02-25 20:04:33,379][00219] Fps is (10 sec: 3276.1, 60 sec: 4710.2, 300 sec: 4637.5). Total num frames: 6823936. Throughput: 0: 1140.2. Samples: 702912. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:04:33,382][00219] Avg episode reward: [(0, '27.492')] [2023-02-25 20:04:36,075][32866] Updated weights for policy 0, policy_version 1668 (0.0014) [2023-02-25 20:04:38,377][00219] Fps is (10 sec: 2868.7, 60 sec: 4437.3, 300 sec: 4609.7). Total num frames: 6836224. Throughput: 0: 1116.8. Samples: 707616. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:04:38,379][00219] Avg episode reward: [(0, '27.520')] [2023-02-25 20:04:43,377][00219] Fps is (10 sec: 2867.9, 60 sec: 4164.3, 300 sec: 4595.9). Total num frames: 6852608. Throughput: 0: 1062.0. Samples: 712624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:04:43,386][00219] Avg episode reward: [(0, '29.076')] [2023-02-25 20:04:46,294][32866] Updated weights for policy 0, policy_version 1678 (0.0012) [2023-02-25 20:04:48,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4437.3, 300 sec: 4651.4). Total num frames: 6885376. Throughput: 0: 1055.6. Samples: 716864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:04:48,384][00219] Avg episode reward: [(0, '31.090')] [2023-02-25 20:04:48,390][32858] Saving new best policy, reward=31.090! [2023-02-25 20:04:53,164][32866] Updated weights for policy 0, policy_version 1688 (0.0012) [2023-02-25 20:04:53,377][00219] Fps is (10 sec: 6143.8, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 6914048. Throughput: 0: 1080.9. Samples: 725712. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:04:53,382][00219] Avg episode reward: [(0, '30.698')] [2023-02-25 20:04:58,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 6934528. Throughput: 0: 1095.5. Samples: 732256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:04:58,385][00219] Avg episode reward: [(0, '29.057')] [2023-02-25 20:05:03,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4300.8, 300 sec: 4679.2). Total num frames: 6950912. Throughput: 0: 1097.9. Samples: 735088. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:05:03,382][00219] Avg episode reward: [(0, '29.643')] [2023-02-25 20:05:03,794][32866] Updated weights for policy 0, policy_version 1698 (0.0011) [2023-02-25 20:05:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.5, 300 sec: 4706.9). Total num frames: 6979584. Throughput: 0: 1060.3. Samples: 741936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:05:08,383][00219] Avg episode reward: [(0, '27.971')] [2023-02-25 20:05:10,934][32866] Updated weights for policy 0, policy_version 1708 (0.0012) [2023-02-25 20:05:13,377][00219] Fps is (10 sec: 5734.8, 60 sec: 4505.6, 300 sec: 4720.8). Total num frames: 7008256. Throughput: 0: 1119.1. Samples: 751008. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:05:13,385][00219] Avg episode reward: [(0, '26.537')] [2023-02-25 20:05:18,382][00219] Fps is (10 sec: 5322.1, 60 sec: 4573.5, 300 sec: 4706.8). Total num frames: 7032832. Throughput: 0: 1156.2. Samples: 754944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:05:18,384][00219] Avg episode reward: [(0, '27.342')] [2023-02-25 20:05:19,034][32866] Updated weights for policy 0, policy_version 1718 (0.0011) [2023-02-25 20:05:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4369.1, 300 sec: 4706.9). Total num frames: 7053312. Throughput: 0: 1178.3. Samples: 760640. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:05:23,381][00219] Avg episode reward: [(0, '26.381')] [2023-02-25 20:05:28,377][00219] Fps is (10 sec: 4098.1, 60 sec: 4437.7, 300 sec: 4707.0). Total num frames: 7073792. Throughput: 0: 1206.0. Samples: 766896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:05:28,385][00219] Avg episode reward: [(0, '27.100')] [2023-02-25 20:05:28,750][32866] Updated weights for policy 0, policy_version 1728 (0.0012) [2023-02-25 20:05:33,378][00219] Fps is (10 sec: 4914.5, 60 sec: 4642.2, 300 sec: 4734.7). Total num frames: 7102464. Throughput: 0: 1213.8. Samples: 771488. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:05:33,386][00219] Avg episode reward: [(0, '26.588')] [2023-02-25 20:05:35,472][32866] Updated weights for policy 0, policy_version 1738 (0.0012) [2023-02-25 20:05:38,379][00219] Fps is (10 sec: 5733.2, 60 sec: 4915.0, 300 sec: 4734.7). Total num frames: 7131136. Throughput: 0: 1216.3. Samples: 780448. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:05:38,385][00219] Avg episode reward: [(0, '26.128')] [2023-02-25 20:05:43,380][00219] Fps is (10 sec: 4504.8, 60 sec: 4914.9, 300 sec: 4693.0). Total num frames: 7147520. Throughput: 0: 1198.5. Samples: 786192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:05:43,382][00219] Avg episode reward: [(0, '25.333')] [2023-02-25 20:05:43,397][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001745_7147520.pth... [2023-02-25 20:05:43,579][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001469_6017024.pth [2023-02-25 20:05:46,252][32866] Updated weights for policy 0, policy_version 1748 (0.0011) [2023-02-25 20:05:48,377][00219] Fps is (10 sec: 3277.3, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 7163904. Throughput: 0: 1198.2. Samples: 789008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:05:48,382][00219] Avg episode reward: [(0, '25.084')] [2023-02-25 20:05:53,377][00219] Fps is (10 sec: 4916.7, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 7196672. Throughput: 0: 1206.0. Samples: 796208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:05:53,381][00219] Avg episode reward: [(0, '24.308')] [2023-02-25 20:05:53,760][32866] Updated weights for policy 0, policy_version 1758 (0.0012) [2023-02-25 20:05:58,377][00219] Fps is (10 sec: 6144.2, 60 sec: 4846.9, 300 sec: 4734.7). Total num frames: 7225344. Throughput: 0: 1201.1. Samples: 805056. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:05:58,379][00219] Avg episode reward: [(0, '24.436')] [2023-02-25 20:06:01,492][32866] Updated weights for policy 0, policy_version 1768 (0.0012) [2023-02-25 20:06:03,384][00219] Fps is (10 sec: 4911.8, 60 sec: 4914.7, 300 sec: 4692.9). Total num frames: 7245824. Throughput: 0: 1188.6. Samples: 808432. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:06:03,386][00219] Avg episode reward: [(0, '25.439')] [2023-02-25 20:06:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4706.9). Total num frames: 7266304. Throughput: 0: 1190.8. Samples: 814224. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:06:08,382][00219] Avg episode reward: [(0, '26.231')] [2023-02-25 20:06:11,654][32866] Updated weights for policy 0, policy_version 1778 (0.0011) [2023-02-25 20:06:13,377][00219] Fps is (10 sec: 4508.7, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 7290880. Throughput: 0: 1203.9. Samples: 821072. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:06:13,380][00219] Avg episode reward: [(0, '25.150')] [2023-02-25 20:06:18,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4779.1, 300 sec: 4734.7). Total num frames: 7319552. Throughput: 0: 1200.4. Samples: 825504. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:06:18,379][00219] Avg episode reward: [(0, '25.180')] [2023-02-25 20:06:18,485][32866] Updated weights for policy 0, policy_version 1788 (0.0014) [2023-02-25 20:06:23,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4846.8, 300 sec: 4720.8). Total num frames: 7344128. Throughput: 0: 1187.9. Samples: 833904. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:06:23,382][00219] Avg episode reward: [(0, '25.273')] [2023-02-25 20:06:28,012][32866] Updated weights for policy 0, policy_version 1798 (0.0012) [2023-02-25 20:06:28,377][00219] Fps is (10 sec: 4505.4, 60 sec: 4846.9, 300 sec: 4707.0). Total num frames: 7364608. Throughput: 0: 1186.6. Samples: 839584. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:06:28,382][00219] Avg episode reward: [(0, '25.204')] [2023-02-25 20:06:33,377][00219] Fps is (10 sec: 3686.9, 60 sec: 4642.2, 300 sec: 4693.1). Total num frames: 7380992. Throughput: 0: 1186.1. Samples: 842384. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:06:33,380][00219] Avg episode reward: [(0, '24.729')] [2023-02-25 20:06:36,603][32866] Updated weights for policy 0, policy_version 1808 (0.0012) [2023-02-25 20:06:38,377][00219] Fps is (10 sec: 4915.4, 60 sec: 4710.6, 300 sec: 4734.7). Total num frames: 7413760. Throughput: 0: 1199.6. Samples: 850192. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:06:38,378][00219] Avg episode reward: [(0, '26.071')] [2023-02-25 20:06:43,378][00219] Fps is (10 sec: 6143.4, 60 sec: 4915.4, 300 sec: 4734.7). Total num frames: 7442432. Throughput: 0: 1201.0. Samples: 859104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:06:43,385][00219] Avg episode reward: [(0, '27.276')] [2023-02-25 20:06:44,488][32866] Updated weights for policy 0, policy_version 1818 (0.0012) [2023-02-25 20:06:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4693.0). Total num frames: 7458816. Throughput: 0: 1187.7. Samples: 861872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:06:48,383][00219] Avg episode reward: [(0, '27.868')] [2023-02-25 20:06:53,377][00219] Fps is (10 sec: 3686.8, 60 sec: 4710.4, 300 sec: 4693.0). Total num frames: 7479296. Throughput: 0: 1184.4. Samples: 867520. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:06:53,384][00219] Avg episode reward: [(0, '27.897')] [2023-02-25 20:06:54,922][32866] Updated weights for policy 0, policy_version 1828 (0.0012) [2023-02-25 20:06:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4720.8). Total num frames: 7503872. Throughput: 0: 1188.6. Samples: 874560. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:06:58,379][00219] Avg episode reward: [(0, '28.133')] [2023-02-25 20:07:02,344][32866] Updated weights for policy 0, policy_version 1838 (0.0012) [2023-02-25 20:07:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4779.2, 300 sec: 4720.8). Total num frames: 7532544. Throughput: 0: 1185.8. Samples: 878864. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:07:03,379][00219] Avg episode reward: [(0, '27.484')] [2023-02-25 20:07:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4693.0). Total num frames: 7553024. Throughput: 0: 1167.7. Samples: 886448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:07:08,380][00219] Avg episode reward: [(0, '27.527')] [2023-02-25 20:07:12,466][32866] Updated weights for policy 0, policy_version 1848 (0.0012) [2023-02-25 20:07:13,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4693.0). Total num frames: 7573504. Throughput: 0: 1167.3. Samples: 892112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:07:13,384][00219] Avg episode reward: [(0, '27.611')] [2023-02-25 20:07:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4707.0). Total num frames: 7598080. Throughput: 0: 1165.5. Samples: 894832. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:07:18,379][00219] Avg episode reward: [(0, '27.471')] [2023-02-25 20:07:20,586][32866] Updated weights for policy 0, policy_version 1858 (0.0013) [2023-02-25 20:07:23,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4710.5, 300 sec: 4720.8). Total num frames: 7626752. Throughput: 0: 1184.0. Samples: 903472. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:07:23,379][00219] Avg episode reward: [(0, '28.432')] [2023-02-25 20:07:28,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4710.4, 300 sec: 4693.0). Total num frames: 7647232. Throughput: 0: 1162.7. Samples: 911424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:07:28,380][00219] Avg episode reward: [(0, '27.202')] [2023-02-25 20:07:28,553][32866] Updated weights for policy 0, policy_version 1868 (0.0012) [2023-02-25 20:07:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4846.9, 300 sec: 4693.0). Total num frames: 7671808. Throughput: 0: 1164.4. Samples: 914272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:07:33,387][00219] Avg episode reward: [(0, '27.818')] [2023-02-25 20:07:38,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4573.9, 300 sec: 4679.2). Total num frames: 7688192. Throughput: 0: 1163.0. Samples: 919856. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:07:38,378][00219] Avg episode reward: [(0, '27.588')] [2023-02-25 20:07:38,563][32866] Updated weights for policy 0, policy_version 1878 (0.0012) [2023-02-25 20:07:43,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4573.9, 300 sec: 4720.8). Total num frames: 7716864. Throughput: 0: 1189.0. Samples: 928064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:07:43,379][00219] Avg episode reward: [(0, '26.628')] [2023-02-25 20:07:43,395][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001884_7716864.pth... [2023-02-25 20:07:43,531][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001612_6602752.pth [2023-02-25 20:07:45,505][32866] Updated weights for policy 0, policy_version 1888 (0.0012) [2023-02-25 20:07:48,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 7749632. Throughput: 0: 1187.6. Samples: 932304. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:07:48,382][00219] Avg episode reward: [(0, '26.975')] [2023-02-25 20:07:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4679.2). Total num frames: 7766016. Throughput: 0: 1178.0. Samples: 939456. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 20:07:53,384][00219] Avg episode reward: [(0, '26.528')] [2023-02-25 20:07:54,957][32866] Updated weights for policy 0, policy_version 1898 (0.0013) [2023-02-25 20:07:58,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 7782400. Throughput: 0: 1175.1. Samples: 944992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:07:58,384][00219] Avg episode reward: [(0, '24.544')] [2023-02-25 20:08:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4706.9). Total num frames: 7811072. Throughput: 0: 1190.0. Samples: 948384. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:08:03,385][00219] Avg episode reward: [(0, '26.974')] [2023-02-25 20:08:03,545][32866] Updated weights for policy 0, policy_version 1908 (0.0011) [2023-02-25 20:08:08,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4778.6, 300 sec: 4720.8). Total num frames: 7839744. Throughput: 0: 1199.6. Samples: 957456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:08:08,379][00219] Avg episode reward: [(0, '27.731')] [2023-02-25 20:08:10,606][32866] Updated weights for policy 0, policy_version 1918 (0.0012) [2023-02-25 20:08:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4693.0). Total num frames: 7864320. Throughput: 0: 1189.7. Samples: 964960. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:08:13,382][00219] Avg episode reward: [(0, '26.159')] [2023-02-25 20:08:18,382][00219] Fps is (10 sec: 4094.0, 60 sec: 4710.0, 300 sec: 4679.1). Total num frames: 7880704. Throughput: 0: 1191.7. Samples: 967904. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:08:18,384][00219] Avg episode reward: [(0, '25.796')] [2023-02-25 20:08:21,373][32866] Updated weights for policy 0, policy_version 1928 (0.0017) [2023-02-25 20:08:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4693.0). Total num frames: 7905280. Throughput: 0: 1186.1. Samples: 973232. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:08:23,379][00219] Avg episode reward: [(0, '26.762')] [2023-02-25 20:08:28,377][00219] Fps is (10 sec: 5327.5, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 7933952. Throughput: 0: 1195.4. Samples: 981856. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:08:28,384][00219] Avg episode reward: [(0, '25.506')] [2023-02-25 20:08:29,262][32866] Updated weights for policy 0, policy_version 1938 (0.0037) [2023-02-25 20:08:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4706.9). Total num frames: 7958528. Throughput: 0: 1197.2. Samples: 986176. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:08:33,383][00219] Avg episode reward: [(0, '25.118')] [2023-02-25 20:08:38,199][32866] Updated weights for policy 0, policy_version 1948 (0.0022) [2023-02-25 20:08:38,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4846.9, 300 sec: 4665.3). Total num frames: 7979008. Throughput: 0: 1175.5. Samples: 992352. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:08:38,379][00219] Avg episode reward: [(0, '26.027')] [2023-02-25 20:08:43,379][00219] Fps is (10 sec: 3685.6, 60 sec: 4642.0, 300 sec: 4665.2). Total num frames: 7995392. Throughput: 0: 1173.3. Samples: 997792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:08:43,387][00219] Avg episode reward: [(0, '26.550')] [2023-02-25 20:08:47,352][32866] Updated weights for policy 0, policy_version 1958 (0.0011) [2023-02-25 20:08:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4573.9, 300 sec: 4706.9). Total num frames: 8024064. Throughput: 0: 1185.8. Samples: 1001744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:08:48,379][00219] Avg episode reward: [(0, '25.151')] [2023-02-25 20:08:53,377][00219] Fps is (10 sec: 4916.3, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 8044544. Throughput: 0: 1150.6. Samples: 1009232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:08:53,380][00219] Avg episode reward: [(0, '25.677')] [2023-02-25 20:08:58,194][32866] Updated weights for policy 0, policy_version 1968 (0.0022) [2023-02-25 20:08:58,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4642.1, 300 sec: 4637.5). Total num frames: 8060928. Throughput: 0: 1091.6. Samples: 1014080. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:08:58,381][00219] Avg episode reward: [(0, '25.913')] [2023-02-25 20:09:03,378][00219] Fps is (10 sec: 2866.7, 60 sec: 4368.9, 300 sec: 4609.8). Total num frames: 8073216. Throughput: 0: 1077.4. Samples: 1016384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:09:03,384][00219] Avg episode reward: [(0, '25.562')] [2023-02-25 20:09:08,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4232.5, 300 sec: 4595.9). Total num frames: 8093696. Throughput: 0: 1071.6. Samples: 1021456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:09:08,383][00219] Avg episode reward: [(0, '25.764')] [2023-02-25 20:09:09,909][32866] Updated weights for policy 0, policy_version 1978 (0.0018) [2023-02-25 20:09:13,377][00219] Fps is (10 sec: 4916.0, 60 sec: 4300.8, 300 sec: 4623.6). Total num frames: 8122368. Throughput: 0: 1051.7. Samples: 1029184. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:09:13,379][00219] Avg episode reward: [(0, '26.176')] [2023-02-25 20:09:16,401][32866] Updated weights for policy 0, policy_version 1988 (0.0012) [2023-02-25 20:09:18,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4574.2, 300 sec: 4623.6). Total num frames: 8155136. Throughput: 0: 1057.1. Samples: 1033744. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:09:18,379][00219] Avg episode reward: [(0, '26.826')] [2023-02-25 20:09:23,381][00219] Fps is (10 sec: 4913.2, 60 sec: 4437.0, 300 sec: 4623.6). Total num frames: 8171520. Throughput: 0: 1092.5. Samples: 1041520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:09:23,383][00219] Avg episode reward: [(0, '26.071')] [2023-02-25 20:09:25,789][32866] Updated weights for policy 0, policy_version 1998 (0.0011) [2023-02-25 20:09:28,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4368.6, 300 sec: 4651.3). Total num frames: 8196096. Throughput: 0: 1102.1. Samples: 1047392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:09:28,390][00219] Avg episode reward: [(0, '27.336')] [2023-02-25 20:09:33,377][00219] Fps is (10 sec: 4507.4, 60 sec: 4300.8, 300 sec: 4679.2). Total num frames: 8216576. Throughput: 0: 1073.4. Samples: 1050048. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:09:33,384][00219] Avg episode reward: [(0, '27.951')] [2023-02-25 20:09:34,374][32866] Updated weights for policy 0, policy_version 2008 (0.0020) [2023-02-25 20:09:38,377][00219] Fps is (10 sec: 5328.1, 60 sec: 4505.6, 300 sec: 4734.7). Total num frames: 8249344. Throughput: 0: 1109.0. Samples: 1059136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:09:38,384][00219] Avg episode reward: [(0, '28.741')] [2023-02-25 20:09:40,693][32866] Updated weights for policy 0, policy_version 2018 (0.0012) [2023-02-25 20:09:43,383][00219] Fps is (10 sec: 5730.9, 60 sec: 4641.8, 300 sec: 4706.8). Total num frames: 8273920. Throughput: 0: 1183.5. Samples: 1067344. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:09:43,390][00219] Avg episode reward: [(0, '30.646')] [2023-02-25 20:09:43,402][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002020_8273920.pth... [2023-02-25 20:09:43,557][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001745_7147520.pth [2023-02-25 20:09:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4437.3, 300 sec: 4665.3). Total num frames: 8290304. Throughput: 0: 1194.4. Samples: 1070128. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:09:48,382][00219] Avg episode reward: [(0, '31.914')] [2023-02-25 20:09:48,444][32858] Saving new best policy, reward=31.914! [2023-02-25 20:09:51,258][32866] Updated weights for policy 0, policy_version 2028 (0.0011) [2023-02-25 20:09:53,377][00219] Fps is (10 sec: 4098.5, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 8314880. Throughput: 0: 1208.2. Samples: 1075824. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:09:53,381][00219] Avg episode reward: [(0, '31.362')] [2023-02-25 20:09:58,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4642.1, 300 sec: 4706.9). Total num frames: 8339456. Throughput: 0: 1199.3. Samples: 1083152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:09:58,382][00219] Avg episode reward: [(0, '32.303')] [2023-02-25 20:09:58,388][32858] Saving new best policy, reward=32.303! [2023-02-25 20:09:59,816][32866] Updated weights for policy 0, policy_version 2038 (0.0015) [2023-02-25 20:10:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.3, 300 sec: 4706.9). Total num frames: 8368128. Throughput: 0: 1194.7. Samples: 1087504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:10:03,386][00219] Avg episode reward: [(0, '31.743')] [2023-02-25 20:10:08,215][32866] Updated weights for policy 0, policy_version 2048 (0.0011) [2023-02-25 20:10:08,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4915.2, 300 sec: 4679.2). Total num frames: 8388608. Throughput: 0: 1184.5. Samples: 1094816. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:10:08,385][00219] Avg episode reward: [(0, '30.239')] [2023-02-25 20:10:13,377][00219] Fps is (10 sec: 3686.2, 60 sec: 4710.4, 300 sec: 4651.5). Total num frames: 8404992. Throughput: 0: 1183.1. Samples: 1100624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:10:13,383][00219] Avg episode reward: [(0, '30.286')] [2023-02-25 20:10:17,101][32866] Updated weights for policy 0, policy_version 2058 (0.0014) [2023-02-25 20:10:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 8433664. Throughput: 0: 1197.9. Samples: 1103952. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:10:18,383][00219] Avg episode reward: [(0, '31.482')] [2023-02-25 20:10:23,378][00219] Fps is (10 sec: 5733.9, 60 sec: 4847.1, 300 sec: 4706.9). Total num frames: 8462336. Throughput: 0: 1199.2. Samples: 1113104. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:10:23,380][00219] Avg episode reward: [(0, '30.273')] [2023-02-25 20:10:23,940][32866] Updated weights for policy 0, policy_version 2068 (0.0012) [2023-02-25 20:10:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.4, 300 sec: 4693.1). Total num frames: 8486912. Throughput: 0: 1188.8. Samples: 1120832. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:10:28,379][00219] Avg episode reward: [(0, '29.964')] [2023-02-25 20:10:33,377][00219] Fps is (10 sec: 4506.2, 60 sec: 4846.9, 300 sec: 4665.3). Total num frames: 8507392. Throughput: 0: 1187.9. Samples: 1123584. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:10:33,380][00219] Avg episode reward: [(0, '28.886')] [2023-02-25 20:10:34,509][32866] Updated weights for policy 0, policy_version 2078 (0.0033) [2023-02-25 20:10:38,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4710.4, 300 sec: 4693.1). Total num frames: 8531968. Throughput: 0: 1192.5. Samples: 1129488. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:10:38,379][00219] Avg episode reward: [(0, '27.236')] [2023-02-25 20:10:41,992][32866] Updated weights for policy 0, policy_version 2088 (0.0012) [2023-02-25 20:10:43,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4779.2, 300 sec: 4734.7). Total num frames: 8560640. Throughput: 0: 1232.4. Samples: 1138608. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:10:43,379][00219] Avg episode reward: [(0, '26.226')] [2023-02-25 20:10:48,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4915.2, 300 sec: 4706.9). Total num frames: 8585216. Throughput: 0: 1238.8. Samples: 1143248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:10:48,379][00219] Avg episode reward: [(0, '26.829')] [2023-02-25 20:10:49,073][32866] Updated weights for policy 0, policy_version 2098 (0.0011) [2023-02-25 20:10:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4846.9, 300 sec: 4679.2). Total num frames: 8605696. Throughput: 0: 1214.9. Samples: 1149488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:10:53,381][00219] Avg episode reward: [(0, '27.069')] [2023-02-25 20:10:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4679.3). Total num frames: 8626176. Throughput: 0: 1215.0. Samples: 1155296. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:10:58,382][00219] Avg episode reward: [(0, '26.394')] [2023-02-25 20:10:59,441][32866] Updated weights for policy 0, policy_version 2108 (0.0012) [2023-02-25 20:11:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4706.9). Total num frames: 8654848. Throughput: 0: 1235.6. Samples: 1159552. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:11:03,379][00219] Avg episode reward: [(0, '26.689')] [2023-02-25 20:11:06,311][32866] Updated weights for policy 0, policy_version 2118 (0.0011) [2023-02-25 20:11:08,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4720.8). Total num frames: 8683520. Throughput: 0: 1232.0. Samples: 1168544. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:11:08,379][00219] Avg episode reward: [(0, '25.221')] [2023-02-25 20:11:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4693.0). Total num frames: 8704000. Throughput: 0: 1211.7. Samples: 1175360. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:11:13,380][00219] Avg episode reward: [(0, '25.478')] [2023-02-25 20:11:15,928][32866] Updated weights for policy 0, policy_version 2128 (0.0011) [2023-02-25 20:11:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4679.2). Total num frames: 8724480. Throughput: 0: 1217.4. Samples: 1178368. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:11:18,382][00219] Avg episode reward: [(0, '25.480')] [2023-02-25 20:11:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.1, 300 sec: 4706.9). Total num frames: 8753152. Throughput: 0: 1230.2. Samples: 1184848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:11:23,383][00219] Avg episode reward: [(0, '26.474')] [2023-02-25 20:11:23,959][32866] Updated weights for policy 0, policy_version 2138 (0.0012) [2023-02-25 20:11:28,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 8781824. Throughput: 0: 1232.0. Samples: 1194048. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:11:28,384][00219] Avg episode reward: [(0, '28.239')] [2023-02-25 20:11:30,892][32866] Updated weights for policy 0, policy_version 2148 (0.0011) [2023-02-25 20:11:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4720.8). Total num frames: 8806400. Throughput: 0: 1226.3. Samples: 1198432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:11:33,379][00219] Avg episode reward: [(0, '29.066')] [2023-02-25 20:11:38,379][00219] Fps is (10 sec: 4504.6, 60 sec: 4915.0, 300 sec: 4693.0). Total num frames: 8826880. Throughput: 0: 1220.6. Samples: 1204416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:11:38,383][00219] Avg episode reward: [(0, '28.410')] [2023-02-25 20:11:41,457][32866] Updated weights for policy 0, policy_version 2158 (0.0011) [2023-02-25 20:11:43,379][00219] Fps is (10 sec: 4095.2, 60 sec: 4778.5, 300 sec: 4706.9). Total num frames: 8847360. Throughput: 0: 1224.5. Samples: 1210400. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:11:43,380][00219] Avg episode reward: [(0, '27.787')] [2023-02-25 20:11:43,395][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002160_8847360.pth... [2023-02-25 20:11:43,537][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001884_7716864.pth [2023-02-25 20:11:48,386][32866] Updated weights for policy 0, policy_version 2168 (0.0015) [2023-02-25 20:11:48,377][00219] Fps is (10 sec: 5326.0, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 8880128. Throughput: 0: 1228.1. Samples: 1214816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:11:48,388][00219] Avg episode reward: [(0, '27.670')] [2023-02-25 20:11:53,377][00219] Fps is (10 sec: 5735.6, 60 sec: 4983.5, 300 sec: 4748.6). Total num frames: 8904704. Throughput: 0: 1233.1. Samples: 1224032. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:11:53,382][00219] Avg episode reward: [(0, '26.792')] [2023-02-25 20:11:56,413][32866] Updated weights for policy 0, policy_version 2178 (0.0011) [2023-02-25 20:11:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4720.8). Total num frames: 8925184. Throughput: 0: 1221.7. Samples: 1230336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:11:58,380][00219] Avg episode reward: [(0, '27.087')] [2023-02-25 20:12:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 8945664. Throughput: 0: 1216.7. Samples: 1233120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:12:03,379][00219] Avg episode reward: [(0, '26.520')] [2023-02-25 20:12:06,027][32866] Updated weights for policy 0, policy_version 2188 (0.0012) [2023-02-25 20:12:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 8970240. Throughput: 0: 1229.2. Samples: 1240160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:12:08,379][00219] Avg episode reward: [(0, '26.571')] [2023-02-25 20:12:12,963][32866] Updated weights for policy 0, policy_version 2198 (0.0011) [2023-02-25 20:12:13,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 9003008. Throughput: 0: 1229.2. Samples: 1249360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:12:13,379][00219] Avg episode reward: [(0, '26.230')] [2023-02-25 20:12:18,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4983.4, 300 sec: 4734.7). Total num frames: 9023488. Throughput: 0: 1215.6. Samples: 1253136. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:12:18,379][00219] Avg episode reward: [(0, '27.139')] [2023-02-25 20:12:22,768][32866] Updated weights for policy 0, policy_version 2208 (0.0011) [2023-02-25 20:12:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4734.7). Total num frames: 9043968. Throughput: 0: 1213.6. Samples: 1259024. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:12:23,379][00219] Avg episode reward: [(0, '26.389')] [2023-02-25 20:12:28,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 9068544. Throughput: 0: 1227.4. Samples: 1265632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:12:28,379][00219] Avg episode reward: [(0, '27.296')] [2023-02-25 20:12:31,105][32866] Updated weights for policy 0, policy_version 2218 (0.0012) [2023-02-25 20:12:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 9097216. Throughput: 0: 1230.6. Samples: 1270192. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:12:33,379][00219] Avg episode reward: [(0, '27.881')] [2023-02-25 20:12:37,904][32866] Updated weights for policy 0, policy_version 2228 (0.0012) [2023-02-25 20:12:38,379][00219] Fps is (10 sec: 5733.2, 60 sec: 4983.5, 300 sec: 4776.3). Total num frames: 9125888. Throughput: 0: 1219.9. Samples: 1278928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:12:38,381][00219] Avg episode reward: [(0, '28.613')] [2023-02-25 20:12:43,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4915.3, 300 sec: 4720.8). Total num frames: 9142272. Throughput: 0: 1208.5. Samples: 1284720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 20:12:43,382][00219] Avg episode reward: [(0, '29.311')] [2023-02-25 20:12:48,377][00219] Fps is (10 sec: 3687.2, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 9162752. Throughput: 0: 1212.4. Samples: 1287680. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:12:48,384][00219] Avg episode reward: [(0, '29.270')] [2023-02-25 20:12:48,466][32866] Updated weights for policy 0, policy_version 2238 (0.0013) [2023-02-25 20:12:53,377][00219] Fps is (10 sec: 4915.6, 60 sec: 4778.6, 300 sec: 4776.3). Total num frames: 9191424. Throughput: 0: 1227.7. Samples: 1295408. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:12:53,382][00219] Avg episode reward: [(0, '29.105')] [2023-02-25 20:12:55,188][32866] Updated weights for policy 0, policy_version 2248 (0.0013) [2023-02-25 20:12:58,377][00219] Fps is (10 sec: 6143.9, 60 sec: 4983.5, 300 sec: 4790.2). Total num frames: 9224192. Throughput: 0: 1224.5. Samples: 1304464. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:12:58,381][00219] Avg episode reward: [(0, '29.851')] [2023-02-25 20:13:03,379][00219] Fps is (10 sec: 5323.9, 60 sec: 4983.3, 300 sec: 4762.4). Total num frames: 9244672. Throughput: 0: 1207.4. Samples: 1307472. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:13:03,381][00219] Avg episode reward: [(0, '30.947')] [2023-02-25 20:13:04,721][32866] Updated weights for policy 0, policy_version 2258 (0.0012) [2023-02-25 20:13:08,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4846.9, 300 sec: 4734.7). Total num frames: 9261056. Throughput: 0: 1206.4. Samples: 1313312. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:13:08,379][00219] Avg episode reward: [(0, '29.558')] [2023-02-25 20:13:13,377][00219] Fps is (10 sec: 3277.4, 60 sec: 4573.8, 300 sec: 4734.8). Total num frames: 9277440. Throughput: 0: 1174.0. Samples: 1318464. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:13:13,383][00219] Avg episode reward: [(0, '28.517')] [2023-02-25 20:13:15,591][32866] Updated weights for policy 0, policy_version 2268 (0.0015) [2023-02-25 20:13:18,381][00219] Fps is (10 sec: 3684.9, 60 sec: 4573.6, 300 sec: 4720.7). Total num frames: 9297920. Throughput: 0: 1134.8. Samples: 1321264. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:13:18,383][00219] Avg episode reward: [(0, '28.755')] [2023-02-25 20:13:23,379][00219] Fps is (10 sec: 4095.3, 60 sec: 4573.7, 300 sec: 4693.0). Total num frames: 9318400. Throughput: 0: 1085.2. Samples: 1327760. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:13:23,381][00219] Avg episode reward: [(0, '27.492')] [2023-02-25 20:13:26,030][32866] Updated weights for policy 0, policy_version 2278 (0.0012) [2023-02-25 20:13:28,377][00219] Fps is (10 sec: 4097.5, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 9338880. Throughput: 0: 1088.7. Samples: 1333712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:13:28,386][00219] Avg episode reward: [(0, '26.599')] [2023-02-25 20:13:33,378][00219] Fps is (10 sec: 4096.4, 60 sec: 4369.0, 300 sec: 4679.1). Total num frames: 9359360. Throughput: 0: 1087.3. Samples: 1336608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:13:33,381][00219] Avg episode reward: [(0, '25.447')] [2023-02-25 20:13:35,716][32866] Updated weights for policy 0, policy_version 2288 (0.0012) [2023-02-25 20:13:38,377][00219] Fps is (10 sec: 4505.8, 60 sec: 4301.0, 300 sec: 4707.0). Total num frames: 9383936. Throughput: 0: 1074.9. Samples: 1343776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:13:38,379][00219] Avg episode reward: [(0, '25.158')] [2023-02-25 20:13:42,331][32866] Updated weights for policy 0, policy_version 2298 (0.0013) [2023-02-25 20:13:43,377][00219] Fps is (10 sec: 5735.0, 60 sec: 4573.9, 300 sec: 4720.8). Total num frames: 9416704. Throughput: 0: 1076.3. Samples: 1352896. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:13:43,379][00219] Avg episode reward: [(0, '26.026')] [2023-02-25 20:13:43,390][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002299_9416704.pth... [2023-02-25 20:13:43,500][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002020_8273920.pth [2023-02-25 20:13:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4573.9, 300 sec: 4720.8). Total num frames: 9437184. Throughput: 0: 1083.8. Samples: 1356240. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:13:48,379][00219] Avg episode reward: [(0, '25.451')] [2023-02-25 20:13:52,000][32866] Updated weights for policy 0, policy_version 2308 (0.0014) [2023-02-25 20:13:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.1, 300 sec: 4720.8). Total num frames: 9453568. Throughput: 0: 1078.4. Samples: 1361840. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:13:53,385][00219] Avg episode reward: [(0, '25.904')] [2023-02-25 20:13:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4300.8, 300 sec: 4776.4). Total num frames: 9482240. Throughput: 0: 1114.3. Samples: 1368608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:13:58,379][00219] Avg episode reward: [(0, '27.348')] [2023-02-25 20:14:00,494][32866] Updated weights for policy 0, policy_version 2318 (0.0012) [2023-02-25 20:14:03,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4437.5, 300 sec: 4804.1). Total num frames: 9510912. Throughput: 0: 1152.5. Samples: 1373120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:14:03,379][00219] Avg episode reward: [(0, '26.042')] [2023-02-25 20:14:07,650][32866] Updated weights for policy 0, policy_version 2328 (0.0012) [2023-02-25 20:14:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4573.9, 300 sec: 4790.2). Total num frames: 9535488. Throughput: 0: 1201.1. Samples: 1381808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:14:08,380][00219] Avg episode reward: [(0, '26.929')] [2023-02-25 20:14:13,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 9555968. Throughput: 0: 1198.9. Samples: 1387664. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2023-02-25 20:14:13,380][00219] Avg episode reward: [(0, '27.583')] [2023-02-25 20:14:18,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4574.2, 300 sec: 4748.6). Total num frames: 9572352. Throughput: 0: 1197.9. Samples: 1390512. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:14:18,379][00219] Avg episode reward: [(0, '28.756')] [2023-02-25 20:14:18,407][32866] Updated weights for policy 0, policy_version 2338 (0.0011) [2023-02-25 20:14:23,377][00219] Fps is (10 sec: 4915.7, 60 sec: 4778.8, 300 sec: 4776.5). Total num frames: 9605120. Throughput: 0: 1214.9. Samples: 1398448. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:14:23,381][00219] Avg episode reward: [(0, '28.258')] [2023-02-25 20:14:25,183][32866] Updated weights for policy 0, policy_version 2348 (0.0011) [2023-02-25 20:14:28,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 9633792. Throughput: 0: 1216.7. Samples: 1407648. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:14:28,378][00219] Avg episode reward: [(0, '29.264')] [2023-02-25 20:14:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.3, 300 sec: 4762.5). Total num frames: 9654272. Throughput: 0: 1208.5. Samples: 1410624. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-02-25 20:14:33,383][00219] Avg episode reward: [(0, '29.363')] [2023-02-25 20:14:34,227][32866] Updated weights for policy 0, policy_version 2358 (0.0012) [2023-02-25 20:14:38,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.6, 300 sec: 4734.8). Total num frames: 9670656. Throughput: 0: 1210.3. Samples: 1416304. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:14:38,383][00219] Avg episode reward: [(0, '29.596')] [2023-02-25 20:14:42,919][32866] Updated weights for policy 0, policy_version 2368 (0.0011) [2023-02-25 20:14:43,377][00219] Fps is (10 sec: 4915.0, 60 sec: 4778.6, 300 sec: 4790.2). Total num frames: 9703424. Throughput: 0: 1228.1. Samples: 1423872. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:14:43,380][00219] Avg episode reward: [(0, '29.243')] [2023-02-25 20:14:48,377][00219] Fps is (10 sec: 6144.2, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 9732096. Throughput: 0: 1228.4. Samples: 1428400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:14:48,380][00219] Avg episode reward: [(0, '28.301')] [2023-02-25 20:14:49,362][32866] Updated weights for policy 0, policy_version 2378 (0.0012) [2023-02-25 20:14:53,379][00219] Fps is (10 sec: 5323.9, 60 sec: 5051.6, 300 sec: 4804.1). Total num frames: 9756672. Throughput: 0: 1218.8. Samples: 1436656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:14:53,381][00219] Avg episode reward: [(0, '27.067')] [2023-02-25 20:14:58,381][00219] Fps is (10 sec: 4094.3, 60 sec: 4846.6, 300 sec: 4762.4). Total num frames: 9773056. Throughput: 0: 1217.3. Samples: 1442448. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:14:58,385][00219] Avg episode reward: [(0, '26.858')] [2023-02-25 20:14:59,477][32866] Updated weights for policy 0, policy_version 2388 (0.0026) [2023-02-25 20:15:03,377][00219] Fps is (10 sec: 4096.8, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 9797632. Throughput: 0: 1220.3. Samples: 1445424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:15:03,379][00219] Avg episode reward: [(0, '26.566')] [2023-02-25 20:15:07,494][32866] Updated weights for policy 0, policy_version 2398 (0.0013) [2023-02-25 20:15:08,377][00219] Fps is (10 sec: 5327.0, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 9826304. Throughput: 0: 1233.1. Samples: 1453936. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:15:08,379][00219] Avg episode reward: [(0, '26.274')] [2023-02-25 20:15:13,379][00219] Fps is (10 sec: 5733.3, 60 sec: 4983.4, 300 sec: 4818.0). Total num frames: 9854976. Throughput: 0: 1218.4. Samples: 1462480. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:15:13,381][00219] Avg episode reward: [(0, '26.511')] [2023-02-25 20:15:14,863][32866] Updated weights for policy 0, policy_version 2408 (0.0011) [2023-02-25 20:15:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 9871360. Throughput: 0: 1217.4. Samples: 1465408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:15:18,383][00219] Avg episode reward: [(0, '27.349')] [2023-02-25 20:15:23,377][00219] Fps is (10 sec: 3686.9, 60 sec: 4778.6, 300 sec: 4762.5). Total num frames: 9891840. Throughput: 0: 1217.8. Samples: 1471104. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:15:23,385][00219] Avg episode reward: [(0, '27.359')] [2023-02-25 20:15:25,066][32866] Updated weights for policy 0, policy_version 2418 (0.0012) [2023-02-25 20:15:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4790.2). Total num frames: 9920512. Throughput: 0: 1228.1. Samples: 1479136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:15:28,378][00219] Avg episode reward: [(0, '25.954')] [2023-02-25 20:15:31,694][32866] Updated weights for policy 0, policy_version 2428 (0.0012) [2023-02-25 20:15:33,377][00219] Fps is (10 sec: 6144.4, 60 sec: 4983.5, 300 sec: 4818.0). Total num frames: 9953280. Throughput: 0: 1228.4. Samples: 1483680. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:15:33,380][00219] Avg episode reward: [(0, '25.330')] [2023-02-25 20:15:38,378][00219] Fps is (10 sec: 5324.2, 60 sec: 5051.7, 300 sec: 4790.2). Total num frames: 9973760. Throughput: 0: 1208.6. Samples: 1491040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:15:38,380][00219] Avg episode reward: [(0, '26.380')] [2023-02-25 20:15:42,063][32866] Updated weights for policy 0, policy_version 2438 (0.0012) [2023-02-25 20:15:43,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 9990144. Throughput: 0: 1201.5. Samples: 1496512. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:15:43,385][00219] Avg episode reward: [(0, '24.504')] [2023-02-25 20:15:43,402][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002439_9990144.pth... [2023-02-25 20:15:43,618][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002160_8847360.pth [2023-02-25 20:15:48,377][00219] Fps is (10 sec: 4096.4, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 10014720. Throughput: 0: 1196.4. Samples: 1499264. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:15:48,384][00219] Avg episode reward: [(0, '26.506')] [2023-02-25 20:15:50,289][32866] Updated weights for policy 0, policy_version 2448 (0.0012) [2023-02-25 20:15:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.8, 300 sec: 4804.1). Total num frames: 10043392. Throughput: 0: 1198.9. Samples: 1507888. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:15:53,387][00219] Avg episode reward: [(0, '27.375')] [2023-02-25 20:15:57,684][32866] Updated weights for policy 0, policy_version 2458 (0.0012) [2023-02-25 20:15:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.5, 300 sec: 4790.2). Total num frames: 10067968. Throughput: 0: 1186.5. Samples: 1515872. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:15:58,380][00219] Avg episode reward: [(0, '26.996')] [2023-02-25 20:16:03,378][00219] Fps is (10 sec: 4095.3, 60 sec: 4778.5, 300 sec: 4748.6). Total num frames: 10084352. Throughput: 0: 1180.4. Samples: 1518528. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:16:03,382][00219] Avg episode reward: [(0, '28.111')] [2023-02-25 20:16:08,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 10104832. Throughput: 0: 1175.8. Samples: 1524016. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:16:08,385][00219] Avg episode reward: [(0, '29.476')] [2023-02-25 20:16:08,984][32866] Updated weights for policy 0, policy_version 2468 (0.0012) [2023-02-25 20:16:13,377][00219] Fps is (10 sec: 4916.1, 60 sec: 4642.3, 300 sec: 4776.4). Total num frames: 10133504. Throughput: 0: 1176.9. Samples: 1532096. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:16:13,382][00219] Avg episode reward: [(0, '30.325')] [2023-02-25 20:16:15,886][32866] Updated weights for policy 0, policy_version 2478 (0.0011) [2023-02-25 20:16:18,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 10162176. Throughput: 0: 1175.1. Samples: 1536560. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:16:18,381][00219] Avg episode reward: [(0, '29.857')] [2023-02-25 20:16:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.0, 300 sec: 4748.6). Total num frames: 10182656. Throughput: 0: 1164.8. Samples: 1543456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:16:23,382][00219] Avg episode reward: [(0, '29.763')] [2023-02-25 20:16:24,886][32866] Updated weights for policy 0, policy_version 2488 (0.0022) [2023-02-25 20:16:28,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4642.1, 300 sec: 4720.8). Total num frames: 10199040. Throughput: 0: 1166.6. Samples: 1549008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:16:28,385][00219] Avg episode reward: [(0, '28.693')] [2023-02-25 20:16:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4734.7). Total num frames: 10223616. Throughput: 0: 1178.7. Samples: 1552304. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:16:33,378][00219] Avg episode reward: [(0, '26.797')] [2023-02-25 20:16:33,768][32866] Updated weights for policy 0, policy_version 2498 (0.0012) [2023-02-25 20:16:38,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4710.5, 300 sec: 4776.4). Total num frames: 10256384. Throughput: 0: 1182.9. Samples: 1561120. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:16:38,379][00219] Avg episode reward: [(0, '27.552')] [2023-02-25 20:16:41,848][32866] Updated weights for policy 0, policy_version 2508 (0.0011) [2023-02-25 20:16:43,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 10276864. Throughput: 0: 1170.8. Samples: 1568560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:16:43,381][00219] Avg episode reward: [(0, '26.915')] [2023-02-25 20:16:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 10297344. Throughput: 0: 1175.9. Samples: 1571440. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:16:48,385][00219] Avg episode reward: [(0, '29.330')] [2023-02-25 20:16:51,910][32866] Updated weights for policy 0, policy_version 2518 (0.0012) [2023-02-25 20:16:53,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4642.1, 300 sec: 4734.7). Total num frames: 10321920. Throughput: 0: 1185.1. Samples: 1577344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:16:53,385][00219] Avg episode reward: [(0, '29.877')] [2023-02-25 20:16:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 10350592. Throughput: 0: 1204.6. Samples: 1586304. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:16:58,390][00219] Avg episode reward: [(0, '29.381')] [2023-02-25 20:16:58,886][32866] Updated weights for policy 0, policy_version 2528 (0.0012) [2023-02-25 20:17:03,378][00219] Fps is (10 sec: 5324.3, 60 sec: 4847.0, 300 sec: 4762.5). Total num frames: 10375168. Throughput: 0: 1204.9. Samples: 1590784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:17:03,380][00219] Avg episode reward: [(0, '30.615')] [2023-02-25 20:17:07,878][32866] Updated weights for policy 0, policy_version 2538 (0.0012) [2023-02-25 20:17:08,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 10395648. Throughput: 0: 1190.0. Samples: 1597008. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:17:08,385][00219] Avg episode reward: [(0, '30.496')] [2023-02-25 20:17:13,377][00219] Fps is (10 sec: 4096.4, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 10416128. Throughput: 0: 1192.5. Samples: 1602672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:17:13,383][00219] Avg episode reward: [(0, '29.643')] [2023-02-25 20:17:17,173][32866] Updated weights for policy 0, policy_version 2548 (0.0012) [2023-02-25 20:17:18,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4642.1, 300 sec: 4734.7). Total num frames: 10440704. Throughput: 0: 1204.6. Samples: 1606512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:17:18,382][00219] Avg episode reward: [(0, '29.838')] [2023-02-25 20:17:23,378][00219] Fps is (10 sec: 5733.8, 60 sec: 4846.9, 300 sec: 4762.5). Total num frames: 10473472. Throughput: 0: 1208.9. Samples: 1615520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:17:23,384][00219] Avg episode reward: [(0, '30.389')] [2023-02-25 20:17:24,352][32866] Updated weights for policy 0, policy_version 2558 (0.0012) [2023-02-25 20:17:28,380][00219] Fps is (10 sec: 4504.2, 60 sec: 4778.4, 300 sec: 4706.9). Total num frames: 10485760. Throughput: 0: 1168.6. Samples: 1621152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:17:28,382][00219] Avg episode reward: [(0, '30.235')] [2023-02-25 20:17:33,377][00219] Fps is (10 sec: 2867.5, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 10502144. Throughput: 0: 1155.9. Samples: 1623456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:17:33,379][00219] Avg episode reward: [(0, '28.997')] [2023-02-25 20:17:37,441][32866] Updated weights for policy 0, policy_version 2568 (0.0021) [2023-02-25 20:17:38,377][00219] Fps is (10 sec: 3277.8, 60 sec: 4369.1, 300 sec: 4665.3). Total num frames: 10518528. Throughput: 0: 1124.6. Samples: 1627952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:17:38,381][00219] Avg episode reward: [(0, '29.873')] [2023-02-25 20:17:43,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4300.8, 300 sec: 4651.4). Total num frames: 10534912. Throughput: 0: 1037.9. Samples: 1633008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:17:43,379][00219] Avg episode reward: [(0, '30.458')] [2023-02-25 20:17:43,456][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002573_10539008.pth... [2023-02-25 20:17:43,581][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002299_9416704.pth [2023-02-25 20:17:46,826][32866] Updated weights for policy 0, policy_version 2578 (0.0019) [2023-02-25 20:17:48,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4505.6, 300 sec: 4665.3). Total num frames: 10567680. Throughput: 0: 1033.6. Samples: 1637296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:17:48,380][00219] Avg episode reward: [(0, '29.977')] [2023-02-25 20:17:53,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4573.9, 300 sec: 4651.4). Total num frames: 10596352. Throughput: 0: 1094.8. Samples: 1646272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:17:53,384][00219] Avg episode reward: [(0, '29.877')] [2023-02-25 20:17:54,460][32866] Updated weights for policy 0, policy_version 2588 (0.0012) [2023-02-25 20:17:58,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4369.1, 300 sec: 4637.5). Total num frames: 10612736. Throughput: 0: 1099.7. Samples: 1652160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:17:58,379][00219] Avg episode reward: [(0, '30.227')] [2023-02-25 20:18:03,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4232.6, 300 sec: 4637.5). Total num frames: 10629120. Throughput: 0: 1075.6. Samples: 1654912. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:18:03,379][00219] Avg episode reward: [(0, '28.965')] [2023-02-25 20:18:05,106][32866] Updated weights for policy 0, policy_version 2598 (0.0013) [2023-02-25 20:18:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4437.4, 300 sec: 4693.0). Total num frames: 10661888. Throughput: 0: 1035.8. Samples: 1662128. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:18:08,384][00219] Avg episode reward: [(0, '28.509')] [2023-02-25 20:18:11,897][32866] Updated weights for policy 0, policy_version 2608 (0.0012) [2023-02-25 20:18:13,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4573.9, 300 sec: 4720.9). Total num frames: 10690560. Throughput: 0: 1114.0. Samples: 1671280. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:18:13,379][00219] Avg episode reward: [(0, '27.635')] [2023-02-25 20:18:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4505.6, 300 sec: 4720.8). Total num frames: 10711040. Throughput: 0: 1142.0. Samples: 1674848. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:18:18,380][00219] Avg episode reward: [(0, '28.541')] [2023-02-25 20:18:21,381][32866] Updated weights for policy 0, policy_version 2618 (0.0011) [2023-02-25 20:18:23,377][00219] Fps is (10 sec: 4095.8, 60 sec: 4300.8, 300 sec: 4720.8). Total num frames: 10731520. Throughput: 0: 1171.2. Samples: 1680656. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:18:23,384][00219] Avg episode reward: [(0, '27.311')] [2023-02-25 20:18:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4505.8, 300 sec: 4734.7). Total num frames: 10756096. Throughput: 0: 1211.7. Samples: 1687536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:18:28,385][00219] Avg episode reward: [(0, '27.674')] [2023-02-25 20:18:29,322][32866] Updated weights for policy 0, policy_version 2628 (0.0011) [2023-02-25 20:18:33,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 10788864. Throughput: 0: 1218.1. Samples: 1692112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:18:33,385][00219] Avg episode reward: [(0, '27.323')] [2023-02-25 20:18:36,289][32866] Updated weights for policy 0, policy_version 2638 (0.0011) [2023-02-25 20:18:38,384][00219] Fps is (10 sec: 5321.0, 60 sec: 4846.4, 300 sec: 4720.7). Total num frames: 10809344. Throughput: 0: 1211.2. Samples: 1700784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:18:38,390][00219] Avg episode reward: [(0, '28.846')] [2023-02-25 20:18:43,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4914.7, 300 sec: 4720.7). Total num frames: 10829824. Throughput: 0: 1212.3. Samples: 1706720. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:18:43,385][00219] Avg episode reward: [(0, '30.125')] [2023-02-25 20:18:46,857][32866] Updated weights for policy 0, policy_version 2648 (0.0011) [2023-02-25 20:18:48,377][00219] Fps is (10 sec: 4098.9, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 10850304. Throughput: 0: 1214.9. Samples: 1709584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:18:48,385][00219] Avg episode reward: [(0, '30.081')] [2023-02-25 20:18:53,377][00219] Fps is (10 sec: 5328.0, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 10883072. Throughput: 0: 1229.9. Samples: 1717472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:18:53,384][00219] Avg episode reward: [(0, '30.072')] [2023-02-25 20:18:53,886][32866] Updated weights for policy 0, policy_version 2658 (0.0012) [2023-02-25 20:18:58,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4748.6). Total num frames: 10911744. Throughput: 0: 1228.8. Samples: 1726576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:18:58,381][00219] Avg episode reward: [(0, '30.527')] [2023-02-25 20:19:01,908][32866] Updated weights for policy 0, policy_version 2668 (0.0012) [2023-02-25 20:19:03,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4983.5, 300 sec: 4720.8). Total num frames: 10928128. Throughput: 0: 1214.6. Samples: 1729504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:19:03,386][00219] Avg episode reward: [(0, '28.093')] [2023-02-25 20:19:08,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 10948608. Throughput: 0: 1211.0. Samples: 1735152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:19:08,382][00219] Avg episode reward: [(0, '26.799')] [2023-02-25 20:19:11,646][32866] Updated weights for policy 0, policy_version 2678 (0.0012) [2023-02-25 20:19:13,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 10977280. Throughput: 0: 1223.5. Samples: 1742592. Policy #0 lag: (min: 1.0, avg: 1.1, max: 2.0) [2023-02-25 20:19:13,384][00219] Avg episode reward: [(0, '27.608')] [2023-02-25 20:19:18,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 11005952. Throughput: 0: 1221.7. Samples: 1747088. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:19:18,379][00219] Avg episode reward: [(0, '27.430')] [2023-02-25 20:19:18,663][32866] Updated weights for policy 0, policy_version 2688 (0.0012) [2023-02-25 20:19:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4720.8). Total num frames: 11026432. Throughput: 0: 1207.7. Samples: 1755120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:19:23,381][00219] Avg episode reward: [(0, '27.994')] [2023-02-25 20:19:28,384][00219] Fps is (10 sec: 4093.1, 60 sec: 4846.4, 300 sec: 4720.7). Total num frames: 11046912. Throughput: 0: 1201.8. Samples: 1760800. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:19:28,391][00219] Avg episode reward: [(0, '27.861')] [2023-02-25 20:19:28,656][32866] Updated weights for policy 0, policy_version 2698 (0.0011) [2023-02-25 20:19:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4748.6). Total num frames: 11071488. Throughput: 0: 1199.6. Samples: 1763568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:19:33,384][00219] Avg episode reward: [(0, '28.609')] [2023-02-25 20:19:36,586][32866] Updated weights for policy 0, policy_version 2708 (0.0012) [2023-02-25 20:19:38,377][00219] Fps is (10 sec: 5328.5, 60 sec: 4847.5, 300 sec: 4734.7). Total num frames: 11100160. Throughput: 0: 1219.2. Samples: 1772336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:19:38,385][00219] Avg episode reward: [(0, '30.732')] [2023-02-25 20:19:43,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4915.7, 300 sec: 4720.8). Total num frames: 11124736. Throughput: 0: 1200.3. Samples: 1780592. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:19:43,390][00219] Avg episode reward: [(0, '29.630')] [2023-02-25 20:19:43,482][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002717_11128832.pth... [2023-02-25 20:19:43,624][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002439_9990144.pth [2023-02-25 20:19:44,534][32866] Updated weights for policy 0, policy_version 2718 (0.0013) [2023-02-25 20:19:48,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4915.1, 300 sec: 4706.9). Total num frames: 11145216. Throughput: 0: 1197.5. Samples: 1783392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:19:48,388][00219] Avg episode reward: [(0, '29.174')] [2023-02-25 20:19:53,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4710.4, 300 sec: 4720.9). Total num frames: 11165696. Throughput: 0: 1200.4. Samples: 1789168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:19:53,378][00219] Avg episode reward: [(0, '29.796')] [2023-02-25 20:19:54,768][32866] Updated weights for policy 0, policy_version 2728 (0.0011) [2023-02-25 20:19:58,377][00219] Fps is (10 sec: 4915.8, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 11194368. Throughput: 0: 1217.4. Samples: 1797376. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:19:58,380][00219] Avg episode reward: [(0, '29.961')] [2023-02-25 20:19:58,859][32858] Signal inference workers to stop experience collection... (50 times) [2023-02-25 20:19:58,863][32858] Signal inference workers to resume experience collection... (50 times) [2023-02-25 20:19:58,914][32866] InferenceWorker_p0-w0: stopping experience collection (50 times) [2023-02-25 20:19:58,915][32866] InferenceWorker_p0-w0: resuming experience collection (50 times) [2023-02-25 20:20:01,444][32866] Updated weights for policy 0, policy_version 2738 (0.0012) [2023-02-25 20:20:03,377][00219] Fps is (10 sec: 6143.7, 60 sec: 4983.4, 300 sec: 4748.6). Total num frames: 11227136. Throughput: 0: 1221.0. Samples: 1802032. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:20:03,379][00219] Avg episode reward: [(0, '28.389')] [2023-02-25 20:20:08,382][00219] Fps is (10 sec: 4912.7, 60 sec: 4914.8, 300 sec: 4706.9). Total num frames: 11243520. Throughput: 0: 1206.3. Samples: 1809408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 20:20:08,393][00219] Avg episode reward: [(0, '27.752')] [2023-02-25 20:20:10,982][32866] Updated weights for policy 0, policy_version 2748 (0.0011) [2023-02-25 20:20:13,377][00219] Fps is (10 sec: 3686.6, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 11264000. Throughput: 0: 1209.1. Samples: 1815200. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:20:13,382][00219] Avg episode reward: [(0, '28.215')] [2023-02-25 20:20:18,377][00219] Fps is (10 sec: 4917.7, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 11292672. Throughput: 0: 1212.8. Samples: 1818144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:20:18,386][00219] Avg episode reward: [(0, '27.496')] [2023-02-25 20:20:19,362][32866] Updated weights for policy 0, policy_version 2758 (0.0012) [2023-02-25 20:20:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4734.7). Total num frames: 11317248. Throughput: 0: 1218.5. Samples: 1827168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:20:23,388][00219] Avg episode reward: [(0, '28.243')] [2023-02-25 20:20:26,544][32866] Updated weights for policy 0, policy_version 2768 (0.0011) [2023-02-25 20:20:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.8, 300 sec: 4706.9). Total num frames: 11341824. Throughput: 0: 1206.8. Samples: 1834896. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:20:28,381][00219] Avg episode reward: [(0, '29.087')] [2023-02-25 20:20:33,378][00219] Fps is (10 sec: 4914.7, 60 sec: 4915.1, 300 sec: 4720.8). Total num frames: 11366400. Throughput: 0: 1208.5. Samples: 1837776. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:20:33,380][00219] Avg episode reward: [(0, '28.924')] [2023-02-25 20:20:37,454][32866] Updated weights for policy 0, policy_version 2778 (0.0012) [2023-02-25 20:20:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 11382784. Throughput: 0: 1209.2. Samples: 1843584. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:20:38,379][00219] Avg episode reward: [(0, '28.731')] [2023-02-25 20:20:43,377][00219] Fps is (10 sec: 4915.7, 60 sec: 4847.0, 300 sec: 4748.6). Total num frames: 11415552. Throughput: 0: 1216.4. Samples: 1852112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:20:43,385][00219] Avg episode reward: [(0, '28.540')] [2023-02-25 20:20:44,191][32866] Updated weights for policy 0, policy_version 2788 (0.0022) [2023-02-25 20:20:48,381][00219] Fps is (10 sec: 5732.1, 60 sec: 4915.0, 300 sec: 4734.6). Total num frames: 11440128. Throughput: 0: 1214.1. Samples: 1856672. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:20:48,386][00219] Avg episode reward: [(0, '27.500')] [2023-02-25 20:20:52,731][32866] Updated weights for policy 0, policy_version 2798 (0.0012) [2023-02-25 20:20:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4720.8). Total num frames: 11460608. Throughput: 0: 1201.2. Samples: 1863456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:20:53,379][00219] Avg episode reward: [(0, '27.696')] [2023-02-25 20:20:58,377][00219] Fps is (10 sec: 4097.5, 60 sec: 4778.6, 300 sec: 4734.7). Total num frames: 11481088. Throughput: 0: 1202.8. Samples: 1869328. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:20:58,381][00219] Avg episode reward: [(0, '28.649')] [2023-02-25 20:21:01,618][32866] Updated weights for policy 0, policy_version 2808 (0.0012) [2023-02-25 20:21:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 11509760. Throughput: 0: 1213.9. Samples: 1872768. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:21:03,379][00219] Avg episode reward: [(0, '31.020')] [2023-02-25 20:21:08,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4915.6, 300 sec: 4762.5). Total num frames: 11538432. Throughput: 0: 1217.4. Samples: 1881952. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:21:08,384][00219] Avg episode reward: [(0, '32.189')] [2023-02-25 20:21:08,453][32866] Updated weights for policy 0, policy_version 2818 (0.0012) [2023-02-25 20:21:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4748.6). Total num frames: 11563008. Throughput: 0: 1207.1. Samples: 1889216. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:21:13,379][00219] Avg episode reward: [(0, '30.094')] [2023-02-25 20:21:18,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 11579392. Throughput: 0: 1206.8. Samples: 1892080. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:21:18,379][00219] Avg episode reward: [(0, '29.417')] [2023-02-25 20:21:19,108][32866] Updated weights for policy 0, policy_version 2828 (0.0013) [2023-02-25 20:21:23,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 11603968. Throughput: 0: 1210.3. Samples: 1898048. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:21:23,379][00219] Avg episode reward: [(0, '29.190')] [2023-02-25 20:21:26,299][32866] Updated weights for policy 0, policy_version 2838 (0.0012) [2023-02-25 20:21:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 11632640. Throughput: 0: 1213.9. Samples: 1906736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:21:28,378][00219] Avg episode reward: [(0, '27.305')] [2023-02-25 20:21:33,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 11649024. Throughput: 0: 1175.5. Samples: 1909568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:21:33,380][00219] Avg episode reward: [(0, '27.603')] [2023-02-25 20:21:37,647][32866] Updated weights for policy 0, policy_version 2848 (0.0012) [2023-02-25 20:21:38,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4710.4, 300 sec: 4706.9). Total num frames: 11665408. Throughput: 0: 1132.1. Samples: 1914400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:21:38,379][00219] Avg episode reward: [(0, '28.914')] [2023-02-25 20:21:43,378][00219] Fps is (10 sec: 3276.7, 60 sec: 4437.2, 300 sec: 4693.0). Total num frames: 11681792. Throughput: 0: 1104.7. Samples: 1919040. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:21:43,382][00219] Avg episode reward: [(0, '30.202')] [2023-02-25 20:21:43,392][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002852_11681792.pth... [2023-02-25 20:21:43,512][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002573_10539008.pth [2023-02-25 20:21:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.4, 300 sec: 4679.2). Total num frames: 11702272. Throughput: 0: 1088.4. Samples: 1921744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:21:48,379][00219] Avg episode reward: [(0, '30.748')] [2023-02-25 20:21:48,888][32866] Updated weights for policy 0, policy_version 2858 (0.0012) [2023-02-25 20:21:53,377][00219] Fps is (10 sec: 4915.9, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 11730944. Throughput: 0: 1064.5. Samples: 1929856. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:21:53,378][00219] Avg episode reward: [(0, '29.964')] [2023-02-25 20:21:55,561][32866] Updated weights for policy 0, policy_version 2868 (0.0011) [2023-02-25 20:21:58,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4642.2, 300 sec: 4693.1). Total num frames: 11759616. Throughput: 0: 1100.8. Samples: 1938752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:21:58,383][00219] Avg episode reward: [(0, '30.687')] [2023-02-25 20:22:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.3, 300 sec: 4679.2). Total num frames: 11776000. Throughput: 0: 1100.8. Samples: 1941616. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:22:03,379][00219] Avg episode reward: [(0, '29.381')] [2023-02-25 20:22:05,392][32866] Updated weights for policy 0, policy_version 2878 (0.0012) [2023-02-25 20:22:08,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4300.8, 300 sec: 4679.2). Total num frames: 11796480. Throughput: 0: 1095.8. Samples: 1947360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:22:08,385][00219] Avg episode reward: [(0, '30.785')] [2023-02-25 20:22:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4369.1, 300 sec: 4693.0). Total num frames: 11825152. Throughput: 0: 1070.2. Samples: 1954896. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:22:13,382][00219] Avg episode reward: [(0, '31.177')] [2023-02-25 20:22:13,893][32866] Updated weights for policy 0, policy_version 2888 (0.0011) [2023-02-25 20:22:18,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4642.1, 300 sec: 4693.1). Total num frames: 11857920. Throughput: 0: 1106.9. Samples: 1959376. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:22:18,384][00219] Avg episode reward: [(0, '31.394')] [2023-02-25 20:22:21,210][32866] Updated weights for policy 0, policy_version 2898 (0.0012) [2023-02-25 20:22:23,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4573.8, 300 sec: 4720.8). Total num frames: 11878400. Throughput: 0: 1177.6. Samples: 1967392. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:22:23,383][00219] Avg episode reward: [(0, '30.369')] [2023-02-25 20:22:28,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.1, 300 sec: 4720.8). Total num frames: 11894784. Throughput: 0: 1202.5. Samples: 1973152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:22:28,384][00219] Avg episode reward: [(0, '31.641')] [2023-02-25 20:22:31,280][32866] Updated weights for policy 0, policy_version 2908 (0.0014) [2023-02-25 20:22:33,377][00219] Fps is (10 sec: 4096.5, 60 sec: 4505.7, 300 sec: 4748.6). Total num frames: 11919360. Throughput: 0: 1207.5. Samples: 1976080. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:22:33,379][00219] Avg episode reward: [(0, '28.651')] [2023-02-25 20:22:38,341][32866] Updated weights for policy 0, policy_version 2918 (0.0012) [2023-02-25 20:22:38,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 11952128. Throughput: 0: 1216.7. Samples: 1984608. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:22:38,379][00219] Avg episode reward: [(0, '27.725')] [2023-02-25 20:22:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.3, 300 sec: 4776.4). Total num frames: 11976704. Throughput: 0: 1205.7. Samples: 1993008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:22:43,379][00219] Avg episode reward: [(0, '27.673')] [2023-02-25 20:22:47,151][32866] Updated weights for policy 0, policy_version 2928 (0.0013) [2023-02-25 20:22:48,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4915.1, 300 sec: 4748.6). Total num frames: 11997184. Throughput: 0: 1206.7. Samples: 1995920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:22:48,380][00219] Avg episode reward: [(0, '28.166')] [2023-02-25 20:22:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4748.6). Total num frames: 12013568. Throughput: 0: 1209.2. Samples: 2001776. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:22:53,379][00219] Avg episode reward: [(0, '28.495')] [2023-02-25 20:22:56,448][32866] Updated weights for policy 0, policy_version 2938 (0.0012) [2023-02-25 20:22:58,377][00219] Fps is (10 sec: 4915.7, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 12046336. Throughput: 0: 1222.8. Samples: 2009920. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:22:58,383][00219] Avg episode reward: [(0, '30.132')] [2023-02-25 20:23:03,197][32866] Updated weights for policy 0, policy_version 2948 (0.0012) [2023-02-25 20:23:03,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4790.2). Total num frames: 12075008. Throughput: 0: 1221.3. Samples: 2014336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:23:03,379][00219] Avg episode reward: [(0, '31.665')] [2023-02-25 20:23:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 12095488. Throughput: 0: 1202.5. Samples: 2021504. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:23:08,381][00219] Avg episode reward: [(0, '32.116')] [2023-02-25 20:23:13,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 12111872. Throughput: 0: 1203.6. Samples: 2027312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:23:13,383][00219] Avg episode reward: [(0, '32.253')] [2023-02-25 20:23:14,020][32866] Updated weights for policy 0, policy_version 2958 (0.0013) [2023-02-25 20:23:18,378][00219] Fps is (10 sec: 4095.5, 60 sec: 4642.0, 300 sec: 4762.5). Total num frames: 12136448. Throughput: 0: 1204.2. Samples: 2030272. Policy #0 lag: (min: 1.0, avg: 1.2, max: 3.0) [2023-02-25 20:23:18,386][00219] Avg episode reward: [(0, '31.932')] [2023-02-25 20:23:21,147][32866] Updated weights for policy 0, policy_version 2968 (0.0011) [2023-02-25 20:23:23,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4847.0, 300 sec: 4790.2). Total num frames: 12169216. Throughput: 0: 1217.8. Samples: 2039408. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:23:23,379][00219] Avg episode reward: [(0, '30.733')] [2023-02-25 20:23:28,377][00219] Fps is (10 sec: 5735.1, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 12193792. Throughput: 0: 1210.7. Samples: 2047488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:23:28,379][00219] Avg episode reward: [(0, '29.647')] [2023-02-25 20:23:29,116][32866] Updated weights for policy 0, policy_version 2978 (0.0011) [2023-02-25 20:23:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4748.7). Total num frames: 12210176. Throughput: 0: 1212.1. Samples: 2050464. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:23:33,387][00219] Avg episode reward: [(0, '28.842')] [2023-02-25 20:23:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4762.6). Total num frames: 12234752. Throughput: 0: 1210.0. Samples: 2056224. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:23:38,381][00219] Avg episode reward: [(0, '29.732')] [2023-02-25 20:23:38,935][32866] Updated weights for policy 0, policy_version 2988 (0.0012) [2023-02-25 20:23:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 4804.1). Total num frames: 12267520. Throughput: 0: 1222.0. Samples: 2064912. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:23:43,379][00219] Avg episode reward: [(0, '29.903')] [2023-02-25 20:23:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002995_12267520.pth... [2023-02-25 20:23:43,490][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002717_11128832.pth [2023-02-25 20:23:45,846][32866] Updated weights for policy 0, policy_version 2998 (0.0012) [2023-02-25 20:23:48,383][00219] Fps is (10 sec: 5730.9, 60 sec: 4914.8, 300 sec: 4776.3). Total num frames: 12292096. Throughput: 0: 1222.6. Samples: 2069360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:23:48,387][00219] Avg episode reward: [(0, '29.225')] [2023-02-25 20:23:53,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4983.5, 300 sec: 4748.6). Total num frames: 12312576. Throughput: 0: 1213.9. Samples: 2076128. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:23:53,379][00219] Avg episode reward: [(0, '30.001')] [2023-02-25 20:23:55,272][32866] Updated weights for policy 0, policy_version 3008 (0.0011) [2023-02-25 20:23:58,377][00219] Fps is (10 sec: 4098.3, 60 sec: 4778.6, 300 sec: 4762.5). Total num frames: 12333056. Throughput: 0: 1214.9. Samples: 2081984. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:23:58,384][00219] Avg episode reward: [(0, '29.961')] [2023-02-25 20:24:03,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 12357632. Throughput: 0: 1231.0. Samples: 2085664. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:24:03,383][00219] Avg episode reward: [(0, '29.007')] [2023-02-25 20:24:03,407][32866] Updated weights for policy 0, policy_version 3018 (0.0015) [2023-02-25 20:24:08,378][00219] Fps is (10 sec: 6143.3, 60 sec: 4983.3, 300 sec: 4804.1). Total num frames: 12394496. Throughput: 0: 1232.3. Samples: 2094864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:24:08,388][00219] Avg episode reward: [(0, '28.354')] [2023-02-25 20:24:09,831][32866] Updated weights for policy 0, policy_version 3028 (0.0011) [2023-02-25 20:24:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 12410880. Throughput: 0: 1214.2. Samples: 2102128. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:24:13,384][00219] Avg episode reward: [(0, '28.866')] [2023-02-25 20:24:18,377][00219] Fps is (10 sec: 4096.7, 60 sec: 4983.6, 300 sec: 4776.4). Total num frames: 12435456. Throughput: 0: 1211.4. Samples: 2104976. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:24:18,380][00219] Avg episode reward: [(0, '28.891')] [2023-02-25 20:24:20,548][32866] Updated weights for policy 0, policy_version 3038 (0.0013) [2023-02-25 20:24:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4776.5). Total num frames: 12455936. Throughput: 0: 1219.9. Samples: 2111120. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:24:23,382][00219] Avg episode reward: [(0, '28.635')] [2023-02-25 20:24:27,473][32866] Updated weights for policy 0, policy_version 3048 (0.0012) [2023-02-25 20:24:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4790.2). Total num frames: 12484608. Throughput: 0: 1228.4. Samples: 2120192. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:24:28,382][00219] Avg episode reward: [(0, '28.741')] [2023-02-25 20:24:33,381][00219] Fps is (10 sec: 5732.2, 60 sec: 5051.4, 300 sec: 4790.2). Total num frames: 12513280. Throughput: 0: 1232.4. Samples: 2124816. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:24:33,383][00219] Avg episode reward: [(0, '27.643')] [2023-02-25 20:24:35,790][32866] Updated weights for policy 0, policy_version 3058 (0.0035) [2023-02-25 20:24:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 12533760. Throughput: 0: 1216.7. Samples: 2130880. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:24:38,384][00219] Avg episode reward: [(0, '27.361')] [2023-02-25 20:24:43,377][00219] Fps is (10 sec: 3687.8, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 12550144. Throughput: 0: 1212.1. Samples: 2136528. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:24:43,379][00219] Avg episode reward: [(0, '27.506')] [2023-02-25 20:24:45,739][32866] Updated weights for policy 0, policy_version 3068 (0.0011) [2023-02-25 20:24:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.4, 300 sec: 4804.1). Total num frames: 12582912. Throughput: 0: 1226.0. Samples: 2140832. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:24:48,378][00219] Avg episode reward: [(0, '27.930')] [2023-02-25 20:24:52,306][32866] Updated weights for policy 0, policy_version 3078 (0.0011) [2023-02-25 20:24:53,382][00219] Fps is (10 sec: 6140.9, 60 sec: 4983.1, 300 sec: 4804.0). Total num frames: 12611584. Throughput: 0: 1226.9. Samples: 2150080. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:24:53,392][00219] Avg episode reward: [(0, '28.727')] [2023-02-25 20:24:58,377][00219] Fps is (10 sec: 4505.3, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 12627968. Throughput: 0: 1210.7. Samples: 2156608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:24:58,384][00219] Avg episode reward: [(0, '29.023')] [2023-02-25 20:25:02,619][32866] Updated weights for policy 0, policy_version 3088 (0.0011) [2023-02-25 20:25:03,377][00219] Fps is (10 sec: 4098.1, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 12652544. Throughput: 0: 1212.8. Samples: 2159552. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:25:03,385][00219] Avg episode reward: [(0, '28.998')] [2023-02-25 20:25:08,378][00219] Fps is (10 sec: 4914.7, 60 sec: 4710.4, 300 sec: 4790.2). Total num frames: 12677120. Throughput: 0: 1222.4. Samples: 2166128. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:25:08,388][00219] Avg episode reward: [(0, '28.433')] [2023-02-25 20:25:10,353][32866] Updated weights for policy 0, policy_version 3098 (0.0012) [2023-02-25 20:25:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 12705792. Throughput: 0: 1223.8. Samples: 2175264. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:25:13,384][00219] Avg episode reward: [(0, '28.138')] [2023-02-25 20:25:18,377][00219] Fps is (10 sec: 4916.0, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 12726272. Throughput: 0: 1214.0. Samples: 2179440. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:25:18,379][00219] Avg episode reward: [(0, '28.066')] [2023-02-25 20:25:18,538][32866] Updated weights for policy 0, policy_version 3108 (0.0012) [2023-02-25 20:25:23,381][00219] Fps is (10 sec: 4503.8, 60 sec: 4914.9, 300 sec: 4776.3). Total num frames: 12750848. Throughput: 0: 1206.6. Samples: 2185184. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:25:23,383][00219] Avg episode reward: [(0, '28.023')] [2023-02-25 20:25:28,342][32866] Updated weights for policy 0, policy_version 3118 (0.0014) [2023-02-25 20:25:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 12771328. Throughput: 0: 1214.9. Samples: 2191200. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:25:28,379][00219] Avg episode reward: [(0, '27.875')] [2023-02-25 20:25:33,380][00219] Fps is (10 sec: 4915.7, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 12800000. Throughput: 0: 1219.5. Samples: 2195712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:25:33,382][00219] Avg episode reward: [(0, '27.872')] [2023-02-25 20:25:37,291][32866] Updated weights for policy 0, policy_version 3128 (0.0016) [2023-02-25 20:25:38,380][00219] Fps is (10 sec: 4504.2, 60 sec: 4710.2, 300 sec: 4748.5). Total num frames: 12816384. Throughput: 0: 1153.8. Samples: 2202000. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:25:38,387][00219] Avg episode reward: [(0, '28.655')] [2023-02-25 20:25:43,379][00219] Fps is (10 sec: 3277.1, 60 sec: 4710.2, 300 sec: 4720.8). Total num frames: 12832768. Throughput: 0: 1114.6. Samples: 2206768. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:25:43,381][00219] Avg episode reward: [(0, '28.547')] [2023-02-25 20:25:43,392][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003133_12832768.pth... [2023-02-25 20:25:43,526][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002852_11681792.pth [2023-02-25 20:25:48,377][00219] Fps is (10 sec: 3277.8, 60 sec: 4437.3, 300 sec: 4706.9). Total num frames: 12849152. Throughput: 0: 1097.6. Samples: 2208944. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:25:48,382][00219] Avg episode reward: [(0, '28.188')] [2023-02-25 20:25:49,781][32866] Updated weights for policy 0, policy_version 3138 (0.0016) [2023-02-25 20:25:53,377][00219] Fps is (10 sec: 3277.5, 60 sec: 4232.9, 300 sec: 4693.1). Total num frames: 12865536. Throughput: 0: 1075.6. Samples: 2214528. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:25:53,379][00219] Avg episode reward: [(0, '28.105')] [2023-02-25 20:25:57,348][32866] Updated weights for policy 0, policy_version 3148 (0.0015) [2023-02-25 20:25:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.4, 300 sec: 4693.0). Total num frames: 12894208. Throughput: 0: 1056.4. Samples: 2222800. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:25:58,379][00219] Avg episode reward: [(0, '27.484')] [2023-02-25 20:26:03,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4573.9, 300 sec: 4706.9). Total num frames: 12926976. Throughput: 0: 1060.6. Samples: 2227168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:26:03,379][00219] Avg episode reward: [(0, '27.515')] [2023-02-25 20:26:04,984][32866] Updated weights for policy 0, policy_version 3158 (0.0012) [2023-02-25 20:26:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4437.5, 300 sec: 4679.2). Total num frames: 12943360. Throughput: 0: 1091.7. Samples: 2234304. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:26:08,388][00219] Avg episode reward: [(0, '27.857')] [2023-02-25 20:26:13,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4300.8, 300 sec: 4693.0). Total num frames: 12963840. Throughput: 0: 1086.2. Samples: 2240080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:26:13,382][00219] Avg episode reward: [(0, '28.294')] [2023-02-25 20:26:15,613][32866] Updated weights for policy 0, policy_version 3168 (0.0014) [2023-02-25 20:26:18,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4369.0, 300 sec: 4693.0). Total num frames: 12988416. Throughput: 0: 1057.8. Samples: 2243312. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:26:18,384][00219] Avg episode reward: [(0, '27.626')] [2023-02-25 20:26:22,416][32866] Updated weights for policy 0, policy_version 3178 (0.0012) [2023-02-25 20:26:23,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4505.9, 300 sec: 4706.9). Total num frames: 13021184. Throughput: 0: 1120.8. Samples: 2252432. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:26:23,378][00219] Avg episode reward: [(0, '27.803')] [2023-02-25 20:26:28,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4505.6, 300 sec: 4720.8). Total num frames: 13041664. Throughput: 0: 1181.2. Samples: 2259920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:26:28,379][00219] Avg episode reward: [(0, '26.571')] [2023-02-25 20:26:31,248][32866] Updated weights for policy 0, policy_version 3188 (0.0012) [2023-02-25 20:26:33,381][00219] Fps is (10 sec: 4503.8, 60 sec: 4437.3, 300 sec: 4748.5). Total num frames: 13066240. Throughput: 0: 1197.8. Samples: 2262848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:26:33,383][00219] Avg episode reward: [(0, '25.787')] [2023-02-25 20:26:38,378][00219] Fps is (10 sec: 4505.0, 60 sec: 4505.7, 300 sec: 4762.5). Total num frames: 13086720. Throughput: 0: 1202.5. Samples: 2268640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:26:38,386][00219] Avg episode reward: [(0, '26.117')] [2023-02-25 20:26:40,067][32866] Updated weights for policy 0, policy_version 3198 (0.0012) [2023-02-25 20:26:43,377][00219] Fps is (10 sec: 4917.2, 60 sec: 4710.6, 300 sec: 4790.2). Total num frames: 13115392. Throughput: 0: 1219.2. Samples: 2277664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:26:43,389][00219] Avg episode reward: [(0, '26.239')] [2023-02-25 20:26:46,857][32866] Updated weights for policy 0, policy_version 3208 (0.0011) [2023-02-25 20:26:48,377][00219] Fps is (10 sec: 5325.5, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 13139968. Throughput: 0: 1221.0. Samples: 2282112. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:26:48,379][00219] Avg episode reward: [(0, '27.710')] [2023-02-25 20:26:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 13160448. Throughput: 0: 1205.7. Samples: 2288560. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:26:53,380][00219] Avg episode reward: [(0, '28.100')] [2023-02-25 20:26:57,315][32866] Updated weights for policy 0, policy_version 3218 (0.0012) [2023-02-25 20:26:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 13180928. Throughput: 0: 1203.6. Samples: 2294240. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:26:58,387][00219] Avg episode reward: [(0, '29.818')] [2023-02-25 20:27:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4790.2). Total num frames: 13209600. Throughput: 0: 1219.9. Samples: 2298208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:27:03,379][00219] Avg episode reward: [(0, '28.734')] [2023-02-25 20:27:04,946][32866] Updated weights for policy 0, policy_version 3228 (0.0013) [2023-02-25 20:27:08,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 13238272. Throughput: 0: 1218.8. Samples: 2307280. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:27:08,378][00219] Avg episode reward: [(0, '27.444')] [2023-02-25 20:27:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 13258752. Throughput: 0: 1203.9. Samples: 2314096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:27:13,381][00219] Avg episode reward: [(0, '27.313')] [2023-02-25 20:27:13,603][32866] Updated weights for policy 0, policy_version 3238 (0.0012) [2023-02-25 20:27:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4847.0, 300 sec: 4748.6). Total num frames: 13279232. Throughput: 0: 1200.8. Samples: 2316880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:27:18,382][00219] Avg episode reward: [(0, '24.766')] [2023-02-25 20:27:23,112][32866] Updated weights for policy 0, policy_version 3248 (0.0012) [2023-02-25 20:27:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 13303808. Throughput: 0: 1211.8. Samples: 2323168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:27:23,381][00219] Avg episode reward: [(0, '26.141')] [2023-02-25 20:27:28,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4846.9, 300 sec: 4790.2). Total num frames: 13332480. Throughput: 0: 1211.7. Samples: 2332192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:27:28,380][00219] Avg episode reward: [(0, '25.296')] [2023-02-25 20:27:29,889][32866] Updated weights for policy 0, policy_version 3258 (0.0011) [2023-02-25 20:27:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.3, 300 sec: 4762.5). Total num frames: 13357056. Throughput: 0: 1209.2. Samples: 2336528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:27:33,381][00219] Avg episode reward: [(0, '26.122')] [2023-02-25 20:27:38,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4847.0, 300 sec: 4748.6). Total num frames: 13377536. Throughput: 0: 1196.1. Samples: 2342384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:27:38,383][00219] Avg episode reward: [(0, '25.988')] [2023-02-25 20:27:40,122][32866] Updated weights for policy 0, policy_version 3268 (0.0011) [2023-02-25 20:27:43,379][00219] Fps is (10 sec: 4504.7, 60 sec: 4778.5, 300 sec: 4762.5). Total num frames: 13402112. Throughput: 0: 1203.2. Samples: 2348384. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:27:43,381][00219] Avg episode reward: [(0, '24.931')] [2023-02-25 20:27:43,389][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003272_13402112.pth... [2023-02-25 20:27:43,505][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002995_12267520.pth [2023-02-25 20:27:47,422][32866] Updated weights for policy 0, policy_version 3278 (0.0011) [2023-02-25 20:27:48,377][00219] Fps is (10 sec: 4915.4, 60 sec: 4778.7, 300 sec: 4790.2). Total num frames: 13426688. Throughput: 0: 1213.5. Samples: 2352816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:27:48,382][00219] Avg episode reward: [(0, '25.552')] [2023-02-25 20:27:53,377][00219] Fps is (10 sec: 5325.8, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 13455360. Throughput: 0: 1214.6. Samples: 2361936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:27:53,378][00219] Avg episode reward: [(0, '27.466')] [2023-02-25 20:27:55,210][32866] Updated weights for policy 0, policy_version 3288 (0.0012) [2023-02-25 20:27:58,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 13475840. Throughput: 0: 1197.2. Samples: 2367968. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:27:58,380][00219] Avg episode reward: [(0, '27.318')] [2023-02-25 20:28:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 13496320. Throughput: 0: 1200.7. Samples: 2370912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:28:03,378][00219] Avg episode reward: [(0, '26.813')] [2023-02-25 20:28:05,415][32866] Updated weights for policy 0, policy_version 3298 (0.0012) [2023-02-25 20:28:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4790.2). Total num frames: 13524992. Throughput: 0: 1220.3. Samples: 2378080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:28:08,385][00219] Avg episode reward: [(0, '27.416')] [2023-02-25 20:28:12,289][32866] Updated weights for policy 0, policy_version 3308 (0.0016) [2023-02-25 20:28:13,377][00219] Fps is (10 sec: 5734.2, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 13553664. Throughput: 0: 1222.4. Samples: 2387200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:28:13,379][00219] Avg episode reward: [(0, '27.726')] [2023-02-25 20:28:18,378][00219] Fps is (10 sec: 4914.6, 60 sec: 4915.1, 300 sec: 4762.4). Total num frames: 13574144. Throughput: 0: 1207.1. Samples: 2390848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:28:18,380][00219] Avg episode reward: [(0, '27.629')] [2023-02-25 20:28:22,246][32866] Updated weights for policy 0, policy_version 3318 (0.0012) [2023-02-25 20:28:23,381][00219] Fps is (10 sec: 4094.5, 60 sec: 4846.6, 300 sec: 4748.5). Total num frames: 13594624. Throughput: 0: 1201.3. Samples: 2396448. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-25 20:28:23,383][00219] Avg episode reward: [(0, '27.759')] [2023-02-25 20:28:28,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 13619200. Throughput: 0: 1210.0. Samples: 2402832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:28:28,379][00219] Avg episode reward: [(0, '27.907')] [2023-02-25 20:28:30,863][32866] Updated weights for policy 0, policy_version 3328 (0.0012) [2023-02-25 20:28:33,377][00219] Fps is (10 sec: 4917.2, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 13643776. Throughput: 0: 1208.5. Samples: 2407200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:28:33,379][00219] Avg episode reward: [(0, '29.053')] [2023-02-25 20:28:38,127][32866] Updated weights for policy 0, policy_version 3338 (0.0012) [2023-02-25 20:28:38,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4915.1, 300 sec: 4762.5). Total num frames: 13672448. Throughput: 0: 1196.4. Samples: 2415776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:28:38,386][00219] Avg episode reward: [(0, '28.048')] [2023-02-25 20:28:43,378][00219] Fps is (10 sec: 4914.6, 60 sec: 4847.0, 300 sec: 4748.7). Total num frames: 13692928. Throughput: 0: 1192.1. Samples: 2421616. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:28:43,386][00219] Avg episode reward: [(0, '28.034')] [2023-02-25 20:28:48,377][00219] Fps is (10 sec: 3686.8, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 13709312. Throughput: 0: 1188.3. Samples: 2424384. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:28:48,379][00219] Avg episode reward: [(0, '27.590')] [2023-02-25 20:28:48,962][32866] Updated weights for policy 0, policy_version 3348 (0.0012) [2023-02-25 20:28:53,377][00219] Fps is (10 sec: 4506.2, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 13737984. Throughput: 0: 1196.1. Samples: 2431904. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:28:53,379][00219] Avg episode reward: [(0, '27.337')] [2023-02-25 20:28:55,425][32866] Updated weights for policy 0, policy_version 3358 (0.0011) [2023-02-25 20:28:58,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 13766656. Throughput: 0: 1193.6. Samples: 2440912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:28:58,379][00219] Avg episode reward: [(0, '27.036')] [2023-02-25 20:29:03,378][00219] Fps is (10 sec: 5324.1, 60 sec: 4915.1, 300 sec: 4734.7). Total num frames: 13791232. Throughput: 0: 1185.1. Samples: 2444176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:29:03,381][00219] Avg episode reward: [(0, '29.250')] [2023-02-25 20:29:04,848][32866] Updated weights for policy 0, policy_version 3368 (0.0012) [2023-02-25 20:29:08,377][00219] Fps is (10 sec: 4095.7, 60 sec: 4710.3, 300 sec: 4734.7). Total num frames: 13807616. Throughput: 0: 1190.5. Samples: 2450016. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:29:08,382][00219] Avg episode reward: [(0, '29.167')] [2023-02-25 20:29:13,312][32866] Updated weights for policy 0, policy_version 3378 (0.0012) [2023-02-25 20:29:13,377][00219] Fps is (10 sec: 4506.2, 60 sec: 4710.4, 300 sec: 4748.6). Total num frames: 13836288. Throughput: 0: 1205.3. Samples: 2457072. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:29:13,379][00219] Avg episode reward: [(0, '31.161')] [2023-02-25 20:29:18,379][00219] Fps is (10 sec: 5324.1, 60 sec: 4778.6, 300 sec: 4762.4). Total num frames: 13860864. Throughput: 0: 1208.5. Samples: 2461584. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:29:18,381][00219] Avg episode reward: [(0, '31.568')] [2023-02-25 20:29:20,000][32866] Updated weights for policy 0, policy_version 3388 (0.0012) [2023-02-25 20:29:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.3, 300 sec: 4748.6). Total num frames: 13885440. Throughput: 0: 1201.8. Samples: 2469856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:29:23,380][00219] Avg episode reward: [(0, '30.704')] [2023-02-25 20:29:28,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4778.7, 300 sec: 4720.9). Total num frames: 13905920. Throughput: 0: 1200.0. Samples: 2475616. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:29:28,383][00219] Avg episode reward: [(0, '30.284')] [2023-02-25 20:29:31,254][32866] Updated weights for policy 0, policy_version 3398 (0.0013) [2023-02-25 20:29:33,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4778.6, 300 sec: 4734.7). Total num frames: 13930496. Throughput: 0: 1202.5. Samples: 2478496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:29:33,382][00219] Avg episode reward: [(0, '29.277')] [2023-02-25 20:29:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.5, 300 sec: 4762.5). Total num frames: 13955072. Throughput: 0: 1218.5. Samples: 2486736. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:29:38,383][00219] Avg episode reward: [(0, '29.192')] [2023-02-25 20:29:38,405][32866] Updated weights for policy 0, policy_version 3408 (0.0012) [2023-02-25 20:29:43,380][00219] Fps is (10 sec: 4504.3, 60 sec: 4710.2, 300 sec: 4720.8). Total num frames: 13975552. Throughput: 0: 1155.8. Samples: 2492928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:29:43,383][00219] Avg episode reward: [(0, '28.380')] [2023-02-25 20:29:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003412_13975552.pth... [2023-02-25 20:29:43,580][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003133_12832768.pth [2023-02-25 20:29:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4679.2). Total num frames: 13991936. Throughput: 0: 1133.2. Samples: 2495168. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:29:48,378][00219] Avg episode reward: [(0, '29.650')] [2023-02-25 20:29:50,363][32866] Updated weights for policy 0, policy_version 3418 (0.0012) [2023-02-25 20:29:53,377][00219] Fps is (10 sec: 3277.7, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 14008320. Throughput: 0: 1107.2. Samples: 2499840. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:29:53,382][00219] Avg episode reward: [(0, '29.471')] [2023-02-25 20:29:58,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4300.8, 300 sec: 4651.4). Total num frames: 14024704. Throughput: 0: 1066.0. Samples: 2505040. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:29:58,379][00219] Avg episode reward: [(0, '29.758')] [2023-02-25 20:30:00,535][32866] Updated weights for policy 0, policy_version 3428 (0.0012) [2023-02-25 20:30:03,377][00219] Fps is (10 sec: 4915.5, 60 sec: 4437.4, 300 sec: 4679.2). Total num frames: 14057472. Throughput: 0: 1061.7. Samples: 2509360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:30:03,379][00219] Avg episode reward: [(0, '28.724')] [2023-02-25 20:30:07,209][32866] Updated weights for policy 0, policy_version 3438 (0.0011) [2023-02-25 20:30:08,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4573.9, 300 sec: 4665.3). Total num frames: 14082048. Throughput: 0: 1081.2. Samples: 2518512. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:30:08,384][00219] Avg episode reward: [(0, '29.138')] [2023-02-25 20:30:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.3, 300 sec: 4665.3). Total num frames: 14102528. Throughput: 0: 1095.8. Samples: 2524928. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:30:13,380][00219] Avg episode reward: [(0, '29.019')] [2023-02-25 20:30:17,594][32866] Updated weights for policy 0, policy_version 3448 (0.0012) [2023-02-25 20:30:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4369.2, 300 sec: 4651.5). Total num frames: 14123008. Throughput: 0: 1096.2. Samples: 2527824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:30:18,380][00219] Avg episode reward: [(0, '28.433')] [2023-02-25 20:30:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4437.3, 300 sec: 4679.2). Total num frames: 14151680. Throughput: 0: 1064.5. Samples: 2534640. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:30:23,379][00219] Avg episode reward: [(0, '29.342')] [2023-02-25 20:30:25,223][32866] Updated weights for policy 0, policy_version 3458 (0.0012) [2023-02-25 20:30:28,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4573.9, 300 sec: 4679.2). Total num frames: 14180352. Throughput: 0: 1130.4. Samples: 2543792. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:30:28,379][00219] Avg episode reward: [(0, '31.966')] [2023-02-25 20:30:33,024][32866] Updated weights for policy 0, policy_version 3468 (0.0012) [2023-02-25 20:30:33,379][00219] Fps is (10 sec: 5323.7, 60 sec: 4573.7, 300 sec: 4706.9). Total num frames: 14204928. Throughput: 0: 1169.0. Samples: 2547776. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:30:33,383][00219] Avg episode reward: [(0, '32.146')] [2023-02-25 20:30:38,379][00219] Fps is (10 sec: 4095.1, 60 sec: 4437.2, 300 sec: 4706.9). Total num frames: 14221312. Throughput: 0: 1195.3. Samples: 2553632. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:30:38,382][00219] Avg episode reward: [(0, '32.888')] [2023-02-25 20:30:38,388][32858] Saving new best policy, reward=32.888! [2023-02-25 20:30:42,865][32866] Updated weights for policy 0, policy_version 3478 (0.0013) [2023-02-25 20:30:43,377][00219] Fps is (10 sec: 4096.8, 60 sec: 4505.8, 300 sec: 4734.7). Total num frames: 14245888. Throughput: 0: 1219.9. Samples: 2559936. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:30:43,381][00219] Avg episode reward: [(0, '32.968')] [2023-02-25 20:30:43,398][32858] Saving new best policy, reward=32.968! [2023-02-25 20:30:48,377][00219] Fps is (10 sec: 5326.0, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 14274560. Throughput: 0: 1221.7. Samples: 2564336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:30:48,379][00219] Avg episode reward: [(0, '30.024')] [2023-02-25 20:30:50,369][32866] Updated weights for policy 0, policy_version 3488 (0.0012) [2023-02-25 20:30:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.0, 300 sec: 4762.5). Total num frames: 14299136. Throughput: 0: 1216.0. Samples: 2573232. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:30:53,382][00219] Avg episode reward: [(0, '28.763')] [2023-02-25 20:30:58,380][00219] Fps is (10 sec: 4504.2, 60 sec: 4914.9, 300 sec: 4720.8). Total num frames: 14319616. Throughput: 0: 1202.8. Samples: 2579056. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:30:58,382][00219] Avg episode reward: [(0, '28.277')] [2023-02-25 20:30:59,993][32866] Updated weights for policy 0, policy_version 3498 (0.0012) [2023-02-25 20:31:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 14340096. Throughput: 0: 1202.1. Samples: 2581920. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:31:03,385][00219] Avg episode reward: [(0, '27.515')] [2023-02-25 20:31:08,051][32866] Updated weights for policy 0, policy_version 3508 (0.0013) [2023-02-25 20:31:08,377][00219] Fps is (10 sec: 4916.8, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 14368768. Throughput: 0: 1214.6. Samples: 2589296. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:31:08,382][00219] Avg episode reward: [(0, '26.752')] [2023-02-25 20:31:13,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4790.2). Total num frames: 14401536. Throughput: 0: 1212.4. Samples: 2598352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:31:13,379][00219] Avg episode reward: [(0, '27.111')] [2023-02-25 20:31:15,137][32866] Updated weights for policy 0, policy_version 3518 (0.0011) [2023-02-25 20:31:18,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4915.2, 300 sec: 4734.7). Total num frames: 14417920. Throughput: 0: 1196.5. Samples: 2601616. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:31:18,379][00219] Avg episode reward: [(0, '28.001')] [2023-02-25 20:31:23,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 14438400. Throughput: 0: 1191.9. Samples: 2607264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:31:23,382][00219] Avg episode reward: [(0, '28.466')] [2023-02-25 20:31:26,318][32866] Updated weights for policy 0, policy_version 3528 (0.0012) [2023-02-25 20:31:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4734.8). Total num frames: 14462976. Throughput: 0: 1211.4. Samples: 2614448. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:31:28,384][00219] Avg episode reward: [(0, '29.838')] [2023-02-25 20:31:32,882][32866] Updated weights for policy 0, policy_version 3538 (0.0011) [2023-02-25 20:31:33,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4847.1, 300 sec: 4776.4). Total num frames: 14495744. Throughput: 0: 1216.0. Samples: 2619056. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:31:33,382][00219] Avg episode reward: [(0, '30.416')] [2023-02-25 20:31:38,378][00219] Fps is (10 sec: 5733.9, 60 sec: 4983.6, 300 sec: 4762.5). Total num frames: 14520320. Throughput: 0: 1207.1. Samples: 2627552. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:31:38,380][00219] Avg episode reward: [(0, '31.128')] [2023-02-25 20:31:41,990][32866] Updated weights for policy 0, policy_version 3548 (0.0025) [2023-02-25 20:31:43,380][00219] Fps is (10 sec: 4094.7, 60 sec: 4846.7, 300 sec: 4734.6). Total num frames: 14536704. Throughput: 0: 1207.1. Samples: 2633376. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:31:43,382][00219] Avg episode reward: [(0, '31.487')] [2023-02-25 20:31:43,401][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003549_14536704.pth... [2023-02-25 20:31:43,545][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003272_13402112.pth [2023-02-25 20:31:48,377][00219] Fps is (10 sec: 4096.4, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 14561280. Throughput: 0: 1209.2. Samples: 2636336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:31:48,387][00219] Avg episode reward: [(0, '31.776')] [2023-02-25 20:31:50,811][32866] Updated weights for policy 0, policy_version 3558 (0.0011) [2023-02-25 20:31:53,379][00219] Fps is (10 sec: 5324.9, 60 sec: 4846.7, 300 sec: 4776.3). Total num frames: 14589952. Throughput: 0: 1222.0. Samples: 2644288. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:31:53,390][00219] Avg episode reward: [(0, '29.707')] [2023-02-25 20:31:57,557][32866] Updated weights for policy 0, policy_version 3568 (0.0012) [2023-02-25 20:31:58,383][00219] Fps is (10 sec: 5730.9, 60 sec: 4983.2, 300 sec: 4776.3). Total num frames: 14618624. Throughput: 0: 1222.6. Samples: 2653376. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:31:58,390][00219] Avg episode reward: [(0, '28.053')] [2023-02-25 20:32:03,377][00219] Fps is (10 sec: 4506.9, 60 sec: 4915.2, 300 sec: 4734.7). Total num frames: 14635008. Throughput: 0: 1218.1. Samples: 2656432. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:32:03,381][00219] Avg episode reward: [(0, '27.870')] [2023-02-25 20:32:07,611][32866] Updated weights for policy 0, policy_version 3578 (0.0011) [2023-02-25 20:32:08,377][00219] Fps is (10 sec: 4098.6, 60 sec: 4846.9, 300 sec: 4748.6). Total num frames: 14659584. Throughput: 0: 1222.0. Samples: 2662256. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:32:08,384][00219] Avg episode reward: [(0, '28.107')] [2023-02-25 20:32:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 14684160. Throughput: 0: 1227.7. Samples: 2669696. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:32:13,379][00219] Avg episode reward: [(0, '27.785')] [2023-02-25 20:32:15,138][32866] Updated weights for policy 0, policy_version 3588 (0.0019) [2023-02-25 20:32:18,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4790.2). Total num frames: 14716928. Throughput: 0: 1224.5. Samples: 2674160. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:32:18,381][00219] Avg episode reward: [(0, '28.984')] [2023-02-25 20:32:23,049][32866] Updated weights for policy 0, policy_version 3598 (0.0014) [2023-02-25 20:32:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 14737408. Throughput: 0: 1215.0. Samples: 2682224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:32:23,382][00219] Avg episode reward: [(0, '29.985')] [2023-02-25 20:32:28,384][00219] Fps is (10 sec: 3683.8, 60 sec: 4846.4, 300 sec: 4734.6). Total num frames: 14753792. Throughput: 0: 1209.5. Samples: 2687808. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:32:28,392][00219] Avg episode reward: [(0, '31.603')] [2023-02-25 20:32:33,291][32866] Updated weights for policy 0, policy_version 3608 (0.0015) [2023-02-25 20:32:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4748.6). Total num frames: 14778368. Throughput: 0: 1203.9. Samples: 2690512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:32:33,379][00219] Avg episode reward: [(0, '31.506')] [2023-02-25 20:32:38,377][00219] Fps is (10 sec: 5328.6, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 14807040. Throughput: 0: 1204.7. Samples: 2698496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:32:38,379][00219] Avg episode reward: [(0, '30.146')] [2023-02-25 20:32:40,398][32866] Updated weights for policy 0, policy_version 3618 (0.0012) [2023-02-25 20:32:43,377][00219] Fps is (10 sec: 5324.4, 60 sec: 4915.4, 300 sec: 4762.5). Total num frames: 14831616. Throughput: 0: 1185.2. Samples: 2706704. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:32:43,382][00219] Avg episode reward: [(0, '30.795')] [2023-02-25 20:32:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 14848000. Throughput: 0: 1175.5. Samples: 2709328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:32:48,382][00219] Avg episode reward: [(0, '30.929')] [2023-02-25 20:32:51,032][32866] Updated weights for policy 0, policy_version 3628 (0.0012) [2023-02-25 20:32:53,377][00219] Fps is (10 sec: 3276.9, 60 sec: 4574.0, 300 sec: 4706.9). Total num frames: 14864384. Throughput: 0: 1171.5. Samples: 2714976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:32:53,383][00219] Avg episode reward: [(0, '30.039')] [2023-02-25 20:32:58,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.6, 300 sec: 4748.6). Total num frames: 14897152. Throughput: 0: 1178.3. Samples: 2722720. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-02-25 20:32:58,378][00219] Avg episode reward: [(0, '29.478')] [2023-02-25 20:32:59,139][32866] Updated weights for policy 0, policy_version 3638 (0.0012) [2023-02-25 20:33:03,377][00219] Fps is (10 sec: 6144.3, 60 sec: 4846.9, 300 sec: 4748.6). Total num frames: 14925824. Throughput: 0: 1180.8. Samples: 2727296. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:33:03,388][00219] Avg episode reward: [(0, '27.850')] [2023-02-25 20:33:06,271][32866] Updated weights for policy 0, policy_version 3648 (0.0040) [2023-02-25 20:33:08,378][00219] Fps is (10 sec: 4914.7, 60 sec: 4778.6, 300 sec: 4720.8). Total num frames: 14946304. Throughput: 0: 1162.6. Samples: 2734544. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:33:08,382][00219] Avg episode reward: [(0, '28.884')] [2023-02-25 20:33:13,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4710.4, 300 sec: 4720.8). Total num frames: 14966784. Throughput: 0: 1167.5. Samples: 2740336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:33:13,382][00219] Avg episode reward: [(0, '28.675')] [2023-02-25 20:33:16,465][32866] Updated weights for policy 0, policy_version 3658 (0.0013) [2023-02-25 20:33:18,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4573.9, 300 sec: 4734.8). Total num frames: 14991360. Throughput: 0: 1175.8. Samples: 2743424. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:33:18,379][00219] Avg episode reward: [(0, '29.969')] [2023-02-25 20:33:23,335][32866] Updated weights for policy 0, policy_version 3668 (0.0012) [2023-02-25 20:33:23,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 15024128. Throughput: 0: 1197.5. Samples: 2752384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:33:23,382][00219] Avg episode reward: [(0, '30.732')] [2023-02-25 20:33:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.5, 300 sec: 4748.6). Total num frames: 15044608. Throughput: 0: 1186.5. Samples: 2760096. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:33:28,379][00219] Avg episode reward: [(0, '31.776')] [2023-02-25 20:33:33,050][32866] Updated weights for policy 0, policy_version 3678 (0.0011) [2023-02-25 20:33:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4720.8). Total num frames: 15065088. Throughput: 0: 1190.4. Samples: 2762896. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-25 20:33:33,385][00219] Avg episode reward: [(0, '32.776')] [2023-02-25 20:33:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4720.8). Total num frames: 15085568. Throughput: 0: 1189.3. Samples: 2768496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:33:38,379][00219] Avg episode reward: [(0, '31.398')] [2023-02-25 20:33:42,948][32866] Updated weights for policy 0, policy_version 3688 (0.0012) [2023-02-25 20:33:43,378][00219] Fps is (10 sec: 4505.2, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 15110144. Throughput: 0: 1180.4. Samples: 2775840. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:33:43,386][00219] Avg episode reward: [(0, '31.699')] [2023-02-25 20:33:43,400][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003689_15110144.pth... [2023-02-25 20:33:43,569][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003412_13975552.pth [2023-02-25 20:33:48,379][00219] Fps is (10 sec: 4095.1, 60 sec: 4642.0, 300 sec: 4706.9). Total num frames: 15126528. Throughput: 0: 1138.4. Samples: 2778528. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:33:48,384][00219] Avg episode reward: [(0, '32.193')] [2023-02-25 20:33:53,377][00219] Fps is (10 sec: 2867.4, 60 sec: 4573.9, 300 sec: 4651.4). Total num frames: 15138816. Throughput: 0: 1088.7. Samples: 2783536. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:33:53,381][00219] Avg episode reward: [(0, '32.873')] [2023-02-25 20:33:54,531][32866] Updated weights for policy 0, policy_version 3698 (0.0015) [2023-02-25 20:33:58,377][00219] Fps is (10 sec: 3277.5, 60 sec: 4369.1, 300 sec: 4637.5). Total num frames: 15159296. Throughput: 0: 1079.1. Samples: 2788896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:33:58,381][00219] Avg episode reward: [(0, '32.662')] [2023-02-25 20:34:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4300.8, 300 sec: 4665.3). Total num frames: 15183872. Throughput: 0: 1072.7. Samples: 2791696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:34:03,389][00219] Avg episode reward: [(0, '32.826')] [2023-02-25 20:34:04,000][32866] Updated weights for policy 0, policy_version 3708 (0.0012) [2023-02-25 20:34:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4437.4, 300 sec: 4665.3). Total num frames: 15212544. Throughput: 0: 1068.1. Samples: 2800448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:34:08,384][00219] Avg episode reward: [(0, '33.358')] [2023-02-25 20:34:08,390][32858] Saving new best policy, reward=33.358! [2023-02-25 20:34:10,632][32866] Updated weights for policy 0, policy_version 3718 (0.0012) [2023-02-25 20:34:13,379][00219] Fps is (10 sec: 5732.8, 60 sec: 4573.7, 300 sec: 4679.1). Total num frames: 15241216. Throughput: 0: 1081.2. Samples: 2808752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:34:13,386][00219] Avg episode reward: [(0, '31.091')] [2023-02-25 20:34:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.3, 300 sec: 4651.4). Total num frames: 15257600. Throughput: 0: 1082.3. Samples: 2811600. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:34:18,380][00219] Avg episode reward: [(0, '29.150')] [2023-02-25 20:34:20,899][32866] Updated weights for policy 0, policy_version 3728 (0.0011) [2023-02-25 20:34:23,377][00219] Fps is (10 sec: 3687.4, 60 sec: 4232.5, 300 sec: 4651.4). Total num frames: 15278080. Throughput: 0: 1087.3. Samples: 2817424. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:34:23,384][00219] Avg episode reward: [(0, '29.246')] [2023-02-25 20:34:28,198][32866] Updated weights for policy 0, policy_version 3738 (0.0012) [2023-02-25 20:34:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4437.3, 300 sec: 4679.2). Total num frames: 15310848. Throughput: 0: 1107.6. Samples: 2825680. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:34:28,384][00219] Avg episode reward: [(0, '28.983')] [2023-02-25 20:34:33,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4573.9, 300 sec: 4693.0). Total num frames: 15339520. Throughput: 0: 1149.6. Samples: 2830256. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:34:33,379][00219] Avg episode reward: [(0, '27.130')] [2023-02-25 20:34:36,637][32866] Updated weights for policy 0, policy_version 3748 (0.0011) [2023-02-25 20:34:38,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 15355904. Throughput: 0: 1197.9. Samples: 2837440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:34:38,379][00219] Avg episode reward: [(0, '28.916')] [2023-02-25 20:34:43,380][00219] Fps is (10 sec: 3685.3, 60 sec: 4437.2, 300 sec: 4693.0). Total num frames: 15376384. Throughput: 0: 1208.1. Samples: 2843264. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:34:43,383][00219] Avg episode reward: [(0, '28.883')] [2023-02-25 20:34:46,237][32866] Updated weights for policy 0, policy_version 3758 (0.0012) [2023-02-25 20:34:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.3, 300 sec: 4734.7). Total num frames: 15405056. Throughput: 0: 1217.8. Samples: 2846496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:34:48,378][00219] Avg episode reward: [(0, '27.897')] [2023-02-25 20:34:52,954][32866] Updated weights for policy 0, policy_version 3768 (0.0012) [2023-02-25 20:34:53,377][00219] Fps is (10 sec: 5736.1, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 15433728. Throughput: 0: 1229.5. Samples: 2855776. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:34:53,382][00219] Avg episode reward: [(0, '27.343')] [2023-02-25 20:34:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4748.6). Total num frames: 15458304. Throughput: 0: 1212.2. Samples: 2863296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:34:58,384][00219] Avg episode reward: [(0, '27.268')] [2023-02-25 20:35:02,002][32866] Updated weights for policy 0, policy_version 3778 (0.0011) [2023-02-25 20:35:03,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4846.9, 300 sec: 4720.8). Total num frames: 15474688. Throughput: 0: 1213.1. Samples: 2866192. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:35:03,383][00219] Avg episode reward: [(0, '28.543')] [2023-02-25 20:35:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 15499264. Throughput: 0: 1210.3. Samples: 2871888. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:35:08,380][00219] Avg episode reward: [(0, '29.210')] [2023-02-25 20:35:10,607][32866] Updated weights for policy 0, policy_version 3788 (0.0013) [2023-02-25 20:35:13,377][00219] Fps is (10 sec: 5735.0, 60 sec: 4847.1, 300 sec: 4776.4). Total num frames: 15532032. Throughput: 0: 1229.9. Samples: 2881024. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:35:13,379][00219] Avg episode reward: [(0, '31.366')] [2023-02-25 20:35:17,706][32866] Updated weights for policy 0, policy_version 3798 (0.0012) [2023-02-25 20:35:18,378][00219] Fps is (10 sec: 5733.9, 60 sec: 4983.4, 300 sec: 4762.5). Total num frames: 15556608. Throughput: 0: 1229.5. Samples: 2885584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:35:18,386][00219] Avg episode reward: [(0, '31.398')] [2023-02-25 20:35:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4734.7). Total num frames: 15577088. Throughput: 0: 1213.9. Samples: 2892064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:35:23,378][00219] Avg episode reward: [(0, '31.649')] [2023-02-25 20:35:28,377][00219] Fps is (10 sec: 3686.6, 60 sec: 4710.4, 300 sec: 4707.0). Total num frames: 15593472. Throughput: 0: 1211.1. Samples: 2897760. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:35:28,380][00219] Avg episode reward: [(0, '31.195')] [2023-02-25 20:35:28,502][32866] Updated weights for policy 0, policy_version 3808 (0.0019) [2023-02-25 20:35:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4748.6). Total num frames: 15622144. Throughput: 0: 1226.3. Samples: 2901680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:35:33,387][00219] Avg episode reward: [(0, '31.500')] [2023-02-25 20:35:35,149][32866] Updated weights for policy 0, policy_version 3818 (0.0012) [2023-02-25 20:35:38,377][00219] Fps is (10 sec: 6144.2, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 15654912. Throughput: 0: 1221.3. Samples: 2910736. Policy #0 lag: (min: 1.0, avg: 1.3, max: 3.0) [2023-02-25 20:35:38,386][00219] Avg episode reward: [(0, '30.762')] [2023-02-25 20:35:43,385][00219] Fps is (10 sec: 5320.5, 60 sec: 4983.1, 300 sec: 4748.5). Total num frames: 15675392. Throughput: 0: 1215.8. Samples: 2918016. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:35:43,389][00219] Avg episode reward: [(0, '31.072')] [2023-02-25 20:35:43,401][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003827_15675392.pth... [2023-02-25 20:35:43,561][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003549_14536704.pth [2023-02-25 20:35:43,641][32866] Updated weights for policy 0, policy_version 3828 (0.0011) [2023-02-25 20:35:48,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4846.9, 300 sec: 4734.7). Total num frames: 15695872. Throughput: 0: 1214.6. Samples: 2920848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:35:48,380][00219] Avg episode reward: [(0, '30.299')] [2023-02-25 20:35:53,116][32866] Updated weights for policy 0, policy_version 3838 (0.0012) [2023-02-25 20:35:53,377][00219] Fps is (10 sec: 4509.2, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 15720448. Throughput: 0: 1224.9. Samples: 2927008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:35:53,379][00219] Avg episode reward: [(0, '30.214')] [2023-02-25 20:35:58,377][00219] Fps is (10 sec: 5325.3, 60 sec: 4846.9, 300 sec: 4776.4). Total num frames: 15749120. Throughput: 0: 1224.9. Samples: 2936144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:35:58,379][00219] Avg episode reward: [(0, '30.921')] [2023-02-25 20:36:00,190][32866] Updated weights for policy 0, policy_version 3848 (0.0012) [2023-02-25 20:36:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 15773696. Throughput: 0: 1224.9. Samples: 2940704. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:36:03,381][00219] Avg episode reward: [(0, '31.723')] [2023-02-25 20:36:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4720.8). Total num frames: 15794176. Throughput: 0: 1217.4. Samples: 2946848. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:36:08,378][00219] Avg episode reward: [(0, '31.910')] [2023-02-25 20:36:09,593][32866] Updated weights for policy 0, policy_version 3858 (0.0011) [2023-02-25 20:36:13,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4734.7). Total num frames: 15814656. Throughput: 0: 1218.1. Samples: 2952576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:36:13,379][00219] Avg episode reward: [(0, '31.893')] [2023-02-25 20:36:17,760][32866] Updated weights for policy 0, policy_version 3868 (0.0013) [2023-02-25 20:36:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.0, 300 sec: 4776.4). Total num frames: 15847424. Throughput: 0: 1231.6. Samples: 2957104. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:36:18,381][00219] Avg episode reward: [(0, '32.652')] [2023-02-25 20:36:23,379][00219] Fps is (10 sec: 6142.7, 60 sec: 4983.3, 300 sec: 4790.2). Total num frames: 15876096. Throughput: 0: 1234.4. Samples: 2966288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:36:23,381][00219] Avg episode reward: [(0, '30.906')] [2023-02-25 20:36:24,851][32866] Updated weights for policy 0, policy_version 3878 (0.0011) [2023-02-25 20:36:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4734.7). Total num frames: 15892480. Throughput: 0: 1213.4. Samples: 2972608. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:36:28,381][00219] Avg episode reward: [(0, '29.540')] [2023-02-25 20:36:33,377][00219] Fps is (10 sec: 4096.7, 60 sec: 4915.2, 300 sec: 4734.7). Total num frames: 15917056. Throughput: 0: 1216.0. Samples: 2975568. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:36:33,380][00219] Avg episode reward: [(0, '29.160')] [2023-02-25 20:36:35,027][32866] Updated weights for policy 0, policy_version 3888 (0.0011) [2023-02-25 20:36:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 15941632. Throughput: 0: 1235.2. Samples: 2982592. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:36:38,386][00219] Avg episode reward: [(0, '30.262')] [2023-02-25 20:36:41,821][32866] Updated weights for policy 0, policy_version 3898 (0.0012) [2023-02-25 20:36:43,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4984.1, 300 sec: 4790.2). Total num frames: 15974400. Throughput: 0: 1232.7. Samples: 2991616. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:36:43,379][00219] Avg episode reward: [(0, '31.635')] [2023-02-25 20:36:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4762.5). Total num frames: 15994880. Throughput: 0: 1217.1. Samples: 2995472. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:36:48,379][00219] Avg episode reward: [(0, '30.463')] [2023-02-25 20:36:50,933][32866] Updated weights for policy 0, policy_version 3908 (0.0011) [2023-02-25 20:36:53,380][00219] Fps is (10 sec: 4094.7, 60 sec: 4914.9, 300 sec: 4734.7). Total num frames: 16015360. Throughput: 0: 1209.2. Samples: 3001264. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:36:53,382][00219] Avg episode reward: [(0, '31.112')] [2023-02-25 20:36:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 16035840. Throughput: 0: 1227.0. Samples: 3007792. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:36:58,387][00219] Avg episode reward: [(0, '31.483')] [2023-02-25 20:36:59,682][32866] Updated weights for policy 0, policy_version 3918 (0.0012) [2023-02-25 20:37:03,377][00219] Fps is (10 sec: 4916.7, 60 sec: 4846.9, 300 sec: 4762.5). Total num frames: 16064512. Throughput: 0: 1229.2. Samples: 3012416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:37:03,379][00219] Avg episode reward: [(0, '28.134')] [2023-02-25 20:37:06,632][32866] Updated weights for policy 0, policy_version 3928 (0.0012) [2023-02-25 20:37:08,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 16093184. Throughput: 0: 1224.2. Samples: 3021376. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:37:08,379][00219] Avg episode reward: [(0, '27.826')] [2023-02-25 20:37:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4734.7). Total num frames: 16113664. Throughput: 0: 1217.1. Samples: 3027376. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:37:13,380][00219] Avg episode reward: [(0, '28.204')] [2023-02-25 20:37:16,665][32866] Updated weights for policy 0, policy_version 3938 (0.0012) [2023-02-25 20:37:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4734.7). Total num frames: 16134144. Throughput: 0: 1217.8. Samples: 3030368. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:37:18,382][00219] Avg episode reward: [(0, '28.374')] [2023-02-25 20:37:23,377][00219] Fps is (10 sec: 4915.0, 60 sec: 4778.8, 300 sec: 4776.5). Total num frames: 16162816. Throughput: 0: 1229.5. Samples: 3037920. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:37:23,379][00219] Avg episode reward: [(0, '29.089')] [2023-02-25 20:37:24,014][32866] Updated weights for policy 0, policy_version 3948 (0.0013) [2023-02-25 20:37:28,377][00219] Fps is (10 sec: 6143.8, 60 sec: 5051.7, 300 sec: 4804.1). Total num frames: 16195584. Throughput: 0: 1233.1. Samples: 3047104. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:37:28,379][00219] Avg episode reward: [(0, '31.338')] [2023-02-25 20:37:32,306][32866] Updated weights for policy 0, policy_version 3958 (0.0012) [2023-02-25 20:37:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4762.5). Total num frames: 16211968. Throughput: 0: 1221.7. Samples: 3050448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:37:33,380][00219] Avg episode reward: [(0, '31.011')] [2023-02-25 20:37:38,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4846.9, 300 sec: 4748.6). Total num frames: 16232448. Throughput: 0: 1221.8. Samples: 3056240. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:37:38,379][00219] Avg episode reward: [(0, '31.945')] [2023-02-25 20:37:43,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4573.9, 300 sec: 4748.6). Total num frames: 16248832. Throughput: 0: 1191.8. Samples: 3061424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:37:43,385][00219] Avg episode reward: [(0, '32.306')] [2023-02-25 20:37:43,400][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003967_16248832.pth... [2023-02-25 20:37:43,576][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003689_15110144.pth [2023-02-25 20:37:43,829][32866] Updated weights for policy 0, policy_version 3968 (0.0012) [2023-02-25 20:37:48,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4505.6, 300 sec: 4748.6). Total num frames: 16265216. Throughput: 0: 1153.4. Samples: 3064320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:37:48,388][00219] Avg episode reward: [(0, '32.873')] [2023-02-25 20:37:53,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4574.1, 300 sec: 4720.8). Total num frames: 16289792. Throughput: 0: 1084.8. Samples: 3070192. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:37:53,381][00219] Avg episode reward: [(0, '33.672')] [2023-02-25 20:37:53,398][32858] Saving new best policy, reward=33.672! [2023-02-25 20:37:54,340][32866] Updated weights for policy 0, policy_version 3978 (0.0012) [2023-02-25 20:37:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4573.9, 300 sec: 4693.0). Total num frames: 16310272. Throughput: 0: 1083.0. Samples: 3076112. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:37:58,379][00219] Avg episode reward: [(0, '34.061')] [2023-02-25 20:37:58,381][32858] Saving new best policy, reward=34.061! [2023-02-25 20:38:03,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.1, 300 sec: 4679.2). Total num frames: 16326656. Throughput: 0: 1077.7. Samples: 3078864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:38:03,379][00219] Avg episode reward: [(0, '34.104')] [2023-02-25 20:38:03,395][32858] Saving new best policy, reward=34.104! [2023-02-25 20:38:04,441][32866] Updated weights for policy 0, policy_version 3988 (0.0011) [2023-02-25 20:38:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4369.1, 300 sec: 4706.9). Total num frames: 16355328. Throughput: 0: 1066.0. Samples: 3085888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:38:08,379][00219] Avg episode reward: [(0, '33.855')] [2023-02-25 20:38:11,599][32866] Updated weights for policy 0, policy_version 3998 (0.0012) [2023-02-25 20:38:13,378][00219] Fps is (10 sec: 5733.7, 60 sec: 4505.5, 300 sec: 4720.8). Total num frames: 16384000. Throughput: 0: 1064.2. Samples: 3094992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:38:13,380][00219] Avg episode reward: [(0, '34.527')] [2023-02-25 20:38:13,387][32858] Saving new best policy, reward=34.527! [2023-02-25 20:38:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4505.6, 300 sec: 4679.2). Total num frames: 16404480. Throughput: 0: 1073.4. Samples: 3098752. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:38:18,379][00219] Avg episode reward: [(0, '33.178')] [2023-02-25 20:38:20,810][32866] Updated weights for policy 0, policy_version 4008 (0.0012) [2023-02-25 20:38:23,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4437.3, 300 sec: 4693.0). Total num frames: 16429056. Throughput: 0: 1074.5. Samples: 3104592. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:38:23,379][00219] Avg episode reward: [(0, '32.416')] [2023-02-25 20:38:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4300.8, 300 sec: 4706.9). Total num frames: 16453632. Throughput: 0: 1105.1. Samples: 3111152. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:38:28,380][00219] Avg episode reward: [(0, '31.881')] [2023-02-25 20:38:29,436][32866] Updated weights for policy 0, policy_version 4018 (0.0012) [2023-02-25 20:38:33,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4505.6, 300 sec: 4734.7). Total num frames: 16482304. Throughput: 0: 1140.6. Samples: 3115648. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:38:33,379][00219] Avg episode reward: [(0, '33.940')] [2023-02-25 20:38:35,851][32866] Updated weights for policy 0, policy_version 4028 (0.0012) [2023-02-25 20:38:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4573.9, 300 sec: 4734.7). Total num frames: 16506880. Throughput: 0: 1207.5. Samples: 3124528. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:38:38,382][00219] Avg episode reward: [(0, '32.411')] [2023-02-25 20:38:43,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 16527360. Throughput: 0: 1205.7. Samples: 3130368. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:38:43,383][00219] Avg episode reward: [(0, '33.283')] [2023-02-25 20:38:46,534][32866] Updated weights for policy 0, policy_version 4038 (0.0011) [2023-02-25 20:38:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4642.1, 300 sec: 4762.5). Total num frames: 16543744. Throughput: 0: 1208.2. Samples: 3133232. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:38:48,381][00219] Avg episode reward: [(0, '33.756')] [2023-02-25 20:38:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4790.2). Total num frames: 16572416. Throughput: 0: 1222.4. Samples: 3140896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:38:53,379][00219] Avg episode reward: [(0, '35.005')] [2023-02-25 20:38:53,391][32858] Saving new best policy, reward=35.005! [2023-02-25 20:38:53,831][32866] Updated weights for policy 0, policy_version 4048 (0.0012) [2023-02-25 20:38:58,377][00219] Fps is (10 sec: 6143.9, 60 sec: 4915.2, 300 sec: 4818.0). Total num frames: 16605184. Throughput: 0: 1220.7. Samples: 3149920. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:38:58,379][00219] Avg episode reward: [(0, '33.505')] [2023-02-25 20:39:02,085][32866] Updated weights for policy 0, policy_version 4058 (0.0012) [2023-02-25 20:39:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4790.2). Total num frames: 16625664. Throughput: 0: 1207.8. Samples: 3153104. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:39:03,378][00219] Avg episode reward: [(0, '33.476')] [2023-02-25 20:39:08,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4778.7, 300 sec: 4748.6). Total num frames: 16642048. Throughput: 0: 1208.5. Samples: 3158976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:39:08,385][00219] Avg episode reward: [(0, '32.518')] [2023-02-25 20:39:11,860][32866] Updated weights for policy 0, policy_version 4068 (0.0011) [2023-02-25 20:39:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.8, 300 sec: 4790.2). Total num frames: 16670720. Throughput: 0: 1221.0. Samples: 3166096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:39:13,379][00219] Avg episode reward: [(0, '30.820')] [2023-02-25 20:39:18,212][32866] Updated weights for policy 0, policy_version 4078 (0.0012) [2023-02-25 20:39:18,377][00219] Fps is (10 sec: 6143.6, 60 sec: 4983.4, 300 sec: 4831.9). Total num frames: 16703488. Throughput: 0: 1221.3. Samples: 3170608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:39:18,380][00219] Avg episode reward: [(0, '30.577')] [2023-02-25 20:39:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 16723968. Throughput: 0: 1208.5. Samples: 3178912. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:39:23,379][00219] Avg episode reward: [(0, '30.643')] [2023-02-25 20:39:28,368][32866] Updated weights for policy 0, policy_version 4088 (0.0011) [2023-02-25 20:39:28,378][00219] Fps is (10 sec: 4095.8, 60 sec: 4846.9, 300 sec: 4762.5). Total num frames: 16744448. Throughput: 0: 1209.9. Samples: 3184816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:39:28,385][00219] Avg episode reward: [(0, '29.397')] [2023-02-25 20:39:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 16764928. Throughput: 0: 1206.8. Samples: 3187536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:39:33,379][00219] Avg episode reward: [(0, '29.146')] [2023-02-25 20:39:36,367][32866] Updated weights for policy 0, policy_version 4098 (0.0012) [2023-02-25 20:39:38,377][00219] Fps is (10 sec: 5325.4, 60 sec: 4846.9, 300 sec: 4818.1). Total num frames: 16797696. Throughput: 0: 1225.6. Samples: 3196048. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:39:38,379][00219] Avg episode reward: [(0, '30.146')] [2023-02-25 20:39:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 16822272. Throughput: 0: 1216.7. Samples: 3204672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:39:43,379][00219] Avg episode reward: [(0, '28.517')] [2023-02-25 20:39:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004107_16822272.pth... [2023-02-25 20:39:43,514][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003827_15675392.pth [2023-02-25 20:39:43,804][32866] Updated weights for policy 0, policy_version 4108 (0.0012) [2023-02-25 20:39:48,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4914.7, 300 sec: 4762.4). Total num frames: 16838656. Throughput: 0: 1209.1. Samples: 3207520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:39:48,393][00219] Avg episode reward: [(0, '28.653')] [2023-02-25 20:39:53,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.6, 300 sec: 4748.6). Total num frames: 16859136. Throughput: 0: 1206.7. Samples: 3213280. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:39:53,386][00219] Avg episode reward: [(0, '30.408')] [2023-02-25 20:39:54,132][32866] Updated weights for policy 0, policy_version 4118 (0.0012) [2023-02-25 20:39:58,377][00219] Fps is (10 sec: 5328.0, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 16891904. Throughput: 0: 1226.7. Samples: 3221296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:39:58,379][00219] Avg episode reward: [(0, '29.760')] [2023-02-25 20:40:01,032][32866] Updated weights for policy 0, policy_version 4128 (0.0012) [2023-02-25 20:40:03,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4915.2, 300 sec: 4818.0). Total num frames: 16920576. Throughput: 0: 1226.7. Samples: 3225808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:40:03,379][00219] Avg episode reward: [(0, '29.874')] [2023-02-25 20:40:08,377][00219] Fps is (10 sec: 4915.0, 60 sec: 4983.4, 300 sec: 4776.3). Total num frames: 16941056. Throughput: 0: 1205.3. Samples: 3233152. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0) [2023-02-25 20:40:08,383][00219] Avg episode reward: [(0, '29.570')] [2023-02-25 20:40:09,675][32866] Updated weights for policy 0, policy_version 4138 (0.0012) [2023-02-25 20:40:13,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4846.9, 300 sec: 4762.5). Total num frames: 16961536. Throughput: 0: 1205.0. Samples: 3239040. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:40:13,379][00219] Avg episode reward: [(0, '29.860')] [2023-02-25 20:40:18,377][00219] Fps is (10 sec: 4505.8, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 16986112. Throughput: 0: 1211.7. Samples: 3242064. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:40:18,379][00219] Avg episode reward: [(0, '28.407')] [2023-02-25 20:40:19,038][32866] Updated weights for policy 0, policy_version 4148 (0.0012) [2023-02-25 20:40:23,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 17018880. Throughput: 0: 1223.5. Samples: 3251104. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:40:23,378][00219] Avg episode reward: [(0, '29.418')] [2023-02-25 20:40:25,884][32866] Updated weights for policy 0, policy_version 4158 (0.0012) [2023-02-25 20:40:28,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 17039360. Throughput: 0: 1205.3. Samples: 3258912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:40:28,380][00219] Avg episode reward: [(0, '29.574')] [2023-02-25 20:40:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4762.5). Total num frames: 17059840. Throughput: 0: 1204.1. Samples: 3261696. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:40:33,381][00219] Avg episode reward: [(0, '28.446')] [2023-02-25 20:40:36,309][32866] Updated weights for policy 0, policy_version 4168 (0.0012) [2023-02-25 20:40:38,377][00219] Fps is (10 sec: 4096.5, 60 sec: 4710.4, 300 sec: 4762.6). Total num frames: 17080320. Throughput: 0: 1204.6. Samples: 3267488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:40:38,378][00219] Avg episode reward: [(0, '29.630')] [2023-02-25 20:40:43,284][32866] Updated weights for policy 0, policy_version 4178 (0.0012) [2023-02-25 20:40:43,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4804.1). Total num frames: 17113088. Throughput: 0: 1219.6. Samples: 3276176. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:40:43,385][00219] Avg episode reward: [(0, '29.663')] [2023-02-25 20:40:48,379][00219] Fps is (10 sec: 5733.3, 60 sec: 4983.8, 300 sec: 4804.1). Total num frames: 17137664. Throughput: 0: 1220.9. Samples: 3280752. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:40:48,381][00219] Avg episode reward: [(0, '29.065')] [2023-02-25 20:40:51,656][32866] Updated weights for policy 0, policy_version 4188 (0.0012) [2023-02-25 20:40:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 17158144. Throughput: 0: 1206.4. Samples: 3287440. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:40:53,381][00219] Avg episode reward: [(0, '28.415')] [2023-02-25 20:40:58,377][00219] Fps is (10 sec: 4096.8, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 17178624. Throughput: 0: 1203.6. Samples: 3293200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:40:58,379][00219] Avg episode reward: [(0, '29.587')] [2023-02-25 20:41:01,185][32866] Updated weights for policy 0, policy_version 4198 (0.0012) [2023-02-25 20:41:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 17203200. Throughput: 0: 1215.3. Samples: 3296752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:41:03,379][00219] Avg episode reward: [(0, '30.268')] [2023-02-25 20:41:07,990][32866] Updated weights for policy 0, policy_version 4208 (0.0012) [2023-02-25 20:41:08,377][00219] Fps is (10 sec: 5734.1, 60 sec: 4915.2, 300 sec: 4818.0). Total num frames: 17235968. Throughput: 0: 1216.0. Samples: 3305824. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:41:08,379][00219] Avg episode reward: [(0, '30.386')] [2023-02-25 20:41:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 17256448. Throughput: 0: 1206.4. Samples: 3313200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:41:13,381][00219] Avg episode reward: [(0, '30.176')] [2023-02-25 20:41:18,296][32866] Updated weights for policy 0, policy_version 4218 (0.0016) [2023-02-25 20:41:18,377][00219] Fps is (10 sec: 4096.2, 60 sec: 4846.9, 300 sec: 4748.6). Total num frames: 17276928. Throughput: 0: 1210.0. Samples: 3316144. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:41:18,381][00219] Avg episode reward: [(0, '30.025')] [2023-02-25 20:41:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4762.5). Total num frames: 17297408. Throughput: 0: 1216.7. Samples: 3322240. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:41:23,379][00219] Avg episode reward: [(0, '31.824')] [2023-02-25 20:41:25,948][32866] Updated weights for policy 0, policy_version 4228 (0.0012) [2023-02-25 20:41:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.0, 300 sec: 4790.2). Total num frames: 17330176. Throughput: 0: 1228.4. Samples: 3331456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:41:28,379][00219] Avg episode reward: [(0, '31.627')] [2023-02-25 20:41:33,263][32866] Updated weights for policy 0, policy_version 4238 (0.0012) [2023-02-25 20:41:33,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4804.1). Total num frames: 17358848. Throughput: 0: 1228.1. Samples: 3336016. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:41:33,381][00219] Avg episode reward: [(0, '32.600')] [2023-02-25 20:41:38,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4915.2, 300 sec: 4748.6). Total num frames: 17375232. Throughput: 0: 1214.2. Samples: 3342080. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:41:38,381][00219] Avg episode reward: [(0, '33.483')] [2023-02-25 20:41:43,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4642.1, 300 sec: 4734.7). Total num frames: 17391616. Throughput: 0: 1200.7. Samples: 3347232. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-02-25 20:41:43,379][00219] Avg episode reward: [(0, '33.261')] [2023-02-25 20:41:43,392][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004246_17391616.pth... [2023-02-25 20:41:43,554][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003967_16248832.pth [2023-02-25 20:41:44,929][32866] Updated weights for policy 0, policy_version 4248 (0.0045) [2023-02-25 20:41:48,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4505.7, 300 sec: 4720.9). Total num frames: 17408000. Throughput: 0: 1172.6. Samples: 3349520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:41:48,385][00219] Avg episode reward: [(0, '33.872')] [2023-02-25 20:41:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4573.9, 300 sec: 4734.7). Total num frames: 17432576. Throughput: 0: 1101.2. Samples: 3355376. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 20:41:53,388][00219] Avg episode reward: [(0, '34.272')] [2023-02-25 20:41:54,627][32866] Updated weights for policy 0, policy_version 4258 (0.0015) [2023-02-25 20:41:58,378][00219] Fps is (10 sec: 4914.7, 60 sec: 4642.0, 300 sec: 4720.8). Total num frames: 17457152. Throughput: 0: 1101.1. Samples: 3362752. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:41:58,382][00219] Avg episode reward: [(0, '34.481')] [2023-02-25 20:42:03,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4505.5, 300 sec: 4679.1). Total num frames: 17473536. Throughput: 0: 1097.2. Samples: 3365520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:42:03,380][00219] Avg episode reward: [(0, '35.195')] [2023-02-25 20:42:03,396][32858] Saving new best policy, reward=35.195! [2023-02-25 20:42:05,209][32866] Updated weights for policy 0, policy_version 4268 (0.0012) [2023-02-25 20:42:08,377][00219] Fps is (10 sec: 3686.8, 60 sec: 4300.8, 300 sec: 4679.2). Total num frames: 17494016. Throughput: 0: 1087.6. Samples: 3371184. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:42:08,379][00219] Avg episode reward: [(0, '34.273')] [2023-02-25 20:42:13,050][32866] Updated weights for policy 0, policy_version 4278 (0.0012) [2023-02-25 20:42:13,377][00219] Fps is (10 sec: 5325.3, 60 sec: 4505.6, 300 sec: 4720.8). Total num frames: 17526784. Throughput: 0: 1069.2. Samples: 3379568. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:42:13,379][00219] Avg episode reward: [(0, '33.689')] [2023-02-25 20:42:18,380][00219] Fps is (10 sec: 5732.6, 60 sec: 4573.6, 300 sec: 4706.9). Total num frames: 17551360. Throughput: 0: 1069.8. Samples: 3384160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:42:18,388][00219] Avg episode reward: [(0, '32.497')] [2023-02-25 20:42:20,581][32866] Updated weights for policy 0, policy_version 4288 (0.0011) [2023-02-25 20:42:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 4679.2). Total num frames: 17575936. Throughput: 0: 1094.1. Samples: 3391312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:42:23,384][00219] Avg episode reward: [(0, '31.714')] [2023-02-25 20:42:28,377][00219] Fps is (10 sec: 4097.2, 60 sec: 4369.1, 300 sec: 4679.2). Total num frames: 17592320. Throughput: 0: 1111.1. Samples: 3397232. Policy #0 lag: (min: 1.0, avg: 1.1, max: 2.0) [2023-02-25 20:42:28,384][00219] Avg episode reward: [(0, '30.738')] [2023-02-25 20:42:30,756][32866] Updated weights for policy 0, policy_version 4298 (0.0011) [2023-02-25 20:42:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4300.8, 300 sec: 4693.0). Total num frames: 17616896. Throughput: 0: 1136.7. Samples: 3400672. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:42:33,387][00219] Avg episode reward: [(0, '31.319')] [2023-02-25 20:42:37,460][32866] Updated weights for policy 0, policy_version 4308 (0.0012) [2023-02-25 20:42:38,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4573.9, 300 sec: 4748.6). Total num frames: 17649664. Throughput: 0: 1204.6. Samples: 3409584. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:42:38,379][00219] Avg episode reward: [(0, '32.098')] [2023-02-25 20:42:43,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4762.5). Total num frames: 17670144. Throughput: 0: 1212.1. Samples: 3417296. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:42:43,379][00219] Avg episode reward: [(0, '32.931')] [2023-02-25 20:42:46,171][32866] Updated weights for policy 0, policy_version 4318 (0.0011) [2023-02-25 20:42:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 17694720. Throughput: 0: 1213.9. Samples: 3420144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:42:48,387][00219] Avg episode reward: [(0, '32.073')] [2023-02-25 20:42:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 17715200. Throughput: 0: 1217.8. Samples: 3425984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:42:53,385][00219] Avg episode reward: [(0, '32.204')] [2023-02-25 20:42:54,732][32866] Updated weights for policy 0, policy_version 4328 (0.0011) [2023-02-25 20:42:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.0, 300 sec: 4818.0). Total num frames: 17747968. Throughput: 0: 1236.3. Samples: 3435200. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:42:58,384][00219] Avg episode reward: [(0, '29.732')] [2023-02-25 20:43:01,832][32866] Updated weights for policy 0, policy_version 4338 (0.0012) [2023-02-25 20:43:03,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4983.5, 300 sec: 4804.1). Total num frames: 17772544. Throughput: 0: 1237.8. Samples: 3439856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:43:03,383][00219] Avg episode reward: [(0, '28.537')] [2023-02-25 20:43:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 17793024. Throughput: 0: 1217.8. Samples: 3446112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:43:08,380][00219] Avg episode reward: [(0, '27.822')] [2023-02-25 20:43:12,229][32866] Updated weights for policy 0, policy_version 4348 (0.0012) [2023-02-25 20:43:13,378][00219] Fps is (10 sec: 3686.1, 60 sec: 4710.3, 300 sec: 4762.5). Total num frames: 17809408. Throughput: 0: 1213.1. Samples: 3451824. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:43:13,380][00219] Avg episode reward: [(0, '28.563')] [2023-02-25 20:43:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.2, 300 sec: 4790.2). Total num frames: 17842176. Throughput: 0: 1228.8. Samples: 3455968. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:43:18,384][00219] Avg episode reward: [(0, '29.252')] [2023-02-25 20:43:19,113][32866] Updated weights for policy 0, policy_version 4358 (0.0012) [2023-02-25 20:43:23,377][00219] Fps is (10 sec: 6144.6, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 17870848. Throughput: 0: 1237.0. Samples: 3465248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:43:23,383][00219] Avg episode reward: [(0, '29.246')] [2023-02-25 20:43:27,322][32866] Updated weights for policy 0, policy_version 4368 (0.0011) [2023-02-25 20:43:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 17891328. Throughput: 0: 1216.7. Samples: 3472048. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:43:28,379][00219] Avg episode reward: [(0, '28.560')] [2023-02-25 20:43:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4762.5). Total num frames: 17911808. Throughput: 0: 1217.8. Samples: 3474944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:43:33,379][00219] Avg episode reward: [(0, '29.505')] [2023-02-25 20:43:37,405][32866] Updated weights for policy 0, policy_version 4378 (0.0012) [2023-02-25 20:43:38,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 17936384. Throughput: 0: 1230.6. Samples: 3481360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:43:38,381][00219] Avg episode reward: [(0, '30.074')] [2023-02-25 20:43:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4831.9). Total num frames: 17969152. Throughput: 0: 1229.2. Samples: 3490512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:43:43,385][00219] Avg episode reward: [(0, '31.812')] [2023-02-25 20:43:43,401][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004387_17969152.pth... [2023-02-25 20:43:43,521][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004107_16822272.pth [2023-02-25 20:43:43,994][32866] Updated weights for policy 0, policy_version 4388 (0.0012) [2023-02-25 20:43:48,383][00219] Fps is (10 sec: 5321.6, 60 sec: 4914.7, 300 sec: 4804.0). Total num frames: 17989632. Throughput: 0: 1222.2. Samples: 3494864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:43:48,385][00219] Avg episode reward: [(0, '31.340')] [2023-02-25 20:43:53,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4915.1, 300 sec: 4762.5). Total num frames: 18010112. Throughput: 0: 1212.1. Samples: 3500656. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:43:53,380][00219] Avg episode reward: [(0, '32.596')] [2023-02-25 20:43:53,545][32866] Updated weights for policy 0, policy_version 4398 (0.0012) [2023-02-25 20:43:58,377][00219] Fps is (10 sec: 4098.5, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 18030592. Throughput: 0: 1217.5. Samples: 3506608. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:43:58,386][00219] Avg episode reward: [(0, '33.367')] [2023-02-25 20:44:01,752][32866] Updated weights for policy 0, policy_version 4408 (0.0011) [2023-02-25 20:44:03,377][00219] Fps is (10 sec: 5325.4, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 18063360. Throughput: 0: 1228.1. Samples: 3511232. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:44:03,378][00219] Avg episode reward: [(0, '34.749')] [2023-02-25 20:44:08,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4818.0). Total num frames: 18092032. Throughput: 0: 1225.2. Samples: 3520384. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:44:08,383][00219] Avg episode reward: [(0, '34.439')] [2023-02-25 20:44:09,533][32866] Updated weights for policy 0, policy_version 4418 (0.0011) [2023-02-25 20:44:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.6, 300 sec: 4762.5). Total num frames: 18108416. Throughput: 0: 1213.2. Samples: 3526640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:44:13,378][00219] Avg episode reward: [(0, '34.052')] [2023-02-25 20:44:18,382][00219] Fps is (10 sec: 3684.6, 60 sec: 4778.3, 300 sec: 4762.4). Total num frames: 18128896. Throughput: 0: 1212.0. Samples: 3529488. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:44:18,384][00219] Avg episode reward: [(0, '35.223')] [2023-02-25 20:44:18,452][32858] Saving new best policy, reward=35.223! [2023-02-25 20:44:19,627][32866] Updated weights for policy 0, policy_version 4428 (0.0012) [2023-02-25 20:44:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4790.3). Total num frames: 18157568. Throughput: 0: 1230.6. Samples: 3536736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:44:23,381][00219] Avg episode reward: [(0, '35.334')] [2023-02-25 20:44:23,400][32858] Saving new best policy, reward=35.334! [2023-02-25 20:44:26,375][32866] Updated weights for policy 0, policy_version 4438 (0.0012) [2023-02-25 20:44:28,377][00219] Fps is (10 sec: 6147.1, 60 sec: 4983.5, 300 sec: 4831.9). Total num frames: 18190336. Throughput: 0: 1232.0. Samples: 3545952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:44:28,379][00219] Avg episode reward: [(0, '37.287')] [2023-02-25 20:44:28,392][32858] Saving new best policy, reward=37.287! [2023-02-25 20:44:33,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4915.0, 300 sec: 4776.3). Total num frames: 18206720. Throughput: 0: 1214.7. Samples: 3549520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:44:33,381][00219] Avg episode reward: [(0, '34.595')] [2023-02-25 20:44:35,254][32866] Updated weights for policy 0, policy_version 4448 (0.0012) [2023-02-25 20:44:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 18231296. Throughput: 0: 1216.7. Samples: 3555408. Policy #0 lag: (min: 0.0, avg: 0.7, max: 3.0) [2023-02-25 20:44:38,383][00219] Avg episode reward: [(0, '34.051')] [2023-02-25 20:44:43,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4778.7, 300 sec: 4804.2). Total num frames: 18255872. Throughput: 0: 1233.1. Samples: 3562096. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:44:43,385][00219] Avg episode reward: [(0, '35.718')] [2023-02-25 20:44:44,039][32866] Updated weights for policy 0, policy_version 4458 (0.0012) [2023-02-25 20:44:48,378][00219] Fps is (10 sec: 5324.1, 60 sec: 4915.6, 300 sec: 4831.9). Total num frames: 18284544. Throughput: 0: 1231.6. Samples: 3566656. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:44:48,383][00219] Avg episode reward: [(0, '34.507')] [2023-02-25 20:44:50,665][32866] Updated weights for policy 0, policy_version 4468 (0.0012) [2023-02-25 20:44:53,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4983.5, 300 sec: 4804.1). Total num frames: 18309120. Throughput: 0: 1222.7. Samples: 3575408. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:44:53,379][00219] Avg episode reward: [(0, '32.869')] [2023-02-25 20:44:58,377][00219] Fps is (10 sec: 4506.2, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 18329600. Throughput: 0: 1213.5. Samples: 3581248. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:44:58,380][00219] Avg episode reward: [(0, '33.698')] [2023-02-25 20:45:00,635][32866] Updated weights for policy 0, policy_version 4478 (0.0012) [2023-02-25 20:45:03,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 18350080. Throughput: 0: 1214.7. Samples: 3584144. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 20:45:03,383][00219] Avg episode reward: [(0, '34.211')] [2023-02-25 20:45:08,353][32866] Updated weights for policy 0, policy_version 4488 (0.0012) [2023-02-25 20:45:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 18382848. Throughput: 0: 1227.7. Samples: 3591984. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:45:08,378][00219] Avg episode reward: [(0, '33.088')] [2023-02-25 20:45:13,377][00219] Fps is (10 sec: 6144.0, 60 sec: 5051.7, 300 sec: 4831.9). Total num frames: 18411520. Throughput: 0: 1229.2. Samples: 3601264. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:45:13,380][00219] Avg episode reward: [(0, '32.092')] [2023-02-25 20:45:16,872][32866] Updated weights for policy 0, policy_version 4498 (0.0012) [2023-02-25 20:45:18,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4983.9, 300 sec: 4776.3). Total num frames: 18427904. Throughput: 0: 1217.5. Samples: 3604304. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:45:18,379][00219] Avg episode reward: [(0, '31.883')] [2023-02-25 20:45:23,379][00219] Fps is (10 sec: 3685.4, 60 sec: 4846.7, 300 sec: 4776.3). Total num frames: 18448384. Throughput: 0: 1214.5. Samples: 3610064. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:45:23,382][00219] Avg episode reward: [(0, '31.032')] [2023-02-25 20:45:26,526][32866] Updated weights for policy 0, policy_version 4508 (0.0012) [2023-02-25 20:45:28,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 18477056. Throughput: 0: 1232.0. Samples: 3617536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:45:28,379][00219] Avg episode reward: [(0, '32.297')] [2023-02-25 20:45:33,009][32866] Updated weights for policy 0, policy_version 4518 (0.0011) [2023-02-25 20:45:33,377][00219] Fps is (10 sec: 5735.9, 60 sec: 4983.6, 300 sec: 4831.9). Total num frames: 18505728. Throughput: 0: 1231.0. Samples: 3622048. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:45:33,379][00219] Avg episode reward: [(0, '30.988')] [2023-02-25 20:45:38,380][00219] Fps is (10 sec: 4913.6, 60 sec: 4914.9, 300 sec: 4790.2). Total num frames: 18526208. Throughput: 0: 1215.2. Samples: 3630096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:45:38,383][00219] Avg episode reward: [(0, '31.539')] [2023-02-25 20:45:43,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4778.7, 300 sec: 4762.5). Total num frames: 18542592. Throughput: 0: 1190.8. Samples: 3634832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:45:43,380][00219] Avg episode reward: [(0, '32.013')] [2023-02-25 20:45:43,394][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004527_18542592.pth... [2023-02-25 20:45:43,548][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004246_17391616.pth [2023-02-25 20:45:44,250][32866] Updated weights for policy 0, policy_version 4528 (0.0011) [2023-02-25 20:45:48,378][00219] Fps is (10 sec: 3277.3, 60 sec: 4573.8, 300 sec: 4748.6). Total num frames: 18558976. Throughput: 0: 1177.6. Samples: 3637136. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:45:48,384][00219] Avg episode reward: [(0, '32.158')] [2023-02-25 20:45:53,377][00219] Fps is (10 sec: 2867.1, 60 sec: 4369.1, 300 sec: 4720.8). Total num frames: 18571264. Throughput: 0: 1081.6. Samples: 3640656. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:45:53,381][00219] Avg episode reward: [(0, '31.678')] [2023-02-25 20:45:56,275][32866] Updated weights for policy 0, policy_version 4538 (0.0013) [2023-02-25 20:45:58,377][00219] Fps is (10 sec: 3687.0, 60 sec: 4437.3, 300 sec: 4720.8). Total num frames: 18595840. Throughput: 0: 1040.0. Samples: 3648064. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:45:58,379][00219] Avg episode reward: [(0, '30.521')] [2023-02-25 20:46:02,984][32866] Updated weights for policy 0, policy_version 4548 (0.0012) [2023-02-25 20:46:03,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4642.1, 300 sec: 4720.8). Total num frames: 18628608. Throughput: 0: 1078.0. Samples: 3652816. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:46:03,378][00219] Avg episode reward: [(0, '31.343')] [2023-02-25 20:46:08,380][00219] Fps is (10 sec: 5323.1, 60 sec: 4437.1, 300 sec: 4720.8). Total num frames: 18649088. Throughput: 0: 1110.0. Samples: 3660016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:46:08,388][00219] Avg episode reward: [(0, '30.477')] [2023-02-25 20:46:13,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4232.5, 300 sec: 4706.9). Total num frames: 18665472. Throughput: 0: 1072.0. Samples: 3665776. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:46:13,383][00219] Avg episode reward: [(0, '29.578')] [2023-02-25 20:46:13,898][32866] Updated weights for policy 0, policy_version 4558 (0.0012) [2023-02-25 20:46:18,377][00219] Fps is (10 sec: 4506.9, 60 sec: 4437.3, 300 sec: 4734.7). Total num frames: 18694144. Throughput: 0: 1041.1. Samples: 3668896. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:46:18,379][00219] Avg episode reward: [(0, '31.082')] [2023-02-25 20:46:21,117][32866] Updated weights for policy 0, policy_version 4568 (0.0012) [2023-02-25 20:46:23,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4574.1, 300 sec: 4720.8). Total num frames: 18722816. Throughput: 0: 1062.1. Samples: 3677888. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:46:23,379][00219] Avg episode reward: [(0, '33.646')] [2023-02-25 20:46:28,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4505.6, 300 sec: 4706.9). Total num frames: 18747392. Throughput: 0: 1128.9. Samples: 3685632. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:46:28,380][00219] Avg episode reward: [(0, '33.909')] [2023-02-25 20:46:29,397][32866] Updated weights for policy 0, policy_version 4578 (0.0011) [2023-02-25 20:46:33,383][00219] Fps is (10 sec: 4093.6, 60 sec: 4300.4, 300 sec: 4706.8). Total num frames: 18763776. Throughput: 0: 1143.0. Samples: 3688576. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:46:33,385][00219] Avg episode reward: [(0, '34.265')] [2023-02-25 20:46:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4369.3, 300 sec: 4734.7). Total num frames: 18788352. Throughput: 0: 1194.0. Samples: 3694384. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:46:38,381][00219] Avg episode reward: [(0, '36.273')] [2023-02-25 20:46:39,052][32866] Updated weights for policy 0, policy_version 4588 (0.0012) [2023-02-25 20:46:43,377][00219] Fps is (10 sec: 5328.0, 60 sec: 4573.9, 300 sec: 4776.4). Total num frames: 18817024. Throughput: 0: 1218.5. Samples: 3702896. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:46:43,383][00219] Avg episode reward: [(0, '35.964')] [2023-02-25 20:46:45,875][32866] Updated weights for policy 0, policy_version 4598 (0.0012) [2023-02-25 20:46:48,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4778.8, 300 sec: 4790.2). Total num frames: 18845696. Throughput: 0: 1214.9. Samples: 3707488. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:46:48,384][00219] Avg episode reward: [(0, '34.442')] [2023-02-25 20:46:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4847.0, 300 sec: 4762.5). Total num frames: 18862080. Throughput: 0: 1200.8. Samples: 3714048. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:46:53,381][00219] Avg episode reward: [(0, '33.215')] [2023-02-25 20:46:55,728][32866] Updated weights for policy 0, policy_version 4608 (0.0011) [2023-02-25 20:46:58,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4778.7, 300 sec: 4776.4). Total num frames: 18882560. Throughput: 0: 1203.2. Samples: 3719920. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:46:58,389][00219] Avg episode reward: [(0, '32.210')] [2023-02-25 20:47:03,378][00219] Fps is (10 sec: 4914.5, 60 sec: 4710.3, 300 sec: 4804.1). Total num frames: 18911232. Throughput: 0: 1220.2. Samples: 3723808. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:47:03,383][00219] Avg episode reward: [(0, '31.323')] [2023-02-25 20:47:03,514][32866] Updated weights for policy 0, policy_version 4618 (0.0019) [2023-02-25 20:47:08,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4915.5, 300 sec: 4804.1). Total num frames: 18944000. Throughput: 0: 1222.4. Samples: 3732896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:47:08,386][00219] Avg episode reward: [(0, '29.535')] [2023-02-25 20:47:11,231][32866] Updated weights for policy 0, policy_version 4628 (0.0013) [2023-02-25 20:47:13,378][00219] Fps is (10 sec: 5325.0, 60 sec: 4983.4, 300 sec: 4790.3). Total num frames: 18964480. Throughput: 0: 1210.3. Samples: 3740096. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:47:13,385][00219] Avg episode reward: [(0, '30.380')] [2023-02-25 20:47:18,379][00219] Fps is (10 sec: 4095.2, 60 sec: 4846.8, 300 sec: 4776.3). Total num frames: 18984960. Throughput: 0: 1207.9. Samples: 3742928. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:47:18,383][00219] Avg episode reward: [(0, '30.339')] [2023-02-25 20:47:21,359][32866] Updated weights for policy 0, policy_version 4638 (0.0012) [2023-02-25 20:47:23,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 19009536. Throughput: 0: 1218.1. Samples: 3749200. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:47:23,385][00219] Avg episode reward: [(0, '29.570')] [2023-02-25 20:47:27,645][32866] Updated weights for policy 0, policy_version 4648 (0.0011) [2023-02-25 20:47:28,379][00219] Fps is (10 sec: 5324.6, 60 sec: 4846.7, 300 sec: 4818.0). Total num frames: 19038208. Throughput: 0: 1231.9. Samples: 3758336. Policy #0 lag: (min: 1.0, avg: 1.0, max: 2.0) [2023-02-25 20:47:28,389][00219] Avg episode reward: [(0, '30.896')] [2023-02-25 20:47:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4984.0, 300 sec: 4790.2). Total num frames: 19062784. Throughput: 0: 1231.3. Samples: 3762896. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:47:33,379][00219] Avg episode reward: [(0, '32.397')] [2023-02-25 20:47:36,842][32866] Updated weights for policy 0, policy_version 4658 (0.0013) [2023-02-25 20:47:38,377][00219] Fps is (10 sec: 4506.7, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 19083264. Throughput: 0: 1212.4. Samples: 3768608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:47:38,381][00219] Avg episode reward: [(0, '30.775')] [2023-02-25 20:47:43,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4778.7, 300 sec: 4776.3). Total num frames: 19103744. Throughput: 0: 1224.5. Samples: 3775024. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:47:43,382][00219] Avg episode reward: [(0, '29.767')] [2023-02-25 20:47:43,401][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004664_19103744.pth... [2023-02-25 20:47:43,510][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004387_17969152.pth [2023-02-25 20:47:45,250][32866] Updated weights for policy 0, policy_version 4668 (0.0014) [2023-02-25 20:47:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 19136512. Throughput: 0: 1239.5. Samples: 3779584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:47:48,387][00219] Avg episode reward: [(0, '31.272')] [2023-02-25 20:47:52,772][32866] Updated weights for policy 0, policy_version 4678 (0.0012) [2023-02-25 20:47:53,377][00219] Fps is (10 sec: 6144.1, 60 sec: 5051.7, 300 sec: 4804.1). Total num frames: 19165184. Throughput: 0: 1238.8. Samples: 3788640. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:47:53,379][00219] Avg episode reward: [(0, '31.338')] [2023-02-25 20:47:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4776.4). Total num frames: 19181568. Throughput: 0: 1208.9. Samples: 3794496. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:47:58,381][00219] Avg episode reward: [(0, '30.712')] [2023-02-25 20:48:02,983][32866] Updated weights for policy 0, policy_version 4688 (0.0012) [2023-02-25 20:48:03,378][00219] Fps is (10 sec: 4095.3, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 19206144. Throughput: 0: 1213.2. Samples: 3797520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:48:03,389][00219] Avg episode reward: [(0, '30.040')] [2023-02-25 20:48:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4818.0). Total num frames: 19230720. Throughput: 0: 1253.3. Samples: 3805600. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:48:08,379][00219] Avg episode reward: [(0, '32.542')] [2023-02-25 20:48:09,696][32866] Updated weights for policy 0, policy_version 4698 (0.0011) [2023-02-25 20:48:13,377][00219] Fps is (10 sec: 5735.3, 60 sec: 4983.6, 300 sec: 4818.0). Total num frames: 19263488. Throughput: 0: 1250.9. Samples: 3814624. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:48:13,386][00219] Avg episode reward: [(0, '32.306')] [2023-02-25 20:48:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.4, 300 sec: 4776.4). Total num frames: 19279872. Throughput: 0: 1214.9. Samples: 3817568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:48:18,384][00219] Avg episode reward: [(0, '30.849')] [2023-02-25 20:48:18,583][32866] Updated weights for policy 0, policy_version 4708 (0.0015) [2023-02-25 20:48:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4790.2). Total num frames: 19304448. Throughput: 0: 1219.2. Samples: 3823472. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:48:23,379][00219] Avg episode reward: [(0, '32.188')] [2023-02-25 20:48:26,963][32866] Updated weights for policy 0, policy_version 4718 (0.0012) [2023-02-25 20:48:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.4, 300 sec: 4818.0). Total num frames: 19333120. Throughput: 0: 1258.3. Samples: 3831648. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:48:28,379][00219] Avg episode reward: [(0, '32.010')] [2023-02-25 20:48:33,279][32866] Updated weights for policy 0, policy_version 4728 (0.0011) [2023-02-25 20:48:33,377][00219] Fps is (10 sec: 6144.0, 60 sec: 5051.7, 300 sec: 4845.8). Total num frames: 19365888. Throughput: 0: 1260.8. Samples: 3836320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:48:33,379][00219] Avg episode reward: [(0, '30.981')] [2023-02-25 20:48:38,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4983.3, 300 sec: 4790.2). Total num frames: 19382272. Throughput: 0: 1220.2. Samples: 3843552. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:48:38,384][00219] Avg episode reward: [(0, '30.721')] [2023-02-25 20:48:43,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4983.5, 300 sec: 4790.3). Total num frames: 19402752. Throughput: 0: 1221.3. Samples: 3849456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:48:43,385][00219] Avg episode reward: [(0, '31.662')] [2023-02-25 20:48:43,940][32866] Updated weights for policy 0, policy_version 4738 (0.0012) [2023-02-25 20:48:46,661][32858] Signal inference workers to stop experience collection... (100 times) [2023-02-25 20:48:46,667][32858] Signal inference workers to resume experience collection... (100 times) [2023-02-25 20:48:46,708][32866] InferenceWorker_p0-w0: stopping experience collection (100 times) [2023-02-25 20:48:46,711][32866] InferenceWorker_p0-w0: resuming experience collection (100 times) [2023-02-25 20:48:48,377][00219] Fps is (10 sec: 4916.1, 60 sec: 4915.2, 300 sec: 4818.0). Total num frames: 19431424. Throughput: 0: 1230.3. Samples: 3852880. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:48:48,379][00219] Avg episode reward: [(0, '32.344')] [2023-02-25 20:48:51,171][32866] Updated weights for policy 0, policy_version 4748 (0.0012) [2023-02-25 20:48:53,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 19460096. Throughput: 0: 1254.4. Samples: 3862048. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:48:53,381][00219] Avg episode reward: [(0, '31.948')] [2023-02-25 20:48:58,381][00219] Fps is (10 sec: 4913.2, 60 sec: 4983.1, 300 sec: 4804.1). Total num frames: 19480576. Throughput: 0: 1213.4. Samples: 3869232. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:48:58,383][00219] Avg episode reward: [(0, '32.507')] [2023-02-25 20:48:59,484][32866] Updated weights for policy 0, policy_version 4758 (0.0012) [2023-02-25 20:49:03,378][00219] Fps is (10 sec: 4095.3, 60 sec: 4915.2, 300 sec: 4776.3). Total num frames: 19501056. Throughput: 0: 1213.5. Samples: 3872176. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:49:03,384][00219] Avg episode reward: [(0, '31.764')] [2023-02-25 20:49:08,377][00219] Fps is (10 sec: 4507.3, 60 sec: 4915.2, 300 sec: 4804.1). Total num frames: 19525632. Throughput: 0: 1222.7. Samples: 3878496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:49:08,379][00219] Avg episode reward: [(0, '30.298')] [2023-02-25 20:49:08,898][32866] Updated weights for policy 0, policy_version 4768 (0.0012) [2023-02-25 20:49:13,377][00219] Fps is (10 sec: 5735.4, 60 sec: 4915.2, 300 sec: 4845.9). Total num frames: 19558400. Throughput: 0: 1247.3. Samples: 3887776. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:49:13,379][00219] Avg episode reward: [(0, '29.493')] [2023-02-25 20:49:15,580][32866] Updated weights for policy 0, policy_version 4778 (0.0012) [2023-02-25 20:49:18,377][00219] Fps is (10 sec: 5325.0, 60 sec: 4983.5, 300 sec: 4818.0). Total num frames: 19578880. Throughput: 0: 1237.7. Samples: 3892016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:49:18,382][00219] Avg episode reward: [(0, '29.436')] [2023-02-25 20:49:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4776.4). Total num frames: 19599360. Throughput: 0: 1206.8. Samples: 3897856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:49:23,379][00219] Avg episode reward: [(0, '28.816')] [2023-02-25 20:49:25,880][32866] Updated weights for policy 0, policy_version 4788 (0.0012) [2023-02-25 20:49:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4846.9, 300 sec: 4804.2). Total num frames: 19623936. Throughput: 0: 1221.3. Samples: 3904416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:49:28,379][00219] Avg episode reward: [(0, '29.990')] [2023-02-25 20:49:33,169][32866] Updated weights for policy 0, policy_version 4798 (0.0012) [2023-02-25 20:49:33,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 19656704. Throughput: 0: 1247.6. Samples: 3909024. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:49:33,379][00219] Avg episode reward: [(0, '30.905')] [2023-02-25 20:49:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.1, 300 sec: 4804.1). Total num frames: 19673088. Throughput: 0: 1207.1. Samples: 3916368. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:49:38,379][00219] Avg episode reward: [(0, '31.174')] [2023-02-25 20:49:43,377][00219] Fps is (10 sec: 3276.6, 60 sec: 4778.6, 300 sec: 4762.5). Total num frames: 19689472. Throughput: 0: 1152.8. Samples: 3921104. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:49:43,380][00219] Avg episode reward: [(0, '31.170')] [2023-02-25 20:49:43,394][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004807_19689472.pth... [2023-02-25 20:49:43,632][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004527_18542592.pth [2023-02-25 20:49:45,145][32866] Updated weights for policy 0, policy_version 4808 (0.0013) [2023-02-25 20:49:48,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4573.9, 300 sec: 4734.7). Total num frames: 19705856. Throughput: 0: 1137.8. Samples: 3923376. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:49:48,381][00219] Avg episode reward: [(0, '30.820')] [2023-02-25 20:49:53,377][00219] Fps is (10 sec: 3277.0, 60 sec: 4369.1, 300 sec: 4720.8). Total num frames: 19722240. Throughput: 0: 1111.8. Samples: 3928528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:49:53,387][00219] Avg episode reward: [(0, '31.566')] [2023-02-25 20:49:54,765][32866] Updated weights for policy 0, policy_version 4818 (0.0020) [2023-02-25 20:49:58,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4574.2, 300 sec: 4762.5). Total num frames: 19755008. Throughput: 0: 1107.9. Samples: 3937632. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:49:58,382][00219] Avg episode reward: [(0, '30.077')] [2023-02-25 20:50:01,267][32866] Updated weights for policy 0, policy_version 4828 (0.0016) [2023-02-25 20:50:03,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4710.5, 300 sec: 4748.6). Total num frames: 19783680. Throughput: 0: 1117.5. Samples: 3942304. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:50:03,386][00219] Avg episode reward: [(0, '29.728')] [2023-02-25 20:50:08,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4642.0, 300 sec: 4720.8). Total num frames: 19804160. Throughput: 0: 1129.5. Samples: 3948688. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:50:08,383][00219] Avg episode reward: [(0, '29.372')] [2023-02-25 20:50:11,465][32866] Updated weights for policy 0, policy_version 4838 (0.0011) [2023-02-25 20:50:13,381][00219] Fps is (10 sec: 4094.1, 60 sec: 4437.0, 300 sec: 4734.6). Total num frames: 19824640. Throughput: 0: 1112.4. Samples: 3954480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:50:13,385][00219] Avg episode reward: [(0, '29.456')] [2023-02-25 20:50:18,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4573.9, 300 sec: 4762.5). Total num frames: 19853312. Throughput: 0: 1107.2. Samples: 3958848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:50:18,379][00219] Avg episode reward: [(0, '31.728')] [2023-02-25 20:50:18,628][32866] Updated weights for policy 0, policy_version 4848 (0.0012) [2023-02-25 20:50:23,377][00219] Fps is (10 sec: 5737.0, 60 sec: 4710.4, 300 sec: 4762.5). Total num frames: 19881984. Throughput: 0: 1146.3. Samples: 3967952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:50:23,379][00219] Avg episode reward: [(0, '31.778')] [2023-02-25 20:50:26,736][32866] Updated weights for policy 0, policy_version 4858 (0.0012) [2023-02-25 20:50:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 4734.7). Total num frames: 19902464. Throughput: 0: 1184.0. Samples: 3974384. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:50:28,386][00219] Avg episode reward: [(0, '33.017')] [2023-02-25 20:50:33,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.1, 300 sec: 4720.9). Total num frames: 19918848. Throughput: 0: 1198.6. Samples: 3977312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:50:33,385][00219] Avg episode reward: [(0, '35.664')] [2023-02-25 20:50:36,134][32866] Updated weights for policy 0, policy_version 4868 (0.0011) [2023-02-25 20:50:38,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4573.9, 300 sec: 4762.5). Total num frames: 19947520. Throughput: 0: 1249.4. Samples: 3984752. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:50:38,386][00219] Avg episode reward: [(0, '35.448')] [2023-02-25 20:50:43,034][32866] Updated weights for policy 0, policy_version 4878 (0.0011) [2023-02-25 20:50:43,380][00219] Fps is (10 sec: 6141.8, 60 sec: 4846.7, 300 sec: 4818.0). Total num frames: 19980288. Throughput: 0: 1249.7. Samples: 3993872. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:50:43,383][00219] Avg episode reward: [(0, '34.853')] [2023-02-25 20:50:48,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4915.1, 300 sec: 4845.8). Total num frames: 20000768. Throughput: 0: 1218.5. Samples: 3997136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:50:48,383][00219] Avg episode reward: [(0, '35.786')] [2023-02-25 20:50:52,844][32866] Updated weights for policy 0, policy_version 4888 (0.0021) [2023-02-25 20:50:53,377][00219] Fps is (10 sec: 4097.5, 60 sec: 4983.5, 300 sec: 4831.9). Total num frames: 20021248. Throughput: 0: 1206.8. Samples: 4002992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:50:53,383][00219] Avg episode reward: [(0, '35.488')] [2023-02-25 20:50:58,377][00219] Fps is (10 sec: 4506.1, 60 sec: 4846.9, 300 sec: 4804.1). Total num frames: 20045824. Throughput: 0: 1246.7. Samples: 4010576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:50:58,379][00219] Avg episode reward: [(0, '33.346')] [2023-02-25 20:51:00,022][32866] Updated weights for policy 0, policy_version 4898 (0.0011) [2023-02-25 20:51:03,380][00219] Fps is (10 sec: 5732.7, 60 sec: 4914.9, 300 sec: 4845.8). Total num frames: 20078592. Throughput: 0: 1255.0. Samples: 4015328. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:51:03,382][00219] Avg episode reward: [(0, '32.744')] [2023-02-25 20:51:08,251][32866] Updated weights for policy 0, policy_version 4908 (0.0011) [2023-02-25 20:51:08,381][00219] Fps is (10 sec: 5732.0, 60 sec: 4983.3, 300 sec: 4873.5). Total num frames: 20103168. Throughput: 0: 1224.1. Samples: 4023040. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 20:51:08,388][00219] Avg episode reward: [(0, '32.903')] [2023-02-25 20:51:13,377][00219] Fps is (10 sec: 4097.2, 60 sec: 4915.6, 300 sec: 4831.9). Total num frames: 20119552. Throughput: 0: 1210.7. Samples: 4028864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:51:13,379][00219] Avg episode reward: [(0, '32.447')] [2023-02-25 20:51:18,377][00219] Fps is (10 sec: 4097.7, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 20144128. Throughput: 0: 1213.9. Samples: 4031936. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:51:18,379][00219] Avg episode reward: [(0, '32.930')] [2023-02-25 20:51:18,477][32866] Updated weights for policy 0, policy_version 4919 (0.0012) [2023-02-25 20:51:23,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 20176896. Throughput: 0: 1252.6. Samples: 4041120. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:51:23,378][00219] Avg episode reward: [(0, '34.580')] [2023-02-25 20:51:25,204][32866] Updated weights for policy 0, policy_version 4929 (0.0012) [2023-02-25 20:51:28,378][00219] Fps is (10 sec: 5324.3, 60 sec: 4915.1, 300 sec: 4859.7). Total num frames: 20197376. Throughput: 0: 1221.4. Samples: 4048832. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:51:28,387][00219] Avg episode reward: [(0, '34.456')] [2023-02-25 20:51:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4983.5, 300 sec: 4845.8). Total num frames: 20217856. Throughput: 0: 1212.8. Samples: 4051712. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:51:33,379][00219] Avg episode reward: [(0, '34.982')] [2023-02-25 20:51:36,289][32866] Updated weights for policy 0, policy_version 4939 (0.0012) [2023-02-25 20:51:38,377][00219] Fps is (10 sec: 4506.0, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 20242432. Throughput: 0: 1215.3. Samples: 4057680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:51:38,380][00219] Avg episode reward: [(0, '33.804')] [2023-02-25 20:51:42,905][32866] Updated weights for policy 0, policy_version 4949 (0.0012) [2023-02-25 20:51:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.5, 300 sec: 4845.8). Total num frames: 20275200. Throughput: 0: 1250.1. Samples: 4066832. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:51:43,378][00219] Avg episode reward: [(0, '29.926')] [2023-02-25 20:51:43,390][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004950_20275200.pth... [2023-02-25 20:51:43,472][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004664_19103744.pth [2023-02-25 20:51:48,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.3, 300 sec: 4859.7). Total num frames: 20295680. Throughput: 0: 1242.4. Samples: 4071232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:51:48,379][00219] Avg episode reward: [(0, '28.855')] [2023-02-25 20:51:51,249][32866] Updated weights for policy 0, policy_version 4959 (0.0021) [2023-02-25 20:51:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 20316160. Throughput: 0: 1204.0. Samples: 4077216. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:51:53,385][00219] Avg episode reward: [(0, '28.197')] [2023-02-25 20:51:58,379][00219] Fps is (10 sec: 4504.7, 60 sec: 4915.0, 300 sec: 4845.8). Total num frames: 20340736. Throughput: 0: 1213.1. Samples: 4083456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:51:58,381][00219] Avg episode reward: [(0, '29.969')] [2023-02-25 20:52:00,279][32866] Updated weights for policy 0, policy_version 4969 (0.0011) [2023-02-25 20:52:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4847.2, 300 sec: 4831.9). Total num frames: 20369408. Throughput: 0: 1245.9. Samples: 4088000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:03,379][00219] Avg episode reward: [(0, '30.593')] [2023-02-25 20:52:06,805][32866] Updated weights for policy 0, policy_version 4979 (0.0012) [2023-02-25 20:52:08,380][00219] Fps is (10 sec: 5324.3, 60 sec: 4847.0, 300 sec: 4845.7). Total num frames: 20393984. Throughput: 0: 1246.1. Samples: 4097200. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:52:08,386][00219] Avg episode reward: [(0, '33.536')] [2023-02-25 20:52:13,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4983.3, 300 sec: 4859.7). Total num frames: 20418560. Throughput: 0: 1207.4. Samples: 4103168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:13,381][00219] Avg episode reward: [(0, '34.735')] [2023-02-25 20:52:17,741][32866] Updated weights for policy 0, policy_version 4989 (0.0015) [2023-02-25 20:52:18,377][00219] Fps is (10 sec: 4507.0, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 20439040. Throughput: 0: 1206.4. Samples: 4106000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:18,379][00219] Avg episode reward: [(0, '34.170')] [2023-02-25 20:52:23,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 20467712. Throughput: 0: 1250.8. Samples: 4113968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:23,384][00219] Avg episode reward: [(0, '32.262')] [2023-02-25 20:52:24,490][32866] Updated weights for policy 0, policy_version 4999 (0.0011) [2023-02-25 20:52:28,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 20496384. Throughput: 0: 1251.5. Samples: 4123152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:28,382][00219] Avg episode reward: [(0, '30.758')] [2023-02-25 20:52:33,015][32866] Updated weights for policy 0, policy_version 5009 (0.0012) [2023-02-25 20:52:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 20516864. Throughput: 0: 1219.2. Samples: 4126096. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:52:33,379][00219] Avg episode reward: [(0, '29.449')] [2023-02-25 20:52:38,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 20533248. Throughput: 0: 1211.4. Samples: 4131728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:52:38,383][00219] Avg episode reward: [(0, '27.722')] [2023-02-25 20:52:41,899][32866] Updated weights for policy 0, policy_version 5019 (0.0013) [2023-02-25 20:52:43,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4831.9). Total num frames: 20561920. Throughput: 0: 1242.7. Samples: 4139376. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:52:43,382][00219] Avg episode reward: [(0, '28.567')] [2023-02-25 20:52:48,377][00219] Fps is (10 sec: 6144.4, 60 sec: 4983.5, 300 sec: 4845.8). Total num frames: 20594688. Throughput: 0: 1244.1. Samples: 4143984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:52:48,382][00219] Avg episode reward: [(0, '29.266')] [2023-02-25 20:52:49,168][32866] Updated weights for policy 0, policy_version 5029 (0.0011) [2023-02-25 20:52:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 20611072. Throughput: 0: 1205.1. Samples: 4151424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:52:53,379][00219] Avg episode reward: [(0, '28.888')] [2023-02-25 20:52:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.4, 300 sec: 4845.8). Total num frames: 20635648. Throughput: 0: 1200.8. Samples: 4157200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:52:58,382][00219] Avg episode reward: [(0, '30.992')] [2023-02-25 20:52:59,891][32866] Updated weights for policy 0, policy_version 5039 (0.0012) [2023-02-25 20:53:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 20660224. Throughput: 0: 1209.2. Samples: 4160416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:53:03,379][00219] Avg episode reward: [(0, '32.174')] [2023-02-25 20:53:06,311][32866] Updated weights for policy 0, policy_version 5049 (0.0012) [2023-02-25 20:53:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.5, 300 sec: 4831.9). Total num frames: 20688896. Throughput: 0: 1234.8. Samples: 4169536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:53:08,379][00219] Avg episode reward: [(0, '32.510')] [2023-02-25 20:53:13,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4915.4, 300 sec: 4859.7). Total num frames: 20713472. Throughput: 0: 1197.9. Samples: 4177056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:53:13,382][00219] Avg episode reward: [(0, '32.278')] [2023-02-25 20:53:15,110][32866] Updated weights for policy 0, policy_version 5059 (0.0012) [2023-02-25 20:53:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 20729856. Throughput: 0: 1195.4. Samples: 4179888. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:53:18,381][00219] Avg episode reward: [(0, '32.852')] [2023-02-25 20:53:23,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 20758528. Throughput: 0: 1210.7. Samples: 4186208. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:53:23,386][00219] Avg episode reward: [(0, '32.832')] [2023-02-25 20:53:24,226][32866] Updated weights for policy 0, policy_version 5069 (0.0013) [2023-02-25 20:53:28,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4847.0, 300 sec: 4818.0). Total num frames: 20787200. Throughput: 0: 1245.2. Samples: 4195408. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:53:28,379][00219] Avg episode reward: [(0, '32.590')] [2023-02-25 20:53:30,804][32866] Updated weights for policy 0, policy_version 5079 (0.0012) [2023-02-25 20:53:33,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 20811776. Throughput: 0: 1243.0. Samples: 4199920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:53:33,385][00219] Avg episode reward: [(0, '31.051')] [2023-02-25 20:53:38,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4847.0, 300 sec: 4818.0). Total num frames: 20824064. Throughput: 0: 1180.8. Samples: 4204560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:53:38,380][00219] Avg episode reward: [(0, '30.708')] [2023-02-25 20:53:43,377][00219] Fps is (10 sec: 2867.3, 60 sec: 4642.1, 300 sec: 4776.4). Total num frames: 20840448. Throughput: 0: 1157.7. Samples: 4209296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:53:43,383][00219] Avg episode reward: [(0, '30.926')] [2023-02-25 20:53:43,398][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005088_20840448.pth... [2023-02-25 20:53:43,578][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004807_19689472.pth [2023-02-25 20:53:44,117][32866] Updated weights for policy 0, policy_version 5089 (0.0016) [2023-02-25 20:53:48,377][00219] Fps is (10 sec: 3686.5, 60 sec: 4437.3, 300 sec: 4748.6). Total num frames: 20860928. Throughput: 0: 1137.8. Samples: 4211616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:53:48,379][00219] Avg episode reward: [(0, '29.549')] [2023-02-25 20:53:52,323][32866] Updated weights for policy 0, policy_version 5099 (0.0012) [2023-02-25 20:53:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 4776.4). Total num frames: 20889600. Throughput: 0: 1109.7. Samples: 4219472. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:53:53,386][00219] Avg episode reward: [(0, '29.443')] [2023-02-25 20:53:58,377][00219] Fps is (10 sec: 5734.1, 60 sec: 4710.4, 300 sec: 4804.1). Total num frames: 20918272. Throughput: 0: 1144.9. Samples: 4228576. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:53:58,379][00219] Avg episode reward: [(0, '31.653')] [2023-02-25 20:53:59,920][32866] Updated weights for policy 0, policy_version 5109 (0.0012) [2023-02-25 20:54:03,382][00219] Fps is (10 sec: 4503.3, 60 sec: 4573.5, 300 sec: 4776.3). Total num frames: 20934656. Throughput: 0: 1145.8. Samples: 4231456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:54:03,384][00219] Avg episode reward: [(0, '32.433')] [2023-02-25 20:54:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4748.6). Total num frames: 20959232. Throughput: 0: 1138.8. Samples: 4237456. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:54:08,379][00219] Avg episode reward: [(0, '30.602')] [2023-02-25 20:54:09,769][32866] Updated weights for policy 0, policy_version 5119 (0.0012) [2023-02-25 20:54:13,377][00219] Fps is (10 sec: 4917.7, 60 sec: 4505.6, 300 sec: 4762.5). Total num frames: 20983808. Throughput: 0: 1109.7. Samples: 4245344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:54:13,378][00219] Avg episode reward: [(0, '31.846')] [2023-02-25 20:54:16,422][32866] Updated weights for policy 0, policy_version 5129 (0.0013) [2023-02-25 20:54:18,377][00219] Fps is (10 sec: 5734.7, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 21016576. Throughput: 0: 1109.0. Samples: 4249824. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:54:18,379][00219] Avg episode reward: [(0, '30.943')] [2023-02-25 20:54:23,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4710.4, 300 sec: 4804.1). Total num frames: 21041152. Throughput: 0: 1170.9. Samples: 4257248. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:54:23,381][00219] Avg episode reward: [(0, '31.031')] [2023-02-25 20:54:25,739][32866] Updated weights for policy 0, policy_version 5139 (0.0042) [2023-02-25 20:54:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4748.6). Total num frames: 21057536. Throughput: 0: 1198.6. Samples: 4263232. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:54:28,380][00219] Avg episode reward: [(0, '30.276')] [2023-02-25 20:54:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4776.4). Total num frames: 21082112. Throughput: 0: 1222.8. Samples: 4266640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:54:33,385][00219] Avg episode reward: [(0, '31.752')] [2023-02-25 20:54:33,698][32866] Updated weights for policy 0, policy_version 5149 (0.0013) [2023-02-25 20:54:38,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4847.0, 300 sec: 4831.9). Total num frames: 21114880. Throughput: 0: 1253.3. Samples: 4275872. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:54:38,378][00219] Avg episode reward: [(0, '32.896')] [2023-02-25 20:54:41,646][32866] Updated weights for policy 0, policy_version 5159 (0.0012) [2023-02-25 20:54:43,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21135360. Throughput: 0: 1217.1. Samples: 4283344. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:54:43,379][00219] Avg episode reward: [(0, '33.192')] [2023-02-25 20:54:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4873.5). Total num frames: 21159936. Throughput: 0: 1218.6. Samples: 4286288. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:54:48,386][00219] Avg episode reward: [(0, '32.891')] [2023-02-25 20:54:51,268][32866] Updated weights for policy 0, policy_version 5169 (0.0014) [2023-02-25 20:54:53,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21184512. Throughput: 0: 1232.7. Samples: 4292928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:54:53,380][00219] Avg episode reward: [(0, '31.561')] [2023-02-25 20:54:57,826][32866] Updated weights for policy 0, policy_version 5179 (0.0012) [2023-02-25 20:54:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21213184. Throughput: 0: 1263.6. Samples: 4302208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:54:58,379][00219] Avg episode reward: [(0, '33.095')] [2023-02-25 20:55:03,377][00219] Fps is (10 sec: 5325.0, 60 sec: 5052.2, 300 sec: 4859.7). Total num frames: 21237760. Throughput: 0: 1259.0. Samples: 4306480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:55:03,379][00219] Avg episode reward: [(0, '31.251')] [2023-02-25 20:55:06,819][32866] Updated weights for policy 0, policy_version 5189 (0.0011) [2023-02-25 20:55:08,380][00219] Fps is (10 sec: 4504.2, 60 sec: 4983.3, 300 sec: 4859.7). Total num frames: 21258240. Throughput: 0: 1224.1. Samples: 4312336. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:55:08,382][00219] Avg episode reward: [(0, '31.899')] [2023-02-25 20:55:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4845.8). Total num frames: 21282816. Throughput: 0: 1241.2. Samples: 4319088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:55:13,379][00219] Avg episode reward: [(0, '33.030')] [2023-02-25 20:55:15,264][32866] Updated weights for policy 0, policy_version 5199 (0.0012) [2023-02-25 20:55:18,377][00219] Fps is (10 sec: 5326.3, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21311488. Throughput: 0: 1266.1. Samples: 4323616. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:55:18,380][00219] Avg episode reward: [(0, '32.441')] [2023-02-25 20:55:22,172][32866] Updated weights for policy 0, policy_version 5209 (0.0012) [2023-02-25 20:55:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 21336064. Throughput: 0: 1255.5. Samples: 4332368. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:55:23,381][00219] Avg episode reward: [(0, '32.038')] [2023-02-25 20:55:28,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4983.5, 300 sec: 4873.5). Total num frames: 21356544. Throughput: 0: 1217.8. Samples: 4338144. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:55:28,385][00219] Avg episode reward: [(0, '31.345')] [2023-02-25 20:55:32,259][32866] Updated weights for policy 0, policy_version 5219 (0.0011) [2023-02-25 20:55:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 21381120. Throughput: 0: 1217.4. Samples: 4341072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:55:33,380][00219] Avg episode reward: [(0, '33.720')] [2023-02-25 20:55:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21409792. Throughput: 0: 1258.3. Samples: 4349552. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:55:38,378][00219] Avg episode reward: [(0, '33.271')] [2023-02-25 20:55:39,452][32866] Updated weights for policy 0, policy_version 5229 (0.0012) [2023-02-25 20:55:43,378][00219] Fps is (10 sec: 5733.7, 60 sec: 5051.6, 300 sec: 4873.5). Total num frames: 21438464. Throughput: 0: 1244.1. Samples: 4358192. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:55:43,382][00219] Avg episode reward: [(0, '34.709')] [2023-02-25 20:55:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005234_21438464.pth... [2023-02-25 20:55:43,529][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004950_20275200.pth [2023-02-25 20:55:48,378][00219] Fps is (10 sec: 4504.9, 60 sec: 4915.1, 300 sec: 4859.6). Total num frames: 21454848. Throughput: 0: 1212.4. Samples: 4361040. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:55:48,384][00219] Avg episode reward: [(0, '34.620')] [2023-02-25 20:55:48,875][32866] Updated weights for policy 0, policy_version 5239 (0.0011) [2023-02-25 20:55:53,377][00219] Fps is (10 sec: 4096.5, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 21479424. Throughput: 0: 1211.5. Samples: 4366848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:55:53,387][00219] Avg episode reward: [(0, '33.755')] [2023-02-25 20:55:57,014][32866] Updated weights for policy 0, policy_version 5249 (0.0012) [2023-02-25 20:55:58,377][00219] Fps is (10 sec: 5325.7, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21508096. Throughput: 0: 1257.2. Samples: 4375664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:55:58,388][00219] Avg episode reward: [(0, '32.388')] [2023-02-25 20:56:03,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 21536768. Throughput: 0: 1259.0. Samples: 4380272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:56:03,382][00219] Avg episode reward: [(0, '32.302')] [2023-02-25 20:56:03,686][32866] Updated weights for policy 0, policy_version 5259 (0.0012) [2023-02-25 20:56:08,382][00219] Fps is (10 sec: 4912.7, 60 sec: 4983.3, 300 sec: 4873.5). Total num frames: 21557248. Throughput: 0: 1219.1. Samples: 4387232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:56:08,386][00219] Avg episode reward: [(0, '32.043')] [2023-02-25 20:56:13,378][00219] Fps is (10 sec: 4095.4, 60 sec: 4915.1, 300 sec: 4859.6). Total num frames: 21577728. Throughput: 0: 1218.8. Samples: 4392992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:56:13,386][00219] Avg episode reward: [(0, '32.401')] [2023-02-25 20:56:14,256][32866] Updated weights for policy 0, policy_version 5269 (0.0012) [2023-02-25 20:56:18,377][00219] Fps is (10 sec: 4917.7, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21606400. Throughput: 0: 1247.6. Samples: 4397216. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:56:18,379][00219] Avg episode reward: [(0, '32.809')] [2023-02-25 20:56:20,450][32866] Updated weights for policy 0, policy_version 5279 (0.0013) [2023-02-25 20:56:23,377][00219] Fps is (10 sec: 6144.9, 60 sec: 5051.7, 300 sec: 4887.4). Total num frames: 21639168. Throughput: 0: 1263.6. Samples: 4406416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:56:23,379][00219] Avg episode reward: [(0, '33.391')] [2023-02-25 20:56:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4873.5). Total num frames: 21655552. Throughput: 0: 1218.5. Samples: 4413024. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:56:28,381][00219] Avg episode reward: [(0, '33.467')] [2023-02-25 20:56:29,459][32866] Updated weights for policy 0, policy_version 5289 (0.0014) [2023-02-25 20:56:33,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 21676032. Throughput: 0: 1221.7. Samples: 4416016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:56:33,383][00219] Avg episode reward: [(0, '31.253')] [2023-02-25 20:56:37,646][32866] Updated weights for policy 0, policy_version 5299 (0.0013) [2023-02-25 20:56:38,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 21704704. Throughput: 0: 1257.6. Samples: 4423440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:56:38,379][00219] Avg episode reward: [(0, '31.964')] [2023-02-25 20:56:43,377][00219] Fps is (10 sec: 5734.2, 60 sec: 4915.3, 300 sec: 4873.5). Total num frames: 21733376. Throughput: 0: 1266.5. Samples: 4432656. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:56:43,379][00219] Avg episode reward: [(0, '32.538')] [2023-02-25 20:56:45,096][32866] Updated weights for policy 0, policy_version 5309 (0.0012) [2023-02-25 20:56:48,377][00219] Fps is (10 sec: 5324.9, 60 sec: 5051.9, 300 sec: 4887.4). Total num frames: 21757952. Throughput: 0: 1245.2. Samples: 4436304. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:56:48,385][00219] Avg episode reward: [(0, '32.172')] [2023-02-25 20:56:53,377][00219] Fps is (10 sec: 4505.8, 60 sec: 4983.5, 300 sec: 4873.6). Total num frames: 21778432. Throughput: 0: 1222.2. Samples: 4442224. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:56:53,379][00219] Avg episode reward: [(0, '32.532')] [2023-02-25 20:56:55,284][32866] Updated weights for policy 0, policy_version 5319 (0.0012) [2023-02-25 20:56:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 21803008. Throughput: 0: 1254.4. Samples: 4449440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:56:58,378][00219] Avg episode reward: [(0, '32.771')] [2023-02-25 20:57:01,885][32866] Updated weights for policy 0, policy_version 5329 (0.0012) [2023-02-25 20:57:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4873.6). Total num frames: 21831680. Throughput: 0: 1259.0. Samples: 4453872. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:57:03,379][00219] Avg episode reward: [(0, '32.985')] [2023-02-25 20:57:08,377][00219] Fps is (10 sec: 5324.5, 60 sec: 4983.8, 300 sec: 4873.6). Total num frames: 21856256. Throughput: 0: 1240.5. Samples: 4462240. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:57:08,385][00219] Avg episode reward: [(0, '33.276')] [2023-02-25 20:57:10,866][32866] Updated weights for policy 0, policy_version 5339 (0.0011) [2023-02-25 20:57:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.6, 300 sec: 4873.5). Total num frames: 21876736. Throughput: 0: 1223.1. Samples: 4468064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:57:13,384][00219] Avg episode reward: [(0, '32.531')] [2023-02-25 20:57:18,377][00219] Fps is (10 sec: 4505.9, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 21901312. Throughput: 0: 1217.8. Samples: 4470816. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:57:18,382][00219] Avg episode reward: [(0, '32.695')] [2023-02-25 20:57:19,548][32866] Updated weights for policy 0, policy_version 5349 (0.0011) [2023-02-25 20:57:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 21929984. Throughput: 0: 1256.9. Samples: 4480000. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 20:57:23,379][00219] Avg episode reward: [(0, '31.805')] [2023-02-25 20:57:26,182][32866] Updated weights for policy 0, policy_version 5359 (0.0012) [2023-02-25 20:57:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4873.5). Total num frames: 21954560. Throughput: 0: 1229.5. Samples: 4487984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:57:28,381][00219] Avg episode reward: [(0, '32.149')] [2023-02-25 20:57:33,379][00219] Fps is (10 sec: 4095.2, 60 sec: 4915.0, 300 sec: 4873.5). Total num frames: 21970944. Throughput: 0: 1200.7. Samples: 4490336. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:57:33,383][00219] Avg episode reward: [(0, '32.753')] [2023-02-25 20:57:38,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4710.4, 300 sec: 4831.9). Total num frames: 21987328. Throughput: 0: 1173.7. Samples: 4495040. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:57:38,380][00219] Avg episode reward: [(0, '33.276')] [2023-02-25 20:57:39,139][32866] Updated weights for policy 0, policy_version 5369 (0.0012) [2023-02-25 20:57:43,377][00219] Fps is (10 sec: 3277.4, 60 sec: 4505.6, 300 sec: 4776.4). Total num frames: 22003712. Throughput: 0: 1123.2. Samples: 4499984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:57:43,383][00219] Avg episode reward: [(0, '33.205')] [2023-02-25 20:57:43,448][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005373_22007808.pth... [2023-02-25 20:57:43,544][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005088_20840448.pth [2023-02-25 20:57:47,411][32866] Updated weights for policy 0, policy_version 5379 (0.0011) [2023-02-25 20:57:48,378][00219] Fps is (10 sec: 4504.8, 60 sec: 4573.7, 300 sec: 4818.0). Total num frames: 22032384. Throughput: 0: 1122.4. Samples: 4504384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:57:48,384][00219] Avg episode reward: [(0, '34.126')] [2023-02-25 20:57:53,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4778.7, 300 sec: 4845.8). Total num frames: 22065152. Throughput: 0: 1138.1. Samples: 4513456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:57:53,381][00219] Avg episode reward: [(0, '34.434')] [2023-02-25 20:57:54,796][32866] Updated weights for policy 0, policy_version 5389 (0.0015) [2023-02-25 20:57:58,377][00219] Fps is (10 sec: 4916.1, 60 sec: 4642.1, 300 sec: 4818.0). Total num frames: 22081536. Throughput: 0: 1149.9. Samples: 4519808. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:57:58,383][00219] Avg episode reward: [(0, '35.722')] [2023-02-25 20:58:03,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4505.6, 300 sec: 4790.2). Total num frames: 22102016. Throughput: 0: 1154.1. Samples: 4522752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:58:03,385][00219] Avg episode reward: [(0, '35.373')] [2023-02-25 20:58:05,308][32866] Updated weights for policy 0, policy_version 5399 (0.0011) [2023-02-25 20:58:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.2, 300 sec: 4818.0). Total num frames: 22134784. Throughput: 0: 1117.9. Samples: 4530304. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:58:08,379][00219] Avg episode reward: [(0, '34.483')] [2023-02-25 20:58:11,632][32866] Updated weights for policy 0, policy_version 5409 (0.0019) [2023-02-25 20:58:13,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4778.7, 300 sec: 4859.7). Total num frames: 22163456. Throughput: 0: 1148.8. Samples: 4539680. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:58:13,380][00219] Avg episode reward: [(0, '34.819')] [2023-02-25 20:58:18,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4710.2, 300 sec: 4831.9). Total num frames: 22183936. Throughput: 0: 1167.3. Samples: 4542864. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:58:18,386][00219] Avg episode reward: [(0, '33.286')] [2023-02-25 20:58:21,557][32866] Updated weights for policy 0, policy_version 5419 (0.0011) [2023-02-25 20:58:23,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4505.6, 300 sec: 4790.2). Total num frames: 22200320. Throughput: 0: 1192.2. Samples: 4548688. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 20:58:23,380][00219] Avg episode reward: [(0, '31.981')] [2023-02-25 20:58:28,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4642.1, 300 sec: 4818.0). Total num frames: 22233088. Throughput: 0: 1254.8. Samples: 4556448. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:58:28,379][00219] Avg episode reward: [(0, '31.238')] [2023-02-25 20:58:29,093][32866] Updated weights for policy 0, policy_version 5429 (0.0011) [2023-02-25 20:58:33,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4847.1, 300 sec: 4873.6). Total num frames: 22261760. Throughput: 0: 1258.7. Samples: 4561024. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 20:58:33,378][00219] Avg episode reward: [(0, '30.627')] [2023-02-25 20:58:36,341][32866] Updated weights for policy 0, policy_version 5439 (0.0011) [2023-02-25 20:58:38,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 22286336. Throughput: 0: 1229.9. Samples: 4568800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 20:58:38,382][00219] Avg episode reward: [(0, '29.288')] [2023-02-25 20:58:43,377][00219] Fps is (10 sec: 4095.7, 60 sec: 4983.4, 300 sec: 4887.4). Total num frames: 22302720. Throughput: 0: 1219.2. Samples: 4574672. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:58:43,379][00219] Avg episode reward: [(0, '30.704')] [2023-02-25 20:58:45,995][32866] Updated weights for policy 0, policy_version 5449 (0.0012) [2023-02-25 20:58:48,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4983.6, 300 sec: 4887.4). Total num frames: 22331392. Throughput: 0: 1230.6. Samples: 4578128. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 20:58:48,380][00219] Avg episode reward: [(0, '30.630')] [2023-02-25 20:58:53,158][32866] Updated weights for policy 0, policy_version 5459 (0.0012) [2023-02-25 20:58:53,377][00219] Fps is (10 sec: 6144.5, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 22364160. Throughput: 0: 1266.5. Samples: 4587296. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:58:53,379][00219] Avg episode reward: [(0, '31.528')] [2023-02-25 20:58:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 5051.7, 300 sec: 4915.3). Total num frames: 22384640. Throughput: 0: 1226.3. Samples: 4594864. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 20:58:58,380][00219] Avg episode reward: [(0, '32.053')] [2023-02-25 20:59:02,483][32866] Updated weights for policy 0, policy_version 5469 (0.0011) [2023-02-25 20:59:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 5051.7, 300 sec: 4901.3). Total num frames: 22405120. Throughput: 0: 1221.0. Samples: 4597808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:59:03,385][00219] Avg episode reward: [(0, '31.994')] [2023-02-25 20:59:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4901.3). Total num frames: 22429696. Throughput: 0: 1232.7. Samples: 4604160. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 20:59:08,378][00219] Avg episode reward: [(0, '31.557')] [2023-02-25 20:59:10,177][32866] Updated weights for policy 0, policy_version 5479 (0.0012) [2023-02-25 20:59:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 22458368. Throughput: 0: 1265.4. Samples: 4613392. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:59:13,386][00219] Avg episode reward: [(0, '32.134')] [2023-02-25 20:59:17,638][32866] Updated weights for policy 0, policy_version 5489 (0.0012) [2023-02-25 20:59:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.6, 300 sec: 4887.4). Total num frames: 22482944. Throughput: 0: 1263.6. Samples: 4617888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 20:59:18,379][00219] Avg episode reward: [(0, '32.242')] [2023-02-25 20:59:23,381][00219] Fps is (10 sec: 4503.7, 60 sec: 5051.4, 300 sec: 4901.2). Total num frames: 22503424. Throughput: 0: 1221.6. Samples: 4623776. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:59:23,391][00219] Avg episode reward: [(0, '31.954')] [2023-02-25 20:59:27,545][32866] Updated weights for policy 0, policy_version 5499 (0.0012) [2023-02-25 20:59:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4887.4). Total num frames: 22523904. Throughput: 0: 1226.0. Samples: 4629840. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 20:59:28,379][00219] Avg episode reward: [(0, '32.814')] [2023-02-25 20:59:33,377][00219] Fps is (10 sec: 5327.1, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 22556672. Throughput: 0: 1251.9. Samples: 4634464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:59:33,379][00219] Avg episode reward: [(0, '32.126')] [2023-02-25 20:59:34,512][32866] Updated weights for policy 0, policy_version 5509 (0.0011) [2023-02-25 20:59:38,382][00219] Fps is (10 sec: 6140.9, 60 sec: 4983.1, 300 sec: 4915.1). Total num frames: 22585344. Throughput: 0: 1251.8. Samples: 4643632. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 20:59:38,387][00219] Avg episode reward: [(0, '32.026')] [2023-02-25 20:59:43,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 22601728. Throughput: 0: 1214.9. Samples: 4649536. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:59:43,385][00219] Avg episode reward: [(0, '32.667')] [2023-02-25 20:59:43,401][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005518_22601728.pth... [2023-02-25 20:59:43,590][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005234_21438464.pth [2023-02-25 20:59:43,975][32866] Updated weights for policy 0, policy_version 5519 (0.0012) [2023-02-25 20:59:48,377][00219] Fps is (10 sec: 3688.3, 60 sec: 4846.9, 300 sec: 4873.6). Total num frames: 22622208. Throughput: 0: 1215.3. Samples: 4652496. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 20:59:48,394][00219] Avg episode reward: [(0, '31.932')] [2023-02-25 20:59:51,920][32866] Updated weights for policy 0, policy_version 5529 (0.0014) [2023-02-25 20:59:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4887.4). Total num frames: 22654976. Throughput: 0: 1247.6. Samples: 4660304. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:59:53,379][00219] Avg episode reward: [(0, '30.181')] [2023-02-25 20:59:58,378][00219] Fps is (10 sec: 6143.2, 60 sec: 4983.4, 300 sec: 4901.3). Total num frames: 22683648. Throughput: 0: 1249.0. Samples: 4669600. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 20:59:58,385][00219] Avg episode reward: [(0, '32.259')] [2023-02-25 20:59:59,116][32866] Updated weights for policy 0, policy_version 5539 (0.0012) [2023-02-25 21:00:03,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4983.5, 300 sec: 4901.4). Total num frames: 22704128. Throughput: 0: 1219.2. Samples: 4672752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:00:03,379][00219] Avg episode reward: [(0, '31.455')] [2023-02-25 21:00:08,377][00219] Fps is (10 sec: 3686.9, 60 sec: 4846.9, 300 sec: 4873.5). Total num frames: 22720512. Throughput: 0: 1219.7. Samples: 4678656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:00:08,380][00219] Avg episode reward: [(0, '30.671')] [2023-02-25 21:00:09,283][32866] Updated weights for policy 0, policy_version 5549 (0.0017) [2023-02-25 21:00:13,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 22753280. Throughput: 0: 1259.4. Samples: 4686512. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:00:13,378][00219] Avg episode reward: [(0, '30.240')] [2023-02-25 21:00:15,871][32866] Updated weights for policy 0, policy_version 5559 (0.0011) [2023-02-25 21:00:18,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 22781952. Throughput: 0: 1259.0. Samples: 4691120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:00:18,382][00219] Avg episode reward: [(0, '32.220')] [2023-02-25 21:00:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.8, 300 sec: 4901.3). Total num frames: 22802432. Throughput: 0: 1225.0. Samples: 4698752. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:00:23,382][00219] Avg episode reward: [(0, '30.253')] [2023-02-25 21:00:24,520][32866] Updated weights for policy 0, policy_version 5569 (0.0011) [2023-02-25 21:00:28,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 22822912. Throughput: 0: 1224.5. Samples: 4704640. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:00:28,382][00219] Avg episode reward: [(0, '31.043')] [2023-02-25 21:00:33,064][32866] Updated weights for policy 0, policy_version 5579 (0.0037) [2023-02-25 21:00:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 22851584. Throughput: 0: 1232.7. Samples: 4707968. Policy #0 lag: (min: 0.0, avg: 0.8, max: 3.0) [2023-02-25 21:00:33,378][00219] Avg episode reward: [(0, '32.078')] [2023-02-25 21:00:38,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.6, 300 sec: 4887.5). Total num frames: 22880256. Throughput: 0: 1264.0. Samples: 4717184. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:00:38,383][00219] Avg episode reward: [(0, '33.858')] [2023-02-25 21:00:40,112][32866] Updated weights for policy 0, policy_version 5589 (0.0012) [2023-02-25 21:00:43,377][00219] Fps is (10 sec: 5324.8, 60 sec: 5051.7, 300 sec: 4915.2). Total num frames: 22904832. Throughput: 0: 1228.5. Samples: 4724880. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:00:43,382][00219] Avg episode reward: [(0, '32.553')] [2023-02-25 21:00:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 22921216. Throughput: 0: 1224.5. Samples: 4727856. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:00:48,381][00219] Avg episode reward: [(0, '33.134')] [2023-02-25 21:00:50,968][32866] Updated weights for policy 0, policy_version 5599 (0.0011) [2023-02-25 21:00:53,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 22949888. Throughput: 0: 1230.6. Samples: 4734032. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:00:53,387][00219] Avg episode reward: [(0, '33.792')] [2023-02-25 21:00:57,168][32866] Updated weights for policy 0, policy_version 5609 (0.0012) [2023-02-25 21:00:58,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4915.3, 300 sec: 4887.4). Total num frames: 22978560. Throughput: 0: 1259.7. Samples: 4743200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:00:58,378][00219] Avg episode reward: [(0, '31.424')] [2023-02-25 21:01:03,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4983.5, 300 sec: 4901.4). Total num frames: 23003136. Throughput: 0: 1260.8. Samples: 4747856. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:01:03,383][00219] Avg episode reward: [(0, '32.479')] [2023-02-25 21:01:05,872][32866] Updated weights for policy 0, policy_version 5619 (0.0013) [2023-02-25 21:01:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 5051.7, 300 sec: 4901.3). Total num frames: 23023616. Throughput: 0: 1221.0. Samples: 4753696. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:01:08,384][00219] Avg episode reward: [(0, '31.984')] [2023-02-25 21:01:13,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4846.9, 300 sec: 4873.5). Total num frames: 23044096. Throughput: 0: 1229.2. Samples: 4759952. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:01:13,379][00219] Avg episode reward: [(0, '32.477')] [2023-02-25 21:01:15,025][32866] Updated weights for policy 0, policy_version 5629 (0.0012) [2023-02-25 21:01:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23076864. Throughput: 0: 1254.4. Samples: 4764416. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:01:18,385][00219] Avg episode reward: [(0, '31.190')] [2023-02-25 21:01:21,456][32866] Updated weights for policy 0, policy_version 5639 (0.0012) [2023-02-25 21:01:23,378][00219] Fps is (10 sec: 5733.5, 60 sec: 4983.3, 300 sec: 4901.3). Total num frames: 23101440. Throughput: 0: 1255.4. Samples: 4773680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:01:23,381][00219] Avg episode reward: [(0, '32.399')] [2023-02-25 21:01:28,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4914.7, 300 sec: 4887.3). Total num frames: 23117824. Throughput: 0: 1194.5. Samples: 4778640. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:01:28,385][00219] Avg episode reward: [(0, '31.832')] [2023-02-25 21:01:33,377][00219] Fps is (10 sec: 3277.2, 60 sec: 4710.4, 300 sec: 4845.8). Total num frames: 23134208. Throughput: 0: 1180.8. Samples: 4780992. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:01:33,379][00219] Avg episode reward: [(0, '31.308')] [2023-02-25 21:01:33,858][32866] Updated weights for policy 0, policy_version 5649 (0.0011) [2023-02-25 21:01:38,377][00219] Fps is (10 sec: 3278.8, 60 sec: 4505.6, 300 sec: 4804.1). Total num frames: 23150592. Throughput: 0: 1149.2. Samples: 4785744. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:01:38,381][00219] Avg episode reward: [(0, '31.355')] [2023-02-25 21:01:43,043][32866] Updated weights for policy 0, policy_version 5659 (0.0011) [2023-02-25 21:01:43,377][00219] Fps is (10 sec: 4505.8, 60 sec: 4573.9, 300 sec: 4818.0). Total num frames: 23179264. Throughput: 0: 1117.2. Samples: 4793472. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:01:43,378][00219] Avg episode reward: [(0, '31.450')] [2023-02-25 21:01:43,393][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005659_23179264.pth... [2023-02-25 21:01:43,487][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005373_22007808.pth [2023-02-25 21:01:48,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 23212032. Throughput: 0: 1112.9. Samples: 4797936. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:01:48,383][00219] Avg episode reward: [(0, '31.419')] [2023-02-25 21:01:50,070][32866] Updated weights for policy 0, policy_version 5669 (0.0011) [2023-02-25 21:01:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4642.1, 300 sec: 4831.9). Total num frames: 23228416. Throughput: 0: 1148.4. Samples: 4805376. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:01:53,382][00219] Avg episode reward: [(0, '30.423')] [2023-02-25 21:01:58,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4505.6, 300 sec: 4804.1). Total num frames: 23248896. Throughput: 0: 1136.0. Samples: 4811072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:01:58,382][00219] Avg episode reward: [(0, '30.524')] [2023-02-25 21:02:00,591][32866] Updated weights for policy 0, policy_version 5679 (0.0012) [2023-02-25 21:02:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4573.9, 300 sec: 4818.0). Total num frames: 23277568. Throughput: 0: 1120.0. Samples: 4814816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:02:03,381][00219] Avg episode reward: [(0, '31.987')] [2023-02-25 21:02:06,987][32866] Updated weights for policy 0, policy_version 5689 (0.0012) [2023-02-25 21:02:08,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4778.7, 300 sec: 4859.7). Total num frames: 23310336. Throughput: 0: 1117.2. Samples: 4823952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:02:08,379][00219] Avg episode reward: [(0, '33.876')] [2023-02-25 21:02:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4845.8). Total num frames: 23330816. Throughput: 0: 1166.4. Samples: 4831120. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:02:13,384][00219] Avg episode reward: [(0, '35.219')] [2023-02-25 21:02:16,872][32866] Updated weights for policy 0, policy_version 5699 (0.0015) [2023-02-25 21:02:18,377][00219] Fps is (10 sec: 3686.2, 60 sec: 4505.6, 300 sec: 4804.1). Total num frames: 23347200. Throughput: 0: 1178.3. Samples: 4834016. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:02:18,382][00219] Avg episode reward: [(0, '35.341')] [2023-02-25 21:02:23,377][00219] Fps is (10 sec: 4505.4, 60 sec: 4574.0, 300 sec: 4818.0). Total num frames: 23375872. Throughput: 0: 1222.7. Samples: 4840768. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:02:23,379][00219] Avg episode reward: [(0, '35.676')] [2023-02-25 21:02:24,474][32866] Updated weights for policy 0, policy_version 5709 (0.0018) [2023-02-25 21:02:28,377][00219] Fps is (10 sec: 5734.7, 60 sec: 4779.1, 300 sec: 4859.7). Total num frames: 23404544. Throughput: 0: 1250.8. Samples: 4849760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:02:28,379][00219] Avg episode reward: [(0, '35.976')] [2023-02-25 21:02:32,107][32866] Updated weights for policy 0, policy_version 5719 (0.0014) [2023-02-25 21:02:33,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 23429120. Throughput: 0: 1249.1. Samples: 4854144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:02:33,381][00219] Avg episode reward: [(0, '33.158')] [2023-02-25 21:02:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 23445504. Throughput: 0: 1214.6. Samples: 4860032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:02:38,379][00219] Avg episode reward: [(0, '32.515')] [2023-02-25 21:02:41,923][32866] Updated weights for policy 0, policy_version 5729 (0.0013) [2023-02-25 21:02:43,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4915.2, 300 sec: 4887.5). Total num frames: 23474176. Throughput: 0: 1237.3. Samples: 4866752. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:02:43,382][00219] Avg episode reward: [(0, '32.349')] [2023-02-25 21:02:48,357][32866] Updated weights for policy 0, policy_version 5739 (0.0012) [2023-02-25 21:02:48,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 23506944. Throughput: 0: 1255.8. Samples: 4871328. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:02:48,382][00219] Avg episode reward: [(0, '31.095')] [2023-02-25 21:02:53,380][00219] Fps is (10 sec: 5323.2, 60 sec: 4983.2, 300 sec: 4901.3). Total num frames: 23527424. Throughput: 0: 1251.1. Samples: 4880256. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 21:02:53,389][00219] Avg episode reward: [(0, '30.187')] [2023-02-25 21:02:57,219][32866] Updated weights for policy 0, policy_version 5749 (0.0013) [2023-02-25 21:02:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 23547904. Throughput: 0: 1221.0. Samples: 4886064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:02:58,378][00219] Avg episode reward: [(0, '31.281')] [2023-02-25 21:03:03,377][00219] Fps is (10 sec: 4506.9, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23572480. Throughput: 0: 1222.8. Samples: 4889040. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:03:03,379][00219] Avg episode reward: [(0, '31.145')] [2023-02-25 21:03:06,025][32866] Updated weights for policy 0, policy_version 5759 (0.0012) [2023-02-25 21:03:08,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4873.5). Total num frames: 23601152. Throughput: 0: 1252.3. Samples: 4897120. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:03:08,382][00219] Avg episode reward: [(0, '33.003')] [2023-02-25 21:03:13,289][32866] Updated weights for policy 0, policy_version 5769 (0.0012) [2023-02-25 21:03:13,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 23629824. Throughput: 0: 1253.3. Samples: 4906160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:03:13,379][00219] Avg episode reward: [(0, '33.622')] [2023-02-25 21:03:18,379][00219] Fps is (10 sec: 4504.7, 60 sec: 4983.3, 300 sec: 4901.3). Total num frames: 23646208. Throughput: 0: 1221.3. Samples: 4909104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:03:18,381][00219] Avg episode reward: [(0, '33.902')] [2023-02-25 21:03:23,171][32866] Updated weights for policy 0, policy_version 5779 (0.0012) [2023-02-25 21:03:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23670784. Throughput: 0: 1221.7. Samples: 4915008. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:03:23,381][00219] Avg episode reward: [(0, '33.914')] [2023-02-25 21:03:28,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 23695360. Throughput: 0: 1251.6. Samples: 4923072. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:03:28,382][00219] Avg episode reward: [(0, '33.535')] [2023-02-25 21:03:30,114][32866] Updated weights for policy 0, policy_version 5789 (0.0012) [2023-02-25 21:03:33,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 23728128. Throughput: 0: 1253.0. Samples: 4927712. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:03:33,385][00219] Avg episode reward: [(0, '30.373')] [2023-02-25 21:03:38,380][00219] Fps is (10 sec: 5323.2, 60 sec: 5051.5, 300 sec: 4901.3). Total num frames: 23748608. Throughput: 0: 1219.2. Samples: 4935120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:03:38,382][00219] Avg episode reward: [(0, '31.772')] [2023-02-25 21:03:38,795][32866] Updated weights for policy 0, policy_version 5799 (0.0013) [2023-02-25 21:03:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23769088. Throughput: 0: 1218.1. Samples: 4940880. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:03:43,382][00219] Avg episode reward: [(0, '29.829')] [2023-02-25 21:03:43,394][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005803_23769088.pth... [2023-02-25 21:03:43,528][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005518_22601728.pth [2023-02-25 21:03:47,662][32866] Updated weights for policy 0, policy_version 5809 (0.0012) [2023-02-25 21:03:48,377][00219] Fps is (10 sec: 4507.0, 60 sec: 4778.7, 300 sec: 4845.8). Total num frames: 23793664. Throughput: 0: 1229.2. Samples: 4944352. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:03:48,379][00219] Avg episode reward: [(0, '29.307')] [2023-02-25 21:03:53,377][00219] Fps is (10 sec: 5734.2, 60 sec: 4983.7, 300 sec: 4887.4). Total num frames: 23826432. Throughput: 0: 1254.0. Samples: 4953552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:03:53,379][00219] Avg episode reward: [(0, '29.908')] [2023-02-25 21:03:54,756][32866] Updated weights for policy 0, policy_version 5819 (0.0011) [2023-02-25 21:03:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 23846912. Throughput: 0: 1214.2. Samples: 4960800. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:03:58,379][00219] Avg episode reward: [(0, '30.530')] [2023-02-25 21:04:03,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23867392. Throughput: 0: 1214.3. Samples: 4963744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:04:03,380][00219] Avg episode reward: [(0, '31.064')] [2023-02-25 21:04:05,392][32866] Updated weights for policy 0, policy_version 5829 (0.0011) [2023-02-25 21:04:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 23896064. Throughput: 0: 1229.5. Samples: 4970336. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:04:08,379][00219] Avg episode reward: [(0, '31.944')] [2023-02-25 21:04:12,047][32866] Updated weights for policy 0, policy_version 5839 (0.0012) [2023-02-25 21:04:13,377][00219] Fps is (10 sec: 5734.8, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 23924736. Throughput: 0: 1251.9. Samples: 4979408. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:04:13,379][00219] Avg episode reward: [(0, '32.700')] [2023-02-25 21:04:18,377][00219] Fps is (10 sec: 5324.7, 60 sec: 5051.9, 300 sec: 4901.4). Total num frames: 23949312. Throughput: 0: 1247.6. Samples: 4983856. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:04:18,386][00219] Avg episode reward: [(0, '34.507')] [2023-02-25 21:04:19,957][32866] Updated weights for policy 0, policy_version 5849 (0.0011) [2023-02-25 21:04:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 23965696. Throughput: 0: 1216.4. Samples: 4989856. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:04:23,389][00219] Avg episode reward: [(0, '35.387')] [2023-02-25 21:04:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 23990272. Throughput: 0: 1228.4. Samples: 4996160. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:04:28,385][00219] Avg episode reward: [(0, '33.438')] [2023-02-25 21:04:29,464][32866] Updated weights for policy 0, policy_version 5859 (0.0012) [2023-02-25 21:04:33,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 24018944. Throughput: 0: 1251.6. Samples: 5000672. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:04:33,381][00219] Avg episode reward: [(0, '32.492')] [2023-02-25 21:04:36,057][32866] Updated weights for policy 0, policy_version 5869 (0.0011) [2023-02-25 21:04:38,380][00219] Fps is (10 sec: 5732.7, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 24047616. Throughput: 0: 1251.8. Samples: 5009888. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:04:38,382][00219] Avg episode reward: [(0, '31.125')] [2023-02-25 21:04:43,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 24068096. Throughput: 0: 1223.5. Samples: 5015856. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:04:43,381][00219] Avg episode reward: [(0, '29.696')] [2023-02-25 21:04:45,925][32866] Updated weights for policy 0, policy_version 5879 (0.0011) [2023-02-25 21:04:48,377][00219] Fps is (10 sec: 4097.3, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 24088576. Throughput: 0: 1222.8. Samples: 5018768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:04:48,378][00219] Avg episode reward: [(0, '29.139')] [2023-02-25 21:04:53,190][32866] Updated weights for policy 0, policy_version 5889 (0.0011) [2023-02-25 21:04:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.2, 300 sec: 4873.6). Total num frames: 24121344. Throughput: 0: 1255.5. Samples: 5026832. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:04:53,385][00219] Avg episode reward: [(0, '29.700')] [2023-02-25 21:04:58,377][00219] Fps is (10 sec: 6144.0, 60 sec: 5051.7, 300 sec: 4901.3). Total num frames: 24150016. Throughput: 0: 1255.1. Samples: 5035888. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:04:58,379][00219] Avg episode reward: [(0, '31.119')] [2023-02-25 21:05:02,046][32866] Updated weights for policy 0, policy_version 5899 (0.0011) [2023-02-25 21:05:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 24166400. Throughput: 0: 1219.6. Samples: 5038736. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:05:03,387][00219] Avg episode reward: [(0, '31.145')] [2023-02-25 21:05:08,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4778.7, 300 sec: 4845.8). Total num frames: 24182784. Throughput: 0: 1213.5. Samples: 5044464. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:05:08,386][00219] Avg episode reward: [(0, '32.750')] [2023-02-25 21:05:10,875][32866] Updated weights for policy 0, policy_version 5909 (0.0012) [2023-02-25 21:05:13,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 24215552. Throughput: 0: 1254.4. Samples: 5052608. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:05:13,379][00219] Avg episode reward: [(0, '34.978')] [2023-02-25 21:05:17,649][32866] Updated weights for policy 0, policy_version 5919 (0.0012) [2023-02-25 21:05:18,381][00219] Fps is (10 sec: 6141.6, 60 sec: 4914.9, 300 sec: 4887.4). Total num frames: 24244224. Throughput: 0: 1257.8. Samples: 5057280. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:05:18,388][00219] Avg episode reward: [(0, '35.794')] [2023-02-25 21:05:23,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4983.4, 300 sec: 4887.4). Total num frames: 24264704. Throughput: 0: 1195.4. Samples: 5063680. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:05:23,380][00219] Avg episode reward: [(0, '36.696')] [2023-02-25 21:05:28,377][00219] Fps is (10 sec: 3687.9, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 24281088. Throughput: 0: 1168.0. Samples: 5068416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:05:28,379][00219] Avg episode reward: [(0, '37.334')] [2023-02-25 21:05:28,384][32858] Saving new best policy, reward=37.334! [2023-02-25 21:05:30,842][32866] Updated weights for policy 0, policy_version 5929 (0.0011) [2023-02-25 21:05:33,377][00219] Fps is (10 sec: 2867.3, 60 sec: 4573.9, 300 sec: 4790.2). Total num frames: 24293376. Throughput: 0: 1156.6. Samples: 5070816. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:05:33,384][00219] Avg episode reward: [(0, '36.689')] [2023-02-25 21:05:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4574.1, 300 sec: 4804.1). Total num frames: 24322048. Throughput: 0: 1110.0. Samples: 5076784. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:05:38,385][00219] Avg episode reward: [(0, '36.471')] [2023-02-25 21:05:39,431][32866] Updated weights for policy 0, policy_version 5939 (0.0012) [2023-02-25 21:05:43,377][00219] Fps is (10 sec: 5324.6, 60 sec: 4642.1, 300 sec: 4831.9). Total num frames: 24346624. Throughput: 0: 1113.2. Samples: 5085984. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:05:43,380][00219] Avg episode reward: [(0, '35.326')] [2023-02-25 21:05:43,391][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005944_24346624.pth... [2023-02-25 21:05:43,518][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005659_23179264.pth [2023-02-25 21:05:46,615][32866] Updated weights for policy 0, policy_version 5949 (0.0011) [2023-02-25 21:05:48,379][00219] Fps is (10 sec: 4914.1, 60 sec: 4710.2, 300 sec: 4818.0). Total num frames: 24371200. Throughput: 0: 1144.8. Samples: 5090256. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:05:48,381][00219] Avg episode reward: [(0, '35.528')] [2023-02-25 21:05:53,377][00219] Fps is (10 sec: 4505.7, 60 sec: 4505.6, 300 sec: 4790.2). Total num frames: 24391680. Throughput: 0: 1148.8. Samples: 5096160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:05:53,381][00219] Avg episode reward: [(0, '35.724')] [2023-02-25 21:05:56,794][32866] Updated weights for policy 0, policy_version 5959 (0.0012) [2023-02-25 21:05:58,377][00219] Fps is (10 sec: 4506.6, 60 sec: 4437.3, 300 sec: 4790.2). Total num frames: 24416256. Throughput: 0: 1112.2. Samples: 5102656. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:05:58,382][00219] Avg episode reward: [(0, '35.456')] [2023-02-25 21:06:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4818.0). Total num frames: 24444928. Throughput: 0: 1111.9. Samples: 5107312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:06:03,379][00219] Avg episode reward: [(0, '34.529')] [2023-02-25 21:06:03,663][32866] Updated weights for policy 0, policy_version 5969 (0.0012) [2023-02-25 21:06:08,379][00219] Fps is (10 sec: 5733.2, 60 sec: 4846.8, 300 sec: 4845.7). Total num frames: 24473600. Throughput: 0: 1171.5. Samples: 5116400. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:06:08,383][00219] Avg episode reward: [(0, '34.465')] [2023-02-25 21:06:12,036][32866] Updated weights for policy 0, policy_version 5979 (0.0012) [2023-02-25 21:06:13,382][00219] Fps is (10 sec: 4912.7, 60 sec: 4641.8, 300 sec: 4804.0). Total num frames: 24494080. Throughput: 0: 1198.4. Samples: 5122352. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:06:13,384][00219] Avg episode reward: [(0, '34.009')] [2023-02-25 21:06:18,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4574.2, 300 sec: 4804.1). Total num frames: 24518656. Throughput: 0: 1210.0. Samples: 5125264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:06:18,382][00219] Avg episode reward: [(0, '34.079')] [2023-02-25 21:06:20,775][32866] Updated weights for policy 0, policy_version 5989 (0.0016) [2023-02-25 21:06:23,377][00219] Fps is (10 sec: 5327.5, 60 sec: 4710.4, 300 sec: 4845.9). Total num frames: 24547328. Throughput: 0: 1259.0. Samples: 5133440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:06:23,383][00219] Avg episode reward: [(0, '33.147')] [2023-02-25 21:06:27,141][32866] Updated weights for policy 0, policy_version 5999 (0.0012) [2023-02-25 21:06:28,378][00219] Fps is (10 sec: 5324.2, 60 sec: 4846.8, 300 sec: 4873.5). Total num frames: 24571904. Throughput: 0: 1256.5. Samples: 5142528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:06:28,382][00219] Avg episode reward: [(0, '32.828')] [2023-02-25 21:06:33,378][00219] Fps is (10 sec: 4504.9, 60 sec: 4983.3, 300 sec: 4887.4). Total num frames: 24592384. Throughput: 0: 1226.3. Samples: 5145440. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:06:33,380][00219] Avg episode reward: [(0, '32.308')] [2023-02-25 21:06:38,058][32866] Updated weights for policy 0, policy_version 6009 (0.0011) [2023-02-25 21:06:38,377][00219] Fps is (10 sec: 4506.0, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 24616960. Throughput: 0: 1229.2. Samples: 5151472. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:06:38,380][00219] Avg episode reward: [(0, '31.323')] [2023-02-25 21:06:43,377][00219] Fps is (10 sec: 4916.0, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 24641536. Throughput: 0: 1266.5. Samples: 5159648. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:06:43,382][00219] Avg episode reward: [(0, '31.404')] [2023-02-25 21:06:44,661][32866] Updated weights for policy 0, policy_version 6019 (0.0012) [2023-02-25 21:06:48,377][00219] Fps is (10 sec: 5734.5, 60 sec: 5051.9, 300 sec: 4901.3). Total num frames: 24674304. Throughput: 0: 1268.6. Samples: 5164400. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:06:48,382][00219] Avg episode reward: [(0, '31.841')] [2023-02-25 21:06:52,360][32866] Updated weights for policy 0, policy_version 6029 (0.0011) [2023-02-25 21:06:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 5051.7, 300 sec: 4901.3). Total num frames: 24694784. Throughput: 0: 1234.9. Samples: 5171968. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:06:53,381][00219] Avg episode reward: [(0, '32.487')] [2023-02-25 21:06:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4983.5, 300 sec: 4873.5). Total num frames: 24715264. Throughput: 0: 1235.7. Samples: 5177952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:06:58,382][00219] Avg episode reward: [(0, '33.075')] [2023-02-25 21:07:01,942][32866] Updated weights for policy 0, policy_version 6039 (0.0014) [2023-02-25 21:07:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 24739840. Throughput: 0: 1246.9. Samples: 5181376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:07:03,378][00219] Avg episode reward: [(0, '34.078')] [2023-02-25 21:07:08,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.6, 300 sec: 4887.4). Total num frames: 24772608. Throughput: 0: 1267.9. Samples: 5190496. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:07:08,383][00219] Avg episode reward: [(0, '34.632')] [2023-02-25 21:07:08,408][32866] Updated weights for policy 0, policy_version 6049 (0.0011) [2023-02-25 21:07:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4983.9, 300 sec: 4901.3). Total num frames: 24793088. Throughput: 0: 1230.3. Samples: 5197888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:07:13,386][00219] Avg episode reward: [(0, '35.778')] [2023-02-25 21:07:18,087][32866] Updated weights for policy 0, policy_version 6059 (0.0013) [2023-02-25 21:07:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4887.4). Total num frames: 24817664. Throughput: 0: 1230.6. Samples: 5200816. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:07:18,386][00219] Avg episode reward: [(0, '34.679')] [2023-02-25 21:07:23,378][00219] Fps is (10 sec: 4914.4, 60 sec: 4915.1, 300 sec: 4873.5). Total num frames: 24842240. Throughput: 0: 1245.5. Samples: 5207520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:07:23,385][00219] Avg episode reward: [(0, '34.144')] [2023-02-25 21:07:26,124][32866] Updated weights for policy 0, policy_version 6069 (0.0012) [2023-02-25 21:07:28,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4983.6, 300 sec: 4887.4). Total num frames: 24870912. Throughput: 0: 1263.3. Samples: 5216496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:07:28,379][00219] Avg episode reward: [(0, '32.151')] [2023-02-25 21:07:33,377][00219] Fps is (10 sec: 5325.7, 60 sec: 5051.9, 300 sec: 4915.2). Total num frames: 24895488. Throughput: 0: 1254.0. Samples: 5220832. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:07:33,383][00219] Avg episode reward: [(0, '31.341')] [2023-02-25 21:07:34,468][32866] Updated weights for policy 0, policy_version 6079 (0.0012) [2023-02-25 21:07:38,379][00219] Fps is (10 sec: 4095.1, 60 sec: 4915.0, 300 sec: 4873.5). Total num frames: 24911872. Throughput: 0: 1214.2. Samples: 5226608. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:07:38,389][00219] Avg episode reward: [(0, '28.466')] [2023-02-25 21:07:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 24936448. Throughput: 0: 1228.4. Samples: 5233232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:07:43,388][00219] Avg episode reward: [(0, '28.281')] [2023-02-25 21:07:43,416][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006088_24936448.pth... [2023-02-25 21:07:43,544][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005803_23769088.pth [2023-02-25 21:07:43,640][32866] Updated weights for policy 0, policy_version 6089 (0.0012) [2023-02-25 21:07:48,377][00219] Fps is (10 sec: 5735.6, 60 sec: 4915.2, 300 sec: 4887.5). Total num frames: 24969216. Throughput: 0: 1249.1. Samples: 5237584. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:07:48,378][00219] Avg episode reward: [(0, '30.018')] [2023-02-25 21:07:50,226][32866] Updated weights for policy 0, policy_version 6099 (0.0011) [2023-02-25 21:07:53,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 24989696. Throughput: 0: 1234.8. Samples: 5246064. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:07:53,385][00219] Avg episode reward: [(0, '30.293')] [2023-02-25 21:07:58,377][00219] Fps is (10 sec: 4095.9, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 25010176. Throughput: 0: 1197.1. Samples: 5251760. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:07:58,383][00219] Avg episode reward: [(0, '29.884')] [2023-02-25 21:08:01,243][32866] Updated weights for policy 0, policy_version 6109 (0.0012) [2023-02-25 21:08:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 25034752. Throughput: 0: 1196.1. Samples: 5254640. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:08:03,387][00219] Avg episode reward: [(0, '32.202')] [2023-02-25 21:08:08,021][32866] Updated weights for policy 0, policy_version 6119 (0.0012) [2023-02-25 21:08:08,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4915.2, 300 sec: 4873.5). Total num frames: 25067520. Throughput: 0: 1238.8. Samples: 5263264. Policy #0 lag: (min: 1.0, avg: 1.3, max: 3.0) [2023-02-25 21:08:08,380][00219] Avg episode reward: [(0, '35.412')] [2023-02-25 21:08:13,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4901.3). Total num frames: 25092096. Throughput: 0: 1233.1. Samples: 5271984. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:08:13,379][00219] Avg episode reward: [(0, '35.558')] [2023-02-25 21:08:15,643][32866] Updated weights for policy 0, policy_version 6129 (0.0013) [2023-02-25 21:08:18,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4915.2, 300 sec: 4887.4). Total num frames: 25112576. Throughput: 0: 1200.0. Samples: 5274832. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:08:18,384][00219] Avg episode reward: [(0, '35.977')] [2023-02-25 21:08:23,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4778.8, 300 sec: 4859.7). Total num frames: 25128960. Throughput: 0: 1201.8. Samples: 5280688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:08:23,384][00219] Avg episode reward: [(0, '35.717')] [2023-02-25 21:08:25,179][32866] Updated weights for policy 0, policy_version 6139 (0.0011) [2023-02-25 21:08:28,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25161728. Throughput: 0: 1245.5. Samples: 5289280. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:08:28,379][00219] Avg episode reward: [(0, '34.262')] [2023-02-25 21:08:32,186][32866] Updated weights for policy 0, policy_version 6149 (0.0011) [2023-02-25 21:08:33,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4915.2, 300 sec: 4887.5). Total num frames: 25190400. Throughput: 0: 1248.7. Samples: 5293776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:08:33,379][00219] Avg episode reward: [(0, '34.283')] [2023-02-25 21:08:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.6, 300 sec: 4887.4). Total num frames: 25210880. Throughput: 0: 1214.2. Samples: 5300704. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:08:38,385][00219] Avg episode reward: [(0, '33.224')] [2023-02-25 21:08:41,935][32866] Updated weights for policy 0, policy_version 6159 (0.0011) [2023-02-25 21:08:43,379][00219] Fps is (10 sec: 4095.2, 60 sec: 4915.0, 300 sec: 4873.5). Total num frames: 25231360. Throughput: 0: 1219.1. Samples: 5306624. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:08:43,383][00219] Avg episode reward: [(0, '33.839')] [2023-02-25 21:08:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25260032. Throughput: 0: 1242.3. Samples: 5310544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:08:48,378][00219] Avg episode reward: [(0, '34.537')] [2023-02-25 21:08:49,133][32866] Updated weights for policy 0, policy_version 6169 (0.0012) [2023-02-25 21:08:53,377][00219] Fps is (10 sec: 6145.3, 60 sec: 5051.7, 300 sec: 4901.3). Total num frames: 25292800. Throughput: 0: 1254.4. Samples: 5319712. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:08:53,379][00219] Avg episode reward: [(0, '35.353')] [2023-02-25 21:08:57,706][32866] Updated weights for policy 0, policy_version 6179 (0.0012) [2023-02-25 21:08:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 5051.8, 300 sec: 4901.3). Total num frames: 25313280. Throughput: 0: 1217.8. Samples: 5326784. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:08:58,381][00219] Avg episode reward: [(0, '35.357')] [2023-02-25 21:09:03,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 25329664. Throughput: 0: 1217.4. Samples: 5329616. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:09:03,385][00219] Avg episode reward: [(0, '35.411')] [2023-02-25 21:09:06,951][32866] Updated weights for policy 0, policy_version 6189 (0.0012) [2023-02-25 21:09:08,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25358336. Throughput: 0: 1238.0. Samples: 5336400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 21:09:08,379][00219] Avg episode reward: [(0, '34.487')] [2023-02-25 21:09:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25382912. Throughput: 0: 1231.3. Samples: 5344688. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:09:13,379][00219] Avg episode reward: [(0, '34.290')] [2023-02-25 21:09:15,642][32866] Updated weights for policy 0, policy_version 6199 (0.0011) [2023-02-25 21:09:18,382][00219] Fps is (10 sec: 4094.0, 60 sec: 4778.3, 300 sec: 4859.6). Total num frames: 25399296. Throughput: 0: 1191.7. Samples: 5347408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:09:18,392][00219] Avg episode reward: [(0, '34.187')] [2023-02-25 21:09:23,379][00219] Fps is (10 sec: 3276.2, 60 sec: 4778.5, 300 sec: 4831.9). Total num frames: 25415680. Throughput: 0: 1142.0. Samples: 5352096. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:09:23,385][00219] Avg episode reward: [(0, '33.937')] [2023-02-25 21:09:28,132][32866] Updated weights for policy 0, policy_version 6209 (0.0011) [2023-02-25 21:09:28,377][00219] Fps is (10 sec: 3688.3, 60 sec: 4573.9, 300 sec: 4804.1). Total num frames: 25436160. Throughput: 0: 1124.0. Samples: 5357200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:09:28,386][00219] Avg episode reward: [(0, '33.714')] [2023-02-25 21:09:33,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4505.6, 300 sec: 4790.3). Total num frames: 25460736. Throughput: 0: 1112.5. Samples: 5360608. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:09:33,379][00219] Avg episode reward: [(0, '32.192')] [2023-02-25 21:09:35,044][32866] Updated weights for policy 0, policy_version 6219 (0.0012) [2023-02-25 21:09:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4818.0). Total num frames: 25489408. Throughput: 0: 1110.4. Samples: 5369680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:09:38,379][00219] Avg episode reward: [(0, '30.211')] [2023-02-25 21:09:43,286][32866] Updated weights for policy 0, policy_version 6229 (0.0013) [2023-02-25 21:09:43,379][00219] Fps is (10 sec: 5323.6, 60 sec: 4710.4, 300 sec: 4831.9). Total num frames: 25513984. Throughput: 0: 1113.5. Samples: 5376896. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:09:43,381][00219] Avg episode reward: [(0, '28.612')] [2023-02-25 21:09:43,390][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006229_25513984.pth... [2023-02-25 21:09:43,583][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000005944_24346624.pth [2023-02-25 21:09:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4776.4). Total num frames: 25530368. Throughput: 0: 1112.5. Samples: 5379680. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:09:48,385][00219] Avg episode reward: [(0, '27.232')] [2023-02-25 21:09:52,827][32866] Updated weights for policy 0, policy_version 6239 (0.0022) [2023-02-25 21:09:53,377][00219] Fps is (10 sec: 4096.9, 60 sec: 4369.1, 300 sec: 4762.5). Total num frames: 25554944. Throughput: 0: 1105.4. Samples: 5386144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:09:53,378][00219] Avg episode reward: [(0, '25.982')] [2023-02-25 21:09:58,383][00219] Fps is (10 sec: 5321.6, 60 sec: 4505.1, 300 sec: 4804.0). Total num frames: 25583616. Throughput: 0: 1123.4. Samples: 5395248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:09:58,385][00219] Avg episode reward: [(0, '28.551')] [2023-02-25 21:09:59,798][32866] Updated weights for policy 0, policy_version 6249 (0.0012) [2023-02-25 21:10:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4831.9). Total num frames: 25608192. Throughput: 0: 1160.3. Samples: 5399616. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:10:03,385][00219] Avg episode reward: [(0, '27.509')] [2023-02-25 21:10:08,377][00219] Fps is (10 sec: 4508.3, 60 sec: 4505.6, 300 sec: 4790.2). Total num frames: 25628672. Throughput: 0: 1183.3. Samples: 5405344. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:10:08,380][00219] Avg episode reward: [(0, '28.389')] [2023-02-25 21:10:09,493][32866] Updated weights for policy 0, policy_version 6259 (0.0012) [2023-02-25 21:10:13,377][00219] Fps is (10 sec: 4505.4, 60 sec: 4505.6, 300 sec: 4776.4). Total num frames: 25653248. Throughput: 0: 1214.2. Samples: 5411840. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:10:13,387][00219] Avg episode reward: [(0, '29.428')] [2023-02-25 21:10:16,803][32866] Updated weights for policy 0, policy_version 6269 (0.0011) [2023-02-25 21:10:18,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4779.1, 300 sec: 4818.0). Total num frames: 25686016. Throughput: 0: 1239.1. Samples: 5416368. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:10:18,379][00219] Avg episode reward: [(0, '30.596')] [2023-02-25 21:10:23,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4915.4, 300 sec: 4845.8). Total num frames: 25710592. Throughput: 0: 1237.0. Samples: 5425344. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:10:23,385][00219] Avg episode reward: [(0, '29.374')] [2023-02-25 21:10:24,865][32866] Updated weights for policy 0, policy_version 6279 (0.0011) [2023-02-25 21:10:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25726976. Throughput: 0: 1204.3. Samples: 5431088. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:10:28,382][00219] Avg episode reward: [(0, '29.255')] [2023-02-25 21:10:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 25751552. Throughput: 0: 1208.5. Samples: 5434064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:10:33,387][00219] Avg episode reward: [(0, '32.569')] [2023-02-25 21:10:34,590][32866] Updated weights for policy 0, policy_version 6289 (0.0012) [2023-02-25 21:10:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4846.9, 300 sec: 4859.7). Total num frames: 25780224. Throughput: 0: 1245.2. Samples: 5442176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:10:38,379][00219] Avg episode reward: [(0, '33.030')] [2023-02-25 21:10:41,419][32866] Updated weights for policy 0, policy_version 6299 (0.0012) [2023-02-25 21:10:43,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4915.4, 300 sec: 4873.6). Total num frames: 25808896. Throughput: 0: 1235.7. Samples: 5450848. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:10:43,379][00219] Avg episode reward: [(0, '34.185')] [2023-02-25 21:10:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 25825280. Throughput: 0: 1202.1. Samples: 5453712. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:10:48,379][00219] Avg episode reward: [(0, '34.959')] [2023-02-25 21:10:51,169][32866] Updated weights for policy 0, policy_version 6309 (0.0012) [2023-02-25 21:10:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 25845760. Throughput: 0: 1204.3. Samples: 5459536. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:10:53,379][00219] Avg episode reward: [(0, '35.657')] [2023-02-25 21:10:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4915.7, 300 sec: 4859.7). Total num frames: 25878528. Throughput: 0: 1245.5. Samples: 5467888. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:10:58,379][00219] Avg episode reward: [(0, '35.103')] [2023-02-25 21:10:59,172][32866] Updated weights for policy 0, policy_version 6319 (0.0012) [2023-02-25 21:11:03,377][00219] Fps is (10 sec: 6143.9, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 25907200. Throughput: 0: 1243.4. Samples: 5472320. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:11:03,379][00219] Avg episode reward: [(0, '34.394')] [2023-02-25 21:11:07,251][32866] Updated weights for policy 0, policy_version 6329 (0.0011) [2023-02-25 21:11:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4983.5, 300 sec: 4859.7). Total num frames: 25927680. Throughput: 0: 1205.3. Samples: 5479584. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:11:08,378][00219] Avg episode reward: [(0, '34.346')] [2023-02-25 21:11:13,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 25948160. Throughput: 0: 1206.8. Samples: 5485392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:11:13,380][00219] Avg episode reward: [(0, '33.391')] [2023-02-25 21:11:16,464][32866] Updated weights for policy 0, policy_version 6339 (0.0012) [2023-02-25 21:11:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 25976832. Throughput: 0: 1223.1. Samples: 5489104. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:11:18,385][00219] Avg episode reward: [(0, '32.437')] [2023-02-25 21:11:23,248][32866] Updated weights for policy 0, policy_version 6349 (0.0012) [2023-02-25 21:11:23,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4915.2, 300 sec: 4859.7). Total num frames: 26005504. Throughput: 0: 1247.3. Samples: 5498304. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 21:11:23,379][00219] Avg episode reward: [(0, '30.791')] [2023-02-25 21:11:28,377][00219] Fps is (10 sec: 4914.8, 60 sec: 4983.4, 300 sec: 4859.7). Total num frames: 26025984. Throughput: 0: 1210.6. Samples: 5505328. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:11:28,380][00219] Avg episode reward: [(0, '31.578')] [2023-02-25 21:11:33,109][32866] Updated weights for policy 0, policy_version 6359 (0.0011) [2023-02-25 21:11:33,378][00219] Fps is (10 sec: 4095.6, 60 sec: 4915.1, 300 sec: 4845.8). Total num frames: 26046464. Throughput: 0: 1210.6. Samples: 5508192. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:11:33,382][00219] Avg episode reward: [(0, '30.681')] [2023-02-25 21:11:38,377][00219] Fps is (10 sec: 4505.9, 60 sec: 4846.9, 300 sec: 4845.8). Total num frames: 26071040. Throughput: 0: 1226.0. Samples: 5514704. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:11:38,379][00219] Avg episode reward: [(0, '32.445')] [2023-02-25 21:11:41,029][32866] Updated weights for policy 0, policy_version 6369 (0.0011) [2023-02-25 21:11:43,377][00219] Fps is (10 sec: 5325.3, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 26099712. Throughput: 0: 1242.3. Samples: 5523792. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:11:43,379][00219] Avg episode reward: [(0, '34.274')] [2023-02-25 21:11:43,475][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006373_26103808.pth... [2023-02-25 21:11:43,593][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006088_24936448.pth [2023-02-25 21:11:48,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 26120192. Throughput: 0: 1234.8. Samples: 5527888. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:11:48,381][00219] Avg episode reward: [(0, '34.848')] [2023-02-25 21:11:48,735][32866] Updated weights for policy 0, policy_version 6379 (0.0013) [2023-02-25 21:11:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 26140672. Throughput: 0: 1199.6. Samples: 5533568. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:11:53,384][00219] Avg episode reward: [(0, '35.429')] [2023-02-25 21:11:58,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4778.7, 300 sec: 4831.9). Total num frames: 26165248. Throughput: 0: 1221.7. Samples: 5540368. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:11:58,381][00219] Avg episode reward: [(0, '35.220')] [2023-02-25 21:11:58,435][32866] Updated weights for policy 0, policy_version 6389 (0.0012) [2023-02-25 21:12:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4818.0). Total num frames: 26193920. Throughput: 0: 1238.4. Samples: 5544832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:12:03,379][00219] Avg episode reward: [(0, '35.747')] [2023-02-25 21:12:05,324][32866] Updated weights for policy 0, policy_version 6399 (0.0012) [2023-02-25 21:12:08,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4915.2, 300 sec: 4845.8). Total num frames: 26222592. Throughput: 0: 1225.6. Samples: 5553456. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:12:08,379][00219] Avg episode reward: [(0, '36.694')] [2023-02-25 21:12:13,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 26238976. Throughput: 0: 1198.6. Samples: 5559264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:12:13,378][00219] Avg episode reward: [(0, '36.158')] [2023-02-25 21:12:15,842][32866] Updated weights for policy 0, policy_version 6409 (0.0013) [2023-02-25 21:12:18,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4778.7, 300 sec: 4818.0). Total num frames: 26263552. Throughput: 0: 1200.7. Samples: 5562224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:12:18,381][00219] Avg episode reward: [(0, '36.379')] [2023-02-25 21:12:22,926][32866] Updated weights for policy 0, policy_version 6419 (0.0012) [2023-02-25 21:12:23,377][00219] Fps is (10 sec: 5734.5, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 26296320. Throughput: 0: 1242.3. Samples: 5570608. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:12:23,379][00219] Avg episode reward: [(0, '37.913')] [2023-02-25 21:12:23,389][32858] Saving new best policy, reward=37.913! [2023-02-25 21:12:28,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.3, 300 sec: 4831.9). Total num frames: 26320896. Throughput: 0: 1229.5. Samples: 5579120. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:12:28,379][00219] Avg episode reward: [(0, '36.820')] [2023-02-25 21:12:31,440][32866] Updated weights for policy 0, policy_version 6429 (0.0012) [2023-02-25 21:12:33,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4847.0, 300 sec: 4831.9). Total num frames: 26337280. Throughput: 0: 1201.8. Samples: 5581968. Policy #0 lag: (min: 0.0, avg: 1.0, max: 3.0) [2023-02-25 21:12:33,382][00219] Avg episode reward: [(0, '36.277')] [2023-02-25 21:12:38,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 26361856. Throughput: 0: 1201.4. Samples: 5587632. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:12:38,379][00219] Avg episode reward: [(0, '33.031')] [2023-02-25 21:12:40,406][32866] Updated weights for policy 0, policy_version 6439 (0.0012) [2023-02-25 21:12:43,376][00219] Fps is (10 sec: 5324.9, 60 sec: 4847.0, 300 sec: 4818.0). Total num frames: 26390528. Throughput: 0: 1239.8. Samples: 5596160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:12:43,384][00219] Avg episode reward: [(0, '32.304')] [2023-02-25 21:12:47,380][32866] Updated weights for policy 0, policy_version 6449 (0.0012) [2023-02-25 21:12:48,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4983.5, 300 sec: 4845.8). Total num frames: 26419200. Throughput: 0: 1242.3. Samples: 5600736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:12:48,384][00219] Avg episode reward: [(0, '30.951')] [2023-02-25 21:12:53,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4983.5, 300 sec: 4845.8). Total num frames: 26439680. Throughput: 0: 1208.2. Samples: 5607824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:12:53,379][00219] Avg episode reward: [(0, '30.140')] [2023-02-25 21:12:56,905][32866] Updated weights for policy 0, policy_version 6459 (0.0011) [2023-02-25 21:12:58,377][00219] Fps is (10 sec: 4095.8, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 26460160. Throughput: 0: 1210.0. Samples: 5613712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:12:58,383][00219] Avg episode reward: [(0, '29.891')] [2023-02-25 21:13:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4915.2, 300 sec: 4818.0). Total num frames: 26488832. Throughput: 0: 1230.9. Samples: 5617616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:13:03,378][00219] Avg episode reward: [(0, '31.023')] [2023-02-25 21:13:04,823][32866] Updated weights for policy 0, policy_version 6469 (0.0012) [2023-02-25 21:13:08,377][00219] Fps is (10 sec: 5734.6, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 26517504. Throughput: 0: 1245.5. Samples: 5626656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:13:08,384][00219] Avg episode reward: [(0, '32.211')] [2023-02-25 21:13:13,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4846.9, 300 sec: 4804.1). Total num frames: 26529792. Throughput: 0: 1172.6. Samples: 5631888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:13:13,381][00219] Avg episode reward: [(0, '32.143')] [2023-02-25 21:13:14,362][32866] Updated weights for policy 0, policy_version 6479 (0.0017) [2023-02-25 21:13:18,380][00219] Fps is (10 sec: 2866.3, 60 sec: 4710.2, 300 sec: 4804.1). Total num frames: 26546176. Throughput: 0: 1161.5. Samples: 5634240. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:13:18,388][00219] Avg episode reward: [(0, '32.104')] [2023-02-25 21:13:23,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4437.3, 300 sec: 4748.6). Total num frames: 26562560. Throughput: 0: 1140.3. Samples: 5638944. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:13:23,384][00219] Avg episode reward: [(0, '31.935')] [2023-02-25 21:13:26,142][32866] Updated weights for policy 0, policy_version 6489 (0.0017) [2023-02-25 21:13:28,377][00219] Fps is (10 sec: 4507.0, 60 sec: 4505.6, 300 sec: 4748.6). Total num frames: 26591232. Throughput: 0: 1104.4. Samples: 5645856. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:13:28,378][00219] Avg episode reward: [(0, '32.114')] [2023-02-25 21:13:32,863][32866] Updated weights for policy 0, policy_version 6499 (0.0012) [2023-02-25 21:13:33,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 26619904. Throughput: 0: 1104.7. Samples: 5650448. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:13:33,380][00219] Avg episode reward: [(0, '30.281')] [2023-02-25 21:13:38,379][00219] Fps is (10 sec: 5323.7, 60 sec: 4710.2, 300 sec: 4790.2). Total num frames: 26644480. Throughput: 0: 1128.5. Samples: 5658608. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:13:38,381][00219] Avg episode reward: [(0, '29.441')] [2023-02-25 21:13:42,214][32866] Updated weights for policy 0, policy_version 6509 (0.0013) [2023-02-25 21:13:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4505.6, 300 sec: 4748.6). Total num frames: 26660864. Throughput: 0: 1125.7. Samples: 5664368. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:13:43,379][00219] Avg episode reward: [(0, '28.782')] [2023-02-25 21:13:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006509_26660864.pth... [2023-02-25 21:13:43,564][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006229_25513984.pth [2023-02-25 21:13:48,377][00219] Fps is (10 sec: 4096.9, 60 sec: 4437.3, 300 sec: 4720.8). Total num frames: 26685440. Throughput: 0: 1103.6. Samples: 5667280. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:13:48,387][00219] Avg episode reward: [(0, '30.388')] [2023-02-25 21:13:50,537][32866] Updated weights for policy 0, policy_version 6519 (0.0012) [2023-02-25 21:13:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4573.9, 300 sec: 4748.6). Total num frames: 26714112. Throughput: 0: 1101.2. Samples: 5676208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:13:53,387][00219] Avg episode reward: [(0, '29.415')] [2023-02-25 21:13:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.2, 300 sec: 4776.4). Total num frames: 26738688. Throughput: 0: 1168.4. Samples: 5684464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:13:58,379][00219] Avg episode reward: [(0, '30.122')] [2023-02-25 21:13:58,410][32866] Updated weights for policy 0, policy_version 6529 (0.0011) [2023-02-25 21:14:03,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4573.7, 300 sec: 4762.4). Total num frames: 26763264. Throughput: 0: 1178.7. Samples: 5687280. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:03,385][00219] Avg episode reward: [(0, '31.701')] [2023-02-25 21:14:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4369.1, 300 sec: 4734.7). Total num frames: 26779648. Throughput: 0: 1199.3. Samples: 5692912. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:08,379][00219] Avg episode reward: [(0, '33.296')] [2023-02-25 21:14:08,508][32866] Updated weights for policy 0, policy_version 6539 (0.0012) [2023-02-25 21:14:13,377][00219] Fps is (10 sec: 5325.9, 60 sec: 4778.7, 300 sec: 4804.2). Total num frames: 26816512. Throughput: 0: 1243.7. Samples: 5701824. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:14:13,379][00219] Avg episode reward: [(0, '32.244')] [2023-02-25 21:14:14,694][32866] Updated weights for policy 0, policy_version 6549 (0.0012) [2023-02-25 21:14:18,377][00219] Fps is (10 sec: 6144.0, 60 sec: 4915.4, 300 sec: 4831.9). Total num frames: 26841088. Throughput: 0: 1243.7. Samples: 5706416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:14:18,379][00219] Avg episode reward: [(0, '33.517')] [2023-02-25 21:14:23,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4983.5, 300 sec: 4831.9). Total num frames: 26861568. Throughput: 0: 1208.6. Samples: 5712992. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:23,379][00219] Avg episode reward: [(0, '34.837')] [2023-02-25 21:14:24,850][32866] Updated weights for policy 0, policy_version 6559 (0.0011) [2023-02-25 21:14:28,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.6, 300 sec: 4804.1). Total num frames: 26877952. Throughput: 0: 1201.4. Samples: 5718432. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:28,379][00219] Avg episode reward: [(0, '35.884')] [2023-02-25 21:14:32,284][32866] Updated weights for policy 0, policy_version 6569 (0.0013) [2023-02-25 21:14:33,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4778.6, 300 sec: 4804.1). Total num frames: 26906624. Throughput: 0: 1237.7. Samples: 5722976. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:14:33,380][00219] Avg episode reward: [(0, '36.285')] [2023-02-25 21:14:38,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4915.4, 300 sec: 4831.9). Total num frames: 26939392. Throughput: 0: 1239.1. Samples: 5731968. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:14:38,379][00219] Avg episode reward: [(0, '35.562')] [2023-02-25 21:14:40,614][32866] Updated weights for policy 0, policy_version 6579 (0.0011) [2023-02-25 21:14:43,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 26955776. Throughput: 0: 1195.7. Samples: 5738272. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:14:43,386][00219] Avg episode reward: [(0, '36.042')] [2023-02-25 21:14:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 26976256. Throughput: 0: 1195.4. Samples: 5741072. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:48,380][00219] Avg episode reward: [(0, '34.008')] [2023-02-25 21:14:50,160][32866] Updated weights for policy 0, policy_version 6589 (0.0011) [2023-02-25 21:14:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4846.9, 300 sec: 4818.1). Total num frames: 27004928. Throughput: 0: 1236.3. Samples: 5748544. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:14:53,379][00219] Avg episode reward: [(0, '33.644')] [2023-02-25 21:14:57,265][32866] Updated weights for policy 0, policy_version 6599 (0.0012) [2023-02-25 21:14:58,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 27033600. Throughput: 0: 1238.8. Samples: 5757568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:14:58,388][00219] Avg episode reward: [(0, '33.353')] [2023-02-25 21:15:03,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.1, 300 sec: 4831.9). Total num frames: 27054080. Throughput: 0: 1212.4. Samples: 5760976. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:15:03,388][00219] Avg episode reward: [(0, '32.674')] [2023-02-25 21:15:07,231][32866] Updated weights for policy 0, policy_version 6609 (0.0023) [2023-02-25 21:15:08,377][00219] Fps is (10 sec: 4095.7, 60 sec: 4915.1, 300 sec: 4818.0). Total num frames: 27074560. Throughput: 0: 1192.2. Samples: 5766640. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:15:08,380][00219] Avg episode reward: [(0, '34.656')] [2023-02-25 21:15:13,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.4, 300 sec: 4790.2). Total num frames: 27099136. Throughput: 0: 1231.3. Samples: 5773840. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:15:13,384][00219] Avg episode reward: [(0, '34.510')] [2023-02-25 21:15:14,747][32866] Updated weights for policy 0, policy_version 6619 (0.0012) [2023-02-25 21:15:18,377][00219] Fps is (10 sec: 5734.9, 60 sec: 4846.9, 300 sec: 4818.0). Total num frames: 27131904. Throughput: 0: 1229.5. Samples: 5778304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:15:18,388][00219] Avg episode reward: [(0, '32.650')] [2023-02-25 21:15:22,576][32866] Updated weights for policy 0, policy_version 6629 (0.0012) [2023-02-25 21:15:23,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4846.9, 300 sec: 4831.9). Total num frames: 27152384. Throughput: 0: 1203.5. Samples: 5786128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:15:23,384][00219] Avg episode reward: [(0, '31.607')] [2023-02-25 21:15:28,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4846.9, 300 sec: 4804.1). Total num frames: 27168768. Throughput: 0: 1187.6. Samples: 5791712. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:15:28,381][00219] Avg episode reward: [(0, '30.980')] [2023-02-25 21:15:32,394][32866] Updated weights for policy 0, policy_version 6639 (0.0012) [2023-02-25 21:15:33,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4778.7, 300 sec: 4790.2). Total num frames: 27193344. Throughput: 0: 1189.3. Samples: 5794592. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:15:33,380][00219] Avg episode reward: [(0, '30.745')] [2023-02-25 21:15:38,377][00219] Fps is (10 sec: 5734.3, 60 sec: 4778.7, 300 sec: 4804.1). Total num frames: 27226112. Throughput: 0: 1215.3. Samples: 5803232. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:15:38,380][00219] Avg episode reward: [(0, '29.039')] [2023-02-25 21:15:39,542][32866] Updated weights for policy 0, policy_version 6649 (0.0012) [2023-02-25 21:15:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4915.2, 300 sec: 4831.9). Total num frames: 27250688. Throughput: 0: 1187.2. Samples: 5810992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:15:43,379][00219] Avg episode reward: [(0, '29.454')] [2023-02-25 21:15:43,394][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006653_27250688.pth... [2023-02-25 21:15:43,603][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006373_26103808.pth [2023-02-25 21:15:48,378][00219] Fps is (10 sec: 4095.3, 60 sec: 4846.8, 300 sec: 4818.0). Total num frames: 27267072. Throughput: 0: 1168.0. Samples: 5813536. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:15:48,381][00219] Avg episode reward: [(0, '28.484')] [2023-02-25 21:15:50,000][32866] Updated weights for policy 0, policy_version 6659 (0.0013) [2023-02-25 21:15:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 27287552. Throughput: 0: 1165.5. Samples: 5819088. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:15:53,378][00219] Avg episode reward: [(0, '28.891')] [2023-02-25 21:15:58,318][32866] Updated weights for policy 0, policy_version 6669 (0.0012) [2023-02-25 21:15:58,377][00219] Fps is (10 sec: 4916.1, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 27316224. Throughput: 0: 1197.9. Samples: 5827744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:15:58,387][00219] Avg episode reward: [(0, '29.435')] [2023-02-25 21:16:03,381][00219] Fps is (10 sec: 5322.6, 60 sec: 4778.3, 300 sec: 4790.2). Total num frames: 27340800. Throughput: 0: 1194.2. Samples: 5832048. Policy #0 lag: (min: 0.0, avg: 0.9, max: 3.0) [2023-02-25 21:16:03,383][00219] Avg episode reward: [(0, '31.582')] [2023-02-25 21:16:07,491][32866] Updated weights for policy 0, policy_version 6679 (0.0012) [2023-02-25 21:16:08,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4790.2). Total num frames: 27361280. Throughput: 0: 1153.8. Samples: 5838048. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:16:08,382][00219] Avg episode reward: [(0, '33.212')] [2023-02-25 21:16:13,377][00219] Fps is (10 sec: 3687.9, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 27377664. Throughput: 0: 1150.9. Samples: 5843504. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:16:13,379][00219] Avg episode reward: [(0, '33.436')] [2023-02-25 21:16:16,090][32866] Updated weights for policy 0, policy_version 6689 (0.0013) [2023-02-25 21:16:18,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4573.8, 300 sec: 4748.6). Total num frames: 27406336. Throughput: 0: 1182.6. Samples: 5847808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:16:18,379][00219] Avg episode reward: [(0, '34.712')] [2023-02-25 21:16:23,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4710.4, 300 sec: 4776.4). Total num frames: 27435008. Throughput: 0: 1183.6. Samples: 5856496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:16:23,381][00219] Avg episode reward: [(0, '32.361')] [2023-02-25 21:16:23,813][32866] Updated weights for policy 0, policy_version 6699 (0.0012) [2023-02-25 21:16:28,379][00219] Fps is (10 sec: 4914.3, 60 sec: 4778.5, 300 sec: 4776.3). Total num frames: 27455488. Throughput: 0: 1147.0. Samples: 5862608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:16:28,381][00219] Avg episode reward: [(0, '32.905')] [2023-02-25 21:16:33,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 27471872. Throughput: 0: 1153.5. Samples: 5865440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:16:33,382][00219] Avg episode reward: [(0, '30.845')] [2023-02-25 21:16:34,350][32866] Updated weights for policy 0, policy_version 6709 (0.0012) [2023-02-25 21:16:38,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4573.9, 300 sec: 4748.6). Total num frames: 27500544. Throughput: 0: 1183.6. Samples: 5872352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:16:38,378][00219] Avg episode reward: [(0, '30.608')] [2023-02-25 21:16:41,572][32866] Updated weights for policy 0, policy_version 6719 (0.0013) [2023-02-25 21:16:43,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4642.1, 300 sec: 4776.4). Total num frames: 27529216. Throughput: 0: 1181.2. Samples: 5880896. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:16:43,379][00219] Avg episode reward: [(0, '29.149')] [2023-02-25 21:16:48,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.5, 300 sec: 4776.4). Total num frames: 27549696. Throughput: 0: 1162.4. Samples: 5884352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:16:48,386][00219] Avg episode reward: [(0, '30.388')] [2023-02-25 21:16:51,192][32866] Updated weights for policy 0, policy_version 6729 (0.0012) [2023-02-25 21:16:53,383][00219] Fps is (10 sec: 4093.4, 60 sec: 4709.9, 300 sec: 4762.4). Total num frames: 27570176. Throughput: 0: 1151.8. Samples: 5889888. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:16:53,394][00219] Avg episode reward: [(0, '31.358')] [2023-02-25 21:16:58,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4748.6). Total num frames: 27594752. Throughput: 0: 1184.0. Samples: 5896784. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:16:58,379][00219] Avg episode reward: [(0, '31.207')] [2023-02-25 21:16:59,589][32866] Updated weights for policy 0, policy_version 6739 (0.0021) [2023-02-25 21:17:03,377][00219] Fps is (10 sec: 4508.4, 60 sec: 4574.2, 300 sec: 4720.8). Total num frames: 27615232. Throughput: 0: 1173.7. Samples: 5900624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:17:03,385][00219] Avg episode reward: [(0, '31.867')] [2023-02-25 21:17:08,378][00219] Fps is (10 sec: 3276.4, 60 sec: 4437.2, 300 sec: 4706.9). Total num frames: 27627520. Throughput: 0: 1096.1. Samples: 5905824. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:17:08,386][00219] Avg episode reward: [(0, '32.720')] [2023-02-25 21:17:12,024][32866] Updated weights for policy 0, policy_version 6749 (0.0018) [2023-02-25 21:17:13,377][00219] Fps is (10 sec: 2867.2, 60 sec: 4437.3, 300 sec: 4679.2). Total num frames: 27643904. Throughput: 0: 1058.9. Samples: 5910256. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:17:13,388][00219] Avg episode reward: [(0, '33.625')] [2023-02-25 21:17:18,378][00219] Fps is (10 sec: 3276.9, 60 sec: 4232.5, 300 sec: 4623.6). Total num frames: 27660288. Throughput: 0: 1047.4. Samples: 5912576. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:17:18,384][00219] Avg episode reward: [(0, '35.704')] [2023-02-25 21:17:22,159][32866] Updated weights for policy 0, policy_version 6759 (0.0012) [2023-02-25 21:17:23,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4300.8, 300 sec: 4651.4). Total num frames: 27693056. Throughput: 0: 1048.2. Samples: 5919520. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:17:23,381][00219] Avg episode reward: [(0, '35.692')] [2023-02-25 21:17:28,377][00219] Fps is (10 sec: 5735.0, 60 sec: 4369.2, 300 sec: 4679.2). Total num frames: 27717632. Throughput: 0: 1048.2. Samples: 5928064. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:17:28,378][00219] Avg episode reward: [(0, '34.730')] [2023-02-25 21:17:29,648][32866] Updated weights for policy 0, policy_version 6769 (0.0013) [2023-02-25 21:17:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.3, 300 sec: 4665.3). Total num frames: 27738112. Throughput: 0: 1050.0. Samples: 5931600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:17:33,379][00219] Avg episode reward: [(0, '34.139')] [2023-02-25 21:17:38,378][00219] Fps is (10 sec: 3686.1, 60 sec: 4232.5, 300 sec: 4623.6). Total num frames: 27754496. Throughput: 0: 1051.9. Samples: 5937216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:17:38,380][00219] Avg episode reward: [(0, '34.749')] [2023-02-25 21:17:40,829][32866] Updated weights for policy 0, policy_version 6779 (0.0018) [2023-02-25 21:17:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4164.3, 300 sec: 4609.7). Total num frames: 27779072. Throughput: 0: 1047.1. Samples: 5943904. Policy #0 lag: (min: 0.0, avg: 0.0, max: 2.0) [2023-02-25 21:17:43,380][00219] Avg episode reward: [(0, '33.445')] [2023-02-25 21:17:43,449][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006783_27783168.pth... [2023-02-25 21:17:43,544][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006509_26660864.pth [2023-02-25 21:17:47,737][32866] Updated weights for policy 0, policy_version 6789 (0.0012) [2023-02-25 21:17:48,377][00219] Fps is (10 sec: 5735.0, 60 sec: 4369.1, 300 sec: 4651.4). Total num frames: 27811840. Throughput: 0: 1055.3. Samples: 5948112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:17:48,386][00219] Avg episode reward: [(0, '33.458')] [2023-02-25 21:17:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4369.5, 300 sec: 4651.4). Total num frames: 27832320. Throughput: 0: 1119.0. Samples: 5956176. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 21:17:53,385][00219] Avg episode reward: [(0, '35.510')] [2023-02-25 21:17:57,572][32866] Updated weights for policy 0, policy_version 6799 (0.0013) [2023-02-25 21:17:58,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4300.8, 300 sec: 4623.6). Total num frames: 27852800. Throughput: 0: 1145.6. Samples: 5961808. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:17:58,384][00219] Avg episode reward: [(0, '35.086')] [2023-02-25 21:18:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4300.8, 300 sec: 4595.9). Total num frames: 27873280. Throughput: 0: 1153.8. Samples: 5964496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:18:03,380][00219] Avg episode reward: [(0, '36.467')] [2023-02-25 21:18:05,894][32866] Updated weights for policy 0, policy_version 6809 (0.0013) [2023-02-25 21:18:08,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4574.0, 300 sec: 4651.4). Total num frames: 27901952. Throughput: 0: 1185.8. Samples: 5972880. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:18:08,381][00219] Avg episode reward: [(0, '36.327')] [2023-02-25 21:18:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4679.2). Total num frames: 27926528. Throughput: 0: 1173.3. Samples: 5980864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:18:13,385][00219] Avg episode reward: [(0, '37.736')] [2023-02-25 21:18:13,649][32866] Updated weights for policy 0, policy_version 6819 (0.0016) [2023-02-25 21:18:18,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4778.7, 300 sec: 4693.0). Total num frames: 27947008. Throughput: 0: 1154.8. Samples: 5983568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:18:18,380][00219] Avg episode reward: [(0, '37.127')] [2023-02-25 21:18:23,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4505.6, 300 sec: 4651.4). Total num frames: 27963392. Throughput: 0: 1154.2. Samples: 5989152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:18:23,379][00219] Avg episode reward: [(0, '35.642')] [2023-02-25 21:18:23,767][32866] Updated weights for policy 0, policy_version 6829 (0.0012) [2023-02-25 21:18:28,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4573.9, 300 sec: 4651.4). Total num frames: 27992064. Throughput: 0: 1188.6. Samples: 5997392. Policy #0 lag: (min: 0.0, avg: 1.1, max: 3.0) [2023-02-25 21:18:28,382][00219] Avg episode reward: [(0, '36.193')] [2023-02-25 21:18:31,383][32866] Updated weights for policy 0, policy_version 6839 (0.0013) [2023-02-25 21:18:33,377][00219] Fps is (10 sec: 6143.9, 60 sec: 4778.7, 300 sec: 4679.2). Total num frames: 28024832. Throughput: 0: 1191.8. Samples: 6001744. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:18:33,379][00219] Avg episode reward: [(0, '35.166')] [2023-02-25 21:18:38,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4679.2). Total num frames: 28041216. Throughput: 0: 1153.4. Samples: 6008080. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:18:38,385][00219] Avg episode reward: [(0, '34.766')] [2023-02-25 21:18:41,741][32866] Updated weights for policy 0, policy_version 6849 (0.0013) [2023-02-25 21:18:43,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4642.1, 300 sec: 4651.4). Total num frames: 28057600. Throughput: 0: 1148.4. Samples: 6013488. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:18:43,378][00219] Avg episode reward: [(0, '34.711')] [2023-02-25 21:18:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4573.9, 300 sec: 4651.4). Total num frames: 28086272. Throughput: 0: 1178.7. Samples: 6017536. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:18:48,382][00219] Avg episode reward: [(0, '34.810')] [2023-02-25 21:18:49,032][32866] Updated weights for policy 0, policy_version 6859 (0.0012) [2023-02-25 21:18:53,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4710.4, 300 sec: 4665.3). Total num frames: 28114944. Throughput: 0: 1185.4. Samples: 6026224. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:18:53,381][00219] Avg episode reward: [(0, '34.773')] [2023-02-25 21:18:58,384][00219] Fps is (10 sec: 4502.5, 60 sec: 4641.6, 300 sec: 4637.4). Total num frames: 28131328. Throughput: 0: 1151.1. Samples: 6032672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:18:58,386][00219] Avg episode reward: [(0, '34.565')] [2023-02-25 21:18:58,542][32866] Updated weights for policy 0, policy_version 6869 (0.0012) [2023-02-25 21:19:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4665.3). Total num frames: 28155904. Throughput: 0: 1154.1. Samples: 6035504. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:19:03,384][00219] Avg episode reward: [(0, '35.487')] [2023-02-25 21:19:08,194][32866] Updated weights for policy 0, policy_version 6880 (0.0012) [2023-02-25 21:19:08,377][00219] Fps is (10 sec: 4918.7, 60 sec: 4642.1, 300 sec: 4623.6). Total num frames: 28180480. Throughput: 0: 1178.0. Samples: 6042160. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:19:08,384][00219] Avg episode reward: [(0, '35.225')] [2023-02-25 21:19:13,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4637.5). Total num frames: 28209152. Throughput: 0: 1187.6. Samples: 6050832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:19:13,378][00219] Avg episode reward: [(0, '34.696')] [2023-02-25 21:19:16,200][32866] Updated weights for policy 0, policy_version 6890 (0.0012) [2023-02-25 21:19:18,379][00219] Fps is (10 sec: 4914.1, 60 sec: 4710.2, 300 sec: 4637.5). Total num frames: 28229632. Throughput: 0: 1169.0. Samples: 6054352. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:19:18,382][00219] Avg episode reward: [(0, '36.015')] [2023-02-25 21:19:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4651.4). Total num frames: 28250112. Throughput: 0: 1149.9. Samples: 6059824. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:19:23,379][00219] Avg episode reward: [(0, '36.754')] [2023-02-25 21:19:26,621][32866] Updated weights for policy 0, policy_version 6900 (0.0013) [2023-02-25 21:19:28,380][00219] Fps is (10 sec: 4505.2, 60 sec: 4710.2, 300 sec: 4637.5). Total num frames: 28274688. Throughput: 0: 1182.1. Samples: 6066688. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:19:28,388][00219] Avg episode reward: [(0, '36.351')] [2023-02-25 21:19:33,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4573.9, 300 sec: 4609.7). Total num frames: 28299264. Throughput: 0: 1187.9. Samples: 6070992. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:19:33,382][00219] Avg episode reward: [(0, '35.018')] [2023-02-25 21:19:33,491][32866] Updated weights for policy 0, policy_version 6910 (0.0012) [2023-02-25 21:19:38,377][00219] Fps is (10 sec: 4916.7, 60 sec: 4710.4, 300 sec: 4637.5). Total num frames: 28323840. Throughput: 0: 1168.7. Samples: 6078816. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:19:38,383][00219] Avg episode reward: [(0, '36.331')] [2023-02-25 21:19:43,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4623.6). Total num frames: 28340224. Throughput: 0: 1146.5. Samples: 6084256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:19:43,381][00219] Avg episode reward: [(0, '36.413')] [2023-02-25 21:19:43,396][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006919_28340224.pth... [2023-02-25 21:19:43,577][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006653_27250688.pth [2023-02-25 21:19:44,061][32866] Updated weights for policy 0, policy_version 6920 (0.0012) [2023-02-25 21:19:48,379][00219] Fps is (10 sec: 4095.0, 60 sec: 4642.0, 300 sec: 4609.7). Total num frames: 28364800. Throughput: 0: 1143.4. Samples: 6086960. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:19:48,382][00219] Avg episode reward: [(0, '35.399')] [2023-02-25 21:19:51,696][32866] Updated weights for policy 0, policy_version 6930 (0.0013) [2023-02-25 21:19:53,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4609.7). Total num frames: 28393472. Throughput: 0: 1179.0. Samples: 6095216. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:19:53,379][00219] Avg episode reward: [(0, '35.130')] [2023-02-25 21:19:58,378][00219] Fps is (10 sec: 5325.4, 60 sec: 4779.1, 300 sec: 4623.6). Total num frames: 28418048. Throughput: 0: 1163.0. Samples: 6103168. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:19:58,383][00219] Avg episode reward: [(0, '35.223')] [2023-02-25 21:20:00,054][32866] Updated weights for policy 0, policy_version 6940 (0.0013) [2023-02-25 21:20:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4609.7). Total num frames: 28434432. Throughput: 0: 1143.2. Samples: 6105792. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:20:03,381][00219] Avg episode reward: [(0, '35.826')] [2023-02-25 21:20:08,377][00219] Fps is (10 sec: 3686.8, 60 sec: 4573.9, 300 sec: 4595.9). Total num frames: 28454912. Throughput: 0: 1143.1. Samples: 6111264. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:20:08,389][00219] Avg episode reward: [(0, '35.435')] [2023-02-25 21:20:09,878][32866] Updated weights for policy 0, policy_version 6950 (0.0012) [2023-02-25 21:20:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4573.9, 300 sec: 4582.0). Total num frames: 28483584. Throughput: 0: 1175.9. Samples: 6119600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:20:13,379][00219] Avg episode reward: [(0, '34.760')] [2023-02-25 21:20:16,984][32866] Updated weights for policy 0, policy_version 6960 (0.0012) [2023-02-25 21:20:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.3, 300 sec: 4595.9). Total num frames: 28508160. Throughput: 0: 1176.9. Samples: 6123952. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:20:18,378][00219] Avg episode reward: [(0, '35.854')] [2023-02-25 21:20:23,383][00219] Fps is (10 sec: 4912.2, 60 sec: 4709.9, 300 sec: 4623.5). Total num frames: 28532736. Throughput: 0: 1143.3. Samples: 6130272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:20:23,385][00219] Avg episode reward: [(0, '36.706')] [2023-02-25 21:20:27,971][32866] Updated weights for policy 0, policy_version 6970 (0.0022) [2023-02-25 21:20:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4574.1, 300 sec: 4595.8). Total num frames: 28549120. Throughput: 0: 1146.0. Samples: 6135824. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:20:28,380][00219] Avg episode reward: [(0, '36.161')] [2023-02-25 21:20:33,377][00219] Fps is (10 sec: 4098.5, 60 sec: 4573.9, 300 sec: 4568.1). Total num frames: 28573696. Throughput: 0: 1170.2. Samples: 6139616. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:20:33,379][00219] Avg episode reward: [(0, '36.811')] [2023-02-25 21:20:35,782][32866] Updated weights for policy 0, policy_version 6980 (0.0013) [2023-02-25 21:20:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4582.0). Total num frames: 28602368. Throughput: 0: 1178.0. Samples: 6148224. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:20:38,379][00219] Avg episode reward: [(0, '37.937')] [2023-02-25 21:20:38,383][32858] Saving new best policy, reward=37.937! [2023-02-25 21:20:43,378][00219] Fps is (10 sec: 4505.1, 60 sec: 4642.1, 300 sec: 4582.0). Total num frames: 28618752. Throughput: 0: 1142.0. Samples: 6154560. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:20:43,386][00219] Avg episode reward: [(0, '38.732')] [2023-02-25 21:20:43,407][32858] Saving new best policy, reward=38.732! [2023-02-25 21:20:45,746][32866] Updated weights for policy 0, policy_version 6990 (0.0012) [2023-02-25 21:20:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4574.0, 300 sec: 4582.0). Total num frames: 28639232. Throughput: 0: 1142.8. Samples: 6157216. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-25 21:20:48,385][00219] Avg episode reward: [(0, '38.674')] [2023-02-25 21:20:53,378][00219] Fps is (10 sec: 4095.8, 60 sec: 4437.2, 300 sec: 4554.2). Total num frames: 28659712. Throughput: 0: 1150.5. Samples: 6163040. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:20:53,384][00219] Avg episode reward: [(0, '37.654')] [2023-02-25 21:20:56,041][32866] Updated weights for policy 0, policy_version 7000 (0.0020) [2023-02-25 21:20:58,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4369.2, 300 sec: 4540.4). Total num frames: 28680192. Throughput: 0: 1089.1. Samples: 6168608. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:20:58,383][00219] Avg episode reward: [(0, '35.908')] [2023-02-25 21:21:03,381][00219] Fps is (10 sec: 3685.4, 60 sec: 4368.8, 300 sec: 4526.4). Total num frames: 28696576. Throughput: 0: 1050.2. Samples: 6171216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:03,383][00219] Avg episode reward: [(0, '35.153')] [2023-02-25 21:21:07,436][32866] Updated weights for policy 0, policy_version 7010 (0.0012) [2023-02-25 21:21:08,377][00219] Fps is (10 sec: 3276.8, 60 sec: 4300.8, 300 sec: 4526.4). Total num frames: 28712960. Throughput: 0: 1025.6. Samples: 6176416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:21:08,379][00219] Avg episode reward: [(0, '33.591')] [2023-02-25 21:21:13,378][00219] Fps is (10 sec: 3687.6, 60 sec: 4164.2, 300 sec: 4498.6). Total num frames: 28733440. Throughput: 0: 1039.3. Samples: 6182592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:21:13,380][00219] Avg episode reward: [(0, '31.996')] [2023-02-25 21:21:16,339][32866] Updated weights for policy 0, policy_version 7020 (0.0012) [2023-02-25 21:21:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4300.8, 300 sec: 4512.5). Total num frames: 28766208. Throughput: 0: 1052.1. Samples: 6186960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:18,378][00219] Avg episode reward: [(0, '31.450')] [2023-02-25 21:21:21,313][32858] Signal inference workers to stop experience collection... (150 times) [2023-02-25 21:21:21,353][32866] InferenceWorker_p0-w0: stopping experience collection (150 times) [2023-02-25 21:21:21,367][32858] Signal inference workers to resume experience collection... (150 times) [2023-02-25 21:21:21,400][32866] InferenceWorker_p0-w0: resuming experience collection (150 times) [2023-02-25 21:21:23,377][00219] Fps is (10 sec: 5734.9, 60 sec: 4301.2, 300 sec: 4526.5). Total num frames: 28790784. Throughput: 0: 1050.7. Samples: 6195504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:23,383][00219] Avg episode reward: [(0, '31.319')] [2023-02-25 21:21:24,446][32866] Updated weights for policy 0, policy_version 7030 (0.0013) [2023-02-25 21:21:28,383][00219] Fps is (10 sec: 4093.5, 60 sec: 4300.4, 300 sec: 4526.3). Total num frames: 28807168. Throughput: 0: 1032.1. Samples: 6201008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:21:28,385][00219] Avg episode reward: [(0, '31.876')] [2023-02-25 21:21:33,377][00219] Fps is (10 sec: 4096.1, 60 sec: 4300.8, 300 sec: 4512.5). Total num frames: 28831744. Throughput: 0: 1038.2. Samples: 6203936. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:21:33,383][00219] Avg episode reward: [(0, '32.386')] [2023-02-25 21:21:34,395][32866] Updated weights for policy 0, policy_version 7040 (0.0012) [2023-02-25 21:21:38,377][00219] Fps is (10 sec: 5328.0, 60 sec: 4300.8, 300 sec: 4512.5). Total num frames: 28860416. Throughput: 0: 1085.5. Samples: 6211888. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:21:38,379][00219] Avg episode reward: [(0, '33.450')] [2023-02-25 21:21:41,253][32866] Updated weights for policy 0, policy_version 7050 (0.0012) [2023-02-25 21:21:43,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4437.4, 300 sec: 4526.4). Total num frames: 28884992. Throughput: 0: 1149.5. Samples: 6220336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:43,383][00219] Avg episode reward: [(0, '35.063')] [2023-02-25 21:21:43,397][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007052_28884992.pth... [2023-02-25 21:21:43,574][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006783_27783168.pth [2023-02-25 21:21:48,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4437.3, 300 sec: 4526.5). Total num frames: 28905472. Throughput: 0: 1150.3. Samples: 6222976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:21:48,388][00219] Avg episode reward: [(0, '33.726')] [2023-02-25 21:21:51,472][32866] Updated weights for policy 0, policy_version 7060 (0.0013) [2023-02-25 21:21:53,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.2, 300 sec: 4498.7). Total num frames: 28921856. Throughput: 0: 1160.2. Samples: 6228624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:53,378][00219] Avg episode reward: [(0, '34.226')] [2023-02-25 21:21:58,377][00219] Fps is (10 sec: 4915.1, 60 sec: 4573.9, 300 sec: 4540.3). Total num frames: 28954624. Throughput: 0: 1201.8. Samples: 6236672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:21:58,381][00219] Avg episode reward: [(0, '33.159')] [2023-02-25 21:21:59,479][32866] Updated weights for policy 0, policy_version 7070 (0.0012) [2023-02-25 21:22:03,376][00219] Fps is (10 sec: 6144.1, 60 sec: 4779.0, 300 sec: 4595.9). Total num frames: 28983296. Throughput: 0: 1201.8. Samples: 6241040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:22:03,381][00219] Avg episode reward: [(0, '34.398')] [2023-02-25 21:22:08,267][32866] Updated weights for policy 0, policy_version 7080 (0.0012) [2023-02-25 21:22:08,379][00219] Fps is (10 sec: 4504.7, 60 sec: 4778.5, 300 sec: 4595.8). Total num frames: 28999680. Throughput: 0: 1164.0. Samples: 6247888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:22:08,381][00219] Avg episode reward: [(0, '33.451')] [2023-02-25 21:22:13,377][00219] Fps is (10 sec: 3686.3, 60 sec: 4778.7, 300 sec: 4609.7). Total num frames: 29020160. Throughput: 0: 1166.0. Samples: 6253472. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:22:13,381][00219] Avg episode reward: [(0, '33.201')] [2023-02-25 21:22:17,055][32866] Updated weights for policy 0, policy_version 7090 (0.0020) [2023-02-25 21:22:18,377][00219] Fps is (10 sec: 4916.2, 60 sec: 4710.4, 300 sec: 4595.9). Total num frames: 29048832. Throughput: 0: 1180.4. Samples: 6257056. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:22:18,381][00219] Avg episode reward: [(0, '33.709')] [2023-02-25 21:22:23,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4710.4, 300 sec: 4595.9). Total num frames: 29073408. Throughput: 0: 1202.8. Samples: 6266016. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:22:23,378][00219] Avg episode reward: [(0, '34.229')] [2023-02-25 21:22:23,986][32866] Updated weights for policy 0, policy_version 7100 (0.0012) [2023-02-25 21:22:28,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4847.4, 300 sec: 4609.7). Total num frames: 29097984. Throughput: 0: 1164.4. Samples: 6272736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:22:28,381][00219] Avg episode reward: [(0, '34.034')] [2023-02-25 21:22:33,379][00219] Fps is (10 sec: 4094.9, 60 sec: 4710.2, 300 sec: 4609.7). Total num frames: 29114368. Throughput: 0: 1166.9. Samples: 6275488. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:22:33,388][00219] Avg episode reward: [(0, '33.887')] [2023-02-25 21:22:35,312][32866] Updated weights for policy 0, policy_version 7110 (0.0012) [2023-02-25 21:22:38,378][00219] Fps is (10 sec: 4095.2, 60 sec: 4642.0, 300 sec: 4609.7). Total num frames: 29138944. Throughput: 0: 1183.2. Samples: 6281872. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:22:38,381][00219] Avg episode reward: [(0, '35.889')] [2023-02-25 21:22:42,488][32866] Updated weights for policy 0, policy_version 7120 (0.0012) [2023-02-25 21:22:43,377][00219] Fps is (10 sec: 5326.2, 60 sec: 4710.4, 300 sec: 4595.9). Total num frames: 29167616. Throughput: 0: 1200.0. Samples: 6290672. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:22:43,384][00219] Avg episode reward: [(0, '35.652')] [2023-02-25 21:22:48,377][00219] Fps is (10 sec: 5325.5, 60 sec: 4778.6, 300 sec: 4609.7). Total num frames: 29192192. Throughput: 0: 1190.7. Samples: 6294624. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:22:48,380][00219] Avg episode reward: [(0, '35.086')] [2023-02-25 21:22:51,615][32866] Updated weights for policy 0, policy_version 7130 (0.0012) [2023-02-25 21:22:53,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4778.7, 300 sec: 4595.8). Total num frames: 29208576. Throughput: 0: 1163.4. Samples: 6300240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:22:53,380][00219] Avg episode reward: [(0, '34.219')] [2023-02-25 21:22:58,381][00219] Fps is (10 sec: 4094.5, 60 sec: 4641.8, 300 sec: 4609.7). Total num frames: 29233152. Throughput: 0: 1186.7. Samples: 6306880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:22:58,388][00219] Avg episode reward: [(0, '34.489')] [2023-02-25 21:22:59,870][32866] Updated weights for policy 0, policy_version 7140 (0.0014) [2023-02-25 21:23:03,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4642.1, 300 sec: 4609.7). Total num frames: 29261824. Throughput: 0: 1204.6. Samples: 6311264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:23:03,388][00219] Avg episode reward: [(0, '33.033')] [2023-02-25 21:23:07,587][32866] Updated weights for policy 0, policy_version 7150 (0.0012) [2023-02-25 21:23:08,377][00219] Fps is (10 sec: 5327.0, 60 sec: 4778.8, 300 sec: 4609.7). Total num frames: 29286400. Throughput: 0: 1194.0. Samples: 6319744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:23:08,378][00219] Avg episode reward: [(0, '32.688')] [2023-02-25 21:23:13,379][00219] Fps is (10 sec: 4504.6, 60 sec: 4778.5, 300 sec: 4609.7). Total num frames: 29306880. Throughput: 0: 1170.4. Samples: 6325408. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:23:13,389][00219] Avg episode reward: [(0, '33.129')] [2023-02-25 21:23:17,906][32866] Updated weights for policy 0, policy_version 7160 (0.0012) [2023-02-25 21:23:18,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4623.6). Total num frames: 29327360. Throughput: 0: 1171.3. Samples: 6328192. Policy #0 lag: (min: 0.0, avg: 0.9, max: 2.0) [2023-02-25 21:23:18,385][00219] Avg episode reward: [(0, '32.700')] [2023-02-25 21:23:23,378][00219] Fps is (10 sec: 4915.4, 60 sec: 4710.3, 300 sec: 4623.6). Total num frames: 29356032. Throughput: 0: 1209.6. Samples: 6336304. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:23:23,381][00219] Avg episode reward: [(0, '34.626')] [2023-02-25 21:23:25,237][32866] Updated weights for policy 0, policy_version 7170 (0.0012) [2023-02-25 21:23:28,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4595.9). Total num frames: 29380608. Throughput: 0: 1196.4. Samples: 6344512. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:23:28,378][00219] Avg episode reward: [(0, '34.988')] [2023-02-25 21:23:33,377][00219] Fps is (10 sec: 4506.4, 60 sec: 4778.9, 300 sec: 4609.7). Total num frames: 29401088. Throughput: 0: 1173.3. Samples: 6347424. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:23:33,380][00219] Avg episode reward: [(0, '35.872')] [2023-02-25 21:23:34,712][32866] Updated weights for policy 0, policy_version 7180 (0.0012) [2023-02-25 21:23:38,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.5, 300 sec: 4623.6). Total num frames: 29421568. Throughput: 0: 1170.8. Samples: 6352928. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:23:38,379][00219] Avg episode reward: [(0, '37.234')] [2023-02-25 21:23:42,933][32866] Updated weights for policy 0, policy_version 7190 (0.0013) [2023-02-25 21:23:43,379][00219] Fps is (10 sec: 4914.3, 60 sec: 4710.2, 300 sec: 4623.6). Total num frames: 29450240. Throughput: 0: 1206.1. Samples: 6361152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:23:43,381][00219] Avg episode reward: [(0, '36.070')] [2023-02-25 21:23:43,404][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007190_29450240.pth... [2023-02-25 21:23:43,502][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000006919_28340224.pth [2023-02-25 21:23:48,386][00219] Fps is (10 sec: 5320.0, 60 sec: 4709.7, 300 sec: 4609.6). Total num frames: 29474816. Throughput: 0: 1201.5. Samples: 6365344. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:23:48,393][00219] Avg episode reward: [(0, '35.981')] [2023-02-25 21:23:51,233][32866] Updated weights for policy 0, policy_version 7200 (0.0013) [2023-02-25 21:23:53,377][00219] Fps is (10 sec: 4506.5, 60 sec: 4778.7, 300 sec: 4623.7). Total num frames: 29495296. Throughput: 0: 1160.2. Samples: 6371952. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:23:53,388][00219] Avg episode reward: [(0, '35.377')] [2023-02-25 21:23:58,377][00219] Fps is (10 sec: 4099.7, 60 sec: 4710.7, 300 sec: 4609.7). Total num frames: 29515776. Throughput: 0: 1158.1. Samples: 6377520. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:23:58,379][00219] Avg episode reward: [(0, '35.145')] [2023-02-25 21:24:01,310][32866] Updated weights for policy 0, policy_version 7210 (0.0013) [2023-02-25 21:24:03,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4642.1, 300 sec: 4609.7). Total num frames: 29540352. Throughput: 0: 1181.5. Samples: 6381360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:24:03,379][00219] Avg episode reward: [(0, '33.203')] [2023-02-25 21:24:08,287][32866] Updated weights for policy 0, policy_version 7220 (0.0012) [2023-02-25 21:24:08,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4778.7, 300 sec: 4623.6). Total num frames: 29573120. Throughput: 0: 1196.8. Samples: 6390160. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:24:08,382][00219] Avg episode reward: [(0, '32.547')] [2023-02-25 21:24:13,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.6, 300 sec: 4609.8). Total num frames: 29589504. Throughput: 0: 1160.5. Samples: 6396736. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:24:13,383][00219] Avg episode reward: [(0, '34.366')] [2023-02-25 21:24:18,377][00219] Fps is (10 sec: 3276.7, 60 sec: 4642.1, 300 sec: 4595.8). Total num frames: 29605888. Throughput: 0: 1157.7. Samples: 6399520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:24:18,387][00219] Avg episode reward: [(0, '34.918')] [2023-02-25 21:24:18,522][32866] Updated weights for policy 0, policy_version 7230 (0.0017) [2023-02-25 21:24:23,377][00219] Fps is (10 sec: 4505.5, 60 sec: 4642.3, 300 sec: 4609.8). Total num frames: 29634560. Throughput: 0: 1185.4. Samples: 6406272. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:24:23,379][00219] Avg episode reward: [(0, '35.015')] [2023-02-25 21:24:26,342][32866] Updated weights for policy 0, policy_version 7240 (0.0013) [2023-02-25 21:24:28,377][00219] Fps is (10 sec: 6144.1, 60 sec: 4778.7, 300 sec: 4637.5). Total num frames: 29667328. Throughput: 0: 1198.3. Samples: 6415072. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:24:28,379][00219] Avg episode reward: [(0, '35.801')] [2023-02-25 21:24:33,377][00219] Fps is (10 sec: 4915.3, 60 sec: 4710.4, 300 sec: 4609.7). Total num frames: 29683712. Throughput: 0: 1190.3. Samples: 6418896. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:24:33,379][00219] Avg episode reward: [(0, '36.256')] [2023-02-25 21:24:35,645][32866] Updated weights for policy 0, policy_version 7250 (0.0012) [2023-02-25 21:24:38,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4710.4, 300 sec: 4623.6). Total num frames: 29704192. Throughput: 0: 1163.0. Samples: 6424288. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:24:38,385][00219] Avg episode reward: [(0, '35.759')] [2023-02-25 21:24:43,379][00219] Fps is (10 sec: 3276.1, 60 sec: 4437.3, 300 sec: 4582.0). Total num frames: 29716480. Throughput: 0: 1139.9. Samples: 6428816. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:24:43,388][00219] Avg episode reward: [(0, '34.246')] [2023-02-25 21:24:47,986][32866] Updated weights for policy 0, policy_version 7260 (0.0014) [2023-02-25 21:24:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4438.0, 300 sec: 4568.1). Total num frames: 29741056. Throughput: 0: 1115.0. Samples: 6431536. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:24:48,380][00219] Avg episode reward: [(0, '33.482')] [2023-02-25 21:24:53,377][00219] Fps is (10 sec: 4916.1, 60 sec: 4505.6, 300 sec: 4568.1). Total num frames: 29765632. Throughput: 0: 1074.5. Samples: 6438512. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-25 21:24:53,381][00219] Avg episode reward: [(0, '34.020')] [2023-02-25 21:24:56,714][32866] Updated weights for policy 0, policy_version 7270 (0.0014) [2023-02-25 21:24:58,378][00219] Fps is (10 sec: 4095.5, 60 sec: 4437.2, 300 sec: 4568.1). Total num frames: 29782016. Throughput: 0: 1068.4. Samples: 6444816. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:24:58,384][00219] Avg episode reward: [(0, '33.652')] [2023-02-25 21:25:03,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4369.0, 300 sec: 4568.1). Total num frames: 29802496. Throughput: 0: 1066.7. Samples: 6447520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:25:03,379][00219] Avg episode reward: [(0, '33.689')] [2023-02-25 21:25:06,484][32866] Updated weights for policy 0, policy_version 7280 (0.0012) [2023-02-25 21:25:08,377][00219] Fps is (10 sec: 4915.8, 60 sec: 4300.8, 300 sec: 4568.1). Total num frames: 29831168. Throughput: 0: 1072.4. Samples: 6454528. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:25:08,379][00219] Avg episode reward: [(0, '33.725')] [2023-02-25 21:25:13,377][00219] Fps is (10 sec: 5325.0, 60 sec: 4437.3, 300 sec: 4568.1). Total num frames: 29855744. Throughput: 0: 1069.2. Samples: 6463184. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:25:13,380][00219] Avg episode reward: [(0, '34.137')] [2023-02-25 21:25:13,458][32866] Updated weights for policy 0, policy_version 7290 (0.0012) [2023-02-25 21:25:18,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4573.9, 300 sec: 4568.2). Total num frames: 29880320. Throughput: 0: 1061.0. Samples: 6466640. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:25:18,379][00219] Avg episode reward: [(0, '33.688')] [2023-02-25 21:25:23,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4369.1, 300 sec: 4568.1). Total num frames: 29896704. Throughput: 0: 1064.2. Samples: 6472176. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:25:23,384][00219] Avg episode reward: [(0, '33.044')] [2023-02-25 21:25:23,864][32866] Updated weights for policy 0, policy_version 7300 (0.0028) [2023-02-25 21:25:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4232.5, 300 sec: 4568.1). Total num frames: 29921280. Throughput: 0: 1118.6. Samples: 6479152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:25:28,385][00219] Avg episode reward: [(0, '34.508')] [2023-02-25 21:25:31,538][32866] Updated weights for policy 0, policy_version 7310 (0.0012) [2023-02-25 21:25:33,377][00219] Fps is (10 sec: 5734.4, 60 sec: 4505.6, 300 sec: 4582.0). Total num frames: 29954048. Throughput: 0: 1155.6. Samples: 6483536. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:25:33,379][00219] Avg episode reward: [(0, '35.151')] [2023-02-25 21:25:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4505.6, 300 sec: 4595.9). Total num frames: 29974528. Throughput: 0: 1176.9. Samples: 6491472. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:25:38,381][00219] Avg episode reward: [(0, '34.607')] [2023-02-25 21:25:40,015][32866] Updated weights for policy 0, policy_version 7320 (0.0012) [2023-02-25 21:25:43,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4574.0, 300 sec: 4582.0). Total num frames: 29990912. Throughput: 0: 1159.5. Samples: 6496992. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:25:43,379][00219] Avg episode reward: [(0, '36.442')] [2023-02-25 21:25:43,397][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007322_29990912.pth... [2023-02-25 21:25:43,575][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007052_28884992.pth [2023-02-25 21:25:48,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4573.9, 300 sec: 4595.9). Total num frames: 30015488. Throughput: 0: 1158.4. Samples: 6499648. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:25:48,379][00219] Avg episode reward: [(0, '37.060')] [2023-02-25 21:25:49,331][32866] Updated weights for policy 0, policy_version 7330 (0.0041) [2023-02-25 21:25:53,377][00219] Fps is (10 sec: 5324.7, 60 sec: 4642.1, 300 sec: 4623.6). Total num frames: 30044160. Throughput: 0: 1193.6. Samples: 6508240. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:25:53,379][00219] Avg episode reward: [(0, '35.752')] [2023-02-25 21:25:56,606][32866] Updated weights for policy 0, policy_version 7340 (0.0012) [2023-02-25 21:25:58,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.8, 300 sec: 4651.5). Total num frames: 30068736. Throughput: 0: 1178.7. Samples: 6516224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:25:58,384][00219] Avg episode reward: [(0, '34.279')] [2023-02-25 21:26:03,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30085120. Throughput: 0: 1163.4. Samples: 6518992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:26:03,379][00219] Avg episode reward: [(0, '34.478')] [2023-02-25 21:26:07,437][32866] Updated weights for policy 0, policy_version 7350 (0.0013) [2023-02-25 21:26:08,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4642.1, 300 sec: 4665.3). Total num frames: 30109696. Throughput: 0: 1160.9. Samples: 6524416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:26:08,384][00219] Avg episode reward: [(0, '35.335')] [2023-02-25 21:26:13,377][00219] Fps is (10 sec: 5324.9, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30138368. Throughput: 0: 1202.5. Samples: 6533264. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:26:13,379][00219] Avg episode reward: [(0, '35.958')] [2023-02-25 21:26:14,422][32866] Updated weights for policy 0, policy_version 7360 (0.0018) [2023-02-25 21:26:18,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30162944. Throughput: 0: 1203.6. Samples: 6537696. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-25 21:26:18,381][00219] Avg episode reward: [(0, '36.554')] [2023-02-25 21:26:23,377][00219] Fps is (10 sec: 4505.3, 60 sec: 4778.6, 300 sec: 4665.4). Total num frames: 30183424. Throughput: 0: 1160.9. Samples: 6543712. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:26:23,382][00219] Avg episode reward: [(0, '36.777')] [2023-02-25 21:26:24,355][32866] Updated weights for policy 0, policy_version 7370 (0.0014) [2023-02-25 21:26:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30203904. Throughput: 0: 1161.6. Samples: 6549264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:26:28,381][00219] Avg episode reward: [(0, '36.357')] [2023-02-25 21:26:32,830][32866] Updated weights for policy 0, policy_version 7380 (0.0014) [2023-02-25 21:26:33,379][00219] Fps is (10 sec: 4914.2, 60 sec: 4641.9, 300 sec: 4651.3). Total num frames: 30232576. Throughput: 0: 1200.6. Samples: 6553680. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:26:33,382][00219] Avg episode reward: [(0, '38.076')] [2023-02-25 21:26:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30257152. Throughput: 0: 1203.9. Samples: 6562416. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:26:38,383][00219] Avg episode reward: [(0, '37.276')] [2023-02-25 21:26:41,057][32866] Updated weights for policy 0, policy_version 7390 (0.0012) [2023-02-25 21:26:43,377][00219] Fps is (10 sec: 4506.8, 60 sec: 4778.7, 300 sec: 4651.4). Total num frames: 30277632. Throughput: 0: 1158.8. Samples: 6568368. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:26:43,380][00219] Avg episode reward: [(0, '36.685')] [2023-02-25 21:26:48,377][00219] Fps is (10 sec: 3686.4, 60 sec: 4642.1, 300 sec: 4651.4). Total num frames: 30294016. Throughput: 0: 1158.8. Samples: 6571136. Policy #0 lag: (min: 0.0, avg: 0.1, max: 2.0) [2023-02-25 21:26:48,379][00219] Avg episode reward: [(0, '36.116')] [2023-02-25 21:26:50,243][32866] Updated weights for policy 0, policy_version 7400 (0.0013) [2023-02-25 21:26:53,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30326784. Throughput: 0: 1200.7. Samples: 6578448. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2023-02-25 21:26:53,379][00219] Avg episode reward: [(0, '36.522')] [2023-02-25 21:26:57,352][32866] Updated weights for policy 0, policy_version 7410 (0.0012) [2023-02-25 21:26:58,378][00219] Fps is (10 sec: 5733.6, 60 sec: 4710.3, 300 sec: 4637.5). Total num frames: 30351360. Throughput: 0: 1199.6. Samples: 6587248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-25 21:26:58,382][00219] Avg episode reward: [(0, '36.512')] [2023-02-25 21:27:03,384][00219] Fps is (10 sec: 4502.4, 60 sec: 4778.1, 300 sec: 4651.3). Total num frames: 30371840. Throughput: 0: 1172.1. Samples: 6590448. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:27:03,386][00219] Avg episode reward: [(0, '33.683')] [2023-02-25 21:27:07,579][32866] Updated weights for policy 0, policy_version 7420 (0.0012) [2023-02-25 21:27:08,379][00219] Fps is (10 sec: 4095.7, 60 sec: 4710.2, 300 sec: 4651.4). Total num frames: 30392320. Throughput: 0: 1162.6. Samples: 6596032. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:27:08,381][00219] Avg episode reward: [(0, '33.894')] [2023-02-25 21:27:13,377][00219] Fps is (10 sec: 4508.8, 60 sec: 4642.1, 300 sec: 4637.5). Total num frames: 30416896. Throughput: 0: 1200.7. Samples: 6603296. Policy #0 lag: (min: 0.0, avg: 0.2, max: 2.0) [2023-02-25 21:27:13,378][00219] Avg episode reward: [(0, '34.852')] [2023-02-25 21:27:15,766][32866] Updated weights for policy 0, policy_version 7430 (0.0012) [2023-02-25 21:27:18,377][00219] Fps is (10 sec: 5325.9, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30445568. Throughput: 0: 1201.8. Samples: 6607760. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0) [2023-02-25 21:27:18,385][00219] Avg episode reward: [(0, '32.261')] [2023-02-25 21:27:23,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4778.7, 300 sec: 4651.4). Total num frames: 30470144. Throughput: 0: 1170.8. Samples: 6615104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:27:23,379][00219] Avg episode reward: [(0, '33.320')] [2023-02-25 21:27:24,641][32866] Updated weights for policy 0, policy_version 7440 (0.0020) [2023-02-25 21:27:28,377][00219] Fps is (10 sec: 4096.0, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30486528. Throughput: 0: 1159.1. Samples: 6620528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-25 21:27:28,382][00219] Avg episode reward: [(0, '31.169')] [2023-02-25 21:27:33,377][00219] Fps is (10 sec: 4505.6, 60 sec: 4710.6, 300 sec: 4665.3). Total num frames: 30515200. Throughput: 0: 1165.5. Samples: 6623584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-25 21:27:33,379][32866] Updated weights for policy 0, policy_version 7450 (0.0012) [2023-02-25 21:27:33,382][00219] Avg episode reward: [(0, '32.691')] [2023-02-25 21:27:38,377][00219] Fps is (10 sec: 5324.8, 60 sec: 4710.4, 300 sec: 4651.4). Total num frames: 30539776. Throughput: 0: 1198.2. Samples: 6632368. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:27:38,379][00219] Avg episode reward: [(0, '31.736')] [2023-02-25 21:27:40,930][32866] Updated weights for policy 0, policy_version 7460 (0.0015) [2023-02-25 21:27:43,377][00219] Fps is (10 sec: 4915.2, 60 sec: 4778.7, 300 sec: 4651.4). Total num frames: 30564352. Throughput: 0: 1163.1. Samples: 6639584. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-25 21:27:43,383][00219] Avg episode reward: [(0, '32.739')] [2023-02-25 21:27:43,394][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007462_30564352.pth... [2023-02-25 21:27:43,646][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007190_29450240.pth [2023-02-25 21:27:48,167][00219] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 219], exiting... [2023-02-25 21:27:48,174][32858] Stopping Batcher_0... [2023-02-25 21:27:48,175][32858] Loop batcher_evt_loop terminating... [2023-02-25 21:27:48,176][32858] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007466_30580736.pth... [2023-02-25 21:27:48,173][00219] Runner profile tree view: main_loop: 5643.2013 [2023-02-25 21:27:48,183][00219] Collected {0: 30580736}, FPS: 4709.2 [2023-02-25 21:27:48,347][32866] Weights refcount: 2 0 [2023-02-25 21:27:48,388][32866] Stopping InferenceWorker_p0-w0... [2023-02-25 21:27:48,388][32866] Loop inference_proc0-0_evt_loop terminating... [2023-02-25 21:27:48,627][32872] EvtLoop [rollout_proc1_evt_loop, process=rollout_proc1] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance1'), args=(1, 0) Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step reward = self.game.make_action(actions_flattened, self.skip_frames) vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. [2023-02-25 21:27:48,627][32871] EvtLoop [rollout_proc0_evt_loop, process=rollout_proc0] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance0'), args=(0, 0) Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step obs, rew, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step observation, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step return self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step obs, reward, terminated, truncated, info = self.env.step(action) File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step reward = self.game.make_action(actions_flattened, self.skip_frames) vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed. [2023-02-25 21:27:48,631][32871] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc0_evt_loop [2023-02-25 21:27:48,751][32858] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007322_29990912.pth [2023-02-25 21:27:48,831][32858] Stopping LearnerWorker_p0... [2023-02-25 21:27:48,831][32858] Loop learner_proc0_evt_loop terminating... [2023-02-25 21:27:48,629][32872] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc1_evt_loop [2023-02-25 21:27:56,151][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 21:27:56,153][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 21:27:56,156][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 21:27:56,159][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 21:27:56,161][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 21:27:56,164][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 21:27:56,165][00219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 21:27:56,167][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 21:27:56,168][00219] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-25 21:27:56,170][00219] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-25 21:27:56,171][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 21:27:56,174][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 21:27:56,176][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 21:27:56,178][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 21:27:56,180][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 21:27:56,224][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 21:27:56,229][00219] RunningMeanStd input shape: (1,) [2023-02-25 21:27:56,250][00219] ConvEncoder: input_channels=3 [2023-02-25 21:27:56,404][00219] Conv encoder output size: 512 [2023-02-25 21:27:56,408][00219] Policy head output size: 512 [2023-02-25 21:27:56,526][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007466_30580736.pth... [2023-02-25 21:27:57,290][00219] Num frames 100... [2023-02-25 21:27:57,401][00219] Num frames 200... [2023-02-25 21:27:57,518][00219] Num frames 300... [2023-02-25 21:27:57,633][00219] Num frames 400... [2023-02-25 21:27:57,742][00219] Num frames 500... [2023-02-25 21:27:57,857][00219] Num frames 600... [2023-02-25 21:27:57,970][00219] Num frames 700... [2023-02-25 21:27:58,084][00219] Num frames 800... [2023-02-25 21:27:58,199][00219] Num frames 900... [2023-02-25 21:27:58,310][00219] Num frames 1000... [2023-02-25 21:27:58,430][00219] Num frames 1100... [2023-02-25 21:27:58,553][00219] Num frames 1200... [2023-02-25 21:27:58,672][00219] Num frames 1300... [2023-02-25 21:27:58,782][00219] Num frames 1400... [2023-02-25 21:27:58,896][00219] Num frames 1500... [2023-02-25 21:27:59,007][00219] Num frames 1600... [2023-02-25 21:27:59,125][00219] Num frames 1700... [2023-02-25 21:27:59,211][00219] Avg episode rewards: #0: 42.259, true rewards: #0: 17.260 [2023-02-25 21:27:59,213][00219] Avg episode reward: 42.259, avg true_objective: 17.260 [2023-02-25 21:27:59,299][00219] Num frames 1800... [2023-02-25 21:27:59,419][00219] Num frames 1900... [2023-02-25 21:27:59,536][00219] Num frames 2000... [2023-02-25 21:27:59,658][00219] Num frames 2100... [2023-02-25 21:27:59,768][00219] Num frames 2200... [2023-02-25 21:27:59,887][00219] Num frames 2300... [2023-02-25 21:27:59,999][00219] Num frames 2400... [2023-02-25 21:28:00,113][00219] Num frames 2500... [2023-02-25 21:28:00,231][00219] Num frames 2600... [2023-02-25 21:28:00,344][00219] Num frames 2700... [2023-02-25 21:28:00,461][00219] Num frames 2800... [2023-02-25 21:28:00,579][00219] Num frames 2900... [2023-02-25 21:28:00,694][00219] Num frames 3000... [2023-02-25 21:28:00,805][00219] Num frames 3100... [2023-02-25 21:28:00,919][00219] Num frames 3200... [2023-02-25 21:28:01,039][00219] Num frames 3300... [2023-02-25 21:28:01,168][00219] Num frames 3400... [2023-02-25 21:28:01,325][00219] Num frames 3500... [2023-02-25 21:28:01,492][00219] Num frames 3600... [2023-02-25 21:28:01,652][00219] Num frames 3700... [2023-02-25 21:28:01,813][00219] Num frames 3800... [2023-02-25 21:28:01,914][00219] Avg episode rewards: #0: 54.629, true rewards: #0: 19.130 [2023-02-25 21:28:01,918][00219] Avg episode reward: 54.629, avg true_objective: 19.130 [2023-02-25 21:28:02,037][00219] Num frames 3900... [2023-02-25 21:28:02,192][00219] Num frames 4000... [2023-02-25 21:28:02,361][00219] Num frames 4100... [2023-02-25 21:28:02,519][00219] Num frames 4200... [2023-02-25 21:28:02,684][00219] Num frames 4300... [2023-02-25 21:28:02,848][00219] Num frames 4400... [2023-02-25 21:28:03,011][00219] Num frames 4500... [2023-02-25 21:28:03,177][00219] Num frames 4600... [2023-02-25 21:28:03,344][00219] Num frames 4700... [2023-02-25 21:28:03,509][00219] Num frames 4800... [2023-02-25 21:28:03,676][00219] Num frames 4900... [2023-02-25 21:28:03,839][00219] Num frames 5000... [2023-02-25 21:28:04,003][00219] Num frames 5100... [2023-02-25 21:28:04,126][00219] Num frames 5200... [2023-02-25 21:28:04,244][00219] Num frames 5300... [2023-02-25 21:28:04,369][00219] Num frames 5400... [2023-02-25 21:28:04,520][00219] Avg episode rewards: #0: 51.606, true rewards: #0: 18.273 [2023-02-25 21:28:04,523][00219] Avg episode reward: 51.606, avg true_objective: 18.273 [2023-02-25 21:28:04,548][00219] Num frames 5500... [2023-02-25 21:28:04,671][00219] Num frames 5600... [2023-02-25 21:28:04,796][00219] Num frames 5700... [2023-02-25 21:28:04,910][00219] Num frames 5800... [2023-02-25 21:28:05,022][00219] Num frames 5900... [2023-02-25 21:28:05,136][00219] Num frames 6000... [2023-02-25 21:28:05,246][00219] Num frames 6100... [2023-02-25 21:28:05,360][00219] Num frames 6200... [2023-02-25 21:28:05,437][00219] Avg episode rewards: #0: 41.795, true rewards: #0: 15.545 [2023-02-25 21:28:05,439][00219] Avg episode reward: 41.795, avg true_objective: 15.545 [2023-02-25 21:28:05,531][00219] Num frames 6300... [2023-02-25 21:28:05,646][00219] Num frames 6400... [2023-02-25 21:28:05,760][00219] Num frames 6500... [2023-02-25 21:28:05,879][00219] Num frames 6600... [2023-02-25 21:28:05,990][00219] Num frames 6700... [2023-02-25 21:28:06,109][00219] Num frames 6800... [2023-02-25 21:28:06,221][00219] Num frames 6900... [2023-02-25 21:28:06,336][00219] Num frames 7000... [2023-02-25 21:28:06,453][00219] Num frames 7100... [2023-02-25 21:28:06,566][00219] Num frames 7200... [2023-02-25 21:28:06,684][00219] Num frames 7300... [2023-02-25 21:28:06,797][00219] Num frames 7400... [2023-02-25 21:28:06,913][00219] Num frames 7500... [2023-02-25 21:28:07,024][00219] Num frames 7600... [2023-02-25 21:28:07,136][00219] Num frames 7700... [2023-02-25 21:28:07,255][00219] Num frames 7800... [2023-02-25 21:28:07,366][00219] Num frames 7900... [2023-02-25 21:28:07,482][00219] Num frames 8000... [2023-02-25 21:28:07,596][00219] Num frames 8100... [2023-02-25 21:28:07,719][00219] Num frames 8200... [2023-02-25 21:28:07,835][00219] Num frames 8300... [2023-02-25 21:28:07,912][00219] Avg episode rewards: #0: 44.835, true rewards: #0: 16.636 [2023-02-25 21:28:07,914][00219] Avg episode reward: 44.835, avg true_objective: 16.636 [2023-02-25 21:28:08,008][00219] Num frames 8400... [2023-02-25 21:28:08,116][00219] Num frames 8500... [2023-02-25 21:28:08,229][00219] Num frames 8600... [2023-02-25 21:28:08,340][00219] Num frames 8700... [2023-02-25 21:28:08,450][00219] Num frames 8800... [2023-02-25 21:28:08,567][00219] Avg episode rewards: #0: 39.083, true rewards: #0: 14.750 [2023-02-25 21:28:08,569][00219] Avg episode reward: 39.083, avg true_objective: 14.750 [2023-02-25 21:28:08,626][00219] Num frames 8900... [2023-02-25 21:28:08,749][00219] Num frames 9000... [2023-02-25 21:28:08,861][00219] Num frames 9100... [2023-02-25 21:28:08,971][00219] Num frames 9200... [2023-02-25 21:28:09,086][00219] Num frames 9300... [2023-02-25 21:28:09,196][00219] Num frames 9400... [2023-02-25 21:28:09,313][00219] Num frames 9500... [2023-02-25 21:28:09,427][00219] Num frames 9600... [2023-02-25 21:28:09,516][00219] Avg episode rewards: #0: 36.041, true rewards: #0: 13.756 [2023-02-25 21:28:09,519][00219] Avg episode reward: 36.041, avg true_objective: 13.756 [2023-02-25 21:28:09,601][00219] Num frames 9700... [2023-02-25 21:28:09,711][00219] Num frames 9800... [2023-02-25 21:28:09,832][00219] Num frames 9900... [2023-02-25 21:28:09,944][00219] Num frames 10000... [2023-02-25 21:28:10,059][00219] Num frames 10100... [2023-02-25 21:28:10,172][00219] Num frames 10200... [2023-02-25 21:28:10,292][00219] Num frames 10300... [2023-02-25 21:28:10,405][00219] Num frames 10400... [2023-02-25 21:28:10,516][00219] Num frames 10500... [2023-02-25 21:28:10,629][00219] Num frames 10600... [2023-02-25 21:28:10,742][00219] Num frames 10700... [2023-02-25 21:28:10,881][00219] Num frames 10800... [2023-02-25 21:28:10,995][00219] Num frames 10900... [2023-02-25 21:28:11,109][00219] Num frames 11000... [2023-02-25 21:28:11,224][00219] Num frames 11100... [2023-02-25 21:28:11,343][00219] Num frames 11200... [2023-02-25 21:28:11,456][00219] Num frames 11300... [2023-02-25 21:28:11,567][00219] Num frames 11400... [2023-02-25 21:28:11,680][00219] Num frames 11500... [2023-02-25 21:28:11,797][00219] Num frames 11600... [2023-02-25 21:28:11,919][00219] Num frames 11700... [2023-02-25 21:28:12,009][00219] Avg episode rewards: #0: 38.661, true rewards: #0: 14.661 [2023-02-25 21:28:12,010][00219] Avg episode reward: 38.661, avg true_objective: 14.661 [2023-02-25 21:28:12,094][00219] Num frames 11800... [2023-02-25 21:28:12,206][00219] Num frames 11900... [2023-02-25 21:28:12,317][00219] Num frames 12000... [2023-02-25 21:28:12,437][00219] Num frames 12100... [2023-02-25 21:28:12,548][00219] Num frames 12200... [2023-02-25 21:28:12,665][00219] Num frames 12300... [2023-02-25 21:28:12,779][00219] Num frames 12400... [2023-02-25 21:28:12,902][00219] Num frames 12500... [2023-02-25 21:28:13,012][00219] Num frames 12600... [2023-02-25 21:28:13,122][00219] Num frames 12700... [2023-02-25 21:28:13,235][00219] Num frames 12800... [2023-02-25 21:28:13,348][00219] Num frames 12900... [2023-02-25 21:28:13,464][00219] Num frames 13000... [2023-02-25 21:28:13,574][00219] Num frames 13100... [2023-02-25 21:28:13,686][00219] Num frames 13200... [2023-02-25 21:28:13,800][00219] Num frames 13300... [2023-02-25 21:28:13,918][00219] Num frames 13400... [2023-02-25 21:28:14,036][00219] Num frames 13500... [2023-02-25 21:28:14,200][00219] Num frames 13600... [2023-02-25 21:28:14,358][00219] Num frames 13700... [2023-02-25 21:28:14,438][00219] Avg episode rewards: #0: 40.903, true rewards: #0: 15.237 [2023-02-25 21:28:14,441][00219] Avg episode reward: 40.903, avg true_objective: 15.237 [2023-02-25 21:28:14,580][00219] Num frames 13800... [2023-02-25 21:28:14,734][00219] Num frames 13900... [2023-02-25 21:28:14,892][00219] Num frames 14000... [2023-02-25 21:28:15,051][00219] Num frames 14100... [2023-02-25 21:28:15,224][00219] Num frames 14200... [2023-02-25 21:28:15,385][00219] Num frames 14300... [2023-02-25 21:28:15,545][00219] Num frames 14400... [2023-02-25 21:28:15,633][00219] Avg episode rewards: #0: 38.716, true rewards: #0: 14.417 [2023-02-25 21:28:15,636][00219] Avg episode reward: 38.716, avg true_objective: 14.417 [2023-02-25 21:29:38,377][00219] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-25 21:29:38,856][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 21:29:38,858][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 21:29:38,861][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 21:29:38,863][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 21:29:38,865][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 21:29:38,867][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 21:29:38,868][00219] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-25 21:29:38,869][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 21:29:38,870][00219] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-25 21:29:38,872][00219] Adding new argument 'hf_repository'='msgerasyov/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-25 21:29:38,873][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 21:29:38,874][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 21:29:38,875][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 21:29:38,876][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 21:29:38,878][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 21:29:38,899][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 21:29:38,902][00219] RunningMeanStd input shape: (1,) [2023-02-25 21:29:38,924][00219] ConvEncoder: input_channels=3 [2023-02-25 21:29:38,986][00219] Conv encoder output size: 512 [2023-02-25 21:29:38,990][00219] Policy head output size: 512 [2023-02-25 21:29:39,019][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007466_30580736.pth... [2023-02-25 21:29:39,680][00219] Num frames 100... [2023-02-25 21:29:39,800][00219] Num frames 200... [2023-02-25 21:29:39,908][00219] Num frames 300... [2023-02-25 21:29:40,020][00219] Num frames 400... [2023-02-25 21:29:40,132][00219] Num frames 500... [2023-02-25 21:29:40,261][00219] Num frames 600... [2023-02-25 21:29:40,373][00219] Num frames 700... [2023-02-25 21:29:40,486][00219] Num frames 800... [2023-02-25 21:29:40,595][00219] Num frames 900... [2023-02-25 21:29:40,709][00219] Num frames 1000... [2023-02-25 21:29:40,831][00219] Num frames 1100... [2023-02-25 21:29:40,940][00219] Num frames 1200... [2023-02-25 21:29:41,071][00219] Num frames 1300... [2023-02-25 21:29:41,184][00219] Num frames 1400... [2023-02-25 21:29:41,305][00219] Num frames 1500... [2023-02-25 21:29:41,429][00219] Num frames 1600... [2023-02-25 21:29:41,548][00219] Num frames 1700... [2023-02-25 21:29:41,664][00219] Num frames 1800... [2023-02-25 21:29:41,781][00219] Num frames 1900... [2023-02-25 21:29:41,903][00219] Num frames 2000... [2023-02-25 21:29:42,022][00219] Num frames 2100... [2023-02-25 21:29:42,077][00219] Avg episode rewards: #0: 55.999, true rewards: #0: 21.000 [2023-02-25 21:29:42,079][00219] Avg episode reward: 55.999, avg true_objective: 21.000 [2023-02-25 21:29:42,200][00219] Num frames 2200... [2023-02-25 21:29:42,317][00219] Num frames 2300... [2023-02-25 21:29:42,433][00219] Num frames 2400... [2023-02-25 21:29:42,545][00219] Num frames 2500... [2023-02-25 21:29:42,661][00219] Num frames 2600... [2023-02-25 21:29:42,773][00219] Num frames 2700... [2023-02-25 21:29:42,891][00219] Num frames 2800... [2023-02-25 21:29:43,003][00219] Num frames 2900... [2023-02-25 21:29:43,112][00219] Num frames 3000... [2023-02-25 21:29:43,237][00219] Num frames 3100... [2023-02-25 21:29:43,348][00219] Num frames 3200... [2023-02-25 21:29:43,461][00219] Num frames 3300... [2023-02-25 21:29:43,570][00219] Num frames 3400... [2023-02-25 21:29:43,691][00219] Num frames 3500... [2023-02-25 21:29:43,806][00219] Num frames 3600... [2023-02-25 21:29:43,926][00219] Num frames 3700... [2023-02-25 21:29:44,038][00219] Num frames 3800... [2023-02-25 21:29:44,155][00219] Num frames 3900... [2023-02-25 21:29:44,272][00219] Num frames 4000... [2023-02-25 21:29:44,391][00219] Num frames 4100... [2023-02-25 21:29:44,506][00219] Num frames 4200... [2023-02-25 21:29:44,560][00219] Avg episode rewards: #0: 61.999, true rewards: #0: 21.000 [2023-02-25 21:29:44,561][00219] Avg episode reward: 61.999, avg true_objective: 21.000 [2023-02-25 21:29:44,679][00219] Num frames 4300... [2023-02-25 21:29:44,786][00219] Num frames 4400... [2023-02-25 21:29:44,903][00219] Num frames 4500... [2023-02-25 21:29:45,015][00219] Num frames 4600... [2023-02-25 21:29:45,131][00219] Num frames 4700... [2023-02-25 21:29:45,240][00219] Num frames 4800... [2023-02-25 21:29:45,360][00219] Num frames 4900... [2023-02-25 21:29:45,470][00219] Num frames 5000... [2023-02-25 21:29:45,587][00219] Num frames 5100... [2023-02-25 21:29:45,698][00219] Num frames 5200... [2023-02-25 21:29:45,814][00219] Num frames 5300... [2023-02-25 21:29:45,927][00219] Num frames 5400... [2023-02-25 21:29:46,044][00219] Num frames 5500... [2023-02-25 21:29:46,154][00219] Num frames 5600... [2023-02-25 21:29:46,269][00219] Num frames 5700... [2023-02-25 21:29:46,389][00219] Num frames 5800... [2023-02-25 21:29:46,500][00219] Num frames 5900... [2023-02-25 21:29:46,620][00219] Num frames 6000... [2023-02-25 21:29:46,732][00219] Num frames 6100... [2023-02-25 21:29:46,851][00219] Num frames 6200... [2023-02-25 21:29:46,946][00219] Avg episode rewards: #0: 58.779, true rewards: #0: 20.780 [2023-02-25 21:29:46,948][00219] Avg episode reward: 58.779, avg true_objective: 20.780 [2023-02-25 21:29:47,031][00219] Num frames 6300... [2023-02-25 21:29:47,140][00219] Num frames 6400... [2023-02-25 21:29:47,255][00219] Num frames 6500... [2023-02-25 21:29:47,402][00219] Num frames 6600... [2023-02-25 21:29:47,564][00219] Num frames 6700... [2023-02-25 21:29:47,723][00219] Num frames 6800... [2023-02-25 21:29:47,886][00219] Num frames 6900... [2023-02-25 21:29:48,040][00219] Num frames 7000... [2023-02-25 21:29:48,192][00219] Num frames 7100... [2023-02-25 21:29:48,352][00219] Num frames 7200... [2023-02-25 21:29:48,514][00219] Num frames 7300... [2023-02-25 21:29:48,671][00219] Num frames 7400... [2023-02-25 21:29:48,838][00219] Num frames 7500... [2023-02-25 21:29:49,005][00219] Avg episode rewards: #0: 52.674, true rewards: #0: 18.925 [2023-02-25 21:29:49,007][00219] Avg episode reward: 52.674, avg true_objective: 18.925 [2023-02-25 21:29:49,063][00219] Num frames 7600... [2023-02-25 21:29:49,225][00219] Num frames 7700... [2023-02-25 21:29:49,395][00219] Num frames 7800... [2023-02-25 21:29:49,559][00219] Num frames 7900... [2023-02-25 21:29:49,716][00219] Num frames 8000... [2023-02-25 21:29:49,873][00219] Num frames 8100... [2023-02-25 21:29:50,033][00219] Num frames 8200... [2023-02-25 21:29:50,186][00219] Num frames 8300... [2023-02-25 21:29:50,305][00219] Num frames 8400... [2023-02-25 21:29:50,422][00219] Num frames 8500... [2023-02-25 21:29:50,555][00219] Avg episode rewards: #0: 46.725, true rewards: #0: 17.126 [2023-02-25 21:29:50,557][00219] Avg episode reward: 46.725, avg true_objective: 17.126 [2023-02-25 21:29:50,603][00219] Num frames 8600... [2023-02-25 21:29:50,715][00219] Num frames 8700... [2023-02-25 21:29:50,825][00219] Num frames 8800... [2023-02-25 21:29:50,940][00219] Num frames 8900... [2023-02-25 21:29:51,050][00219] Num frames 9000... [2023-02-25 21:29:51,159][00219] Num frames 9100... [2023-02-25 21:29:51,275][00219] Num frames 9200... [2023-02-25 21:29:51,444][00219] Avg episode rewards: #0: 41.665, true rewards: #0: 15.498 [2023-02-25 21:29:51,447][00219] Avg episode reward: 41.665, avg true_objective: 15.498 [2023-02-25 21:29:51,453][00219] Num frames 9300... [2023-02-25 21:29:51,585][00219] Num frames 9400... [2023-02-25 21:29:51,707][00219] Num frames 9500... [2023-02-25 21:29:51,834][00219] Num frames 9600... [2023-02-25 21:29:51,953][00219] Num frames 9700... [2023-02-25 21:29:52,078][00219] Num frames 9800... [2023-02-25 21:29:52,192][00219] Num frames 9900... [2023-02-25 21:29:52,311][00219] Num frames 10000... [2023-02-25 21:29:52,431][00219] Num frames 10100... [2023-02-25 21:29:52,554][00219] Num frames 10200... [2023-02-25 21:29:52,666][00219] Num frames 10300... [2023-02-25 21:29:52,783][00219] Num frames 10400... [2023-02-25 21:29:52,894][00219] Num frames 10500... [2023-02-25 21:29:53,013][00219] Num frames 10600... [2023-02-25 21:29:53,125][00219] Num frames 10700... [2023-02-25 21:29:53,242][00219] Num frames 10800... [2023-02-25 21:29:53,354][00219] Num frames 10900... [2023-02-25 21:29:53,477][00219] Num frames 11000... [2023-02-25 21:29:53,589][00219] Num frames 11100... [2023-02-25 21:29:53,712][00219] Avg episode rewards: #0: 43.364, true rewards: #0: 15.936 [2023-02-25 21:29:53,713][00219] Avg episode reward: 43.364, avg true_objective: 15.936 [2023-02-25 21:29:53,769][00219] Num frames 11200... [2023-02-25 21:29:53,884][00219] Num frames 11300... [2023-02-25 21:29:53,994][00219] Num frames 11400... [2023-02-25 21:29:54,106][00219] Num frames 11500... [2023-02-25 21:29:54,228][00219] Num frames 11600... [2023-02-25 21:29:54,341][00219] Num frames 11700... [2023-02-25 21:29:54,462][00219] Num frames 11800... [2023-02-25 21:29:54,581][00219] Num frames 11900... [2023-02-25 21:29:54,704][00219] Num frames 12000... [2023-02-25 21:29:54,817][00219] Num frames 12100... [2023-02-25 21:29:54,937][00219] Num frames 12200... [2023-02-25 21:29:55,050][00219] Num frames 12300... [2023-02-25 21:29:55,169][00219] Num frames 12400... [2023-02-25 21:29:55,285][00219] Num frames 12500... [2023-02-25 21:29:55,407][00219] Num frames 12600... [2023-02-25 21:29:55,530][00219] Num frames 12700... [2023-02-25 21:29:55,656][00219] Num frames 12800... [2023-02-25 21:29:55,776][00219] Num frames 12900... [2023-02-25 21:29:55,888][00219] Num frames 13000... [2023-02-25 21:29:56,003][00219] Num frames 13100... [2023-02-25 21:29:56,125][00219] Num frames 13200... [2023-02-25 21:29:56,241][00219] Avg episode rewards: #0: 45.318, true rewards: #0: 16.569 [2023-02-25 21:29:56,243][00219] Avg episode reward: 45.318, avg true_objective: 16.569 [2023-02-25 21:29:56,300][00219] Num frames 13300... [2023-02-25 21:29:56,418][00219] Num frames 13400... [2023-02-25 21:29:56,536][00219] Num frames 13500... [2023-02-25 21:29:56,654][00219] Num frames 13600... [2023-02-25 21:29:56,768][00219] Num frames 13700... [2023-02-25 21:29:56,886][00219] Num frames 13800... [2023-02-25 21:29:56,997][00219] Num frames 13900... [2023-02-25 21:29:57,125][00219] Num frames 14000... [2023-02-25 21:29:57,238][00219] Num frames 14100... [2023-02-25 21:29:57,353][00219] Num frames 14200... [2023-02-25 21:29:57,466][00219] Num frames 14300... [2023-02-25 21:29:57,595][00219] Num frames 14400... [2023-02-25 21:29:57,708][00219] Num frames 14500... [2023-02-25 21:29:57,827][00219] Num frames 14600... [2023-02-25 21:29:57,943][00219] Num frames 14700... [2023-02-25 21:29:58,062][00219] Num frames 14800... [2023-02-25 21:29:58,175][00219] Num frames 14900... [2023-02-25 21:29:58,294][00219] Num frames 15000... [2023-02-25 21:29:58,368][00219] Avg episode rewards: #0: 45.460, true rewards: #0: 16.683 [2023-02-25 21:29:58,371][00219] Avg episode reward: 45.460, avg true_objective: 16.683 [2023-02-25 21:29:58,468][00219] Num frames 15100... [2023-02-25 21:29:58,593][00219] Num frames 15200... [2023-02-25 21:29:58,705][00219] Num frames 15300... [2023-02-25 21:29:58,822][00219] Num frames 15400... [2023-02-25 21:29:58,934][00219] Num frames 15500... [2023-02-25 21:29:59,051][00219] Num frames 15600... [2023-02-25 21:29:59,162][00219] Num frames 15700... [2023-02-25 21:29:59,279][00219] Num frames 15800... [2023-02-25 21:29:59,390][00219] Num frames 15900... [2023-02-25 21:29:59,510][00219] Num frames 16000... [2023-02-25 21:29:59,631][00219] Num frames 16100... [2023-02-25 21:29:59,751][00219] Num frames 16200... [2023-02-25 21:29:59,863][00219] Num frames 16300... [2023-02-25 21:29:59,980][00219] Num frames 16400... [2023-02-25 21:30:00,093][00219] Num frames 16500... [2023-02-25 21:30:00,221][00219] Num frames 16600... [2023-02-25 21:30:00,377][00219] Num frames 16700... [2023-02-25 21:30:00,554][00219] Avg episode rewards: #0: 44.874, true rewards: #0: 16.775 [2023-02-25 21:30:00,557][00219] Avg episode reward: 44.874, avg true_objective: 16.775 [2023-02-25 21:31:35,010][00219] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-25 21:31:38,070][00219] The model has been pushed to https://huggingface.co/msgerasyov/rl_course_vizdoom_health_gathering_supreme [2023-02-25 21:31:40,245][00219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-25 21:31:40,247][00219] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-25 21:31:40,250][00219] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-25 21:31:40,253][00219] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-25 21:31:40,254][00219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-25 21:31:40,260][00219] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-25 21:31:40,261][00219] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-25 21:31:40,263][00219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-25 21:31:40,264][00219] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-25 21:31:40,268][00219] Adding new argument 'hf_repository'='msgerasyov/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-25 21:31:40,269][00219] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-25 21:31:40,271][00219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-25 21:31:40,272][00219] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-25 21:31:40,274][00219] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-25 21:31:40,275][00219] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-25 21:31:40,303][00219] RunningMeanStd input shape: (3, 72, 128) [2023-02-25 21:31:40,306][00219] RunningMeanStd input shape: (1,) [2023-02-25 21:31:40,337][00219] ConvEncoder: input_channels=3 [2023-02-25 21:31:40,431][00219] Conv encoder output size: 512 [2023-02-25 21:31:40,437][00219] Policy head output size: 512 [2023-02-25 21:31:40,468][00219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000007466_30580736.pth... [2023-02-25 21:31:40,958][00219] Num frames 100... [2023-02-25 21:31:41,067][00219] Num frames 200... [2023-02-25 21:31:41,184][00219] Num frames 300... [2023-02-25 21:31:41,296][00219] Num frames 400... [2023-02-25 21:31:41,424][00219] Num frames 500... [2023-02-25 21:31:41,540][00219] Num frames 600... [2023-02-25 21:31:41,674][00219] Num frames 700... [2023-02-25 21:31:41,841][00219] Avg episode rewards: #0: 17.680, true rewards: #0: 7.680 [2023-02-25 21:31:41,844][00219] Avg episode reward: 17.680, avg true_objective: 7.680 [2023-02-25 21:31:41,903][00219] Num frames 800... [2023-02-25 21:31:42,013][00219] Num frames 900... [2023-02-25 21:31:42,126][00219] Num frames 1000... [2023-02-25 21:31:42,245][00219] Num frames 1100... [2023-02-25 21:31:42,363][00219] Num frames 1200... [2023-02-25 21:31:42,483][00219] Num frames 1300... [2023-02-25 21:31:42,597][00219] Num frames 1400... [2023-02-25 21:31:42,716][00219] Num frames 1500... [2023-02-25 21:31:42,838][00219] Num frames 1600... [2023-02-25 21:31:42,958][00219] Num frames 1700... [2023-02-25 21:31:43,070][00219] Num frames 1800... [2023-02-25 21:31:43,184][00219] Num frames 1900... [2023-02-25 21:31:43,337][00219] Avg episode rewards: #0: 21.920, true rewards: #0: 9.920 [2023-02-25 21:31:43,339][00219] Avg episode reward: 21.920, avg true_objective: 9.920 [2023-02-25 21:31:43,361][00219] Num frames 2000... [2023-02-25 21:31:43,483][00219] Num frames 2100... [2023-02-25 21:31:43,597][00219] Num frames 2200... [2023-02-25 21:31:43,760][00219] Num frames 2300... [2023-02-25 21:31:43,976][00219] Num frames 2400... [2023-02-25 21:31:44,140][00219] Num frames 2500... [2023-02-25 21:31:44,397][00219] Num frames 2600... [2023-02-25 21:31:44,588][00219] Num frames 2700... [2023-02-25 21:31:44,865][00219] Num frames 2800... [2023-02-25 21:31:45,077][00219] Num frames 2900... [2023-02-25 21:31:45,278][00219] Num frames 3000... [2023-02-25 21:31:45,462][00219] Num frames 3100... [2023-02-25 21:31:45,688][00219] Num frames 3200... [2023-02-25 21:31:45,936][00219] Num frames 3300... [2023-02-25 21:31:46,118][00219] Num frames 3400... [2023-02-25 21:31:46,292][00219] Num frames 3500... [2023-02-25 21:31:46,470][00219] Num frames 3600... [2023-02-25 21:31:46,639][00219] Num frames 3700... [2023-02-25 21:31:46,967][00219] Num frames 3800... [2023-02-25 21:31:47,107][00219] Avg episode rewards: #0: 32.713, true rewards: #0: 12.713 [2023-02-25 21:31:47,112][00219] Avg episode reward: 32.713, avg true_objective: 12.713 [2023-02-25 21:31:47,420][00219] Num frames 3900... [2023-02-25 21:31:47,881][00219] Num frames 4000... [2023-02-25 21:31:48,301][00219] Num frames 4100... [2023-02-25 21:31:48,730][00219] Num frames 4200... [2023-02-25 21:31:49,382][00219] Num frames 4300... [2023-02-25 21:31:49,744][00219] Num frames 4400... [2023-02-25 21:31:50,103][00219] Num frames 4500... [2023-02-25 21:31:50,485][00219] Num frames 4600... [2023-02-25 21:31:50,894][00219] Num frames 4700... [2023-02-25 21:31:51,299][00219] Num frames 4800... [2023-02-25 21:31:51,492][00219] Num frames 4900... [2023-02-25 21:31:51,638][00219] Avg episode rewards: #0: 31.097, true rewards: #0: 12.347 [2023-02-25 21:31:51,644][00219] Avg episode reward: 31.097, avg true_objective: 12.347 [2023-02-25 21:31:51,772][00219] Num frames 5000... [2023-02-25 21:31:51,964][00219] Num frames 5100... [2023-02-25 21:31:52,190][00219] Num frames 5200... [2023-02-25 21:31:52,325][00219] Num frames 5300... [2023-02-25 21:31:52,436][00219] Num frames 5400... [2023-02-25 21:31:52,558][00219] Num frames 5500... [2023-02-25 21:31:52,674][00219] Num frames 5600... [2023-02-25 21:31:52,795][00219] Num frames 5700... [2023-02-25 21:31:52,908][00219] Num frames 5800... [2023-02-25 21:31:53,023][00219] Num frames 5900... [2023-02-25 21:31:53,161][00219] Avg episode rewards: #0: 30.140, true rewards: #0: 11.940 [2023-02-25 21:31:53,163][00219] Avg episode reward: 30.140, avg true_objective: 11.940 [2023-02-25 21:31:53,206][00219] Num frames 6000... [2023-02-25 21:31:53,334][00219] Num frames 6100... [2023-02-25 21:31:53,453][00219] Num frames 6200... [2023-02-25 21:31:53,573][00219] Num frames 6300... [2023-02-25 21:31:53,686][00219] Num frames 6400... [2023-02-25 21:31:53,819][00219] Num frames 6500... [2023-02-25 21:31:53,930][00219] Num frames 6600... [2023-02-25 21:31:54,047][00219] Num frames 6700... [2023-02-25 21:31:54,159][00219] Num frames 6800... [2023-02-25 21:31:54,276][00219] Num frames 6900... [2023-02-25 21:31:54,389][00219] Num frames 7000... [2023-02-25 21:31:54,505][00219] Num frames 7100... [2023-02-25 21:31:54,623][00219] Num frames 7200... [2023-02-25 21:31:54,744][00219] Num frames 7300... [2023-02-25 21:31:54,864][00219] Num frames 7400... [2023-02-25 21:31:54,984][00219] Num frames 7500... [2023-02-25 21:31:55,105][00219] Num frames 7600... [2023-02-25 21:31:55,226][00219] Num frames 7700... [2023-02-25 21:31:55,345][00219] Num frames 7800... [2023-02-25 21:31:55,461][00219] Num frames 7900... [2023-02-25 21:31:55,583][00219] Num frames 8000... [2023-02-25 21:31:55,719][00219] Avg episode rewards: #0: 34.283, true rewards: #0: 13.450 [2023-02-25 21:31:55,721][00219] Avg episode reward: 34.283, avg true_objective: 13.450 [2023-02-25 21:31:55,769][00219] Num frames 8100... [2023-02-25 21:31:55,886][00219] Num frames 8200... [2023-02-25 21:31:55,998][00219] Num frames 8300... [2023-02-25 21:31:56,114][00219] Num frames 8400... [2023-02-25 21:31:56,229][00219] Num frames 8500... [2023-02-25 21:31:56,357][00219] Num frames 8600... [2023-02-25 21:31:56,469][00219] Num frames 8700... [2023-02-25 21:31:56,578][00219] Num frames 8800... [2023-02-25 21:31:56,693][00219] Num frames 8900... [2023-02-25 21:31:56,812][00219] Num frames 9000... [2023-02-25 21:31:56,930][00219] Num frames 9100... [2023-02-25 21:31:57,042][00219] Num frames 9200... [2023-02-25 21:31:57,159][00219] Num frames 9300... [2023-02-25 21:31:57,270][00219] Num frames 9400... [2023-02-25 21:31:57,388][00219] Num frames 9500... [2023-02-25 21:31:57,501][00219] Num frames 9600... [2023-02-25 21:31:57,618][00219] Num frames 9700... [2023-02-25 21:31:57,730][00219] Num frames 9800... [2023-02-25 21:31:57,808][00219] Avg episode rewards: #0: 36.454, true rewards: #0: 14.026 [2023-02-25 21:31:57,810][00219] Avg episode reward: 36.454, avg true_objective: 14.026 [2023-02-25 21:31:57,909][00219] Num frames 9900... [2023-02-25 21:31:58,021][00219] Num frames 10000... [2023-02-25 21:31:58,136][00219] Num frames 10100... [2023-02-25 21:31:58,248][00219] Num frames 10200... [2023-02-25 21:31:58,365][00219] Num frames 10300... [2023-02-25 21:31:58,478][00219] Num frames 10400... [2023-02-25 21:31:58,589][00219] Num frames 10500... [2023-02-25 21:31:58,707][00219] Num frames 10600... [2023-02-25 21:31:58,829][00219] Num frames 10700... [2023-02-25 21:31:58,943][00219] Num frames 10800... [2023-02-25 21:31:59,057][00219] Num frames 10900... [2023-02-25 21:31:59,175][00219] Num frames 11000... [2023-02-25 21:31:59,290][00219] Num frames 11100... [2023-02-25 21:31:59,408][00219] Num frames 11200... [2023-02-25 21:31:59,520][00219] Num frames 11300... [2023-02-25 21:31:59,631][00219] Num frames 11400... [2023-02-25 21:31:59,742][00219] Num frames 11500... [2023-02-25 21:31:59,854][00219] Avg episode rewards: #0: 38.057, true rewards: #0: 14.432 [2023-02-25 21:31:59,856][00219] Avg episode reward: 38.057, avg true_objective: 14.432 [2023-02-25 21:31:59,920][00219] Num frames 11600... [2023-02-25 21:32:00,036][00219] Num frames 11700... [2023-02-25 21:32:00,147][00219] Num frames 11800... [2023-02-25 21:32:00,267][00219] Num frames 11900... [2023-02-25 21:32:00,382][00219] Num frames 12000... [2023-02-25 21:32:00,494][00219] Num frames 12100... [2023-02-25 21:32:00,607][00219] Num frames 12200... [2023-02-25 21:32:00,736][00219] Num frames 12300... [2023-02-25 21:32:00,861][00219] Num frames 12400... [2023-02-25 21:32:00,975][00219] Num frames 12500... [2023-02-25 21:32:01,094][00219] Num frames 12600... [2023-02-25 21:32:01,207][00219] Num frames 12700... [2023-02-25 21:32:01,340][00219] Num frames 12800... [2023-02-25 21:32:01,503][00219] Num frames 12900... [2023-02-25 21:32:01,670][00219] Num frames 13000... [2023-02-25 21:32:01,840][00219] Num frames 13100... [2023-02-25 21:32:02,001][00219] Num frames 13200... [2023-02-25 21:32:02,171][00219] Num frames 13300... [2023-02-25 21:32:02,341][00219] Num frames 13400... [2023-02-25 21:32:02,519][00219] Num frames 13500... [2023-02-25 21:32:02,695][00219] Num frames 13600... [2023-02-25 21:32:02,826][00219] Avg episode rewards: #0: 39.828, true rewards: #0: 15.162 [2023-02-25 21:32:02,828][00219] Avg episode reward: 39.828, avg true_objective: 15.162 [2023-02-25 21:32:02,941][00219] Num frames 13700... [2023-02-25 21:32:03,101][00219] Num frames 13800... [2023-02-25 21:32:03,269][00219] Num frames 13900... [2023-02-25 21:32:03,443][00219] Num frames 14000... [2023-02-25 21:32:03,620][00219] Num frames 14100... [2023-02-25 21:32:03,789][00219] Num frames 14200... [2023-02-25 21:32:03,988][00219] Num frames 14300... [2023-02-25 21:32:04,170][00219] Num frames 14400... [2023-02-25 21:32:04,353][00219] Num frames 14500... [2023-02-25 21:32:04,530][00219] Num frames 14600... [2023-02-25 21:32:04,716][00219] Num frames 14700... [2023-02-25 21:32:04,839][00219] Avg episode rewards: #0: 38.834, true rewards: #0: 14.734 [2023-02-25 21:32:04,842][00219] Avg episode reward: 38.834, avg true_objective: 14.734 [2023-02-25 21:33:34,041][00219] Replay video saved to /content/train_dir/default_experiment/replay.mp4!