diff --git "a/sf_log.txt" "b/sf_log.txt" --- "a/sf_log.txt" +++ "b/sf_log.txt" @@ -1,50 +1,50 @@ -[2023-03-12 07:21:53,884][00184] Saving configuration to /content/train_dir/default_experiment/config.json... -[2023-03-12 07:21:53,889][00184] Rollout worker 0 uses device cpu -[2023-03-12 07:21:53,892][00184] Rollout worker 1 uses device cpu -[2023-03-12 07:21:53,893][00184] Rollout worker 2 uses device cpu -[2023-03-12 07:21:53,894][00184] Rollout worker 3 uses device cpu -[2023-03-12 07:21:53,895][00184] Rollout worker 4 uses device cpu -[2023-03-12 07:21:53,896][00184] Rollout worker 5 uses device cpu -[2023-03-12 07:21:53,897][00184] Rollout worker 6 uses device cpu -[2023-03-12 07:21:53,899][00184] Rollout worker 7 uses device cpu -[2023-03-12 07:21:54,091][00184] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-03-12 07:21:54,093][00184] InferenceWorker_p0-w0: min num requests: 2 -[2023-03-12 07:21:54,123][00184] Starting all processes... -[2023-03-12 07:21:54,124][00184] Starting process learner_proc0 -[2023-03-12 07:21:54,189][00184] Starting all processes... -[2023-03-12 07:21:54,198][00184] Starting process inference_proc0-0 -[2023-03-12 07:21:54,198][00184] Starting process rollout_proc0 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc1 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc2 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc3 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc4 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc5 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc6 -[2023-03-12 07:21:54,202][00184] Starting process rollout_proc7 -[2023-03-12 07:22:03,192][11773] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-03-12 07:22:03,196][11773] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 -[2023-03-12 07:22:03,427][11799] Worker 3 uses CPU cores [1] -[2023-03-12 07:22:03,610][11793] Worker 4 uses CPU cores [0] -[2023-03-12 07:22:03,756][11787] Worker 1 uses CPU cores [1] -[2023-03-12 07:22:03,824][11789] Worker 2 uses CPU cores [0] -[2023-03-12 07:22:03,847][11786] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-03-12 07:22:03,854][11786] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 -[2023-03-12 07:22:03,913][11800] Worker 7 uses CPU cores [1] -[2023-03-12 07:22:04,049][11791] Worker 0 uses CPU cores [0] -[2023-03-12 07:22:04,291][11798] Worker 5 uses CPU cores [1] -[2023-03-12 07:22:04,324][11801] Worker 6 uses CPU cores [0] -[2023-03-12 07:22:04,378][11786] Num visible devices: 1 -[2023-03-12 07:22:04,379][11773] Num visible devices: 1 -[2023-03-12 07:22:04,392][11773] Starting seed is not provided -[2023-03-12 07:22:04,392][11773] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-03-12 07:22:04,392][11773] Initializing actor-critic model on device cuda:0 -[2023-03-12 07:22:04,393][11773] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:22:04,395][11773] RunningMeanStd input shape: (1,) -[2023-03-12 07:22:04,415][11773] ConvEncoder: input_channels=3 -[2023-03-12 07:22:04,753][11773] Conv encoder output size: 512 -[2023-03-12 07:22:04,754][11773] Policy head output size: 512 -[2023-03-12 07:22:04,814][11773] Created Actor Critic model with architecture: -[2023-03-12 07:22:04,815][11773] ActorCriticSharedWeights( +[2023-03-14 14:00:34,317][00372] Saving configuration to /content/train_dir/default_experiment/config.json... +[2023-03-14 14:00:34,320][00372] Rollout worker 0 uses device cpu +[2023-03-14 14:00:34,324][00372] Rollout worker 1 uses device cpu +[2023-03-14 14:00:34,325][00372] Rollout worker 2 uses device cpu +[2023-03-14 14:00:34,327][00372] Rollout worker 3 uses device cpu +[2023-03-14 14:00:34,328][00372] Rollout worker 4 uses device cpu +[2023-03-14 14:00:34,329][00372] Rollout worker 5 uses device cpu +[2023-03-14 14:00:34,330][00372] Rollout worker 6 uses device cpu +[2023-03-14 14:00:34,331][00372] Rollout worker 7 uses device cpu +[2023-03-14 14:00:34,528][00372] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-03-14 14:00:34,530][00372] InferenceWorker_p0-w0: min num requests: 2 +[2023-03-14 14:00:34,772][00372] Starting all processes... +[2023-03-14 14:00:34,774][00372] Starting process learner_proc0 +[2023-03-14 14:00:34,843][00372] Starting all processes... +[2023-03-14 14:00:34,850][00372] Starting process inference_proc0-0 +[2023-03-14 14:00:34,852][00372] Starting process rollout_proc0 +[2023-03-14 14:00:34,875][00372] Starting process rollout_proc1 +[2023-03-14 14:00:34,877][00372] Starting process rollout_proc2 +[2023-03-14 14:00:34,877][00372] Starting process rollout_proc3 +[2023-03-14 14:00:34,877][00372] Starting process rollout_proc4 +[2023-03-14 14:00:34,877][00372] Starting process rollout_proc5 +[2023-03-14 14:00:34,882][00372] Starting process rollout_proc6 +[2023-03-14 14:00:34,882][00372] Starting process rollout_proc7 +[2023-03-14 14:00:47,488][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-03-14 14:00:47,490][13187] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-03-14 14:00:47,910][13205] Worker 0 uses CPU cores [0] +[2023-03-14 14:00:48,043][13202] Worker 1 uses CPU cores [1] +[2023-03-14 14:00:48,189][13200] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-03-14 14:00:48,197][13200] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-03-14 14:00:48,230][13207] Worker 4 uses CPU cores [0] +[2023-03-14 14:00:48,319][13209] Worker 7 uses CPU cores [1] +[2023-03-14 14:00:48,402][13208] Worker 3 uses CPU cores [1] +[2023-03-14 14:00:48,465][13211] Worker 6 uses CPU cores [0] +[2023-03-14 14:00:48,483][13210] Worker 5 uses CPU cores [1] +[2023-03-14 14:00:48,513][13204] Worker 2 uses CPU cores [0] +[2023-03-14 14:00:48,576][13187] Num visible devices: 1 +[2023-03-14 14:00:48,577][13200] Num visible devices: 1 +[2023-03-14 14:00:48,590][13187] Starting seed is not provided +[2023-03-14 14:00:48,591][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-03-14 14:00:48,591][13187] Initializing actor-critic model on device cuda:0 +[2023-03-14 14:00:48,592][13187] RunningMeanStd input shape: (3, 72, 128) +[2023-03-14 14:00:48,593][13187] RunningMeanStd input shape: (1,) +[2023-03-14 14:00:48,606][13187] ConvEncoder: input_channels=3 +[2023-03-14 14:00:48,868][13187] Conv encoder output size: 512 +[2023-03-14 14:00:48,869][13187] Policy head output size: 512 +[2023-03-14 14:00:48,915][13187] Created Actor Critic model with architecture: +[2023-03-14 14:00:48,916][13187] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( @@ -85,1411 +85,1292 @@ (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) -[2023-03-12 07:22:12,203][11773] Using optimizer -[2023-03-12 07:22:12,204][11773] No checkpoints found -[2023-03-12 07:22:12,205][11773] Did not load from checkpoint, starting from scratch! -[2023-03-12 07:22:12,205][11773] Initialized policy 0 weights for model version 0 -[2023-03-12 07:22:12,209][11773] Using GPUs [0] for process 0 (actually maps to GPUs [0]) -[2023-03-12 07:22:12,216][11773] LearnerWorker_p0 finished initialization! -[2023-03-12 07:22:12,403][11786] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:22:12,404][11786] RunningMeanStd input shape: (1,) -[2023-03-12 07:22:12,416][11786] ConvEncoder: input_channels=3 -[2023-03-12 07:22:12,514][11786] Conv encoder output size: 512 -[2023-03-12 07:22:12,514][11786] Policy head output size: 512 -[2023-03-12 07:22:14,084][00184] Heartbeat connected on Batcher_0 -[2023-03-12 07:22:14,092][00184] Heartbeat connected on LearnerWorker_p0 -[2023-03-12 07:22:14,101][00184] Heartbeat connected on RolloutWorker_w0 -[2023-03-12 07:22:14,108][00184] Heartbeat connected on RolloutWorker_w2 -[2023-03-12 07:22:14,111][00184] Heartbeat connected on RolloutWorker_w1 -[2023-03-12 07:22:14,114][00184] Heartbeat connected on RolloutWorker_w3 -[2023-03-12 07:22:14,117][00184] Heartbeat connected on RolloutWorker_w4 -[2023-03-12 07:22:14,121][00184] Heartbeat connected on RolloutWorker_w5 -[2023-03-12 07:22:14,124][00184] Heartbeat connected on RolloutWorker_w6 -[2023-03-12 07:22:14,126][00184] Heartbeat connected on RolloutWorker_w7 -[2023-03-12 07:22:14,134][00184] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-03-12 07:22:14,841][00184] Inference worker 0-0 is ready! -[2023-03-12 07:22:14,843][00184] All inference workers are ready! Signal rollout workers to start! -[2023-03-12 07:22:14,846][00184] Heartbeat connected on InferenceWorker_p0-w0 -[2023-03-12 07:22:14,954][11789] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:14,944][11793] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:14,964][11791] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,000][11801] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,021][11787] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,019][11800] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,024][11798] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,032][11799] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:22:15,509][11787] Decorrelating experience for 0 frames... -[2023-03-12 07:22:15,858][11787] Decorrelating experience for 32 frames... -[2023-03-12 07:22:16,272][11787] Decorrelating experience for 64 frames... -[2023-03-12 07:22:16,384][11801] Decorrelating experience for 0 frames... -[2023-03-12 07:22:16,387][11789] Decorrelating experience for 0 frames... -[2023-03-12 07:22:16,389][11791] Decorrelating experience for 0 frames... -[2023-03-12 07:22:16,395][11793] Decorrelating experience for 0 frames... -[2023-03-12 07:22:17,233][11787] Decorrelating experience for 96 frames... -[2023-03-12 07:22:17,696][11800] Decorrelating experience for 0 frames... -[2023-03-12 07:22:17,761][11798] Decorrelating experience for 0 frames... -[2023-03-12 07:22:18,101][11793] Decorrelating experience for 32 frames... -[2023-03-12 07:22:18,112][11801] Decorrelating experience for 32 frames... -[2023-03-12 07:22:18,121][11789] Decorrelating experience for 32 frames... -[2023-03-12 07:22:18,123][11791] Decorrelating experience for 32 frames... -[2023-03-12 07:22:18,577][11800] Decorrelating experience for 32 frames... -[2023-03-12 07:22:18,684][11798] Decorrelating experience for 32 frames... -[2023-03-12 07:22:19,129][00184] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-03-12 07:22:19,345][11799] Decorrelating experience for 0 frames... -[2023-03-12 07:22:19,996][11800] Decorrelating experience for 64 frames... -[2023-03-12 07:22:20,224][11798] Decorrelating experience for 64 frames... -[2023-03-12 07:22:20,517][11799] Decorrelating experience for 32 frames... -[2023-03-12 07:22:21,074][11800] Decorrelating experience for 96 frames... -[2023-03-12 07:22:21,252][11798] Decorrelating experience for 96 frames... -[2023-03-12 07:22:21,735][11799] Decorrelating experience for 64 frames... -[2023-03-12 07:22:22,208][11799] Decorrelating experience for 96 frames... -[2023-03-12 07:22:22,996][11793] Decorrelating experience for 64 frames... -[2023-03-12 07:22:23,007][11791] Decorrelating experience for 64 frames... -[2023-03-12 07:22:23,091][11801] Decorrelating experience for 64 frames... -[2023-03-12 07:22:24,129][00184] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2.0. Samples: 20. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-03-12 07:22:25,473][11789] Decorrelating experience for 64 frames... -[2023-03-12 07:22:25,608][11793] Decorrelating experience for 96 frames... -[2023-03-12 07:22:25,626][11791] Decorrelating experience for 96 frames... -[2023-03-12 07:22:29,135][00184] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 65.4. Samples: 982. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) -[2023-03-12 07:22:29,142][00184] Avg episode reward: [(0, '2.206')] -[2023-03-12 07:22:30,209][11801] Decorrelating experience for 96 frames... -[2023-03-12 07:22:30,240][11789] Decorrelating experience for 96 frames... -[2023-03-12 07:22:31,074][11773] Signal inference workers to stop experience collection... -[2023-03-12 07:22:31,084][11786] InferenceWorker_p0-w0: stopping experience collection -[2023-03-12 07:22:33,462][11773] Signal inference workers to resume experience collection... -[2023-03-12 07:22:33,465][11786] InferenceWorker_p0-w0: resuming experience collection -[2023-03-12 07:22:34,129][00184] Fps is (10 sec: 409.6, 60 sec: 204.8, 300 sec: 204.8). Total num frames: 4096. Throughput: 0: 115.5. Samples: 2310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) -[2023-03-12 07:22:34,132][00184] Avg episode reward: [(0, '2.842')] -[2023-03-12 07:22:39,129][00184] Fps is (10 sec: 2869.0, 60 sec: 1146.9, 300 sec: 1146.9). Total num frames: 28672. Throughput: 0: 302.3. Samples: 7558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:22:39,131][00184] Avg episode reward: [(0, '3.764')] -[2023-03-12 07:22:43,326][11786] Updated weights for policy 0, policy_version 10 (0.0366) -[2023-03-12 07:22:44,129][00184] Fps is (10 sec: 3686.4, 60 sec: 1365.3, 300 sec: 1365.3). Total num frames: 40960. Throughput: 0: 328.6. Samples: 9858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:22:44,139][00184] Avg episode reward: [(0, '4.227')] -[2023-03-12 07:22:49,129][00184] Fps is (10 sec: 2867.2, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 57344. Throughput: 0: 408.7. Samples: 14306. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:22:49,132][00184] Avg episode reward: [(0, '4.532')] -[2023-03-12 07:22:54,129][00184] Fps is (10 sec: 3276.8, 60 sec: 1843.2, 300 sec: 1843.2). Total num frames: 73728. Throughput: 0: 493.1. Samples: 19722. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:22:54,131][00184] Avg episode reward: [(0, '4.448')] -[2023-03-12 07:22:55,180][11786] Updated weights for policy 0, policy_version 20 (0.0017) -[2023-03-12 07:22:59,133][00184] Fps is (10 sec: 3685.1, 60 sec: 2093.3, 300 sec: 2093.3). Total num frames: 94208. Throughput: 0: 491.1. Samples: 22100. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) -[2023-03-12 07:22:59,140][00184] Avg episode reward: [(0, '4.366')] -[2023-03-12 07:23:04,134][00184] Fps is (10 sec: 3275.2, 60 sec: 2129.7, 300 sec: 2129.7). Total num frames: 106496. Throughput: 0: 603.1. Samples: 27144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:23:04,137][00184] Avg episode reward: [(0, '4.464')] -[2023-03-12 07:23:04,142][11773] Saving new best policy, reward=4.464! -[2023-03-12 07:23:09,132][00184] Fps is (10 sec: 2457.9, 60 sec: 2159.6, 300 sec: 2159.6). Total num frames: 118784. Throughput: 0: 675.0. Samples: 30398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:23:09,134][00184] Avg episode reward: [(0, '4.620')] -[2023-03-12 07:23:09,141][11773] Saving new best policy, reward=4.620! -[2023-03-12 07:23:12,177][11786] Updated weights for policy 0, policy_version 30 (0.0020) -[2023-03-12 07:23:14,129][00184] Fps is (10 sec: 2049.0, 60 sec: 2116.3, 300 sec: 2116.3). Total num frames: 126976. Throughput: 0: 672.2. Samples: 31226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:23:14,137][00184] Avg episode reward: [(0, '4.708')] -[2023-03-12 07:23:14,142][11773] Saving new best policy, reward=4.708! -[2023-03-12 07:23:19,129][00184] Fps is (10 sec: 2868.0, 60 sec: 2457.6, 300 sec: 2268.6). Total num frames: 147456. Throughput: 0: 760.8. Samples: 36546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:23:19,132][00184] Avg episode reward: [(0, '4.552')] -[2023-03-12 07:23:22,348][11786] Updated weights for policy 0, policy_version 40 (0.0014) -[2023-03-12 07:23:24,131][00184] Fps is (10 sec: 4095.2, 60 sec: 2798.8, 300 sec: 2399.0). Total num frames: 167936. Throughput: 0: 791.3. Samples: 43168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-03-12 07:23:24,136][00184] Avg episode reward: [(0, '4.437')] -[2023-03-12 07:23:29,129][00184] Fps is (10 sec: 3686.3, 60 sec: 3072.3, 300 sec: 2457.6). Total num frames: 184320. Throughput: 0: 788.6. Samples: 45346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:23:29,133][00184] Avg episode reward: [(0, '4.595')] -[2023-03-12 07:23:34,129][00184] Fps is (10 sec: 3277.5, 60 sec: 3276.8, 300 sec: 2508.8). Total num frames: 200704. Throughput: 0: 785.3. Samples: 49646. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:23:34,132][00184] Avg episode reward: [(0, '4.654')] -[2023-03-12 07:23:34,784][11786] Updated weights for policy 0, policy_version 50 (0.0038) -[2023-03-12 07:23:39,129][00184] Fps is (10 sec: 3686.5, 60 sec: 3208.5, 300 sec: 2602.2). Total num frames: 221184. Throughput: 0: 819.6. Samples: 56606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:23:39,132][00184] Avg episode reward: [(0, '4.417')] -[2023-03-12 07:23:44,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 2685.2). Total num frames: 241664. Throughput: 0: 841.7. Samples: 59974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:23:44,131][00184] Avg episode reward: [(0, '4.253')] -[2023-03-12 07:23:45,473][11786] Updated weights for policy 0, policy_version 60 (0.0012) -[2023-03-12 07:23:49,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 2673.2). Total num frames: 253952. Throughput: 0: 811.6. Samples: 63660. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) -[2023-03-12 07:23:49,140][00184] Avg episode reward: [(0, '4.394')] -[2023-03-12 07:23:49,150][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000062_253952.pth... -[2023-03-12 07:23:54,130][00184] Fps is (10 sec: 2047.8, 60 sec: 3140.2, 300 sec: 2621.4). Total num frames: 262144. Throughput: 0: 808.1. Samples: 66762. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) -[2023-03-12 07:23:54,136][00184] Avg episode reward: [(0, '4.579')] -[2023-03-12 07:23:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 2691.7). Total num frames: 282624. Throughput: 0: 836.6. Samples: 68872. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) -[2023-03-12 07:23:59,132][00184] Avg episode reward: [(0, '4.774')] -[2023-03-12 07:23:59,146][11773] Saving new best policy, reward=4.774! -[2023-03-12 07:24:00,067][11786] Updated weights for policy 0, policy_version 70 (0.0011) -[2023-03-12 07:24:04,129][00184] Fps is (10 sec: 3686.7, 60 sec: 3208.8, 300 sec: 2718.3). Total num frames: 299008. Throughput: 0: 852.5. Samples: 74910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:24:04,136][00184] Avg episode reward: [(0, '4.752')] -[2023-03-12 07:24:09,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.7, 300 sec: 2706.9). Total num frames: 311296. Throughput: 0: 793.8. Samples: 78886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:24:09,145][00184] Avg episode reward: [(0, '4.697')] -[2023-03-12 07:24:14,131][00184] Fps is (10 sec: 2457.2, 60 sec: 3276.7, 300 sec: 2696.5). Total num frames: 323584. Throughput: 0: 775.9. Samples: 80262. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-03-12 07:24:14,134][00184] Avg episode reward: [(0, '4.566')] -[2023-03-12 07:24:15,131][11786] Updated weights for policy 0, policy_version 80 (0.0016) -[2023-03-12 07:24:19,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2719.8). Total num frames: 339968. Throughput: 0: 783.9. Samples: 84922. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:24:19,132][00184] Avg episode reward: [(0, '4.252')] -[2023-03-12 07:24:24,129][00184] Fps is (10 sec: 3687.0, 60 sec: 3208.6, 300 sec: 2772.7). Total num frames: 360448. Throughput: 0: 767.3. Samples: 91136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:24:24,133][00184] Avg episode reward: [(0, '4.486')] -[2023-03-12 07:24:26,005][11786] Updated weights for policy 0, policy_version 90 (0.0022) -[2023-03-12 07:24:29,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 2761.0). Total num frames: 372736. Throughput: 0: 741.7. Samples: 93350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:24:29,132][00184] Avg episode reward: [(0, '4.528')] -[2023-03-12 07:24:34,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2779.4). Total num frames: 389120. Throughput: 0: 740.4. Samples: 96980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:24:34,136][00184] Avg episode reward: [(0, '4.812')] -[2023-03-12 07:24:34,142][11773] Saving new best policy, reward=4.812! -[2023-03-12 07:24:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3003.7, 300 sec: 2768.3). Total num frames: 401408. Throughput: 0: 772.5. Samples: 101524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:24:39,136][00184] Avg episode reward: [(0, '4.824')] -[2023-03-12 07:24:39,147][11773] Saving new best policy, reward=4.824! -[2023-03-12 07:24:40,251][11786] Updated weights for policy 0, policy_version 100 (0.0012) -[2023-03-12 07:24:44,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3072.0, 300 sec: 2839.9). Total num frames: 425984. Throughput: 0: 798.5. Samples: 104806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:24:44,137][00184] Avg episode reward: [(0, '4.965')] -[2023-03-12 07:24:44,141][11773] Saving new best policy, reward=4.965! -[2023-03-12 07:24:48,853][11786] Updated weights for policy 0, policy_version 110 (0.0015) -[2023-03-12 07:24:49,129][00184] Fps is (10 sec: 4915.2, 60 sec: 3276.8, 300 sec: 2906.8). Total num frames: 450560. Throughput: 0: 823.6. Samples: 111974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:24:49,136][00184] Avg episode reward: [(0, '4.890')] -[2023-03-12 07:24:54,132][00184] Fps is (10 sec: 3685.4, 60 sec: 3345.0, 300 sec: 2892.8). Total num frames: 462848. Throughput: 0: 847.6. Samples: 117030. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:24:54,134][00184] Avg episode reward: [(0, '4.608')] -[2023-03-12 07:24:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 2904.4). Total num frames: 479232. Throughput: 0: 865.1. Samples: 119190. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:24:59,132][00184] Avg episode reward: [(0, '4.532')] -[2023-03-12 07:25:01,207][11786] Updated weights for policy 0, policy_version 120 (0.0023) -[2023-03-12 07:25:04,133][00184] Fps is (10 sec: 3686.0, 60 sec: 3344.9, 300 sec: 2939.4). Total num frames: 499712. Throughput: 0: 882.0. Samples: 124614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:04,137][00184] Avg episode reward: [(0, '4.568')] -[2023-03-12 07:25:09,131][00184] Fps is (10 sec: 3276.3, 60 sec: 3345.0, 300 sec: 2925.7). Total num frames: 512000. Throughput: 0: 843.9. Samples: 129112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:09,136][00184] Avg episode reward: [(0, '4.681')] -[2023-03-12 07:25:14,129][00184] Fps is (10 sec: 2458.5, 60 sec: 3345.2, 300 sec: 2912.7). Total num frames: 524288. Throughput: 0: 832.7. Samples: 130820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:14,132][00184] Avg episode reward: [(0, '4.785')] -[2023-03-12 07:25:15,799][11786] Updated weights for policy 0, policy_version 130 (0.0032) -[2023-03-12 07:25:19,129][00184] Fps is (10 sec: 2867.7, 60 sec: 3345.1, 300 sec: 2922.6). Total num frames: 540672. Throughput: 0: 845.3. Samples: 135018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:19,135][00184] Avg episode reward: [(0, '4.817')] -[2023-03-12 07:25:24,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2953.4). Total num frames: 561152. Throughput: 0: 881.0. Samples: 141168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:25:24,137][00184] Avg episode reward: [(0, '4.662')] -[2023-03-12 07:25:26,117][11786] Updated weights for policy 0, policy_version 140 (0.0021) -[2023-03-12 07:25:29,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3003.7). Total num frames: 585728. Throughput: 0: 889.2. Samples: 144820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:25:29,136][00184] Avg episode reward: [(0, '4.871')] -[2023-03-12 07:25:34,136][00184] Fps is (10 sec: 4093.2, 60 sec: 3549.5, 300 sec: 3010.5). Total num frames: 602112. Throughput: 0: 857.3. Samples: 150558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:34,138][00184] Avg episode reward: [(0, '4.955')] -[2023-03-12 07:25:37,810][11786] Updated weights for policy 0, policy_version 150 (0.0024) -[2023-03-12 07:25:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 2997.1). Total num frames: 614400. Throughput: 0: 844.8. Samples: 155044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:25:39,132][00184] Avg episode reward: [(0, '4.759')] -[2023-03-12 07:25:44,129][00184] Fps is (10 sec: 3279.0, 60 sec: 3481.6, 300 sec: 3023.2). Total num frames: 634880. Throughput: 0: 864.8. Samples: 158106. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:25:44,136][00184] Avg episode reward: [(0, '4.571')] -[2023-03-12 07:25:49,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3010.1). Total num frames: 647168. Throughput: 0: 840.0. Samples: 162410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:25:49,135][00184] Avg episode reward: [(0, '4.621')] -[2023-03-12 07:25:49,148][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000158_647168.pth... -[2023-03-12 07:25:51,188][11786] Updated weights for policy 0, policy_version 160 (0.0033) -[2023-03-12 07:25:54,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3276.9, 300 sec: 2997.5). Total num frames: 659456. Throughput: 0: 827.2. Samples: 166334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:25:54,136][00184] Avg episode reward: [(0, '4.655')] -[2023-03-12 07:25:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3003.7). Total num frames: 675840. Throughput: 0: 832.6. Samples: 168286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:25:59,135][00184] Avg episode reward: [(0, '4.819')] -[2023-03-12 07:26:03,294][11786] Updated weights for policy 0, policy_version 170 (0.0030) -[2023-03-12 07:26:04,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3345.3, 300 sec: 3045.3). Total num frames: 700416. Throughput: 0: 871.2. Samples: 174224. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:26:04,131][00184] Avg episode reward: [(0, '5.182')] -[2023-03-12 07:26:04,137][11773] Saving new best policy, reward=5.182! -[2023-03-12 07:26:09,129][00184] Fps is (10 sec: 4505.7, 60 sec: 3481.7, 300 sec: 3067.6). Total num frames: 720896. Throughput: 0: 890.2. Samples: 181226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:26:09,139][00184] Avg episode reward: [(0, '5.057')] -[2023-03-12 07:26:13,448][11786] Updated weights for policy 0, policy_version 180 (0.0012) -[2023-03-12 07:26:14,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3072.0). Total num frames: 737280. Throughput: 0: 866.9. Samples: 183830. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:26:14,140][00184] Avg episode reward: [(0, '5.000')] -[2023-03-12 07:26:19,129][00184] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3076.2). Total num frames: 753664. Throughput: 0: 841.0. Samples: 188398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:26:19,131][00184] Avg episode reward: [(0, '5.204')] -[2023-03-12 07:26:19,154][11773] Saving new best policy, reward=5.204! -[2023-03-12 07:26:24,131][00184] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3080.2). Total num frames: 770048. Throughput: 0: 854.5. Samples: 193500. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-03-12 07:26:24,138][00184] Avg episode reward: [(0, '5.001')] -[2023-03-12 07:26:25,816][11786] Updated weights for policy 0, policy_version 190 (0.0022) -[2023-03-12 07:26:29,129][00184] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3084.1). Total num frames: 786432. Throughput: 0: 834.6. Samples: 195664. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:26:29,135][00184] Avg episode reward: [(0, '4.805')] -[2023-03-12 07:26:34,129][00184] Fps is (10 sec: 2867.9, 60 sec: 3277.2, 300 sec: 3072.0). Total num frames: 798720. Throughput: 0: 826.4. Samples: 199600. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:26:34,134][00184] Avg episode reward: [(0, '4.713')] -[2023-03-12 07:26:39,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3060.4). Total num frames: 811008. Throughput: 0: 829.2. Samples: 203650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:26:39,132][00184] Avg episode reward: [(0, '4.719')] -[2023-03-12 07:26:40,678][11786] Updated weights for policy 0, policy_version 200 (0.0033) -[2023-03-12 07:26:44,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3079.6). Total num frames: 831488. Throughput: 0: 856.2. Samples: 206814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:26:44,131][00184] Avg episode reward: [(0, '4.645')] -[2023-03-12 07:26:49,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3113.0). Total num frames: 856064. Throughput: 0: 885.5. Samples: 214070. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:26:49,132][00184] Avg episode reward: [(0, '4.995')] -[2023-03-12 07:26:49,197][11786] Updated weights for policy 0, policy_version 210 (0.0020) -[2023-03-12 07:26:54,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3115.9). Total num frames: 872448. Throughput: 0: 846.7. Samples: 219326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:26:54,131][00184] Avg episode reward: [(0, '5.190')] -[2023-03-12 07:26:59,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3118.7). Total num frames: 888832. Throughput: 0: 837.9. Samples: 221536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:26:59,132][00184] Avg episode reward: [(0, '5.280')] -[2023-03-12 07:26:59,150][11773] Saving new best policy, reward=5.280! -[2023-03-12 07:27:02,015][11786] Updated weights for policy 0, policy_version 220 (0.0029) -[2023-03-12 07:27:04,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3121.4). Total num frames: 905216. Throughput: 0: 844.2. Samples: 226386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:27:04,136][00184] Avg episode reward: [(0, '5.244')] -[2023-03-12 07:27:09,133][00184] Fps is (10 sec: 3275.6, 60 sec: 3344.9, 300 sec: 3124.0). Total num frames: 921600. Throughput: 0: 833.0. Samples: 230986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:27:09,135][00184] Avg episode reward: [(0, '5.177')] -[2023-03-12 07:27:14,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3165.7). Total num frames: 933888. Throughput: 0: 833.6. Samples: 233174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:27:14,136][00184] Avg episode reward: [(0, '5.163')] -[2023-03-12 07:27:15,736][11786] Updated weights for policy 0, policy_version 230 (0.0011) -[2023-03-12 07:27:19,132][00184] Fps is (10 sec: 2867.5, 60 sec: 3276.7, 300 sec: 3221.2). Total num frames: 950272. Throughput: 0: 848.9. Samples: 237802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:27:19,134][00184] Avg episode reward: [(0, '5.393')] -[2023-03-12 07:27:19,152][11773] Saving new best policy, reward=5.393! -[2023-03-12 07:27:24,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 902.1. Samples: 244246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:27:24,132][00184] Avg episode reward: [(0, '5.412')] -[2023-03-12 07:27:24,134][11773] Saving new best policy, reward=5.412! -[2023-03-12 07:27:25,713][11786] Updated weights for policy 0, policy_version 240 (0.0018) -[2023-03-12 07:27:29,129][00184] Fps is (10 sec: 4506.8, 60 sec: 3481.6, 300 sec: 3360.1). Total num frames: 995328. Throughput: 0: 907.8. Samples: 247666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:27:29,137][00184] Avg episode reward: [(0, '4.983')] -[2023-03-12 07:27:34,129][00184] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3332.3). Total num frames: 1011712. Throughput: 0: 868.6. Samples: 253158. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:27:34,135][00184] Avg episode reward: [(0, '5.046')] -[2023-03-12 07:27:37,681][11786] Updated weights for policy 0, policy_version 250 (0.0017) -[2023-03-12 07:27:39,130][00184] Fps is (10 sec: 3276.6, 60 sec: 3618.1, 300 sec: 3346.2). Total num frames: 1028096. Throughput: 0: 849.2. Samples: 257542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:27:39,135][00184] Avg episode reward: [(0, '5.121')] -[2023-03-12 07:27:44,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3346.2). Total num frames: 1044480. Throughput: 0: 857.9. Samples: 260140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:27:44,132][00184] Avg episode reward: [(0, '5.108')] -[2023-03-12 07:27:49,131][00184] Fps is (10 sec: 2866.9, 60 sec: 3345.0, 300 sec: 3332.3). Total num frames: 1056768. Throughput: 0: 846.0. Samples: 264458. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:27:49,133][00184] Avg episode reward: [(0, '4.954')] -[2023-03-12 07:27:49,150][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000258_1056768.pth... -[2023-03-12 07:27:49,351][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000062_253952.pth -[2023-03-12 07:27:51,255][11786] Updated weights for policy 0, policy_version 260 (0.0018) -[2023-03-12 07:27:54,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 1073152. Throughput: 0: 836.2. Samples: 268614. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:27:54,132][00184] Avg episode reward: [(0, '5.497')] -[2023-03-12 07:27:54,135][11773] Saving new best policy, reward=5.497! -[2023-03-12 07:27:59,129][00184] Fps is (10 sec: 2867.7, 60 sec: 3276.8, 300 sec: 3318.5). Total num frames: 1085440. Throughput: 0: 834.7. Samples: 270734. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-03-12 07:27:59,135][00184] Avg episode reward: [(0, '5.798')] -[2023-03-12 07:27:59,147][11773] Saving new best policy, reward=5.798! -[2023-03-12 07:28:03,227][11786] Updated weights for policy 0, policy_version 270 (0.0020) -[2023-03-12 07:28:04,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3346.3). Total num frames: 1105920. Throughput: 0: 862.6. Samples: 276616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:28:04,132][00184] Avg episode reward: [(0, '5.736')] -[2023-03-12 07:28:09,129][00184] Fps is (10 sec: 4505.7, 60 sec: 3481.8, 300 sec: 3401.8). Total num frames: 1130496. Throughput: 0: 872.6. Samples: 283512. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:28:09,137][00184] Avg episode reward: [(0, '5.826')] -[2023-03-12 07:28:09,147][11773] Saving new best policy, reward=5.826! -[2023-03-12 07:28:13,967][11786] Updated weights for policy 0, policy_version 280 (0.0013) -[2023-03-12 07:28:14,133][00184] Fps is (10 sec: 4094.3, 60 sec: 3549.6, 300 sec: 3387.8). Total num frames: 1146880. Throughput: 0: 847.2. Samples: 285792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:28:14,136][00184] Avg episode reward: [(0, '5.855')] -[2023-03-12 07:28:14,142][11773] Saving new best policy, reward=5.855! -[2023-03-12 07:28:19,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3481.8, 300 sec: 3360.1). Total num frames: 1159168. Throughput: 0: 821.1. Samples: 290106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:28:19,132][00184] Avg episode reward: [(0, '6.097')] -[2023-03-12 07:28:19,142][11773] Saving new best policy, reward=6.097! -[2023-03-12 07:28:24,129][00184] Fps is (10 sec: 2868.4, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 1175552. Throughput: 0: 824.0. Samples: 294622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:28:24,132][00184] Avg episode reward: [(0, '6.190')] -[2023-03-12 07:28:24,134][11773] Saving new best policy, reward=6.190! -[2023-03-12 07:28:27,873][11786] Updated weights for policy 0, policy_version 290 (0.0015) -[2023-03-12 07:28:29,140][00184] Fps is (10 sec: 3273.2, 60 sec: 3276.2, 300 sec: 3360.0). Total num frames: 1191936. Throughput: 0: 813.7. Samples: 296764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:28:29,147][00184] Avg episode reward: [(0, '6.745')] -[2023-03-12 07:28:29,161][11773] Saving new best policy, reward=6.745! -[2023-03-12 07:28:34,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3332.3). Total num frames: 1204224. Throughput: 0: 815.0. Samples: 301132. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:28:34,131][00184] Avg episode reward: [(0, '6.729')] -[2023-03-12 07:28:39,129][00184] Fps is (10 sec: 2870.3, 60 sec: 3208.6, 300 sec: 3318.5). Total num frames: 1220608. Throughput: 0: 821.5. Samples: 305580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:28:39,132][00184] Avg episode reward: [(0, '6.948')] -[2023-03-12 07:28:39,140][11773] Saving new best policy, reward=6.948! -[2023-03-12 07:28:40,890][11786] Updated weights for policy 0, policy_version 300 (0.0029) -[2023-03-12 07:28:44,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3346.2). Total num frames: 1241088. Throughput: 0: 848.2. Samples: 308902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:28:44,131][00184] Avg episode reward: [(0, '6.861')] -[2023-03-12 07:28:49,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3481.7, 300 sec: 3401.8). Total num frames: 1265664. Throughput: 0: 874.0. Samples: 315948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:28:49,135][00184] Avg episode reward: [(0, '6.914')] -[2023-03-12 07:28:49,930][11786] Updated weights for policy 0, policy_version 310 (0.0016) -[2023-03-12 07:28:54,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1282048. Throughput: 0: 829.7. Samples: 320850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:28:54,136][00184] Avg episode reward: [(0, '7.400')] -[2023-03-12 07:28:54,139][11773] Saving new best policy, reward=7.400! -[2023-03-12 07:28:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1294336. Throughput: 0: 824.7. Samples: 322902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:28:59,146][00184] Avg episode reward: [(0, '7.345')] -[2023-03-12 07:29:03,292][11786] Updated weights for policy 0, policy_version 320 (0.0025) -[2023-03-12 07:29:04,130][00184] Fps is (10 sec: 2866.9, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 1310720. Throughput: 0: 840.1. Samples: 327912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:29:04,132][00184] Avg episode reward: [(0, '7.626')] -[2023-03-12 07:29:04,140][11773] Saving new best policy, reward=7.626! -[2023-03-12 07:29:09,130][00184] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 1327104. Throughput: 0: 835.5. Samples: 332222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:29:09,132][00184] Avg episode reward: [(0, '8.232')] -[2023-03-12 07:29:09,141][11773] Saving new best policy, reward=8.232! -[2023-03-12 07:29:14,134][00184] Fps is (10 sec: 2866.1, 60 sec: 3208.5, 300 sec: 3387.8). Total num frames: 1339392. Throughput: 0: 832.6. Samples: 334228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:29:14,140][00184] Avg episode reward: [(0, '8.264')] -[2023-03-12 07:29:14,143][11773] Saving new best policy, reward=8.264! -[2023-03-12 07:29:17,619][11786] Updated weights for policy 0, policy_version 330 (0.0020) -[2023-03-12 07:29:19,129][00184] Fps is (10 sec: 2457.7, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 1351680. Throughput: 0: 831.2. Samples: 338534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:29:19,134][00184] Avg episode reward: [(0, '9.342')] -[2023-03-12 07:29:19,158][11773] Saving new best policy, reward=9.342! -[2023-03-12 07:29:24,129][00184] Fps is (10 sec: 3688.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 1376256. Throughput: 0: 874.4. Samples: 344926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:29:24,131][00184] Avg episode reward: [(0, '9.281')] -[2023-03-12 07:29:27,109][11786] Updated weights for policy 0, policy_version 340 (0.0012) -[2023-03-12 07:29:29,129][00184] Fps is (10 sec: 4915.2, 60 sec: 3482.2, 300 sec: 3429.5). Total num frames: 1400832. Throughput: 0: 877.2. Samples: 348374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:29:29,131][00184] Avg episode reward: [(0, '9.275')] -[2023-03-12 07:29:34,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1413120. Throughput: 0: 837.2. Samples: 353624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:29:34,132][00184] Avg episode reward: [(0, '9.153')] -[2023-03-12 07:29:39,129][00184] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1429504. Throughput: 0: 832.1. Samples: 358296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:29:39,132][00184] Avg episode reward: [(0, '9.526')] -[2023-03-12 07:29:39,140][11773] Saving new best policy, reward=9.526! -[2023-03-12 07:29:39,603][11786] Updated weights for policy 0, policy_version 350 (0.0011) -[2023-03-12 07:29:44,131][00184] Fps is (10 sec: 3276.2, 60 sec: 3413.2, 300 sec: 3374.0). Total num frames: 1445888. Throughput: 0: 852.2. Samples: 361252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:29:44,133][00184] Avg episode reward: [(0, '9.214')] -[2023-03-12 07:29:49,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 1462272. Throughput: 0: 841.7. Samples: 365790. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:29:49,132][00184] Avg episode reward: [(0, '8.821')] -[2023-03-12 07:29:49,150][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000357_1462272.pth... -[2023-03-12 07:29:49,357][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000158_647168.pth -[2023-03-12 07:29:53,131][11786] Updated weights for policy 0, policy_version 360 (0.0025) -[2023-03-12 07:29:54,129][00184] Fps is (10 sec: 2867.7, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 1474560. Throughput: 0: 833.8. Samples: 369744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:29:54,131][00184] Avg episode reward: [(0, '9.449')] -[2023-03-12 07:29:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 1490944. Throughput: 0: 837.7. Samples: 371922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:29:59,137][00184] Avg episode reward: [(0, '9.388')] -[2023-03-12 07:30:04,010][11786] Updated weights for policy 0, policy_version 370 (0.0019) -[2023-03-12 07:30:04,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 1515520. Throughput: 0: 879.7. Samples: 378120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:30:04,132][00184] Avg episode reward: [(0, '9.923')] -[2023-03-12 07:30:04,134][11773] Saving new best policy, reward=9.923! -[2023-03-12 07:30:09,129][00184] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1536000. Throughput: 0: 893.4. Samples: 385128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:30:09,131][00184] Avg episode reward: [(0, '10.747')] -[2023-03-12 07:30:09,142][11773] Saving new best policy, reward=10.747! -[2023-03-12 07:30:14,131][00184] Fps is (10 sec: 3685.5, 60 sec: 3550.0, 300 sec: 3429.5). Total num frames: 1552384. Throughput: 0: 864.8. Samples: 387292. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:30:14,137][00184] Avg episode reward: [(0, '10.921')] -[2023-03-12 07:30:14,139][11773] Saving new best policy, reward=10.921! -[2023-03-12 07:30:15,134][11786] Updated weights for policy 0, policy_version 380 (0.0027) -[2023-03-12 07:30:19,137][00184] Fps is (10 sec: 3274.3, 60 sec: 3617.7, 300 sec: 3415.6). Total num frames: 1568768. Throughput: 0: 846.3. Samples: 391712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:30:19,138][00184] Avg episode reward: [(0, '10.220')] -[2023-03-12 07:30:24,129][00184] Fps is (10 sec: 3277.6, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1585152. Throughput: 0: 855.6. Samples: 396796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:30:24,134][00184] Avg episode reward: [(0, '9.370')] -[2023-03-12 07:30:28,179][11786] Updated weights for policy 0, policy_version 390 (0.0038) -[2023-03-12 07:30:29,129][00184] Fps is (10 sec: 2869.4, 60 sec: 3276.8, 300 sec: 3374.1). Total num frames: 1597440. Throughput: 0: 835.9. Samples: 398864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:30:29,134][00184] Avg episode reward: [(0, '9.035')] -[2023-03-12 07:30:34,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 1609728. Throughput: 0: 826.5. Samples: 402984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:30:34,134][00184] Avg episode reward: [(0, '8.988')] -[2023-03-12 07:30:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 1626112. Throughput: 0: 838.5. Samples: 407476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:30:39,137][00184] Avg episode reward: [(0, '9.841')] -[2023-03-12 07:30:41,264][11786] Updated weights for policy 0, policy_version 400 (0.0024) -[2023-03-12 07:30:44,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 1650688. Throughput: 0: 865.9. Samples: 410888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:30:44,138][00184] Avg episode reward: [(0, '9.928')] -[2023-03-12 07:30:49,129][00184] Fps is (10 sec: 4915.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1675264. Throughput: 0: 884.4. Samples: 417918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:30:49,135][00184] Avg episode reward: [(0, '9.719')] -[2023-03-12 07:30:50,434][11786] Updated weights for policy 0, policy_version 410 (0.0012) -[2023-03-12 07:30:54,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1687552. Throughput: 0: 837.6. Samples: 422820. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:30:54,137][00184] Avg episode reward: [(0, '9.725')] -[2023-03-12 07:30:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3401.8). Total num frames: 1703936. Throughput: 0: 837.2. Samples: 424964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:30:59,132][00184] Avg episode reward: [(0, '9.936')] -[2023-03-12 07:31:03,563][11786] Updated weights for policy 0, policy_version 420 (0.0022) -[2023-03-12 07:31:04,131][00184] Fps is (10 sec: 3276.2, 60 sec: 3413.2, 300 sec: 3387.9). Total num frames: 1720320. Throughput: 0: 853.4. Samples: 430110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:31:04,135][00184] Avg episode reward: [(0, '9.960')] -[2023-03-12 07:31:09,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 1732608. Throughput: 0: 840.4. Samples: 434614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:31:09,134][00184] Avg episode reward: [(0, '10.320')] -[2023-03-12 07:31:14,129][00184] Fps is (10 sec: 2867.6, 60 sec: 3276.9, 300 sec: 3374.0). Total num frames: 1748992. Throughput: 0: 837.8. Samples: 436566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:31:14,134][00184] Avg episode reward: [(0, '10.596')] -[2023-03-12 07:31:17,711][11786] Updated weights for policy 0, policy_version 430 (0.0023) -[2023-03-12 07:31:19,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3277.2, 300 sec: 3374.0). Total num frames: 1765376. Throughput: 0: 843.5. Samples: 440942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:31:19,137][00184] Avg episode reward: [(0, '11.375')] -[2023-03-12 07:31:19,156][11773] Saving new best policy, reward=11.375! -[2023-03-12 07:31:24,129][00184] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 1785856. Throughput: 0: 887.7. Samples: 447422. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:31:24,134][00184] Avg episode reward: [(0, '10.852')] -[2023-03-12 07:31:26,901][11786] Updated weights for policy 0, policy_version 440 (0.0019) -[2023-03-12 07:31:29,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1810432. Throughput: 0: 891.2. Samples: 450994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:31:29,136][00184] Avg episode reward: [(0, '10.554')] -[2023-03-12 07:31:34,136][00184] Fps is (10 sec: 4093.1, 60 sec: 3617.7, 300 sec: 3443.3). Total num frames: 1826816. Throughput: 0: 854.9. Samples: 456394. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:31:34,139][00184] Avg episode reward: [(0, '12.219')] -[2023-03-12 07:31:34,143][11773] Saving new best policy, reward=12.219! -[2023-03-12 07:31:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 1839104. Throughput: 0: 844.6. Samples: 460828. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:31:39,136][00184] Avg episode reward: [(0, '12.454')] -[2023-03-12 07:31:39,145][11773] Saving new best policy, reward=12.454! -[2023-03-12 07:31:39,386][11786] Updated weights for policy 0, policy_version 450 (0.0025) -[2023-03-12 07:31:44,129][00184] Fps is (10 sec: 3279.1, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1859584. Throughput: 0: 870.6. Samples: 464142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:31:44,134][00184] Avg episode reward: [(0, '13.693')] -[2023-03-12 07:31:44,136][11773] Saving new best policy, reward=13.693! -[2023-03-12 07:31:49,130][00184] Fps is (10 sec: 3276.6, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 1871872. Throughput: 0: 851.8. Samples: 468438. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:31:49,134][00184] Avg episode reward: [(0, '14.618')] -[2023-03-12 07:31:49,156][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000457_1871872.pth... -[2023-03-12 07:31:49,335][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000258_1056768.pth -[2023-03-12 07:31:49,358][11773] Saving new best policy, reward=14.618! -[2023-03-12 07:31:52,815][11786] Updated weights for policy 0, policy_version 460 (0.0016) -[2023-03-12 07:31:54,131][00184] Fps is (10 sec: 2457.2, 60 sec: 3276.7, 300 sec: 3374.0). Total num frames: 1884160. Throughput: 0: 835.1. Samples: 472196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:31:54,134][00184] Avg episode reward: [(0, '14.936')] -[2023-03-12 07:31:54,136][11773] Saving new best policy, reward=14.936! -[2023-03-12 07:31:59,129][00184] Fps is (10 sec: 2867.3, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 1900544. Throughput: 0: 837.4. Samples: 474250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:31:59,137][00184] Avg episode reward: [(0, '15.378')] -[2023-03-12 07:31:59,149][11773] Saving new best policy, reward=15.378! -[2023-03-12 07:32:04,129][00184] Fps is (10 sec: 3687.0, 60 sec: 3345.2, 300 sec: 3387.9). Total num frames: 1921024. Throughput: 0: 866.0. Samples: 479914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:32:04,136][00184] Avg episode reward: [(0, '16.265')] -[2023-03-12 07:32:04,140][11773] Saving new best policy, reward=16.265! -[2023-03-12 07:32:04,534][11786] Updated weights for policy 0, policy_version 470 (0.0033) -[2023-03-12 07:32:09,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1945600. Throughput: 0: 874.5. Samples: 486776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:32:09,137][00184] Avg episode reward: [(0, '17.530')] -[2023-03-12 07:32:09,149][11773] Saving new best policy, reward=17.530! -[2023-03-12 07:32:14,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 1957888. Throughput: 0: 848.4. Samples: 489172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:32:14,134][00184] Avg episode reward: [(0, '17.364')] -[2023-03-12 07:32:15,723][11786] Updated weights for policy 0, policy_version 480 (0.0012) -[2023-03-12 07:32:19,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1974272. Throughput: 0: 829.2. Samples: 493704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:32:19,133][00184] Avg episode reward: [(0, '17.609')] -[2023-03-12 07:32:19,144][11773] Saving new best policy, reward=17.609! -[2023-03-12 07:32:24,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1994752. Throughput: 0: 854.4. Samples: 499278. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) -[2023-03-12 07:32:24,131][00184] Avg episode reward: [(0, '16.722')] -[2023-03-12 07:32:29,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 2002944. Throughput: 0: 817.9. Samples: 500948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:32:29,132][00184] Avg episode reward: [(0, '16.538')] -[2023-03-12 07:32:29,592][11786] Updated weights for policy 0, policy_version 490 (0.0048) -[2023-03-12 07:32:34,129][00184] Fps is (10 sec: 2048.0, 60 sec: 3140.6, 300 sec: 3346.2). Total num frames: 2015232. Throughput: 0: 799.6. Samples: 504418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:32:34,135][00184] Avg episode reward: [(0, '16.606')] -[2023-03-12 07:32:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3346.2). Total num frames: 2031616. Throughput: 0: 811.2. Samples: 508700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:32:39,132][00184] Avg episode reward: [(0, '16.077')] -[2023-03-12 07:32:42,892][11786] Updated weights for policy 0, policy_version 500 (0.0022) -[2023-03-12 07:32:44,129][00184] Fps is (10 sec: 3686.5, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 2052096. Throughput: 0: 827.5. Samples: 511486. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:32:44,131][00184] Avg episode reward: [(0, '15.833')] -[2023-03-12 07:32:49,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 2076672. Throughput: 0: 862.2. Samples: 518714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:32:49,137][00184] Avg episode reward: [(0, '16.930')] -[2023-03-12 07:32:51,546][11786] Updated weights for policy 0, policy_version 510 (0.0025) -[2023-03-12 07:32:54,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3415.6). Total num frames: 2093056. Throughput: 0: 837.3. Samples: 524456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:32:54,132][00184] Avg episode reward: [(0, '16.759')] -[2023-03-12 07:32:59,134][00184] Fps is (10 sec: 3275.2, 60 sec: 3481.3, 300 sec: 3401.7). Total num frames: 2109440. Throughput: 0: 834.5. Samples: 526730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:32:59,136][00184] Avg episode reward: [(0, '16.912')] -[2023-03-12 07:33:03,739][11786] Updated weights for policy 0, policy_version 520 (0.0021) -[2023-03-12 07:33:04,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 2129920. Throughput: 0: 859.9. Samples: 532400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:33:04,136][00184] Avg episode reward: [(0, '16.981')] -[2023-03-12 07:33:09,129][00184] Fps is (10 sec: 3688.2, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2146304. Throughput: 0: 838.9. Samples: 537028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:33:09,136][00184] Avg episode reward: [(0, '16.467')] -[2023-03-12 07:33:14,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2158592. Throughput: 0: 846.4. Samples: 539036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:33:14,135][00184] Avg episode reward: [(0, '17.757')] -[2023-03-12 07:33:14,138][11773] Saving new best policy, reward=17.757! -[2023-03-12 07:33:18,137][11786] Updated weights for policy 0, policy_version 530 (0.0023) -[2023-03-12 07:33:19,130][00184] Fps is (10 sec: 2457.4, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 2170880. Throughput: 0: 860.3. Samples: 543134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:33:19,136][00184] Avg episode reward: [(0, '16.357')] -[2023-03-12 07:33:24,129][00184] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3388.0). Total num frames: 2191360. Throughput: 0: 891.6. Samples: 548822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:33:24,131][00184] Avg episode reward: [(0, '15.951')] -[2023-03-12 07:33:28,049][11786] Updated weights for policy 0, policy_version 540 (0.0022) -[2023-03-12 07:33:29,129][00184] Fps is (10 sec: 4505.9, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2215936. Throughput: 0: 907.4. Samples: 552318. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:33:29,132][00184] Avg episode reward: [(0, '18.113')] -[2023-03-12 07:33:29,138][11773] Saving new best policy, reward=18.113! -[2023-03-12 07:33:34,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2232320. Throughput: 0: 882.2. Samples: 558412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:33:34,136][00184] Avg episode reward: [(0, '18.627')] -[2023-03-12 07:33:34,138][11773] Saving new best policy, reward=18.627! -[2023-03-12 07:33:39,131][00184] Fps is (10 sec: 3276.1, 60 sec: 3618.0, 300 sec: 3415.6). Total num frames: 2248704. Throughput: 0: 852.8. Samples: 562832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:33:39,134][00184] Avg episode reward: [(0, '19.629')] -[2023-03-12 07:33:39,156][11773] Saving new best policy, reward=19.629! -[2023-03-12 07:33:40,217][11786] Updated weights for policy 0, policy_version 550 (0.0023) -[2023-03-12 07:33:44,129][00184] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3401.8). Total num frames: 2269184. Throughput: 0: 861.0. Samples: 565472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:33:44,132][00184] Avg episode reward: [(0, '19.836')] -[2023-03-12 07:33:44,136][11773] Saving new best policy, reward=19.836! -[2023-03-12 07:33:49,129][00184] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2281472. Throughput: 0: 850.2. Samples: 570658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:33:49,135][00184] Avg episode reward: [(0, '21.555')] -[2023-03-12 07:33:49,148][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000557_2281472.pth... -[2023-03-12 07:33:49,282][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000357_1462272.pth -[2023-03-12 07:33:49,295][11773] Saving new best policy, reward=21.555! -[2023-03-12 07:33:53,081][11786] Updated weights for policy 0, policy_version 560 (0.0021) -[2023-03-12 07:33:54,129][00184] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2293760. Throughput: 0: 831.4. Samples: 574442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:33:54,136][00184] Avg episode reward: [(0, '21.808')] -[2023-03-12 07:33:54,141][11773] Saving new best policy, reward=21.808! -[2023-03-12 07:33:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.3, 300 sec: 3387.9). Total num frames: 2310144. Throughput: 0: 831.4. Samples: 576450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:33:59,134][00184] Avg episode reward: [(0, '21.424')] -[2023-03-12 07:34:04,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2326528. Throughput: 0: 852.5. Samples: 581498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:34:04,132][00184] Avg episode reward: [(0, '20.067')] -[2023-03-12 07:34:05,036][11786] Updated weights for policy 0, policy_version 570 (0.0019) -[2023-03-12 07:34:09,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 2351104. Throughput: 0: 885.6. Samples: 588672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:34:09,138][00184] Avg episode reward: [(0, '19.736')] -[2023-03-12 07:34:14,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2371584. Throughput: 0: 889.0. Samples: 592324. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:34:14,134][00184] Avg episode reward: [(0, '20.259')] -[2023-03-12 07:34:14,653][11786] Updated weights for policy 0, policy_version 580 (0.0014) -[2023-03-12 07:34:19,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3429.5). Total num frames: 2387968. Throughput: 0: 856.5. Samples: 596956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:34:19,133][00184] Avg episode reward: [(0, '19.439')] -[2023-03-12 07:34:24,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3415.6). Total num frames: 2408448. Throughput: 0: 882.8. Samples: 602558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:34:24,135][00184] Avg episode reward: [(0, '19.964')] -[2023-03-12 07:34:26,322][11786] Updated weights for policy 0, policy_version 590 (0.0051) -[2023-03-12 07:34:29,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2420736. Throughput: 0: 874.7. Samples: 604834. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-03-12 07:34:29,133][00184] Avg episode reward: [(0, '19.674')] -[2023-03-12 07:34:34,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 2433024. Throughput: 0: 851.4. Samples: 608970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:34:34,132][00184] Avg episode reward: [(0, '18.329')] -[2023-03-12 07:34:39,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.2, 300 sec: 3401.8). Total num frames: 2449408. Throughput: 0: 856.4. Samples: 612980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:34:39,134][00184] Avg episode reward: [(0, '19.054')] -[2023-03-12 07:34:41,362][11786] Updated weights for policy 0, policy_version 600 (0.0012) -[2023-03-12 07:34:44,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 2465792. Throughput: 0: 860.1. Samples: 615154. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:34:44,136][00184] Avg episode reward: [(0, '18.374')] -[2023-03-12 07:34:49,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2490368. Throughput: 0: 897.7. Samples: 621896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:34:49,135][00184] Avg episode reward: [(0, '18.065')] -[2023-03-12 07:34:50,735][11786] Updated weights for policy 0, policy_version 610 (0.0027) -[2023-03-12 07:34:54,134][00184] Fps is (10 sec: 4503.4, 60 sec: 3617.8, 300 sec: 3457.2). Total num frames: 2510848. Throughput: 0: 876.7. Samples: 628126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:34:54,141][00184] Avg episode reward: [(0, '19.714')] -[2023-03-12 07:34:59,132][00184] Fps is (10 sec: 3275.9, 60 sec: 3549.7, 300 sec: 3415.6). Total num frames: 2523136. Throughput: 0: 841.8. Samples: 630206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:34:59,136][00184] Avg episode reward: [(0, '19.004')] -[2023-03-12 07:35:03,725][11786] Updated weights for policy 0, policy_version 620 (0.0020) -[2023-03-12 07:35:04,131][00184] Fps is (10 sec: 2868.0, 60 sec: 3549.8, 300 sec: 3401.7). Total num frames: 2539520. Throughput: 0: 833.1. Samples: 634448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:35:04,138][00184] Avg episode reward: [(0, '19.607')] -[2023-03-12 07:35:09,129][00184] Fps is (10 sec: 2868.0, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2551808. Throughput: 0: 812.4. Samples: 639114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:35:09,147][00184] Avg episode reward: [(0, '20.010')] -[2023-03-12 07:35:14,132][00184] Fps is (10 sec: 2867.0, 60 sec: 3276.7, 300 sec: 3387.9). Total num frames: 2568192. Throughput: 0: 810.3. Samples: 641300. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:35:14,134][00184] Avg episode reward: [(0, '20.521')] -[2023-03-12 07:35:17,978][11786] Updated weights for policy 0, policy_version 630 (0.0016) -[2023-03-12 07:35:19,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 2580480. Throughput: 0: 814.2. Samples: 645610. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:35:19,136][00184] Avg episode reward: [(0, '20.616')] -[2023-03-12 07:35:24,129][00184] Fps is (10 sec: 2867.9, 60 sec: 3140.3, 300 sec: 3387.9). Total num frames: 2596864. Throughput: 0: 830.4. Samples: 650348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:35:24,135][00184] Avg episode reward: [(0, '19.670')] -[2023-03-12 07:35:28,608][11786] Updated weights for policy 0, policy_version 640 (0.0031) -[2023-03-12 07:35:29,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 2621440. Throughput: 0: 860.0. Samples: 653852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:35:29,131][00184] Avg episode reward: [(0, '19.320')] -[2023-03-12 07:35:34,130][00184] Fps is (10 sec: 4505.3, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2641920. Throughput: 0: 863.5. Samples: 660752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:35:34,134][00184] Avg episode reward: [(0, '20.121')] -[2023-03-12 07:35:39,132][00184] Fps is (10 sec: 3685.4, 60 sec: 3481.4, 300 sec: 3415.6). Total num frames: 2658304. Throughput: 0: 818.8. Samples: 664970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:35:39,134][00184] Avg episode reward: [(0, '19.549')] -[2023-03-12 07:35:40,309][11786] Updated weights for policy 0, policy_version 650 (0.0011) -[2023-03-12 07:35:44,130][00184] Fps is (10 sec: 3276.7, 60 sec: 3481.5, 300 sec: 3387.9). Total num frames: 2674688. Throughput: 0: 821.8. Samples: 667184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:35:44,132][00184] Avg episode reward: [(0, '20.453')] -[2023-03-12 07:35:49,129][00184] Fps is (10 sec: 2868.0, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2686976. Throughput: 0: 834.5. Samples: 671998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:35:49,131][00184] Avg episode reward: [(0, '21.933')] -[2023-03-12 07:35:49,145][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth... -[2023-03-12 07:35:49,282][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000457_1871872.pth -[2023-03-12 07:35:49,296][11773] Saving new best policy, reward=21.933! -[2023-03-12 07:35:53,766][11786] Updated weights for policy 0, policy_version 660 (0.0012) -[2023-03-12 07:35:54,140][00184] Fps is (10 sec: 2864.4, 60 sec: 3208.2, 300 sec: 3387.8). Total num frames: 2703360. Throughput: 0: 827.0. Samples: 676338. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:35:54,149][00184] Avg episode reward: [(0, '23.142')] -[2023-03-12 07:35:54,151][11773] Saving new best policy, reward=23.142! -[2023-03-12 07:35:59,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.7, 300 sec: 3374.0). Total num frames: 2715648. Throughput: 0: 827.0. Samples: 678514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:35:59,136][00184] Avg episode reward: [(0, '23.006')] -[2023-03-12 07:36:04,129][00184] Fps is (10 sec: 2870.2, 60 sec: 3208.6, 300 sec: 3387.9). Total num frames: 2732032. Throughput: 0: 830.1. Samples: 682966. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:36:04,131][00184] Avg episode reward: [(0, '24.033')] -[2023-03-12 07:36:04,135][11773] Saving new best policy, reward=24.033! -[2023-03-12 07:36:06,146][11786] Updated weights for policy 0, policy_version 670 (0.0024) -[2023-03-12 07:36:09,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2756608. Throughput: 0: 879.4. Samples: 689922. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:36:09,132][00184] Avg episode reward: [(0, '24.152')] -[2023-03-12 07:36:09,141][11773] Saving new best policy, reward=24.152! -[2023-03-12 07:36:14,131][00184] Fps is (10 sec: 4504.8, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2777088. Throughput: 0: 875.4. Samples: 693246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:36:14,133][00184] Avg episode reward: [(0, '21.817')] -[2023-03-12 07:36:16,131][11786] Updated weights for policy 0, policy_version 680 (0.0015) -[2023-03-12 07:36:19,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2793472. Throughput: 0: 827.7. Samples: 697996. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:36:19,131][00184] Avg episode reward: [(0, '20.377')] -[2023-03-12 07:36:24,129][00184] Fps is (10 sec: 2867.7, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 2805760. Throughput: 0: 838.1. Samples: 702680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:36:24,131][00184] Avg episode reward: [(0, '20.257')] -[2023-03-12 07:36:29,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3374.1). Total num frames: 2822144. Throughput: 0: 838.4. Samples: 704912. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:36:29,134][00184] Avg episode reward: [(0, '20.770')] -[2023-03-12 07:36:29,588][11786] Updated weights for policy 0, policy_version 690 (0.0047) -[2023-03-12 07:36:34,132][00184] Fps is (10 sec: 3275.9, 60 sec: 3276.7, 300 sec: 3387.8). Total num frames: 2838528. Throughput: 0: 835.3. Samples: 709588. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) -[2023-03-12 07:36:34,134][00184] Avg episode reward: [(0, '20.450')] -[2023-03-12 07:36:39,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3277.0, 300 sec: 3374.0). Total num frames: 2854912. Throughput: 0: 836.1. Samples: 713954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:36:39,136][00184] Avg episode reward: [(0, '20.328')] -[2023-03-12 07:36:42,830][11786] Updated weights for policy 0, policy_version 700 (0.0021) -[2023-03-12 07:36:44,129][00184] Fps is (10 sec: 3277.7, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2871296. Throughput: 0: 837.1. Samples: 716184. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:36:44,131][00184] Avg episode reward: [(0, '20.639')] -[2023-03-12 07:36:49,129][00184] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2895872. Throughput: 0: 893.4. Samples: 723168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:36:49,131][00184] Avg episode reward: [(0, '23.156')] -[2023-03-12 07:36:51,653][11786] Updated weights for policy 0, policy_version 710 (0.0014) -[2023-03-12 07:36:54,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3550.5, 300 sec: 3443.4). Total num frames: 2916352. Throughput: 0: 877.5. Samples: 729410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:36:54,139][00184] Avg episode reward: [(0, '24.057')] -[2023-03-12 07:36:59,131][00184] Fps is (10 sec: 3276.4, 60 sec: 3549.8, 300 sec: 3415.6). Total num frames: 2928640. Throughput: 0: 851.1. Samples: 731544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:36:59,135][00184] Avg episode reward: [(0, '23.215')] -[2023-03-12 07:37:04,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3387.9). Total num frames: 2945024. Throughput: 0: 850.2. Samples: 736256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:37:04,145][00184] Avg episode reward: [(0, '23.755')] -[2023-03-12 07:37:05,224][11786] Updated weights for policy 0, policy_version 720 (0.0012) -[2023-03-12 07:37:09,129][00184] Fps is (10 sec: 2867.6, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2957312. Throughput: 0: 841.2. Samples: 740536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:37:09,136][00184] Avg episode reward: [(0, '23.391')] -[2023-03-12 07:37:14,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3208.6, 300 sec: 3374.0). Total num frames: 2969600. Throughput: 0: 839.2. Samples: 742676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:37:14,138][00184] Avg episode reward: [(0, '21.818')] -[2023-03-12 07:37:19,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 2985984. Throughput: 0: 825.2. Samples: 746720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:37:19,132][00184] Avg episode reward: [(0, '19.715')] -[2023-03-12 07:37:19,847][11786] Updated weights for policy 0, policy_version 730 (0.0021) -[2023-03-12 07:37:24,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 3002368. Throughput: 0: 840.0. Samples: 751756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:37:24,138][00184] Avg episode reward: [(0, '20.358')] -[2023-03-12 07:37:29,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 3026944. Throughput: 0: 867.0. Samples: 755200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:37:29,131][00184] Avg episode reward: [(0, '20.943')] -[2023-03-12 07:37:29,517][11786] Updated weights for policy 0, policy_version 740 (0.0031) -[2023-03-12 07:37:34,130][00184] Fps is (10 sec: 4505.4, 60 sec: 3481.7, 300 sec: 3443.4). Total num frames: 3047424. Throughput: 0: 863.4. Samples: 762022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:37:34,132][00184] Avg episode reward: [(0, '20.254')] -[2023-03-12 07:37:39,129][00184] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3063808. Throughput: 0: 822.2. Samples: 766410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:37:39,135][00184] Avg episode reward: [(0, '19.974')] -[2023-03-12 07:37:41,719][11786] Updated weights for policy 0, policy_version 750 (0.0016) -[2023-03-12 07:37:44,129][00184] Fps is (10 sec: 3277.0, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 3080192. Throughput: 0: 825.0. Samples: 768666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:37:44,132][00184] Avg episode reward: [(0, '19.997')] -[2023-03-12 07:37:49,129][00184] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3096576. Throughput: 0: 836.4. Samples: 773892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:37:49,131][00184] Avg episode reward: [(0, '20.906')] -[2023-03-12 07:37:49,150][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000756_3096576.pth... -[2023-03-12 07:37:49,279][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000557_2281472.pth -[2023-03-12 07:37:54,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3387.9). Total num frames: 3108864. Throughput: 0: 828.5. Samples: 777820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:37:54,133][00184] Avg episode reward: [(0, '20.972')] -[2023-03-12 07:37:55,370][11786] Updated weights for policy 0, policy_version 760 (0.0021) -[2023-03-12 07:37:59,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3208.6, 300 sec: 3360.1). Total num frames: 3121152. Throughput: 0: 828.6. Samples: 779964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:37:59,134][00184] Avg episode reward: [(0, '21.303')] -[2023-03-12 07:38:04,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 3141632. Throughput: 0: 843.4. Samples: 784674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:38:04,132][00184] Avg episode reward: [(0, '21.269')] -[2023-03-12 07:38:06,705][11786] Updated weights for policy 0, policy_version 770 (0.0014) -[2023-03-12 07:38:09,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 3162112. Throughput: 0: 885.9. Samples: 791620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:38:09,131][00184] Avg episode reward: [(0, '21.520')] -[2023-03-12 07:38:14,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 3182592. Throughput: 0: 886.3. Samples: 795084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:38:14,133][00184] Avg episode reward: [(0, '22.134')] -[2023-03-12 07:38:17,236][11786] Updated weights for policy 0, policy_version 780 (0.0024) -[2023-03-12 07:38:19,130][00184] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3415.6). Total num frames: 3198976. Throughput: 0: 838.5. Samples: 799754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:38:19,136][00184] Avg episode reward: [(0, '21.585')] -[2023-03-12 07:38:24,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3387.9). Total num frames: 3215360. Throughput: 0: 858.1. Samples: 805026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:38:24,131][00184] Avg episode reward: [(0, '20.438')] -[2023-03-12 07:38:29,129][00184] Fps is (10 sec: 3277.0, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 3231744. Throughput: 0: 860.0. Samples: 807364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:38:29,134][00184] Avg episode reward: [(0, '20.775')] -[2023-03-12 07:38:29,759][11786] Updated weights for policy 0, policy_version 790 (0.0026) -[2023-03-12 07:38:34,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 3244032. Throughput: 0: 842.1. Samples: 811788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:38:34,134][00184] Avg episode reward: [(0, '21.040')] -[2023-03-12 07:38:39,129][00184] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3346.2). Total num frames: 3256320. Throughput: 0: 819.1. Samples: 814680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:38:39,136][00184] Avg episode reward: [(0, '20.846')] -[2023-03-12 07:38:44,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 3272704. Throughput: 0: 822.4. Samples: 816972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:38:44,136][00184] Avg episode reward: [(0, '20.005')] -[2023-03-12 07:38:44,527][11786] Updated weights for policy 0, policy_version 800 (0.0013) -[2023-03-12 07:38:49,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3297280. Throughput: 0: 870.9. Samples: 823864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:38:49,134][00184] Avg episode reward: [(0, '20.812')] -[2023-03-12 07:38:53,229][11786] Updated weights for policy 0, policy_version 810 (0.0011) -[2023-03-12 07:38:54,129][00184] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 3317760. Throughput: 0: 864.0. Samples: 830498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:38:54,136][00184] Avg episode reward: [(0, '20.151')] -[2023-03-12 07:38:59,129][00184] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 3334144. Throughput: 0: 838.4. Samples: 832812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:38:59,136][00184] Avg episode reward: [(0, '21.750')] -[2023-03-12 07:39:04,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3401.8). Total num frames: 3354624. Throughput: 0: 843.1. Samples: 837692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:04,131][00184] Avg episode reward: [(0, '22.516')] -[2023-03-12 07:39:05,674][11786] Updated weights for policy 0, policy_version 820 (0.0024) -[2023-03-12 07:39:09,134][00184] Fps is (10 sec: 3275.3, 60 sec: 3413.1, 300 sec: 3373.9). Total num frames: 3366912. Throughput: 0: 837.3. Samples: 842708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:09,136][00184] Avg episode reward: [(0, '23.395')] -[2023-03-12 07:39:14,133][00184] Fps is (10 sec: 2456.7, 60 sec: 3276.6, 300 sec: 3360.1). Total num frames: 3379200. Throughput: 0: 835.7. Samples: 844974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:39:14,135][00184] Avg episode reward: [(0, '24.927')] -[2023-03-12 07:39:14,140][11773] Saving new best policy, reward=24.927! -[2023-03-12 07:39:19,129][00184] Fps is (10 sec: 2868.6, 60 sec: 3276.8, 300 sec: 3346.2). Total num frames: 3395584. Throughput: 0: 826.5. Samples: 848982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:19,131][00184] Avg episode reward: [(0, '23.356')] -[2023-03-12 07:39:20,025][11786] Updated weights for policy 0, policy_version 830 (0.0011) -[2023-03-12 07:39:24,129][00184] Fps is (10 sec: 3277.9, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 3411968. Throughput: 0: 873.3. Samples: 853978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:24,135][00184] Avg episode reward: [(0, '23.316')] -[2023-03-12 07:39:29,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 3436544. Throughput: 0: 898.9. Samples: 857422. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:39:29,132][00184] Avg episode reward: [(0, '22.350')] -[2023-03-12 07:39:29,649][11786] Updated weights for policy 0, policy_version 840 (0.0019) -[2023-03-12 07:39:34,129][00184] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 3457024. Throughput: 0: 905.7. Samples: 864622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:34,133][00184] Avg episode reward: [(0, '22.959')] -[2023-03-12 07:39:39,129][00184] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3415.6). Total num frames: 3473408. Throughput: 0: 860.3. Samples: 869210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:39,133][00184] Avg episode reward: [(0, '22.146')] -[2023-03-12 07:39:41,194][11786] Updated weights for policy 0, policy_version 850 (0.0030) -[2023-03-12 07:39:44,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3387.9). Total num frames: 3489792. Throughput: 0: 860.9. Samples: 871554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:44,131][00184] Avg episode reward: [(0, '23.627')] -[2023-03-12 07:39:49,129][00184] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 3506176. Throughput: 0: 877.8. Samples: 877194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:39:49,131][00184] Avg episode reward: [(0, '23.511')] -[2023-03-12 07:39:49,150][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000857_3510272.pth... -[2023-03-12 07:39:49,280][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth -[2023-03-12 07:39:53,368][11786] Updated weights for policy 0, policy_version 860 (0.0014) -[2023-03-12 07:39:54,132][00184] Fps is (10 sec: 3275.9, 60 sec: 3413.2, 300 sec: 3387.9). Total num frames: 3522560. Throughput: 0: 864.1. Samples: 881592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:39:54,134][00184] Avg episode reward: [(0, '23.440')] -[2023-03-12 07:39:59,129][00184] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 3538944. Throughput: 0: 858.3. Samples: 883596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) -[2023-03-12 07:39:59,137][00184] Avg episode reward: [(0, '22.703')] -[2023-03-12 07:40:04,129][00184] Fps is (10 sec: 3277.7, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3555328. Throughput: 0: 872.0. Samples: 888220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:40:04,131][00184] Avg episode reward: [(0, '22.386')] -[2023-03-12 07:40:05,738][11786] Updated weights for policy 0, policy_version 870 (0.0024) -[2023-03-12 07:40:09,129][00184] Fps is (10 sec: 4096.1, 60 sec: 3550.2, 300 sec: 3429.6). Total num frames: 3579904. Throughput: 0: 918.2. Samples: 895298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:40:09,132][00184] Avg episode reward: [(0, '22.867')] -[2023-03-12 07:40:14,137][00184] Fps is (10 sec: 4502.1, 60 sec: 3686.2, 300 sec: 3457.2). Total num frames: 3600384. Throughput: 0: 921.0. Samples: 898876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:40:14,145][00184] Avg episode reward: [(0, '22.267')] -[2023-03-12 07:40:14,733][11786] Updated weights for policy 0, policy_version 880 (0.0013) -[2023-03-12 07:40:19,131][00184] Fps is (10 sec: 3685.8, 60 sec: 3686.3, 300 sec: 3457.3). Total num frames: 3616768. Throughput: 0: 876.6. Samples: 904072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:40:19,139][00184] Avg episode reward: [(0, '22.727')] -[2023-03-12 07:40:24,129][00184] Fps is (10 sec: 3279.3, 60 sec: 3686.4, 300 sec: 3429.5). Total num frames: 3633152. Throughput: 0: 887.6. Samples: 909152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:40:24,137][00184] Avg episode reward: [(0, '22.663')] -[2023-03-12 07:40:26,503][11786] Updated weights for policy 0, policy_version 890 (0.0013) -[2023-03-12 07:40:29,137][00184] Fps is (10 sec: 3274.8, 60 sec: 3549.4, 300 sec: 3415.6). Total num frames: 3649536. Throughput: 0: 900.6. Samples: 912086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:40:29,143][00184] Avg episode reward: [(0, '22.228')] -[2023-03-12 07:40:34,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 3665920. Throughput: 0: 879.5. Samples: 916770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:40:34,131][00184] Avg episode reward: [(0, '20.273')] -[2023-03-12 07:40:39,129][00184] Fps is (10 sec: 2869.4, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 3678208. Throughput: 0: 870.9. Samples: 920782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:40:39,132][00184] Avg episode reward: [(0, '20.284')] -[2023-03-12 07:40:40,575][11786] Updated weights for policy 0, policy_version 900 (0.0021) -[2023-03-12 07:40:44,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 3694592. Throughput: 0: 878.0. Samples: 923108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:40:44,134][00184] Avg episode reward: [(0, '20.682')] -[2023-03-12 07:40:49,129][00184] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.5). Total num frames: 3719168. Throughput: 0: 918.8. Samples: 929568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:40:49,134][00184] Avg episode reward: [(0, '21.291')] -[2023-03-12 07:40:50,435][11786] Updated weights for policy 0, policy_version 910 (0.0021) -[2023-03-12 07:40:54,130][00184] Fps is (10 sec: 4914.6, 60 sec: 3686.5, 300 sec: 3485.1). Total num frames: 3743744. Throughput: 0: 921.9. Samples: 936786. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) -[2023-03-12 07:40:54,139][00184] Avg episode reward: [(0, '22.320')] -[2023-03-12 07:40:59,129][00184] Fps is (10 sec: 4095.9, 60 sec: 3686.4, 300 sec: 3485.1). Total num frames: 3760128. Throughput: 0: 894.8. Samples: 939134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:40:59,140][00184] Avg episode reward: [(0, '21.796')] -[2023-03-12 07:41:01,591][11786] Updated weights for policy 0, policy_version 920 (0.0022) -[2023-03-12 07:41:04,129][00184] Fps is (10 sec: 3277.2, 60 sec: 3686.4, 300 sec: 3457.3). Total num frames: 3776512. Throughput: 0: 882.3. Samples: 943772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:41:04,132][00184] Avg episode reward: [(0, '22.174')] -[2023-03-12 07:41:09,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3792896. Throughput: 0: 885.5. Samples: 949000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:41:09,135][00184] Avg episode reward: [(0, '22.127')] -[2023-03-12 07:41:14,112][11786] Updated weights for policy 0, policy_version 930 (0.0021) -[2023-03-12 07:41:14,129][00184] Fps is (10 sec: 3276.7, 60 sec: 3482.0, 300 sec: 3443.4). Total num frames: 3809280. Throughput: 0: 869.0. Samples: 951186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:41:14,133][00184] Avg episode reward: [(0, '23.783')] -[2023-03-12 07:41:19,130][00184] Fps is (10 sec: 2867.1, 60 sec: 3413.4, 300 sec: 3443.4). Total num frames: 3821568. Throughput: 0: 863.7. Samples: 955636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:41:19,132][00184] Avg episode reward: [(0, '22.968')] -[2023-03-12 07:41:24,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 3837952. Throughput: 0: 877.3. Samples: 960262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:41:24,132][00184] Avg episode reward: [(0, '22.934')] -[2023-03-12 07:41:26,359][11786] Updated weights for policy 0, policy_version 940 (0.0023) -[2023-03-12 07:41:29,129][00184] Fps is (10 sec: 4096.2, 60 sec: 3550.3, 300 sec: 3471.2). Total num frames: 3862528. Throughput: 0: 905.2. Samples: 963842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:41:29,138][00184] Avg episode reward: [(0, '24.924')] -[2023-03-12 07:41:34,130][00184] Fps is (10 sec: 4915.0, 60 sec: 3686.3, 300 sec: 3498.9). Total num frames: 3887104. Throughput: 0: 923.4. Samples: 971122. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:41:34,132][00184] Avg episode reward: [(0, '24.257')] -[2023-03-12 07:41:35,302][11786] Updated weights for policy 0, policy_version 950 (0.0019) -[2023-03-12 07:41:39,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3485.1). Total num frames: 3899392. Throughput: 0: 872.4. Samples: 976044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:41:39,134][00184] Avg episode reward: [(0, '25.356')] -[2023-03-12 07:41:39,155][11773] Saving new best policy, reward=25.356! -[2023-03-12 07:41:44,129][00184] Fps is (10 sec: 2867.4, 60 sec: 3686.4, 300 sec: 3457.3). Total num frames: 3915776. Throughput: 0: 870.0. Samples: 978282. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) -[2023-03-12 07:41:44,141][00184] Avg episode reward: [(0, '24.329')] -[2023-03-12 07:41:47,188][11786] Updated weights for policy 0, policy_version 960 (0.0015) -[2023-03-12 07:41:49,129][00184] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 3936256. Throughput: 0: 893.4. Samples: 983976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) -[2023-03-12 07:41:49,134][00184] Avg episode reward: [(0, '23.660')] -[2023-03-12 07:41:49,144][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000961_3936256.pth... -[2023-03-12 07:41:49,312][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000756_3096576.pth -[2023-03-12 07:41:54,129][00184] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3457.3). Total num frames: 3948544. Throughput: 0: 875.0. Samples: 988374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:41:54,136][00184] Avg episode reward: [(0, '24.143')] -[2023-03-12 07:41:59,135][00184] Fps is (10 sec: 2865.4, 60 sec: 3413.0, 300 sec: 3457.2). Total num frames: 3964928. Throughput: 0: 864.5. Samples: 990092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) -[2023-03-12 07:41:59,144][00184] Avg episode reward: [(0, '23.284')] -[2023-03-12 07:42:01,797][11786] Updated weights for policy 0, policy_version 970 (0.0014) -[2023-03-12 07:42:04,129][00184] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 3977216. Throughput: 0: 866.0. Samples: 994606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) -[2023-03-12 07:42:04,132][00184] Avg episode reward: [(0, '23.682')] -[2023-03-12 07:42:09,129][00184] Fps is (10 sec: 3688.7, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 4001792. Throughput: 0: 912.4. Samples: 1001322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) -[2023-03-12 07:42:09,133][00184] Avg episode reward: [(0, '23.842')] -[2023-03-12 07:42:09,474][00184] Component Batcher_0 stopped! -[2023-03-12 07:42:09,474][11773] Stopping Batcher_0... -[2023-03-12 07:42:09,477][11773] Loop batcher_evt_loop terminating... -[2023-03-12 07:42:09,478][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:42:09,521][11786] Weights refcount: 2 0 -[2023-03-12 07:42:09,527][11786] Stopping InferenceWorker_p0-w0... -[2023-03-12 07:42:09,527][11786] Loop inference_proc0-0_evt_loop terminating... -[2023-03-12 07:42:09,528][00184] Component InferenceWorker_p0-w0 stopped! -[2023-03-12 07:42:09,547][11787] Stopping RolloutWorker_w1... -[2023-03-12 07:42:09,547][11787] Loop rollout_proc1_evt_loop terminating... -[2023-03-12 07:42:09,547][00184] Component RolloutWorker_w1 stopped! -[2023-03-12 07:42:09,561][11800] Stopping RolloutWorker_w7... -[2023-03-12 07:42:09,561][00184] Component RolloutWorker_w7 stopped! -[2023-03-12 07:42:09,567][11800] Loop rollout_proc7_evt_loop terminating... -[2023-03-12 07:42:09,580][11799] Stopping RolloutWorker_w3... -[2023-03-12 07:42:09,580][00184] Component RolloutWorker_w3 stopped! -[2023-03-12 07:42:09,581][11799] Loop rollout_proc3_evt_loop terminating... -[2023-03-12 07:42:09,590][11798] Stopping RolloutWorker_w5... -[2023-03-12 07:42:09,590][11798] Loop rollout_proc5_evt_loop terminating... -[2023-03-12 07:42:09,590][00184] Component RolloutWorker_w5 stopped! -[2023-03-12 07:42:09,598][00184] Component RolloutWorker_w6 stopped! -[2023-03-12 07:42:09,601][11801] Stopping RolloutWorker_w6... -[2023-03-12 07:42:09,613][11801] Loop rollout_proc6_evt_loop terminating... -[2023-03-12 07:42:09,617][00184] Component RolloutWorker_w0 stopped! -[2023-03-12 07:42:09,621][11789] Stopping RolloutWorker_w2... -[2023-03-12 07:42:09,622][00184] Component RolloutWorker_w2 stopped! -[2023-03-12 07:42:09,626][11789] Loop rollout_proc2_evt_loop terminating... -[2023-03-12 07:42:09,625][11791] Stopping RolloutWorker_w0... -[2023-03-12 07:42:09,628][00184] Component RolloutWorker_w4 stopped! -[2023-03-12 07:42:09,631][11793] Stopping RolloutWorker_w4... -[2023-03-12 07:42:09,643][11793] Loop rollout_proc4_evt_loop terminating... -[2023-03-12 07:42:09,644][11791] Loop rollout_proc0_evt_loop terminating... -[2023-03-12 07:42:09,689][11773] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000857_3510272.pth -[2023-03-12 07:42:09,705][11773] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:42:09,900][00184] Component LearnerWorker_p0 stopped! -[2023-03-12 07:42:09,908][00184] Waiting for process learner_proc0 to stop... -[2023-03-12 07:42:09,913][11773] Stopping LearnerWorker_p0... -[2023-03-12 07:42:09,914][11773] Loop learner_proc0_evt_loop terminating... -[2023-03-12 07:42:11,677][00184] Waiting for process inference_proc0-0 to join... -[2023-03-12 07:42:11,920][00184] Waiting for process rollout_proc0 to join... -[2023-03-12 07:42:12,420][00184] Waiting for process rollout_proc1 to join... -[2023-03-12 07:42:12,423][00184] Waiting for process rollout_proc2 to join... -[2023-03-12 07:42:12,425][00184] Waiting for process rollout_proc3 to join... -[2023-03-12 07:42:12,428][00184] Waiting for process rollout_proc4 to join... -[2023-03-12 07:42:12,432][00184] Waiting for process rollout_proc5 to join... -[2023-03-12 07:42:12,439][00184] Waiting for process rollout_proc6 to join... -[2023-03-12 07:42:12,441][00184] Waiting for process rollout_proc7 to join... -[2023-03-12 07:42:12,442][00184] Batcher 0 profile tree view: -batching: 27.9395, releasing_batches: 0.0240 -[2023-03-12 07:42:12,443][00184] InferenceWorker_p0-w0 profile tree view: +[2023-03-14 14:00:54,519][00372] Heartbeat connected on Batcher_0 +[2023-03-14 14:00:54,529][00372] Heartbeat connected on InferenceWorker_p0-w0 +[2023-03-14 14:00:54,539][00372] Heartbeat connected on RolloutWorker_w0 +[2023-03-14 14:00:54,544][00372] Heartbeat connected on RolloutWorker_w1 +[2023-03-14 14:00:54,548][00372] Heartbeat connected on RolloutWorker_w2 +[2023-03-14 14:00:54,759][00372] Heartbeat connected on RolloutWorker_w3 +[2023-03-14 14:00:54,764][00372] Heartbeat connected on RolloutWorker_w4 +[2023-03-14 14:00:54,766][00372] Heartbeat connected on RolloutWorker_w5 +[2023-03-14 14:00:54,769][00372] Heartbeat connected on RolloutWorker_w6 +[2023-03-14 14:00:54,773][00372] Heartbeat connected on RolloutWorker_w7 +[2023-03-14 14:00:55,602][13187] Using optimizer +[2023-03-14 14:00:55,603][13187] No checkpoints found +[2023-03-14 14:00:55,603][13187] Did not load from checkpoint, starting from scratch! +[2023-03-14 14:00:55,603][13187] Initialized policy 0 weights for model version 0 +[2023-03-14 14:00:55,613][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-03-14 14:00:55,623][13187] LearnerWorker_p0 finished initialization! +[2023-03-14 14:00:55,623][00372] Heartbeat connected on LearnerWorker_p0 +[2023-03-14 14:00:55,807][13200] RunningMeanStd input shape: (3, 72, 128) +[2023-03-14 14:00:55,809][13200] RunningMeanStd input shape: (1,) +[2023-03-14 14:00:55,831][13200] ConvEncoder: input_channels=3 +[2023-03-14 14:00:55,997][13200] Conv encoder output size: 512 +[2023-03-14 14:00:55,999][13200] Policy head output size: 512 +[2023-03-14 14:00:59,436][00372] Inference worker 0-0 is ready! +[2023-03-14 14:00:59,438][00372] All inference workers are ready! Signal rollout workers to start! +[2023-03-14 14:00:59,554][13205] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,559][13211] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,560][13204] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,578][13207] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,713][13209] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,720][13202] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,725][13208] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:00:59,723][13210] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:01:00,144][00372] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-03-14 14:01:00,912][13209] Decorrelating experience for 0 frames... +[2023-03-14 14:01:00,914][13202] Decorrelating experience for 0 frames... +[2023-03-14 14:01:01,406][13211] Decorrelating experience for 0 frames... +[2023-03-14 14:01:01,408][13204] Decorrelating experience for 0 frames... +[2023-03-14 14:01:01,412][13207] Decorrelating experience for 0 frames... +[2023-03-14 14:01:01,544][13209] Decorrelating experience for 32 frames... +[2023-03-14 14:01:02,126][13208] Decorrelating experience for 0 frames... +[2023-03-14 14:01:02,339][13204] Decorrelating experience for 32 frames... +[2023-03-14 14:01:02,431][13205] Decorrelating experience for 0 frames... +[2023-03-14 14:01:02,523][13209] Decorrelating experience for 64 frames... +[2023-03-14 14:01:02,655][13211] Decorrelating experience for 32 frames... +[2023-03-14 14:01:03,241][13210] Decorrelating experience for 0 frames... +[2023-03-14 14:01:03,249][13208] Decorrelating experience for 32 frames... +[2023-03-14 14:01:03,339][13205] Decorrelating experience for 32 frames... +[2023-03-14 14:01:03,486][13204] Decorrelating experience for 64 frames... +[2023-03-14 14:01:03,847][13209] Decorrelating experience for 96 frames... +[2023-03-14 14:01:04,539][13210] Decorrelating experience for 32 frames... +[2023-03-14 14:01:04,571][13207] Decorrelating experience for 32 frames... +[2023-03-14 14:01:04,738][13208] Decorrelating experience for 64 frames... +[2023-03-14 14:01:04,850][13211] Decorrelating experience for 64 frames... +[2023-03-14 14:01:04,953][13202] Decorrelating experience for 32 frames... +[2023-03-14 14:01:05,143][00372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-03-14 14:01:05,226][13204] Decorrelating experience for 96 frames... +[2023-03-14 14:01:05,563][13202] Decorrelating experience for 64 frames... +[2023-03-14 14:01:05,843][13208] Decorrelating experience for 96 frames... +[2023-03-14 14:01:06,116][13205] Decorrelating experience for 64 frames... +[2023-03-14 14:01:06,243][13211] Decorrelating experience for 96 frames... +[2023-03-14 14:01:06,720][13210] Decorrelating experience for 64 frames... +[2023-03-14 14:01:06,743][13205] Decorrelating experience for 96 frames... +[2023-03-14 14:01:07,121][13202] Decorrelating experience for 96 frames... +[2023-03-14 14:01:07,507][13210] Decorrelating experience for 96 frames... +[2023-03-14 14:01:07,957][13207] Decorrelating experience for 64 frames... +[2023-03-14 14:01:10,143][00372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 53.0. Samples: 530. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-03-14 14:01:10,146][00372] Avg episode reward: [(0, '1.660')] +[2023-03-14 14:01:10,932][13187] Signal inference workers to stop experience collection... +[2023-03-14 14:01:10,941][13200] InferenceWorker_p0-w0: stopping experience collection +[2023-03-14 14:01:10,999][13207] Decorrelating experience for 96 frames... +[2023-03-14 14:01:14,007][13187] Signal inference workers to resume experience collection... +[2023-03-14 14:01:14,008][13200] InferenceWorker_p0-w0: resuming experience collection +[2023-03-14 14:01:15,143][00372] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 169.9. Samples: 2548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) +[2023-03-14 14:01:15,150][00372] Avg episode reward: [(0, '2.686')] +[2023-03-14 14:01:20,143][00372] Fps is (10 sec: 2048.0, 60 sec: 1024.1, 300 sec: 1024.1). Total num frames: 20480. Throughput: 0: 180.6. Samples: 3612. Policy #0 lag: (min: 0.0, avg: 0.3, max: 3.0) +[2023-03-14 14:01:20,150][00372] Avg episode reward: [(0, '3.515')] +[2023-03-14 14:01:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 1310.8, 300 sec: 1310.8). Total num frames: 32768. Throughput: 0: 313.6. Samples: 7840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:01:25,146][00372] Avg episode reward: [(0, '4.007')] +[2023-03-14 14:01:26,631][13200] Updated weights for policy 0, policy_version 10 (0.0013) +[2023-03-14 14:01:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 1775.0, 300 sec: 1775.0). Total num frames: 53248. Throughput: 0: 463.6. Samples: 13908. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:01:30,149][00372] Avg episode reward: [(0, '4.655')] +[2023-03-14 14:01:35,147][00372] Fps is (10 sec: 4094.1, 60 sec: 2106.3, 300 sec: 2106.3). Total num frames: 73728. Throughput: 0: 486.1. Samples: 17014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:01:35,152][00372] Avg episode reward: [(0, '4.581')] +[2023-03-14 14:01:37,689][13200] Updated weights for policy 0, policy_version 20 (0.0033) +[2023-03-14 14:01:40,144][00372] Fps is (10 sec: 3276.3, 60 sec: 2150.4, 300 sec: 2150.4). Total num frames: 86016. Throughput: 0: 533.8. Samples: 21354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:01:40,148][00372] Avg episode reward: [(0, '4.389')] +[2023-03-14 14:01:45,144][00372] Fps is (10 sec: 2458.3, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 98304. Throughput: 0: 564.9. Samples: 25420. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:01:45,152][00372] Avg episode reward: [(0, '4.206')] +[2023-03-14 14:01:50,143][00372] Fps is (10 sec: 2867.6, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 114688. Throughput: 0: 608.6. Samples: 27388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:01:50,145][00372] Avg episode reward: [(0, '4.330')] +[2023-03-14 14:01:50,152][13187] Saving new best policy, reward=4.330! +[2023-03-14 14:01:51,670][13200] Updated weights for policy 0, policy_version 30 (0.0014) +[2023-03-14 14:01:55,143][00372] Fps is (10 sec: 3686.9, 60 sec: 2457.7, 300 sec: 2457.7). Total num frames: 135168. Throughput: 0: 727.6. Samples: 33272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:01:55,146][00372] Avg episode reward: [(0, '4.421')] +[2023-03-14 14:01:55,151][13187] Saving new best policy, reward=4.421! +[2023-03-14 14:02:00,144][00372] Fps is (10 sec: 3686.2, 60 sec: 2525.9, 300 sec: 2525.9). Total num frames: 151552. Throughput: 0: 811.1. Samples: 39048. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:00,146][00372] Avg episode reward: [(0, '4.232')] +[2023-03-14 14:02:03,230][13200] Updated weights for policy 0, policy_version 40 (0.0023) +[2023-03-14 14:02:05,143][00372] Fps is (10 sec: 3276.9, 60 sec: 2798.9, 300 sec: 2583.7). Total num frames: 167936. Throughput: 0: 832.0. Samples: 41054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:05,154][00372] Avg episode reward: [(0, '4.304')] +[2023-03-14 14:02:10,143][00372] Fps is (10 sec: 2867.4, 60 sec: 3003.7, 300 sec: 2574.7). Total num frames: 180224. Throughput: 0: 826.5. Samples: 45034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:10,151][00372] Avg episode reward: [(0, '4.388')] +[2023-03-14 14:02:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 2621.5). Total num frames: 196608. Throughput: 0: 794.9. Samples: 49678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:15,148][00372] Avg episode reward: [(0, '4.358')] +[2023-03-14 14:02:16,544][13200] Updated weights for policy 0, policy_version 50 (0.0040) +[2023-03-14 14:02:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 2713.7). Total num frames: 217088. Throughput: 0: 795.0. Samples: 52784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:20,148][00372] Avg episode reward: [(0, '4.380')] +[2023-03-14 14:02:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2746.8). Total num frames: 233472. Throughput: 0: 834.4. Samples: 58900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:25,149][00372] Avg episode reward: [(0, '4.276')] +[2023-03-14 14:02:28,154][13200] Updated weights for policy 0, policy_version 60 (0.0016) +[2023-03-14 14:02:30,148][00372] Fps is (10 sec: 3275.0, 60 sec: 3276.5, 300 sec: 2776.1). Total num frames: 249856. Throughput: 0: 833.0. Samples: 62910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:02:30,156][00372] Avg episode reward: [(0, '4.309')] +[2023-03-14 14:02:30,170][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth... +[2023-03-14 14:02:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 2759.5). Total num frames: 262144. Throughput: 0: 832.0. Samples: 64830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:02:35,147][00372] Avg episode reward: [(0, '4.342')] +[2023-03-14 14:02:40,143][00372] Fps is (10 sec: 2868.8, 60 sec: 3208.6, 300 sec: 2785.3). Total num frames: 278528. Throughput: 0: 798.6. Samples: 69208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:02:40,153][00372] Avg episode reward: [(0, '4.421')] +[2023-03-14 14:02:41,492][13200] Updated weights for policy 0, policy_version 70 (0.0017) +[2023-03-14 14:02:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 2847.7). Total num frames: 299008. Throughput: 0: 808.2. Samples: 75418. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:02:45,145][00372] Avg episode reward: [(0, '4.279')] +[2023-03-14 14:02:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2867.2). Total num frames: 315392. Throughput: 0: 832.9. Samples: 78536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:02:50,148][00372] Avg episode reward: [(0, '4.310')] +[2023-03-14 14:02:53,566][13200] Updated weights for policy 0, policy_version 80 (0.0034) +[2023-03-14 14:02:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 2885.0). Total num frames: 331776. Throughput: 0: 832.0. Samples: 82476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:02:55,145][00372] Avg episode reward: [(0, '4.472')] +[2023-03-14 14:02:55,151][13187] Saving new best policy, reward=4.472! +[2023-03-14 14:03:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 2867.2). Total num frames: 344064. Throughput: 0: 816.7. Samples: 86430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:03:00,145][00372] Avg episode reward: [(0, '4.478')] +[2023-03-14 14:03:00,153][13187] Saving new best policy, reward=4.478! +[2023-03-14 14:03:05,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 2883.6). Total num frames: 360448. Throughput: 0: 791.9. Samples: 88420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:03:05,145][00372] Avg episode reward: [(0, '4.335')] +[2023-03-14 14:03:06,695][13200] Updated weights for policy 0, policy_version 90 (0.0027) +[2023-03-14 14:03:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2930.2). Total num frames: 380928. Throughput: 0: 794.7. Samples: 94662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:03:10,151][00372] Avg episode reward: [(0, '4.322')] +[2023-03-14 14:03:15,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3345.0, 300 sec: 2943.0). Total num frames: 397312. Throughput: 0: 824.3. Samples: 100002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:03:15,149][00372] Avg episode reward: [(0, '4.364')] +[2023-03-14 14:03:18,947][13200] Updated weights for policy 0, policy_version 100 (0.0015) +[2023-03-14 14:03:20,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.3, 300 sec: 2925.7). Total num frames: 409600. Throughput: 0: 823.8. Samples: 101906. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:03:20,149][00372] Avg episode reward: [(0, '4.373')] +[2023-03-14 14:03:25,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3208.5, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 817.2. Samples: 105984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:03:25,149][00372] Avg episode reward: [(0, '4.317')] +[2023-03-14 14:03:30,143][00372] Fps is (10 sec: 3277.9, 60 sec: 3208.8, 300 sec: 2949.1). Total num frames: 442368. Throughput: 0: 787.7. Samples: 110864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:03:30,150][00372] Avg episode reward: [(0, '4.578')] +[2023-03-14 14:03:30,160][13187] Saving new best policy, reward=4.578! +[2023-03-14 14:03:31,728][13200] Updated weights for policy 0, policy_version 110 (0.0035) +[2023-03-14 14:03:35,146][00372] Fps is (10 sec: 3685.3, 60 sec: 3344.9, 300 sec: 2986.1). Total num frames: 462848. Throughput: 0: 787.7. Samples: 113986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:03:35,149][00372] Avg episode reward: [(0, '4.485')] +[2023-03-14 14:03:40,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 2995.2). Total num frames: 479232. Throughput: 0: 827.2. Samples: 119702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:03:40,152][00372] Avg episode reward: [(0, '4.286')] +[2023-03-14 14:03:44,115][13200] Updated weights for policy 0, policy_version 120 (0.0019) +[2023-03-14 14:03:45,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3208.5, 300 sec: 2978.9). Total num frames: 491520. Throughput: 0: 827.6. Samples: 123670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:03:45,148][00372] Avg episode reward: [(0, '4.310')] +[2023-03-14 14:03:50,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 2987.7). Total num frames: 507904. Throughput: 0: 826.8. Samples: 125628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:03:50,145][00372] Avg episode reward: [(0, '4.462')] +[2023-03-14 14:03:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2996.0). Total num frames: 524288. Throughput: 0: 793.6. Samples: 130372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:03:55,146][00372] Avg episode reward: [(0, '4.788')] +[2023-03-14 14:03:55,153][13187] Saving new best policy, reward=4.788! +[2023-03-14 14:03:56,712][13200] Updated weights for policy 0, policy_version 130 (0.0016) +[2023-03-14 14:04:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3026.5). Total num frames: 544768. Throughput: 0: 810.2. Samples: 136460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:04:00,146][00372] Avg episode reward: [(0, '4.721')] +[2023-03-14 14:04:05,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3033.3). Total num frames: 561152. Throughput: 0: 830.4. Samples: 139270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:04:05,145][00372] Avg episode reward: [(0, '4.687')] +[2023-03-14 14:04:09,762][13200] Updated weights for policy 0, policy_version 140 (0.0021) +[2023-03-14 14:04:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3018.1). Total num frames: 573440. Throughput: 0: 823.7. Samples: 143048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:10,149][00372] Avg episode reward: [(0, '4.655')] +[2023-03-14 14:04:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3003.8). Total num frames: 585728. Throughput: 0: 799.6. Samples: 146846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:15,151][00372] Avg episode reward: [(0, '4.629')] +[2023-03-14 14:04:20,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.5, 300 sec: 3010.5). Total num frames: 602112. Throughput: 0: 780.0. Samples: 149088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:20,149][00372] Avg episode reward: [(0, '4.413')] +[2023-03-14 14:04:22,207][13200] Updated weights for policy 0, policy_version 150 (0.0018) +[2023-03-14 14:04:25,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3037.1). Total num frames: 622592. Throughput: 0: 792.9. Samples: 155382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:25,145][00372] Avg episode reward: [(0, '4.350')] +[2023-03-14 14:04:30,144][00372] Fps is (10 sec: 3687.2, 60 sec: 3276.7, 300 sec: 3042.7). Total num frames: 638976. Throughput: 0: 819.4. Samples: 160544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:30,156][00372] Avg episode reward: [(0, '4.533')] +[2023-03-14 14:04:30,176][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000156_638976.pth... +[2023-03-14 14:04:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 3029.2). Total num frames: 651264. Throughput: 0: 817.4. Samples: 162410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:04:35,145][00372] Avg episode reward: [(0, '4.534')] +[2023-03-14 14:04:35,593][13200] Updated weights for policy 0, policy_version 160 (0.0019) +[2023-03-14 14:04:40,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3140.2, 300 sec: 3034.8). Total num frames: 667648. Throughput: 0: 800.5. Samples: 166396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:04:40,149][00372] Avg episode reward: [(0, '4.574')] +[2023-03-14 14:04:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3040.2). Total num frames: 684032. Throughput: 0: 784.0. Samples: 171742. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:04:45,149][00372] Avg episode reward: [(0, '4.492')] +[2023-03-14 14:04:47,244][13200] Updated weights for policy 0, policy_version 170 (0.0032) +[2023-03-14 14:04:50,143][00372] Fps is (10 sec: 3686.6, 60 sec: 3276.8, 300 sec: 3063.1). Total num frames: 704512. Throughput: 0: 791.4. Samples: 174882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:04:50,149][00372] Avg episode reward: [(0, '4.543')] +[2023-03-14 14:04:55,143][00372] Fps is (10 sec: 3686.2, 60 sec: 3276.8, 300 sec: 3067.7). Total num frames: 720896. Throughput: 0: 823.9. Samples: 180124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:04:55,151][00372] Avg episode reward: [(0, '4.618')] +[2023-03-14 14:05:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3055.0). Total num frames: 733184. Throughput: 0: 826.7. Samples: 184046. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:05:00,146][00372] Avg episode reward: [(0, '4.672')] +[2023-03-14 14:05:00,735][13200] Updated weights for policy 0, policy_version 180 (0.0034) +[2023-03-14 14:05:05,143][00372] Fps is (10 sec: 2457.7, 60 sec: 3072.0, 300 sec: 3042.8). Total num frames: 745472. Throughput: 0: 822.3. Samples: 186088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:05:05,146][00372] Avg episode reward: [(0, '4.585')] +[2023-03-14 14:05:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3063.8). Total num frames: 765952. Throughput: 0: 796.8. Samples: 191238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:05:10,145][00372] Avg episode reward: [(0, '4.510')] +[2023-03-14 14:05:12,262][13200] Updated weights for policy 0, policy_version 190 (0.0015) +[2023-03-14 14:05:15,143][00372] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3100.1). Total num frames: 790528. Throughput: 0: 820.9. Samples: 197482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:15,151][00372] Avg episode reward: [(0, '4.737')] +[2023-03-14 14:05:20,145][00372] Fps is (10 sec: 3685.5, 60 sec: 3345.1, 300 sec: 3087.7). Total num frames: 802816. Throughput: 0: 834.1. Samples: 199946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:05:20,148][00372] Avg episode reward: [(0, '4.657')] +[2023-03-14 14:05:25,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 3075.9). Total num frames: 815104. Throughput: 0: 832.4. Samples: 203852. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:25,152][00372] Avg episode reward: [(0, '4.589')] +[2023-03-14 14:05:25,510][13200] Updated weights for policy 0, policy_version 200 (0.0030) +[2023-03-14 14:05:30,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.6, 300 sec: 3079.6). Total num frames: 831488. Throughput: 0: 803.6. Samples: 207904. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:30,146][00372] Avg episode reward: [(0, '4.492')] +[2023-03-14 14:05:35,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3083.2). Total num frames: 847872. Throughput: 0: 793.5. Samples: 210590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:35,148][00372] Avg episode reward: [(0, '4.463')] +[2023-03-14 14:05:37,128][13200] Updated weights for policy 0, policy_version 210 (0.0034) +[2023-03-14 14:05:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3115.9). Total num frames: 872448. Throughput: 0: 816.9. Samples: 216884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:40,145][00372] Avg episode reward: [(0, '4.534')] +[2023-03-14 14:05:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3104.4). Total num frames: 884736. Throughput: 0: 835.5. Samples: 221642. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:05:45,148][00372] Avg episode reward: [(0, '4.562')] +[2023-03-14 14:05:50,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 3093.2). Total num frames: 897024. Throughput: 0: 835.1. Samples: 223670. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:05:50,145][00372] Avg episode reward: [(0, '4.589')] +[2023-03-14 14:05:50,581][13200] Updated weights for policy 0, policy_version 220 (0.0015) +[2023-03-14 14:05:55,144][00372] Fps is (10 sec: 2866.9, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 913408. Throughput: 0: 810.1. Samples: 227692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:05:55,148][00372] Avg episode reward: [(0, '4.615')] +[2023-03-14 14:06:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3151.8). Total num frames: 929792. Throughput: 0: 793.6. Samples: 233194. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:06:00,150][00372] Avg episode reward: [(0, '4.408')] +[2023-03-14 14:06:02,180][13200] Updated weights for policy 0, policy_version 230 (0.0023) +[2023-03-14 14:06:05,143][00372] Fps is (10 sec: 4096.5, 60 sec: 3481.6, 300 sec: 3235.1). Total num frames: 954368. Throughput: 0: 808.7. Samples: 236334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:06:05,150][00372] Avg episode reward: [(0, '4.418')] +[2023-03-14 14:06:10,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 966656. Throughput: 0: 834.8. Samples: 241418. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:06:10,150][00372] Avg episode reward: [(0, '4.573')] +[2023-03-14 14:06:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 978944. Throughput: 0: 832.3. Samples: 245358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:06:15,145][00372] Avg episode reward: [(0, '4.674')] +[2023-03-14 14:06:15,917][13200] Updated weights for policy 0, policy_version 240 (0.0029) +[2023-03-14 14:06:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3249.0). Total num frames: 991232. Throughput: 0: 818.0. Samples: 247398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:06:20,146][00372] Avg episode reward: [(0, '4.651')] +[2023-03-14 14:06:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1011712. Throughput: 0: 794.9. Samples: 252654. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:06:25,146][00372] Avg episode reward: [(0, '4.397')] +[2023-03-14 14:06:27,287][13200] Updated weights for policy 0, policy_version 250 (0.0020) +[2023-03-14 14:06:30,145][00372] Fps is (10 sec: 4095.0, 60 sec: 3344.9, 300 sec: 3249.1). Total num frames: 1032192. Throughput: 0: 827.1. Samples: 258864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:06:30,153][00372] Avg episode reward: [(0, '4.469')] +[2023-03-14 14:06:30,191][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000253_1036288.pth... +[2023-03-14 14:06:30,384][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth +[2023-03-14 14:06:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1048576. Throughput: 0: 829.6. Samples: 261000. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:06:35,145][00372] Avg episode reward: [(0, '4.512')] +[2023-03-14 14:06:40,146][00372] Fps is (10 sec: 2867.9, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 1060864. Throughput: 0: 827.6. Samples: 264932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:06:40,149][00372] Avg episode reward: [(0, '4.483')] +[2023-03-14 14:06:41,269][13200] Updated weights for policy 0, policy_version 260 (0.0023) +[2023-03-14 14:06:45,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1073152. Throughput: 0: 792.3. Samples: 268848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:06:45,150][00372] Avg episode reward: [(0, '4.525')] +[2023-03-14 14:06:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1093632. Throughput: 0: 789.1. Samples: 271844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:06:50,145][00372] Avg episode reward: [(0, '4.628')] +[2023-03-14 14:06:52,412][13200] Updated weights for policy 0, policy_version 270 (0.0018) +[2023-03-14 14:06:55,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1114112. Throughput: 0: 812.0. Samples: 277958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:06:55,150][00372] Avg episode reward: [(0, '4.876')] +[2023-03-14 14:06:55,155][13187] Saving new best policy, reward=4.876! +[2023-03-14 14:07:00,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1126400. Throughput: 0: 819.8. Samples: 282248. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:07:00,152][00372] Avg episode reward: [(0, '5.077')] +[2023-03-14 14:07:00,168][13187] Saving new best policy, reward=5.077! +[2023-03-14 14:07:05,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3249.0). Total num frames: 1138688. Throughput: 0: 817.0. Samples: 284164. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:07:05,149][00372] Avg episode reward: [(0, '5.161')] +[2023-03-14 14:07:05,154][13187] Saving new best policy, reward=5.161! +[2023-03-14 14:07:07,096][13200] Updated weights for policy 0, policy_version 280 (0.0013) +[2023-03-14 14:07:10,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1155072. Throughput: 0: 791.3. Samples: 288264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:07:10,145][00372] Avg episode reward: [(0, '4.802')] +[2023-03-14 14:07:15,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1175552. Throughput: 0: 784.3. Samples: 294156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:07:15,151][00372] Avg episode reward: [(0, '4.633')] +[2023-03-14 14:07:17,752][13200] Updated weights for policy 0, policy_version 290 (0.0013) +[2023-03-14 14:07:20,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1196032. Throughput: 0: 805.0. Samples: 297226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:07:20,147][00372] Avg episode reward: [(0, '4.748')] +[2023-03-14 14:07:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.1). Total num frames: 1208320. Throughput: 0: 822.8. Samples: 301958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:07:25,151][00372] Avg episode reward: [(0, '4.705')] +[2023-03-14 14:07:30,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3249.0). Total num frames: 1220608. Throughput: 0: 822.4. Samples: 305854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:07:30,151][00372] Avg episode reward: [(0, '4.705')] +[2023-03-14 14:07:32,215][13200] Updated weights for policy 0, policy_version 300 (0.0023) +[2023-03-14 14:07:35,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1236992. Throughput: 0: 800.0. Samples: 307844. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:07:35,153][00372] Avg episode reward: [(0, '4.809')] +[2023-03-14 14:07:40,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1257472. Throughput: 0: 791.3. Samples: 313568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:07:40,145][00372] Avg episode reward: [(0, '4.788')] +[2023-03-14 14:07:42,619][13200] Updated weights for policy 0, policy_version 310 (0.0017) +[2023-03-14 14:07:45,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1277952. Throughput: 0: 830.7. Samples: 319628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:07:45,145][00372] Avg episode reward: [(0, '4.939')] +[2023-03-14 14:07:50,146][00372] Fps is (10 sec: 3275.7, 60 sec: 3276.6, 300 sec: 3249.0). Total num frames: 1290240. Throughput: 0: 831.8. Samples: 321596. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:07:50,151][00372] Avg episode reward: [(0, '4.881')] +[2023-03-14 14:07:55,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1302528. Throughput: 0: 826.0. Samples: 325436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:07:55,145][00372] Avg episode reward: [(0, '4.948')] +[2023-03-14 14:07:57,511][13200] Updated weights for policy 0, policy_version 320 (0.0050) +[2023-03-14 14:08:00,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 1318912. Throughput: 0: 790.2. Samples: 329714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:08:00,147][00372] Avg episode reward: [(0, '4.761')] +[2023-03-14 14:08:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1339392. Throughput: 0: 790.8. Samples: 332812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:08:05,149][00372] Avg episode reward: [(0, '4.732')] +[2023-03-14 14:08:07,835][13200] Updated weights for policy 0, policy_version 330 (0.0012) +[2023-03-14 14:08:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1355776. Throughput: 0: 823.8. Samples: 339030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:08:10,150][00372] Avg episode reward: [(0, '4.674')] +[2023-03-14 14:08:15,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 1368064. Throughput: 0: 824.3. Samples: 342946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:08:15,148][00372] Avg episode reward: [(0, '4.679')] +[2023-03-14 14:08:20,160][00372] Fps is (10 sec: 2453.3, 60 sec: 3071.1, 300 sec: 3235.0). Total num frames: 1380352. Throughput: 0: 821.8. Samples: 344838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:08:20,168][00372] Avg episode reward: [(0, '4.929')] +[2023-03-14 14:08:23,057][13200] Updated weights for policy 0, policy_version 340 (0.0033) +[2023-03-14 14:08:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 1396736. Throughput: 0: 786.0. Samples: 348938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:08:25,145][00372] Avg episode reward: [(0, '5.001')] +[2023-03-14 14:08:30,143][00372] Fps is (10 sec: 3692.7, 60 sec: 3276.8, 300 sec: 3235.2). Total num frames: 1417216. Throughput: 0: 786.2. Samples: 355006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:08:30,148][00372] Avg episode reward: [(0, '4.809')] +[2023-03-14 14:08:30,196][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000347_1421312.pth... +[2023-03-14 14:08:30,333][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000156_638976.pth +[2023-03-14 14:08:33,054][13200] Updated weights for policy 0, policy_version 350 (0.0022) +[2023-03-14 14:08:35,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1437696. Throughput: 0: 810.3. Samples: 358058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:08:35,147][00372] Avg episode reward: [(0, '4.563')] +[2023-03-14 14:08:40,148][00372] Fps is (10 sec: 3275.1, 60 sec: 3208.2, 300 sec: 3249.0). Total num frames: 1449984. Throughput: 0: 822.7. Samples: 362460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:08:40,155][00372] Avg episode reward: [(0, '4.525')] +[2023-03-14 14:08:45,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 1462272. Throughput: 0: 818.3. Samples: 366538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:08:45,157][00372] Avg episode reward: [(0, '4.564')] +[2023-03-14 14:08:48,177][13200] Updated weights for policy 0, policy_version 360 (0.0017) +[2023-03-14 14:08:50,143][00372] Fps is (10 sec: 2868.8, 60 sec: 3140.5, 300 sec: 3235.1). Total num frames: 1478656. Throughput: 0: 794.6. Samples: 368568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:08:50,145][00372] Avg episode reward: [(0, '4.648')] +[2023-03-14 14:08:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1499136. Throughput: 0: 789.3. Samples: 374550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:08:55,145][00372] Avg episode reward: [(0, '4.734')] +[2023-03-14 14:08:58,167][13200] Updated weights for policy 0, policy_version 370 (0.0012) +[2023-03-14 14:09:00,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1519616. Throughput: 0: 829.0. Samples: 380250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:00,147][00372] Avg episode reward: [(0, '4.794')] +[2023-03-14 14:09:05,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1531904. Throughput: 0: 828.8. Samples: 382118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:09:05,151][00372] Avg episode reward: [(0, '4.553')] +[2023-03-14 14:09:10,145][00372] Fps is (10 sec: 2457.0, 60 sec: 3140.1, 300 sec: 3249.0). Total num frames: 1544192. Throughput: 0: 820.9. Samples: 385880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:09:10,149][00372] Avg episode reward: [(0, '4.453')] +[2023-03-14 14:09:13,941][13200] Updated weights for policy 0, policy_version 380 (0.0026) +[2023-03-14 14:09:15,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 1560576. Throughput: 0: 782.5. Samples: 390218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:15,146][00372] Avg episode reward: [(0, '4.338')] +[2023-03-14 14:09:20,143][00372] Fps is (10 sec: 3687.3, 60 sec: 3346.0, 300 sec: 3249.0). Total num frames: 1581056. Throughput: 0: 782.7. Samples: 393278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:20,150][00372] Avg episode reward: [(0, '4.581')] +[2023-03-14 14:09:24,412][13200] Updated weights for policy 0, policy_version 390 (0.0013) +[2023-03-14 14:09:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1597440. Throughput: 0: 814.5. Samples: 399108. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:09:25,148][00372] Avg episode reward: [(0, '4.716')] +[2023-03-14 14:09:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1609728. Throughput: 0: 810.7. Samples: 403020. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:09:30,150][00372] Avg episode reward: [(0, '4.786')] +[2023-03-14 14:09:35,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 1622016. Throughput: 0: 809.0. Samples: 404972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:35,150][00372] Avg episode reward: [(0, '4.711')] +[2023-03-14 14:09:39,256][13200] Updated weights for policy 0, policy_version 400 (0.0016) +[2023-03-14 14:09:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.6, 300 sec: 3235.1). Total num frames: 1638400. Throughput: 0: 774.8. Samples: 409414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:40,152][00372] Avg episode reward: [(0, '4.627')] +[2023-03-14 14:09:45,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1658880. Throughput: 0: 784.7. Samples: 415560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:45,151][00372] Avg episode reward: [(0, '4.710')] +[2023-03-14 14:09:50,144][00372] Fps is (10 sec: 3685.8, 60 sec: 3276.7, 300 sec: 3235.1). Total num frames: 1675264. Throughput: 0: 808.9. Samples: 418518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:09:50,146][00372] Avg episode reward: [(0, '4.747')] +[2023-03-14 14:09:50,488][13200] Updated weights for policy 0, policy_version 410 (0.0020) +[2023-03-14 14:09:55,145][00372] Fps is (10 sec: 2866.4, 60 sec: 3140.1, 300 sec: 3235.1). Total num frames: 1687552. Throughput: 0: 809.8. Samples: 422320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:09:55,154][00372] Avg episode reward: [(0, '4.852')] +[2023-03-14 14:10:00,144][00372] Fps is (10 sec: 2867.3, 60 sec: 3071.9, 300 sec: 3249.0). Total num frames: 1703936. Throughput: 0: 800.4. Samples: 426238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:10:00,153][00372] Avg episode reward: [(0, '4.884')] +[2023-03-14 14:10:04,502][13200] Updated weights for policy 0, policy_version 420 (0.0039) +[2023-03-14 14:10:05,143][00372] Fps is (10 sec: 3277.6, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 1720320. Throughput: 0: 783.8. Samples: 428548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:10:05,145][00372] Avg episode reward: [(0, '5.060')] +[2023-03-14 14:10:10,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.9, 300 sec: 3221.3). Total num frames: 1740800. Throughput: 0: 793.3. Samples: 434808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:10:10,150][00372] Avg episode reward: [(0, '5.365')] +[2023-03-14 14:10:10,164][13187] Saving new best policy, reward=5.365! +[2023-03-14 14:10:15,149][00372] Fps is (10 sec: 3684.0, 60 sec: 3276.4, 300 sec: 3235.1). Total num frames: 1757184. Throughput: 0: 824.9. Samples: 440144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:10:15,154][00372] Avg episode reward: [(0, '5.472')] +[2023-03-14 14:10:15,159][13187] Saving new best policy, reward=5.472! +[2023-03-14 14:10:15,539][13200] Updated weights for policy 0, policy_version 430 (0.0019) +[2023-03-14 14:10:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1773568. Throughput: 0: 825.9. Samples: 442138. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:10:20,151][00372] Avg episode reward: [(0, '5.414')] +[2023-03-14 14:10:25,144][00372] Fps is (10 sec: 2868.8, 60 sec: 3140.2, 300 sec: 3235.1). Total num frames: 1785856. Throughput: 0: 818.8. Samples: 446260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:10:25,152][00372] Avg episode reward: [(0, '5.449')] +[2023-03-14 14:10:29,073][13200] Updated weights for policy 0, policy_version 440 (0.0048) +[2023-03-14 14:10:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1806336. Throughput: 0: 799.4. Samples: 451534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:10:30,145][00372] Avg episode reward: [(0, '5.195')] +[2023-03-14 14:10:30,160][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000441_1806336.pth... +[2023-03-14 14:10:30,282][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000253_1036288.pth +[2023-03-14 14:10:35,143][00372] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 1826816. Throughput: 0: 802.2. Samples: 454616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:10:35,145][00372] Avg episode reward: [(0, '5.015')] +[2023-03-14 14:10:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1839104. Throughput: 0: 840.2. Samples: 460128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:10:40,146][00372] Avg episode reward: [(0, '5.231')] +[2023-03-14 14:10:40,331][13200] Updated weights for policy 0, policy_version 450 (0.0020) +[2023-03-14 14:10:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1855488. Throughput: 0: 840.3. Samples: 464052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:10:45,147][00372] Avg episode reward: [(0, '5.159')] +[2023-03-14 14:10:50,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.6, 300 sec: 3235.2). Total num frames: 1867776. Throughput: 0: 832.3. Samples: 466004. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:10:50,146][00372] Avg episode reward: [(0, '5.252')] +[2023-03-14 14:10:54,323][13200] Updated weights for policy 0, policy_version 460 (0.0059) +[2023-03-14 14:10:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.9, 300 sec: 3235.1). Total num frames: 1884160. Throughput: 0: 797.2. Samples: 470682. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:10:55,145][00372] Avg episode reward: [(0, '5.608')] +[2023-03-14 14:10:55,154][13187] Saving new best policy, reward=5.608! +[2023-03-14 14:11:00,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 1904640. Throughput: 0: 811.7. Samples: 476666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:11:00,148][00372] Avg episode reward: [(0, '5.539')] +[2023-03-14 14:11:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1921024. Throughput: 0: 825.4. Samples: 479280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:11:05,152][00372] Avg episode reward: [(0, '5.431')] +[2023-03-14 14:11:06,456][13200] Updated weights for policy 0, policy_version 470 (0.0019) +[2023-03-14 14:11:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 1933312. Throughput: 0: 818.1. Samples: 483074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:11:10,149][00372] Avg episode reward: [(0, '5.626')] +[2023-03-14 14:11:10,166][13187] Saving new best policy, reward=5.626! +[2023-03-14 14:11:15,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.6, 300 sec: 3235.1). Total num frames: 1945600. Throughput: 0: 787.8. Samples: 486984. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:11:15,151][00372] Avg episode reward: [(0, '5.505')] +[2023-03-14 14:11:19,713][13200] Updated weights for policy 0, policy_version 480 (0.0061) +[2023-03-14 14:11:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 1966080. Throughput: 0: 775.3. Samples: 489506. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:11:20,146][00372] Avg episode reward: [(0, '5.250')] +[2023-03-14 14:11:25,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3235.2). Total num frames: 1986560. Throughput: 0: 792.1. Samples: 495774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:11:25,150][00372] Avg episode reward: [(0, '5.645')] +[2023-03-14 14:11:25,157][13187] Saving new best policy, reward=5.645! +[2023-03-14 14:11:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 1998848. Throughput: 0: 811.6. Samples: 500574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:11:30,145][00372] Avg episode reward: [(0, '6.080')] +[2023-03-14 14:11:30,161][13187] Saving new best policy, reward=6.080! +[2023-03-14 14:11:32,139][13200] Updated weights for policy 0, policy_version 490 (0.0021) +[2023-03-14 14:11:35,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2011136. Throughput: 0: 811.4. Samples: 502516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:11:35,149][00372] Avg episode reward: [(0, '6.293')] +[2023-03-14 14:11:35,195][13187] Saving new best policy, reward=6.293! +[2023-03-14 14:11:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2027520. Throughput: 0: 797.3. Samples: 506560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:11:40,146][00372] Avg episode reward: [(0, '6.281')] +[2023-03-14 14:11:44,968][13200] Updated weights for policy 0, policy_version 500 (0.0021) +[2023-03-14 14:11:45,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 2048000. Throughput: 0: 784.7. Samples: 511978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:11:45,146][00372] Avg episode reward: [(0, '6.643')] +[2023-03-14 14:11:45,148][13187] Saving new best policy, reward=6.643! +[2023-03-14 14:11:50,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 2068480. Throughput: 0: 794.0. Samples: 515012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:11:50,146][00372] Avg episode reward: [(0, '6.715')] +[2023-03-14 14:11:50,160][13187] Saving new best policy, reward=6.715! +[2023-03-14 14:11:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3235.2). Total num frames: 2080768. Throughput: 0: 823.5. Samples: 520130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:11:55,147][00372] Avg episode reward: [(0, '6.622')] +[2023-03-14 14:11:57,214][13200] Updated weights for policy 0, policy_version 510 (0.0023) +[2023-03-14 14:12:00,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2093056. Throughput: 0: 821.5. Samples: 523950. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:12:00,150][00372] Avg episode reward: [(0, '6.371')] +[2023-03-14 14:12:05,147][00372] Fps is (10 sec: 2456.6, 60 sec: 3071.8, 300 sec: 3221.2). Total num frames: 2105344. Throughput: 0: 809.3. Samples: 525928. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:12:05,150][00372] Avg episode reward: [(0, '6.637')] +[2023-03-14 14:12:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2125824. Throughput: 0: 784.6. Samples: 531080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:12:10,146][00372] Avg episode reward: [(0, '6.483')] +[2023-03-14 14:12:10,391][13200] Updated weights for policy 0, policy_version 520 (0.0041) +[2023-03-14 14:12:15,143][00372] Fps is (10 sec: 4097.7, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2146304. Throughput: 0: 813.4. Samples: 537178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:12:15,145][00372] Avg episode reward: [(0, '6.937')] +[2023-03-14 14:12:15,168][13187] Saving new best policy, reward=6.937! +[2023-03-14 14:12:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2162688. Throughput: 0: 820.3. Samples: 539430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:12:20,156][00372] Avg episode reward: [(0, '7.011')] +[2023-03-14 14:12:20,174][13187] Saving new best policy, reward=7.011! +[2023-03-14 14:12:23,128][13200] Updated weights for policy 0, policy_version 530 (0.0017) +[2023-03-14 14:12:25,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2174976. Throughput: 0: 815.8. Samples: 543270. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-03-14 14:12:25,146][00372] Avg episode reward: [(0, '7.531')] +[2023-03-14 14:12:25,154][13187] Saving new best policy, reward=7.531! +[2023-03-14 14:12:30,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2187264. Throughput: 0: 784.0. Samples: 547260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:12:30,150][00372] Avg episode reward: [(0, '7.091')] +[2023-03-14 14:12:30,172][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000534_2187264.pth... +[2023-03-14 14:12:30,321][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000347_1421312.pth +[2023-03-14 14:12:35,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2207744. Throughput: 0: 779.1. Samples: 550070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:12:35,151][00372] Avg episode reward: [(0, '6.819')] +[2023-03-14 14:12:35,842][13200] Updated weights for policy 0, policy_version 540 (0.0036) +[2023-03-14 14:12:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2228224. Throughput: 0: 801.2. Samples: 556184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:12:40,145][00372] Avg episode reward: [(0, '6.809')] +[2023-03-14 14:12:45,143][00372] Fps is (10 sec: 3276.6, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2240512. Throughput: 0: 817.4. Samples: 560734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:12:45,145][00372] Avg episode reward: [(0, '6.777')] +[2023-03-14 14:12:49,019][13200] Updated weights for policy 0, policy_version 550 (0.0034) +[2023-03-14 14:12:50,145][00372] Fps is (10 sec: 2457.2, 60 sec: 3071.9, 300 sec: 3221.2). Total num frames: 2252800. Throughput: 0: 818.3. Samples: 562750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:12:50,152][00372] Avg episode reward: [(0, '7.140')] +[2023-03-14 14:12:55,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2269184. Throughput: 0: 795.0. Samples: 566854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:12:55,145][00372] Avg episode reward: [(0, '7.639')] +[2023-03-14 14:12:55,151][13187] Saving new best policy, reward=7.639! +[2023-03-14 14:13:00,143][00372] Fps is (10 sec: 3687.1, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2289664. Throughput: 0: 782.2. Samples: 572378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:13:00,151][00372] Avg episode reward: [(0, '7.805')] +[2023-03-14 14:13:00,166][13187] Saving new best policy, reward=7.805! +[2023-03-14 14:13:01,173][13200] Updated weights for policy 0, policy_version 560 (0.0022) +[2023-03-14 14:13:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.6, 300 sec: 3235.1). Total num frames: 2310144. Throughput: 0: 800.8. Samples: 575468. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:13:05,145][00372] Avg episode reward: [(0, '8.309')] +[2023-03-14 14:13:05,148][13187] Saving new best policy, reward=8.309! +[2023-03-14 14:13:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2322432. Throughput: 0: 821.2. Samples: 580226. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:13:10,154][00372] Avg episode reward: [(0, '8.103')] +[2023-03-14 14:13:14,629][13200] Updated weights for policy 0, policy_version 570 (0.0037) +[2023-03-14 14:13:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3235.3). Total num frames: 2334720. Throughput: 0: 819.7. Samples: 584148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:13:15,148][00372] Avg episode reward: [(0, '7.802')] +[2023-03-14 14:13:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2347008. Throughput: 0: 801.4. Samples: 586132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:13:20,152][00372] Avg episode reward: [(0, '7.974')] +[2023-03-14 14:13:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2367488. Throughput: 0: 786.7. Samples: 591586. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:13:25,145][00372] Avg episode reward: [(0, '7.424')] +[2023-03-14 14:13:26,236][13200] Updated weights for policy 0, policy_version 580 (0.0013) +[2023-03-14 14:13:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2387968. Throughput: 0: 818.4. Samples: 597560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:13:30,147][00372] Avg episode reward: [(0, '7.896')] +[2023-03-14 14:13:35,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2400256. Throughput: 0: 817.4. Samples: 599530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:13:35,145][00372] Avg episode reward: [(0, '7.798')] +[2023-03-14 14:13:40,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2412544. Throughput: 0: 811.7. Samples: 603382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:13:40,152][00372] Avg episode reward: [(0, '7.567')] +[2023-03-14 14:13:40,783][13200] Updated weights for policy 0, policy_version 590 (0.0026) +[2023-03-14 14:13:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2428928. Throughput: 0: 782.0. Samples: 607570. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:13:45,152][00372] Avg episode reward: [(0, '7.653')] +[2023-03-14 14:13:50,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.9, 300 sec: 3221.3). Total num frames: 2449408. Throughput: 0: 780.6. Samples: 610596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:13:50,145][00372] Avg episode reward: [(0, '7.783')] +[2023-03-14 14:13:51,853][13200] Updated weights for policy 0, policy_version 600 (0.0023) +[2023-03-14 14:13:55,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3207.4). Total num frames: 2465792. Throughput: 0: 809.9. Samples: 616670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:13:55,147][00372] Avg episode reward: [(0, '8.113')] +[2023-03-14 14:14:00,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2482176. Throughput: 0: 813.3. Samples: 620748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:14:00,148][00372] Avg episode reward: [(0, '8.021')] +[2023-03-14 14:14:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2494464. Throughput: 0: 811.6. Samples: 622656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:14:05,147][00372] Avg episode reward: [(0, '7.975')] +[2023-03-14 14:14:06,251][13200] Updated weights for policy 0, policy_version 610 (0.0014) +[2023-03-14 14:14:10,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3207.4). Total num frames: 2506752. Throughput: 0: 775.3. Samples: 626474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:10,145][00372] Avg episode reward: [(0, '7.801')] +[2023-03-14 14:14:15,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 2527232. Throughput: 0: 770.5. Samples: 632232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:15,150][00372] Avg episode reward: [(0, '7.768')] +[2023-03-14 14:14:17,647][13200] Updated weights for policy 0, policy_version 620 (0.0012) +[2023-03-14 14:14:20,150][00372] Fps is (10 sec: 4093.0, 60 sec: 3344.7, 300 sec: 3221.2). Total num frames: 2547712. Throughput: 0: 793.7. Samples: 635254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:20,153][00372] Avg episode reward: [(0, '7.218')] +[2023-03-14 14:14:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2560000. Throughput: 0: 807.1. Samples: 639700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:25,146][00372] Avg episode reward: [(0, '7.239')] +[2023-03-14 14:14:30,143][00372] Fps is (10 sec: 2459.4, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2572288. Throughput: 0: 804.0. Samples: 643748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:14:30,148][00372] Avg episode reward: [(0, '7.537')] +[2023-03-14 14:14:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth... +[2023-03-14 14:14:30,367][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000441_1806336.pth +[2023-03-14 14:14:32,324][13200] Updated weights for policy 0, policy_version 630 (0.0024) +[2023-03-14 14:14:35,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2588672. Throughput: 0: 779.8. Samples: 645686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:35,146][00372] Avg episode reward: [(0, '8.336')] +[2023-03-14 14:14:35,154][13187] Saving new best policy, reward=8.336! +[2023-03-14 14:14:40,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2609152. Throughput: 0: 778.6. Samples: 651706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:40,146][00372] Avg episode reward: [(0, '8.934')] +[2023-03-14 14:14:40,156][13187] Saving new best policy, reward=8.934! +[2023-03-14 14:14:42,454][13200] Updated weights for policy 0, policy_version 640 (0.0023) +[2023-03-14 14:14:45,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3235.2). Total num frames: 2629632. Throughput: 0: 818.2. Samples: 657566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:14:45,145][00372] Avg episode reward: [(0, '9.972')] +[2023-03-14 14:14:45,147][13187] Saving new best policy, reward=9.972! +[2023-03-14 14:14:50,144][00372] Fps is (10 sec: 3276.5, 60 sec: 3208.5, 300 sec: 3235.2). Total num frames: 2641920. Throughput: 0: 818.4. Samples: 659484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:14:50,146][00372] Avg episode reward: [(0, '9.165')] +[2023-03-14 14:14:55,145][00372] Fps is (10 sec: 2457.0, 60 sec: 3140.1, 300 sec: 3221.2). Total num frames: 2654208. Throughput: 0: 820.0. Samples: 663376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:14:55,150][00372] Avg episode reward: [(0, '10.121')] +[2023-03-14 14:14:55,152][13187] Saving new best policy, reward=10.121! +[2023-03-14 14:14:57,564][13200] Updated weights for policy 0, policy_version 650 (0.0021) +[2023-03-14 14:15:00,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2670592. Throughput: 0: 793.1. Samples: 667922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:15:00,146][00372] Avg episode reward: [(0, '9.747')] +[2023-03-14 14:15:05,143][00372] Fps is (10 sec: 3687.3, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2691072. Throughput: 0: 795.0. Samples: 671022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:05,148][00372] Avg episode reward: [(0, '9.345')] +[2023-03-14 14:15:07,570][13200] Updated weights for policy 0, policy_version 660 (0.0014) +[2023-03-14 14:15:10,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2707456. Throughput: 0: 833.3. Samples: 677198. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:15:10,154][00372] Avg episode reward: [(0, '9.946')] +[2023-03-14 14:15:15,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2723840. Throughput: 0: 832.4. Samples: 681204. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:15:15,146][00372] Avg episode reward: [(0, '9.821')] +[2023-03-14 14:15:20,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.7, 300 sec: 3221.3). Total num frames: 2736128. Throughput: 0: 834.7. Samples: 683248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:20,151][00372] Avg episode reward: [(0, '10.564')] +[2023-03-14 14:15:20,163][13187] Saving new best policy, reward=10.564! +[2023-03-14 14:15:22,388][13200] Updated weights for policy 0, policy_version 670 (0.0018) +[2023-03-14 14:15:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 2752512. Throughput: 0: 798.4. Samples: 687636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:15:25,148][00372] Avg episode reward: [(0, '11.390')] +[2023-03-14 14:15:25,154][13187] Saving new best policy, reward=11.390! +[2023-03-14 14:15:30,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3207.4). Total num frames: 2772992. Throughput: 0: 809.2. Samples: 693982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:15:30,145][00372] Avg episode reward: [(0, '11.470')] +[2023-03-14 14:15:30,168][13187] Saving new best policy, reward=11.470! +[2023-03-14 14:15:32,367][13200] Updated weights for policy 0, policy_version 680 (0.0028) +[2023-03-14 14:15:35,146][00372] Fps is (10 sec: 3685.2, 60 sec: 3344.9, 300 sec: 3221.2). Total num frames: 2789376. Throughput: 0: 834.0. Samples: 697018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:35,149][00372] Avg episode reward: [(0, '12.260')] +[2023-03-14 14:15:35,154][13187] Saving new best policy, reward=12.260! +[2023-03-14 14:15:40,150][00372] Fps is (10 sec: 3274.4, 60 sec: 3276.4, 300 sec: 3221.2). Total num frames: 2805760. Throughput: 0: 835.1. Samples: 700960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:40,158][00372] Avg episode reward: [(0, '13.049')] +[2023-03-14 14:15:40,172][13187] Saving new best policy, reward=13.049! +[2023-03-14 14:15:45,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2818048. Throughput: 0: 824.0. Samples: 705000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:45,147][00372] Avg episode reward: [(0, '13.164')] +[2023-03-14 14:15:45,151][13187] Saving new best policy, reward=13.164! +[2023-03-14 14:15:47,485][13200] Updated weights for policy 0, policy_version 690 (0.0026) +[2023-03-14 14:15:50,143][00372] Fps is (10 sec: 2869.3, 60 sec: 3208.6, 300 sec: 3221.3). Total num frames: 2834432. Throughput: 0: 805.2. Samples: 707258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:15:50,146][00372] Avg episode reward: [(0, '13.988')] +[2023-03-14 14:15:50,154][13187] Saving new best policy, reward=13.988! +[2023-03-14 14:15:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3221.3). Total num frames: 2854912. Throughput: 0: 806.6. Samples: 713496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:15:55,145][00372] Avg episode reward: [(0, '14.562')] +[2023-03-14 14:15:55,151][13187] Saving new best policy, reward=14.562! +[2023-03-14 14:15:57,624][13200] Updated weights for policy 0, policy_version 700 (0.0047) +[2023-03-14 14:16:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2871296. Throughput: 0: 832.6. Samples: 718670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:16:00,148][00372] Avg episode reward: [(0, '13.817')] +[2023-03-14 14:16:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2883584. Throughput: 0: 831.1. Samples: 720648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:16:05,148][00372] Avg episode reward: [(0, '14.462')] +[2023-03-14 14:16:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 2899968. Throughput: 0: 824.2. Samples: 724724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:16:10,145][00372] Avg episode reward: [(0, '14.085')] +[2023-03-14 14:16:12,285][13200] Updated weights for policy 0, policy_version 710 (0.0031) +[2023-03-14 14:16:15,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2920448. Throughput: 0: 799.3. Samples: 729950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:15,149][00372] Avg episode reward: [(0, '13.695')] +[2023-03-14 14:16:20,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 2940928. Throughput: 0: 800.9. Samples: 733058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:20,146][00372] Avg episode reward: [(0, '14.497')] +[2023-03-14 14:16:22,310][13200] Updated weights for policy 0, policy_version 720 (0.0013) +[2023-03-14 14:16:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 2953216. Throughput: 0: 835.8. Samples: 738566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:25,148][00372] Avg episode reward: [(0, '14.419')] +[2023-03-14 14:16:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 2969600. Throughput: 0: 835.9. Samples: 742614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:16:30,150][00372] Avg episode reward: [(0, '13.953')] +[2023-03-14 14:16:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000725_2969600.pth... +[2023-03-14 14:16:30,331][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000534_2187264.pth +[2023-03-14 14:16:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.7, 300 sec: 3235.1). Total num frames: 2981888. Throughput: 0: 829.1. Samples: 744568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:35,147][00372] Avg episode reward: [(0, '14.644')] +[2023-03-14 14:16:35,149][13187] Saving new best policy, reward=14.644! +[2023-03-14 14:16:37,150][13200] Updated weights for policy 0, policy_version 730 (0.0015) +[2023-03-14 14:16:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3277.2, 300 sec: 3235.1). Total num frames: 3002368. Throughput: 0: 801.2. Samples: 749550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:40,145][00372] Avg episode reward: [(0, '15.058')] +[2023-03-14 14:16:40,156][13187] Saving new best policy, reward=15.058! +[2023-03-14 14:16:45,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 3022848. Throughput: 0: 827.0. Samples: 755886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:45,146][00372] Avg episode reward: [(0, '15.576')] +[2023-03-14 14:16:45,152][13187] Saving new best policy, reward=15.576! +[2023-03-14 14:16:47,228][13200] Updated weights for policy 0, policy_version 740 (0.0025) +[2023-03-14 14:16:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 3035136. Throughput: 0: 838.7. Samples: 758390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:50,147][00372] Avg episode reward: [(0, '15.567')] +[2023-03-14 14:16:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3051520. Throughput: 0: 838.9. Samples: 762476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:16:55,147][00372] Avg episode reward: [(0, '15.279')] +[2023-03-14 14:17:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 3063808. Throughput: 0: 815.1. Samples: 766630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:17:00,146][00372] Avg episode reward: [(0, '15.984')] +[2023-03-14 14:17:00,168][13187] Saving new best policy, reward=15.984! +[2023-03-14 14:17:01,939][13200] Updated weights for policy 0, policy_version 750 (0.0013) +[2023-03-14 14:17:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 3084288. Throughput: 0: 801.5. Samples: 769124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:17:05,146][00372] Avg episode reward: [(0, '15.766')] +[2023-03-14 14:17:10,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3104768. Throughput: 0: 819.0. Samples: 775420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:17:10,149][00372] Avg episode reward: [(0, '15.274')] +[2023-03-14 14:17:11,876][13200] Updated weights for policy 0, policy_version 760 (0.0013) +[2023-03-14 14:17:15,144][00372] Fps is (10 sec: 3685.8, 60 sec: 3345.0, 300 sec: 3249.0). Total num frames: 3121152. Throughput: 0: 842.0. Samples: 780506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:17:15,151][00372] Avg episode reward: [(0, '15.076')] +[2023-03-14 14:17:20,144][00372] Fps is (10 sec: 2866.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 3133440. Throughput: 0: 841.6. Samples: 782440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:17:20,147][00372] Avg episode reward: [(0, '15.162')] +[2023-03-14 14:17:25,147][00372] Fps is (10 sec: 2457.0, 60 sec: 3208.3, 300 sec: 3249.0). Total num frames: 3145728. Throughput: 0: 819.4. Samples: 786424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:17:25,152][00372] Avg episode reward: [(0, '14.169')] +[2023-03-14 14:17:26,724][13200] Updated weights for policy 0, policy_version 770 (0.0013) +[2023-03-14 14:17:30,143][00372] Fps is (10 sec: 3277.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3166208. Throughput: 0: 803.1. Samples: 792024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:17:30,150][00372] Avg episode reward: [(0, '15.131')] +[2023-03-14 14:17:35,143][00372] Fps is (10 sec: 4097.7, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3186688. Throughput: 0: 814.7. Samples: 795052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:17:35,148][00372] Avg episode reward: [(0, '16.771')] +[2023-03-14 14:17:35,153][13187] Saving new best policy, reward=16.771! +[2023-03-14 14:17:37,258][13200] Updated weights for policy 0, policy_version 780 (0.0018) +[2023-03-14 14:17:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3198976. Throughput: 0: 834.9. Samples: 800046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:17:40,145][00372] Avg episode reward: [(0, '17.101')] +[2023-03-14 14:17:40,261][13187] Saving new best policy, reward=17.101! +[2023-03-14 14:17:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3215360. Throughput: 0: 830.4. Samples: 803998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:17:45,145][00372] Avg episode reward: [(0, '17.761')] +[2023-03-14 14:17:45,147][13187] Saving new best policy, reward=17.761! +[2023-03-14 14:17:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 3227648. Throughput: 0: 820.0. Samples: 806026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:17:50,146][00372] Avg episode reward: [(0, '18.537')] +[2023-03-14 14:17:50,164][13187] Saving new best policy, reward=18.537! +[2023-03-14 14:17:51,979][13200] Updated weights for policy 0, policy_version 790 (0.0043) +[2023-03-14 14:17:55,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3248128. Throughput: 0: 793.9. Samples: 811144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:17:55,145][00372] Avg episode reward: [(0, '19.508')] +[2023-03-14 14:17:55,149][13187] Saving new best policy, reward=19.508! +[2023-03-14 14:18:00,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3268608. Throughput: 0: 823.3. Samples: 817554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:18:00,145][00372] Avg episode reward: [(0, '18.626')] +[2023-03-14 14:18:01,970][13200] Updated weights for policy 0, policy_version 800 (0.0028) +[2023-03-14 14:18:05,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3280896. Throughput: 0: 834.7. Samples: 820002. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:18:05,150][00372] Avg episode reward: [(0, '18.513')] +[2023-03-14 14:18:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3297280. Throughput: 0: 833.5. Samples: 823930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:18:10,149][00372] Avg episode reward: [(0, '19.400')] +[2023-03-14 14:18:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 3309568. Throughput: 0: 802.0. Samples: 828114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:18:15,148][00372] Avg episode reward: [(0, '20.029')] +[2023-03-14 14:18:15,153][13187] Saving new best policy, reward=20.029! +[2023-03-14 14:18:16,501][13200] Updated weights for policy 0, policy_version 810 (0.0030) +[2023-03-14 14:18:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3262.9). Total num frames: 3330048. Throughput: 0: 795.9. Samples: 830868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:18:20,148][00372] Avg episode reward: [(0, '18.905')] +[2023-03-14 14:18:25,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.6, 300 sec: 3262.9). Total num frames: 3350528. Throughput: 0: 825.3. Samples: 837184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:18:25,146][00372] Avg episode reward: [(0, '18.714')] +[2023-03-14 14:18:27,160][13200] Updated weights for policy 0, policy_version 820 (0.0030) +[2023-03-14 14:18:30,146][00372] Fps is (10 sec: 3685.1, 60 sec: 3344.9, 300 sec: 3276.8). Total num frames: 3366912. Throughput: 0: 843.9. Samples: 841978. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:18:30,149][00372] Avg episode reward: [(0, '18.459')] +[2023-03-14 14:18:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000822_3366912.pth... +[2023-03-14 14:18:30,347][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth +[2023-03-14 14:18:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3379200. Throughput: 0: 840.4. Samples: 843842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:18:35,147][00372] Avg episode reward: [(0, '17.178')] +[2023-03-14 14:18:40,143][00372] Fps is (10 sec: 2458.5, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3391488. Throughput: 0: 817.7. Samples: 847942. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:18:40,145][00372] Avg episode reward: [(0, '15.610')] +[2023-03-14 14:18:41,517][13200] Updated weights for policy 0, policy_version 830 (0.0034) +[2023-03-14 14:18:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 3411968. Throughput: 0: 803.0. Samples: 853690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:18:45,146][00372] Avg episode reward: [(0, '16.384')] +[2023-03-14 14:18:50,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 3432448. Throughput: 0: 817.2. Samples: 856778. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:18:50,145][00372] Avg episode reward: [(0, '17.387')] +[2023-03-14 14:18:51,883][13200] Updated weights for policy 0, policy_version 840 (0.0021) +[2023-03-14 14:18:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3448832. Throughput: 0: 839.2. Samples: 861696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:18:55,150][00372] Avg episode reward: [(0, '17.480')] +[2023-03-14 14:19:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3461120. Throughput: 0: 836.4. Samples: 865752. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:19:00,145][00372] Avg episode reward: [(0, '17.248')] +[2023-03-14 14:19:05,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3473408. Throughput: 0: 820.6. Samples: 867794. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:19:05,145][00372] Avg episode reward: [(0, '17.249')] +[2023-03-14 14:19:06,500][13200] Updated weights for policy 0, policy_version 850 (0.0031) +[2023-03-14 14:19:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3493888. Throughput: 0: 794.5. Samples: 872938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:19:10,145][00372] Avg episode reward: [(0, '18.821')] +[2023-03-14 14:19:15,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3276.9). Total num frames: 3514368. Throughput: 0: 823.5. Samples: 879034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:15,145][00372] Avg episode reward: [(0, '19.260')] +[2023-03-14 14:19:17,585][13200] Updated weights for policy 0, policy_version 860 (0.0013) +[2023-03-14 14:19:20,145][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3526656. Throughput: 0: 828.3. Samples: 881116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:19:20,147][00372] Avg episode reward: [(0, '19.241')] +[2023-03-14 14:19:25,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3538944. Throughput: 0: 824.5. Samples: 885044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:19:25,146][00372] Avg episode reward: [(0, '19.037')] +[2023-03-14 14:19:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 3276.8). Total num frames: 3555328. Throughput: 0: 790.5. Samples: 889262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:19:30,148][00372] Avg episode reward: [(0, '19.234')] +[2023-03-14 14:19:31,699][13200] Updated weights for policy 0, policy_version 870 (0.0022) +[2023-03-14 14:19:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3575808. Throughput: 0: 790.8. Samples: 892364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:35,145][00372] Avg episode reward: [(0, '18.772')] +[2023-03-14 14:19:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 3596288. Throughput: 0: 819.4. Samples: 898568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:40,149][00372] Avg episode reward: [(0, '18.643')] +[2023-03-14 14:19:42,605][13200] Updated weights for policy 0, policy_version 880 (0.0023) +[2023-03-14 14:19:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3608576. Throughput: 0: 825.9. Samples: 902916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:45,153][00372] Avg episode reward: [(0, '18.709')] +[2023-03-14 14:19:50,144][00372] Fps is (10 sec: 2457.4, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3620864. Throughput: 0: 824.7. Samples: 904908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:50,148][00372] Avg episode reward: [(0, '19.604')] +[2023-03-14 14:19:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3637248. Throughput: 0: 802.7. Samples: 909058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:19:55,145][00372] Avg episode reward: [(0, '19.484')] +[2023-03-14 14:19:56,598][13200] Updated weights for policy 0, policy_version 890 (0.0029) +[2023-03-14 14:20:00,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3657728. Throughput: 0: 800.4. Samples: 915054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:20:00,152][00372] Avg episode reward: [(0, '20.453')] +[2023-03-14 14:20:00,165][13187] Saving new best policy, reward=20.453! +[2023-03-14 14:20:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 3678208. Throughput: 0: 822.6. Samples: 918134. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:20:05,147][00372] Avg episode reward: [(0, '20.899')] +[2023-03-14 14:20:05,149][13187] Saving new best policy, reward=20.899! +[2023-03-14 14:20:08,041][13200] Updated weights for policy 0, policy_version 900 (0.0016) +[2023-03-14 14:20:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3690496. Throughput: 0: 833.9. Samples: 922568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:20:10,149][00372] Avg episode reward: [(0, '21.631')] +[2023-03-14 14:20:10,162][13187] Saving new best policy, reward=21.631! +[2023-03-14 14:20:15,147][00372] Fps is (10 sec: 2456.8, 60 sec: 3140.1, 300 sec: 3276.8). Total num frames: 3702784. Throughput: 0: 825.9. Samples: 926430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:20:15,151][00372] Avg episode reward: [(0, '20.911')] +[2023-03-14 14:20:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3719168. Throughput: 0: 800.8. Samples: 928402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:20:20,145][00372] Avg episode reward: [(0, '21.338')] +[2023-03-14 14:20:21,885][13200] Updated weights for policy 0, policy_version 910 (0.0041) +[2023-03-14 14:20:25,143][00372] Fps is (10 sec: 3687.5, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3739648. Throughput: 0: 790.2. Samples: 934128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:20:25,145][00372] Avg episode reward: [(0, '20.509')] +[2023-03-14 14:20:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 3760128. Throughput: 0: 828.3. Samples: 940190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:20:30,156][00372] Avg episode reward: [(0, '20.659')] +[2023-03-14 14:20:30,168][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000918_3760128.pth... +[2023-03-14 14:20:30,304][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000725_2969600.pth +[2023-03-14 14:20:33,133][13200] Updated weights for policy 0, policy_version 920 (0.0013) +[2023-03-14 14:20:35,144][00372] Fps is (10 sec: 3276.3, 60 sec: 3276.7, 300 sec: 3276.9). Total num frames: 3772416. Throughput: 0: 826.9. Samples: 942120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:20:35,147][00372] Avg episode reward: [(0, '20.531')] +[2023-03-14 14:20:40,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3784704. Throughput: 0: 824.0. Samples: 946138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:20:40,151][00372] Avg episode reward: [(0, '19.511')] +[2023-03-14 14:20:45,144][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3801088. Throughput: 0: 791.8. Samples: 950684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:20:45,148][00372] Avg episode reward: [(0, '19.759')] +[2023-03-14 14:20:46,700][13200] Updated weights for policy 0, policy_version 930 (0.0034) +[2023-03-14 14:20:50,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3821568. Throughput: 0: 792.1. Samples: 953780. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:20:50,146][00372] Avg episode reward: [(0, '19.582')] +[2023-03-14 14:20:55,143][00372] Fps is (10 sec: 3687.0, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3837952. Throughput: 0: 830.6. Samples: 959946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:20:55,149][00372] Avg episode reward: [(0, '19.445')] +[2023-03-14 14:20:58,397][13200] Updated weights for policy 0, policy_version 940 (0.0018) +[2023-03-14 14:21:00,145][00372] Fps is (10 sec: 3276.1, 60 sec: 3276.7, 300 sec: 3290.7). Total num frames: 3854336. Throughput: 0: 834.1. Samples: 963964. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:21:00,151][00372] Avg episode reward: [(0, '19.360')] +[2023-03-14 14:21:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3866624. Throughput: 0: 834.4. Samples: 965948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:21:05,147][00372] Avg episode reward: [(0, '19.113')] +[2023-03-14 14:21:10,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3883008. Throughput: 0: 802.1. Samples: 970222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:10,145][00372] Avg episode reward: [(0, '18.789')] +[2023-03-14 14:21:11,477][13200] Updated weights for policy 0, policy_version 950 (0.0036) +[2023-03-14 14:21:15,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 3903488. Throughput: 0: 807.1. Samples: 976508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:15,146][00372] Avg episode reward: [(0, '18.265')] +[2023-03-14 14:21:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3919872. Throughput: 0: 834.8. Samples: 979684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:20,145][00372] Avg episode reward: [(0, '18.707')] +[2023-03-14 14:21:23,182][13200] Updated weights for policy 0, policy_version 960 (0.0018) +[2023-03-14 14:21:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3936256. Throughput: 0: 834.7. Samples: 983700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:25,147][00372] Avg episode reward: [(0, '19.918')] +[2023-03-14 14:21:30,145][00372] Fps is (10 sec: 2866.7, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3948544. Throughput: 0: 823.2. Samples: 987730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:21:30,147][00372] Avg episode reward: [(0, '20.429')] +[2023-03-14 14:21:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3262.9). Total num frames: 3964928. Throughput: 0: 802.4. Samples: 989888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:35,146][00372] Avg episode reward: [(0, '20.354')] +[2023-03-14 14:21:36,223][13200] Updated weights for policy 0, policy_version 970 (0.0019) +[2023-03-14 14:21:40,143][00372] Fps is (10 sec: 3687.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 3985408. Throughput: 0: 804.1. Samples: 996132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:21:40,145][00372] Avg episode reward: [(0, '21.814')] +[2023-03-14 14:21:40,215][13187] Saving new best policy, reward=21.814! +[2023-03-14 14:21:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4001792. Throughput: 0: 830.7. Samples: 1001342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:45,150][00372] Avg episode reward: [(0, '22.687')] +[2023-03-14 14:21:45,152][13187] Saving new best policy, reward=22.687! +[2023-03-14 14:21:48,804][13200] Updated weights for policy 0, policy_version 980 (0.0017) +[2023-03-14 14:21:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4014080. Throughput: 0: 829.2. Samples: 1003260. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:21:50,148][00372] Avg episode reward: [(0, '23.651')] +[2023-03-14 14:21:50,165][13187] Saving new best policy, reward=23.651! +[2023-03-14 14:21:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 4030464. Throughput: 0: 819.3. Samples: 1007092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:21:55,146][00372] Avg episode reward: [(0, '23.598')] +[2023-03-14 14:22:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.7, 300 sec: 3262.9). Total num frames: 4046848. Throughput: 0: 793.1. Samples: 1012196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:22:00,146][00372] Avg episode reward: [(0, '21.497')] +[2023-03-14 14:22:01,428][13200] Updated weights for policy 0, policy_version 990 (0.0023) +[2023-03-14 14:22:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4067328. Throughput: 0: 791.7. Samples: 1015312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:22:05,146][00372] Avg episode reward: [(0, '20.983')] +[2023-03-14 14:22:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4083712. Throughput: 0: 823.6. Samples: 1020760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:22:10,147][00372] Avg episode reward: [(0, '19.705')] +[2023-03-14 14:22:14,565][13200] Updated weights for policy 0, policy_version 1000 (0.0021) +[2023-03-14 14:22:15,145][00372] Fps is (10 sec: 2866.6, 60 sec: 3208.4, 300 sec: 3262.9). Total num frames: 4096000. Throughput: 0: 822.2. Samples: 1024730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:22:15,153][00372] Avg episode reward: [(0, '20.227')] +[2023-03-14 14:22:20,148][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3263.0). Total num frames: 4108288. Throughput: 0: 818.7. Samples: 1026728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:20,150][00372] Avg episode reward: [(0, '19.321')] +[2023-03-14 14:22:25,143][00372] Fps is (10 sec: 3277.5, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4128768. Throughput: 0: 790.5. Samples: 1031704. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:25,145][00372] Avg episode reward: [(0, '19.285')] +[2023-03-14 14:22:26,635][13200] Updated weights for policy 0, policy_version 1010 (0.0028) +[2023-03-14 14:22:30,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 4149248. Throughput: 0: 814.2. Samples: 1037982. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:30,150][00372] Avg episode reward: [(0, '19.720')] +[2023-03-14 14:22:30,160][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth... +[2023-03-14 14:22:30,290][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000822_3366912.pth +[2023-03-14 14:22:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4165632. Throughput: 0: 826.9. Samples: 1040470. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:35,145][00372] Avg episode reward: [(0, '19.750')] +[2023-03-14 14:22:39,723][13200] Updated weights for policy 0, policy_version 1020 (0.0014) +[2023-03-14 14:22:40,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4177920. Throughput: 0: 828.6. Samples: 1044378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:22:40,151][00372] Avg episode reward: [(0, '20.221')] +[2023-03-14 14:22:45,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4190208. Throughput: 0: 804.0. Samples: 1048374. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:45,150][00372] Avg episode reward: [(0, '20.149')] +[2023-03-14 14:22:50,143][00372] Fps is (10 sec: 3277.0, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4210688. Throughput: 0: 798.0. Samples: 1051224. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:22:50,145][00372] Avg episode reward: [(0, '19.594')] +[2023-03-14 14:22:51,425][13200] Updated weights for policy 0, policy_version 1030 (0.0026) +[2023-03-14 14:22:55,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4231168. Throughput: 0: 813.5. Samples: 1057366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:22:55,145][00372] Avg episode reward: [(0, '19.841')] +[2023-03-14 14:23:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4247552. Throughput: 0: 829.5. Samples: 1062054. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:23:00,150][00372] Avg episode reward: [(0, '20.467')] +[2023-03-14 14:23:04,800][13200] Updated weights for policy 0, policy_version 1040 (0.0020) +[2023-03-14 14:23:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4259840. Throughput: 0: 827.6. Samples: 1063970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:23:05,147][00372] Avg episode reward: [(0, '20.312')] +[2023-03-14 14:23:10,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4272128. Throughput: 0: 804.6. Samples: 1067910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:23:10,146][00372] Avg episode reward: [(0, '19.500')] +[2023-03-14 14:23:15,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.9, 300 sec: 3262.9). Total num frames: 4292608. Throughput: 0: 790.3. Samples: 1073544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:23:15,145][00372] Avg episode reward: [(0, '19.971')] +[2023-03-14 14:23:16,552][13200] Updated weights for policy 0, policy_version 1050 (0.0044) +[2023-03-14 14:23:20,145][00372] Fps is (10 sec: 4095.0, 60 sec: 3413.2, 300 sec: 3262.9). Total num frames: 4313088. Throughput: 0: 805.1. Samples: 1076702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:23:20,148][00372] Avg episode reward: [(0, '21.662')] +[2023-03-14 14:23:25,155][00372] Fps is (10 sec: 3682.0, 60 sec: 3344.4, 300 sec: 3262.8). Total num frames: 4329472. Throughput: 0: 829.4. Samples: 1081712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:23:25,160][00372] Avg episode reward: [(0, '21.603')] +[2023-03-14 14:23:29,925][13200] Updated weights for policy 0, policy_version 1060 (0.0033) +[2023-03-14 14:23:30,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4341760. Throughput: 0: 828.7. Samples: 1085664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:23:30,148][00372] Avg episode reward: [(0, '21.446')] +[2023-03-14 14:23:35,143][00372] Fps is (10 sec: 2460.5, 60 sec: 3140.2, 300 sec: 3262.9). Total num frames: 4354048. Throughput: 0: 811.1. Samples: 1087722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:23:35,153][00372] Avg episode reward: [(0, '22.320')] +[2023-03-14 14:23:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4374528. Throughput: 0: 796.8. Samples: 1093220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:23:40,151][00372] Avg episode reward: [(0, '22.747')] +[2023-03-14 14:23:41,361][13200] Updated weights for policy 0, policy_version 1070 (0.0014) +[2023-03-14 14:23:45,143][00372] Fps is (10 sec: 4096.2, 60 sec: 3413.4, 300 sec: 3262.9). Total num frames: 4395008. Throughput: 0: 832.8. Samples: 1099530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:23:45,147][00372] Avg episode reward: [(0, '21.084')] +[2023-03-14 14:23:50,146][00372] Fps is (10 sec: 3275.7, 60 sec: 3276.6, 300 sec: 3249.0). Total num frames: 4407296. Throughput: 0: 834.8. Samples: 1101538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:23:50,149][00372] Avg episode reward: [(0, '21.179')] +[2023-03-14 14:23:54,976][13200] Updated weights for policy 0, policy_version 1080 (0.0025) +[2023-03-14 14:23:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4423680. Throughput: 0: 835.7. Samples: 1105516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:23:55,151][00372] Avg episode reward: [(0, '22.660')] +[2023-03-14 14:24:00,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4435968. Throughput: 0: 800.2. Samples: 1109552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:24:00,146][00372] Avg episode reward: [(0, '21.523')] +[2023-03-14 14:24:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4456448. Throughput: 0: 798.9. Samples: 1112650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:24:05,145][00372] Avg episode reward: [(0, '20.522')] +[2023-03-14 14:24:06,391][13200] Updated weights for policy 0, policy_version 1090 (0.0020) +[2023-03-14 14:24:10,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 4476928. Throughput: 0: 823.7. Samples: 1118770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:10,151][00372] Avg episode reward: [(0, '21.747')] +[2023-03-14 14:24:15,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4489216. Throughput: 0: 823.5. Samples: 1122720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:15,152][00372] Avg episode reward: [(0, '22.149')] +[2023-03-14 14:24:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3262.9). Total num frames: 4501504. Throughput: 0: 822.3. Samples: 1124726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:24:20,148][00372] Avg episode reward: [(0, '22.304')] +[2023-03-14 14:24:20,803][13200] Updated weights for policy 0, policy_version 1100 (0.0033) +[2023-03-14 14:24:25,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.9, 300 sec: 3262.9). Total num frames: 4517888. Throughput: 0: 788.4. Samples: 1128700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:25,151][00372] Avg episode reward: [(0, '20.465')] +[2023-03-14 14:24:30,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4538368. Throughput: 0: 784.7. Samples: 1134840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:24:30,151][00372] Avg episode reward: [(0, '21.047')] +[2023-03-14 14:24:30,167][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001108_4538368.pth... +[2023-03-14 14:24:30,304][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000918_3760128.pth +[2023-03-14 14:24:31,920][13200] Updated weights for policy 0, policy_version 1110 (0.0014) +[2023-03-14 14:24:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 4554752. Throughput: 0: 807.1. Samples: 1137854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:35,146][00372] Avg episode reward: [(0, '21.064')] +[2023-03-14 14:24:40,144][00372] Fps is (10 sec: 3276.4, 60 sec: 3276.7, 300 sec: 3262.9). Total num frames: 4571136. Throughput: 0: 817.7. Samples: 1142312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:40,151][00372] Avg episode reward: [(0, '22.806')] +[2023-03-14 14:24:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4583424. Throughput: 0: 816.4. Samples: 1146290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:24:45,151][00372] Avg episode reward: [(0, '22.617')] +[2023-03-14 14:24:46,216][13200] Updated weights for policy 0, policy_version 1120 (0.0018) +[2023-03-14 14:24:50,143][00372] Fps is (10 sec: 2867.6, 60 sec: 3208.7, 300 sec: 3262.9). Total num frames: 4599808. Throughput: 0: 793.9. Samples: 1148374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:24:50,152][00372] Avg episode reward: [(0, '21.802')] +[2023-03-14 14:24:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4620288. Throughput: 0: 791.0. Samples: 1154366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:24:55,151][00372] Avg episode reward: [(0, '22.051')] +[2023-03-14 14:24:56,685][13200] Updated weights for policy 0, policy_version 1130 (0.0029) +[2023-03-14 14:25:00,148][00372] Fps is (10 sec: 3684.4, 60 sec: 3344.8, 300 sec: 3249.0). Total num frames: 4636672. Throughput: 0: 831.6. Samples: 1160148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:25:00,150][00372] Avg episode reward: [(0, '22.586')] +[2023-03-14 14:25:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4653056. Throughput: 0: 831.0. Samples: 1162120. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:25:05,147][00372] Avg episode reward: [(0, '21.852')] +[2023-03-14 14:25:10,143][00372] Fps is (10 sec: 2868.6, 60 sec: 3140.2, 300 sec: 3262.9). Total num frames: 4665344. Throughput: 0: 829.6. Samples: 1166034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:25:10,147][00372] Avg episode reward: [(0, '20.633')] +[2023-03-14 14:25:11,188][13200] Updated weights for policy 0, policy_version 1140 (0.0030) +[2023-03-14 14:25:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4681728. Throughput: 0: 795.8. Samples: 1170652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:25:15,147][00372] Avg episode reward: [(0, '21.372')] +[2023-03-14 14:25:20,143][00372] Fps is (10 sec: 3686.6, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4702208. Throughput: 0: 798.3. Samples: 1173776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:25:20,146][00372] Avg episode reward: [(0, '22.690')] +[2023-03-14 14:25:21,751][13200] Updated weights for policy 0, policy_version 1150 (0.0015) +[2023-03-14 14:25:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 4718592. Throughput: 0: 834.3. Samples: 1179856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:25:25,150][00372] Avg episode reward: [(0, '21.794')] +[2023-03-14 14:25:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 4730880. Throughput: 0: 833.2. Samples: 1183786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:25:30,147][00372] Avg episode reward: [(0, '21.458')] +[2023-03-14 14:25:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4747264. Throughput: 0: 831.5. Samples: 1185792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:25:35,147][00372] Avg episode reward: [(0, '22.316')] +[2023-03-14 14:25:36,509][13200] Updated weights for policy 0, policy_version 1160 (0.0017) +[2023-03-14 14:25:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3262.9). Total num frames: 4763648. Throughput: 0: 795.4. Samples: 1190158. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:25:40,146][00372] Avg episode reward: [(0, '22.188')] +[2023-03-14 14:25:45,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4784128. Throughput: 0: 805.6. Samples: 1196394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:25:45,145][00372] Avg episode reward: [(0, '21.684')] +[2023-03-14 14:25:46,755][13200] Updated weights for policy 0, policy_version 1170 (0.0029) +[2023-03-14 14:25:50,149][00372] Fps is (10 sec: 3684.1, 60 sec: 3344.7, 300 sec: 3262.8). Total num frames: 4800512. Throughput: 0: 827.7. Samples: 1199372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:25:50,155][00372] Avg episode reward: [(0, '22.007')] +[2023-03-14 14:25:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 4812800. Throughput: 0: 828.4. Samples: 1203312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:25:55,145][00372] Avg episode reward: [(0, '22.908')] +[2023-03-14 14:26:00,143][00372] Fps is (10 sec: 2459.1, 60 sec: 3140.5, 300 sec: 3249.0). Total num frames: 4825088. Throughput: 0: 813.1. Samples: 1207242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:26:00,151][00372] Avg episode reward: [(0, '23.199')] +[2023-03-14 14:26:01,865][13200] Updated weights for policy 0, policy_version 1180 (0.0030) +[2023-03-14 14:26:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4845568. Throughput: 0: 795.3. Samples: 1209566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:26:05,151][00372] Avg episode reward: [(0, '23.918')] +[2023-03-14 14:26:05,153][13187] Saving new best policy, reward=23.918! +[2023-03-14 14:26:10,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4866048. Throughput: 0: 797.7. Samples: 1215754. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:26:10,149][00372] Avg episode reward: [(0, '24.095')] +[2023-03-14 14:26:10,171][13187] Saving new best policy, reward=24.095! +[2023-03-14 14:26:12,303][13200] Updated weights for policy 0, policy_version 1190 (0.0018) +[2023-03-14 14:26:15,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 4878336. Throughput: 0: 821.7. Samples: 1220762. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:26:15,146][00372] Avg episode reward: [(0, '23.553')] +[2023-03-14 14:26:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 4894720. Throughput: 0: 821.7. Samples: 1222768. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) +[2023-03-14 14:26:20,149][00372] Avg episode reward: [(0, '22.819')] +[2023-03-14 14:26:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.1). Total num frames: 4907008. Throughput: 0: 813.1. Samples: 1226746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:26:25,151][00372] Avg episode reward: [(0, '21.228')] +[2023-03-14 14:26:26,897][13200] Updated weights for policy 0, policy_version 1200 (0.0018) +[2023-03-14 14:26:30,144][00372] Fps is (10 sec: 3276.3, 60 sec: 3276.7, 300 sec: 3262.9). Total num frames: 4927488. Throughput: 0: 793.9. Samples: 1232120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:26:30,147][00372] Avg episode reward: [(0, '21.901')] +[2023-03-14 14:26:30,158][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001203_4927488.pth... +[2023-03-14 14:26:30,285][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth +[2023-03-14 14:26:35,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4947968. Throughput: 0: 794.2. Samples: 1235108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:26:35,152][00372] Avg episode reward: [(0, '20.714')] +[2023-03-14 14:26:37,042][13200] Updated weights for policy 0, policy_version 1210 (0.0036) +[2023-03-14 14:26:40,152][00372] Fps is (10 sec: 3683.6, 60 sec: 3344.6, 300 sec: 3262.8). Total num frames: 4964352. Throughput: 0: 825.7. Samples: 1240474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:26:40,154][00372] Avg episode reward: [(0, '21.705')] +[2023-03-14 14:26:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4976640. Throughput: 0: 824.9. Samples: 1244362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:26:45,150][00372] Avg episode reward: [(0, '22.097')] +[2023-03-14 14:26:50,143][00372] Fps is (10 sec: 2459.7, 60 sec: 3140.6, 300 sec: 3249.0). Total num frames: 4988928. Throughput: 0: 815.8. Samples: 1246278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:26:50,147][00372] Avg episode reward: [(0, '21.970')] +[2023-03-14 14:26:52,116][13200] Updated weights for policy 0, policy_version 1220 (0.0022) +[2023-03-14 14:26:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5009408. Throughput: 0: 793.5. Samples: 1251462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:26:55,149][00372] Avg episode reward: [(0, '22.552')] +[2023-03-14 14:27:00,143][00372] Fps is (10 sec: 4096.2, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 5029888. Throughput: 0: 820.3. Samples: 1257674. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:27:00,146][00372] Avg episode reward: [(0, '24.695')] +[2023-03-14 14:27:00,161][13187] Saving new best policy, reward=24.695! +[2023-03-14 14:27:02,688][13200] Updated weights for policy 0, policy_version 1230 (0.0013) +[2023-03-14 14:27:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5042176. Throughput: 0: 824.8. Samples: 1259884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:27:05,146][00372] Avg episode reward: [(0, '23.962')] +[2023-03-14 14:27:10,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5054464. Throughput: 0: 822.5. Samples: 1263758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:27:10,146][00372] Avg episode reward: [(0, '25.158')] +[2023-03-14 14:27:10,168][13187] Saving new best policy, reward=25.158! +[2023-03-14 14:27:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5066752. Throughput: 0: 791.8. Samples: 1267750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:27:15,149][00372] Avg episode reward: [(0, '24.751')] +[2023-03-14 14:27:17,135][13200] Updated weights for policy 0, policy_version 1240 (0.0017) +[2023-03-14 14:27:20,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5091328. Throughput: 0: 790.3. Samples: 1270672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:27:20,150][00372] Avg episode reward: [(0, '23.312')] +[2023-03-14 14:27:25,143][00372] Fps is (10 sec: 4505.4, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 5111808. Throughput: 0: 812.2. Samples: 1277018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:27:25,148][00372] Avg episode reward: [(0, '24.556')] +[2023-03-14 14:27:28,076][13200] Updated weights for policy 0, policy_version 1250 (0.0028) +[2023-03-14 14:27:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3249.0). Total num frames: 5124096. Throughput: 0: 827.0. Samples: 1281576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:27:30,147][00372] Avg episode reward: [(0, '22.490')] +[2023-03-14 14:27:35,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3249.0). Total num frames: 5136384. Throughput: 0: 828.0. Samples: 1283536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:27:35,149][00372] Avg episode reward: [(0, '22.414')] +[2023-03-14 14:27:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.7, 300 sec: 3262.9). Total num frames: 5152768. Throughput: 0: 803.1. Samples: 1287600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:27:40,150][00372] Avg episode reward: [(0, '20.182')] +[2023-03-14 14:27:42,028][13200] Updated weights for policy 0, policy_version 1260 (0.0025) +[2023-03-14 14:27:45,143][00372] Fps is (10 sec: 3686.7, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5173248. Throughput: 0: 795.3. Samples: 1293462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:27:45,149][00372] Avg episode reward: [(0, '20.726')] +[2023-03-14 14:27:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5189632. Throughput: 0: 813.6. Samples: 1296494. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:27:50,145][00372] Avg episode reward: [(0, '20.565')] +[2023-03-14 14:27:53,318][13200] Updated weights for policy 0, policy_version 1270 (0.0015) +[2023-03-14 14:27:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5206016. Throughput: 0: 832.1. Samples: 1301202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:27:55,145][00372] Avg episode reward: [(0, '20.306')] +[2023-03-14 14:28:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5218304. Throughput: 0: 832.9. Samples: 1305230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:28:00,150][00372] Avg episode reward: [(0, '21.171')] +[2023-03-14 14:28:05,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5234688. Throughput: 0: 812.8. Samples: 1307250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:28:05,145][00372] Avg episode reward: [(0, '21.338')] +[2023-03-14 14:28:06,752][13200] Updated weights for policy 0, policy_version 1280 (0.0036) +[2023-03-14 14:28:10,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5255168. Throughput: 0: 798.7. Samples: 1312958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:28:10,148][00372] Avg episode reward: [(0, '21.994')] +[2023-03-14 14:28:15,150][00372] Fps is (10 sec: 4093.4, 60 sec: 3481.2, 300 sec: 3262.9). Total num frames: 5275648. Throughput: 0: 832.2. Samples: 1319030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:28:15,153][00372] Avg episode reward: [(0, '22.574')] +[2023-03-14 14:28:18,137][13200] Updated weights for policy 0, policy_version 1290 (0.0016) +[2023-03-14 14:28:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.2). Total num frames: 5287936. Throughput: 0: 831.7. Samples: 1320960. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:28:20,148][00372] Avg episode reward: [(0, '22.764')] +[2023-03-14 14:28:25,143][00372] Fps is (10 sec: 2459.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5300224. Throughput: 0: 828.4. Samples: 1324876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:28:25,152][00372] Avg episode reward: [(0, '22.168')] +[2023-03-14 14:28:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5316608. Throughput: 0: 793.6. Samples: 1329172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:28:30,151][00372] Avg episode reward: [(0, '20.893')] +[2023-03-14 14:28:30,162][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001298_5316608.pth... +[2023-03-14 14:28:30,302][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001108_4538368.pth +[2023-03-14 14:28:32,143][13200] Updated weights for policy 0, policy_version 1300 (0.0014) +[2023-03-14 14:28:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5337088. Throughput: 0: 794.9. Samples: 1332264. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-03-14 14:28:35,146][00372] Avg episode reward: [(0, '20.645')] +[2023-03-14 14:28:40,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5353472. Throughput: 0: 830.4. Samples: 1338570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:28:40,145][00372] Avg episode reward: [(0, '20.926')] +[2023-03-14 14:28:43,934][13200] Updated weights for policy 0, policy_version 1310 (0.0016) +[2023-03-14 14:28:45,149][00372] Fps is (10 sec: 2865.5, 60 sec: 3208.2, 300 sec: 3249.0). Total num frames: 5365760. Throughput: 0: 828.9. Samples: 1342536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:28:45,154][00372] Avg episode reward: [(0, '20.029')] +[2023-03-14 14:28:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5382144. Throughput: 0: 827.7. Samples: 1344498. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:28:50,151][00372] Avg episode reward: [(0, '21.027')] +[2023-03-14 14:28:55,143][00372] Fps is (10 sec: 3278.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5398528. Throughput: 0: 794.2. Samples: 1348696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:28:55,148][00372] Avg episode reward: [(0, '22.142')] +[2023-03-14 14:28:56,978][13200] Updated weights for policy 0, policy_version 1320 (0.0019) +[2023-03-14 14:29:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5419008. Throughput: 0: 797.7. Samples: 1354920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:29:00,146][00372] Avg episode reward: [(0, '22.430')] +[2023-03-14 14:29:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5435392. Throughput: 0: 824.1. Samples: 1358046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:29:05,145][00372] Avg episode reward: [(0, '24.843')] +[2023-03-14 14:29:08,847][13200] Updated weights for policy 0, policy_version 1330 (0.0026) +[2023-03-14 14:29:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5447680. Throughput: 0: 827.7. Samples: 1362124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:29:10,150][00372] Avg episode reward: [(0, '25.161')] +[2023-03-14 14:29:10,167][13187] Saving new best policy, reward=25.161! +[2023-03-14 14:29:15,144][00372] Fps is (10 sec: 2457.4, 60 sec: 3072.3, 300 sec: 3249.0). Total num frames: 5459968. Throughput: 0: 812.7. Samples: 1365744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:29:15,148][00372] Avg episode reward: [(0, '25.140')] +[2023-03-14 14:29:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5476352. Throughput: 0: 788.8. Samples: 1367758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:29:20,153][00372] Avg episode reward: [(0, '25.306')] +[2023-03-14 14:29:20,167][13187] Saving new best policy, reward=25.306! +[2023-03-14 14:29:22,345][13200] Updated weights for policy 0, policy_version 1340 (0.0023) +[2023-03-14 14:29:25,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5496832. Throughput: 0: 784.1. Samples: 1373856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:29:25,145][00372] Avg episode reward: [(0, '25.035')] +[2023-03-14 14:29:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5517312. Throughput: 0: 820.2. Samples: 1379438. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-03-14 14:29:30,151][00372] Avg episode reward: [(0, '24.574')] +[2023-03-14 14:29:34,868][13200] Updated weights for policy 0, policy_version 1350 (0.0019) +[2023-03-14 14:29:35,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 5529600. Throughput: 0: 818.9. Samples: 1381348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:29:35,148][00372] Avg episode reward: [(0, '24.586')] +[2023-03-14 14:29:40,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5541888. Throughput: 0: 815.0. Samples: 1385372. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:29:40,145][00372] Avg episode reward: [(0, '23.317')] +[2023-03-14 14:29:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.9, 300 sec: 3249.0). Total num frames: 5558272. Throughput: 0: 783.9. Samples: 1390196. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-03-14 14:29:45,149][00372] Avg episode reward: [(0, '24.084')] +[2023-03-14 14:29:47,447][13200] Updated weights for policy 0, policy_version 1360 (0.0018) +[2023-03-14 14:29:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5578752. Throughput: 0: 782.6. Samples: 1393264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:29:50,149][00372] Avg episode reward: [(0, '23.874')] +[2023-03-14 14:29:55,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3276.7, 300 sec: 3249.1). Total num frames: 5595136. Throughput: 0: 820.8. Samples: 1399062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:29:55,147][00372] Avg episode reward: [(0, '23.975')] +[2023-03-14 14:30:00,052][13200] Updated weights for policy 0, policy_version 1370 (0.0015) +[2023-03-14 14:30:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5611520. Throughput: 0: 827.9. Samples: 1403000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:30:00,149][00372] Avg episode reward: [(0, '23.838')] +[2023-03-14 14:30:05,143][00372] Fps is (10 sec: 2867.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5623808. Throughput: 0: 826.7. Samples: 1404960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:30:05,152][00372] Avg episode reward: [(0, '23.508')] +[2023-03-14 14:30:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5640192. Throughput: 0: 795.4. Samples: 1409648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:30:10,151][00372] Avg episode reward: [(0, '22.751')] +[2023-03-14 14:30:12,340][13200] Updated weights for policy 0, policy_version 1380 (0.0024) +[2023-03-14 14:30:15,143][00372] Fps is (10 sec: 3686.2, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5660672. Throughput: 0: 808.6. Samples: 1415826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:30:15,153][00372] Avg episode reward: [(0, '21.424')] +[2023-03-14 14:30:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5677056. Throughput: 0: 828.0. Samples: 1418610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:30:20,148][00372] Avg episode reward: [(0, '20.753')] +[2023-03-14 14:30:25,145][00372] Fps is (10 sec: 2866.6, 60 sec: 3208.4, 300 sec: 3249.0). Total num frames: 5689344. Throughput: 0: 825.7. Samples: 1422532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:30:25,150][00372] Avg episode reward: [(0, '20.196')] +[2023-03-14 14:30:25,600][13200] Updated weights for policy 0, policy_version 1390 (0.0022) +[2023-03-14 14:30:30,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 5701632. Throughput: 0: 805.3. Samples: 1426436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:30:30,151][00372] Avg episode reward: [(0, '19.434')] +[2023-03-14 14:30:30,165][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001392_5701632.pth... +[2023-03-14 14:30:30,361][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001203_4927488.pth +[2023-03-14 14:30:35,143][00372] Fps is (10 sec: 3277.6, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5722112. Throughput: 0: 792.4. Samples: 1428924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:30:35,145][00372] Avg episode reward: [(0, '20.116')] +[2023-03-14 14:30:37,500][13200] Updated weights for policy 0, policy_version 1400 (0.0044) +[2023-03-14 14:30:40,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5742592. Throughput: 0: 802.9. Samples: 1435190. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:30:40,145][00372] Avg episode reward: [(0, '20.498')] +[2023-03-14 14:30:45,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3345.0, 300 sec: 3249.1). Total num frames: 5758976. Throughput: 0: 825.0. Samples: 1440128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:30:45,148][00372] Avg episode reward: [(0, '19.854')] +[2023-03-14 14:30:50,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.4, 300 sec: 3249.0). Total num frames: 5771264. Throughput: 0: 825.2. Samples: 1442096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:30:50,153][00372] Avg episode reward: [(0, '19.707')] +[2023-03-14 14:30:51,024][13200] Updated weights for policy 0, policy_version 1410 (0.0025) +[2023-03-14 14:30:55,143][00372] Fps is (10 sec: 2457.9, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5783552. Throughput: 0: 808.0. Samples: 1446006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-03-14 14:30:55,155][00372] Avg episode reward: [(0, '21.100')] +[2023-03-14 14:31:00,143][00372] Fps is (10 sec: 3277.9, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5804032. Throughput: 0: 793.4. Samples: 1451528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:31:00,145][00372] Avg episode reward: [(0, '20.707')] +[2023-03-14 14:31:02,453][13200] Updated weights for policy 0, policy_version 1420 (0.0026) +[2023-03-14 14:31:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5824512. Throughput: 0: 801.0. Samples: 1454654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:31:05,145][00372] Avg episode reward: [(0, '20.869')] +[2023-03-14 14:31:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5840896. Throughput: 0: 827.2. Samples: 1459752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:31:10,147][00372] Avg episode reward: [(0, '20.677')] +[2023-03-14 14:31:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 5853184. Throughput: 0: 825.6. Samples: 1463586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:31:15,146][00372] Avg episode reward: [(0, '21.012')] +[2023-03-14 14:31:16,330][13200] Updated weights for policy 0, policy_version 1430 (0.0044) +[2023-03-14 14:31:20,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3249.0). Total num frames: 5865472. Throughput: 0: 814.1. Samples: 1465560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:31:20,145][00372] Avg episode reward: [(0, '21.364')] +[2023-03-14 14:31:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.9, 300 sec: 3249.0). Total num frames: 5885952. Throughput: 0: 793.4. Samples: 1470892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:31:25,152][00372] Avg episode reward: [(0, '19.767')] +[2023-03-14 14:31:27,492][13200] Updated weights for policy 0, policy_version 1440 (0.0014) +[2023-03-14 14:31:30,145][00372] Fps is (10 sec: 4095.2, 60 sec: 3413.2, 300 sec: 3249.0). Total num frames: 5906432. Throughput: 0: 823.7. Samples: 1477194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:31:30,152][00372] Avg episode reward: [(0, '20.242')] +[2023-03-14 14:31:35,149][00372] Fps is (10 sec: 3274.7, 60 sec: 3276.4, 300 sec: 3235.2). Total num frames: 5918720. Throughput: 0: 826.4. Samples: 1479286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:31:35,155][00372] Avg episode reward: [(0, '20.704')] +[2023-03-14 14:31:40,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5935104. Throughput: 0: 829.2. Samples: 1483318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-03-14 14:31:40,146][00372] Avg episode reward: [(0, '20.893')] +[2023-03-14 14:31:41,642][13200] Updated weights for policy 0, policy_version 1450 (0.0020) +[2023-03-14 14:31:45,143][00372] Fps is (10 sec: 2869.0, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5947392. Throughput: 0: 800.4. Samples: 1487548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-03-14 14:31:45,146][00372] Avg episode reward: [(0, '22.108')] +[2023-03-14 14:31:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3277.0, 300 sec: 3249.0). Total num frames: 5967872. Throughput: 0: 801.0. Samples: 1490698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:31:50,146][00372] Avg episode reward: [(0, '22.192')] +[2023-03-14 14:31:52,073][13200] Updated weights for policy 0, policy_version 1460 (0.0028) +[2023-03-14 14:31:55,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 5988352. Throughput: 0: 828.5. Samples: 1497036. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-03-14 14:31:55,146][00372] Avg episode reward: [(0, '22.800')] +[2023-03-14 14:32:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 6000640. Throughput: 0: 833.1. Samples: 1501074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-03-14 14:32:00,147][00372] Avg episode reward: [(0, '23.851')] +[2023-03-14 14:32:00,600][13187] Stopping Batcher_0... +[2023-03-14 14:32:00,601][13187] Loop batcher_evt_loop terminating... +[2023-03-14 14:32:00,602][00372] Component Batcher_0 stopped! +[2023-03-14 14:32:00,605][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... +[2023-03-14 14:32:00,694][13200] Weights refcount: 2 0 +[2023-03-14 14:32:00,701][00372] Component InferenceWorker_p0-w0 stopped! +[2023-03-14 14:32:00,704][13200] Stopping InferenceWorker_p0-w0... +[2023-03-14 14:32:00,704][13200] Loop inference_proc0-0_evt_loop terminating... +[2023-03-14 14:32:00,750][00372] Component RolloutWorker_w1 stopped! +[2023-03-14 14:32:00,752][13202] Stopping RolloutWorker_w1... +[2023-03-14 14:32:00,758][00372] Component RolloutWorker_w3 stopped! +[2023-03-14 14:32:00,760][13208] Stopping RolloutWorker_w3... +[2023-03-14 14:32:00,764][13208] Loop rollout_proc3_evt_loop terminating... +[2023-03-14 14:32:00,766][13202] Loop rollout_proc1_evt_loop terminating... +[2023-03-14 14:32:00,768][00372] Component RolloutWorker_w5 stopped! +[2023-03-14 14:32:00,770][13210] Stopping RolloutWorker_w5... +[2023-03-14 14:32:00,771][13210] Loop rollout_proc5_evt_loop terminating... +[2023-03-14 14:32:00,801][00372] Component RolloutWorker_w7 stopped! +[2023-03-14 14:32:00,803][13209] Stopping RolloutWorker_w7... +[2023-03-14 14:32:00,803][13209] Loop rollout_proc7_evt_loop terminating... +[2023-03-14 14:32:00,826][00372] Component RolloutWorker_w6 stopped! +[2023-03-14 14:32:00,839][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001298_5316608.pth +[2023-03-14 14:32:00,826][13211] Stopping RolloutWorker_w6... +[2023-03-14 14:32:00,856][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... +[2023-03-14 14:32:00,867][13204] Stopping RolloutWorker_w2... +[2023-03-14 14:32:00,867][00372] Component RolloutWorker_w2 stopped! +[2023-03-14 14:32:00,876][00372] Component RolloutWorker_w0 stopped! +[2023-03-14 14:32:00,854][13211] Loop rollout_proc6_evt_loop terminating... +[2023-03-14 14:32:00,876][13205] Stopping RolloutWorker_w0... +[2023-03-14 14:32:00,899][13205] Loop rollout_proc0_evt_loop terminating... +[2023-03-14 14:32:00,868][13204] Loop rollout_proc2_evt_loop terminating... +[2023-03-14 14:32:00,940][13207] Stopping RolloutWorker_w4... +[2023-03-14 14:32:00,941][13207] Loop rollout_proc4_evt_loop terminating... +[2023-03-14 14:32:00,934][00372] Component RolloutWorker_w4 stopped! +[2023-03-14 14:32:01,221][13187] Stopping LearnerWorker_p0... +[2023-03-14 14:32:01,221][13187] Loop learner_proc0_evt_loop terminating... +[2023-03-14 14:32:01,220][00372] Component LearnerWorker_p0 stopped! +[2023-03-14 14:32:01,223][00372] Waiting for process learner_proc0 to stop... +[2023-03-14 14:32:03,670][00372] Waiting for process inference_proc0-0 to join... +[2023-03-14 14:32:04,191][00372] Waiting for process rollout_proc0 to join... +[2023-03-14 14:32:05,111][00372] Waiting for process rollout_proc1 to join... +[2023-03-14 14:32:05,115][00372] Waiting for process rollout_proc2 to join... +[2023-03-14 14:32:05,124][00372] Waiting for process rollout_proc3 to join... +[2023-03-14 14:32:05,128][00372] Waiting for process rollout_proc4 to join... +[2023-03-14 14:32:05,130][00372] Waiting for process rollout_proc5 to join... +[2023-03-14 14:32:05,133][00372] Waiting for process rollout_proc6 to join... +[2023-03-14 14:32:05,135][00372] Waiting for process rollout_proc7 to join... +[2023-03-14 14:32:05,136][00372] Batcher 0 profile tree view: +batching: 43.1679, releasing_batches: 0.0384 +[2023-03-14 14:32:05,140][00372] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 - wait_policy_total: 617.1554 -update_model: 8.1702 - weight_update: 0.0014 -one_step: 0.0022 - handle_policy_step: 525.3543 - deserialize: 15.4791, stack: 2.9672, obs_to_device_normalize: 115.8456, forward: 251.8045, send_messages: 28.2498 - prepare_outputs: 83.5445 - to_cpu: 51.7867 -[2023-03-12 07:42:12,445][00184] Learner 0 profile tree view: -misc: 0.0066, prepare_batch: 17.7406 -train: 76.4282 - epoch_init: 0.0227, minibatch_init: 0.0151, losses_postprocess: 0.5834, kl_divergence: 0.5142, after_optimizer: 32.9554 - calculate_losses: 27.1808 - losses_init: 0.0035, forward_head: 1.7143, bptt_initial: 17.8158, tail: 1.0341, advantages_returns: 0.3561, losses: 3.6289 - bptt: 2.3178 - bptt_forward_core: 2.2617 - update: 14.4683 - clip: 1.4112 -[2023-03-12 07:42:12,446][00184] RolloutWorker_w0 profile tree view: -wait_for_trajectories: 0.3379, enqueue_policy_requests: 180.9792, env_step: 871.3540, overhead: 22.3214, complete_rollouts: 7.4711 -save_policy_outputs: 21.8828 - split_output_tensors: 10.2411 -[2023-03-12 07:42:12,447][00184] RolloutWorker_w7 profile tree view: -wait_for_trajectories: 0.3746, enqueue_policy_requests: 185.7645, env_step: 869.0301, overhead: 21.4990, complete_rollouts: 7.0329 -save_policy_outputs: 21.4696 - split_output_tensors: 10.7412 -[2023-03-12 07:42:12,449][00184] Loop Runner_EvtLoop terminating... -[2023-03-12 07:42:12,450][00184] Runner profile tree view: -main_loop: 1218.3274 -[2023-03-12 07:42:12,452][00184] Collected {0: 4005888}, FPS: 3288.0 -[2023-03-12 07:42:57,364][00184] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-03-12 07:42:57,366][00184] Overriding arg 'num_workers' with value 1 passed from command line -[2023-03-12 07:42:57,368][00184] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-03-12 07:42:57,371][00184] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-03-12 07:42:57,373][00184] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-03-12 07:42:57,374][00184] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-03-12 07:42:57,378][00184] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! -[2023-03-12 07:42:57,381][00184] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-03-12 07:42:57,383][00184] Adding new argument 'push_to_hub'=False that is not in the saved config file! -[2023-03-12 07:42:57,385][00184] Adding new argument 'hf_repository'=None that is not in the saved config file! -[2023-03-12 07:42:57,386][00184] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-03-12 07:42:57,389][00184] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-03-12 07:42:57,390][00184] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-03-12 07:42:57,391][00184] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-03-12 07:42:57,393][00184] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-03-12 07:42:57,415][00184] Doom resolution: 160x120, resize resolution: (128, 72) -[2023-03-12 07:42:57,418][00184] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:42:57,421][00184] RunningMeanStd input shape: (1,) -[2023-03-12 07:42:57,437][00184] ConvEncoder: input_channels=3 -[2023-03-12 07:42:58,098][00184] Conv encoder output size: 512 -[2023-03-12 07:42:58,100][00184] Policy head output size: 512 -[2023-03-12 07:43:02,487][00184] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:43:03,731][00184] Num frames 100... -[2023-03-12 07:43:03,852][00184] Num frames 200... -[2023-03-12 07:43:03,964][00184] Num frames 300... -[2023-03-12 07:43:04,088][00184] Num frames 400... -[2023-03-12 07:43:04,202][00184] Num frames 500... -[2023-03-12 07:43:04,315][00184] Num frames 600... -[2023-03-12 07:43:04,425][00184] Num frames 700... -[2023-03-12 07:43:04,572][00184] Num frames 800... -[2023-03-12 07:43:04,725][00184] Num frames 900... -[2023-03-12 07:43:04,880][00184] Num frames 1000... -[2023-03-12 07:43:05,044][00184] Num frames 1100... -[2023-03-12 07:43:05,203][00184] Num frames 1200... -[2023-03-12 07:43:05,355][00184] Num frames 1300... -[2023-03-12 07:43:05,507][00184] Num frames 1400... -[2023-03-12 07:43:05,670][00184] Num frames 1500... -[2023-03-12 07:43:05,829][00184] Num frames 1600... -[2023-03-12 07:43:05,997][00184] Num frames 1700... -[2023-03-12 07:43:06,155][00184] Num frames 1800... -[2023-03-12 07:43:06,321][00184] Num frames 1900... -[2023-03-12 07:43:06,479][00184] Num frames 2000... -[2023-03-12 07:43:06,644][00184] Num frames 2100... -[2023-03-12 07:43:06,700][00184] Avg episode rewards: #0: 53.999, true rewards: #0: 21.000 -[2023-03-12 07:43:06,702][00184] Avg episode reward: 53.999, avg true_objective: 21.000 -[2023-03-12 07:43:06,861][00184] Num frames 2200... -[2023-03-12 07:43:07,026][00184] Num frames 2300... -[2023-03-12 07:43:07,191][00184] Num frames 2400... -[2023-03-12 07:43:07,348][00184] Num frames 2500... -[2023-03-12 07:43:07,504][00184] Num frames 2600... -[2023-03-12 07:43:07,670][00184] Num frames 2700... -[2023-03-12 07:43:07,846][00184] Num frames 2800... -[2023-03-12 07:43:08,065][00184] Avg episode rewards: #0: 34.445, true rewards: #0: 14.445 -[2023-03-12 07:43:08,067][00184] Avg episode reward: 34.445, avg true_objective: 14.445 -[2023-03-12 07:43:08,086][00184] Num frames 2900... -[2023-03-12 07:43:08,257][00184] Num frames 3000... -[2023-03-12 07:43:08,409][00184] Num frames 3100... -[2023-03-12 07:43:08,558][00184] Num frames 3200... -[2023-03-12 07:43:08,714][00184] Num frames 3300... -[2023-03-12 07:43:08,870][00184] Num frames 3400... -[2023-03-12 07:43:09,029][00184] Num frames 3500... -[2023-03-12 07:43:09,183][00184] Num frames 3600... -[2023-03-12 07:43:09,344][00184] Num frames 3700... -[2023-03-12 07:43:09,497][00184] Num frames 3800... -[2023-03-12 07:43:09,652][00184] Num frames 3900... -[2023-03-12 07:43:09,779][00184] Avg episode rewards: #0: 32.150, true rewards: #0: 13.150 -[2023-03-12 07:43:09,782][00184] Avg episode reward: 32.150, avg true_objective: 13.150 -[2023-03-12 07:43:09,871][00184] Num frames 4000... -[2023-03-12 07:43:10,026][00184] Num frames 4100... -[2023-03-12 07:43:10,184][00184] Num frames 4200... -[2023-03-12 07:43:10,348][00184] Num frames 4300... -[2023-03-12 07:43:10,509][00184] Num frames 4400... -[2023-03-12 07:43:10,589][00184] Avg episode rewards: #0: 26.282, true rewards: #0: 11.033 -[2023-03-12 07:43:10,591][00184] Avg episode reward: 26.282, avg true_objective: 11.033 -[2023-03-12 07:43:10,739][00184] Num frames 4500... -[2023-03-12 07:43:10,908][00184] Num frames 4600... -[2023-03-12 07:43:11,073][00184] Num frames 4700... -[2023-03-12 07:43:11,243][00184] Num frames 4800... -[2023-03-12 07:43:11,369][00184] Num frames 4900... -[2023-03-12 07:43:11,462][00184] Avg episode rewards: #0: 22.864, true rewards: #0: 9.864 -[2023-03-12 07:43:11,463][00184] Avg episode reward: 22.864, avg true_objective: 9.864 -[2023-03-12 07:43:11,544][00184] Num frames 5000... -[2023-03-12 07:43:11,653][00184] Num frames 5100... -[2023-03-12 07:43:11,772][00184] Num frames 5200... -[2023-03-12 07:43:11,889][00184] Num frames 5300... -[2023-03-12 07:43:12,002][00184] Num frames 5400... -[2023-03-12 07:43:12,119][00184] Num frames 5500... -[2023-03-12 07:43:12,228][00184] Num frames 5600... -[2023-03-12 07:43:12,343][00184] Num frames 5700... -[2023-03-12 07:43:12,452][00184] Num frames 5800... -[2023-03-12 07:43:12,559][00184] Num frames 5900... -[2023-03-12 07:43:12,668][00184] Num frames 6000... -[2023-03-12 07:43:12,777][00184] Num frames 6100... -[2023-03-12 07:43:12,886][00184] Num frames 6200... -[2023-03-12 07:43:12,998][00184] Num frames 6300... -[2023-03-12 07:43:13,110][00184] Num frames 6400... -[2023-03-12 07:43:13,258][00184] Avg episode rewards: #0: 24.807, true rewards: #0: 10.807 -[2023-03-12 07:43:13,260][00184] Avg episode reward: 24.807, avg true_objective: 10.807 -[2023-03-12 07:43:13,283][00184] Num frames 6500... -[2023-03-12 07:43:13,400][00184] Num frames 6600... -[2023-03-12 07:43:13,510][00184] Num frames 6700... -[2023-03-12 07:43:13,621][00184] Num frames 6800... -[2023-03-12 07:43:13,738][00184] Num frames 6900... -[2023-03-12 07:43:13,845][00184] Num frames 7000... -[2023-03-12 07:43:13,960][00184] Num frames 7100... -[2023-03-12 07:43:14,072][00184] Num frames 7200... -[2023-03-12 07:43:14,184][00184] Num frames 7300... -[2023-03-12 07:43:14,295][00184] Num frames 7400... -[2023-03-12 07:43:14,411][00184] Num frames 7500... -[2023-03-12 07:43:14,523][00184] Num frames 7600... -[2023-03-12 07:43:14,633][00184] Num frames 7700... -[2023-03-12 07:43:14,745][00184] Num frames 7800... -[2023-03-12 07:43:14,859][00184] Num frames 7900... -[2023-03-12 07:43:14,970][00184] Num frames 8000... -[2023-03-12 07:43:15,088][00184] Num frames 8100... -[2023-03-12 07:43:15,229][00184] Avg episode rewards: #0: 27.400, true rewards: #0: 11.686 -[2023-03-12 07:43:15,230][00184] Avg episode reward: 27.400, avg true_objective: 11.686 -[2023-03-12 07:43:15,257][00184] Num frames 8200... -[2023-03-12 07:43:15,368][00184] Num frames 8300... -[2023-03-12 07:43:15,478][00184] Num frames 8400... -[2023-03-12 07:43:15,586][00184] Num frames 8500... -[2023-03-12 07:43:15,699][00184] Num frames 8600... -[2023-03-12 07:43:15,815][00184] Num frames 8700... -[2023-03-12 07:43:15,924][00184] Num frames 8800... -[2023-03-12 07:43:16,033][00184] Num frames 8900... -[2023-03-12 07:43:16,142][00184] Num frames 9000... -[2023-03-12 07:43:16,254][00184] Num frames 9100... -[2023-03-12 07:43:16,361][00184] Num frames 9200... -[2023-03-12 07:43:16,484][00184] Num frames 9300... -[2023-03-12 07:43:16,604][00184] Num frames 9400... -[2023-03-12 07:43:16,728][00184] Num frames 9500... -[2023-03-12 07:43:16,850][00184] Avg episode rewards: #0: 28.580, true rewards: #0: 11.955 -[2023-03-12 07:43:16,852][00184] Avg episode reward: 28.580, avg true_objective: 11.955 -[2023-03-12 07:43:16,897][00184] Num frames 9600... -[2023-03-12 07:43:17,007][00184] Num frames 9700... -[2023-03-12 07:43:17,117][00184] Num frames 9800... -[2023-03-12 07:43:17,237][00184] Num frames 9900... -[2023-03-12 07:43:17,346][00184] Num frames 10000... -[2023-03-12 07:43:17,463][00184] Num frames 10100... -[2023-03-12 07:43:17,576][00184] Num frames 10200... -[2023-03-12 07:43:17,685][00184] Num frames 10300... -[2023-03-12 07:43:17,802][00184] Num frames 10400... -[2023-03-12 07:43:17,911][00184] Num frames 10500... -[2023-03-12 07:43:17,993][00184] Avg episode rewards: #0: 27.693, true rewards: #0: 11.693 -[2023-03-12 07:43:17,995][00184] Avg episode reward: 27.693, avg true_objective: 11.693 -[2023-03-12 07:43:18,081][00184] Num frames 10600... -[2023-03-12 07:43:18,189][00184] Num frames 10700... -[2023-03-12 07:43:18,344][00184] Num frames 10800... -[2023-03-12 07:43:18,502][00184] Num frames 10900... -[2023-03-12 07:43:18,649][00184] Num frames 11000... -[2023-03-12 07:43:18,804][00184] Num frames 11100... -[2023-03-12 07:43:18,957][00184] Num frames 11200... -[2023-03-12 07:43:19,108][00184] Num frames 11300... -[2023-03-12 07:43:19,262][00184] Num frames 11400... -[2023-03-12 07:43:19,420][00184] Num frames 11500... -[2023-03-12 07:43:19,582][00184] Num frames 11600... -[2023-03-12 07:43:19,737][00184] Num frames 11700... -[2023-03-12 07:43:19,888][00184] Num frames 11800... -[2023-03-12 07:43:20,068][00184] Num frames 11900... -[2023-03-12 07:43:20,224][00184] Num frames 12000... -[2023-03-12 07:43:20,381][00184] Num frames 12100... -[2023-03-12 07:43:20,552][00184] Num frames 12200... -[2023-03-12 07:43:20,721][00184] Num frames 12300... -[2023-03-12 07:43:20,831][00184] Avg episode rewards: #0: 29.631, true rewards: #0: 12.331 -[2023-03-12 07:43:20,833][00184] Avg episode reward: 29.631, avg true_objective: 12.331 -[2023-03-12 07:44:38,063][00184] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-03-12 07:44:38,393][00184] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-03-12 07:44:38,395][00184] Overriding arg 'num_workers' with value 1 passed from command line -[2023-03-12 07:44:38,396][00184] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-03-12 07:44:38,398][00184] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-03-12 07:44:38,399][00184] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-03-12 07:44:38,400][00184] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-03-12 07:44:38,401][00184] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-03-12 07:44:38,402][00184] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-03-12 07:44:38,403][00184] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-03-12 07:44:38,404][00184] Adding new argument 'hf_repository'='Kittitouch /rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-03-12 07:44:38,405][00184] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-03-12 07:44:38,407][00184] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-03-12 07:44:38,408][00184] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-03-12 07:44:38,409][00184] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-03-12 07:44:38,410][00184] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-03-12 07:44:38,438][00184] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:44:38,441][00184] RunningMeanStd input shape: (1,) -[2023-03-12 07:44:38,453][00184] ConvEncoder: input_channels=3 -[2023-03-12 07:44:38,489][00184] Conv encoder output size: 512 -[2023-03-12 07:44:38,491][00184] Policy head output size: 512 -[2023-03-12 07:44:38,511][00184] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:44:38,977][00184] Num frames 100... -[2023-03-12 07:44:39,092][00184] Num frames 200... -[2023-03-12 07:44:39,210][00184] Num frames 300... -[2023-03-12 07:44:39,323][00184] Num frames 400... -[2023-03-12 07:44:39,434][00184] Num frames 500... -[2023-03-12 07:44:39,547][00184] Num frames 600... -[2023-03-12 07:44:39,663][00184] Num frames 700... -[2023-03-12 07:44:39,774][00184] Num frames 800... -[2023-03-12 07:44:39,888][00184] Num frames 900... -[2023-03-12 07:44:40,016][00184] Num frames 1000... -[2023-03-12 07:44:40,133][00184] Num frames 1100... -[2023-03-12 07:44:40,253][00184] Num frames 1200... -[2023-03-12 07:44:40,363][00184] Num frames 1300... -[2023-03-12 07:44:40,434][00184] Avg episode rewards: #0: 32.130, true rewards: #0: 13.130 -[2023-03-12 07:44:40,435][00184] Avg episode reward: 32.130, avg true_objective: 13.130 -[2023-03-12 07:44:40,536][00184] Num frames 1400... -[2023-03-12 07:44:40,648][00184] Num frames 1500... -[2023-03-12 07:44:40,766][00184] Num frames 1600... -[2023-03-12 07:44:40,881][00184] Num frames 1700... -[2023-03-12 07:44:41,003][00184] Num frames 1800... -[2023-03-12 07:44:41,121][00184] Num frames 1900... -[2023-03-12 07:44:41,237][00184] Num frames 2000... -[2023-03-12 07:44:41,362][00184] Num frames 2100... -[2023-03-12 07:44:41,472][00184] Num frames 2200... -[2023-03-12 07:44:41,588][00184] Num frames 2300... -[2023-03-12 07:44:41,700][00184] Num frames 2400... -[2023-03-12 07:44:41,814][00184] Num frames 2500... -[2023-03-12 07:44:41,938][00184] Num frames 2600... -[2023-03-12 07:44:42,059][00184] Num frames 2700... -[2023-03-12 07:44:42,170][00184] Num frames 2800... -[2023-03-12 07:44:42,281][00184] Num frames 2900... -[2023-03-12 07:44:42,396][00184] Num frames 3000... -[2023-03-12 07:44:42,516][00184] Num frames 3100... -[2023-03-12 07:44:42,627][00184] Num frames 3200... -[2023-03-12 07:44:42,737][00184] Num frames 3300... -[2023-03-12 07:44:42,845][00184] Num frames 3400... -[2023-03-12 07:44:42,916][00184] Avg episode rewards: #0: 44.564, true rewards: #0: 17.065 -[2023-03-12 07:44:42,917][00184] Avg episode reward: 44.564, avg true_objective: 17.065 -[2023-03-12 07:44:43,038][00184] Num frames 3500... -[2023-03-12 07:44:43,157][00184] Num frames 3600... -[2023-03-12 07:44:43,269][00184] Num frames 3700... -[2023-03-12 07:44:43,378][00184] Num frames 3800... -[2023-03-12 07:44:43,488][00184] Num frames 3900... -[2023-03-12 07:44:43,602][00184] Num frames 4000... -[2023-03-12 07:44:43,713][00184] Num frames 4100... -[2023-03-12 07:44:43,824][00184] Num frames 4200... -[2023-03-12 07:44:43,941][00184] Num frames 4300... -[2023-03-12 07:44:44,057][00184] Num frames 4400... -[2023-03-12 07:44:44,170][00184] Num frames 4500... -[2023-03-12 07:44:44,284][00184] Num frames 4600... -[2023-03-12 07:44:44,442][00184] Avg episode rewards: #0: 40.643, true rewards: #0: 15.643 -[2023-03-12 07:44:44,444][00184] Avg episode reward: 40.643, avg true_objective: 15.643 -[2023-03-12 07:44:44,454][00184] Num frames 4700... -[2023-03-12 07:44:44,564][00184] Num frames 4800... -[2023-03-12 07:44:44,675][00184] Num frames 4900... -[2023-03-12 07:44:44,787][00184] Num frames 5000... -[2023-03-12 07:44:44,897][00184] Num frames 5100... -[2023-03-12 07:44:45,013][00184] Num frames 5200... -[2023-03-12 07:44:45,127][00184] Num frames 5300... -[2023-03-12 07:44:45,239][00184] Num frames 5400... -[2023-03-12 07:44:45,357][00184] Num frames 5500... -[2023-03-12 07:44:45,466][00184] Num frames 5600... -[2023-03-12 07:44:45,580][00184] Num frames 5700... -[2023-03-12 07:44:45,668][00184] Avg episode rewards: #0: 36.060, true rewards: #0: 14.310 -[2023-03-12 07:44:45,670][00184] Avg episode reward: 36.060, avg true_objective: 14.310 -[2023-03-12 07:44:45,753][00184] Num frames 5800... -[2023-03-12 07:44:45,872][00184] Num frames 5900... -[2023-03-12 07:44:45,981][00184] Num frames 6000... -[2023-03-12 07:44:46,095][00184] Num frames 6100... -[2023-03-12 07:44:46,215][00184] Avg episode rewards: #0: 30.080, true rewards: #0: 12.280 -[2023-03-12 07:44:46,218][00184] Avg episode reward: 30.080, avg true_objective: 12.280 -[2023-03-12 07:44:46,311][00184] Num frames 6200... -[2023-03-12 07:44:46,462][00184] Num frames 6300... -[2023-03-12 07:44:46,619][00184] Num frames 6400... -[2023-03-12 07:44:46,766][00184] Num frames 6500... -[2023-03-12 07:44:46,914][00184] Num frames 6600... -[2023-03-12 07:44:47,069][00184] Num frames 6700... -[2023-03-12 07:44:47,149][00184] Avg episode rewards: #0: 26.860, true rewards: #0: 11.193 -[2023-03-12 07:44:47,151][00184] Avg episode reward: 26.860, avg true_objective: 11.193 -[2023-03-12 07:44:47,292][00184] Num frames 6800... -[2023-03-12 07:44:47,441][00184] Num frames 6900... -[2023-03-12 07:44:47,590][00184] Num frames 7000... -[2023-03-12 07:44:47,752][00184] Num frames 7100... -[2023-03-12 07:44:47,859][00184] Avg episode rewards: #0: 23.903, true rewards: #0: 10.189 -[2023-03-12 07:44:47,861][00184] Avg episode reward: 23.903, avg true_objective: 10.189 -[2023-03-12 07:44:47,971][00184] Num frames 7200... -[2023-03-12 07:44:48,132][00184] Num frames 7300... -[2023-03-12 07:44:48,294][00184] Num frames 7400... -[2023-03-12 07:44:48,457][00184] Num frames 7500... -[2023-03-12 07:44:48,624][00184] Num frames 7600... -[2023-03-12 07:44:48,790][00184] Num frames 7700... -[2023-03-12 07:44:48,953][00184] Num frames 7800... -[2023-03-12 07:44:49,120][00184] Num frames 7900... -[2023-03-12 07:44:49,289][00184] Num frames 8000... -[2023-03-12 07:44:49,449][00184] Num frames 8100... -[2023-03-12 07:44:49,572][00184] Num frames 8200... -[2023-03-12 07:44:49,686][00184] Num frames 8300... -[2023-03-12 07:44:49,833][00184] Avg episode rewards: #0: 24.600, true rewards: #0: 10.475 -[2023-03-12 07:44:49,834][00184] Avg episode reward: 24.600, avg true_objective: 10.475 -[2023-03-12 07:44:49,862][00184] Num frames 8400... -[2023-03-12 07:44:49,971][00184] Num frames 8500... -[2023-03-12 07:44:50,079][00184] Num frames 8600... -[2023-03-12 07:44:50,192][00184] Num frames 8700... -[2023-03-12 07:44:50,310][00184] Num frames 8800... -[2023-03-12 07:44:50,419][00184] Num frames 8900... -[2023-03-12 07:44:50,530][00184] Num frames 9000... -[2023-03-12 07:44:50,641][00184] Num frames 9100... -[2023-03-12 07:44:50,751][00184] Num frames 9200... -[2023-03-12 07:44:50,867][00184] Num frames 9300... -[2023-03-12 07:44:50,982][00184] Num frames 9400... -[2023-03-12 07:44:51,094][00184] Num frames 9500... -[2023-03-12 07:44:51,223][00184] Avg episode rewards: #0: 24.738, true rewards: #0: 10.627 -[2023-03-12 07:44:51,225][00184] Avg episode reward: 24.738, avg true_objective: 10.627 -[2023-03-12 07:44:51,267][00184] Num frames 9600... -[2023-03-12 07:44:51,385][00184] Num frames 9700... -[2023-03-12 07:44:51,493][00184] Num frames 9800... -[2023-03-12 07:44:51,601][00184] Num frames 9900... -[2023-03-12 07:44:51,714][00184] Num frames 10000... -[2023-03-12 07:44:51,821][00184] Num frames 10100... -[2023-03-12 07:44:51,933][00184] Num frames 10200... -[2023-03-12 07:44:52,066][00184] Avg episode rewards: #0: 23.568, true rewards: #0: 10.268 -[2023-03-12 07:44:52,067][00184] Avg episode reward: 23.568, avg true_objective: 10.268 -[2023-03-12 07:45:55,244][00184] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-03-12 07:47:01,347][00184] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-03-12 07:47:01,349][00184] Overriding arg 'num_workers' with value 1 passed from command line -[2023-03-12 07:47:01,351][00184] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-03-12 07:47:01,353][00184] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-03-12 07:47:01,355][00184] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-03-12 07:47:01,357][00184] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-03-12 07:47:01,358][00184] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-03-12 07:47:01,359][00184] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-03-12 07:47:01,361][00184] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-03-12 07:47:01,362][00184] Adding new argument 'hf_repository'='Kittitouch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-03-12 07:47:01,363][00184] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-03-12 07:47:01,364][00184] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-03-12 07:47:01,365][00184] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-03-12 07:47:01,366][00184] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-03-12 07:47:01,368][00184] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-03-12 07:47:01,393][00184] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:47:01,395][00184] RunningMeanStd input shape: (1,) -[2023-03-12 07:47:01,408][00184] ConvEncoder: input_channels=3 -[2023-03-12 07:47:01,452][00184] Conv encoder output size: 512 -[2023-03-12 07:47:01,454][00184] Policy head output size: 512 -[2023-03-12 07:47:01,473][00184] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:47:01,928][00184] Num frames 100... -[2023-03-12 07:47:02,044][00184] Num frames 200... -[2023-03-12 07:47:02,154][00184] Num frames 300... -[2023-03-12 07:47:02,263][00184] Num frames 400... -[2023-03-12 07:47:02,372][00184] Num frames 500... -[2023-03-12 07:47:02,494][00184] Num frames 600... -[2023-03-12 07:47:02,603][00184] Num frames 700... -[2023-03-12 07:47:02,725][00184] Num frames 800... -[2023-03-12 07:47:02,836][00184] Num frames 900... -[2023-03-12 07:47:02,957][00184] Num frames 1000... -[2023-03-12 07:47:03,079][00184] Num frames 1100... -[2023-03-12 07:47:03,189][00184] Num frames 1200... -[2023-03-12 07:47:03,300][00184] Num frames 1300... -[2023-03-12 07:47:03,409][00184] Num frames 1400... -[2023-03-12 07:47:03,474][00184] Avg episode rewards: #0: 36.080, true rewards: #0: 14.080 -[2023-03-12 07:47:03,476][00184] Avg episode reward: 36.080, avg true_objective: 14.080 -[2023-03-12 07:47:03,577][00184] Num frames 1500... -[2023-03-12 07:47:03,690][00184] Num frames 1600... -[2023-03-12 07:47:03,801][00184] Num frames 1700... -[2023-03-12 07:47:03,916][00184] Num frames 1800... -[2023-03-12 07:47:04,031][00184] Num frames 1900... -[2023-03-12 07:47:04,147][00184] Num frames 2000... -[2023-03-12 07:47:04,260][00184] Num frames 2100... -[2023-03-12 07:47:04,370][00184] Num frames 2200... -[2023-03-12 07:47:04,478][00184] Num frames 2300... -[2023-03-12 07:47:04,593][00184] Num frames 2400... -[2023-03-12 07:47:04,696][00184] Avg episode rewards: #0: 30.710, true rewards: #0: 12.210 -[2023-03-12 07:47:04,698][00184] Avg episode reward: 30.710, avg true_objective: 12.210 -[2023-03-12 07:47:04,768][00184] Num frames 2500... -[2023-03-12 07:47:04,878][00184] Num frames 2600... -[2023-03-12 07:47:04,996][00184] Num frames 2700... -[2023-03-12 07:47:05,109][00184] Num frames 2800... -[2023-03-12 07:47:05,223][00184] Num frames 2900... -[2023-03-12 07:47:05,333][00184] Num frames 3000... -[2023-03-12 07:47:05,441][00184] Num frames 3100... -[2023-03-12 07:47:05,551][00184] Num frames 3200... -[2023-03-12 07:47:05,689][00184] Avg episode rewards: #0: 26.580, true rewards: #0: 10.913 -[2023-03-12 07:47:05,690][00184] Avg episode reward: 26.580, avg true_objective: 10.913 -[2023-03-12 07:47:05,740][00184] Num frames 3300... -[2023-03-12 07:47:05,851][00184] Num frames 3400... -[2023-03-12 07:47:05,970][00184] Num frames 3500... -[2023-03-12 07:47:06,079][00184] Num frames 3600... -[2023-03-12 07:47:06,233][00184] Avg episode rewards: #0: 21.475, true rewards: #0: 9.225 -[2023-03-12 07:47:06,235][00184] Avg episode reward: 21.475, avg true_objective: 9.225 -[2023-03-12 07:47:06,250][00184] Num frames 3700... -[2023-03-12 07:47:06,360][00184] Num frames 3800... -[2023-03-12 07:47:06,471][00184] Num frames 3900... -[2023-03-12 07:47:06,578][00184] Num frames 4000... -[2023-03-12 07:47:06,689][00184] Num frames 4100... -[2023-03-12 07:47:06,796][00184] Num frames 4200... -[2023-03-12 07:47:06,905][00184] Num frames 4300... -[2023-03-12 07:47:07,021][00184] Num frames 4400... -[2023-03-12 07:47:07,131][00184] Num frames 4500... -[2023-03-12 07:47:07,243][00184] Num frames 4600... -[2023-03-12 07:47:07,357][00184] Num frames 4700... -[2023-03-12 07:47:07,468][00184] Num frames 4800... -[2023-03-12 07:47:07,579][00184] Num frames 4900... -[2023-03-12 07:47:07,692][00184] Num frames 5000... -[2023-03-12 07:47:07,801][00184] Num frames 5100... -[2023-03-12 07:47:07,922][00184] Num frames 5200... -[2023-03-12 07:47:08,050][00184] Num frames 5300... -[2023-03-12 07:47:08,163][00184] Num frames 5400... -[2023-03-12 07:47:08,278][00184] Num frames 5500... -[2023-03-12 07:47:08,391][00184] Num frames 5600... -[2023-03-12 07:47:08,506][00184] Num frames 5700... -[2023-03-12 07:47:08,662][00184] Avg episode rewards: #0: 28.580, true rewards: #0: 11.580 -[2023-03-12 07:47:08,664][00184] Avg episode reward: 28.580, avg true_objective: 11.580 -[2023-03-12 07:47:08,679][00184] Num frames 5800... -[2023-03-12 07:47:08,792][00184] Num frames 5900... -[2023-03-12 07:47:08,902][00184] Num frames 6000... -[2023-03-12 07:47:09,021][00184] Num frames 6100... -[2023-03-12 07:47:09,137][00184] Num frames 6200... -[2023-03-12 07:47:09,247][00184] Num frames 6300... -[2023-03-12 07:47:09,358][00184] Num frames 6400... -[2023-03-12 07:47:09,472][00184] Num frames 6500... -[2023-03-12 07:47:09,596][00184] Num frames 6600... -[2023-03-12 07:47:09,762][00184] Num frames 6700... -[2023-03-12 07:47:09,921][00184] Num frames 6800... -[2023-03-12 07:47:10,108][00184] Num frames 6900... -[2023-03-12 07:47:10,268][00184] Num frames 7000... -[2023-03-12 07:47:10,443][00184] Num frames 7100... -[2023-03-12 07:47:10,628][00184] Num frames 7200... -[2023-03-12 07:47:10,877][00184] Avg episode rewards: #0: 29.490, true rewards: #0: 12.157 -[2023-03-12 07:47:10,879][00184] Avg episode reward: 29.490, avg true_objective: 12.157 -[2023-03-12 07:47:10,894][00184] Num frames 7300... -[2023-03-12 07:47:11,091][00184] Num frames 7400... -[2023-03-12 07:47:11,266][00184] Num frames 7500... -[2023-03-12 07:47:11,460][00184] Num frames 7600... -[2023-03-12 07:47:11,627][00184] Num frames 7700... -[2023-03-12 07:47:11,845][00184] Num frames 7800... -[2023-03-12 07:47:12,053][00184] Num frames 7900... -[2023-03-12 07:47:12,239][00184] Num frames 8000... -[2023-03-12 07:47:12,441][00184] Num frames 8100... -[2023-03-12 07:47:12,622][00184] Num frames 8200... -[2023-03-12 07:47:12,821][00184] Num frames 8300... -[2023-03-12 07:47:13,030][00184] Num frames 8400... -[2023-03-12 07:47:13,217][00184] Num frames 8500... -[2023-03-12 07:47:13,398][00184] Num frames 8600... -[2023-03-12 07:47:13,540][00184] Avg episode rewards: #0: 30.491, true rewards: #0: 12.349 -[2023-03-12 07:47:13,543][00184] Avg episode reward: 30.491, avg true_objective: 12.349 -[2023-03-12 07:47:13,661][00184] Num frames 8700... -[2023-03-12 07:47:13,886][00184] Num frames 8800... -[2023-03-12 07:47:14,096][00184] Num frames 8900... -[2023-03-12 07:47:14,267][00184] Num frames 9000... -[2023-03-12 07:47:14,433][00184] Num frames 9100... -[2023-03-12 07:47:14,596][00184] Num frames 9200... -[2023-03-12 07:47:14,763][00184] Num frames 9300... -[2023-03-12 07:47:14,930][00184] Num frames 9400... -[2023-03-12 07:47:15,095][00184] Num frames 9500... -[2023-03-12 07:47:15,200][00184] Avg episode rewards: #0: 29.410, true rewards: #0: 11.910 -[2023-03-12 07:47:15,202][00184] Avg episode reward: 29.410, avg true_objective: 11.910 -[2023-03-12 07:47:15,328][00184] Num frames 9600... -[2023-03-12 07:47:15,491][00184] Num frames 9700... -[2023-03-12 07:47:15,649][00184] Num frames 9800... -[2023-03-12 07:47:15,811][00184] Num frames 9900... -[2023-03-12 07:47:15,945][00184] Num frames 10000... -[2023-03-12 07:47:16,059][00184] Num frames 10100... -[2023-03-12 07:47:16,170][00184] Num frames 10200... -[2023-03-12 07:47:16,292][00184] Num frames 10300... -[2023-03-12 07:47:16,409][00184] Num frames 10400... -[2023-03-12 07:47:16,523][00184] Num frames 10500... -[2023-03-12 07:47:16,638][00184] Num frames 10600... -[2023-03-12 07:47:16,750][00184] Num frames 10700... -[2023-03-12 07:47:16,866][00184] Num frames 10800... -[2023-03-12 07:47:16,980][00184] Num frames 10900... -[2023-03-12 07:47:17,092][00184] Num frames 11000... -[2023-03-12 07:47:17,208][00184] Num frames 11100... -[2023-03-12 07:47:17,323][00184] Num frames 11200... -[2023-03-12 07:47:17,440][00184] Num frames 11300... -[2023-03-12 07:47:17,595][00184] Avg episode rewards: #0: 31.100, true rewards: #0: 12.656 -[2023-03-12 07:47:17,596][00184] Avg episode reward: 31.100, avg true_objective: 12.656 -[2023-03-12 07:47:17,612][00184] Num frames 11400... -[2023-03-12 07:47:17,726][00184] Num frames 11500... -[2023-03-12 07:47:17,839][00184] Num frames 11600... -[2023-03-12 07:47:17,951][00184] Num frames 11700... -[2023-03-12 07:47:18,062][00184] Num frames 11800... -[2023-03-12 07:47:18,219][00184] Avg episode rewards: #0: 28.894, true rewards: #0: 11.894 -[2023-03-12 07:47:18,221][00184] Avg episode reward: 28.894, avg true_objective: 11.894 -[2023-03-12 07:48:30,463][00184] Replay video saved to /content/train_dir/default_experiment/replay.mp4! -[2023-03-12 07:51:02,042][00184] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json -[2023-03-12 07:51:02,044][00184] Overriding arg 'num_workers' with value 1 passed from command line -[2023-03-12 07:51:02,045][00184] Adding new argument 'no_render'=True that is not in the saved config file! -[2023-03-12 07:51:02,046][00184] Adding new argument 'save_video'=True that is not in the saved config file! -[2023-03-12 07:51:02,048][00184] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! -[2023-03-12 07:51:02,052][00184] Adding new argument 'video_name'=None that is not in the saved config file! -[2023-03-12 07:51:02,053][00184] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! -[2023-03-12 07:51:02,054][00184] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! -[2023-03-12 07:51:02,056][00184] Adding new argument 'push_to_hub'=True that is not in the saved config file! -[2023-03-12 07:51:02,058][00184] Adding new argument 'hf_repository'='Kittitouch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! -[2023-03-12 07:51:02,060][00184] Adding new argument 'policy_index'=0 that is not in the saved config file! -[2023-03-12 07:51:02,062][00184] Adding new argument 'eval_deterministic'=False that is not in the saved config file! -[2023-03-12 07:51:02,064][00184] Adding new argument 'train_script'=None that is not in the saved config file! -[2023-03-12 07:51:02,065][00184] Adding new argument 'enjoy_script'=None that is not in the saved config file! -[2023-03-12 07:51:02,067][00184] Using frameskip 1 and render_action_repeat=4 for evaluation -[2023-03-12 07:51:02,092][00184] RunningMeanStd input shape: (3, 72, 128) -[2023-03-12 07:51:02,094][00184] RunningMeanStd input shape: (1,) -[2023-03-12 07:51:02,107][00184] ConvEncoder: input_channels=3 -[2023-03-12 07:51:02,143][00184] Conv encoder output size: 512 -[2023-03-12 07:51:02,144][00184] Policy head output size: 512 -[2023-03-12 07:51:02,164][00184] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... -[2023-03-12 07:51:02,657][00184] Num frames 100... -[2023-03-12 07:51:02,767][00184] Num frames 200... -[2023-03-12 07:51:02,879][00184] Num frames 300... -[2023-03-12 07:51:02,991][00184] Num frames 400... -[2023-03-12 07:51:03,114][00184] Num frames 500... -[2023-03-12 07:51:03,223][00184] Num frames 600... -[2023-03-12 07:51:03,358][00184] Avg episode rewards: #0: 10.720, true rewards: #0: 6.720 -[2023-03-12 07:51:03,359][00184] Avg episode reward: 10.720, avg true_objective: 6.720 -[2023-03-12 07:51:03,402][00184] Num frames 700... -[2023-03-12 07:51:03,521][00184] Num frames 800... -[2023-03-12 07:51:03,638][00184] Num frames 900... -[2023-03-12 07:51:03,747][00184] Num frames 1000... -[2023-03-12 07:51:03,900][00184] Avg episode rewards: #0: 8.440, true rewards: #0: 5.440 -[2023-03-12 07:51:03,902][00184] Avg episode reward: 8.440, avg true_objective: 5.440 -[2023-03-12 07:51:03,924][00184] Num frames 1100... -[2023-03-12 07:51:04,038][00184] Num frames 1200... -[2023-03-12 07:51:04,150][00184] Num frames 1300... -[2023-03-12 07:51:04,266][00184] Num frames 1400... -[2023-03-12 07:51:04,376][00184] Num frames 1500... -[2023-03-12 07:51:04,500][00184] Num frames 1600... -[2023-03-12 07:51:04,613][00184] Num frames 1700... -[2023-03-12 07:51:04,728][00184] Num frames 1800... -[2023-03-12 07:51:04,840][00184] Num frames 1900... -[2023-03-12 07:51:04,961][00184] Num frames 2000... -[2023-03-12 07:51:05,117][00184] Num frames 2100... -[2023-03-12 07:51:05,280][00184] Num frames 2200... -[2023-03-12 07:51:05,432][00184] Num frames 2300... -[2023-03-12 07:51:05,589][00184] Num frames 2400... -[2023-03-12 07:51:05,745][00184] Num frames 2500... -[2023-03-12 07:51:05,908][00184] Num frames 2600... -[2023-03-12 07:51:06,028][00184] Avg episode rewards: #0: 18.793, true rewards: #0: 8.793 -[2023-03-12 07:51:06,030][00184] Avg episode reward: 18.793, avg true_objective: 8.793 -[2023-03-12 07:51:06,125][00184] Num frames 2700... -[2023-03-12 07:51:06,281][00184] Num frames 2800... -[2023-03-12 07:51:06,433][00184] Num frames 2900... -[2023-03-12 07:51:06,589][00184] Num frames 3000... -[2023-03-12 07:51:06,743][00184] Num frames 3100... -[2023-03-12 07:51:06,903][00184] Num frames 3200... -[2023-03-12 07:51:07,066][00184] Num frames 3300... -[2023-03-12 07:51:07,236][00184] Num frames 3400... -[2023-03-12 07:51:07,392][00184] Num frames 3500... -[2023-03-12 07:51:07,553][00184] Num frames 3600... -[2023-03-12 07:51:07,715][00184] Num frames 3700... -[2023-03-12 07:51:07,876][00184] Num frames 3800... -[2023-03-12 07:51:08,039][00184] Num frames 3900... -[2023-03-12 07:51:08,203][00184] Num frames 4000... -[2023-03-12 07:51:08,359][00184] Num frames 4100... -[2023-03-12 07:51:08,490][00184] Num frames 4200... -[2023-03-12 07:51:08,608][00184] Num frames 4300... -[2023-03-12 07:51:08,720][00184] Num frames 4400... -[2023-03-12 07:51:08,831][00184] Num frames 4500... -[2023-03-12 07:51:08,945][00184] Num frames 4600... -[2023-03-12 07:51:09,068][00184] Num frames 4700... -[2023-03-12 07:51:09,167][00184] Avg episode rewards: #0: 28.845, true rewards: #0: 11.845 -[2023-03-12 07:51:09,168][00184] Avg episode reward: 28.845, avg true_objective: 11.845 -[2023-03-12 07:51:09,245][00184] Num frames 4800... -[2023-03-12 07:51:09,359][00184] Num frames 4900... -[2023-03-12 07:51:09,469][00184] Num frames 5000... -[2023-03-12 07:51:09,593][00184] Num frames 5100... -[2023-03-12 07:51:09,705][00184] Num frames 5200... -[2023-03-12 07:51:09,819][00184] Num frames 5300... -[2023-03-12 07:51:09,930][00184] Num frames 5400... -[2023-03-12 07:51:10,063][00184] Avg episode rewards: #0: 26.736, true rewards: #0: 10.936 -[2023-03-12 07:51:10,065][00184] Avg episode reward: 26.736, avg true_objective: 10.936 -[2023-03-12 07:51:10,111][00184] Num frames 5500... -[2023-03-12 07:51:10,221][00184] Num frames 5600... -[2023-03-12 07:51:10,333][00184] Num frames 5700... -[2023-03-12 07:51:10,443][00184] Num frames 5800... -[2023-03-12 07:51:10,555][00184] Num frames 5900... -[2023-03-12 07:51:10,677][00184] Num frames 6000... -[2023-03-12 07:51:10,788][00184] Num frames 6100... -[2023-03-12 07:51:10,901][00184] Num frames 6200... -[2023-03-12 07:51:11,035][00184] Avg episode rewards: #0: 25.113, true rewards: #0: 10.447 -[2023-03-12 07:51:11,037][00184] Avg episode reward: 25.113, avg true_objective: 10.447 -[2023-03-12 07:51:11,076][00184] Num frames 6300... -[2023-03-12 07:51:11,188][00184] Num frames 6400... -[2023-03-12 07:51:11,309][00184] Num frames 6500... -[2023-03-12 07:51:11,423][00184] Num frames 6600... -[2023-03-12 07:51:11,541][00184] Num frames 6700... -[2023-03-12 07:51:11,664][00184] Num frames 6800... -[2023-03-12 07:51:11,784][00184] Num frames 6900... -[2023-03-12 07:51:11,896][00184] Num frames 7000... -[2023-03-12 07:51:11,956][00184] Avg episode rewards: #0: 23.720, true rewards: #0: 10.006 -[2023-03-12 07:51:11,957][00184] Avg episode reward: 23.720, avg true_objective: 10.006 -[2023-03-12 07:51:12,072][00184] Num frames 7100... -[2023-03-12 07:51:12,186][00184] Num frames 7200... -[2023-03-12 07:51:12,300][00184] Num frames 7300... -[2023-03-12 07:51:12,416][00184] Num frames 7400... -[2023-03-12 07:51:12,528][00184] Num frames 7500... -[2023-03-12 07:51:12,646][00184] Num frames 7600... -[2023-03-12 07:51:12,760][00184] Num frames 7700... -[2023-03-12 07:51:12,872][00184] Num frames 7800... -[2023-03-12 07:51:12,966][00184] Avg episode rewards: #0: 22.542, true rewards: #0: 9.792 -[2023-03-12 07:51:12,967][00184] Avg episode reward: 22.542, avg true_objective: 9.792 -[2023-03-12 07:51:13,045][00184] Num frames 7900... -[2023-03-12 07:51:13,166][00184] Num frames 8000... -[2023-03-12 07:51:13,282][00184] Num frames 8100... -[2023-03-12 07:51:13,393][00184] Num frames 8200... -[2023-03-12 07:51:13,559][00184] Num frames 8300... -[2023-03-12 07:51:13,659][00184] Avg episode rewards: #0: 21.028, true rewards: #0: 9.250 -[2023-03-12 07:51:13,661][00184] Avg episode reward: 21.028, avg true_objective: 9.250 -[2023-03-12 07:51:13,790][00184] Num frames 8400... -[2023-03-12 07:51:13,943][00184] Num frames 8500... -[2023-03-12 07:51:14,108][00184] Num frames 8600... -[2023-03-12 07:51:14,275][00184] Num frames 8700... -[2023-03-12 07:51:14,435][00184] Num frames 8800... -[2023-03-12 07:51:14,595][00184] Num frames 8900... -[2023-03-12 07:51:14,757][00184] Num frames 9000... -[2023-03-12 07:51:14,927][00184] Num frames 9100... -[2023-03-12 07:51:15,087][00184] Num frames 9200... -[2023-03-12 07:51:15,245][00184] Num frames 9300... -[2023-03-12 07:51:15,417][00184] Num frames 9400... -[2023-03-12 07:51:15,581][00184] Num frames 9500... -[2023-03-12 07:51:15,748][00184] Num frames 9600... -[2023-03-12 07:51:15,914][00184] Num frames 9700... -[2023-03-12 07:51:16,076][00184] Num frames 9800... -[2023-03-12 07:51:16,240][00184] Num frames 9900... -[2023-03-12 07:51:16,407][00184] Num frames 10000... -[2023-03-12 07:51:16,576][00184] Num frames 10100... -[2023-03-12 07:51:16,761][00184] Num frames 10200... -[2023-03-12 07:51:16,907][00184] Num frames 10300... -[2023-03-12 07:51:17,050][00184] Avg episode rewards: #0: 24.373, true rewards: #0: 10.373 -[2023-03-12 07:51:17,052][00184] Avg episode reward: 24.373, avg true_objective: 10.373 -[2023-03-12 07:52:20,090][00184] Replay video saved to /content/train_dir/default_experiment/replay.mp4! + wait_policy_total: 891.3881 +update_model: 12.5659 + weight_update: 0.0029 +one_step: 0.0024 + handle_policy_step: 885.8978 + deserialize: 25.2186, stack: 4.8805, obs_to_device_normalize: 189.4452, forward: 436.1786, send_messages: 44.4774 + prepare_outputs: 139.8153 + to_cpu: 87.7006 +[2023-03-14 14:32:05,141][00372] Learner 0 profile tree view: +misc: 0.0119, prepare_batch: 24.1735 +train: 116.4224 + epoch_init: 0.0256, minibatch_init: 0.0136, losses_postprocess: 0.8965, kl_divergence: 0.9717, after_optimizer: 49.3341 + calculate_losses: 41.5410 + losses_init: 0.0059, forward_head: 2.7492, bptt_initial: 27.0971, tail: 1.8301, advantages_returns: 0.5186, losses: 5.0086 + bptt: 3.7426 + bptt_forward_core: 3.5580 + update: 22.6093 + clip: 2.2821 +[2023-03-14 14:32:05,142][00372] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 0.7599, enqueue_policy_requests: 260.8927, env_step: 1378.4958, overhead: 39.9957, complete_rollouts: 11.3098 +save_policy_outputs: 39.3351 + split_output_tensors: 18.8414 +[2023-03-14 14:32:05,143][00372] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 0.6716, enqueue_policy_requests: 269.8551, env_step: 1370.5238, overhead: 39.7292, complete_rollouts: 11.8346 +save_policy_outputs: 37.5300 + split_output_tensors: 18.0413 +[2023-03-14 14:32:05,145][00372] Loop Runner_EvtLoop terminating... +[2023-03-14 14:32:05,150][00372] Runner profile tree view: +main_loop: 1890.3781 +[2023-03-14 14:32:05,151][00372] Collected {0: 6004736}, FPS: 3176.5 +[2023-03-14 14:33:00,806][00372] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-03-14 14:33:00,808][00372] Overriding arg 'num_workers' with value 1 passed from command line +[2023-03-14 14:33:00,810][00372] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-03-14 14:33:00,812][00372] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-03-14 14:33:00,817][00372] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-03-14 14:33:00,818][00372] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-03-14 14:33:00,820][00372] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-03-14 14:33:00,822][00372] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-03-14 14:33:00,824][00372] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-03-14 14:33:00,826][00372] Adding new argument 'hf_repository'='Kittitouch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-03-14 14:33:00,827][00372] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-03-14 14:33:00,828][00372] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-03-14 14:33:00,829][00372] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-03-14 14:33:00,830][00372] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-03-14 14:33:00,831][00372] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-03-14 14:33:00,869][00372] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-03-14 14:33:00,874][00372] RunningMeanStd input shape: (3, 72, 128) +[2023-03-14 14:33:00,877][00372] RunningMeanStd input shape: (1,) +[2023-03-14 14:33:00,913][00372] ConvEncoder: input_channels=3 +[2023-03-14 14:33:01,694][00372] Conv encoder output size: 512 +[2023-03-14 14:33:01,699][00372] Policy head output size: 512 +[2023-03-14 14:33:04,178][00372] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... +[2023-03-14 14:33:05,495][00372] Num frames 100... +[2023-03-14 14:33:05,610][00372] Num frames 200... +[2023-03-14 14:33:05,729][00372] Num frames 300... +[2023-03-14 14:33:05,850][00372] Num frames 400... +[2023-03-14 14:33:05,964][00372] Num frames 500... +[2023-03-14 14:33:06,077][00372] Num frames 600... +[2023-03-14 14:33:06,193][00372] Num frames 700... +[2023-03-14 14:33:06,310][00372] Num frames 800... +[2023-03-14 14:33:06,425][00372] Num frames 900... +[2023-03-14 14:33:06,542][00372] Num frames 1000... +[2023-03-14 14:33:06,647][00372] Avg episode rewards: #0: 25.420, true rewards: #0: 10.420 +[2023-03-14 14:33:06,648][00372] Avg episode reward: 25.420, avg true_objective: 10.420 +[2023-03-14 14:33:06,725][00372] Num frames 1100... +[2023-03-14 14:33:06,859][00372] Num frames 1200... +[2023-03-14 14:33:06,979][00372] Num frames 1300... +[2023-03-14 14:33:07,093][00372] Num frames 1400... +[2023-03-14 14:33:07,206][00372] Num frames 1500... +[2023-03-14 14:33:07,324][00372] Num frames 1600... +[2023-03-14 14:33:07,439][00372] Num frames 1700... +[2023-03-14 14:33:07,554][00372] Num frames 1800... +[2023-03-14 14:33:07,672][00372] Num frames 1900... +[2023-03-14 14:33:07,791][00372] Num frames 2000... +[2023-03-14 14:33:07,915][00372] Num frames 2100... +[2023-03-14 14:33:08,029][00372] Num frames 2200... +[2023-03-14 14:33:08,144][00372] Num frames 2300... +[2023-03-14 14:33:08,260][00372] Num frames 2400... +[2023-03-14 14:33:08,376][00372] Num frames 2500... +[2023-03-14 14:33:08,496][00372] Num frames 2600... +[2023-03-14 14:33:08,613][00372] Num frames 2700... +[2023-03-14 14:33:08,734][00372] Num frames 2800... +[2023-03-14 14:33:08,857][00372] Num frames 2900... +[2023-03-14 14:33:08,979][00372] Num frames 3000... +[2023-03-14 14:33:09,097][00372] Num frames 3100... +[2023-03-14 14:33:09,201][00372] Avg episode rewards: #0: 40.709, true rewards: #0: 15.710 +[2023-03-14 14:33:09,202][00372] Avg episode reward: 40.709, avg true_objective: 15.710 +[2023-03-14 14:33:09,272][00372] Num frames 3200... +[2023-03-14 14:33:09,389][00372] Num frames 3300... +[2023-03-14 14:33:09,503][00372] Num frames 3400... +[2023-03-14 14:33:09,622][00372] Avg episode rewards: #0: 28.846, true rewards: #0: 11.513 +[2023-03-14 14:33:09,624][00372] Avg episode reward: 28.846, avg true_objective: 11.513 +[2023-03-14 14:33:09,683][00372] Num frames 3500... +[2023-03-14 14:33:09,798][00372] Num frames 3600... +[2023-03-14 14:33:09,919][00372] Num frames 3700... +[2023-03-14 14:33:10,034][00372] Num frames 3800... +[2023-03-14 14:33:10,149][00372] Num frames 3900... +[2023-03-14 14:33:10,266][00372] Num frames 4000... +[2023-03-14 14:33:10,385][00372] Num frames 4100... +[2023-03-14 14:33:10,503][00372] Num frames 4200... +[2023-03-14 14:33:10,618][00372] Num frames 4300... +[2023-03-14 14:33:10,747][00372] Num frames 4400... +[2023-03-14 14:33:10,874][00372] Num frames 4500... +[2023-03-14 14:33:10,997][00372] Num frames 4600... +[2023-03-14 14:33:11,062][00372] Avg episode rewards: #0: 28.515, true rewards: #0: 11.515 +[2023-03-14 14:33:11,064][00372] Avg episode reward: 28.515, avg true_objective: 11.515 +[2023-03-14 14:33:11,183][00372] Num frames 4700... +[2023-03-14 14:33:11,311][00372] Num frames 4800... +[2023-03-14 14:33:11,443][00372] Num frames 4900... +[2023-03-14 14:33:11,559][00372] Num frames 5000... +[2023-03-14 14:33:11,675][00372] Num frames 5100... +[2023-03-14 14:33:11,790][00372] Num frames 5200... +[2023-03-14 14:33:11,911][00372] Num frames 5300... +[2023-03-14 14:33:12,035][00372] Num frames 5400... +[2023-03-14 14:33:12,159][00372] Num frames 5500... +[2023-03-14 14:33:12,322][00372] Num frames 5600... +[2023-03-14 14:33:12,486][00372] Num frames 5700... +[2023-03-14 14:33:12,643][00372] Num frames 5800... +[2023-03-14 14:33:12,803][00372] Num frames 5900... +[2023-03-14 14:33:12,976][00372] Num frames 6000... +[2023-03-14 14:33:13,136][00372] Num frames 6100... +[2023-03-14 14:33:13,266][00372] Avg episode rewards: #0: 29.284, true rewards: #0: 12.284 +[2023-03-14 14:33:13,267][00372] Avg episode reward: 29.284, avg true_objective: 12.284 +[2023-03-14 14:33:13,368][00372] Num frames 6200... +[2023-03-14 14:33:13,535][00372] Num frames 6300... +[2023-03-14 14:33:13,696][00372] Num frames 6400... +[2023-03-14 14:33:13,856][00372] Num frames 6500... +[2023-03-14 14:33:14,030][00372] Num frames 6600... +[2023-03-14 14:33:14,186][00372] Num frames 6700... +[2023-03-14 14:33:14,345][00372] Num frames 6800... +[2023-03-14 14:33:14,508][00372] Num frames 6900... +[2023-03-14 14:33:14,675][00372] Num frames 7000... +[2023-03-14 14:33:14,839][00372] Num frames 7100... +[2023-03-14 14:33:15,006][00372] Num frames 7200... +[2023-03-14 14:33:15,172][00372] Num frames 7300... +[2023-03-14 14:33:15,340][00372] Num frames 7400... +[2023-03-14 14:33:15,505][00372] Num frames 7500... +[2023-03-14 14:33:15,677][00372] Num frames 7600... +[2023-03-14 14:33:15,846][00372] Num frames 7700... +[2023-03-14 14:33:16,025][00372] Num frames 7800... +[2023-03-14 14:33:16,203][00372] Num frames 7900... +[2023-03-14 14:33:16,373][00372] Num frames 8000... +[2023-03-14 14:33:16,550][00372] Num frames 8100... +[2023-03-14 14:33:16,725][00372] Num frames 8200... +[2023-03-14 14:33:16,859][00372] Avg episode rewards: #0: 33.070, true rewards: #0: 13.737 +[2023-03-14 14:33:16,861][00372] Avg episode reward: 33.070, avg true_objective: 13.737 +[2023-03-14 14:33:16,968][00372] Num frames 8300... +[2023-03-14 14:33:17,149][00372] Num frames 8400... +[2023-03-14 14:33:17,316][00372] Num frames 8500... +[2023-03-14 14:33:17,483][00372] Num frames 8600... +[2023-03-14 14:33:17,626][00372] Num frames 8700... +[2023-03-14 14:33:17,751][00372] Num frames 8800... +[2023-03-14 14:33:17,870][00372] Avg episode rewards: #0: 30.077, true rewards: #0: 12.649 +[2023-03-14 14:33:17,872][00372] Avg episode reward: 30.077, avg true_objective: 12.649 +[2023-03-14 14:33:17,929][00372] Num frames 8900... +[2023-03-14 14:33:18,043][00372] Num frames 9000... +[2023-03-14 14:33:18,163][00372] Num frames 9100... +[2023-03-14 14:33:18,277][00372] Num frames 9200... +[2023-03-14 14:33:18,390][00372] Num frames 9300... +[2023-03-14 14:33:18,501][00372] Num frames 9400... +[2023-03-14 14:33:18,616][00372] Num frames 9500... +[2023-03-14 14:33:18,728][00372] Num frames 9600... +[2023-03-14 14:33:18,825][00372] Avg episode rewards: #0: 28.289, true rewards: #0: 12.039 +[2023-03-14 14:33:18,826][00372] Avg episode reward: 28.289, avg true_objective: 12.039 +[2023-03-14 14:33:18,911][00372] Num frames 9700... +[2023-03-14 14:33:19,022][00372] Num frames 9800... +[2023-03-14 14:33:19,134][00372] Num frames 9900... +[2023-03-14 14:33:19,252][00372] Num frames 10000... +[2023-03-14 14:33:19,370][00372] Num frames 10100... +[2023-03-14 14:33:19,484][00372] Num frames 10200... +[2023-03-14 14:33:19,594][00372] Num frames 10300... +[2023-03-14 14:33:19,706][00372] Num frames 10400... +[2023-03-14 14:33:19,828][00372] Num frames 10500... +[2023-03-14 14:33:19,941][00372] Num frames 10600... +[2023-03-14 14:33:20,056][00372] Num frames 10700... +[2023-03-14 14:33:20,170][00372] Num frames 10800... +[2023-03-14 14:33:20,299][00372] Num frames 10900... +[2023-03-14 14:33:20,419][00372] Num frames 11000... +[2023-03-14 14:33:20,543][00372] Num frames 11100... +[2023-03-14 14:33:20,662][00372] Num frames 11200... +[2023-03-14 14:33:20,782][00372] Num frames 11300... +[2023-03-14 14:33:20,901][00372] Num frames 11400... +[2023-03-14 14:33:21,020][00372] Num frames 11500... +[2023-03-14 14:33:21,137][00372] Num frames 11600... +[2023-03-14 14:33:21,262][00372] Avg episode rewards: #0: 30.723, true rewards: #0: 12.946 +[2023-03-14 14:33:21,264][00372] Avg episode reward: 30.723, avg true_objective: 12.946 +[2023-03-14 14:33:21,341][00372] Num frames 11700... +[2023-03-14 14:33:21,464][00372] Num frames 11800... +[2023-03-14 14:33:21,592][00372] Num frames 11900... +[2023-03-14 14:33:21,720][00372] Num frames 12000... +[2023-03-14 14:33:21,897][00372] Avg episode rewards: #0: 28.199, true rewards: #0: 12.099 +[2023-03-14 14:33:21,899][00372] Avg episode reward: 28.199, avg true_objective: 12.099 +[2023-03-14 14:33:21,905][00372] Num frames 12100... +[2023-03-14 14:34:47,474][00372] Replay video saved to /content/train_dir/default_experiment/replay.mp4!