diff --git "a/sf_log.txt" "b/sf_log.txt" new file mode 100644--- /dev/null +++ "b/sf_log.txt" @@ -0,0 +1,3541 @@ +[2023-02-24 12:14:52,972][00205] Saving configuration to /content/train_dir/default_experiment/config.json... +[2023-02-24 12:14:52,974][00205] Rollout worker 0 uses device cpu +[2023-02-24 12:14:52,976][00205] Rollout worker 1 uses device cpu +[2023-02-24 12:14:52,979][00205] Rollout worker 2 uses device cpu +[2023-02-24 12:14:52,980][00205] Rollout worker 3 uses device cpu +[2023-02-24 12:14:52,981][00205] Rollout worker 4 uses device cpu +[2023-02-24 12:14:52,983][00205] Rollout worker 5 uses device cpu +[2023-02-24 12:14:52,986][00205] Rollout worker 6 uses device cpu +[2023-02-24 12:14:52,987][00205] Rollout worker 7 uses device cpu +[2023-02-24 12:14:53,198][00205] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:14:53,201][00205] InferenceWorker_p0-w0: min num requests: 2 +[2023-02-24 12:14:53,241][00205] Starting all processes... +[2023-02-24 12:14:53,243][00205] Starting process learner_proc0 +[2023-02-24 12:14:53,333][00205] Starting all processes... +[2023-02-24 12:14:53,345][00205] Starting process inference_proc0-0 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc0 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc1 +[2023-02-24 12:14:53,345][00205] Starting process rollout_proc2 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc3 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc4 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc5 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc6 +[2023-02-24 12:14:53,346][00205] Starting process rollout_proc7 +[2023-02-24 12:15:02,266][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:02,266][11201] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 +[2023-02-24 12:15:02,435][11223] Worker 3 uses CPU cores [1] +[2023-02-24 12:15:02,459][11227] Worker 4 uses CPU cores [0] +[2023-02-24 12:15:02,579][11222] Worker 2 uses CPU cores [0] +[2023-02-24 12:15:02,669][11224] Worker 5 uses CPU cores [1] +[2023-02-24 12:15:02,680][11215] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:02,680][11215] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 +[2023-02-24 12:15:02,710][11216] Worker 0 uses CPU cores [0] +[2023-02-24 12:15:02,785][11221] Worker 1 uses CPU cores [1] +[2023-02-24 12:15:02,791][11226] Worker 7 uses CPU cores [1] +[2023-02-24 12:15:02,922][11225] Worker 6 uses CPU cores [0] +[2023-02-24 12:15:03,232][11215] Num visible devices: 1 +[2023-02-24 12:15:03,232][11201] Num visible devices: 1 +[2023-02-24 12:15:03,238][11201] Starting seed is not provided +[2023-02-24 12:15:03,238][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:03,238][11201] Initializing actor-critic model on device cuda:0 +[2023-02-24 12:15:03,239][11201] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 12:15:03,240][11201] RunningMeanStd input shape: (1,) +[2023-02-24 12:15:03,253][11201] ConvEncoder: input_channels=3 +[2023-02-24 12:15:03,567][11201] Conv encoder output size: 512 +[2023-02-24 12:15:03,568][11201] Policy head output size: 512 +[2023-02-24 12:15:03,624][11201] Created Actor Critic model with architecture: +[2023-02-24 12:15:03,625][11201] ActorCriticSharedWeights( + (obs_normalizer): ObservationNormalizer( + (running_mean_std): RunningMeanStdDictInPlace( + (running_mean_std): ModuleDict( + (obs): RunningMeanStdInPlace() + ) + ) + ) + (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) + (encoder): VizdoomEncoder( + (basic_encoder): ConvEncoder( + (enc): RecursiveScriptModule( + original_name=ConvEncoderImpl + (conv_head): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Conv2d) + (1): RecursiveScriptModule(original_name=ELU) + (2): RecursiveScriptModule(original_name=Conv2d) + (3): RecursiveScriptModule(original_name=ELU) + (4): RecursiveScriptModule(original_name=Conv2d) + (5): RecursiveScriptModule(original_name=ELU) + ) + (mlp_layers): RecursiveScriptModule( + original_name=Sequential + (0): RecursiveScriptModule(original_name=Linear) + (1): RecursiveScriptModule(original_name=ELU) + ) + ) + ) + ) + (core): ModelCoreRNN( + (core): GRU(512, 512) + ) + (decoder): MlpDecoder( + (mlp): Identity() + ) + (critic_linear): Linear(in_features=512, out_features=1, bias=True) + (action_parameterization): ActionParameterizationDefault( + (distribution_linear): Linear(in_features=512, out_features=5, bias=True) + ) +) +[2023-02-24 12:15:10,546][11201] Using optimizer +[2023-02-24 12:15:10,547][11201] No checkpoints found +[2023-02-24 12:15:10,547][11201] Did not load from checkpoint, starting from scratch! +[2023-02-24 12:15:10,547][11201] Initialized policy 0 weights for model version 0 +[2023-02-24 12:15:10,551][11201] LearnerWorker_p0 finished initialization! +[2023-02-24 12:15:10,551][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) +[2023-02-24 12:15:10,778][11215] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 12:15:10,779][11215] RunningMeanStd input shape: (1,) +[2023-02-24 12:15:10,792][11215] ConvEncoder: input_channels=3 +[2023-02-24 12:15:10,892][11215] Conv encoder output size: 512 +[2023-02-24 12:15:10,892][11215] Policy head output size: 512 +[2023-02-24 12:15:12,870][00205] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:13,135][00205] Inference worker 0-0 is ready! +[2023-02-24 12:15:13,137][00205] All inference workers are ready! Signal rollout workers to start! +[2023-02-24 12:15:13,189][00205] Heartbeat connected on Batcher_0 +[2023-02-24 12:15:13,193][00205] Heartbeat connected on LearnerWorker_p0 +[2023-02-24 12:15:13,242][00205] Heartbeat connected on InferenceWorker_p0-w0 +[2023-02-24 12:15:13,280][11226] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,277][11227] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,288][11222] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,307][11224] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,311][11216] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,327][11223] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,328][11221] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:13,325][11225] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 12:15:14,505][11223] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,502][11216] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,504][11224] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,503][11227] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,506][11226] Decorrelating experience for 0 frames... +[2023-02-24 12:15:14,505][11225] Decorrelating experience for 0 frames... +[2023-02-24 12:15:15,534][11221] Decorrelating experience for 0 frames... +[2023-02-24 12:15:15,544][11225] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,551][11226] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,550][11216] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,555][11224] Decorrelating experience for 32 frames... +[2023-02-24 12:15:15,553][11227] Decorrelating experience for 32 frames... +[2023-02-24 12:15:16,392][11223] Decorrelating experience for 32 frames... +[2023-02-24 12:15:16,405][11222] Decorrelating experience for 0 frames... +[2023-02-24 12:15:16,494][11216] Decorrelating experience for 64 frames... +[2023-02-24 12:15:16,504][11224] Decorrelating experience for 64 frames... +[2023-02-24 12:15:17,196][11222] Decorrelating experience for 32 frames... +[2023-02-24 12:15:17,361][11223] Decorrelating experience for 64 frames... +[2023-02-24 12:15:17,441][11224] Decorrelating experience for 96 frames... +[2023-02-24 12:15:17,581][11216] Decorrelating experience for 96 frames... +[2023-02-24 12:15:17,619][00205] Heartbeat connected on RolloutWorker_w5 +[2023-02-24 12:15:17,800][00205] Heartbeat connected on RolloutWorker_w0 +[2023-02-24 12:15:17,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:18,286][11227] Decorrelating experience for 64 frames... +[2023-02-24 12:15:18,426][11222] Decorrelating experience for 64 frames... +[2023-02-24 12:15:18,910][11227] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,117][00205] Heartbeat connected on RolloutWorker_w4 +[2023-02-24 12:15:19,278][11223] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,558][11222] Decorrelating experience for 96 frames... +[2023-02-24 12:15:19,660][00205] Heartbeat connected on RolloutWorker_w3 +[2023-02-24 12:15:19,678][00205] Heartbeat connected on RolloutWorker_w2 +[2023-02-24 12:15:19,943][11226] Decorrelating experience for 64 frames... +[2023-02-24 12:15:21,076][11221] Decorrelating experience for 32 frames... +[2023-02-24 12:15:21,171][11226] Decorrelating experience for 96 frames... +[2023-02-24 12:15:21,501][00205] Heartbeat connected on RolloutWorker_w7 +[2023-02-24 12:15:21,637][11225] Decorrelating experience for 64 frames... +[2023-02-24 12:15:22,201][11225] Decorrelating experience for 96 frames... +[2023-02-24 12:15:22,366][00205] Heartbeat connected on RolloutWorker_w6 +[2023-02-24 12:15:22,492][11221] Decorrelating experience for 64 frames... +[2023-02-24 12:15:22,876][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 3.6. Samples: 36. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:24,394][11221] Decorrelating experience for 96 frames... +[2023-02-24 12:15:25,075][00205] Heartbeat connected on RolloutWorker_w1 +[2023-02-24 12:15:27,251][11201] Signal inference workers to stop experience collection... +[2023-02-24 12:15:27,262][11215] InferenceWorker_p0-w0: stopping experience collection +[2023-02-24 12:15:27,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 108.0. Samples: 1620. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) +[2023-02-24 12:15:27,872][00205] Avg episode reward: [(0, '2.148')] +[2023-02-24 12:15:29,651][11201] Signal inference workers to resume experience collection... +[2023-02-24 12:15:29,653][11215] InferenceWorker_p0-w0: resuming experience collection +[2023-02-24 12:15:32,870][00205] Fps is (10 sec: 1639.3, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 191.1. Samples: 3822. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) +[2023-02-24 12:15:32,876][00205] Avg episode reward: [(0, '3.277')] +[2023-02-24 12:15:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 36864. Throughput: 0: 408.2. Samples: 10204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:15:37,877][00205] Avg episode reward: [(0, '3.970')] +[2023-02-24 12:15:37,944][11215] Updated weights for policy 0, policy_version 10 (0.0017) +[2023-02-24 12:15:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 1774.9, 300 sec: 1774.9). Total num frames: 53248. Throughput: 0: 414.3. Samples: 12428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:15:42,879][00205] Avg episode reward: [(0, '4.276')] +[2023-02-24 12:15:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 467.9. Samples: 16378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:15:47,872][00205] Avg episode reward: [(0, '4.387')] +[2023-02-24 12:15:50,151][11215] Updated weights for policy 0, policy_version 20 (0.0018) +[2023-02-24 12:15:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 90112. Throughput: 0: 582.8. Samples: 23310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:15:52,877][00205] Avg episode reward: [(0, '4.311')] +[2023-02-24 12:15:57,870][00205] Fps is (10 sec: 4505.4, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 597.3. Samples: 26878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:15:57,876][00205] Avg episode reward: [(0, '4.498')] +[2023-02-24 12:15:57,885][11201] Saving new best policy, reward=4.498! +[2023-02-24 12:16:00,589][11215] Updated weights for policy 0, policy_version 30 (0.0021) +[2023-02-24 12:16:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2539.5, 300 sec: 2539.5). Total num frames: 126976. Throughput: 0: 702.6. Samples: 31616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:02,876][00205] Avg episode reward: [(0, '4.507')] +[2023-02-24 12:16:02,879][11201] Saving new best policy, reward=4.507! +[2023-02-24 12:16:07,870][00205] Fps is (10 sec: 3277.0, 60 sec: 2681.0, 300 sec: 2681.0). Total num frames: 147456. Throughput: 0: 814.0. Samples: 36662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:07,878][00205] Avg episode reward: [(0, '4.360')] +[2023-02-24 12:16:11,386][11215] Updated weights for policy 0, policy_version 40 (0.0012) +[2023-02-24 12:16:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 2798.9, 300 sec: 2798.9). Total num frames: 167936. Throughput: 0: 856.4. Samples: 40160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:16:12,873][00205] Avg episode reward: [(0, '4.326')] +[2023-02-24 12:16:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 2898.7). Total num frames: 188416. Throughput: 0: 957.8. Samples: 46924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:17,872][00205] Avg episode reward: [(0, '4.480')] +[2023-02-24 12:16:22,602][11215] Updated weights for policy 0, policy_version 50 (0.0019) +[2023-02-24 12:16:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.6, 300 sec: 2925.7). Total num frames: 204800. Throughput: 0: 914.2. Samples: 51344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:22,875][00205] Avg episode reward: [(0, '4.418')] +[2023-02-24 12:16:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3003.7). Total num frames: 225280. Throughput: 0: 916.9. Samples: 53690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:16:27,873][00205] Avg episode reward: [(0, '4.363')] +[2023-02-24 12:16:32,346][11215] Updated weights for policy 0, policy_version 60 (0.0012) +[2023-02-24 12:16:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3072.0). Total num frames: 245760. Throughput: 0: 982.3. Samples: 60580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:32,873][00205] Avg episode reward: [(0, '4.446')] +[2023-02-24 12:16:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3132.1). Total num frames: 266240. Throughput: 0: 968.6. Samples: 66900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:37,875][00205] Avg episode reward: [(0, '4.487')] +[2023-02-24 12:16:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3094.8). Total num frames: 278528. Throughput: 0: 939.7. Samples: 69166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:16:42,879][00205] Avg episode reward: [(0, '4.349')] +[2023-02-24 12:16:44,177][11215] Updated weights for policy 0, policy_version 70 (0.0048) +[2023-02-24 12:16:47,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3822.9, 300 sec: 3147.4). Total num frames: 299008. Throughput: 0: 945.0. Samples: 74142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:16:47,875][00205] Avg episode reward: [(0, '4.269')] +[2023-02-24 12:16:47,972][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth... +[2023-02-24 12:16:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3235.8). Total num frames: 323584. Throughput: 0: 985.9. Samples: 81028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:16:52,878][00205] Avg episode reward: [(0, '4.319')] +[2023-02-24 12:16:53,454][11215] Updated weights for policy 0, policy_version 80 (0.0018) +[2023-02-24 12:16:57,874][00205] Fps is (10 sec: 4094.3, 60 sec: 3754.4, 300 sec: 3237.7). Total num frames: 339968. Throughput: 0: 988.0. Samples: 84624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:16:57,883][00205] Avg episode reward: [(0, '4.369')] +[2023-02-24 12:17:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3239.6). Total num frames: 356352. Throughput: 0: 939.6. Samples: 89206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:02,874][00205] Avg episode reward: [(0, '4.496')] +[2023-02-24 12:17:05,500][11215] Updated weights for policy 0, policy_version 90 (0.0025) +[2023-02-24 12:17:07,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 376832. Throughput: 0: 962.2. Samples: 94642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:17:07,875][00205] Avg episode reward: [(0, '4.494')] +[2023-02-24 12:17:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3345.1). Total num frames: 401408. Throughput: 0: 989.4. Samples: 98212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:12,873][00205] Avg episode reward: [(0, '4.592')] +[2023-02-24 12:17:12,880][11201] Saving new best policy, reward=4.592! +[2023-02-24 12:17:14,319][11215] Updated weights for policy 0, policy_version 100 (0.0029) +[2023-02-24 12:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3342.3). Total num frames: 417792. Throughput: 0: 981.1. Samples: 104730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:17:17,872][00205] Avg episode reward: [(0, '4.498')] +[2023-02-24 12:17:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3339.8). Total num frames: 434176. Throughput: 0: 934.1. Samples: 108934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:17:22,873][00205] Avg episode reward: [(0, '4.362')] +[2023-02-24 12:17:27,018][11215] Updated weights for policy 0, policy_version 110 (0.0029) +[2023-02-24 12:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3367.8). Total num frames: 454656. Throughput: 0: 934.0. Samples: 111194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:27,872][00205] Avg episode reward: [(0, '4.453')] +[2023-02-24 12:17:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3393.8). Total num frames: 475136. Throughput: 0: 978.1. Samples: 118156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:17:32,878][00205] Avg episode reward: [(0, '4.647')] +[2023-02-24 12:17:32,884][11201] Saving new best policy, reward=4.647! +[2023-02-24 12:17:36,281][11215] Updated weights for policy 0, policy_version 120 (0.0015) +[2023-02-24 12:17:37,875][00205] Fps is (10 sec: 4094.1, 60 sec: 3822.8, 300 sec: 3417.9). Total num frames: 495616. Throughput: 0: 958.5. Samples: 124166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:17:37,878][00205] Avg episode reward: [(0, '4.540')] +[2023-02-24 12:17:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 924.4. Samples: 126220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:42,873][00205] Avg episode reward: [(0, '4.494')] +[2023-02-24 12:17:47,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3382.5). Total num frames: 524288. Throughput: 0: 924.0. Samples: 130784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:47,873][00205] Avg episode reward: [(0, '4.480')] +[2023-02-24 12:17:48,958][11215] Updated weights for policy 0, policy_version 130 (0.0029) +[2023-02-24 12:17:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3430.4). Total num frames: 548864. Throughput: 0: 948.7. Samples: 137334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:17:52,877][00205] Avg episode reward: [(0, '4.535')] +[2023-02-24 12:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3425.7). Total num frames: 565248. Throughput: 0: 944.3. Samples: 140704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:17:57,875][00205] Avg episode reward: [(0, '4.569')] +[2023-02-24 12:17:59,506][11215] Updated weights for policy 0, policy_version 140 (0.0025) +[2023-02-24 12:18:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 893.5. Samples: 144938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:02,872][00205] Avg episode reward: [(0, '4.381')] +[2023-02-24 12:18:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3417.2). Total num frames: 598016. Throughput: 0: 900.6. Samples: 149460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:18:07,872][00205] Avg episode reward: [(0, '4.500')] +[2023-02-24 12:18:11,604][11215] Updated weights for policy 0, policy_version 150 (0.0025) +[2023-02-24 12:18:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3436.1). Total num frames: 618496. Throughput: 0: 919.8. Samples: 152584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:18:12,873][00205] Avg episode reward: [(0, '4.487')] +[2023-02-24 12:18:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3431.8). Total num frames: 634880. Throughput: 0: 904.4. Samples: 158856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:17,873][00205] Avg episode reward: [(0, '4.541')] +[2023-02-24 12:18:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3406.1). Total num frames: 647168. Throughput: 0: 862.2. Samples: 162960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:18:22,874][00205] Avg episode reward: [(0, '4.610')] +[2023-02-24 12:18:24,292][11215] Updated weights for policy 0, policy_version 160 (0.0015) +[2023-02-24 12:18:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3423.8). Total num frames: 667648. Throughput: 0: 863.5. Samples: 165078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:18:27,873][00205] Avg episode reward: [(0, '4.621')] +[2023-02-24 12:18:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3440.6). Total num frames: 688128. Throughput: 0: 904.7. Samples: 171496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:32,873][00205] Avg episode reward: [(0, '4.513')] +[2023-02-24 12:18:34,056][11215] Updated weights for policy 0, policy_version 170 (0.0012) +[2023-02-24 12:18:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.1, 300 sec: 3456.6). Total num frames: 708608. Throughput: 0: 891.1. Samples: 177432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:37,873][00205] Avg episode reward: [(0, '4.413')] +[2023-02-24 12:18:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3432.8). Total num frames: 720896. Throughput: 0: 862.0. Samples: 179496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:18:42,874][00205] Avg episode reward: [(0, '4.574')] +[2023-02-24 12:18:47,477][11215] Updated weights for policy 0, policy_version 180 (0.0025) +[2023-02-24 12:18:47,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3429.2). Total num frames: 737280. Throughput: 0: 854.2. Samples: 183378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:47,876][00205] Avg episode reward: [(0, '4.684')] +[2023-02-24 12:18:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth... +[2023-02-24 12:18:48,003][11201] Saving new best policy, reward=4.684! +[2023-02-24 12:18:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3444.4). Total num frames: 757760. Throughput: 0: 887.9. Samples: 189418. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:18:52,876][00205] Avg episode reward: [(0, '4.764')] +[2023-02-24 12:18:52,881][11201] Saving new best policy, reward=4.764! +[2023-02-24 12:18:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3440.6). Total num frames: 774144. Throughput: 0: 890.4. Samples: 192650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:18:57,872][00205] Avg episode reward: [(0, '4.736')] +[2023-02-24 12:18:58,010][11215] Updated weights for policy 0, policy_version 190 (0.0013) +[2023-02-24 12:19:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3437.1). Total num frames: 790528. Throughput: 0: 854.0. Samples: 197286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:02,875][00205] Avg episode reward: [(0, '4.664')] +[2023-02-24 12:19:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3433.7). Total num frames: 806912. Throughput: 0: 866.8. Samples: 201966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:07,873][00205] Avg episode reward: [(0, '4.628')] +[2023-02-24 12:19:10,101][11215] Updated weights for policy 0, policy_version 200 (0.0025) +[2023-02-24 12:19:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3447.5). Total num frames: 827392. Throughput: 0: 894.9. Samples: 205346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:12,878][00205] Avg episode reward: [(0, '4.605')] +[2023-02-24 12:19:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3460.7). Total num frames: 847872. Throughput: 0: 899.6. Samples: 211980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:19:17,874][00205] Avg episode reward: [(0, '4.689')] +[2023-02-24 12:19:21,222][11215] Updated weights for policy 0, policy_version 210 (0.0014) +[2023-02-24 12:19:22,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3457.0). Total num frames: 864256. Throughput: 0: 862.4. Samples: 216242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:22,877][00205] Avg episode reward: [(0, '4.659')] +[2023-02-24 12:19:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3453.5). Total num frames: 880640. Throughput: 0: 865.1. Samples: 218426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:19:27,872][00205] Avg episode reward: [(0, '4.987')] +[2023-02-24 12:19:27,885][11201] Saving new best policy, reward=4.987! +[2023-02-24 12:19:32,149][11215] Updated weights for policy 0, policy_version 220 (0.0026) +[2023-02-24 12:19:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3465.8). Total num frames: 901120. Throughput: 0: 920.1. Samples: 224780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:19:32,873][00205] Avg episode reward: [(0, '4.751')] +[2023-02-24 12:19:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3477.7). Total num frames: 921600. Throughput: 0: 923.4. Samples: 230972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:19:37,872][00205] Avg episode reward: [(0, '4.536')] +[2023-02-24 12:19:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3474.0). Total num frames: 937984. Throughput: 0: 898.8. Samples: 233098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:19:42,874][00205] Avg episode reward: [(0, '4.607')] +[2023-02-24 12:19:44,170][11215] Updated weights for policy 0, policy_version 230 (0.0030) +[2023-02-24 12:19:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3470.4). Total num frames: 954368. Throughput: 0: 895.2. Samples: 237572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:47,875][00205] Avg episode reward: [(0, '4.886')] +[2023-02-24 12:19:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3481.6). Total num frames: 974848. Throughput: 0: 937.3. Samples: 244142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:19:52,873][00205] Avg episode reward: [(0, '4.803')] +[2023-02-24 12:19:54,244][11215] Updated weights for policy 0, policy_version 240 (0.0026) +[2023-02-24 12:19:57,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3686.2, 300 sec: 3492.3). Total num frames: 995328. Throughput: 0: 932.5. Samples: 247310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:19:57,875][00205] Avg episode reward: [(0, '4.768')] +[2023-02-24 12:20:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3474.5). Total num frames: 1007616. Throughput: 0: 881.9. Samples: 251664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:02,877][00205] Avg episode reward: [(0, '4.945')] +[2023-02-24 12:20:07,425][11215] Updated weights for policy 0, policy_version 250 (0.0012) +[2023-02-24 12:20:07,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.2, 300 sec: 3471.2). Total num frames: 1024000. Throughput: 0: 887.0. Samples: 256156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:07,877][00205] Avg episode reward: [(0, '5.081')] +[2023-02-24 12:20:07,888][11201] Saving new best policy, reward=5.081! +[2023-02-24 12:20:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1044480. Throughput: 0: 908.4. Samples: 259306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:12,874][00205] Avg episode reward: [(0, '5.107')] +[2023-02-24 12:20:12,880][11201] Saving new best policy, reward=5.107! +[2023-02-24 12:20:17,756][11215] Updated weights for policy 0, policy_version 260 (0.0012) +[2023-02-24 12:20:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.1). Total num frames: 1064960. Throughput: 0: 907.4. Samples: 265614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:17,875][00205] Avg episode reward: [(0, '5.047')] +[2023-02-24 12:20:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1077248. Throughput: 0: 858.2. Samples: 269590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:20:22,873][00205] Avg episode reward: [(0, '5.014')] +[2023-02-24 12:20:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1093632. Throughput: 0: 853.6. Samples: 271510. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:20:27,877][00205] Avg episode reward: [(0, '5.123')] +[2023-02-24 12:20:27,887][11201] Saving new best policy, reward=5.123! +[2023-02-24 12:20:30,782][11215] Updated weights for policy 0, policy_version 270 (0.0030) +[2023-02-24 12:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1114112. Throughput: 0: 887.3. Samples: 277500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:32,872][00205] Avg episode reward: [(0, '5.457')] +[2023-02-24 12:20:32,877][11201] Saving new best policy, reward=5.457! +[2023-02-24 12:20:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 1130496. Throughput: 0: 872.3. Samples: 283396. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:20:37,876][00205] Avg episode reward: [(0, '5.337')] +[2023-02-24 12:20:42,877][00205] Fps is (10 sec: 2865.1, 60 sec: 3412.9, 300 sec: 3637.7). Total num frames: 1142784. Throughput: 0: 846.3. Samples: 285396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:20:42,881][00205] Avg episode reward: [(0, '5.401')] +[2023-02-24 12:20:42,915][11215] Updated weights for policy 0, policy_version 280 (0.0013) +[2023-02-24 12:20:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 1159168. Throughput: 0: 840.1. Samples: 289468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:20:47,873][00205] Avg episode reward: [(0, '5.472')] +[2023-02-24 12:20:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth... +[2023-02-24 12:20:47,995][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth +[2023-02-24 12:20:48,011][11201] Saving new best policy, reward=5.472! +[2023-02-24 12:20:52,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 1183744. Throughput: 0: 878.3. Samples: 295680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:52,876][00205] Avg episode reward: [(0, '5.501')] +[2023-02-24 12:20:52,880][11201] Saving new best policy, reward=5.501! +[2023-02-24 12:20:53,934][11215] Updated weights for policy 0, policy_version 290 (0.0015) +[2023-02-24 12:20:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3637.8). Total num frames: 1200128. Throughput: 0: 875.2. Samples: 298692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:20:57,879][00205] Avg episode reward: [(0, '5.351')] +[2023-02-24 12:21:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1212416. Throughput: 0: 833.4. Samples: 303116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:02,874][00205] Avg episode reward: [(0, '5.385')] +[2023-02-24 12:21:07,198][11215] Updated weights for policy 0, policy_version 300 (0.0012) +[2023-02-24 12:21:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 1228800. Throughput: 0: 845.3. Samples: 307628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:07,872][00205] Avg episode reward: [(0, '5.232')] +[2023-02-24 12:21:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3596.1). Total num frames: 1249280. Throughput: 0: 872.5. Samples: 310772. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:21:12,875][00205] Avg episode reward: [(0, '5.181')] +[2023-02-24 12:21:17,493][11215] Updated weights for policy 0, policy_version 310 (0.0016) +[2023-02-24 12:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1269760. Throughput: 0: 880.0. Samples: 317100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:17,873][00205] Avg episode reward: [(0, '5.654')] +[2023-02-24 12:21:17,890][11201] Saving new best policy, reward=5.654! +[2023-02-24 12:21:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3582.3). Total num frames: 1282048. Throughput: 0: 836.2. Samples: 321026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:21:22,872][00205] Avg episode reward: [(0, '5.684')] +[2023-02-24 12:21:22,875][11201] Saving new best policy, reward=5.684! +[2023-02-24 12:21:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1298432. Throughput: 0: 836.8. Samples: 323048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:27,876][00205] Avg episode reward: [(0, '5.906')] +[2023-02-24 12:21:27,887][11201] Saving new best policy, reward=5.906! +[2023-02-24 12:21:30,480][11215] Updated weights for policy 0, policy_version 320 (0.0018) +[2023-02-24 12:21:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1318912. Throughput: 0: 879.6. Samples: 329050. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:21:32,873][00205] Avg episode reward: [(0, '5.781')] +[2023-02-24 12:21:37,874][00205] Fps is (10 sec: 3685.1, 60 sec: 3413.1, 300 sec: 3582.2). Total num frames: 1335296. Throughput: 0: 872.0. Samples: 334922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:37,884][00205] Avg episode reward: [(0, '5.859')] +[2023-02-24 12:21:42,523][11215] Updated weights for policy 0, policy_version 330 (0.0022) +[2023-02-24 12:21:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.9, 300 sec: 3568.4). Total num frames: 1351680. Throughput: 0: 849.4. Samples: 336916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:42,875][00205] Avg episode reward: [(0, '6.205')] +[2023-02-24 12:21:42,881][11201] Saving new best policy, reward=6.205! +[2023-02-24 12:21:47,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1368064. Throughput: 0: 842.8. Samples: 341040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:47,877][00205] Avg episode reward: [(0, '6.115')] +[2023-02-24 12:21:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 1388544. Throughput: 0: 886.0. Samples: 347496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:21:52,876][00205] Avg episode reward: [(0, '5.819')] +[2023-02-24 12:21:53,336][11215] Updated weights for policy 0, policy_version 340 (0.0013) +[2023-02-24 12:21:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 1409024. Throughput: 0: 888.5. Samples: 350756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:21:57,875][00205] Avg episode reward: [(0, '5.845')] +[2023-02-24 12:22:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1421312. Throughput: 0: 839.8. Samples: 354890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:22:02,878][00205] Avg episode reward: [(0, '5.821')] +[2023-02-24 12:22:06,387][11215] Updated weights for policy 0, policy_version 350 (0.0019) +[2023-02-24 12:22:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1437696. Throughput: 0: 860.0. Samples: 359724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:22:07,878][00205] Avg episode reward: [(0, '6.200')] +[2023-02-24 12:22:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1458176. Throughput: 0: 887.9. Samples: 363002. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:22:12,878][00205] Avg episode reward: [(0, '6.405')] +[2023-02-24 12:22:12,882][11201] Saving new best policy, reward=6.405! +[2023-02-24 12:22:16,660][11215] Updated weights for policy 0, policy_version 360 (0.0020) +[2023-02-24 12:22:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 1474560. Throughput: 0: 887.8. Samples: 369000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:22:17,875][00205] Avg episode reward: [(0, '5.882')] +[2023-02-24 12:22:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1490944. Throughput: 0: 849.6. Samples: 373150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:22,877][00205] Avg episode reward: [(0, '6.185')] +[2023-02-24 12:22:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1507328. Throughput: 0: 850.6. Samples: 375192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:27,873][00205] Avg episode reward: [(0, '6.282')] +[2023-02-24 12:22:29,299][11215] Updated weights for policy 0, policy_version 370 (0.0015) +[2023-02-24 12:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1527808. Throughput: 0: 898.9. Samples: 381492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:32,872][00205] Avg episode reward: [(0, '6.499')] +[2023-02-24 12:22:32,875][11201] Saving new best policy, reward=6.499! +[2023-02-24 12:22:37,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1544192. Throughput: 0: 877.9. Samples: 387004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:22:37,887][00205] Avg episode reward: [(0, '6.315')] +[2023-02-24 12:22:41,166][11215] Updated weights for policy 0, policy_version 380 (0.0012) +[2023-02-24 12:22:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3512.8). Total num frames: 1560576. Throughput: 0: 850.4. Samples: 389024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:22:42,872][00205] Avg episode reward: [(0, '6.222')] +[2023-02-24 12:22:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1576960. Throughput: 0: 857.3. Samples: 393470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:22:47,878][00205] Avg episode reward: [(0, '6.490')] +[2023-02-24 12:22:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth... +[2023-02-24 12:22:48,008][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth +[2023-02-24 12:22:52,505][11215] Updated weights for policy 0, policy_version 390 (0.0016) +[2023-02-24 12:22:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1597440. Throughput: 0: 888.3. Samples: 399698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:22:52,873][00205] Avg episode reward: [(0, '6.882')] +[2023-02-24 12:22:52,876][11201] Saving new best policy, reward=6.882! +[2023-02-24 12:22:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 1613824. Throughput: 0: 884.8. Samples: 402816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:22:57,874][00205] Avg episode reward: [(0, '6.792')] +[2023-02-24 12:23:02,874][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1626112. Throughput: 0: 838.5. Samples: 406734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:02,876][00205] Avg episode reward: [(0, '7.109')] +[2023-02-24 12:23:02,878][11201] Saving new best policy, reward=7.109! +[2023-02-24 12:23:05,821][11215] Updated weights for policy 0, policy_version 400 (0.0013) +[2023-02-24 12:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1646592. Throughput: 0: 854.3. Samples: 411594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:23:07,873][00205] Avg episode reward: [(0, '7.453')] +[2023-02-24 12:23:07,881][11201] Saving new best policy, reward=7.453! +[2023-02-24 12:23:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1667072. Throughput: 0: 878.7. Samples: 414732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:12,873][00205] Avg episode reward: [(0, '7.514')] +[2023-02-24 12:23:12,877][11201] Saving new best policy, reward=7.514! +[2023-02-24 12:23:15,700][11215] Updated weights for policy 0, policy_version 410 (0.0012) +[2023-02-24 12:23:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1683456. Throughput: 0: 869.8. Samples: 420632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:17,872][00205] Avg episode reward: [(0, '7.496')] +[2023-02-24 12:23:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1695744. Throughput: 0: 836.9. Samples: 424662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:23:22,877][00205] Avg episode reward: [(0, '7.483')] +[2023-02-24 12:23:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1716224. Throughput: 0: 837.9. Samples: 426728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:27,878][00205] Avg episode reward: [(0, '7.827')] +[2023-02-24 12:23:27,886][11201] Saving new best policy, reward=7.827! +[2023-02-24 12:23:28,928][11215] Updated weights for policy 0, policy_version 420 (0.0020) +[2023-02-24 12:23:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1736704. Throughput: 0: 878.6. Samples: 433008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:32,879][00205] Avg episode reward: [(0, '8.086')] +[2023-02-24 12:23:32,883][11201] Saving new best policy, reward=8.086! +[2023-02-24 12:23:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3485.1). Total num frames: 1748992. Throughput: 0: 859.6. Samples: 438378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:23:37,874][00205] Avg episode reward: [(0, '7.910')] +[2023-02-24 12:23:40,867][11215] Updated weights for policy 0, policy_version 430 (0.0011) +[2023-02-24 12:23:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1765376. Throughput: 0: 833.9. Samples: 440342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:23:42,878][00205] Avg episode reward: [(0, '7.784')] +[2023-02-24 12:23:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1781760. Throughput: 0: 848.8. Samples: 444928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:23:47,872][00205] Avg episode reward: [(0, '7.886')] +[2023-02-24 12:23:52,061][11215] Updated weights for policy 0, policy_version 440 (0.0018) +[2023-02-24 12:23:52,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1802240. Throughput: 0: 882.7. Samples: 451314. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:23:52,883][00205] Avg episode reward: [(0, '8.041')] +[2023-02-24 12:23:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1818624. Throughput: 0: 878.9. Samples: 454284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:23:57,872][00205] Avg episode reward: [(0, '8.421')] +[2023-02-24 12:23:57,892][11201] Saving new best policy, reward=8.421! +[2023-02-24 12:24:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1830912. Throughput: 0: 835.1. Samples: 458214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:02,875][00205] Avg episode reward: [(0, '8.603')] +[2023-02-24 12:24:02,878][11201] Saving new best policy, reward=8.603! +[2023-02-24 12:24:05,500][11215] Updated weights for policy 0, policy_version 450 (0.0032) +[2023-02-24 12:24:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1851392. Throughput: 0: 853.7. Samples: 463078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:24:07,880][00205] Avg episode reward: [(0, '9.296')] +[2023-02-24 12:24:07,890][11201] Saving new best policy, reward=9.296! +[2023-02-24 12:24:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1871872. Throughput: 0: 876.9. Samples: 466190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:24:12,872][00205] Avg episode reward: [(0, '9.761')] +[2023-02-24 12:24:12,880][11201] Saving new best policy, reward=9.761! +[2023-02-24 12:24:16,003][11215] Updated weights for policy 0, policy_version 460 (0.0027) +[2023-02-24 12:24:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1888256. Throughput: 0: 861.4. Samples: 471770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:17,874][00205] Avg episode reward: [(0, '9.532')] +[2023-02-24 12:24:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1900544. Throughput: 0: 829.6. Samples: 475712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:22,876][00205] Avg episode reward: [(0, '10.124')] +[2023-02-24 12:24:22,881][11201] Saving new best policy, reward=10.124! +[2023-02-24 12:24:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1916928. Throughput: 0: 836.3. Samples: 477974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:27,872][00205] Avg episode reward: [(0, '9.217')] +[2023-02-24 12:24:28,884][11215] Updated weights for policy 0, policy_version 470 (0.0033) +[2023-02-24 12:24:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1937408. Throughput: 0: 870.9. Samples: 484120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:32,873][00205] Avg episode reward: [(0, '9.530')] +[2023-02-24 12:24:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1953792. Throughput: 0: 843.8. Samples: 489286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:37,876][00205] Avg episode reward: [(0, '9.708')] +[2023-02-24 12:24:41,426][11215] Updated weights for policy 0, policy_version 480 (0.0041) +[2023-02-24 12:24:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1966080. Throughput: 0: 821.3. Samples: 491242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:42,873][00205] Avg episode reward: [(0, '9.560')] +[2023-02-24 12:24:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1986560. Throughput: 0: 839.7. Samples: 496002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:24:47,877][00205] Avg episode reward: [(0, '10.393')] +[2023-02-24 12:24:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth... +[2023-02-24 12:24:48,015][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth +[2023-02-24 12:24:48,026][11201] Saving new best policy, reward=10.393! +[2023-02-24 12:24:52,339][11215] Updated weights for policy 0, policy_version 490 (0.0012) +[2023-02-24 12:24:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 2007040. Throughput: 0: 867.1. Samples: 502096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:24:52,873][00205] Avg episode reward: [(0, '10.734')] +[2023-02-24 12:24:52,876][11201] Saving new best policy, reward=10.734! +[2023-02-24 12:24:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2023424. Throughput: 0: 857.2. Samples: 504764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:24:57,880][00205] Avg episode reward: [(0, '11.146')] +[2023-02-24 12:24:57,896][11201] Saving new best policy, reward=11.146! +[2023-02-24 12:25:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3429.5). Total num frames: 2035712. Throughput: 0: 815.2. Samples: 508456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:02,877][00205] Avg episode reward: [(0, '10.893')] +[2023-02-24 12:25:06,006][11215] Updated weights for policy 0, policy_version 500 (0.0025) +[2023-02-24 12:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2052096. Throughput: 0: 844.3. Samples: 513704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:07,877][00205] Avg episode reward: [(0, '10.468')] +[2023-02-24 12:25:12,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3344.9, 300 sec: 3415.6). Total num frames: 2072576. Throughput: 0: 864.1. Samples: 516860. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:12,878][00205] Avg episode reward: [(0, '10.186')] +[2023-02-24 12:25:16,819][11215] Updated weights for policy 0, policy_version 510 (0.0012) +[2023-02-24 12:25:17,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3344.8, 300 sec: 3429.5). Total num frames: 2088960. Throughput: 0: 853.7. Samples: 522542. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:17,877][00205] Avg episode reward: [(0, '10.895')] +[2023-02-24 12:25:22,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2105344. Throughput: 0: 830.5. Samples: 526658. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:22,876][00205] Avg episode reward: [(0, '11.838')] +[2023-02-24 12:25:22,883][11201] Saving new best policy, reward=11.838! +[2023-02-24 12:25:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2125824. Throughput: 0: 842.8. Samples: 529168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:25:27,876][00205] Avg episode reward: [(0, '11.191')] +[2023-02-24 12:25:28,778][11215] Updated weights for policy 0, policy_version 520 (0.0015) +[2023-02-24 12:25:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2146304. Throughput: 0: 879.1. Samples: 535562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:25:32,872][00205] Avg episode reward: [(0, '12.906')] +[2023-02-24 12:25:32,881][11201] Saving new best policy, reward=12.906! +[2023-02-24 12:25:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3413.1, 300 sec: 3443.5). Total num frames: 2158592. Throughput: 0: 854.1. Samples: 540534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:25:37,876][00205] Avg episode reward: [(0, '12.234')] +[2023-02-24 12:25:41,184][11215] Updated weights for policy 0, policy_version 530 (0.0022) +[2023-02-24 12:25:42,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3481.5, 300 sec: 3443.4). Total num frames: 2174976. Throughput: 0: 839.1. Samples: 542524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:42,874][00205] Avg episode reward: [(0, '11.783')] +[2023-02-24 12:25:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2191360. Throughput: 0: 867.4. Samples: 547488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:47,875][00205] Avg episode reward: [(0, '11.929')] +[2023-02-24 12:25:51,704][11215] Updated weights for policy 0, policy_version 540 (0.0012) +[2023-02-24 12:25:52,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2215936. Throughput: 0: 894.9. Samples: 553976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:25:52,877][00205] Avg episode reward: [(0, '11.139')] +[2023-02-24 12:25:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2228224. Throughput: 0: 882.2. Samples: 556558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:25:57,873][00205] Avg episode reward: [(0, '12.047')] +[2023-02-24 12:26:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2244608. Throughput: 0: 846.2. Samples: 560618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:02,873][00205] Avg episode reward: [(0, '12.846')] +[2023-02-24 12:26:05,046][11215] Updated weights for policy 0, policy_version 550 (0.0019) +[2023-02-24 12:26:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2265088. Throughput: 0: 875.7. Samples: 566064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:07,872][00205] Avg episode reward: [(0, '13.604')] +[2023-02-24 12:26:07,888][11201] Saving new best policy, reward=13.604! +[2023-02-24 12:26:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 2285568. Throughput: 0: 888.0. Samples: 569126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:12,878][00205] Avg episode reward: [(0, '13.737')] +[2023-02-24 12:26:12,885][11201] Saving new best policy, reward=13.737! +[2023-02-24 12:26:15,515][11215] Updated weights for policy 0, policy_version 560 (0.0015) +[2023-02-24 12:26:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.9, 300 sec: 3443.4). Total num frames: 2297856. Throughput: 0: 863.5. Samples: 574418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:17,873][00205] Avg episode reward: [(0, '14.525')] +[2023-02-24 12:26:17,890][11201] Saving new best policy, reward=14.525! +[2023-02-24 12:26:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2310144. Throughput: 0: 839.6. Samples: 578312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:22,874][00205] Avg episode reward: [(0, '14.451')] +[2023-02-24 12:26:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2330624. Throughput: 0: 856.8. Samples: 581076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:27,873][00205] Avg episode reward: [(0, '14.278')] +[2023-02-24 12:26:28,206][11215] Updated weights for policy 0, policy_version 570 (0.0023) +[2023-02-24 12:26:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3443.5). Total num frames: 2351104. Throughput: 0: 885.1. Samples: 587318. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:26:32,872][00205] Avg episode reward: [(0, '14.832')] +[2023-02-24 12:26:32,876][11201] Saving new best policy, reward=14.832! +[2023-02-24 12:26:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 2367488. Throughput: 0: 847.5. Samples: 592114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:37,878][00205] Avg episode reward: [(0, '14.690')] +[2023-02-24 12:26:40,690][11215] Updated weights for policy 0, policy_version 580 (0.0018) +[2023-02-24 12:26:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.5, 300 sec: 3429.5). Total num frames: 2379776. Throughput: 0: 835.2. Samples: 594142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:42,875][00205] Avg episode reward: [(0, '13.701')] +[2023-02-24 12:26:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2400256. Throughput: 0: 862.8. Samples: 599444. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:26:47,872][00205] Avg episode reward: [(0, '15.186')] +[2023-02-24 12:26:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth... +[2023-02-24 12:26:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth +[2023-02-24 12:26:48,046][11201] Saving new best policy, reward=15.186! +[2023-02-24 12:26:51,208][11215] Updated weights for policy 0, policy_version 590 (0.0014) +[2023-02-24 12:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2420736. Throughput: 0: 881.0. Samples: 605710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:26:52,878][00205] Avg episode reward: [(0, '14.601')] +[2023-02-24 12:26:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2437120. Throughput: 0: 866.8. Samples: 608134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:26:57,873][00205] Avg episode reward: [(0, '14.396')] +[2023-02-24 12:27:02,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2449408. Throughput: 0: 839.4. Samples: 612190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:27:02,880][00205] Avg episode reward: [(0, '14.811')] +[2023-02-24 12:27:04,374][11215] Updated weights for policy 0, policy_version 600 (0.0020) +[2023-02-24 12:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2469888. Throughput: 0: 884.1. Samples: 618096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:27:07,878][00205] Avg episode reward: [(0, '15.226')] +[2023-02-24 12:27:07,887][11201] Saving new best policy, reward=15.226! +[2023-02-24 12:27:12,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2494464. Throughput: 0: 892.8. Samples: 621252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:27:12,878][00205] Avg episode reward: [(0, '15.343')] +[2023-02-24 12:27:12,883][11201] Saving new best policy, reward=15.343! +[2023-02-24 12:27:14,283][11215] Updated weights for policy 0, policy_version 610 (0.0012) +[2023-02-24 12:27:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2506752. Throughput: 0: 870.8. Samples: 626504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:27:17,880][00205] Avg episode reward: [(0, '15.456')] +[2023-02-24 12:27:17,895][11201] Saving new best policy, reward=15.456! +[2023-02-24 12:27:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2519040. Throughput: 0: 856.6. Samples: 630662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:22,875][00205] Avg episode reward: [(0, '16.697')] +[2023-02-24 12:27:22,879][11201] Saving new best policy, reward=16.697! +[2023-02-24 12:27:27,025][11215] Updated weights for policy 0, policy_version 620 (0.0025) +[2023-02-24 12:27:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2539520. Throughput: 0: 875.3. Samples: 633530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:27,875][00205] Avg episode reward: [(0, '16.487')] +[2023-02-24 12:27:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2564096. Throughput: 0: 902.6. Samples: 640060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:32,873][00205] Avg episode reward: [(0, '18.070')] +[2023-02-24 12:27:32,880][11201] Saving new best policy, reward=18.070! +[2023-02-24 12:27:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2576384. Throughput: 0: 865.5. Samples: 644656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:37,876][00205] Avg episode reward: [(0, '19.315')] +[2023-02-24 12:27:37,889][11201] Saving new best policy, reward=19.315! +[2023-02-24 12:27:38,589][11215] Updated weights for policy 0, policy_version 630 (0.0016) +[2023-02-24 12:27:42,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2588672. Throughput: 0: 854.9. Samples: 646606. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:27:42,872][00205] Avg episode reward: [(0, '18.752')] +[2023-02-24 12:27:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2613248. Throughput: 0: 886.7. Samples: 652090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:27:47,873][00205] Avg episode reward: [(0, '19.421')] +[2023-02-24 12:27:47,885][11201] Saving new best policy, reward=19.421! +[2023-02-24 12:27:49,817][11215] Updated weights for policy 0, policy_version 640 (0.0027) +[2023-02-24 12:27:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2633728. Throughput: 0: 899.0. Samples: 658552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:52,874][00205] Avg episode reward: [(0, '18.641')] +[2023-02-24 12:27:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2646016. Throughput: 0: 879.2. Samples: 660816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:27:57,874][00205] Avg episode reward: [(0, '18.167')] +[2023-02-24 12:28:02,866][11215] Updated weights for policy 0, policy_version 650 (0.0028) +[2023-02-24 12:28:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2662400. Throughput: 0: 853.8. Samples: 664926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:02,879][00205] Avg episode reward: [(0, '18.468')] +[2023-02-24 12:28:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2682880. Throughput: 0: 893.1. Samples: 670852. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:07,879][00205] Avg episode reward: [(0, '19.198')] +[2023-02-24 12:28:12,188][11215] Updated weights for policy 0, policy_version 660 (0.0021) +[2023-02-24 12:28:12,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 2703360. Throughput: 0: 901.2. Samples: 674086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:28:12,883][00205] Avg episode reward: [(0, '19.262')] +[2023-02-24 12:28:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2715648. Throughput: 0: 869.5. Samples: 679188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:28:17,873][00205] Avg episode reward: [(0, '19.826')] +[2023-02-24 12:28:17,896][11201] Saving new best policy, reward=19.826! +[2023-02-24 12:28:22,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2732032. Throughput: 0: 857.9. Samples: 683262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:28:22,873][00205] Avg episode reward: [(0, '20.419')] +[2023-02-24 12:28:22,875][11201] Saving new best policy, reward=20.419! +[2023-02-24 12:28:25,364][11215] Updated weights for policy 0, policy_version 670 (0.0015) +[2023-02-24 12:28:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2752512. Throughput: 0: 883.5. Samples: 686364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:28:27,872][00205] Avg episode reward: [(0, '19.222')] +[2023-02-24 12:28:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 2772992. Throughput: 0: 906.4. Samples: 692878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:28:32,879][00205] Avg episode reward: [(0, '20.182')] +[2023-02-24 12:28:36,282][11215] Updated weights for policy 0, policy_version 680 (0.0027) +[2023-02-24 12:28:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2789376. Throughput: 0: 864.6. Samples: 697458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:28:37,873][00205] Avg episode reward: [(0, '19.719')] +[2023-02-24 12:28:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2801664. Throughput: 0: 858.8. Samples: 699460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:28:42,878][00205] Avg episode reward: [(0, '18.783')] +[2023-02-24 12:28:47,846][11215] Updated weights for policy 0, policy_version 690 (0.0014) +[2023-02-24 12:28:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2826240. Throughput: 0: 895.8. Samples: 705238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:28:47,872][00205] Avg episode reward: [(0, '17.900')] +[2023-02-24 12:28:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth... +[2023-02-24 12:28:47,997][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth +[2023-02-24 12:28:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2842624. Throughput: 0: 907.5. Samples: 711688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:52,874][00205] Avg episode reward: [(0, '16.623')] +[2023-02-24 12:28:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2859008. Throughput: 0: 878.2. Samples: 713602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:28:57,873][00205] Avg episode reward: [(0, '16.645')] +[2023-02-24 12:29:00,608][11215] Updated weights for policy 0, policy_version 700 (0.0028) +[2023-02-24 12:29:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2871296. Throughput: 0: 855.0. Samples: 717664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:02,876][00205] Avg episode reward: [(0, '17.588')] +[2023-02-24 12:29:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2891776. Throughput: 0: 897.9. Samples: 723666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:07,873][00205] Avg episode reward: [(0, '17.711')] +[2023-02-24 12:29:10,964][11215] Updated weights for policy 0, policy_version 710 (0.0037) +[2023-02-24 12:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3471.2). Total num frames: 2912256. Throughput: 0: 898.6. Samples: 726800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:29:12,876][00205] Avg episode reward: [(0, '18.661')] +[2023-02-24 12:29:17,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3481.3, 300 sec: 3471.1). Total num frames: 2924544. Throughput: 0: 857.9. Samples: 731486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:17,878][00205] Avg episode reward: [(0, '19.159')] +[2023-02-24 12:29:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2940928. Throughput: 0: 854.0. Samples: 735890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:29:22,876][00205] Avg episode reward: [(0, '19.729')] +[2023-02-24 12:29:23,915][11215] Updated weights for policy 0, policy_version 720 (0.0026) +[2023-02-24 12:29:27,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2965504. Throughput: 0: 880.8. Samples: 739098. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:29:27,873][00205] Avg episode reward: [(0, '19.640')] +[2023-02-24 12:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 2981888. Throughput: 0: 898.1. Samples: 745654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:29:32,876][00205] Avg episode reward: [(0, '20.169')] +[2023-02-24 12:29:34,526][11215] Updated weights for policy 0, policy_version 730 (0.0015) +[2023-02-24 12:29:37,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3498.9). Total num frames: 2998272. Throughput: 0: 850.3. Samples: 749954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:29:37,875][00205] Avg episode reward: [(0, '20.339')] +[2023-02-24 12:29:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3014656. Throughput: 0: 852.5. Samples: 751966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:29:42,873][00205] Avg episode reward: [(0, '21.573')] +[2023-02-24 12:29:42,881][11201] Saving new best policy, reward=21.573! +[2023-02-24 12:29:46,581][11215] Updated weights for policy 0, policy_version 740 (0.0023) +[2023-02-24 12:29:47,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3035136. Throughput: 0: 892.8. Samples: 757842. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:29:47,872][00205] Avg episode reward: [(0, '22.027')] +[2023-02-24 12:29:47,886][11201] Saving new best policy, reward=22.027! +[2023-02-24 12:29:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3051520. Throughput: 0: 895.1. Samples: 763946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:29:52,875][00205] Avg episode reward: [(0, '21.699')] +[2023-02-24 12:29:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3067904. Throughput: 0: 870.3. Samples: 765962. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:29:57,874][00205] Avg episode reward: [(0, '22.312')] +[2023-02-24 12:29:57,886][11201] Saving new best policy, reward=22.312! +[2023-02-24 12:29:59,122][11215] Updated weights for policy 0, policy_version 750 (0.0020) +[2023-02-24 12:30:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3084288. Throughput: 0: 853.4. Samples: 769884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:30:02,873][00205] Avg episode reward: [(0, '21.738')] +[2023-02-24 12:30:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3104768. Throughput: 0: 899.1. Samples: 776350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:30:07,877][00205] Avg episode reward: [(0, '22.720')] +[2023-02-24 12:30:07,891][11201] Saving new best policy, reward=22.720! +[2023-02-24 12:30:09,415][11215] Updated weights for policy 0, policy_version 760 (0.0014) +[2023-02-24 12:30:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3121152. Throughput: 0: 898.2. Samples: 779516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:12,873][00205] Avg episode reward: [(0, '22.461')] +[2023-02-24 12:30:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.2, 300 sec: 3499.0). Total num frames: 3137536. Throughput: 0: 852.9. Samples: 784036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:17,876][00205] Avg episode reward: [(0, '22.451')] +[2023-02-24 12:30:22,406][11215] Updated weights for policy 0, policy_version 770 (0.0017) +[2023-02-24 12:30:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3153920. Throughput: 0: 862.7. Samples: 788772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:30:22,872][00205] Avg episode reward: [(0, '21.952')] +[2023-02-24 12:30:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3178496. Throughput: 0: 891.9. Samples: 792100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:30:27,879][00205] Avg episode reward: [(0, '21.292')] +[2023-02-24 12:30:31,449][11215] Updated weights for policy 0, policy_version 780 (0.0014) +[2023-02-24 12:30:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3194880. Throughput: 0: 917.6. Samples: 799132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:30:32,876][00205] Avg episode reward: [(0, '20.844')] +[2023-02-24 12:30:37,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3211264. Throughput: 0: 869.6. Samples: 803078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:37,878][00205] Avg episode reward: [(0, '20.241')] +[2023-02-24 12:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3227648. Throughput: 0: 871.1. Samples: 805160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:30:42,872][00205] Avg episode reward: [(0, '21.477')] +[2023-02-24 12:30:44,598][11215] Updated weights for policy 0, policy_version 790 (0.0023) +[2023-02-24 12:30:47,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3248128. Throughput: 0: 921.2. Samples: 811336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:30:47,873][00205] Avg episode reward: [(0, '22.575')] +[2023-02-24 12:30:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth... +[2023-02-24 12:30:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth +[2023-02-24 12:30:52,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3617.8, 300 sec: 3526.7). Total num frames: 3268608. Throughput: 0: 910.7. Samples: 817338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:30:52,885][00205] Avg episode reward: [(0, '22.106')] +[2023-02-24 12:30:55,704][11215] Updated weights for policy 0, policy_version 800 (0.0029) +[2023-02-24 12:30:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3280896. Throughput: 0: 886.8. Samples: 819424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:30:57,874][00205] Avg episode reward: [(0, '22.843')] +[2023-02-24 12:30:57,885][11201] Saving new best policy, reward=22.843! +[2023-02-24 12:31:02,870][00205] Fps is (10 sec: 2868.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3297280. Throughput: 0: 879.2. Samples: 823602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:02,872][00205] Avg episode reward: [(0, '24.269')] +[2023-02-24 12:31:02,882][11201] Saving new best policy, reward=24.269! +[2023-02-24 12:31:07,107][11215] Updated weights for policy 0, policy_version 810 (0.0023) +[2023-02-24 12:31:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.8, 300 sec: 3499.0). Total num frames: 3317760. Throughput: 0: 918.7. Samples: 830114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:07,873][00205] Avg episode reward: [(0, '25.845')] +[2023-02-24 12:31:07,886][11201] Saving new best policy, reward=25.845! +[2023-02-24 12:31:12,874][00205] Fps is (10 sec: 4094.2, 60 sec: 3617.9, 300 sec: 3526.7). Total num frames: 3338240. Throughput: 0: 915.9. Samples: 833320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:12,877][00205] Avg episode reward: [(0, '24.641')] +[2023-02-24 12:31:17,872][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3350528. Throughput: 0: 857.0. Samples: 837698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:31:17,877][00205] Avg episode reward: [(0, '24.794')] +[2023-02-24 12:31:19,799][11215] Updated weights for policy 0, policy_version 820 (0.0014) +[2023-02-24 12:31:22,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 3371008. Throughput: 0: 879.1. Samples: 842638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:31:22,881][00205] Avg episode reward: [(0, '25.251')] +[2023-02-24 12:31:27,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3391488. Throughput: 0: 905.2. Samples: 845896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:31:27,873][00205] Avg episode reward: [(0, '25.312')] +[2023-02-24 12:31:29,403][11215] Updated weights for policy 0, policy_version 830 (0.0035) +[2023-02-24 12:31:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3407872. Throughput: 0: 905.1. Samples: 852064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:32,872][00205] Avg episode reward: [(0, '25.504')] +[2023-02-24 12:31:37,872][00205] Fps is (10 sec: 3276.2, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3424256. Throughput: 0: 864.5. Samples: 856236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:37,875][00205] Avg episode reward: [(0, '24.761')] +[2023-02-24 12:31:42,707][11215] Updated weights for policy 0, policy_version 840 (0.0026) +[2023-02-24 12:31:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3440640. Throughput: 0: 862.1. Samples: 858220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:31:42,875][00205] Avg episode reward: [(0, '25.354')] +[2023-02-24 12:31:47,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3461120. Throughput: 0: 913.2. Samples: 864694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:47,876][00205] Avg episode reward: [(0, '24.708')] +[2023-02-24 12:31:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.9, 300 sec: 3526.7). Total num frames: 3477504. Throughput: 0: 897.5. Samples: 870502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:31:52,873][00205] Avg episode reward: [(0, '24.505')] +[2023-02-24 12:31:52,901][11215] Updated weights for policy 0, policy_version 850 (0.0019) +[2023-02-24 12:31:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3493888. Throughput: 0: 871.6. Samples: 872538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:31:57,877][00205] Avg episode reward: [(0, '24.874')] +[2023-02-24 12:32:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3510272. Throughput: 0: 877.1. Samples: 877168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:32:02,872][00205] Avg episode reward: [(0, '23.638')] +[2023-02-24 12:32:04,697][11215] Updated weights for policy 0, policy_version 860 (0.0038) +[2023-02-24 12:32:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3526.7). Total num frames: 3534848. Throughput: 0: 915.9. Samples: 883856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:07,875][00205] Avg episode reward: [(0, '23.967')] +[2023-02-24 12:32:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 3551232. Throughput: 0: 912.1. Samples: 886942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:12,874][00205] Avg episode reward: [(0, '21.795')] +[2023-02-24 12:32:16,744][11215] Updated weights for policy 0, policy_version 870 (0.0022) +[2023-02-24 12:32:17,872][00205] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3563520. Throughput: 0: 866.7. Samples: 891068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:32:17,876][00205] Avg episode reward: [(0, '21.779')] +[2023-02-24 12:32:22,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3584000. Throughput: 0: 887.8. Samples: 896186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:22,875][00205] Avg episode reward: [(0, '22.045')] +[2023-02-24 12:32:27,234][11215] Updated weights for policy 0, policy_version 880 (0.0022) +[2023-02-24 12:32:27,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3604480. Throughput: 0: 915.7. Samples: 899428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:27,875][00205] Avg episode reward: [(0, '21.926')] +[2023-02-24 12:32:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3620864. Throughput: 0: 907.9. Samples: 905550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:32:32,875][00205] Avg episode reward: [(0, '21.742')] +[2023-02-24 12:32:37,872][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 3637248. Throughput: 0: 870.2. Samples: 909660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:37,874][00205] Avg episode reward: [(0, '21.922')] +[2023-02-24 12:32:40,249][11215] Updated weights for policy 0, policy_version 890 (0.0026) +[2023-02-24 12:32:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3653632. Throughput: 0: 873.9. Samples: 911862. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:32:42,878][00205] Avg episode reward: [(0, '22.072')] +[2023-02-24 12:32:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3678208. Throughput: 0: 918.6. Samples: 918504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:47,873][00205] Avg episode reward: [(0, '21.527')] +[2023-02-24 12:32:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth... +[2023-02-24 12:32:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth +[2023-02-24 12:32:49,504][11215] Updated weights for policy 0, policy_version 900 (0.0014) +[2023-02-24 12:32:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3694592. Throughput: 0: 894.0. Samples: 924084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:52,876][00205] Avg episode reward: [(0, '20.238')] +[2023-02-24 12:32:57,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3706880. Throughput: 0: 871.4. Samples: 926154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:32:57,874][00205] Avg episode reward: [(0, '20.008')] +[2023-02-24 12:33:02,222][11215] Updated weights for policy 0, policy_version 910 (0.0020) +[2023-02-24 12:33:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3727360. Throughput: 0: 889.2. Samples: 931082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:02,878][00205] Avg episode reward: [(0, '20.590')] +[2023-02-24 12:33:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 3751936. Throughput: 0: 922.6. Samples: 937702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:07,872][00205] Avg episode reward: [(0, '20.570')] +[2023-02-24 12:33:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3764224. Throughput: 0: 914.8. Samples: 940592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:33:12,880][00205] Avg episode reward: [(0, '21.241')] +[2023-02-24 12:33:13,123][11215] Updated weights for policy 0, policy_version 920 (0.0015) +[2023-02-24 12:33:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 3780608. Throughput: 0: 871.6. Samples: 944770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:33:17,875][00205] Avg episode reward: [(0, '20.790')] +[2023-02-24 12:33:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3801088. Throughput: 0: 901.6. Samples: 950230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:33:22,877][00205] Avg episode reward: [(0, '23.611')] +[2023-02-24 12:33:24,717][11215] Updated weights for policy 0, policy_version 930 (0.0038) +[2023-02-24 12:33:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3821568. Throughput: 0: 924.8. Samples: 953480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:33:27,872][00205] Avg episode reward: [(0, '23.023')] +[2023-02-24 12:33:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3837952. Throughput: 0: 907.0. Samples: 959318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:32,873][00205] Avg episode reward: [(0, '22.371')] +[2023-02-24 12:33:36,764][11215] Updated weights for policy 0, policy_version 940 (0.0026) +[2023-02-24 12:33:37,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 3850240. Throughput: 0: 874.8. Samples: 963452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:33:37,873][00205] Avg episode reward: [(0, '23.057')] +[2023-02-24 12:33:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3870720. Throughput: 0: 883.0. Samples: 965890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:42,872][00205] Avg episode reward: [(0, '23.216')] +[2023-02-24 12:33:46,999][11215] Updated weights for policy 0, policy_version 950 (0.0012) +[2023-02-24 12:33:47,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3891200. Throughput: 0: 919.3. Samples: 972450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:47,872][00205] Avg episode reward: [(0, '21.615')] +[2023-02-24 12:33:52,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3549.6, 300 sec: 3554.4). Total num frames: 3907584. Throughput: 0: 889.4. Samples: 977728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:52,876][00205] Avg episode reward: [(0, '22.134')] +[2023-02-24 12:33:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 3923968. Throughput: 0: 870.5. Samples: 979764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:33:57,875][00205] Avg episode reward: [(0, '22.780')] +[2023-02-24 12:34:00,073][11215] Updated weights for policy 0, policy_version 960 (0.0019) +[2023-02-24 12:34:02,870][00205] Fps is (10 sec: 3688.1, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 3944448. Throughput: 0: 892.1. Samples: 984916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:02,876][00205] Avg episode reward: [(0, '22.696')] +[2023-02-24 12:34:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3964928. Throughput: 0: 920.1. Samples: 991634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:34:07,872][00205] Avg episode reward: [(0, '22.213')] +[2023-02-24 12:34:09,599][11215] Updated weights for policy 0, policy_version 970 (0.0014) +[2023-02-24 12:34:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 3981312. Throughput: 0: 907.9. Samples: 994334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:12,875][00205] Avg episode reward: [(0, '22.002')] +[2023-02-24 12:34:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3993600. Throughput: 0: 869.8. Samples: 998460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:34:17,876][00205] Avg episode reward: [(0, '21.600')] +[2023-02-24 12:34:22,322][11215] Updated weights for policy 0, policy_version 980 (0.0019) +[2023-02-24 12:34:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4014080. Throughput: 0: 903.1. Samples: 1004092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:22,876][00205] Avg episode reward: [(0, '24.195')] +[2023-02-24 12:34:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4038656. Throughput: 0: 920.5. Samples: 1007312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:34:27,878][00205] Avg episode reward: [(0, '23.973')] +[2023-02-24 12:34:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4050944. Throughput: 0: 899.2. Samples: 1012912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:32,876][00205] Avg episode reward: [(0, '23.066')] +[2023-02-24 12:34:33,331][11215] Updated weights for policy 0, policy_version 990 (0.0013) +[2023-02-24 12:34:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 4067328. Throughput: 0: 874.4. Samples: 1017072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:37,876][00205] Avg episode reward: [(0, '25.073')] +[2023-02-24 12:34:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4087808. Throughput: 0: 890.0. Samples: 1019816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:42,873][00205] Avg episode reward: [(0, '26.109')] +[2023-02-24 12:34:42,875][11201] Saving new best policy, reward=26.109! +[2023-02-24 12:34:44,417][11215] Updated weights for policy 0, policy_version 1000 (0.0017) +[2023-02-24 12:34:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4108288. Throughput: 0: 918.4. Samples: 1026242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:34:47,873][00205] Avg episode reward: [(0, '26.954')] +[2023-02-24 12:34:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth... +[2023-02-24 12:34:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth +[2023-02-24 12:34:48,061][11201] Saving new best policy, reward=26.954! +[2023-02-24 12:34:52,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3550.1, 300 sec: 3568.4). Total num frames: 4120576. Throughput: 0: 873.1. Samples: 1030926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:34:52,873][00205] Avg episode reward: [(0, '26.611')] +[2023-02-24 12:34:57,874][00205] Fps is (10 sec: 2456.6, 60 sec: 3481.3, 300 sec: 3554.4). Total num frames: 4132864. Throughput: 0: 857.2. Samples: 1032910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:34:57,877][00205] Avg episode reward: [(0, '26.001')] +[2023-02-24 12:34:57,910][11215] Updated weights for policy 0, policy_version 1010 (0.0021) +[2023-02-24 12:35:02,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4153344. Throughput: 0: 877.5. Samples: 1037946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:02,873][00205] Avg episode reward: [(0, '26.890')] +[2023-02-24 12:35:07,732][11215] Updated weights for policy 0, policy_version 1020 (0.0026) +[2023-02-24 12:35:07,870][00205] Fps is (10 sec: 4507.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4177920. Throughput: 0: 898.0. Samples: 1044500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:07,872][00205] Avg episode reward: [(0, '25.811')] +[2023-02-24 12:35:12,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3481.4, 300 sec: 3568.3). Total num frames: 4190208. Throughput: 0: 880.8. Samples: 1046952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:12,883][00205] Avg episode reward: [(0, '26.692')] +[2023-02-24 12:35:17,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4206592. Throughput: 0: 847.7. Samples: 1051058. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:17,874][00205] Avg episode reward: [(0, '25.030')] +[2023-02-24 12:35:20,544][11215] Updated weights for policy 0, policy_version 1030 (0.0021) +[2023-02-24 12:35:22,870][00205] Fps is (10 sec: 3687.7, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4227072. Throughput: 0: 888.2. Samples: 1057042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:22,872][00205] Avg episode reward: [(0, '24.995')] +[2023-02-24 12:35:27,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 4247552. Throughput: 0: 898.8. Samples: 1060262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:27,876][00205] Avg episode reward: [(0, '23.880')] +[2023-02-24 12:35:30,855][11215] Updated weights for policy 0, policy_version 1040 (0.0025) +[2023-02-24 12:35:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4263936. Throughput: 0: 874.1. Samples: 1065578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:32,876][00205] Avg episode reward: [(0, '24.515')] +[2023-02-24 12:35:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4276224. Throughput: 0: 863.2. Samples: 1069770. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:37,873][00205] Avg episode reward: [(0, '24.452')] +[2023-02-24 12:35:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4296704. Throughput: 0: 883.5. Samples: 1072664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:35:42,873][00205] Avg episode reward: [(0, '23.207')] +[2023-02-24 12:35:43,053][11215] Updated weights for policy 0, policy_version 1050 (0.0015) +[2023-02-24 12:35:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4321280. Throughput: 0: 917.2. Samples: 1079218. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:35:47,873][00205] Avg episode reward: [(0, '23.072')] +[2023-02-24 12:35:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4333568. Throughput: 0: 878.6. Samples: 1084036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:35:52,873][00205] Avg episode reward: [(0, '23.314')] +[2023-02-24 12:35:54,833][11215] Updated weights for policy 0, policy_version 1060 (0.0015) +[2023-02-24 12:35:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.4, 300 sec: 3568.4). Total num frames: 4349952. Throughput: 0: 870.1. Samples: 1086102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:35:57,879][00205] Avg episode reward: [(0, '22.996')] +[2023-02-24 12:36:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4370432. Throughput: 0: 904.5. Samples: 1091762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:36:02,877][00205] Avg episode reward: [(0, '23.478')] +[2023-02-24 12:36:05,185][11215] Updated weights for policy 0, policy_version 1070 (0.0016) +[2023-02-24 12:36:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3568.4). Total num frames: 4390912. Throughput: 0: 917.2. Samples: 1098320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:36:07,879][00205] Avg episode reward: [(0, '23.825')] +[2023-02-24 12:36:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3582.3). Total num frames: 4407296. Throughput: 0: 896.0. Samples: 1100582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:12,877][00205] Avg episode reward: [(0, '24.060')] +[2023-02-24 12:36:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4419584. Throughput: 0: 871.6. Samples: 1104800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:17,873][00205] Avg episode reward: [(0, '25.036')] +[2023-02-24 12:36:18,075][11215] Updated weights for policy 0, policy_version 1080 (0.0020) +[2023-02-24 12:36:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4440064. Throughput: 0: 911.4. Samples: 1110784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:36:22,875][00205] Avg episode reward: [(0, '24.692')] +[2023-02-24 12:36:27,528][11215] Updated weights for policy 0, policy_version 1090 (0.0016) +[2023-02-24 12:36:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4464640. Throughput: 0: 919.2. Samples: 1114026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:27,873][00205] Avg episode reward: [(0, '25.739')] +[2023-02-24 12:36:32,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4476928. Throughput: 0: 888.7. Samples: 1119210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:36:32,875][00205] Avg episode reward: [(0, '25.352')] +[2023-02-24 12:36:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4493312. Throughput: 0: 872.7. Samples: 1123308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:36:37,879][00205] Avg episode reward: [(0, '25.245')] +[2023-02-24 12:36:40,517][11215] Updated weights for policy 0, policy_version 1100 (0.0023) +[2023-02-24 12:36:42,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4513792. Throughput: 0: 895.7. Samples: 1126410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:36:42,872][00205] Avg episode reward: [(0, '24.665')] +[2023-02-24 12:36:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4534272. Throughput: 0: 913.9. Samples: 1132888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:36:47,879][00205] Avg episode reward: [(0, '23.801')] +[2023-02-24 12:36:47,902][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth... +[2023-02-24 12:36:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth +[2023-02-24 12:36:51,888][11215] Updated weights for policy 0, policy_version 1110 (0.0016) +[2023-02-24 12:36:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4546560. Throughput: 0: 868.5. Samples: 1137400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:36:52,874][00205] Avg episode reward: [(0, '23.689')] +[2023-02-24 12:36:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4562944. Throughput: 0: 862.9. Samples: 1139412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:36:57,874][00205] Avg episode reward: [(0, '24.247')] +[2023-02-24 12:37:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4583424. Throughput: 0: 898.9. Samples: 1145252. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:37:02,873][00205] Avg episode reward: [(0, '22.963')] +[2023-02-24 12:37:03,079][11215] Updated weights for policy 0, policy_version 1120 (0.0016) +[2023-02-24 12:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4603904. Throughput: 0: 912.9. Samples: 1151864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:07,876][00205] Avg episode reward: [(0, '22.026')] +[2023-02-24 12:37:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4620288. Throughput: 0: 886.8. Samples: 1153932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:37:12,876][00205] Avg episode reward: [(0, '22.896')] +[2023-02-24 12:37:15,580][11215] Updated weights for policy 0, policy_version 1130 (0.0020) +[2023-02-24 12:37:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4632576. Throughput: 0: 862.9. Samples: 1158038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:17,873][00205] Avg episode reward: [(0, '24.056')] +[2023-02-24 12:37:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4657152. Throughput: 0: 908.4. Samples: 1164188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:22,873][00205] Avg episode reward: [(0, '25.627')] +[2023-02-24 12:37:25,359][11215] Updated weights for policy 0, policy_version 1140 (0.0020) +[2023-02-24 12:37:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4677632. Throughput: 0: 912.5. Samples: 1167472. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:37:27,872][00205] Avg episode reward: [(0, '24.836')] +[2023-02-24 12:37:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4689920. Throughput: 0: 877.6. Samples: 1172380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:32,873][00205] Avg episode reward: [(0, '25.505')] +[2023-02-24 12:37:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4706304. Throughput: 0: 871.8. Samples: 1176632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:37:37,872][00205] Avg episode reward: [(0, '27.100')] +[2023-02-24 12:37:37,897][11201] Saving new best policy, reward=27.100! +[2023-02-24 12:37:38,645][11215] Updated weights for policy 0, policy_version 1150 (0.0031) +[2023-02-24 12:37:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4726784. Throughput: 0: 897.6. Samples: 1179804. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:37:42,873][00205] Avg episode reward: [(0, '26.473')] +[2023-02-24 12:37:47,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 4747264. Throughput: 0: 911.5. Samples: 1186272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:37:47,876][00205] Avg episode reward: [(0, '27.358')] +[2023-02-24 12:37:47,884][11201] Saving new best policy, reward=27.358! +[2023-02-24 12:37:48,971][11215] Updated weights for policy 0, policy_version 1160 (0.0013) +[2023-02-24 12:37:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4759552. Throughput: 0: 859.2. Samples: 1190528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:37:52,872][00205] Avg episode reward: [(0, '26.911')] +[2023-02-24 12:37:57,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4775936. Throughput: 0: 858.6. Samples: 1192568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:37:57,872][00205] Avg episode reward: [(0, '26.721')] +[2023-02-24 12:38:01,317][11215] Updated weights for policy 0, policy_version 1170 (0.0011) +[2023-02-24 12:38:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4796416. Throughput: 0: 901.9. Samples: 1198624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:38:02,872][00205] Avg episode reward: [(0, '25.721')] +[2023-02-24 12:38:07,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4816896. Throughput: 0: 904.7. Samples: 1204900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:38:07,880][00205] Avg episode reward: [(0, '25.433')] +[2023-02-24 12:38:12,859][11215] Updated weights for policy 0, policy_version 1180 (0.0023) +[2023-02-24 12:38:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4833280. Throughput: 0: 878.1. Samples: 1206988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:38:12,879][00205] Avg episode reward: [(0, '25.316')] +[2023-02-24 12:38:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4845568. Throughput: 0: 862.8. Samples: 1211208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:38:17,872][00205] Avg episode reward: [(0, '26.604')] +[2023-02-24 12:38:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4870144. Throughput: 0: 908.6. Samples: 1217520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:38:22,873][00205] Avg episode reward: [(0, '24.923')] +[2023-02-24 12:38:23,700][11215] Updated weights for policy 0, policy_version 1190 (0.0028) +[2023-02-24 12:38:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4890624. Throughput: 0: 911.3. Samples: 1220812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:38:27,879][00205] Avg episode reward: [(0, '26.016')] +[2023-02-24 12:38:32,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3549.4, 300 sec: 3568.3). Total num frames: 4902912. Throughput: 0: 873.1. Samples: 1225564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:32,884][00205] Avg episode reward: [(0, '25.041')] +[2023-02-24 12:38:36,551][11215] Updated weights for policy 0, policy_version 1200 (0.0014) +[2023-02-24 12:38:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4919296. Throughput: 0: 879.7. Samples: 1230116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:38:37,872][00205] Avg episode reward: [(0, '25.457')] +[2023-02-24 12:38:42,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4939776. Throughput: 0: 908.0. Samples: 1233430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:42,878][00205] Avg episode reward: [(0, '26.832')] +[2023-02-24 12:38:45,830][11215] Updated weights for policy 0, policy_version 1210 (0.0022) +[2023-02-24 12:38:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4960256. Throughput: 0: 919.5. Samples: 1240002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:47,874][00205] Avg episode reward: [(0, '26.514')] +[2023-02-24 12:38:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth... +[2023-02-24 12:38:48,096][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth +[2023-02-24 12:38:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4972544. Throughput: 0: 868.1. Samples: 1243964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:52,882][00205] Avg episode reward: [(0, '27.554')] +[2023-02-24 12:38:52,891][11201] Saving new best policy, reward=27.554! +[2023-02-24 12:38:57,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4988928. Throughput: 0: 866.5. Samples: 1245980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:38:57,872][00205] Avg episode reward: [(0, '27.820')] +[2023-02-24 12:38:57,889][11201] Saving new best policy, reward=27.820! +[2023-02-24 12:38:59,147][11215] Updated weights for policy 0, policy_version 1220 (0.0026) +[2023-02-24 12:39:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 5013504. Throughput: 0: 906.9. Samples: 1252020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:02,875][00205] Avg episode reward: [(0, '27.834')] +[2023-02-24 12:39:02,880][11201] Saving new best policy, reward=27.834! +[2023-02-24 12:39:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5029888. Throughput: 0: 901.1. Samples: 1258070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:07,880][00205] Avg episode reward: [(0, '27.955')] +[2023-02-24 12:39:07,897][11201] Saving new best policy, reward=27.955! +[2023-02-24 12:39:10,087][11215] Updated weights for policy 0, policy_version 1230 (0.0019) +[2023-02-24 12:39:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5042176. Throughput: 0: 871.9. Samples: 1260046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:12,875][00205] Avg episode reward: [(0, '26.738')] +[2023-02-24 12:39:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5058560. Throughput: 0: 860.2. Samples: 1264266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:17,873][00205] Avg episode reward: [(0, '26.010')] +[2023-02-24 12:39:21,693][11215] Updated weights for policy 0, policy_version 1240 (0.0025) +[2023-02-24 12:39:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5083136. Throughput: 0: 900.9. Samples: 1270656. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:39:22,873][00205] Avg episode reward: [(0, '25.109')] +[2023-02-24 12:39:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5099520. Throughput: 0: 899.9. Samples: 1273924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:27,873][00205] Avg episode reward: [(0, '23.278')] +[2023-02-24 12:39:32,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3550.3, 300 sec: 3554.5). Total num frames: 5115904. Throughput: 0: 853.5. Samples: 1278410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:32,875][00205] Avg episode reward: [(0, '23.698')] +[2023-02-24 12:39:34,002][11215] Updated weights for policy 0, policy_version 1250 (0.0014) +[2023-02-24 12:39:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5132288. Throughput: 0: 864.5. Samples: 1282866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:39:37,872][00205] Avg episode reward: [(0, '23.343')] +[2023-02-24 12:39:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5152768. Throughput: 0: 891.0. Samples: 1286074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:39:42,872][00205] Avg episode reward: [(0, '23.898')] +[2023-02-24 12:39:44,431][11215] Updated weights for policy 0, policy_version 1260 (0.0019) +[2023-02-24 12:39:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5169152. Throughput: 0: 905.9. Samples: 1292786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:39:47,877][00205] Avg episode reward: [(0, '26.005')] +[2023-02-24 12:39:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5185536. Throughput: 0: 860.8. Samples: 1296806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:39:52,873][00205] Avg episode reward: [(0, '25.523')] +[2023-02-24 12:39:57,597][11215] Updated weights for policy 0, policy_version 1270 (0.0011) +[2023-02-24 12:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5201920. Throughput: 0: 864.5. Samples: 1298948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:39:57,875][00205] Avg episode reward: [(0, '26.334')] +[2023-02-24 12:40:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5222400. Throughput: 0: 900.7. Samples: 1304796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:02,873][00205] Avg episode reward: [(0, '27.262')] +[2023-02-24 12:40:07,703][11215] Updated weights for policy 0, policy_version 1280 (0.0016) +[2023-02-24 12:40:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 5242880. Throughput: 0: 895.7. Samples: 1310962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:40:07,879][00205] Avg episode reward: [(0, '27.918')] +[2023-02-24 12:40:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5255168. Throughput: 0: 867.7. Samples: 1312972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:12,873][00205] Avg episode reward: [(0, '27.645')] +[2023-02-24 12:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 5271552. Throughput: 0: 861.7. Samples: 1317186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:17,875][00205] Avg episode reward: [(0, '28.400')] +[2023-02-24 12:40:17,889][11201] Saving new best policy, reward=28.400! +[2023-02-24 12:40:20,215][11215] Updated weights for policy 0, policy_version 1290 (0.0030) +[2023-02-24 12:40:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5292032. Throughput: 0: 904.2. Samples: 1323556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:22,883][00205] Avg episode reward: [(0, '27.920')] +[2023-02-24 12:40:27,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5312512. Throughput: 0: 906.3. Samples: 1326860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:27,874][00205] Avg episode reward: [(0, '27.805')] +[2023-02-24 12:40:31,668][11215] Updated weights for policy 0, policy_version 1300 (0.0027) +[2023-02-24 12:40:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5324800. Throughput: 0: 858.2. Samples: 1331404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:32,876][00205] Avg episode reward: [(0, '27.152')] +[2023-02-24 12:40:37,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5345280. Throughput: 0: 874.9. Samples: 1336176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:37,876][00205] Avg episode reward: [(0, '25.343')] +[2023-02-24 12:40:42,540][11215] Updated weights for policy 0, policy_version 1310 (0.0020) +[2023-02-24 12:40:42,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 5365760. Throughput: 0: 899.0. Samples: 1339404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:40:42,874][00205] Avg episode reward: [(0, '27.124')] +[2023-02-24 12:40:47,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5382144. Throughput: 0: 911.1. Samples: 1345798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:40:47,875][00205] Avg episode reward: [(0, '25.789')] +[2023-02-24 12:40:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth... +[2023-02-24 12:40:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth +[2023-02-24 12:40:52,872][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5398528. Throughput: 0: 862.0. Samples: 1349754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:52,879][00205] Avg episode reward: [(0, '24.927')] +[2023-02-24 12:40:55,663][11215] Updated weights for policy 0, policy_version 1320 (0.0021) +[2023-02-24 12:40:57,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5414912. Throughput: 0: 863.6. Samples: 1351836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:40:57,872][00205] Avg episode reward: [(0, '26.058')] +[2023-02-24 12:41:02,870][00205] Fps is (10 sec: 3687.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5435392. Throughput: 0: 916.1. Samples: 1358412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:02,872][00205] Avg episode reward: [(0, '27.403')] +[2023-02-24 12:41:05,026][11215] Updated weights for policy 0, policy_version 1330 (0.0015) +[2023-02-24 12:41:07,871][00205] Fps is (10 sec: 4095.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5455872. Throughput: 0: 904.6. Samples: 1364266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:07,879][00205] Avg episode reward: [(0, '27.649')] +[2023-02-24 12:41:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5468160. Throughput: 0: 876.3. Samples: 1366294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:12,873][00205] Avg episode reward: [(0, '26.789')] +[2023-02-24 12:41:17,870][00205] Fps is (10 sec: 2867.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5484544. Throughput: 0: 877.1. Samples: 1370872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:41:17,873][00205] Avg episode reward: [(0, '27.554')] +[2023-02-24 12:41:17,904][11215] Updated weights for policy 0, policy_version 1340 (0.0013) +[2023-02-24 12:41:22,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 5509120. Throughput: 0: 915.6. Samples: 1377378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:22,875][00205] Avg episode reward: [(0, '28.234')] +[2023-02-24 12:41:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5525504. Throughput: 0: 917.0. Samples: 1380666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:27,875][00205] Avg episode reward: [(0, '28.472')] +[2023-02-24 12:41:27,890][11201] Saving new best policy, reward=28.472! +[2023-02-24 12:41:28,502][11215] Updated weights for policy 0, policy_version 1350 (0.0018) +[2023-02-24 12:41:32,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5537792. Throughput: 0: 863.7. Samples: 1384662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:32,874][00205] Avg episode reward: [(0, '27.413')] +[2023-02-24 12:41:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5558272. Throughput: 0: 887.5. Samples: 1389688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:41:37,873][00205] Avg episode reward: [(0, '27.655')] +[2023-02-24 12:41:40,251][11215] Updated weights for policy 0, policy_version 1360 (0.0032) +[2023-02-24 12:41:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 5578752. Throughput: 0: 913.2. Samples: 1392928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:41:42,876][00205] Avg episode reward: [(0, '28.925')] +[2023-02-24 12:41:42,879][11201] Saving new best policy, reward=28.925! +[2023-02-24 12:41:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5595136. Throughput: 0: 897.9. Samples: 1398818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:47,876][00205] Avg episode reward: [(0, '29.357')] +[2023-02-24 12:41:47,888][11201] Saving new best policy, reward=29.357! +[2023-02-24 12:41:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5607424. Throughput: 0: 856.0. Samples: 1402786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:41:52,874][00205] Avg episode reward: [(0, '29.483')] +[2023-02-24 12:41:52,906][11201] Saving new best policy, reward=29.483! +[2023-02-24 12:41:52,911][11215] Updated weights for policy 0, policy_version 1370 (0.0017) +[2023-02-24 12:41:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5627904. Throughput: 0: 857.4. Samples: 1404878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:41:57,872][00205] Avg episode reward: [(0, '30.130')] +[2023-02-24 12:41:57,887][11201] Saving new best policy, reward=30.130! +[2023-02-24 12:42:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5648384. Throughput: 0: 898.9. Samples: 1411324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:02,880][00205] Avg episode reward: [(0, '29.209')] +[2023-02-24 12:42:03,348][11215] Updated weights for policy 0, policy_version 1380 (0.0022) +[2023-02-24 12:42:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5664768. Throughput: 0: 881.8. Samples: 1417056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:42:07,873][00205] Avg episode reward: [(0, '28.847')] +[2023-02-24 12:42:12,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5681152. Throughput: 0: 854.9. Samples: 1419140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:42:12,874][00205] Avg episode reward: [(0, '28.500')] +[2023-02-24 12:42:16,061][11215] Updated weights for policy 0, policy_version 1390 (0.0028) +[2023-02-24 12:42:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5701632. Throughput: 0: 878.1. Samples: 1424178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:42:17,878][00205] Avg episode reward: [(0, '28.080')] +[2023-02-24 12:42:22,870][00205] Fps is (10 sec: 4506.6, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 5726208. Throughput: 0: 925.9. Samples: 1431352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:22,873][00205] Avg episode reward: [(0, '27.170')] +[2023-02-24 12:42:24,609][11215] Updated weights for policy 0, policy_version 1400 (0.0023) +[2023-02-24 12:42:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5742592. Throughput: 0: 927.2. Samples: 1434654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:27,876][00205] Avg episode reward: [(0, '26.678')] +[2023-02-24 12:42:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5758976. Throughput: 0: 893.1. Samples: 1439008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:42:32,872][00205] Avg episode reward: [(0, '27.204')] +[2023-02-24 12:42:36,876][11215] Updated weights for policy 0, policy_version 1410 (0.0020) +[2023-02-24 12:42:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5779456. Throughput: 0: 932.0. Samples: 1444728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:37,872][00205] Avg episode reward: [(0, '26.727')] +[2023-02-24 12:42:42,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5804032. Throughput: 0: 965.6. Samples: 1448332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:42,872][00205] Avg episode reward: [(0, '25.694')] +[2023-02-24 12:42:45,928][11215] Updated weights for policy 0, policy_version 1420 (0.0024) +[2023-02-24 12:42:47,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3596.1). Total num frames: 5820416. Throughput: 0: 965.8. Samples: 1454792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:42:47,878][00205] Avg episode reward: [(0, '25.460')] +[2023-02-24 12:42:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth... +[2023-02-24 12:42:48,041][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth +[2023-02-24 12:42:52,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5832704. Throughput: 0: 926.7. Samples: 1458756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:42:52,875][00205] Avg episode reward: [(0, '25.142')] +[2023-02-24 12:42:57,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5849088. Throughput: 0: 930.5. Samples: 1461010. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:42:57,875][00205] Avg episode reward: [(0, '24.673')] +[2023-02-24 12:42:58,934][11215] Updated weights for policy 0, policy_version 1430 (0.0017) +[2023-02-24 12:43:02,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3582.3). Total num frames: 5873664. Throughput: 0: 962.7. Samples: 1467498. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:02,879][00205] Avg episode reward: [(0, '25.489')] +[2023-02-24 12:43:07,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3754.6, 300 sec: 3582.2). Total num frames: 5890048. Throughput: 0: 927.3. Samples: 1473084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:07,878][00205] Avg episode reward: [(0, '24.572')] +[2023-02-24 12:43:10,110][11215] Updated weights for policy 0, policy_version 1440 (0.0022) +[2023-02-24 12:43:12,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3686.5, 300 sec: 3582.3). Total num frames: 5902336. Throughput: 0: 899.6. Samples: 1475136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:12,875][00205] Avg episode reward: [(0, '24.618')] +[2023-02-24 12:43:17,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5922816. Throughput: 0: 910.5. Samples: 1479980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:43:17,877][00205] Avg episode reward: [(0, '23.787')] +[2023-02-24 12:43:21,151][11215] Updated weights for policy 0, policy_version 1450 (0.0021) +[2023-02-24 12:43:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5943296. Throughput: 0: 928.1. Samples: 1486492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:43:22,873][00205] Avg episode reward: [(0, '25.619')] +[2023-02-24 12:43:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.4). Total num frames: 5959680. Throughput: 0: 914.4. Samples: 1489480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:27,876][00205] Avg episode reward: [(0, '25.045')] +[2023-02-24 12:43:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5976064. Throughput: 0: 861.5. Samples: 1493554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:43:32,874][00205] Avg episode reward: [(0, '25.332')] +[2023-02-24 12:43:34,228][11215] Updated weights for policy 0, policy_version 1460 (0.0027) +[2023-02-24 12:43:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5996544. Throughput: 0: 893.2. Samples: 1498950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:37,872][00205] Avg episode reward: [(0, '25.772')] +[2023-02-24 12:43:42,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6017024. Throughput: 0: 914.1. Samples: 1502144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:43:42,878][00205] Avg episode reward: [(0, '25.907')] +[2023-02-24 12:43:43,586][11215] Updated weights for policy 0, policy_version 1470 (0.0016) +[2023-02-24 12:43:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3596.1). Total num frames: 6033408. Throughput: 0: 896.5. Samples: 1507838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:47,877][00205] Avg episode reward: [(0, '26.580')] +[2023-02-24 12:43:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6045696. Throughput: 0: 862.3. Samples: 1511886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:52,879][00205] Avg episode reward: [(0, '25.424')] +[2023-02-24 12:43:56,726][11215] Updated weights for policy 0, policy_version 1480 (0.0012) +[2023-02-24 12:43:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 6066176. Throughput: 0: 871.0. Samples: 1514332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:43:57,878][00205] Avg episode reward: [(0, '24.721')] +[2023-02-24 12:44:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6086656. Throughput: 0: 907.5. Samples: 1520818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:44:02,877][00205] Avg episode reward: [(0, '24.597')] +[2023-02-24 12:44:07,384][11215] Updated weights for policy 0, policy_version 1490 (0.0016) +[2023-02-24 12:44:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6103040. Throughput: 0: 881.4. Samples: 1526156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:07,873][00205] Avg episode reward: [(0, '25.599')] +[2023-02-24 12:44:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6115328. Throughput: 0: 860.8. Samples: 1528214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:44:12,875][00205] Avg episode reward: [(0, '26.135')] +[2023-02-24 12:44:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6135808. Throughput: 0: 886.1. Samples: 1533430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:17,875][00205] Avg episode reward: [(0, '26.670')] +[2023-02-24 12:44:18,915][11215] Updated weights for policy 0, policy_version 1500 (0.0016) +[2023-02-24 12:44:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 6160384. Throughput: 0: 911.6. Samples: 1539972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:22,873][00205] Avg episode reward: [(0, '27.034')] +[2023-02-24 12:44:27,872][00205] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 6172672. Throughput: 0: 896.8. Samples: 1542502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:27,875][00205] Avg episode reward: [(0, '28.232')] +[2023-02-24 12:44:31,281][11215] Updated weights for policy 0, policy_version 1510 (0.0015) +[2023-02-24 12:44:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6189056. Throughput: 0: 862.0. Samples: 1546626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:32,877][00205] Avg episode reward: [(0, '28.717')] +[2023-02-24 12:44:37,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6209536. Throughput: 0: 899.2. Samples: 1552352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:37,875][00205] Avg episode reward: [(0, '29.136')] +[2023-02-24 12:44:41,437][11215] Updated weights for policy 0, policy_version 1520 (0.0015) +[2023-02-24 12:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6230016. Throughput: 0: 917.2. Samples: 1555608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:42,876][00205] Avg episode reward: [(0, '27.976')] +[2023-02-24 12:44:47,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3596.1). Total num frames: 6246400. Throughput: 0: 893.7. Samples: 1561038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:44:47,877][00205] Avg episode reward: [(0, '27.555')] +[2023-02-24 12:44:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth... +[2023-02-24 12:44:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth +[2023-02-24 12:44:52,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6258688. Throughput: 0: 862.0. Samples: 1564946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:44:52,876][00205] Avg episode reward: [(0, '26.975')] +[2023-02-24 12:44:54,636][11215] Updated weights for policy 0, policy_version 1530 (0.0017) +[2023-02-24 12:44:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6279168. Throughput: 0: 877.6. Samples: 1567708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:44:57,873][00205] Avg episode reward: [(0, '26.468')] +[2023-02-24 12:45:02,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6299648. Throughput: 0: 902.7. Samples: 1574052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:45:02,878][00205] Avg episode reward: [(0, '25.282')] +[2023-02-24 12:45:05,088][11215] Updated weights for policy 0, policy_version 1540 (0.0013) +[2023-02-24 12:45:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6311936. Throughput: 0: 866.2. Samples: 1578952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:45:07,874][00205] Avg episode reward: [(0, '25.806')] +[2023-02-24 12:45:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6328320. Throughput: 0: 855.0. Samples: 1580976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:45:12,875][00205] Avg episode reward: [(0, '26.113')] +[2023-02-24 12:45:17,190][11215] Updated weights for policy 0, policy_version 1550 (0.0016) +[2023-02-24 12:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6348800. Throughput: 0: 888.3. Samples: 1586598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:45:17,872][00205] Avg episode reward: [(0, '27.188')] +[2023-02-24 12:45:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6369280. Throughput: 0: 903.1. Samples: 1592990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:45:22,875][00205] Avg episode reward: [(0, '27.562')] +[2023-02-24 12:45:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 6385664. Throughput: 0: 881.2. Samples: 1595262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:27,875][00205] Avg episode reward: [(0, '27.304')] +[2023-02-24 12:45:29,203][11215] Updated weights for policy 0, policy_version 1560 (0.0017) +[2023-02-24 12:45:32,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6397952. Throughput: 0: 851.9. Samples: 1599370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:45:32,879][00205] Avg episode reward: [(0, '27.522')] +[2023-02-24 12:45:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6422528. Throughput: 0: 898.2. Samples: 1605366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:37,872][00205] Avg episode reward: [(0, '27.148')] +[2023-02-24 12:45:39,717][11215] Updated weights for policy 0, policy_version 1570 (0.0020) +[2023-02-24 12:45:42,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3549.8, 300 sec: 3596.2). Total num frames: 6443008. Throughput: 0: 909.9. Samples: 1608652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:42,873][00205] Avg episode reward: [(0, '26.756')] +[2023-02-24 12:45:47,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.7, 300 sec: 3582.3). Total num frames: 6455296. Throughput: 0: 879.4. Samples: 1613624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:45:47,874][00205] Avg episode reward: [(0, '24.953')] +[2023-02-24 12:45:52,870][00205] Fps is (10 sec: 2457.7, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6467584. Throughput: 0: 860.0. Samples: 1617654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:45:52,872][00205] Avg episode reward: [(0, '24.925')] +[2023-02-24 12:45:53,081][11215] Updated weights for policy 0, policy_version 1580 (0.0029) +[2023-02-24 12:45:57,870][00205] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6492160. Throughput: 0: 887.6. Samples: 1620918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:45:57,875][00205] Avg episode reward: [(0, '25.196')] +[2023-02-24 12:46:02,276][11215] Updated weights for policy 0, policy_version 1590 (0.0024) +[2023-02-24 12:46:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6512640. Throughput: 0: 907.6. Samples: 1627438. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:02,873][00205] Avg episode reward: [(0, '24.389')] +[2023-02-24 12:46:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6524928. Throughput: 0: 865.2. Samples: 1631924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:07,873][00205] Avg episode reward: [(0, '25.683')] +[2023-02-24 12:46:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6541312. Throughput: 0: 859.7. Samples: 1633948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:46:12,874][00205] Avg episode reward: [(0, '24.758')] +[2023-02-24 12:46:15,361][11215] Updated weights for policy 0, policy_version 1600 (0.0012) +[2023-02-24 12:46:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6561792. Throughput: 0: 900.7. Samples: 1639900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:17,873][00205] Avg episode reward: [(0, '26.283')] +[2023-02-24 12:46:22,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 6582272. Throughput: 0: 908.8. Samples: 1646262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:22,874][00205] Avg episode reward: [(0, '25.931')] +[2023-02-24 12:46:26,337][11215] Updated weights for policy 0, policy_version 1610 (0.0015) +[2023-02-24 12:46:27,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6594560. Throughput: 0: 880.8. Samples: 1648286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:46:27,876][00205] Avg episode reward: [(0, '25.591')] +[2023-02-24 12:46:32,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6610944. Throughput: 0: 860.2. Samples: 1652332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:32,880][00205] Avg episode reward: [(0, '25.339')] +[2023-02-24 12:46:37,665][11215] Updated weights for policy 0, policy_version 1620 (0.0016) +[2023-02-24 12:46:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6635520. Throughput: 0: 914.7. Samples: 1658814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:46:37,873][00205] Avg episode reward: [(0, '25.575')] +[2023-02-24 12:46:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6651904. Throughput: 0: 915.1. Samples: 1662098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:46:42,873][00205] Avg episode reward: [(0, '25.991')] +[2023-02-24 12:46:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6668288. Throughput: 0: 871.0. Samples: 1666634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:47,877][00205] Avg episode reward: [(0, '26.007')] +[2023-02-24 12:46:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth... +[2023-02-24 12:46:48,098][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth +[2023-02-24 12:46:50,894][11215] Updated weights for policy 0, policy_version 1630 (0.0025) +[2023-02-24 12:46:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6680576. Throughput: 0: 870.9. Samples: 1671114. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:46:52,872][00205] Avg episode reward: [(0, '25.376')] +[2023-02-24 12:46:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6705152. Throughput: 0: 898.4. Samples: 1674378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:46:57,875][00205] Avg episode reward: [(0, '24.305')] +[2023-02-24 12:47:00,430][11215] Updated weights for policy 0, policy_version 1640 (0.0031) +[2023-02-24 12:47:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6721536. Throughput: 0: 909.5. Samples: 1680826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:02,873][00205] Avg episode reward: [(0, '24.712')] +[2023-02-24 12:47:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 6737920. Throughput: 0: 861.4. Samples: 1685024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:07,881][00205] Avg episode reward: [(0, '23.768')] +[2023-02-24 12:47:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 6754304. Throughput: 0: 861.9. Samples: 1687070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:12,873][00205] Avg episode reward: [(0, '25.100')] +[2023-02-24 12:47:13,240][11215] Updated weights for policy 0, policy_version 1650 (0.0024) +[2023-02-24 12:47:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6774784. Throughput: 0: 910.3. Samples: 1693294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:47:17,881][00205] Avg episode reward: [(0, '24.110')] +[2023-02-24 12:47:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 6795264. Throughput: 0: 903.2. Samples: 1699456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:47:22,872][00205] Avg episode reward: [(0, '24.172')] +[2023-02-24 12:47:23,684][11215] Updated weights for policy 0, policy_version 1660 (0.0018) +[2023-02-24 12:47:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6807552. Throughput: 0: 876.6. Samples: 1701544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:47:27,874][00205] Avg episode reward: [(0, '24.170')] +[2023-02-24 12:47:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6828032. Throughput: 0: 869.7. Samples: 1705772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:47:32,872][00205] Avg episode reward: [(0, '24.176')] +[2023-02-24 12:47:35,573][11215] Updated weights for policy 0, policy_version 1670 (0.0019) +[2023-02-24 12:47:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6848512. Throughput: 0: 919.2. Samples: 1712476. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:47:37,873][00205] Avg episode reward: [(0, '23.699')] +[2023-02-24 12:47:42,883][00205] Fps is (10 sec: 4090.5, 60 sec: 3617.3, 300 sec: 3554.4). Total num frames: 6868992. Throughput: 0: 918.7. Samples: 1715732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:42,897][00205] Avg episode reward: [(0, '24.180')] +[2023-02-24 12:47:47,330][11215] Updated weights for policy 0, policy_version 1680 (0.0012) +[2023-02-24 12:47:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6881280. Throughput: 0: 870.4. Samples: 1719994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:47,874][00205] Avg episode reward: [(0, '25.763')] +[2023-02-24 12:47:52,870][00205] Fps is (10 sec: 2871.1, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6897664. Throughput: 0: 886.6. Samples: 1724922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:47:52,878][00205] Avg episode reward: [(0, '25.965')] +[2023-02-24 12:47:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6918144. Throughput: 0: 913.5. Samples: 1728176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:47:57,873][00205] Avg episode reward: [(0, '25.937')] +[2023-02-24 12:47:57,956][11215] Updated weights for policy 0, policy_version 1690 (0.0014) +[2023-02-24 12:48:02,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 6938624. Throughput: 0: 909.1. Samples: 1734204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:02,880][00205] Avg episode reward: [(0, '24.937')] +[2023-02-24 12:48:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6950912. Throughput: 0: 865.3. Samples: 1738396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:07,875][00205] Avg episode reward: [(0, '25.501')] +[2023-02-24 12:48:10,916][11215] Updated weights for policy 0, policy_version 1700 (0.0011) +[2023-02-24 12:48:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6971392. Throughput: 0: 868.0. Samples: 1740602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:12,876][00205] Avg episode reward: [(0, '25.679')] +[2023-02-24 12:48:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6991872. Throughput: 0: 921.3. Samples: 1747232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:48:17,873][00205] Avg episode reward: [(0, '24.206')] +[2023-02-24 12:48:20,241][11215] Updated weights for policy 0, policy_version 1710 (0.0011) +[2023-02-24 12:48:22,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7008256. Throughput: 0: 896.1. Samples: 1752804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:48:22,881][00205] Avg episode reward: [(0, '23.939')] +[2023-02-24 12:48:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7024640. Throughput: 0: 870.0. Samples: 1754872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:48:27,878][00205] Avg episode reward: [(0, '25.550')] +[2023-02-24 12:48:32,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7041024. Throughput: 0: 881.2. Samples: 1759646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:48:32,873][00205] Avg episode reward: [(0, '26.768')] +[2023-02-24 12:48:33,291][11215] Updated weights for policy 0, policy_version 1720 (0.0019) +[2023-02-24 12:48:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7061504. Throughput: 0: 917.9. Samples: 1766228. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:48:37,876][00205] Avg episode reward: [(0, '26.012')] +[2023-02-24 12:48:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3482.4, 300 sec: 3540.6). Total num frames: 7077888. Throughput: 0: 911.4. Samples: 1769188. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:48:42,873][00205] Avg episode reward: [(0, '25.593')] +[2023-02-24 12:48:44,671][11215] Updated weights for policy 0, policy_version 1730 (0.0015) +[2023-02-24 12:48:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7094272. Throughput: 0: 868.4. Samples: 1773282. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:48:47,880][00205] Avg episode reward: [(0, '25.264')] +[2023-02-24 12:48:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth... +[2023-02-24 12:48:48,073][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth +[2023-02-24 12:48:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7110656. Throughput: 0: 892.0. Samples: 1778538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:48:52,878][00205] Avg episode reward: [(0, '24.936')] +[2023-02-24 12:48:55,723][11215] Updated weights for policy 0, policy_version 1740 (0.0015) +[2023-02-24 12:48:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7135232. Throughput: 0: 915.3. Samples: 1781790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:48:57,873][00205] Avg episode reward: [(0, '24.912')] +[2023-02-24 12:49:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7151616. Throughput: 0: 892.8. Samples: 1787408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:02,877][00205] Avg episode reward: [(0, '25.786')] +[2023-02-24 12:49:07,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7163904. Throughput: 0: 859.7. Samples: 1791492. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:49:07,879][00205] Avg episode reward: [(0, '25.678')] +[2023-02-24 12:49:08,364][11215] Updated weights for policy 0, policy_version 1750 (0.0021) +[2023-02-24 12:49:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7184384. Throughput: 0: 873.7. Samples: 1794190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 12:49:12,873][00205] Avg episode reward: [(0, '26.490')] +[2023-02-24 12:49:17,872][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 7204864. Throughput: 0: 912.6. Samples: 1800716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:17,875][00205] Avg episode reward: [(0, '26.776')] +[2023-02-24 12:49:18,189][11215] Updated weights for policy 0, policy_version 1760 (0.0019) +[2023-02-24 12:49:22,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7221248. Throughput: 0: 881.0. Samples: 1805876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:22,874][00205] Avg episode reward: [(0, '25.964')] +[2023-02-24 12:49:27,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7233536. Throughput: 0: 860.6. Samples: 1807916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:27,874][00205] Avg episode reward: [(0, '26.699')] +[2023-02-24 12:49:31,253][11215] Updated weights for policy 0, policy_version 1770 (0.0017) +[2023-02-24 12:49:32,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7254016. Throughput: 0: 882.6. Samples: 1812998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:32,872][00205] Avg episode reward: [(0, '24.613')] +[2023-02-24 12:49:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7278592. Throughput: 0: 910.6. Samples: 1819514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:49:37,872][00205] Avg episode reward: [(0, '23.298')] +[2023-02-24 12:49:41,760][11215] Updated weights for policy 0, policy_version 1780 (0.0011) +[2023-02-24 12:49:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7290880. Throughput: 0: 897.6. Samples: 1822180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:42,877][00205] Avg episode reward: [(0, '24.331')] +[2023-02-24 12:49:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7307264. Throughput: 0: 863.6. Samples: 1826270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:49:47,880][00205] Avg episode reward: [(0, '23.984')] +[2023-02-24 12:49:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7327744. Throughput: 0: 899.7. Samples: 1831978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:49:52,878][00205] Avg episode reward: [(0, '24.064')] +[2023-02-24 12:49:53,624][11215] Updated weights for policy 0, policy_version 1790 (0.0031) +[2023-02-24 12:49:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7348224. Throughput: 0: 913.0. Samples: 1835276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:49:57,877][00205] Avg episode reward: [(0, '24.798')] +[2023-02-24 12:50:02,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 7364608. Throughput: 0: 883.9. Samples: 1840488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:50:02,874][00205] Avg episode reward: [(0, '25.309')] +[2023-02-24 12:50:05,692][11215] Updated weights for policy 0, policy_version 1800 (0.0016) +[2023-02-24 12:50:07,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7376896. Throughput: 0: 862.8. Samples: 1844704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:50:07,879][00205] Avg episode reward: [(0, '26.188')] +[2023-02-24 12:50:12,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7397376. Throughput: 0: 881.2. Samples: 1847570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:50:12,874][00205] Avg episode reward: [(0, '25.033')] +[2023-02-24 12:50:15,993][11215] Updated weights for policy 0, policy_version 1810 (0.0020) +[2023-02-24 12:50:17,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7417856. Throughput: 0: 914.2. Samples: 1854138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:17,872][00205] Avg episode reward: [(0, '25.616')] +[2023-02-24 12:50:22,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7434240. Throughput: 0: 882.8. Samples: 1859244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 12:50:22,875][00205] Avg episode reward: [(0, '26.669')] +[2023-02-24 12:50:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7446528. Throughput: 0: 869.6. Samples: 1861312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:27,873][00205] Avg episode reward: [(0, '25.640')] +[2023-02-24 12:50:29,015][11215] Updated weights for policy 0, policy_version 1820 (0.0029) +[2023-02-24 12:50:32,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7471104. Throughput: 0: 896.4. Samples: 1866610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:32,873][00205] Avg episode reward: [(0, '26.284')] +[2023-02-24 12:50:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7491584. Throughput: 0: 918.1. Samples: 1873294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:50:37,872][00205] Avg episode reward: [(0, '25.689')] +[2023-02-24 12:50:38,382][11215] Updated weights for policy 0, policy_version 1830 (0.0020) +[2023-02-24 12:50:42,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7507968. Throughput: 0: 899.0. Samples: 1875732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:42,877][00205] Avg episode reward: [(0, '26.624')] +[2023-02-24 12:50:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7520256. Throughput: 0: 875.2. Samples: 1879870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:50:47,879][00205] Avg episode reward: [(0, '26.732')] +[2023-02-24 12:50:47,895][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth... +[2023-02-24 12:50:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth +[2023-02-24 12:50:51,327][11215] Updated weights for policy 0, policy_version 1840 (0.0023) +[2023-02-24 12:50:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7540736. Throughput: 0: 908.0. Samples: 1885564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:50:52,873][00205] Avg episode reward: [(0, '26.612')] +[2023-02-24 12:50:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7561216. Throughput: 0: 915.1. Samples: 1888750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:50:57,877][00205] Avg episode reward: [(0, '27.062')] +[2023-02-24 12:51:02,874][11215] Updated weights for policy 0, policy_version 1850 (0.0012) +[2023-02-24 12:51:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7577600. Throughput: 0: 880.1. Samples: 1893744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:02,880][00205] Avg episode reward: [(0, '26.215')] +[2023-02-24 12:51:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7589888. Throughput: 0: 856.6. Samples: 1897788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:51:07,872][00205] Avg episode reward: [(0, '26.323')] +[2023-02-24 12:51:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7610368. Throughput: 0: 876.8. Samples: 1900768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:12,873][00205] Avg episode reward: [(0, '27.406')] +[2023-02-24 12:51:14,417][11215] Updated weights for policy 0, policy_version 1860 (0.0014) +[2023-02-24 12:51:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7630848. Throughput: 0: 899.7. Samples: 1907096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:17,872][00205] Avg episode reward: [(0, '28.808')] +[2023-02-24 12:51:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3554.5). Total num frames: 7643136. Throughput: 0: 849.4. Samples: 1911516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:22,877][00205] Avg episode reward: [(0, '29.578')] +[2023-02-24 12:51:27,793][11215] Updated weights for policy 0, policy_version 1870 (0.0016) +[2023-02-24 12:51:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7659520. Throughput: 0: 838.7. Samples: 1913472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:51:27,880][00205] Avg episode reward: [(0, '28.223')] +[2023-02-24 12:51:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7680000. Throughput: 0: 871.3. Samples: 1919078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:51:32,873][00205] Avg episode reward: [(0, '27.765')] +[2023-02-24 12:51:37,190][11215] Updated weights for policy 0, policy_version 1880 (0.0018) +[2023-02-24 12:51:37,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3481.4, 300 sec: 3554.5). Total num frames: 7700480. Throughput: 0: 891.2. Samples: 1925672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:51:37,876][00205] Avg episode reward: [(0, '27.676')] +[2023-02-24 12:51:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 7712768. Throughput: 0: 866.8. Samples: 1927756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:51:42,877][00205] Avg episode reward: [(0, '26.898')] +[2023-02-24 12:51:47,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 7729152. Throughput: 0: 847.1. Samples: 1931862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:47,881][00205] Avg episode reward: [(0, '26.736')] +[2023-02-24 12:51:50,187][11215] Updated weights for policy 0, policy_version 1890 (0.0014) +[2023-02-24 12:51:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7749632. Throughput: 0: 897.0. Samples: 1938152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:51:52,875][00205] Avg episode reward: [(0, '25.938')] +[2023-02-24 12:51:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7770112. Throughput: 0: 902.8. Samples: 1941396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:51:57,874][00205] Avg episode reward: [(0, '26.483')] +[2023-02-24 12:52:00,893][11215] Updated weights for policy 0, policy_version 1900 (0.0020) +[2023-02-24 12:52:02,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7786496. Throughput: 0: 869.7. Samples: 1946236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:52:02,880][00205] Avg episode reward: [(0, '26.473')] +[2023-02-24 12:52:07,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7802880. Throughput: 0: 868.6. Samples: 1950604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:07,879][00205] Avg episode reward: [(0, '26.178')] +[2023-02-24 12:52:12,327][11215] Updated weights for policy 0, policy_version 1910 (0.0013) +[2023-02-24 12:52:12,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7823360. Throughput: 0: 898.8. Samples: 1953916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:12,880][00205] Avg episode reward: [(0, '25.571')] +[2023-02-24 12:52:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7843840. Throughput: 0: 920.7. Samples: 1960510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:17,879][00205] Avg episode reward: [(0, '25.162')] +[2023-02-24 12:52:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7856128. Throughput: 0: 871.2. Samples: 1964872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:22,872][00205] Avg episode reward: [(0, '24.254')] +[2023-02-24 12:52:24,711][11215] Updated weights for policy 0, policy_version 1920 (0.0018) +[2023-02-24 12:52:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7872512. Throughput: 0: 869.5. Samples: 1966884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:27,875][00205] Avg episode reward: [(0, '26.902')] +[2023-02-24 12:52:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 7897088. Throughput: 0: 913.4. Samples: 1972966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:32,878][00205] Avg episode reward: [(0, '27.304')] +[2023-02-24 12:52:34,676][11215] Updated weights for policy 0, policy_version 1930 (0.0012) +[2023-02-24 12:52:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.1, 300 sec: 3540.8). Total num frames: 7913472. Throughput: 0: 914.4. Samples: 1979300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:37,876][00205] Avg episode reward: [(0, '27.444')] +[2023-02-24 12:52:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7929856. Throughput: 0: 888.5. Samples: 1981376. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:52:42,872][00205] Avg episode reward: [(0, '27.197')] +[2023-02-24 12:52:47,644][11215] Updated weights for policy 0, policy_version 1940 (0.0018) +[2023-02-24 12:52:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7946240. Throughput: 0: 874.3. Samples: 1985578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:52:47,876][00205] Avg episode reward: [(0, '28.879')] +[2023-02-24 12:52:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth... +[2023-02-24 12:52:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth +[2023-02-24 12:52:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7966720. Throughput: 0: 919.1. Samples: 1991962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:52:52,873][00205] Avg episode reward: [(0, '29.815')] +[2023-02-24 12:52:57,558][11215] Updated weights for policy 0, policy_version 1950 (0.0023) +[2023-02-24 12:52:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 7987200. Throughput: 0: 918.8. Samples: 1995260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:52:57,872][00205] Avg episode reward: [(0, '31.595')] +[2023-02-24 12:52:57,882][11201] Saving new best policy, reward=31.595! +[2023-02-24 12:53:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7999488. Throughput: 0: 871.2. Samples: 1999716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:02,878][00205] Avg episode reward: [(0, '30.868')] +[2023-02-24 12:53:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8015872. Throughput: 0: 877.0. Samples: 2004338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:07,876][00205] Avg episode reward: [(0, '31.491')] +[2023-02-24 12:53:10,102][11215] Updated weights for policy 0, policy_version 1960 (0.0012) +[2023-02-24 12:53:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8036352. Throughput: 0: 904.5. Samples: 2007588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:12,873][00205] Avg episode reward: [(0, '31.804')] +[2023-02-24 12:53:12,898][11201] Saving new best policy, reward=31.804! +[2023-02-24 12:53:17,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 8056832. Throughput: 0: 915.1. Samples: 2014144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:17,875][00205] Avg episode reward: [(0, '31.345')] +[2023-02-24 12:53:21,407][11215] Updated weights for policy 0, policy_version 1970 (0.0035) +[2023-02-24 12:53:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8073216. Throughput: 0: 867.2. Samples: 2018322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:22,874][00205] Avg episode reward: [(0, '31.241')] +[2023-02-24 12:53:27,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8089600. Throughput: 0: 868.6. Samples: 2020464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:53:27,874][00205] Avg episode reward: [(0, '29.691')] +[2023-02-24 12:53:32,595][11215] Updated weights for policy 0, policy_version 1980 (0.0012) +[2023-02-24 12:53:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8110080. Throughput: 0: 912.1. Samples: 2026624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:53:32,872][00205] Avg episode reward: [(0, '27.546')] +[2023-02-24 12:53:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8130560. Throughput: 0: 903.2. Samples: 2032604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:37,876][00205] Avg episode reward: [(0, '26.472')] +[2023-02-24 12:53:42,871][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8142848. Throughput: 0: 875.6. Samples: 2034662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:42,874][00205] Avg episode reward: [(0, '25.526')] +[2023-02-24 12:53:45,377][11215] Updated weights for policy 0, policy_version 1990 (0.0025) +[2023-02-24 12:53:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8159232. Throughput: 0: 871.5. Samples: 2038934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:47,877][00205] Avg episode reward: [(0, '25.844')] +[2023-02-24 12:53:52,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 8179712. Throughput: 0: 911.0. Samples: 2045334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:53:52,875][00205] Avg episode reward: [(0, '25.736')] +[2023-02-24 12:53:55,101][11215] Updated weights for policy 0, policy_version 2000 (0.0011) +[2023-02-24 12:53:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 8200192. Throughput: 0: 910.7. Samples: 2048570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:53:57,876][00205] Avg episode reward: [(0, '25.999')] +[2023-02-24 12:54:02,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8212480. Throughput: 0: 863.1. Samples: 2052980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:54:02,875][00205] Avg episode reward: [(0, '26.480')] +[2023-02-24 12:54:07,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8228864. Throughput: 0: 879.3. Samples: 2057890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:54:07,876][00205] Avg episode reward: [(0, '27.098')] +[2023-02-24 12:54:08,056][11215] Updated weights for policy 0, policy_version 2010 (0.0030) +[2023-02-24 12:54:12,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8253440. Throughput: 0: 903.7. Samples: 2061132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:12,873][00205] Avg episode reward: [(0, '28.201')] +[2023-02-24 12:54:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8269824. Throughput: 0: 908.6. Samples: 2067510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:17,872][00205] Avg episode reward: [(0, '26.869')] +[2023-02-24 12:54:18,176][11215] Updated weights for policy 0, policy_version 2020 (0.0015) +[2023-02-24 12:54:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8286208. Throughput: 0: 868.8. Samples: 2071702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:22,873][00205] Avg episode reward: [(0, '26.320')] +[2023-02-24 12:54:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8302592. Throughput: 0: 870.4. Samples: 2073830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:27,872][00205] Avg episode reward: [(0, '26.621')] +[2023-02-24 12:54:30,189][11215] Updated weights for policy 0, policy_version 2030 (0.0017) +[2023-02-24 12:54:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8323072. Throughput: 0: 920.5. Samples: 2080358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:32,875][00205] Avg episode reward: [(0, '26.115')] +[2023-02-24 12:54:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8343552. Throughput: 0: 907.8. Samples: 2086182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:37,872][00205] Avg episode reward: [(0, '27.023')] +[2023-02-24 12:54:41,947][11215] Updated weights for policy 0, policy_version 2040 (0.0019) +[2023-02-24 12:54:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8355840. Throughput: 0: 883.0. Samples: 2088302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:54:42,879][00205] Avg episode reward: [(0, '27.933')] +[2023-02-24 12:54:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8376320. Throughput: 0: 888.0. Samples: 2092938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:47,873][00205] Avg episode reward: [(0, '28.366')] +[2023-02-24 12:54:47,883][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth... +[2023-02-24 12:54:48,006][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth +[2023-02-24 12:54:52,537][11215] Updated weights for policy 0, policy_version 2050 (0.0013) +[2023-02-24 12:54:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 8396800. Throughput: 0: 921.9. Samples: 2099376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:54:52,879][00205] Avg episode reward: [(0, '28.745')] +[2023-02-24 12:54:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8413184. Throughput: 0: 919.3. Samples: 2102500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:54:57,875][00205] Avg episode reward: [(0, '29.440')] +[2023-02-24 12:55:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8425472. Throughput: 0: 865.4. Samples: 2106454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:02,878][00205] Avg episode reward: [(0, '28.903')] +[2023-02-24 12:55:05,727][11215] Updated weights for policy 0, policy_version 2060 (0.0016) +[2023-02-24 12:55:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8445952. Throughput: 0: 887.2. Samples: 2111624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:55:07,873][00205] Avg episode reward: [(0, '28.550')] +[2023-02-24 12:55:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8466432. Throughput: 0: 911.7. Samples: 2114858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:12,879][00205] Avg episode reward: [(0, '28.947')] +[2023-02-24 12:55:15,273][11215] Updated weights for policy 0, policy_version 2070 (0.0011) +[2023-02-24 12:55:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8482816. Throughput: 0: 897.2. Samples: 2120732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:55:17,878][00205] Avg episode reward: [(0, '27.978')] +[2023-02-24 12:55:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8499200. Throughput: 0: 859.9. Samples: 2124878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 12:55:22,873][00205] Avg episode reward: [(0, '27.826')] +[2023-02-24 12:55:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8515584. Throughput: 0: 868.5. Samples: 2127384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:27,875][00205] Avg episode reward: [(0, '27.059')] +[2023-02-24 12:55:28,038][11215] Updated weights for policy 0, policy_version 2080 (0.0035) +[2023-02-24 12:55:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8540160. Throughput: 0: 912.2. Samples: 2133986. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:32,872][00205] Avg episode reward: [(0, '29.171')] +[2023-02-24 12:55:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8556544. Throughput: 0: 889.4. Samples: 2139398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:55:37,877][00205] Avg episode reward: [(0, '29.604')] +[2023-02-24 12:55:38,962][11215] Updated weights for policy 0, policy_version 2090 (0.0012) +[2023-02-24 12:55:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8568832. Throughput: 0: 865.1. Samples: 2141430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:42,875][00205] Avg episode reward: [(0, '30.581')] +[2023-02-24 12:55:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8589312. Throughput: 0: 890.4. Samples: 2146524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:55:47,875][00205] Avg episode reward: [(0, '29.880')] +[2023-02-24 12:55:50,256][11215] Updated weights for policy 0, policy_version 2100 (0.0031) +[2023-02-24 12:55:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8609792. Throughput: 0: 920.1. Samples: 2153030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:55:52,872][00205] Avg episode reward: [(0, '29.177')] +[2023-02-24 12:55:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8626176. Throughput: 0: 911.2. Samples: 2155860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:55:57,877][00205] Avg episode reward: [(0, '30.053')] +[2023-02-24 12:56:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8638464. Throughput: 0: 871.9. Samples: 2159966. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:02,878][00205] Avg episode reward: [(0, '28.558')] +[2023-02-24 12:56:03,106][11215] Updated weights for policy 0, policy_version 2110 (0.0042) +[2023-02-24 12:56:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8658944. Throughput: 0: 901.4. Samples: 2165442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:56:07,872][00205] Avg episode reward: [(0, '27.659')] +[2023-02-24 12:56:12,814][11215] Updated weights for policy 0, policy_version 2120 (0.0012) +[2023-02-24 12:56:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8683520. Throughput: 0: 915.2. Samples: 2168566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:12,872][00205] Avg episode reward: [(0, '28.102')] +[2023-02-24 12:56:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8695808. Throughput: 0: 895.8. Samples: 2174298. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:56:17,875][00205] Avg episode reward: [(0, '27.283')] +[2023-02-24 12:56:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8712192. Throughput: 0: 867.5. Samples: 2178436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:56:22,877][00205] Avg episode reward: [(0, '27.355')] +[2023-02-24 12:56:25,659][11215] Updated weights for policy 0, policy_version 2130 (0.0022) +[2023-02-24 12:56:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8732672. Throughput: 0: 882.9. Samples: 2181162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:27,873][00205] Avg episode reward: [(0, '29.071')] +[2023-02-24 12:56:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8753152. Throughput: 0: 916.0. Samples: 2187742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:32,878][00205] Avg episode reward: [(0, '28.206')] +[2023-02-24 12:56:35,983][11215] Updated weights for policy 0, policy_version 2140 (0.0020) +[2023-02-24 12:56:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 8769536. Throughput: 0: 885.6. Samples: 2192880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:37,879][00205] Avg episode reward: [(0, '29.021')] +[2023-02-24 12:56:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8781824. Throughput: 0: 867.6. Samples: 2194900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:42,872][00205] Avg episode reward: [(0, '29.582')] +[2023-02-24 12:56:47,810][11215] Updated weights for policy 0, policy_version 2150 (0.0023) +[2023-02-24 12:56:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8806400. Throughput: 0: 896.2. Samples: 2200294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:47,877][00205] Avg episode reward: [(0, '28.292')] +[2023-02-24 12:56:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth... +[2023-02-24 12:56:48,011][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth +[2023-02-24 12:56:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8826880. Throughput: 0: 920.0. Samples: 2206842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 12:56:52,874][00205] Avg episode reward: [(0, '27.672')] +[2023-02-24 12:56:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8839168. Throughput: 0: 906.6. Samples: 2209364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:56:57,876][00205] Avg episode reward: [(0, '27.579')] +[2023-02-24 12:56:59,635][11215] Updated weights for policy 0, policy_version 2160 (0.0015) +[2023-02-24 12:57:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8855552. Throughput: 0: 869.4. Samples: 2213422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:02,873][00205] Avg episode reward: [(0, '26.788')] +[2023-02-24 12:57:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8876032. Throughput: 0: 909.3. Samples: 2219356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:07,877][00205] Avg episode reward: [(0, '28.423')] +[2023-02-24 12:57:10,273][11215] Updated weights for policy 0, policy_version 2170 (0.0012) +[2023-02-24 12:57:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8896512. Throughput: 0: 919.4. Samples: 2222536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:12,879][00205] Avg episode reward: [(0, '28.220')] +[2023-02-24 12:57:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8912896. Throughput: 0: 890.7. Samples: 2227826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 12:57:17,875][00205] Avg episode reward: [(0, '28.280')] +[2023-02-24 12:57:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8925184. Throughput: 0: 871.3. Samples: 2232088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:22,877][00205] Avg episode reward: [(0, '27.497')] +[2023-02-24 12:57:23,192][11215] Updated weights for policy 0, policy_version 2180 (0.0028) +[2023-02-24 12:57:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8949760. Throughput: 0: 891.6. Samples: 2235024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:27,873][00205] Avg episode reward: [(0, '29.520')] +[2023-02-24 12:57:32,529][11215] Updated weights for policy 0, policy_version 2190 (0.0018) +[2023-02-24 12:57:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8970240. Throughput: 0: 919.3. Samples: 2241662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:32,872][00205] Avg episode reward: [(0, '29.056')] +[2023-02-24 12:57:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8982528. Throughput: 0: 882.6. Samples: 2246558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:37,872][00205] Avg episode reward: [(0, '28.206')] +[2023-02-24 12:57:42,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8998912. Throughput: 0: 872.4. Samples: 2248622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:42,884][00205] Avg episode reward: [(0, '27.766')] +[2023-02-24 12:57:45,525][11215] Updated weights for policy 0, policy_version 2200 (0.0026) +[2023-02-24 12:57:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9019392. Throughput: 0: 904.6. Samples: 2254130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:57:47,872][00205] Avg episode reward: [(0, '27.727')] +[2023-02-24 12:57:52,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9039872. Throughput: 0: 918.0. Samples: 2260664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:57:52,873][00205] Avg episode reward: [(0, '28.475')] +[2023-02-24 12:57:56,016][11215] Updated weights for policy 0, policy_version 2210 (0.0018) +[2023-02-24 12:57:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9056256. Throughput: 0: 899.6. Samples: 2263016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:57:57,877][00205] Avg episode reward: [(0, '26.638')] +[2023-02-24 12:58:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9068544. Throughput: 0: 873.3. Samples: 2267124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:02,879][00205] Avg episode reward: [(0, '26.333')] +[2023-02-24 12:58:07,770][11215] Updated weights for policy 0, policy_version 2220 (0.0034) +[2023-02-24 12:58:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9093120. Throughput: 0: 913.2. Samples: 2273180. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:58:07,873][00205] Avg episode reward: [(0, '25.888')] +[2023-02-24 12:58:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9113600. Throughput: 0: 920.0. Samples: 2276424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:58:12,875][00205] Avg episode reward: [(0, '24.725')] +[2023-02-24 12:58:17,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3549.6, 300 sec: 3568.3). Total num frames: 9125888. Throughput: 0: 887.7. Samples: 2281612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:17,879][00205] Avg episode reward: [(0, '24.592')] +[2023-02-24 12:58:19,754][11215] Updated weights for policy 0, policy_version 2230 (0.0016) +[2023-02-24 12:58:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9142272. Throughput: 0: 872.0. Samples: 2285796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:22,878][00205] Avg episode reward: [(0, '24.101')] +[2023-02-24 12:58:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9162752. Throughput: 0: 895.9. Samples: 2288936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:27,872][00205] Avg episode reward: [(0, '25.623')] +[2023-02-24 12:58:30,071][11215] Updated weights for policy 0, policy_version 2240 (0.0014) +[2023-02-24 12:58:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 9183232. Throughput: 0: 919.9. Samples: 2295528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:32,874][00205] Avg episode reward: [(0, '26.641')] +[2023-02-24 12:58:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9199616. Throughput: 0: 881.3. Samples: 2300322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:37,878][00205] Avg episode reward: [(0, '26.937')] +[2023-02-24 12:58:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9211904. Throughput: 0: 872.7. Samples: 2302288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:58:42,878][00205] Avg episode reward: [(0, '27.227')] +[2023-02-24 12:58:42,996][11215] Updated weights for policy 0, policy_version 2250 (0.0017) +[2023-02-24 12:58:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9236480. Throughput: 0: 912.1. Samples: 2308170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:58:47,872][00205] Avg episode reward: [(0, '26.217')] +[2023-02-24 12:58:47,884][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth... +[2023-02-24 12:58:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth +[2023-02-24 12:58:52,474][11215] Updated weights for policy 0, policy_version 2260 (0.0020) +[2023-02-24 12:58:52,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3618.0, 300 sec: 3582.3). Total num frames: 9256960. Throughput: 0: 922.3. Samples: 2314686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:58:52,879][00205] Avg episode reward: [(0, '27.372')] +[2023-02-24 12:58:57,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 9269248. Throughput: 0: 896.1. Samples: 2316748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:58:57,876][00205] Avg episode reward: [(0, '26.596')] +[2023-02-24 12:59:02,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9285632. Throughput: 0: 872.1. Samples: 2320854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:02,873][00205] Avg episode reward: [(0, '26.704')] +[2023-02-24 12:59:05,405][11215] Updated weights for policy 0, policy_version 2270 (0.0020) +[2023-02-24 12:59:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9306112. Throughput: 0: 917.7. Samples: 2327092. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:07,872][00205] Avg episode reward: [(0, '26.088')] +[2023-02-24 12:59:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9326592. Throughput: 0: 919.9. Samples: 2330330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:12,874][00205] Avg episode reward: [(0, '26.037')] +[2023-02-24 12:59:16,197][11215] Updated weights for policy 0, policy_version 2280 (0.0037) +[2023-02-24 12:59:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.4, 300 sec: 3582.3). Total num frames: 9342976. Throughput: 0: 882.9. Samples: 2335258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:17,876][00205] Avg episode reward: [(0, '26.873')] +[2023-02-24 12:59:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9359360. Throughput: 0: 871.3. Samples: 2339530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 12:59:22,872][00205] Avg episode reward: [(0, '27.022')] +[2023-02-24 12:59:27,569][11215] Updated weights for policy 0, policy_version 2290 (0.0015) +[2023-02-24 12:59:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9379840. Throughput: 0: 900.6. Samples: 2342814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:27,872][00205] Avg episode reward: [(0, '27.863')] +[2023-02-24 12:59:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 9400320. Throughput: 0: 917.4. Samples: 2349454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:32,875][00205] Avg episode reward: [(0, '27.149')] +[2023-02-24 12:59:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9412608. Throughput: 0: 869.7. Samples: 2353822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 12:59:37,873][00205] Avg episode reward: [(0, '27.391')] +[2023-02-24 12:59:40,176][11215] Updated weights for policy 0, policy_version 2300 (0.0019) +[2023-02-24 12:59:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9428992. Throughput: 0: 869.1. Samples: 2355858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:42,872][00205] Avg episode reward: [(0, '28.339')] +[2023-02-24 12:59:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9449472. Throughput: 0: 911.9. Samples: 2361888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 12:59:47,873][00205] Avg episode reward: [(0, '28.968')] +[2023-02-24 12:59:49,807][11215] Updated weights for policy 0, policy_version 2310 (0.0011) +[2023-02-24 12:59:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3582.3). Total num frames: 9469952. Throughput: 0: 913.8. Samples: 2368214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:52,880][00205] Avg episode reward: [(0, '28.679')] +[2023-02-24 12:59:57,871][00205] Fps is (10 sec: 3685.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9486336. Throughput: 0: 888.1. Samples: 2370298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 12:59:57,875][00205] Avg episode reward: [(0, '28.517')] +[2023-02-24 13:00:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9498624. Throughput: 0: 867.0. Samples: 2374274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:02,872][00205] Avg episode reward: [(0, '29.111')] +[2023-02-24 13:00:03,025][11215] Updated weights for policy 0, policy_version 2320 (0.0020) +[2023-02-24 13:00:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9523200. Throughput: 0: 918.4. Samples: 2380856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:07,873][00205] Avg episode reward: [(0, '29.438')] +[2023-02-24 13:00:12,873][00205] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 9539584. Throughput: 0: 919.3. Samples: 2384184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:12,876][00205] Avg episode reward: [(0, '29.535')] +[2023-02-24 13:00:12,976][11215] Updated weights for policy 0, policy_version 2330 (0.0015) +[2023-02-24 13:00:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 9555968. Throughput: 0: 871.9. Samples: 2388692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:17,876][00205] Avg episode reward: [(0, '29.166')] +[2023-02-24 13:00:22,870][00205] Fps is (10 sec: 3277.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9572352. Throughput: 0: 878.2. Samples: 2393340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:22,879][00205] Avg episode reward: [(0, '28.664')] +[2023-02-24 13:00:25,293][11215] Updated weights for policy 0, policy_version 2340 (0.0011) +[2023-02-24 13:00:27,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9592832. Throughput: 0: 907.0. Samples: 2396672. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:00:27,877][00205] Avg episode reward: [(0, '29.703')] +[2023-02-24 13:00:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9613312. Throughput: 0: 916.7. Samples: 2403140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:32,873][00205] Avg episode reward: [(0, '28.917')] +[2023-02-24 13:00:36,509][11215] Updated weights for policy 0, policy_version 2350 (0.0013) +[2023-02-24 13:00:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9625600. Throughput: 0: 869.5. Samples: 2407340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:00:37,872][00205] Avg episode reward: [(0, '30.191')] +[2023-02-24 13:00:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9641984. Throughput: 0: 868.9. Samples: 2409398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:00:42,881][00205] Avg episode reward: [(0, '28.555')] +[2023-02-24 13:00:47,770][11215] Updated weights for policy 0, policy_version 2360 (0.0012) +[2023-02-24 13:00:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9666560. Throughput: 0: 915.4. Samples: 2415466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:00:47,872][00205] Avg episode reward: [(0, '28.287')] +[2023-02-24 13:00:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth... +[2023-02-24 13:00:48,005][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth +[2023-02-24 13:00:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9682944. Throughput: 0: 906.2. Samples: 2421636. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:00:52,875][00205] Avg episode reward: [(0, '28.414')] +[2023-02-24 13:00:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 9699328. Throughput: 0: 876.8. Samples: 2423636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:00:57,872][00205] Avg episode reward: [(0, '28.747')] +[2023-02-24 13:01:00,664][11215] Updated weights for policy 0, policy_version 2370 (0.0020) +[2023-02-24 13:01:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9715712. Throughput: 0: 870.8. Samples: 2427878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:01:02,879][00205] Avg episode reward: [(0, '28.769')] +[2023-02-24 13:01:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9736192. Throughput: 0: 916.4. Samples: 2434578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:01:07,873][00205] Avg episode reward: [(0, '29.246')] +[2023-02-24 13:01:10,109][11215] Updated weights for policy 0, policy_version 2380 (0.0012) +[2023-02-24 13:01:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 9756672. Throughput: 0: 914.9. Samples: 2437842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:12,877][00205] Avg episode reward: [(0, '28.743')] +[2023-02-24 13:01:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9768960. Throughput: 0: 871.5. Samples: 2442356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:17,873][00205] Avg episode reward: [(0, '27.426')] +[2023-02-24 13:01:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9785344. Throughput: 0: 881.3. Samples: 2447000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:01:22,872][00205] Avg episode reward: [(0, '29.245')] +[2023-02-24 13:01:22,886][11215] Updated weights for policy 0, policy_version 2390 (0.0014) +[2023-02-24 13:01:27,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9809920. Throughput: 0: 909.5. Samples: 2450324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:27,872][00205] Avg episode reward: [(0, '28.138')] +[2023-02-24 13:01:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9826304. Throughput: 0: 919.0. Samples: 2456820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:32,880][00205] Avg episode reward: [(0, '29.533')] +[2023-02-24 13:01:33,120][11215] Updated weights for policy 0, policy_version 2400 (0.0017) +[2023-02-24 13:01:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9842688. Throughput: 0: 873.8. Samples: 2460956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:37,875][00205] Avg episode reward: [(0, '26.979')] +[2023-02-24 13:01:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9859072. Throughput: 0: 876.7. Samples: 2463086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:42,873][00205] Avg episode reward: [(0, '26.891')] +[2023-02-24 13:01:45,223][11215] Updated weights for policy 0, policy_version 2410 (0.0014) +[2023-02-24 13:01:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9879552. Throughput: 0: 921.8. Samples: 2469360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:01:47,873][00205] Avg episode reward: [(0, '29.293')] +[2023-02-24 13:01:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9900032. Throughput: 0: 906.2. Samples: 2475356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:52,877][00205] Avg episode reward: [(0, '31.339')] +[2023-02-24 13:01:56,888][11215] Updated weights for policy 0, policy_version 2420 (0.0014) +[2023-02-24 13:01:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9912320. Throughput: 0: 879.5. Samples: 2477420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:01:57,880][00205] Avg episode reward: [(0, '30.273')] +[2023-02-24 13:02:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9928704. Throughput: 0: 876.2. Samples: 2481786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:02,879][00205] Avg episode reward: [(0, '30.971')] +[2023-02-24 13:02:07,534][11215] Updated weights for policy 0, policy_version 2430 (0.0027) +[2023-02-24 13:02:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9953280. Throughput: 0: 920.0. Samples: 2488400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:07,872][00205] Avg episode reward: [(0, '30.673')] +[2023-02-24 13:02:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 9973760. Throughput: 0: 926.2. Samples: 2492002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:12,879][00205] Avg episode reward: [(0, '32.861')] +[2023-02-24 13:02:12,881][11201] Saving new best policy, reward=32.861! +[2023-02-24 13:02:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9986048. Throughput: 0: 885.0. Samples: 2496644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:02:17,873][00205] Avg episode reward: [(0, '32.329')] +[2023-02-24 13:02:19,426][11215] Updated weights for policy 0, policy_version 2440 (0.0027) +[2023-02-24 13:02:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 10006528. Throughput: 0: 913.5. Samples: 2502064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:22,878][00205] Avg episode reward: [(0, '33.120')] +[2023-02-24 13:02:22,881][11201] Saving new best policy, reward=33.120! +[2023-02-24 13:02:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 10031104. Throughput: 0: 944.7. Samples: 2505598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:27,872][00205] Avg episode reward: [(0, '32.114')] +[2023-02-24 13:02:28,321][11215] Updated weights for policy 0, policy_version 2450 (0.0011) +[2023-02-24 13:02:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10051584. Throughput: 0: 958.1. Samples: 2512476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:32,872][00205] Avg episode reward: [(0, '32.733')] +[2023-02-24 13:02:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10067968. Throughput: 0: 928.1. Samples: 2517120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:37,876][00205] Avg episode reward: [(0, '31.443')] +[2023-02-24 13:02:40,375][11215] Updated weights for policy 0, policy_version 2460 (0.0024) +[2023-02-24 13:02:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10088448. Throughput: 0: 934.0. Samples: 2519452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:42,873][00205] Avg episode reward: [(0, '31.606')] +[2023-02-24 13:02:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10108928. Throughput: 0: 995.9. Samples: 2526602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:02:47,876][00205] Avg episode reward: [(0, '31.046')] +[2023-02-24 13:02:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth... +[2023-02-24 13:02:48,003][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth +[2023-02-24 13:02:48,873][11215] Updated weights for policy 0, policy_version 2470 (0.0014) +[2023-02-24 13:02:52,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3637.8). Total num frames: 10129408. Throughput: 0: 986.9. Samples: 2532812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:52,876][00205] Avg episode reward: [(0, '29.429')] +[2023-02-24 13:02:57,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3891.0, 300 sec: 3651.7). Total num frames: 10145792. Throughput: 0: 956.8. Samples: 2535062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:02:57,880][00205] Avg episode reward: [(0, '28.950')] +[2023-02-24 13:03:01,104][11215] Updated weights for policy 0, policy_version 2480 (0.0018) +[2023-02-24 13:03:02,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3959.5, 300 sec: 3637.8). Total num frames: 10166272. Throughput: 0: 965.6. Samples: 2540094. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:02,877][00205] Avg episode reward: [(0, '28.169')] +[2023-02-24 13:03:07,870][00205] Fps is (10 sec: 4097.1, 60 sec: 3891.2, 300 sec: 3637.8). Total num frames: 10186752. Throughput: 0: 1006.3. Samples: 2547348. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:03:07,874][00205] Avg episode reward: [(0, '28.697')] +[2023-02-24 13:03:09,547][11215] Updated weights for policy 0, policy_version 2490 (0.0022) +[2023-02-24 13:03:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 10207232. Throughput: 0: 1006.0. Samples: 2550870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:03:12,874][00205] Avg episode reward: [(0, '28.847')] +[2023-02-24 13:03:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10223616. Throughput: 0: 952.1. Samples: 2555322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:17,876][00205] Avg episode reward: [(0, '29.297')] +[2023-02-24 13:03:21,717][11215] Updated weights for policy 0, policy_version 2500 (0.0025) +[2023-02-24 13:03:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10244096. Throughput: 0: 975.1. Samples: 2560998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:03:22,877][00205] Avg episode reward: [(0, '30.799')] +[2023-02-24 13:03:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 10268672. Throughput: 0: 1003.7. Samples: 2564618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:03:27,873][00205] Avg episode reward: [(0, '30.425')] +[2023-02-24 13:03:30,474][11215] Updated weights for policy 0, policy_version 2510 (0.0011) +[2023-02-24 13:03:32,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 10285056. Throughput: 0: 992.0. Samples: 2571242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:32,874][00205] Avg episode reward: [(0, '30.344')] +[2023-02-24 13:03:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 10301440. Throughput: 0: 956.1. Samples: 2575834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:37,878][00205] Avg episode reward: [(0, '30.769')] +[2023-02-24 13:03:42,226][11215] Updated weights for policy 0, policy_version 2520 (0.0019) +[2023-02-24 13:03:42,871][00205] Fps is (10 sec: 3686.3, 60 sec: 3891.1, 300 sec: 3679.4). Total num frames: 10321920. Throughput: 0: 967.2. Samples: 2578586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:03:42,878][00205] Avg episode reward: [(0, '31.018')] +[2023-02-24 13:03:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3693.4). Total num frames: 10346496. Throughput: 0: 1017.7. Samples: 2585892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:03:47,873][00205] Avg episode reward: [(0, '28.245')] +[2023-02-24 13:03:51,216][11215] Updated weights for policy 0, policy_version 2530 (0.0011) +[2023-02-24 13:03:52,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10366976. Throughput: 0: 990.8. Samples: 2591934. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 13:03:52,878][00205] Avg episode reward: [(0, '28.230')] +[2023-02-24 13:03:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.3, 300 sec: 3707.2). Total num frames: 10379264. Throughput: 0: 960.7. Samples: 2594100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:03:57,878][00205] Avg episode reward: [(0, '29.088')] +[2023-02-24 13:04:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 10399744. Throughput: 0: 981.2. Samples: 2599476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:04:02,875][00205] Avg episode reward: [(0, '27.915')] +[2023-02-24 13:04:02,958][11215] Updated weights for policy 0, policy_version 2540 (0.0022) +[2023-02-24 13:04:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10424320. Throughput: 0: 1016.7. Samples: 2606750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:07,875][00205] Avg episode reward: [(0, '27.682')] +[2023-02-24 13:04:12,555][11215] Updated weights for policy 0, policy_version 2550 (0.0021) +[2023-02-24 13:04:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10444800. Throughput: 0: 1008.3. Samples: 2609992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:12,878][00205] Avg episode reward: [(0, '26.928')] +[2023-02-24 13:04:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 10457088. Throughput: 0: 962.1. Samples: 2614536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:17,880][00205] Avg episode reward: [(0, '27.941')] +[2023-02-24 13:04:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10481664. Throughput: 0: 993.4. Samples: 2620536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:04:22,872][00205] Avg episode reward: [(0, '26.358')] +[2023-02-24 13:04:23,456][11215] Updated weights for policy 0, policy_version 2560 (0.0016) +[2023-02-24 13:04:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 10506240. Throughput: 0: 1012.4. Samples: 2624142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:04:27,872][00205] Avg episode reward: [(0, '25.797')] +[2023-02-24 13:04:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3762.7). Total num frames: 10522624. Throughput: 0: 987.9. Samples: 2630348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:04:32,875][00205] Avg episode reward: [(0, '27.648')] +[2023-02-24 13:04:33,598][11215] Updated weights for policy 0, policy_version 2570 (0.0011) +[2023-02-24 13:04:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10539008. Throughput: 0: 955.4. Samples: 2634928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:37,877][00205] Avg episode reward: [(0, '28.105')] +[2023-02-24 13:04:42,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10559488. Throughput: 0: 972.3. Samples: 2637854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:04:42,873][00205] Avg episode reward: [(0, '29.040')] +[2023-02-24 13:04:43,957][11215] Updated weights for policy 0, policy_version 2580 (0.0014) +[2023-02-24 13:04:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3776.6). Total num frames: 10584064. Throughput: 0: 1014.0. Samples: 2645106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:04:47,873][00205] Avg episode reward: [(0, '28.718')] +[2023-02-24 13:04:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth... +[2023-02-24 13:04:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth +[2023-02-24 13:04:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 10600448. Throughput: 0: 974.1. Samples: 2650584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:04:52,873][00205] Avg episode reward: [(0, '29.225')] +[2023-02-24 13:04:54,891][11215] Updated weights for policy 0, policy_version 2590 (0.0017) +[2023-02-24 13:04:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 10616832. Throughput: 0: 951.6. Samples: 2652814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:04:57,879][00205] Avg episode reward: [(0, '30.032')] +[2023-02-24 13:05:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 10637312. Throughput: 0: 972.8. Samples: 2658312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:02,873][00205] Avg episode reward: [(0, '30.642')] +[2023-02-24 13:05:05,055][11215] Updated weights for policy 0, policy_version 2600 (0.0022) +[2023-02-24 13:05:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.5). Total num frames: 10661888. Throughput: 0: 1001.2. Samples: 2665590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:07,880][00205] Avg episode reward: [(0, '29.211')] +[2023-02-24 13:05:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10678272. Throughput: 0: 983.6. Samples: 2668406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:05:12,877][00205] Avg episode reward: [(0, '30.018')] +[2023-02-24 13:05:16,702][11215] Updated weights for policy 0, policy_version 2610 (0.0038) +[2023-02-24 13:05:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 10690560. Throughput: 0: 946.1. Samples: 2672920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:05:17,876][00205] Avg episode reward: [(0, '30.640')] +[2023-02-24 13:05:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10715136. Throughput: 0: 981.7. Samples: 2679104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:22,873][00205] Avg episode reward: [(0, '30.615')] +[2023-02-24 13:05:25,826][11215] Updated weights for policy 0, policy_version 2620 (0.0031) +[2023-02-24 13:05:27,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10739712. Throughput: 0: 996.9. Samples: 2682712. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:05:27,873][00205] Avg episode reward: [(0, '31.885')] +[2023-02-24 13:05:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 10756096. Throughput: 0: 969.5. Samples: 2688734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:32,877][00205] Avg episode reward: [(0, '31.684')] +[2023-02-24 13:05:37,665][11215] Updated weights for policy 0, policy_version 2630 (0.0021) +[2023-02-24 13:05:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3891.0, 300 sec: 3832.1). Total num frames: 10772480. Throughput: 0: 949.5. Samples: 2693316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:37,882][00205] Avg episode reward: [(0, '32.848')] +[2023-02-24 13:05:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10792960. Throughput: 0: 972.3. Samples: 2696566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:42,872][00205] Avg episode reward: [(0, '32.834')] +[2023-02-24 13:05:46,239][11215] Updated weights for policy 0, policy_version 2640 (0.0014) +[2023-02-24 13:05:47,870][00205] Fps is (10 sec: 4916.9, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 10821632. Throughput: 0: 1012.2. Samples: 2703862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:05:47,878][00205] Avg episode reward: [(0, '33.923')] +[2023-02-24 13:05:47,893][11201] Saving new best policy, reward=33.923! +[2023-02-24 13:05:52,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3891.1, 300 sec: 3846.0). Total num frames: 10833920. Throughput: 0: 966.0. Samples: 2709064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:05:52,879][00205] Avg episode reward: [(0, '32.833')] +[2023-02-24 13:05:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10850304. Throughput: 0: 952.7. Samples: 2711278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:05:57,873][00205] Avg episode reward: [(0, '30.821')] +[2023-02-24 13:05:58,763][11215] Updated weights for policy 0, policy_version 2650 (0.0031) +[2023-02-24 13:06:02,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10870784. Throughput: 0: 982.6. Samples: 2717138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:06:02,872][00205] Avg episode reward: [(0, '28.643')] +[2023-02-24 13:06:07,363][11215] Updated weights for policy 0, policy_version 2660 (0.0016) +[2023-02-24 13:06:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 10895360. Throughput: 0: 1004.5. Samples: 2724308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:07,877][00205] Avg episode reward: [(0, '28.408')] +[2023-02-24 13:06:12,874][00205] Fps is (10 sec: 4094.4, 60 sec: 3890.9, 300 sec: 3873.8). Total num frames: 10911744. Throughput: 0: 982.5. Samples: 2726928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:12,879][00205] Avg episode reward: [(0, '28.654')] +[2023-02-24 13:06:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 10928128. Throughput: 0: 951.6. Samples: 2731556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:06:17,875][00205] Avg episode reward: [(0, '27.336')] +[2023-02-24 13:06:19,284][11215] Updated weights for policy 0, policy_version 2670 (0.0011) +[2023-02-24 13:06:22,877][00205] Fps is (10 sec: 4094.8, 60 sec: 3959.0, 300 sec: 3873.8). Total num frames: 10952704. Throughput: 0: 995.8. Samples: 2738132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:22,883][00205] Avg episode reward: [(0, '27.078')] +[2023-02-24 13:06:27,724][11215] Updated weights for policy 0, policy_version 2680 (0.0025) +[2023-02-24 13:06:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 10977280. Throughput: 0: 1004.9. Samples: 2741788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:27,877][00205] Avg episode reward: [(0, '29.119')] +[2023-02-24 13:06:32,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 10989568. Throughput: 0: 968.8. Samples: 2747460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:06:32,874][00205] Avg episode reward: [(0, '29.675')] +[2023-02-24 13:06:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.4, 300 sec: 3887.7). Total num frames: 11005952. Throughput: 0: 952.8. Samples: 2751940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:37,872][00205] Avg episode reward: [(0, '29.712')] +[2023-02-24 13:06:39,952][11215] Updated weights for policy 0, policy_version 2690 (0.0012) +[2023-02-24 13:06:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11030528. Throughput: 0: 981.0. Samples: 2755424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:06:42,877][00205] Avg episode reward: [(0, '30.627')] +[2023-02-24 13:06:47,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11055104. Throughput: 0: 1011.0. Samples: 2762634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:47,877][00205] Avg episode reward: [(0, '31.811')] +[2023-02-24 13:06:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth... +[2023-02-24 13:06:48,038][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth +[2023-02-24 13:06:49,145][11215] Updated weights for policy 0, policy_version 2700 (0.0011) +[2023-02-24 13:06:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11067392. Throughput: 0: 960.7. Samples: 2767540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:06:52,875][00205] Avg episode reward: [(0, '32.203')] +[2023-02-24 13:06:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11083776. Throughput: 0: 952.8. Samples: 2769800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:06:57,872][00205] Avg episode reward: [(0, '31.032')] +[2023-02-24 13:07:00,880][11215] Updated weights for policy 0, policy_version 2710 (0.0011) +[2023-02-24 13:07:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 11108352. Throughput: 0: 986.4. Samples: 2775946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:07:02,873][00205] Avg episode reward: [(0, '29.995')] +[2023-02-24 13:07:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11128832. Throughput: 0: 999.9. Samples: 2783122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:07,873][00205] Avg episode reward: [(0, '31.850')] +[2023-02-24 13:07:10,795][11215] Updated weights for policy 0, policy_version 2720 (0.0023) +[2023-02-24 13:07:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.5, 300 sec: 3929.4). Total num frames: 11145216. Throughput: 0: 969.5. Samples: 2785414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:12,872][00205] Avg episode reward: [(0, '30.840')] +[2023-02-24 13:07:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11161600. Throughput: 0: 943.5. Samples: 2789918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:07:17,873][00205] Avg episode reward: [(0, '29.945')] +[2023-02-24 13:07:21,861][11215] Updated weights for policy 0, policy_version 2730 (0.0015) +[2023-02-24 13:07:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.6, 300 sec: 3915.5). Total num frames: 11186176. Throughput: 0: 988.6. Samples: 2796426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:07:22,872][00205] Avg episode reward: [(0, '28.391')] +[2023-02-24 13:07:27,872][00205] Fps is (10 sec: 4504.7, 60 sec: 3822.8, 300 sec: 3915.5). Total num frames: 11206656. Throughput: 0: 988.4. Samples: 2799904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:07:27,880][00205] Avg episode reward: [(0, '28.858')] +[2023-02-24 13:07:32,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3901.6). Total num frames: 11218944. Throughput: 0: 941.9. Samples: 2805020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:07:32,872][00205] Avg episode reward: [(0, '31.238')] +[2023-02-24 13:07:33,132][11215] Updated weights for policy 0, policy_version 2740 (0.0011) +[2023-02-24 13:07:37,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 11235328. Throughput: 0: 931.2. Samples: 2809444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:37,872][00205] Avg episode reward: [(0, '30.567')] +[2023-02-24 13:07:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11259904. Throughput: 0: 961.5. Samples: 2813068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:42,878][00205] Avg episode reward: [(0, '30.264')] +[2023-02-24 13:07:42,998][11215] Updated weights for policy 0, policy_version 2750 (0.0012) +[2023-02-24 13:07:47,875][00205] Fps is (10 sec: 4912.7, 60 sec: 3822.6, 300 sec: 3915.4). Total num frames: 11284480. Throughput: 0: 986.8. Samples: 2820358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:47,880][00205] Avg episode reward: [(0, '31.168')] +[2023-02-24 13:07:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11296768. Throughput: 0: 935.2. Samples: 2825206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:07:52,874][00205] Avg episode reward: [(0, '32.204')] +[2023-02-24 13:07:54,363][11215] Updated weights for policy 0, policy_version 2760 (0.0034) +[2023-02-24 13:07:57,870][00205] Fps is (10 sec: 3278.5, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11317248. Throughput: 0: 934.0. Samples: 2827446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:07:57,872][00205] Avg episode reward: [(0, '33.362')] +[2023-02-24 13:08:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11337728. Throughput: 0: 978.8. Samples: 2833964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:02,878][00205] Avg episode reward: [(0, '33.448')] +[2023-02-24 13:08:03,838][11215] Updated weights for policy 0, policy_version 2770 (0.0017) +[2023-02-24 13:08:07,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11362304. Throughput: 0: 987.8. Samples: 2840876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:08:07,880][00205] Avg episode reward: [(0, '32.975')] +[2023-02-24 13:08:12,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3901.6). Total num frames: 11374592. Throughput: 0: 960.1. Samples: 2843108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:12,880][00205] Avg episode reward: [(0, '32.114')] +[2023-02-24 13:08:15,661][11215] Updated weights for policy 0, policy_version 2780 (0.0012) +[2023-02-24 13:08:17,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11395072. Throughput: 0: 945.3. Samples: 2847558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:08:17,873][00205] Avg episode reward: [(0, '30.239')] +[2023-02-24 13:08:22,870][00205] Fps is (10 sec: 4506.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11419648. Throughput: 0: 1004.7. Samples: 2854656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:22,873][00205] Avg episode reward: [(0, '29.244')] +[2023-02-24 13:08:24,526][11215] Updated weights for policy 0, policy_version 2790 (0.0011) +[2023-02-24 13:08:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11440128. Throughput: 0: 1002.7. Samples: 2858190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:08:27,874][00205] Avg episode reward: [(0, '28.940')] +[2023-02-24 13:08:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11452416. Throughput: 0: 953.8. Samples: 2863276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:08:32,872][00205] Avg episode reward: [(0, '27.881')] +[2023-02-24 13:08:36,720][11215] Updated weights for policy 0, policy_version 2800 (0.0016) +[2023-02-24 13:08:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11472896. Throughput: 0: 957.2. Samples: 2868282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:08:37,880][00205] Avg episode reward: [(0, '27.963')] +[2023-02-24 13:08:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11497472. Throughput: 0: 986.5. Samples: 2871838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:42,872][00205] Avg episode reward: [(0, '29.039')] +[2023-02-24 13:08:45,237][11215] Updated weights for policy 0, policy_version 2810 (0.0018) +[2023-02-24 13:08:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.5, 300 sec: 3901.6). Total num frames: 11517952. Throughput: 0: 1002.6. Samples: 2879082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:47,876][00205] Avg episode reward: [(0, '29.905')] +[2023-02-24 13:08:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth... +[2023-02-24 13:08:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth +[2023-02-24 13:08:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11530240. Throughput: 0: 947.2. Samples: 2883500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:08:52,874][00205] Avg episode reward: [(0, '30.130')] +[2023-02-24 13:08:57,602][11215] Updated weights for policy 0, policy_version 2820 (0.0034) +[2023-02-24 13:08:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11550720. Throughput: 0: 947.8. Samples: 2885758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:08:57,881][00205] Avg episode reward: [(0, '31.498')] +[2023-02-24 13:09:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11575296. Throughput: 0: 999.3. Samples: 2892526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:02,873][00205] Avg episode reward: [(0, '30.983')] +[2023-02-24 13:09:06,160][11215] Updated weights for policy 0, policy_version 2830 (0.0021) +[2023-02-24 13:09:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11595776. Throughput: 0: 991.9. Samples: 2899292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:07,873][00205] Avg episode reward: [(0, '30.991')] +[2023-02-24 13:09:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3915.5). Total num frames: 11612160. Throughput: 0: 964.2. Samples: 2901580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:12,877][00205] Avg episode reward: [(0, '30.112')] +[2023-02-24 13:09:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11628544. Throughput: 0: 954.1. Samples: 2906212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:17,872][00205] Avg episode reward: [(0, '28.870')] +[2023-02-24 13:09:18,305][11215] Updated weights for policy 0, policy_version 2840 (0.0025) +[2023-02-24 13:09:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11653120. Throughput: 0: 1004.3. Samples: 2913476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:22,872][00205] Avg episode reward: [(0, '27.218')] +[2023-02-24 13:09:27,339][11215] Updated weights for policy 0, policy_version 2850 (0.0025) +[2023-02-24 13:09:27,875][00205] Fps is (10 sec: 4503.2, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11673600. Throughput: 0: 1003.3. Samples: 2916992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:09:27,881][00205] Avg episode reward: [(0, '26.111')] +[2023-02-24 13:09:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11689984. Throughput: 0: 950.6. Samples: 2921860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:32,876][00205] Avg episode reward: [(0, '26.217')] +[2023-02-24 13:09:37,870][00205] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11706368. Throughput: 0: 971.2. Samples: 2927204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:37,881][00205] Avg episode reward: [(0, '27.113')] +[2023-02-24 13:09:38,824][11215] Updated weights for policy 0, policy_version 2860 (0.0024) +[2023-02-24 13:09:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11730944. Throughput: 0: 1000.1. Samples: 2930764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:09:42,878][00205] Avg episode reward: [(0, '27.195')] +[2023-02-24 13:09:47,874][00205] Fps is (10 sec: 4503.5, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11751424. Throughput: 0: 999.8. Samples: 2937522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:09:47,883][00205] Avg episode reward: [(0, '26.724')] +[2023-02-24 13:09:48,667][11215] Updated weights for policy 0, policy_version 2870 (0.0023) +[2023-02-24 13:09:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11763712. Throughput: 0: 948.6. Samples: 2941980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:09:52,878][00205] Avg episode reward: [(0, '28.349')] +[2023-02-24 13:09:57,870][00205] Fps is (10 sec: 3278.3, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11784192. Throughput: 0: 948.8. Samples: 2944278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:09:57,876][00205] Avg episode reward: [(0, '28.480')] +[2023-02-24 13:10:00,095][11215] Updated weights for policy 0, policy_version 2880 (0.0020) +[2023-02-24 13:10:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11808768. Throughput: 0: 995.2. Samples: 2950994. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:10:02,879][00205] Avg episode reward: [(0, '29.259')] +[2023-02-24 13:10:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11829248. Throughput: 0: 972.9. Samples: 2957256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:10:07,882][00205] Avg episode reward: [(0, '28.575')] +[2023-02-24 13:10:10,845][11215] Updated weights for policy 0, policy_version 2890 (0.0022) +[2023-02-24 13:10:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11841536. Throughput: 0: 944.2. Samples: 2959476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:10:12,874][00205] Avg episode reward: [(0, '28.388')] +[2023-02-24 13:10:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11862016. Throughput: 0: 943.7. Samples: 2964328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:17,873][00205] Avg episode reward: [(0, '27.157')] +[2023-02-24 13:10:21,046][11215] Updated weights for policy 0, policy_version 2900 (0.0014) +[2023-02-24 13:10:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11886592. Throughput: 0: 984.7. Samples: 2971514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:22,874][00205] Avg episode reward: [(0, '27.272')] +[2023-02-24 13:10:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3887.7). Total num frames: 11902976. Throughput: 0: 980.3. Samples: 2974876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:10:27,875][00205] Avg episode reward: [(0, '27.192')] +[2023-02-24 13:10:32,746][11215] Updated weights for policy 0, policy_version 2910 (0.0014) +[2023-02-24 13:10:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3887.8). Total num frames: 11919360. Throughput: 0: 928.2. Samples: 2979288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:32,874][00205] Avg episode reward: [(0, '28.060')] +[2023-02-24 13:10:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 11935744. Throughput: 0: 948.3. Samples: 2984652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:37,875][00205] Avg episode reward: [(0, '28.655')] +[2023-02-24 13:10:42,349][11215] Updated weights for policy 0, policy_version 2920 (0.0031) +[2023-02-24 13:10:42,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 11960320. Throughput: 0: 974.4. Samples: 2988128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:42,872][00205] Avg episode reward: [(0, '29.528')] +[2023-02-24 13:10:47,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3754.6, 300 sec: 3873.8). Total num frames: 11976704. Throughput: 0: 964.8. Samples: 2994414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:10:47,883][00205] Avg episode reward: [(0, '30.080')] +[2023-02-24 13:10:47,923][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth... +[2023-02-24 13:10:48,074][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth +[2023-02-24 13:10:52,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3822.7, 300 sec: 3873.8). Total num frames: 11993088. Throughput: 0: 923.6. Samples: 2998820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:52,885][00205] Avg episode reward: [(0, '29.245')] +[2023-02-24 13:10:54,979][11215] Updated weights for policy 0, policy_version 2930 (0.0023) +[2023-02-24 13:10:57,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12013568. Throughput: 0: 926.2. Samples: 3001154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:10:57,875][00205] Avg episode reward: [(0, '29.862')] +[2023-02-24 13:11:02,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12038144. Throughput: 0: 979.2. Samples: 3008392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:11:02,877][00205] Avg episode reward: [(0, '29.976')] +[2023-02-24 13:11:03,367][11215] Updated weights for policy 0, policy_version 2940 (0.0012) +[2023-02-24 13:11:07,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3873.8). Total num frames: 12054528. Throughput: 0: 956.8. Samples: 3014576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:07,881][00205] Avg episode reward: [(0, '29.996')] +[2023-02-24 13:11:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12070912. Throughput: 0: 931.0. Samples: 3016772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:12,874][00205] Avg episode reward: [(0, '29.561')] +[2023-02-24 13:11:15,626][11215] Updated weights for policy 0, policy_version 2950 (0.0040) +[2023-02-24 13:11:17,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12091392. Throughput: 0: 949.2. Samples: 3022000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:17,872][00205] Avg episode reward: [(0, '29.569')] +[2023-02-24 13:11:22,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12115968. Throughput: 0: 988.6. Samples: 3029138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:11:22,880][00205] Avg episode reward: [(0, '29.785')] +[2023-02-24 13:11:24,419][11215] Updated weights for policy 0, policy_version 2960 (0.0035) +[2023-02-24 13:11:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12132352. Throughput: 0: 983.2. Samples: 3032370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:11:27,876][00205] Avg episode reward: [(0, '30.146')] +[2023-02-24 13:11:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3873.8). Total num frames: 12148736. Throughput: 0: 944.2. Samples: 3036898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:32,873][00205] Avg episode reward: [(0, '29.465')] +[2023-02-24 13:11:36,684][11215] Updated weights for policy 0, policy_version 2970 (0.0027) +[2023-02-24 13:11:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 12169216. Throughput: 0: 965.7. Samples: 3042274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:11:37,880][00205] Avg episode reward: [(0, '29.023')] +[2023-02-24 13:11:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12189696. Throughput: 0: 992.4. Samples: 3045810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:11:42,872][00205] Avg episode reward: [(0, '28.890')] +[2023-02-24 13:11:46,019][11215] Updated weights for policy 0, policy_version 2980 (0.0028) +[2023-02-24 13:11:47,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3891.5, 300 sec: 3873.8). Total num frames: 12210176. Throughput: 0: 972.4. Samples: 3052150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:11:47,876][00205] Avg episode reward: [(0, '29.086')] +[2023-02-24 13:11:52,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3823.2, 300 sec: 3859.9). Total num frames: 12222464. Throughput: 0: 931.8. Samples: 3056502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:52,873][00205] Avg episode reward: [(0, '28.779')] +[2023-02-24 13:11:57,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12242944. Throughput: 0: 936.8. Samples: 3058930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:11:57,877][00205] Avg episode reward: [(0, '28.479')] +[2023-02-24 13:11:58,062][11215] Updated weights for policy 0, policy_version 2990 (0.0019) +[2023-02-24 13:12:02,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12267520. Throughput: 0: 975.8. Samples: 3065910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:02,879][00205] Avg episode reward: [(0, '28.994')] +[2023-02-24 13:12:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3823.1, 300 sec: 3859.9). Total num frames: 12283904. Throughput: 0: 948.9. Samples: 3071840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:07,877][00205] Avg episode reward: [(0, '27.318')] +[2023-02-24 13:12:08,107][11215] Updated weights for policy 0, policy_version 3000 (0.0012) +[2023-02-24 13:12:12,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3859.9). Total num frames: 12300288. Throughput: 0: 926.0. Samples: 3074042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:12,880][00205] Avg episode reward: [(0, '27.280')] +[2023-02-24 13:12:17,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12320768. Throughput: 0: 939.7. Samples: 3079182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:12:17,874][00205] Avg episode reward: [(0, '27.394')] +[2023-02-24 13:12:19,207][11215] Updated weights for policy 0, policy_version 3010 (0.0028) +[2023-02-24 13:12:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3823.0, 300 sec: 3860.0). Total num frames: 12345344. Throughput: 0: 976.9. Samples: 3086234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:22,873][00205] Avg episode reward: [(0, '27.006')] +[2023-02-24 13:12:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12361728. Throughput: 0: 968.4. Samples: 3089388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:12:27,874][00205] Avg episode reward: [(0, '27.804')] +[2023-02-24 13:12:30,189][11215] Updated weights for policy 0, policy_version 3020 (0.0017) +[2023-02-24 13:12:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12374016. Throughput: 0: 923.7. Samples: 3093716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:12:32,872][00205] Avg episode reward: [(0, '26.419')] +[2023-02-24 13:12:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 12394496. Throughput: 0: 950.6. Samples: 3099278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:37,877][00205] Avg episode reward: [(0, '26.483')] +[2023-02-24 13:12:40,594][11215] Updated weights for policy 0, policy_version 3030 (0.0017) +[2023-02-24 13:12:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12419072. Throughput: 0: 974.0. Samples: 3102758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:12:42,875][00205] Avg episode reward: [(0, '28.327')] +[2023-02-24 13:12:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12435456. Throughput: 0: 959.6. Samples: 3109090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:47,874][00205] Avg episode reward: [(0, '28.554')] +[2023-02-24 13:12:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth... +[2023-02-24 13:12:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth +[2023-02-24 13:12:52,202][11215] Updated weights for policy 0, policy_version 3040 (0.0017) +[2023-02-24 13:12:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 12451840. Throughput: 0: 923.6. Samples: 3113400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:12:52,875][00205] Avg episode reward: [(0, '28.876')] +[2023-02-24 13:12:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12472320. Throughput: 0: 930.4. Samples: 3115908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:12:57,880][00205] Avg episode reward: [(0, '29.707')] +[2023-02-24 13:13:02,140][11215] Updated weights for policy 0, policy_version 3050 (0.0024) +[2023-02-24 13:13:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12492800. Throughput: 0: 966.6. Samples: 3122680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:02,872][00205] Avg episode reward: [(0, '29.861')] +[2023-02-24 13:13:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 12509184. Throughput: 0: 929.6. Samples: 3128068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:07,878][00205] Avg episode reward: [(0, '30.553')] +[2023-02-24 13:13:12,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12525568. Throughput: 0: 906.7. Samples: 3130188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:13:12,878][00205] Avg episode reward: [(0, '30.168')] +[2023-02-24 13:13:14,790][11215] Updated weights for policy 0, policy_version 3060 (0.0019) +[2023-02-24 13:13:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12546048. Throughput: 0: 931.7. Samples: 3135642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:17,879][00205] Avg episode reward: [(0, '29.916')] +[2023-02-24 13:13:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12570624. Throughput: 0: 963.1. Samples: 3142616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:22,873][00205] Avg episode reward: [(0, '28.476')] +[2023-02-24 13:13:23,608][11215] Updated weights for policy 0, policy_version 3070 (0.0025) +[2023-02-24 13:13:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3846.1). Total num frames: 12587008. Throughput: 0: 950.5. Samples: 3145532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:13:27,877][00205] Avg episode reward: [(0, '27.511')] +[2023-02-24 13:13:32,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 12599296. Throughput: 0: 904.5. Samples: 3149794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:32,876][00205] Avg episode reward: [(0, '27.058')] +[2023-02-24 13:13:36,092][11215] Updated weights for policy 0, policy_version 3080 (0.0022) +[2023-02-24 13:13:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12623872. Throughput: 0: 941.1. Samples: 3155750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:37,872][00205] Avg episode reward: [(0, '27.133')] +[2023-02-24 13:13:42,870][00205] Fps is (10 sec: 4506.2, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12644352. Throughput: 0: 964.2. Samples: 3159296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:42,881][00205] Avg episode reward: [(0, '27.275')] +[2023-02-24 13:13:45,463][11215] Updated weights for policy 0, policy_version 3090 (0.0025) +[2023-02-24 13:13:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12660736. Throughput: 0: 945.8. Samples: 3165240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:13:47,874][00205] Avg episode reward: [(0, '26.354')] +[2023-02-24 13:13:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12677120. Throughput: 0: 925.4. Samples: 3169712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:13:52,877][00205] Avg episode reward: [(0, '25.739')] +[2023-02-24 13:13:57,098][11215] Updated weights for policy 0, policy_version 3100 (0.0019) +[2023-02-24 13:13:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12697600. Throughput: 0: 943.2. Samples: 3172630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:13:57,879][00205] Avg episode reward: [(0, '25.633')] +[2023-02-24 13:14:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12722176. Throughput: 0: 977.0. Samples: 3179608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:14:02,877][00205] Avg episode reward: [(0, '27.165')] +[2023-02-24 13:14:07,380][11215] Updated weights for policy 0, policy_version 3110 (0.0033) +[2023-02-24 13:14:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12738560. Throughput: 0: 941.1. Samples: 3184964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:14:07,875][00205] Avg episode reward: [(0, '25.456')] +[2023-02-24 13:14:12,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12750848. Throughput: 0: 925.6. Samples: 3187186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:14:12,881][00205] Avg episode reward: [(0, '25.933')] +[2023-02-24 13:14:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12775424. Throughput: 0: 960.3. Samples: 3193004. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:17,872][00205] Avg episode reward: [(0, '26.471')] +[2023-02-24 13:14:18,152][11215] Updated weights for policy 0, policy_version 3120 (0.0019) +[2023-02-24 13:14:22,870][00205] Fps is (10 sec: 4915.5, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 12800000. Throughput: 0: 986.0. Samples: 3200122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:22,874][00205] Avg episode reward: [(0, '27.418')] +[2023-02-24 13:14:27,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3822.6, 300 sec: 3818.2). Total num frames: 12816384. Throughput: 0: 965.5. Samples: 3202750. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:27,883][00205] Avg episode reward: [(0, '27.568')] +[2023-02-24 13:14:28,893][11215] Updated weights for policy 0, policy_version 3130 (0.0023) +[2023-02-24 13:14:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 12828672. Throughput: 0: 932.2. Samples: 3207188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:14:32,875][00205] Avg episode reward: [(0, '26.551')] +[2023-02-24 13:14:37,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12853248. Throughput: 0: 975.5. Samples: 3213610. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:37,872][00205] Avg episode reward: [(0, '26.103')] +[2023-02-24 13:14:39,206][11215] Updated weights for policy 0, policy_version 3140 (0.0014) +[2023-02-24 13:14:42,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 12877824. Throughput: 0: 990.2. Samples: 3217188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:14:42,872][00205] Avg episode reward: [(0, '25.734')] +[2023-02-24 13:14:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 12894208. Throughput: 0: 965.4. Samples: 3223052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:14:47,872][00205] Avg episode reward: [(0, '25.618')] +[2023-02-24 13:14:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth... +[2023-02-24 13:14:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth +[2023-02-24 13:14:50,425][11215] Updated weights for policy 0, policy_version 3150 (0.0013) +[2023-02-24 13:14:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12906496. Throughput: 0: 947.2. Samples: 3227590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:14:52,873][00205] Avg episode reward: [(0, '25.574')] +[2023-02-24 13:14:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 12931072. Throughput: 0: 968.8. Samples: 3230782. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:14:57,872][00205] Avg episode reward: [(0, '25.325')] +[2023-02-24 13:15:00,192][11215] Updated weights for policy 0, policy_version 3160 (0.0023) +[2023-02-24 13:15:02,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 12955648. Throughput: 0: 988.6. Samples: 3237490. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:15:02,872][00205] Avg episode reward: [(0, '27.306')] +[2023-02-24 13:15:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12967936. Throughput: 0: 951.6. Samples: 3242942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:07,876][00205] Avg episode reward: [(0, '28.133')] +[2023-02-24 13:15:12,154][11215] Updated weights for policy 0, policy_version 3170 (0.0011) +[2023-02-24 13:15:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.3, 300 sec: 3804.4). Total num frames: 12984320. Throughput: 0: 943.1. Samples: 3245184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:12,879][00205] Avg episode reward: [(0, '28.380')] +[2023-02-24 13:15:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 13008896. Throughput: 0: 981.4. Samples: 3251352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:17,879][00205] Avg episode reward: [(0, '29.088')] +[2023-02-24 13:15:20,814][11215] Updated weights for policy 0, policy_version 3180 (0.0021) +[2023-02-24 13:15:22,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13033472. Throughput: 0: 998.4. Samples: 3258536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:22,873][00205] Avg episode reward: [(0, '31.138')] +[2023-02-24 13:15:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.6, 300 sec: 3832.2). Total num frames: 13049856. Throughput: 0: 974.3. Samples: 3261032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:15:27,873][00205] Avg episode reward: [(0, '30.174')] +[2023-02-24 13:15:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13062144. Throughput: 0: 944.3. Samples: 3265546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:15:32,873][00205] Avg episode reward: [(0, '28.223')] +[2023-02-24 13:15:33,036][11215] Updated weights for policy 0, policy_version 3190 (0.0014) +[2023-02-24 13:15:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13086720. Throughput: 0: 992.4. Samples: 3272246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:15:37,873][00205] Avg episode reward: [(0, '29.218')] +[2023-02-24 13:15:41,379][11215] Updated weights for policy 0, policy_version 3200 (0.0024) +[2023-02-24 13:15:42,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13111296. Throughput: 0: 1001.8. Samples: 3275864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:42,878][00205] Avg episode reward: [(0, '28.585')] +[2023-02-24 13:15:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13127680. Throughput: 0: 978.6. Samples: 3281528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:47,876][00205] Avg episode reward: [(0, '28.434')] +[2023-02-24 13:15:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13144064. Throughput: 0: 958.6. Samples: 3286080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:52,875][00205] Avg episode reward: [(0, '27.921')] +[2023-02-24 13:15:53,572][11215] Updated weights for policy 0, policy_version 3210 (0.0019) +[2023-02-24 13:15:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13164544. Throughput: 0: 983.6. Samples: 3289448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:15:57,880][00205] Avg episode reward: [(0, '28.219')] +[2023-02-24 13:16:02,336][11215] Updated weights for policy 0, policy_version 3220 (0.0019) +[2023-02-24 13:16:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13189120. Throughput: 0: 1001.8. Samples: 3296434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:02,876][00205] Avg episode reward: [(0, '26.576')] +[2023-02-24 13:16:07,876][00205] Fps is (10 sec: 4093.3, 60 sec: 3959.1, 300 sec: 3846.0). Total num frames: 13205504. Throughput: 0: 955.4. Samples: 3301536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:16:07,879][00205] Avg episode reward: [(0, '26.522')] +[2023-02-24 13:16:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13221888. Throughput: 0: 949.9. Samples: 3303776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:12,873][00205] Avg episode reward: [(0, '27.093')] +[2023-02-24 13:16:14,616][11215] Updated weights for policy 0, policy_version 3230 (0.0030) +[2023-02-24 13:16:17,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13242368. Throughput: 0: 988.4. Samples: 3310022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:16:17,872][00205] Avg episode reward: [(0, '26.161')] +[2023-02-24 13:16:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13266944. Throughput: 0: 999.3. Samples: 3317214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:22,877][00205] Avg episode reward: [(0, '27.452')] +[2023-02-24 13:16:23,608][11215] Updated weights for policy 0, policy_version 3240 (0.0011) +[2023-02-24 13:16:27,878][00205] Fps is (10 sec: 4092.8, 60 sec: 3890.7, 300 sec: 3846.0). Total num frames: 13283328. Throughput: 0: 968.9. Samples: 3319472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:27,881][00205] Avg episode reward: [(0, '27.243')] +[2023-02-24 13:16:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13295616. Throughput: 0: 943.2. Samples: 3323972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:32,873][00205] Avg episode reward: [(0, '27.434')] +[2023-02-24 13:16:35,436][11215] Updated weights for policy 0, policy_version 3250 (0.0025) +[2023-02-24 13:16:37,870][00205] Fps is (10 sec: 3689.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13320192. Throughput: 0: 990.8. Samples: 3330666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:16:37,872][00205] Avg episode reward: [(0, '28.732')] +[2023-02-24 13:16:42,873][00205] Fps is (10 sec: 4913.5, 60 sec: 3891.0, 300 sec: 3846.0). Total num frames: 13344768. Throughput: 0: 995.5. Samples: 3334250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:42,877][00205] Avg episode reward: [(0, '29.226')] +[2023-02-24 13:16:44,956][11215] Updated weights for policy 0, policy_version 3260 (0.0011) +[2023-02-24 13:16:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13361152. Throughput: 0: 960.8. Samples: 3339670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:16:47,872][00205] Avg episode reward: [(0, '30.104')] +[2023-02-24 13:16:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth... +[2023-02-24 13:16:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth +[2023-02-24 13:16:52,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13377536. Throughput: 0: 948.0. Samples: 3344192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:16:52,879][00205] Avg episode reward: [(0, '29.504')] +[2023-02-24 13:16:56,264][11215] Updated weights for policy 0, policy_version 3270 (0.0012) +[2023-02-24 13:16:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13398016. Throughput: 0: 975.4. Samples: 3347668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:16:57,872][00205] Avg episode reward: [(0, '28.290')] +[2023-02-24 13:17:02,871][00205] Fps is (10 sec: 4505.0, 60 sec: 3891.1, 300 sec: 3860.0). Total num frames: 13422592. Throughput: 0: 992.6. Samples: 3354692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:02,875][00205] Avg episode reward: [(0, '29.657')] +[2023-02-24 13:17:06,777][11215] Updated weights for policy 0, policy_version 3280 (0.0015) +[2023-02-24 13:17:07,873][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.3, 300 sec: 3846.1). Total num frames: 13434880. Throughput: 0: 941.9. Samples: 3359600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:07,882][00205] Avg episode reward: [(0, '29.137')] +[2023-02-24 13:17:12,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13451264. Throughput: 0: 943.9. Samples: 3361942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:12,872][00205] Avg episode reward: [(0, '28.440')] +[2023-02-24 13:17:17,438][11215] Updated weights for policy 0, policy_version 3290 (0.0018) +[2023-02-24 13:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13475840. Throughput: 0: 982.1. Samples: 3368166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:17,872][00205] Avg episode reward: [(0, '27.276')] +[2023-02-24 13:17:22,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13500416. Throughput: 0: 993.2. Samples: 3375360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:22,872][00205] Avg episode reward: [(0, '26.586')] +[2023-02-24 13:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.4, 300 sec: 3860.0). Total num frames: 13512704. Throughput: 0: 962.5. Samples: 3377558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:17:27,873][00205] Avg episode reward: [(0, '26.672')] +[2023-02-24 13:17:28,242][11215] Updated weights for policy 0, policy_version 3300 (0.0015) +[2023-02-24 13:17:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13529088. Throughput: 0: 934.9. Samples: 3381740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:32,874][00205] Avg episode reward: [(0, '25.957')] +[2023-02-24 13:17:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13549568. Throughput: 0: 976.0. Samples: 3388112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:17:37,875][00205] Avg episode reward: [(0, '26.734')] +[2023-02-24 13:17:38,777][11215] Updated weights for policy 0, policy_version 3310 (0.0016) +[2023-02-24 13:17:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.1, 300 sec: 3860.0). Total num frames: 13574144. Throughput: 0: 975.6. Samples: 3391572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:42,875][00205] Avg episode reward: [(0, '27.235')] +[2023-02-24 13:17:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13586432. Throughput: 0: 935.1. Samples: 3396768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:47,872][00205] Avg episode reward: [(0, '26.936')] +[2023-02-24 13:17:50,851][11215] Updated weights for policy 0, policy_version 3320 (0.0019) +[2023-02-24 13:17:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13606912. Throughput: 0: 928.7. Samples: 3401392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:17:52,871][00205] Avg episode reward: [(0, '27.285')] +[2023-02-24 13:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13627392. Throughput: 0: 955.6. Samples: 3404946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:17:57,879][00205] Avg episode reward: [(0, '28.580')] +[2023-02-24 13:17:59,878][11215] Updated weights for policy 0, policy_version 3330 (0.0019) +[2023-02-24 13:18:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3860.0). Total num frames: 13647872. Throughput: 0: 976.7. Samples: 3412118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:18:02,873][00205] Avg episode reward: [(0, '28.180')] +[2023-02-24 13:18:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13664256. Throughput: 0: 918.7. Samples: 3416702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:07,877][00205] Avg episode reward: [(0, '27.651')] +[2023-02-24 13:18:12,307][11215] Updated weights for policy 0, policy_version 3340 (0.0041) +[2023-02-24 13:18:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13680640. Throughput: 0: 918.0. Samples: 3418870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:12,873][00205] Avg episode reward: [(0, '28.435')] +[2023-02-24 13:18:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13705216. Throughput: 0: 967.1. Samples: 3425260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:17,873][00205] Avg episode reward: [(0, '29.291')] +[2023-02-24 13:18:20,960][11215] Updated weights for policy 0, policy_version 3350 (0.0017) +[2023-02-24 13:18:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13725696. Throughput: 0: 975.4. Samples: 3432004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:22,876][00205] Avg episode reward: [(0, '29.159')] +[2023-02-24 13:18:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3873.9). Total num frames: 13742080. Throughput: 0: 946.8. Samples: 3434176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:27,878][00205] Avg episode reward: [(0, '28.704')] +[2023-02-24 13:18:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13758464. Throughput: 0: 929.7. Samples: 3438606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:18:32,879][00205] Avg episode reward: [(0, '29.137')] +[2023-02-24 13:18:33,539][11215] Updated weights for policy 0, policy_version 3360 (0.0020) +[2023-02-24 13:18:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13778944. Throughput: 0: 975.9. Samples: 3445306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:18:37,873][00205] Avg episode reward: [(0, '28.880')] +[2023-02-24 13:18:42,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13799424. Throughput: 0: 975.5. Samples: 3448844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:42,873][00205] Avg episode reward: [(0, '29.770')] +[2023-02-24 13:18:43,020][11215] Updated weights for policy 0, policy_version 3370 (0.0013) +[2023-02-24 13:18:47,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 13815808. Throughput: 0: 923.1. Samples: 3453658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:47,877][00205] Avg episode reward: [(0, '30.464')] +[2023-02-24 13:18:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth... +[2023-02-24 13:18:48,036][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth +[2023-02-24 13:18:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13832192. Throughput: 0: 928.3. Samples: 3458474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:52,873][00205] Avg episode reward: [(0, '30.365')] +[2023-02-24 13:18:54,924][11215] Updated weights for policy 0, policy_version 3380 (0.0023) +[2023-02-24 13:18:57,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13856768. Throughput: 0: 957.3. Samples: 3461948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:18:57,879][00205] Avg episode reward: [(0, '31.903')] +[2023-02-24 13:19:02,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13877248. Throughput: 0: 968.5. Samples: 3468844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:19:02,880][00205] Avg episode reward: [(0, '30.509')] +[2023-02-24 13:19:05,667][11215] Updated weights for policy 0, policy_version 3390 (0.0014) +[2023-02-24 13:19:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13889536. Throughput: 0: 913.1. Samples: 3473092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:07,876][00205] Avg episode reward: [(0, '30.280')] +[2023-02-24 13:19:12,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13910016. Throughput: 0: 915.0. Samples: 3475350. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:19:12,878][00205] Avg episode reward: [(0, '28.337')] +[2023-02-24 13:19:16,419][11215] Updated weights for policy 0, policy_version 3400 (0.0013) +[2023-02-24 13:19:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13930496. Throughput: 0: 961.9. Samples: 3481890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:17,879][00205] Avg episode reward: [(0, '27.576')] +[2023-02-24 13:19:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13950976. Throughput: 0: 952.9. Samples: 3488186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:22,873][00205] Avg episode reward: [(0, '26.552')] +[2023-02-24 13:19:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 13963264. Throughput: 0: 922.4. Samples: 3490352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:27,876][00205] Avg episode reward: [(0, '25.387')] +[2023-02-24 13:19:28,281][11215] Updated weights for policy 0, policy_version 3410 (0.0023) +[2023-02-24 13:19:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13983744. Throughput: 0: 915.1. Samples: 3494838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:19:32,875][00205] Avg episode reward: [(0, '25.694')] +[2023-02-24 13:19:37,787][11215] Updated weights for policy 0, policy_version 3420 (0.0014) +[2023-02-24 13:19:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14008320. Throughput: 0: 959.7. Samples: 3501662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) +[2023-02-24 13:19:37,879][00205] Avg episode reward: [(0, '27.661')] +[2023-02-24 13:19:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 14024704. Throughput: 0: 960.4. Samples: 3505164. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:19:42,874][00205] Avg episode reward: [(0, '27.639')] +[2023-02-24 13:19:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 14041088. Throughput: 0: 909.3. Samples: 3509764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:19:47,875][00205] Avg episode reward: [(0, '29.394')] +[2023-02-24 13:19:50,471][11215] Updated weights for policy 0, policy_version 3430 (0.0027) +[2023-02-24 13:19:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14057472. Throughput: 0: 929.2. Samples: 3514906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:52,872][00205] Avg episode reward: [(0, '29.038')] +[2023-02-24 13:19:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14082048. Throughput: 0: 958.0. Samples: 3518462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:19:57,873][00205] Avg episode reward: [(0, '28.238')] +[2023-02-24 13:19:59,275][11215] Updated weights for policy 0, policy_version 3440 (0.0022) +[2023-02-24 13:20:02,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3754.5, 300 sec: 3846.0). Total num frames: 14102528. Throughput: 0: 955.6. Samples: 3524892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:02,877][00205] Avg episode reward: [(0, '30.013')] +[2023-02-24 13:20:07,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3832.2). Total num frames: 14114816. Throughput: 0: 910.3. Samples: 3529152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:20:07,874][00205] Avg episode reward: [(0, '26.897')] +[2023-02-24 13:20:11,975][11215] Updated weights for policy 0, policy_version 3450 (0.0021) +[2023-02-24 13:20:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14135296. Throughput: 0: 911.4. Samples: 3531366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:12,872][00205] Avg episode reward: [(0, '26.146')] +[2023-02-24 13:20:17,870][00205] Fps is (10 sec: 4096.5, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14155776. Throughput: 0: 966.9. Samples: 3538348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:17,872][00205] Avg episode reward: [(0, '25.074')] +[2023-02-24 13:20:20,670][11215] Updated weights for policy 0, policy_version 3460 (0.0011) +[2023-02-24 13:20:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14176256. Throughput: 0: 956.0. Samples: 3544684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:22,874][00205] Avg episode reward: [(0, '25.799')] +[2023-02-24 13:20:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14192640. Throughput: 0: 926.8. Samples: 3546870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:27,875][00205] Avg episode reward: [(0, '27.608')] +[2023-02-24 13:20:32,831][11215] Updated weights for policy 0, policy_version 3470 (0.0030) +[2023-02-24 13:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14213120. Throughput: 0: 933.2. Samples: 3551758. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:32,872][00205] Avg episode reward: [(0, '28.704')] +[2023-02-24 13:20:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14233600. Throughput: 0: 975.5. Samples: 3558802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:20:37,878][00205] Avg episode reward: [(0, '30.275')] +[2023-02-24 13:20:42,548][11215] Updated weights for policy 0, policy_version 3480 (0.0024) +[2023-02-24 13:20:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14254080. Throughput: 0: 973.6. Samples: 3562272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:42,872][00205] Avg episode reward: [(0, '29.947')] +[2023-02-24 13:20:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14266368. Throughput: 0: 927.9. Samples: 3566644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:20:47,873][00205] Avg episode reward: [(0, '31.964')] +[2023-02-24 13:20:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth... +[2023-02-24 13:20:48,049][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth +[2023-02-24 13:20:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14286848. Throughput: 0: 952.0. Samples: 3571990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:20:52,872][00205] Avg episode reward: [(0, '32.424')] +[2023-02-24 13:20:54,231][11215] Updated weights for policy 0, policy_version 3490 (0.0026) +[2023-02-24 13:20:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14311424. Throughput: 0: 979.7. Samples: 3575454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:20:57,875][00205] Avg episode reward: [(0, '31.058')] +[2023-02-24 13:21:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3804.5). Total num frames: 14327808. Throughput: 0: 969.7. Samples: 3581984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:21:02,878][00205] Avg episode reward: [(0, '31.370')] +[2023-02-24 13:21:04,717][11215] Updated weights for policy 0, policy_version 3500 (0.0030) +[2023-02-24 13:21:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 14344192. Throughput: 0: 923.7. Samples: 3586250. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:07,875][00205] Avg episode reward: [(0, '30.257')] +[2023-02-24 13:21:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14360576. Throughput: 0: 925.1. Samples: 3588500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:12,879][00205] Avg episode reward: [(0, '30.011')] +[2023-02-24 13:21:15,474][11215] Updated weights for policy 0, policy_version 3510 (0.0020) +[2023-02-24 13:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14385152. Throughput: 0: 973.6. Samples: 3595572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:17,876][00205] Avg episode reward: [(0, '29.541')] +[2023-02-24 13:21:22,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 14405632. Throughput: 0: 955.8. Samples: 3601814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:21:22,877][00205] Avg episode reward: [(0, '28.882')] +[2023-02-24 13:21:26,956][11215] Updated weights for policy 0, policy_version 3520 (0.0026) +[2023-02-24 13:21:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14417920. Throughput: 0: 926.4. Samples: 3603962. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:21:27,872][00205] Avg episode reward: [(0, '29.000')] +[2023-02-24 13:21:32,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14438400. Throughput: 0: 935.4. Samples: 3608738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:32,873][00205] Avg episode reward: [(0, '28.622')] +[2023-02-24 13:21:36,772][11215] Updated weights for policy 0, policy_version 3530 (0.0026) +[2023-02-24 13:21:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 14462976. Throughput: 0: 972.3. Samples: 3615742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:37,873][00205] Avg episode reward: [(0, '28.415')] +[2023-02-24 13:21:42,877][00205] Fps is (10 sec: 4093.0, 60 sec: 3754.2, 300 sec: 3790.4). Total num frames: 14479360. Throughput: 0: 972.9. Samples: 3619240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:42,884][00205] Avg episode reward: [(0, '26.604')] +[2023-02-24 13:21:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14495744. Throughput: 0: 922.6. Samples: 3623500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:47,873][00205] Avg episode reward: [(0, '28.410')] +[2023-02-24 13:21:48,978][11215] Updated weights for policy 0, policy_version 3540 (0.0028) +[2023-02-24 13:21:52,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14516224. Throughput: 0: 948.1. Samples: 3628914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:21:52,872][00205] Avg episode reward: [(0, '28.067')] +[2023-02-24 13:21:57,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 14536704. Throughput: 0: 976.2. Samples: 3632428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:21:57,873][00205] Avg episode reward: [(0, '28.859')] +[2023-02-24 13:21:58,067][11215] Updated weights for policy 0, policy_version 3550 (0.0013) +[2023-02-24 13:22:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14557184. Throughput: 0: 959.7. Samples: 3638758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:02,871][00205] Avg episode reward: [(0, '29.466')] +[2023-02-24 13:22:07,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14569472. Throughput: 0: 917.2. Samples: 3643084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:07,873][00205] Avg episode reward: [(0, '29.846')] +[2023-02-24 13:22:10,606][11215] Updated weights for policy 0, policy_version 3560 (0.0020) +[2023-02-24 13:22:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14589952. Throughput: 0: 922.3. Samples: 3645464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:12,872][00205] Avg episode reward: [(0, '30.779')] +[2023-02-24 13:22:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14614528. Throughput: 0: 971.7. Samples: 3652466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:22:17,872][00205] Avg episode reward: [(0, '30.570')] +[2023-02-24 13:22:19,483][11215] Updated weights for policy 0, policy_version 3570 (0.0020) +[2023-02-24 13:22:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 14630912. Throughput: 0: 947.1. Samples: 3658362. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:22:22,875][00205] Avg episode reward: [(0, '30.270')] +[2023-02-24 13:22:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14647296. Throughput: 0: 917.7. Samples: 3660532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:22:27,879][00205] Avg episode reward: [(0, '30.798')] +[2023-02-24 13:22:31,857][11215] Updated weights for policy 0, policy_version 3580 (0.0017) +[2023-02-24 13:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14667776. Throughput: 0: 940.0. Samples: 3665800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:32,879][00205] Avg episode reward: [(0, '28.957')] +[2023-02-24 13:22:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14692352. Throughput: 0: 974.2. Samples: 3672752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:22:37,877][00205] Avg episode reward: [(0, '28.236')] +[2023-02-24 13:22:41,102][11215] Updated weights for policy 0, policy_version 3590 (0.0026) +[2023-02-24 13:22:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.4, 300 sec: 3804.4). Total num frames: 14708736. Throughput: 0: 967.5. Samples: 3675964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:22:42,877][00205] Avg episode reward: [(0, '27.926')] +[2023-02-24 13:22:47,874][00205] Fps is (10 sec: 2866.0, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 14721024. Throughput: 0: 925.2. Samples: 3680394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:47,879][00205] Avg episode reward: [(0, '27.837')] +[2023-02-24 13:22:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth... +[2023-02-24 13:22:48,060][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth +[2023-02-24 13:22:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14741504. Throughput: 0: 953.4. Samples: 3685988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:52,872][00205] Avg episode reward: [(0, '27.800')] +[2023-02-24 13:22:53,026][11215] Updated weights for policy 0, policy_version 3600 (0.0021) +[2023-02-24 13:22:57,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 14766080. Throughput: 0: 978.2. Samples: 3689482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:22:57,879][00205] Avg episode reward: [(0, '28.941')] +[2023-02-24 13:23:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14782464. Throughput: 0: 961.3. Samples: 3695726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:02,875][00205] Avg episode reward: [(0, '28.939')] +[2023-02-24 13:23:03,086][11215] Updated weights for policy 0, policy_version 3610 (0.0011) +[2023-02-24 13:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14798848. Throughput: 0: 925.8. Samples: 3700024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:07,873][00205] Avg episode reward: [(0, '29.466')] +[2023-02-24 13:23:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14819328. Throughput: 0: 934.0. Samples: 3702560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:23:12,872][00205] Avg episode reward: [(0, '30.402')] +[2023-02-24 13:23:14,294][11215] Updated weights for policy 0, policy_version 3620 (0.0024) +[2023-02-24 13:23:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14843904. Throughput: 0: 972.8. Samples: 3709576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:23:17,872][00205] Avg episode reward: [(0, '31.584')] +[2023-02-24 13:23:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14860288. Throughput: 0: 947.2. Samples: 3715374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:22,873][00205] Avg episode reward: [(0, '31.972')] +[2023-02-24 13:23:25,541][11215] Updated weights for policy 0, policy_version 3630 (0.0030) +[2023-02-24 13:23:27,870][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 14872576. Throughput: 0: 922.3. Samples: 3717468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:27,878][00205] Avg episode reward: [(0, '31.147')] +[2023-02-24 13:23:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14893056. Throughput: 0: 940.1. Samples: 3722696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:32,879][00205] Avg episode reward: [(0, '30.206')] +[2023-02-24 13:23:35,677][11215] Updated weights for policy 0, policy_version 3640 (0.0013) +[2023-02-24 13:23:37,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14917632. Throughput: 0: 972.9. Samples: 3729768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:37,872][00205] Avg episode reward: [(0, '29.361')] +[2023-02-24 13:23:42,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 14934016. Throughput: 0: 962.9. Samples: 3732814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:42,874][00205] Avg episode reward: [(0, '29.718')] +[2023-02-24 13:23:47,828][11215] Updated weights for policy 0, policy_version 3650 (0.0019) +[2023-02-24 13:23:47,875][00205] Fps is (10 sec: 3275.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14950400. Throughput: 0: 916.6. Samples: 3736978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:23:47,882][00205] Avg episode reward: [(0, '29.596')] +[2023-02-24 13:23:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14970880. Throughput: 0: 950.8. Samples: 3742810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:23:52,878][00205] Avg episode reward: [(0, '27.900')] +[2023-02-24 13:23:56,875][11215] Updated weights for policy 0, policy_version 3660 (0.0011) +[2023-02-24 13:23:57,870][00205] Fps is (10 sec: 4507.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14995456. Throughput: 0: 972.5. Samples: 3746322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:23:57,872][00205] Avg episode reward: [(0, '28.600')] +[2023-02-24 13:24:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15011840. Throughput: 0: 952.3. Samples: 3752430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:02,877][00205] Avg episode reward: [(0, '29.330')] +[2023-02-24 13:24:07,873][00205] Fps is (10 sec: 2866.3, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 15024128. Throughput: 0: 917.5. Samples: 3756664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:07,878][00205] Avg episode reward: [(0, '29.552')] +[2023-02-24 13:24:09,391][11215] Updated weights for policy 0, policy_version 3670 (0.0020) +[2023-02-24 13:24:12,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 15044608. Throughput: 0: 936.1. Samples: 3759594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:12,877][00205] Avg episode reward: [(0, '30.132')] +[2023-02-24 13:24:17,870][00205] Fps is (10 sec: 4506.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15069184. Throughput: 0: 975.4. Samples: 3766590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:17,874][00205] Avg episode reward: [(0, '30.715')] +[2023-02-24 13:24:18,083][11215] Updated weights for policy 0, policy_version 3680 (0.0022) +[2023-02-24 13:24:22,873][00205] Fps is (10 sec: 4096.4, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15085568. Throughput: 0: 940.3. Samples: 3772084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:24:22,876][00205] Avg episode reward: [(0, '31.799')] +[2023-02-24 13:24:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 15101952. Throughput: 0: 921.7. Samples: 3774290. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:24:27,873][00205] Avg episode reward: [(0, '31.516')] +[2023-02-24 13:24:30,543][11215] Updated weights for policy 0, policy_version 3690 (0.0024) +[2023-02-24 13:24:32,870][00205] Fps is (10 sec: 3687.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15122432. Throughput: 0: 954.7. Samples: 3779936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:32,873][00205] Avg episode reward: [(0, '31.794')] +[2023-02-24 13:24:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15147008. Throughput: 0: 982.4. Samples: 3787018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:37,876][00205] Avg episode reward: [(0, '30.642')] +[2023-02-24 13:24:39,576][11215] Updated weights for policy 0, policy_version 3700 (0.0017) +[2023-02-24 13:24:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 15163392. Throughput: 0: 965.6. Samples: 3789772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:42,877][00205] Avg episode reward: [(0, '28.464')] +[2023-02-24 13:24:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3755.0, 300 sec: 3790.5). Total num frames: 15175680. Throughput: 0: 923.3. Samples: 3793980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:24:47,878][00205] Avg episode reward: [(0, '28.684')] +[2023-02-24 13:24:47,897][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth... +[2023-02-24 13:24:48,079][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth +[2023-02-24 13:24:51,750][11215] Updated weights for policy 0, policy_version 3710 (0.0020) +[2023-02-24 13:24:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15200256. Throughput: 0: 963.0. Samples: 3799994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:24:52,873][00205] Avg episode reward: [(0, '27.226')] +[2023-02-24 13:24:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 15220736. Throughput: 0: 975.4. Samples: 3803482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:24:57,875][00205] Avg episode reward: [(0, '27.058')] +[2023-02-24 13:25:02,409][11215] Updated weights for policy 0, policy_version 3720 (0.0028) +[2023-02-24 13:25:02,873][00205] Fps is (10 sec: 3685.3, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15237120. Throughput: 0: 941.8. Samples: 3808976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:25:02,879][00205] Avg episode reward: [(0, '27.016')] +[2023-02-24 13:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 15249408. Throughput: 0: 916.3. Samples: 3813316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:07,879][00205] Avg episode reward: [(0, '26.833')] +[2023-02-24 13:25:12,870][00205] Fps is (10 sec: 3687.5, 60 sec: 3823.2, 300 sec: 3790.5). Total num frames: 15273984. Throughput: 0: 934.8. Samples: 3816354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:12,873][00205] Avg episode reward: [(0, '28.488')] +[2023-02-24 13:25:13,510][11215] Updated weights for policy 0, policy_version 3730 (0.0017) +[2023-02-24 13:25:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15298560. Throughput: 0: 961.5. Samples: 3823204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:17,873][00205] Avg episode reward: [(0, '29.344')] +[2023-02-24 13:25:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 15310848. Throughput: 0: 923.4. Samples: 3828570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:22,874][00205] Avg episode reward: [(0, '30.927')] +[2023-02-24 13:25:24,623][11215] Updated weights for policy 0, policy_version 3740 (0.0016) +[2023-02-24 13:25:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15327232. Throughput: 0: 910.9. Samples: 3830762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:27,873][00205] Avg episode reward: [(0, '30.591')] +[2023-02-24 13:25:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15351808. Throughput: 0: 946.9. Samples: 3836590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:25:32,872][00205] Avg episode reward: [(0, '30.033')] +[2023-02-24 13:25:34,539][11215] Updated weights for policy 0, policy_version 3750 (0.0017) +[2023-02-24 13:25:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15372288. Throughput: 0: 968.7. Samples: 3843584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:25:37,879][00205] Avg episode reward: [(0, '29.382')] +[2023-02-24 13:25:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15388672. Throughput: 0: 949.6. Samples: 3846214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:42,872][00205] Avg episode reward: [(0, '29.231')] +[2023-02-24 13:25:46,555][11215] Updated weights for policy 0, policy_version 3760 (0.0030) +[2023-02-24 13:25:47,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15405056. Throughput: 0: 924.2. Samples: 3850564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:25:47,879][00205] Avg episode reward: [(0, '28.233')] +[2023-02-24 13:25:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15425536. Throughput: 0: 968.5. Samples: 3856900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:52,879][00205] Avg episode reward: [(0, '28.024')] +[2023-02-24 13:25:55,699][11215] Updated weights for policy 0, policy_version 3770 (0.0011) +[2023-02-24 13:25:57,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15450112. Throughput: 0: 979.4. Samples: 3860428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:25:57,878][00205] Avg episode reward: [(0, '28.417')] +[2023-02-24 13:26:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 15466496. Throughput: 0: 952.5. Samples: 3866066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:26:02,873][00205] Avg episode reward: [(0, '29.659')] +[2023-02-24 13:26:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15478784. Throughput: 0: 930.6. Samples: 3870448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:26:07,872][00205] Avg episode reward: [(0, '30.114')] +[2023-02-24 13:26:08,097][11215] Updated weights for policy 0, policy_version 3780 (0.0037) +[2023-02-24 13:26:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15503360. Throughput: 0: 952.3. Samples: 3873616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:12,880][00205] Avg episode reward: [(0, '30.489')] +[2023-02-24 13:26:16,975][11215] Updated weights for policy 0, policy_version 3790 (0.0017) +[2023-02-24 13:26:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 15527936. Throughput: 0: 975.8. Samples: 3880500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:17,879][00205] Avg episode reward: [(0, '31.722')] +[2023-02-24 13:26:22,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15540224. Throughput: 0: 936.0. Samples: 3885702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:22,874][00205] Avg episode reward: [(0, '32.746')] +[2023-02-24 13:26:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15556608. Throughput: 0: 924.5. Samples: 3887818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:27,873][00205] Avg episode reward: [(0, '31.724')] +[2023-02-24 13:26:29,367][11215] Updated weights for policy 0, policy_version 3800 (0.0011) +[2023-02-24 13:26:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15577088. Throughput: 0: 960.3. Samples: 3893776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:26:32,876][00205] Avg episode reward: [(0, '29.443')] +[2023-02-24 13:26:37,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 15601664. Throughput: 0: 974.1. Samples: 3900736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:37,878][00205] Avg episode reward: [(0, '28.960')] +[2023-02-24 13:26:38,499][11215] Updated weights for policy 0, policy_version 3810 (0.0017) +[2023-02-24 13:26:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15613952. Throughput: 0: 948.3. Samples: 3903102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:42,874][00205] Avg episode reward: [(0, '28.649')] +[2023-02-24 13:26:47,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15630336. Throughput: 0: 919.1. Samples: 3907426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:26:47,872][00205] Avg episode reward: [(0, '27.857')] +[2023-02-24 13:26:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth... +[2023-02-24 13:26:48,007][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth +[2023-02-24 13:26:50,814][11215] Updated weights for policy 0, policy_version 3820 (0.0020) +[2023-02-24 13:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15654912. Throughput: 0: 961.1. Samples: 3913696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:26:52,872][00205] Avg episode reward: [(0, '27.325')] +[2023-02-24 13:26:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15675392. Throughput: 0: 968.2. Samples: 3917186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:26:57,879][00205] Avg episode reward: [(0, '27.446')] +[2023-02-24 13:27:00,812][11215] Updated weights for policy 0, policy_version 3830 (0.0016) +[2023-02-24 13:27:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15691776. Throughput: 0: 937.4. Samples: 3922684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:02,877][00205] Avg episode reward: [(0, '29.302')] +[2023-02-24 13:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15708160. Throughput: 0: 919.9. Samples: 3927096. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:27:07,873][00205] Avg episode reward: [(0, '28.767')] +[2023-02-24 13:27:12,018][11215] Updated weights for policy 0, policy_version 3840 (0.0025) +[2023-02-24 13:27:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15728640. Throughput: 0: 947.1. Samples: 3930436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:27:12,872][00205] Avg episode reward: [(0, '30.464')] +[2023-02-24 13:27:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15753216. Throughput: 0: 967.7. Samples: 3937324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:17,872][00205] Avg episode reward: [(0, '31.113')] +[2023-02-24 13:27:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15765504. Throughput: 0: 925.3. Samples: 3942374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:27:22,881][00205] Avg episode reward: [(0, '32.603')] +[2023-02-24 13:27:23,047][11215] Updated weights for policy 0, policy_version 3850 (0.0019) +[2023-02-24 13:27:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 15781888. Throughput: 0: 922.0. Samples: 3944590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:27,875][00205] Avg episode reward: [(0, '33.059')] +[2023-02-24 13:27:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15806464. Throughput: 0: 959.2. Samples: 3950590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:27:32,872][00205] Avg episode reward: [(0, '33.152')] +[2023-02-24 13:27:33,423][11215] Updated weights for policy 0, policy_version 3860 (0.0012) +[2023-02-24 13:27:37,878][00205] Fps is (10 sec: 4501.8, 60 sec: 3754.3, 300 sec: 3790.4). Total num frames: 15826944. Throughput: 0: 974.5. Samples: 3957556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:37,881][00205] Avg episode reward: [(0, '34.449')] +[2023-02-24 13:27:37,899][11201] Saving new best policy, reward=34.449! +[2023-02-24 13:27:42,874][00205] Fps is (10 sec: 3684.8, 60 sec: 3822.7, 300 sec: 3804.4). Total num frames: 15843328. Throughput: 0: 945.7. Samples: 3959746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:27:42,876][00205] Avg episode reward: [(0, '34.549')] +[2023-02-24 13:27:42,886][11201] Saving new best policy, reward=34.549! +[2023-02-24 13:27:45,283][11215] Updated weights for policy 0, policy_version 3870 (0.0011) +[2023-02-24 13:27:47,870][00205] Fps is (10 sec: 3279.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15859712. Throughput: 0: 921.7. Samples: 3964162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:27:47,872][00205] Avg episode reward: [(0, '31.631')] +[2023-02-24 13:27:52,870][00205] Fps is (10 sec: 4097.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15884288. Throughput: 0: 970.6. Samples: 3970774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:52,876][00205] Avg episode reward: [(0, '30.482')] +[2023-02-24 13:27:54,630][11215] Updated weights for policy 0, policy_version 3880 (0.0019) +[2023-02-24 13:27:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15904768. Throughput: 0: 975.8. Samples: 3974346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:27:57,872][00205] Avg episode reward: [(0, '30.739')] +[2023-02-24 13:28:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15921152. Throughput: 0: 940.8. Samples: 3979662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:28:02,873][00205] Avg episode reward: [(0, '31.045')] +[2023-02-24 13:28:06,900][11215] Updated weights for policy 0, policy_version 3890 (0.0026) +[2023-02-24 13:28:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15937536. Throughput: 0: 928.7. Samples: 3984164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:07,873][00205] Avg episode reward: [(0, '31.096')] +[2023-02-24 13:28:12,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 15958016. Throughput: 0: 958.3. Samples: 3987712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:28:12,878][00205] Avg episode reward: [(0, '28.555')] +[2023-02-24 13:28:15,825][11215] Updated weights for policy 0, policy_version 3900 (0.0014) +[2023-02-24 13:28:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15978496. Throughput: 0: 976.8. Samples: 3994548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:17,877][00205] Avg episode reward: [(0, '29.449')] +[2023-02-24 13:28:22,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15994880. Throughput: 0: 925.8. Samples: 3999208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:22,872][00205] Avg episode reward: [(0, '31.371')] +[2023-02-24 13:28:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16011264. Throughput: 0: 924.0. Samples: 4001324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:27,879][00205] Avg episode reward: [(0, '30.108')] +[2023-02-24 13:28:28,307][11215] Updated weights for policy 0, policy_version 3910 (0.0016) +[2023-02-24 13:28:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16035840. Throughput: 0: 968.8. Samples: 4007756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:28:32,872][00205] Avg episode reward: [(0, '30.784')] +[2023-02-24 13:28:37,024][11215] Updated weights for policy 0, policy_version 3920 (0.0011) +[2023-02-24 13:28:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.5, 300 sec: 3804.4). Total num frames: 16056320. Throughput: 0: 973.9. Samples: 4014598. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:37,878][00205] Avg episode reward: [(0, '30.628')] +[2023-02-24 13:28:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3804.5). Total num frames: 16072704. Throughput: 0: 942.5. Samples: 4016758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:42,879][00205] Avg episode reward: [(0, '30.219')] +[2023-02-24 13:28:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16089088. Throughput: 0: 924.1. Samples: 4021248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:28:47,872][00205] Avg episode reward: [(0, '30.982')] +[2023-02-24 13:28:47,881][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth... +[2023-02-24 13:28:47,999][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth +[2023-02-24 13:28:49,555][11215] Updated weights for policy 0, policy_version 3930 (0.0020) +[2023-02-24 13:28:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16109568. Throughput: 0: 971.3. Samples: 4027874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:28:52,872][00205] Avg episode reward: [(0, '31.041')] +[2023-02-24 13:28:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16130048. Throughput: 0: 967.6. Samples: 4031252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:28:57,877][00205] Avg episode reward: [(0, '32.061')] +[2023-02-24 13:28:59,693][11215] Updated weights for policy 0, policy_version 3940 (0.0017) +[2023-02-24 13:29:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.5). Total num frames: 16146432. Throughput: 0: 929.4. Samples: 4036370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:29:02,879][00205] Avg episode reward: [(0, '30.960')] +[2023-02-24 13:29:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3790.6). Total num frames: 16162816. Throughput: 0: 931.7. Samples: 4041134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:29:07,875][00205] Avg episode reward: [(0, '31.018')] +[2023-02-24 13:29:10,828][11215] Updated weights for policy 0, policy_version 3950 (0.0020) +[2023-02-24 13:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 16187392. Throughput: 0: 961.6. Samples: 4044598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:12,880][00205] Avg episode reward: [(0, '31.144')] +[2023-02-24 13:29:17,876][00205] Fps is (10 sec: 4502.9, 60 sec: 3822.5, 300 sec: 3804.4). Total num frames: 16207872. Throughput: 0: 972.3. Samples: 4051514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:17,879][00205] Avg episode reward: [(0, '32.337')] +[2023-02-24 13:29:21,731][11215] Updated weights for policy 0, policy_version 3960 (0.0014) +[2023-02-24 13:29:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16220160. Throughput: 0: 919.1. Samples: 4055956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:29:22,876][00205] Avg episode reward: [(0, '32.456')] +[2023-02-24 13:29:27,870][00205] Fps is (10 sec: 3278.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16240640. Throughput: 0: 918.0. Samples: 4058066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:29:27,872][00205] Avg episode reward: [(0, '32.083')] +[2023-02-24 13:29:32,158][11215] Updated weights for policy 0, policy_version 3970 (0.0017) +[2023-02-24 13:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16261120. Throughput: 0: 965.4. Samples: 4064690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:32,873][00205] Avg episode reward: [(0, '30.460')] +[2023-02-24 13:29:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 16281600. Throughput: 0: 961.6. Samples: 4071148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:29:37,883][00205] Avg episode reward: [(0, '30.129')] +[2023-02-24 13:29:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 16297984. Throughput: 0: 936.3. Samples: 4073386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:29:42,875][00205] Avg episode reward: [(0, '29.237')] +[2023-02-24 13:29:43,843][11215] Updated weights for policy 0, policy_version 3980 (0.0012) +[2023-02-24 13:29:47,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16314368. Throughput: 0: 922.6. Samples: 4077888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:29:47,875][00205] Avg episode reward: [(0, '29.045')] +[2023-02-24 13:29:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16338944. Throughput: 0: 966.9. Samples: 4084644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:29:52,878][00205] Avg episode reward: [(0, '31.121')] +[2023-02-24 13:29:53,682][11215] Updated weights for policy 0, policy_version 3990 (0.0019) +[2023-02-24 13:29:57,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 16359424. Throughput: 0: 965.5. Samples: 4088046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:29:57,873][00205] Avg episode reward: [(0, '30.992')] +[2023-02-24 13:30:02,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 16371712. Throughput: 0: 912.0. Samples: 4092550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:02,875][00205] Avg episode reward: [(0, '30.782')] +[2023-02-24 13:30:06,278][11215] Updated weights for policy 0, policy_version 4000 (0.0021) +[2023-02-24 13:30:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16388096. Throughput: 0: 923.2. Samples: 4097502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:07,873][00205] Avg episode reward: [(0, '30.459')] +[2023-02-24 13:30:12,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16412672. Throughput: 0: 953.2. Samples: 4100958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:12,879][00205] Avg episode reward: [(0, '32.182')] +[2023-02-24 13:30:15,315][11215] Updated weights for policy 0, policy_version 4010 (0.0013) +[2023-02-24 13:30:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.1, 300 sec: 3804.4). Total num frames: 16433152. Throughput: 0: 956.1. Samples: 4107714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:30:17,876][00205] Avg episode reward: [(0, '32.199')] +[2023-02-24 13:30:22,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 16445440. Throughput: 0: 908.6. Samples: 4112032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:30:22,878][00205] Avg episode reward: [(0, '31.692')] +[2023-02-24 13:30:27,659][11215] Updated weights for policy 0, policy_version 4020 (0.0027) +[2023-02-24 13:30:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16465920. Throughput: 0: 907.7. Samples: 4114234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:27,881][00205] Avg episode reward: [(0, '30.593')] +[2023-02-24 13:30:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16486400. Throughput: 0: 958.9. Samples: 4121038. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:30:32,872][00205] Avg episode reward: [(0, '29.981')] +[2023-02-24 13:30:37,153][11215] Updated weights for policy 0, policy_version 4030 (0.0027) +[2023-02-24 13:30:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 16506880. Throughput: 0: 947.2. Samples: 4127268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:37,872][00205] Avg episode reward: [(0, '30.057')] +[2023-02-24 13:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16519168. Throughput: 0: 917.9. Samples: 4129350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:42,873][00205] Avg episode reward: [(0, '29.359')] +[2023-02-24 13:30:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16539648. Throughput: 0: 928.5. Samples: 4134332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:30:47,873][00205] Avg episode reward: [(0, '29.214')] +[2023-02-24 13:30:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth... +[2023-02-24 13:30:48,013][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth +[2023-02-24 13:30:49,043][11215] Updated weights for policy 0, policy_version 4040 (0.0023) +[2023-02-24 13:30:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16564224. Throughput: 0: 971.4. Samples: 4141216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:30:52,878][00205] Avg episode reward: [(0, '29.870')] +[2023-02-24 13:30:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16580608. Throughput: 0: 968.6. Samples: 4144546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:30:57,878][00205] Avg episode reward: [(0, '29.219')] +[2023-02-24 13:30:59,542][11215] Updated weights for policy 0, policy_version 4050 (0.0013) +[2023-02-24 13:31:02,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16596992. Throughput: 0: 916.0. Samples: 4148936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:31:02,877][00205] Avg episode reward: [(0, '29.501')] +[2023-02-24 13:31:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16617472. Throughput: 0: 941.9. Samples: 4154416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:07,873][00205] Avg episode reward: [(0, '28.976')] +[2023-02-24 13:31:10,200][11215] Updated weights for policy 0, policy_version 4060 (0.0016) +[2023-02-24 13:31:12,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16637952. Throughput: 0: 970.3. Samples: 4157898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:12,872][00205] Avg episode reward: [(0, '30.104')] +[2023-02-24 13:31:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16658432. Throughput: 0: 957.9. Samples: 4164142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:17,877][00205] Avg episode reward: [(0, '30.695')] +[2023-02-24 13:31:21,723][11215] Updated weights for policy 0, policy_version 4070 (0.0028) +[2023-02-24 13:31:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16670720. Throughput: 0: 916.6. Samples: 4168516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:22,877][00205] Avg episode reward: [(0, '29.653')] +[2023-02-24 13:31:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16691200. Throughput: 0: 928.3. Samples: 4171124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:31:27,872][00205] Avg episode reward: [(0, '29.709')] +[2023-02-24 13:31:31,542][11215] Updated weights for policy 0, policy_version 4080 (0.0019) +[2023-02-24 13:31:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16715776. Throughput: 0: 968.9. Samples: 4177932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:32,873][00205] Avg episode reward: [(0, '29.639')] +[2023-02-24 13:31:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16732160. Throughput: 0: 942.6. Samples: 4183632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:37,875][00205] Avg episode reward: [(0, '29.612')] +[2023-02-24 13:31:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16748544. Throughput: 0: 917.0. Samples: 4185810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:31:42,873][00205] Avg episode reward: [(0, '27.973')] +[2023-02-24 13:31:43,966][11215] Updated weights for policy 0, policy_version 4090 (0.0015) +[2023-02-24 13:31:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16769024. Throughput: 0: 935.5. Samples: 4191032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:31:47,877][00205] Avg episode reward: [(0, '27.151')] +[2023-02-24 13:31:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16789504. Throughput: 0: 963.7. Samples: 4197782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:31:52,876][00205] Avg episode reward: [(0, '27.364')] +[2023-02-24 13:31:53,266][11215] Updated weights for policy 0, policy_version 4100 (0.0016) +[2023-02-24 13:31:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16805888. Throughput: 0: 955.4. Samples: 4200890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:31:57,878][00205] Avg episode reward: [(0, '27.743')] +[2023-02-24 13:32:02,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16822272. Throughput: 0: 911.9. Samples: 4205180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:02,873][00205] Avg episode reward: [(0, '26.935')] +[2023-02-24 13:32:05,764][11215] Updated weights for policy 0, policy_version 4110 (0.0015) +[2023-02-24 13:32:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16842752. Throughput: 0: 941.7. Samples: 4210894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:07,884][00205] Avg episode reward: [(0, '27.696')] +[2023-02-24 13:32:12,873][00205] Fps is (10 sec: 4504.2, 60 sec: 3822.7, 300 sec: 3776.6). Total num frames: 16867328. Throughput: 0: 959.8. Samples: 4214318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:12,877][00205] Avg episode reward: [(0, '30.044')] +[2023-02-24 13:32:15,059][11215] Updated weights for policy 0, policy_version 4120 (0.0017) +[2023-02-24 13:32:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16883712. Throughput: 0: 943.8. Samples: 4220404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:32:17,877][00205] Avg episode reward: [(0, '32.229')] +[2023-02-24 13:32:22,870][00205] Fps is (10 sec: 2868.1, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16896000. Throughput: 0: 913.1. Samples: 4224720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:22,879][00205] Avg episode reward: [(0, '33.955')] +[2023-02-24 13:32:27,093][11215] Updated weights for policy 0, policy_version 4130 (0.0011) +[2023-02-24 13:32:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16916480. Throughput: 0: 925.8. Samples: 4227470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:32:27,872][00205] Avg episode reward: [(0, '34.055')] +[2023-02-24 13:32:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3776.8). Total num frames: 16941056. Throughput: 0: 966.7. Samples: 4234534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:32:32,873][00205] Avg episode reward: [(0, '34.715')] +[2023-02-24 13:32:32,875][11201] Saving new best policy, reward=34.715! +[2023-02-24 13:32:36,765][11215] Updated weights for policy 0, policy_version 4140 (0.0025) +[2023-02-24 13:32:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16957440. Throughput: 0: 940.8. Samples: 4240118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:32:37,873][00205] Avg episode reward: [(0, '36.120')] +[2023-02-24 13:32:37,884][11201] Saving new best policy, reward=36.120! +[2023-02-24 13:32:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 16973824. Throughput: 0: 919.4. Samples: 4242266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:42,880][00205] Avg episode reward: [(0, '35.290')] +[2023-02-24 13:32:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16994304. Throughput: 0: 944.2. Samples: 4247670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:32:47,877][00205] Avg episode reward: [(0, '34.136')] +[2023-02-24 13:32:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth... +[2023-02-24 13:32:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth +[2023-02-24 13:32:48,301][11215] Updated weights for policy 0, policy_version 4150 (0.0027) +[2023-02-24 13:32:52,870][00205] Fps is (10 sec: 4506.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17018880. Throughput: 0: 971.9. Samples: 4254628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:32:52,879][00205] Avg episode reward: [(0, '32.439')] +[2023-02-24 13:32:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17031168. Throughput: 0: 957.7. Samples: 4257412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:32:57,875][00205] Avg episode reward: [(0, '32.089')] +[2023-02-24 13:32:59,334][11215] Updated weights for policy 0, policy_version 4160 (0.0020) +[2023-02-24 13:33:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17047552. Throughput: 0: 919.1. Samples: 4261762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:02,877][00205] Avg episode reward: [(0, '32.550')] +[2023-02-24 13:33:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17068032. Throughput: 0: 954.0. Samples: 4267650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:07,876][00205] Avg episode reward: [(0, '32.966')] +[2023-02-24 13:33:09,698][11215] Updated weights for policy 0, policy_version 4170 (0.0016) +[2023-02-24 13:33:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 17092608. Throughput: 0: 968.8. Samples: 4271064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:12,872][00205] Avg episode reward: [(0, '32.938')] +[2023-02-24 13:33:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17108992. Throughput: 0: 939.4. Samples: 4276808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:33:17,872][00205] Avg episode reward: [(0, '32.660')] +[2023-02-24 13:33:21,800][11215] Updated weights for policy 0, policy_version 4180 (0.0027) +[2023-02-24 13:33:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17121280. Throughput: 0: 912.0. Samples: 4281156. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:33:22,878][00205] Avg episode reward: [(0, '31.478')] +[2023-02-24 13:33:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17145856. Throughput: 0: 928.6. Samples: 4284050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:27,879][00205] Avg episode reward: [(0, '31.509')] +[2023-02-24 13:33:31,446][11215] Updated weights for policy 0, policy_version 4190 (0.0018) +[2023-02-24 13:33:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17166336. Throughput: 0: 958.2. Samples: 4290790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:33:32,873][00205] Avg episode reward: [(0, '32.313')] +[2023-02-24 13:33:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17182720. Throughput: 0: 916.0. Samples: 4295850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:33:37,874][00205] Avg episode reward: [(0, '33.047')] +[2023-02-24 13:33:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3748.9). Total num frames: 17195008. Throughput: 0: 901.0. Samples: 4297958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:33:42,874][00205] Avg episode reward: [(0, '32.472')] +[2023-02-24 13:33:44,122][11215] Updated weights for policy 0, policy_version 4200 (0.0020) +[2023-02-24 13:33:47,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17219584. Throughput: 0: 935.4. Samples: 4303854. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:33:47,879][00205] Avg episode reward: [(0, '32.023')] +[2023-02-24 13:33:52,823][11215] Updated weights for policy 0, policy_version 4210 (0.0021) +[2023-02-24 13:33:52,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17244160. Throughput: 0: 960.3. Samples: 4310864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:33:52,877][00205] Avg episode reward: [(0, '32.352')] +[2023-02-24 13:33:57,870][00205] Fps is (10 sec: 3686.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17256448. Throughput: 0: 939.6. Samples: 4313348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:33:57,873][00205] Avg episode reward: [(0, '33.993')] +[2023-02-24 13:34:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17272832. Throughput: 0: 907.8. Samples: 4317660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:02,872][00205] Avg episode reward: [(0, '34.625')] +[2023-02-24 13:34:05,487][11215] Updated weights for policy 0, policy_version 4220 (0.0018) +[2023-02-24 13:34:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17293312. Throughput: 0: 950.6. Samples: 4323932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:34:07,874][00205] Avg episode reward: [(0, '33.173')] +[2023-02-24 13:34:12,879][00205] Fps is (10 sec: 4503.3, 60 sec: 3754.4, 300 sec: 3762.8). Total num frames: 17317888. Throughput: 0: 961.7. Samples: 4327330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:12,882][00205] Avg episode reward: [(0, '31.212')] +[2023-02-24 13:34:15,093][11215] Updated weights for policy 0, policy_version 4230 (0.0012) +[2023-02-24 13:34:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 17330176. Throughput: 0: 935.1. Samples: 4332870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:17,878][00205] Avg episode reward: [(0, '30.270')] +[2023-02-24 13:34:22,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17346560. Throughput: 0: 920.5. Samples: 4337274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:22,873][00205] Avg episode reward: [(0, '30.476')] +[2023-02-24 13:34:26,653][11215] Updated weights for policy 0, policy_version 4240 (0.0013) +[2023-02-24 13:34:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17371136. Throughput: 0: 945.7. Samples: 4340514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:27,876][00205] Avg episode reward: [(0, '29.960')] +[2023-02-24 13:34:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17391616. Throughput: 0: 969.8. Samples: 4347496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) +[2023-02-24 13:34:32,874][00205] Avg episode reward: [(0, '29.325')] +[2023-02-24 13:34:37,210][11215] Updated weights for policy 0, policy_version 4250 (0.0015) +[2023-02-24 13:34:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17408000. Throughput: 0: 926.6. Samples: 4352562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:37,873][00205] Avg episode reward: [(0, '29.577')] +[2023-02-24 13:34:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17424384. Throughput: 0: 918.4. Samples: 4354676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:42,872][00205] Avg episode reward: [(0, '31.387')] +[2023-02-24 13:34:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17444864. Throughput: 0: 957.4. Samples: 4360744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:47,872][00205] Avg episode reward: [(0, '31.086')] +[2023-02-24 13:34:47,922][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth... +[2023-02-24 13:34:47,925][11215] Updated weights for policy 0, policy_version 4260 (0.0018) +[2023-02-24 13:34:48,064][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth +[2023-02-24 13:34:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17469440. Throughput: 0: 972.8. Samples: 4367708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:34:52,875][00205] Avg episode reward: [(0, '30.335')] +[2023-02-24 13:34:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17481728. Throughput: 0: 945.0. Samples: 4369850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:34:57,877][00205] Avg episode reward: [(0, '29.849')] +[2023-02-24 13:34:59,967][11215] Updated weights for policy 0, policy_version 4270 (0.0018) +[2023-02-24 13:35:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17498112. Throughput: 0: 914.1. Samples: 4374004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:35:02,872][00205] Avg episode reward: [(0, '31.191')] +[2023-02-24 13:35:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17522688. Throughput: 0: 963.6. Samples: 4380634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:07,873][00205] Avg episode reward: [(0, '32.240')] +[2023-02-24 13:35:09,432][11215] Updated weights for policy 0, policy_version 4280 (0.0018) +[2023-02-24 13:35:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.0, 300 sec: 3762.8). Total num frames: 17543168. Throughput: 0: 969.5. Samples: 4384140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:12,873][00205] Avg episode reward: [(0, '31.934')] +[2023-02-24 13:35:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 17559552. Throughput: 0: 930.0. Samples: 4389344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:17,875][00205] Avg episode reward: [(0, '31.448')] +[2023-02-24 13:35:21,830][11215] Updated weights for policy 0, policy_version 4290 (0.0011) +[2023-02-24 13:35:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17575936. Throughput: 0: 919.2. Samples: 4393924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:35:22,873][00205] Avg episode reward: [(0, '31.255')] +[2023-02-24 13:35:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17596416. Throughput: 0: 946.6. Samples: 4397274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:27,872][00205] Avg episode reward: [(0, '32.735')] +[2023-02-24 13:35:30,785][11215] Updated weights for policy 0, policy_version 4300 (0.0012) +[2023-02-24 13:35:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 17616896. Throughput: 0: 965.7. Samples: 4404202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:35:32,875][00205] Avg episode reward: [(0, '32.606')] +[2023-02-24 13:35:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17633280. Throughput: 0: 912.5. Samples: 4408772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:35:37,872][00205] Avg episode reward: [(0, '30.765')] +[2023-02-24 13:35:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17649664. Throughput: 0: 912.8. Samples: 4410926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:35:42,877][00205] Avg episode reward: [(0, '31.647')] +[2023-02-24 13:35:43,238][11215] Updated weights for policy 0, policy_version 4310 (0.0018) +[2023-02-24 13:35:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17674240. Throughput: 0: 963.7. Samples: 4417370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:47,872][00205] Avg episode reward: [(0, '32.506')] +[2023-02-24 13:35:52,482][11215] Updated weights for policy 0, policy_version 4320 (0.0016) +[2023-02-24 13:35:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17694720. Throughput: 0: 965.4. Samples: 4424076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:52,876][00205] Avg episode reward: [(0, '32.686')] +[2023-02-24 13:35:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17707008. Throughput: 0: 933.8. Samples: 4426162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:35:57,878][00205] Avg episode reward: [(0, '31.490')] +[2023-02-24 13:36:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17727488. Throughput: 0: 916.5. Samples: 4430588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:02,875][00205] Avg episode reward: [(0, '30.287')] +[2023-02-24 13:36:04,490][11215] Updated weights for policy 0, policy_version 4330 (0.0012) +[2023-02-24 13:36:07,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17747968. Throughput: 0: 969.9. Samples: 4437570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:07,876][00205] Avg episode reward: [(0, '29.798')] +[2023-02-24 13:36:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 17768448. Throughput: 0: 971.0. Samples: 4440970. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:36:12,874][00205] Avg episode reward: [(0, '30.527')] +[2023-02-24 13:36:14,723][11215] Updated weights for policy 0, policy_version 4340 (0.0026) +[2023-02-24 13:36:17,871][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 17784832. Throughput: 0: 925.7. Samples: 4445858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:36:17,878][00205] Avg episode reward: [(0, '30.486')] +[2023-02-24 13:36:22,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17801216. Throughput: 0: 933.2. Samples: 4450768. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:36:22,872][00205] Avg episode reward: [(0, '29.989')] +[2023-02-24 13:36:26,034][11215] Updated weights for policy 0, policy_version 4350 (0.0013) +[2023-02-24 13:36:27,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17825792. Throughput: 0: 958.6. Samples: 4454064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:27,879][00205] Avg episode reward: [(0, '32.190')] +[2023-02-24 13:36:32,873][00205] Fps is (10 sec: 4504.0, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17846272. Throughput: 0: 967.0. Samples: 4460888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:32,875][00205] Avg episode reward: [(0, '31.899')] +[2023-02-24 13:36:37,228][11215] Updated weights for policy 0, policy_version 4360 (0.0014) +[2023-02-24 13:36:37,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3754.2, 300 sec: 3762.7). Total num frames: 17858560. Throughput: 0: 914.3. Samples: 4465228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:37,885][00205] Avg episode reward: [(0, '31.766')] +[2023-02-24 13:36:42,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17874944. Throughput: 0: 915.7. Samples: 4467370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:36:42,873][00205] Avg episode reward: [(0, '31.646')] +[2023-02-24 13:36:47,420][11215] Updated weights for policy 0, policy_version 4370 (0.0017) +[2023-02-24 13:36:47,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17899520. Throughput: 0: 964.9. Samples: 4474008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:36:47,881][00205] Avg episode reward: [(0, '30.047')] +[2023-02-24 13:36:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth... +[2023-02-24 13:36:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth +[2023-02-24 13:36:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17920000. Throughput: 0: 948.7. Samples: 4480262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:52,876][00205] Avg episode reward: [(0, '30.306')] +[2023-02-24 13:36:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17932288. Throughput: 0: 919.7. Samples: 4482356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:36:57,875][00205] Avg episode reward: [(0, '28.813')] +[2023-02-24 13:36:59,646][11215] Updated weights for policy 0, policy_version 4380 (0.0033) +[2023-02-24 13:37:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 17948672. Throughput: 0: 910.0. Samples: 4486808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:37:02,873][00205] Avg episode reward: [(0, '27.859')] +[2023-02-24 13:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17973248. Throughput: 0: 955.5. Samples: 4493766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:07,873][00205] Avg episode reward: [(0, '29.218')] +[2023-02-24 13:37:08,913][11215] Updated weights for policy 0, policy_version 4390 (0.0012) +[2023-02-24 13:37:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 17993728. Throughput: 0: 959.2. Samples: 4497226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:12,880][00205] Avg episode reward: [(0, '31.056')] +[2023-02-24 13:37:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18010112. Throughput: 0: 911.0. Samples: 4501882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:37:17,878][00205] Avg episode reward: [(0, '30.981')] +[2023-02-24 13:37:21,376][11215] Updated weights for policy 0, policy_version 4400 (0.0015) +[2023-02-24 13:37:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18026496. Throughput: 0: 930.7. Samples: 4507104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:37:22,872][00205] Avg episode reward: [(0, '30.979')] +[2023-02-24 13:37:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18051072. Throughput: 0: 960.0. Samples: 4510572. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:37:27,879][00205] Avg episode reward: [(0, '31.649')] +[2023-02-24 13:37:30,306][11215] Updated weights for policy 0, policy_version 4410 (0.0023) +[2023-02-24 13:37:32,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3686.6, 300 sec: 3762.8). Total num frames: 18067456. Throughput: 0: 960.0. Samples: 4517206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:32,873][00205] Avg episode reward: [(0, '31.672')] +[2023-02-24 13:37:37,873][00205] Fps is (10 sec: 3275.8, 60 sec: 3754.9, 300 sec: 3762.7). Total num frames: 18083840. Throughput: 0: 918.5. Samples: 4521598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:37:37,875][00205] Avg episode reward: [(0, '31.408')] +[2023-02-24 13:37:42,551][11215] Updated weights for policy 0, policy_version 4420 (0.0025) +[2023-02-24 13:37:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18104320. Throughput: 0: 920.1. Samples: 4523760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:37:42,872][00205] Avg episode reward: [(0, '31.068')] +[2023-02-24 13:37:47,870][00205] Fps is (10 sec: 4507.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18128896. Throughput: 0: 978.8. Samples: 4530854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:47,872][00205] Avg episode reward: [(0, '29.337')] +[2023-02-24 13:37:51,721][11215] Updated weights for policy 0, policy_version 4430 (0.0013) +[2023-02-24 13:37:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18145280. Throughput: 0: 962.1. Samples: 4537058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:52,877][00205] Avg episode reward: [(0, '28.751')] +[2023-02-24 13:37:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18161664. Throughput: 0: 932.4. Samples: 4539182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:37:57,878][00205] Avg episode reward: [(0, '28.305')] +[2023-02-24 13:38:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18178048. Throughput: 0: 936.1. Samples: 4544008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:38:02,872][00205] Avg episode reward: [(0, '29.431')] +[2023-02-24 13:38:03,901][11215] Updated weights for policy 0, policy_version 4440 (0.0023) +[2023-02-24 13:38:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 18202624. Throughput: 0: 975.9. Samples: 4551018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:38:07,872][00205] Avg episode reward: [(0, '29.386')] +[2023-02-24 13:38:12,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18223104. Throughput: 0: 972.9. Samples: 4554352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:38:12,876][00205] Avg episode reward: [(0, '31.248')] +[2023-02-24 13:38:14,241][11215] Updated weights for policy 0, policy_version 4450 (0.0022) +[2023-02-24 13:38:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18235392. Throughput: 0: 921.5. Samples: 4558672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:38:17,879][00205] Avg episode reward: [(0, '31.288')] +[2023-02-24 13:38:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18255872. Throughput: 0: 943.7. Samples: 4564060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:22,872][00205] Avg episode reward: [(0, '31.926')] +[2023-02-24 13:38:25,306][11215] Updated weights for policy 0, policy_version 4460 (0.0025) +[2023-02-24 13:38:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18276352. Throughput: 0: 971.2. Samples: 4567464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:27,873][00205] Avg episode reward: [(0, '33.241')] +[2023-02-24 13:38:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18296832. Throughput: 0: 953.2. Samples: 4573748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:38:32,880][00205] Avg episode reward: [(0, '32.765')] +[2023-02-24 13:38:36,710][11215] Updated weights for policy 0, policy_version 4470 (0.0011) +[2023-02-24 13:38:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3776.6). Total num frames: 18309120. Throughput: 0: 912.5. Samples: 4578120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:37,877][00205] Avg episode reward: [(0, '32.218')] +[2023-02-24 13:38:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18329600. Throughput: 0: 920.8. Samples: 4580616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:38:42,873][00205] Avg episode reward: [(0, '32.247')] +[2023-02-24 13:38:46,660][11215] Updated weights for policy 0, policy_version 4480 (0.0034) +[2023-02-24 13:38:47,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18354176. Throughput: 0: 966.6. Samples: 4587504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:47,872][00205] Avg episode reward: [(0, '30.376')] +[2023-02-24 13:38:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth... +[2023-02-24 13:38:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth +[2023-02-24 13:38:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18370560. Throughput: 0: 938.8. Samples: 4593266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:38:52,875][00205] Avg episode reward: [(0, '29.966')] +[2023-02-24 13:38:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18386944. Throughput: 0: 912.1. Samples: 4595396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:38:57,872][00205] Avg episode reward: [(0, '30.101')] +[2023-02-24 13:38:59,192][11215] Updated weights for policy 0, policy_version 4490 (0.0016) +[2023-02-24 13:39:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18407424. Throughput: 0: 929.7. Samples: 4600508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:39:02,877][00205] Avg episode reward: [(0, '29.211')] +[2023-02-24 13:39:07,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18427904. Throughput: 0: 965.3. Samples: 4607500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:07,874][00205] Avg episode reward: [(0, '31.558')] +[2023-02-24 13:39:08,138][11215] Updated weights for policy 0, policy_version 4500 (0.0017) +[2023-02-24 13:39:12,875][00205] Fps is (10 sec: 3684.4, 60 sec: 3686.1, 300 sec: 3776.6). Total num frames: 18444288. Throughput: 0: 957.4. Samples: 4610554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:12,880][00205] Avg episode reward: [(0, '30.965')] +[2023-02-24 13:39:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18460672. Throughput: 0: 914.6. Samples: 4614906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:39:17,877][00205] Avg episode reward: [(0, '31.455')] +[2023-02-24 13:39:20,479][11215] Updated weights for policy 0, policy_version 4510 (0.0017) +[2023-02-24 13:39:22,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18481152. Throughput: 0: 946.6. Samples: 4620718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:22,882][00205] Avg episode reward: [(0, '32.129')] +[2023-02-24 13:39:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18505728. Throughput: 0: 968.6. Samples: 4624202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:39:27,881][00205] Avg episode reward: [(0, '32.640')] +[2023-02-24 13:39:29,696][11215] Updated weights for policy 0, policy_version 4520 (0.0014) +[2023-02-24 13:39:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18522112. Throughput: 0: 947.4. Samples: 4630138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:32,874][00205] Avg episode reward: [(0, '33.105')] +[2023-02-24 13:39:37,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18534400. Throughput: 0: 916.4. Samples: 4634506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:37,879][00205] Avg episode reward: [(0, '33.413')] +[2023-02-24 13:39:41,942][11215] Updated weights for policy 0, policy_version 4530 (0.0016) +[2023-02-24 13:39:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18558976. Throughput: 0: 930.5. Samples: 4637266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:42,875][00205] Avg episode reward: [(0, '31.860')] +[2023-02-24 13:39:47,870][00205] Fps is (10 sec: 4505.9, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18579456. Throughput: 0: 973.5. Samples: 4644316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:39:47,871][00205] Avg episode reward: [(0, '33.947')] +[2023-02-24 13:39:51,730][11215] Updated weights for policy 0, policy_version 4540 (0.0011) +[2023-02-24 13:39:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18595840. Throughput: 0: 941.1. Samples: 4649848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:39:52,877][00205] Avg episode reward: [(0, '33.288')] +[2023-02-24 13:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18612224. Throughput: 0: 921.8. Samples: 4652028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:39:57,876][00205] Avg episode reward: [(0, '32.833')] +[2023-02-24 13:40:02,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18632704. Throughput: 0: 941.5. Samples: 4657272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:02,872][00205] Avg episode reward: [(0, '32.925')] +[2023-02-24 13:40:03,409][11215] Updated weights for policy 0, policy_version 4550 (0.0014) +[2023-02-24 13:40:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18657280. Throughput: 0: 968.4. Samples: 4664296. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:40:07,875][00205] Avg episode reward: [(0, '32.421')] +[2023-02-24 13:40:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3823.2, 300 sec: 3776.6). Total num frames: 18673664. Throughput: 0: 952.5. Samples: 4667064. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:40:12,874][00205] Avg episode reward: [(0, '32.367')] +[2023-02-24 13:40:14,224][11215] Updated weights for policy 0, policy_version 4560 (0.0014) +[2023-02-24 13:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18685952. Throughput: 0: 916.9. Samples: 4671398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:40:17,874][00205] Avg episode reward: [(0, '33.540')] +[2023-02-24 13:40:22,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18710528. Throughput: 0: 956.8. Samples: 4677562. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:40:22,872][00205] Avg episode reward: [(0, '31.933')] +[2023-02-24 13:40:24,288][11215] Updated weights for policy 0, policy_version 4570 (0.0020) +[2023-02-24 13:40:27,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 18731008. Throughput: 0: 974.3. Samples: 4681112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:27,872][00205] Avg episode reward: [(0, '30.706')] +[2023-02-24 13:40:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18747392. Throughput: 0: 941.7. Samples: 4686694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:40:32,877][00205] Avg episode reward: [(0, '31.070')] +[2023-02-24 13:40:36,269][11215] Updated weights for policy 0, policy_version 4580 (0.0018) +[2023-02-24 13:40:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3776.6). Total num frames: 18763776. Throughput: 0: 917.0. Samples: 4691112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:40:37,872][00205] Avg episode reward: [(0, '30.738')] +[2023-02-24 13:40:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18784256. Throughput: 0: 934.1. Samples: 4694062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:40:42,873][00205] Avg episode reward: [(0, '31.084')] +[2023-02-24 13:40:45,892][11215] Updated weights for policy 0, policy_version 4590 (0.0022) +[2023-02-24 13:40:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18808832. Throughput: 0: 974.7. Samples: 4701134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:47,872][00205] Avg episode reward: [(0, '30.620')] +[2023-02-24 13:40:47,880][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth... +[2023-02-24 13:40:48,029][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth +[2023-02-24 13:40:52,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18821120. Throughput: 0: 937.5. Samples: 4706484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:40:52,875][00205] Avg episode reward: [(0, '30.381')] +[2023-02-24 13:40:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18837504. Throughput: 0: 923.6. Samples: 4708624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:40:57,874][00205] Avg episode reward: [(0, '31.576')] +[2023-02-24 13:40:58,500][11215] Updated weights for policy 0, policy_version 4600 (0.0011) +[2023-02-24 13:41:02,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18862080. Throughput: 0: 950.1. Samples: 4714150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:02,880][00205] Avg episode reward: [(0, '31.787')] +[2023-02-24 13:41:07,349][11215] Updated weights for policy 0, policy_version 4610 (0.0026) +[2023-02-24 13:41:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18882560. Throughput: 0: 969.6. Samples: 4721196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:07,873][00205] Avg episode reward: [(0, '31.181')] +[2023-02-24 13:41:12,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.8, 300 sec: 3776.7). Total num frames: 18898944. Throughput: 0: 949.5. Samples: 4723840. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:41:12,873][00205] Avg episode reward: [(0, '32.176')] +[2023-02-24 13:41:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18911232. Throughput: 0: 924.0. Samples: 4728272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:41:17,874][00205] Avg episode reward: [(0, '31.709')] +[2023-02-24 13:41:19,673][11215] Updated weights for policy 0, policy_version 4620 (0.0025) +[2023-02-24 13:41:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18935808. Throughput: 0: 963.2. Samples: 4734458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:41:22,873][00205] Avg episode reward: [(0, '31.460')] +[2023-02-24 13:41:27,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18960384. Throughput: 0: 975.9. Samples: 4737980. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:41:27,881][00205] Avg episode reward: [(0, '32.309')] +[2023-02-24 13:41:28,487][11215] Updated weights for policy 0, policy_version 4630 (0.0022) +[2023-02-24 13:41:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 18976768. Throughput: 0: 945.1. Samples: 4743662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:41:32,872][00205] Avg episode reward: [(0, '31.874')] +[2023-02-24 13:41:37,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18989056. Throughput: 0: 922.5. Samples: 4747996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:41:37,875][00205] Avg episode reward: [(0, '33.078')] +[2023-02-24 13:41:40,808][11215] Updated weights for policy 0, policy_version 4640 (0.0026) +[2023-02-24 13:41:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19013632. Throughput: 0: 944.5. Samples: 4751128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:42,876][00205] Avg episode reward: [(0, '32.005')] +[2023-02-24 13:41:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19034112. Throughput: 0: 977.3. Samples: 4758130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:41:47,875][00205] Avg episode reward: [(0, '30.901')] +[2023-02-24 13:41:50,662][11215] Updated weights for policy 0, policy_version 4650 (0.0016) +[2023-02-24 13:41:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 19050496. Throughput: 0: 937.7. Samples: 4763392. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:41:52,881][00205] Avg episode reward: [(0, '31.801')] +[2023-02-24 13:41:57,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 19066880. Throughput: 0: 926.2. Samples: 4765520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:41:57,879][00205] Avg episode reward: [(0, '32.167')] +[2023-02-24 13:42:02,148][11215] Updated weights for policy 0, policy_version 4660 (0.0017) +[2023-02-24 13:42:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19087360. Throughput: 0: 955.6. Samples: 4771276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:02,873][00205] Avg episode reward: [(0, '29.330')] +[2023-02-24 13:42:07,870][00205] Fps is (10 sec: 4506.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19111936. Throughput: 0: 972.3. Samples: 4778212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:07,873][00205] Avg episode reward: [(0, '29.882')] +[2023-02-24 13:42:12,785][11215] Updated weights for policy 0, policy_version 4670 (0.0015) +[2023-02-24 13:42:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19128320. Throughput: 0: 949.2. Samples: 4780692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:12,872][00205] Avg episode reward: [(0, '30.252')] +[2023-02-24 13:42:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19140608. Throughput: 0: 917.6. Samples: 4784952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:17,872][00205] Avg episode reward: [(0, '30.203')] +[2023-02-24 13:42:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3776.7). Total num frames: 19165184. Throughput: 0: 960.3. Samples: 4791208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:22,873][00205] Avg episode reward: [(0, '29.424')] +[2023-02-24 13:42:23,628][11215] Updated weights for policy 0, policy_version 4680 (0.0020) +[2023-02-24 13:42:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19185664. Throughput: 0: 969.2. Samples: 4794740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:27,872][00205] Avg episode reward: [(0, '28.006')] +[2023-02-24 13:42:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 19202048. Throughput: 0: 935.9. Samples: 4800244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:32,872][00205] Avg episode reward: [(0, '28.539')] +[2023-02-24 13:42:35,105][11215] Updated weights for policy 0, policy_version 4690 (0.0016) +[2023-02-24 13:42:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19218432. Throughput: 0: 918.4. Samples: 4804718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:37,881][00205] Avg episode reward: [(0, '29.047')] +[2023-02-24 13:42:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 19238912. Throughput: 0: 945.2. Samples: 4808052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:42,872][00205] Avg episode reward: [(0, '27.868')] +[2023-02-24 13:42:44,645][11215] Updated weights for policy 0, policy_version 4700 (0.0011) +[2023-02-24 13:42:47,874][00205] Fps is (10 sec: 4503.7, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19263488. Throughput: 0: 973.2. Samples: 4815076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:42:47,877][00205] Avg episode reward: [(0, '28.748')] +[2023-02-24 13:42:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth... +[2023-02-24 13:42:48,040][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth +[2023-02-24 13:42:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19275776. Throughput: 0: 929.0. Samples: 4820018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:42:52,878][00205] Avg episode reward: [(0, '27.869')] +[2023-02-24 13:42:57,042][11215] Updated weights for policy 0, policy_version 4710 (0.0023) +[2023-02-24 13:42:57,870][00205] Fps is (10 sec: 2868.4, 60 sec: 3754.8, 300 sec: 3776.6). Total num frames: 19292160. Throughput: 0: 922.1. Samples: 4822186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:42:57,873][00205] Avg episode reward: [(0, '28.164')] +[2023-02-24 13:43:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19316736. Throughput: 0: 957.1. Samples: 4828020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:43:02,873][00205] Avg episode reward: [(0, '28.988')] +[2023-02-24 13:43:06,185][11215] Updated weights for policy 0, policy_version 4720 (0.0012) +[2023-02-24 13:43:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19337216. Throughput: 0: 974.3. Samples: 4835052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:43:07,873][00205] Avg episode reward: [(0, '27.640')] +[2023-02-24 13:43:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19353600. Throughput: 0: 947.6. Samples: 4837380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:43:12,873][00205] Avg episode reward: [(0, '28.411')] +[2023-02-24 13:43:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19369984. Throughput: 0: 924.0. Samples: 4841824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:17,873][00205] Avg episode reward: [(0, '28.179')] +[2023-02-24 13:43:18,659][11215] Updated weights for policy 0, policy_version 4730 (0.0034) +[2023-02-24 13:43:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19390464. Throughput: 0: 969.9. Samples: 4848362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:22,873][00205] Avg episode reward: [(0, '28.688')] +[2023-02-24 13:43:27,183][11215] Updated weights for policy 0, policy_version 4740 (0.0017) +[2023-02-24 13:43:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19415040. Throughput: 0: 973.9. Samples: 4851878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:43:27,872][00205] Avg episode reward: [(0, '28.992')] +[2023-02-24 13:43:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 19427328. Throughput: 0: 935.0. Samples: 4857146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:32,875][00205] Avg episode reward: [(0, '29.533')] +[2023-02-24 13:43:37,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 19443712. Throughput: 0: 921.0. Samples: 4861466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:37,875][00205] Avg episode reward: [(0, '29.846')] +[2023-02-24 13:43:39,807][11215] Updated weights for policy 0, policy_version 4750 (0.0029) +[2023-02-24 13:43:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19468288. Throughput: 0: 949.6. Samples: 4864920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:42,872][00205] Avg episode reward: [(0, '29.126')] +[2023-02-24 13:43:47,873][00205] Fps is (10 sec: 4505.2, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19488768. Throughput: 0: 975.4. Samples: 4871916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:43:47,876][00205] Avg episode reward: [(0, '30.106')] +[2023-02-24 13:43:49,625][11215] Updated weights for policy 0, policy_version 4760 (0.0016) +[2023-02-24 13:43:52,874][00205] Fps is (10 sec: 3684.9, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19505152. Throughput: 0: 924.6. Samples: 4876662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:43:52,876][00205] Avg episode reward: [(0, '30.569')] +[2023-02-24 13:43:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 19521536. Throughput: 0: 920.2. Samples: 4878788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:43:57,872][00205] Avg episode reward: [(0, '30.493')] +[2023-02-24 13:44:00,981][11215] Updated weights for policy 0, policy_version 4770 (0.0026) +[2023-02-24 13:44:02,870][00205] Fps is (10 sec: 4097.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19546112. Throughput: 0: 961.9. Samples: 4885108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:44:02,872][00205] Avg episode reward: [(0, '29.534')] +[2023-02-24 13:44:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 19566592. Throughput: 0: 967.4. Samples: 4891896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:07,872][00205] Avg episode reward: [(0, '30.184')] +[2023-02-24 13:44:11,976][11215] Updated weights for policy 0, policy_version 4780 (0.0011) +[2023-02-24 13:44:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19578880. Throughput: 0: 936.6. Samples: 4894026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:12,874][00205] Avg episode reward: [(0, '31.095')] +[2023-02-24 13:44:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19595264. Throughput: 0: 916.0. Samples: 4898364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:44:17,872][00205] Avg episode reward: [(0, '30.332')] +[2023-02-24 13:44:22,239][11215] Updated weights for policy 0, policy_version 4790 (0.0017) +[2023-02-24 13:44:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19619840. Throughput: 0: 977.0. Samples: 4905430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:22,873][00205] Avg episode reward: [(0, '30.039')] +[2023-02-24 13:44:27,874][00205] Fps is (10 sec: 4503.6, 60 sec: 3754.4, 300 sec: 3790.5). Total num frames: 19640320. Throughput: 0: 978.0. Samples: 4908936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:44:27,877][00205] Avg episode reward: [(0, '29.960')] +[2023-02-24 13:44:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 19656704. Throughput: 0: 931.4. Samples: 4913828. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:44:32,878][00205] Avg episode reward: [(0, '29.704')] +[2023-02-24 13:44:33,987][11215] Updated weights for policy 0, policy_version 4800 (0.0026) +[2023-02-24 13:44:37,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3823.1, 300 sec: 3776.7). Total num frames: 19673088. Throughput: 0: 934.4. Samples: 4918708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) +[2023-02-24 13:44:37,873][00205] Avg episode reward: [(0, '30.977')] +[2023-02-24 13:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19697664. Throughput: 0: 962.5. Samples: 4922100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:44:42,872][00205] Avg episode reward: [(0, '32.188')] +[2023-02-24 13:44:43,543][11215] Updated weights for policy 0, policy_version 4810 (0.0024) +[2023-02-24 13:44:47,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 19718144. Throughput: 0: 978.1. Samples: 4929124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:44:47,874][00205] Avg episode reward: [(0, '30.658')] +[2023-02-24 13:44:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004814_19718144.pth... +[2023-02-24 13:44:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth +[2023-02-24 13:44:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19730432. Throughput: 0: 923.7. Samples: 4933462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:44:52,880][00205] Avg episode reward: [(0, '32.048')] +[2023-02-24 13:44:56,130][11215] Updated weights for policy 0, policy_version 4820 (0.0019) +[2023-02-24 13:44:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19746816. Throughput: 0: 923.3. Samples: 4935574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) +[2023-02-24 13:44:57,872][00205] Avg episode reward: [(0, '31.180')] +[2023-02-24 13:45:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19771392. Throughput: 0: 963.9. Samples: 4941740. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:02,872][00205] Avg episode reward: [(0, '30.934')] +[2023-02-24 13:45:05,453][11215] Updated weights for policy 0, policy_version 4830 (0.0019) +[2023-02-24 13:45:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 19787776. Throughput: 0: 946.6. Samples: 4948026. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:07,874][00205] Avg episode reward: [(0, '30.586')] +[2023-02-24 13:45:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19804160. Throughput: 0: 916.7. Samples: 4950184. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) +[2023-02-24 13:45:12,874][00205] Avg episode reward: [(0, '31.153')] +[2023-02-24 13:45:17,852][11215] Updated weights for policy 0, policy_version 4840 (0.0012) +[2023-02-24 13:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19824640. Throughput: 0: 908.3. Samples: 4954702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:45:17,878][00205] Avg episode reward: [(0, '29.579')] +[2023-02-24 13:45:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19845120. Throughput: 0: 957.3. Samples: 4961788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:22,879][00205] Avg episode reward: [(0, '30.000')] +[2023-02-24 13:45:27,014][11215] Updated weights for policy 0, policy_version 4850 (0.0021) +[2023-02-24 13:45:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19865600. Throughput: 0: 959.2. Samples: 4965266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) +[2023-02-24 13:45:27,878][00205] Avg episode reward: [(0, '29.956')] +[2023-02-24 13:45:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19881984. Throughput: 0: 906.3. Samples: 4969906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:32,874][00205] Avg episode reward: [(0, '31.605')] +[2023-02-24 13:45:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19898368. Throughput: 0: 920.4. Samples: 4974880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) +[2023-02-24 13:45:37,878][00205] Avg episode reward: [(0, '30.971')] +[2023-02-24 13:45:39,131][11215] Updated weights for policy 0, policy_version 4860 (0.0014) +[2023-02-24 13:45:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19922944. Throughput: 0: 951.0. Samples: 4978368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:42,879][00205] Avg episode reward: [(0, '30.475')] +[2023-02-24 13:45:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.6). Total num frames: 19939328. Throughput: 0: 963.4. Samples: 4985094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) +[2023-02-24 13:45:47,880][00205] Avg episode reward: [(0, '30.778')] +[2023-02-24 13:45:49,491][11215] Updated weights for policy 0, policy_version 4870 (0.0015) +[2023-02-24 13:45:52,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3754.3, 300 sec: 3790.5). Total num frames: 19955712. Throughput: 0: 921.1. Samples: 4989480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) +[2023-02-24 13:45:52,879][00205] Avg episode reward: [(0, '31.618')] +[2023-02-24 13:45:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19976192. Throughput: 0: 922.4. Samples: 4991690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:45:57,880][00205] Avg episode reward: [(0, '31.637')] +[2023-02-24 13:46:00,362][11215] Updated weights for policy 0, policy_version 4880 (0.0016) +[2023-02-24 13:46:02,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19996672. Throughput: 0: 975.7. Samples: 4998610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) +[2023-02-24 13:46:02,882][00205] Avg episode reward: [(0, '31.607')] +[2023-02-24 13:46:03,954][11201] Stopping Batcher_0... +[2023-02-24 13:46:03,954][11201] Loop batcher_evt_loop terminating... +[2023-02-24 13:46:03,954][00205] Component Batcher_0 stopped! +[2023-02-24 13:46:03,960][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 13:46:04,002][11215] Weights refcount: 2 0 +[2023-02-24 13:46:04,006][00205] Component InferenceWorker_p0-w0 stopped! +[2023-02-24 13:46:04,017][11215] Stopping InferenceWorker_p0-w0... +[2023-02-24 13:46:04,017][11215] Loop inference_proc0-0_evt_loop terminating... +[2023-02-24 13:46:04,036][11221] Stopping RolloutWorker_w1... +[2023-02-24 13:46:04,031][11226] Stopping RolloutWorker_w7... +[2023-02-24 13:46:04,032][00205] Component RolloutWorker_w7 stopped! +[2023-02-24 13:46:04,038][00205] Component RolloutWorker_w1 stopped! +[2023-02-24 13:46:04,041][00205] Component RolloutWorker_w5 stopped! +[2023-02-24 13:46:04,041][11224] Stopping RolloutWorker_w5... +[2023-02-24 13:46:04,044][11224] Loop rollout_proc5_evt_loop terminating... +[2023-02-24 13:46:04,038][11221] Loop rollout_proc1_evt_loop terminating... +[2023-02-24 13:46:04,045][11226] Loop rollout_proc7_evt_loop terminating... +[2023-02-24 13:46:04,047][11223] Stopping RolloutWorker_w3... +[2023-02-24 13:46:04,048][11223] Loop rollout_proc3_evt_loop terminating... +[2023-02-24 13:46:04,049][00205] Component RolloutWorker_w3 stopped! +[2023-02-24 13:46:04,064][11225] Stopping RolloutWorker_w6... +[2023-02-24 13:46:04,064][00205] Component RolloutWorker_w6 stopped! +[2023-02-24 13:46:04,072][11225] Loop rollout_proc6_evt_loop terminating... +[2023-02-24 13:46:04,078][11222] Stopping RolloutWorker_w2... +[2023-02-24 13:46:04,079][11222] Loop rollout_proc2_evt_loop terminating... +[2023-02-24 13:46:04,078][00205] Component RolloutWorker_w2 stopped! +[2023-02-24 13:46:04,088][11216] Stopping RolloutWorker_w0... +[2023-02-24 13:46:04,088][00205] Component RolloutWorker_w0 stopped! +[2023-02-24 13:46:04,093][11227] Stopping RolloutWorker_w4... +[2023-02-24 13:46:04,093][00205] Component RolloutWorker_w4 stopped! +[2023-02-24 13:46:04,095][11227] Loop rollout_proc4_evt_loop terminating... +[2023-02-24 13:46:04,099][11216] Loop rollout_proc0_evt_loop terminating... +[2023-02-24 13:46:04,139][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth +[2023-02-24 13:46:04,151][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 13:46:04,323][00205] Component LearnerWorker_p0 stopped! +[2023-02-24 13:46:04,328][00205] Waiting for process learner_proc0 to stop... +[2023-02-24 13:46:04,343][11201] Stopping LearnerWorker_p0... +[2023-02-24 13:46:04,344][11201] Loop learner_proc0_evt_loop terminating... +[2023-02-24 13:46:06,567][00205] Waiting for process inference_proc0-0 to join... +[2023-02-24 13:46:07,271][00205] Waiting for process rollout_proc0 to join... +[2023-02-24 13:46:08,011][00205] Waiting for process rollout_proc1 to join... +[2023-02-24 13:46:08,013][00205] Waiting for process rollout_proc2 to join... +[2023-02-24 13:46:08,022][00205] Waiting for process rollout_proc3 to join... +[2023-02-24 13:46:08,024][00205] Waiting for process rollout_proc4 to join... +[2023-02-24 13:46:08,025][00205] Waiting for process rollout_proc5 to join... +[2023-02-24 13:46:08,026][00205] Waiting for process rollout_proc6 to join... +[2023-02-24 13:46:08,028][00205] Waiting for process rollout_proc7 to join... +[2023-02-24 13:46:08,030][00205] Batcher 0 profile tree view: +batching: 125.6692, releasing_batches: 0.1240 +[2023-02-24 13:46:08,032][00205] InferenceWorker_p0-w0 profile tree view: +wait_policy: 0.0001 + wait_policy_total: 2737.0422 +update_model: 37.9221 + weight_update: 0.0016 +one_step: 0.0118 + handle_policy_step: 2483.2568 + deserialize: 75.3816, stack: 14.8616, obs_to_device_normalize: 559.5526, forward: 1184.2792, send_messages: 130.3140 + prepare_outputs: 395.5048 + to_cpu: 242.4212 +[2023-02-24 13:46:08,034][00205] Learner 0 profile tree view: +misc: 0.0290, prepare_batch: 60.7874 +train: 366.6477 + epoch_init: 0.0621, minibatch_init: 0.0458, losses_postprocess: 2.9527, kl_divergence: 2.8728, after_optimizer: 162.1073 + calculate_losses: 130.0243 + losses_init: 0.0470, forward_head: 7.9970, bptt_initial: 86.1198, tail: 5.1093, advantages_returns: 1.4157, losses: 16.7013 + bptt: 11.0390 + bptt_forward_core: 10.5633 + update: 65.4562 + clip: 7.0785 +[2023-02-24 13:46:08,036][00205] RolloutWorker_w0 profile tree view: +wait_for_trajectories: 1.6596, enqueue_policy_requests: 779.0498, env_step: 4091.0265, overhead: 107.0679, complete_rollouts: 33.4033 +save_policy_outputs: 98.6476 + split_output_tensors: 47.6182 +[2023-02-24 13:46:08,038][00205] RolloutWorker_w7 profile tree view: +wait_for_trajectories: 1.8929, enqueue_policy_requests: 766.4891, env_step: 4101.5740, overhead: 109.4752, complete_rollouts: 35.1840 +save_policy_outputs: 99.1619 + split_output_tensors: 48.3378 +[2023-02-24 13:46:08,040][00205] Loop Runner_EvtLoop terminating... +[2023-02-24 13:46:08,043][00205] Runner profile tree view: +main_loop: 5474.8034 +[2023-02-24 13:46:08,055][00205] Collected {0: 20004864}, FPS: 3654.0 +[2023-02-24 14:12:39,442][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:12:39,445][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:12:39,447][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:12:39,449][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:12:39,451][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:12:39,452][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:12:39,456][00205] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:12:39,457][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:12:39,458][00205] Adding new argument 'push_to_hub'=False that is not in the saved config file! +[2023-02-24 14:12:39,459][00205] Adding new argument 'hf_repository'=None that is not in the saved config file! +[2023-02-24 14:12:39,462][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:12:39,463][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:12:39,465][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:12:39,466][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:12:39,467][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:12:39,510][00205] Doom resolution: 160x120, resize resolution: (128, 72) +[2023-02-24 14:12:39,514][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:12:39,519][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:12:39,546][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:12:40,349][00205] Conv encoder output size: 512 +[2023-02-24 14:12:40,352][00205] Policy head output size: 512 +[2023-02-24 14:12:43,178][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:12:44,426][00205] Num frames 100... +[2023-02-24 14:12:44,538][00205] Num frames 200... +[2023-02-24 14:12:44,654][00205] Num frames 300... +[2023-02-24 14:12:44,765][00205] Num frames 400... +[2023-02-24 14:12:44,880][00205] Num frames 500... +[2023-02-24 14:12:44,989][00205] Num frames 600... +[2023-02-24 14:12:45,102][00205] Num frames 700... +[2023-02-24 14:12:45,213][00205] Num frames 800... +[2023-02-24 14:12:45,331][00205] Num frames 900... +[2023-02-24 14:12:45,443][00205] Num frames 1000... +[2023-02-24 14:12:45,561][00205] Num frames 1100... +[2023-02-24 14:12:45,677][00205] Num frames 1200... +[2023-02-24 14:12:45,797][00205] Num frames 1300... +[2023-02-24 14:12:45,909][00205] Num frames 1400... +[2023-02-24 14:12:46,022][00205] Num frames 1500... +[2023-02-24 14:12:46,083][00205] Avg episode rewards: #0: 36.040, true rewards: #0: 15.040 +[2023-02-24 14:12:46,085][00205] Avg episode reward: 36.040, avg true_objective: 15.040 +[2023-02-24 14:12:46,195][00205] Num frames 1600... +[2023-02-24 14:12:46,307][00205] Num frames 1700... +[2023-02-24 14:12:46,416][00205] Num frames 1800... +[2023-02-24 14:12:46,529][00205] Num frames 1900... +[2023-02-24 14:12:46,651][00205] Num frames 2000... +[2023-02-24 14:12:46,762][00205] Num frames 2100... +[2023-02-24 14:12:46,873][00205] Num frames 2200... +[2023-02-24 14:12:46,990][00205] Num frames 2300... +[2023-02-24 14:12:47,109][00205] Num frames 2400... +[2023-02-24 14:12:47,229][00205] Num frames 2500... +[2023-02-24 14:12:47,338][00205] Num frames 2600... +[2023-02-24 14:12:47,450][00205] Num frames 2700... +[2023-02-24 14:12:47,531][00205] Avg episode rewards: #0: 32.600, true rewards: #0: 13.600 +[2023-02-24 14:12:47,533][00205] Avg episode reward: 32.600, avg true_objective: 13.600 +[2023-02-24 14:12:47,623][00205] Num frames 2800... +[2023-02-24 14:12:47,739][00205] Num frames 2900... +[2023-02-24 14:12:47,850][00205] Num frames 3000... +[2023-02-24 14:12:47,973][00205] Num frames 3100... +[2023-02-24 14:12:48,083][00205] Num frames 3200... +[2023-02-24 14:12:48,194][00205] Num frames 3300... +[2023-02-24 14:12:48,304][00205] Num frames 3400... +[2023-02-24 14:12:48,419][00205] Num frames 3500... +[2023-02-24 14:12:48,564][00205] Avg episode rewards: #0: 27.613, true rewards: #0: 11.947 +[2023-02-24 14:12:48,565][00205] Avg episode reward: 27.613, avg true_objective: 11.947 +[2023-02-24 14:12:48,589][00205] Num frames 3600... +[2023-02-24 14:12:48,704][00205] Num frames 3700... +[2023-02-24 14:12:48,821][00205] Num frames 3800... +[2023-02-24 14:12:48,929][00205] Num frames 3900... +[2023-02-24 14:12:49,048][00205] Num frames 4000... +[2023-02-24 14:12:49,166][00205] Num frames 4100... +[2023-02-24 14:12:49,278][00205] Num frames 4200... +[2023-02-24 14:12:49,390][00205] Num frames 4300... +[2023-02-24 14:12:49,502][00205] Num frames 4400... +[2023-02-24 14:12:49,616][00205] Num frames 4500... +[2023-02-24 14:12:49,733][00205] Num frames 4600... +[2023-02-24 14:12:49,842][00205] Num frames 4700... +[2023-02-24 14:12:49,955][00205] Num frames 4800... +[2023-02-24 14:12:50,066][00205] Num frames 4900... +[2023-02-24 14:12:50,178][00205] Num frames 5000... +[2023-02-24 14:12:50,290][00205] Num frames 5100... +[2023-02-24 14:12:50,405][00205] Num frames 5200... +[2023-02-24 14:12:50,523][00205] Num frames 5300... +[2023-02-24 14:12:50,633][00205] Num frames 5400... +[2023-02-24 14:12:50,778][00205] Avg episode rewards: #0: 32.680, true rewards: #0: 13.680 +[2023-02-24 14:12:50,779][00205] Avg episode reward: 32.680, avg true_objective: 13.680 +[2023-02-24 14:12:50,815][00205] Num frames 5500... +[2023-02-24 14:12:50,926][00205] Num frames 5600... +[2023-02-24 14:12:51,041][00205] Num frames 5700... +[2023-02-24 14:12:51,148][00205] Num frames 5800... +[2023-02-24 14:12:51,258][00205] Num frames 5900... +[2023-02-24 14:12:51,336][00205] Avg episode rewards: #0: 27.240, true rewards: #0: 11.840 +[2023-02-24 14:12:51,339][00205] Avg episode reward: 27.240, avg true_objective: 11.840 +[2023-02-24 14:12:51,437][00205] Num frames 6000... +[2023-02-24 14:12:51,547][00205] Num frames 6100... +[2023-02-24 14:12:51,658][00205] Num frames 6200... +[2023-02-24 14:12:51,774][00205] Num frames 6300... +[2023-02-24 14:12:51,889][00205] Num frames 6400... +[2023-02-24 14:12:52,001][00205] Num frames 6500... +[2023-02-24 14:12:52,112][00205] Num frames 6600... +[2023-02-24 14:12:52,223][00205] Num frames 6700... +[2023-02-24 14:12:52,391][00205] Num frames 6800... +[2023-02-24 14:12:52,551][00205] Num frames 6900... +[2023-02-24 14:12:52,704][00205] Num frames 7000... +[2023-02-24 14:12:52,875][00205] Num frames 7100... +[2023-02-24 14:12:53,035][00205] Num frames 7200... +[2023-02-24 14:12:53,190][00205] Num frames 7300... +[2023-02-24 14:12:53,354][00205] Num frames 7400... +[2023-02-24 14:12:53,432][00205] Avg episode rewards: #0: 29.685, true rewards: #0: 12.352 +[2023-02-24 14:12:53,437][00205] Avg episode reward: 29.685, avg true_objective: 12.352 +[2023-02-24 14:12:53,581][00205] Num frames 7500... +[2023-02-24 14:12:53,741][00205] Num frames 7600... +[2023-02-24 14:12:53,896][00205] Num frames 7700... +[2023-02-24 14:12:54,052][00205] Num frames 7800... +[2023-02-24 14:12:54,215][00205] Num frames 7900... +[2023-02-24 14:12:54,378][00205] Num frames 8000... +[2023-02-24 14:12:54,538][00205] Num frames 8100... +[2023-02-24 14:12:54,696][00205] Num frames 8200... +[2023-02-24 14:12:54,861][00205] Num frames 8300... +[2023-02-24 14:12:55,032][00205] Avg episode rewards: #0: 28.530, true rewards: #0: 11.959 +[2023-02-24 14:12:55,034][00205] Avg episode reward: 28.530, avg true_objective: 11.959 +[2023-02-24 14:12:55,091][00205] Num frames 8400... +[2023-02-24 14:12:55,262][00205] Num frames 8500... +[2023-02-24 14:12:55,425][00205] Num frames 8600... +[2023-02-24 14:12:55,584][00205] Num frames 8700... +[2023-02-24 14:12:55,744][00205] Num frames 8800... +[2023-02-24 14:12:55,863][00205] Avg episode rewards: #0: 26.321, true rewards: #0: 11.071 +[2023-02-24 14:12:55,866][00205] Avg episode reward: 26.321, avg true_objective: 11.071 +[2023-02-24 14:12:55,918][00205] Num frames 8900... +[2023-02-24 14:12:56,027][00205] Num frames 9000... +[2023-02-24 14:12:56,143][00205] Num frames 9100... +[2023-02-24 14:12:56,255][00205] Num frames 9200... +[2023-02-24 14:12:56,369][00205] Num frames 9300... +[2023-02-24 14:12:56,482][00205] Num frames 9400... +[2023-02-24 14:12:56,600][00205] Num frames 9500... +[2023-02-24 14:12:56,711][00205] Num frames 9600... +[2023-02-24 14:12:56,835][00205] Num frames 9700... +[2023-02-24 14:12:56,966][00205] Num frames 9800... +[2023-02-24 14:12:57,076][00205] Num frames 9900... +[2023-02-24 14:12:57,148][00205] Avg episode rewards: #0: 25.903, true rewards: #0: 11.014 +[2023-02-24 14:12:57,150][00205] Avg episode reward: 25.903, avg true_objective: 11.014 +[2023-02-24 14:12:57,248][00205] Num frames 10000... +[2023-02-24 14:12:57,368][00205] Num frames 10100... +[2023-02-24 14:12:57,479][00205] Num frames 10200... +[2023-02-24 14:12:57,591][00205] Num frames 10300... +[2023-02-24 14:12:57,701][00205] Num frames 10400... +[2023-02-24 14:12:57,814][00205] Num frames 10500... +[2023-02-24 14:12:57,930][00205] Num frames 10600... +[2023-02-24 14:12:58,040][00205] Num frames 10700... +[2023-02-24 14:12:58,151][00205] Num frames 10800... +[2023-02-24 14:12:58,227][00205] Avg episode rewards: #0: 25.217, true rewards: #0: 10.817 +[2023-02-24 14:12:58,229][00205] Avg episode reward: 25.217, avg true_objective: 10.817 +[2023-02-24 14:14:02,653][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 14:22:15,412][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:22:15,415][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:22:15,417][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:22:15,419][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:22:15,421][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:22:15,423][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:22:15,425][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 14:22:15,427][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:22:15,433][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 14:22:15,434][00205] Adding new argument 'hf_repository'='parsasam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 14:22:15,436][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:22:15,439][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:22:15,441][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:22:15,443][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:22:15,445][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:22:15,463][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:22:15,466][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:22:15,480][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:22:15,518][00205] Conv encoder output size: 512 +[2023-02-24 14:22:15,520][00205] Policy head output size: 512 +[2023-02-24 14:22:15,540][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:22:15,996][00205] Num frames 100... +[2023-02-24 14:22:16,126][00205] Num frames 200... +[2023-02-24 14:22:16,252][00205] Num frames 300... +[2023-02-24 14:22:16,374][00205] Num frames 400... +[2023-02-24 14:22:16,493][00205] Num frames 500... +[2023-02-24 14:22:16,613][00205] Num frames 600... +[2023-02-24 14:22:16,738][00205] Num frames 700... +[2023-02-24 14:22:16,854][00205] Num frames 800... +[2023-02-24 14:22:16,974][00205] Num frames 900... +[2023-02-24 14:22:17,099][00205] Num frames 1000... +[2023-02-24 14:22:17,235][00205] Num frames 1100... +[2023-02-24 14:22:17,354][00205] Num frames 1200... +[2023-02-24 14:22:17,484][00205] Num frames 1300... +[2023-02-24 14:22:17,606][00205] Num frames 1400... +[2023-02-24 14:22:17,733][00205] Num frames 1500... +[2023-02-24 14:22:17,857][00205] Num frames 1600... +[2023-02-24 14:22:18,032][00205] Avg episode rewards: #0: 41.980, true rewards: #0: 16.980 +[2023-02-24 14:22:18,034][00205] Avg episode reward: 41.980, avg true_objective: 16.980 +[2023-02-24 14:22:18,040][00205] Num frames 1700... +[2023-02-24 14:22:18,166][00205] Num frames 1800... +[2023-02-24 14:22:18,286][00205] Num frames 1900... +[2023-02-24 14:22:18,408][00205] Num frames 2000... +[2023-02-24 14:22:18,519][00205] Num frames 2100... +[2023-02-24 14:22:18,628][00205] Num frames 2200... +[2023-02-24 14:22:18,738][00205] Num frames 2300... +[2023-02-24 14:22:18,856][00205] Num frames 2400... +[2023-02-24 14:22:18,968][00205] Num frames 2500... +[2023-02-24 14:22:19,082][00205] Num frames 2600... +[2023-02-24 14:22:19,202][00205] Num frames 2700... +[2023-02-24 14:22:19,322][00205] Num frames 2800... +[2023-02-24 14:22:19,435][00205] Num frames 2900... +[2023-02-24 14:22:19,549][00205] Num frames 3000... +[2023-02-24 14:22:19,622][00205] Avg episode rewards: #0: 36.555, true rewards: #0: 15.055 +[2023-02-24 14:22:19,624][00205] Avg episode reward: 36.555, avg true_objective: 15.055 +[2023-02-24 14:22:19,732][00205] Num frames 3100... +[2023-02-24 14:22:19,843][00205] Num frames 3200... +[2023-02-24 14:22:19,967][00205] Num frames 3300... +[2023-02-24 14:22:20,082][00205] Num frames 3400... +[2023-02-24 14:22:20,201][00205] Num frames 3500... +[2023-02-24 14:22:20,315][00205] Num frames 3600... +[2023-02-24 14:22:20,430][00205] Num frames 3700... +[2023-02-24 14:22:20,544][00205] Num frames 3800... +[2023-02-24 14:22:20,660][00205] Num frames 3900... +[2023-02-24 14:22:20,775][00205] Num frames 4000... +[2023-02-24 14:22:20,890][00205] Num frames 4100... +[2023-02-24 14:22:21,004][00205] Num frames 4200... +[2023-02-24 14:22:21,134][00205] Avg episode rewards: #0: 34.890, true rewards: #0: 14.223 +[2023-02-24 14:22:21,135][00205] Avg episode reward: 34.890, avg true_objective: 14.223 +[2023-02-24 14:22:21,192][00205] Num frames 4300... +[2023-02-24 14:22:21,310][00205] Num frames 4400... +[2023-02-24 14:22:21,426][00205] Num frames 4500... +[2023-02-24 14:22:21,542][00205] Num frames 4600... +[2023-02-24 14:22:21,666][00205] Num frames 4700... +[2023-02-24 14:22:21,782][00205] Num frames 4800... +[2023-02-24 14:22:21,908][00205] Num frames 4900... +[2023-02-24 14:22:22,022][00205] Num frames 5000... +[2023-02-24 14:22:22,135][00205] Num frames 5100... +[2023-02-24 14:22:22,260][00205] Num frames 5200... +[2023-02-24 14:22:22,373][00205] Num frames 5300... +[2023-02-24 14:22:22,489][00205] Num frames 5400... +[2023-02-24 14:22:22,603][00205] Num frames 5500... +[2023-02-24 14:22:22,720][00205] Num frames 5600... +[2023-02-24 14:22:22,835][00205] Num frames 5700... +[2023-02-24 14:22:22,942][00205] Avg episode rewards: #0: 36.850, true rewards: #0: 14.350 +[2023-02-24 14:22:22,944][00205] Avg episode reward: 36.850, avg true_objective: 14.350 +[2023-02-24 14:22:23,016][00205] Num frames 5800... +[2023-02-24 14:22:23,132][00205] Num frames 5900... +[2023-02-24 14:22:23,254][00205] Num frames 6000... +[2023-02-24 14:22:23,366][00205] Num frames 6100... +[2023-02-24 14:22:23,511][00205] Avg episode rewards: #0: 30.960, true rewards: #0: 12.360 +[2023-02-24 14:22:23,513][00205] Avg episode reward: 30.960, avg true_objective: 12.360 +[2023-02-24 14:22:23,542][00205] Num frames 6200... +[2023-02-24 14:22:23,655][00205] Num frames 6300... +[2023-02-24 14:22:23,776][00205] Num frames 6400... +[2023-02-24 14:22:23,901][00205] Num frames 6500... +[2023-02-24 14:22:24,018][00205] Num frames 6600... +[2023-02-24 14:22:24,132][00205] Num frames 6700... +[2023-02-24 14:22:24,255][00205] Num frames 6800... +[2023-02-24 14:22:24,376][00205] Num frames 6900... +[2023-02-24 14:22:24,490][00205] Num frames 7000... +[2023-02-24 14:22:24,610][00205] Num frames 7100... +[2023-02-24 14:22:24,737][00205] Num frames 7200... +[2023-02-24 14:22:24,852][00205] Num frames 7300... +[2023-02-24 14:22:24,970][00205] Num frames 7400... +[2023-02-24 14:22:25,086][00205] Num frames 7500... +[2023-02-24 14:22:25,244][00205] Num frames 7600... +[2023-02-24 14:22:25,425][00205] Num frames 7700... +[2023-02-24 14:22:25,592][00205] Num frames 7800... +[2023-02-24 14:22:25,756][00205] Num frames 7900... +[2023-02-24 14:22:25,919][00205] Num frames 8000... +[2023-02-24 14:22:26,008][00205] Avg episode rewards: #0: 33.361, true rewards: #0: 13.362 +[2023-02-24 14:22:26,015][00205] Avg episode reward: 33.361, avg true_objective: 13.362 +[2023-02-24 14:22:26,151][00205] Num frames 8100... +[2023-02-24 14:22:26,324][00205] Num frames 8200... +[2023-02-24 14:22:26,489][00205] Num frames 8300... +[2023-02-24 14:22:26,641][00205] Num frames 8400... +[2023-02-24 14:22:26,803][00205] Num frames 8500... +[2023-02-24 14:22:26,968][00205] Num frames 8600... +[2023-02-24 14:22:27,120][00205] Avg episode rewards: #0: 30.653, true rewards: #0: 12.367 +[2023-02-24 14:22:27,123][00205] Avg episode reward: 30.653, avg true_objective: 12.367 +[2023-02-24 14:22:27,202][00205] Num frames 8700... +[2023-02-24 14:22:27,379][00205] Num frames 8800... +[2023-02-24 14:22:27,549][00205] Num frames 8900... +[2023-02-24 14:22:27,718][00205] Num frames 9000... +[2023-02-24 14:22:27,893][00205] Num frames 9100... +[2023-02-24 14:22:28,060][00205] Num frames 9200... +[2023-02-24 14:22:28,227][00205] Num frames 9300... +[2023-02-24 14:22:28,398][00205] Num frames 9400... +[2023-02-24 14:22:28,576][00205] Num frames 9500... +[2023-02-24 14:22:28,741][00205] Num frames 9600... +[2023-02-24 14:22:28,860][00205] Num frames 9700... +[2023-02-24 14:22:28,987][00205] Num frames 9800... +[2023-02-24 14:22:29,095][00205] Avg episode rewards: #0: 30.429, true rewards: #0: 12.304 +[2023-02-24 14:22:29,096][00205] Avg episode reward: 30.429, avg true_objective: 12.304 +[2023-02-24 14:22:29,167][00205] Num frames 9900... +[2023-02-24 14:22:29,289][00205] Num frames 10000... +[2023-02-24 14:22:29,402][00205] Num frames 10100... +[2023-02-24 14:22:29,522][00205] Num frames 10200... +[2023-02-24 14:22:29,639][00205] Num frames 10300... +[2023-02-24 14:22:29,752][00205] Num frames 10400... +[2023-02-24 14:22:29,869][00205] Num frames 10500... +[2023-02-24 14:22:29,982][00205] Num frames 10600... +[2023-02-24 14:22:30,093][00205] Avg episode rewards: #0: 28.937, true rewards: #0: 11.826 +[2023-02-24 14:22:30,096][00205] Avg episode reward: 28.937, avg true_objective: 11.826 +[2023-02-24 14:22:30,163][00205] Num frames 10700... +[2023-02-24 14:22:30,280][00205] Num frames 10800... +[2023-02-24 14:22:30,399][00205] Num frames 10900... +[2023-02-24 14:22:30,527][00205] Num frames 11000... +[2023-02-24 14:22:30,640][00205] Num frames 11100... +[2023-02-24 14:22:30,756][00205] Num frames 11200... +[2023-02-24 14:22:30,874][00205] Num frames 11300... +[2023-02-24 14:22:30,988][00205] Num frames 11400... +[2023-02-24 14:22:31,105][00205] Num frames 11500... +[2023-02-24 14:22:31,218][00205] Num frames 11600... +[2023-02-24 14:22:31,341][00205] Num frames 11700... +[2023-02-24 14:22:31,461][00205] Num frames 11800... +[2023-02-24 14:22:31,580][00205] Num frames 11900... +[2023-02-24 14:22:31,701][00205] Num frames 12000... +[2023-02-24 14:22:31,781][00205] Avg episode rewards: #0: 29.819, true rewards: #0: 12.019 +[2023-02-24 14:22:31,782][00205] Avg episode reward: 29.819, avg true_objective: 12.019 +[2023-02-24 14:23:45,181][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! +[2023-02-24 14:28:54,610][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json +[2023-02-24 14:28:54,613][00205] Overriding arg 'num_workers' with value 1 passed from command line +[2023-02-24 14:28:54,616][00205] Adding new argument 'no_render'=True that is not in the saved config file! +[2023-02-24 14:28:54,619][00205] Adding new argument 'save_video'=True that is not in the saved config file! +[2023-02-24 14:28:54,621][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! +[2023-02-24 14:28:54,624][00205] Adding new argument 'video_name'=None that is not in the saved config file! +[2023-02-24 14:28:54,625][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! +[2023-02-24 14:28:54,626][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! +[2023-02-24 14:28:54,628][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! +[2023-02-24 14:28:54,629][00205] Adding new argument 'hf_repository'='parsasam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! +[2023-02-24 14:28:54,630][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! +[2023-02-24 14:28:54,632][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! +[2023-02-24 14:28:54,633][00205] Adding new argument 'train_script'=None that is not in the saved config file! +[2023-02-24 14:28:54,635][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! +[2023-02-24 14:28:54,636][00205] Using frameskip 1 and render_action_repeat=4 for evaluation +[2023-02-24 14:28:54,667][00205] RunningMeanStd input shape: (3, 72, 128) +[2023-02-24 14:28:54,669][00205] RunningMeanStd input shape: (1,) +[2023-02-24 14:28:54,686][00205] ConvEncoder: input_channels=3 +[2023-02-24 14:28:54,746][00205] Conv encoder output size: 512 +[2023-02-24 14:28:54,747][00205] Policy head output size: 512 +[2023-02-24 14:28:54,769][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... +[2023-02-24 14:28:55,225][00205] Num frames 100... +[2023-02-24 14:28:55,343][00205] Num frames 200... +[2023-02-24 14:28:55,454][00205] Num frames 300... +[2023-02-24 14:28:55,583][00205] Num frames 400... +[2023-02-24 14:28:55,694][00205] Num frames 500... +[2023-02-24 14:28:55,818][00205] Num frames 600... +[2023-02-24 14:28:55,945][00205] Num frames 700... +[2023-02-24 14:28:56,066][00205] Num frames 800... +[2023-02-24 14:28:56,179][00205] Num frames 900... +[2023-02-24 14:28:56,290][00205] Num frames 1000... +[2023-02-24 14:28:56,401][00205] Num frames 1100... +[2023-02-24 14:28:56,516][00205] Num frames 1200... +[2023-02-24 14:28:56,638][00205] Num frames 1300... +[2023-02-24 14:28:56,749][00205] Num frames 1400... +[2023-02-24 14:28:56,861][00205] Num frames 1500... +[2023-02-24 14:28:56,973][00205] Num frames 1600... +[2023-02-24 14:28:57,087][00205] Num frames 1700... +[2023-02-24 14:28:57,207][00205] Num frames 1800... +[2023-02-24 14:28:57,332][00205] Num frames 1900... +[2023-02-24 14:28:57,450][00205] Num frames 2000... +[2023-02-24 14:28:57,575][00205] Num frames 2100... +[2023-02-24 14:28:57,628][00205] Avg episode rewards: #0: 64.999, true rewards: #0: 21.000 +[2023-02-24 14:28:57,629][00205] Avg episode reward: 64.999, avg true_objective: 21.000 +[2023-02-24 14:28:57,748][00205] Num frames 2200... +[2023-02-24 14:28:57,873][00205] Num frames 2300... +[2023-02-24 14:28:57,988][00205] Num frames 2400... +[2023-02-24 14:28:58,102][00205] Num frames 2500... +[2023-02-24 14:28:58,215][00205] Num frames 2600... +[2023-02-24 14:28:58,339][00205] Num frames 2700... +[2023-02-24 14:28:58,493][00205] Num frames 2800... +[2023-02-24 14:28:58,617][00205] Num frames 2900... +[2023-02-24 14:28:58,732][00205] Num frames 3000... +[2023-02-24 14:28:58,851][00205] Num frames 3100... +[2023-02-24 14:28:58,964][00205] Num frames 3200... +[2023-02-24 14:28:59,134][00205] Num frames 3300... +[2023-02-24 14:28:59,299][00205] Num frames 3400... +[2023-02-24 14:28:59,421][00205] Avg episode rewards: #0: 49.200, true rewards: #0: 17.200 +[2023-02-24 14:28:59,427][00205] Avg episode reward: 49.200, avg true_objective: 17.200 +[2023-02-24 14:28:59,527][00205] Num frames 3500... +[2023-02-24 14:28:59,691][00205] Num frames 3600... +[2023-02-24 14:28:59,846][00205] Num frames 3700... +[2023-02-24 14:29:00,000][00205] Num frames 3800... +[2023-02-24 14:29:00,169][00205] Num frames 3900... +[2023-02-24 14:29:00,327][00205] Num frames 4000... +[2023-02-24 14:29:00,485][00205] Num frames 4100... +[2023-02-24 14:29:00,654][00205] Num frames 4200... +[2023-02-24 14:29:00,749][00205] Avg episode rewards: #0: 39.740, true rewards: #0: 14.073 +[2023-02-24 14:29:00,751][00205] Avg episode reward: 39.740, avg true_objective: 14.073 +[2023-02-24 14:29:00,877][00205] Num frames 4300... +[2023-02-24 14:29:01,032][00205] Num frames 4400... +[2023-02-24 14:29:01,190][00205] Num frames 4500... +[2023-02-24 14:29:01,356][00205] Num frames 4600... +[2023-02-24 14:29:01,519][00205] Num frames 4700... +[2023-02-24 14:29:01,687][00205] Num frames 4800... +[2023-02-24 14:29:01,850][00205] Num frames 4900... +[2023-02-24 14:29:02,013][00205] Num frames 5000... +[2023-02-24 14:29:02,209][00205] Avg episode rewards: #0: 33.965, true rewards: #0: 12.715 +[2023-02-24 14:29:02,212][00205] Avg episode reward: 33.965, avg true_objective: 12.715 +[2023-02-24 14:29:02,243][00205] Num frames 5100... +[2023-02-24 14:29:02,413][00205] Num frames 5200... +[2023-02-24 14:29:02,533][00205] Num frames 5300... +[2023-02-24 14:29:02,647][00205] Num frames 5400... +[2023-02-24 14:29:02,766][00205] Num frames 5500... +[2023-02-24 14:29:02,880][00205] Num frames 5600... +[2023-02-24 14:29:02,997][00205] Num frames 5700... +[2023-02-24 14:29:03,113][00205] Num frames 5800... +[2023-02-24 14:29:03,230][00205] Num frames 5900... +[2023-02-24 14:29:03,340][00205] Num frames 6000... +[2023-02-24 14:29:03,451][00205] Num frames 6100... +[2023-02-24 14:29:03,565][00205] Num frames 6200... +[2023-02-24 14:29:03,684][00205] Num frames 6300... +[2023-02-24 14:29:03,805][00205] Num frames 6400... +[2023-02-24 14:29:03,920][00205] Num frames 6500... +[2023-02-24 14:29:04,053][00205] Avg episode rewards: #0: 35.338, true rewards: #0: 13.138 +[2023-02-24 14:29:04,055][00205] Avg episode reward: 35.338, avg true_objective: 13.138 +[2023-02-24 14:29:04,093][00205] Num frames 6600... +[2023-02-24 14:29:04,214][00205] Num frames 6700... +[2023-02-24 14:29:04,326][00205] Num frames 6800... +[2023-02-24 14:29:04,449][00205] Num frames 6900... +[2023-02-24 14:29:04,562][00205] Num frames 7000... +[2023-02-24 14:29:04,673][00205] Num frames 7100... +[2023-02-24 14:29:04,798][00205] Num frames 7200... +[2023-02-24 14:29:04,918][00205] Num frames 7300... +[2023-02-24 14:29:04,982][00205] Avg episode rewards: #0: 31.675, true rewards: #0: 12.175 +[2023-02-24 14:29:04,986][00205] Avg episode reward: 31.675, avg true_objective: 12.175 +[2023-02-24 14:29:05,094][00205] Num frames 7400... +[2023-02-24 14:29:05,206][00205] Num frames 7500... +[2023-02-24 14:29:05,326][00205] Num frames 7600... +[2023-02-24 14:29:05,438][00205] Num frames 7700... +[2023-02-24 14:29:05,553][00205] Num frames 7800... +[2023-02-24 14:29:05,669][00205] Num frames 7900... +[2023-02-24 14:29:05,795][00205] Num frames 8000... +[2023-02-24 14:29:05,909][00205] Num frames 8100... +[2023-02-24 14:29:06,023][00205] Num frames 8200... +[2023-02-24 14:29:06,134][00205] Num frames 8300... +[2023-02-24 14:29:06,250][00205] Num frames 8400... +[2023-02-24 14:29:06,361][00205] Num frames 8500... +[2023-02-24 14:29:06,480][00205] Num frames 8600... +[2023-02-24 14:29:06,559][00205] Avg episode rewards: #0: 31.314, true rewards: #0: 12.314 +[2023-02-24 14:29:06,560][00205] Avg episode reward: 31.314, avg true_objective: 12.314 +[2023-02-24 14:29:06,657][00205] Num frames 8700... +[2023-02-24 14:29:06,785][00205] Num frames 8800... +[2023-02-24 14:29:06,903][00205] Num frames 8900... +[2023-02-24 14:29:07,018][00205] Num frames 9000... +[2023-02-24 14:29:07,138][00205] Num frames 9100... +[2023-02-24 14:29:07,248][00205] Num frames 9200... +[2023-02-24 14:29:07,363][00205] Num frames 9300... +[2023-02-24 14:29:07,479][00205] Num frames 9400... +[2023-02-24 14:29:07,595][00205] Num frames 9500... +[2023-02-24 14:29:07,707][00205] Num frames 9600... +[2023-02-24 14:29:07,833][00205] Num frames 9700... +[2023-02-24 14:29:07,946][00205] Num frames 9800... +[2023-02-24 14:29:08,060][00205] Num frames 9900... +[2023-02-24 14:29:08,175][00205] Avg episode rewards: #0: 31.062, true rewards: #0: 12.437 +[2023-02-24 14:29:08,178][00205] Avg episode reward: 31.062, avg true_objective: 12.437 +[2023-02-24 14:29:08,238][00205] Num frames 10000... +[2023-02-24 14:29:08,357][00205] Num frames 10100... +[2023-02-24 14:29:08,469][00205] Num frames 10200... +[2023-02-24 14:29:08,584][00205] Num frames 10300... +[2023-02-24 14:29:08,715][00205] Avg episode rewards: #0: 28.296, true rewards: #0: 11.518 +[2023-02-24 14:29:08,717][00205] Avg episode reward: 28.296, avg true_objective: 11.518 +[2023-02-24 14:29:08,760][00205] Num frames 10400... +[2023-02-24 14:29:08,882][00205] Num frames 10500... +[2023-02-24 14:29:09,002][00205] Num frames 10600... +[2023-02-24 14:29:09,118][00205] Num frames 10700... +[2023-02-24 14:29:09,230][00205] Num frames 10800... +[2023-02-24 14:29:09,342][00205] Num frames 10900... +[2023-02-24 14:29:09,458][00205] Num frames 11000... +[2023-02-24 14:29:09,579][00205] Num frames 11100... +[2023-02-24 14:29:09,691][00205] Num frames 11200... +[2023-02-24 14:29:09,802][00205] Num frames 11300... +[2023-02-24 14:29:09,923][00205] Num frames 11400... +[2023-02-24 14:29:10,035][00205] Num frames 11500... +[2023-02-24 14:29:10,152][00205] Num frames 11600... +[2023-02-24 14:29:10,244][00205] Avg episode rewards: #0: 28.832, true rewards: #0: 11.632 +[2023-02-24 14:29:10,246][00205] Avg episode reward: 28.832, avg true_objective: 11.632 +[2023-02-24 14:30:20,364][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4!