[2023-02-24 12:14:52,972][00205] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-24 12:14:52,974][00205] Rollout worker 0 uses device cpu [2023-02-24 12:14:52,976][00205] Rollout worker 1 uses device cpu [2023-02-24 12:14:52,979][00205] Rollout worker 2 uses device cpu [2023-02-24 12:14:52,980][00205] Rollout worker 3 uses device cpu [2023-02-24 12:14:52,981][00205] Rollout worker 4 uses device cpu [2023-02-24 12:14:52,983][00205] Rollout worker 5 uses device cpu [2023-02-24 12:14:52,986][00205] Rollout worker 6 uses device cpu [2023-02-24 12:14:52,987][00205] Rollout worker 7 uses device cpu [2023-02-24 12:14:53,198][00205] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 12:14:53,201][00205] InferenceWorker_p0-w0: min num requests: 2 [2023-02-24 12:14:53,241][00205] Starting all processes... [2023-02-24 12:14:53,243][00205] Starting process learner_proc0 [2023-02-24 12:14:53,333][00205] Starting all processes... [2023-02-24 12:14:53,345][00205] Starting process inference_proc0-0 [2023-02-24 12:14:53,345][00205] Starting process rollout_proc0 [2023-02-24 12:14:53,345][00205] Starting process rollout_proc1 [2023-02-24 12:14:53,345][00205] Starting process rollout_proc2 [2023-02-24 12:14:53,346][00205] Starting process rollout_proc3 [2023-02-24 12:14:53,346][00205] Starting process rollout_proc4 [2023-02-24 12:14:53,346][00205] Starting process rollout_proc5 [2023-02-24 12:14:53,346][00205] Starting process rollout_proc6 [2023-02-24 12:14:53,346][00205] Starting process rollout_proc7 [2023-02-24 12:15:02,266][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 12:15:02,266][11201] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-24 12:15:02,435][11223] Worker 3 uses CPU cores [1] [2023-02-24 12:15:02,459][11227] Worker 4 uses CPU cores [0] [2023-02-24 12:15:02,579][11222] Worker 2 uses CPU cores [0] [2023-02-24 12:15:02,669][11224] Worker 5 uses CPU cores [1] [2023-02-24 12:15:02,680][11215] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 12:15:02,680][11215] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-24 12:15:02,710][11216] Worker 0 uses CPU cores [0] [2023-02-24 12:15:02,785][11221] Worker 1 uses CPU cores [1] [2023-02-24 12:15:02,791][11226] Worker 7 uses CPU cores [1] [2023-02-24 12:15:02,922][11225] Worker 6 uses CPU cores [0] [2023-02-24 12:15:03,232][11215] Num visible devices: 1 [2023-02-24 12:15:03,232][11201] Num visible devices: 1 [2023-02-24 12:15:03,238][11201] Starting seed is not provided [2023-02-24 12:15:03,238][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 12:15:03,238][11201] Initializing actor-critic model on device cuda:0 [2023-02-24 12:15:03,239][11201] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 12:15:03,240][11201] RunningMeanStd input shape: (1,) [2023-02-24 12:15:03,253][11201] ConvEncoder: input_channels=3 [2023-02-24 12:15:03,567][11201] Conv encoder output size: 512 [2023-02-24 12:15:03,568][11201] Policy head output size: 512 [2023-02-24 12:15:03,624][11201] Created Actor Critic model with architecture: [2023-02-24 12:15:03,625][11201] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-24 12:15:10,546][11201] Using optimizer [2023-02-24 12:15:10,547][11201] No checkpoints found [2023-02-24 12:15:10,547][11201] Did not load from checkpoint, starting from scratch! [2023-02-24 12:15:10,547][11201] Initialized policy 0 weights for model version 0 [2023-02-24 12:15:10,551][11201] LearnerWorker_p0 finished initialization! [2023-02-24 12:15:10,551][11201] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 12:15:10,778][11215] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 12:15:10,779][11215] RunningMeanStd input shape: (1,) [2023-02-24 12:15:10,792][11215] ConvEncoder: input_channels=3 [2023-02-24 12:15:10,892][11215] Conv encoder output size: 512 [2023-02-24 12:15:10,892][11215] Policy head output size: 512 [2023-02-24 12:15:12,870][00205] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 12:15:13,135][00205] Inference worker 0-0 is ready! [2023-02-24 12:15:13,137][00205] All inference workers are ready! Signal rollout workers to start! [2023-02-24 12:15:13,189][00205] Heartbeat connected on Batcher_0 [2023-02-24 12:15:13,193][00205] Heartbeat connected on LearnerWorker_p0 [2023-02-24 12:15:13,242][00205] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-24 12:15:13,280][11226] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,277][11227] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,288][11222] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,307][11224] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,311][11216] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,327][11223] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,328][11221] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:13,325][11225] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 12:15:14,505][11223] Decorrelating experience for 0 frames... [2023-02-24 12:15:14,502][11216] Decorrelating experience for 0 frames... [2023-02-24 12:15:14,504][11224] Decorrelating experience for 0 frames... [2023-02-24 12:15:14,503][11227] Decorrelating experience for 0 frames... [2023-02-24 12:15:14,506][11226] Decorrelating experience for 0 frames... [2023-02-24 12:15:14,505][11225] Decorrelating experience for 0 frames... [2023-02-24 12:15:15,534][11221] Decorrelating experience for 0 frames... [2023-02-24 12:15:15,544][11225] Decorrelating experience for 32 frames... [2023-02-24 12:15:15,551][11226] Decorrelating experience for 32 frames... [2023-02-24 12:15:15,550][11216] Decorrelating experience for 32 frames... [2023-02-24 12:15:15,555][11224] Decorrelating experience for 32 frames... [2023-02-24 12:15:15,553][11227] Decorrelating experience for 32 frames... [2023-02-24 12:15:16,392][11223] Decorrelating experience for 32 frames... [2023-02-24 12:15:16,405][11222] Decorrelating experience for 0 frames... [2023-02-24 12:15:16,494][11216] Decorrelating experience for 64 frames... [2023-02-24 12:15:16,504][11224] Decorrelating experience for 64 frames... [2023-02-24 12:15:17,196][11222] Decorrelating experience for 32 frames... [2023-02-24 12:15:17,361][11223] Decorrelating experience for 64 frames... [2023-02-24 12:15:17,441][11224] Decorrelating experience for 96 frames... [2023-02-24 12:15:17,581][11216] Decorrelating experience for 96 frames... [2023-02-24 12:15:17,619][00205] Heartbeat connected on RolloutWorker_w5 [2023-02-24 12:15:17,800][00205] Heartbeat connected on RolloutWorker_w0 [2023-02-24 12:15:17,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 12:15:18,286][11227] Decorrelating experience for 64 frames... [2023-02-24 12:15:18,426][11222] Decorrelating experience for 64 frames... [2023-02-24 12:15:18,910][11227] Decorrelating experience for 96 frames... [2023-02-24 12:15:19,117][00205] Heartbeat connected on RolloutWorker_w4 [2023-02-24 12:15:19,278][11223] Decorrelating experience for 96 frames... [2023-02-24 12:15:19,558][11222] Decorrelating experience for 96 frames... [2023-02-24 12:15:19,660][00205] Heartbeat connected on RolloutWorker_w3 [2023-02-24 12:15:19,678][00205] Heartbeat connected on RolloutWorker_w2 [2023-02-24 12:15:19,943][11226] Decorrelating experience for 64 frames... [2023-02-24 12:15:21,076][11221] Decorrelating experience for 32 frames... [2023-02-24 12:15:21,171][11226] Decorrelating experience for 96 frames... [2023-02-24 12:15:21,501][00205] Heartbeat connected on RolloutWorker_w7 [2023-02-24 12:15:21,637][11225] Decorrelating experience for 64 frames... [2023-02-24 12:15:22,201][11225] Decorrelating experience for 96 frames... [2023-02-24 12:15:22,366][00205] Heartbeat connected on RolloutWorker_w6 [2023-02-24 12:15:22,492][11221] Decorrelating experience for 64 frames... [2023-02-24 12:15:22,876][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 3.6. Samples: 36. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 12:15:24,394][11221] Decorrelating experience for 96 frames... [2023-02-24 12:15:25,075][00205] Heartbeat connected on RolloutWorker_w1 [2023-02-24 12:15:27,251][11201] Signal inference workers to stop experience collection... [2023-02-24 12:15:27,262][11215] InferenceWorker_p0-w0: stopping experience collection [2023-02-24 12:15:27,870][00205] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 108.0. Samples: 1620. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 12:15:27,872][00205] Avg episode reward: [(0, '2.148')] [2023-02-24 12:15:29,651][11201] Signal inference workers to resume experience collection... [2023-02-24 12:15:29,653][11215] InferenceWorker_p0-w0: resuming experience collection [2023-02-24 12:15:32,870][00205] Fps is (10 sec: 1639.3, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 191.1. Samples: 3822. Policy #0 lag: (min: 0.0, avg: 1.5, max: 3.0) [2023-02-24 12:15:32,876][00205] Avg episode reward: [(0, '3.277')] [2023-02-24 12:15:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 1474.6, 300 sec: 1474.6). Total num frames: 36864. Throughput: 0: 408.2. Samples: 10204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:15:37,877][00205] Avg episode reward: [(0, '3.970')] [2023-02-24 12:15:37,944][11215] Updated weights for policy 0, policy_version 10 (0.0017) [2023-02-24 12:15:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 1774.9, 300 sec: 1774.9). Total num frames: 53248. Throughput: 0: 414.3. Samples: 12428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:15:42,879][00205] Avg episode reward: [(0, '4.276')] [2023-02-24 12:15:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 467.9. Samples: 16378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:15:47,872][00205] Avg episode reward: [(0, '4.387')] [2023-02-24 12:15:50,151][11215] Updated weights for policy 0, policy_version 20 (0.0018) [2023-02-24 12:15:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 90112. Throughput: 0: 582.8. Samples: 23310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:15:52,877][00205] Avg episode reward: [(0, '4.311')] [2023-02-24 12:15:57,870][00205] Fps is (10 sec: 4505.4, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 597.3. Samples: 26878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:15:57,876][00205] Avg episode reward: [(0, '4.498')] [2023-02-24 12:15:57,885][11201] Saving new best policy, reward=4.498! [2023-02-24 12:16:00,589][11215] Updated weights for policy 0, policy_version 30 (0.0021) [2023-02-24 12:16:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 2539.5, 300 sec: 2539.5). Total num frames: 126976. Throughput: 0: 702.6. Samples: 31616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:16:02,876][00205] Avg episode reward: [(0, '4.507')] [2023-02-24 12:16:02,879][11201] Saving new best policy, reward=4.507! [2023-02-24 12:16:07,870][00205] Fps is (10 sec: 3277.0, 60 sec: 2681.0, 300 sec: 2681.0). Total num frames: 147456. Throughput: 0: 814.0. Samples: 36662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:16:07,878][00205] Avg episode reward: [(0, '4.360')] [2023-02-24 12:16:11,386][11215] Updated weights for policy 0, policy_version 40 (0.0012) [2023-02-24 12:16:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 2798.9, 300 sec: 2798.9). Total num frames: 167936. Throughput: 0: 856.4. Samples: 40160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:16:12,873][00205] Avg episode reward: [(0, '4.326')] [2023-02-24 12:16:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 2898.7). Total num frames: 188416. Throughput: 0: 957.8. Samples: 46924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:16:17,872][00205] Avg episode reward: [(0, '4.480')] [2023-02-24 12:16:22,602][11215] Updated weights for policy 0, policy_version 50 (0.0019) [2023-02-24 12:16:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.6, 300 sec: 2925.7). Total num frames: 204800. Throughput: 0: 914.2. Samples: 51344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:16:22,875][00205] Avg episode reward: [(0, '4.418')] [2023-02-24 12:16:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3003.7). Total num frames: 225280. Throughput: 0: 916.9. Samples: 53690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:16:27,873][00205] Avg episode reward: [(0, '4.363')] [2023-02-24 12:16:32,346][11215] Updated weights for policy 0, policy_version 60 (0.0012) [2023-02-24 12:16:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3072.0). Total num frames: 245760. Throughput: 0: 982.3. Samples: 60580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:16:32,873][00205] Avg episode reward: [(0, '4.446')] [2023-02-24 12:16:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3132.1). Total num frames: 266240. Throughput: 0: 968.6. Samples: 66900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:16:37,875][00205] Avg episode reward: [(0, '4.487')] [2023-02-24 12:16:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3094.8). Total num frames: 278528. Throughput: 0: 939.7. Samples: 69166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:16:42,879][00205] Avg episode reward: [(0, '4.349')] [2023-02-24 12:16:44,177][11215] Updated weights for policy 0, policy_version 70 (0.0048) [2023-02-24 12:16:47,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3822.9, 300 sec: 3147.4). Total num frames: 299008. Throughput: 0: 945.0. Samples: 74142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:16:47,875][00205] Avg episode reward: [(0, '4.269')] [2023-02-24 12:16:47,972][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth... [2023-02-24 12:16:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3235.8). Total num frames: 323584. Throughput: 0: 985.9. Samples: 81028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:16:52,878][00205] Avg episode reward: [(0, '4.319')] [2023-02-24 12:16:53,454][11215] Updated weights for policy 0, policy_version 80 (0.0018) [2023-02-24 12:16:57,874][00205] Fps is (10 sec: 4094.3, 60 sec: 3754.4, 300 sec: 3237.7). Total num frames: 339968. Throughput: 0: 988.0. Samples: 84624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:16:57,883][00205] Avg episode reward: [(0, '4.369')] [2023-02-24 12:17:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3239.6). Total num frames: 356352. Throughput: 0: 939.6. Samples: 89206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:17:02,874][00205] Avg episode reward: [(0, '4.496')] [2023-02-24 12:17:05,500][11215] Updated weights for policy 0, policy_version 90 (0.0025) [2023-02-24 12:17:07,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 376832. Throughput: 0: 962.2. Samples: 94642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:17:07,875][00205] Avg episode reward: [(0, '4.494')] [2023-02-24 12:17:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3345.1). Total num frames: 401408. Throughput: 0: 989.4. Samples: 98212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:17:12,873][00205] Avg episode reward: [(0, '4.592')] [2023-02-24 12:17:12,880][11201] Saving new best policy, reward=4.592! [2023-02-24 12:17:14,319][11215] Updated weights for policy 0, policy_version 100 (0.0029) [2023-02-24 12:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3342.3). Total num frames: 417792. Throughput: 0: 981.1. Samples: 104730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:17:17,872][00205] Avg episode reward: [(0, '4.498')] [2023-02-24 12:17:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3339.8). Total num frames: 434176. Throughput: 0: 934.1. Samples: 108934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:17:22,873][00205] Avg episode reward: [(0, '4.362')] [2023-02-24 12:17:27,018][11215] Updated weights for policy 0, policy_version 110 (0.0029) [2023-02-24 12:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3367.8). Total num frames: 454656. Throughput: 0: 934.0. Samples: 111194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:17:27,872][00205] Avg episode reward: [(0, '4.453')] [2023-02-24 12:17:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3393.8). Total num frames: 475136. Throughput: 0: 978.1. Samples: 118156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:17:32,878][00205] Avg episode reward: [(0, '4.647')] [2023-02-24 12:17:32,884][11201] Saving new best policy, reward=4.647! [2023-02-24 12:17:36,281][11215] Updated weights for policy 0, policy_version 120 (0.0015) [2023-02-24 12:17:37,875][00205] Fps is (10 sec: 4094.1, 60 sec: 3822.8, 300 sec: 3417.9). Total num frames: 495616. Throughput: 0: 958.5. Samples: 124166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:17:37,878][00205] Avg episode reward: [(0, '4.540')] [2023-02-24 12:17:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 924.4. Samples: 126220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:17:42,873][00205] Avg episode reward: [(0, '4.494')] [2023-02-24 12:17:47,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3382.5). Total num frames: 524288. Throughput: 0: 924.0. Samples: 130784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:17:47,873][00205] Avg episode reward: [(0, '4.480')] [2023-02-24 12:17:48,958][11215] Updated weights for policy 0, policy_version 130 (0.0029) [2023-02-24 12:17:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3430.4). Total num frames: 548864. Throughput: 0: 948.7. Samples: 137334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:17:52,877][00205] Avg episode reward: [(0, '4.535')] [2023-02-24 12:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3425.7). Total num frames: 565248. Throughput: 0: 944.3. Samples: 140704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:17:57,875][00205] Avg episode reward: [(0, '4.569')] [2023-02-24 12:17:59,506][11215] Updated weights for policy 0, policy_version 140 (0.0025) [2023-02-24 12:18:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 893.5. Samples: 144938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:02,872][00205] Avg episode reward: [(0, '4.381')] [2023-02-24 12:18:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3417.2). Total num frames: 598016. Throughput: 0: 900.6. Samples: 149460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:18:07,872][00205] Avg episode reward: [(0, '4.500')] [2023-02-24 12:18:11,604][11215] Updated weights for policy 0, policy_version 150 (0.0025) [2023-02-24 12:18:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3436.1). Total num frames: 618496. Throughput: 0: 919.8. Samples: 152584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:18:12,873][00205] Avg episode reward: [(0, '4.487')] [2023-02-24 12:18:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3431.8). Total num frames: 634880. Throughput: 0: 904.4. Samples: 158856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:17,873][00205] Avg episode reward: [(0, '4.541')] [2023-02-24 12:18:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3406.1). Total num frames: 647168. Throughput: 0: 862.2. Samples: 162960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:18:22,874][00205] Avg episode reward: [(0, '4.610')] [2023-02-24 12:18:24,292][11215] Updated weights for policy 0, policy_version 160 (0.0015) [2023-02-24 12:18:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3423.8). Total num frames: 667648. Throughput: 0: 863.5. Samples: 165078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:18:27,873][00205] Avg episode reward: [(0, '4.621')] [2023-02-24 12:18:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3440.6). Total num frames: 688128. Throughput: 0: 904.7. Samples: 171496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:32,873][00205] Avg episode reward: [(0, '4.513')] [2023-02-24 12:18:34,056][11215] Updated weights for policy 0, policy_version 170 (0.0012) [2023-02-24 12:18:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.1, 300 sec: 3456.6). Total num frames: 708608. Throughput: 0: 891.1. Samples: 177432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:37,873][00205] Avg episode reward: [(0, '4.413')] [2023-02-24 12:18:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3432.8). Total num frames: 720896. Throughput: 0: 862.0. Samples: 179496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 12:18:42,874][00205] Avg episode reward: [(0, '4.574')] [2023-02-24 12:18:47,477][11215] Updated weights for policy 0, policy_version 180 (0.0025) [2023-02-24 12:18:47,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3429.2). Total num frames: 737280. Throughput: 0: 854.2. Samples: 183378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:47,876][00205] Avg episode reward: [(0, '4.684')] [2023-02-24 12:18:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth... [2023-02-24 12:18:48,003][11201] Saving new best policy, reward=4.684! [2023-02-24 12:18:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3444.4). Total num frames: 757760. Throughput: 0: 887.9. Samples: 189418. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:18:52,876][00205] Avg episode reward: [(0, '4.764')] [2023-02-24 12:18:52,881][11201] Saving new best policy, reward=4.764! [2023-02-24 12:18:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3440.6). Total num frames: 774144. Throughput: 0: 890.4. Samples: 192650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:18:57,872][00205] Avg episode reward: [(0, '4.736')] [2023-02-24 12:18:58,010][11215] Updated weights for policy 0, policy_version 190 (0.0013) [2023-02-24 12:19:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3437.1). Total num frames: 790528. Throughput: 0: 854.0. Samples: 197286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:19:02,875][00205] Avg episode reward: [(0, '4.664')] [2023-02-24 12:19:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3433.7). Total num frames: 806912. Throughput: 0: 866.8. Samples: 201966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:19:07,873][00205] Avg episode reward: [(0, '4.628')] [2023-02-24 12:19:10,101][11215] Updated weights for policy 0, policy_version 200 (0.0025) [2023-02-24 12:19:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3447.5). Total num frames: 827392. Throughput: 0: 894.9. Samples: 205346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:19:12,878][00205] Avg episode reward: [(0, '4.605')] [2023-02-24 12:19:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3460.7). Total num frames: 847872. Throughput: 0: 899.6. Samples: 211980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:19:17,874][00205] Avg episode reward: [(0, '4.689')] [2023-02-24 12:19:21,222][11215] Updated weights for policy 0, policy_version 210 (0.0014) [2023-02-24 12:19:22,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3457.0). Total num frames: 864256. Throughput: 0: 862.4. Samples: 216242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:19:22,877][00205] Avg episode reward: [(0, '4.659')] [2023-02-24 12:19:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3453.5). Total num frames: 880640. Throughput: 0: 865.1. Samples: 218426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:19:27,872][00205] Avg episode reward: [(0, '4.987')] [2023-02-24 12:19:27,885][11201] Saving new best policy, reward=4.987! [2023-02-24 12:19:32,149][11215] Updated weights for policy 0, policy_version 220 (0.0026) [2023-02-24 12:19:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3465.8). Total num frames: 901120. Throughput: 0: 920.1. Samples: 224780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:19:32,873][00205] Avg episode reward: [(0, '4.751')] [2023-02-24 12:19:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3477.7). Total num frames: 921600. Throughput: 0: 923.4. Samples: 230972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:19:37,872][00205] Avg episode reward: [(0, '4.536')] [2023-02-24 12:19:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3474.0). Total num frames: 937984. Throughput: 0: 898.8. Samples: 233098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:19:42,874][00205] Avg episode reward: [(0, '4.607')] [2023-02-24 12:19:44,170][11215] Updated weights for policy 0, policy_version 230 (0.0030) [2023-02-24 12:19:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.2, 300 sec: 3470.4). Total num frames: 954368. Throughput: 0: 895.2. Samples: 237572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:19:47,875][00205] Avg episode reward: [(0, '4.886')] [2023-02-24 12:19:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3481.6). Total num frames: 974848. Throughput: 0: 937.3. Samples: 244142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:19:52,873][00205] Avg episode reward: [(0, '4.803')] [2023-02-24 12:19:54,244][11215] Updated weights for policy 0, policy_version 240 (0.0026) [2023-02-24 12:19:57,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3686.2, 300 sec: 3492.3). Total num frames: 995328. Throughput: 0: 932.5. Samples: 247310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:19:57,875][00205] Avg episode reward: [(0, '4.768')] [2023-02-24 12:20:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3474.5). Total num frames: 1007616. Throughput: 0: 881.9. Samples: 251664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:20:02,877][00205] Avg episode reward: [(0, '4.945')] [2023-02-24 12:20:07,425][11215] Updated weights for policy 0, policy_version 250 (0.0012) [2023-02-24 12:20:07,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.2, 300 sec: 3471.2). Total num frames: 1024000. Throughput: 0: 887.0. Samples: 256156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:20:07,877][00205] Avg episode reward: [(0, '5.081')] [2023-02-24 12:20:07,888][11201] Saving new best policy, reward=5.081! [2023-02-24 12:20:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1044480. Throughput: 0: 908.4. Samples: 259306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:20:12,874][00205] Avg episode reward: [(0, '5.107')] [2023-02-24 12:20:12,880][11201] Saving new best policy, reward=5.107! [2023-02-24 12:20:17,756][11215] Updated weights for policy 0, policy_version 260 (0.0012) [2023-02-24 12:20:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.1). Total num frames: 1064960. Throughput: 0: 907.4. Samples: 265614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:20:17,875][00205] Avg episode reward: [(0, '5.047')] [2023-02-24 12:20:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1077248. Throughput: 0: 858.2. Samples: 269590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:20:22,873][00205] Avg episode reward: [(0, '5.014')] [2023-02-24 12:20:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1093632. Throughput: 0: 853.6. Samples: 271510. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:20:27,877][00205] Avg episode reward: [(0, '5.123')] [2023-02-24 12:20:27,887][11201] Saving new best policy, reward=5.123! [2023-02-24 12:20:30,782][11215] Updated weights for policy 0, policy_version 270 (0.0030) [2023-02-24 12:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1114112. Throughput: 0: 887.3. Samples: 277500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:20:32,872][00205] Avg episode reward: [(0, '5.457')] [2023-02-24 12:20:32,877][11201] Saving new best policy, reward=5.457! [2023-02-24 12:20:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 1130496. Throughput: 0: 872.3. Samples: 283396. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:20:37,876][00205] Avg episode reward: [(0, '5.337')] [2023-02-24 12:20:42,877][00205] Fps is (10 sec: 2865.1, 60 sec: 3412.9, 300 sec: 3637.7). Total num frames: 1142784. Throughput: 0: 846.3. Samples: 285396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:20:42,881][00205] Avg episode reward: [(0, '5.401')] [2023-02-24 12:20:42,915][11215] Updated weights for policy 0, policy_version 280 (0.0013) [2023-02-24 12:20:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 1159168. Throughput: 0: 840.1. Samples: 289468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:20:47,873][00205] Avg episode reward: [(0, '5.472')] [2023-02-24 12:20:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth... [2023-02-24 12:20:47,995][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth [2023-02-24 12:20:48,011][11201] Saving new best policy, reward=5.472! [2023-02-24 12:20:52,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3481.6, 300 sec: 3623.9). Total num frames: 1183744. Throughput: 0: 878.3. Samples: 295680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:20:52,876][00205] Avg episode reward: [(0, '5.501')] [2023-02-24 12:20:52,880][11201] Saving new best policy, reward=5.501! [2023-02-24 12:20:53,934][11215] Updated weights for policy 0, policy_version 290 (0.0015) [2023-02-24 12:20:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.5, 300 sec: 3637.8). Total num frames: 1200128. Throughput: 0: 875.2. Samples: 298692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:20:57,879][00205] Avg episode reward: [(0, '5.351')] [2023-02-24 12:21:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1212416. Throughput: 0: 833.4. Samples: 303116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:02,874][00205] Avg episode reward: [(0, '5.385')] [2023-02-24 12:21:07,198][11215] Updated weights for policy 0, policy_version 300 (0.0012) [2023-02-24 12:21:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3596.2). Total num frames: 1228800. Throughput: 0: 845.3. Samples: 307628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:21:07,872][00205] Avg episode reward: [(0, '5.232')] [2023-02-24 12:21:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3596.1). Total num frames: 1249280. Throughput: 0: 872.5. Samples: 310772. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:21:12,875][00205] Avg episode reward: [(0, '5.181')] [2023-02-24 12:21:17,493][11215] Updated weights for policy 0, policy_version 310 (0.0016) [2023-02-24 12:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 1269760. Throughput: 0: 880.0. Samples: 317100. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:17,873][00205] Avg episode reward: [(0, '5.654')] [2023-02-24 12:21:17,890][11201] Saving new best policy, reward=5.654! [2023-02-24 12:21:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3582.3). Total num frames: 1282048. Throughput: 0: 836.2. Samples: 321026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:21:22,872][00205] Avg episode reward: [(0, '5.684')] [2023-02-24 12:21:22,875][11201] Saving new best policy, reward=5.684! [2023-02-24 12:21:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1298432. Throughput: 0: 836.8. Samples: 323048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:27,876][00205] Avg episode reward: [(0, '5.906')] [2023-02-24 12:21:27,887][11201] Saving new best policy, reward=5.906! [2023-02-24 12:21:30,480][11215] Updated weights for policy 0, policy_version 320 (0.0018) [2023-02-24 12:21:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1318912. Throughput: 0: 879.6. Samples: 329050. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:21:32,873][00205] Avg episode reward: [(0, '5.781')] [2023-02-24 12:21:37,874][00205] Fps is (10 sec: 3685.1, 60 sec: 3413.1, 300 sec: 3582.2). Total num frames: 1335296. Throughput: 0: 872.0. Samples: 334922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:37,884][00205] Avg episode reward: [(0, '5.859')] [2023-02-24 12:21:42,523][11215] Updated weights for policy 0, policy_version 330 (0.0022) [2023-02-24 12:21:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.9, 300 sec: 3568.4). Total num frames: 1351680. Throughput: 0: 849.4. Samples: 336916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:21:42,875][00205] Avg episode reward: [(0, '6.205')] [2023-02-24 12:21:42,881][11201] Saving new best policy, reward=6.205! [2023-02-24 12:21:47,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1368064. Throughput: 0: 842.8. Samples: 341040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:47,877][00205] Avg episode reward: [(0, '6.115')] [2023-02-24 12:21:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 1388544. Throughput: 0: 886.0. Samples: 347496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:21:52,876][00205] Avg episode reward: [(0, '5.819')] [2023-02-24 12:21:53,336][11215] Updated weights for policy 0, policy_version 340 (0.0013) [2023-02-24 12:21:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 1409024. Throughput: 0: 888.5. Samples: 350756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:21:57,875][00205] Avg episode reward: [(0, '5.845')] [2023-02-24 12:22:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1421312. Throughput: 0: 839.8. Samples: 354890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:22:02,878][00205] Avg episode reward: [(0, '5.821')] [2023-02-24 12:22:06,387][11215] Updated weights for policy 0, policy_version 350 (0.0019) [2023-02-24 12:22:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1437696. Throughput: 0: 860.0. Samples: 359724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:22:07,878][00205] Avg episode reward: [(0, '6.200')] [2023-02-24 12:22:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1458176. Throughput: 0: 887.9. Samples: 363002. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:22:12,878][00205] Avg episode reward: [(0, '6.405')] [2023-02-24 12:22:12,882][11201] Saving new best policy, reward=6.405! [2023-02-24 12:22:16,660][11215] Updated weights for policy 0, policy_version 360 (0.0020) [2023-02-24 12:22:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 1474560. Throughput: 0: 887.8. Samples: 369000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:22:17,875][00205] Avg episode reward: [(0, '5.882')] [2023-02-24 12:22:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1490944. Throughput: 0: 849.6. Samples: 373150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:22:22,877][00205] Avg episode reward: [(0, '6.185')] [2023-02-24 12:22:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1507328. Throughput: 0: 850.6. Samples: 375192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:22:27,873][00205] Avg episode reward: [(0, '6.282')] [2023-02-24 12:22:29,299][11215] Updated weights for policy 0, policy_version 370 (0.0015) [2023-02-24 12:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1527808. Throughput: 0: 898.9. Samples: 381492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:22:32,872][00205] Avg episode reward: [(0, '6.499')] [2023-02-24 12:22:32,875][11201] Saving new best policy, reward=6.499! [2023-02-24 12:22:37,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1544192. Throughput: 0: 877.9. Samples: 387004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:22:37,887][00205] Avg episode reward: [(0, '6.315')] [2023-02-24 12:22:41,166][11215] Updated weights for policy 0, policy_version 380 (0.0012) [2023-02-24 12:22:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3512.8). Total num frames: 1560576. Throughput: 0: 850.4. Samples: 389024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:22:42,872][00205] Avg episode reward: [(0, '6.222')] [2023-02-24 12:22:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1576960. Throughput: 0: 857.3. Samples: 393470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:22:47,878][00205] Avg episode reward: [(0, '6.490')] [2023-02-24 12:22:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth... [2023-02-24 12:22:48,008][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth [2023-02-24 12:22:52,505][11215] Updated weights for policy 0, policy_version 390 (0.0016) [2023-02-24 12:22:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1597440. Throughput: 0: 888.3. Samples: 399698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:22:52,873][00205] Avg episode reward: [(0, '6.882')] [2023-02-24 12:22:52,876][11201] Saving new best policy, reward=6.882! [2023-02-24 12:22:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 1613824. Throughput: 0: 884.8. Samples: 402816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:22:57,874][00205] Avg episode reward: [(0, '6.792')] [2023-02-24 12:23:02,874][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1626112. Throughput: 0: 838.5. Samples: 406734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:02,876][00205] Avg episode reward: [(0, '7.109')] [2023-02-24 12:23:02,878][11201] Saving new best policy, reward=7.109! [2023-02-24 12:23:05,821][11215] Updated weights for policy 0, policy_version 400 (0.0013) [2023-02-24 12:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1646592. Throughput: 0: 854.3. Samples: 411594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:23:07,873][00205] Avg episode reward: [(0, '7.453')] [2023-02-24 12:23:07,881][11201] Saving new best policy, reward=7.453! [2023-02-24 12:23:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1667072. Throughput: 0: 878.7. Samples: 414732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:12,873][00205] Avg episode reward: [(0, '7.514')] [2023-02-24 12:23:12,877][11201] Saving new best policy, reward=7.514! [2023-02-24 12:23:15,700][11215] Updated weights for policy 0, policy_version 410 (0.0012) [2023-02-24 12:23:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1683456. Throughput: 0: 869.8. Samples: 420632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:17,872][00205] Avg episode reward: [(0, '7.496')] [2023-02-24 12:23:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1695744. Throughput: 0: 836.9. Samples: 424662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:23:22,877][00205] Avg episode reward: [(0, '7.483')] [2023-02-24 12:23:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1716224. Throughput: 0: 837.9. Samples: 426728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:27,878][00205] Avg episode reward: [(0, '7.827')] [2023-02-24 12:23:27,886][11201] Saving new best policy, reward=7.827! [2023-02-24 12:23:28,928][11215] Updated weights for policy 0, policy_version 420 (0.0020) [2023-02-24 12:23:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 1736704. Throughput: 0: 878.6. Samples: 433008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:32,879][00205] Avg episode reward: [(0, '8.086')] [2023-02-24 12:23:32,883][11201] Saving new best policy, reward=8.086! [2023-02-24 12:23:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3485.1). Total num frames: 1748992. Throughput: 0: 859.6. Samples: 438378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:23:37,874][00205] Avg episode reward: [(0, '7.910')] [2023-02-24 12:23:40,867][11215] Updated weights for policy 0, policy_version 430 (0.0011) [2023-02-24 12:23:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1765376. Throughput: 0: 833.9. Samples: 440342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:23:42,878][00205] Avg episode reward: [(0, '7.784')] [2023-02-24 12:23:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1781760. Throughput: 0: 848.8. Samples: 444928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:23:47,872][00205] Avg episode reward: [(0, '7.886')] [2023-02-24 12:23:52,061][11215] Updated weights for policy 0, policy_version 440 (0.0018) [2023-02-24 12:23:52,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1802240. Throughput: 0: 882.7. Samples: 451314. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:23:52,883][00205] Avg episode reward: [(0, '8.041')] [2023-02-24 12:23:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1818624. Throughput: 0: 878.9. Samples: 454284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:23:57,872][00205] Avg episode reward: [(0, '8.421')] [2023-02-24 12:23:57,892][11201] Saving new best policy, reward=8.421! [2023-02-24 12:24:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1830912. Throughput: 0: 835.1. Samples: 458214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:24:02,875][00205] Avg episode reward: [(0, '8.603')] [2023-02-24 12:24:02,878][11201] Saving new best policy, reward=8.603! [2023-02-24 12:24:05,500][11215] Updated weights for policy 0, policy_version 450 (0.0032) [2023-02-24 12:24:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1851392. Throughput: 0: 853.7. Samples: 463078. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:24:07,880][00205] Avg episode reward: [(0, '9.296')] [2023-02-24 12:24:07,890][11201] Saving new best policy, reward=9.296! [2023-02-24 12:24:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1871872. Throughput: 0: 876.9. Samples: 466190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:24:12,872][00205] Avg episode reward: [(0, '9.761')] [2023-02-24 12:24:12,880][11201] Saving new best policy, reward=9.761! [2023-02-24 12:24:16,003][11215] Updated weights for policy 0, policy_version 460 (0.0027) [2023-02-24 12:24:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1888256. Throughput: 0: 861.4. Samples: 471770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:24:17,874][00205] Avg episode reward: [(0, '9.532')] [2023-02-24 12:24:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1900544. Throughput: 0: 829.6. Samples: 475712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:24:22,876][00205] Avg episode reward: [(0, '10.124')] [2023-02-24 12:24:22,881][11201] Saving new best policy, reward=10.124! [2023-02-24 12:24:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1916928. Throughput: 0: 836.3. Samples: 477974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:24:27,872][00205] Avg episode reward: [(0, '9.217')] [2023-02-24 12:24:28,884][11215] Updated weights for policy 0, policy_version 470 (0.0033) [2023-02-24 12:24:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3443.4). Total num frames: 1937408. Throughput: 0: 870.9. Samples: 484120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:24:32,873][00205] Avg episode reward: [(0, '9.530')] [2023-02-24 12:24:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1953792. Throughput: 0: 843.8. Samples: 489286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:24:37,876][00205] Avg episode reward: [(0, '9.708')] [2023-02-24 12:24:41,426][11215] Updated weights for policy 0, policy_version 480 (0.0041) [2023-02-24 12:24:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1966080. Throughput: 0: 821.3. Samples: 491242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:24:42,873][00205] Avg episode reward: [(0, '9.560')] [2023-02-24 12:24:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1986560. Throughput: 0: 839.7. Samples: 496002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:24:47,877][00205] Avg episode reward: [(0, '10.393')] [2023-02-24 12:24:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth... [2023-02-24 12:24:48,015][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000283_1159168.pth [2023-02-24 12:24:48,026][11201] Saving new best policy, reward=10.393! [2023-02-24 12:24:52,339][11215] Updated weights for policy 0, policy_version 490 (0.0012) [2023-02-24 12:24:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 2007040. Throughput: 0: 867.1. Samples: 502096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:24:52,873][00205] Avg episode reward: [(0, '10.734')] [2023-02-24 12:24:52,876][11201] Saving new best policy, reward=10.734! [2023-02-24 12:24:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2023424. Throughput: 0: 857.2. Samples: 504764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 12:24:57,880][00205] Avg episode reward: [(0, '11.146')] [2023-02-24 12:24:57,896][11201] Saving new best policy, reward=11.146! [2023-02-24 12:25:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3429.5). Total num frames: 2035712. Throughput: 0: 815.2. Samples: 508456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:25:02,877][00205] Avg episode reward: [(0, '10.893')] [2023-02-24 12:25:06,006][11215] Updated weights for policy 0, policy_version 500 (0.0025) [2023-02-24 12:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2052096. Throughput: 0: 844.3. Samples: 513704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:25:07,877][00205] Avg episode reward: [(0, '10.468')] [2023-02-24 12:25:12,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3344.9, 300 sec: 3415.6). Total num frames: 2072576. Throughput: 0: 864.1. Samples: 516860. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:25:12,878][00205] Avg episode reward: [(0, '10.186')] [2023-02-24 12:25:16,819][11215] Updated weights for policy 0, policy_version 510 (0.0012) [2023-02-24 12:25:17,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3344.8, 300 sec: 3429.5). Total num frames: 2088960. Throughput: 0: 853.7. Samples: 522542. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:25:17,877][00205] Avg episode reward: [(0, '10.895')] [2023-02-24 12:25:22,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2105344. Throughput: 0: 830.5. Samples: 526658. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:25:22,876][00205] Avg episode reward: [(0, '11.838')] [2023-02-24 12:25:22,883][11201] Saving new best policy, reward=11.838! [2023-02-24 12:25:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2125824. Throughput: 0: 842.8. Samples: 529168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:25:27,876][00205] Avg episode reward: [(0, '11.191')] [2023-02-24 12:25:28,778][11215] Updated weights for policy 0, policy_version 520 (0.0015) [2023-02-24 12:25:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2146304. Throughput: 0: 879.1. Samples: 535562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:25:32,872][00205] Avg episode reward: [(0, '12.906')] [2023-02-24 12:25:32,881][11201] Saving new best policy, reward=12.906! [2023-02-24 12:25:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3413.1, 300 sec: 3443.5). Total num frames: 2158592. Throughput: 0: 854.1. Samples: 540534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:25:37,876][00205] Avg episode reward: [(0, '12.234')] [2023-02-24 12:25:41,184][11215] Updated weights for policy 0, policy_version 530 (0.0022) [2023-02-24 12:25:42,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3481.5, 300 sec: 3443.4). Total num frames: 2174976. Throughput: 0: 839.1. Samples: 542524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:25:42,874][00205] Avg episode reward: [(0, '11.783')] [2023-02-24 12:25:47,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2191360. Throughput: 0: 867.4. Samples: 547488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:25:47,875][00205] Avg episode reward: [(0, '11.929')] [2023-02-24 12:25:51,704][11215] Updated weights for policy 0, policy_version 540 (0.0012) [2023-02-24 12:25:52,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2215936. Throughput: 0: 894.9. Samples: 553976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:25:52,877][00205] Avg episode reward: [(0, '11.139')] [2023-02-24 12:25:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2228224. Throughput: 0: 882.2. Samples: 556558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:25:57,873][00205] Avg episode reward: [(0, '12.047')] [2023-02-24 12:26:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2244608. Throughput: 0: 846.2. Samples: 560618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:26:02,873][00205] Avg episode reward: [(0, '12.846')] [2023-02-24 12:26:05,046][11215] Updated weights for policy 0, policy_version 550 (0.0019) [2023-02-24 12:26:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2265088. Throughput: 0: 875.7. Samples: 566064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:07,872][00205] Avg episode reward: [(0, '13.604')] [2023-02-24 12:26:07,888][11201] Saving new best policy, reward=13.604! [2023-02-24 12:26:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 2285568. Throughput: 0: 888.0. Samples: 569126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:12,878][00205] Avg episode reward: [(0, '13.737')] [2023-02-24 12:26:12,885][11201] Saving new best policy, reward=13.737! [2023-02-24 12:26:15,515][11215] Updated weights for policy 0, policy_version 560 (0.0015) [2023-02-24 12:26:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.9, 300 sec: 3443.4). Total num frames: 2297856. Throughput: 0: 863.5. Samples: 574418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:17,873][00205] Avg episode reward: [(0, '14.525')] [2023-02-24 12:26:17,890][11201] Saving new best policy, reward=14.525! [2023-02-24 12:26:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2310144. Throughput: 0: 839.6. Samples: 578312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:26:22,874][00205] Avg episode reward: [(0, '14.451')] [2023-02-24 12:26:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2330624. Throughput: 0: 856.8. Samples: 581076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:27,873][00205] Avg episode reward: [(0, '14.278')] [2023-02-24 12:26:28,206][11215] Updated weights for policy 0, policy_version 570 (0.0023) [2023-02-24 12:26:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3443.5). Total num frames: 2351104. Throughput: 0: 885.1. Samples: 587318. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:26:32,872][00205] Avg episode reward: [(0, '14.832')] [2023-02-24 12:26:32,876][11201] Saving new best policy, reward=14.832! [2023-02-24 12:26:37,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 2367488. Throughput: 0: 847.5. Samples: 592114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:37,878][00205] Avg episode reward: [(0, '14.690')] [2023-02-24 12:26:40,690][11215] Updated weights for policy 0, policy_version 580 (0.0018) [2023-02-24 12:26:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3413.5, 300 sec: 3429.5). Total num frames: 2379776. Throughput: 0: 835.2. Samples: 594142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:26:42,875][00205] Avg episode reward: [(0, '13.701')] [2023-02-24 12:26:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2400256. Throughput: 0: 862.8. Samples: 599444. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:26:47,872][00205] Avg episode reward: [(0, '15.186')] [2023-02-24 12:26:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth... [2023-02-24 12:26:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_1576960.pth [2023-02-24 12:26:48,046][11201] Saving new best policy, reward=15.186! [2023-02-24 12:26:51,208][11215] Updated weights for policy 0, policy_version 590 (0.0014) [2023-02-24 12:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2420736. Throughput: 0: 881.0. Samples: 605710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:26:52,878][00205] Avg episode reward: [(0, '14.601')] [2023-02-24 12:26:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2437120. Throughput: 0: 866.8. Samples: 608134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:26:57,873][00205] Avg episode reward: [(0, '14.396')] [2023-02-24 12:27:02,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2449408. Throughput: 0: 839.4. Samples: 612190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:27:02,880][00205] Avg episode reward: [(0, '14.811')] [2023-02-24 12:27:04,374][11215] Updated weights for policy 0, policy_version 600 (0.0020) [2023-02-24 12:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2469888. Throughput: 0: 884.1. Samples: 618096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:27:07,878][00205] Avg episode reward: [(0, '15.226')] [2023-02-24 12:27:07,887][11201] Saving new best policy, reward=15.226! [2023-02-24 12:27:12,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2494464. Throughput: 0: 892.8. Samples: 621252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:27:12,878][00205] Avg episode reward: [(0, '15.343')] [2023-02-24 12:27:12,883][11201] Saving new best policy, reward=15.343! [2023-02-24 12:27:14,283][11215] Updated weights for policy 0, policy_version 610 (0.0012) [2023-02-24 12:27:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2506752. Throughput: 0: 870.8. Samples: 626504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:27:17,880][00205] Avg episode reward: [(0, '15.456')] [2023-02-24 12:27:17,895][11201] Saving new best policy, reward=15.456! [2023-02-24 12:27:22,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2519040. Throughput: 0: 856.6. Samples: 630662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:22,875][00205] Avg episode reward: [(0, '16.697')] [2023-02-24 12:27:22,879][11201] Saving new best policy, reward=16.697! [2023-02-24 12:27:27,025][11215] Updated weights for policy 0, policy_version 620 (0.0025) [2023-02-24 12:27:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2539520. Throughput: 0: 875.3. Samples: 633530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:27,875][00205] Avg episode reward: [(0, '16.487')] [2023-02-24 12:27:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2564096. Throughput: 0: 902.6. Samples: 640060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:32,873][00205] Avg episode reward: [(0, '18.070')] [2023-02-24 12:27:32,880][11201] Saving new best policy, reward=18.070! [2023-02-24 12:27:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2576384. Throughput: 0: 865.5. Samples: 644656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:37,876][00205] Avg episode reward: [(0, '19.315')] [2023-02-24 12:27:37,889][11201] Saving new best policy, reward=19.315! [2023-02-24 12:27:38,589][11215] Updated weights for policy 0, policy_version 630 (0.0016) [2023-02-24 12:27:42,870][00205] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2588672. Throughput: 0: 854.9. Samples: 646606. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:27:42,872][00205] Avg episode reward: [(0, '18.752')] [2023-02-24 12:27:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2613248. Throughput: 0: 886.7. Samples: 652090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:27:47,873][00205] Avg episode reward: [(0, '19.421')] [2023-02-24 12:27:47,885][11201] Saving new best policy, reward=19.421! [2023-02-24 12:27:49,817][11215] Updated weights for policy 0, policy_version 640 (0.0027) [2023-02-24 12:27:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2633728. Throughput: 0: 899.0. Samples: 658552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:52,874][00205] Avg episode reward: [(0, '18.641')] [2023-02-24 12:27:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2646016. Throughput: 0: 879.2. Samples: 660816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:27:57,874][00205] Avg episode reward: [(0, '18.167')] [2023-02-24 12:28:02,866][11215] Updated weights for policy 0, policy_version 650 (0.0028) [2023-02-24 12:28:02,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2662400. Throughput: 0: 853.8. Samples: 664926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:28:02,879][00205] Avg episode reward: [(0, '18.468')] [2023-02-24 12:28:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2682880. Throughput: 0: 893.1. Samples: 670852. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:28:07,879][00205] Avg episode reward: [(0, '19.198')] [2023-02-24 12:28:12,188][11215] Updated weights for policy 0, policy_version 660 (0.0021) [2023-02-24 12:28:12,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 2703360. Throughput: 0: 901.2. Samples: 674086. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:28:12,883][00205] Avg episode reward: [(0, '19.262')] [2023-02-24 12:28:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2715648. Throughput: 0: 869.5. Samples: 679188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:28:17,873][00205] Avg episode reward: [(0, '19.826')] [2023-02-24 12:28:17,896][11201] Saving new best policy, reward=19.826! [2023-02-24 12:28:22,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2732032. Throughput: 0: 857.9. Samples: 683262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:28:22,873][00205] Avg episode reward: [(0, '20.419')] [2023-02-24 12:28:22,875][11201] Saving new best policy, reward=20.419! [2023-02-24 12:28:25,364][11215] Updated weights for policy 0, policy_version 670 (0.0015) [2023-02-24 12:28:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2752512. Throughput: 0: 883.5. Samples: 686364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:28:27,872][00205] Avg episode reward: [(0, '19.222')] [2023-02-24 12:28:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 2772992. Throughput: 0: 906.4. Samples: 692878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:28:32,879][00205] Avg episode reward: [(0, '20.182')] [2023-02-24 12:28:36,282][11215] Updated weights for policy 0, policy_version 680 (0.0027) [2023-02-24 12:28:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2789376. Throughput: 0: 864.6. Samples: 697458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:28:37,873][00205] Avg episode reward: [(0, '19.719')] [2023-02-24 12:28:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2801664. Throughput: 0: 858.8. Samples: 699460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:28:42,878][00205] Avg episode reward: [(0, '18.783')] [2023-02-24 12:28:47,846][11215] Updated weights for policy 0, policy_version 690 (0.0014) [2023-02-24 12:28:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2826240. Throughput: 0: 895.8. Samples: 705238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:28:47,872][00205] Avg episode reward: [(0, '17.900')] [2023-02-24 12:28:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth... [2023-02-24 12:28:47,997][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000485_1986560.pth [2023-02-24 12:28:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2842624. Throughput: 0: 907.5. Samples: 711688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:28:52,874][00205] Avg episode reward: [(0, '16.623')] [2023-02-24 12:28:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2859008. Throughput: 0: 878.2. Samples: 713602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:28:57,873][00205] Avg episode reward: [(0, '16.645')] [2023-02-24 12:29:00,608][11215] Updated weights for policy 0, policy_version 700 (0.0028) [2023-02-24 12:29:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2871296. Throughput: 0: 855.0. Samples: 717664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:29:02,876][00205] Avg episode reward: [(0, '17.588')] [2023-02-24 12:29:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2891776. Throughput: 0: 897.9. Samples: 723666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:29:07,873][00205] Avg episode reward: [(0, '17.711')] [2023-02-24 12:29:10,964][11215] Updated weights for policy 0, policy_version 710 (0.0037) [2023-02-24 12:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3471.2). Total num frames: 2912256. Throughput: 0: 898.6. Samples: 726800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:29:12,876][00205] Avg episode reward: [(0, '18.661')] [2023-02-24 12:29:17,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3481.3, 300 sec: 3471.1). Total num frames: 2924544. Throughput: 0: 857.9. Samples: 731486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:29:17,878][00205] Avg episode reward: [(0, '19.159')] [2023-02-24 12:29:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2940928. Throughput: 0: 854.0. Samples: 735890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:29:22,876][00205] Avg episode reward: [(0, '19.729')] [2023-02-24 12:29:23,915][11215] Updated weights for policy 0, policy_version 720 (0.0026) [2023-02-24 12:29:27,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2965504. Throughput: 0: 880.8. Samples: 739098. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:29:27,873][00205] Avg episode reward: [(0, '19.640')] [2023-02-24 12:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 2981888. Throughput: 0: 898.1. Samples: 745654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 12:29:32,876][00205] Avg episode reward: [(0, '20.169')] [2023-02-24 12:29:34,526][11215] Updated weights for policy 0, policy_version 730 (0.0015) [2023-02-24 12:29:37,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3498.9). Total num frames: 2998272. Throughput: 0: 850.3. Samples: 749954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:29:37,875][00205] Avg episode reward: [(0, '20.339')] [2023-02-24 12:29:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3014656. Throughput: 0: 852.5. Samples: 751966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:29:42,873][00205] Avg episode reward: [(0, '21.573')] [2023-02-24 12:29:42,881][11201] Saving new best policy, reward=21.573! [2023-02-24 12:29:46,581][11215] Updated weights for policy 0, policy_version 740 (0.0023) [2023-02-24 12:29:47,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3035136. Throughput: 0: 892.8. Samples: 757842. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:29:47,872][00205] Avg episode reward: [(0, '22.027')] [2023-02-24 12:29:47,886][11201] Saving new best policy, reward=22.027! [2023-02-24 12:29:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3051520. Throughput: 0: 895.1. Samples: 763946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:29:52,875][00205] Avg episode reward: [(0, '21.699')] [2023-02-24 12:29:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3067904. Throughput: 0: 870.3. Samples: 765962. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:29:57,874][00205] Avg episode reward: [(0, '22.312')] [2023-02-24 12:29:57,886][11201] Saving new best policy, reward=22.312! [2023-02-24 12:29:59,122][11215] Updated weights for policy 0, policy_version 750 (0.0020) [2023-02-24 12:30:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3084288. Throughput: 0: 853.4. Samples: 769884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:30:02,873][00205] Avg episode reward: [(0, '21.738')] [2023-02-24 12:30:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3104768. Throughput: 0: 899.1. Samples: 776350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:30:07,877][00205] Avg episode reward: [(0, '22.720')] [2023-02-24 12:30:07,891][11201] Saving new best policy, reward=22.720! [2023-02-24 12:30:09,415][11215] Updated weights for policy 0, policy_version 760 (0.0014) [2023-02-24 12:30:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3121152. Throughput: 0: 898.2. Samples: 779516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:30:12,873][00205] Avg episode reward: [(0, '22.461')] [2023-02-24 12:30:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.2, 300 sec: 3499.0). Total num frames: 3137536. Throughput: 0: 852.9. Samples: 784036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:30:17,876][00205] Avg episode reward: [(0, '22.451')] [2023-02-24 12:30:22,406][11215] Updated weights for policy 0, policy_version 770 (0.0017) [2023-02-24 12:30:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3153920. Throughput: 0: 862.7. Samples: 788772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:30:22,872][00205] Avg episode reward: [(0, '21.952')] [2023-02-24 12:30:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3178496. Throughput: 0: 891.9. Samples: 792100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:30:27,879][00205] Avg episode reward: [(0, '21.292')] [2023-02-24 12:30:31,449][11215] Updated weights for policy 0, policy_version 780 (0.0014) [2023-02-24 12:30:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3194880. Throughput: 0: 917.6. Samples: 799132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:30:32,876][00205] Avg episode reward: [(0, '20.844')] [2023-02-24 12:30:37,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 3211264. Throughput: 0: 869.6. Samples: 803078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:30:37,878][00205] Avg episode reward: [(0, '20.241')] [2023-02-24 12:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3227648. Throughput: 0: 871.1. Samples: 805160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:30:42,872][00205] Avg episode reward: [(0, '21.477')] [2023-02-24 12:30:44,598][11215] Updated weights for policy 0, policy_version 790 (0.0023) [2023-02-24 12:30:47,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3248128. Throughput: 0: 921.2. Samples: 811336. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:30:47,873][00205] Avg episode reward: [(0, '22.575')] [2023-02-24 12:30:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth... [2023-02-24 12:30:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000586_2400256.pth [2023-02-24 12:30:52,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3617.8, 300 sec: 3526.7). Total num frames: 3268608. Throughput: 0: 910.7. Samples: 817338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:30:52,885][00205] Avg episode reward: [(0, '22.106')] [2023-02-24 12:30:55,704][11215] Updated weights for policy 0, policy_version 800 (0.0029) [2023-02-24 12:30:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3280896. Throughput: 0: 886.8. Samples: 819424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:30:57,874][00205] Avg episode reward: [(0, '22.843')] [2023-02-24 12:30:57,885][11201] Saving new best policy, reward=22.843! [2023-02-24 12:31:02,870][00205] Fps is (10 sec: 2868.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3297280. Throughput: 0: 879.2. Samples: 823602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:31:02,872][00205] Avg episode reward: [(0, '24.269')] [2023-02-24 12:31:02,882][11201] Saving new best policy, reward=24.269! [2023-02-24 12:31:07,107][11215] Updated weights for policy 0, policy_version 810 (0.0023) [2023-02-24 12:31:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.8, 300 sec: 3499.0). Total num frames: 3317760. Throughput: 0: 918.7. Samples: 830114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:31:07,873][00205] Avg episode reward: [(0, '25.845')] [2023-02-24 12:31:07,886][11201] Saving new best policy, reward=25.845! [2023-02-24 12:31:12,874][00205] Fps is (10 sec: 4094.2, 60 sec: 3617.9, 300 sec: 3526.7). Total num frames: 3338240. Throughput: 0: 915.9. Samples: 833320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:31:12,877][00205] Avg episode reward: [(0, '24.641')] [2023-02-24 12:31:17,872][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3350528. Throughput: 0: 857.0. Samples: 837698. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:31:17,877][00205] Avg episode reward: [(0, '24.794')] [2023-02-24 12:31:19,799][11215] Updated weights for policy 0, policy_version 820 (0.0014) [2023-02-24 12:31:22,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 3371008. Throughput: 0: 879.1. Samples: 842638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:31:22,881][00205] Avg episode reward: [(0, '25.251')] [2023-02-24 12:31:27,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3391488. Throughput: 0: 905.2. Samples: 845896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:31:27,873][00205] Avg episode reward: [(0, '25.312')] [2023-02-24 12:31:29,403][11215] Updated weights for policy 0, policy_version 830 (0.0035) [2023-02-24 12:31:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 3407872. Throughput: 0: 905.1. Samples: 852064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:31:32,872][00205] Avg episode reward: [(0, '25.504')] [2023-02-24 12:31:37,872][00205] Fps is (10 sec: 3276.2, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3424256. Throughput: 0: 864.5. Samples: 856236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:31:37,875][00205] Avg episode reward: [(0, '24.761')] [2023-02-24 12:31:42,707][11215] Updated weights for policy 0, policy_version 840 (0.0026) [2023-02-24 12:31:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3440640. Throughput: 0: 862.1. Samples: 858220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:31:42,875][00205] Avg episode reward: [(0, '25.354')] [2023-02-24 12:31:47,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3461120. Throughput: 0: 913.2. Samples: 864694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:31:47,876][00205] Avg episode reward: [(0, '24.708')] [2023-02-24 12:31:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.9, 300 sec: 3526.7). Total num frames: 3477504. Throughput: 0: 897.5. Samples: 870502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:31:52,873][00205] Avg episode reward: [(0, '24.505')] [2023-02-24 12:31:52,901][11215] Updated weights for policy 0, policy_version 850 (0.0019) [2023-02-24 12:31:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 3493888. Throughput: 0: 871.6. Samples: 872538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:31:57,877][00205] Avg episode reward: [(0, '24.874')] [2023-02-24 12:32:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3510272. Throughput: 0: 877.1. Samples: 877168. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:32:02,872][00205] Avg episode reward: [(0, '23.638')] [2023-02-24 12:32:04,697][11215] Updated weights for policy 0, policy_version 860 (0.0038) [2023-02-24 12:32:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3526.7). Total num frames: 3534848. Throughput: 0: 915.9. Samples: 883856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:32:07,875][00205] Avg episode reward: [(0, '23.967')] [2023-02-24 12:32:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 3551232. Throughput: 0: 912.1. Samples: 886942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:32:12,874][00205] Avg episode reward: [(0, '21.795')] [2023-02-24 12:32:16,744][11215] Updated weights for policy 0, policy_version 870 (0.0022) [2023-02-24 12:32:17,872][00205] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3563520. Throughput: 0: 866.7. Samples: 891068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:32:17,876][00205] Avg episode reward: [(0, '21.779')] [2023-02-24 12:32:22,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3584000. Throughput: 0: 887.8. Samples: 896186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:22,875][00205] Avg episode reward: [(0, '22.045')] [2023-02-24 12:32:27,234][11215] Updated weights for policy 0, policy_version 880 (0.0022) [2023-02-24 12:32:27,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3604480. Throughput: 0: 915.7. Samples: 899428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:27,875][00205] Avg episode reward: [(0, '21.926')] [2023-02-24 12:32:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3620864. Throughput: 0: 907.9. Samples: 905550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:32:32,875][00205] Avg episode reward: [(0, '21.742')] [2023-02-24 12:32:37,872][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 3637248. Throughput: 0: 870.2. Samples: 909660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:37,874][00205] Avg episode reward: [(0, '21.922')] [2023-02-24 12:32:40,249][11215] Updated weights for policy 0, policy_version 890 (0.0026) [2023-02-24 12:32:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3653632. Throughput: 0: 873.9. Samples: 911862. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:32:42,878][00205] Avg episode reward: [(0, '22.072')] [2023-02-24 12:32:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3678208. Throughput: 0: 918.6. Samples: 918504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:47,873][00205] Avg episode reward: [(0, '21.527')] [2023-02-24 12:32:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth... [2023-02-24 12:32:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000690_2826240.pth [2023-02-24 12:32:49,504][11215] Updated weights for policy 0, policy_version 900 (0.0014) [2023-02-24 12:32:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3694592. Throughput: 0: 894.0. Samples: 924084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:52,876][00205] Avg episode reward: [(0, '20.238')] [2023-02-24 12:32:57,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 3706880. Throughput: 0: 871.4. Samples: 926154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:32:57,874][00205] Avg episode reward: [(0, '20.008')] [2023-02-24 12:33:02,222][11215] Updated weights for policy 0, policy_version 910 (0.0020) [2023-02-24 12:33:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3727360. Throughput: 0: 889.2. Samples: 931082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:33:02,878][00205] Avg episode reward: [(0, '20.590')] [2023-02-24 12:33:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 3751936. Throughput: 0: 922.6. Samples: 937702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:33:07,872][00205] Avg episode reward: [(0, '20.570')] [2023-02-24 12:33:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3764224. Throughput: 0: 914.8. Samples: 940592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:33:12,880][00205] Avg episode reward: [(0, '21.241')] [2023-02-24 12:33:13,123][11215] Updated weights for policy 0, policy_version 920 (0.0015) [2023-02-24 12:33:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 3780608. Throughput: 0: 871.6. Samples: 944770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:33:17,875][00205] Avg episode reward: [(0, '20.790')] [2023-02-24 12:33:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3801088. Throughput: 0: 901.6. Samples: 950230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:33:22,877][00205] Avg episode reward: [(0, '23.611')] [2023-02-24 12:33:24,717][11215] Updated weights for policy 0, policy_version 930 (0.0038) [2023-02-24 12:33:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3821568. Throughput: 0: 924.8. Samples: 953480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:33:27,872][00205] Avg episode reward: [(0, '23.023')] [2023-02-24 12:33:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 3837952. Throughput: 0: 907.0. Samples: 959318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:33:32,873][00205] Avg episode reward: [(0, '22.371')] [2023-02-24 12:33:36,764][11215] Updated weights for policy 0, policy_version 940 (0.0026) [2023-02-24 12:33:37,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 3850240. Throughput: 0: 874.8. Samples: 963452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:33:37,873][00205] Avg episode reward: [(0, '23.057')] [2023-02-24 12:33:42,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 3870720. Throughput: 0: 883.0. Samples: 965890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:33:42,872][00205] Avg episode reward: [(0, '23.216')] [2023-02-24 12:33:46,999][11215] Updated weights for policy 0, policy_version 950 (0.0012) [2023-02-24 12:33:47,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3891200. Throughput: 0: 919.3. Samples: 972450. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:33:47,872][00205] Avg episode reward: [(0, '21.615')] [2023-02-24 12:33:52,874][00205] Fps is (10 sec: 3684.7, 60 sec: 3549.6, 300 sec: 3554.4). Total num frames: 3907584. Throughput: 0: 889.4. Samples: 977728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:33:52,876][00205] Avg episode reward: [(0, '22.134')] [2023-02-24 12:33:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 3923968. Throughput: 0: 870.5. Samples: 979764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:33:57,875][00205] Avg episode reward: [(0, '22.780')] [2023-02-24 12:34:00,073][11215] Updated weights for policy 0, policy_version 960 (0.0019) [2023-02-24 12:34:02,870][00205] Fps is (10 sec: 3688.1, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 3944448. Throughput: 0: 892.1. Samples: 984916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:34:02,876][00205] Avg episode reward: [(0, '22.696')] [2023-02-24 12:34:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3964928. Throughput: 0: 920.1. Samples: 991634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:34:07,872][00205] Avg episode reward: [(0, '22.213')] [2023-02-24 12:34:09,599][11215] Updated weights for policy 0, policy_version 970 (0.0014) [2023-02-24 12:34:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 3981312. Throughput: 0: 907.9. Samples: 994334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:34:12,875][00205] Avg episode reward: [(0, '22.002')] [2023-02-24 12:34:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 3993600. Throughput: 0: 869.8. Samples: 998460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:34:17,876][00205] Avg episode reward: [(0, '21.600')] [2023-02-24 12:34:22,322][11215] Updated weights for policy 0, policy_version 980 (0.0019) [2023-02-24 12:34:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4014080. Throughput: 0: 903.1. Samples: 1004092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:34:22,876][00205] Avg episode reward: [(0, '24.195')] [2023-02-24 12:34:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4038656. Throughput: 0: 920.5. Samples: 1007312. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:34:27,878][00205] Avg episode reward: [(0, '23.973')] [2023-02-24 12:34:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4050944. Throughput: 0: 899.2. Samples: 1012912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:34:32,876][00205] Avg episode reward: [(0, '23.066')] [2023-02-24 12:34:33,331][11215] Updated weights for policy 0, policy_version 990 (0.0013) [2023-02-24 12:34:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 4067328. Throughput: 0: 874.4. Samples: 1017072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:34:37,876][00205] Avg episode reward: [(0, '25.073')] [2023-02-24 12:34:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4087808. Throughput: 0: 890.0. Samples: 1019816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:34:42,873][00205] Avg episode reward: [(0, '26.109')] [2023-02-24 12:34:42,875][11201] Saving new best policy, reward=26.109! [2023-02-24 12:34:44,417][11215] Updated weights for policy 0, policy_version 1000 (0.0017) [2023-02-24 12:34:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4108288. Throughput: 0: 918.4. Samples: 1026242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:34:47,873][00205] Avg episode reward: [(0, '26.954')] [2023-02-24 12:34:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth... [2023-02-24 12:34:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000793_3248128.pth [2023-02-24 12:34:48,061][11201] Saving new best policy, reward=26.954! [2023-02-24 12:34:52,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3550.1, 300 sec: 3568.4). Total num frames: 4120576. Throughput: 0: 873.1. Samples: 1030926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:34:52,873][00205] Avg episode reward: [(0, '26.611')] [2023-02-24 12:34:57,874][00205] Fps is (10 sec: 2456.6, 60 sec: 3481.3, 300 sec: 3554.4). Total num frames: 4132864. Throughput: 0: 857.2. Samples: 1032910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:34:57,877][00205] Avg episode reward: [(0, '26.001')] [2023-02-24 12:34:57,910][11215] Updated weights for policy 0, policy_version 1010 (0.0021) [2023-02-24 12:35:02,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4153344. Throughput: 0: 877.5. Samples: 1037946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:35:02,873][00205] Avg episode reward: [(0, '26.890')] [2023-02-24 12:35:07,732][11215] Updated weights for policy 0, policy_version 1020 (0.0026) [2023-02-24 12:35:07,870][00205] Fps is (10 sec: 4507.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4177920. Throughput: 0: 898.0. Samples: 1044500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:35:07,872][00205] Avg episode reward: [(0, '25.811')] [2023-02-24 12:35:12,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3481.4, 300 sec: 3568.3). Total num frames: 4190208. Throughput: 0: 880.8. Samples: 1046952. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:35:12,883][00205] Avg episode reward: [(0, '26.692')] [2023-02-24 12:35:17,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4206592. Throughput: 0: 847.7. Samples: 1051058. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:35:17,874][00205] Avg episode reward: [(0, '25.030')] [2023-02-24 12:35:20,544][11215] Updated weights for policy 0, policy_version 1030 (0.0021) [2023-02-24 12:35:22,870][00205] Fps is (10 sec: 3687.7, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4227072. Throughput: 0: 888.2. Samples: 1057042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:35:22,872][00205] Avg episode reward: [(0, '24.995')] [2023-02-24 12:35:27,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 4247552. Throughput: 0: 898.8. Samples: 1060262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:35:27,876][00205] Avg episode reward: [(0, '23.880')] [2023-02-24 12:35:30,855][11215] Updated weights for policy 0, policy_version 1040 (0.0025) [2023-02-24 12:35:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4263936. Throughput: 0: 874.1. Samples: 1065578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:35:32,876][00205] Avg episode reward: [(0, '24.515')] [2023-02-24 12:35:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4276224. Throughput: 0: 863.2. Samples: 1069770. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:35:37,873][00205] Avg episode reward: [(0, '24.452')] [2023-02-24 12:35:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 4296704. Throughput: 0: 883.5. Samples: 1072664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:35:42,873][00205] Avg episode reward: [(0, '23.207')] [2023-02-24 12:35:43,053][11215] Updated weights for policy 0, policy_version 1050 (0.0015) [2023-02-24 12:35:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4321280. Throughput: 0: 917.2. Samples: 1079218. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:35:47,873][00205] Avg episode reward: [(0, '23.072')] [2023-02-24 12:35:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4333568. Throughput: 0: 878.6. Samples: 1084036. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:35:52,873][00205] Avg episode reward: [(0, '23.314')] [2023-02-24 12:35:54,833][11215] Updated weights for policy 0, policy_version 1060 (0.0015) [2023-02-24 12:35:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.4, 300 sec: 3568.4). Total num frames: 4349952. Throughput: 0: 870.1. Samples: 1086102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:35:57,879][00205] Avg episode reward: [(0, '22.996')] [2023-02-24 12:36:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4370432. Throughput: 0: 904.5. Samples: 1091762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:36:02,877][00205] Avg episode reward: [(0, '23.478')] [2023-02-24 12:36:05,185][11215] Updated weights for policy 0, policy_version 1070 (0.0016) [2023-02-24 12:36:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3568.4). Total num frames: 4390912. Throughput: 0: 917.2. Samples: 1098320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:36:07,879][00205] Avg episode reward: [(0, '23.825')] [2023-02-24 12:36:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3582.3). Total num frames: 4407296. Throughput: 0: 896.0. Samples: 1100582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:36:12,877][00205] Avg episode reward: [(0, '24.060')] [2023-02-24 12:36:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4419584. Throughput: 0: 871.6. Samples: 1104800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:36:17,873][00205] Avg episode reward: [(0, '25.036')] [2023-02-24 12:36:18,075][11215] Updated weights for policy 0, policy_version 1080 (0.0020) [2023-02-24 12:36:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4440064. Throughput: 0: 911.4. Samples: 1110784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:36:22,875][00205] Avg episode reward: [(0, '24.692')] [2023-02-24 12:36:27,528][11215] Updated weights for policy 0, policy_version 1090 (0.0016) [2023-02-24 12:36:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 4464640. Throughput: 0: 919.2. Samples: 1114026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:36:27,873][00205] Avg episode reward: [(0, '25.739')] [2023-02-24 12:36:32,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 4476928. Throughput: 0: 888.7. Samples: 1119210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:36:32,875][00205] Avg episode reward: [(0, '25.352')] [2023-02-24 12:36:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4493312. Throughput: 0: 872.7. Samples: 1123308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:36:37,879][00205] Avg episode reward: [(0, '25.245')] [2023-02-24 12:36:40,517][11215] Updated weights for policy 0, policy_version 1100 (0.0023) [2023-02-24 12:36:42,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4513792. Throughput: 0: 895.7. Samples: 1126410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:36:42,872][00205] Avg episode reward: [(0, '24.665')] [2023-02-24 12:36:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4534272. Throughput: 0: 913.9. Samples: 1132888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:36:47,879][00205] Avg episode reward: [(0, '23.801')] [2023-02-24 12:36:47,902][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth... [2023-02-24 12:36:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000898_3678208.pth [2023-02-24 12:36:51,888][11215] Updated weights for policy 0, policy_version 1110 (0.0016) [2023-02-24 12:36:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4546560. Throughput: 0: 868.5. Samples: 1137400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:36:52,874][00205] Avg episode reward: [(0, '23.689')] [2023-02-24 12:36:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4562944. Throughput: 0: 862.9. Samples: 1139412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:36:57,874][00205] Avg episode reward: [(0, '24.247')] [2023-02-24 12:37:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4583424. Throughput: 0: 898.9. Samples: 1145252. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:37:02,873][00205] Avg episode reward: [(0, '22.963')] [2023-02-24 12:37:03,079][11215] Updated weights for policy 0, policy_version 1120 (0.0016) [2023-02-24 12:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4603904. Throughput: 0: 912.9. Samples: 1151864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:37:07,876][00205] Avg episode reward: [(0, '22.026')] [2023-02-24 12:37:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4620288. Throughput: 0: 886.8. Samples: 1153932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:37:12,876][00205] Avg episode reward: [(0, '22.896')] [2023-02-24 12:37:15,580][11215] Updated weights for policy 0, policy_version 1130 (0.0020) [2023-02-24 12:37:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4632576. Throughput: 0: 862.9. Samples: 1158038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:37:17,873][00205] Avg episode reward: [(0, '24.056')] [2023-02-24 12:37:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 4657152. Throughput: 0: 908.4. Samples: 1164188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:37:22,873][00205] Avg episode reward: [(0, '25.627')] [2023-02-24 12:37:25,359][11215] Updated weights for policy 0, policy_version 1140 (0.0020) [2023-02-24 12:37:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 4677632. Throughput: 0: 912.5. Samples: 1167472. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 12:37:27,872][00205] Avg episode reward: [(0, '24.836')] [2023-02-24 12:37:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4689920. Throughput: 0: 877.6. Samples: 1172380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:37:32,873][00205] Avg episode reward: [(0, '25.505')] [2023-02-24 12:37:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4706304. Throughput: 0: 871.8. Samples: 1176632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:37:37,872][00205] Avg episode reward: [(0, '27.100')] [2023-02-24 12:37:37,897][11201] Saving new best policy, reward=27.100! [2023-02-24 12:37:38,645][11215] Updated weights for policy 0, policy_version 1150 (0.0031) [2023-02-24 12:37:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4726784. Throughput: 0: 897.6. Samples: 1179804. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:37:42,873][00205] Avg episode reward: [(0, '26.473')] [2023-02-24 12:37:47,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 4747264. Throughput: 0: 911.5. Samples: 1186272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:37:47,876][00205] Avg episode reward: [(0, '27.358')] [2023-02-24 12:37:47,884][11201] Saving new best policy, reward=27.358! [2023-02-24 12:37:48,971][11215] Updated weights for policy 0, policy_version 1160 (0.0013) [2023-02-24 12:37:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4759552. Throughput: 0: 859.2. Samples: 1190528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:37:52,872][00205] Avg episode reward: [(0, '26.911')] [2023-02-24 12:37:57,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4775936. Throughput: 0: 858.6. Samples: 1192568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:37:57,872][00205] Avg episode reward: [(0, '26.721')] [2023-02-24 12:38:01,317][11215] Updated weights for policy 0, policy_version 1170 (0.0011) [2023-02-24 12:38:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4796416. Throughput: 0: 901.9. Samples: 1198624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:38:02,872][00205] Avg episode reward: [(0, '25.721')] [2023-02-24 12:38:07,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4816896. Throughput: 0: 904.7. Samples: 1204900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:38:07,880][00205] Avg episode reward: [(0, '25.433')] [2023-02-24 12:38:12,859][11215] Updated weights for policy 0, policy_version 1180 (0.0023) [2023-02-24 12:38:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4833280. Throughput: 0: 878.1. Samples: 1206988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:38:12,879][00205] Avg episode reward: [(0, '25.316')] [2023-02-24 12:38:17,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4845568. Throughput: 0: 862.8. Samples: 1211208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:38:17,872][00205] Avg episode reward: [(0, '26.604')] [2023-02-24 12:38:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4870144. Throughput: 0: 908.6. Samples: 1217520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:38:22,873][00205] Avg episode reward: [(0, '24.923')] [2023-02-24 12:38:23,700][11215] Updated weights for policy 0, policy_version 1190 (0.0028) [2023-02-24 12:38:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 4890624. Throughput: 0: 911.3. Samples: 1220812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:38:27,879][00205] Avg episode reward: [(0, '26.016')] [2023-02-24 12:38:32,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3549.4, 300 sec: 3568.3). Total num frames: 4902912. Throughput: 0: 873.1. Samples: 1225564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:38:32,884][00205] Avg episode reward: [(0, '25.041')] [2023-02-24 12:38:36,551][11215] Updated weights for policy 0, policy_version 1200 (0.0014) [2023-02-24 12:38:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4919296. Throughput: 0: 879.7. Samples: 1230116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:38:37,872][00205] Avg episode reward: [(0, '25.457')] [2023-02-24 12:38:42,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4939776. Throughput: 0: 908.0. Samples: 1233430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:38:42,878][00205] Avg episode reward: [(0, '26.832')] [2023-02-24 12:38:45,830][11215] Updated weights for policy 0, policy_version 1210 (0.0022) [2023-02-24 12:38:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 4960256. Throughput: 0: 919.5. Samples: 1240002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:38:47,874][00205] Avg episode reward: [(0, '26.514')] [2023-02-24 12:38:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth... [2023-02-24 12:38:48,096][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_4108288.pth [2023-02-24 12:38:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 4972544. Throughput: 0: 868.1. Samples: 1243964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:38:52,882][00205] Avg episode reward: [(0, '27.554')] [2023-02-24 12:38:52,891][11201] Saving new best policy, reward=27.554! [2023-02-24 12:38:57,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4988928. Throughput: 0: 866.5. Samples: 1245980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:38:57,872][00205] Avg episode reward: [(0, '27.820')] [2023-02-24 12:38:57,889][11201] Saving new best policy, reward=27.820! [2023-02-24 12:38:59,147][11215] Updated weights for policy 0, policy_version 1220 (0.0026) [2023-02-24 12:39:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 5013504. Throughput: 0: 906.9. Samples: 1252020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:39:02,875][00205] Avg episode reward: [(0, '27.834')] [2023-02-24 12:39:02,880][11201] Saving new best policy, reward=27.834! [2023-02-24 12:39:07,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5029888. Throughput: 0: 901.1. Samples: 1258070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:39:07,880][00205] Avg episode reward: [(0, '27.955')] [2023-02-24 12:39:07,897][11201] Saving new best policy, reward=27.955! [2023-02-24 12:39:10,087][11215] Updated weights for policy 0, policy_version 1230 (0.0019) [2023-02-24 12:39:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5042176. Throughput: 0: 871.9. Samples: 1260046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:39:12,875][00205] Avg episode reward: [(0, '26.738')] [2023-02-24 12:39:17,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5058560. Throughput: 0: 860.2. Samples: 1264266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:39:17,873][00205] Avg episode reward: [(0, '26.010')] [2023-02-24 12:39:21,693][11215] Updated weights for policy 0, policy_version 1240 (0.0025) [2023-02-24 12:39:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5083136. Throughput: 0: 900.9. Samples: 1270656. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:39:22,873][00205] Avg episode reward: [(0, '25.109')] [2023-02-24 12:39:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5099520. Throughput: 0: 899.9. Samples: 1273924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:39:27,873][00205] Avg episode reward: [(0, '23.278')] [2023-02-24 12:39:32,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3550.3, 300 sec: 3554.5). Total num frames: 5115904. Throughput: 0: 853.5. Samples: 1278410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:39:32,875][00205] Avg episode reward: [(0, '23.698')] [2023-02-24 12:39:34,002][11215] Updated weights for policy 0, policy_version 1250 (0.0014) [2023-02-24 12:39:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5132288. Throughput: 0: 864.5. Samples: 1282866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:39:37,872][00205] Avg episode reward: [(0, '23.343')] [2023-02-24 12:39:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5152768. Throughput: 0: 891.0. Samples: 1286074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:39:42,872][00205] Avg episode reward: [(0, '23.898')] [2023-02-24 12:39:44,431][11215] Updated weights for policy 0, policy_version 1260 (0.0019) [2023-02-24 12:39:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5169152. Throughput: 0: 905.9. Samples: 1292786. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:39:47,877][00205] Avg episode reward: [(0, '26.005')] [2023-02-24 12:39:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5185536. Throughput: 0: 860.8. Samples: 1296806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:39:52,873][00205] Avg episode reward: [(0, '25.523')] [2023-02-24 12:39:57,597][11215] Updated weights for policy 0, policy_version 1270 (0.0011) [2023-02-24 12:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5201920. Throughput: 0: 864.5. Samples: 1298948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:39:57,875][00205] Avg episode reward: [(0, '26.334')] [2023-02-24 12:40:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5222400. Throughput: 0: 900.7. Samples: 1304796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:02,873][00205] Avg episode reward: [(0, '27.262')] [2023-02-24 12:40:07,703][11215] Updated weights for policy 0, policy_version 1280 (0.0016) [2023-02-24 12:40:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 5242880. Throughput: 0: 895.7. Samples: 1310962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:40:07,879][00205] Avg episode reward: [(0, '27.918')] [2023-02-24 12:40:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5255168. Throughput: 0: 867.7. Samples: 1312972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:40:12,873][00205] Avg episode reward: [(0, '27.645')] [2023-02-24 12:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3549.8, 300 sec: 3540.6). Total num frames: 5271552. Throughput: 0: 861.7. Samples: 1317186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:40:17,875][00205] Avg episode reward: [(0, '28.400')] [2023-02-24 12:40:17,889][11201] Saving new best policy, reward=28.400! [2023-02-24 12:40:20,215][11215] Updated weights for policy 0, policy_version 1290 (0.0030) [2023-02-24 12:40:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 5292032. Throughput: 0: 904.2. Samples: 1323556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:22,883][00205] Avg episode reward: [(0, '27.920')] [2023-02-24 12:40:27,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5312512. Throughput: 0: 906.3. Samples: 1326860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:27,874][00205] Avg episode reward: [(0, '27.805')] [2023-02-24 12:40:31,668][11215] Updated weights for policy 0, policy_version 1300 (0.0027) [2023-02-24 12:40:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5324800. Throughput: 0: 858.2. Samples: 1331404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:32,876][00205] Avg episode reward: [(0, '27.152')] [2023-02-24 12:40:37,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5345280. Throughput: 0: 874.9. Samples: 1336176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:40:37,876][00205] Avg episode reward: [(0, '25.343')] [2023-02-24 12:40:42,540][11215] Updated weights for policy 0, policy_version 1310 (0.0020) [2023-02-24 12:40:42,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 5365760. Throughput: 0: 899.0. Samples: 1339404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:40:42,874][00205] Avg episode reward: [(0, '27.124')] [2023-02-24 12:40:47,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5382144. Throughput: 0: 911.1. Samples: 1345798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:40:47,875][00205] Avg episode reward: [(0, '25.789')] [2023-02-24 12:40:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth... [2023-02-24 12:40:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001107_4534272.pth [2023-02-24 12:40:52,872][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 5398528. Throughput: 0: 862.0. Samples: 1349754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:52,879][00205] Avg episode reward: [(0, '24.927')] [2023-02-24 12:40:55,663][11215] Updated weights for policy 0, policy_version 1320 (0.0021) [2023-02-24 12:40:57,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5414912. Throughput: 0: 863.6. Samples: 1351836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:40:57,872][00205] Avg episode reward: [(0, '26.058')] [2023-02-24 12:41:02,870][00205] Fps is (10 sec: 3687.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5435392. Throughput: 0: 916.1. Samples: 1358412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:02,872][00205] Avg episode reward: [(0, '27.403')] [2023-02-24 12:41:05,026][11215] Updated weights for policy 0, policy_version 1330 (0.0015) [2023-02-24 12:41:07,871][00205] Fps is (10 sec: 4095.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5455872. Throughput: 0: 904.6. Samples: 1364266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:07,879][00205] Avg episode reward: [(0, '27.649')] [2023-02-24 12:41:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5468160. Throughput: 0: 876.3. Samples: 1366294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:12,873][00205] Avg episode reward: [(0, '26.789')] [2023-02-24 12:41:17,870][00205] Fps is (10 sec: 2867.5, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5484544. Throughput: 0: 877.1. Samples: 1370872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:41:17,873][00205] Avg episode reward: [(0, '27.554')] [2023-02-24 12:41:17,904][11215] Updated weights for policy 0, policy_version 1340 (0.0013) [2023-02-24 12:41:22,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3540.6). Total num frames: 5509120. Throughput: 0: 915.6. Samples: 1377378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:22,875][00205] Avg episode reward: [(0, '28.234')] [2023-02-24 12:41:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5525504. Throughput: 0: 917.0. Samples: 1380666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:27,875][00205] Avg episode reward: [(0, '28.472')] [2023-02-24 12:41:27,890][11201] Saving new best policy, reward=28.472! [2023-02-24 12:41:28,502][11215] Updated weights for policy 0, policy_version 1350 (0.0018) [2023-02-24 12:41:32,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5537792. Throughput: 0: 863.7. Samples: 1384662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:41:32,874][00205] Avg episode reward: [(0, '27.413')] [2023-02-24 12:41:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5558272. Throughput: 0: 887.5. Samples: 1389688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:41:37,873][00205] Avg episode reward: [(0, '27.655')] [2023-02-24 12:41:40,251][11215] Updated weights for policy 0, policy_version 1360 (0.0032) [2023-02-24 12:41:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 5578752. Throughput: 0: 913.2. Samples: 1392928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:41:42,876][00205] Avg episode reward: [(0, '28.925')] [2023-02-24 12:41:42,879][11201] Saving new best policy, reward=28.925! [2023-02-24 12:41:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 5595136. Throughput: 0: 897.9. Samples: 1398818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:41:47,876][00205] Avg episode reward: [(0, '29.357')] [2023-02-24 12:41:47,888][11201] Saving new best policy, reward=29.357! [2023-02-24 12:41:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5607424. Throughput: 0: 856.0. Samples: 1402786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:41:52,874][00205] Avg episode reward: [(0, '29.483')] [2023-02-24 12:41:52,906][11201] Saving new best policy, reward=29.483! [2023-02-24 12:41:52,911][11215] Updated weights for policy 0, policy_version 1370 (0.0017) [2023-02-24 12:41:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5627904. Throughput: 0: 857.4. Samples: 1404878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:41:57,872][00205] Avg episode reward: [(0, '30.130')] [2023-02-24 12:41:57,887][11201] Saving new best policy, reward=30.130! [2023-02-24 12:42:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5648384. Throughput: 0: 898.9. Samples: 1411324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:42:02,880][00205] Avg episode reward: [(0, '29.209')] [2023-02-24 12:42:03,348][11215] Updated weights for policy 0, policy_version 1380 (0.0022) [2023-02-24 12:42:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 5664768. Throughput: 0: 881.8. Samples: 1417056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:42:07,873][00205] Avg episode reward: [(0, '28.847')] [2023-02-24 12:42:12,872][00205] Fps is (10 sec: 3276.1, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 5681152. Throughput: 0: 854.9. Samples: 1419140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:42:12,874][00205] Avg episode reward: [(0, '28.500')] [2023-02-24 12:42:16,061][11215] Updated weights for policy 0, policy_version 1390 (0.0028) [2023-02-24 12:42:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 5701632. Throughput: 0: 878.1. Samples: 1424178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:42:17,878][00205] Avg episode reward: [(0, '28.080')] [2023-02-24 12:42:22,870][00205] Fps is (10 sec: 4506.6, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 5726208. Throughput: 0: 925.9. Samples: 1431352. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:42:22,873][00205] Avg episode reward: [(0, '27.170')] [2023-02-24 12:42:24,609][11215] Updated weights for policy 0, policy_version 1400 (0.0023) [2023-02-24 12:42:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5742592. Throughput: 0: 927.2. Samples: 1434654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:42:27,876][00205] Avg episode reward: [(0, '26.678')] [2023-02-24 12:42:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5758976. Throughput: 0: 893.1. Samples: 1439008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:42:32,872][00205] Avg episode reward: [(0, '27.204')] [2023-02-24 12:42:36,876][11215] Updated weights for policy 0, policy_version 1410 (0.0020) [2023-02-24 12:42:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5779456. Throughput: 0: 932.0. Samples: 1444728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:42:37,872][00205] Avg episode reward: [(0, '26.727')] [2023-02-24 12:42:42,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5804032. Throughput: 0: 965.6. Samples: 1448332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:42:42,872][00205] Avg episode reward: [(0, '25.694')] [2023-02-24 12:42:45,928][11215] Updated weights for policy 0, policy_version 1420 (0.0024) [2023-02-24 12:42:47,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3596.1). Total num frames: 5820416. Throughput: 0: 965.8. Samples: 1454792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:42:47,878][00205] Avg episode reward: [(0, '25.460')] [2023-02-24 12:42:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth... [2023-02-24 12:42:48,041][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001211_4960256.pth [2023-02-24 12:42:52,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 5832704. Throughput: 0: 926.7. Samples: 1458756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:42:52,875][00205] Avg episode reward: [(0, '25.142')] [2023-02-24 12:42:57,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5849088. Throughput: 0: 930.5. Samples: 1461010. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:42:57,875][00205] Avg episode reward: [(0, '24.673')] [2023-02-24 12:42:58,934][11215] Updated weights for policy 0, policy_version 1430 (0.0017) [2023-02-24 12:43:02,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3582.3). Total num frames: 5873664. Throughput: 0: 962.7. Samples: 1467498. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:43:02,879][00205] Avg episode reward: [(0, '25.489')] [2023-02-24 12:43:07,871][00205] Fps is (10 sec: 4095.6, 60 sec: 3754.6, 300 sec: 3582.2). Total num frames: 5890048. Throughput: 0: 927.3. Samples: 1473084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:07,878][00205] Avg episode reward: [(0, '24.572')] [2023-02-24 12:43:10,110][11215] Updated weights for policy 0, policy_version 1440 (0.0022) [2023-02-24 12:43:12,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3686.5, 300 sec: 3582.3). Total num frames: 5902336. Throughput: 0: 899.6. Samples: 1475136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:12,875][00205] Avg episode reward: [(0, '24.618')] [2023-02-24 12:43:17,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 5922816. Throughput: 0: 910.5. Samples: 1479980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:43:17,877][00205] Avg episode reward: [(0, '23.787')] [2023-02-24 12:43:21,151][11215] Updated weights for policy 0, policy_version 1450 (0.0021) [2023-02-24 12:43:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5943296. Throughput: 0: 928.1. Samples: 1486492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:43:22,873][00205] Avg episode reward: [(0, '25.619')] [2023-02-24 12:43:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.4). Total num frames: 5959680. Throughput: 0: 914.4. Samples: 1489480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:43:27,876][00205] Avg episode reward: [(0, '25.045')] [2023-02-24 12:43:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5976064. Throughput: 0: 861.5. Samples: 1493554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:43:32,874][00205] Avg episode reward: [(0, '25.332')] [2023-02-24 12:43:34,228][11215] Updated weights for policy 0, policy_version 1460 (0.0027) [2023-02-24 12:43:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 5996544. Throughput: 0: 893.2. Samples: 1498950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:37,872][00205] Avg episode reward: [(0, '25.772')] [2023-02-24 12:43:42,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6017024. Throughput: 0: 914.1. Samples: 1502144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:43:42,878][00205] Avg episode reward: [(0, '25.907')] [2023-02-24 12:43:43,586][11215] Updated weights for policy 0, policy_version 1470 (0.0016) [2023-02-24 12:43:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3596.1). Total num frames: 6033408. Throughput: 0: 896.5. Samples: 1507838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:47,877][00205] Avg episode reward: [(0, '26.580')] [2023-02-24 12:43:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6045696. Throughput: 0: 862.3. Samples: 1511886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:52,879][00205] Avg episode reward: [(0, '25.424')] [2023-02-24 12:43:56,726][11215] Updated weights for policy 0, policy_version 1480 (0.0012) [2023-02-24 12:43:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 6066176. Throughput: 0: 871.0. Samples: 1514332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:43:57,878][00205] Avg episode reward: [(0, '24.721')] [2023-02-24 12:44:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6086656. Throughput: 0: 907.5. Samples: 1520818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:44:02,877][00205] Avg episode reward: [(0, '24.597')] [2023-02-24 12:44:07,384][11215] Updated weights for policy 0, policy_version 1490 (0.0016) [2023-02-24 12:44:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6103040. Throughput: 0: 881.4. Samples: 1526156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:44:07,873][00205] Avg episode reward: [(0, '25.599')] [2023-02-24 12:44:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6115328. Throughput: 0: 860.8. Samples: 1528214. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:44:12,875][00205] Avg episode reward: [(0, '26.135')] [2023-02-24 12:44:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6135808. Throughput: 0: 886.1. Samples: 1533430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:17,875][00205] Avg episode reward: [(0, '26.670')] [2023-02-24 12:44:18,915][11215] Updated weights for policy 0, policy_version 1500 (0.0016) [2023-02-24 12:44:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 6160384. Throughput: 0: 911.6. Samples: 1539972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:22,873][00205] Avg episode reward: [(0, '27.034')] [2023-02-24 12:44:27,872][00205] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 6172672. Throughput: 0: 896.8. Samples: 1542502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:27,875][00205] Avg episode reward: [(0, '28.232')] [2023-02-24 12:44:31,281][11215] Updated weights for policy 0, policy_version 1510 (0.0015) [2023-02-24 12:44:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6189056. Throughput: 0: 862.0. Samples: 1546626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:44:32,877][00205] Avg episode reward: [(0, '28.717')] [2023-02-24 12:44:37,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6209536. Throughput: 0: 899.2. Samples: 1552352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:37,875][00205] Avg episode reward: [(0, '29.136')] [2023-02-24 12:44:41,437][11215] Updated weights for policy 0, policy_version 1520 (0.0015) [2023-02-24 12:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6230016. Throughput: 0: 917.2. Samples: 1555608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:42,876][00205] Avg episode reward: [(0, '27.976')] [2023-02-24 12:44:47,873][00205] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3596.1). Total num frames: 6246400. Throughput: 0: 893.7. Samples: 1561038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:44:47,877][00205] Avg episode reward: [(0, '27.555')] [2023-02-24 12:44:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth... [2023-02-24 12:44:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001314_5382144.pth [2023-02-24 12:44:52,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6258688. Throughput: 0: 862.0. Samples: 1564946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:44:52,876][00205] Avg episode reward: [(0, '26.975')] [2023-02-24 12:44:54,636][11215] Updated weights for policy 0, policy_version 1530 (0.0017) [2023-02-24 12:44:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6279168. Throughput: 0: 877.6. Samples: 1567708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:44:57,873][00205] Avg episode reward: [(0, '26.468')] [2023-02-24 12:45:02,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6299648. Throughput: 0: 902.7. Samples: 1574052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:45:02,878][00205] Avg episode reward: [(0, '25.282')] [2023-02-24 12:45:05,088][11215] Updated weights for policy 0, policy_version 1540 (0.0013) [2023-02-24 12:45:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6311936. Throughput: 0: 866.2. Samples: 1578952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:45:07,874][00205] Avg episode reward: [(0, '25.806')] [2023-02-24 12:45:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6328320. Throughput: 0: 855.0. Samples: 1580976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:45:12,875][00205] Avg episode reward: [(0, '26.113')] [2023-02-24 12:45:17,190][11215] Updated weights for policy 0, policy_version 1550 (0.0016) [2023-02-24 12:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6348800. Throughput: 0: 888.3. Samples: 1586598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:45:17,872][00205] Avg episode reward: [(0, '27.188')] [2023-02-24 12:45:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6369280. Throughput: 0: 903.1. Samples: 1592990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:45:22,875][00205] Avg episode reward: [(0, '27.562')] [2023-02-24 12:45:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 6385664. Throughput: 0: 881.2. Samples: 1595262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:45:27,875][00205] Avg episode reward: [(0, '27.304')] [2023-02-24 12:45:29,203][11215] Updated weights for policy 0, policy_version 1560 (0.0017) [2023-02-24 12:45:32,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6397952. Throughput: 0: 851.9. Samples: 1599370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:45:32,879][00205] Avg episode reward: [(0, '27.522')] [2023-02-24 12:45:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6422528. Throughput: 0: 898.2. Samples: 1605366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:45:37,872][00205] Avg episode reward: [(0, '27.148')] [2023-02-24 12:45:39,717][11215] Updated weights for policy 0, policy_version 1570 (0.0020) [2023-02-24 12:45:42,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3549.8, 300 sec: 3596.2). Total num frames: 6443008. Throughput: 0: 909.9. Samples: 1608652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:45:42,873][00205] Avg episode reward: [(0, '26.756')] [2023-02-24 12:45:47,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3481.7, 300 sec: 3582.3). Total num frames: 6455296. Throughput: 0: 879.4. Samples: 1613624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:45:47,874][00205] Avg episode reward: [(0, '24.953')] [2023-02-24 12:45:52,870][00205] Fps is (10 sec: 2457.7, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 6467584. Throughput: 0: 860.0. Samples: 1617654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:45:52,872][00205] Avg episode reward: [(0, '24.925')] [2023-02-24 12:45:53,081][11215] Updated weights for policy 0, policy_version 1580 (0.0029) [2023-02-24 12:45:57,870][00205] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6492160. Throughput: 0: 887.6. Samples: 1620918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:45:57,875][00205] Avg episode reward: [(0, '25.196')] [2023-02-24 12:46:02,276][11215] Updated weights for policy 0, policy_version 1590 (0.0024) [2023-02-24 12:46:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6512640. Throughput: 0: 907.6. Samples: 1627438. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:46:02,873][00205] Avg episode reward: [(0, '24.389')] [2023-02-24 12:46:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6524928. Throughput: 0: 865.2. Samples: 1631924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:46:07,873][00205] Avg episode reward: [(0, '25.683')] [2023-02-24 12:46:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6541312. Throughput: 0: 859.7. Samples: 1633948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:46:12,874][00205] Avg episode reward: [(0, '24.758')] [2023-02-24 12:46:15,361][11215] Updated weights for policy 0, policy_version 1600 (0.0012) [2023-02-24 12:46:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6561792. Throughput: 0: 900.7. Samples: 1639900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:46:17,873][00205] Avg episode reward: [(0, '26.283')] [2023-02-24 12:46:22,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 6582272. Throughput: 0: 908.8. Samples: 1646262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:46:22,874][00205] Avg episode reward: [(0, '25.931')] [2023-02-24 12:46:26,337][11215] Updated weights for policy 0, policy_version 1610 (0.0015) [2023-02-24 12:46:27,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6594560. Throughput: 0: 880.8. Samples: 1648286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:46:27,876][00205] Avg episode reward: [(0, '25.591')] [2023-02-24 12:46:32,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6610944. Throughput: 0: 860.2. Samples: 1652332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:46:32,880][00205] Avg episode reward: [(0, '25.339')] [2023-02-24 12:46:37,665][11215] Updated weights for policy 0, policy_version 1620 (0.0016) [2023-02-24 12:46:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6635520. Throughput: 0: 914.7. Samples: 1658814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:46:37,873][00205] Avg episode reward: [(0, '25.575')] [2023-02-24 12:46:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6651904. Throughput: 0: 915.1. Samples: 1662098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:46:42,873][00205] Avg episode reward: [(0, '25.991')] [2023-02-24 12:46:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 6668288. Throughput: 0: 871.0. Samples: 1666634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:46:47,877][00205] Avg episode reward: [(0, '26.007')] [2023-02-24 12:46:47,889][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth... [2023-02-24 12:46:48,098][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001421_5820416.pth [2023-02-24 12:46:50,894][11215] Updated weights for policy 0, policy_version 1630 (0.0025) [2023-02-24 12:46:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6680576. Throughput: 0: 870.9. Samples: 1671114. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 12:46:52,872][00205] Avg episode reward: [(0, '25.376')] [2023-02-24 12:46:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6705152. Throughput: 0: 898.4. Samples: 1674378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:46:57,875][00205] Avg episode reward: [(0, '24.305')] [2023-02-24 12:47:00,430][11215] Updated weights for policy 0, policy_version 1640 (0.0031) [2023-02-24 12:47:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6721536. Throughput: 0: 909.5. Samples: 1680826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:47:02,873][00205] Avg episode reward: [(0, '24.712')] [2023-02-24 12:47:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 6737920. Throughput: 0: 861.4. Samples: 1685024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:47:07,881][00205] Avg episode reward: [(0, '23.768')] [2023-02-24 12:47:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 6754304. Throughput: 0: 861.9. Samples: 1687070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:47:12,873][00205] Avg episode reward: [(0, '25.100')] [2023-02-24 12:47:13,240][11215] Updated weights for policy 0, policy_version 1650 (0.0024) [2023-02-24 12:47:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6774784. Throughput: 0: 910.3. Samples: 1693294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:47:17,881][00205] Avg episode reward: [(0, '24.110')] [2023-02-24 12:47:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 6795264. Throughput: 0: 903.2. Samples: 1699456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:47:22,872][00205] Avg episode reward: [(0, '24.172')] [2023-02-24 12:47:23,684][11215] Updated weights for policy 0, policy_version 1660 (0.0018) [2023-02-24 12:47:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6807552. Throughput: 0: 876.6. Samples: 1701544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:47:27,874][00205] Avg episode reward: [(0, '24.170')] [2023-02-24 12:47:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6828032. Throughput: 0: 869.7. Samples: 1705772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:47:32,872][00205] Avg episode reward: [(0, '24.176')] [2023-02-24 12:47:35,573][11215] Updated weights for policy 0, policy_version 1670 (0.0019) [2023-02-24 12:47:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6848512. Throughput: 0: 919.2. Samples: 1712476. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:47:37,873][00205] Avg episode reward: [(0, '23.699')] [2023-02-24 12:47:42,883][00205] Fps is (10 sec: 4090.5, 60 sec: 3617.3, 300 sec: 3554.4). Total num frames: 6868992. Throughput: 0: 918.7. Samples: 1715732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:47:42,897][00205] Avg episode reward: [(0, '24.180')] [2023-02-24 12:47:47,330][11215] Updated weights for policy 0, policy_version 1680 (0.0012) [2023-02-24 12:47:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6881280. Throughput: 0: 870.4. Samples: 1719994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:47:47,874][00205] Avg episode reward: [(0, '25.763')] [2023-02-24 12:47:52,870][00205] Fps is (10 sec: 2871.1, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6897664. Throughput: 0: 886.6. Samples: 1724922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:47:52,878][00205] Avg episode reward: [(0, '25.965')] [2023-02-24 12:47:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6918144. Throughput: 0: 913.5. Samples: 1728176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:47:57,873][00205] Avg episode reward: [(0, '25.937')] [2023-02-24 12:47:57,956][11215] Updated weights for policy 0, policy_version 1690 (0.0014) [2023-02-24 12:48:02,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 6938624. Throughput: 0: 909.1. Samples: 1734204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:48:02,880][00205] Avg episode reward: [(0, '24.937')] [2023-02-24 12:48:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6950912. Throughput: 0: 865.3. Samples: 1738396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:48:07,875][00205] Avg episode reward: [(0, '25.501')] [2023-02-24 12:48:10,916][11215] Updated weights for policy 0, policy_version 1700 (0.0011) [2023-02-24 12:48:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6971392. Throughput: 0: 868.0. Samples: 1740602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:48:12,876][00205] Avg episode reward: [(0, '25.679')] [2023-02-24 12:48:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6991872. Throughput: 0: 921.3. Samples: 1747232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:48:17,873][00205] Avg episode reward: [(0, '24.206')] [2023-02-24 12:48:20,241][11215] Updated weights for policy 0, policy_version 1710 (0.0011) [2023-02-24 12:48:22,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7008256. Throughput: 0: 896.1. Samples: 1752804. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:48:22,881][00205] Avg episode reward: [(0, '23.939')] [2023-02-24 12:48:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7024640. Throughput: 0: 870.0. Samples: 1754872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:48:27,878][00205] Avg episode reward: [(0, '25.550')] [2023-02-24 12:48:32,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7041024. Throughput: 0: 881.2. Samples: 1759646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:48:32,873][00205] Avg episode reward: [(0, '26.768')] [2023-02-24 12:48:33,291][11215] Updated weights for policy 0, policy_version 1720 (0.0019) [2023-02-24 12:48:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7061504. Throughput: 0: 917.9. Samples: 1766228. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:48:37,876][00205] Avg episode reward: [(0, '26.012')] [2023-02-24 12:48:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3482.4, 300 sec: 3540.6). Total num frames: 7077888. Throughput: 0: 911.4. Samples: 1769188. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 12:48:42,873][00205] Avg episode reward: [(0, '25.593')] [2023-02-24 12:48:44,671][11215] Updated weights for policy 0, policy_version 1730 (0.0015) [2023-02-24 12:48:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7094272. Throughput: 0: 868.4. Samples: 1773282. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 12:48:47,880][00205] Avg episode reward: [(0, '25.264')] [2023-02-24 12:48:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth... [2023-02-24 12:48:48,073][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001525_6246400.pth [2023-02-24 12:48:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7110656. Throughput: 0: 892.0. Samples: 1778538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:48:52,878][00205] Avg episode reward: [(0, '24.936')] [2023-02-24 12:48:55,723][11215] Updated weights for policy 0, policy_version 1740 (0.0015) [2023-02-24 12:48:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7135232. Throughput: 0: 915.3. Samples: 1781790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:48:57,873][00205] Avg episode reward: [(0, '24.912')] [2023-02-24 12:49:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7151616. Throughput: 0: 892.8. Samples: 1787408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:02,877][00205] Avg episode reward: [(0, '25.786')] [2023-02-24 12:49:07,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7163904. Throughput: 0: 859.7. Samples: 1791492. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 12:49:07,879][00205] Avg episode reward: [(0, '25.678')] [2023-02-24 12:49:08,364][11215] Updated weights for policy 0, policy_version 1750 (0.0021) [2023-02-24 12:49:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7184384. Throughput: 0: 873.7. Samples: 1794190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 12:49:12,873][00205] Avg episode reward: [(0, '26.490')] [2023-02-24 12:49:17,872][00205] Fps is (10 sec: 4095.6, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 7204864. Throughput: 0: 912.6. Samples: 1800716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:17,875][00205] Avg episode reward: [(0, '26.776')] [2023-02-24 12:49:18,189][11215] Updated weights for policy 0, policy_version 1760 (0.0019) [2023-02-24 12:49:22,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7221248. Throughput: 0: 881.0. Samples: 1805876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:22,874][00205] Avg episode reward: [(0, '25.964')] [2023-02-24 12:49:27,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7233536. Throughput: 0: 860.6. Samples: 1807916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:27,874][00205] Avg episode reward: [(0, '26.699')] [2023-02-24 12:49:31,253][11215] Updated weights for policy 0, policy_version 1770 (0.0017) [2023-02-24 12:49:32,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7254016. Throughput: 0: 882.6. Samples: 1812998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:32,872][00205] Avg episode reward: [(0, '24.613')] [2023-02-24 12:49:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7278592. Throughput: 0: 910.6. Samples: 1819514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:49:37,872][00205] Avg episode reward: [(0, '23.298')] [2023-02-24 12:49:41,760][11215] Updated weights for policy 0, policy_version 1780 (0.0011) [2023-02-24 12:49:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7290880. Throughput: 0: 897.6. Samples: 1822180. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:42,877][00205] Avg episode reward: [(0, '24.331')] [2023-02-24 12:49:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7307264. Throughput: 0: 863.6. Samples: 1826270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:49:47,880][00205] Avg episode reward: [(0, '23.984')] [2023-02-24 12:49:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7327744. Throughput: 0: 899.7. Samples: 1831978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:49:52,878][00205] Avg episode reward: [(0, '24.064')] [2023-02-24 12:49:53,624][11215] Updated weights for policy 0, policy_version 1790 (0.0031) [2023-02-24 12:49:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7348224. Throughput: 0: 913.0. Samples: 1835276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:49:57,877][00205] Avg episode reward: [(0, '24.798')] [2023-02-24 12:50:02,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 7364608. Throughput: 0: 883.9. Samples: 1840488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:50:02,874][00205] Avg episode reward: [(0, '25.309')] [2023-02-24 12:50:05,692][11215] Updated weights for policy 0, policy_version 1800 (0.0016) [2023-02-24 12:50:07,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7376896. Throughput: 0: 862.8. Samples: 1844704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:50:07,879][00205] Avg episode reward: [(0, '26.188')] [2023-02-24 12:50:12,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7397376. Throughput: 0: 881.2. Samples: 1847570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:50:12,874][00205] Avg episode reward: [(0, '25.033')] [2023-02-24 12:50:15,993][11215] Updated weights for policy 0, policy_version 1810 (0.0020) [2023-02-24 12:50:17,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7417856. Throughput: 0: 914.2. Samples: 1854138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:50:17,872][00205] Avg episode reward: [(0, '25.616')] [2023-02-24 12:50:22,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7434240. Throughput: 0: 882.8. Samples: 1859244. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 12:50:22,875][00205] Avg episode reward: [(0, '26.669')] [2023-02-24 12:50:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7446528. Throughput: 0: 869.6. Samples: 1861312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:50:27,873][00205] Avg episode reward: [(0, '25.640')] [2023-02-24 12:50:29,015][11215] Updated weights for policy 0, policy_version 1820 (0.0029) [2023-02-24 12:50:32,870][00205] Fps is (10 sec: 3687.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7471104. Throughput: 0: 896.4. Samples: 1866610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:50:32,873][00205] Avg episode reward: [(0, '26.284')] [2023-02-24 12:50:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7491584. Throughput: 0: 918.1. Samples: 1873294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:50:37,872][00205] Avg episode reward: [(0, '25.689')] [2023-02-24 12:50:38,382][11215] Updated weights for policy 0, policy_version 1830 (0.0020) [2023-02-24 12:50:42,872][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 7507968. Throughput: 0: 899.0. Samples: 1875732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:50:42,877][00205] Avg episode reward: [(0, '26.624')] [2023-02-24 12:50:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7520256. Throughput: 0: 875.2. Samples: 1879870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:50:47,879][00205] Avg episode reward: [(0, '26.732')] [2023-02-24 12:50:47,895][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth... [2023-02-24 12:50:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001628_6668288.pth [2023-02-24 12:50:51,327][11215] Updated weights for policy 0, policy_version 1840 (0.0023) [2023-02-24 12:50:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7540736. Throughput: 0: 908.0. Samples: 1885564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:50:52,873][00205] Avg episode reward: [(0, '26.612')] [2023-02-24 12:50:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7561216. Throughput: 0: 915.1. Samples: 1888750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:50:57,877][00205] Avg episode reward: [(0, '27.062')] [2023-02-24 12:51:02,874][11215] Updated weights for policy 0, policy_version 1850 (0.0012) [2023-02-24 12:51:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7577600. Throughput: 0: 880.1. Samples: 1893744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:51:02,880][00205] Avg episode reward: [(0, '26.215')] [2023-02-24 12:51:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7589888. Throughput: 0: 856.6. Samples: 1897788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:51:07,872][00205] Avg episode reward: [(0, '26.323')] [2023-02-24 12:51:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7610368. Throughput: 0: 876.8. Samples: 1900768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:51:12,873][00205] Avg episode reward: [(0, '27.406')] [2023-02-24 12:51:14,417][11215] Updated weights for policy 0, policy_version 1860 (0.0014) [2023-02-24 12:51:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7630848. Throughput: 0: 899.7. Samples: 1907096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:51:17,872][00205] Avg episode reward: [(0, '28.808')] [2023-02-24 12:51:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3554.5). Total num frames: 7643136. Throughput: 0: 849.4. Samples: 1911516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:51:22,877][00205] Avg episode reward: [(0, '29.578')] [2023-02-24 12:51:27,793][11215] Updated weights for policy 0, policy_version 1870 (0.0016) [2023-02-24 12:51:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7659520. Throughput: 0: 838.7. Samples: 1913472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:51:27,880][00205] Avg episode reward: [(0, '28.223')] [2023-02-24 12:51:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7680000. Throughput: 0: 871.3. Samples: 1919078. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:51:32,873][00205] Avg episode reward: [(0, '27.765')] [2023-02-24 12:51:37,190][11215] Updated weights for policy 0, policy_version 1880 (0.0018) [2023-02-24 12:51:37,873][00205] Fps is (10 sec: 4094.6, 60 sec: 3481.4, 300 sec: 3554.5). Total num frames: 7700480. Throughput: 0: 891.2. Samples: 1925672. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:51:37,876][00205] Avg episode reward: [(0, '27.676')] [2023-02-24 12:51:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 7712768. Throughput: 0: 866.8. Samples: 1927756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:51:42,877][00205] Avg episode reward: [(0, '26.898')] [2023-02-24 12:51:47,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 7729152. Throughput: 0: 847.1. Samples: 1931862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:51:47,881][00205] Avg episode reward: [(0, '26.736')] [2023-02-24 12:51:50,187][11215] Updated weights for policy 0, policy_version 1890 (0.0014) [2023-02-24 12:51:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7749632. Throughput: 0: 897.0. Samples: 1938152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:51:52,875][00205] Avg episode reward: [(0, '25.938')] [2023-02-24 12:51:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7770112. Throughput: 0: 902.8. Samples: 1941396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:51:57,874][00205] Avg episode reward: [(0, '26.483')] [2023-02-24 12:52:00,893][11215] Updated weights for policy 0, policy_version 1900 (0.0020) [2023-02-24 12:52:02,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3481.5, 300 sec: 3554.5). Total num frames: 7786496. Throughput: 0: 869.7. Samples: 1946236. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:52:02,880][00205] Avg episode reward: [(0, '26.473')] [2023-02-24 12:52:07,870][00205] Fps is (10 sec: 3277.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7802880. Throughput: 0: 868.6. Samples: 1950604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:52:07,879][00205] Avg episode reward: [(0, '26.178')] [2023-02-24 12:52:12,327][11215] Updated weights for policy 0, policy_version 1910 (0.0013) [2023-02-24 12:52:12,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7823360. Throughput: 0: 898.8. Samples: 1953916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:12,880][00205] Avg episode reward: [(0, '25.571')] [2023-02-24 12:52:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7843840. Throughput: 0: 920.7. Samples: 1960510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:17,879][00205] Avg episode reward: [(0, '25.162')] [2023-02-24 12:52:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7856128. Throughput: 0: 871.2. Samples: 1964872. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:52:22,872][00205] Avg episode reward: [(0, '24.254')] [2023-02-24 12:52:24,711][11215] Updated weights for policy 0, policy_version 1920 (0.0018) [2023-02-24 12:52:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7872512. Throughput: 0: 869.5. Samples: 1966884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:27,875][00205] Avg episode reward: [(0, '26.902')] [2023-02-24 12:52:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 7897088. Throughput: 0: 913.4. Samples: 1972966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:32,878][00205] Avg episode reward: [(0, '27.304')] [2023-02-24 12:52:34,676][11215] Updated weights for policy 0, policy_version 1930 (0.0012) [2023-02-24 12:52:37,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3550.1, 300 sec: 3540.8). Total num frames: 7913472. Throughput: 0: 914.4. Samples: 1979300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:37,876][00205] Avg episode reward: [(0, '27.444')] [2023-02-24 12:52:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7929856. Throughput: 0: 888.5. Samples: 1981376. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:52:42,872][00205] Avg episode reward: [(0, '27.197')] [2023-02-24 12:52:47,644][11215] Updated weights for policy 0, policy_version 1940 (0.0018) [2023-02-24 12:52:47,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7946240. Throughput: 0: 874.3. Samples: 1985578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:52:47,876][00205] Avg episode reward: [(0, '28.879')] [2023-02-24 12:52:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth... [2023-02-24 12:52:48,009][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001732_7094272.pth [2023-02-24 12:52:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7966720. Throughput: 0: 919.1. Samples: 1991962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:52:52,873][00205] Avg episode reward: [(0, '29.815')] [2023-02-24 12:52:57,558][11215] Updated weights for policy 0, policy_version 1950 (0.0023) [2023-02-24 12:52:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 7987200. Throughput: 0: 918.8. Samples: 1995260. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:52:57,872][00205] Avg episode reward: [(0, '31.595')] [2023-02-24 12:52:57,882][11201] Saving new best policy, reward=31.595! [2023-02-24 12:53:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 7999488. Throughput: 0: 871.2. Samples: 1999716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:53:02,878][00205] Avg episode reward: [(0, '30.868')] [2023-02-24 12:53:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8015872. Throughput: 0: 877.0. Samples: 2004338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:53:07,876][00205] Avg episode reward: [(0, '31.491')] [2023-02-24 12:53:10,102][11215] Updated weights for policy 0, policy_version 1960 (0.0012) [2023-02-24 12:53:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8036352. Throughput: 0: 904.5. Samples: 2007588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:53:12,873][00205] Avg episode reward: [(0, '31.804')] [2023-02-24 12:53:12,898][11201] Saving new best policy, reward=31.804! [2023-02-24 12:53:17,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 8056832. Throughput: 0: 915.1. Samples: 2014144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:53:17,875][00205] Avg episode reward: [(0, '31.345')] [2023-02-24 12:53:21,407][11215] Updated weights for policy 0, policy_version 1970 (0.0035) [2023-02-24 12:53:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8073216. Throughput: 0: 867.2. Samples: 2018322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:53:22,874][00205] Avg episode reward: [(0, '31.241')] [2023-02-24 12:53:27,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8089600. Throughput: 0: 868.6. Samples: 2020464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:53:27,874][00205] Avg episode reward: [(0, '29.691')] [2023-02-24 12:53:32,595][11215] Updated weights for policy 0, policy_version 1980 (0.0012) [2023-02-24 12:53:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8110080. Throughput: 0: 912.1. Samples: 2026624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:53:32,872][00205] Avg episode reward: [(0, '27.546')] [2023-02-24 12:53:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8130560. Throughput: 0: 903.2. Samples: 2032604. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:53:37,876][00205] Avg episode reward: [(0, '26.472')] [2023-02-24 12:53:42,871][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8142848. Throughput: 0: 875.6. Samples: 2034662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:53:42,874][00205] Avg episode reward: [(0, '25.526')] [2023-02-24 12:53:45,377][11215] Updated weights for policy 0, policy_version 1990 (0.0025) [2023-02-24 12:53:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8159232. Throughput: 0: 871.5. Samples: 2038934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:53:47,877][00205] Avg episode reward: [(0, '25.844')] [2023-02-24 12:53:52,873][00205] Fps is (10 sec: 3685.1, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 8179712. Throughput: 0: 911.0. Samples: 2045334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:53:52,875][00205] Avg episode reward: [(0, '25.736')] [2023-02-24 12:53:55,101][11215] Updated weights for policy 0, policy_version 2000 (0.0011) [2023-02-24 12:53:57,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 8200192. Throughput: 0: 910.7. Samples: 2048570. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:53:57,876][00205] Avg episode reward: [(0, '25.999')] [2023-02-24 12:54:02,870][00205] Fps is (10 sec: 3278.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8212480. Throughput: 0: 863.1. Samples: 2052980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:54:02,875][00205] Avg episode reward: [(0, '26.480')] [2023-02-24 12:54:07,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8228864. Throughput: 0: 879.3. Samples: 2057890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:54:07,876][00205] Avg episode reward: [(0, '27.098')] [2023-02-24 12:54:08,056][11215] Updated weights for policy 0, policy_version 2010 (0.0030) [2023-02-24 12:54:12,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8253440. Throughput: 0: 903.7. Samples: 2061132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:54:12,873][00205] Avg episode reward: [(0, '28.201')] [2023-02-24 12:54:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8269824. Throughput: 0: 908.6. Samples: 2067510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:54:17,872][00205] Avg episode reward: [(0, '26.869')] [2023-02-24 12:54:18,176][11215] Updated weights for policy 0, policy_version 2020 (0.0015) [2023-02-24 12:54:22,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8286208. Throughput: 0: 868.8. Samples: 2071702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:54:22,873][00205] Avg episode reward: [(0, '26.320')] [2023-02-24 12:54:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8302592. Throughput: 0: 870.4. Samples: 2073830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:54:27,872][00205] Avg episode reward: [(0, '26.621')] [2023-02-24 12:54:30,189][11215] Updated weights for policy 0, policy_version 2030 (0.0017) [2023-02-24 12:54:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8323072. Throughput: 0: 920.5. Samples: 2080358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:54:32,875][00205] Avg episode reward: [(0, '26.115')] [2023-02-24 12:54:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8343552. Throughput: 0: 907.8. Samples: 2086182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:54:37,872][00205] Avg episode reward: [(0, '27.023')] [2023-02-24 12:54:41,947][11215] Updated weights for policy 0, policy_version 2040 (0.0019) [2023-02-24 12:54:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8355840. Throughput: 0: 883.0. Samples: 2088302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:54:42,879][00205] Avg episode reward: [(0, '27.933')] [2023-02-24 12:54:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8376320. Throughput: 0: 888.0. Samples: 2092938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:54:47,873][00205] Avg episode reward: [(0, '28.366')] [2023-02-24 12:54:47,883][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth... [2023-02-24 12:54:48,006][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001836_7520256.pth [2023-02-24 12:54:52,537][11215] Updated weights for policy 0, policy_version 2050 (0.0013) [2023-02-24 12:54:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 8396800. Throughput: 0: 921.9. Samples: 2099376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:54:52,879][00205] Avg episode reward: [(0, '28.745')] [2023-02-24 12:54:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 8413184. Throughput: 0: 919.3. Samples: 2102500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:54:57,875][00205] Avg episode reward: [(0, '29.440')] [2023-02-24 12:55:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8425472. Throughput: 0: 865.4. Samples: 2106454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:55:02,878][00205] Avg episode reward: [(0, '28.903')] [2023-02-24 12:55:05,727][11215] Updated weights for policy 0, policy_version 2060 (0.0016) [2023-02-24 12:55:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8445952. Throughput: 0: 887.2. Samples: 2111624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:55:07,873][00205] Avg episode reward: [(0, '28.550')] [2023-02-24 12:55:12,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8466432. Throughput: 0: 911.7. Samples: 2114858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:55:12,879][00205] Avg episode reward: [(0, '28.947')] [2023-02-24 12:55:15,273][11215] Updated weights for policy 0, policy_version 2070 (0.0011) [2023-02-24 12:55:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8482816. Throughput: 0: 897.2. Samples: 2120732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:55:17,878][00205] Avg episode reward: [(0, '27.978')] [2023-02-24 12:55:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8499200. Throughput: 0: 859.9. Samples: 2124878. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 12:55:22,873][00205] Avg episode reward: [(0, '27.826')] [2023-02-24 12:55:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 8515584. Throughput: 0: 868.5. Samples: 2127384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:55:27,875][00205] Avg episode reward: [(0, '27.059')] [2023-02-24 12:55:28,038][11215] Updated weights for policy 0, policy_version 2080 (0.0035) [2023-02-24 12:55:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 8540160. Throughput: 0: 912.2. Samples: 2133986. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:55:32,872][00205] Avg episode reward: [(0, '29.171')] [2023-02-24 12:55:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8556544. Throughput: 0: 889.4. Samples: 2139398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:55:37,877][00205] Avg episode reward: [(0, '29.604')] [2023-02-24 12:55:38,962][11215] Updated weights for policy 0, policy_version 2090 (0.0012) [2023-02-24 12:55:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8568832. Throughput: 0: 865.1. Samples: 2141430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:55:42,875][00205] Avg episode reward: [(0, '30.581')] [2023-02-24 12:55:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8589312. Throughput: 0: 890.4. Samples: 2146524. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:55:47,875][00205] Avg episode reward: [(0, '29.880')] [2023-02-24 12:55:50,256][11215] Updated weights for policy 0, policy_version 2100 (0.0031) [2023-02-24 12:55:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8609792. Throughput: 0: 920.1. Samples: 2153030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:55:52,872][00205] Avg episode reward: [(0, '29.177')] [2023-02-24 12:55:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8626176. Throughput: 0: 911.2. Samples: 2155860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:55:57,877][00205] Avg episode reward: [(0, '30.053')] [2023-02-24 12:56:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8638464. Throughput: 0: 871.9. Samples: 2159966. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:56:02,878][00205] Avg episode reward: [(0, '28.558')] [2023-02-24 12:56:03,106][11215] Updated weights for policy 0, policy_version 2110 (0.0042) [2023-02-24 12:56:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 8658944. Throughput: 0: 901.4. Samples: 2165442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:56:07,872][00205] Avg episode reward: [(0, '27.659')] [2023-02-24 12:56:12,814][11215] Updated weights for policy 0, policy_version 2120 (0.0012) [2023-02-24 12:56:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8683520. Throughput: 0: 915.2. Samples: 2168566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:56:12,872][00205] Avg episode reward: [(0, '28.102')] [2023-02-24 12:56:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8695808. Throughput: 0: 895.8. Samples: 2174298. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:56:17,875][00205] Avg episode reward: [(0, '27.283')] [2023-02-24 12:56:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8712192. Throughput: 0: 867.5. Samples: 2178436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:56:22,877][00205] Avg episode reward: [(0, '27.355')] [2023-02-24 12:56:25,659][11215] Updated weights for policy 0, policy_version 2130 (0.0022) [2023-02-24 12:56:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8732672. Throughput: 0: 882.9. Samples: 2181162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:56:27,873][00205] Avg episode reward: [(0, '29.071')] [2023-02-24 12:56:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8753152. Throughput: 0: 916.0. Samples: 2187742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:56:32,878][00205] Avg episode reward: [(0, '28.206')] [2023-02-24 12:56:35,983][11215] Updated weights for policy 0, policy_version 2140 (0.0020) [2023-02-24 12:56:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 8769536. Throughput: 0: 885.6. Samples: 2192880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:56:37,879][00205] Avg episode reward: [(0, '29.021')] [2023-02-24 12:56:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8781824. Throughput: 0: 867.6. Samples: 2194900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:56:42,872][00205] Avg episode reward: [(0, '29.582')] [2023-02-24 12:56:47,810][11215] Updated weights for policy 0, policy_version 2150 (0.0023) [2023-02-24 12:56:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8806400. Throughput: 0: 896.2. Samples: 2200294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:56:47,877][00205] Avg episode reward: [(0, '28.292')] [2023-02-24 12:56:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth... [2023-02-24 12:56:48,011][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001940_7946240.pth [2023-02-24 12:56:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8826880. Throughput: 0: 920.0. Samples: 2206842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 12:56:52,874][00205] Avg episode reward: [(0, '27.672')] [2023-02-24 12:56:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8839168. Throughput: 0: 906.6. Samples: 2209364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:56:57,876][00205] Avg episode reward: [(0, '27.579')] [2023-02-24 12:56:59,635][11215] Updated weights for policy 0, policy_version 2160 (0.0015) [2023-02-24 12:57:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8855552. Throughput: 0: 869.4. Samples: 2213422. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:57:02,873][00205] Avg episode reward: [(0, '26.788')] [2023-02-24 12:57:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8876032. Throughput: 0: 909.3. Samples: 2219356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:57:07,877][00205] Avg episode reward: [(0, '28.423')] [2023-02-24 12:57:10,273][11215] Updated weights for policy 0, policy_version 2170 (0.0012) [2023-02-24 12:57:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8896512. Throughput: 0: 919.4. Samples: 2222536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:57:12,879][00205] Avg episode reward: [(0, '28.220')] [2023-02-24 12:57:17,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8912896. Throughput: 0: 890.7. Samples: 2227826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 12:57:17,875][00205] Avg episode reward: [(0, '28.280')] [2023-02-24 12:57:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8925184. Throughput: 0: 871.3. Samples: 2232088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:57:22,877][00205] Avg episode reward: [(0, '27.497')] [2023-02-24 12:57:23,192][11215] Updated weights for policy 0, policy_version 2180 (0.0028) [2023-02-24 12:57:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8949760. Throughput: 0: 891.6. Samples: 2235024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:57:27,873][00205] Avg episode reward: [(0, '29.520')] [2023-02-24 12:57:32,529][11215] Updated weights for policy 0, policy_version 2190 (0.0018) [2023-02-24 12:57:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 8970240. Throughput: 0: 919.3. Samples: 2241662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:57:32,872][00205] Avg episode reward: [(0, '29.056')] [2023-02-24 12:57:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 8982528. Throughput: 0: 882.6. Samples: 2246558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:57:37,872][00205] Avg episode reward: [(0, '28.206')] [2023-02-24 12:57:42,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 8998912. Throughput: 0: 872.4. Samples: 2248622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:57:42,884][00205] Avg episode reward: [(0, '27.766')] [2023-02-24 12:57:45,525][11215] Updated weights for policy 0, policy_version 2200 (0.0026) [2023-02-24 12:57:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9019392. Throughput: 0: 904.6. Samples: 2254130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:57:47,872][00205] Avg episode reward: [(0, '27.727')] [2023-02-24 12:57:52,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9039872. Throughput: 0: 918.0. Samples: 2260664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:57:52,873][00205] Avg episode reward: [(0, '28.475')] [2023-02-24 12:57:56,016][11215] Updated weights for policy 0, policy_version 2210 (0.0018) [2023-02-24 12:57:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9056256. Throughput: 0: 899.6. Samples: 2263016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:57:57,877][00205] Avg episode reward: [(0, '26.638')] [2023-02-24 12:58:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9068544. Throughput: 0: 873.3. Samples: 2267124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:02,879][00205] Avg episode reward: [(0, '26.333')] [2023-02-24 12:58:07,770][11215] Updated weights for policy 0, policy_version 2220 (0.0034) [2023-02-24 12:58:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9093120. Throughput: 0: 913.2. Samples: 2273180. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:58:07,873][00205] Avg episode reward: [(0, '25.888')] [2023-02-24 12:58:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9113600. Throughput: 0: 920.0. Samples: 2276424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:58:12,875][00205] Avg episode reward: [(0, '24.725')] [2023-02-24 12:58:17,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3549.6, 300 sec: 3568.3). Total num frames: 9125888. Throughput: 0: 887.7. Samples: 2281612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:17,879][00205] Avg episode reward: [(0, '24.592')] [2023-02-24 12:58:19,754][11215] Updated weights for policy 0, policy_version 2230 (0.0016) [2023-02-24 12:58:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9142272. Throughput: 0: 872.0. Samples: 2285796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:22,878][00205] Avg episode reward: [(0, '24.101')] [2023-02-24 12:58:27,870][00205] Fps is (10 sec: 3688.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9162752. Throughput: 0: 895.9. Samples: 2288936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:27,872][00205] Avg episode reward: [(0, '25.623')] [2023-02-24 12:58:30,071][11215] Updated weights for policy 0, policy_version 2240 (0.0014) [2023-02-24 12:58:32,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 9183232. Throughput: 0: 919.9. Samples: 2295528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:32,874][00205] Avg episode reward: [(0, '26.641')] [2023-02-24 12:58:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9199616. Throughput: 0: 881.3. Samples: 2300322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:37,878][00205] Avg episode reward: [(0, '26.937')] [2023-02-24 12:58:42,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9211904. Throughput: 0: 872.7. Samples: 2302288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:58:42,878][00205] Avg episode reward: [(0, '27.227')] [2023-02-24 12:58:42,996][11215] Updated weights for policy 0, policy_version 2250 (0.0017) [2023-02-24 12:58:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9236480. Throughput: 0: 912.1. Samples: 2308170. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:58:47,872][00205] Avg episode reward: [(0, '26.217')] [2023-02-24 12:58:47,884][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth... [2023-02-24 12:58:48,000][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002045_8376320.pth [2023-02-24 12:58:52,474][11215] Updated weights for policy 0, policy_version 2260 (0.0020) [2023-02-24 12:58:52,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3618.0, 300 sec: 3582.3). Total num frames: 9256960. Throughput: 0: 922.3. Samples: 2314686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:58:52,879][00205] Avg episode reward: [(0, '27.372')] [2023-02-24 12:58:57,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 9269248. Throughput: 0: 896.1. Samples: 2316748. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:58:57,876][00205] Avg episode reward: [(0, '26.596')] [2023-02-24 12:59:02,870][00205] Fps is (10 sec: 2867.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9285632. Throughput: 0: 872.1. Samples: 2320854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:59:02,873][00205] Avg episode reward: [(0, '26.704')] [2023-02-24 12:59:05,405][11215] Updated weights for policy 0, policy_version 2270 (0.0020) [2023-02-24 12:59:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9306112. Throughput: 0: 917.7. Samples: 2327092. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:59:07,872][00205] Avg episode reward: [(0, '26.088')] [2023-02-24 12:59:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9326592. Throughput: 0: 919.9. Samples: 2330330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:59:12,874][00205] Avg episode reward: [(0, '26.037')] [2023-02-24 12:59:16,197][11215] Updated weights for policy 0, policy_version 2280 (0.0037) [2023-02-24 12:59:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.4, 300 sec: 3582.3). Total num frames: 9342976. Throughput: 0: 882.9. Samples: 2335258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:59:17,876][00205] Avg episode reward: [(0, '26.873')] [2023-02-24 12:59:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9359360. Throughput: 0: 871.3. Samples: 2339530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 12:59:22,872][00205] Avg episode reward: [(0, '27.022')] [2023-02-24 12:59:27,569][11215] Updated weights for policy 0, policy_version 2290 (0.0015) [2023-02-24 12:59:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9379840. Throughput: 0: 900.6. Samples: 2342814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:59:27,872][00205] Avg episode reward: [(0, '27.863')] [2023-02-24 12:59:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 9400320. Throughput: 0: 917.4. Samples: 2349454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:59:32,875][00205] Avg episode reward: [(0, '27.149')] [2023-02-24 12:59:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9412608. Throughput: 0: 869.7. Samples: 2353822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 12:59:37,873][00205] Avg episode reward: [(0, '27.391')] [2023-02-24 12:59:40,176][11215] Updated weights for policy 0, policy_version 2300 (0.0019) [2023-02-24 12:59:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9428992. Throughput: 0: 869.1. Samples: 2355858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:59:42,872][00205] Avg episode reward: [(0, '28.339')] [2023-02-24 12:59:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9449472. Throughput: 0: 911.9. Samples: 2361888. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 12:59:47,873][00205] Avg episode reward: [(0, '28.968')] [2023-02-24 12:59:49,807][11215] Updated weights for policy 0, policy_version 2310 (0.0011) [2023-02-24 12:59:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3582.3). Total num frames: 9469952. Throughput: 0: 913.8. Samples: 2368214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:59:52,880][00205] Avg episode reward: [(0, '28.679')] [2023-02-24 12:59:57,871][00205] Fps is (10 sec: 3685.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9486336. Throughput: 0: 888.1. Samples: 2370298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 12:59:57,875][00205] Avg episode reward: [(0, '28.517')] [2023-02-24 13:00:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9498624. Throughput: 0: 867.0. Samples: 2374274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:02,872][00205] Avg episode reward: [(0, '29.111')] [2023-02-24 13:00:03,025][11215] Updated weights for policy 0, policy_version 2320 (0.0020) [2023-02-24 13:00:07,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9523200. Throughput: 0: 918.4. Samples: 2380856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:07,873][00205] Avg episode reward: [(0, '29.438')] [2023-02-24 13:00:12,873][00205] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 9539584. Throughput: 0: 919.3. Samples: 2384184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:12,876][00205] Avg episode reward: [(0, '29.535')] [2023-02-24 13:00:12,976][11215] Updated weights for policy 0, policy_version 2330 (0.0015) [2023-02-24 13:00:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 9555968. Throughput: 0: 871.9. Samples: 2388692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:17,876][00205] Avg episode reward: [(0, '29.166')] [2023-02-24 13:00:22,870][00205] Fps is (10 sec: 3277.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9572352. Throughput: 0: 878.2. Samples: 2393340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:22,879][00205] Avg episode reward: [(0, '28.664')] [2023-02-24 13:00:25,293][11215] Updated weights for policy 0, policy_version 2340 (0.0011) [2023-02-24 13:00:27,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9592832. Throughput: 0: 907.0. Samples: 2396672. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:00:27,877][00205] Avg episode reward: [(0, '29.703')] [2023-02-24 13:00:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9613312. Throughput: 0: 916.7. Samples: 2403140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:32,873][00205] Avg episode reward: [(0, '28.917')] [2023-02-24 13:00:36,509][11215] Updated weights for policy 0, policy_version 2350 (0.0013) [2023-02-24 13:00:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9625600. Throughput: 0: 869.5. Samples: 2407340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:00:37,872][00205] Avg episode reward: [(0, '30.191')] [2023-02-24 13:00:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9641984. Throughput: 0: 868.9. Samples: 2409398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:00:42,881][00205] Avg episode reward: [(0, '28.555')] [2023-02-24 13:00:47,770][11215] Updated weights for policy 0, policy_version 2360 (0.0012) [2023-02-24 13:00:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9666560. Throughput: 0: 915.4. Samples: 2415466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:00:47,872][00205] Avg episode reward: [(0, '28.287')] [2023-02-24 13:00:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth... [2023-02-24 13:00:48,005][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002150_8806400.pth [2023-02-24 13:00:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9682944. Throughput: 0: 906.2. Samples: 2421636. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:00:52,875][00205] Avg episode reward: [(0, '28.414')] [2023-02-24 13:00:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 9699328. Throughput: 0: 876.8. Samples: 2423636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:00:57,872][00205] Avg episode reward: [(0, '28.747')] [2023-02-24 13:01:00,664][11215] Updated weights for policy 0, policy_version 2370 (0.0020) [2023-02-24 13:01:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9715712. Throughput: 0: 870.8. Samples: 2427878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:01:02,879][00205] Avg episode reward: [(0, '28.769')] [2023-02-24 13:01:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9736192. Throughput: 0: 916.4. Samples: 2434578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:01:07,873][00205] Avg episode reward: [(0, '29.246')] [2023-02-24 13:01:10,109][11215] Updated weights for policy 0, policy_version 2380 (0.0012) [2023-02-24 13:01:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 9756672. Throughput: 0: 914.9. Samples: 2437842. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:01:12,877][00205] Avg episode reward: [(0, '28.743')] [2023-02-24 13:01:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9768960. Throughput: 0: 871.5. Samples: 2442356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:01:17,873][00205] Avg episode reward: [(0, '27.426')] [2023-02-24 13:01:22,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9785344. Throughput: 0: 881.3. Samples: 2447000. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:01:22,872][00205] Avg episode reward: [(0, '29.245')] [2023-02-24 13:01:22,886][11215] Updated weights for policy 0, policy_version 2390 (0.0014) [2023-02-24 13:01:27,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9809920. Throughput: 0: 909.5. Samples: 2450324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:27,872][00205] Avg episode reward: [(0, '28.138')] [2023-02-24 13:01:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9826304. Throughput: 0: 919.0. Samples: 2456820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:32,880][00205] Avg episode reward: [(0, '29.533')] [2023-02-24 13:01:33,120][11215] Updated weights for policy 0, policy_version 2400 (0.0017) [2023-02-24 13:01:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9842688. Throughput: 0: 873.8. Samples: 2460956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:37,875][00205] Avg episode reward: [(0, '26.979')] [2023-02-24 13:01:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 9859072. Throughput: 0: 876.7. Samples: 2463086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:42,873][00205] Avg episode reward: [(0, '26.891')] [2023-02-24 13:01:45,223][11215] Updated weights for policy 0, policy_version 2410 (0.0014) [2023-02-24 13:01:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9879552. Throughput: 0: 921.8. Samples: 2469360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:01:47,873][00205] Avg episode reward: [(0, '29.293')] [2023-02-24 13:01:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9900032. Throughput: 0: 906.2. Samples: 2475356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:52,877][00205] Avg episode reward: [(0, '31.339')] [2023-02-24 13:01:56,888][11215] Updated weights for policy 0, policy_version 2420 (0.0014) [2023-02-24 13:01:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 9912320. Throughput: 0: 879.5. Samples: 2477420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:01:57,880][00205] Avg episode reward: [(0, '30.273')] [2023-02-24 13:02:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 9928704. Throughput: 0: 876.2. Samples: 2481786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:02,879][00205] Avg episode reward: [(0, '30.971')] [2023-02-24 13:02:07,534][11215] Updated weights for policy 0, policy_version 2430 (0.0027) [2023-02-24 13:02:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 9953280. Throughput: 0: 920.0. Samples: 2488400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:07,872][00205] Avg episode reward: [(0, '30.673')] [2023-02-24 13:02:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 9973760. Throughput: 0: 926.2. Samples: 2492002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:12,879][00205] Avg episode reward: [(0, '32.861')] [2023-02-24 13:02:12,881][11201] Saving new best policy, reward=32.861! [2023-02-24 13:02:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 9986048. Throughput: 0: 885.0. Samples: 2496644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:02:17,873][00205] Avg episode reward: [(0, '32.329')] [2023-02-24 13:02:19,426][11215] Updated weights for policy 0, policy_version 2440 (0.0027) [2023-02-24 13:02:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 10006528. Throughput: 0: 913.5. Samples: 2502064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:02:22,878][00205] Avg episode reward: [(0, '33.120')] [2023-02-24 13:02:22,881][11201] Saving new best policy, reward=33.120! [2023-02-24 13:02:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 10031104. Throughput: 0: 944.7. Samples: 2505598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:02:27,872][00205] Avg episode reward: [(0, '32.114')] [2023-02-24 13:02:28,321][11215] Updated weights for policy 0, policy_version 2450 (0.0011) [2023-02-24 13:02:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10051584. Throughput: 0: 958.1. Samples: 2512476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:32,872][00205] Avg episode reward: [(0, '32.733')] [2023-02-24 13:02:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 10067968. Throughput: 0: 928.1. Samples: 2517120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:02:37,876][00205] Avg episode reward: [(0, '31.443')] [2023-02-24 13:02:40,375][11215] Updated weights for policy 0, policy_version 2460 (0.0024) [2023-02-24 13:02:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10088448. Throughput: 0: 934.0. Samples: 2519452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:02:42,873][00205] Avg episode reward: [(0, '31.606')] [2023-02-24 13:02:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3623.9). Total num frames: 10108928. Throughput: 0: 995.9. Samples: 2526602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:02:47,876][00205] Avg episode reward: [(0, '31.046')] [2023-02-24 13:02:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth... [2023-02-24 13:02:48,003][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002255_9236480.pth [2023-02-24 13:02:48,873][11215] Updated weights for policy 0, policy_version 2470 (0.0014) [2023-02-24 13:02:52,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3637.8). Total num frames: 10129408. Throughput: 0: 986.9. Samples: 2532812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:52,876][00205] Avg episode reward: [(0, '29.429')] [2023-02-24 13:02:57,872][00205] Fps is (10 sec: 3685.5, 60 sec: 3891.0, 300 sec: 3651.7). Total num frames: 10145792. Throughput: 0: 956.8. Samples: 2535062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:02:57,880][00205] Avg episode reward: [(0, '28.950')] [2023-02-24 13:03:01,104][11215] Updated weights for policy 0, policy_version 2480 (0.0018) [2023-02-24 13:03:02,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3959.5, 300 sec: 3637.8). Total num frames: 10166272. Throughput: 0: 965.6. Samples: 2540094. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:03:02,877][00205] Avg episode reward: [(0, '28.169')] [2023-02-24 13:03:07,870][00205] Fps is (10 sec: 4097.1, 60 sec: 3891.2, 300 sec: 3637.8). Total num frames: 10186752. Throughput: 0: 1006.3. Samples: 2547348. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:03:07,874][00205] Avg episode reward: [(0, '28.697')] [2023-02-24 13:03:09,547][11215] Updated weights for policy 0, policy_version 2490 (0.0022) [2023-02-24 13:03:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 10207232. Throughput: 0: 1006.0. Samples: 2550870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:03:12,874][00205] Avg episode reward: [(0, '28.847')] [2023-02-24 13:03:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10223616. Throughput: 0: 952.1. Samples: 2555322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:03:17,876][00205] Avg episode reward: [(0, '29.297')] [2023-02-24 13:03:21,717][11215] Updated weights for policy 0, policy_version 2500 (0.0025) [2023-02-24 13:03:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3665.6). Total num frames: 10244096. Throughput: 0: 975.1. Samples: 2560998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:03:22,877][00205] Avg episode reward: [(0, '30.799')] [2023-02-24 13:03:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 10268672. Throughput: 0: 1003.7. Samples: 2564618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:03:27,873][00205] Avg episode reward: [(0, '30.425')] [2023-02-24 13:03:30,474][11215] Updated weights for policy 0, policy_version 2510 (0.0011) [2023-02-24 13:03:32,870][00205] Fps is (10 sec: 4095.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 10285056. Throughput: 0: 992.0. Samples: 2571242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:03:32,874][00205] Avg episode reward: [(0, '30.344')] [2023-02-24 13:03:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 10301440. Throughput: 0: 956.1. Samples: 2575834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:03:37,878][00205] Avg episode reward: [(0, '30.769')] [2023-02-24 13:03:42,226][11215] Updated weights for policy 0, policy_version 2520 (0.0019) [2023-02-24 13:03:42,871][00205] Fps is (10 sec: 3686.3, 60 sec: 3891.1, 300 sec: 3679.4). Total num frames: 10321920. Throughput: 0: 967.2. Samples: 2578586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:03:42,878][00205] Avg episode reward: [(0, '31.018')] [2023-02-24 13:03:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3693.4). Total num frames: 10346496. Throughput: 0: 1017.7. Samples: 2585892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:03:47,873][00205] Avg episode reward: [(0, '28.245')] [2023-02-24 13:03:51,216][11215] Updated weights for policy 0, policy_version 2530 (0.0011) [2023-02-24 13:03:52,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10366976. Throughput: 0: 990.8. Samples: 2591934. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-24 13:03:52,878][00205] Avg episode reward: [(0, '28.230')] [2023-02-24 13:03:57,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.3, 300 sec: 3707.2). Total num frames: 10379264. Throughput: 0: 960.7. Samples: 2594100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 13:03:57,878][00205] Avg episode reward: [(0, '29.088')] [2023-02-24 13:04:02,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 10399744. Throughput: 0: 981.2. Samples: 2599476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:04:02,875][00205] Avg episode reward: [(0, '27.915')] [2023-02-24 13:04:02,958][11215] Updated weights for policy 0, policy_version 2540 (0.0022) [2023-02-24 13:04:07,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 10424320. Throughput: 0: 1016.7. Samples: 2606750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:04:07,875][00205] Avg episode reward: [(0, '27.682')] [2023-02-24 13:04:12,555][11215] Updated weights for policy 0, policy_version 2550 (0.0021) [2023-02-24 13:04:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10444800. Throughput: 0: 1008.3. Samples: 2609992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:04:12,878][00205] Avg episode reward: [(0, '26.928')] [2023-02-24 13:04:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 10457088. Throughput: 0: 962.1. Samples: 2614536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:04:17,880][00205] Avg episode reward: [(0, '27.941')] [2023-02-24 13:04:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 10481664. Throughput: 0: 993.4. Samples: 2620536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:04:22,872][00205] Avg episode reward: [(0, '26.358')] [2023-02-24 13:04:23,456][11215] Updated weights for policy 0, policy_version 2560 (0.0016) [2023-02-24 13:04:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 10506240. Throughput: 0: 1012.4. Samples: 2624142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:04:27,872][00205] Avg episode reward: [(0, '25.797')] [2023-02-24 13:04:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3762.7). Total num frames: 10522624. Throughput: 0: 987.9. Samples: 2630348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:04:32,875][00205] Avg episode reward: [(0, '27.648')] [2023-02-24 13:04:33,598][11215] Updated weights for policy 0, policy_version 2570 (0.0011) [2023-02-24 13:04:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10539008. Throughput: 0: 955.4. Samples: 2634928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:04:37,877][00205] Avg episode reward: [(0, '28.105')] [2023-02-24 13:04:42,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 10559488. Throughput: 0: 972.3. Samples: 2637854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:04:42,873][00205] Avg episode reward: [(0, '29.040')] [2023-02-24 13:04:43,957][11215] Updated weights for policy 0, policy_version 2580 (0.0014) [2023-02-24 13:04:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3776.6). Total num frames: 10584064. Throughput: 0: 1014.0. Samples: 2645106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:04:47,873][00205] Avg episode reward: [(0, '28.718')] [2023-02-24 13:04:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth... [2023-02-24 13:04:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002360_9666560.pth [2023-02-24 13:04:52,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 10600448. Throughput: 0: 974.1. Samples: 2650584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:04:52,873][00205] Avg episode reward: [(0, '29.225')] [2023-02-24 13:04:54,891][11215] Updated weights for policy 0, policy_version 2590 (0.0017) [2023-02-24 13:04:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 10616832. Throughput: 0: 951.6. Samples: 2652814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:04:57,879][00205] Avg episode reward: [(0, '30.032')] [2023-02-24 13:05:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 10637312. Throughput: 0: 972.8. Samples: 2658312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:05:02,873][00205] Avg episode reward: [(0, '30.642')] [2023-02-24 13:05:05,055][11215] Updated weights for policy 0, policy_version 2600 (0.0022) [2023-02-24 13:05:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.5). Total num frames: 10661888. Throughput: 0: 1001.2. Samples: 2665590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:05:07,880][00205] Avg episode reward: [(0, '29.211')] [2023-02-24 13:05:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10678272. Throughput: 0: 983.6. Samples: 2668406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:05:12,877][00205] Avg episode reward: [(0, '30.018')] [2023-02-24 13:05:16,702][11215] Updated weights for policy 0, policy_version 2610 (0.0038) [2023-02-24 13:05:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 10690560. Throughput: 0: 946.1. Samples: 2672920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:05:17,876][00205] Avg episode reward: [(0, '30.640')] [2023-02-24 13:05:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 10715136. Throughput: 0: 981.7. Samples: 2679104. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:05:22,873][00205] Avg episode reward: [(0, '30.615')] [2023-02-24 13:05:25,826][11215] Updated weights for policy 0, policy_version 2620 (0.0031) [2023-02-24 13:05:27,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10739712. Throughput: 0: 996.9. Samples: 2682712. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:05:27,873][00205] Avg episode reward: [(0, '31.885')] [2023-02-24 13:05:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 10756096. Throughput: 0: 969.5. Samples: 2688734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:05:32,877][00205] Avg episode reward: [(0, '31.684')] [2023-02-24 13:05:37,665][11215] Updated weights for policy 0, policy_version 2630 (0.0021) [2023-02-24 13:05:37,873][00205] Fps is (10 sec: 3275.7, 60 sec: 3891.0, 300 sec: 3832.1). Total num frames: 10772480. Throughput: 0: 949.5. Samples: 2693316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:05:37,882][00205] Avg episode reward: [(0, '32.848')] [2023-02-24 13:05:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 10792960. Throughput: 0: 972.3. Samples: 2696566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:05:42,872][00205] Avg episode reward: [(0, '32.834')] [2023-02-24 13:05:46,239][11215] Updated weights for policy 0, policy_version 2640 (0.0014) [2023-02-24 13:05:47,870][00205] Fps is (10 sec: 4916.9, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 10821632. Throughput: 0: 1012.2. Samples: 2703862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:05:47,878][00205] Avg episode reward: [(0, '33.923')] [2023-02-24 13:05:47,893][11201] Saving new best policy, reward=33.923! [2023-02-24 13:05:52,872][00205] Fps is (10 sec: 4095.2, 60 sec: 3891.1, 300 sec: 3846.0). Total num frames: 10833920. Throughput: 0: 966.0. Samples: 2709064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:05:52,879][00205] Avg episode reward: [(0, '32.833')] [2023-02-24 13:05:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10850304. Throughput: 0: 952.7. Samples: 2711278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:05:57,873][00205] Avg episode reward: [(0, '30.821')] [2023-02-24 13:05:58,763][11215] Updated weights for policy 0, policy_version 2650 (0.0031) [2023-02-24 13:06:02,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 10870784. Throughput: 0: 982.6. Samples: 2717138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:06:02,872][00205] Avg episode reward: [(0, '28.643')] [2023-02-24 13:06:07,363][11215] Updated weights for policy 0, policy_version 2660 (0.0016) [2023-02-24 13:06:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 10895360. Throughput: 0: 1004.5. Samples: 2724308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:06:07,877][00205] Avg episode reward: [(0, '28.408')] [2023-02-24 13:06:12,874][00205] Fps is (10 sec: 4094.4, 60 sec: 3890.9, 300 sec: 3873.8). Total num frames: 10911744. Throughput: 0: 982.5. Samples: 2726928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:06:12,879][00205] Avg episode reward: [(0, '28.654')] [2023-02-24 13:06:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 10928128. Throughput: 0: 951.6. Samples: 2731556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:06:17,875][00205] Avg episode reward: [(0, '27.336')] [2023-02-24 13:06:19,284][11215] Updated weights for policy 0, policy_version 2670 (0.0011) [2023-02-24 13:06:22,877][00205] Fps is (10 sec: 4094.8, 60 sec: 3959.0, 300 sec: 3873.8). Total num frames: 10952704. Throughput: 0: 995.8. Samples: 2738132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:06:22,883][00205] Avg episode reward: [(0, '27.078')] [2023-02-24 13:06:27,724][11215] Updated weights for policy 0, policy_version 2680 (0.0025) [2023-02-24 13:06:27,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 10977280. Throughput: 0: 1004.9. Samples: 2741788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:06:27,877][00205] Avg episode reward: [(0, '29.119')] [2023-02-24 13:06:32,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 10989568. Throughput: 0: 968.8. Samples: 2747460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:06:32,874][00205] Avg episode reward: [(0, '29.675')] [2023-02-24 13:06:37,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.4, 300 sec: 3887.7). Total num frames: 11005952. Throughput: 0: 952.8. Samples: 2751940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:06:37,872][00205] Avg episode reward: [(0, '29.712')] [2023-02-24 13:06:39,952][11215] Updated weights for policy 0, policy_version 2690 (0.0012) [2023-02-24 13:06:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11030528. Throughput: 0: 981.0. Samples: 2755424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:06:42,877][00205] Avg episode reward: [(0, '30.627')] [2023-02-24 13:06:47,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11055104. Throughput: 0: 1011.0. Samples: 2762634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:06:47,877][00205] Avg episode reward: [(0, '31.811')] [2023-02-24 13:06:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth... [2023-02-24 13:06:48,038][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002468_10108928.pth [2023-02-24 13:06:49,145][11215] Updated weights for policy 0, policy_version 2700 (0.0011) [2023-02-24 13:06:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11067392. Throughput: 0: 960.7. Samples: 2767540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:06:52,875][00205] Avg episode reward: [(0, '32.203')] [2023-02-24 13:06:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11083776. Throughput: 0: 952.8. Samples: 2769800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:06:57,872][00205] Avg episode reward: [(0, '31.032')] [2023-02-24 13:07:00,880][11215] Updated weights for policy 0, policy_version 2710 (0.0011) [2023-02-24 13:07:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 11108352. Throughput: 0: 986.4. Samples: 2775946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:07:02,873][00205] Avg episode reward: [(0, '29.995')] [2023-02-24 13:07:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11128832. Throughput: 0: 999.9. Samples: 2783122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:07,873][00205] Avg episode reward: [(0, '31.850')] [2023-02-24 13:07:10,795][11215] Updated weights for policy 0, policy_version 2720 (0.0023) [2023-02-24 13:07:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.5, 300 sec: 3929.4). Total num frames: 11145216. Throughput: 0: 969.5. Samples: 2785414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:12,872][00205] Avg episode reward: [(0, '30.840')] [2023-02-24 13:07:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11161600. Throughput: 0: 943.5. Samples: 2789918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:07:17,873][00205] Avg episode reward: [(0, '29.945')] [2023-02-24 13:07:21,861][11215] Updated weights for policy 0, policy_version 2730 (0.0015) [2023-02-24 13:07:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.6, 300 sec: 3915.5). Total num frames: 11186176. Throughput: 0: 988.6. Samples: 2796426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:07:22,872][00205] Avg episode reward: [(0, '28.391')] [2023-02-24 13:07:27,872][00205] Fps is (10 sec: 4504.7, 60 sec: 3822.8, 300 sec: 3915.5). Total num frames: 11206656. Throughput: 0: 988.4. Samples: 2799904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:07:27,880][00205] Avg episode reward: [(0, '28.858')] [2023-02-24 13:07:32,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3901.6). Total num frames: 11218944. Throughput: 0: 941.9. Samples: 2805020. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:07:32,872][00205] Avg episode reward: [(0, '31.238')] [2023-02-24 13:07:33,132][11215] Updated weights for policy 0, policy_version 2740 (0.0011) [2023-02-24 13:07:37,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 11235328. Throughput: 0: 931.2. Samples: 2809444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:37,872][00205] Avg episode reward: [(0, '30.567')] [2023-02-24 13:07:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11259904. Throughput: 0: 961.5. Samples: 2813068. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:42,878][00205] Avg episode reward: [(0, '30.264')] [2023-02-24 13:07:42,998][11215] Updated weights for policy 0, policy_version 2750 (0.0012) [2023-02-24 13:07:47,875][00205] Fps is (10 sec: 4912.7, 60 sec: 3822.6, 300 sec: 3915.4). Total num frames: 11284480. Throughput: 0: 986.8. Samples: 2820358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:47,880][00205] Avg episode reward: [(0, '31.168')] [2023-02-24 13:07:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11296768. Throughput: 0: 935.2. Samples: 2825206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:07:52,874][00205] Avg episode reward: [(0, '32.204')] [2023-02-24 13:07:54,363][11215] Updated weights for policy 0, policy_version 2760 (0.0034) [2023-02-24 13:07:57,870][00205] Fps is (10 sec: 3278.5, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11317248. Throughput: 0: 934.0. Samples: 2827446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:07:57,872][00205] Avg episode reward: [(0, '33.362')] [2023-02-24 13:08:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11337728. Throughput: 0: 978.8. Samples: 2833964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:02,878][00205] Avg episode reward: [(0, '33.448')] [2023-02-24 13:08:03,838][11215] Updated weights for policy 0, policy_version 2770 (0.0017) [2023-02-24 13:08:07,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 11362304. Throughput: 0: 987.8. Samples: 2840876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:08:07,880][00205] Avg episode reward: [(0, '32.975')] [2023-02-24 13:08:12,872][00205] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3901.6). Total num frames: 11374592. Throughput: 0: 960.1. Samples: 2843108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:12,880][00205] Avg episode reward: [(0, '32.114')] [2023-02-24 13:08:15,661][11215] Updated weights for policy 0, policy_version 2780 (0.0012) [2023-02-24 13:08:17,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11395072. Throughput: 0: 945.3. Samples: 2847558. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:08:17,873][00205] Avg episode reward: [(0, '30.239')] [2023-02-24 13:08:22,870][00205] Fps is (10 sec: 4506.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11419648. Throughput: 0: 1004.7. Samples: 2854656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:22,873][00205] Avg episode reward: [(0, '29.244')] [2023-02-24 13:08:24,526][11215] Updated weights for policy 0, policy_version 2790 (0.0011) [2023-02-24 13:08:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3915.5). Total num frames: 11440128. Throughput: 0: 1002.7. Samples: 2858190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:08:27,874][00205] Avg episode reward: [(0, '28.940')] [2023-02-24 13:08:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11452416. Throughput: 0: 953.8. Samples: 2863276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:08:32,872][00205] Avg episode reward: [(0, '27.881')] [2023-02-24 13:08:36,720][11215] Updated weights for policy 0, policy_version 2800 (0.0016) [2023-02-24 13:08:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11472896. Throughput: 0: 957.2. Samples: 2868282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:08:37,880][00205] Avg episode reward: [(0, '27.963')] [2023-02-24 13:08:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11497472. Throughput: 0: 986.5. Samples: 2871838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:42,872][00205] Avg episode reward: [(0, '29.039')] [2023-02-24 13:08:45,237][11215] Updated weights for policy 0, policy_version 2810 (0.0018) [2023-02-24 13:08:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.5, 300 sec: 3901.6). Total num frames: 11517952. Throughput: 0: 1002.6. Samples: 2879082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:47,876][00205] Avg episode reward: [(0, '29.905')] [2023-02-24 13:08:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth... [2023-02-24 13:08:48,023][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002584_10584064.pth [2023-02-24 13:08:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11530240. Throughput: 0: 947.2. Samples: 2883500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:08:52,874][00205] Avg episode reward: [(0, '30.130')] [2023-02-24 13:08:57,602][11215] Updated weights for policy 0, policy_version 2820 (0.0034) [2023-02-24 13:08:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11550720. Throughput: 0: 947.8. Samples: 2885758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:08:57,881][00205] Avg episode reward: [(0, '31.498')] [2023-02-24 13:09:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11575296. Throughput: 0: 999.3. Samples: 2892526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:09:02,873][00205] Avg episode reward: [(0, '30.983')] [2023-02-24 13:09:06,160][11215] Updated weights for policy 0, policy_version 2830 (0.0021) [2023-02-24 13:09:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11595776. Throughput: 0: 991.9. Samples: 2899292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:07,873][00205] Avg episode reward: [(0, '30.991')] [2023-02-24 13:09:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3915.5). Total num frames: 11612160. Throughput: 0: 964.2. Samples: 2901580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:12,877][00205] Avg episode reward: [(0, '30.112')] [2023-02-24 13:09:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11628544. Throughput: 0: 954.1. Samples: 2906212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:17,872][00205] Avg episode reward: [(0, '28.870')] [2023-02-24 13:09:18,305][11215] Updated weights for policy 0, policy_version 2840 (0.0025) [2023-02-24 13:09:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11653120. Throughput: 0: 1004.3. Samples: 2913476. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:09:22,872][00205] Avg episode reward: [(0, '27.218')] [2023-02-24 13:09:27,339][11215] Updated weights for policy 0, policy_version 2850 (0.0025) [2023-02-24 13:09:27,875][00205] Fps is (10 sec: 4503.2, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11673600. Throughput: 0: 1003.3. Samples: 2916992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:09:27,881][00205] Avg episode reward: [(0, '26.111')] [2023-02-24 13:09:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 11689984. Throughput: 0: 950.6. Samples: 2921860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:32,876][00205] Avg episode reward: [(0, '26.217')] [2023-02-24 13:09:37,870][00205] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11706368. Throughput: 0: 971.2. Samples: 2927204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:37,881][00205] Avg episode reward: [(0, '27.113')] [2023-02-24 13:09:38,824][11215] Updated weights for policy 0, policy_version 2860 (0.0024) [2023-02-24 13:09:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11730944. Throughput: 0: 1000.1. Samples: 2930764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:09:42,878][00205] Avg episode reward: [(0, '27.195')] [2023-02-24 13:09:47,874][00205] Fps is (10 sec: 4503.5, 60 sec: 3890.9, 300 sec: 3901.6). Total num frames: 11751424. Throughput: 0: 999.8. Samples: 2937522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:09:47,883][00205] Avg episode reward: [(0, '26.724')] [2023-02-24 13:09:48,667][11215] Updated weights for policy 0, policy_version 2870 (0.0023) [2023-02-24 13:09:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11763712. Throughput: 0: 948.6. Samples: 2941980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:09:52,878][00205] Avg episode reward: [(0, '28.349')] [2023-02-24 13:09:57,870][00205] Fps is (10 sec: 3278.3, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11784192. Throughput: 0: 948.8. Samples: 2944278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:09:57,876][00205] Avg episode reward: [(0, '28.480')] [2023-02-24 13:10:00,095][11215] Updated weights for policy 0, policy_version 2880 (0.0020) [2023-02-24 13:10:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11808768. Throughput: 0: 995.2. Samples: 2950994. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:10:02,879][00205] Avg episode reward: [(0, '29.259')] [2023-02-24 13:10:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 11829248. Throughput: 0: 972.9. Samples: 2957256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:10:07,882][00205] Avg episode reward: [(0, '28.575')] [2023-02-24 13:10:10,845][11215] Updated weights for policy 0, policy_version 2890 (0.0022) [2023-02-24 13:10:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 11841536. Throughput: 0: 944.2. Samples: 2959476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:10:12,874][00205] Avg episode reward: [(0, '28.388')] [2023-02-24 13:10:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11862016. Throughput: 0: 943.7. Samples: 2964328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:17,873][00205] Avg episode reward: [(0, '27.157')] [2023-02-24 13:10:21,046][11215] Updated weights for policy 0, policy_version 2900 (0.0014) [2023-02-24 13:10:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 11886592. Throughput: 0: 984.7. Samples: 2971514. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:22,874][00205] Avg episode reward: [(0, '27.272')] [2023-02-24 13:10:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.3, 300 sec: 3887.7). Total num frames: 11902976. Throughput: 0: 980.3. Samples: 2974876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:10:27,875][00205] Avg episode reward: [(0, '27.192')] [2023-02-24 13:10:32,746][11215] Updated weights for policy 0, policy_version 2910 (0.0014) [2023-02-24 13:10:32,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3887.8). Total num frames: 11919360. Throughput: 0: 928.2. Samples: 2979288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:32,874][00205] Avg episode reward: [(0, '28.060')] [2023-02-24 13:10:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 11935744. Throughput: 0: 948.3. Samples: 2984652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:37,875][00205] Avg episode reward: [(0, '28.655')] [2023-02-24 13:10:42,349][11215] Updated weights for policy 0, policy_version 2920 (0.0031) [2023-02-24 13:10:42,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 11960320. Throughput: 0: 974.4. Samples: 2988128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:42,872][00205] Avg episode reward: [(0, '29.528')] [2023-02-24 13:10:47,875][00205] Fps is (10 sec: 4093.7, 60 sec: 3754.6, 300 sec: 3873.8). Total num frames: 11976704. Throughput: 0: 964.8. Samples: 2994414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:10:47,883][00205] Avg episode reward: [(0, '30.080')] [2023-02-24 13:10:47,923][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth... [2023-02-24 13:10:48,074][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002699_11055104.pth [2023-02-24 13:10:52,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3822.7, 300 sec: 3873.8). Total num frames: 11993088. Throughput: 0: 923.6. Samples: 2998820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:52,885][00205] Avg episode reward: [(0, '29.245')] [2023-02-24 13:10:54,979][11215] Updated weights for policy 0, policy_version 2930 (0.0023) [2023-02-24 13:10:57,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12013568. Throughput: 0: 926.2. Samples: 3001154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:10:57,875][00205] Avg episode reward: [(0, '29.862')] [2023-02-24 13:11:02,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12038144. Throughput: 0: 979.2. Samples: 3008392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:11:02,877][00205] Avg episode reward: [(0, '29.976')] [2023-02-24 13:11:03,367][11215] Updated weights for policy 0, policy_version 2940 (0.0012) [2023-02-24 13:11:07,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3754.3, 300 sec: 3873.8). Total num frames: 12054528. Throughput: 0: 956.8. Samples: 3014576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:11:07,881][00205] Avg episode reward: [(0, '29.996')] [2023-02-24 13:11:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12070912. Throughput: 0: 931.0. Samples: 3016772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:11:12,874][00205] Avg episode reward: [(0, '29.561')] [2023-02-24 13:11:15,626][11215] Updated weights for policy 0, policy_version 2950 (0.0040) [2023-02-24 13:11:17,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12091392. Throughput: 0: 949.2. Samples: 3022000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:11:17,872][00205] Avg episode reward: [(0, '29.569')] [2023-02-24 13:11:22,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12115968. Throughput: 0: 988.6. Samples: 3029138. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:11:22,880][00205] Avg episode reward: [(0, '29.785')] [2023-02-24 13:11:24,419][11215] Updated weights for policy 0, policy_version 2960 (0.0035) [2023-02-24 13:11:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12132352. Throughput: 0: 983.2. Samples: 3032370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:11:27,876][00205] Avg episode reward: [(0, '30.146')] [2023-02-24 13:11:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3873.8). Total num frames: 12148736. Throughput: 0: 944.2. Samples: 3036898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:11:32,873][00205] Avg episode reward: [(0, '29.465')] [2023-02-24 13:11:36,684][11215] Updated weights for policy 0, policy_version 2970 (0.0027) [2023-02-24 13:11:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 12169216. Throughput: 0: 965.7. Samples: 3042274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:11:37,880][00205] Avg episode reward: [(0, '29.023')] [2023-02-24 13:11:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12189696. Throughput: 0: 992.4. Samples: 3045810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:11:42,872][00205] Avg episode reward: [(0, '28.890')] [2023-02-24 13:11:46,019][11215] Updated weights for policy 0, policy_version 2980 (0.0028) [2023-02-24 13:11:47,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3891.5, 300 sec: 3873.8). Total num frames: 12210176. Throughput: 0: 972.4. Samples: 3052150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:11:47,876][00205] Avg episode reward: [(0, '29.086')] [2023-02-24 13:11:52,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3823.2, 300 sec: 3859.9). Total num frames: 12222464. Throughput: 0: 931.8. Samples: 3056502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:11:52,873][00205] Avg episode reward: [(0, '28.779')] [2023-02-24 13:11:57,870][00205] Fps is (10 sec: 3277.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12242944. Throughput: 0: 936.8. Samples: 3058930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:11:57,877][00205] Avg episode reward: [(0, '28.479')] [2023-02-24 13:11:58,062][11215] Updated weights for policy 0, policy_version 2990 (0.0019) [2023-02-24 13:12:02,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 12267520. Throughput: 0: 975.8. Samples: 3065910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:12:02,879][00205] Avg episode reward: [(0, '28.994')] [2023-02-24 13:12:07,872][00205] Fps is (10 sec: 4095.1, 60 sec: 3823.1, 300 sec: 3859.9). Total num frames: 12283904. Throughput: 0: 948.9. Samples: 3071840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:12:07,877][00205] Avg episode reward: [(0, '27.318')] [2023-02-24 13:12:08,107][11215] Updated weights for policy 0, policy_version 3000 (0.0012) [2023-02-24 13:12:12,871][00205] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3859.9). Total num frames: 12300288. Throughput: 0: 926.0. Samples: 3074042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:12:12,880][00205] Avg episode reward: [(0, '27.280')] [2023-02-24 13:12:17,870][00205] Fps is (10 sec: 3687.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12320768. Throughput: 0: 939.7. Samples: 3079182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:12:17,874][00205] Avg episode reward: [(0, '27.394')] [2023-02-24 13:12:19,207][11215] Updated weights for policy 0, policy_version 3010 (0.0028) [2023-02-24 13:12:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3823.0, 300 sec: 3860.0). Total num frames: 12345344. Throughput: 0: 976.9. Samples: 3086234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:12:22,873][00205] Avg episode reward: [(0, '27.006')] [2023-02-24 13:12:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 12361728. Throughput: 0: 968.4. Samples: 3089388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:12:27,874][00205] Avg episode reward: [(0, '27.804')] [2023-02-24 13:12:30,189][11215] Updated weights for policy 0, policy_version 3020 (0.0017) [2023-02-24 13:12:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12374016. Throughput: 0: 923.7. Samples: 3093716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:12:32,872][00205] Avg episode reward: [(0, '26.419')] [2023-02-24 13:12:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 12394496. Throughput: 0: 950.6. Samples: 3099278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:12:37,877][00205] Avg episode reward: [(0, '26.483')] [2023-02-24 13:12:40,594][11215] Updated weights for policy 0, policy_version 3030 (0.0017) [2023-02-24 13:12:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12419072. Throughput: 0: 974.0. Samples: 3102758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:12:42,875][00205] Avg episode reward: [(0, '28.327')] [2023-02-24 13:12:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 12435456. Throughput: 0: 959.6. Samples: 3109090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:12:47,874][00205] Avg episode reward: [(0, '28.554')] [2023-02-24 13:12:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth... [2023-02-24 13:12:48,048][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002812_11517952.pth [2023-02-24 13:12:52,202][11215] Updated weights for policy 0, policy_version 3040 (0.0017) [2023-02-24 13:12:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 12451840. Throughput: 0: 923.6. Samples: 3113400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:12:52,875][00205] Avg episode reward: [(0, '28.876')] [2023-02-24 13:12:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 12472320. Throughput: 0: 930.4. Samples: 3115908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:12:57,880][00205] Avg episode reward: [(0, '29.707')] [2023-02-24 13:13:02,140][11215] Updated weights for policy 0, policy_version 3050 (0.0024) [2023-02-24 13:13:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12492800. Throughput: 0: 966.6. Samples: 3122680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:02,872][00205] Avg episode reward: [(0, '29.861')] [2023-02-24 13:13:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 12509184. Throughput: 0: 929.6. Samples: 3128068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:07,878][00205] Avg episode reward: [(0, '30.553')] [2023-02-24 13:13:12,871][00205] Fps is (10 sec: 3276.5, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12525568. Throughput: 0: 906.7. Samples: 3130188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:13:12,878][00205] Avg episode reward: [(0, '30.168')] [2023-02-24 13:13:14,790][11215] Updated weights for policy 0, policy_version 3060 (0.0019) [2023-02-24 13:13:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12546048. Throughput: 0: 931.7. Samples: 3135642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:17,879][00205] Avg episode reward: [(0, '29.916')] [2023-02-24 13:13:22,870][00205] Fps is (10 sec: 4506.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12570624. Throughput: 0: 963.1. Samples: 3142616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:13:22,873][00205] Avg episode reward: [(0, '28.476')] [2023-02-24 13:13:23,608][11215] Updated weights for policy 0, policy_version 3070 (0.0025) [2023-02-24 13:13:27,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3846.1). Total num frames: 12587008. Throughput: 0: 950.5. Samples: 3145532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:13:27,877][00205] Avg episode reward: [(0, '27.511')] [2023-02-24 13:13:32,871][00205] Fps is (10 sec: 2866.8, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 12599296. Throughput: 0: 904.5. Samples: 3149794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:32,876][00205] Avg episode reward: [(0, '27.058')] [2023-02-24 13:13:36,092][11215] Updated weights for policy 0, policy_version 3080 (0.0022) [2023-02-24 13:13:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12623872. Throughput: 0: 941.1. Samples: 3155750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:13:37,872][00205] Avg episode reward: [(0, '27.133')] [2023-02-24 13:13:42,870][00205] Fps is (10 sec: 4506.2, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12644352. Throughput: 0: 964.2. Samples: 3159296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:42,881][00205] Avg episode reward: [(0, '27.275')] [2023-02-24 13:13:45,463][11215] Updated weights for policy 0, policy_version 3090 (0.0025) [2023-02-24 13:13:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 12660736. Throughput: 0: 945.8. Samples: 3165240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:13:47,874][00205] Avg episode reward: [(0, '26.354')] [2023-02-24 13:13:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 12677120. Throughput: 0: 925.4. Samples: 3169712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:13:52,877][00205] Avg episode reward: [(0, '25.739')] [2023-02-24 13:13:57,098][11215] Updated weights for policy 0, policy_version 3100 (0.0019) [2023-02-24 13:13:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12697600. Throughput: 0: 943.2. Samples: 3172630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:13:57,879][00205] Avg episode reward: [(0, '25.633')] [2023-02-24 13:14:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12722176. Throughput: 0: 977.0. Samples: 3179608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:14:02,877][00205] Avg episode reward: [(0, '27.165')] [2023-02-24 13:14:07,380][11215] Updated weights for policy 0, policy_version 3110 (0.0033) [2023-02-24 13:14:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12738560. Throughput: 0: 941.1. Samples: 3184964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:14:07,875][00205] Avg episode reward: [(0, '25.456')] [2023-02-24 13:14:12,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 12750848. Throughput: 0: 925.6. Samples: 3187186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:14:12,881][00205] Avg episode reward: [(0, '25.933')] [2023-02-24 13:14:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12775424. Throughput: 0: 960.3. Samples: 3193004. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:14:17,872][00205] Avg episode reward: [(0, '26.471')] [2023-02-24 13:14:18,152][11215] Updated weights for policy 0, policy_version 3120 (0.0019) [2023-02-24 13:14:22,870][00205] Fps is (10 sec: 4915.5, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 12800000. Throughput: 0: 986.0. Samples: 3200122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:14:22,874][00205] Avg episode reward: [(0, '27.418')] [2023-02-24 13:14:27,875][00205] Fps is (10 sec: 4093.8, 60 sec: 3822.6, 300 sec: 3818.2). Total num frames: 12816384. Throughput: 0: 965.5. Samples: 3202750. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:14:27,883][00205] Avg episode reward: [(0, '27.568')] [2023-02-24 13:14:28,893][11215] Updated weights for policy 0, policy_version 3130 (0.0023) [2023-02-24 13:14:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 12828672. Throughput: 0: 932.2. Samples: 3207188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:14:32,875][00205] Avg episode reward: [(0, '26.551')] [2023-02-24 13:14:37,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12853248. Throughput: 0: 975.5. Samples: 3213610. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:14:37,872][00205] Avg episode reward: [(0, '26.103')] [2023-02-24 13:14:39,206][11215] Updated weights for policy 0, policy_version 3140 (0.0014) [2023-02-24 13:14:42,870][00205] Fps is (10 sec: 4915.3, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 12877824. Throughput: 0: 990.2. Samples: 3217188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:14:42,872][00205] Avg episode reward: [(0, '25.734')] [2023-02-24 13:14:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 12894208. Throughput: 0: 965.4. Samples: 3223052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:14:47,872][00205] Avg episode reward: [(0, '25.618')] [2023-02-24 13:14:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth... [2023-02-24 13:14:48,031][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000002925_11980800.pth [2023-02-24 13:14:50,425][11215] Updated weights for policy 0, policy_version 3150 (0.0013) [2023-02-24 13:14:52,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 12906496. Throughput: 0: 947.2. Samples: 3227590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:14:52,873][00205] Avg episode reward: [(0, '25.574')] [2023-02-24 13:14:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 12931072. Throughput: 0: 968.8. Samples: 3230782. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:14:57,872][00205] Avg episode reward: [(0, '25.325')] [2023-02-24 13:15:00,192][11215] Updated weights for policy 0, policy_version 3160 (0.0023) [2023-02-24 13:15:02,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 12955648. Throughput: 0: 988.6. Samples: 3237490. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:15:02,872][00205] Avg episode reward: [(0, '27.306')] [2023-02-24 13:15:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 12967936. Throughput: 0: 951.6. Samples: 3242942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:07,876][00205] Avg episode reward: [(0, '28.133')] [2023-02-24 13:15:12,154][11215] Updated weights for policy 0, policy_version 3170 (0.0011) [2023-02-24 13:15:12,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.3, 300 sec: 3804.4). Total num frames: 12984320. Throughput: 0: 943.1. Samples: 3245184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:12,879][00205] Avg episode reward: [(0, '28.380')] [2023-02-24 13:15:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 13008896. Throughput: 0: 981.4. Samples: 3251352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:15:17,879][00205] Avg episode reward: [(0, '29.088')] [2023-02-24 13:15:20,814][11215] Updated weights for policy 0, policy_version 3180 (0.0021) [2023-02-24 13:15:22,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13033472. Throughput: 0: 998.4. Samples: 3258536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:15:22,873][00205] Avg episode reward: [(0, '31.138')] [2023-02-24 13:15:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.6, 300 sec: 3832.2). Total num frames: 13049856. Throughput: 0: 974.3. Samples: 3261032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:15:27,873][00205] Avg episode reward: [(0, '30.174')] [2023-02-24 13:15:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13062144. Throughput: 0: 944.3. Samples: 3265546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:15:32,873][00205] Avg episode reward: [(0, '28.223')] [2023-02-24 13:15:33,036][11215] Updated weights for policy 0, policy_version 3190 (0.0014) [2023-02-24 13:15:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13086720. Throughput: 0: 992.4. Samples: 3272246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:15:37,873][00205] Avg episode reward: [(0, '29.218')] [2023-02-24 13:15:41,379][11215] Updated weights for policy 0, policy_version 3200 (0.0024) [2023-02-24 13:15:42,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13111296. Throughput: 0: 1001.8. Samples: 3275864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:42,878][00205] Avg episode reward: [(0, '28.585')] [2023-02-24 13:15:47,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13127680. Throughput: 0: 978.6. Samples: 3281528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:47,876][00205] Avg episode reward: [(0, '28.434')] [2023-02-24 13:15:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13144064. Throughput: 0: 958.6. Samples: 3286080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:52,875][00205] Avg episode reward: [(0, '27.921')] [2023-02-24 13:15:53,572][11215] Updated weights for policy 0, policy_version 3210 (0.0019) [2023-02-24 13:15:57,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13164544. Throughput: 0: 983.6. Samples: 3289448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:15:57,880][00205] Avg episode reward: [(0, '28.219')] [2023-02-24 13:16:02,336][11215] Updated weights for policy 0, policy_version 3220 (0.0019) [2023-02-24 13:16:02,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13189120. Throughput: 0: 1001.8. Samples: 3296434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:02,876][00205] Avg episode reward: [(0, '26.576')] [2023-02-24 13:16:07,876][00205] Fps is (10 sec: 4093.3, 60 sec: 3959.1, 300 sec: 3846.0). Total num frames: 13205504. Throughput: 0: 955.4. Samples: 3301536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:16:07,879][00205] Avg episode reward: [(0, '26.522')] [2023-02-24 13:16:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 13221888. Throughput: 0: 949.9. Samples: 3303776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:12,873][00205] Avg episode reward: [(0, '27.093')] [2023-02-24 13:16:14,616][11215] Updated weights for policy 0, policy_version 3230 (0.0030) [2023-02-24 13:16:17,870][00205] Fps is (10 sec: 3688.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13242368. Throughput: 0: 988.4. Samples: 3310022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:16:17,872][00205] Avg episode reward: [(0, '26.161')] [2023-02-24 13:16:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13266944. Throughput: 0: 999.3. Samples: 3317214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:22,877][00205] Avg episode reward: [(0, '27.452')] [2023-02-24 13:16:23,608][11215] Updated weights for policy 0, policy_version 3240 (0.0011) [2023-02-24 13:16:27,878][00205] Fps is (10 sec: 4092.8, 60 sec: 3890.7, 300 sec: 3846.0). Total num frames: 13283328. Throughput: 0: 968.9. Samples: 3319472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:27,881][00205] Avg episode reward: [(0, '27.243')] [2023-02-24 13:16:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 13295616. Throughput: 0: 943.2. Samples: 3323972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:32,873][00205] Avg episode reward: [(0, '27.434')] [2023-02-24 13:16:35,436][11215] Updated weights for policy 0, policy_version 3250 (0.0025) [2023-02-24 13:16:37,870][00205] Fps is (10 sec: 3689.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13320192. Throughput: 0: 990.8. Samples: 3330666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:16:37,872][00205] Avg episode reward: [(0, '28.732')] [2023-02-24 13:16:42,873][00205] Fps is (10 sec: 4913.5, 60 sec: 3891.0, 300 sec: 3846.0). Total num frames: 13344768. Throughput: 0: 995.5. Samples: 3334250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:42,877][00205] Avg episode reward: [(0, '29.226')] [2023-02-24 13:16:44,956][11215] Updated weights for policy 0, policy_version 3260 (0.0011) [2023-02-24 13:16:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13361152. Throughput: 0: 960.8. Samples: 3339670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:16:47,872][00205] Avg episode reward: [(0, '30.104')] [2023-02-24 13:16:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth... [2023-02-24 13:16:48,050][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003036_12435456.pth [2023-02-24 13:16:52,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13377536. Throughput: 0: 948.0. Samples: 3344192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:16:52,879][00205] Avg episode reward: [(0, '29.504')] [2023-02-24 13:16:56,264][11215] Updated weights for policy 0, policy_version 3270 (0.0012) [2023-02-24 13:16:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13398016. Throughput: 0: 975.4. Samples: 3347668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:16:57,872][00205] Avg episode reward: [(0, '28.290')] [2023-02-24 13:17:02,871][00205] Fps is (10 sec: 4505.0, 60 sec: 3891.1, 300 sec: 3860.0). Total num frames: 13422592. Throughput: 0: 992.6. Samples: 3354692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:02,875][00205] Avg episode reward: [(0, '29.657')] [2023-02-24 13:17:06,777][11215] Updated weights for policy 0, policy_version 3280 (0.0015) [2023-02-24 13:17:07,873][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.3, 300 sec: 3846.1). Total num frames: 13434880. Throughput: 0: 941.9. Samples: 3359600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:17:07,882][00205] Avg episode reward: [(0, '29.137')] [2023-02-24 13:17:12,870][00205] Fps is (10 sec: 2867.6, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13451264. Throughput: 0: 943.9. Samples: 3361942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:17:12,872][00205] Avg episode reward: [(0, '28.440')] [2023-02-24 13:17:17,438][11215] Updated weights for policy 0, policy_version 3290 (0.0018) [2023-02-24 13:17:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 13475840. Throughput: 0: 982.1. Samples: 3368166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:17:17,872][00205] Avg episode reward: [(0, '27.276')] [2023-02-24 13:17:22,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 13500416. Throughput: 0: 993.2. Samples: 3375360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:22,872][00205] Avg episode reward: [(0, '26.586')] [2023-02-24 13:17:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.4, 300 sec: 3860.0). Total num frames: 13512704. Throughput: 0: 962.5. Samples: 3377558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:17:27,873][00205] Avg episode reward: [(0, '26.672')] [2023-02-24 13:17:28,242][11215] Updated weights for policy 0, policy_version 3300 (0.0015) [2023-02-24 13:17:32,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 13529088. Throughput: 0: 934.9. Samples: 3381740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:32,874][00205] Avg episode reward: [(0, '25.957')] [2023-02-24 13:17:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 13549568. Throughput: 0: 976.0. Samples: 3388112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:17:37,875][00205] Avg episode reward: [(0, '26.734')] [2023-02-24 13:17:38,777][11215] Updated weights for policy 0, policy_version 3310 (0.0016) [2023-02-24 13:17:42,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.1, 300 sec: 3860.0). Total num frames: 13574144. Throughput: 0: 975.6. Samples: 3391572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:42,875][00205] Avg episode reward: [(0, '27.235')] [2023-02-24 13:17:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13586432. Throughput: 0: 935.1. Samples: 3396768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:47,872][00205] Avg episode reward: [(0, '26.936')] [2023-02-24 13:17:50,851][11215] Updated weights for policy 0, policy_version 3320 (0.0019) [2023-02-24 13:17:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13606912. Throughput: 0: 928.7. Samples: 3401392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:17:52,871][00205] Avg episode reward: [(0, '27.285')] [2023-02-24 13:17:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13627392. Throughput: 0: 955.6. Samples: 3404946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:17:57,879][00205] Avg episode reward: [(0, '28.580')] [2023-02-24 13:17:59,878][11215] Updated weights for policy 0, policy_version 3330 (0.0019) [2023-02-24 13:18:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3860.0). Total num frames: 13647872. Throughput: 0: 976.7. Samples: 3412118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:18:02,873][00205] Avg episode reward: [(0, '28.180')] [2023-02-24 13:18:07,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13664256. Throughput: 0: 918.7. Samples: 3416702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:18:07,877][00205] Avg episode reward: [(0, '27.651')] [2023-02-24 13:18:12,307][11215] Updated weights for policy 0, policy_version 3340 (0.0041) [2023-02-24 13:18:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13680640. Throughput: 0: 918.0. Samples: 3418870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:18:12,873][00205] Avg episode reward: [(0, '28.435')] [2023-02-24 13:18:17,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13705216. Throughput: 0: 967.1. Samples: 3425260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:18:17,873][00205] Avg episode reward: [(0, '29.291')] [2023-02-24 13:18:20,960][11215] Updated weights for policy 0, policy_version 3350 (0.0017) [2023-02-24 13:18:22,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13725696. Throughput: 0: 975.4. Samples: 3432004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:18:22,876][00205] Avg episode reward: [(0, '29.159')] [2023-02-24 13:18:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3873.9). Total num frames: 13742080. Throughput: 0: 946.8. Samples: 3434176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:18:27,878][00205] Avg episode reward: [(0, '28.704')] [2023-02-24 13:18:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13758464. Throughput: 0: 929.7. Samples: 3438606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:18:32,879][00205] Avg episode reward: [(0, '29.137')] [2023-02-24 13:18:33,539][11215] Updated weights for policy 0, policy_version 3360 (0.0020) [2023-02-24 13:18:37,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13778944. Throughput: 0: 975.9. Samples: 3445306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:18:37,873][00205] Avg episode reward: [(0, '28.880')] [2023-02-24 13:18:42,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13799424. Throughput: 0: 975.5. Samples: 3448844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:18:42,873][00205] Avg episode reward: [(0, '29.770')] [2023-02-24 13:18:43,020][11215] Updated weights for policy 0, policy_version 3370 (0.0013) [2023-02-24 13:18:47,872][00205] Fps is (10 sec: 3685.8, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 13815808. Throughput: 0: 923.1. Samples: 3453658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:18:47,877][00205] Avg episode reward: [(0, '30.464')] [2023-02-24 13:18:47,893][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth... [2023-02-24 13:18:48,036][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003148_12894208.pth [2023-02-24 13:18:52,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13832192. Throughput: 0: 928.3. Samples: 3458474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:18:52,873][00205] Avg episode reward: [(0, '30.365')] [2023-02-24 13:18:54,924][11215] Updated weights for policy 0, policy_version 3380 (0.0023) [2023-02-24 13:18:57,870][00205] Fps is (10 sec: 4096.7, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13856768. Throughput: 0: 957.3. Samples: 3461948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:18:57,879][00205] Avg episode reward: [(0, '31.903')] [2023-02-24 13:19:02,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 13877248. Throughput: 0: 968.5. Samples: 3468844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:19:02,880][00205] Avg episode reward: [(0, '30.509')] [2023-02-24 13:19:05,667][11215] Updated weights for policy 0, policy_version 3390 (0.0014) [2023-02-24 13:19:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 13889536. Throughput: 0: 913.1. Samples: 3473092. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:19:07,876][00205] Avg episode reward: [(0, '30.280')] [2023-02-24 13:19:12,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 13910016. Throughput: 0: 915.0. Samples: 3475350. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:19:12,878][00205] Avg episode reward: [(0, '28.337')] [2023-02-24 13:19:16,419][11215] Updated weights for policy 0, policy_version 3400 (0.0013) [2023-02-24 13:19:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13930496. Throughput: 0: 961.9. Samples: 3481890. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:19:17,879][00205] Avg episode reward: [(0, '27.576')] [2023-02-24 13:19:22,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 13950976. Throughput: 0: 952.9. Samples: 3488186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:19:22,873][00205] Avg episode reward: [(0, '26.552')] [2023-02-24 13:19:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3846.1). Total num frames: 13963264. Throughput: 0: 922.4. Samples: 3490352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:19:27,876][00205] Avg episode reward: [(0, '25.387')] [2023-02-24 13:19:28,281][11215] Updated weights for policy 0, policy_version 3410 (0.0023) [2023-02-24 13:19:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 13983744. Throughput: 0: 915.1. Samples: 3494838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:19:32,875][00205] Avg episode reward: [(0, '25.694')] [2023-02-24 13:19:37,787][11215] Updated weights for policy 0, policy_version 3420 (0.0014) [2023-02-24 13:19:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14008320. Throughput: 0: 959.7. Samples: 3501662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-24 13:19:37,879][00205] Avg episode reward: [(0, '27.661')] [2023-02-24 13:19:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 14024704. Throughput: 0: 960.4. Samples: 3505164. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:19:42,874][00205] Avg episode reward: [(0, '27.639')] [2023-02-24 13:19:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 14041088. Throughput: 0: 909.3. Samples: 3509764. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 13:19:47,875][00205] Avg episode reward: [(0, '29.394')] [2023-02-24 13:19:50,471][11215] Updated weights for policy 0, policy_version 3430 (0.0027) [2023-02-24 13:19:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14057472. Throughput: 0: 929.2. Samples: 3514906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:19:52,872][00205] Avg episode reward: [(0, '29.038')] [2023-02-24 13:19:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14082048. Throughput: 0: 958.0. Samples: 3518462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:19:57,873][00205] Avg episode reward: [(0, '28.238')] [2023-02-24 13:19:59,275][11215] Updated weights for policy 0, policy_version 3440 (0.0022) [2023-02-24 13:20:02,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3754.5, 300 sec: 3846.0). Total num frames: 14102528. Throughput: 0: 955.6. Samples: 3524892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:20:02,877][00205] Avg episode reward: [(0, '30.013')] [2023-02-24 13:20:07,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3832.2). Total num frames: 14114816. Throughput: 0: 910.3. Samples: 3529152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:20:07,874][00205] Avg episode reward: [(0, '26.897')] [2023-02-24 13:20:11,975][11215] Updated weights for policy 0, policy_version 3450 (0.0021) [2023-02-24 13:20:12,870][00205] Fps is (10 sec: 3277.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14135296. Throughput: 0: 911.4. Samples: 3531366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:20:12,872][00205] Avg episode reward: [(0, '26.146')] [2023-02-24 13:20:17,870][00205] Fps is (10 sec: 4096.5, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14155776. Throughput: 0: 966.9. Samples: 3538348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:20:17,872][00205] Avg episode reward: [(0, '25.074')] [2023-02-24 13:20:20,670][11215] Updated weights for policy 0, policy_version 3460 (0.0011) [2023-02-24 13:20:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 14176256. Throughput: 0: 956.0. Samples: 3544684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:20:22,874][00205] Avg episode reward: [(0, '25.799')] [2023-02-24 13:20:27,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 14192640. Throughput: 0: 926.8. Samples: 3546870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:20:27,875][00205] Avg episode reward: [(0, '27.608')] [2023-02-24 13:20:32,831][11215] Updated weights for policy 0, policy_version 3470 (0.0030) [2023-02-24 13:20:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14213120. Throughput: 0: 933.2. Samples: 3551758. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:20:32,872][00205] Avg episode reward: [(0, '28.704')] [2023-02-24 13:20:37,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14233600. Throughput: 0: 975.5. Samples: 3558802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:20:37,878][00205] Avg episode reward: [(0, '30.275')] [2023-02-24 13:20:42,548][11215] Updated weights for policy 0, policy_version 3480 (0.0024) [2023-02-24 13:20:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 14254080. Throughput: 0: 973.6. Samples: 3562272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:20:42,872][00205] Avg episode reward: [(0, '29.947')] [2023-02-24 13:20:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14266368. Throughput: 0: 927.9. Samples: 3566644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:20:47,873][00205] Avg episode reward: [(0, '31.964')] [2023-02-24 13:20:47,890][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth... [2023-02-24 13:20:48,049][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003262_13361152.pth [2023-02-24 13:20:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14286848. Throughput: 0: 952.0. Samples: 3571990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:20:52,872][00205] Avg episode reward: [(0, '32.424')] [2023-02-24 13:20:54,231][11215] Updated weights for policy 0, policy_version 3490 (0.0026) [2023-02-24 13:20:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14311424. Throughput: 0: 979.7. Samples: 3575454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:20:57,875][00205] Avg episode reward: [(0, '31.058')] [2023-02-24 13:21:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3804.5). Total num frames: 14327808. Throughput: 0: 969.7. Samples: 3581984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:21:02,878][00205] Avg episode reward: [(0, '31.370')] [2023-02-24 13:21:04,717][11215] Updated weights for policy 0, policy_version 3500 (0.0030) [2023-02-24 13:21:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 14344192. Throughput: 0: 923.7. Samples: 3586250. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:07,875][00205] Avg episode reward: [(0, '30.257')] [2023-02-24 13:21:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14360576. Throughput: 0: 925.1. Samples: 3588500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:12,879][00205] Avg episode reward: [(0, '30.011')] [2023-02-24 13:21:15,474][11215] Updated weights for policy 0, policy_version 3510 (0.0020) [2023-02-24 13:21:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14385152. Throughput: 0: 973.6. Samples: 3595572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:17,876][00205] Avg episode reward: [(0, '29.541')] [2023-02-24 13:21:22,872][00205] Fps is (10 sec: 4504.5, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 14405632. Throughput: 0: 955.8. Samples: 3601814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:21:22,877][00205] Avg episode reward: [(0, '28.882')] [2023-02-24 13:21:26,956][11215] Updated weights for policy 0, policy_version 3520 (0.0026) [2023-02-24 13:21:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 14417920. Throughput: 0: 926.4. Samples: 3603962. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:21:27,872][00205] Avg episode reward: [(0, '29.000')] [2023-02-24 13:21:32,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14438400. Throughput: 0: 935.4. Samples: 3608738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:32,873][00205] Avg episode reward: [(0, '28.622')] [2023-02-24 13:21:36,772][11215] Updated weights for policy 0, policy_version 3530 (0.0026) [2023-02-24 13:21:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 14462976. Throughput: 0: 972.3. Samples: 3615742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:37,873][00205] Avg episode reward: [(0, '28.415')] [2023-02-24 13:21:42,877][00205] Fps is (10 sec: 4093.0, 60 sec: 3754.2, 300 sec: 3790.4). Total num frames: 14479360. Throughput: 0: 972.9. Samples: 3619240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:42,884][00205] Avg episode reward: [(0, '26.604')] [2023-02-24 13:21:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14495744. Throughput: 0: 922.6. Samples: 3623500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:47,873][00205] Avg episode reward: [(0, '28.410')] [2023-02-24 13:21:48,978][11215] Updated weights for policy 0, policy_version 3540 (0.0028) [2023-02-24 13:21:52,870][00205] Fps is (10 sec: 3689.1, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14516224. Throughput: 0: 948.1. Samples: 3628914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:21:52,872][00205] Avg episode reward: [(0, '28.067')] [2023-02-24 13:21:57,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 14536704. Throughput: 0: 976.2. Samples: 3632428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:21:57,873][00205] Avg episode reward: [(0, '28.859')] [2023-02-24 13:21:58,067][11215] Updated weights for policy 0, policy_version 3550 (0.0013) [2023-02-24 13:22:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 14557184. Throughput: 0: 959.7. Samples: 3638758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:02,871][00205] Avg episode reward: [(0, '29.466')] [2023-02-24 13:22:07,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14569472. Throughput: 0: 917.2. Samples: 3643084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:07,873][00205] Avg episode reward: [(0, '29.846')] [2023-02-24 13:22:10,606][11215] Updated weights for policy 0, policy_version 3560 (0.0020) [2023-02-24 13:22:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14589952. Throughput: 0: 922.3. Samples: 3645464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:12,872][00205] Avg episode reward: [(0, '30.779')] [2023-02-24 13:22:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14614528. Throughput: 0: 971.7. Samples: 3652466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:22:17,872][00205] Avg episode reward: [(0, '30.570')] [2023-02-24 13:22:19,483][11215] Updated weights for policy 0, policy_version 3570 (0.0020) [2023-02-24 13:22:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 14630912. Throughput: 0: 947.1. Samples: 3658362. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:22:22,875][00205] Avg episode reward: [(0, '30.270')] [2023-02-24 13:22:27,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14647296. Throughput: 0: 917.7. Samples: 3660532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:22:27,879][00205] Avg episode reward: [(0, '30.798')] [2023-02-24 13:22:31,857][11215] Updated weights for policy 0, policy_version 3580 (0.0017) [2023-02-24 13:22:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14667776. Throughput: 0: 940.0. Samples: 3665800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:32,879][00205] Avg episode reward: [(0, '28.957')] [2023-02-24 13:22:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14692352. Throughput: 0: 974.2. Samples: 3672752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:22:37,877][00205] Avg episode reward: [(0, '28.236')] [2023-02-24 13:22:41,102][11215] Updated weights for policy 0, policy_version 3590 (0.0026) [2023-02-24 13:22:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.4, 300 sec: 3804.4). Total num frames: 14708736. Throughput: 0: 967.5. Samples: 3675964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:22:42,877][00205] Avg episode reward: [(0, '27.926')] [2023-02-24 13:22:47,874][00205] Fps is (10 sec: 2866.0, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 14721024. Throughput: 0: 925.2. Samples: 3680394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:47,879][00205] Avg episode reward: [(0, '27.837')] [2023-02-24 13:22:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth... [2023-02-24 13:22:48,060][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003373_13815808.pth [2023-02-24 13:22:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14741504. Throughput: 0: 953.4. Samples: 3685988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:52,872][00205] Avg episode reward: [(0, '27.800')] [2023-02-24 13:22:53,026][11215] Updated weights for policy 0, policy_version 3600 (0.0021) [2023-02-24 13:22:57,870][00205] Fps is (10 sec: 4507.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 14766080. Throughput: 0: 978.2. Samples: 3689482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:22:57,879][00205] Avg episode reward: [(0, '28.941')] [2023-02-24 13:23:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14782464. Throughput: 0: 961.3. Samples: 3695726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:23:02,875][00205] Avg episode reward: [(0, '28.939')] [2023-02-24 13:23:03,086][11215] Updated weights for policy 0, policy_version 3610 (0.0011) [2023-02-24 13:23:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14798848. Throughput: 0: 925.8. Samples: 3700024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:07,873][00205] Avg episode reward: [(0, '29.466')] [2023-02-24 13:23:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14819328. Throughput: 0: 934.0. Samples: 3702560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:23:12,872][00205] Avg episode reward: [(0, '30.402')] [2023-02-24 13:23:14,294][11215] Updated weights for policy 0, policy_version 3620 (0.0024) [2023-02-24 13:23:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14843904. Throughput: 0: 972.8. Samples: 3709576. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:23:17,872][00205] Avg episode reward: [(0, '31.584')] [2023-02-24 13:23:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14860288. Throughput: 0: 947.2. Samples: 3715374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:22,873][00205] Avg episode reward: [(0, '31.972')] [2023-02-24 13:23:25,541][11215] Updated weights for policy 0, policy_version 3630 (0.0030) [2023-02-24 13:23:27,870][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 14872576. Throughput: 0: 922.3. Samples: 3717468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:23:27,878][00205] Avg episode reward: [(0, '31.147')] [2023-02-24 13:23:32,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 14893056. Throughput: 0: 940.1. Samples: 3722696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:32,879][00205] Avg episode reward: [(0, '30.206')] [2023-02-24 13:23:35,677][11215] Updated weights for policy 0, policy_version 3640 (0.0013) [2023-02-24 13:23:37,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 14917632. Throughput: 0: 972.9. Samples: 3729768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:37,872][00205] Avg episode reward: [(0, '29.361')] [2023-02-24 13:23:42,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 14934016. Throughput: 0: 962.9. Samples: 3732814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:42,874][00205] Avg episode reward: [(0, '29.718')] [2023-02-24 13:23:47,828][11215] Updated weights for policy 0, policy_version 3650 (0.0019) [2023-02-24 13:23:47,875][00205] Fps is (10 sec: 3275.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14950400. Throughput: 0: 916.6. Samples: 3736978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:23:47,882][00205] Avg episode reward: [(0, '29.596')] [2023-02-24 13:23:52,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 14970880. Throughput: 0: 950.8. Samples: 3742810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:23:52,878][00205] Avg episode reward: [(0, '27.900')] [2023-02-24 13:23:56,875][11215] Updated weights for policy 0, policy_version 3660 (0.0011) [2023-02-24 13:23:57,870][00205] Fps is (10 sec: 4507.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 14995456. Throughput: 0: 972.5. Samples: 3746322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:23:57,872][00205] Avg episode reward: [(0, '28.600')] [2023-02-24 13:24:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15011840. Throughput: 0: 952.3. Samples: 3752430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:24:02,877][00205] Avg episode reward: [(0, '29.330')] [2023-02-24 13:24:07,873][00205] Fps is (10 sec: 2866.3, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 15024128. Throughput: 0: 917.5. Samples: 3756664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:24:07,878][00205] Avg episode reward: [(0, '29.552')] [2023-02-24 13:24:09,391][11215] Updated weights for policy 0, policy_version 3670 (0.0020) [2023-02-24 13:24:12,874][00205] Fps is (10 sec: 3275.4, 60 sec: 3754.4, 300 sec: 3776.6). Total num frames: 15044608. Throughput: 0: 936.1. Samples: 3759594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:24:12,877][00205] Avg episode reward: [(0, '30.132')] [2023-02-24 13:24:17,870][00205] Fps is (10 sec: 4506.9, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15069184. Throughput: 0: 975.4. Samples: 3766590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:24:17,874][00205] Avg episode reward: [(0, '30.715')] [2023-02-24 13:24:18,083][11215] Updated weights for policy 0, policy_version 3680 (0.0022) [2023-02-24 13:24:22,873][00205] Fps is (10 sec: 4096.4, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15085568. Throughput: 0: 940.3. Samples: 3772084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:24:22,876][00205] Avg episode reward: [(0, '31.799')] [2023-02-24 13:24:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 15101952. Throughput: 0: 921.7. Samples: 3774290. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:24:27,873][00205] Avg episode reward: [(0, '31.516')] [2023-02-24 13:24:30,543][11215] Updated weights for policy 0, policy_version 3690 (0.0024) [2023-02-24 13:24:32,870][00205] Fps is (10 sec: 3687.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15122432. Throughput: 0: 954.7. Samples: 3779936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:24:32,873][00205] Avg episode reward: [(0, '31.794')] [2023-02-24 13:24:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15147008. Throughput: 0: 982.4. Samples: 3787018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:24:37,876][00205] Avg episode reward: [(0, '30.642')] [2023-02-24 13:24:39,576][11215] Updated weights for policy 0, policy_version 3700 (0.0017) [2023-02-24 13:24:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 15163392. Throughput: 0: 965.6. Samples: 3789772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:24:42,877][00205] Avg episode reward: [(0, '28.464')] [2023-02-24 13:24:47,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3755.0, 300 sec: 3790.5). Total num frames: 15175680. Throughput: 0: 923.3. Samples: 3793980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:24:47,878][00205] Avg episode reward: [(0, '28.684')] [2023-02-24 13:24:47,897][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth... [2023-02-24 13:24:48,079][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003483_14266368.pth [2023-02-24 13:24:51,750][11215] Updated weights for policy 0, policy_version 3710 (0.0020) [2023-02-24 13:24:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15200256. Throughput: 0: 963.0. Samples: 3799994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:24:52,873][00205] Avg episode reward: [(0, '27.226')] [2023-02-24 13:24:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 15220736. Throughput: 0: 975.4. Samples: 3803482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:24:57,875][00205] Avg episode reward: [(0, '27.058')] [2023-02-24 13:25:02,409][11215] Updated weights for policy 0, policy_version 3720 (0.0028) [2023-02-24 13:25:02,873][00205] Fps is (10 sec: 3685.3, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 15237120. Throughput: 0: 941.8. Samples: 3808976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:25:02,879][00205] Avg episode reward: [(0, '27.016')] [2023-02-24 13:25:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 15249408. Throughput: 0: 916.3. Samples: 3813316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:07,879][00205] Avg episode reward: [(0, '26.833')] [2023-02-24 13:25:12,870][00205] Fps is (10 sec: 3687.5, 60 sec: 3823.2, 300 sec: 3790.5). Total num frames: 15273984. Throughput: 0: 934.8. Samples: 3816354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:12,873][00205] Avg episode reward: [(0, '28.488')] [2023-02-24 13:25:13,510][11215] Updated weights for policy 0, policy_version 3730 (0.0017) [2023-02-24 13:25:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15298560. Throughput: 0: 961.5. Samples: 3823204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:25:17,873][00205] Avg episode reward: [(0, '29.344')] [2023-02-24 13:25:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 15310848. Throughput: 0: 923.4. Samples: 3828570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:25:22,874][00205] Avg episode reward: [(0, '30.927')] [2023-02-24 13:25:24,623][11215] Updated weights for policy 0, policy_version 3740 (0.0016) [2023-02-24 13:25:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15327232. Throughput: 0: 910.9. Samples: 3830762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:27,873][00205] Avg episode reward: [(0, '30.591')] [2023-02-24 13:25:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15351808. Throughput: 0: 946.9. Samples: 3836590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:25:32,872][00205] Avg episode reward: [(0, '30.033')] [2023-02-24 13:25:34,539][11215] Updated weights for policy 0, policy_version 3750 (0.0017) [2023-02-24 13:25:37,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15372288. Throughput: 0: 968.7. Samples: 3843584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:25:37,879][00205] Avg episode reward: [(0, '29.382')] [2023-02-24 13:25:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15388672. Throughput: 0: 949.6. Samples: 3846214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:42,872][00205] Avg episode reward: [(0, '29.231')] [2023-02-24 13:25:46,555][11215] Updated weights for policy 0, policy_version 3760 (0.0030) [2023-02-24 13:25:47,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15405056. Throughput: 0: 924.2. Samples: 3850564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:25:47,879][00205] Avg episode reward: [(0, '28.233')] [2023-02-24 13:25:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15425536. Throughput: 0: 968.5. Samples: 3856900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:52,879][00205] Avg episode reward: [(0, '28.024')] [2023-02-24 13:25:55,699][11215] Updated weights for policy 0, policy_version 3770 (0.0011) [2023-02-24 13:25:57,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15450112. Throughput: 0: 979.4. Samples: 3860428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:25:57,878][00205] Avg episode reward: [(0, '28.417')] [2023-02-24 13:26:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 15466496. Throughput: 0: 952.5. Samples: 3866066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:26:02,873][00205] Avg episode reward: [(0, '29.659')] [2023-02-24 13:26:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15478784. Throughput: 0: 930.6. Samples: 3870448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:26:07,872][00205] Avg episode reward: [(0, '30.114')] [2023-02-24 13:26:08,097][11215] Updated weights for policy 0, policy_version 3780 (0.0037) [2023-02-24 13:26:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15503360. Throughput: 0: 952.3. Samples: 3873616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:26:12,880][00205] Avg episode reward: [(0, '30.489')] [2023-02-24 13:26:16,975][11215] Updated weights for policy 0, policy_version 3790 (0.0017) [2023-02-24 13:26:17,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 15527936. Throughput: 0: 975.8. Samples: 3880500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:26:17,879][00205] Avg episode reward: [(0, '31.722')] [2023-02-24 13:26:22,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15540224. Throughput: 0: 936.0. Samples: 3885702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:26:22,874][00205] Avg episode reward: [(0, '32.746')] [2023-02-24 13:26:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15556608. Throughput: 0: 924.5. Samples: 3887818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:26:27,873][00205] Avg episode reward: [(0, '31.724')] [2023-02-24 13:26:29,367][11215] Updated weights for policy 0, policy_version 3800 (0.0011) [2023-02-24 13:26:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15577088. Throughput: 0: 960.3. Samples: 3893776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:26:32,876][00205] Avg episode reward: [(0, '29.443')] [2023-02-24 13:26:37,872][00205] Fps is (10 sec: 4504.6, 60 sec: 3822.8, 300 sec: 3804.5). Total num frames: 15601664. Throughput: 0: 974.1. Samples: 3900736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:26:37,878][00205] Avg episode reward: [(0, '28.960')] [2023-02-24 13:26:38,499][11215] Updated weights for policy 0, policy_version 3810 (0.0017) [2023-02-24 13:26:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15613952. Throughput: 0: 948.3. Samples: 3903102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:26:42,874][00205] Avg episode reward: [(0, '28.649')] [2023-02-24 13:26:47,870][00205] Fps is (10 sec: 2867.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15630336. Throughput: 0: 919.1. Samples: 3907426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:26:47,872][00205] Avg episode reward: [(0, '27.857')] [2023-02-24 13:26:47,887][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth... [2023-02-24 13:26:48,007][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003594_14721024.pth [2023-02-24 13:26:50,814][11215] Updated weights for policy 0, policy_version 3820 (0.0020) [2023-02-24 13:26:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15654912. Throughput: 0: 961.1. Samples: 3913696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:26:52,872][00205] Avg episode reward: [(0, '27.325')] [2023-02-24 13:26:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15675392. Throughput: 0: 968.2. Samples: 3917186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:26:57,879][00205] Avg episode reward: [(0, '27.446')] [2023-02-24 13:27:00,812][11215] Updated weights for policy 0, policy_version 3830 (0.0016) [2023-02-24 13:27:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15691776. Throughput: 0: 937.4. Samples: 3922684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:27:02,877][00205] Avg episode reward: [(0, '29.302')] [2023-02-24 13:27:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15708160. Throughput: 0: 919.9. Samples: 3927096. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:27:07,873][00205] Avg episode reward: [(0, '28.767')] [2023-02-24 13:27:12,018][11215] Updated weights for policy 0, policy_version 3840 (0.0025) [2023-02-24 13:27:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 15728640. Throughput: 0: 947.1. Samples: 3930436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:27:12,872][00205] Avg episode reward: [(0, '30.464')] [2023-02-24 13:27:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 15753216. Throughput: 0: 967.7. Samples: 3937324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:27:17,872][00205] Avg episode reward: [(0, '31.113')] [2023-02-24 13:27:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15765504. Throughput: 0: 925.3. Samples: 3942374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:27:22,881][00205] Avg episode reward: [(0, '32.603')] [2023-02-24 13:27:23,047][11215] Updated weights for policy 0, policy_version 3850 (0.0019) [2023-02-24 13:27:27,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 15781888. Throughput: 0: 922.0. Samples: 3944590. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:27:27,875][00205] Avg episode reward: [(0, '33.059')] [2023-02-24 13:27:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 15806464. Throughput: 0: 959.2. Samples: 3950590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:27:32,872][00205] Avg episode reward: [(0, '33.152')] [2023-02-24 13:27:33,423][11215] Updated weights for policy 0, policy_version 3860 (0.0012) [2023-02-24 13:27:37,878][00205] Fps is (10 sec: 4501.8, 60 sec: 3754.3, 300 sec: 3790.4). Total num frames: 15826944. Throughput: 0: 974.5. Samples: 3957556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:27:37,881][00205] Avg episode reward: [(0, '34.449')] [2023-02-24 13:27:37,899][11201] Saving new best policy, reward=34.449! [2023-02-24 13:27:42,874][00205] Fps is (10 sec: 3684.8, 60 sec: 3822.7, 300 sec: 3804.4). Total num frames: 15843328. Throughput: 0: 945.7. Samples: 3959746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:27:42,876][00205] Avg episode reward: [(0, '34.549')] [2023-02-24 13:27:42,886][11201] Saving new best policy, reward=34.549! [2023-02-24 13:27:45,283][11215] Updated weights for policy 0, policy_version 3870 (0.0011) [2023-02-24 13:27:47,870][00205] Fps is (10 sec: 3279.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15859712. Throughput: 0: 921.7. Samples: 3964162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:27:47,872][00205] Avg episode reward: [(0, '31.631')] [2023-02-24 13:27:52,870][00205] Fps is (10 sec: 4097.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15884288. Throughput: 0: 970.6. Samples: 3970774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:27:52,876][00205] Avg episode reward: [(0, '30.482')] [2023-02-24 13:27:54,630][11215] Updated weights for policy 0, policy_version 3880 (0.0019) [2023-02-24 13:27:57,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15904768. Throughput: 0: 975.8. Samples: 3974346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:27:57,872][00205] Avg episode reward: [(0, '30.739')] [2023-02-24 13:28:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15921152. Throughput: 0: 940.8. Samples: 3979662. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:28:02,873][00205] Avg episode reward: [(0, '31.045')] [2023-02-24 13:28:06,900][11215] Updated weights for policy 0, policy_version 3890 (0.0026) [2023-02-24 13:28:07,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 15937536. Throughput: 0: 928.7. Samples: 3984164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:07,873][00205] Avg episode reward: [(0, '31.096')] [2023-02-24 13:28:12,871][00205] Fps is (10 sec: 3686.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 15958016. Throughput: 0: 958.3. Samples: 3987712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:28:12,878][00205] Avg episode reward: [(0, '28.555')] [2023-02-24 13:28:15,825][11215] Updated weights for policy 0, policy_version 3900 (0.0014) [2023-02-24 13:28:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 15978496. Throughput: 0: 976.8. Samples: 3994548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:17,877][00205] Avg episode reward: [(0, '29.449')] [2023-02-24 13:28:22,870][00205] Fps is (10 sec: 3686.7, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 15994880. Throughput: 0: 925.8. Samples: 3999208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:22,872][00205] Avg episode reward: [(0, '31.371')] [2023-02-24 13:28:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16011264. Throughput: 0: 924.0. Samples: 4001324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:27,879][00205] Avg episode reward: [(0, '30.108')] [2023-02-24 13:28:28,307][11215] Updated weights for policy 0, policy_version 3910 (0.0016) [2023-02-24 13:28:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16035840. Throughput: 0: 968.8. Samples: 4007756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:28:32,872][00205] Avg episode reward: [(0, '30.784')] [2023-02-24 13:28:37,024][11215] Updated weights for policy 0, policy_version 3920 (0.0011) [2023-02-24 13:28:37,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3823.5, 300 sec: 3804.4). Total num frames: 16056320. Throughput: 0: 973.9. Samples: 4014598. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:37,878][00205] Avg episode reward: [(0, '30.628')] [2023-02-24 13:28:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3804.5). Total num frames: 16072704. Throughput: 0: 942.5. Samples: 4016758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:42,879][00205] Avg episode reward: [(0, '30.219')] [2023-02-24 13:28:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16089088. Throughput: 0: 924.1. Samples: 4021248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:28:47,872][00205] Avg episode reward: [(0, '30.982')] [2023-02-24 13:28:47,881][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth... [2023-02-24 13:28:47,999][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003705_15175680.pth [2023-02-24 13:28:49,555][11215] Updated weights for policy 0, policy_version 3930 (0.0020) [2023-02-24 13:28:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16109568. Throughput: 0: 971.3. Samples: 4027874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:28:52,872][00205] Avg episode reward: [(0, '31.041')] [2023-02-24 13:28:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16130048. Throughput: 0: 967.6. Samples: 4031252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:28:57,877][00205] Avg episode reward: [(0, '32.061')] [2023-02-24 13:28:59,693][11215] Updated weights for policy 0, policy_version 3940 (0.0017) [2023-02-24 13:29:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.5). Total num frames: 16146432. Throughput: 0: 929.4. Samples: 4036370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:29:02,879][00205] Avg episode reward: [(0, '30.960')] [2023-02-24 13:29:07,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3790.6). Total num frames: 16162816. Throughput: 0: 931.7. Samples: 4041134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:29:07,875][00205] Avg episode reward: [(0, '31.018')] [2023-02-24 13:29:10,828][11215] Updated weights for policy 0, policy_version 3950 (0.0020) [2023-02-24 13:29:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 16187392. Throughput: 0: 961.6. Samples: 4044598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:29:12,880][00205] Avg episode reward: [(0, '31.144')] [2023-02-24 13:29:17,876][00205] Fps is (10 sec: 4502.9, 60 sec: 3822.5, 300 sec: 3804.4). Total num frames: 16207872. Throughput: 0: 972.3. Samples: 4051514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:29:17,879][00205] Avg episode reward: [(0, '32.337')] [2023-02-24 13:29:21,731][11215] Updated weights for policy 0, policy_version 3960 (0.0014) [2023-02-24 13:29:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16220160. Throughput: 0: 919.1. Samples: 4055956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:29:22,876][00205] Avg episode reward: [(0, '32.456')] [2023-02-24 13:29:27,870][00205] Fps is (10 sec: 3278.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16240640. Throughput: 0: 918.0. Samples: 4058066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:29:27,872][00205] Avg episode reward: [(0, '32.083')] [2023-02-24 13:29:32,158][11215] Updated weights for policy 0, policy_version 3970 (0.0017) [2023-02-24 13:29:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16261120. Throughput: 0: 965.4. Samples: 4064690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:29:32,873][00205] Avg episode reward: [(0, '30.460')] [2023-02-24 13:29:37,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3790.5). Total num frames: 16281600. Throughput: 0: 961.6. Samples: 4071148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:29:37,883][00205] Avg episode reward: [(0, '30.129')] [2023-02-24 13:29:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 16297984. Throughput: 0: 936.3. Samples: 4073386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:29:42,875][00205] Avg episode reward: [(0, '29.237')] [2023-02-24 13:29:43,843][11215] Updated weights for policy 0, policy_version 3980 (0.0012) [2023-02-24 13:29:47,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16314368. Throughput: 0: 922.6. Samples: 4077888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:29:47,875][00205] Avg episode reward: [(0, '29.045')] [2023-02-24 13:29:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16338944. Throughput: 0: 966.9. Samples: 4084644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:29:52,878][00205] Avg episode reward: [(0, '31.121')] [2023-02-24 13:29:53,682][11215] Updated weights for policy 0, policy_version 3990 (0.0019) [2023-02-24 13:29:57,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 16359424. Throughput: 0: 965.5. Samples: 4088046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:29:57,873][00205] Avg episode reward: [(0, '30.992')] [2023-02-24 13:30:02,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3754.5, 300 sec: 3804.4). Total num frames: 16371712. Throughput: 0: 912.0. Samples: 4092550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:30:02,875][00205] Avg episode reward: [(0, '30.782')] [2023-02-24 13:30:06,278][11215] Updated weights for policy 0, policy_version 4000 (0.0021) [2023-02-24 13:30:07,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16388096. Throughput: 0: 923.2. Samples: 4097502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:30:07,873][00205] Avg episode reward: [(0, '30.459')] [2023-02-24 13:30:12,870][00205] Fps is (10 sec: 4097.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16412672. Throughput: 0: 953.2. Samples: 4100958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:30:12,879][00205] Avg episode reward: [(0, '32.182')] [2023-02-24 13:30:15,315][11215] Updated weights for policy 0, policy_version 4010 (0.0013) [2023-02-24 13:30:17,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.1, 300 sec: 3804.4). Total num frames: 16433152. Throughput: 0: 956.1. Samples: 4107714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:30:17,876][00205] Avg episode reward: [(0, '32.199')] [2023-02-24 13:30:22,870][00205] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 16445440. Throughput: 0: 908.6. Samples: 4112032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:30:22,878][00205] Avg episode reward: [(0, '31.692')] [2023-02-24 13:30:27,659][11215] Updated weights for policy 0, policy_version 4020 (0.0027) [2023-02-24 13:30:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16465920. Throughput: 0: 907.7. Samples: 4114234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:30:27,881][00205] Avg episode reward: [(0, '30.593')] [2023-02-24 13:30:32,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16486400. Throughput: 0: 958.9. Samples: 4121038. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:30:32,872][00205] Avg episode reward: [(0, '29.981')] [2023-02-24 13:30:37,153][11215] Updated weights for policy 0, policy_version 4030 (0.0027) [2023-02-24 13:30:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3790.5). Total num frames: 16506880. Throughput: 0: 947.2. Samples: 4127268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:30:37,872][00205] Avg episode reward: [(0, '30.057')] [2023-02-24 13:30:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16519168. Throughput: 0: 917.9. Samples: 4129350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:30:42,873][00205] Avg episode reward: [(0, '29.359')] [2023-02-24 13:30:47,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16539648. Throughput: 0: 928.5. Samples: 4134332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:30:47,873][00205] Avg episode reward: [(0, '29.214')] [2023-02-24 13:30:47,891][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth... [2023-02-24 13:30:48,013][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003816_15630336.pth [2023-02-24 13:30:49,043][11215] Updated weights for policy 0, policy_version 4040 (0.0023) [2023-02-24 13:30:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16564224. Throughput: 0: 971.4. Samples: 4141216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:30:52,878][00205] Avg episode reward: [(0, '29.870')] [2023-02-24 13:30:57,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 16580608. Throughput: 0: 968.6. Samples: 4144546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:30:57,878][00205] Avg episode reward: [(0, '29.219')] [2023-02-24 13:30:59,542][11215] Updated weights for policy 0, policy_version 4050 (0.0013) [2023-02-24 13:31:02,871][00205] Fps is (10 sec: 3276.3, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16596992. Throughput: 0: 916.0. Samples: 4148936. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:31:02,877][00205] Avg episode reward: [(0, '29.501')] [2023-02-24 13:31:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16617472. Throughput: 0: 941.9. Samples: 4154416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:07,873][00205] Avg episode reward: [(0, '28.976')] [2023-02-24 13:31:10,200][11215] Updated weights for policy 0, policy_version 4060 (0.0016) [2023-02-24 13:31:12,870][00205] Fps is (10 sec: 4096.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16637952. Throughput: 0: 970.3. Samples: 4157898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:12,872][00205] Avg episode reward: [(0, '30.104')] [2023-02-24 13:31:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16658432. Throughput: 0: 957.9. Samples: 4164142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:17,877][00205] Avg episode reward: [(0, '30.695')] [2023-02-24 13:31:21,723][11215] Updated weights for policy 0, policy_version 4070 (0.0028) [2023-02-24 13:31:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16670720. Throughput: 0: 916.6. Samples: 4168516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:22,877][00205] Avg episode reward: [(0, '29.653')] [2023-02-24 13:31:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16691200. Throughput: 0: 928.3. Samples: 4171124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:31:27,872][00205] Avg episode reward: [(0, '29.709')] [2023-02-24 13:31:31,542][11215] Updated weights for policy 0, policy_version 4080 (0.0019) [2023-02-24 13:31:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16715776. Throughput: 0: 968.9. Samples: 4177932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:32,873][00205] Avg episode reward: [(0, '29.639')] [2023-02-24 13:31:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16732160. Throughput: 0: 942.6. Samples: 4183632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:37,875][00205] Avg episode reward: [(0, '29.612')] [2023-02-24 13:31:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 16748544. Throughput: 0: 917.0. Samples: 4185810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:31:42,873][00205] Avg episode reward: [(0, '27.973')] [2023-02-24 13:31:43,966][11215] Updated weights for policy 0, policy_version 4090 (0.0015) [2023-02-24 13:31:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 16769024. Throughput: 0: 935.5. Samples: 4191032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:31:47,877][00205] Avg episode reward: [(0, '27.151')] [2023-02-24 13:31:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16789504. Throughput: 0: 963.7. Samples: 4197782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:31:52,876][00205] Avg episode reward: [(0, '27.364')] [2023-02-24 13:31:53,266][11215] Updated weights for policy 0, policy_version 4100 (0.0016) [2023-02-24 13:31:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16805888. Throughput: 0: 955.4. Samples: 4200890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:31:57,878][00205] Avg episode reward: [(0, '27.743')] [2023-02-24 13:32:02,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16822272. Throughput: 0: 911.9. Samples: 4205180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:32:02,873][00205] Avg episode reward: [(0, '26.935')] [2023-02-24 13:32:05,764][11215] Updated weights for policy 0, policy_version 4110 (0.0015) [2023-02-24 13:32:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16842752. Throughput: 0: 941.7. Samples: 4210894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:32:07,884][00205] Avg episode reward: [(0, '27.696')] [2023-02-24 13:32:12,873][00205] Fps is (10 sec: 4504.2, 60 sec: 3822.7, 300 sec: 3776.6). Total num frames: 16867328. Throughput: 0: 959.8. Samples: 4214318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:32:12,877][00205] Avg episode reward: [(0, '30.044')] [2023-02-24 13:32:15,059][11215] Updated weights for policy 0, policy_version 4120 (0.0017) [2023-02-24 13:32:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 16883712. Throughput: 0: 943.8. Samples: 4220404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:32:17,877][00205] Avg episode reward: [(0, '32.229')] [2023-02-24 13:32:22,870][00205] Fps is (10 sec: 2868.1, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 16896000. Throughput: 0: 913.1. Samples: 4224720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:32:22,879][00205] Avg episode reward: [(0, '33.955')] [2023-02-24 13:32:27,093][11215] Updated weights for policy 0, policy_version 4130 (0.0011) [2023-02-24 13:32:27,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16916480. Throughput: 0: 925.8. Samples: 4227470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:32:27,872][00205] Avg episode reward: [(0, '34.055')] [2023-02-24 13:32:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3776.8). Total num frames: 16941056. Throughput: 0: 966.7. Samples: 4234534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:32:32,873][00205] Avg episode reward: [(0, '34.715')] [2023-02-24 13:32:32,875][11201] Saving new best policy, reward=34.715! [2023-02-24 13:32:36,765][11215] Updated weights for policy 0, policy_version 4140 (0.0025) [2023-02-24 13:32:37,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 16957440. Throughput: 0: 940.8. Samples: 4240118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:32:37,873][00205] Avg episode reward: [(0, '36.120')] [2023-02-24 13:32:37,884][11201] Saving new best policy, reward=36.120! [2023-02-24 13:32:42,871][00205] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 16973824. Throughput: 0: 919.4. Samples: 4242266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:32:42,880][00205] Avg episode reward: [(0, '35.290')] [2023-02-24 13:32:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 16994304. Throughput: 0: 944.2. Samples: 4247670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:32:47,877][00205] Avg episode reward: [(0, '34.136')] [2023-02-24 13:32:47,894][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth... [2023-02-24 13:32:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000003928_16089088.pth [2023-02-24 13:32:48,301][11215] Updated weights for policy 0, policy_version 4150 (0.0027) [2023-02-24 13:32:52,870][00205] Fps is (10 sec: 4506.1, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17018880. Throughput: 0: 971.9. Samples: 4254628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:32:52,879][00205] Avg episode reward: [(0, '32.439')] [2023-02-24 13:32:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17031168. Throughput: 0: 957.7. Samples: 4257412. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:32:57,875][00205] Avg episode reward: [(0, '32.089')] [2023-02-24 13:32:59,334][11215] Updated weights for policy 0, policy_version 4160 (0.0020) [2023-02-24 13:33:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17047552. Throughput: 0: 919.1. Samples: 4261762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:33:02,877][00205] Avg episode reward: [(0, '32.550')] [2023-02-24 13:33:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17068032. Throughput: 0: 954.0. Samples: 4267650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:33:07,876][00205] Avg episode reward: [(0, '32.966')] [2023-02-24 13:33:09,698][11215] Updated weights for policy 0, policy_version 4170 (0.0016) [2023-02-24 13:33:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3776.7). Total num frames: 17092608. Throughput: 0: 968.8. Samples: 4271064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:33:12,872][00205] Avg episode reward: [(0, '32.938')] [2023-02-24 13:33:17,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17108992. Throughput: 0: 939.4. Samples: 4276808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:33:17,872][00205] Avg episode reward: [(0, '32.660')] [2023-02-24 13:33:21,800][11215] Updated weights for policy 0, policy_version 4180 (0.0027) [2023-02-24 13:33:22,870][00205] Fps is (10 sec: 2867.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17121280. Throughput: 0: 912.0. Samples: 4281156. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:33:22,878][00205] Avg episode reward: [(0, '31.478')] [2023-02-24 13:33:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17145856. Throughput: 0: 928.6. Samples: 4284050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:33:27,879][00205] Avg episode reward: [(0, '31.509')] [2023-02-24 13:33:31,446][11215] Updated weights for policy 0, policy_version 4190 (0.0018) [2023-02-24 13:33:32,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17166336. Throughput: 0: 958.2. Samples: 4290790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:33:32,873][00205] Avg episode reward: [(0, '32.313')] [2023-02-24 13:33:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17182720. Throughput: 0: 916.0. Samples: 4295850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:33:37,874][00205] Avg episode reward: [(0, '33.047')] [2023-02-24 13:33:42,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.5, 300 sec: 3748.9). Total num frames: 17195008. Throughput: 0: 901.0. Samples: 4297958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:33:42,874][00205] Avg episode reward: [(0, '32.472')] [2023-02-24 13:33:44,122][11215] Updated weights for policy 0, policy_version 4200 (0.0020) [2023-02-24 13:33:47,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17219584. Throughput: 0: 935.4. Samples: 4303854. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:33:47,879][00205] Avg episode reward: [(0, '32.023')] [2023-02-24 13:33:52,823][11215] Updated weights for policy 0, policy_version 4210 (0.0021) [2023-02-24 13:33:52,870][00205] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17244160. Throughput: 0: 960.3. Samples: 4310864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:33:52,877][00205] Avg episode reward: [(0, '32.352')] [2023-02-24 13:33:57,870][00205] Fps is (10 sec: 3686.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17256448. Throughput: 0: 939.6. Samples: 4313348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:33:57,873][00205] Avg episode reward: [(0, '33.993')] [2023-02-24 13:34:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17272832. Throughput: 0: 907.8. Samples: 4317660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:02,872][00205] Avg episode reward: [(0, '34.625')] [2023-02-24 13:34:05,487][11215] Updated weights for policy 0, policy_version 4220 (0.0018) [2023-02-24 13:34:07,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17293312. Throughput: 0: 950.6. Samples: 4323932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:34:07,874][00205] Avg episode reward: [(0, '33.173')] [2023-02-24 13:34:12,879][00205] Fps is (10 sec: 4503.3, 60 sec: 3754.4, 300 sec: 3762.8). Total num frames: 17317888. Throughput: 0: 961.7. Samples: 4327330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:12,882][00205] Avg episode reward: [(0, '31.212')] [2023-02-24 13:34:15,093][11215] Updated weights for policy 0, policy_version 4230 (0.0012) [2023-02-24 13:34:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 17330176. Throughput: 0: 935.1. Samples: 4332870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:17,878][00205] Avg episode reward: [(0, '30.270')] [2023-02-24 13:34:22,870][00205] Fps is (10 sec: 2868.6, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17346560. Throughput: 0: 920.5. Samples: 4337274. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:22,873][00205] Avg episode reward: [(0, '30.476')] [2023-02-24 13:34:26,653][11215] Updated weights for policy 0, policy_version 4240 (0.0013) [2023-02-24 13:34:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17371136. Throughput: 0: 945.7. Samples: 4340514. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:27,876][00205] Avg episode reward: [(0, '29.960')] [2023-02-24 13:34:32,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17391616. Throughput: 0: 969.8. Samples: 4347496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 13:34:32,874][00205] Avg episode reward: [(0, '29.325')] [2023-02-24 13:34:37,210][11215] Updated weights for policy 0, policy_version 4250 (0.0015) [2023-02-24 13:34:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17408000. Throughput: 0: 926.6. Samples: 4352562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:37,873][00205] Avg episode reward: [(0, '29.577')] [2023-02-24 13:34:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17424384. Throughput: 0: 918.4. Samples: 4354676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:42,872][00205] Avg episode reward: [(0, '31.387')] [2023-02-24 13:34:47,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17444864. Throughput: 0: 957.4. Samples: 4360744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:47,872][00205] Avg episode reward: [(0, '31.086')] [2023-02-24 13:34:47,922][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth... [2023-02-24 13:34:47,925][11215] Updated weights for policy 0, policy_version 4260 (0.0018) [2023-02-24 13:34:48,064][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004038_16539648.pth [2023-02-24 13:34:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17469440. Throughput: 0: 972.8. Samples: 4367708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:34:52,875][00205] Avg episode reward: [(0, '30.335')] [2023-02-24 13:34:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17481728. Throughput: 0: 945.0. Samples: 4369850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:34:57,877][00205] Avg episode reward: [(0, '29.849')] [2023-02-24 13:34:59,967][11215] Updated weights for policy 0, policy_version 4270 (0.0018) [2023-02-24 13:35:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17498112. Throughput: 0: 914.1. Samples: 4374004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:35:02,872][00205] Avg episode reward: [(0, '31.191')] [2023-02-24 13:35:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17522688. Throughput: 0: 963.6. Samples: 4380634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:07,873][00205] Avg episode reward: [(0, '32.240')] [2023-02-24 13:35:09,432][11215] Updated weights for policy 0, policy_version 4280 (0.0018) [2023-02-24 13:35:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3755.0, 300 sec: 3762.8). Total num frames: 17543168. Throughput: 0: 969.5. Samples: 4384140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:12,873][00205] Avg episode reward: [(0, '31.934')] [2023-02-24 13:35:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 17559552. Throughput: 0: 930.0. Samples: 4389344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:17,875][00205] Avg episode reward: [(0, '31.448')] [2023-02-24 13:35:21,830][11215] Updated weights for policy 0, policy_version 4290 (0.0011) [2023-02-24 13:35:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17575936. Throughput: 0: 919.2. Samples: 4393924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:35:22,873][00205] Avg episode reward: [(0, '31.255')] [2023-02-24 13:35:27,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17596416. Throughput: 0: 946.6. Samples: 4397274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:27,872][00205] Avg episode reward: [(0, '32.735')] [2023-02-24 13:35:30,785][11215] Updated weights for policy 0, policy_version 4300 (0.0012) [2023-02-24 13:35:32,872][00205] Fps is (10 sec: 4095.0, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 17616896. Throughput: 0: 965.7. Samples: 4404202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:35:32,875][00205] Avg episode reward: [(0, '32.606')] [2023-02-24 13:35:37,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17633280. Throughput: 0: 912.5. Samples: 4408772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:35:37,872][00205] Avg episode reward: [(0, '30.765')] [2023-02-24 13:35:42,870][00205] Fps is (10 sec: 3277.6, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17649664. Throughput: 0: 912.8. Samples: 4410926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:35:42,877][00205] Avg episode reward: [(0, '31.647')] [2023-02-24 13:35:43,238][11215] Updated weights for policy 0, policy_version 4310 (0.0018) [2023-02-24 13:35:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17674240. Throughput: 0: 963.7. Samples: 4417370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:47,872][00205] Avg episode reward: [(0, '32.506')] [2023-02-24 13:35:52,482][11215] Updated weights for policy 0, policy_version 4320 (0.0016) [2023-02-24 13:35:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17694720. Throughput: 0: 965.4. Samples: 4424076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:52,876][00205] Avg episode reward: [(0, '32.686')] [2023-02-24 13:35:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17707008. Throughput: 0: 933.8. Samples: 4426162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:35:57,878][00205] Avg episode reward: [(0, '31.490')] [2023-02-24 13:36:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17727488. Throughput: 0: 916.5. Samples: 4430588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:36:02,875][00205] Avg episode reward: [(0, '30.287')] [2023-02-24 13:36:04,490][11215] Updated weights for policy 0, policy_version 4330 (0.0012) [2023-02-24 13:36:07,871][00205] Fps is (10 sec: 4095.7, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 17747968. Throughput: 0: 969.9. Samples: 4437570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:36:07,876][00205] Avg episode reward: [(0, '29.798')] [2023-02-24 13:36:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 17768448. Throughput: 0: 971.0. Samples: 4440970. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:36:12,874][00205] Avg episode reward: [(0, '30.527')] [2023-02-24 13:36:14,723][11215] Updated weights for policy 0, policy_version 4340 (0.0026) [2023-02-24 13:36:17,871][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 17784832. Throughput: 0: 925.7. Samples: 4445858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:36:17,878][00205] Avg episode reward: [(0, '30.486')] [2023-02-24 13:36:22,870][00205] Fps is (10 sec: 3277.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17801216. Throughput: 0: 933.2. Samples: 4450768. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:36:22,872][00205] Avg episode reward: [(0, '29.989')] [2023-02-24 13:36:26,034][11215] Updated weights for policy 0, policy_version 4350 (0.0013) [2023-02-24 13:36:27,870][00205] Fps is (10 sec: 4096.3, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 17825792. Throughput: 0: 958.6. Samples: 4454064. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:36:27,879][00205] Avg episode reward: [(0, '32.190')] [2023-02-24 13:36:32,873][00205] Fps is (10 sec: 4504.0, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 17846272. Throughput: 0: 967.0. Samples: 4460888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:36:32,875][00205] Avg episode reward: [(0, '31.899')] [2023-02-24 13:36:37,228][11215] Updated weights for policy 0, policy_version 4360 (0.0014) [2023-02-24 13:36:37,877][00205] Fps is (10 sec: 3274.4, 60 sec: 3754.2, 300 sec: 3762.7). Total num frames: 17858560. Throughput: 0: 914.3. Samples: 4465228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:36:37,885][00205] Avg episode reward: [(0, '31.766')] [2023-02-24 13:36:42,870][00205] Fps is (10 sec: 2868.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17874944. Throughput: 0: 915.7. Samples: 4467370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:36:42,873][00205] Avg episode reward: [(0, '31.646')] [2023-02-24 13:36:47,420][11215] Updated weights for policy 0, policy_version 4370 (0.0017) [2023-02-24 13:36:47,870][00205] Fps is (10 sec: 4099.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17899520. Throughput: 0: 964.9. Samples: 4474008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:36:47,881][00205] Avg episode reward: [(0, '30.047')] [2023-02-24 13:36:47,888][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth... [2023-02-24 13:36:48,033][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004149_16994304.pth [2023-02-24 13:36:52,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 17920000. Throughput: 0: 948.7. Samples: 4480262. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:36:52,876][00205] Avg episode reward: [(0, '30.306')] [2023-02-24 13:36:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 17932288. Throughput: 0: 919.7. Samples: 4482356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:36:57,875][00205] Avg episode reward: [(0, '28.813')] [2023-02-24 13:36:59,646][11215] Updated weights for policy 0, policy_version 4380 (0.0033) [2023-02-24 13:37:02,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 17948672. Throughput: 0: 910.0. Samples: 4486808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:37:02,873][00205] Avg episode reward: [(0, '27.859')] [2023-02-24 13:37:07,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 17973248. Throughput: 0: 955.5. Samples: 4493766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:37:07,873][00205] Avg episode reward: [(0, '29.218')] [2023-02-24 13:37:08,913][11215] Updated weights for policy 0, policy_version 4390 (0.0012) [2023-02-24 13:37:12,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.8, 300 sec: 3762.8). Total num frames: 17993728. Throughput: 0: 959.2. Samples: 4497226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:37:12,880][00205] Avg episode reward: [(0, '31.056')] [2023-02-24 13:37:17,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18010112. Throughput: 0: 911.0. Samples: 4501882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:37:17,878][00205] Avg episode reward: [(0, '30.981')] [2023-02-24 13:37:21,376][11215] Updated weights for policy 0, policy_version 4400 (0.0015) [2023-02-24 13:37:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18026496. Throughput: 0: 930.7. Samples: 4507104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:37:22,872][00205] Avg episode reward: [(0, '30.979')] [2023-02-24 13:37:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18051072. Throughput: 0: 960.0. Samples: 4510572. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:37:27,879][00205] Avg episode reward: [(0, '31.649')] [2023-02-24 13:37:30,306][11215] Updated weights for policy 0, policy_version 4410 (0.0023) [2023-02-24 13:37:32,870][00205] Fps is (10 sec: 4095.9, 60 sec: 3686.6, 300 sec: 3762.8). Total num frames: 18067456. Throughput: 0: 960.0. Samples: 4517206. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:37:32,873][00205] Avg episode reward: [(0, '31.672')] [2023-02-24 13:37:37,873][00205] Fps is (10 sec: 3275.8, 60 sec: 3754.9, 300 sec: 3762.7). Total num frames: 18083840. Throughput: 0: 918.5. Samples: 4521598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:37:37,875][00205] Avg episode reward: [(0, '31.408')] [2023-02-24 13:37:42,551][11215] Updated weights for policy 0, policy_version 4420 (0.0025) [2023-02-24 13:37:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18104320. Throughput: 0: 920.1. Samples: 4523760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:37:42,872][00205] Avg episode reward: [(0, '31.068')] [2023-02-24 13:37:47,870][00205] Fps is (10 sec: 4507.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18128896. Throughput: 0: 978.8. Samples: 4530854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:37:47,872][00205] Avg episode reward: [(0, '29.337')] [2023-02-24 13:37:51,721][11215] Updated weights for policy 0, policy_version 4430 (0.0013) [2023-02-24 13:37:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18145280. Throughput: 0: 962.1. Samples: 4537058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:37:52,877][00205] Avg episode reward: [(0, '28.751')] [2023-02-24 13:37:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18161664. Throughput: 0: 932.4. Samples: 4539182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:37:57,878][00205] Avg episode reward: [(0, '28.305')] [2023-02-24 13:38:02,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18178048. Throughput: 0: 936.1. Samples: 4544008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:38:02,872][00205] Avg episode reward: [(0, '29.431')] [2023-02-24 13:38:03,901][11215] Updated weights for policy 0, policy_version 4440 (0.0023) [2023-02-24 13:38:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 18202624. Throughput: 0: 975.9. Samples: 4551018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:38:07,872][00205] Avg episode reward: [(0, '29.386')] [2023-02-24 13:38:12,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18223104. Throughput: 0: 972.9. Samples: 4554352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:38:12,876][00205] Avg episode reward: [(0, '31.248')] [2023-02-24 13:38:14,241][11215] Updated weights for policy 0, policy_version 4450 (0.0022) [2023-02-24 13:38:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18235392. Throughput: 0: 921.5. Samples: 4558672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:38:17,879][00205] Avg episode reward: [(0, '31.288')] [2023-02-24 13:38:22,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 18255872. Throughput: 0: 943.7. Samples: 4564060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:38:22,872][00205] Avg episode reward: [(0, '31.926')] [2023-02-24 13:38:25,306][11215] Updated weights for policy 0, policy_version 4460 (0.0025) [2023-02-24 13:38:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18276352. Throughput: 0: 971.2. Samples: 4567464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:38:27,873][00205] Avg episode reward: [(0, '33.241')] [2023-02-24 13:38:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18296832. Throughput: 0: 953.2. Samples: 4573748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:38:32,880][00205] Avg episode reward: [(0, '32.765')] [2023-02-24 13:38:36,710][11215] Updated weights for policy 0, policy_version 4470 (0.0011) [2023-02-24 13:38:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3776.6). Total num frames: 18309120. Throughput: 0: 912.5. Samples: 4578120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:38:37,877][00205] Avg episode reward: [(0, '32.218')] [2023-02-24 13:38:42,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18329600. Throughput: 0: 920.8. Samples: 4580616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:38:42,873][00205] Avg episode reward: [(0, '32.247')] [2023-02-24 13:38:46,660][11215] Updated weights for policy 0, policy_version 4480 (0.0034) [2023-02-24 13:38:47,870][00205] Fps is (10 sec: 4505.4, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18354176. Throughput: 0: 966.6. Samples: 4587504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:38:47,872][00205] Avg episode reward: [(0, '30.376')] [2023-02-24 13:38:47,882][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth... [2023-02-24 13:38:48,010][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004260_17448960.pth [2023-02-24 13:38:52,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18370560. Throughput: 0: 938.8. Samples: 4593266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:38:52,875][00205] Avg episode reward: [(0, '29.966')] [2023-02-24 13:38:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18386944. Throughput: 0: 912.1. Samples: 4595396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:38:57,872][00205] Avg episode reward: [(0, '30.101')] [2023-02-24 13:38:59,192][11215] Updated weights for policy 0, policy_version 4490 (0.0016) [2023-02-24 13:39:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18407424. Throughput: 0: 929.7. Samples: 4600508. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:39:02,877][00205] Avg episode reward: [(0, '29.211')] [2023-02-24 13:39:07,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18427904. Throughput: 0: 965.3. Samples: 4607500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:39:07,874][00205] Avg episode reward: [(0, '31.558')] [2023-02-24 13:39:08,138][11215] Updated weights for policy 0, policy_version 4500 (0.0017) [2023-02-24 13:39:12,875][00205] Fps is (10 sec: 3684.4, 60 sec: 3686.1, 300 sec: 3776.6). Total num frames: 18444288. Throughput: 0: 957.4. Samples: 4610554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:39:12,880][00205] Avg episode reward: [(0, '30.965')] [2023-02-24 13:39:17,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18460672. Throughput: 0: 914.6. Samples: 4614906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:39:17,877][00205] Avg episode reward: [(0, '31.455')] [2023-02-24 13:39:20,479][11215] Updated weights for policy 0, policy_version 4510 (0.0017) [2023-02-24 13:39:22,870][00205] Fps is (10 sec: 3688.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18481152. Throughput: 0: 946.6. Samples: 4620718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:39:22,882][00205] Avg episode reward: [(0, '32.129')] [2023-02-24 13:39:27,870][00205] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18505728. Throughput: 0: 968.6. Samples: 4624202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:39:27,881][00205] Avg episode reward: [(0, '32.640')] [2023-02-24 13:39:29,696][11215] Updated weights for policy 0, policy_version 4520 (0.0014) [2023-02-24 13:39:32,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18522112. Throughput: 0: 947.4. Samples: 4630138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:39:32,874][00205] Avg episode reward: [(0, '33.105')] [2023-02-24 13:39:37,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18534400. Throughput: 0: 916.4. Samples: 4634506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:39:37,879][00205] Avg episode reward: [(0, '33.413')] [2023-02-24 13:39:41,942][11215] Updated weights for policy 0, policy_version 4530 (0.0016) [2023-02-24 13:39:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18558976. Throughput: 0: 930.5. Samples: 4637266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:39:42,875][00205] Avg episode reward: [(0, '31.860')] [2023-02-24 13:39:47,870][00205] Fps is (10 sec: 4505.9, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18579456. Throughput: 0: 973.5. Samples: 4644316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:39:47,871][00205] Avg episode reward: [(0, '33.947')] [2023-02-24 13:39:51,730][11215] Updated weights for policy 0, policy_version 4540 (0.0011) [2023-02-24 13:39:52,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18595840. Throughput: 0: 941.1. Samples: 4649848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:39:52,877][00205] Avg episode reward: [(0, '33.288')] [2023-02-24 13:39:57,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18612224. Throughput: 0: 921.8. Samples: 4652028. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:39:57,876][00205] Avg episode reward: [(0, '32.833')] [2023-02-24 13:40:02,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18632704. Throughput: 0: 941.5. Samples: 4657272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:40:02,872][00205] Avg episode reward: [(0, '32.925')] [2023-02-24 13:40:03,409][11215] Updated weights for policy 0, policy_version 4550 (0.0014) [2023-02-24 13:40:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18657280. Throughput: 0: 968.4. Samples: 4664296. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:40:07,875][00205] Avg episode reward: [(0, '32.421')] [2023-02-24 13:40:12,871][00205] Fps is (10 sec: 4095.4, 60 sec: 3823.2, 300 sec: 3776.6). Total num frames: 18673664. Throughput: 0: 952.5. Samples: 4667064. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:40:12,874][00205] Avg episode reward: [(0, '32.367')] [2023-02-24 13:40:14,224][11215] Updated weights for policy 0, policy_version 4560 (0.0014) [2023-02-24 13:40:17,871][00205] Fps is (10 sec: 2867.0, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18685952. Throughput: 0: 916.9. Samples: 4671398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:40:17,874][00205] Avg episode reward: [(0, '33.540')] [2023-02-24 13:40:22,870][00205] Fps is (10 sec: 3686.9, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18710528. Throughput: 0: 956.8. Samples: 4677562. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:40:22,872][00205] Avg episode reward: [(0, '31.933')] [2023-02-24 13:40:24,288][11215] Updated weights for policy 0, policy_version 4570 (0.0020) [2023-02-24 13:40:27,870][00205] Fps is (10 sec: 4505.8, 60 sec: 3754.6, 300 sec: 3776.7). Total num frames: 18731008. Throughput: 0: 974.3. Samples: 4681112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:40:27,872][00205] Avg episode reward: [(0, '30.706')] [2023-02-24 13:40:32,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 18747392. Throughput: 0: 941.7. Samples: 4686694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:40:32,877][00205] Avg episode reward: [(0, '31.070')] [2023-02-24 13:40:36,269][11215] Updated weights for policy 0, policy_version 4580 (0.0018) [2023-02-24 13:40:37,870][00205] Fps is (10 sec: 3276.9, 60 sec: 3823.0, 300 sec: 3776.6). Total num frames: 18763776. Throughput: 0: 917.0. Samples: 4691112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:40:37,872][00205] Avg episode reward: [(0, '30.738')] [2023-02-24 13:40:42,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18784256. Throughput: 0: 934.1. Samples: 4694062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:40:42,873][00205] Avg episode reward: [(0, '31.084')] [2023-02-24 13:40:45,892][11215] Updated weights for policy 0, policy_version 4590 (0.0022) [2023-02-24 13:40:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18808832. Throughput: 0: 974.7. Samples: 4701134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:40:47,872][00205] Avg episode reward: [(0, '30.620')] [2023-02-24 13:40:47,880][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth... [2023-02-24 13:40:48,029][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004370_17899520.pth [2023-02-24 13:40:52,871][00205] Fps is (10 sec: 3685.9, 60 sec: 3754.6, 300 sec: 3776.6). Total num frames: 18821120. Throughput: 0: 937.5. Samples: 4706484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:40:52,875][00205] Avg episode reward: [(0, '30.381')] [2023-02-24 13:40:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18837504. Throughput: 0: 923.6. Samples: 4708624. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:40:57,874][00205] Avg episode reward: [(0, '31.576')] [2023-02-24 13:40:58,500][11215] Updated weights for policy 0, policy_version 4600 (0.0011) [2023-02-24 13:41:02,870][00205] Fps is (10 sec: 4096.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18862080. Throughput: 0: 950.1. Samples: 4714150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:41:02,880][00205] Avg episode reward: [(0, '31.787')] [2023-02-24 13:41:07,349][11215] Updated weights for policy 0, policy_version 4610 (0.0026) [2023-02-24 13:41:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18882560. Throughput: 0: 969.6. Samples: 4721196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:41:07,873][00205] Avg episode reward: [(0, '31.181')] [2023-02-24 13:41:12,870][00205] Fps is (10 sec: 3686.5, 60 sec: 3754.8, 300 sec: 3776.7). Total num frames: 18898944. Throughput: 0: 949.5. Samples: 4723840. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:41:12,873][00205] Avg episode reward: [(0, '32.176')] [2023-02-24 13:41:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 18911232. Throughput: 0: 924.0. Samples: 4728272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:41:17,874][00205] Avg episode reward: [(0, '31.709')] [2023-02-24 13:41:19,673][11215] Updated weights for policy 0, policy_version 4620 (0.0025) [2023-02-24 13:41:22,870][00205] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3762.8). Total num frames: 18935808. Throughput: 0: 963.2. Samples: 4734458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:41:22,873][00205] Avg episode reward: [(0, '31.460')] [2023-02-24 13:41:27,870][00205] Fps is (10 sec: 4915.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 18960384. Throughput: 0: 975.9. Samples: 4737980. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:41:27,881][00205] Avg episode reward: [(0, '32.309')] [2023-02-24 13:41:28,487][11215] Updated weights for policy 0, policy_version 4630 (0.0022) [2023-02-24 13:41:32,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 18976768. Throughput: 0: 945.1. Samples: 4743662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:41:32,872][00205] Avg episode reward: [(0, '31.874')] [2023-02-24 13:41:37,870][00205] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 18989056. Throughput: 0: 922.5. Samples: 4747996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:41:37,875][00205] Avg episode reward: [(0, '33.078')] [2023-02-24 13:41:40,808][11215] Updated weights for policy 0, policy_version 4640 (0.0026) [2023-02-24 13:41:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19013632. Throughput: 0: 944.5. Samples: 4751128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:41:42,876][00205] Avg episode reward: [(0, '32.005')] [2023-02-24 13:41:47,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19034112. Throughput: 0: 977.3. Samples: 4758130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:41:47,875][00205] Avg episode reward: [(0, '30.901')] [2023-02-24 13:41:50,662][11215] Updated weights for policy 0, policy_version 4650 (0.0016) [2023-02-24 13:41:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 19050496. Throughput: 0: 937.7. Samples: 4763392. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:41:52,881][00205] Avg episode reward: [(0, '31.801')] [2023-02-24 13:41:57,872][00205] Fps is (10 sec: 3276.0, 60 sec: 3822.8, 300 sec: 3790.5). Total num frames: 19066880. Throughput: 0: 926.2. Samples: 4765520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:41:57,879][00205] Avg episode reward: [(0, '32.167')] [2023-02-24 13:42:02,148][11215] Updated weights for policy 0, policy_version 4660 (0.0017) [2023-02-24 13:42:02,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19087360. Throughput: 0: 955.6. Samples: 4771276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:42:02,873][00205] Avg episode reward: [(0, '29.330')] [2023-02-24 13:42:07,870][00205] Fps is (10 sec: 4506.7, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19111936. Throughput: 0: 972.3. Samples: 4778212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:42:07,873][00205] Avg episode reward: [(0, '29.882')] [2023-02-24 13:42:12,785][11215] Updated weights for policy 0, policy_version 4670 (0.0015) [2023-02-24 13:42:12,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19128320. Throughput: 0: 949.2. Samples: 4780692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:12,872][00205] Avg episode reward: [(0, '30.252')] [2023-02-24 13:42:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19140608. Throughput: 0: 917.6. Samples: 4784952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:17,872][00205] Avg episode reward: [(0, '30.203')] [2023-02-24 13:42:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3776.7). Total num frames: 19165184. Throughput: 0: 960.3. Samples: 4791208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:42:22,873][00205] Avg episode reward: [(0, '29.424')] [2023-02-24 13:42:23,628][11215] Updated weights for policy 0, policy_version 4680 (0.0020) [2023-02-24 13:42:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19185664. Throughput: 0: 969.2. Samples: 4794740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:42:27,872][00205] Avg episode reward: [(0, '28.006')] [2023-02-24 13:42:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.6). Total num frames: 19202048. Throughput: 0: 935.9. Samples: 4800244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:32,872][00205] Avg episode reward: [(0, '28.539')] [2023-02-24 13:42:35,105][11215] Updated weights for policy 0, policy_version 4690 (0.0016) [2023-02-24 13:42:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19218432. Throughput: 0: 918.4. Samples: 4804718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:37,881][00205] Avg episode reward: [(0, '29.047')] [2023-02-24 13:42:42,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 19238912. Throughput: 0: 945.2. Samples: 4808052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:42,872][00205] Avg episode reward: [(0, '27.868')] [2023-02-24 13:42:44,645][11215] Updated weights for policy 0, policy_version 4700 (0.0011) [2023-02-24 13:42:47,874][00205] Fps is (10 sec: 4503.7, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19263488. Throughput: 0: 973.2. Samples: 4815076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:42:47,877][00205] Avg episode reward: [(0, '28.748')] [2023-02-24 13:42:47,885][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth... [2023-02-24 13:42:48,040][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004481_18354176.pth [2023-02-24 13:42:52,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19275776. Throughput: 0: 929.0. Samples: 4820018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:42:52,878][00205] Avg episode reward: [(0, '27.869')] [2023-02-24 13:42:57,042][11215] Updated weights for policy 0, policy_version 4710 (0.0023) [2023-02-24 13:42:57,870][00205] Fps is (10 sec: 2868.4, 60 sec: 3754.8, 300 sec: 3776.6). Total num frames: 19292160. Throughput: 0: 922.1. Samples: 4822186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:42:57,873][00205] Avg episode reward: [(0, '28.164')] [2023-02-24 13:43:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19316736. Throughput: 0: 957.1. Samples: 4828020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:43:02,873][00205] Avg episode reward: [(0, '28.988')] [2023-02-24 13:43:06,185][11215] Updated weights for policy 0, policy_version 4720 (0.0012) [2023-02-24 13:43:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19337216. Throughput: 0: 974.3. Samples: 4835052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:43:07,873][00205] Avg episode reward: [(0, '27.640')] [2023-02-24 13:43:12,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19353600. Throughput: 0: 947.6. Samples: 4837380. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:43:12,873][00205] Avg episode reward: [(0, '28.411')] [2023-02-24 13:43:17,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19369984. Throughput: 0: 924.0. Samples: 4841824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:17,873][00205] Avg episode reward: [(0, '28.179')] [2023-02-24 13:43:18,659][11215] Updated weights for policy 0, policy_version 4730 (0.0034) [2023-02-24 13:43:22,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19390464. Throughput: 0: 969.9. Samples: 4848362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:22,873][00205] Avg episode reward: [(0, '28.688')] [2023-02-24 13:43:27,183][11215] Updated weights for policy 0, policy_version 4740 (0.0017) [2023-02-24 13:43:27,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19415040. Throughput: 0: 973.9. Samples: 4851878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:43:27,872][00205] Avg episode reward: [(0, '28.992')] [2023-02-24 13:43:32,870][00205] Fps is (10 sec: 3686.2, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 19427328. Throughput: 0: 935.0. Samples: 4857146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:32,875][00205] Avg episode reward: [(0, '29.533')] [2023-02-24 13:43:37,872][00205] Fps is (10 sec: 2866.5, 60 sec: 3754.5, 300 sec: 3776.6). Total num frames: 19443712. Throughput: 0: 921.0. Samples: 4861466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:37,875][00205] Avg episode reward: [(0, '29.846')] [2023-02-24 13:43:39,807][11215] Updated weights for policy 0, policy_version 4750 (0.0029) [2023-02-24 13:43:42,870][00205] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19468288. Throughput: 0: 949.6. Samples: 4864920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:42,872][00205] Avg episode reward: [(0, '29.126')] [2023-02-24 13:43:47,873][00205] Fps is (10 sec: 4505.2, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19488768. Throughput: 0: 975.4. Samples: 4871916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:43:47,876][00205] Avg episode reward: [(0, '30.106')] [2023-02-24 13:43:49,625][11215] Updated weights for policy 0, policy_version 4760 (0.0016) [2023-02-24 13:43:52,874][00205] Fps is (10 sec: 3684.9, 60 sec: 3822.7, 300 sec: 3790.5). Total num frames: 19505152. Throughput: 0: 924.6. Samples: 4876662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:43:52,876][00205] Avg episode reward: [(0, '30.569')] [2023-02-24 13:43:57,870][00205] Fps is (10 sec: 3277.9, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 19521536. Throughput: 0: 920.2. Samples: 4878788. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:43:57,872][00205] Avg episode reward: [(0, '30.493')] [2023-02-24 13:44:00,981][11215] Updated weights for policy 0, policy_version 4770 (0.0026) [2023-02-24 13:44:02,870][00205] Fps is (10 sec: 4097.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19546112. Throughput: 0: 961.9. Samples: 4885108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:44:02,872][00205] Avg episode reward: [(0, '29.534')] [2023-02-24 13:44:07,870][00205] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 19566592. Throughput: 0: 967.4. Samples: 4891896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:44:07,872][00205] Avg episode reward: [(0, '30.184')] [2023-02-24 13:44:11,976][11215] Updated weights for policy 0, policy_version 4780 (0.0011) [2023-02-24 13:44:12,870][00205] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19578880. Throughput: 0: 936.6. Samples: 4894026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:44:12,874][00205] Avg episode reward: [(0, '31.095')] [2023-02-24 13:44:17,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19595264. Throughput: 0: 916.0. Samples: 4898364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:44:17,872][00205] Avg episode reward: [(0, '30.332')] [2023-02-24 13:44:22,239][11215] Updated weights for policy 0, policy_version 4790 (0.0017) [2023-02-24 13:44:22,870][00205] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19619840. Throughput: 0: 977.0. Samples: 4905430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:44:22,873][00205] Avg episode reward: [(0, '30.039')] [2023-02-24 13:44:27,874][00205] Fps is (10 sec: 4503.6, 60 sec: 3754.4, 300 sec: 3790.5). Total num frames: 19640320. Throughput: 0: 978.0. Samples: 4908936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:44:27,877][00205] Avg episode reward: [(0, '29.960')] [2023-02-24 13:44:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 19656704. Throughput: 0: 931.4. Samples: 4913828. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:44:32,878][00205] Avg episode reward: [(0, '29.704')] [2023-02-24 13:44:33,987][11215] Updated weights for policy 0, policy_version 4800 (0.0026) [2023-02-24 13:44:37,870][00205] Fps is (10 sec: 3278.2, 60 sec: 3823.1, 300 sec: 3776.7). Total num frames: 19673088. Throughput: 0: 934.4. Samples: 4918708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 13:44:37,873][00205] Avg episode reward: [(0, '30.977')] [2023-02-24 13:44:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 19697664. Throughput: 0: 962.5. Samples: 4922100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:44:42,872][00205] Avg episode reward: [(0, '32.188')] [2023-02-24 13:44:43,543][11215] Updated weights for policy 0, policy_version 4810 (0.0024) [2023-02-24 13:44:47,870][00205] Fps is (10 sec: 4505.5, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 19718144. Throughput: 0: 978.1. Samples: 4929124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:44:47,874][00205] Avg episode reward: [(0, '30.658')] [2023-02-24 13:44:47,886][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004814_19718144.pth... [2023-02-24 13:44:48,035][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004592_18808832.pth [2023-02-24 13:44:52,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19730432. Throughput: 0: 923.7. Samples: 4933462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:44:52,880][00205] Avg episode reward: [(0, '32.048')] [2023-02-24 13:44:56,130][11215] Updated weights for policy 0, policy_version 4820 (0.0019) [2023-02-24 13:44:57,870][00205] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19746816. Throughput: 0: 923.3. Samples: 4935574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 13:44:57,872][00205] Avg episode reward: [(0, '31.180')] [2023-02-24 13:45:02,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19771392. Throughput: 0: 963.9. Samples: 4941740. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:45:02,872][00205] Avg episode reward: [(0, '30.934')] [2023-02-24 13:45:05,453][11215] Updated weights for policy 0, policy_version 4830 (0.0019) [2023-02-24 13:45:07,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 19787776. Throughput: 0: 946.6. Samples: 4948026. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:45:07,874][00205] Avg episode reward: [(0, '30.586')] [2023-02-24 13:45:12,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19804160. Throughput: 0: 916.7. Samples: 4950184. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 13:45:12,874][00205] Avg episode reward: [(0, '31.153')] [2023-02-24 13:45:17,852][11215] Updated weights for policy 0, policy_version 4840 (0.0012) [2023-02-24 13:45:17,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19824640. Throughput: 0: 908.3. Samples: 4954702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:45:17,878][00205] Avg episode reward: [(0, '29.579')] [2023-02-24 13:45:22,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19845120. Throughput: 0: 957.3. Samples: 4961788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:45:22,879][00205] Avg episode reward: [(0, '30.000')] [2023-02-24 13:45:27,014][11215] Updated weights for policy 0, policy_version 4850 (0.0021) [2023-02-24 13:45:27,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 19865600. Throughput: 0: 959.2. Samples: 4965266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 13:45:27,878][00205] Avg episode reward: [(0, '29.956')] [2023-02-24 13:45:32,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 19881984. Throughput: 0: 906.3. Samples: 4969906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:45:32,874][00205] Avg episode reward: [(0, '31.605')] [2023-02-24 13:45:37,870][00205] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19898368. Throughput: 0: 920.4. Samples: 4974880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 13:45:37,878][00205] Avg episode reward: [(0, '30.971')] [2023-02-24 13:45:39,131][11215] Updated weights for policy 0, policy_version 4860 (0.0014) [2023-02-24 13:45:42,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19922944. Throughput: 0: 951.0. Samples: 4978368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:45:42,879][00205] Avg episode reward: [(0, '30.475')] [2023-02-24 13:45:47,870][00205] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.6). Total num frames: 19939328. Throughput: 0: 963.4. Samples: 4985094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 13:45:47,880][00205] Avg episode reward: [(0, '30.778')] [2023-02-24 13:45:49,491][11215] Updated weights for policy 0, policy_version 4870 (0.0015) [2023-02-24 13:45:52,875][00205] Fps is (10 sec: 3275.0, 60 sec: 3754.3, 300 sec: 3790.5). Total num frames: 19955712. Throughput: 0: 921.1. Samples: 4989480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 13:45:52,879][00205] Avg episode reward: [(0, '31.618')] [2023-02-24 13:45:57,870][00205] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 19976192. Throughput: 0: 922.4. Samples: 4991690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:45:57,880][00205] Avg episode reward: [(0, '31.637')] [2023-02-24 13:46:00,362][11215] Updated weights for policy 0, policy_version 4880 (0.0016) [2023-02-24 13:46:02,870][00205] Fps is (10 sec: 4098.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 19996672. Throughput: 0: 975.7. Samples: 4998610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 13:46:02,882][00205] Avg episode reward: [(0, '31.607')] [2023-02-24 13:46:03,954][11201] Stopping Batcher_0... [2023-02-24 13:46:03,954][11201] Loop batcher_evt_loop terminating... [2023-02-24 13:46:03,954][00205] Component Batcher_0 stopped! [2023-02-24 13:46:03,960][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... [2023-02-24 13:46:04,002][11215] Weights refcount: 2 0 [2023-02-24 13:46:04,006][00205] Component InferenceWorker_p0-w0 stopped! [2023-02-24 13:46:04,017][11215] Stopping InferenceWorker_p0-w0... [2023-02-24 13:46:04,017][11215] Loop inference_proc0-0_evt_loop terminating... [2023-02-24 13:46:04,036][11221] Stopping RolloutWorker_w1... [2023-02-24 13:46:04,031][11226] Stopping RolloutWorker_w7... [2023-02-24 13:46:04,032][00205] Component RolloutWorker_w7 stopped! [2023-02-24 13:46:04,038][00205] Component RolloutWorker_w1 stopped! [2023-02-24 13:46:04,041][00205] Component RolloutWorker_w5 stopped! [2023-02-24 13:46:04,041][11224] Stopping RolloutWorker_w5... [2023-02-24 13:46:04,044][11224] Loop rollout_proc5_evt_loop terminating... [2023-02-24 13:46:04,038][11221] Loop rollout_proc1_evt_loop terminating... [2023-02-24 13:46:04,045][11226] Loop rollout_proc7_evt_loop terminating... [2023-02-24 13:46:04,047][11223] Stopping RolloutWorker_w3... [2023-02-24 13:46:04,048][11223] Loop rollout_proc3_evt_loop terminating... [2023-02-24 13:46:04,049][00205] Component RolloutWorker_w3 stopped! [2023-02-24 13:46:04,064][11225] Stopping RolloutWorker_w6... [2023-02-24 13:46:04,064][00205] Component RolloutWorker_w6 stopped! [2023-02-24 13:46:04,072][11225] Loop rollout_proc6_evt_loop terminating... [2023-02-24 13:46:04,078][11222] Stopping RolloutWorker_w2... [2023-02-24 13:46:04,079][11222] Loop rollout_proc2_evt_loop terminating... [2023-02-24 13:46:04,078][00205] Component RolloutWorker_w2 stopped! [2023-02-24 13:46:04,088][11216] Stopping RolloutWorker_w0... [2023-02-24 13:46:04,088][00205] Component RolloutWorker_w0 stopped! [2023-02-24 13:46:04,093][11227] Stopping RolloutWorker_w4... [2023-02-24 13:46:04,093][00205] Component RolloutWorker_w4 stopped! [2023-02-24 13:46:04,095][11227] Loop rollout_proc4_evt_loop terminating... [2023-02-24 13:46:04,099][11216] Loop rollout_proc0_evt_loop terminating... [2023-02-24 13:46:04,139][11201] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004703_19263488.pth [2023-02-24 13:46:04,151][11201] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... [2023-02-24 13:46:04,323][00205] Component LearnerWorker_p0 stopped! [2023-02-24 13:46:04,328][00205] Waiting for process learner_proc0 to stop... [2023-02-24 13:46:04,343][11201] Stopping LearnerWorker_p0... [2023-02-24 13:46:04,344][11201] Loop learner_proc0_evt_loop terminating... [2023-02-24 13:46:06,567][00205] Waiting for process inference_proc0-0 to join... [2023-02-24 13:46:07,271][00205] Waiting for process rollout_proc0 to join... [2023-02-24 13:46:08,011][00205] Waiting for process rollout_proc1 to join... [2023-02-24 13:46:08,013][00205] Waiting for process rollout_proc2 to join... [2023-02-24 13:46:08,022][00205] Waiting for process rollout_proc3 to join... [2023-02-24 13:46:08,024][00205] Waiting for process rollout_proc4 to join... [2023-02-24 13:46:08,025][00205] Waiting for process rollout_proc5 to join... [2023-02-24 13:46:08,026][00205] Waiting for process rollout_proc6 to join... [2023-02-24 13:46:08,028][00205] Waiting for process rollout_proc7 to join... [2023-02-24 13:46:08,030][00205] Batcher 0 profile tree view: batching: 125.6692, releasing_batches: 0.1240 [2023-02-24 13:46:08,032][00205] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0001 wait_policy_total: 2737.0422 update_model: 37.9221 weight_update: 0.0016 one_step: 0.0118 handle_policy_step: 2483.2568 deserialize: 75.3816, stack: 14.8616, obs_to_device_normalize: 559.5526, forward: 1184.2792, send_messages: 130.3140 prepare_outputs: 395.5048 to_cpu: 242.4212 [2023-02-24 13:46:08,034][00205] Learner 0 profile tree view: misc: 0.0290, prepare_batch: 60.7874 train: 366.6477 epoch_init: 0.0621, minibatch_init: 0.0458, losses_postprocess: 2.9527, kl_divergence: 2.8728, after_optimizer: 162.1073 calculate_losses: 130.0243 losses_init: 0.0470, forward_head: 7.9970, bptt_initial: 86.1198, tail: 5.1093, advantages_returns: 1.4157, losses: 16.7013 bptt: 11.0390 bptt_forward_core: 10.5633 update: 65.4562 clip: 7.0785 [2023-02-24 13:46:08,036][00205] RolloutWorker_w0 profile tree view: wait_for_trajectories: 1.6596, enqueue_policy_requests: 779.0498, env_step: 4091.0265, overhead: 107.0679, complete_rollouts: 33.4033 save_policy_outputs: 98.6476 split_output_tensors: 47.6182 [2023-02-24 13:46:08,038][00205] RolloutWorker_w7 profile tree view: wait_for_trajectories: 1.8929, enqueue_policy_requests: 766.4891, env_step: 4101.5740, overhead: 109.4752, complete_rollouts: 35.1840 save_policy_outputs: 99.1619 split_output_tensors: 48.3378 [2023-02-24 13:46:08,040][00205] Loop Runner_EvtLoop terminating... [2023-02-24 13:46:08,043][00205] Runner profile tree view: main_loop: 5474.8034 [2023-02-24 13:46:08,055][00205] Collected {0: 20004864}, FPS: 3654.0 [2023-02-24 14:12:39,442][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-24 14:12:39,445][00205] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-24 14:12:39,447][00205] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-24 14:12:39,449][00205] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-24 14:12:39,451][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 14:12:39,452][00205] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-24 14:12:39,456][00205] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 14:12:39,457][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-24 14:12:39,458][00205] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-24 14:12:39,459][00205] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-24 14:12:39,462][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-24 14:12:39,463][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-24 14:12:39,465][00205] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-24 14:12:39,466][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-24 14:12:39,467][00205] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-24 14:12:39,510][00205] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 14:12:39,514][00205] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 14:12:39,519][00205] RunningMeanStd input shape: (1,) [2023-02-24 14:12:39,546][00205] ConvEncoder: input_channels=3 [2023-02-24 14:12:40,349][00205] Conv encoder output size: 512 [2023-02-24 14:12:40,352][00205] Policy head output size: 512 [2023-02-24 14:12:43,178][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... [2023-02-24 14:12:44,426][00205] Num frames 100... [2023-02-24 14:12:44,538][00205] Num frames 200... [2023-02-24 14:12:44,654][00205] Num frames 300... [2023-02-24 14:12:44,765][00205] Num frames 400... [2023-02-24 14:12:44,880][00205] Num frames 500... [2023-02-24 14:12:44,989][00205] Num frames 600... [2023-02-24 14:12:45,102][00205] Num frames 700... [2023-02-24 14:12:45,213][00205] Num frames 800... [2023-02-24 14:12:45,331][00205] Num frames 900... [2023-02-24 14:12:45,443][00205] Num frames 1000... [2023-02-24 14:12:45,561][00205] Num frames 1100... [2023-02-24 14:12:45,677][00205] Num frames 1200... [2023-02-24 14:12:45,797][00205] Num frames 1300... [2023-02-24 14:12:45,909][00205] Num frames 1400... [2023-02-24 14:12:46,022][00205] Num frames 1500... [2023-02-24 14:12:46,083][00205] Avg episode rewards: #0: 36.040, true rewards: #0: 15.040 [2023-02-24 14:12:46,085][00205] Avg episode reward: 36.040, avg true_objective: 15.040 [2023-02-24 14:12:46,195][00205] Num frames 1600... [2023-02-24 14:12:46,307][00205] Num frames 1700... [2023-02-24 14:12:46,416][00205] Num frames 1800... [2023-02-24 14:12:46,529][00205] Num frames 1900... [2023-02-24 14:12:46,651][00205] Num frames 2000... [2023-02-24 14:12:46,762][00205] Num frames 2100... [2023-02-24 14:12:46,873][00205] Num frames 2200... [2023-02-24 14:12:46,990][00205] Num frames 2300... [2023-02-24 14:12:47,109][00205] Num frames 2400... [2023-02-24 14:12:47,229][00205] Num frames 2500... [2023-02-24 14:12:47,338][00205] Num frames 2600... [2023-02-24 14:12:47,450][00205] Num frames 2700... [2023-02-24 14:12:47,531][00205] Avg episode rewards: #0: 32.600, true rewards: #0: 13.600 [2023-02-24 14:12:47,533][00205] Avg episode reward: 32.600, avg true_objective: 13.600 [2023-02-24 14:12:47,623][00205] Num frames 2800... [2023-02-24 14:12:47,739][00205] Num frames 2900... [2023-02-24 14:12:47,850][00205] Num frames 3000... [2023-02-24 14:12:47,973][00205] Num frames 3100... [2023-02-24 14:12:48,083][00205] Num frames 3200... [2023-02-24 14:12:48,194][00205] Num frames 3300... [2023-02-24 14:12:48,304][00205] Num frames 3400... [2023-02-24 14:12:48,419][00205] Num frames 3500... [2023-02-24 14:12:48,564][00205] Avg episode rewards: #0: 27.613, true rewards: #0: 11.947 [2023-02-24 14:12:48,565][00205] Avg episode reward: 27.613, avg true_objective: 11.947 [2023-02-24 14:12:48,589][00205] Num frames 3600... [2023-02-24 14:12:48,704][00205] Num frames 3700... [2023-02-24 14:12:48,821][00205] Num frames 3800... [2023-02-24 14:12:48,929][00205] Num frames 3900... [2023-02-24 14:12:49,048][00205] Num frames 4000... [2023-02-24 14:12:49,166][00205] Num frames 4100... [2023-02-24 14:12:49,278][00205] Num frames 4200... [2023-02-24 14:12:49,390][00205] Num frames 4300... [2023-02-24 14:12:49,502][00205] Num frames 4400... [2023-02-24 14:12:49,616][00205] Num frames 4500... [2023-02-24 14:12:49,733][00205] Num frames 4600... [2023-02-24 14:12:49,842][00205] Num frames 4700... [2023-02-24 14:12:49,955][00205] Num frames 4800... [2023-02-24 14:12:50,066][00205] Num frames 4900... [2023-02-24 14:12:50,178][00205] Num frames 5000... [2023-02-24 14:12:50,290][00205] Num frames 5100... [2023-02-24 14:12:50,405][00205] Num frames 5200... [2023-02-24 14:12:50,523][00205] Num frames 5300... [2023-02-24 14:12:50,633][00205] Num frames 5400... [2023-02-24 14:12:50,778][00205] Avg episode rewards: #0: 32.680, true rewards: #0: 13.680 [2023-02-24 14:12:50,779][00205] Avg episode reward: 32.680, avg true_objective: 13.680 [2023-02-24 14:12:50,815][00205] Num frames 5500... [2023-02-24 14:12:50,926][00205] Num frames 5600... [2023-02-24 14:12:51,041][00205] Num frames 5700... [2023-02-24 14:12:51,148][00205] Num frames 5800... [2023-02-24 14:12:51,258][00205] Num frames 5900... [2023-02-24 14:12:51,336][00205] Avg episode rewards: #0: 27.240, true rewards: #0: 11.840 [2023-02-24 14:12:51,339][00205] Avg episode reward: 27.240, avg true_objective: 11.840 [2023-02-24 14:12:51,437][00205] Num frames 6000... [2023-02-24 14:12:51,547][00205] Num frames 6100... [2023-02-24 14:12:51,658][00205] Num frames 6200... [2023-02-24 14:12:51,774][00205] Num frames 6300... [2023-02-24 14:12:51,889][00205] Num frames 6400... [2023-02-24 14:12:52,001][00205] Num frames 6500... [2023-02-24 14:12:52,112][00205] Num frames 6600... [2023-02-24 14:12:52,223][00205] Num frames 6700... [2023-02-24 14:12:52,391][00205] Num frames 6800... [2023-02-24 14:12:52,551][00205] Num frames 6900... [2023-02-24 14:12:52,704][00205] Num frames 7000... [2023-02-24 14:12:52,875][00205] Num frames 7100... [2023-02-24 14:12:53,035][00205] Num frames 7200... [2023-02-24 14:12:53,190][00205] Num frames 7300... [2023-02-24 14:12:53,354][00205] Num frames 7400... [2023-02-24 14:12:53,432][00205] Avg episode rewards: #0: 29.685, true rewards: #0: 12.352 [2023-02-24 14:12:53,437][00205] Avg episode reward: 29.685, avg true_objective: 12.352 [2023-02-24 14:12:53,581][00205] Num frames 7500... [2023-02-24 14:12:53,741][00205] Num frames 7600... [2023-02-24 14:12:53,896][00205] Num frames 7700... [2023-02-24 14:12:54,052][00205] Num frames 7800... [2023-02-24 14:12:54,215][00205] Num frames 7900... [2023-02-24 14:12:54,378][00205] Num frames 8000... [2023-02-24 14:12:54,538][00205] Num frames 8100... [2023-02-24 14:12:54,696][00205] Num frames 8200... [2023-02-24 14:12:54,861][00205] Num frames 8300... [2023-02-24 14:12:55,032][00205] Avg episode rewards: #0: 28.530, true rewards: #0: 11.959 [2023-02-24 14:12:55,034][00205] Avg episode reward: 28.530, avg true_objective: 11.959 [2023-02-24 14:12:55,091][00205] Num frames 8400... [2023-02-24 14:12:55,262][00205] Num frames 8500... [2023-02-24 14:12:55,425][00205] Num frames 8600... [2023-02-24 14:12:55,584][00205] Num frames 8700... [2023-02-24 14:12:55,744][00205] Num frames 8800... [2023-02-24 14:12:55,863][00205] Avg episode rewards: #0: 26.321, true rewards: #0: 11.071 [2023-02-24 14:12:55,866][00205] Avg episode reward: 26.321, avg true_objective: 11.071 [2023-02-24 14:12:55,918][00205] Num frames 8900... [2023-02-24 14:12:56,027][00205] Num frames 9000... [2023-02-24 14:12:56,143][00205] Num frames 9100... [2023-02-24 14:12:56,255][00205] Num frames 9200... [2023-02-24 14:12:56,369][00205] Num frames 9300... [2023-02-24 14:12:56,482][00205] Num frames 9400... [2023-02-24 14:12:56,600][00205] Num frames 9500... [2023-02-24 14:12:56,711][00205] Num frames 9600... [2023-02-24 14:12:56,835][00205] Num frames 9700... [2023-02-24 14:12:56,966][00205] Num frames 9800... [2023-02-24 14:12:57,076][00205] Num frames 9900... [2023-02-24 14:12:57,148][00205] Avg episode rewards: #0: 25.903, true rewards: #0: 11.014 [2023-02-24 14:12:57,150][00205] Avg episode reward: 25.903, avg true_objective: 11.014 [2023-02-24 14:12:57,248][00205] Num frames 10000... [2023-02-24 14:12:57,368][00205] Num frames 10100... [2023-02-24 14:12:57,479][00205] Num frames 10200... [2023-02-24 14:12:57,591][00205] Num frames 10300... [2023-02-24 14:12:57,701][00205] Num frames 10400... [2023-02-24 14:12:57,814][00205] Num frames 10500... [2023-02-24 14:12:57,930][00205] Num frames 10600... [2023-02-24 14:12:58,040][00205] Num frames 10700... [2023-02-24 14:12:58,151][00205] Num frames 10800... [2023-02-24 14:12:58,227][00205] Avg episode rewards: #0: 25.217, true rewards: #0: 10.817 [2023-02-24 14:12:58,229][00205] Avg episode reward: 25.217, avg true_objective: 10.817 [2023-02-24 14:14:02,653][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-24 14:22:15,412][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-24 14:22:15,415][00205] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-24 14:22:15,417][00205] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-24 14:22:15,419][00205] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-24 14:22:15,421][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 14:22:15,423][00205] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-24 14:22:15,425][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-24 14:22:15,427][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-24 14:22:15,433][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-24 14:22:15,434][00205] Adding new argument 'hf_repository'='parsasam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-24 14:22:15,436][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-24 14:22:15,439][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-24 14:22:15,441][00205] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-24 14:22:15,443][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-24 14:22:15,445][00205] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-24 14:22:15,463][00205] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 14:22:15,466][00205] RunningMeanStd input shape: (1,) [2023-02-24 14:22:15,480][00205] ConvEncoder: input_channels=3 [2023-02-24 14:22:15,518][00205] Conv encoder output size: 512 [2023-02-24 14:22:15,520][00205] Policy head output size: 512 [2023-02-24 14:22:15,540][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... [2023-02-24 14:22:15,996][00205] Num frames 100... [2023-02-24 14:22:16,126][00205] Num frames 200... [2023-02-24 14:22:16,252][00205] Num frames 300... [2023-02-24 14:22:16,374][00205] Num frames 400... [2023-02-24 14:22:16,493][00205] Num frames 500... [2023-02-24 14:22:16,613][00205] Num frames 600... [2023-02-24 14:22:16,738][00205] Num frames 700... [2023-02-24 14:22:16,854][00205] Num frames 800... [2023-02-24 14:22:16,974][00205] Num frames 900... [2023-02-24 14:22:17,099][00205] Num frames 1000... [2023-02-24 14:22:17,235][00205] Num frames 1100... [2023-02-24 14:22:17,354][00205] Num frames 1200... [2023-02-24 14:22:17,484][00205] Num frames 1300... [2023-02-24 14:22:17,606][00205] Num frames 1400... [2023-02-24 14:22:17,733][00205] Num frames 1500... [2023-02-24 14:22:17,857][00205] Num frames 1600... [2023-02-24 14:22:18,032][00205] Avg episode rewards: #0: 41.980, true rewards: #0: 16.980 [2023-02-24 14:22:18,034][00205] Avg episode reward: 41.980, avg true_objective: 16.980 [2023-02-24 14:22:18,040][00205] Num frames 1700... [2023-02-24 14:22:18,166][00205] Num frames 1800... [2023-02-24 14:22:18,286][00205] Num frames 1900... [2023-02-24 14:22:18,408][00205] Num frames 2000... [2023-02-24 14:22:18,519][00205] Num frames 2100... [2023-02-24 14:22:18,628][00205] Num frames 2200... [2023-02-24 14:22:18,738][00205] Num frames 2300... [2023-02-24 14:22:18,856][00205] Num frames 2400... [2023-02-24 14:22:18,968][00205] Num frames 2500... [2023-02-24 14:22:19,082][00205] Num frames 2600... [2023-02-24 14:22:19,202][00205] Num frames 2700... [2023-02-24 14:22:19,322][00205] Num frames 2800... [2023-02-24 14:22:19,435][00205] Num frames 2900... [2023-02-24 14:22:19,549][00205] Num frames 3000... [2023-02-24 14:22:19,622][00205] Avg episode rewards: #0: 36.555, true rewards: #0: 15.055 [2023-02-24 14:22:19,624][00205] Avg episode reward: 36.555, avg true_objective: 15.055 [2023-02-24 14:22:19,732][00205] Num frames 3100... [2023-02-24 14:22:19,843][00205] Num frames 3200... [2023-02-24 14:22:19,967][00205] Num frames 3300... [2023-02-24 14:22:20,082][00205] Num frames 3400... [2023-02-24 14:22:20,201][00205] Num frames 3500... [2023-02-24 14:22:20,315][00205] Num frames 3600... [2023-02-24 14:22:20,430][00205] Num frames 3700... [2023-02-24 14:22:20,544][00205] Num frames 3800... [2023-02-24 14:22:20,660][00205] Num frames 3900... [2023-02-24 14:22:20,775][00205] Num frames 4000... [2023-02-24 14:22:20,890][00205] Num frames 4100... [2023-02-24 14:22:21,004][00205] Num frames 4200... [2023-02-24 14:22:21,134][00205] Avg episode rewards: #0: 34.890, true rewards: #0: 14.223 [2023-02-24 14:22:21,135][00205] Avg episode reward: 34.890, avg true_objective: 14.223 [2023-02-24 14:22:21,192][00205] Num frames 4300... [2023-02-24 14:22:21,310][00205] Num frames 4400... [2023-02-24 14:22:21,426][00205] Num frames 4500... [2023-02-24 14:22:21,542][00205] Num frames 4600... [2023-02-24 14:22:21,666][00205] Num frames 4700... [2023-02-24 14:22:21,782][00205] Num frames 4800... [2023-02-24 14:22:21,908][00205] Num frames 4900... [2023-02-24 14:22:22,022][00205] Num frames 5000... [2023-02-24 14:22:22,135][00205] Num frames 5100... [2023-02-24 14:22:22,260][00205] Num frames 5200... [2023-02-24 14:22:22,373][00205] Num frames 5300... [2023-02-24 14:22:22,489][00205] Num frames 5400... [2023-02-24 14:22:22,603][00205] Num frames 5500... [2023-02-24 14:22:22,720][00205] Num frames 5600... [2023-02-24 14:22:22,835][00205] Num frames 5700... [2023-02-24 14:22:22,942][00205] Avg episode rewards: #0: 36.850, true rewards: #0: 14.350 [2023-02-24 14:22:22,944][00205] Avg episode reward: 36.850, avg true_objective: 14.350 [2023-02-24 14:22:23,016][00205] Num frames 5800... [2023-02-24 14:22:23,132][00205] Num frames 5900... [2023-02-24 14:22:23,254][00205] Num frames 6000... [2023-02-24 14:22:23,366][00205] Num frames 6100... [2023-02-24 14:22:23,511][00205] Avg episode rewards: #0: 30.960, true rewards: #0: 12.360 [2023-02-24 14:22:23,513][00205] Avg episode reward: 30.960, avg true_objective: 12.360 [2023-02-24 14:22:23,542][00205] Num frames 6200... [2023-02-24 14:22:23,655][00205] Num frames 6300... [2023-02-24 14:22:23,776][00205] Num frames 6400... [2023-02-24 14:22:23,901][00205] Num frames 6500... [2023-02-24 14:22:24,018][00205] Num frames 6600... [2023-02-24 14:22:24,132][00205] Num frames 6700... [2023-02-24 14:22:24,255][00205] Num frames 6800... [2023-02-24 14:22:24,376][00205] Num frames 6900... [2023-02-24 14:22:24,490][00205] Num frames 7000... [2023-02-24 14:22:24,610][00205] Num frames 7100... [2023-02-24 14:22:24,737][00205] Num frames 7200... [2023-02-24 14:22:24,852][00205] Num frames 7300... [2023-02-24 14:22:24,970][00205] Num frames 7400... [2023-02-24 14:22:25,086][00205] Num frames 7500... [2023-02-24 14:22:25,244][00205] Num frames 7600... [2023-02-24 14:22:25,425][00205] Num frames 7700... [2023-02-24 14:22:25,592][00205] Num frames 7800... [2023-02-24 14:22:25,756][00205] Num frames 7900... [2023-02-24 14:22:25,919][00205] Num frames 8000... [2023-02-24 14:22:26,008][00205] Avg episode rewards: #0: 33.361, true rewards: #0: 13.362 [2023-02-24 14:22:26,015][00205] Avg episode reward: 33.361, avg true_objective: 13.362 [2023-02-24 14:22:26,151][00205] Num frames 8100... [2023-02-24 14:22:26,324][00205] Num frames 8200... [2023-02-24 14:22:26,489][00205] Num frames 8300... [2023-02-24 14:22:26,641][00205] Num frames 8400... [2023-02-24 14:22:26,803][00205] Num frames 8500... [2023-02-24 14:22:26,968][00205] Num frames 8600... [2023-02-24 14:22:27,120][00205] Avg episode rewards: #0: 30.653, true rewards: #0: 12.367 [2023-02-24 14:22:27,123][00205] Avg episode reward: 30.653, avg true_objective: 12.367 [2023-02-24 14:22:27,202][00205] Num frames 8700... [2023-02-24 14:22:27,379][00205] Num frames 8800... [2023-02-24 14:22:27,549][00205] Num frames 8900... [2023-02-24 14:22:27,718][00205] Num frames 9000... [2023-02-24 14:22:27,893][00205] Num frames 9100... [2023-02-24 14:22:28,060][00205] Num frames 9200... [2023-02-24 14:22:28,227][00205] Num frames 9300... [2023-02-24 14:22:28,398][00205] Num frames 9400... [2023-02-24 14:22:28,576][00205] Num frames 9500... [2023-02-24 14:22:28,741][00205] Num frames 9600... [2023-02-24 14:22:28,860][00205] Num frames 9700... [2023-02-24 14:22:28,987][00205] Num frames 9800... [2023-02-24 14:22:29,095][00205] Avg episode rewards: #0: 30.429, true rewards: #0: 12.304 [2023-02-24 14:22:29,096][00205] Avg episode reward: 30.429, avg true_objective: 12.304 [2023-02-24 14:22:29,167][00205] Num frames 9900... [2023-02-24 14:22:29,289][00205] Num frames 10000... [2023-02-24 14:22:29,402][00205] Num frames 10100... [2023-02-24 14:22:29,522][00205] Num frames 10200... [2023-02-24 14:22:29,639][00205] Num frames 10300... [2023-02-24 14:22:29,752][00205] Num frames 10400... [2023-02-24 14:22:29,869][00205] Num frames 10500... [2023-02-24 14:22:29,982][00205] Num frames 10600... [2023-02-24 14:22:30,093][00205] Avg episode rewards: #0: 28.937, true rewards: #0: 11.826 [2023-02-24 14:22:30,096][00205] Avg episode reward: 28.937, avg true_objective: 11.826 [2023-02-24 14:22:30,163][00205] Num frames 10700... [2023-02-24 14:22:30,280][00205] Num frames 10800... [2023-02-24 14:22:30,399][00205] Num frames 10900... [2023-02-24 14:22:30,527][00205] Num frames 11000... [2023-02-24 14:22:30,640][00205] Num frames 11100... [2023-02-24 14:22:30,756][00205] Num frames 11200... [2023-02-24 14:22:30,874][00205] Num frames 11300... [2023-02-24 14:22:30,988][00205] Num frames 11400... [2023-02-24 14:22:31,105][00205] Num frames 11500... [2023-02-24 14:22:31,218][00205] Num frames 11600... [2023-02-24 14:22:31,341][00205] Num frames 11700... [2023-02-24 14:22:31,461][00205] Num frames 11800... [2023-02-24 14:22:31,580][00205] Num frames 11900... [2023-02-24 14:22:31,701][00205] Num frames 12000... [2023-02-24 14:22:31,781][00205] Avg episode rewards: #0: 29.819, true rewards: #0: 12.019 [2023-02-24 14:22:31,782][00205] Avg episode reward: 29.819, avg true_objective: 12.019 [2023-02-24 14:23:45,181][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-24 14:28:54,610][00205] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-24 14:28:54,613][00205] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-24 14:28:54,616][00205] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-24 14:28:54,619][00205] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-24 14:28:54,621][00205] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 14:28:54,624][00205] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-24 14:28:54,625][00205] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-24 14:28:54,626][00205] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-24 14:28:54,628][00205] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-24 14:28:54,629][00205] Adding new argument 'hf_repository'='parsasam/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-24 14:28:54,630][00205] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-24 14:28:54,632][00205] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-24 14:28:54,633][00205] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-24 14:28:54,635][00205] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-24 14:28:54,636][00205] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-24 14:28:54,667][00205] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 14:28:54,669][00205] RunningMeanStd input shape: (1,) [2023-02-24 14:28:54,686][00205] ConvEncoder: input_channels=3 [2023-02-24 14:28:54,746][00205] Conv encoder output size: 512 [2023-02-24 14:28:54,747][00205] Policy head output size: 512 [2023-02-24 14:28:54,769][00205] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000004884_20004864.pth... [2023-02-24 14:28:55,225][00205] Num frames 100... [2023-02-24 14:28:55,343][00205] Num frames 200... [2023-02-24 14:28:55,454][00205] Num frames 300... [2023-02-24 14:28:55,583][00205] Num frames 400... [2023-02-24 14:28:55,694][00205] Num frames 500... [2023-02-24 14:28:55,818][00205] Num frames 600... [2023-02-24 14:28:55,945][00205] Num frames 700... [2023-02-24 14:28:56,066][00205] Num frames 800... [2023-02-24 14:28:56,179][00205] Num frames 900... [2023-02-24 14:28:56,290][00205] Num frames 1000... [2023-02-24 14:28:56,401][00205] Num frames 1100... [2023-02-24 14:28:56,516][00205] Num frames 1200... [2023-02-24 14:28:56,638][00205] Num frames 1300... [2023-02-24 14:28:56,749][00205] Num frames 1400... [2023-02-24 14:28:56,861][00205] Num frames 1500... [2023-02-24 14:28:56,973][00205] Num frames 1600... [2023-02-24 14:28:57,087][00205] Num frames 1700... [2023-02-24 14:28:57,207][00205] Num frames 1800... [2023-02-24 14:28:57,332][00205] Num frames 1900... [2023-02-24 14:28:57,450][00205] Num frames 2000... [2023-02-24 14:28:57,575][00205] Num frames 2100... [2023-02-24 14:28:57,628][00205] Avg episode rewards: #0: 64.999, true rewards: #0: 21.000 [2023-02-24 14:28:57,629][00205] Avg episode reward: 64.999, avg true_objective: 21.000 [2023-02-24 14:28:57,748][00205] Num frames 2200... [2023-02-24 14:28:57,873][00205] Num frames 2300... [2023-02-24 14:28:57,988][00205] Num frames 2400... [2023-02-24 14:28:58,102][00205] Num frames 2500... [2023-02-24 14:28:58,215][00205] Num frames 2600... [2023-02-24 14:28:58,339][00205] Num frames 2700... [2023-02-24 14:28:58,493][00205] Num frames 2800... [2023-02-24 14:28:58,617][00205] Num frames 2900... [2023-02-24 14:28:58,732][00205] Num frames 3000... [2023-02-24 14:28:58,851][00205] Num frames 3100... [2023-02-24 14:28:58,964][00205] Num frames 3200... [2023-02-24 14:28:59,134][00205] Num frames 3300... [2023-02-24 14:28:59,299][00205] Num frames 3400... [2023-02-24 14:28:59,421][00205] Avg episode rewards: #0: 49.200, true rewards: #0: 17.200 [2023-02-24 14:28:59,427][00205] Avg episode reward: 49.200, avg true_objective: 17.200 [2023-02-24 14:28:59,527][00205] Num frames 3500... [2023-02-24 14:28:59,691][00205] Num frames 3600... [2023-02-24 14:28:59,846][00205] Num frames 3700... [2023-02-24 14:29:00,000][00205] Num frames 3800... [2023-02-24 14:29:00,169][00205] Num frames 3900... [2023-02-24 14:29:00,327][00205] Num frames 4000... [2023-02-24 14:29:00,485][00205] Num frames 4100... [2023-02-24 14:29:00,654][00205] Num frames 4200... [2023-02-24 14:29:00,749][00205] Avg episode rewards: #0: 39.740, true rewards: #0: 14.073 [2023-02-24 14:29:00,751][00205] Avg episode reward: 39.740, avg true_objective: 14.073 [2023-02-24 14:29:00,877][00205] Num frames 4300... [2023-02-24 14:29:01,032][00205] Num frames 4400... [2023-02-24 14:29:01,190][00205] Num frames 4500... [2023-02-24 14:29:01,356][00205] Num frames 4600... [2023-02-24 14:29:01,519][00205] Num frames 4700... [2023-02-24 14:29:01,687][00205] Num frames 4800... [2023-02-24 14:29:01,850][00205] Num frames 4900... [2023-02-24 14:29:02,013][00205] Num frames 5000... [2023-02-24 14:29:02,209][00205] Avg episode rewards: #0: 33.965, true rewards: #0: 12.715 [2023-02-24 14:29:02,212][00205] Avg episode reward: 33.965, avg true_objective: 12.715 [2023-02-24 14:29:02,243][00205] Num frames 5100... [2023-02-24 14:29:02,413][00205] Num frames 5200... [2023-02-24 14:29:02,533][00205] Num frames 5300... [2023-02-24 14:29:02,647][00205] Num frames 5400... [2023-02-24 14:29:02,766][00205] Num frames 5500... [2023-02-24 14:29:02,880][00205] Num frames 5600... [2023-02-24 14:29:02,997][00205] Num frames 5700... [2023-02-24 14:29:03,113][00205] Num frames 5800... [2023-02-24 14:29:03,230][00205] Num frames 5900... [2023-02-24 14:29:03,340][00205] Num frames 6000... [2023-02-24 14:29:03,451][00205] Num frames 6100... [2023-02-24 14:29:03,565][00205] Num frames 6200... [2023-02-24 14:29:03,684][00205] Num frames 6300... [2023-02-24 14:29:03,805][00205] Num frames 6400... [2023-02-24 14:29:03,920][00205] Num frames 6500... [2023-02-24 14:29:04,053][00205] Avg episode rewards: #0: 35.338, true rewards: #0: 13.138 [2023-02-24 14:29:04,055][00205] Avg episode reward: 35.338, avg true_objective: 13.138 [2023-02-24 14:29:04,093][00205] Num frames 6600... [2023-02-24 14:29:04,214][00205] Num frames 6700... [2023-02-24 14:29:04,326][00205] Num frames 6800... [2023-02-24 14:29:04,449][00205] Num frames 6900... [2023-02-24 14:29:04,562][00205] Num frames 7000... [2023-02-24 14:29:04,673][00205] Num frames 7100... [2023-02-24 14:29:04,798][00205] Num frames 7200... [2023-02-24 14:29:04,918][00205] Num frames 7300... [2023-02-24 14:29:04,982][00205] Avg episode rewards: #0: 31.675, true rewards: #0: 12.175 [2023-02-24 14:29:04,986][00205] Avg episode reward: 31.675, avg true_objective: 12.175 [2023-02-24 14:29:05,094][00205] Num frames 7400... [2023-02-24 14:29:05,206][00205] Num frames 7500... [2023-02-24 14:29:05,326][00205] Num frames 7600... [2023-02-24 14:29:05,438][00205] Num frames 7700... [2023-02-24 14:29:05,553][00205] Num frames 7800... [2023-02-24 14:29:05,669][00205] Num frames 7900... [2023-02-24 14:29:05,795][00205] Num frames 8000... [2023-02-24 14:29:05,909][00205] Num frames 8100... [2023-02-24 14:29:06,023][00205] Num frames 8200... [2023-02-24 14:29:06,134][00205] Num frames 8300... [2023-02-24 14:29:06,250][00205] Num frames 8400... [2023-02-24 14:29:06,361][00205] Num frames 8500... [2023-02-24 14:29:06,480][00205] Num frames 8600... [2023-02-24 14:29:06,559][00205] Avg episode rewards: #0: 31.314, true rewards: #0: 12.314 [2023-02-24 14:29:06,560][00205] Avg episode reward: 31.314, avg true_objective: 12.314 [2023-02-24 14:29:06,657][00205] Num frames 8700... [2023-02-24 14:29:06,785][00205] Num frames 8800... [2023-02-24 14:29:06,903][00205] Num frames 8900... [2023-02-24 14:29:07,018][00205] Num frames 9000... [2023-02-24 14:29:07,138][00205] Num frames 9100... [2023-02-24 14:29:07,248][00205] Num frames 9200... [2023-02-24 14:29:07,363][00205] Num frames 9300... [2023-02-24 14:29:07,479][00205] Num frames 9400... [2023-02-24 14:29:07,595][00205] Num frames 9500... [2023-02-24 14:29:07,707][00205] Num frames 9600... [2023-02-24 14:29:07,833][00205] Num frames 9700... [2023-02-24 14:29:07,946][00205] Num frames 9800... [2023-02-24 14:29:08,060][00205] Num frames 9900... [2023-02-24 14:29:08,175][00205] Avg episode rewards: #0: 31.062, true rewards: #0: 12.437 [2023-02-24 14:29:08,178][00205] Avg episode reward: 31.062, avg true_objective: 12.437 [2023-02-24 14:29:08,238][00205] Num frames 10000... [2023-02-24 14:29:08,357][00205] Num frames 10100... [2023-02-24 14:29:08,469][00205] Num frames 10200... [2023-02-24 14:29:08,584][00205] Num frames 10300... [2023-02-24 14:29:08,715][00205] Avg episode rewards: #0: 28.296, true rewards: #0: 11.518 [2023-02-24 14:29:08,717][00205] Avg episode reward: 28.296, avg true_objective: 11.518 [2023-02-24 14:29:08,760][00205] Num frames 10400... [2023-02-24 14:29:08,882][00205] Num frames 10500... [2023-02-24 14:29:09,002][00205] Num frames 10600... [2023-02-24 14:29:09,118][00205] Num frames 10700... [2023-02-24 14:29:09,230][00205] Num frames 10800... [2023-02-24 14:29:09,342][00205] Num frames 10900... [2023-02-24 14:29:09,458][00205] Num frames 11000... [2023-02-24 14:29:09,579][00205] Num frames 11100... [2023-02-24 14:29:09,691][00205] Num frames 11200... [2023-02-24 14:29:09,802][00205] Num frames 11300... [2023-02-24 14:29:09,923][00205] Num frames 11400... [2023-02-24 14:29:10,035][00205] Num frames 11500... [2023-02-24 14:29:10,152][00205] Num frames 11600... [2023-02-24 14:29:10,244][00205] Avg episode rewards: #0: 28.832, true rewards: #0: 11.632 [2023-02-24 14:29:10,246][00205] Avg episode reward: 28.832, avg true_objective: 11.632 [2023-02-24 14:30:20,364][00205] Replay video saved to /content/train_dir/default_experiment/replay.mp4!