[2023-02-24 06:47:27,135][00771] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-24 06:47:27,139][00771] Rollout worker 0 uses device cpu [2023-02-24 06:47:27,140][00771] Rollout worker 1 uses device cpu [2023-02-24 06:47:27,142][00771] Rollout worker 2 uses device cpu [2023-02-24 06:47:27,144][00771] Rollout worker 3 uses device cpu [2023-02-24 06:47:27,145][00771] Rollout worker 4 uses device cpu [2023-02-24 06:47:27,147][00771] Rollout worker 5 uses device cpu [2023-02-24 06:47:27,148][00771] Rollout worker 6 uses device cpu [2023-02-24 06:47:27,150][00771] Rollout worker 7 uses device cpu [2023-02-24 06:47:27,342][00771] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 06:47:27,344][00771] InferenceWorker_p0-w0: min num requests: 2 [2023-02-24 06:47:27,379][00771] Starting all processes... [2023-02-24 06:47:27,381][00771] Starting process learner_proc0 [2023-02-24 06:47:27,439][00771] Starting all processes... [2023-02-24 06:47:27,448][00771] Starting process inference_proc0-0 [2023-02-24 06:47:27,449][00771] Starting process rollout_proc0 [2023-02-24 06:47:27,451][00771] Starting process rollout_proc1 [2023-02-24 06:47:27,451][00771] Starting process rollout_proc2 [2023-02-24 06:47:27,451][00771] Starting process rollout_proc3 [2023-02-24 06:47:27,451][00771] Starting process rollout_proc4 [2023-02-24 06:47:27,451][00771] Starting process rollout_proc5 [2023-02-24 06:47:27,452][00771] Starting process rollout_proc6 [2023-02-24 06:47:27,452][00771] Starting process rollout_proc7 [2023-02-24 06:47:39,903][12909] Worker 3 uses CPU cores [1] [2023-02-24 06:47:40,006][12892] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 06:47:40,007][12892] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-24 06:47:40,043][12911] Worker 4 uses CPU cores [0] [2023-02-24 06:47:40,119][12907] Worker 1 uses CPU cores [1] [2023-02-24 06:47:40,161][12913] Worker 6 uses CPU cores [0] [2023-02-24 06:47:40,257][12908] Worker 0 uses CPU cores [0] [2023-02-24 06:47:40,418][12910] Worker 2 uses CPU cores [0] [2023-02-24 06:47:40,446][12906] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 06:47:40,450][12914] Worker 7 uses CPU cores [1] [2023-02-24 06:47:40,450][12906] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-24 06:47:40,598][12912] Worker 5 uses CPU cores [1] [2023-02-24 06:47:40,921][12892] Num visible devices: 1 [2023-02-24 06:47:40,921][12906] Num visible devices: 1 [2023-02-24 06:47:40,930][12892] Starting seed is not provided [2023-02-24 06:47:40,930][12892] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 06:47:40,931][12892] Initializing actor-critic model on device cuda:0 [2023-02-24 06:47:40,931][12892] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 06:47:40,933][12892] RunningMeanStd input shape: (1,) [2023-02-24 06:47:40,946][12892] ConvEncoder: input_channels=3 [2023-02-24 06:47:41,239][12892] Conv encoder output size: 512 [2023-02-24 06:47:41,239][12892] Policy head output size: 512 [2023-02-24 06:47:41,293][12892] Created Actor Critic model with architecture: [2023-02-24 06:47:41,293][12892] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-24 06:47:47,334][00771] Heartbeat connected on Batcher_0 [2023-02-24 06:47:47,342][00771] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-24 06:47:47,353][00771] Heartbeat connected on RolloutWorker_w0 [2023-02-24 06:47:47,356][00771] Heartbeat connected on RolloutWorker_w1 [2023-02-24 06:47:47,360][00771] Heartbeat connected on RolloutWorker_w2 [2023-02-24 06:47:47,363][00771] Heartbeat connected on RolloutWorker_w3 [2023-02-24 06:47:47,368][00771] Heartbeat connected on RolloutWorker_w4 [2023-02-24 06:47:47,372][00771] Heartbeat connected on RolloutWorker_w5 [2023-02-24 06:47:47,377][00771] Heartbeat connected on RolloutWorker_w6 [2023-02-24 06:47:47,380][00771] Heartbeat connected on RolloutWorker_w7 [2023-02-24 06:47:48,647][12892] Using optimizer [2023-02-24 06:47:48,648][12892] No checkpoints found [2023-02-24 06:47:48,648][12892] Did not load from checkpoint, starting from scratch! [2023-02-24 06:47:48,648][12892] Initialized policy 0 weights for model version 0 [2023-02-24 06:47:48,654][12892] LearnerWorker_p0 finished initialization! [2023-02-24 06:47:48,656][00771] Heartbeat connected on LearnerWorker_p0 [2023-02-24 06:47:48,660][12892] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-24 06:47:48,957][12906] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 06:47:48,958][12906] RunningMeanStd input shape: (1,) [2023-02-24 06:47:48,979][12906] ConvEncoder: input_channels=3 [2023-02-24 06:47:49,141][12906] Conv encoder output size: 512 [2023-02-24 06:47:49,142][12906] Policy head output size: 512 [2023-02-24 06:47:52,354][00771] Inference worker 0-0 is ready! [2023-02-24 06:47:52,359][00771] All inference workers are ready! Signal rollout workers to start! [2023-02-24 06:47:52,552][12907] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,555][12909] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,551][12908] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,573][12912] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,573][12914] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,604][12911] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,602][12910] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,624][12913] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 06:47:52,787][00771] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 06:47:53,780][12907] Decorrelating experience for 0 frames... [2023-02-24 06:47:53,778][12909] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,092][12908] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,099][12910] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,102][12913] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,705][12914] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,713][12912] Decorrelating experience for 0 frames... [2023-02-24 06:47:54,808][12908] Decorrelating experience for 32 frames... [2023-02-24 06:47:54,950][12911] Decorrelating experience for 0 frames... [2023-02-24 06:47:55,082][12909] Decorrelating experience for 32 frames... [2023-02-24 06:47:55,500][12914] Decorrelating experience for 32 frames... [2023-02-24 06:47:55,792][12912] Decorrelating experience for 32 frames... [2023-02-24 06:47:55,907][12913] Decorrelating experience for 32 frames... [2023-02-24 06:47:56,015][12911] Decorrelating experience for 32 frames... [2023-02-24 06:47:56,026][12908] Decorrelating experience for 64 frames... [2023-02-24 06:47:56,821][12907] Decorrelating experience for 32 frames... [2023-02-24 06:47:56,835][12914] Decorrelating experience for 64 frames... [2023-02-24 06:47:57,364][12912] Decorrelating experience for 64 frames... [2023-02-24 06:47:57,423][12910] Decorrelating experience for 32 frames... [2023-02-24 06:47:57,657][12913] Decorrelating experience for 64 frames... [2023-02-24 06:47:57,695][12908] Decorrelating experience for 96 frames... [2023-02-24 06:47:57,787][00771] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 06:47:57,858][12911] Decorrelating experience for 64 frames... [2023-02-24 06:47:58,808][12907] Decorrelating experience for 64 frames... [2023-02-24 06:47:58,880][12912] Decorrelating experience for 96 frames... [2023-02-24 06:47:58,940][12910] Decorrelating experience for 64 frames... [2023-02-24 06:47:58,955][12909] Decorrelating experience for 64 frames... [2023-02-24 06:47:59,285][12911] Decorrelating experience for 96 frames... [2023-02-24 06:47:59,509][12914] Decorrelating experience for 96 frames... [2023-02-24 06:48:00,106][12907] Decorrelating experience for 96 frames... [2023-02-24 06:48:00,185][12909] Decorrelating experience for 96 frames... [2023-02-24 06:48:00,327][12913] Decorrelating experience for 96 frames... [2023-02-24 06:48:00,506][12910] Decorrelating experience for 96 frames... [2023-02-24 06:48:02,787][00771] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 6.4. Samples: 64. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 06:48:02,789][00771] Avg episode reward: [(0, '1.160')] [2023-02-24 06:48:05,560][12892] Signal inference workers to stop experience collection... [2023-02-24 06:48:05,585][12906] InferenceWorker_p0-w0: stopping experience collection [2023-02-24 06:48:07,787][00771] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 159.2. Samples: 2388. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-24 06:48:07,789][00771] Avg episode reward: [(0, '1.932')] [2023-02-24 06:48:08,306][12892] Signal inference workers to resume experience collection... [2023-02-24 06:48:08,307][12906] InferenceWorker_p0-w0: resuming experience collection [2023-02-24 06:48:12,787][00771] Fps is (10 sec: 1228.8, 60 sec: 614.4, 300 sec: 614.4). Total num frames: 12288. Throughput: 0: 159.8. Samples: 3196. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0) [2023-02-24 06:48:12,790][00771] Avg episode reward: [(0, '3.053')] [2023-02-24 06:48:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 1310.8, 300 sec: 1310.8). Total num frames: 32768. Throughput: 0: 325.0. Samples: 8126. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:48:17,789][00771] Avg episode reward: [(0, '3.747')] [2023-02-24 06:48:19,087][12906] Updated weights for policy 0, policy_version 10 (0.0019) [2023-02-24 06:48:22,787][00771] Fps is (10 sec: 4096.0, 60 sec: 1775.0, 300 sec: 1775.0). Total num frames: 53248. Throughput: 0: 470.4. Samples: 14112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:48:22,794][00771] Avg episode reward: [(0, '4.393')] [2023-02-24 06:48:27,787][00771] Fps is (10 sec: 3276.8, 60 sec: 1872.5, 300 sec: 1872.5). Total num frames: 65536. Throughput: 0: 456.1. Samples: 15964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:48:27,793][00771] Avg episode reward: [(0, '4.382')] [2023-02-24 06:48:32,787][00771] Fps is (10 sec: 2457.6, 60 sec: 1945.6, 300 sec: 1945.6). Total num frames: 77824. Throughput: 0: 494.6. Samples: 19782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:48:32,793][00771] Avg episode reward: [(0, '4.263')] [2023-02-24 06:48:32,846][12906] Updated weights for policy 0, policy_version 20 (0.0018) [2023-02-24 06:48:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 2184.6, 300 sec: 2184.6). Total num frames: 98304. Throughput: 0: 568.0. Samples: 25560. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 06:48:37,789][00771] Avg episode reward: [(0, '4.319')] [2023-02-24 06:48:42,787][00771] Fps is (10 sec: 4095.9, 60 sec: 2375.7, 300 sec: 2375.7). Total num frames: 118784. Throughput: 0: 634.4. Samples: 28548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:48:42,794][00771] Avg episode reward: [(0, '4.473')] [2023-02-24 06:48:42,808][12892] Saving new best policy, reward=4.473! [2023-02-24 06:48:43,200][12906] Updated weights for policy 0, policy_version 30 (0.0013) [2023-02-24 06:48:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 2383.2, 300 sec: 2383.2). Total num frames: 131072. Throughput: 0: 737.6. Samples: 33256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:48:47,789][00771] Avg episode reward: [(0, '4.420')] [2023-02-24 06:48:52,789][00771] Fps is (10 sec: 2047.7, 60 sec: 2321.0, 300 sec: 2321.0). Total num frames: 139264. Throughput: 0: 733.6. Samples: 35402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:48:52,795][00771] Avg episode reward: [(0, '4.391')] [2023-02-24 06:48:57,787][00771] Fps is (10 sec: 2867.2, 60 sec: 2662.4, 300 sec: 2457.6). Total num frames: 159744. Throughput: 0: 779.8. Samples: 38288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:48:57,794][00771] Avg episode reward: [(0, '4.365')] [2023-02-24 06:48:58,586][12906] Updated weights for policy 0, policy_version 40 (0.0052) [2023-02-24 06:49:02,787][00771] Fps is (10 sec: 4096.8, 60 sec: 3003.7, 300 sec: 2574.6). Total num frames: 180224. Throughput: 0: 805.0. Samples: 44352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:49:02,789][00771] Avg episode reward: [(0, '4.156')] [2023-02-24 06:49:07,791][00771] Fps is (10 sec: 3275.4, 60 sec: 3208.3, 300 sec: 2566.7). Total num frames: 192512. Throughput: 0: 767.8. Samples: 48664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:49:07,795][00771] Avg episode reward: [(0, '4.232')] [2023-02-24 06:49:12,234][12906] Updated weights for policy 0, policy_version 50 (0.0018) [2023-02-24 06:49:12,787][00771] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 2560.0). Total num frames: 204800. Throughput: 0: 768.7. Samples: 50558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:49:12,794][00771] Avg episode reward: [(0, '4.305')] [2023-02-24 06:49:17,787][00771] Fps is (10 sec: 3278.2, 60 sec: 3208.5, 300 sec: 2650.4). Total num frames: 225280. Throughput: 0: 798.8. Samples: 55730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:49:17,790][00771] Avg episode reward: [(0, '4.352')] [2023-02-24 06:49:22,788][00771] Fps is (10 sec: 3686.0, 60 sec: 3140.2, 300 sec: 2685.1). Total num frames: 241664. Throughput: 0: 801.9. Samples: 61648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:49:22,791][00771] Avg episode reward: [(0, '4.433')] [2023-02-24 06:49:22,867][12906] Updated weights for policy 0, policy_version 60 (0.0023) [2023-02-24 06:49:22,869][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000060_245760.pth... [2023-02-24 06:49:27,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2673.2). Total num frames: 253952. Throughput: 0: 771.2. Samples: 63252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:49:27,792][00771] Avg episode reward: [(0, '4.443')] [2023-02-24 06:49:32,787][00771] Fps is (10 sec: 2458.0, 60 sec: 3140.3, 300 sec: 2662.4). Total num frames: 266240. Throughput: 0: 734.7. Samples: 66318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:49:32,793][00771] Avg episode reward: [(0, '4.607')] [2023-02-24 06:49:32,799][12892] Saving new best policy, reward=4.607! [2023-02-24 06:49:37,787][00771] Fps is (10 sec: 2047.9, 60 sec: 2935.4, 300 sec: 2613.6). Total num frames: 274432. Throughput: 0: 763.5. Samples: 69758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:49:37,793][00771] Avg episode reward: [(0, '4.666')] [2023-02-24 06:49:37,795][12892] Saving new best policy, reward=4.666! [2023-02-24 06:49:39,940][12906] Updated weights for policy 0, policy_version 70 (0.0041) [2023-02-24 06:49:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3003.8, 300 sec: 2718.3). Total num frames: 299008. Throughput: 0: 766.0. Samples: 72758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:49:42,789][00771] Avg episode reward: [(0, '4.685')] [2023-02-24 06:49:42,800][12892] Saving new best policy, reward=4.685! [2023-02-24 06:49:47,787][00771] Fps is (10 sec: 4505.8, 60 sec: 3140.3, 300 sec: 2778.2). Total num frames: 319488. Throughput: 0: 778.0. Samples: 79360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:49:47,794][00771] Avg episode reward: [(0, '4.560')] [2023-02-24 06:49:49,764][12906] Updated weights for policy 0, policy_version 80 (0.0015) [2023-02-24 06:49:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3276.9, 300 sec: 2798.9). Total num frames: 335872. Throughput: 0: 786.6. Samples: 84056. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-24 06:49:52,789][00771] Avg episode reward: [(0, '4.592')] [2023-02-24 06:49:57,787][00771] Fps is (10 sec: 2867.0, 60 sec: 3140.2, 300 sec: 2785.3). Total num frames: 348160. Throughput: 0: 790.3. Samples: 86122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:49:57,794][00771] Avg episode reward: [(0, '4.525')] [2023-02-24 06:50:02,114][12906] Updated weights for policy 0, policy_version 90 (0.0027) [2023-02-24 06:50:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 2835.7). Total num frames: 368640. Throughput: 0: 803.9. Samples: 91904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:50:02,790][00771] Avg episode reward: [(0, '4.485')] [2023-02-24 06:50:07,787][00771] Fps is (10 sec: 4505.9, 60 sec: 3345.3, 300 sec: 2912.7). Total num frames: 393216. Throughput: 0: 820.5. Samples: 98570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 06:50:07,792][00771] Avg episode reward: [(0, '4.635')] [2023-02-24 06:50:12,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2896.5). Total num frames: 405504. Throughput: 0: 832.7. Samples: 100722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:50:12,791][00771] Avg episode reward: [(0, '4.528')] [2023-02-24 06:50:13,407][12906] Updated weights for policy 0, policy_version 100 (0.0041) [2023-02-24 06:50:17,788][00771] Fps is (10 sec: 2866.8, 60 sec: 3276.7, 300 sec: 2909.6). Total num frames: 421888. Throughput: 0: 860.3. Samples: 105032. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:50:17,790][00771] Avg episode reward: [(0, '4.515')] [2023-02-24 06:50:22,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2949.1). Total num frames: 442368. Throughput: 0: 921.6. Samples: 111228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:50:22,789][00771] Avg episode reward: [(0, '4.421')] [2023-02-24 06:50:24,073][12906] Updated weights for policy 0, policy_version 110 (0.0019) [2023-02-24 06:50:27,787][00771] Fps is (10 sec: 4506.2, 60 sec: 3549.9, 300 sec: 3012.6). Total num frames: 466944. Throughput: 0: 931.9. Samples: 114694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:50:27,789][00771] Avg episode reward: [(0, '4.540')] [2023-02-24 06:50:32,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 2995.2). Total num frames: 479232. Throughput: 0: 908.3. Samples: 120232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:50:32,790][00771] Avg episode reward: [(0, '4.684')] [2023-02-24 06:50:36,033][12906] Updated weights for policy 0, policy_version 120 (0.0017) [2023-02-24 06:50:37,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3003.7). Total num frames: 495616. Throughput: 0: 899.2. Samples: 124518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:50:37,792][00771] Avg episode reward: [(0, '4.669')] [2023-02-24 06:50:42,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3035.9). Total num frames: 516096. Throughput: 0: 922.2. Samples: 127620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:50:42,790][00771] Avg episode reward: [(0, '4.606')] [2023-02-24 06:50:45,554][12906] Updated weights for policy 0, policy_version 130 (0.0012) [2023-02-24 06:50:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3089.6). Total num frames: 540672. Throughput: 0: 948.8. Samples: 134598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:50:47,791][00771] Avg episode reward: [(0, '4.474')] [2023-02-24 06:50:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3094.8). Total num frames: 557056. Throughput: 0: 911.4. Samples: 139582. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:50:52,790][00771] Avg episode reward: [(0, '4.408')] [2023-02-24 06:50:57,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3077.5). Total num frames: 569344. Throughput: 0: 910.8. Samples: 141708. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:50:57,795][00771] Avg episode reward: [(0, '4.570')] [2023-02-24 06:50:58,021][12906] Updated weights for policy 0, policy_version 140 (0.0031) [2023-02-24 06:51:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3125.9). Total num frames: 593920. Throughput: 0: 948.3. Samples: 147706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:51:02,789][00771] Avg episode reward: [(0, '4.588')] [2023-02-24 06:51:06,856][12906] Updated weights for policy 0, policy_version 150 (0.0013) [2023-02-24 06:51:07,787][00771] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3171.8). Total num frames: 618496. Throughput: 0: 969.5. Samples: 154854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:51:07,789][00771] Avg episode reward: [(0, '4.554')] [2023-02-24 06:51:12,788][00771] Fps is (10 sec: 3685.8, 60 sec: 3754.6, 300 sec: 3153.9). Total num frames: 630784. Throughput: 0: 945.8. Samples: 157256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:51:12,796][00771] Avg episode reward: [(0, '4.741')] [2023-02-24 06:51:12,809][12892] Saving new best policy, reward=4.741! [2023-02-24 06:51:17,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3754.8, 300 sec: 3156.9). Total num frames: 647168. Throughput: 0: 921.3. Samples: 161690. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:51:17,795][00771] Avg episode reward: [(0, '4.694')] [2023-02-24 06:51:19,057][12906] Updated weights for policy 0, policy_version 160 (0.0019) [2023-02-24 06:51:22,787][00771] Fps is (10 sec: 4096.6, 60 sec: 3822.9, 300 sec: 3198.8). Total num frames: 671744. Throughput: 0: 973.1. Samples: 168306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:51:22,792][00771] Avg episode reward: [(0, '4.586')] [2023-02-24 06:51:22,801][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000164_671744.pth... [2023-02-24 06:51:27,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3219.7). Total num frames: 692224. Throughput: 0: 979.9. Samples: 171716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:51:27,790][00771] Avg episode reward: [(0, '4.634')] [2023-02-24 06:51:28,059][12906] Updated weights for policy 0, policy_version 170 (0.0018) [2023-02-24 06:51:32,787][00771] Fps is (10 sec: 3686.2, 60 sec: 3822.9, 300 sec: 3220.9). Total num frames: 708608. Throughput: 0: 946.3. Samples: 177182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:51:32,792][00771] Avg episode reward: [(0, '4.631')] [2023-02-24 06:51:37,787][00771] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3222.2). Total num frames: 724992. Throughput: 0: 938.2. Samples: 181802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:51:37,794][00771] Avg episode reward: [(0, '4.602')] [2023-02-24 06:51:40,060][12906] Updated weights for policy 0, policy_version 180 (0.0019) [2023-02-24 06:51:42,787][00771] Fps is (10 sec: 4096.3, 60 sec: 3891.2, 300 sec: 3259.0). Total num frames: 749568. Throughput: 0: 968.0. Samples: 185266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:51:42,790][00771] Avg episode reward: [(0, '4.644')] [2023-02-24 06:51:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 770048. Throughput: 0: 991.8. Samples: 192336. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:51:47,789][00771] Avg episode reward: [(0, '4.935')] [2023-02-24 06:51:47,792][12892] Saving new best policy, reward=4.935! [2023-02-24 06:51:49,556][12906] Updated weights for policy 0, policy_version 190 (0.0013) [2023-02-24 06:51:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 786432. Throughput: 0: 940.7. Samples: 197184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:51:52,790][00771] Avg episode reward: [(0, '5.034')] [2023-02-24 06:51:52,814][12892] Saving new best policy, reward=5.034! [2023-02-24 06:51:57,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3276.8). Total num frames: 802816. Throughput: 0: 935.1. Samples: 199334. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) [2023-02-24 06:51:57,795][00771] Avg episode reward: [(0, '4.999')] [2023-02-24 06:52:01,182][12906] Updated weights for policy 0, policy_version 200 (0.0017) [2023-02-24 06:52:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3293.2). Total num frames: 823296. Throughput: 0: 976.7. Samples: 205642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:52:02,789][00771] Avg episode reward: [(0, '4.785')] [2023-02-24 06:52:07,789][00771] Fps is (10 sec: 4504.7, 60 sec: 3822.8, 300 sec: 3325.0). Total num frames: 847872. Throughput: 0: 985.6. Samples: 212662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:52:07,799][00771] Avg episode reward: [(0, '5.032')] [2023-02-24 06:52:11,255][12906] Updated weights for policy 0, policy_version 210 (0.0022) [2023-02-24 06:52:12,789][00771] Fps is (10 sec: 4095.1, 60 sec: 3891.2, 300 sec: 3324.0). Total num frames: 864256. Throughput: 0: 958.4. Samples: 214844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:52:12,798][00771] Avg episode reward: [(0, '4.995')] [2023-02-24 06:52:17,787][00771] Fps is (10 sec: 2867.8, 60 sec: 3822.9, 300 sec: 3307.7). Total num frames: 876544. Throughput: 0: 933.3. Samples: 219182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:52:17,789][00771] Avg episode reward: [(0, '4.864')] [2023-02-24 06:52:22,364][12906] Updated weights for policy 0, policy_version 220 (0.0014) [2023-02-24 06:52:22,787][00771] Fps is (10 sec: 3687.2, 60 sec: 3822.9, 300 sec: 3337.5). Total num frames: 901120. Throughput: 0: 979.5. Samples: 225878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:52:22,789][00771] Avg episode reward: [(0, '5.138')] [2023-02-24 06:52:22,800][12892] Saving new best policy, reward=5.138! [2023-02-24 06:52:27,792][00771] Fps is (10 sec: 4912.8, 60 sec: 3890.9, 300 sec: 3366.1). Total num frames: 925696. Throughput: 0: 979.4. Samples: 229346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:52:27,800][00771] Avg episode reward: [(0, '5.195')] [2023-02-24 06:52:27,801][12892] Saving new best policy, reward=5.195! [2023-02-24 06:52:32,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3350.0). Total num frames: 937984. Throughput: 0: 937.9. Samples: 234542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:52:32,789][00771] Avg episode reward: [(0, '5.114')] [2023-02-24 06:52:33,317][12906] Updated weights for policy 0, policy_version 230 (0.0020) [2023-02-24 06:52:37,787][00771] Fps is (10 sec: 2868.6, 60 sec: 3822.9, 300 sec: 3348.7). Total num frames: 954368. Throughput: 0: 933.7. Samples: 239200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:52:37,793][00771] Avg episode reward: [(0, '5.131')] [2023-02-24 06:52:42,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3375.7). Total num frames: 978944. Throughput: 0: 965.2. Samples: 242768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:52:42,792][00771] Avg episode reward: [(0, '4.960')] [2023-02-24 06:52:43,407][12906] Updated weights for policy 0, policy_version 240 (0.0021) [2023-02-24 06:52:47,790][00771] Fps is (10 sec: 4504.1, 60 sec: 3822.7, 300 sec: 3387.8). Total num frames: 999424. Throughput: 0: 983.5. Samples: 249902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:52:47,793][00771] Avg episode reward: [(0, '5.083')] [2023-02-24 06:52:52,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3429.5). Total num frames: 1011712. Throughput: 0: 909.3. Samples: 253578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:52:52,790][00771] Avg episode reward: [(0, '5.178')] [2023-02-24 06:52:57,210][12906] Updated weights for policy 0, policy_version 250 (0.0021) [2023-02-24 06:52:57,787][00771] Fps is (10 sec: 2458.4, 60 sec: 3686.4, 300 sec: 3471.2). Total num frames: 1024000. Throughput: 0: 900.0. Samples: 255342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:52:57,789][00771] Avg episode reward: [(0, '5.234')] [2023-02-24 06:52:57,794][12892] Saving new best policy, reward=5.234! [2023-02-24 06:53:02,787][00771] Fps is (10 sec: 2867.3, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 1040384. Throughput: 0: 890.4. Samples: 259252. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:53:02,790][00771] Avg episode reward: [(0, '5.338')] [2023-02-24 06:53:02,799][12892] Saving new best policy, reward=5.338! [2023-02-24 06:53:07,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3554.5). Total num frames: 1060864. Throughput: 0: 896.2. Samples: 266206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:53:07,794][00771] Avg episode reward: [(0, '5.452')] [2023-02-24 06:53:07,800][12892] Saving new best policy, reward=5.452! [2023-02-24 06:53:08,049][12906] Updated weights for policy 0, policy_version 260 (0.0035) [2023-02-24 06:53:12,793][00771] Fps is (10 sec: 4093.4, 60 sec: 3617.9, 300 sec: 3554.4). Total num frames: 1081344. Throughput: 0: 896.3. Samples: 269680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:12,795][00771] Avg episode reward: [(0, '5.596')] [2023-02-24 06:53:12,820][12892] Saving new best policy, reward=5.596! [2023-02-24 06:53:17,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3540.6). Total num frames: 1097728. Throughput: 0: 889.4. Samples: 274566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:17,793][00771] Avg episode reward: [(0, '5.492')] [2023-02-24 06:53:19,787][12906] Updated weights for policy 0, policy_version 270 (0.0025) [2023-02-24 06:53:22,787][00771] Fps is (10 sec: 3278.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1114112. Throughput: 0: 895.7. Samples: 279506. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:22,789][00771] Avg episode reward: [(0, '5.742')] [2023-02-24 06:53:22,797][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000272_1114112.pth... [2023-02-24 06:53:22,906][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000060_245760.pth [2023-02-24 06:53:22,915][12892] Saving new best policy, reward=5.742! [2023-02-24 06:53:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3550.2, 300 sec: 3596.2). Total num frames: 1138688. Throughput: 0: 892.7. Samples: 282940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:27,788][00771] Avg episode reward: [(0, '5.731')] [2023-02-24 06:53:28,887][12906] Updated weights for policy 0, policy_version 280 (0.0013) [2023-02-24 06:53:32,787][00771] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 1163264. Throughput: 0: 895.3. Samples: 290186. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:53:32,794][00771] Avg episode reward: [(0, '5.529')] [2023-02-24 06:53:37,787][00771] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 1175552. Throughput: 0: 913.7. Samples: 294694. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:53:37,790][00771] Avg episode reward: [(0, '5.205')] [2023-02-24 06:53:41,144][12906] Updated weights for policy 0, policy_version 290 (0.0022) [2023-02-24 06:53:42,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 1191936. Throughput: 0: 926.0. Samples: 297012. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:53:42,793][00771] Avg episode reward: [(0, '5.536')] [2023-02-24 06:53:47,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3651.7). Total num frames: 1216512. Throughput: 0: 990.8. Samples: 303840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:53:47,794][00771] Avg episode reward: [(0, '5.660')] [2023-02-24 06:53:49,673][12906] Updated weights for policy 0, policy_version 300 (0.0015) [2023-02-24 06:53:52,788][00771] Fps is (10 sec: 4914.6, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 1241088. Throughput: 0: 988.8. Samples: 310702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:52,792][00771] Avg episode reward: [(0, '6.062')] [2023-02-24 06:53:52,811][12892] Saving new best policy, reward=6.062! [2023-02-24 06:53:57,789][00771] Fps is (10 sec: 3685.3, 60 sec: 3822.7, 300 sec: 3637.8). Total num frames: 1253376. Throughput: 0: 961.7. Samples: 312954. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:53:57,793][00771] Avg episode reward: [(0, '6.043')] [2023-02-24 06:54:01,611][12906] Updated weights for policy 0, policy_version 310 (0.0026) [2023-02-24 06:54:02,787][00771] Fps is (10 sec: 3277.3, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 1273856. Throughput: 0: 959.1. Samples: 317726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:02,792][00771] Avg episode reward: [(0, '6.385')] [2023-02-24 06:54:02,804][12892] Saving new best policy, reward=6.385! [2023-02-24 06:54:07,787][00771] Fps is (10 sec: 4506.9, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 1298432. Throughput: 0: 1009.6. Samples: 324936. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:54:07,792][00771] Avg episode reward: [(0, '6.536')] [2023-02-24 06:54:07,794][12892] Saving new best policy, reward=6.536! [2023-02-24 06:54:10,299][12906] Updated weights for policy 0, policy_version 320 (0.0018) [2023-02-24 06:54:12,789][00771] Fps is (10 sec: 4504.5, 60 sec: 3959.7, 300 sec: 3707.2). Total num frames: 1318912. Throughput: 0: 1011.1. Samples: 328442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:54:12,794][00771] Avg episode reward: [(0, '6.490')] [2023-02-24 06:54:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3693.4). Total num frames: 1331200. Throughput: 0: 954.5. Samples: 333138. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:54:17,794][00771] Avg episode reward: [(0, '6.918')] [2023-02-24 06:54:17,797][12892] Saving new best policy, reward=6.918! [2023-02-24 06:54:22,454][12906] Updated weights for policy 0, policy_version 330 (0.0027) [2023-02-24 06:54:22,787][00771] Fps is (10 sec: 3277.5, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 1351680. Throughput: 0: 973.5. Samples: 338500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:22,795][00771] Avg episode reward: [(0, '7.004')] [2023-02-24 06:54:22,809][12892] Saving new best policy, reward=7.004! [2023-02-24 06:54:27,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 1376256. Throughput: 0: 998.7. Samples: 341954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:27,793][00771] Avg episode reward: [(0, '6.922')] [2023-02-24 06:54:31,282][12906] Updated weights for policy 0, policy_version 340 (0.0033) [2023-02-24 06:54:32,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1396736. Throughput: 0: 998.9. Samples: 348790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:32,795][00771] Avg episode reward: [(0, '6.918')] [2023-02-24 06:54:37,790][00771] Fps is (10 sec: 3275.7, 60 sec: 3891.0, 300 sec: 3762.7). Total num frames: 1409024. Throughput: 0: 948.7. Samples: 353396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:54:37,793][00771] Avg episode reward: [(0, '7.340')] [2023-02-24 06:54:37,795][12892] Saving new best policy, reward=7.340! [2023-02-24 06:54:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 1429504. Throughput: 0: 949.3. Samples: 355668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:42,789][00771] Avg episode reward: [(0, '7.742')] [2023-02-24 06:54:42,802][12892] Saving new best policy, reward=7.742! [2023-02-24 06:54:43,337][12906] Updated weights for policy 0, policy_version 350 (0.0024) [2023-02-24 06:54:47,787][00771] Fps is (10 sec: 4507.2, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1454080. Throughput: 0: 998.1. Samples: 362640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 06:54:47,792][00771] Avg episode reward: [(0, '8.865')] [2023-02-24 06:54:47,794][12892] Saving new best policy, reward=8.865! [2023-02-24 06:54:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 1470464. Throughput: 0: 974.4. Samples: 368782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:54:52,792][00771] Avg episode reward: [(0, '9.635')] [2023-02-24 06:54:52,802][12892] Saving new best policy, reward=9.635! [2023-02-24 06:54:53,049][12906] Updated weights for policy 0, policy_version 360 (0.0013) [2023-02-24 06:54:57,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.4, 300 sec: 3790.5). Total num frames: 1486848. Throughput: 0: 945.1. Samples: 370970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:54:57,793][00771] Avg episode reward: [(0, '10.392')] [2023-02-24 06:54:57,800][12892] Saving new best policy, reward=10.392! [2023-02-24 06:55:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 1507328. Throughput: 0: 954.9. Samples: 376108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:55:02,792][00771] Avg episode reward: [(0, '10.325')] [2023-02-24 06:55:04,190][12906] Updated weights for policy 0, policy_version 370 (0.0021) [2023-02-24 06:55:07,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1531904. Throughput: 0: 993.9. Samples: 383224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:55:07,789][00771] Avg episode reward: [(0, '10.805')] [2023-02-24 06:55:07,793][12892] Saving new best policy, reward=10.805! [2023-02-24 06:55:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3818.3). Total num frames: 1548288. Throughput: 0: 989.4. Samples: 386476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:55:12,797][00771] Avg episode reward: [(0, '10.717')] [2023-02-24 06:55:14,525][12906] Updated weights for policy 0, policy_version 380 (0.0018) [2023-02-24 06:55:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1564672. Throughput: 0: 937.2. Samples: 390966. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:55:17,791][00771] Avg episode reward: [(0, '10.921')] [2023-02-24 06:55:17,799][12892] Saving new best policy, reward=10.921! [2023-02-24 06:55:22,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1585152. Throughput: 0: 960.4. Samples: 396610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 06:55:22,789][00771] Avg episode reward: [(0, '11.410')] [2023-02-24 06:55:22,800][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000387_1585152.pth... [2023-02-24 06:55:22,915][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000164_671744.pth [2023-02-24 06:55:22,930][12892] Saving new best policy, reward=11.410! [2023-02-24 06:55:25,285][12906] Updated weights for policy 0, policy_version 390 (0.0013) [2023-02-24 06:55:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1605632. Throughput: 0: 986.6. Samples: 400064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:55:27,795][00771] Avg episode reward: [(0, '11.660')] [2023-02-24 06:55:27,864][12892] Saving new best policy, reward=11.660! [2023-02-24 06:55:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1626112. Throughput: 0: 973.0. Samples: 406424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:55:32,793][00771] Avg episode reward: [(0, '11.791')] [2023-02-24 06:55:32,806][12892] Saving new best policy, reward=11.791! [2023-02-24 06:55:36,328][12906] Updated weights for policy 0, policy_version 400 (0.0021) [2023-02-24 06:55:37,789][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.4, 300 sec: 3818.3). Total num frames: 1642496. Throughput: 0: 936.3. Samples: 410914. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:55:37,791][00771] Avg episode reward: [(0, '12.101')] [2023-02-24 06:55:37,794][12892] Saving new best policy, reward=12.101! [2023-02-24 06:55:42,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1662976. Throughput: 0: 945.0. Samples: 413494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:55:42,789][00771] Avg episode reward: [(0, '11.628')] [2023-02-24 06:55:46,135][12906] Updated weights for policy 0, policy_version 410 (0.0021) [2023-02-24 06:55:47,787][00771] Fps is (10 sec: 4505.2, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 1687552. Throughput: 0: 991.7. Samples: 420736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:55:47,794][00771] Avg episode reward: [(0, '12.098')] [2023-02-24 06:55:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1703936. Throughput: 0: 967.1. Samples: 426744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:55:52,789][00771] Avg episode reward: [(0, '13.473')] [2023-02-24 06:55:52,802][12892] Saving new best policy, reward=13.473! [2023-02-24 06:55:57,637][12906] Updated weights for policy 0, policy_version 420 (0.0029) [2023-02-24 06:55:57,787][00771] Fps is (10 sec: 3277.1, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1720320. Throughput: 0: 942.5. Samples: 428890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:55:57,790][00771] Avg episode reward: [(0, '13.684')] [2023-02-24 06:55:57,792][12892] Saving new best policy, reward=13.684! [2023-02-24 06:56:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1740800. Throughput: 0: 965.2. Samples: 434398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:56:02,789][00771] Avg episode reward: [(0, '14.375')] [2023-02-24 06:56:02,797][12892] Saving new best policy, reward=14.375! [2023-02-24 06:56:06,909][12906] Updated weights for policy 0, policy_version 430 (0.0029) [2023-02-24 06:56:07,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1765376. Throughput: 0: 997.6. Samples: 441502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:56:07,789][00771] Avg episode reward: [(0, '16.052')] [2023-02-24 06:56:07,795][12892] Saving new best policy, reward=16.052! [2023-02-24 06:56:12,788][00771] Fps is (10 sec: 4095.4, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 1781760. Throughput: 0: 988.3. Samples: 444540. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:56:12,791][00771] Avg episode reward: [(0, '15.954')] [2023-02-24 06:56:17,790][00771] Fps is (10 sec: 2866.2, 60 sec: 3822.7, 300 sec: 3804.4). Total num frames: 1794048. Throughput: 0: 935.5. Samples: 448524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:56:17,793][00771] Avg episode reward: [(0, '16.028')] [2023-02-24 06:56:20,686][12906] Updated weights for policy 0, policy_version 440 (0.0030) [2023-02-24 06:56:22,787][00771] Fps is (10 sec: 2457.9, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 1806336. Throughput: 0: 917.4. Samples: 452198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:56:22,795][00771] Avg episode reward: [(0, '16.888')] [2023-02-24 06:56:22,812][12892] Saving new best policy, reward=16.888! [2023-02-24 06:56:27,788][00771] Fps is (10 sec: 3277.5, 60 sec: 3686.3, 300 sec: 3790.5). Total num frames: 1826816. Throughput: 0: 909.7. Samples: 454432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 06:56:27,792][00771] Avg episode reward: [(0, '17.063')] [2023-02-24 06:56:27,799][12892] Saving new best policy, reward=17.063! [2023-02-24 06:56:31,253][12906] Updated weights for policy 0, policy_version 450 (0.0011) [2023-02-24 06:56:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1847296. Throughput: 0: 908.3. Samples: 461610. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:56:32,792][00771] Avg episode reward: [(0, '15.743')] [2023-02-24 06:56:37,787][00771] Fps is (10 sec: 3686.9, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 1863680. Throughput: 0: 888.0. Samples: 466706. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 06:56:37,789][00771] Avg episode reward: [(0, '16.359')] [2023-02-24 06:56:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3762.8). Total num frames: 1880064. Throughput: 0: 892.1. Samples: 469034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:56:42,789][00771] Avg episode reward: [(0, '16.769')] [2023-02-24 06:56:43,059][12906] Updated weights for policy 0, policy_version 460 (0.0015) [2023-02-24 06:56:47,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3790.5). Total num frames: 1904640. Throughput: 0: 919.1. Samples: 475756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:56:47,789][00771] Avg episode reward: [(0, '16.622')] [2023-02-24 06:56:51,395][12906] Updated weights for policy 0, policy_version 470 (0.0014) [2023-02-24 06:56:52,787][00771] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 1929216. Throughput: 0: 923.7. Samples: 483070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:56:52,791][00771] Avg episode reward: [(0, '17.615')] [2023-02-24 06:56:52,806][12892] Saving new best policy, reward=17.615! [2023-02-24 06:56:57,787][00771] Fps is (10 sec: 4095.8, 60 sec: 3754.6, 300 sec: 3804.4). Total num frames: 1945600. Throughput: 0: 907.7. Samples: 485384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:56:57,792][00771] Avg episode reward: [(0, '17.883')] [2023-02-24 06:56:57,795][12892] Saving new best policy, reward=17.883! [2023-02-24 06:57:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3776.7). Total num frames: 1961984. Throughput: 0: 923.6. Samples: 490082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:57:02,791][00771] Avg episode reward: [(0, '17.140')] [2023-02-24 06:57:03,326][12906] Updated weights for policy 0, policy_version 480 (0.0013) [2023-02-24 06:57:07,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1986560. Throughput: 0: 998.0. Samples: 497108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:57:07,795][00771] Avg episode reward: [(0, '18.205')] [2023-02-24 06:57:07,802][12892] Saving new best policy, reward=18.205! [2023-02-24 06:57:11,773][12906] Updated weights for policy 0, policy_version 490 (0.0012) [2023-02-24 06:57:12,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3754.8, 300 sec: 3832.2). Total num frames: 2007040. Throughput: 0: 1025.8. Samples: 500592. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:57:12,791][00771] Avg episode reward: [(0, '18.699')] [2023-02-24 06:57:12,808][12892] Saving new best policy, reward=18.699! [2023-02-24 06:57:17,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 2023424. Throughput: 0: 981.6. Samples: 505784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:57:17,792][00771] Avg episode reward: [(0, '18.490')] [2023-02-24 06:57:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 2039808. Throughput: 0: 975.0. Samples: 510580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:57:22,793][00771] Avg episode reward: [(0, '17.916')] [2023-02-24 06:57:22,802][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000498_2039808.pth... [2023-02-24 06:57:22,937][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000272_1114112.pth [2023-02-24 06:57:24,221][12906] Updated weights for policy 0, policy_version 500 (0.0032) [2023-02-24 06:57:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3818.3). Total num frames: 2064384. Throughput: 0: 997.6. Samples: 513926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:57:27,788][00771] Avg episode reward: [(0, '17.594')] [2023-02-24 06:57:32,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2084864. Throughput: 0: 1002.5. Samples: 520870. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:57:32,790][00771] Avg episode reward: [(0, '17.705')] [2023-02-24 06:57:33,778][12906] Updated weights for policy 0, policy_version 510 (0.0016) [2023-02-24 06:57:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 2097152. Throughput: 0: 940.9. Samples: 525410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:57:37,791][00771] Avg episode reward: [(0, '17.608')] [2023-02-24 06:57:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.6). Total num frames: 2117632. Throughput: 0: 938.6. Samples: 527622. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:57:42,789][00771] Avg episode reward: [(0, '17.223')] [2023-02-24 06:57:45,405][12906] Updated weights for policy 0, policy_version 520 (0.0015) [2023-02-24 06:57:47,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2138112. Throughput: 0: 978.7. Samples: 534122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:57:47,793][00771] Avg episode reward: [(0, '17.882')] [2023-02-24 06:57:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2158592. Throughput: 0: 964.8. Samples: 540522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:57:52,791][00771] Avg episode reward: [(0, '17.705')] [2023-02-24 06:57:56,123][12906] Updated weights for policy 0, policy_version 530 (0.0016) [2023-02-24 06:57:57,789][00771] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3846.1). Total num frames: 2174976. Throughput: 0: 934.1. Samples: 542630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:57:57,798][00771] Avg episode reward: [(0, '17.769')] [2023-02-24 06:58:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2191360. Throughput: 0: 919.4. Samples: 547156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:58:02,793][00771] Avg episode reward: [(0, '18.006')] [2023-02-24 06:58:06,973][12906] Updated weights for policy 0, policy_version 540 (0.0024) [2023-02-24 06:58:07,787][00771] Fps is (10 sec: 3687.1, 60 sec: 3754.7, 300 sec: 3832.3). Total num frames: 2211840. Throughput: 0: 967.7. Samples: 554126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:58:07,792][00771] Avg episode reward: [(0, '18.783')] [2023-02-24 06:58:07,823][12892] Saving new best policy, reward=18.783! [2023-02-24 06:58:12,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2236416. Throughput: 0: 972.9. Samples: 557708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 06:58:12,791][00771] Avg episode reward: [(0, '17.738')] [2023-02-24 06:58:17,787][00771] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2248704. Throughput: 0: 930.7. Samples: 562752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:58:17,791][00771] Avg episode reward: [(0, '18.361')] [2023-02-24 06:58:17,914][12906] Updated weights for policy 0, policy_version 550 (0.0021) [2023-02-24 06:58:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2269184. Throughput: 0: 945.2. Samples: 567944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:58:22,794][00771] Avg episode reward: [(0, '17.937')] [2023-02-24 06:58:27,467][12906] Updated weights for policy 0, policy_version 560 (0.0020) [2023-02-24 06:58:27,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2293760. Throughput: 0: 976.5. Samples: 571564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:58:27,790][00771] Avg episode reward: [(0, '17.673')] [2023-02-24 06:58:32,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2314240. Throughput: 0: 990.8. Samples: 578708. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:58:32,791][00771] Avg episode reward: [(0, '17.062')] [2023-02-24 06:58:37,789][00771] Fps is (10 sec: 3685.6, 60 sec: 3891.0, 300 sec: 3859.9). Total num frames: 2330624. Throughput: 0: 948.1. Samples: 583190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:58:37,792][00771] Avg episode reward: [(0, '17.039')] [2023-02-24 06:58:38,796][12906] Updated weights for policy 0, policy_version 570 (0.0011) [2023-02-24 06:58:42,787][00771] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2351104. Throughput: 0: 955.2. Samples: 585614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:58:42,788][00771] Avg episode reward: [(0, '17.672')] [2023-02-24 06:58:47,787][00771] Fps is (10 sec: 4097.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2371584. Throughput: 0: 1012.0. Samples: 592694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:58:47,788][00771] Avg episode reward: [(0, '17.403')] [2023-02-24 06:58:47,854][12906] Updated weights for policy 0, policy_version 580 (0.0016) [2023-02-24 06:58:52,787][00771] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2392064. Throughput: 0: 1003.6. Samples: 599290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:58:52,789][00771] Avg episode reward: [(0, '17.747')] [2023-02-24 06:58:57,788][00771] Fps is (10 sec: 3686.0, 60 sec: 3891.3, 300 sec: 3846.1). Total num frames: 2408448. Throughput: 0: 974.6. Samples: 601566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:58:57,792][00771] Avg episode reward: [(0, '20.598')] [2023-02-24 06:58:57,796][12892] Saving new best policy, reward=20.598! [2023-02-24 06:58:59,774][12906] Updated weights for policy 0, policy_version 590 (0.0038) [2023-02-24 06:59:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2428928. Throughput: 0: 970.5. Samples: 606424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:59:02,795][00771] Avg episode reward: [(0, '21.911')] [2023-02-24 06:59:02,807][12892] Saving new best policy, reward=21.911! [2023-02-24 06:59:07,787][00771] Fps is (10 sec: 4096.5, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2449408. Throughput: 0: 1011.6. Samples: 613468. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 06:59:07,789][00771] Avg episode reward: [(0, '22.084')] [2023-02-24 06:59:07,858][12892] Saving new best policy, reward=22.084! [2023-02-24 06:59:08,843][12906] Updated weights for policy 0, policy_version 600 (0.0028) [2023-02-24 06:59:12,787][00771] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2469888. Throughput: 0: 1007.3. Samples: 616894. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:59:12,790][00771] Avg episode reward: [(0, '23.448')] [2023-02-24 06:59:12,801][12892] Saving new best policy, reward=23.448! [2023-02-24 06:59:17,787][00771] Fps is (10 sec: 3686.2, 60 sec: 3959.4, 300 sec: 3846.1). Total num frames: 2486272. Throughput: 0: 944.0. Samples: 621188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:59:17,791][00771] Avg episode reward: [(0, '23.909')] [2023-02-24 06:59:17,793][12892] Saving new best policy, reward=23.909! [2023-02-24 06:59:21,320][12906] Updated weights for policy 0, policy_version 610 (0.0030) [2023-02-24 06:59:22,787][00771] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2502656. Throughput: 0: 962.5. Samples: 626502. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:59:22,789][00771] Avg episode reward: [(0, '22.252')] [2023-02-24 06:59:22,802][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000611_2502656.pth... [2023-02-24 06:59:22,961][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000387_1585152.pth [2023-02-24 06:59:27,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2527232. Throughput: 0: 985.1. Samples: 629944. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:59:27,790][00771] Avg episode reward: [(0, '21.883')] [2023-02-24 06:59:30,065][12906] Updated weights for policy 0, policy_version 620 (0.0014) [2023-02-24 06:59:32,792][00771] Fps is (10 sec: 4093.8, 60 sec: 3822.6, 300 sec: 3846.0). Total num frames: 2543616. Throughput: 0: 968.2. Samples: 636266. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:59:32,794][00771] Avg episode reward: [(0, '22.988')] [2023-02-24 06:59:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3832.2). Total num frames: 2560000. Throughput: 0: 914.8. Samples: 640458. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 06:59:37,789][00771] Avg episode reward: [(0, '23.798')] [2023-02-24 06:59:42,787][00771] Fps is (10 sec: 2868.6, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 2572288. Throughput: 0: 903.4. Samples: 642218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 06:59:42,789][00771] Avg episode reward: [(0, '23.614')] [2023-02-24 06:59:45,134][12906] Updated weights for policy 0, policy_version 630 (0.0021) [2023-02-24 06:59:47,788][00771] Fps is (10 sec: 2457.3, 60 sec: 3549.8, 300 sec: 3776.6). Total num frames: 2584576. Throughput: 0: 893.8. Samples: 646644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 06:59:47,791][00771] Avg episode reward: [(0, '23.913')] [2023-02-24 06:59:47,853][12892] Saving new best policy, reward=23.913! [2023-02-24 06:59:52,793][00771] Fps is (10 sec: 3274.9, 60 sec: 3549.5, 300 sec: 3790.5). Total num frames: 2605056. Throughput: 0: 859.4. Samples: 652148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 06:59:52,795][00771] Avg episode reward: [(0, '23.739')] [2023-02-24 06:59:57,510][12906] Updated weights for policy 0, policy_version 640 (0.0012) [2023-02-24 06:59:57,787][00771] Fps is (10 sec: 3686.9, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 2621440. Throughput: 0: 831.6. Samples: 654316. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 06:59:57,790][00771] Avg episode reward: [(0, '23.322')] [2023-02-24 07:00:02,791][00771] Fps is (10 sec: 3277.4, 60 sec: 3481.3, 300 sec: 3748.8). Total num frames: 2637824. Throughput: 0: 841.1. Samples: 659040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:00:02,794][00771] Avg episode reward: [(0, '22.380')] [2023-02-24 07:00:07,479][12906] Updated weights for policy 0, policy_version 650 (0.0016) [2023-02-24 07:00:07,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 2662400. Throughput: 0: 879.6. Samples: 666082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:00:07,795][00771] Avg episode reward: [(0, '21.513')] [2023-02-24 07:00:12,787][00771] Fps is (10 sec: 4507.6, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 2682880. Throughput: 0: 878.1. Samples: 669460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:00:12,793][00771] Avg episode reward: [(0, '20.955')] [2023-02-24 07:00:17,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3762.8). Total num frames: 2695168. Throughput: 0: 839.6. Samples: 674042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:00:17,795][00771] Avg episode reward: [(0, '22.433')] [2023-02-24 07:00:19,598][12906] Updated weights for policy 0, policy_version 660 (0.0020) [2023-02-24 07:00:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3762.8). Total num frames: 2715648. Throughput: 0: 857.9. Samples: 679062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:00:22,789][00771] Avg episode reward: [(0, '21.450')] [2023-02-24 07:00:27,787][00771] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3762.8). Total num frames: 2736128. Throughput: 0: 893.7. Samples: 682436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:00:27,794][00771] Avg episode reward: [(0, '21.420')] [2023-02-24 07:00:29,095][12906] Updated weights for policy 0, policy_version 670 (0.0038) [2023-02-24 07:00:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3550.2, 300 sec: 3776.7). Total num frames: 2756608. Throughput: 0: 943.1. Samples: 689084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:00:32,790][00771] Avg episode reward: [(0, '20.588')] [2023-02-24 07:00:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3748.9). Total num frames: 2768896. Throughput: 0: 916.7. Samples: 693394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:00:37,793][00771] Avg episode reward: [(0, '20.251')] [2023-02-24 07:00:41,471][12906] Updated weights for policy 0, policy_version 680 (0.0019) [2023-02-24 07:00:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3735.0). Total num frames: 2789376. Throughput: 0: 919.7. Samples: 695702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:00:42,792][00771] Avg episode reward: [(0, '19.096')] [2023-02-24 07:00:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3762.8). Total num frames: 2813952. Throughput: 0: 968.4. Samples: 702614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:00:47,793][00771] Avg episode reward: [(0, '19.655')] [2023-02-24 07:00:50,284][12906] Updated weights for policy 0, policy_version 690 (0.0012) [2023-02-24 07:00:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3755.1, 300 sec: 3762.8). Total num frames: 2830336. Throughput: 0: 949.5. Samples: 708808. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:00:52,789][00771] Avg episode reward: [(0, '19.475')] [2023-02-24 07:00:57,791][00771] Fps is (10 sec: 3275.4, 60 sec: 3754.4, 300 sec: 3748.8). Total num frames: 2846720. Throughput: 0: 923.2. Samples: 711008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:00:57,797][00771] Avg episode reward: [(0, '20.006')] [2023-02-24 07:01:02,553][12906] Updated weights for policy 0, policy_version 700 (0.0029) [2023-02-24 07:01:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.2, 300 sec: 3735.0). Total num frames: 2867200. Throughput: 0: 931.2. Samples: 715944. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:01:02,789][00771] Avg episode reward: [(0, '21.191')] [2023-02-24 07:01:07,787][00771] Fps is (10 sec: 4097.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 2887680. Throughput: 0: 974.4. Samples: 722910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:01:07,792][00771] Avg episode reward: [(0, '21.748')] [2023-02-24 07:01:12,172][12906] Updated weights for policy 0, policy_version 710 (0.0020) [2023-02-24 07:01:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 2908160. Throughput: 0: 973.4. Samples: 726238. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:01:12,793][00771] Avg episode reward: [(0, '22.711')] [2023-02-24 07:01:17,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 2924544. Throughput: 0: 921.6. Samples: 730556. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:01:17,789][00771] Avg episode reward: [(0, '23.046')] [2023-02-24 07:01:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 2940928. Throughput: 0: 945.1. Samples: 735922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:01:22,792][00771] Avg episode reward: [(0, '22.689')] [2023-02-24 07:01:22,799][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000718_2940928.pth... [2023-02-24 07:01:22,912][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000498_2039808.pth [2023-02-24 07:01:23,981][12906] Updated weights for policy 0, policy_version 720 (0.0014) [2023-02-24 07:01:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2965504. Throughput: 0: 970.0. Samples: 739350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:01:27,792][00771] Avg episode reward: [(0, '22.859')] [2023-02-24 07:01:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 2981888. Throughput: 0: 957.2. Samples: 745686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:01:32,791][00771] Avg episode reward: [(0, '22.428')] [2023-02-24 07:01:34,601][12906] Updated weights for policy 0, policy_version 730 (0.0015) [2023-02-24 07:01:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 2998272. Throughput: 0: 916.0. Samples: 750030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:01:37,789][00771] Avg episode reward: [(0, '23.590')] [2023-02-24 07:01:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3014656. Throughput: 0: 920.8. Samples: 752440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:01:42,793][00771] Avg episode reward: [(0, '23.623')] [2023-02-24 07:01:45,654][12906] Updated weights for policy 0, policy_version 740 (0.0021) [2023-02-24 07:01:47,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3039232. Throughput: 0: 963.3. Samples: 759294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:01:47,793][00771] Avg episode reward: [(0, '23.934')] [2023-02-24 07:01:47,796][12892] Saving new best policy, reward=23.934! [2023-02-24 07:01:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3055616. Throughput: 0: 935.3. Samples: 764998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:01:52,791][00771] Avg episode reward: [(0, '24.055')] [2023-02-24 07:01:52,801][12892] Saving new best policy, reward=24.055! [2023-02-24 07:01:57,280][12906] Updated weights for policy 0, policy_version 750 (0.0018) [2023-02-24 07:01:57,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3762.8). Total num frames: 3072000. Throughput: 0: 908.8. Samples: 767136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:01:57,794][00771] Avg episode reward: [(0, '23.534')] [2023-02-24 07:02:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3092480. Throughput: 0: 928.0. Samples: 772316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:02:02,795][00771] Avg episode reward: [(0, '23.975')] [2023-02-24 07:02:06,837][12906] Updated weights for policy 0, policy_version 760 (0.0022) [2023-02-24 07:02:07,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3117056. Throughput: 0: 970.5. Samples: 779594. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:07,788][00771] Avg episode reward: [(0, '24.775')] [2023-02-24 07:02:07,800][12892] Saving new best policy, reward=24.775! [2023-02-24 07:02:12,788][00771] Fps is (10 sec: 4095.4, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 3133440. Throughput: 0: 964.9. Samples: 782770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:12,790][00771] Avg episode reward: [(0, '25.179')] [2023-02-24 07:02:12,809][12892] Saving new best policy, reward=25.179! [2023-02-24 07:02:17,787][00771] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3748.9). Total num frames: 3145728. Throughput: 0: 918.3. Samples: 787008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:02:17,795][00771] Avg episode reward: [(0, '25.277')] [2023-02-24 07:02:17,820][12892] Saving new best policy, reward=25.277! [2023-02-24 07:02:19,405][12906] Updated weights for policy 0, policy_version 770 (0.0015) [2023-02-24 07:02:22,787][00771] Fps is (10 sec: 3277.2, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 3166208. Throughput: 0: 946.6. Samples: 792626. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:02:22,792][00771] Avg episode reward: [(0, '26.901')] [2023-02-24 07:02:22,801][12892] Saving new best policy, reward=26.901! [2023-02-24 07:02:27,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3190784. Throughput: 0: 968.2. Samples: 796008. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:02:27,789][00771] Avg episode reward: [(0, '27.159')] [2023-02-24 07:02:27,791][12892] Saving new best policy, reward=27.159! [2023-02-24 07:02:28,209][12906] Updated weights for policy 0, policy_version 780 (0.0028) [2023-02-24 07:02:32,790][00771] Fps is (10 sec: 4094.6, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 3207168. Throughput: 0: 949.7. Samples: 802034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:02:32,796][00771] Avg episode reward: [(0, '25.310')] [2023-02-24 07:02:37,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3754.6, 300 sec: 3748.9). Total num frames: 3223552. Throughput: 0: 918.9. Samples: 806350. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:02:37,796][00771] Avg episode reward: [(0, '24.466')] [2023-02-24 07:02:40,729][12906] Updated weights for policy 0, policy_version 790 (0.0015) [2023-02-24 07:02:42,787][00771] Fps is (10 sec: 3687.6, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3244032. Throughput: 0: 933.4. Samples: 809138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:42,789][00771] Avg episode reward: [(0, '24.298')] [2023-02-24 07:02:47,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3264512. Throughput: 0: 973.0. Samples: 816102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:47,792][00771] Avg episode reward: [(0, '23.035')] [2023-02-24 07:02:49,923][12906] Updated weights for policy 0, policy_version 800 (0.0032) [2023-02-24 07:02:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3284992. Throughput: 0: 935.6. Samples: 821696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:52,789][00771] Avg episode reward: [(0, '22.490')] [2023-02-24 07:02:57,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3297280. Throughput: 0: 914.0. Samples: 823898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:02:57,791][00771] Avg episode reward: [(0, '23.052')] [2023-02-24 07:03:02,145][12906] Updated weights for policy 0, policy_version 810 (0.0036) [2023-02-24 07:03:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3317760. Throughput: 0: 937.6. Samples: 829202. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:03:02,789][00771] Avg episode reward: [(0, '22.987')] [2023-02-24 07:03:07,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 3338240. Throughput: 0: 947.6. Samples: 835270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:07,789][00771] Avg episode reward: [(0, '22.327')] [2023-02-24 07:03:12,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3618.2, 300 sec: 3735.0). Total num frames: 3350528. Throughput: 0: 915.5. Samples: 837208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:12,792][00771] Avg episode reward: [(0, '23.437')] [2023-02-24 07:03:15,645][12906] Updated weights for policy 0, policy_version 820 (0.0022) [2023-02-24 07:03:17,792][00771] Fps is (10 sec: 2456.3, 60 sec: 3617.8, 300 sec: 3707.2). Total num frames: 3362816. Throughput: 0: 860.0. Samples: 840734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:17,798][00771] Avg episode reward: [(0, '22.916')] [2023-02-24 07:03:22,787][00771] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 3379200. Throughput: 0: 867.0. Samples: 845366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:22,795][00771] Avg episode reward: [(0, '22.906')] [2023-02-24 07:03:22,804][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000825_3379200.pth... [2023-02-24 07:03:22,922][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000611_2502656.pth [2023-02-24 07:03:26,886][12906] Updated weights for policy 0, policy_version 830 (0.0017) [2023-02-24 07:03:27,787][00771] Fps is (10 sec: 4098.2, 60 sec: 3549.9, 300 sec: 3693.3). Total num frames: 3403776. Throughput: 0: 880.5. Samples: 848760. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:27,789][00771] Avg episode reward: [(0, '23.032')] [2023-02-24 07:03:32,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3707.3). Total num frames: 3424256. Throughput: 0: 880.8. Samples: 855740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:32,791][00771] Avg episode reward: [(0, '23.945')] [2023-02-24 07:03:37,790][00771] Fps is (10 sec: 3275.7, 60 sec: 3549.7, 300 sec: 3679.4). Total num frames: 3436544. Throughput: 0: 855.3. Samples: 860186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:37,793][00771] Avg episode reward: [(0, '25.330')] [2023-02-24 07:03:38,119][12906] Updated weights for policy 0, policy_version 840 (0.0013) [2023-02-24 07:03:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 3457024. Throughput: 0: 855.9. Samples: 862412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:42,788][00771] Avg episode reward: [(0, '25.253')] [2023-02-24 07:03:47,787][00771] Fps is (10 sec: 4097.4, 60 sec: 3549.9, 300 sec: 3679.5). Total num frames: 3477504. Throughput: 0: 889.8. Samples: 869244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:47,795][00771] Avg episode reward: [(0, '25.531')] [2023-02-24 07:03:47,930][12906] Updated weights for policy 0, policy_version 850 (0.0022) [2023-02-24 07:03:52,788][00771] Fps is (10 sec: 4095.6, 60 sec: 3549.8, 300 sec: 3693.3). Total num frames: 3497984. Throughput: 0: 897.6. Samples: 875664. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:03:52,791][00771] Avg episode reward: [(0, '25.432')] [2023-02-24 07:03:57,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 3514368. Throughput: 0: 903.5. Samples: 877864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:03:57,791][00771] Avg episode reward: [(0, '25.134')] [2023-02-24 07:03:59,890][12906] Updated weights for policy 0, policy_version 860 (0.0027) [2023-02-24 07:04:02,787][00771] Fps is (10 sec: 3277.1, 60 sec: 3549.9, 300 sec: 3665.6). Total num frames: 3530752. Throughput: 0: 929.6. Samples: 882562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:04:02,789][00771] Avg episode reward: [(0, '23.867')] [2023-02-24 07:04:07,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 3555328. Throughput: 0: 978.5. Samples: 889400. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:04:07,789][00771] Avg episode reward: [(0, '24.028')] [2023-02-24 07:04:09,166][12906] Updated weights for policy 0, policy_version 870 (0.0015) [2023-02-24 07:04:12,791][00771] Fps is (10 sec: 4503.6, 60 sec: 3754.4, 300 sec: 3693.3). Total num frames: 3575808. Throughput: 0: 979.2. Samples: 892826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:04:12,793][00771] Avg episode reward: [(0, '24.401')] [2023-02-24 07:04:17,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3755.0, 300 sec: 3679.5). Total num frames: 3588096. Throughput: 0: 926.1. Samples: 897416. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:04:17,789][00771] Avg episode reward: [(0, '24.322')] [2023-02-24 07:04:21,568][12906] Updated weights for policy 0, policy_version 880 (0.0029) [2023-02-24 07:04:22,787][00771] Fps is (10 sec: 3278.1, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 3608576. Throughput: 0: 942.9. Samples: 902614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:04:22,794][00771] Avg episode reward: [(0, '24.023')] [2023-02-24 07:04:27,787][00771] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3629056. Throughput: 0: 969.1. Samples: 906022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:04:27,790][00771] Avg episode reward: [(0, '24.383')] [2023-02-24 07:04:30,500][12906] Updated weights for policy 0, policy_version 890 (0.0024) [2023-02-24 07:04:32,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3649536. Throughput: 0: 968.4. Samples: 912820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:04:32,795][00771] Avg episode reward: [(0, '24.654')] [2023-02-24 07:04:37,789][00771] Fps is (10 sec: 3685.5, 60 sec: 3823.0, 300 sec: 3707.2). Total num frames: 3665920. Throughput: 0: 921.7. Samples: 917144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:04:37,792][00771] Avg episode reward: [(0, '24.095')] [2023-02-24 07:04:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 3682304. Throughput: 0: 922.0. Samples: 919356. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 07:04:42,794][00771] Avg episode reward: [(0, '24.775')] [2023-02-24 07:04:43,046][12906] Updated weights for policy 0, policy_version 900 (0.0033) [2023-02-24 07:04:47,787][00771] Fps is (10 sec: 4097.1, 60 sec: 3822.9, 300 sec: 3735.1). Total num frames: 3706880. Throughput: 0: 973.1. Samples: 926350. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:04:47,789][00771] Avg episode reward: [(0, '25.140')] [2023-02-24 07:04:52,286][12906] Updated weights for policy 0, policy_version 910 (0.0021) [2023-02-24 07:04:52,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3748.9). Total num frames: 3727360. Throughput: 0: 959.3. Samples: 932568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:04:52,791][00771] Avg episode reward: [(0, '25.784')] [2023-02-24 07:04:57,792][00771] Fps is (10 sec: 3275.1, 60 sec: 3754.3, 300 sec: 3735.0). Total num frames: 3739648. Throughput: 0: 930.2. Samples: 934688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:04:57,794][00771] Avg episode reward: [(0, '26.430')] [2023-02-24 07:05:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 3760128. Throughput: 0: 938.8. Samples: 939660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:05:02,789][00771] Avg episode reward: [(0, '26.217')] [2023-02-24 07:05:03,960][12906] Updated weights for policy 0, policy_version 920 (0.0021) [2023-02-24 07:05:07,787][00771] Fps is (10 sec: 4508.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 3784704. Throughput: 0: 979.3. Samples: 946682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:05:07,793][00771] Avg episode reward: [(0, '26.630')] [2023-02-24 07:05:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3748.9). Total num frames: 3801088. Throughput: 0: 977.7. Samples: 950018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:05:12,789][00771] Avg episode reward: [(0, '26.663')] [2023-02-24 07:05:14,174][12906] Updated weights for policy 0, policy_version 930 (0.0024) [2023-02-24 07:05:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3735.0). Total num frames: 3817472. Throughput: 0: 923.4. Samples: 954374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:05:17,795][00771] Avg episode reward: [(0, '26.017')] [2023-02-24 07:05:22,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3735.0). Total num frames: 3837952. Throughput: 0: 948.1. Samples: 959806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:05:22,790][00771] Avg episode reward: [(0, '25.432')] [2023-02-24 07:05:22,801][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000937_3837952.pth... [2023-02-24 07:05:22,936][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000718_2940928.pth [2023-02-24 07:05:25,293][12906] Updated weights for policy 0, policy_version 940 (0.0015) [2023-02-24 07:05:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 3858432. Throughput: 0: 974.7. Samples: 963218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:05:27,789][00771] Avg episode reward: [(0, '24.261')] [2023-02-24 07:05:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 3878912. Throughput: 0: 955.2. Samples: 969336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:05:32,793][00771] Avg episode reward: [(0, '24.429')] [2023-02-24 07:05:36,819][12906] Updated weights for policy 0, policy_version 950 (0.0034) [2023-02-24 07:05:37,789][00771] Fps is (10 sec: 3275.9, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 3891200. Throughput: 0: 914.5. Samples: 973722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:05:37,793][00771] Avg episode reward: [(0, '23.738')] [2023-02-24 07:05:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 3911680. Throughput: 0: 924.3. Samples: 976276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:05:42,798][00771] Avg episode reward: [(0, '24.637')] [2023-02-24 07:05:46,757][12906] Updated weights for policy 0, policy_version 960 (0.0020) [2023-02-24 07:05:47,787][00771] Fps is (10 sec: 4506.9, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 3936256. Throughput: 0: 965.9. Samples: 983126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:05:47,796][00771] Avg episode reward: [(0, '24.599')] [2023-02-24 07:05:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3952640. Throughput: 0: 938.0. Samples: 988892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:05:52,794][00771] Avg episode reward: [(0, '24.617')] [2023-02-24 07:05:57,787][00771] Fps is (10 sec: 2867.1, 60 sec: 3755.0, 300 sec: 3721.1). Total num frames: 3964928. Throughput: 0: 912.4. Samples: 991076. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:05:57,789][00771] Avg episode reward: [(0, '25.156')] [2023-02-24 07:05:59,157][12906] Updated weights for policy 0, policy_version 970 (0.0016) [2023-02-24 07:06:02,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 3985408. Throughput: 0: 927.4. Samples: 996108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:06:02,793][00771] Avg episode reward: [(0, '26.779')] [2023-02-24 07:06:07,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 4009984. Throughput: 0: 956.3. Samples: 1002838. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:06:07,794][00771] Avg episode reward: [(0, '26.552')] [2023-02-24 07:06:08,477][12906] Updated weights for policy 0, policy_version 980 (0.0020) [2023-02-24 07:06:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 4026368. Throughput: 0: 947.8. Samples: 1005868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:06:12,789][00771] Avg episode reward: [(0, '25.838')] [2023-02-24 07:06:17,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 4038656. Throughput: 0: 904.2. Samples: 1010024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:06:17,788][00771] Avg episode reward: [(0, '26.403')] [2023-02-24 07:06:21,293][12906] Updated weights for policy 0, policy_version 990 (0.0016) [2023-02-24 07:06:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 4059136. Throughput: 0: 926.6. Samples: 1015418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:06:22,792][00771] Avg episode reward: [(0, '26.566')] [2023-02-24 07:06:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3721.1). Total num frames: 4079616. Throughput: 0: 945.2. Samples: 1018808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:06:27,794][00771] Avg episode reward: [(0, '26.104')] [2023-02-24 07:06:31,765][12906] Updated weights for policy 0, policy_version 1000 (0.0022) [2023-02-24 07:06:32,790][00771] Fps is (10 sec: 3685.2, 60 sec: 3617.9, 300 sec: 3721.1). Total num frames: 4096000. Throughput: 0: 913.9. Samples: 1024254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:06:32,793][00771] Avg episode reward: [(0, '25.936')] [2023-02-24 07:06:37,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.3, 300 sec: 3707.2). Total num frames: 4108288. Throughput: 0: 856.6. Samples: 1027440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:06:37,789][00771] Avg episode reward: [(0, '25.725')] [2023-02-24 07:06:42,787][00771] Fps is (10 sec: 2048.6, 60 sec: 3413.3, 300 sec: 3651.7). Total num frames: 4116480. Throughput: 0: 844.9. Samples: 1029096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:06:42,792][00771] Avg episode reward: [(0, '25.359')] [2023-02-24 07:06:47,193][12906] Updated weights for policy 0, policy_version 1010 (0.0025) [2023-02-24 07:06:47,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3665.6). Total num frames: 4136960. Throughput: 0: 839.0. Samples: 1033864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:06:47,791][00771] Avg episode reward: [(0, '25.233')] [2023-02-24 07:06:52,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 4161536. Throughput: 0: 839.6. Samples: 1040622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:06:52,790][00771] Avg episode reward: [(0, '24.401')] [2023-02-24 07:06:57,424][12906] Updated weights for policy 0, policy_version 1020 (0.0014) [2023-02-24 07:06:57,791][00771] Fps is (10 sec: 4094.2, 60 sec: 3549.6, 300 sec: 3679.4). Total num frames: 4177920. Throughput: 0: 837.7. Samples: 1043570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:06:57,794][00771] Avg episode reward: [(0, '24.348')] [2023-02-24 07:07:02,787][00771] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 4190208. Throughput: 0: 835.5. Samples: 1047622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:07:02,795][00771] Avg episode reward: [(0, '22.914')] [2023-02-24 07:07:07,787][00771] Fps is (10 sec: 3278.2, 60 sec: 3345.1, 300 sec: 3651.7). Total num frames: 4210688. Throughput: 0: 839.6. Samples: 1053198. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:07:07,795][00771] Avg episode reward: [(0, '23.169')] [2023-02-24 07:07:09,240][12906] Updated weights for policy 0, policy_version 1030 (0.0032) [2023-02-24 07:07:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3679.5). Total num frames: 4231168. Throughput: 0: 837.0. Samples: 1056474. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:07:12,789][00771] Avg episode reward: [(0, '23.319')] [2023-02-24 07:07:17,789][00771] Fps is (10 sec: 3685.6, 60 sec: 3481.5, 300 sec: 3665.5). Total num frames: 4247552. Throughput: 0: 846.2. Samples: 1062334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:07:17,795][00771] Avg episode reward: [(0, '23.606')] [2023-02-24 07:07:20,890][12906] Updated weights for policy 0, policy_version 1040 (0.0020) [2023-02-24 07:07:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 4263936. Throughput: 0: 868.1. Samples: 1066504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:07:22,796][00771] Avg episode reward: [(0, '23.448')] [2023-02-24 07:07:22,808][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001041_4263936.pth... [2023-02-24 07:07:22,943][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000825_3379200.pth [2023-02-24 07:07:27,787][00771] Fps is (10 sec: 3687.3, 60 sec: 3413.3, 300 sec: 3651.7). Total num frames: 4284416. Throughput: 0: 891.9. Samples: 1069232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:07:27,789][00771] Avg episode reward: [(0, '24.226')] [2023-02-24 07:07:31,317][12906] Updated weights for policy 0, policy_version 1050 (0.0020) [2023-02-24 07:07:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3481.8, 300 sec: 3665.6). Total num frames: 4304896. Throughput: 0: 930.9. Samples: 1075754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:07:32,789][00771] Avg episode reward: [(0, '26.015')] [2023-02-24 07:07:37,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 4321280. Throughput: 0: 899.6. Samples: 1081102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:07:37,789][00771] Avg episode reward: [(0, '25.838')] [2023-02-24 07:07:42,787][00771] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 4333568. Throughput: 0: 878.6. Samples: 1083102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:07:42,791][00771] Avg episode reward: [(0, '26.101')] [2023-02-24 07:07:44,340][12906] Updated weights for policy 0, policy_version 1060 (0.0029) [2023-02-24 07:07:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 4354048. Throughput: 0: 904.0. Samples: 1088304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:07:47,789][00771] Avg episode reward: [(0, '27.559')] [2023-02-24 07:07:47,792][12892] Saving new best policy, reward=27.559! [2023-02-24 07:07:52,787][00771] Fps is (10 sec: 4505.8, 60 sec: 3618.2, 300 sec: 3665.6). Total num frames: 4378624. Throughput: 0: 926.7. Samples: 1094900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:07:52,788][00771] Avg episode reward: [(0, '25.999')] [2023-02-24 07:07:53,632][12906] Updated weights for policy 0, policy_version 1070 (0.0015) [2023-02-24 07:07:57,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3550.1, 300 sec: 3637.8). Total num frames: 4390912. Throughput: 0: 912.4. Samples: 1097532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:07:57,789][00771] Avg episode reward: [(0, '25.899')] [2023-02-24 07:08:02,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 4407296. Throughput: 0: 874.4. Samples: 1101682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:02,788][00771] Avg episode reward: [(0, '26.150')] [2023-02-24 07:08:06,397][12906] Updated weights for policy 0, policy_version 1080 (0.0019) [2023-02-24 07:08:07,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 4427776. Throughput: 0: 912.2. Samples: 1107552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:08:07,789][00771] Avg episode reward: [(0, '26.027')] [2023-02-24 07:08:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 4448256. Throughput: 0: 926.6. Samples: 1110928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:12,789][00771] Avg episode reward: [(0, '27.086')] [2023-02-24 07:08:16,812][12906] Updated weights for policy 0, policy_version 1090 (0.0014) [2023-02-24 07:08:17,788][00771] Fps is (10 sec: 3686.0, 60 sec: 3618.2, 300 sec: 3679.4). Total num frames: 4464640. Throughput: 0: 902.5. Samples: 1116366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:08:17,794][00771] Avg episode reward: [(0, '26.496')] [2023-02-24 07:08:22,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 4476928. Throughput: 0: 875.1. Samples: 1120482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:08:22,789][00771] Avg episode reward: [(0, '26.481')] [2023-02-24 07:08:27,787][00771] Fps is (10 sec: 3686.8, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 4501504. Throughput: 0: 896.7. Samples: 1123454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:27,789][00771] Avg episode reward: [(0, '29.459')] [2023-02-24 07:08:27,794][12892] Saving new best policy, reward=29.459! [2023-02-24 07:08:28,594][12906] Updated weights for policy 0, policy_version 1100 (0.0019) [2023-02-24 07:08:32,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3679.5). Total num frames: 4521984. Throughput: 0: 926.4. Samples: 1129990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:08:32,789][00771] Avg episode reward: [(0, '28.646')] [2023-02-24 07:08:37,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 4538368. Throughput: 0: 887.8. Samples: 1134852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:08:37,793][00771] Avg episode reward: [(0, '28.879')] [2023-02-24 07:08:40,750][12906] Updated weights for policy 0, policy_version 1110 (0.0012) [2023-02-24 07:08:42,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 4550656. Throughput: 0: 875.0. Samples: 1136908. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:08:42,794][00771] Avg episode reward: [(0, '29.051')] [2023-02-24 07:08:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 4571136. Throughput: 0: 909.9. Samples: 1142628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:47,793][00771] Avg episode reward: [(0, '28.060')] [2023-02-24 07:08:50,669][12906] Updated weights for policy 0, policy_version 1120 (0.0021) [2023-02-24 07:08:52,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 4595712. Throughput: 0: 929.2. Samples: 1149366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:52,789][00771] Avg episode reward: [(0, '28.154')] [2023-02-24 07:08:57,789][00771] Fps is (10 sec: 3685.5, 60 sec: 3618.0, 300 sec: 3651.7). Total num frames: 4608000. Throughput: 0: 902.7. Samples: 1151552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:08:57,792][00771] Avg episode reward: [(0, '27.829')] [2023-02-24 07:09:02,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 4624384. Throughput: 0: 875.0. Samples: 1155742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:09:02,789][00771] Avg episode reward: [(0, '26.394')] [2023-02-24 07:09:03,434][12906] Updated weights for policy 0, policy_version 1130 (0.0020) [2023-02-24 07:09:07,787][00771] Fps is (10 sec: 3687.3, 60 sec: 3618.1, 300 sec: 3624.0). Total num frames: 4644864. Throughput: 0: 920.8. Samples: 1161920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:09:07,794][00771] Avg episode reward: [(0, '25.131')] [2023-02-24 07:09:12,683][12906] Updated weights for policy 0, policy_version 1140 (0.0016) [2023-02-24 07:09:12,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 4669440. Throughput: 0: 931.2. Samples: 1165360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:09:12,792][00771] Avg episode reward: [(0, '25.424')] [2023-02-24 07:09:17,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 4681728. Throughput: 0: 901.4. Samples: 1170552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:09:17,792][00771] Avg episode reward: [(0, '25.509')] [2023-02-24 07:09:22,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 4698112. Throughput: 0: 887.7. Samples: 1174800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:09:22,789][00771] Avg episode reward: [(0, '24.091')] [2023-02-24 07:09:22,799][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001147_4698112.pth... [2023-02-24 07:09:22,932][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000937_3837952.pth [2023-02-24 07:09:25,496][12906] Updated weights for policy 0, policy_version 1150 (0.0016) [2023-02-24 07:09:27,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 4718592. Throughput: 0: 914.5. Samples: 1178062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:09:27,795][00771] Avg episode reward: [(0, '22.451')] [2023-02-24 07:09:32,789][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 4739072. Throughput: 0: 935.5. Samples: 1184724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:09:32,791][00771] Avg episode reward: [(0, '23.302')] [2023-02-24 07:09:36,259][12906] Updated weights for policy 0, policy_version 1160 (0.0017) [2023-02-24 07:09:37,790][00771] Fps is (10 sec: 3685.2, 60 sec: 3617.9, 300 sec: 3637.8). Total num frames: 4755456. Throughput: 0: 885.8. Samples: 1189230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:09:37,797][00771] Avg episode reward: [(0, '23.217')] [2023-02-24 07:09:42,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 4767744. Throughput: 0: 883.7. Samples: 1191316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:09:42,790][00771] Avg episode reward: [(0, '24.166')] [2023-02-24 07:09:47,543][12906] Updated weights for policy 0, policy_version 1170 (0.0014) [2023-02-24 07:09:47,787][00771] Fps is (10 sec: 3687.6, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 4792320. Throughput: 0: 925.4. Samples: 1197386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:09:47,794][00771] Avg episode reward: [(0, '25.217')] [2023-02-24 07:09:52,789][00771] Fps is (10 sec: 4504.5, 60 sec: 3618.0, 300 sec: 3637.8). Total num frames: 4812800. Throughput: 0: 935.3. Samples: 1204012. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:09:52,792][00771] Avg episode reward: [(0, '25.669')] [2023-02-24 07:09:57,791][00771] Fps is (10 sec: 3275.3, 60 sec: 3618.0, 300 sec: 3610.0). Total num frames: 4825088. Throughput: 0: 906.2. Samples: 1206142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:09:57,796][00771] Avg episode reward: [(0, '26.737')] [2023-02-24 07:10:00,127][12906] Updated weights for policy 0, policy_version 1180 (0.0016) [2023-02-24 07:10:02,790][00771] Fps is (10 sec: 2457.4, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 4837376. Throughput: 0: 868.1. Samples: 1209618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:10:02,793][00771] Avg episode reward: [(0, '27.164')] [2023-02-24 07:10:07,787][00771] Fps is (10 sec: 2458.7, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 4849664. Throughput: 0: 853.8. Samples: 1213220. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:10:07,792][00771] Avg episode reward: [(0, '28.635')] [2023-02-24 07:10:12,787][00771] Fps is (10 sec: 3277.8, 60 sec: 3345.1, 300 sec: 3568.4). Total num frames: 4870144. Throughput: 0: 843.1. Samples: 1216000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:10:12,794][00771] Avg episode reward: [(0, '27.634')] [2023-02-24 07:10:13,079][12906] Updated weights for policy 0, policy_version 1190 (0.0037) [2023-02-24 07:10:17,792][00771] Fps is (10 sec: 3684.4, 60 sec: 3413.0, 300 sec: 3554.4). Total num frames: 4886528. Throughput: 0: 831.4. Samples: 1222142. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:10:17,796][00771] Avg episode reward: [(0, '27.964')] [2023-02-24 07:10:22,789][00771] Fps is (10 sec: 3276.0, 60 sec: 3413.2, 300 sec: 3540.6). Total num frames: 4902912. Throughput: 0: 825.3. Samples: 1226366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:10:22,795][00771] Avg episode reward: [(0, '27.393')] [2023-02-24 07:10:25,472][12906] Updated weights for policy 0, policy_version 1200 (0.0017) [2023-02-24 07:10:27,787][00771] Fps is (10 sec: 3688.4, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 4923392. Throughput: 0: 840.5. Samples: 1229138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:10:27,794][00771] Avg episode reward: [(0, '27.290')] [2023-02-24 07:10:32,787][00771] Fps is (10 sec: 4506.7, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 4947968. Throughput: 0: 857.9. Samples: 1235990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:10:32,798][00771] Avg episode reward: [(0, '26.969')] [2023-02-24 07:10:34,665][12906] Updated weights for policy 0, policy_version 1210 (0.0011) [2023-02-24 07:10:37,788][00771] Fps is (10 sec: 4095.5, 60 sec: 3481.7, 300 sec: 3568.4). Total num frames: 4964352. Throughput: 0: 836.6. Samples: 1241658. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:10:37,797][00771] Avg episode reward: [(0, '25.775')] [2023-02-24 07:10:42,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 4980736. Throughput: 0: 837.3. Samples: 1243818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:10:42,789][00771] Avg episode reward: [(0, '27.522')] [2023-02-24 07:10:46,465][12906] Updated weights for policy 0, policy_version 1220 (0.0016) [2023-02-24 07:10:47,787][00771] Fps is (10 sec: 3686.9, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 5001216. Throughput: 0: 894.1. Samples: 1249850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:10:47,796][00771] Avg episode reward: [(0, '27.233')] [2023-02-24 07:10:52,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3550.0, 300 sec: 3596.2). Total num frames: 5025792. Throughput: 0: 969.0. Samples: 1256826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:10:52,795][00771] Avg episode reward: [(0, '28.131')] [2023-02-24 07:10:56,328][12906] Updated weights for policy 0, policy_version 1230 (0.0022) [2023-02-24 07:10:57,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.4, 300 sec: 3582.3). Total num frames: 5042176. Throughput: 0: 963.7. Samples: 1259368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:10:57,794][00771] Avg episode reward: [(0, '27.860')] [2023-02-24 07:11:02,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3686.6, 300 sec: 3554.5). Total num frames: 5058560. Throughput: 0: 926.8. Samples: 1263844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:02,795][00771] Avg episode reward: [(0, '27.987')] [2023-02-24 07:11:07,138][12906] Updated weights for policy 0, policy_version 1240 (0.0014) [2023-02-24 07:11:07,787][00771] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3568.4). Total num frames: 5079040. Throughput: 0: 982.7. Samples: 1270586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:07,793][00771] Avg episode reward: [(0, '30.139')] [2023-02-24 07:11:07,799][12892] Saving new best policy, reward=30.139! [2023-02-24 07:11:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3596.1). Total num frames: 5099520. Throughput: 0: 997.5. Samples: 1274026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:12,803][00771] Avg episode reward: [(0, '28.461')] [2023-02-24 07:11:17,787][00771] Fps is (10 sec: 3686.5, 60 sec: 3823.3, 300 sec: 3582.3). Total num frames: 5115904. Throughput: 0: 955.8. Samples: 1279000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:17,792][00771] Avg episode reward: [(0, '28.135')] [2023-02-24 07:11:18,380][12906] Updated weights for policy 0, policy_version 1250 (0.0014) [2023-02-24 07:11:22,787][00771] Fps is (10 sec: 3276.9, 60 sec: 3823.1, 300 sec: 3568.4). Total num frames: 5132288. Throughput: 0: 945.9. Samples: 1284220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:11:22,794][00771] Avg episode reward: [(0, '27.209')] [2023-02-24 07:11:22,831][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001254_5136384.pth... [2023-02-24 07:11:22,975][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001041_4263936.pth [2023-02-24 07:11:27,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3596.2). Total num frames: 5156864. Throughput: 0: 974.9. Samples: 1287688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:27,793][00771] Avg episode reward: [(0, '27.204')] [2023-02-24 07:11:28,140][12906] Updated weights for policy 0, policy_version 1260 (0.0022) [2023-02-24 07:11:32,790][00771] Fps is (10 sec: 4504.1, 60 sec: 3822.7, 300 sec: 3623.9). Total num frames: 5177344. Throughput: 0: 989.3. Samples: 1294370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:11:32,795][00771] Avg episode reward: [(0, '27.648')] [2023-02-24 07:11:37,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3651.7). Total num frames: 5193728. Throughput: 0: 932.6. Samples: 1298792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:11:37,791][00771] Avg episode reward: [(0, '27.460')] [2023-02-24 07:11:40,050][12906] Updated weights for policy 0, policy_version 1270 (0.0023) [2023-02-24 07:11:42,787][00771] Fps is (10 sec: 3687.6, 60 sec: 3891.2, 300 sec: 3651.7). Total num frames: 5214208. Throughput: 0: 935.5. Samples: 1301464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:42,796][00771] Avg episode reward: [(0, '28.493')] [2023-02-24 07:11:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3651.7). Total num frames: 5238784. Throughput: 0: 996.6. Samples: 1308690. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:11:47,794][00771] Avg episode reward: [(0, '30.308')] [2023-02-24 07:11:47,800][12892] Saving new best policy, reward=30.308! [2023-02-24 07:11:48,716][12906] Updated weights for policy 0, policy_version 1280 (0.0014) [2023-02-24 07:11:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3651.7). Total num frames: 5255168. Throughput: 0: 978.3. Samples: 1314608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:11:52,797][00771] Avg episode reward: [(0, '29.121')] [2023-02-24 07:11:57,788][00771] Fps is (10 sec: 3276.2, 60 sec: 3822.8, 300 sec: 3665.6). Total num frames: 5271552. Throughput: 0: 951.6. Samples: 1316848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:11:57,795][00771] Avg episode reward: [(0, '28.838')] [2023-02-24 07:12:00,352][12906] Updated weights for policy 0, policy_version 1290 (0.0012) [2023-02-24 07:12:02,787][00771] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 5292032. Throughput: 0: 971.9. Samples: 1322738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:12:02,798][00771] Avg episode reward: [(0, '28.844')] [2023-02-24 07:12:07,787][00771] Fps is (10 sec: 4506.4, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 5316608. Throughput: 0: 1014.1. Samples: 1329856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:12:07,794][00771] Avg episode reward: [(0, '28.693')] [2023-02-24 07:12:09,378][12906] Updated weights for policy 0, policy_version 1300 (0.0011) [2023-02-24 07:12:12,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 5332992. Throughput: 0: 996.8. Samples: 1332544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:12:12,791][00771] Avg episode reward: [(0, '27.405')] [2023-02-24 07:12:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 5349376. Throughput: 0: 947.5. Samples: 1337006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:12:17,789][00771] Avg episode reward: [(0, '27.959')] [2023-02-24 07:12:21,014][12906] Updated weights for policy 0, policy_version 1310 (0.0059) [2023-02-24 07:12:22,787][00771] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 3693.3). Total num frames: 5373952. Throughput: 0: 999.5. Samples: 1343768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:12:22,789][00771] Avg episode reward: [(0, '27.831')] [2023-02-24 07:12:27,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 5394432. Throughput: 0: 1019.1. Samples: 1347322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:12:27,792][00771] Avg episode reward: [(0, '29.333')] [2023-02-24 07:12:30,779][12906] Updated weights for policy 0, policy_version 1320 (0.0035) [2023-02-24 07:12:32,787][00771] Fps is (10 sec: 3686.5, 60 sec: 3891.4, 300 sec: 3693.3). Total num frames: 5410816. Throughput: 0: 977.1. Samples: 1352660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:12:32,792][00771] Avg episode reward: [(0, '29.429')] [2023-02-24 07:12:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 5427200. Throughput: 0: 959.3. Samples: 1357778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:12:37,794][00771] Avg episode reward: [(0, '28.909')] [2023-02-24 07:12:41,292][12906] Updated weights for policy 0, policy_version 1330 (0.0019) [2023-02-24 07:12:42,789][00771] Fps is (10 sec: 4095.0, 60 sec: 3959.3, 300 sec: 3721.1). Total num frames: 5451776. Throughput: 0: 990.3. Samples: 1361410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:12:42,792][00771] Avg episode reward: [(0, '29.554')] [2023-02-24 07:12:47,792][00771] Fps is (10 sec: 4912.6, 60 sec: 3959.1, 300 sec: 3721.0). Total num frames: 5476352. Throughput: 0: 1019.0. Samples: 1368596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:12:47,794][00771] Avg episode reward: [(0, '29.305')] [2023-02-24 07:12:51,449][12906] Updated weights for policy 0, policy_version 1340 (0.0011) [2023-02-24 07:12:52,787][00771] Fps is (10 sec: 3687.3, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 5488640. Throughput: 0: 962.9. Samples: 1373188. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:12:52,790][00771] Avg episode reward: [(0, '28.944')] [2023-02-24 07:12:57,787][00771] Fps is (10 sec: 3278.6, 60 sec: 3959.6, 300 sec: 3735.0). Total num frames: 5509120. Throughput: 0: 956.2. Samples: 1375572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:12:57,794][00771] Avg episode reward: [(0, '27.743')] [2023-02-24 07:13:01,508][12906] Updated weights for policy 0, policy_version 1350 (0.0012) [2023-02-24 07:13:02,787][00771] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 3748.9). Total num frames: 5533696. Throughput: 0: 1015.5. Samples: 1382702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:13:02,792][00771] Avg episode reward: [(0, '28.373')] [2023-02-24 07:13:07,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 5554176. Throughput: 0: 1002.6. Samples: 1388886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:07,789][00771] Avg episode reward: [(0, '27.094')] [2023-02-24 07:13:12,788][00771] Fps is (10 sec: 3276.6, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 5566464. Throughput: 0: 972.4. Samples: 1391082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:12,794][00771] Avg episode reward: [(0, '26.755')] [2023-02-24 07:13:13,136][12906] Updated weights for policy 0, policy_version 1360 (0.0015) [2023-02-24 07:13:17,787][00771] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3776.7). Total num frames: 5591040. Throughput: 0: 980.8. Samples: 1396798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:17,789][00771] Avg episode reward: [(0, '27.479')] [2023-02-24 07:13:21,875][12906] Updated weights for policy 0, policy_version 1370 (0.0016) [2023-02-24 07:13:22,787][00771] Fps is (10 sec: 4915.4, 60 sec: 4027.7, 300 sec: 3776.6). Total num frames: 5615616. Throughput: 0: 1029.3. Samples: 1404096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:13:22,789][00771] Avg episode reward: [(0, '27.243')] [2023-02-24 07:13:22,802][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001371_5615616.pth... [2023-02-24 07:13:22,914][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001147_4698112.pth [2023-02-24 07:13:27,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 5627904. Throughput: 0: 1003.8. Samples: 1406580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:27,789][00771] Avg episode reward: [(0, '27.727')] [2023-02-24 07:13:32,789][00771] Fps is (10 sec: 2457.1, 60 sec: 3822.8, 300 sec: 3735.0). Total num frames: 5640192. Throughput: 0: 924.2. Samples: 1410180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:32,792][00771] Avg episode reward: [(0, '27.010')] [2023-02-24 07:13:36,777][12906] Updated weights for policy 0, policy_version 1380 (0.0012) [2023-02-24 07:13:37,787][00771] Fps is (10 sec: 2867.1, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 5656576. Throughput: 0: 912.3. Samples: 1414244. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:13:37,789][00771] Avg episode reward: [(0, '26.886')] [2023-02-24 07:13:42,787][00771] Fps is (10 sec: 3687.3, 60 sec: 3754.8, 300 sec: 3748.9). Total num frames: 5677056. Throughput: 0: 941.3. Samples: 1417930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:13:42,792][00771] Avg episode reward: [(0, '27.484')] [2023-02-24 07:13:45,364][12906] Updated weights for policy 0, policy_version 1390 (0.0035) [2023-02-24 07:13:47,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3755.0, 300 sec: 3748.9). Total num frames: 5701632. Throughput: 0: 944.9. Samples: 1425222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:13:47,794][00771] Avg episode reward: [(0, '27.085')] [2023-02-24 07:13:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 5718016. Throughput: 0: 917.4. Samples: 1430170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:13:52,788][00771] Avg episode reward: [(0, '27.661')] [2023-02-24 07:13:56,961][12906] Updated weights for policy 0, policy_version 1400 (0.0025) [2023-02-24 07:13:57,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 5738496. Throughput: 0: 922.1. Samples: 1432576. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:13:57,789][00771] Avg episode reward: [(0, '27.053')] [2023-02-24 07:14:02,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 5763072. Throughput: 0: 952.4. Samples: 1439658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:02,792][00771] Avg episode reward: [(0, '26.744')] [2023-02-24 07:14:05,137][12906] Updated weights for policy 0, policy_version 1410 (0.0015) [2023-02-24 07:14:07,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 5783552. Throughput: 0: 940.9. Samples: 1446436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:07,792][00771] Avg episode reward: [(0, '27.424')] [2023-02-24 07:14:12,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 5799936. Throughput: 0: 935.3. Samples: 1448668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:14:12,795][00771] Avg episode reward: [(0, '28.205')] [2023-02-24 07:14:16,887][12906] Updated weights for policy 0, policy_version 1420 (0.0066) [2023-02-24 07:14:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 5816320. Throughput: 0: 976.1. Samples: 1454102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:17,794][00771] Avg episode reward: [(0, '28.796')] [2023-02-24 07:14:22,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 5840896. Throughput: 0: 1044.0. Samples: 1461224. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:14:22,788][00771] Avg episode reward: [(0, '30.066')] [2023-02-24 07:14:25,686][12906] Updated weights for policy 0, policy_version 1430 (0.0011) [2023-02-24 07:14:27,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 5861376. Throughput: 0: 1037.7. Samples: 1464626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:27,792][00771] Avg episode reward: [(0, '30.743')] [2023-02-24 07:14:27,802][12892] Saving new best policy, reward=30.743! [2023-02-24 07:14:32,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3804.5). Total num frames: 5877760. Throughput: 0: 975.5. Samples: 1469118. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:14:32,794][00771] Avg episode reward: [(0, '31.519')] [2023-02-24 07:14:32,808][12892] Saving new best policy, reward=31.519! [2023-02-24 07:14:37,063][12906] Updated weights for policy 0, policy_version 1440 (0.0028) [2023-02-24 07:14:37,787][00771] Fps is (10 sec: 3686.5, 60 sec: 4027.8, 300 sec: 3832.2). Total num frames: 5898240. Throughput: 0: 1007.0. Samples: 1475484. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:14:37,791][00771] Avg episode reward: [(0, '29.769')] [2023-02-24 07:14:42,787][00771] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3832.2). Total num frames: 5922816. Throughput: 0: 1033.6. Samples: 1479088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:42,790][00771] Avg episode reward: [(0, '30.382')] [2023-02-24 07:14:46,850][12906] Updated weights for policy 0, policy_version 1450 (0.0020) [2023-02-24 07:14:47,787][00771] Fps is (10 sec: 4095.9, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 5939200. Throughput: 0: 1006.0. Samples: 1484928. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:14:47,793][00771] Avg episode reward: [(0, '30.265')] [2023-02-24 07:14:52,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 5955584. Throughput: 0: 954.7. Samples: 1489398. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:14:52,790][00771] Avg episode reward: [(0, '28.510')] [2023-02-24 07:14:57,604][12906] Updated weights for policy 0, policy_version 1460 (0.0012) [2023-02-24 07:14:57,787][00771] Fps is (10 sec: 4096.1, 60 sec: 4027.7, 300 sec: 3873.9). Total num frames: 5980160. Throughput: 0: 982.4. Samples: 1492874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:14:57,792][00771] Avg episode reward: [(0, '28.438')] [2023-02-24 07:15:02,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 6000640. Throughput: 0: 1018.2. Samples: 1499922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:15:02,789][00771] Avg episode reward: [(0, '28.438')] [2023-02-24 07:15:07,788][00771] Fps is (10 sec: 3686.0, 60 sec: 3891.1, 300 sec: 3887.7). Total num frames: 6017024. Throughput: 0: 965.2. Samples: 1504660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:15:07,793][00771] Avg episode reward: [(0, '29.227')] [2023-02-24 07:15:08,930][12906] Updated weights for policy 0, policy_version 1470 (0.0016) [2023-02-24 07:15:12,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.8). Total num frames: 6033408. Throughput: 0: 939.1. Samples: 1506886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:15:12,795][00771] Avg episode reward: [(0, '28.425')] [2023-02-24 07:15:17,787][00771] Fps is (10 sec: 4096.5, 60 sec: 4027.7, 300 sec: 3915.5). Total num frames: 6057984. Throughput: 0: 988.0. Samples: 1513580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:15:17,791][00771] Avg episode reward: [(0, '27.388')] [2023-02-24 07:15:18,492][12906] Updated weights for policy 0, policy_version 1480 (0.0013) [2023-02-24 07:15:22,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 6078464. Throughput: 0: 991.4. Samples: 1520096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:15:22,794][00771] Avg episode reward: [(0, '27.939')] [2023-02-24 07:15:22,802][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001484_6078464.pth... [2023-02-24 07:15:23,009][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001254_5136384.pth [2023-02-24 07:15:27,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 6090752. Throughput: 0: 958.6. Samples: 1522224. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:15:27,797][00771] Avg episode reward: [(0, '27.767')] [2023-02-24 07:15:30,666][12906] Updated weights for policy 0, policy_version 1490 (0.0018) [2023-02-24 07:15:32,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 6111232. Throughput: 0: 941.4. Samples: 1527292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:15:32,790][00771] Avg episode reward: [(0, '27.590')] [2023-02-24 07:15:37,787][00771] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 6135808. Throughput: 0: 998.5. Samples: 1534332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:15:37,788][00771] Avg episode reward: [(0, '27.399')] [2023-02-24 07:15:39,380][12906] Updated weights for policy 0, policy_version 1500 (0.0015) [2023-02-24 07:15:42,787][00771] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6152192. Throughput: 0: 993.2. Samples: 1537566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:15:42,789][00771] Avg episode reward: [(0, '28.011')] [2023-02-24 07:15:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3873.8). Total num frames: 6168576. Throughput: 0: 932.9. Samples: 1541904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:15:47,789][00771] Avg episode reward: [(0, '27.671')] [2023-02-24 07:15:51,311][12906] Updated weights for policy 0, policy_version 1510 (0.0016) [2023-02-24 07:15:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 6189056. Throughput: 0: 962.9. Samples: 1547990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:15:52,789][00771] Avg episode reward: [(0, '26.749')] [2023-02-24 07:15:57,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3915.5). Total num frames: 6213632. Throughput: 0: 992.6. Samples: 1551552. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:15:57,788][00771] Avg episode reward: [(0, '26.996')] [2023-02-24 07:16:00,898][12906] Updated weights for policy 0, policy_version 1520 (0.0021) [2023-02-24 07:16:02,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6230016. Throughput: 0: 973.8. Samples: 1557400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:16:02,789][00771] Avg episode reward: [(0, '28.238')] [2023-02-24 07:16:07,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3873.8). Total num frames: 6242304. Throughput: 0: 927.2. Samples: 1561822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:16:07,789][00771] Avg episode reward: [(0, '28.153')] [2023-02-24 07:16:12,056][12906] Updated weights for policy 0, policy_version 1530 (0.0032) [2023-02-24 07:16:12,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 6266880. Throughput: 0: 955.3. Samples: 1565210. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:16:12,794][00771] Avg episode reward: [(0, '27.185')] [2023-02-24 07:16:17,787][00771] Fps is (10 sec: 4915.0, 60 sec: 3891.2, 300 sec: 3929.4). Total num frames: 6291456. Throughput: 0: 997.7. Samples: 1572190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:16:17,794][00771] Avg episode reward: [(0, '28.483')] [2023-02-24 07:16:22,787][00771] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3887.7). Total num frames: 6303744. Throughput: 0: 948.7. Samples: 1577024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:16:22,789][00771] Avg episode reward: [(0, '29.810')] [2023-02-24 07:16:23,069][12906] Updated weights for policy 0, policy_version 1540 (0.0015) [2023-02-24 07:16:27,787][00771] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3887.8). Total num frames: 6324224. Throughput: 0: 925.5. Samples: 1579214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:16:27,789][00771] Avg episode reward: [(0, '29.147')] [2023-02-24 07:16:32,787][00771] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 6344704. Throughput: 0: 973.6. Samples: 1585714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:16:32,790][00771] Avg episode reward: [(0, '29.868')] [2023-02-24 07:16:33,238][12906] Updated weights for policy 0, policy_version 1550 (0.0018) [2023-02-24 07:16:37,788][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3901.6). Total num frames: 6365184. Throughput: 0: 978.5. Samples: 1592024. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:16:37,791][00771] Avg episode reward: [(0, '29.646')] [2023-02-24 07:16:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3860.0). Total num frames: 6377472. Throughput: 0: 944.9. Samples: 1594074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:16:42,790][00771] Avg episode reward: [(0, '29.890')] [2023-02-24 07:16:45,541][12906] Updated weights for policy 0, policy_version 1560 (0.0018) [2023-02-24 07:16:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 6397952. Throughput: 0: 922.8. Samples: 1598924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:16:47,789][00771] Avg episode reward: [(0, '29.908')] [2023-02-24 07:16:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3873.9). Total num frames: 6414336. Throughput: 0: 948.6. Samples: 1604510. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:16:52,791][00771] Avg episode reward: [(0, '28.430')] [2023-02-24 07:16:57,789][00771] Fps is (10 sec: 2866.6, 60 sec: 3549.7, 300 sec: 3846.1). Total num frames: 6426624. Throughput: 0: 918.4. Samples: 1606538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:16:57,794][00771] Avg episode reward: [(0, '27.921')] [2023-02-24 07:16:58,346][12906] Updated weights for policy 0, policy_version 1570 (0.0021) [2023-02-24 07:17:02,788][00771] Fps is (10 sec: 2457.3, 60 sec: 3481.5, 300 sec: 3804.4). Total num frames: 6438912. Throughput: 0: 844.3. Samples: 1610186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:17:02,790][00771] Avg episode reward: [(0, '28.681')] [2023-02-24 07:17:07,789][00771] Fps is (10 sec: 3276.7, 60 sec: 3618.0, 300 sec: 3818.3). Total num frames: 6459392. Throughput: 0: 851.9. Samples: 1615360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:17:07,792][00771] Avg episode reward: [(0, '28.810')] [2023-02-24 07:17:10,212][12906] Updated weights for policy 0, policy_version 1580 (0.0013) [2023-02-24 07:17:12,787][00771] Fps is (10 sec: 4096.5, 60 sec: 3549.9, 300 sec: 3832.2). Total num frames: 6479872. Throughput: 0: 876.3. Samples: 1618646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:17:12,789][00771] Avg episode reward: [(0, '28.371')] [2023-02-24 07:17:17,787][00771] Fps is (10 sec: 3687.3, 60 sec: 3413.4, 300 sec: 3804.4). Total num frames: 6496256. Throughput: 0: 864.8. Samples: 1624628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:17:17,792][00771] Avg episode reward: [(0, '27.389')] [2023-02-24 07:17:22,137][12906] Updated weights for policy 0, policy_version 1590 (0.0012) [2023-02-24 07:17:22,789][00771] Fps is (10 sec: 3276.0, 60 sec: 3481.5, 300 sec: 3790.5). Total num frames: 6512640. Throughput: 0: 821.0. Samples: 1628970. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:17:22,794][00771] Avg episode reward: [(0, '27.783')] [2023-02-24 07:17:22,810][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001590_6512640.pth... [2023-02-24 07:17:22,931][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001371_5615616.pth [2023-02-24 07:17:27,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3804.4). Total num frames: 6533120. Throughput: 0: 836.8. Samples: 1631732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:17:27,792][00771] Avg episode reward: [(0, '29.377')] [2023-02-24 07:17:31,474][12906] Updated weights for policy 0, policy_version 1600 (0.0018) [2023-02-24 07:17:32,787][00771] Fps is (10 sec: 4506.7, 60 sec: 3549.9, 300 sec: 3832.2). Total num frames: 6557696. Throughput: 0: 885.6. Samples: 1638776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:17:32,789][00771] Avg episode reward: [(0, '28.320')] [2023-02-24 07:17:37,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3804.5). Total num frames: 6574080. Throughput: 0: 883.6. Samples: 1644274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:17:37,797][00771] Avg episode reward: [(0, '27.825')] [2023-02-24 07:17:42,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 6590464. Throughput: 0: 887.3. Samples: 1646464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:17:42,794][00771] Avg episode reward: [(0, '28.080')] [2023-02-24 07:17:43,584][12906] Updated weights for policy 0, policy_version 1610 (0.0036) [2023-02-24 07:17:47,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3804.4). Total num frames: 6610944. Throughput: 0: 938.4. Samples: 1652414. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:17:47,789][00771] Avg episode reward: [(0, '28.271')] [2023-02-24 07:17:52,507][12906] Updated weights for policy 0, policy_version 1620 (0.0016) [2023-02-24 07:17:52,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 6635520. Throughput: 0: 977.7. Samples: 1659352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:17:52,792][00771] Avg episode reward: [(0, '29.715')] [2023-02-24 07:17:57,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3776.7). Total num frames: 6647808. Throughput: 0: 958.2. Samples: 1661764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:17:57,792][00771] Avg episode reward: [(0, '30.763')] [2023-02-24 07:18:02,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 6664192. Throughput: 0: 921.0. Samples: 1666072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:02,796][00771] Avg episode reward: [(0, '29.327')] [2023-02-24 07:18:04,781][12906] Updated weights for policy 0, policy_version 1630 (0.0023) [2023-02-24 07:18:07,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3804.4). Total num frames: 6688768. Throughput: 0: 970.5. Samples: 1672640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:18:07,789][00771] Avg episode reward: [(0, '30.398')] [2023-02-24 07:18:12,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 6709248. Throughput: 0: 982.9. Samples: 1675962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:12,794][00771] Avg episode reward: [(0, '28.876')] [2023-02-24 07:18:15,011][12906] Updated weights for policy 0, policy_version 1640 (0.0030) [2023-02-24 07:18:17,794][00771] Fps is (10 sec: 3274.4, 60 sec: 3754.2, 300 sec: 3748.8). Total num frames: 6721536. Throughput: 0: 932.2. Samples: 1680730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:18:17,799][00771] Avg episode reward: [(0, '27.295')] [2023-02-24 07:18:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3776.7). Total num frames: 6742016. Throughput: 0: 919.5. Samples: 1685650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:18:22,789][00771] Avg episode reward: [(0, '27.488')] [2023-02-24 07:18:26,573][12906] Updated weights for policy 0, policy_version 1650 (0.0018) [2023-02-24 07:18:27,787][00771] Fps is (10 sec: 4099.0, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 6762496. Throughput: 0: 945.8. Samples: 1689026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:27,794][00771] Avg episode reward: [(0, '25.245')] [2023-02-24 07:18:32,791][00771] Fps is (10 sec: 3684.8, 60 sec: 3686.1, 300 sec: 3804.4). Total num frames: 6778880. Throughput: 0: 955.1. Samples: 1695396. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:18:32,793][00771] Avg episode reward: [(0, '26.092')] [2023-02-24 07:18:37,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 6795264. Throughput: 0: 895.4. Samples: 1699644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:37,789][00771] Avg episode reward: [(0, '26.133')] [2023-02-24 07:18:38,491][12906] Updated weights for policy 0, policy_version 1660 (0.0026) [2023-02-24 07:18:42,787][00771] Fps is (10 sec: 3688.0, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 6815744. Throughput: 0: 899.3. Samples: 1702234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:18:42,797][00771] Avg episode reward: [(0, '25.128')] [2023-02-24 07:18:47,659][12906] Updated weights for policy 0, policy_version 1670 (0.0012) [2023-02-24 07:18:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 6840320. Throughput: 0: 959.0. Samples: 1709228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:47,789][00771] Avg episode reward: [(0, '27.825')] [2023-02-24 07:18:52,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 6856704. Throughput: 0: 942.5. Samples: 1715052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:18:52,790][00771] Avg episode reward: [(0, '27.516')] [2023-02-24 07:18:57,788][00771] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3762.7). Total num frames: 6873088. Throughput: 0: 916.9. Samples: 1717222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:18:57,792][00771] Avg episode reward: [(0, '28.110')] [2023-02-24 07:18:59,590][12906] Updated weights for policy 0, policy_version 1680 (0.0036) [2023-02-24 07:19:02,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 6893568. Throughput: 0: 938.4. Samples: 1722950. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:02,793][00771] Avg episode reward: [(0, '26.792')] [2023-02-24 07:19:07,787][00771] Fps is (10 sec: 4506.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 6918144. Throughput: 0: 986.4. Samples: 1730038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:07,794][00771] Avg episode reward: [(0, '27.740')] [2023-02-24 07:19:08,422][12906] Updated weights for policy 0, policy_version 1690 (0.0011) [2023-02-24 07:19:12,788][00771] Fps is (10 sec: 4095.3, 60 sec: 3754.6, 300 sec: 3790.5). Total num frames: 6934528. Throughput: 0: 968.3. Samples: 1732600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:12,790][00771] Avg episode reward: [(0, '26.615')] [2023-02-24 07:19:17,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3823.4, 300 sec: 3762.8). Total num frames: 6950912. Throughput: 0: 922.8. Samples: 1736920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:19:17,789][00771] Avg episode reward: [(0, '27.060')] [2023-02-24 07:19:20,555][12906] Updated weights for policy 0, policy_version 1700 (0.0033) [2023-02-24 07:19:22,787][00771] Fps is (10 sec: 3687.0, 60 sec: 3822.9, 300 sec: 3762.8). Total num frames: 6971392. Throughput: 0: 978.0. Samples: 1743654. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:19:22,797][00771] Avg episode reward: [(0, '26.175')] [2023-02-24 07:19:22,810][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001702_6971392.pth... [2023-02-24 07:19:22,927][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001484_6078464.pth [2023-02-24 07:19:27,792][00771] Fps is (10 sec: 4503.2, 60 sec: 3890.9, 300 sec: 3790.5). Total num frames: 6995968. Throughput: 0: 998.7. Samples: 1747182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:27,795][00771] Avg episode reward: [(0, '26.968')] [2023-02-24 07:19:30,676][12906] Updated weights for policy 0, policy_version 1710 (0.0027) [2023-02-24 07:19:32,790][00771] Fps is (10 sec: 3685.2, 60 sec: 3823.0, 300 sec: 3762.7). Total num frames: 7008256. Throughput: 0: 956.2. Samples: 1752262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:32,793][00771] Avg episode reward: [(0, '26.707')] [2023-02-24 07:19:37,787][00771] Fps is (10 sec: 2868.7, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 7024640. Throughput: 0: 938.8. Samples: 1757298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:19:37,796][00771] Avg episode reward: [(0, '28.402')] [2023-02-24 07:19:41,285][12906] Updated weights for policy 0, policy_version 1720 (0.0016) [2023-02-24 07:19:42,787][00771] Fps is (10 sec: 4097.4, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 7049216. Throughput: 0: 968.6. Samples: 1760806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:42,794][00771] Avg episode reward: [(0, '29.337')] [2023-02-24 07:19:47,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 7069696. Throughput: 0: 995.5. Samples: 1767746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:19:47,790][00771] Avg episode reward: [(0, '28.321')] [2023-02-24 07:19:52,314][12906] Updated weights for policy 0, policy_version 1730 (0.0011) [2023-02-24 07:19:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3748.9). Total num frames: 7086080. Throughput: 0: 934.8. Samples: 1772106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:19:52,790][00771] Avg episode reward: [(0, '28.259')] [2023-02-24 07:19:57,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3748.9). Total num frames: 7106560. Throughput: 0: 935.2. Samples: 1774684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:19:57,794][00771] Avg episode reward: [(0, '28.771')] [2023-02-24 07:20:01,944][12906] Updated weights for policy 0, policy_version 1740 (0.0013) [2023-02-24 07:20:02,787][00771] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 7131136. Throughput: 0: 995.3. Samples: 1781708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:20:02,790][00771] Avg episode reward: [(0, '28.365')] [2023-02-24 07:20:07,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 7147520. Throughput: 0: 977.6. Samples: 1787648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:20:07,789][00771] Avg episode reward: [(0, '28.142')] [2023-02-24 07:20:12,787][00771] Fps is (10 sec: 2867.0, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 7159808. Throughput: 0: 947.7. Samples: 1789824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:20:12,796][00771] Avg episode reward: [(0, '28.216')] [2023-02-24 07:20:13,958][12906] Updated weights for policy 0, policy_version 1750 (0.0020) [2023-02-24 07:20:17,788][00771] Fps is (10 sec: 3276.4, 60 sec: 3822.8, 300 sec: 3735.0). Total num frames: 7180288. Throughput: 0: 952.9. Samples: 1795140. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:20:17,792][00771] Avg episode reward: [(0, '28.700')] [2023-02-24 07:20:22,787][00771] Fps is (10 sec: 3686.7, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 7196672. Throughput: 0: 939.2. Samples: 1799564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:20:22,793][00771] Avg episode reward: [(0, '30.058')] [2023-02-24 07:20:27,045][12906] Updated weights for policy 0, policy_version 1760 (0.0025) [2023-02-24 07:20:27,787][00771] Fps is (10 sec: 2867.6, 60 sec: 3550.2, 300 sec: 3721.1). Total num frames: 7208960. Throughput: 0: 905.7. Samples: 1801564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:20:27,792][00771] Avg episode reward: [(0, '30.544')] [2023-02-24 07:20:32,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3618.3, 300 sec: 3693.3). Total num frames: 7225344. Throughput: 0: 846.8. Samples: 1805854. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:20:32,789][00771] Avg episode reward: [(0, '28.675')] [2023-02-24 07:20:37,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 7245824. Throughput: 0: 892.8. Samples: 1812284. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 07:20:37,792][00771] Avg episode reward: [(0, '29.495')] [2023-02-24 07:20:38,219][12906] Updated weights for policy 0, policy_version 1770 (0.0024) [2023-02-24 07:20:42,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3735.0). Total num frames: 7270400. Throughput: 0: 914.8. Samples: 1815848. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-24 07:20:42,789][00771] Avg episode reward: [(0, '30.113')] [2023-02-24 07:20:47,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 7282688. Throughput: 0: 881.0. Samples: 1821352. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-24 07:20:47,789][00771] Avg episode reward: [(0, '29.442')] [2023-02-24 07:20:49,207][12906] Updated weights for policy 0, policy_version 1780 (0.0021) [2023-02-24 07:20:52,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 7303168. Throughput: 0: 856.8. Samples: 1826206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:20:52,788][00771] Avg episode reward: [(0, '29.405')] [2023-02-24 07:20:57,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 7323648. Throughput: 0: 885.7. Samples: 1829682. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:20:57,789][00771] Avg episode reward: [(0, '30.013')] [2023-02-24 07:20:58,741][12906] Updated weights for policy 0, policy_version 1790 (0.0016) [2023-02-24 07:21:02,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3748.9). Total num frames: 7348224. Throughput: 0: 923.0. Samples: 1836676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:21:02,790][00771] Avg episode reward: [(0, '31.404')] [2023-02-24 07:21:07,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3707.2). Total num frames: 7360512. Throughput: 0: 927.3. Samples: 1841294. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 07:21:07,788][00771] Avg episode reward: [(0, '31.381')] [2023-02-24 07:21:10,600][12906] Updated weights for policy 0, policy_version 1800 (0.0046) [2023-02-24 07:21:12,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 7380992. Throughput: 0: 932.5. Samples: 1843526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:21:12,794][00771] Avg episode reward: [(0, '30.718')] [2023-02-24 07:21:17,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3735.0). Total num frames: 7405568. Throughput: 0: 993.1. Samples: 1850542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:21:17,789][00771] Avg episode reward: [(0, '30.005')] [2023-02-24 07:21:19,299][12906] Updated weights for policy 0, policy_version 1810 (0.0014) [2023-02-24 07:21:22,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 7426048. Throughput: 0: 988.2. Samples: 1856752. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-24 07:21:22,788][00771] Avg episode reward: [(0, '30.367')] [2023-02-24 07:21:22,800][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001813_7426048.pth... [2023-02-24 07:21:22,938][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001590_6512640.pth [2023-02-24 07:21:27,788][00771] Fps is (10 sec: 3276.3, 60 sec: 3822.8, 300 sec: 3707.2). Total num frames: 7438336. Throughput: 0: 956.9. Samples: 1858908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:21:27,795][00771] Avg episode reward: [(0, '29.787')] [2023-02-24 07:21:31,348][12906] Updated weights for policy 0, policy_version 1820 (0.0024) [2023-02-24 07:21:32,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 7458816. Throughput: 0: 953.3. Samples: 1864252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:21:32,795][00771] Avg episode reward: [(0, '27.622')] [2023-02-24 07:21:37,787][00771] Fps is (10 sec: 4506.3, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 7483392. Throughput: 0: 1002.4. Samples: 1871316. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:21:37,792][00771] Avg episode reward: [(0, '27.375')] [2023-02-24 07:21:40,613][12906] Updated weights for policy 0, policy_version 1830 (0.0011) [2023-02-24 07:21:42,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 7499776. Throughput: 0: 992.4. Samples: 1874340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:21:42,789][00771] Avg episode reward: [(0, '27.945')] [2023-02-24 07:21:47,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 7516160. Throughput: 0: 933.5. Samples: 1878682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:21:47,793][00771] Avg episode reward: [(0, '27.344')] [2023-02-24 07:21:52,193][12906] Updated weights for policy 0, policy_version 1840 (0.0017) [2023-02-24 07:21:52,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 7536640. Throughput: 0: 972.6. Samples: 1885060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:21:52,789][00771] Avg episode reward: [(0, '28.455')] [2023-02-24 07:21:57,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 7561216. Throughput: 0: 1000.9. Samples: 1888566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:21:57,791][00771] Avg episode reward: [(0, '27.931')] [2023-02-24 07:22:02,515][12906] Updated weights for policy 0, policy_version 1850 (0.0013) [2023-02-24 07:22:02,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.6). Total num frames: 7577600. Throughput: 0: 965.7. Samples: 1893998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:02,789][00771] Avg episode reward: [(0, '28.166')] [2023-02-24 07:22:07,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 7593984. Throughput: 0: 936.5. Samples: 1898894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:07,795][00771] Avg episode reward: [(0, '28.767')] [2023-02-24 07:22:12,729][12906] Updated weights for policy 0, policy_version 1860 (0.0012) [2023-02-24 07:22:12,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 7618560. Throughput: 0: 966.8. Samples: 1902412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:12,794][00771] Avg episode reward: [(0, '27.818')] [2023-02-24 07:22:17,789][00771] Fps is (10 sec: 4505.2, 60 sec: 3891.1, 300 sec: 3818.3). Total num frames: 7639040. Throughput: 0: 998.5. Samples: 1909184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:22:17,794][00771] Avg episode reward: [(0, '27.532')] [2023-02-24 07:22:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3790.5). Total num frames: 7651328. Throughput: 0: 942.9. Samples: 1913746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:22,792][00771] Avg episode reward: [(0, '28.103')] [2023-02-24 07:22:24,752][12906] Updated weights for policy 0, policy_version 1870 (0.0028) [2023-02-24 07:22:27,787][00771] Fps is (10 sec: 3277.0, 60 sec: 3891.3, 300 sec: 3776.6). Total num frames: 7671808. Throughput: 0: 925.2. Samples: 1915972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:22:27,789][00771] Avg episode reward: [(0, '28.875')] [2023-02-24 07:22:32,787][00771] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 7692288. Throughput: 0: 981.7. Samples: 1922858. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-24 07:22:32,789][00771] Avg episode reward: [(0, '29.400')] [2023-02-24 07:22:33,889][12906] Updated weights for policy 0, policy_version 1880 (0.0014) [2023-02-24 07:22:37,787][00771] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 7712768. Throughput: 0: 977.6. Samples: 1929054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:22:37,790][00771] Avg episode reward: [(0, '28.891')] [2023-02-24 07:22:42,787][00771] Fps is (10 sec: 3686.5, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 7729152. Throughput: 0: 949.6. Samples: 1931298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:42,795][00771] Avg episode reward: [(0, '29.150')] [2023-02-24 07:22:45,832][12906] Updated weights for policy 0, policy_version 1890 (0.0011) [2023-02-24 07:22:47,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 7749632. Throughput: 0: 947.4. Samples: 1936630. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:22:47,798][00771] Avg episode reward: [(0, '28.955')] [2023-02-24 07:22:52,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 7774208. Throughput: 0: 997.3. Samples: 1943772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-24 07:22:52,792][00771] Avg episode reward: [(0, '28.744')] [2023-02-24 07:22:54,434][12906] Updated weights for policy 0, policy_version 1900 (0.0014) [2023-02-24 07:22:57,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 7790592. Throughput: 0: 987.5. Samples: 1946848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:22:57,789][00771] Avg episode reward: [(0, '28.159')] [2023-02-24 07:23:02,787][00771] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3776.7). Total num frames: 7802880. Throughput: 0: 932.0. Samples: 1951124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:23:02,794][00771] Avg episode reward: [(0, '28.028')] [2023-02-24 07:23:06,575][12906] Updated weights for policy 0, policy_version 1910 (0.0023) [2023-02-24 07:23:07,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 7827456. Throughput: 0: 967.2. Samples: 1957268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:23:07,793][00771] Avg episode reward: [(0, '28.964')] [2023-02-24 07:23:12,787][00771] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 7847936. Throughput: 0: 991.2. Samples: 1960574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:23:12,788][00771] Avg episode reward: [(0, '28.681')] [2023-02-24 07:23:17,415][12906] Updated weights for policy 0, policy_version 1920 (0.0022) [2023-02-24 07:23:17,788][00771] Fps is (10 sec: 3686.0, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 7864320. Throughput: 0: 958.7. Samples: 1965998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:23:17,797][00771] Avg episode reward: [(0, '29.853')] [2023-02-24 07:23:22,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 7880704. Throughput: 0: 917.9. Samples: 1970358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:23:22,793][00771] Avg episode reward: [(0, '31.307')] [2023-02-24 07:23:22,804][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001924_7880704.pth... [2023-02-24 07:23:22,927][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001702_6971392.pth [2023-02-24 07:23:27,787][00771] Fps is (10 sec: 3686.8, 60 sec: 3822.9, 300 sec: 3804.5). Total num frames: 7901184. Throughput: 0: 941.2. Samples: 1973650. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-24 07:23:27,788][00771] Avg episode reward: [(0, '31.671')] [2023-02-24 07:23:27,799][12892] Saving new best policy, reward=31.671! [2023-02-24 07:23:28,466][12906] Updated weights for policy 0, policy_version 1930 (0.0013) [2023-02-24 07:23:32,787][00771] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 7921664. Throughput: 0: 969.5. Samples: 1980256. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-24 07:23:32,789][00771] Avg episode reward: [(0, '32.080')] [2023-02-24 07:23:32,798][12892] Saving new best policy, reward=32.080! [2023-02-24 07:23:37,787][00771] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3804.4). Total num frames: 7938048. Throughput: 0: 914.6. Samples: 1984928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:23:37,792][00771] Avg episode reward: [(0, '32.212')] [2023-02-24 07:23:37,798][12892] Saving new best policy, reward=32.212! [2023-02-24 07:23:40,406][12906] Updated weights for policy 0, policy_version 1940 (0.0016) [2023-02-24 07:23:42,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 7954432. Throughput: 0: 895.3. Samples: 1987136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-24 07:23:42,792][00771] Avg episode reward: [(0, '31.185')] [2023-02-24 07:23:47,787][00771] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3776.6). Total num frames: 7970816. Throughput: 0: 927.9. Samples: 1992878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:23:47,791][00771] Avg episode reward: [(0, '30.457')] [2023-02-24 07:23:52,510][12906] Updated weights for policy 0, policy_version 1950 (0.0028) [2023-02-24 07:23:52,787][00771] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 7987200. Throughput: 0: 890.2. Samples: 1997326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-24 07:23:52,791][00771] Avg episode reward: [(0, '29.879')] [2023-02-24 07:23:57,787][00771] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3748.9). Total num frames: 7999488. Throughput: 0: 864.1. Samples: 1999460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-24 07:23:57,792][00771] Avg episode reward: [(0, '28.805')] [2023-02-24 07:23:59,386][00771] Component Batcher_0 stopped! [2023-02-24 07:23:59,390][12892] Stopping Batcher_0... [2023-02-24 07:23:59,391][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-24 07:23:59,400][12892] Loop batcher_evt_loop terminating... [2023-02-24 07:23:59,451][12906] Weights refcount: 2 0 [2023-02-24 07:23:59,466][00771] Component InferenceWorker_p0-w0 stopped! [2023-02-24 07:23:59,472][12906] Stopping InferenceWorker_p0-w0... [2023-02-24 07:23:59,472][12906] Loop inference_proc0-0_evt_loop terminating... [2023-02-24 07:23:59,478][00771] Component RolloutWorker_w2 stopped! [2023-02-24 07:23:59,480][12910] Stopping RolloutWorker_w2... [2023-02-24 07:23:59,491][12910] Loop rollout_proc2_evt_loop terminating... [2023-02-24 07:23:59,504][00771] Component RolloutWorker_w4 stopped! [2023-02-24 07:23:59,507][12911] Stopping RolloutWorker_w4... [2023-02-24 07:23:59,511][00771] Component RolloutWorker_w0 stopped! [2023-02-24 07:23:59,513][12908] Stopping RolloutWorker_w0... [2023-02-24 07:23:59,513][12908] Loop rollout_proc0_evt_loop terminating... [2023-02-24 07:23:59,520][12914] Stopping RolloutWorker_w7... [2023-02-24 07:23:59,520][12914] Loop rollout_proc7_evt_loop terminating... [2023-02-24 07:23:59,520][00771] Component RolloutWorker_w7 stopped! [2023-02-24 07:23:59,523][12911] Loop rollout_proc4_evt_loop terminating... [2023-02-24 07:23:59,532][12912] Stopping RolloutWorker_w5... [2023-02-24 07:23:59,534][00771] Component RolloutWorker_w5 stopped! [2023-02-24 07:23:59,548][12907] Stopping RolloutWorker_w1... [2023-02-24 07:23:59,548][00771] Component RolloutWorker_w1 stopped! [2023-02-24 07:23:59,553][00771] Component RolloutWorker_w6 stopped! [2023-02-24 07:23:59,555][12913] Stopping RolloutWorker_w6... [2023-02-24 07:23:59,555][12913] Loop rollout_proc6_evt_loop terminating... [2023-02-24 07:23:59,561][12912] Loop rollout_proc5_evt_loop terminating... [2023-02-24 07:23:59,571][12909] Stopping RolloutWorker_w3... [2023-02-24 07:23:59,572][12909] Loop rollout_proc3_evt_loop terminating... [2023-02-24 07:23:59,572][00771] Component RolloutWorker_w3 stopped! [2023-02-24 07:23:59,548][12907] Loop rollout_proc1_evt_loop terminating... [2023-02-24 07:23:59,621][12892] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001813_7426048.pth [2023-02-24 07:23:59,644][12892] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-24 07:23:59,927][00771] Component LearnerWorker_p0 stopped! [2023-02-24 07:23:59,936][00771] Waiting for process learner_proc0 to stop... [2023-02-24 07:23:59,940][12892] Stopping LearnerWorker_p0... [2023-02-24 07:23:59,940][12892] Loop learner_proc0_evt_loop terminating... [2023-02-24 07:24:01,677][00771] Waiting for process inference_proc0-0 to join... [2023-02-24 07:24:02,071][00771] Waiting for process rollout_proc0 to join... [2023-02-24 07:24:02,475][00771] Waiting for process rollout_proc1 to join... [2023-02-24 07:24:02,477][00771] Waiting for process rollout_proc2 to join... [2023-02-24 07:24:02,490][00771] Waiting for process rollout_proc3 to join... [2023-02-24 07:24:02,491][00771] Waiting for process rollout_proc4 to join... [2023-02-24 07:24:02,492][00771] Waiting for process rollout_proc5 to join... [2023-02-24 07:24:02,493][00771] Waiting for process rollout_proc6 to join... [2023-02-24 07:24:02,494][00771] Waiting for process rollout_proc7 to join... [2023-02-24 07:24:02,496][00771] Batcher 0 profile tree view: batching: 51.4234, releasing_batches: 0.0545 [2023-02-24 07:24:02,498][00771] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 1080.2081 update_model: 15.1741 weight_update: 0.0033 one_step: 0.0178 handle_policy_step: 993.1754 deserialize: 28.9346, stack: 6.0801, obs_to_device_normalize: 222.7689, forward: 473.1568, send_messages: 50.7500 prepare_outputs: 161.7248 to_cpu: 100.7197 [2023-02-24 07:24:02,499][00771] Learner 0 profile tree view: misc: 0.0110, prepare_batch: 29.1224 train: 148.9686 epoch_init: 0.0177, minibatch_init: 0.0176, losses_postprocess: 1.1845, kl_divergence: 1.0895, after_optimizer: 65.8050 calculate_losses: 52.7925 losses_init: 0.0079, forward_head: 3.2178, bptt_initial: 35.0260, tail: 1.9852, advantages_returns: 0.6668, losses: 7.0349 bptt: 4.2661 bptt_forward_core: 4.0976 update: 26.8045 clip: 2.8206 [2023-02-24 07:24:02,504][00771] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.7599, enqueue_policy_requests: 291.8986, env_step: 1640.3976, overhead: 40.9853, complete_rollouts: 13.6656 save_policy_outputs: 38.7930 split_output_tensors: 18.8159 [2023-02-24 07:24:02,505][00771] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.6516, enqueue_policy_requests: 293.1380, env_step: 1637.1701, overhead: 40.9755, complete_rollouts: 13.7541 save_policy_outputs: 39.4261 split_output_tensors: 19.2718 [2023-02-24 07:24:02,506][00771] Loop Runner_EvtLoop terminating... [2023-02-24 07:24:02,508][00771] Runner profile tree view: main_loop: 2195.1283 [2023-02-24 07:24:02,509][00771] Collected {0: 8007680}, FPS: 3647.9 [2023-02-24 07:24:02,649][00771] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-24 07:24:02,651][00771] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-24 07:24:02,653][00771] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-24 07:24:02,656][00771] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-24 07:24:02,658][00771] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 07:24:02,662][00771] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-24 07:24:02,664][00771] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 07:24:02,666][00771] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-24 07:24:02,667][00771] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-24 07:24:02,669][00771] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-24 07:24:02,670][00771] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-24 07:24:02,672][00771] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-24 07:24:02,673][00771] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-24 07:24:02,674][00771] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-24 07:24:02,675][00771] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-24 07:24:02,710][00771] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-24 07:24:02,714][00771] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 07:24:02,717][00771] RunningMeanStd input shape: (1,) [2023-02-24 07:24:02,739][00771] ConvEncoder: input_channels=3 [2023-02-24 07:24:03,429][00771] Conv encoder output size: 512 [2023-02-24 07:24:03,431][00771] Policy head output size: 512 [2023-02-24 07:24:05,787][00771] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-24 07:24:07,027][00771] Num frames 100... [2023-02-24 07:24:07,138][00771] Num frames 200... [2023-02-24 07:24:07,252][00771] Num frames 300... [2023-02-24 07:24:07,367][00771] Num frames 400... [2023-02-24 07:24:07,478][00771] Num frames 500... [2023-02-24 07:24:07,594][00771] Num frames 600... [2023-02-24 07:24:07,704][00771] Num frames 700... [2023-02-24 07:24:07,821][00771] Num frames 800... [2023-02-24 07:24:07,939][00771] Num frames 900... [2023-02-24 07:24:08,052][00771] Num frames 1000... [2023-02-24 07:24:08,166][00771] Num frames 1100... [2023-02-24 07:24:08,278][00771] Num frames 1200... [2023-02-24 07:24:08,395][00771] Num frames 1300... [2023-02-24 07:24:08,506][00771] Num frames 1400... [2023-02-24 07:24:08,617][00771] Num frames 1500... [2023-02-24 07:24:08,729][00771] Num frames 1600... [2023-02-24 07:24:08,848][00771] Num frames 1700... [2023-02-24 07:24:08,959][00771] Num frames 1800... [2023-02-24 07:24:09,076][00771] Num frames 1900... [2023-02-24 07:24:09,191][00771] Num frames 2000... [2023-02-24 07:24:09,314][00771] Num frames 2100... [2023-02-24 07:24:09,366][00771] Avg episode rewards: #0: 57.999, true rewards: #0: 21.000 [2023-02-24 07:24:09,369][00771] Avg episode reward: 57.999, avg true_objective: 21.000 [2023-02-24 07:24:09,480][00771] Num frames 2200... [2023-02-24 07:24:09,595][00771] Num frames 2300... [2023-02-24 07:24:09,702][00771] Num frames 2400... [2023-02-24 07:24:09,812][00771] Num frames 2500... [2023-02-24 07:24:09,928][00771] Num frames 2600... [2023-02-24 07:24:10,044][00771] Num frames 2700... [2023-02-24 07:24:10,194][00771] Num frames 2800... [2023-02-24 07:24:10,357][00771] Num frames 2900... [2023-02-24 07:24:10,511][00771] Num frames 3000... [2023-02-24 07:24:10,615][00771] Avg episode rewards: #0: 38.139, true rewards: #0: 15.140 [2023-02-24 07:24:10,616][00771] Avg episode reward: 38.139, avg true_objective: 15.140 [2023-02-24 07:24:10,727][00771] Num frames 3100... [2023-02-24 07:24:10,892][00771] Num frames 3200... [2023-02-24 07:24:11,045][00771] Num frames 3300... [2023-02-24 07:24:11,212][00771] Num frames 3400... [2023-02-24 07:24:11,368][00771] Num frames 3500... [2023-02-24 07:24:11,523][00771] Num frames 3600... [2023-02-24 07:24:11,678][00771] Num frames 3700... [2023-02-24 07:24:11,835][00771] Num frames 3800... [2023-02-24 07:24:12,012][00771] Num frames 3900... [2023-02-24 07:24:12,183][00771] Num frames 4000... [2023-02-24 07:24:12,348][00771] Num frames 4100... [2023-02-24 07:24:12,507][00771] Num frames 4200... [2023-02-24 07:24:12,672][00771] Num frames 4300... [2023-02-24 07:24:12,838][00771] Num frames 4400... [2023-02-24 07:24:12,983][00771] Num frames 4500... [2023-02-24 07:24:13,096][00771] Num frames 4600... [2023-02-24 07:24:13,216][00771] Num frames 4700... [2023-02-24 07:24:13,334][00771] Num frames 4800... [2023-02-24 07:24:13,450][00771] Num frames 4900... [2023-02-24 07:24:13,570][00771] Num frames 5000... [2023-02-24 07:24:13,682][00771] Num frames 5100... [2023-02-24 07:24:13,769][00771] Avg episode rewards: #0: 45.759, true rewards: #0: 17.093 [2023-02-24 07:24:13,771][00771] Avg episode reward: 45.759, avg true_objective: 17.093 [2023-02-24 07:24:13,857][00771] Num frames 5200... [2023-02-24 07:24:13,971][00771] Num frames 5300... [2023-02-24 07:24:14,084][00771] Num frames 5400... [2023-02-24 07:24:14,201][00771] Num frames 5500... [2023-02-24 07:24:14,313][00771] Num frames 5600... [2023-02-24 07:24:14,428][00771] Num frames 5700... [2023-02-24 07:24:14,542][00771] Num frames 5800... [2023-02-24 07:24:14,654][00771] Num frames 5900... [2023-02-24 07:24:14,767][00771] Num frames 6000... [2023-02-24 07:24:14,878][00771] Num frames 6100... [2023-02-24 07:24:15,002][00771] Num frames 6200... [2023-02-24 07:24:15,126][00771] Avg episode rewards: #0: 41.149, true rewards: #0: 15.650 [2023-02-24 07:24:15,128][00771] Avg episode reward: 41.149, avg true_objective: 15.650 [2023-02-24 07:24:15,178][00771] Num frames 6300... [2023-02-24 07:24:15,290][00771] Num frames 6400... [2023-02-24 07:24:15,401][00771] Num frames 6500... [2023-02-24 07:24:15,514][00771] Num frames 6600... [2023-02-24 07:24:15,625][00771] Num frames 6700... [2023-02-24 07:24:15,739][00771] Num frames 6800... [2023-02-24 07:24:15,850][00771] Num frames 6900... [2023-02-24 07:24:15,967][00771] Num frames 7000... [2023-02-24 07:24:16,082][00771] Num frames 7100... [2023-02-24 07:24:16,192][00771] Num frames 7200... [2023-02-24 07:24:16,306][00771] Num frames 7300... [2023-02-24 07:24:16,415][00771] Num frames 7400... [2023-02-24 07:24:16,486][00771] Avg episode rewards: #0: 38.024, true rewards: #0: 14.824 [2023-02-24 07:24:16,487][00771] Avg episode reward: 38.024, avg true_objective: 14.824 [2023-02-24 07:24:16,590][00771] Num frames 7500... [2023-02-24 07:24:16,712][00771] Num frames 7600... [2023-02-24 07:24:16,821][00771] Num frames 7700... [2023-02-24 07:24:16,933][00771] Num frames 7800... [2023-02-24 07:24:17,054][00771] Num frames 7900... [2023-02-24 07:24:17,168][00771] Num frames 8000... [2023-02-24 07:24:17,285][00771] Num frames 8100... [2023-02-24 07:24:17,401][00771] Num frames 8200... [2023-02-24 07:24:17,520][00771] Num frames 8300... [2023-02-24 07:24:17,648][00771] Num frames 8400... [2023-02-24 07:24:17,766][00771] Num frames 8500... [2023-02-24 07:24:17,887][00771] Num frames 8600... [2023-02-24 07:24:18,003][00771] Num frames 8700... [2023-02-24 07:24:18,127][00771] Num frames 8800... [2023-02-24 07:24:18,242][00771] Num frames 8900... [2023-02-24 07:24:18,352][00771] Num frames 9000... [2023-02-24 07:24:18,474][00771] Num frames 9100... [2023-02-24 07:24:18,590][00771] Num frames 9200... [2023-02-24 07:24:18,708][00771] Num frames 9300... [2023-02-24 07:24:18,820][00771] Num frames 9400... [2023-02-24 07:24:18,935][00771] Num frames 9500... [2023-02-24 07:24:19,005][00771] Avg episode rewards: #0: 41.186, true rewards: #0: 15.853 [2023-02-24 07:24:19,006][00771] Avg episode reward: 41.186, avg true_objective: 15.853 [2023-02-24 07:24:19,110][00771] Num frames 9600... [2023-02-24 07:24:19,230][00771] Num frames 9700... [2023-02-24 07:24:19,344][00771] Num frames 9800... [2023-02-24 07:24:19,458][00771] Num frames 9900... [2023-02-24 07:24:19,568][00771] Num frames 10000... [2023-02-24 07:24:19,678][00771] Num frames 10100... [2023-02-24 07:24:19,790][00771] Num frames 10200... [2023-02-24 07:24:19,913][00771] Num frames 10300... [2023-02-24 07:24:20,025][00771] Num frames 10400... [2023-02-24 07:24:20,151][00771] Num frames 10500... [2023-02-24 07:24:20,262][00771] Num frames 10600... [2023-02-24 07:24:20,374][00771] Num frames 10700... [2023-02-24 07:24:20,493][00771] Num frames 10800... [2023-02-24 07:24:20,612][00771] Avg episode rewards: #0: 39.222, true rewards: #0: 15.509 [2023-02-24 07:24:20,613][00771] Avg episode reward: 39.222, avg true_objective: 15.509 [2023-02-24 07:24:20,673][00771] Num frames 10900... [2023-02-24 07:24:20,784][00771] Num frames 11000... [2023-02-24 07:24:20,895][00771] Num frames 11100... [2023-02-24 07:24:21,010][00771] Num frames 11200... [2023-02-24 07:24:21,128][00771] Num frames 11300... [2023-02-24 07:24:21,253][00771] Num frames 11400... [2023-02-24 07:24:21,369][00771] Num frames 11500... [2023-02-24 07:24:21,483][00771] Num frames 11600... [2023-02-24 07:24:21,594][00771] Num frames 11700... [2023-02-24 07:24:21,705][00771] Num frames 11800... [2023-02-24 07:24:21,819][00771] Num frames 11900... [2023-02-24 07:24:21,934][00771] Num frames 12000... [2023-02-24 07:24:22,046][00771] Num frames 12100... [2023-02-24 07:24:22,174][00771] Num frames 12200... [2023-02-24 07:24:22,286][00771] Num frames 12300... [2023-02-24 07:24:22,398][00771] Num frames 12400... [2023-02-24 07:24:22,490][00771] Avg episode rewards: #0: 39.405, true rewards: #0: 15.530 [2023-02-24 07:24:22,493][00771] Avg episode reward: 39.405, avg true_objective: 15.530 [2023-02-24 07:24:22,584][00771] Num frames 12500... [2023-02-24 07:24:22,702][00771] Num frames 12600... [2023-02-24 07:24:22,817][00771] Num frames 12700... [2023-02-24 07:24:22,931][00771] Num frames 12800... [2023-02-24 07:24:23,105][00771] Num frames 12900... [2023-02-24 07:24:23,265][00771] Num frames 13000... [2023-02-24 07:24:23,424][00771] Num frames 13100... [2023-02-24 07:24:23,496][00771] Avg episode rewards: #0: 36.675, true rewards: #0: 14.564 [2023-02-24 07:24:23,498][00771] Avg episode reward: 36.675, avg true_objective: 14.564 [2023-02-24 07:24:23,648][00771] Num frames 13200... [2023-02-24 07:24:23,809][00771] Num frames 13300... [2023-02-24 07:24:23,963][00771] Num frames 13400... [2023-02-24 07:24:24,120][00771] Num frames 13500... [2023-02-24 07:24:24,281][00771] Num frames 13600... [2023-02-24 07:24:24,440][00771] Num frames 13700... [2023-02-24 07:24:24,597][00771] Num frames 13800... [2023-02-24 07:24:24,765][00771] Num frames 13900... [2023-02-24 07:24:24,925][00771] Num frames 14000... [2023-02-24 07:24:25,086][00771] Num frames 14100... [2023-02-24 07:24:25,257][00771] Num frames 14200... [2023-02-24 07:24:25,418][00771] Num frames 14300... [2023-02-24 07:24:25,579][00771] Num frames 14400... [2023-02-24 07:24:25,751][00771] Num frames 14500... [2023-02-24 07:24:25,885][00771] Num frames 14600... [2023-02-24 07:24:26,000][00771] Num frames 14700... [2023-02-24 07:24:26,116][00771] Num frames 14800... [2023-02-24 07:24:26,237][00771] Num frames 14900... [2023-02-24 07:24:26,350][00771] Num frames 15000... [2023-02-24 07:24:26,468][00771] Num frames 15100... [2023-02-24 07:24:26,622][00771] Avg episode rewards: #0: 38.288, true rewards: #0: 15.188 [2023-02-24 07:24:26,624][00771] Avg episode reward: 38.288, avg true_objective: 15.188 [2023-02-24 07:25:51,895][00771] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-24 07:27:10,640][00771] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-24 07:27:10,642][00771] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-24 07:27:10,643][00771] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-24 07:27:10,645][00771] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-24 07:27:10,646][00771] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-24 07:27:10,648][00771] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-24 07:27:10,651][00771] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-24 07:27:10,653][00771] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-24 07:27:10,655][00771] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-24 07:27:10,657][00771] Adding new argument 'hf_repository'='lnros/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-24 07:27:10,659][00771] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-24 07:27:10,661][00771] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-24 07:27:10,663][00771] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-24 07:27:10,664][00771] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-24 07:27:10,665][00771] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-24 07:27:10,705][00771] RunningMeanStd input shape: (3, 72, 128) [2023-02-24 07:27:10,710][00771] RunningMeanStd input shape: (1,) [2023-02-24 07:27:10,731][00771] ConvEncoder: input_channels=3 [2023-02-24 07:27:10,802][00771] Conv encoder output size: 512 [2023-02-24 07:27:10,806][00771] Policy head output size: 512 [2023-02-24 07:27:10,838][00771] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth... [2023-02-24 07:27:11,647][00771] Num frames 100... [2023-02-24 07:27:11,813][00771] Num frames 200... [2023-02-24 07:27:12,006][00771] Num frames 300... [2023-02-24 07:27:12,179][00771] Num frames 400... [2023-02-24 07:27:12,354][00771] Num frames 500... [2023-02-24 07:27:12,528][00771] Num frames 600... [2023-02-24 07:27:12,705][00771] Num frames 700... [2023-02-24 07:27:12,889][00771] Num frames 800... [2023-02-24 07:27:13,059][00771] Num frames 900... [2023-02-24 07:27:13,245][00771] Num frames 1000... [2023-02-24 07:27:13,409][00771] Num frames 1100... [2023-02-24 07:27:13,577][00771] Num frames 1200... [2023-02-24 07:27:13,761][00771] Avg episode rewards: #0: 28.800, true rewards: #0: 12.800 [2023-02-24 07:27:13,764][00771] Avg episode reward: 28.800, avg true_objective: 12.800 [2023-02-24 07:27:13,804][00771] Num frames 1300... [2023-02-24 07:27:13,957][00771] Num frames 1400... [2023-02-24 07:27:14,116][00771] Num frames 1500... [2023-02-24 07:27:14,278][00771] Num frames 1600... [2023-02-24 07:27:14,413][00771] Num frames 1700... [2023-02-24 07:27:14,535][00771] Num frames 1800... [2023-02-24 07:27:14,656][00771] Num frames 1900... [2023-02-24 07:27:14,783][00771] Num frames 2000... [2023-02-24 07:27:14,895][00771] Num frames 2100... [2023-02-24 07:27:15,014][00771] Num frames 2200... [2023-02-24 07:27:15,134][00771] Num frames 2300... [2023-02-24 07:27:15,246][00771] Num frames 2400... [2023-02-24 07:27:15,370][00771] Num frames 2500... [2023-02-24 07:27:15,480][00771] Num frames 2600... [2023-02-24 07:27:15,598][00771] Num frames 2700... [2023-02-24 07:27:15,690][00771] Avg episode rewards: #0: 33.655, true rewards: #0: 13.655 [2023-02-24 07:27:15,691][00771] Avg episode reward: 33.655, avg true_objective: 13.655 [2023-02-24 07:27:15,771][00771] Num frames 2800... [2023-02-24 07:27:15,890][00771] Num frames 2900... [2023-02-24 07:27:15,999][00771] Num frames 3000... [2023-02-24 07:27:16,111][00771] Num frames 3100... [2023-02-24 07:27:16,224][00771] Num frames 3200... [2023-02-24 07:27:16,335][00771] Num frames 3300... [2023-02-24 07:27:16,452][00771] Num frames 3400... [2023-02-24 07:27:16,546][00771] Avg episode rewards: #0: 26.117, true rewards: #0: 11.450 [2023-02-24 07:27:16,549][00771] Avg episode reward: 26.117, avg true_objective: 11.450 [2023-02-24 07:27:16,630][00771] Num frames 3500... [2023-02-24 07:27:16,741][00771] Num frames 3600... [2023-02-24 07:27:16,859][00771] Num frames 3700... [2023-02-24 07:27:16,977][00771] Num frames 3800... [2023-02-24 07:27:17,085][00771] Num frames 3900... [2023-02-24 07:27:17,198][00771] Num frames 4000... [2023-02-24 07:27:17,310][00771] Num frames 4100... [2023-02-24 07:27:17,423][00771] Num frames 4200... [2023-02-24 07:27:17,545][00771] Num frames 4300... [2023-02-24 07:27:17,662][00771] Num frames 4400... [2023-02-24 07:27:17,772][00771] Num frames 4500... [2023-02-24 07:27:17,885][00771] Num frames 4600... [2023-02-24 07:27:18,005][00771] Num frames 4700... [2023-02-24 07:27:18,119][00771] Num frames 4800... [2023-02-24 07:27:18,239][00771] Avg episode rewards: #0: 28.393, true rewards: #0: 12.142 [2023-02-24 07:27:18,241][00771] Avg episode reward: 28.393, avg true_objective: 12.142 [2023-02-24 07:27:18,291][00771] Num frames 4900... [2023-02-24 07:27:18,408][00771] Num frames 5000... [2023-02-24 07:27:18,522][00771] Num frames 5100... [2023-02-24 07:27:18,645][00771] Num frames 5200... [2023-02-24 07:27:18,773][00771] Num frames 5300... [2023-02-24 07:27:18,885][00771] Num frames 5400... [2023-02-24 07:27:19,000][00771] Num frames 5500... [2023-02-24 07:27:19,127][00771] Num frames 5600... [2023-02-24 07:27:19,242][00771] Num frames 5700... [2023-02-24 07:27:19,359][00771] Num frames 5800... [2023-02-24 07:27:19,469][00771] Num frames 5900... [2023-02-24 07:27:19,591][00771] Num frames 6000... [2023-02-24 07:27:19,704][00771] Num frames 6100... [2023-02-24 07:27:19,767][00771] Avg episode rewards: #0: 28.010, true rewards: #0: 12.210 [2023-02-24 07:27:19,768][00771] Avg episode reward: 28.010, avg true_objective: 12.210 [2023-02-24 07:27:19,874][00771] Num frames 6200... [2023-02-24 07:27:19,992][00771] Num frames 6300... [2023-02-24 07:27:20,115][00771] Num frames 6400... [2023-02-24 07:27:20,238][00771] Num frames 6500... [2023-02-24 07:27:20,353][00771] Avg episode rewards: #0: 24.255, true rewards: #0: 10.922 [2023-02-24 07:27:20,354][00771] Avg episode reward: 24.255, avg true_objective: 10.922 [2023-02-24 07:27:20,411][00771] Num frames 6600... [2023-02-24 07:27:20,521][00771] Num frames 6700... [2023-02-24 07:27:20,633][00771] Num frames 6800... [2023-02-24 07:27:20,754][00771] Num frames 6900... [2023-02-24 07:27:20,870][00771] Num frames 7000... [2023-02-24 07:27:20,980][00771] Num frames 7100... [2023-02-24 07:27:21,096][00771] Num frames 7200... [2023-02-24 07:27:21,213][00771] Num frames 7300... [2023-02-24 07:27:21,331][00771] Num frames 7400... [2023-02-24 07:27:21,457][00771] Num frames 7500... [2023-02-24 07:27:21,582][00771] Num frames 7600... [2023-02-24 07:27:21,701][00771] Num frames 7700... [2023-02-24 07:27:21,822][00771] Num frames 7800... [2023-02-24 07:27:21,940][00771] Num frames 7900... [2023-02-24 07:27:22,049][00771] Num frames 8000... [2023-02-24 07:27:22,158][00771] Num frames 8100... [2023-02-24 07:27:22,270][00771] Num frames 8200... [2023-02-24 07:27:22,396][00771] Num frames 8300... [2023-02-24 07:27:22,517][00771] Num frames 8400... [2023-02-24 07:27:22,646][00771] Num frames 8500... [2023-02-24 07:27:22,783][00771] Num frames 8600... [2023-02-24 07:27:22,911][00771] Avg episode rewards: #0: 29.504, true rewards: #0: 12.361 [2023-02-24 07:27:22,913][00771] Avg episode reward: 29.504, avg true_objective: 12.361 [2023-02-24 07:27:22,976][00771] Num frames 8700... [2023-02-24 07:27:23,090][00771] Num frames 8800... [2023-02-24 07:27:23,208][00771] Num frames 8900... [2023-02-24 07:27:23,319][00771] Num frames 9000... [2023-02-24 07:27:23,481][00771] Num frames 9100... [2023-02-24 07:27:23,637][00771] Num frames 9200... [2023-02-24 07:27:23,938][00771] Num frames 9300... [2023-02-24 07:27:24,097][00771] Num frames 9400... [2023-02-24 07:27:24,237][00771] Avg episode rewards: #0: 27.691, true rewards: #0: 11.816 [2023-02-24 07:27:24,240][00771] Avg episode reward: 27.691, avg true_objective: 11.816 [2023-02-24 07:27:24,321][00771] Num frames 9500... [2023-02-24 07:27:24,488][00771] Num frames 9600... [2023-02-24 07:27:24,661][00771] Num frames 9700... [2023-02-24 07:27:24,832][00771] Num frames 9800... [2023-02-24 07:27:24,989][00771] Num frames 9900... [2023-02-24 07:27:25,149][00771] Num frames 10000... [2023-02-24 07:27:25,309][00771] Num frames 10100... [2023-02-24 07:27:25,475][00771] Num frames 10200... [2023-02-24 07:27:25,646][00771] Num frames 10300... [2023-02-24 07:27:25,809][00771] Num frames 10400... [2023-02-24 07:27:25,975][00771] Num frames 10500... [2023-02-24 07:27:26,145][00771] Num frames 10600... [2023-02-24 07:27:26,309][00771] Num frames 10700... [2023-02-24 07:27:26,476][00771] Num frames 10800... [2023-02-24 07:27:26,640][00771] Num frames 10900... [2023-02-24 07:27:26,806][00771] Num frames 11000... [2023-02-24 07:27:26,972][00771] Num frames 11100... [2023-02-24 07:27:27,092][00771] Num frames 11200... [2023-02-24 07:27:27,211][00771] Num frames 11300... [2023-02-24 07:27:27,336][00771] Num frames 11400... [2023-02-24 07:27:27,401][00771] Avg episode rewards: #0: 30.783, true rewards: #0: 12.672 [2023-02-24 07:27:27,403][00771] Avg episode reward: 30.783, avg true_objective: 12.672 [2023-02-24 07:27:27,507][00771] Num frames 11500... [2023-02-24 07:27:27,621][00771] Num frames 11600... [2023-02-24 07:27:27,739][00771] Num frames 11700... [2023-02-24 07:27:27,865][00771] Num frames 11800... [2023-02-24 07:27:27,978][00771] Num frames 11900... [2023-02-24 07:27:28,091][00771] Num frames 12000... [2023-02-24 07:27:28,211][00771] Num frames 12100... [2023-02-24 07:27:28,325][00771] Num frames 12200... [2023-02-24 07:27:28,438][00771] Num frames 12300... [2023-02-24 07:27:28,557][00771] Num frames 12400... [2023-02-24 07:27:28,677][00771] Num frames 12500... [2023-02-24 07:27:28,792][00771] Num frames 12600... [2023-02-24 07:27:28,919][00771] Num frames 12700... [2023-02-24 07:27:29,041][00771] Num frames 12800... [2023-02-24 07:27:29,156][00771] Num frames 12900... [2023-02-24 07:27:29,278][00771] Num frames 13000... [2023-02-24 07:27:29,396][00771] Num frames 13100... [2023-02-24 07:27:29,527][00771] Num frames 13200... [2023-02-24 07:27:29,639][00771] Num frames 13300... [2023-02-24 07:27:29,756][00771] Num frames 13400... [2023-02-24 07:27:29,875][00771] Avg episode rewards: #0: 34.056, true rewards: #0: 13.456 [2023-02-24 07:27:29,877][00771] Avg episode reward: 34.056, avg true_objective: 13.456 [2023-02-24 07:28:47,212][00771] Replay video saved to /content/train_dir/default_experiment/replay.mp4!