[2023-02-23 11:02:52,065][05868] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-23 11:02:52,068][05868] Rollout worker 0 uses device cpu [2023-02-23 11:02:52,073][05868] Rollout worker 1 uses device cpu [2023-02-23 11:02:52,074][05868] Rollout worker 2 uses device cpu [2023-02-23 11:02:52,075][05868] Rollout worker 3 uses device cpu [2023-02-23 11:02:52,076][05868] Rollout worker 4 uses device cpu [2023-02-23 11:02:52,077][05868] Rollout worker 5 uses device cpu [2023-02-23 11:02:52,082][05868] Rollout worker 6 uses device cpu [2023-02-23 11:02:52,084][05868] Rollout worker 7 uses device cpu [2023-02-23 11:02:52,321][05868] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-23 11:02:52,326][05868] InferenceWorker_p0-w0: min num requests: 2 [2023-02-23 11:02:52,370][05868] Starting all processes... [2023-02-23 11:02:52,373][05868] Starting process learner_proc0 [2023-02-23 11:02:52,450][05868] Starting all processes... [2023-02-23 11:02:52,466][05868] Starting process inference_proc0-0 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc0 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc1 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc2 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc3 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc4 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc5 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc6 [2023-02-23 11:02:52,471][05868] Starting process rollout_proc7 [2023-02-23 11:03:02,845][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-23 11:03:02,849][17728] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-23 11:03:03,045][17741] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-23 11:03:03,046][17741] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-23 11:03:03,057][17744] Worker 1 uses CPU cores [1] [2023-02-23 11:03:03,061][17750] Worker 7 uses CPU cores [1] [2023-02-23 11:03:03,141][17745] Worker 2 uses CPU cores [0] [2023-02-23 11:03:03,311][17747] Worker 4 uses CPU cores [0] [2023-02-23 11:03:03,365][17748] Worker 5 uses CPU cores [1] [2023-02-23 11:03:03,391][17746] Worker 3 uses CPU cores [1] [2023-02-23 11:03:03,531][17749] Worker 6 uses CPU cores [0] [2023-02-23 11:03:03,534][17743] Worker 0 uses CPU cores [0] [2023-02-23 11:03:03,643][17728] Num visible devices: 1 [2023-02-23 11:03:03,643][17741] Num visible devices: 1 [2023-02-23 11:03:03,653][17728] Starting seed is not provided [2023-02-23 11:03:03,653][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-23 11:03:03,654][17728] Initializing actor-critic model on device cuda:0 [2023-02-23 11:03:03,655][17728] RunningMeanStd input shape: (3, 72, 128) [2023-02-23 11:03:03,657][17728] RunningMeanStd input shape: (1,) [2023-02-23 11:03:03,669][17728] ConvEncoder: input_channels=3 [2023-02-23 11:03:03,941][17728] Conv encoder output size: 512 [2023-02-23 11:03:03,941][17728] Policy head output size: 512 [2023-02-23 11:03:03,988][17728] Created Actor Critic model with architecture: [2023-02-23 11:03:03,988][17728] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-23 11:03:11,326][17728] Using optimizer [2023-02-23 11:03:11,327][17728] No checkpoints found [2023-02-23 11:03:11,327][17728] Did not load from checkpoint, starting from scratch! [2023-02-23 11:03:11,327][17728] Initialized policy 0 weights for model version 0 [2023-02-23 11:03:11,331][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-23 11:03:11,338][17728] LearnerWorker_p0 finished initialization! [2023-02-23 11:03:11,448][17741] RunningMeanStd input shape: (3, 72, 128) [2023-02-23 11:03:11,449][17741] RunningMeanStd input shape: (1,) [2023-02-23 11:03:11,461][17741] ConvEncoder: input_channels=3 [2023-02-23 11:03:11,572][17741] Conv encoder output size: 512 [2023-02-23 11:03:11,573][17741] Policy head output size: 512 [2023-02-23 11:03:12,309][05868] Heartbeat connected on Batcher_0 [2023-02-23 11:03:12,316][05868] Heartbeat connected on LearnerWorker_p0 [2023-02-23 11:03:12,332][05868] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-23 11:03:12,337][05868] Heartbeat connected on RolloutWorker_w0 [2023-02-23 11:03:12,345][05868] Heartbeat connected on RolloutWorker_w1 [2023-02-23 11:03:12,347][05868] Heartbeat connected on RolloutWorker_w2 [2023-02-23 11:03:12,352][05868] Heartbeat connected on RolloutWorker_w3 [2023-02-23 11:03:12,356][05868] Heartbeat connected on RolloutWorker_w4 [2023-02-23 11:03:12,359][05868] Heartbeat connected on RolloutWorker_w5 [2023-02-23 11:03:12,366][05868] Heartbeat connected on RolloutWorker_w6 [2023-02-23 11:03:12,371][05868] Heartbeat connected on RolloutWorker_w7 [2023-02-23 11:03:13,890][05868] Inference worker 0-0 is ready! [2023-02-23 11:03:13,892][05868] All inference workers are ready! Signal rollout workers to start! [2023-02-23 11:03:13,902][05868] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-23 11:03:14,009][17743] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,020][17747] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,028][17745] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,038][17746] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,051][17750] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,047][17749] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,056][17744] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:14,069][17748] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:03:15,235][17750] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,234][17744] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,234][17746] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,236][17745] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,237][17747] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,234][17743] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,920][17749] Decorrelating experience for 0 frames... [2023-02-23 11:03:15,924][17745] Decorrelating experience for 32 frames... [2023-02-23 11:03:15,934][17746] Decorrelating experience for 32 frames... [2023-02-23 11:03:15,936][17744] Decorrelating experience for 32 frames... [2023-02-23 11:03:16,725][17750] Decorrelating experience for 32 frames... [2023-02-23 11:03:16,853][17744] Decorrelating experience for 64 frames... [2023-02-23 11:03:16,979][17743] Decorrelating experience for 32 frames... [2023-02-23 11:03:16,993][17749] Decorrelating experience for 32 frames... [2023-02-23 11:03:17,100][17745] Decorrelating experience for 64 frames... [2023-02-23 11:03:17,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-23 11:03:17,608][17750] Decorrelating experience for 64 frames... [2023-02-23 11:03:17,689][17744] Decorrelating experience for 96 frames... [2023-02-23 11:03:18,141][17747] Decorrelating experience for 32 frames... [2023-02-23 11:03:18,364][17745] Decorrelating experience for 96 frames... [2023-02-23 11:03:18,389][17743] Decorrelating experience for 64 frames... [2023-02-23 11:03:18,843][17748] Decorrelating experience for 0 frames... [2023-02-23 11:03:19,121][17750] Decorrelating experience for 96 frames... [2023-02-23 11:03:19,491][17748] Decorrelating experience for 32 frames... [2023-02-23 11:03:20,068][17749] Decorrelating experience for 64 frames... [2023-02-23 11:03:20,794][17747] Decorrelating experience for 64 frames... [2023-02-23 11:03:20,898][17743] Decorrelating experience for 96 frames... [2023-02-23 11:03:21,273][17746] Decorrelating experience for 64 frames... [2023-02-23 11:03:21,315][17749] Decorrelating experience for 96 frames... [2023-02-23 11:03:21,599][17747] Decorrelating experience for 96 frames... [2023-02-23 11:03:21,710][17748] Decorrelating experience for 64 frames... [2023-02-23 11:03:22,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-23 11:03:22,593][17746] Decorrelating experience for 96 frames... [2023-02-23 11:03:22,768][17748] Decorrelating experience for 96 frames... [2023-02-23 11:03:27,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 93.7. Samples: 1406. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-23 11:03:27,340][05868] Avg episode reward: [(0, '1.377')] [2023-02-23 11:03:27,825][17728] Signal inference workers to stop experience collection... [2023-02-23 11:03:27,854][17741] InferenceWorker_p0-w0: stopping experience collection [2023-02-23 11:03:30,259][17728] Signal inference workers to resume experience collection... [2023-02-23 11:03:30,260][17741] InferenceWorker_p0-w0: resuming experience collection [2023-02-23 11:03:32,332][05868] Fps is (10 sec: 1228.8, 60 sec: 614.4, 300 sec: 614.4). Total num frames: 12288. Throughput: 0: 161.4. Samples: 3228. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-23 11:03:32,334][05868] Avg episode reward: [(0, '3.061')] [2023-02-23 11:03:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 1310.7, 300 sec: 1310.7). Total num frames: 32768. Throughput: 0: 250.7. Samples: 6268. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2023-02-23 11:03:37,338][05868] Avg episode reward: [(0, '3.835')] [2023-02-23 11:03:39,390][17741] Updated weights for policy 0, policy_version 10 (0.0012) [2023-02-23 11:03:42,338][05868] Fps is (10 sec: 3684.1, 60 sec: 1638.1, 300 sec: 1638.1). Total num frames: 49152. Throughput: 0: 386.2. Samples: 11588. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:03:42,346][05868] Avg episode reward: [(0, '4.414')] [2023-02-23 11:03:47,335][05868] Fps is (10 sec: 2866.3, 60 sec: 1755.3, 300 sec: 1755.3). Total num frames: 61440. Throughput: 0: 446.0. Samples: 15610. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) [2023-02-23 11:03:47,342][05868] Avg episode reward: [(0, '4.585')] [2023-02-23 11:03:52,333][05868] Fps is (10 sec: 2868.6, 60 sec: 1945.5, 300 sec: 1945.5). Total num frames: 77824. Throughput: 0: 450.3. Samples: 18012. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:03:52,342][05868] Avg episode reward: [(0, '4.452')] [2023-02-23 11:03:52,845][17741] Updated weights for policy 0, policy_version 20 (0.0022) [2023-02-23 11:03:57,332][05868] Fps is (10 sec: 2868.1, 60 sec: 2002.5, 300 sec: 2002.5). Total num frames: 90112. Throughput: 0: 516.8. Samples: 23254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:03:57,339][05868] Avg episode reward: [(0, '4.402')] [2023-02-23 11:04:02,332][05868] Fps is (10 sec: 2867.5, 60 sec: 2129.9, 300 sec: 2129.9). Total num frames: 106496. Throughput: 0: 619.9. Samples: 27896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:04:02,337][05868] Avg episode reward: [(0, '4.340')] [2023-02-23 11:04:02,346][17728] Saving new best policy, reward=4.340! [2023-02-23 11:04:07,137][17741] Updated weights for policy 0, policy_version 30 (0.0018) [2023-02-23 11:04:07,338][05868] Fps is (10 sec: 3274.8, 60 sec: 2233.9, 300 sec: 2233.9). Total num frames: 122880. Throughput: 0: 664.0. Samples: 29882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:04:07,341][05868] Avg episode reward: [(0, '4.508')] [2023-02-23 11:04:07,355][17728] Saving new best policy, reward=4.508! [2023-02-23 11:04:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 2321.1, 300 sec: 2321.1). Total num frames: 139264. Throughput: 0: 729.7. Samples: 34242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:04:12,336][05868] Avg episode reward: [(0, '4.442')] [2023-02-23 11:04:17,332][05868] Fps is (10 sec: 3688.7, 60 sec: 2662.4, 300 sec: 2457.6). Total num frames: 159744. Throughput: 0: 826.3. Samples: 40412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:04:17,340][05868] Avg episode reward: [(0, '4.209')] [2023-02-23 11:04:18,055][17741] Updated weights for policy 0, policy_version 40 (0.0016) [2023-02-23 11:04:22,332][05868] Fps is (10 sec: 3686.4, 60 sec: 2935.5, 300 sec: 2516.1). Total num frames: 176128. Throughput: 0: 828.7. Samples: 43558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:04:22,340][05868] Avg episode reward: [(0, '4.239')] [2023-02-23 11:04:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2512.2). Total num frames: 188416. Throughput: 0: 799.6. Samples: 47564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:04:27,337][05868] Avg episode reward: [(0, '4.367')] [2023-02-23 11:04:31,480][17741] Updated weights for policy 0, policy_version 50 (0.0017) [2023-02-23 11:04:32,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2560.0). Total num frames: 204800. Throughput: 0: 810.9. Samples: 52100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:04:32,341][05868] Avg episode reward: [(0, '4.453')] [2023-02-23 11:04:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2650.4). Total num frames: 225280. Throughput: 0: 829.1. Samples: 55320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:04:37,334][05868] Avg episode reward: [(0, '4.451')] [2023-02-23 11:04:41,539][17741] Updated weights for policy 0, policy_version 60 (0.0013) [2023-02-23 11:04:42,333][05868] Fps is (10 sec: 4095.6, 60 sec: 3277.1, 300 sec: 2730.6). Total num frames: 245760. Throughput: 0: 849.9. Samples: 61500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:04:42,339][05868] Avg episode reward: [(0, '4.333')] [2023-02-23 11:04:47,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3277.0, 300 sec: 2716.3). Total num frames: 258048. Throughput: 0: 834.3. Samples: 65440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:04:47,334][05868] Avg episode reward: [(0, '4.338')] [2023-02-23 11:04:47,352][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth... [2023-02-23 11:04:52,332][05868] Fps is (10 sec: 2867.5, 60 sec: 3276.9, 300 sec: 2744.3). Total num frames: 274432. Throughput: 0: 834.4. Samples: 67424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:04:52,334][05868] Avg episode reward: [(0, '4.545')] [2023-02-23 11:04:52,343][17728] Saving new best policy, reward=4.545! [2023-02-23 11:04:55,280][17741] Updated weights for policy 0, policy_version 70 (0.0025) [2023-02-23 11:04:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2808.7). Total num frames: 294912. Throughput: 0: 856.8. Samples: 72800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:04:57,339][05868] Avg episode reward: [(0, '4.657')] [2023-02-23 11:04:57,348][17728] Saving new best policy, reward=4.657! [2023-02-23 11:05:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2830.0). Total num frames: 311296. Throughput: 0: 856.0. Samples: 78932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:05:02,337][05868] Avg episode reward: [(0, '4.463')] [2023-02-23 11:05:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3345.4, 300 sec: 2813.8). Total num frames: 323584. Throughput: 0: 829.2. Samples: 80870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:05:07,338][05868] Avg episode reward: [(0, '4.246')] [2023-02-23 11:05:07,361][17741] Updated weights for policy 0, policy_version 80 (0.0014) [2023-02-23 11:05:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2833.1). Total num frames: 339968. Throughput: 0: 831.5. Samples: 84982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:05:12,334][05868] Avg episode reward: [(0, '4.238')] [2023-02-23 11:05:17,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2883.6). Total num frames: 360448. Throughput: 0: 862.7. Samples: 90922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:05:17,334][05868] Avg episode reward: [(0, '4.374')] [2023-02-23 11:05:18,793][17741] Updated weights for policy 0, policy_version 90 (0.0026) [2023-02-23 11:05:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 2930.2). Total num frames: 380928. Throughput: 0: 861.5. Samples: 94088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:05:22,335][05868] Avg episode reward: [(0, '4.597')] [2023-02-23 11:05:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 2912.7). Total num frames: 393216. Throughput: 0: 831.4. Samples: 98910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:05:27,339][05868] Avg episode reward: [(0, '4.706')] [2023-02-23 11:05:27,407][17728] Saving new best policy, reward=4.706! [2023-02-23 11:05:31,923][17741] Updated weights for policy 0, policy_version 100 (0.0023) [2023-02-23 11:05:32,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 409600. Throughput: 0: 832.5. Samples: 102904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:05:32,334][05868] Avg episode reward: [(0, '4.805')] [2023-02-23 11:05:32,340][17728] Saving new best policy, reward=4.805! [2023-02-23 11:05:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 847.0. Samples: 105538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:05:37,334][05868] Avg episode reward: [(0, '4.709')] [2023-02-23 11:05:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2976.4). Total num frames: 446464. Throughput: 0: 857.9. Samples: 111404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:05:42,335][05868] Avg episode reward: [(0, '4.589')] [2023-02-23 11:05:42,568][17741] Updated weights for policy 0, policy_version 110 (0.0028) [2023-02-23 11:05:47,334][05868] Fps is (10 sec: 3685.7, 60 sec: 3413.2, 300 sec: 2986.1). Total num frames: 462848. Throughput: 0: 825.8. Samples: 116094. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:05:47,343][05868] Avg episode reward: [(0, '4.439')] [2023-02-23 11:05:52,333][05868] Fps is (10 sec: 2866.9, 60 sec: 3345.0, 300 sec: 2969.6). Total num frames: 475136. Throughput: 0: 828.4. Samples: 118148. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:05:52,335][05868] Avg episode reward: [(0, '4.335')] [2023-02-23 11:05:55,992][17741] Updated weights for policy 0, policy_version 120 (0.0025) [2023-02-23 11:05:57,332][05868] Fps is (10 sec: 3277.4, 60 sec: 3345.1, 300 sec: 3003.7). Total num frames: 495616. Throughput: 0: 851.8. Samples: 123314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:05:57,335][05868] Avg episode reward: [(0, '4.686')] [2023-02-23 11:06:02,332][05868] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3035.9). Total num frames: 516096. Throughput: 0: 861.3. Samples: 129682. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:02,338][05868] Avg episode reward: [(0, '4.838')] [2023-02-23 11:06:02,341][17728] Saving new best policy, reward=4.838! [2023-02-23 11:06:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3019.3). Total num frames: 528384. Throughput: 0: 846.2. Samples: 132166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:07,337][05868] Avg episode reward: [(0, '4.824')] [2023-02-23 11:06:07,371][17741] Updated weights for policy 0, policy_version 130 (0.0018) [2023-02-23 11:06:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3026.5). Total num frames: 544768. Throughput: 0: 829.6. Samples: 136242. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:12,334][05868] Avg episode reward: [(0, '4.722')] [2023-02-23 11:06:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3033.3). Total num frames: 561152. Throughput: 0: 859.5. Samples: 141582. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:17,334][05868] Avg episode reward: [(0, '4.635')] [2023-02-23 11:06:19,398][17741] Updated weights for policy 0, policy_version 140 (0.0033) [2023-02-23 11:06:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3082.8). Total num frames: 585728. Throughput: 0: 868.6. Samples: 144626. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-23 11:06:22,339][05868] Avg episode reward: [(0, '4.842')] [2023-02-23 11:06:22,342][17728] Saving new best policy, reward=4.842! [2023-02-23 11:06:27,339][05868] Fps is (10 sec: 3683.8, 60 sec: 3412.9, 300 sec: 3066.6). Total num frames: 598016. Throughput: 0: 861.2. Samples: 150166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:27,356][05868] Avg episode reward: [(0, '4.740')] [2023-02-23 11:06:31,831][17741] Updated weights for policy 0, policy_version 150 (0.0020) [2023-02-23 11:06:32,332][05868] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3072.0). Total num frames: 614400. Throughput: 0: 846.1. Samples: 154166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:06:32,342][05868] Avg episode reward: [(0, '4.719')] [2023-02-23 11:06:37,332][05868] Fps is (10 sec: 3279.1, 60 sec: 3413.3, 300 sec: 3077.0). Total num frames: 630784. Throughput: 0: 848.6. Samples: 156332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:06:37,339][05868] Avg episode reward: [(0, '4.890')] [2023-02-23 11:06:37,351][17728] Saving new best policy, reward=4.890! [2023-02-23 11:06:42,332][05868] Fps is (10 sec: 3686.6, 60 sec: 3413.3, 300 sec: 3101.3). Total num frames: 651264. Throughput: 0: 872.5. Samples: 162576. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:42,339][05868] Avg episode reward: [(0, '4.729')] [2023-02-23 11:06:42,579][17741] Updated weights for policy 0, policy_version 160 (0.0021) [2023-02-23 11:06:47,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3413.4, 300 sec: 3105.3). Total num frames: 667648. Throughput: 0: 854.1. Samples: 168118. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-23 11:06:47,336][05868] Avg episode reward: [(0, '4.562')] [2023-02-23 11:06:47,353][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth... [2023-02-23 11:06:52,338][05868] Fps is (10 sec: 3274.8, 60 sec: 3481.3, 300 sec: 3109.2). Total num frames: 684032. Throughput: 0: 841.7. Samples: 170046. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:52,341][05868] Avg episode reward: [(0, '4.765')] [2023-02-23 11:06:56,289][17741] Updated weights for policy 0, policy_version 170 (0.0019) [2023-02-23 11:06:57,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3113.0). Total num frames: 700416. Throughput: 0: 847.5. Samples: 174378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:06:57,336][05868] Avg episode reward: [(0, '4.879')] [2023-02-23 11:07:02,332][05868] Fps is (10 sec: 3688.6, 60 sec: 3413.3, 300 sec: 3134.3). Total num frames: 720896. Throughput: 0: 869.2. Samples: 180694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:07:02,337][05868] Avg episode reward: [(0, '4.949')] [2023-02-23 11:07:02,343][17728] Saving new best policy, reward=4.949! [2023-02-23 11:07:06,236][17741] Updated weights for policy 0, policy_version 180 (0.0030) [2023-02-23 11:07:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3137.4). Total num frames: 737280. Throughput: 0: 872.0. Samples: 183868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:07:07,336][05868] Avg episode reward: [(0, '4.993')] [2023-02-23 11:07:07,350][17728] Saving new best policy, reward=4.993! [2023-02-23 11:07:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3140.3). Total num frames: 753664. Throughput: 0: 841.4. Samples: 188022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:07:12,334][05868] Avg episode reward: [(0, '5.144')] [2023-02-23 11:07:12,338][17728] Saving new best policy, reward=5.144! [2023-02-23 11:07:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3126.3). Total num frames: 765952. Throughput: 0: 852.2. Samples: 192516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:07:17,334][05868] Avg episode reward: [(0, '5.245')] [2023-02-23 11:07:17,342][17728] Saving new best policy, reward=5.245! [2023-02-23 11:07:19,580][17741] Updated weights for policy 0, policy_version 190 (0.0017) [2023-02-23 11:07:22,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3145.7). Total num frames: 786432. Throughput: 0: 871.2. Samples: 195538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:07:22,335][05868] Avg episode reward: [(0, '5.567')] [2023-02-23 11:07:22,341][17728] Saving new best policy, reward=5.567! [2023-02-23 11:07:27,335][05868] Fps is (10 sec: 4094.7, 60 sec: 3481.8, 300 sec: 3164.3). Total num frames: 806912. Throughput: 0: 870.1. Samples: 201732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:07:27,337][05868] Avg episode reward: [(0, '5.600')] [2023-02-23 11:07:27,358][17728] Saving new best policy, reward=5.600! [2023-02-23 11:07:31,249][17741] Updated weights for policy 0, policy_version 200 (0.0024) [2023-02-23 11:07:32,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3150.8). Total num frames: 819200. Throughput: 0: 836.2. Samples: 205746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:07:32,335][05868] Avg episode reward: [(0, '5.755')] [2023-02-23 11:07:32,337][17728] Saving new best policy, reward=5.755! [2023-02-23 11:07:37,332][05868] Fps is (10 sec: 2868.1, 60 sec: 3413.3, 300 sec: 3153.1). Total num frames: 835584. Throughput: 0: 835.6. Samples: 207642. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:07:37,334][05868] Avg episode reward: [(0, '5.518')] [2023-02-23 11:07:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3170.6). Total num frames: 856064. Throughput: 0: 868.6. Samples: 213466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:07:42,338][05868] Avg episode reward: [(0, '5.578')] [2023-02-23 11:07:42,990][17741] Updated weights for policy 0, policy_version 210 (0.0020) [2023-02-23 11:07:47,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3187.4). Total num frames: 876544. Throughput: 0: 867.1. Samples: 219714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:07:47,339][05868] Avg episode reward: [(0, '5.792')] [2023-02-23 11:07:47,349][17728] Saving new best policy, reward=5.792! [2023-02-23 11:07:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.7, 300 sec: 3174.4). Total num frames: 888832. Throughput: 0: 841.1. Samples: 221718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:07:52,337][05868] Avg episode reward: [(0, '5.530')] [2023-02-23 11:07:56,082][17741] Updated weights for policy 0, policy_version 220 (0.0016) [2023-02-23 11:07:57,332][05868] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3161.8). Total num frames: 901120. Throughput: 0: 839.9. Samples: 225816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:07:57,337][05868] Avg episode reward: [(0, '5.666')] [2023-02-23 11:08:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3192.1). Total num frames: 925696. Throughput: 0: 869.6. Samples: 231648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:02,335][05868] Avg episode reward: [(0, '5.742')] [2023-02-23 11:08:06,121][17741] Updated weights for policy 0, policy_version 230 (0.0015) [2023-02-23 11:08:07,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3207.4). Total num frames: 946176. Throughput: 0: 873.9. Samples: 234864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:07,337][05868] Avg episode reward: [(0, '6.159')] [2023-02-23 11:08:07,348][17728] Saving new best policy, reward=6.159! [2023-02-23 11:08:12,332][05868] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 958464. Throughput: 0: 847.4. Samples: 239862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:08:12,338][05868] Avg episode reward: [(0, '5.804')] [2023-02-23 11:08:17,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 847.8. Samples: 243896. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:08:17,339][05868] Avg episode reward: [(0, '5.908')] [2023-02-23 11:08:19,525][17741] Updated weights for policy 0, policy_version 240 (0.0013) [2023-02-23 11:08:22,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 991232. Throughput: 0: 866.5. Samples: 246634. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:08:22,340][05868] Avg episode reward: [(0, '5.740')] [2023-02-23 11:08:27,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3413.5, 300 sec: 3387.9). Total num frames: 1011712. Throughput: 0: 881.5. Samples: 253134. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:08:27,334][05868] Avg episode reward: [(0, '5.308')] [2023-02-23 11:08:29,852][17741] Updated weights for policy 0, policy_version 250 (0.0018) [2023-02-23 11:08:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1028096. Throughput: 0: 851.8. Samples: 258046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:32,337][05868] Avg episode reward: [(0, '5.206')] [2023-02-23 11:08:37,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3360.2). Total num frames: 1040384. Throughput: 0: 851.2. Samples: 260022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:37,335][05868] Avg episode reward: [(0, '5.178')] [2023-02-23 11:08:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 1060864. Throughput: 0: 872.8. Samples: 265090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:42,339][05868] Avg episode reward: [(0, '5.483')] [2023-02-23 11:08:42,361][17741] Updated weights for policy 0, policy_version 260 (0.0023) [2023-02-23 11:08:47,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 1085440. Throughput: 0: 888.1. Samples: 271612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:08:47,339][05868] Avg episode reward: [(0, '5.861')] [2023-02-23 11:08:47,350][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth... [2023-02-23 11:08:47,490][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth [2023-02-23 11:08:52,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1097728. Throughput: 0: 874.5. Samples: 274218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:52,335][05868] Avg episode reward: [(0, '5.548')] [2023-02-23 11:08:54,095][17741] Updated weights for policy 0, policy_version 270 (0.0018) [2023-02-23 11:08:57,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 1114112. Throughput: 0: 853.1. Samples: 278252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:08:57,339][05868] Avg episode reward: [(0, '5.219')] [2023-02-23 11:09:02,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 1130496. Throughput: 0: 877.3. Samples: 283376. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:09:02,340][05868] Avg episode reward: [(0, '5.420')] [2023-02-23 11:09:05,537][17741] Updated weights for policy 0, policy_version 280 (0.0047) [2023-02-23 11:09:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1150976. Throughput: 0: 888.8. Samples: 286628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:09:07,335][05868] Avg episode reward: [(0, '5.818')] [2023-02-23 11:09:12,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1171456. Throughput: 0: 872.8. Samples: 292412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:09:12,334][05868] Avg episode reward: [(0, '5.745')] [2023-02-23 11:09:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1183744. Throughput: 0: 853.7. Samples: 296464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:09:17,342][05868] Avg episode reward: [(0, '5.726')] [2023-02-23 11:09:18,446][17741] Updated weights for policy 0, policy_version 290 (0.0025) [2023-02-23 11:09:22,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1200128. Throughput: 0: 857.7. Samples: 298618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:09:22,340][05868] Avg episode reward: [(0, '5.916')] [2023-02-23 11:09:27,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1220608. Throughput: 0: 886.9. Samples: 305000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:09:27,335][05868] Avg episode reward: [(0, '6.153')] [2023-02-23 11:09:28,514][17741] Updated weights for policy 0, policy_version 300 (0.0024) [2023-02-23 11:09:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1241088. Throughput: 0: 869.0. Samples: 310716. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:09:32,336][05868] Avg episode reward: [(0, '5.931')] [2023-02-23 11:09:37,336][05868] Fps is (10 sec: 3275.5, 60 sec: 3549.6, 300 sec: 3415.6). Total num frames: 1253376. Throughput: 0: 856.1. Samples: 312748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:09:37,342][05868] Avg episode reward: [(0, '5.748')] [2023-02-23 11:09:41,920][17741] Updated weights for policy 0, policy_version 310 (0.0030) [2023-02-23 11:09:42,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1269760. Throughput: 0: 859.5. Samples: 316928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:09:42,335][05868] Avg episode reward: [(0, '5.567')] [2023-02-23 11:09:47,332][05868] Fps is (10 sec: 3687.9, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1290240. Throughput: 0: 888.4. Samples: 323356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:09:47,339][05868] Avg episode reward: [(0, '6.119')] [2023-02-23 11:09:52,127][17741] Updated weights for policy 0, policy_version 320 (0.0025) [2023-02-23 11:09:52,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1310720. Throughput: 0: 887.6. Samples: 326572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:09:52,334][05868] Avg episode reward: [(0, '6.457')] [2023-02-23 11:09:52,342][17728] Saving new best policy, reward=6.457! [2023-02-23 11:09:57,335][05868] Fps is (10 sec: 3275.7, 60 sec: 3481.4, 300 sec: 3429.5). Total num frames: 1323008. Throughput: 0: 854.8. Samples: 330880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:09:57,340][05868] Avg episode reward: [(0, '6.971')] [2023-02-23 11:09:57,355][17728] Saving new best policy, reward=6.971! [2023-02-23 11:10:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1339392. Throughput: 0: 860.6. Samples: 335192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:10:02,335][05868] Avg episode reward: [(0, '6.893')] [2023-02-23 11:10:04,976][17741] Updated weights for policy 0, policy_version 330 (0.0023) [2023-02-23 11:10:07,333][05868] Fps is (10 sec: 3687.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 1359872. Throughput: 0: 884.2. Samples: 338408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:10:07,336][05868] Avg episode reward: [(0, '6.895')] [2023-02-23 11:10:12,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 1380352. Throughput: 0: 883.2. Samples: 344744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:10:12,340][05868] Avg episode reward: [(0, '6.748')] [2023-02-23 11:10:16,532][17741] Updated weights for policy 0, policy_version 340 (0.0013) [2023-02-23 11:10:17,349][05868] Fps is (10 sec: 3271.6, 60 sec: 3480.6, 300 sec: 3429.3). Total num frames: 1392640. Throughput: 0: 849.9. Samples: 348976. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:10:17,352][05868] Avg episode reward: [(0, '6.543')] [2023-02-23 11:10:22,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1409024. Throughput: 0: 851.2. Samples: 351048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:10:22,338][05868] Avg episode reward: [(0, '6.803')] [2023-02-23 11:10:27,332][05868] Fps is (10 sec: 3692.7, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1429504. Throughput: 0: 884.2. Samples: 356716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:10:27,340][05868] Avg episode reward: [(0, '6.990')] [2023-02-23 11:10:27,350][17728] Saving new best policy, reward=6.990! [2023-02-23 11:10:28,155][17741] Updated weights for policy 0, policy_version 350 (0.0020) [2023-02-23 11:10:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1449984. Throughput: 0: 881.5. Samples: 363022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:10:32,337][05868] Avg episode reward: [(0, '7.446')] [2023-02-23 11:10:32,340][17728] Saving new best policy, reward=7.446! [2023-02-23 11:10:37,333][05868] Fps is (10 sec: 3276.5, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 1462272. Throughput: 0: 855.1. Samples: 365052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:10:37,336][05868] Avg episode reward: [(0, '7.268')] [2023-02-23 11:10:41,006][17741] Updated weights for policy 0, policy_version 360 (0.0017) [2023-02-23 11:10:42,332][05868] Fps is (10 sec: 2457.5, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 1474560. Throughput: 0: 849.6. Samples: 369110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:10:42,348][05868] Avg episode reward: [(0, '7.553')] [2023-02-23 11:10:42,359][17728] Saving new best policy, reward=7.553! [2023-02-23 11:10:47,332][05868] Fps is (10 sec: 3277.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1495040. Throughput: 0: 880.1. Samples: 374796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:10:47,338][05868] Avg episode reward: [(0, '7.076')] [2023-02-23 11:10:47,350][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000365_1495040.pth... [2023-02-23 11:10:47,477][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth [2023-02-23 11:10:51,394][17741] Updated weights for policy 0, policy_version 370 (0.0014) [2023-02-23 11:10:52,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1515520. Throughput: 0: 879.0. Samples: 377964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:10:52,334][05868] Avg episode reward: [(0, '7.711')] [2023-02-23 11:10:52,347][17728] Saving new best policy, reward=7.711! [2023-02-23 11:10:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 1531904. Throughput: 0: 847.8. Samples: 382892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:10:57,337][05868] Avg episode reward: [(0, '7.627')] [2023-02-23 11:11:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1544192. Throughput: 0: 847.6. Samples: 387104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:11:02,343][05868] Avg episode reward: [(0, '7.819')] [2023-02-23 11:11:02,345][17728] Saving new best policy, reward=7.819! [2023-02-23 11:11:04,839][17741] Updated weights for policy 0, policy_version 380 (0.0052) [2023-02-23 11:11:07,333][05868] Fps is (10 sec: 3276.3, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1564672. Throughput: 0: 861.3. Samples: 389810. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:11:07,336][05868] Avg episode reward: [(0, '7.461')] [2023-02-23 11:11:12,332][05868] Fps is (10 sec: 4505.7, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 1589248. Throughput: 0: 879.5. Samples: 396294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:11:12,334][05868] Avg episode reward: [(0, '8.033')] [2023-02-23 11:11:12,338][17728] Saving new best policy, reward=8.033! [2023-02-23 11:11:14,820][17741] Updated weights for policy 0, policy_version 390 (0.0020) [2023-02-23 11:11:17,332][05868] Fps is (10 sec: 3686.9, 60 sec: 3482.6, 300 sec: 3443.4). Total num frames: 1601536. Throughput: 0: 848.4. Samples: 401202. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:11:17,337][05868] Avg episode reward: [(0, '8.168')] [2023-02-23 11:11:17,348][17728] Saving new best policy, reward=8.168! [2023-02-23 11:11:22,334][05868] Fps is (10 sec: 2457.1, 60 sec: 3413.2, 300 sec: 3443.5). Total num frames: 1613824. Throughput: 0: 847.7. Samples: 403200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:11:22,337][05868] Avg episode reward: [(0, '8.240')] [2023-02-23 11:11:22,339][17728] Saving new best policy, reward=8.240! [2023-02-23 11:11:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1634304. Throughput: 0: 863.9. Samples: 407986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:11:27,339][05868] Avg episode reward: [(0, '9.142')] [2023-02-23 11:11:27,349][17728] Saving new best policy, reward=9.142! [2023-02-23 11:11:27,853][17741] Updated weights for policy 0, policy_version 400 (0.0016) [2023-02-23 11:11:32,332][05868] Fps is (10 sec: 4096.6, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1654784. Throughput: 0: 881.9. Samples: 414484. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:11:32,335][05868] Avg episode reward: [(0, '9.759')] [2023-02-23 11:11:32,338][17728] Saving new best policy, reward=9.759! [2023-02-23 11:11:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3457.3). Total num frames: 1671168. Throughput: 0: 868.6. Samples: 417052. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:11:37,342][05868] Avg episode reward: [(0, '9.882')] [2023-02-23 11:11:37,365][17728] Saving new best policy, reward=9.882! [2023-02-23 11:11:39,797][17741] Updated weights for policy 0, policy_version 410 (0.0014) [2023-02-23 11:11:42,332][05868] Fps is (10 sec: 2867.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1683456. Throughput: 0: 847.4. Samples: 421026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:11:42,337][05868] Avg episode reward: [(0, '10.195')] [2023-02-23 11:11:42,347][17728] Saving new best policy, reward=10.195! [2023-02-23 11:11:47,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.4). Total num frames: 1703936. Throughput: 0: 867.8. Samples: 426156. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:11:47,334][05868] Avg episode reward: [(0, '9.819')] [2023-02-23 11:11:50,985][17741] Updated weights for policy 0, policy_version 420 (0.0027) [2023-02-23 11:11:52,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1724416. Throughput: 0: 881.1. Samples: 429460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:11:52,340][05868] Avg episode reward: [(0, '10.216')] [2023-02-23 11:11:52,346][17728] Saving new best policy, reward=10.216! [2023-02-23 11:11:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1740800. Throughput: 0: 862.9. Samples: 435126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:11:57,338][05868] Avg episode reward: [(0, '10.346')] [2023-02-23 11:11:57,351][17728] Saving new best policy, reward=10.346! [2023-02-23 11:12:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1753088. Throughput: 0: 842.8. Samples: 439128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:12:02,338][05868] Avg episode reward: [(0, '11.437')] [2023-02-23 11:12:02,343][17728] Saving new best policy, reward=11.437! [2023-02-23 11:12:04,584][17741] Updated weights for policy 0, policy_version 430 (0.0014) [2023-02-23 11:12:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3443.4). Total num frames: 1769472. Throughput: 0: 845.8. Samples: 441258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:07,341][05868] Avg episode reward: [(0, '12.061')] [2023-02-23 11:12:07,425][17728] Saving new best policy, reward=12.061! [2023-02-23 11:12:12,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1794048. Throughput: 0: 882.4. Samples: 447696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:12,334][05868] Avg episode reward: [(0, '13.771')] [2023-02-23 11:12:12,341][17728] Saving new best policy, reward=13.771! [2023-02-23 11:12:14,176][17741] Updated weights for policy 0, policy_version 440 (0.0014) [2023-02-23 11:12:17,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1810432. Throughput: 0: 862.3. Samples: 453286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:12:17,334][05868] Avg episode reward: [(0, '12.964')] [2023-02-23 11:12:22,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3443.5). Total num frames: 1822720. Throughput: 0: 850.3. Samples: 455314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:22,338][05868] Avg episode reward: [(0, '13.367')] [2023-02-23 11:12:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1839104. Throughput: 0: 854.0. Samples: 459456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:27,334][05868] Avg episode reward: [(0, '13.969')] [2023-02-23 11:12:27,348][17728] Saving new best policy, reward=13.969! [2023-02-23 11:12:27,617][17741] Updated weights for policy 0, policy_version 450 (0.0033) [2023-02-23 11:12:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 1859584. Throughput: 0: 883.2. Samples: 465898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:32,339][05868] Avg episode reward: [(0, '13.638')] [2023-02-23 11:12:37,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1880064. Throughput: 0: 880.2. Samples: 469070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:12:37,335][05868] Avg episode reward: [(0, '14.639')] [2023-02-23 11:12:37,342][17728] Saving new best policy, reward=14.639! [2023-02-23 11:12:38,494][17741] Updated weights for policy 0, policy_version 460 (0.0013) [2023-02-23 11:12:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1892352. Throughput: 0: 846.9. Samples: 473238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:12:42,341][05868] Avg episode reward: [(0, '14.796')] [2023-02-23 11:12:42,349][17728] Saving new best policy, reward=14.796! [2023-02-23 11:12:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1908736. Throughput: 0: 855.8. Samples: 477640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:12:47,335][05868] Avg episode reward: [(0, '14.496')] [2023-02-23 11:12:47,343][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000466_1908736.pth... [2023-02-23 11:12:47,459][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth [2023-02-23 11:12:50,774][17741] Updated weights for policy 0, policy_version 470 (0.0014) [2023-02-23 11:12:52,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1929216. Throughput: 0: 879.2. Samples: 480824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:12:52,335][05868] Avg episode reward: [(0, '13.569')] [2023-02-23 11:12:57,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 1949696. Throughput: 0: 879.5. Samples: 487276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:12:57,337][05868] Avg episode reward: [(0, '14.671')] [2023-02-23 11:13:02,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1961984. Throughput: 0: 847.8. Samples: 491438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:13:02,337][05868] Avg episode reward: [(0, '15.165')] [2023-02-23 11:13:02,342][17728] Saving new best policy, reward=15.165! [2023-02-23 11:13:02,795][17741] Updated weights for policy 0, policy_version 480 (0.0015) [2023-02-23 11:13:07,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1978368. Throughput: 0: 847.0. Samples: 493428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:13:07,334][05868] Avg episode reward: [(0, '16.150')] [2023-02-23 11:13:07,344][17728] Saving new best policy, reward=16.150! [2023-02-23 11:13:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1998848. Throughput: 0: 884.5. Samples: 499258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:13:12,335][05868] Avg episode reward: [(0, '17.154')] [2023-02-23 11:13:12,337][17728] Saving new best policy, reward=17.154! [2023-02-23 11:13:13,815][17741] Updated weights for policy 0, policy_version 490 (0.0020) [2023-02-23 11:13:17,335][05868] Fps is (10 sec: 4094.8, 60 sec: 3481.4, 300 sec: 3485.0). Total num frames: 2019328. Throughput: 0: 879.2. Samples: 505466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:13:17,340][05868] Avg episode reward: [(0, '18.178')] [2023-02-23 11:13:17,357][17728] Saving new best policy, reward=18.178! [2023-02-23 11:13:22,332][05868] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2031616. Throughput: 0: 852.7. Samples: 507440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:13:22,339][05868] Avg episode reward: [(0, '17.746')] [2023-02-23 11:13:27,332][05868] Fps is (10 sec: 2458.3, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2043904. Throughput: 0: 851.3. Samples: 511548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:13:27,339][05868] Avg episode reward: [(0, '17.289')] [2023-02-23 11:13:27,416][17741] Updated weights for policy 0, policy_version 500 (0.0021) [2023-02-23 11:13:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2068480. Throughput: 0: 886.4. Samples: 517528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:13:32,340][05868] Avg episode reward: [(0, '17.500')] [2023-02-23 11:13:36,900][17741] Updated weights for policy 0, policy_version 510 (0.0020) [2023-02-23 11:13:37,332][05868] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2088960. Throughput: 0: 886.7. Samples: 520726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:13:37,337][05868] Avg episode reward: [(0, '17.833')] [2023-02-23 11:13:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2101248. Throughput: 0: 857.1. Samples: 525844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:13:42,334][05868] Avg episode reward: [(0, '18.844')] [2023-02-23 11:13:42,340][17728] Saving new best policy, reward=18.844! [2023-02-23 11:13:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2117632. Throughput: 0: 855.6. Samples: 529940. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:13:47,339][05868] Avg episode reward: [(0, '20.101')] [2023-02-23 11:13:47,357][17728] Saving new best policy, reward=20.101! [2023-02-23 11:13:50,263][17741] Updated weights for policy 0, policy_version 520 (0.0023) [2023-02-23 11:13:52,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2138112. Throughput: 0: 871.6. Samples: 532650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:13:52,337][05868] Avg episode reward: [(0, '21.236')] [2023-02-23 11:13:52,343][17728] Saving new best policy, reward=21.236! [2023-02-23 11:13:57,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 2158592. Throughput: 0: 885.0. Samples: 539084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:13:57,341][05868] Avg episode reward: [(0, '20.925')] [2023-02-23 11:14:00,890][17741] Updated weights for policy 0, policy_version 530 (0.0025) [2023-02-23 11:14:02,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2170880. Throughput: 0: 855.7. Samples: 543972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:14:02,335][05868] Avg episode reward: [(0, '20.021')] [2023-02-23 11:14:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2187264. Throughput: 0: 857.4. Samples: 546022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:14:07,342][05868] Avg episode reward: [(0, '20.288')] [2023-02-23 11:14:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2207744. Throughput: 0: 882.1. Samples: 551242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:14:12,335][05868] Avg episode reward: [(0, '18.943')] [2023-02-23 11:14:13,036][17741] Updated weights for policy 0, policy_version 540 (0.0025) [2023-02-23 11:14:17,332][05868] Fps is (10 sec: 4096.2, 60 sec: 3481.8, 300 sec: 3485.1). Total num frames: 2228224. Throughput: 0: 890.8. Samples: 557616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:14:17,335][05868] Avg episode reward: [(0, '18.880')] [2023-02-23 11:14:22,333][05868] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 2244608. Throughput: 0: 880.9. Samples: 560366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:14:22,341][05868] Avg episode reward: [(0, '19.490')] [2023-02-23 11:14:24,727][17741] Updated weights for policy 0, policy_version 550 (0.0013) [2023-02-23 11:14:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2256896. Throughput: 0: 860.2. Samples: 564554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:14:27,335][05868] Avg episode reward: [(0, '20.118')] [2023-02-23 11:14:32,332][05868] Fps is (10 sec: 3277.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2277376. Throughput: 0: 883.4. Samples: 569692. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:14:32,341][05868] Avg episode reward: [(0, '22.288')] [2023-02-23 11:14:32,347][17728] Saving new best policy, reward=22.288! [2023-02-23 11:14:36,098][17741] Updated weights for policy 0, policy_version 560 (0.0020) [2023-02-23 11:14:37,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2297856. Throughput: 0: 892.7. Samples: 572820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:14:37,339][05868] Avg episode reward: [(0, '22.625')] [2023-02-23 11:14:37,350][17728] Saving new best policy, reward=22.625! [2023-02-23 11:14:42,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2314240. Throughput: 0: 875.7. Samples: 578490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:14:42,337][05868] Avg episode reward: [(0, '21.654')] [2023-02-23 11:14:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2326528. Throughput: 0: 858.2. Samples: 582590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:14:47,337][05868] Avg episode reward: [(0, '21.633')] [2023-02-23 11:14:47,355][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000568_2326528.pth... [2023-02-23 11:14:47,480][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000365_1495040.pth [2023-02-23 11:14:49,535][17741] Updated weights for policy 0, policy_version 570 (0.0024) [2023-02-23 11:14:52,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 2342912. Throughput: 0: 857.6. Samples: 584612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:14:52,335][05868] Avg episode reward: [(0, '21.507')] [2023-02-23 11:14:57,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2367488. Throughput: 0: 886.1. Samples: 591118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:14:57,342][05868] Avg episode reward: [(0, '21.775')] [2023-02-23 11:14:59,139][17741] Updated weights for policy 0, policy_version 580 (0.0023) [2023-02-23 11:15:02,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2383872. Throughput: 0: 869.0. Samples: 596722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:15:02,337][05868] Avg episode reward: [(0, '21.856')] [2023-02-23 11:15:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2396160. Throughput: 0: 853.6. Samples: 598778. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:15:07,338][05868] Avg episode reward: [(0, '21.823')] [2023-02-23 11:15:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.5). Total num frames: 2412544. Throughput: 0: 855.8. Samples: 603064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:15:12,334][05868] Avg episode reward: [(0, '22.925')] [2023-02-23 11:15:12,342][17728] Saving new best policy, reward=22.925! [2023-02-23 11:15:12,591][17741] Updated weights for policy 0, policy_version 590 (0.0037) [2023-02-23 11:15:17,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2437120. Throughput: 0: 884.8. Samples: 609508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:15:17,337][05868] Avg episode reward: [(0, '23.108')] [2023-02-23 11:15:17,347][17728] Saving new best policy, reward=23.108! [2023-02-23 11:15:22,335][05868] Fps is (10 sec: 4094.8, 60 sec: 3481.5, 300 sec: 3471.1). Total num frames: 2453504. Throughput: 0: 886.1. Samples: 612696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:15:22,340][05868] Avg episode reward: [(0, '21.876')] [2023-02-23 11:15:23,029][17741] Updated weights for policy 0, policy_version 600 (0.0022) [2023-02-23 11:15:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2465792. Throughput: 0: 855.4. Samples: 616982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:15:27,337][05868] Avg episode reward: [(0, '21.930')] [2023-02-23 11:15:32,332][05868] Fps is (10 sec: 2868.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 2482176. Throughput: 0: 863.1. Samples: 621430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:15:32,340][05868] Avg episode reward: [(0, '21.419')] [2023-02-23 11:15:35,391][17741] Updated weights for policy 0, policy_version 610 (0.0024) [2023-02-23 11:15:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2502656. Throughput: 0: 890.1. Samples: 624666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:15:37,334][05868] Avg episode reward: [(0, '20.474')] [2023-02-23 11:15:42,333][05868] Fps is (10 sec: 4095.6, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 2523136. Throughput: 0: 891.7. Samples: 631246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:15:42,338][05868] Avg episode reward: [(0, '20.016')] [2023-02-23 11:15:47,022][17741] Updated weights for policy 0, policy_version 620 (0.0041) [2023-02-23 11:15:47,334][05868] Fps is (10 sec: 3685.7, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 2539520. Throughput: 0: 858.7. Samples: 635366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:15:47,336][05868] Avg episode reward: [(0, '20.280')] [2023-02-23 11:15:52,332][05868] Fps is (10 sec: 2867.5, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2551808. Throughput: 0: 858.2. Samples: 637396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:15:52,334][05868] Avg episode reward: [(0, '20.559')] [2023-02-23 11:15:57,332][05868] Fps is (10 sec: 3687.1, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2576384. Throughput: 0: 892.8. Samples: 643240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:15:57,339][05868] Avg episode reward: [(0, '20.358')] [2023-02-23 11:15:58,350][17741] Updated weights for policy 0, policy_version 630 (0.0026) [2023-02-23 11:16:02,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2596864. Throughput: 0: 890.6. Samples: 649584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:16:02,338][05868] Avg episode reward: [(0, '20.223')] [2023-02-23 11:16:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2609152. Throughput: 0: 864.7. Samples: 651604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:16:07,336][05868] Avg episode reward: [(0, '19.544')] [2023-02-23 11:16:11,287][17741] Updated weights for policy 0, policy_version 640 (0.0026) [2023-02-23 11:16:12,336][05868] Fps is (10 sec: 2456.6, 60 sec: 3481.4, 300 sec: 3457.3). Total num frames: 2621440. Throughput: 0: 860.9. Samples: 655728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:16:12,341][05868] Avg episode reward: [(0, '20.280')] [2023-02-23 11:16:17,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2646016. Throughput: 0: 898.8. Samples: 661878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:16:17,339][05868] Avg episode reward: [(0, '20.011')] [2023-02-23 11:16:20,808][17741] Updated weights for policy 0, policy_version 650 (0.0026) [2023-02-23 11:16:22,332][05868] Fps is (10 sec: 4507.2, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 2666496. Throughput: 0: 899.4. Samples: 665140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:16:22,340][05868] Avg episode reward: [(0, '20.920')] [2023-02-23 11:16:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2678784. Throughput: 0: 864.7. Samples: 670156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:16:27,334][05868] Avg episode reward: [(0, '21.276')] [2023-02-23 11:16:32,332][05868] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2695168. Throughput: 0: 860.6. Samples: 674090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:16:32,335][05868] Avg episode reward: [(0, '22.487')] [2023-02-23 11:16:34,384][17741] Updated weights for policy 0, policy_version 660 (0.0019) [2023-02-23 11:16:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2711552. Throughput: 0: 878.4. Samples: 676924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:16:37,335][05868] Avg episode reward: [(0, '22.476')] [2023-02-23 11:16:42,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2736128. Throughput: 0: 892.6. Samples: 683408. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-02-23 11:16:42,339][05868] Avg episode reward: [(0, '23.677')] [2023-02-23 11:16:42,344][17728] Saving new best policy, reward=23.677! [2023-02-23 11:16:44,953][17741] Updated weights for policy 0, policy_version 670 (0.0025) [2023-02-23 11:16:47,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3471.2). Total num frames: 2748416. Throughput: 0: 858.8. Samples: 688230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-23 11:16:47,337][05868] Avg episode reward: [(0, '23.689')] [2023-02-23 11:16:47,359][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth... [2023-02-23 11:16:47,597][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000466_1908736.pth [2023-02-23 11:16:47,626][17728] Saving new best policy, reward=23.689! [2023-02-23 11:16:52,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2764800. Throughput: 0: 856.3. Samples: 690138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:16:52,337][05868] Avg episode reward: [(0, '22.950')] [2023-02-23 11:16:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2785280. Throughput: 0: 878.3. Samples: 695250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:16:57,335][17741] Updated weights for policy 0, policy_version 680 (0.0034) [2023-02-23 11:16:57,334][05868] Avg episode reward: [(0, '23.028')] [2023-02-23 11:17:02,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3512.8). Total num frames: 2805760. Throughput: 0: 884.4. Samples: 701678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:02,341][05868] Avg episode reward: [(0, '23.798')] [2023-02-23 11:17:02,344][17728] Saving new best policy, reward=23.798! [2023-02-23 11:17:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2818048. Throughput: 0: 866.9. Samples: 704152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:07,340][05868] Avg episode reward: [(0, '22.594')] [2023-02-23 11:17:09,158][17741] Updated weights for policy 0, policy_version 690 (0.0020) [2023-02-23 11:17:12,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3550.1, 300 sec: 3471.2). Total num frames: 2834432. Throughput: 0: 846.6. Samples: 708252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:17:12,339][05868] Avg episode reward: [(0, '21.835')] [2023-02-23 11:17:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 876.7. Samples: 713542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:17:17,339][05868] Avg episode reward: [(0, '22.307')] [2023-02-23 11:17:20,372][17741] Updated weights for policy 0, policy_version 700 (0.0033) [2023-02-23 11:17:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 2875392. Throughput: 0: 886.2. Samples: 716802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:17:22,335][05868] Avg episode reward: [(0, '21.407')] [2023-02-23 11:17:27,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2887680. Throughput: 0: 868.6. Samples: 722494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:27,337][05868] Avg episode reward: [(0, '20.878')] [2023-02-23 11:17:32,336][05868] Fps is (10 sec: 2866.0, 60 sec: 3481.4, 300 sec: 3471.1). Total num frames: 2904064. Throughput: 0: 851.3. Samples: 726540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:32,348][05868] Avg episode reward: [(0, '20.648')] [2023-02-23 11:17:33,498][17741] Updated weights for policy 0, policy_version 710 (0.0018) [2023-02-23 11:17:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2920448. Throughput: 0: 858.2. Samples: 728756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:17:37,341][05868] Avg episode reward: [(0, '21.664')] [2023-02-23 11:17:42,332][05868] Fps is (10 sec: 3687.9, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 2940928. Throughput: 0: 887.9. Samples: 735204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:17:42,340][05868] Avg episode reward: [(0, '22.011')] [2023-02-23 11:17:43,452][17741] Updated weights for policy 0, policy_version 720 (0.0013) [2023-02-23 11:17:47,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2961408. Throughput: 0: 872.0. Samples: 740916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:47,337][05868] Avg episode reward: [(0, '22.829')] [2023-02-23 11:17:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2973696. Throughput: 0: 862.4. Samples: 742958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:17:52,339][05868] Avg episode reward: [(0, '23.605')] [2023-02-23 11:17:56,678][17741] Updated weights for policy 0, policy_version 730 (0.0047) [2023-02-23 11:17:57,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2990080. Throughput: 0: 868.1. Samples: 747318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:17:57,339][05868] Avg episode reward: [(0, '23.932')] [2023-02-23 11:17:57,351][17728] Saving new best policy, reward=23.932! [2023-02-23 11:18:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3499.0). Total num frames: 3010560. Throughput: 0: 891.1. Samples: 753642. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:18:02,340][05868] Avg episode reward: [(0, '25.403')] [2023-02-23 11:18:02,346][17728] Saving new best policy, reward=25.403! [2023-02-23 11:18:06,913][17741] Updated weights for policy 0, policy_version 740 (0.0016) [2023-02-23 11:18:07,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3031040. Throughput: 0: 889.9. Samples: 756846. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:18:07,337][05868] Avg episode reward: [(0, '25.534')] [2023-02-23 11:18:07,355][17728] Saving new best policy, reward=25.534! [2023-02-23 11:18:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3043328. Throughput: 0: 857.9. Samples: 761098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:18:12,338][05868] Avg episode reward: [(0, '25.039')] [2023-02-23 11:18:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3059712. Throughput: 0: 868.7. Samples: 765628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:18:17,339][05868] Avg episode reward: [(0, '26.727')] [2023-02-23 11:18:17,350][17728] Saving new best policy, reward=26.727! [2023-02-23 11:18:19,547][17741] Updated weights for policy 0, policy_version 750 (0.0016) [2023-02-23 11:18:22,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 3080192. Throughput: 0: 889.8. Samples: 768796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:18:22,335][05868] Avg episode reward: [(0, '25.943')] [2023-02-23 11:18:27,338][05868] Fps is (10 sec: 4093.5, 60 sec: 3549.5, 300 sec: 3498.9). Total num frames: 3100672. Throughput: 0: 890.5. Samples: 775282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:18:27,341][05868] Avg episode reward: [(0, '25.010')] [2023-02-23 11:18:31,353][17741] Updated weights for policy 0, policy_version 760 (0.0012) [2023-02-23 11:18:32,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 3471.2). Total num frames: 3112960. Throughput: 0: 853.1. Samples: 779304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:18:32,334][05868] Avg episode reward: [(0, '23.966')] [2023-02-23 11:18:37,332][05868] Fps is (10 sec: 2868.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3129344. Throughput: 0: 853.8. Samples: 781380. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:18:37,335][05868] Avg episode reward: [(0, '22.245')] [2023-02-23 11:18:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3149824. Throughput: 0: 886.5. Samples: 787210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:18:42,338][05868] Avg episode reward: [(0, '21.979')] [2023-02-23 11:18:42,723][17741] Updated weights for policy 0, policy_version 770 (0.0027) [2023-02-23 11:18:47,339][05868] Fps is (10 sec: 4093.2, 60 sec: 3481.2, 300 sec: 3498.9). Total num frames: 3170304. Throughput: 0: 885.7. Samples: 793504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:18:47,345][05868] Avg episode reward: [(0, '22.896')] [2023-02-23 11:18:47,366][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth... [2023-02-23 11:18:47,527][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000568_2326528.pth [2023-02-23 11:18:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3182592. Throughput: 0: 858.9. Samples: 795498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:18:52,338][05868] Avg episode reward: [(0, '22.633')] [2023-02-23 11:18:55,226][17741] Updated weights for policy 0, policy_version 780 (0.0016) [2023-02-23 11:18:57,332][05868] Fps is (10 sec: 2869.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3198976. Throughput: 0: 857.6. Samples: 799688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:18:57,335][05868] Avg episode reward: [(0, '25.097')] [2023-02-23 11:19:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3219456. Throughput: 0: 885.8. Samples: 805490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:02,339][05868] Avg episode reward: [(0, '26.970')] [2023-02-23 11:19:02,345][17728] Saving new best policy, reward=26.970! [2023-02-23 11:19:05,822][17741] Updated weights for policy 0, policy_version 790 (0.0031) [2023-02-23 11:19:07,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3239936. Throughput: 0: 885.6. Samples: 808646. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-02-23 11:19:07,337][05868] Avg episode reward: [(0, '28.138')] [2023-02-23 11:19:07,348][17728] Saving new best policy, reward=28.138! [2023-02-23 11:19:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3256320. Throughput: 0: 855.3. Samples: 813766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:12,339][05868] Avg episode reward: [(0, '28.800')] [2023-02-23 11:19:12,344][17728] Saving new best policy, reward=28.800! [2023-02-23 11:19:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3268608. Throughput: 0: 857.2. Samples: 817878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:17,341][05868] Avg episode reward: [(0, '28.317')] [2023-02-23 11:19:19,029][17741] Updated weights for policy 0, policy_version 800 (0.0039) [2023-02-23 11:19:22,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3289088. Throughput: 0: 874.3. Samples: 820722. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-02-23 11:19:22,335][05868] Avg episode reward: [(0, '26.619')] [2023-02-23 11:19:27,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3482.0, 300 sec: 3499.0). Total num frames: 3309568. Throughput: 0: 886.9. Samples: 827122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:19:27,340][05868] Avg episode reward: [(0, '24.254')] [2023-02-23 11:19:28,979][17741] Updated weights for policy 0, policy_version 810 (0.0019) [2023-02-23 11:19:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3325952. Throughput: 0: 855.4. Samples: 831992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:32,334][05868] Avg episode reward: [(0, '25.178')] [2023-02-23 11:19:37,334][05868] Fps is (10 sec: 2866.6, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 3338240. Throughput: 0: 855.9. Samples: 834016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:19:37,337][05868] Avg episode reward: [(0, '25.592')] [2023-02-23 11:19:42,132][17741] Updated weights for policy 0, policy_version 820 (0.0050) [2023-02-23 11:19:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3358720. Throughput: 0: 873.2. Samples: 838984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:42,335][05868] Avg episode reward: [(0, '24.701')] [2023-02-23 11:19:47,332][05868] Fps is (10 sec: 4096.9, 60 sec: 3482.0, 300 sec: 3512.8). Total num frames: 3379200. Throughput: 0: 887.7. Samples: 845436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:19:47,335][05868] Avg episode reward: [(0, '24.032')] [2023-02-23 11:19:52,335][05868] Fps is (10 sec: 3685.3, 60 sec: 3549.7, 300 sec: 3485.0). Total num frames: 3395584. Throughput: 0: 879.6. Samples: 848232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:19:52,338][05868] Avg episode reward: [(0, '23.430')] [2023-02-23 11:19:53,234][17741] Updated weights for policy 0, policy_version 830 (0.0021) [2023-02-23 11:19:57,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3407872. Throughput: 0: 855.4. Samples: 852258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:19:57,341][05868] Avg episode reward: [(0, '24.109')] [2023-02-23 11:20:02,332][05868] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3428352. Throughput: 0: 877.5. Samples: 857364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:20:02,339][05868] Avg episode reward: [(0, '22.084')] [2023-02-23 11:20:05,155][17741] Updated weights for policy 0, policy_version 840 (0.0029) [2023-02-23 11:20:07,332][05868] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3448832. Throughput: 0: 885.6. Samples: 860574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:07,334][05868] Avg episode reward: [(0, '21.356')] [2023-02-23 11:20:12,337][05868] Fps is (10 sec: 3684.5, 60 sec: 3481.3, 300 sec: 3485.0). Total num frames: 3465216. Throughput: 0: 874.6. Samples: 866484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:12,340][05868] Avg episode reward: [(0, '21.887')] [2023-02-23 11:20:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3477504. Throughput: 0: 857.1. Samples: 870560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:17,337][05868] Avg episode reward: [(0, '22.613')] [2023-02-23 11:20:17,533][17741] Updated weights for policy 0, policy_version 850 (0.0018) [2023-02-23 11:20:22,332][05868] Fps is (10 sec: 3278.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3497984. Throughput: 0: 858.8. Samples: 872662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:22,335][05868] Avg episode reward: [(0, '22.359')] [2023-02-23 11:20:27,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3518464. Throughput: 0: 893.8. Samples: 879206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:27,335][05868] Avg episode reward: [(0, '23.042')] [2023-02-23 11:20:27,816][17741] Updated weights for policy 0, policy_version 860 (0.0022) [2023-02-23 11:20:32,338][05868] Fps is (10 sec: 3684.2, 60 sec: 3481.2, 300 sec: 3498.9). Total num frames: 3534848. Throughput: 0: 873.7. Samples: 884756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:32,345][05868] Avg episode reward: [(0, '23.889')] [2023-02-23 11:20:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3485.1). Total num frames: 3551232. Throughput: 0: 856.8. Samples: 886784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:20:37,336][05868] Avg episode reward: [(0, '24.255')] [2023-02-23 11:20:41,279][17741] Updated weights for policy 0, policy_version 870 (0.0045) [2023-02-23 11:20:42,332][05868] Fps is (10 sec: 3278.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3567616. Throughput: 0: 861.4. Samples: 891022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:42,335][05868] Avg episode reward: [(0, '25.370')] [2023-02-23 11:20:47,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3588096. Throughput: 0: 891.7. Samples: 897490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:20:47,339][05868] Avg episode reward: [(0, '25.678')] [2023-02-23 11:20:47,353][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000876_3588096.pth... [2023-02-23 11:20:47,474][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth [2023-02-23 11:20:50,726][17741] Updated weights for policy 0, policy_version 880 (0.0021) [2023-02-23 11:20:52,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 3608576. Throughput: 0: 892.0. Samples: 900714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:20:52,335][05868] Avg episode reward: [(0, '26.738')] [2023-02-23 11:20:57,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 3620864. Throughput: 0: 859.4. Samples: 905152. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:20:57,338][05868] Avg episode reward: [(0, '27.404')] [2023-02-23 11:21:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3637248. Throughput: 0: 865.4. Samples: 909504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:21:02,334][05868] Avg episode reward: [(0, '27.393')] [2023-02-23 11:21:04,234][17741] Updated weights for policy 0, policy_version 890 (0.0026) [2023-02-23 11:21:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.9). Total num frames: 3657728. Throughput: 0: 888.7. Samples: 912654. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:21:07,334][05868] Avg episode reward: [(0, '26.510')] [2023-02-23 11:21:12,338][05868] Fps is (10 sec: 4093.5, 60 sec: 3549.8, 300 sec: 3498.9). Total num frames: 3678208. Throughput: 0: 886.0. Samples: 919080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:21:12,346][05868] Avg episode reward: [(0, '27.735')] [2023-02-23 11:21:14,998][17741] Updated weights for policy 0, policy_version 900 (0.0012) [2023-02-23 11:21:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 3690496. Throughput: 0: 858.5. Samples: 923382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:21:17,341][05868] Avg episode reward: [(0, '28.213')] [2023-02-23 11:21:22,332][05868] Fps is (10 sec: 2868.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3706880. Throughput: 0: 858.0. Samples: 925392. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:21:22,337][05868] Avg episode reward: [(0, '27.407')] [2023-02-23 11:21:27,082][17741] Updated weights for policy 0, policy_version 910 (0.0015) [2023-02-23 11:21:27,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3727360. Throughput: 0: 889.7. Samples: 931058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:21:27,335][05868] Avg episode reward: [(0, '26.676')] [2023-02-23 11:21:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3550.2, 300 sec: 3512.8). Total num frames: 3747840. Throughput: 0: 890.9. Samples: 937582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:21:32,335][05868] Avg episode reward: [(0, '25.324')] [2023-02-23 11:21:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3760128. Throughput: 0: 862.1. Samples: 939508. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-02-23 11:21:37,338][05868] Avg episode reward: [(0, '27.323')] [2023-02-23 11:21:39,371][17741] Updated weights for policy 0, policy_version 920 (0.0034) [2023-02-23 11:21:42,333][05868] Fps is (10 sec: 2457.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 3772416. Throughput: 0: 853.1. Samples: 943542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:21:42,336][05868] Avg episode reward: [(0, '25.412')] [2023-02-23 11:21:47,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3796992. Throughput: 0: 888.0. Samples: 949462. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-02-23 11:21:47,338][05868] Avg episode reward: [(0, '24.885')] [2023-02-23 11:21:50,046][17741] Updated weights for policy 0, policy_version 930 (0.0023) [2023-02-23 11:21:52,332][05868] Fps is (10 sec: 4506.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3817472. Throughput: 0: 890.3. Samples: 952718. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:21:52,341][05868] Avg episode reward: [(0, '24.711')] [2023-02-23 11:21:57,337][05868] Fps is (10 sec: 3684.4, 60 sec: 3549.5, 300 sec: 3485.0). Total num frames: 3833856. Throughput: 0: 864.3. Samples: 957974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:21:57,340][05868] Avg episode reward: [(0, '23.671')] [2023-02-23 11:22:02,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3846144. Throughput: 0: 861.5. Samples: 962152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-02-23 11:22:02,341][05868] Avg episode reward: [(0, '24.008')] [2023-02-23 11:22:03,158][17741] Updated weights for policy 0, policy_version 940 (0.0037) [2023-02-23 11:22:07,332][05868] Fps is (10 sec: 3278.5, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3866624. Throughput: 0: 875.5. Samples: 964790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:22:07,335][05868] Avg episode reward: [(0, '23.997')] [2023-02-23 11:22:12,332][05868] Fps is (10 sec: 4096.3, 60 sec: 3482.0, 300 sec: 3512.8). Total num frames: 3887104. Throughput: 0: 893.3. Samples: 971258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:22:12,335][05868] Avg episode reward: [(0, '23.835')] [2023-02-23 11:22:12,783][17741] Updated weights for policy 0, policy_version 950 (0.0013) [2023-02-23 11:22:17,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3903488. Throughput: 0: 861.9. Samples: 976368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:22:17,339][05868] Avg episode reward: [(0, '24.691')] [2023-02-23 11:22:22,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3915776. Throughput: 0: 864.8. Samples: 978426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:22:22,341][05868] Avg episode reward: [(0, '24.660')] [2023-02-23 11:22:26,189][17741] Updated weights for policy 0, policy_version 960 (0.0019) [2023-02-23 11:22:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3936256. Throughput: 0: 886.9. Samples: 983454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:22:27,340][05868] Avg episode reward: [(0, '24.786')] [2023-02-23 11:22:32,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3956736. Throughput: 0: 896.4. Samples: 989800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-02-23 11:22:32,335][05868] Avg episode reward: [(0, '25.239')] [2023-02-23 11:22:36,758][17741] Updated weights for policy 0, policy_version 970 (0.0041) [2023-02-23 11:22:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3973120. Throughput: 0: 885.8. Samples: 992578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-02-23 11:22:37,338][05868] Avg episode reward: [(0, '25.539')] [2023-02-23 11:22:42,334][05868] Fps is (10 sec: 2866.6, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3985408. Throughput: 0: 858.2. Samples: 996592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-02-23 11:22:42,337][05868] Avg episode reward: [(0, '26.679')] [2023-02-23 11:22:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 4001792. Throughput: 0: 874.6. Samples: 1001508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-02-23 11:22:47,335][05868] Avg episode reward: [(0, '27.634')] [2023-02-23 11:22:47,344][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000977_4001792.pth... [2023-02-23 11:22:47,491][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth [2023-02-23 11:22:47,595][17728] Stopping Batcher_0... [2023-02-23 11:22:47,596][17728] Loop batcher_evt_loop terminating... [2023-02-23 11:22:47,596][05868] Component Batcher_0 stopped! [2023-02-23 11:22:47,607][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-23 11:22:47,661][17746] Stopping RolloutWorker_w3... [2023-02-23 11:22:47,659][05868] Component RolloutWorker_w6 stopped! [2023-02-23 11:22:47,666][05868] Component RolloutWorker_w3 stopped! [2023-02-23 11:22:47,671][17749] Stopping RolloutWorker_w6... [2023-02-23 11:22:47,672][17749] Loop rollout_proc6_evt_loop terminating... [2023-02-23 11:22:47,662][17746] Loop rollout_proc3_evt_loop terminating... [2023-02-23 11:22:47,677][17748] Stopping RolloutWorker_w5... [2023-02-23 11:22:47,677][05868] Component RolloutWorker_w5 stopped! [2023-02-23 11:22:47,683][17744] Stopping RolloutWorker_w1... [2023-02-23 11:22:47,683][05868] Component RolloutWorker_w1 stopped! [2023-02-23 11:22:47,678][17748] Loop rollout_proc5_evt_loop terminating... [2023-02-23 11:22:47,693][17745] Stopping RolloutWorker_w2... [2023-02-23 11:22:47,693][05868] Component RolloutWorker_w2 stopped! [2023-02-23 11:22:47,687][17744] Loop rollout_proc1_evt_loop terminating... [2023-02-23 11:22:47,702][17747] Stopping RolloutWorker_w4... [2023-02-23 11:22:47,702][05868] Component RolloutWorker_w4 stopped! [2023-02-23 11:22:47,708][17747] Loop rollout_proc4_evt_loop terminating... [2023-02-23 11:22:47,710][17743] Stopping RolloutWorker_w0... [2023-02-23 11:22:47,711][17743] Loop rollout_proc0_evt_loop terminating... [2023-02-23 11:22:47,709][05868] Component RolloutWorker_w0 stopped! [2023-02-23 11:22:47,706][17745] Loop rollout_proc2_evt_loop terminating... [2023-02-23 11:22:47,721][17741] Weights refcount: 2 0 [2023-02-23 11:22:47,724][05868] Component InferenceWorker_p0-w0 stopped! [2023-02-23 11:22:47,729][17741] Stopping InferenceWorker_p0-w0... [2023-02-23 11:22:47,730][17741] Loop inference_proc0-0_evt_loop terminating... [2023-02-23 11:22:47,753][17750] Stopping RolloutWorker_w7... [2023-02-23 11:22:47,753][05868] Component RolloutWorker_w7 stopped! [2023-02-23 11:22:47,753][17750] Loop rollout_proc7_evt_loop terminating... [2023-02-23 11:22:47,818][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000876_3588096.pth [2023-02-23 11:22:47,832][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-23 11:22:48,038][05868] Component LearnerWorker_p0 stopped! [2023-02-23 11:22:48,046][05868] Waiting for process learner_proc0 to stop... [2023-02-23 11:22:48,051][17728] Stopping LearnerWorker_p0... [2023-02-23 11:22:48,051][17728] Loop learner_proc0_evt_loop terminating... [2023-02-23 11:22:49,828][05868] Waiting for process inference_proc0-0 to join... [2023-02-23 11:22:50,147][05868] Waiting for process rollout_proc0 to join... [2023-02-23 11:22:50,673][05868] Waiting for process rollout_proc1 to join... [2023-02-23 11:22:50,676][05868] Waiting for process rollout_proc2 to join... [2023-02-23 11:22:50,685][05868] Waiting for process rollout_proc3 to join... [2023-02-23 11:22:50,686][05868] Waiting for process rollout_proc4 to join... [2023-02-23 11:22:50,687][05868] Waiting for process rollout_proc5 to join... [2023-02-23 11:22:50,690][05868] Waiting for process rollout_proc6 to join... [2023-02-23 11:22:50,691][05868] Waiting for process rollout_proc7 to join... [2023-02-23 11:22:50,692][05868] Batcher 0 profile tree view: batching: 26.9477, releasing_batches: 0.0262 [2023-02-23 11:22:50,694][05868] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 553.1603 update_model: 8.1936 weight_update: 0.0044 one_step: 0.0270 handle_policy_step: 565.2490 deserialize: 15.9555, stack: 3.1990, obs_to_device_normalize: 121.1091, forward: 278.3161, send_messages: 27.2497 prepare_outputs: 90.4579 to_cpu: 56.4857 [2023-02-23 11:22:50,697][05868] Learner 0 profile tree view: misc: 0.0062, prepare_batch: 17.5318 train: 77.1227 epoch_init: 0.0112, minibatch_init: 0.0109, losses_postprocess: 0.6452, kl_divergence: 0.5561, after_optimizer: 32.9744 calculate_losses: 27.2532 losses_init: 0.0040, forward_head: 1.7816, bptt_initial: 17.8857, tail: 1.2159, advantages_returns: 0.3498, losses: 3.3914 bptt: 2.3067 bptt_forward_core: 2.2317 update: 15.0318 clip: 1.4671 [2023-02-23 11:22:50,698][05868] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.4025, enqueue_policy_requests: 158.4457, env_step: 876.3397, overhead: 24.1461, complete_rollouts: 7.9883 save_policy_outputs: 21.9482 split_output_tensors: 10.5565 [2023-02-23 11:22:50,700][05868] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.4428, enqueue_policy_requests: 156.8326, env_step: 876.6256, overhead: 23.6488, complete_rollouts: 6.9232 save_policy_outputs: 22.0293 split_output_tensors: 10.5762 [2023-02-23 11:22:50,701][05868] Loop Runner_EvtLoop terminating... [2023-02-23 11:22:50,703][05868] Runner profile tree view: main_loop: 1198.3333 [2023-02-23 11:22:50,704][05868] Collected {0: 4005888}, FPS: 3342.9 [2023-02-23 11:27:41,816][05868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-23 11:27:41,818][05868] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-23 11:27:41,822][05868] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-23 11:27:41,825][05868] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-23 11:27:41,829][05868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-23 11:27:41,831][05868] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-23 11:27:41,833][05868] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-23 11:27:41,836][05868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-23 11:27:41,838][05868] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-23 11:27:41,841][05868] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-23 11:27:41,844][05868] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-23 11:27:41,846][05868] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-23 11:27:41,847][05868] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-23 11:27:41,849][05868] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-23 11:27:41,850][05868] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-23 11:27:41,875][05868] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-23 11:27:41,878][05868] RunningMeanStd input shape: (3, 72, 128) [2023-02-23 11:27:41,882][05868] RunningMeanStd input shape: (1,) [2023-02-23 11:27:41,899][05868] ConvEncoder: input_channels=3 [2023-02-23 11:27:42,594][05868] Conv encoder output size: 512 [2023-02-23 11:27:42,597][05868] Policy head output size: 512 [2023-02-23 11:27:44,939][05868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-23 11:27:46,207][05868] Num frames 100... [2023-02-23 11:27:46,342][05868] Num frames 200... [2023-02-23 11:27:46,464][05868] Num frames 300... [2023-02-23 11:27:46,582][05868] Num frames 400... [2023-02-23 11:27:46,694][05868] Num frames 500... [2023-02-23 11:27:46,805][05868] Num frames 600... [2023-02-23 11:27:46,926][05868] Num frames 700... [2023-02-23 11:27:47,021][05868] Avg episode rewards: #0: 16.360, true rewards: #0: 7.360 [2023-02-23 11:27:47,022][05868] Avg episode reward: 16.360, avg true_objective: 7.360 [2023-02-23 11:27:47,101][05868] Num frames 800... [2023-02-23 11:27:47,237][05868] Num frames 900... [2023-02-23 11:27:47,366][05868] Num frames 1000... [2023-02-23 11:27:47,478][05868] Num frames 1100... [2023-02-23 11:27:47,588][05868] Num frames 1200... [2023-02-23 11:27:47,697][05868] Num frames 1300... [2023-02-23 11:27:47,812][05868] Num frames 1400... [2023-02-23 11:27:47,926][05868] Num frames 1500... [2023-02-23 11:27:48,047][05868] Num frames 1600... [2023-02-23 11:27:48,140][05868] Avg episode rewards: #0: 17.660, true rewards: #0: 8.160 [2023-02-23 11:27:48,141][05868] Avg episode reward: 17.660, avg true_objective: 8.160 [2023-02-23 11:27:48,239][05868] Num frames 1700... [2023-02-23 11:27:48,361][05868] Num frames 1800... [2023-02-23 11:27:48,483][05868] Num frames 1900... [2023-02-23 11:27:48,593][05868] Num frames 2000... [2023-02-23 11:27:48,719][05868] Num frames 2100... [2023-02-23 11:27:48,874][05868] Avg episode rewards: #0: 15.253, true rewards: #0: 7.253 [2023-02-23 11:27:48,877][05868] Avg episode reward: 15.253, avg true_objective: 7.253 [2023-02-23 11:27:48,911][05868] Num frames 2200... [2023-02-23 11:27:49,033][05868] Num frames 2300... [2023-02-23 11:27:49,149][05868] Num frames 2400... [2023-02-23 11:27:49,262][05868] Num frames 2500... [2023-02-23 11:27:49,385][05868] Num frames 2600... [2023-02-23 11:27:49,495][05868] Num frames 2700... [2023-02-23 11:27:49,618][05868] Num frames 2800... [2023-02-23 11:27:49,730][05868] Num frames 2900... [2023-02-23 11:27:49,855][05868] Num frames 3000... [2023-02-23 11:27:49,949][05868] Avg episode rewards: #0: 16.833, true rewards: #0: 7.582 [2023-02-23 11:27:49,952][05868] Avg episode reward: 16.833, avg true_objective: 7.582 [2023-02-23 11:27:50,035][05868] Num frames 3100... [2023-02-23 11:27:50,152][05868] Num frames 3200... [2023-02-23 11:27:50,285][05868] Num frames 3300... [2023-02-23 11:27:50,418][05868] Num frames 3400... [2023-02-23 11:27:50,555][05868] Num frames 3500... [2023-02-23 11:27:50,677][05868] Num frames 3600... [2023-02-23 11:27:50,797][05868] Num frames 3700... [2023-02-23 11:27:50,917][05868] Num frames 3800... [2023-02-23 11:27:50,976][05868] Avg episode rewards: #0: 17.002, true rewards: #0: 7.602 [2023-02-23 11:27:50,977][05868] Avg episode reward: 17.002, avg true_objective: 7.602 [2023-02-23 11:27:51,096][05868] Num frames 3900... [2023-02-23 11:27:51,218][05868] Num frames 4000... [2023-02-23 11:27:51,345][05868] Num frames 4100... [2023-02-23 11:27:51,528][05868] Num frames 4200... [2023-02-23 11:27:51,684][05868] Num frames 4300... [2023-02-23 11:27:51,840][05868] Num frames 4400... [2023-02-23 11:27:51,996][05868] Num frames 4500... [2023-02-23 11:27:52,157][05868] Num frames 4600... [2023-02-23 11:27:52,319][05868] Num frames 4700... [2023-02-23 11:27:52,485][05868] Num frames 4800... [2023-02-23 11:27:52,654][05868] Num frames 4900... [2023-02-23 11:27:52,812][05868] Num frames 5000... [2023-02-23 11:27:52,970][05868] Num frames 5100... [2023-02-23 11:27:53,134][05868] Num frames 5200... [2023-02-23 11:27:53,301][05868] Num frames 5300... [2023-02-23 11:27:53,459][05868] Num frames 5400... [2023-02-23 11:27:53,632][05868] Num frames 5500... [2023-02-23 11:27:53,797][05868] Num frames 5600... [2023-02-23 11:27:53,971][05868] Num frames 5700... [2023-02-23 11:27:54,150][05868] Num frames 5800... [2023-02-23 11:27:54,317][05868] Num frames 5900... [2023-02-23 11:27:54,382][05868] Avg episode rewards: #0: 23.168, true rewards: #0: 9.835 [2023-02-23 11:27:54,384][05868] Avg episode reward: 23.168, avg true_objective: 9.835 [2023-02-23 11:27:54,550][05868] Num frames 6000... [2023-02-23 11:27:54,714][05868] Num frames 6100... [2023-02-23 11:27:54,886][05868] Num frames 6200... [2023-02-23 11:27:55,010][05868] Num frames 6300... [2023-02-23 11:27:55,122][05868] Num frames 6400... [2023-02-23 11:27:55,242][05868] Num frames 6500... [2023-02-23 11:27:55,352][05868] Num frames 6600... [2023-02-23 11:27:55,490][05868] Avg episode rewards: #0: 21.813, true rewards: #0: 9.527 [2023-02-23 11:27:55,492][05868] Avg episode reward: 21.813, avg true_objective: 9.527 [2023-02-23 11:27:55,536][05868] Num frames 6700... [2023-02-23 11:27:55,666][05868] Num frames 6800... [2023-02-23 11:27:55,794][05868] Num frames 6900... [2023-02-23 11:27:55,914][05868] Num frames 7000... [2023-02-23 11:27:56,024][05868] Num frames 7100... [2023-02-23 11:27:56,135][05868] Num frames 7200... [2023-02-23 11:27:56,248][05868] Num frames 7300... [2023-02-23 11:27:56,359][05868] Num frames 7400... [2023-02-23 11:27:56,471][05868] Num frames 7500... [2023-02-23 11:27:56,589][05868] Num frames 7600... [2023-02-23 11:27:56,679][05868] Avg episode rewards: #0: 22.150, true rewards: #0: 9.525 [2023-02-23 11:27:56,681][05868] Avg episode reward: 22.150, avg true_objective: 9.525 [2023-02-23 11:27:56,784][05868] Num frames 7700... [2023-02-23 11:27:56,906][05868] Num frames 7800... [2023-02-23 11:27:57,018][05868] Num frames 7900... [2023-02-23 11:27:57,139][05868] Num frames 8000... [2023-02-23 11:27:57,252][05868] Num frames 8100... [2023-02-23 11:27:57,363][05868] Num frames 8200... [2023-02-23 11:27:57,456][05868] Avg episode rewards: #0: 20.698, true rewards: #0: 9.142 [2023-02-23 11:27:57,458][05868] Avg episode reward: 20.698, avg true_objective: 9.142 [2023-02-23 11:27:57,562][05868] Num frames 8300... [2023-02-23 11:27:57,696][05868] Num frames 8400... [2023-02-23 11:27:57,808][05868] Num frames 8500... [2023-02-23 11:27:57,923][05868] Num frames 8600... [2023-02-23 11:27:58,036][05868] Num frames 8700... [2023-02-23 11:27:58,155][05868] Num frames 8800... [2023-02-23 11:27:58,271][05868] Num frames 8900... [2023-02-23 11:27:58,389][05868] Num frames 9000... [2023-02-23 11:27:58,513][05868] Num frames 9100... [2023-02-23 11:27:58,626][05868] Num frames 9200... [2023-02-23 11:27:58,745][05868] Num frames 9300... [2023-02-23 11:27:58,857][05868] Num frames 9400... [2023-02-23 11:27:58,966][05868] Num frames 9500... [2023-02-23 11:27:59,086][05868] Num frames 9600... [2023-02-23 11:27:59,205][05868] Num frames 9700... [2023-02-23 11:27:59,326][05868] Num frames 9800... [2023-02-23 11:27:59,396][05868] Avg episode rewards: #0: 22.309, true rewards: #0: 9.809 [2023-02-23 11:27:59,398][05868] Avg episode reward: 22.309, avg true_objective: 9.809 [2023-02-23 11:29:02,301][05868] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-23 11:31:46,553][05868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-23 11:31:46,555][05868] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-23 11:31:46,556][05868] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-23 11:31:46,559][05868] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-23 11:31:46,561][05868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-23 11:31:46,562][05868] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-23 11:31:46,563][05868] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-23 11:31:46,565][05868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-23 11:31:46,566][05868] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-23 11:31:46,567][05868] Adding new argument 'hf_repository'='iubeda/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-23 11:31:46,568][05868] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-23 11:31:46,569][05868] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-23 11:31:46,570][05868] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-23 11:31:46,572][05868] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-23 11:31:46,573][05868] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-23 11:31:46,602][05868] RunningMeanStd input shape: (3, 72, 128) [2023-02-23 11:31:46,604][05868] RunningMeanStd input shape: (1,) [2023-02-23 11:31:46,620][05868] ConvEncoder: input_channels=3 [2023-02-23 11:31:46,658][05868] Conv encoder output size: 512 [2023-02-23 11:31:46,659][05868] Policy head output size: 512 [2023-02-23 11:31:46,682][05868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... [2023-02-23 11:31:47,129][05868] Num frames 100... [2023-02-23 11:31:47,275][05868] Num frames 200... [2023-02-23 11:31:47,397][05868] Num frames 300... [2023-02-23 11:31:47,514][05868] Num frames 400... [2023-02-23 11:31:47,630][05868] Num frames 500... [2023-02-23 11:31:47,753][05868] Num frames 600... [2023-02-23 11:31:47,884][05868] Num frames 700... [2023-02-23 11:31:48,007][05868] Num frames 800... [2023-02-23 11:31:48,130][05868] Num frames 900... [2023-02-23 11:31:48,245][05868] Num frames 1000... [2023-02-23 11:31:48,362][05868] Num frames 1100... [2023-02-23 11:31:48,480][05868] Num frames 1200... [2023-02-23 11:31:48,606][05868] Num frames 1300... [2023-02-23 11:31:48,750][05868] Avg episode rewards: #0: 33.760, true rewards: #0: 13.760 [2023-02-23 11:31:48,752][05868] Avg episode reward: 33.760, avg true_objective: 13.760 [2023-02-23 11:31:48,786][05868] Num frames 1400... [2023-02-23 11:31:48,903][05868] Num frames 1500... [2023-02-23 11:31:49,020][05868] Num frames 1600... [2023-02-23 11:31:49,135][05868] Num frames 1700... [2023-02-23 11:31:49,254][05868] Num frames 1800... [2023-02-23 11:31:49,389][05868] Num frames 1900... [2023-02-23 11:31:49,511][05868] Num frames 2000... [2023-02-23 11:31:49,621][05868] Num frames 2100... [2023-02-23 11:31:49,734][05868] Num frames 2200... [2023-02-23 11:31:49,852][05868] Num frames 2300... [2023-02-23 11:31:49,970][05868] Num frames 2400... [2023-02-23 11:31:50,104][05868] Num frames 2500... [2023-02-23 11:31:50,241][05868] Num frames 2600... [2023-02-23 11:31:50,356][05868] Num frames 2700... [2023-02-23 11:31:50,467][05868] Num frames 2800... [2023-02-23 11:31:50,581][05868] Num frames 2900... [2023-02-23 11:31:50,696][05868] Num frames 3000... [2023-02-23 11:31:50,841][05868] Avg episode rewards: #0: 38.335, true rewards: #0: 15.335 [2023-02-23 11:31:50,844][05868] Avg episode reward: 38.335, avg true_objective: 15.335 [2023-02-23 11:31:50,896][05868] Num frames 3100... [2023-02-23 11:31:51,026][05868] Num frames 3200... [2023-02-23 11:31:51,146][05868] Num frames 3300... [2023-02-23 11:31:51,258][05868] Num frames 3400... [2023-02-23 11:31:51,377][05868] Num frames 3500... [2023-02-23 11:31:51,495][05868] Num frames 3600... [2023-02-23 11:31:51,609][05868] Num frames 3700... [2023-02-23 11:31:51,745][05868] Num frames 3800... [2023-02-23 11:31:51,861][05868] Num frames 3900... [2023-02-23 11:31:51,980][05868] Num frames 4000... [2023-02-23 11:31:52,108][05868] Avg episode rewards: #0: 33.197, true rewards: #0: 13.530 [2023-02-23 11:31:52,112][05868] Avg episode reward: 33.197, avg true_objective: 13.530 [2023-02-23 11:31:52,164][05868] Num frames 4100... [2023-02-23 11:31:52,297][05868] Num frames 4200... [2023-02-23 11:31:52,422][05868] Num frames 4300... [2023-02-23 11:31:52,568][05868] Num frames 4400... [2023-02-23 11:31:52,691][05868] Num frames 4500... [2023-02-23 11:31:52,820][05868] Num frames 4600... [2023-02-23 11:31:52,932][05868] Num frames 4700... [2023-02-23 11:31:53,072][05868] Num frames 4800... [2023-02-23 11:31:53,197][05868] Avg episode rewards: #0: 29.647, true rewards: #0: 12.147 [2023-02-23 11:31:53,199][05868] Avg episode reward: 29.647, avg true_objective: 12.147 [2023-02-23 11:31:53,316][05868] Num frames 4900... [2023-02-23 11:31:53,581][05868] Num frames 5000... [2023-02-23 11:31:53,820][05868] Num frames 5100... [2023-02-23 11:31:54,066][05868] Num frames 5200... [2023-02-23 11:31:54,344][05868] Num frames 5300... [2023-02-23 11:31:54,635][05868] Num frames 5400... [2023-02-23 11:31:54,856][05868] Num frames 5500... [2023-02-23 11:31:55,055][05868] Num frames 5600... [2023-02-23 11:31:55,280][05868] Num frames 5700... [2023-02-23 11:31:55,614][05868] Num frames 5800... [2023-02-23 11:31:55,878][05868] Num frames 5900... [2023-02-23 11:31:56,194][05868] Avg episode rewards: #0: 28.158, true rewards: #0: 11.958 [2023-02-23 11:31:56,206][05868] Avg episode reward: 28.158, avg true_objective: 11.958 [2023-02-23 11:31:56,289][05868] Num frames 6000... [2023-02-23 11:31:56,618][05868] Num frames 6100... [2023-02-23 11:31:56,835][05868] Num frames 6200... [2023-02-23 11:31:56,994][05868] Num frames 6300... [2023-02-23 11:31:57,166][05868] Num frames 6400... [2023-02-23 11:31:57,337][05868] Num frames 6500... [2023-02-23 11:31:57,532][05868] Avg episode rewards: #0: 25.312, true rewards: #0: 10.978 [2023-02-23 11:31:57,537][05868] Avg episode reward: 25.312, avg true_objective: 10.978 [2023-02-23 11:31:57,568][05868] Num frames 6600... [2023-02-23 11:31:57,746][05868] Num frames 6700... [2023-02-23 11:31:57,908][05868] Num frames 6800... [2023-02-23 11:31:58,071][05868] Num frames 6900... [2023-02-23 11:31:58,254][05868] Num frames 7000... [2023-02-23 11:31:58,420][05868] Num frames 7100... [2023-02-23 11:31:58,574][05868] Avg episode rewards: #0: 23.224, true rewards: #0: 10.224 [2023-02-23 11:31:58,577][05868] Avg episode reward: 23.224, avg true_objective: 10.224 [2023-02-23 11:31:58,652][05868] Num frames 7200... [2023-02-23 11:31:58,810][05868] Num frames 7300... [2023-02-23 11:31:58,970][05868] Num frames 7400... [2023-02-23 11:31:59,133][05868] Num frames 7500... [2023-02-23 11:31:59,309][05868] Num frames 7600... [2023-02-23 11:31:59,477][05868] Num frames 7700... [2023-02-23 11:31:59,639][05868] Avg episode rewards: #0: 21.456, true rewards: #0: 9.706 [2023-02-23 11:31:59,641][05868] Avg episode reward: 21.456, avg true_objective: 9.706 [2023-02-23 11:31:59,700][05868] Num frames 7800... [2023-02-23 11:31:59,823][05868] Num frames 7900... [2023-02-23 11:31:59,933][05868] Num frames 8000... [2023-02-23 11:32:00,048][05868] Num frames 8100... [2023-02-23 11:32:00,167][05868] Num frames 8200... [2023-02-23 11:32:00,297][05868] Num frames 8300... [2023-02-23 11:32:00,414][05868] Num frames 8400... [2023-02-23 11:32:00,525][05868] Num frames 8500... [2023-02-23 11:32:00,621][05868] Avg episode rewards: #0: 20.814, true rewards: #0: 9.481 [2023-02-23 11:32:00,623][05868] Avg episode reward: 20.814, avg true_objective: 9.481 [2023-02-23 11:32:00,711][05868] Num frames 8600... [2023-02-23 11:32:00,833][05868] Num frames 8700... [2023-02-23 11:32:00,945][05868] Num frames 8800... [2023-02-23 11:32:01,064][05868] Num frames 8900... [2023-02-23 11:32:01,189][05868] Num frames 9000... [2023-02-23 11:32:01,322][05868] Num frames 9100... [2023-02-23 11:32:01,434][05868] Num frames 9200... [2023-02-23 11:32:01,543][05868] Num frames 9300... [2023-02-23 11:32:01,656][05868] Num frames 9400... [2023-02-23 11:32:01,769][05868] Num frames 9500... [2023-02-23 11:32:01,889][05868] Num frames 9600... [2023-02-23 11:32:02,015][05868] Num frames 9700... [2023-02-23 11:32:02,139][05868] Num frames 9800... [2023-02-23 11:32:02,255][05868] Num frames 9900... [2023-02-23 11:32:02,379][05868] Num frames 10000... [2023-02-23 11:32:02,493][05868] Num frames 10100... [2023-02-23 11:32:02,625][05868] Num frames 10200... [2023-02-23 11:32:02,743][05868] Num frames 10300... [2023-02-23 11:32:02,868][05868] Num frames 10400... [2023-02-23 11:32:02,980][05868] Num frames 10500... [2023-02-23 11:32:03,097][05868] Num frames 10600... [2023-02-23 11:32:03,192][05868] Avg episode rewards: #0: 24.633, true rewards: #0: 10.633 [2023-02-23 11:32:03,194][05868] Avg episode reward: 24.633, avg true_objective: 10.633 [2023-02-23 11:33:14,321][05868] Replay video saved to /content/train_dir/default_experiment/replay.mp4!