iubeda's picture
Upload . with huggingface_hub
f5d4510
[2023-02-23 11:02:52,065][05868] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 11:02:52,068][05868] Rollout worker 0 uses device cpu
[2023-02-23 11:02:52,073][05868] Rollout worker 1 uses device cpu
[2023-02-23 11:02:52,074][05868] Rollout worker 2 uses device cpu
[2023-02-23 11:02:52,075][05868] Rollout worker 3 uses device cpu
[2023-02-23 11:02:52,076][05868] Rollout worker 4 uses device cpu
[2023-02-23 11:02:52,077][05868] Rollout worker 5 uses device cpu
[2023-02-23 11:02:52,082][05868] Rollout worker 6 uses device cpu
[2023-02-23 11:02:52,084][05868] Rollout worker 7 uses device cpu
[2023-02-23 11:02:52,321][05868] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 11:02:52,326][05868] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 11:02:52,370][05868] Starting all processes...
[2023-02-23 11:02:52,373][05868] Starting process learner_proc0
[2023-02-23 11:02:52,450][05868] Starting all processes...
[2023-02-23 11:02:52,466][05868] Starting process inference_proc0-0
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc0
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc1
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc2
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc3
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc4
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc5
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc6
[2023-02-23 11:02:52,471][05868] Starting process rollout_proc7
[2023-02-23 11:03:02,845][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 11:03:02,849][17728] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 11:03:03,045][17741] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 11:03:03,046][17741] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 11:03:03,057][17744] Worker 1 uses CPU cores [1]
[2023-02-23 11:03:03,061][17750] Worker 7 uses CPU cores [1]
[2023-02-23 11:03:03,141][17745] Worker 2 uses CPU cores [0]
[2023-02-23 11:03:03,311][17747] Worker 4 uses CPU cores [0]
[2023-02-23 11:03:03,365][17748] Worker 5 uses CPU cores [1]
[2023-02-23 11:03:03,391][17746] Worker 3 uses CPU cores [1]
[2023-02-23 11:03:03,531][17749] Worker 6 uses CPU cores [0]
[2023-02-23 11:03:03,534][17743] Worker 0 uses CPU cores [0]
[2023-02-23 11:03:03,643][17728] Num visible devices: 1
[2023-02-23 11:03:03,643][17741] Num visible devices: 1
[2023-02-23 11:03:03,653][17728] Starting seed is not provided
[2023-02-23 11:03:03,653][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 11:03:03,654][17728] Initializing actor-critic model on device cuda:0
[2023-02-23 11:03:03,655][17728] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 11:03:03,657][17728] RunningMeanStd input shape: (1,)
[2023-02-23 11:03:03,669][17728] ConvEncoder: input_channels=3
[2023-02-23 11:03:03,941][17728] Conv encoder output size: 512
[2023-02-23 11:03:03,941][17728] Policy head output size: 512
[2023-02-23 11:03:03,988][17728] Created Actor Critic model with architecture:
[2023-02-23 11:03:03,988][17728] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-23 11:03:11,326][17728] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 11:03:11,327][17728] No checkpoints found
[2023-02-23 11:03:11,327][17728] Did not load from checkpoint, starting from scratch!
[2023-02-23 11:03:11,327][17728] Initialized policy 0 weights for model version 0
[2023-02-23 11:03:11,331][17728] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 11:03:11,338][17728] LearnerWorker_p0 finished initialization!
[2023-02-23 11:03:11,448][17741] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 11:03:11,449][17741] RunningMeanStd input shape: (1,)
[2023-02-23 11:03:11,461][17741] ConvEncoder: input_channels=3
[2023-02-23 11:03:11,572][17741] Conv encoder output size: 512
[2023-02-23 11:03:11,573][17741] Policy head output size: 512
[2023-02-23 11:03:12,309][05868] Heartbeat connected on Batcher_0
[2023-02-23 11:03:12,316][05868] Heartbeat connected on LearnerWorker_p0
[2023-02-23 11:03:12,332][05868] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 11:03:12,337][05868] Heartbeat connected on RolloutWorker_w0
[2023-02-23 11:03:12,345][05868] Heartbeat connected on RolloutWorker_w1
[2023-02-23 11:03:12,347][05868] Heartbeat connected on RolloutWorker_w2
[2023-02-23 11:03:12,352][05868] Heartbeat connected on RolloutWorker_w3
[2023-02-23 11:03:12,356][05868] Heartbeat connected on RolloutWorker_w4
[2023-02-23 11:03:12,359][05868] Heartbeat connected on RolloutWorker_w5
[2023-02-23 11:03:12,366][05868] Heartbeat connected on RolloutWorker_w6
[2023-02-23 11:03:12,371][05868] Heartbeat connected on RolloutWorker_w7
[2023-02-23 11:03:13,890][05868] Inference worker 0-0 is ready!
[2023-02-23 11:03:13,892][05868] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 11:03:13,902][05868] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 11:03:14,009][17743] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,020][17747] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,028][17745] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,038][17746] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,051][17750] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,047][17749] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,056][17744] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:14,069][17748] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:03:15,235][17750] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,234][17744] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,234][17746] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,236][17745] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,237][17747] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,234][17743] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,920][17749] Decorrelating experience for 0 frames...
[2023-02-23 11:03:15,924][17745] Decorrelating experience for 32 frames...
[2023-02-23 11:03:15,934][17746] Decorrelating experience for 32 frames...
[2023-02-23 11:03:15,936][17744] Decorrelating experience for 32 frames...
[2023-02-23 11:03:16,725][17750] Decorrelating experience for 32 frames...
[2023-02-23 11:03:16,853][17744] Decorrelating experience for 64 frames...
[2023-02-23 11:03:16,979][17743] Decorrelating experience for 32 frames...
[2023-02-23 11:03:16,993][17749] Decorrelating experience for 32 frames...
[2023-02-23 11:03:17,100][17745] Decorrelating experience for 64 frames...
[2023-02-23 11:03:17,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 11:03:17,608][17750] Decorrelating experience for 64 frames...
[2023-02-23 11:03:17,689][17744] Decorrelating experience for 96 frames...
[2023-02-23 11:03:18,141][17747] Decorrelating experience for 32 frames...
[2023-02-23 11:03:18,364][17745] Decorrelating experience for 96 frames...
[2023-02-23 11:03:18,389][17743] Decorrelating experience for 64 frames...
[2023-02-23 11:03:18,843][17748] Decorrelating experience for 0 frames...
[2023-02-23 11:03:19,121][17750] Decorrelating experience for 96 frames...
[2023-02-23 11:03:19,491][17748] Decorrelating experience for 32 frames...
[2023-02-23 11:03:20,068][17749] Decorrelating experience for 64 frames...
[2023-02-23 11:03:20,794][17747] Decorrelating experience for 64 frames...
[2023-02-23 11:03:20,898][17743] Decorrelating experience for 96 frames...
[2023-02-23 11:03:21,273][17746] Decorrelating experience for 64 frames...
[2023-02-23 11:03:21,315][17749] Decorrelating experience for 96 frames...
[2023-02-23 11:03:21,599][17747] Decorrelating experience for 96 frames...
[2023-02-23 11:03:21,710][17748] Decorrelating experience for 64 frames...
[2023-02-23 11:03:22,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 11:03:22,593][17746] Decorrelating experience for 96 frames...
[2023-02-23 11:03:22,768][17748] Decorrelating experience for 96 frames...
[2023-02-23 11:03:27,332][05868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 93.7. Samples: 1406. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 11:03:27,340][05868] Avg episode reward: [(0, '1.377')]
[2023-02-23 11:03:27,825][17728] Signal inference workers to stop experience collection...
[2023-02-23 11:03:27,854][17741] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 11:03:30,259][17728] Signal inference workers to resume experience collection...
[2023-02-23 11:03:30,260][17741] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 11:03:32,332][05868] Fps is (10 sec: 1228.8, 60 sec: 614.4, 300 sec: 614.4). Total num frames: 12288. Throughput: 0: 161.4. Samples: 3228. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-23 11:03:32,334][05868] Avg episode reward: [(0, '3.061')]
[2023-02-23 11:03:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 1310.7, 300 sec: 1310.7). Total num frames: 32768. Throughput: 0: 250.7. Samples: 6268. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-23 11:03:37,338][05868] Avg episode reward: [(0, '3.835')]
[2023-02-23 11:03:39,390][17741] Updated weights for policy 0, policy_version 10 (0.0012)
[2023-02-23 11:03:42,338][05868] Fps is (10 sec: 3684.1, 60 sec: 1638.1, 300 sec: 1638.1). Total num frames: 49152. Throughput: 0: 386.2. Samples: 11588. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:03:42,346][05868] Avg episode reward: [(0, '4.414')]
[2023-02-23 11:03:47,335][05868] Fps is (10 sec: 2866.3, 60 sec: 1755.3, 300 sec: 1755.3). Total num frames: 61440. Throughput: 0: 446.0. Samples: 15610. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-23 11:03:47,342][05868] Avg episode reward: [(0, '4.585')]
[2023-02-23 11:03:52,333][05868] Fps is (10 sec: 2868.6, 60 sec: 1945.5, 300 sec: 1945.5). Total num frames: 77824. Throughput: 0: 450.3. Samples: 18012. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:03:52,342][05868] Avg episode reward: [(0, '4.452')]
[2023-02-23 11:03:52,845][17741] Updated weights for policy 0, policy_version 20 (0.0022)
[2023-02-23 11:03:57,332][05868] Fps is (10 sec: 2868.1, 60 sec: 2002.5, 300 sec: 2002.5). Total num frames: 90112. Throughput: 0: 516.8. Samples: 23254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:03:57,339][05868] Avg episode reward: [(0, '4.402')]
[2023-02-23 11:04:02,332][05868] Fps is (10 sec: 2867.5, 60 sec: 2129.9, 300 sec: 2129.9). Total num frames: 106496. Throughput: 0: 619.9. Samples: 27896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:04:02,337][05868] Avg episode reward: [(0, '4.340')]
[2023-02-23 11:04:02,346][17728] Saving new best policy, reward=4.340!
[2023-02-23 11:04:07,137][17741] Updated weights for policy 0, policy_version 30 (0.0018)
[2023-02-23 11:04:07,338][05868] Fps is (10 sec: 3274.8, 60 sec: 2233.9, 300 sec: 2233.9). Total num frames: 122880. Throughput: 0: 664.0. Samples: 29882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:04:07,341][05868] Avg episode reward: [(0, '4.508')]
[2023-02-23 11:04:07,355][17728] Saving new best policy, reward=4.508!
[2023-02-23 11:04:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 2321.1, 300 sec: 2321.1). Total num frames: 139264. Throughput: 0: 729.7. Samples: 34242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:04:12,336][05868] Avg episode reward: [(0, '4.442')]
[2023-02-23 11:04:17,332][05868] Fps is (10 sec: 3688.7, 60 sec: 2662.4, 300 sec: 2457.6). Total num frames: 159744. Throughput: 0: 826.3. Samples: 40412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:04:17,340][05868] Avg episode reward: [(0, '4.209')]
[2023-02-23 11:04:18,055][17741] Updated weights for policy 0, policy_version 40 (0.0016)
[2023-02-23 11:04:22,332][05868] Fps is (10 sec: 3686.4, 60 sec: 2935.5, 300 sec: 2516.1). Total num frames: 176128. Throughput: 0: 828.7. Samples: 43558. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:04:22,340][05868] Avg episode reward: [(0, '4.239')]
[2023-02-23 11:04:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2512.2). Total num frames: 188416. Throughput: 0: 799.6. Samples: 47564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:04:27,337][05868] Avg episode reward: [(0, '4.367')]
[2023-02-23 11:04:31,480][17741] Updated weights for policy 0, policy_version 50 (0.0017)
[2023-02-23 11:04:32,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2560.0). Total num frames: 204800. Throughput: 0: 810.9. Samples: 52100. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:04:32,341][05868] Avg episode reward: [(0, '4.453')]
[2023-02-23 11:04:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2650.4). Total num frames: 225280. Throughput: 0: 829.1. Samples: 55320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:04:37,334][05868] Avg episode reward: [(0, '4.451')]
[2023-02-23 11:04:41,539][17741] Updated weights for policy 0, policy_version 60 (0.0013)
[2023-02-23 11:04:42,333][05868] Fps is (10 sec: 4095.6, 60 sec: 3277.1, 300 sec: 2730.6). Total num frames: 245760. Throughput: 0: 849.9. Samples: 61500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:04:42,339][05868] Avg episode reward: [(0, '4.333')]
[2023-02-23 11:04:47,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3277.0, 300 sec: 2716.3). Total num frames: 258048. Throughput: 0: 834.3. Samples: 65440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:04:47,334][05868] Avg episode reward: [(0, '4.338')]
[2023-02-23 11:04:47,352][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth...
[2023-02-23 11:04:52,332][05868] Fps is (10 sec: 2867.5, 60 sec: 3276.9, 300 sec: 2744.3). Total num frames: 274432. Throughput: 0: 834.4. Samples: 67424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:04:52,334][05868] Avg episode reward: [(0, '4.545')]
[2023-02-23 11:04:52,343][17728] Saving new best policy, reward=4.545!
[2023-02-23 11:04:55,280][17741] Updated weights for policy 0, policy_version 70 (0.0025)
[2023-02-23 11:04:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2808.7). Total num frames: 294912. Throughput: 0: 856.8. Samples: 72800. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:04:57,339][05868] Avg episode reward: [(0, '4.657')]
[2023-02-23 11:04:57,348][17728] Saving new best policy, reward=4.657!
[2023-02-23 11:05:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2830.0). Total num frames: 311296. Throughput: 0: 856.0. Samples: 78932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:05:02,337][05868] Avg episode reward: [(0, '4.463')]
[2023-02-23 11:05:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3345.4, 300 sec: 2813.8). Total num frames: 323584. Throughput: 0: 829.2. Samples: 80870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:05:07,338][05868] Avg episode reward: [(0, '4.246')]
[2023-02-23 11:05:07,361][17741] Updated weights for policy 0, policy_version 80 (0.0014)
[2023-02-23 11:05:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2833.1). Total num frames: 339968. Throughput: 0: 831.5. Samples: 84982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:05:12,334][05868] Avg episode reward: [(0, '4.238')]
[2023-02-23 11:05:17,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2883.6). Total num frames: 360448. Throughput: 0: 862.7. Samples: 90922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:05:17,334][05868] Avg episode reward: [(0, '4.374')]
[2023-02-23 11:05:18,793][17741] Updated weights for policy 0, policy_version 90 (0.0026)
[2023-02-23 11:05:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 2930.2). Total num frames: 380928. Throughput: 0: 861.5. Samples: 94088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:05:22,335][05868] Avg episode reward: [(0, '4.597')]
[2023-02-23 11:05:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 2912.7). Total num frames: 393216. Throughput: 0: 831.4. Samples: 98910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:05:27,339][05868] Avg episode reward: [(0, '4.706')]
[2023-02-23 11:05:27,407][17728] Saving new best policy, reward=4.706!
[2023-02-23 11:05:31,923][17741] Updated weights for policy 0, policy_version 100 (0.0023)
[2023-02-23 11:05:32,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 409600. Throughput: 0: 832.5. Samples: 102904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:05:32,334][05868] Avg episode reward: [(0, '4.805')]
[2023-02-23 11:05:32,340][17728] Saving new best policy, reward=4.805!
[2023-02-23 11:05:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 847.0. Samples: 105538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:05:37,334][05868] Avg episode reward: [(0, '4.709')]
[2023-02-23 11:05:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2976.4). Total num frames: 446464. Throughput: 0: 857.9. Samples: 111404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:05:42,335][05868] Avg episode reward: [(0, '4.589')]
[2023-02-23 11:05:42,568][17741] Updated weights for policy 0, policy_version 110 (0.0028)
[2023-02-23 11:05:47,334][05868] Fps is (10 sec: 3685.7, 60 sec: 3413.2, 300 sec: 2986.1). Total num frames: 462848. Throughput: 0: 825.8. Samples: 116094. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:05:47,343][05868] Avg episode reward: [(0, '4.439')]
[2023-02-23 11:05:52,333][05868] Fps is (10 sec: 2866.9, 60 sec: 3345.0, 300 sec: 2969.6). Total num frames: 475136. Throughput: 0: 828.4. Samples: 118148. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:05:52,335][05868] Avg episode reward: [(0, '4.335')]
[2023-02-23 11:05:55,992][17741] Updated weights for policy 0, policy_version 120 (0.0025)
[2023-02-23 11:05:57,332][05868] Fps is (10 sec: 3277.4, 60 sec: 3345.1, 300 sec: 3003.7). Total num frames: 495616. Throughput: 0: 851.8. Samples: 123314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:05:57,335][05868] Avg episode reward: [(0, '4.686')]
[2023-02-23 11:06:02,332][05868] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3035.9). Total num frames: 516096. Throughput: 0: 861.3. Samples: 129682. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:02,338][05868] Avg episode reward: [(0, '4.838')]
[2023-02-23 11:06:02,341][17728] Saving new best policy, reward=4.838!
[2023-02-23 11:06:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3019.3). Total num frames: 528384. Throughput: 0: 846.2. Samples: 132166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:07,337][05868] Avg episode reward: [(0, '4.824')]
[2023-02-23 11:06:07,371][17741] Updated weights for policy 0, policy_version 130 (0.0018)
[2023-02-23 11:06:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3026.5). Total num frames: 544768. Throughput: 0: 829.6. Samples: 136242. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:12,334][05868] Avg episode reward: [(0, '4.722')]
[2023-02-23 11:06:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3033.3). Total num frames: 561152. Throughput: 0: 859.5. Samples: 141582. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:17,334][05868] Avg episode reward: [(0, '4.635')]
[2023-02-23 11:06:19,398][17741] Updated weights for policy 0, policy_version 140 (0.0033)
[2023-02-23 11:06:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3082.8). Total num frames: 585728. Throughput: 0: 868.6. Samples: 144626. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 11:06:22,339][05868] Avg episode reward: [(0, '4.842')]
[2023-02-23 11:06:22,342][17728] Saving new best policy, reward=4.842!
[2023-02-23 11:06:27,339][05868] Fps is (10 sec: 3683.8, 60 sec: 3412.9, 300 sec: 3066.6). Total num frames: 598016. Throughput: 0: 861.2. Samples: 150166. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:27,356][05868] Avg episode reward: [(0, '4.740')]
[2023-02-23 11:06:31,831][17741] Updated weights for policy 0, policy_version 150 (0.0020)
[2023-02-23 11:06:32,332][05868] Fps is (10 sec: 2867.0, 60 sec: 3413.3, 300 sec: 3072.0). Total num frames: 614400. Throughput: 0: 846.1. Samples: 154166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:06:32,342][05868] Avg episode reward: [(0, '4.719')]
[2023-02-23 11:06:37,332][05868] Fps is (10 sec: 3279.1, 60 sec: 3413.3, 300 sec: 3077.0). Total num frames: 630784. Throughput: 0: 848.6. Samples: 156332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:06:37,339][05868] Avg episode reward: [(0, '4.890')]
[2023-02-23 11:06:37,351][17728] Saving new best policy, reward=4.890!
[2023-02-23 11:06:42,332][05868] Fps is (10 sec: 3686.6, 60 sec: 3413.3, 300 sec: 3101.3). Total num frames: 651264. Throughput: 0: 872.5. Samples: 162576. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:42,339][05868] Avg episode reward: [(0, '4.729')]
[2023-02-23 11:06:42,579][17741] Updated weights for policy 0, policy_version 160 (0.0021)
[2023-02-23 11:06:47,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3413.4, 300 sec: 3105.3). Total num frames: 667648. Throughput: 0: 854.1. Samples: 168118. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 11:06:47,336][05868] Avg episode reward: [(0, '4.562')]
[2023-02-23 11:06:47,353][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth...
[2023-02-23 11:06:52,338][05868] Fps is (10 sec: 3274.8, 60 sec: 3481.3, 300 sec: 3109.2). Total num frames: 684032. Throughput: 0: 841.7. Samples: 170046. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:52,341][05868] Avg episode reward: [(0, '4.765')]
[2023-02-23 11:06:56,289][17741] Updated weights for policy 0, policy_version 170 (0.0019)
[2023-02-23 11:06:57,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3113.0). Total num frames: 700416. Throughput: 0: 847.5. Samples: 174378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:06:57,336][05868] Avg episode reward: [(0, '4.879')]
[2023-02-23 11:07:02,332][05868] Fps is (10 sec: 3688.6, 60 sec: 3413.3, 300 sec: 3134.3). Total num frames: 720896. Throughput: 0: 869.2. Samples: 180694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:07:02,337][05868] Avg episode reward: [(0, '4.949')]
[2023-02-23 11:07:02,343][17728] Saving new best policy, reward=4.949!
[2023-02-23 11:07:06,236][17741] Updated weights for policy 0, policy_version 180 (0.0030)
[2023-02-23 11:07:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3137.4). Total num frames: 737280. Throughput: 0: 872.0. Samples: 183868. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:07:07,336][05868] Avg episode reward: [(0, '4.993')]
[2023-02-23 11:07:07,350][17728] Saving new best policy, reward=4.993!
[2023-02-23 11:07:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3140.3). Total num frames: 753664. Throughput: 0: 841.4. Samples: 188022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:07:12,334][05868] Avg episode reward: [(0, '5.144')]
[2023-02-23 11:07:12,338][17728] Saving new best policy, reward=5.144!
[2023-02-23 11:07:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3126.3). Total num frames: 765952. Throughput: 0: 852.2. Samples: 192516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:07:17,334][05868] Avg episode reward: [(0, '5.245')]
[2023-02-23 11:07:17,342][17728] Saving new best policy, reward=5.245!
[2023-02-23 11:07:19,580][17741] Updated weights for policy 0, policy_version 190 (0.0017)
[2023-02-23 11:07:22,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3145.7). Total num frames: 786432. Throughput: 0: 871.2. Samples: 195538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:07:22,335][05868] Avg episode reward: [(0, '5.567')]
[2023-02-23 11:07:22,341][17728] Saving new best policy, reward=5.567!
[2023-02-23 11:07:27,335][05868] Fps is (10 sec: 4094.7, 60 sec: 3481.8, 300 sec: 3164.3). Total num frames: 806912. Throughput: 0: 870.1. Samples: 201732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:07:27,337][05868] Avg episode reward: [(0, '5.600')]
[2023-02-23 11:07:27,358][17728] Saving new best policy, reward=5.600!
[2023-02-23 11:07:31,249][17741] Updated weights for policy 0, policy_version 200 (0.0024)
[2023-02-23 11:07:32,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3150.8). Total num frames: 819200. Throughput: 0: 836.2. Samples: 205746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:07:32,335][05868] Avg episode reward: [(0, '5.755')]
[2023-02-23 11:07:32,337][17728] Saving new best policy, reward=5.755!
[2023-02-23 11:07:37,332][05868] Fps is (10 sec: 2868.1, 60 sec: 3413.3, 300 sec: 3153.1). Total num frames: 835584. Throughput: 0: 835.6. Samples: 207642. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:07:37,334][05868] Avg episode reward: [(0, '5.518')]
[2023-02-23 11:07:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3170.6). Total num frames: 856064. Throughput: 0: 868.6. Samples: 213466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:07:42,338][05868] Avg episode reward: [(0, '5.578')]
[2023-02-23 11:07:42,990][17741] Updated weights for policy 0, policy_version 210 (0.0020)
[2023-02-23 11:07:47,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3187.4). Total num frames: 876544. Throughput: 0: 867.1. Samples: 219714. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:07:47,339][05868] Avg episode reward: [(0, '5.792')]
[2023-02-23 11:07:47,349][17728] Saving new best policy, reward=5.792!
[2023-02-23 11:07:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.7, 300 sec: 3174.4). Total num frames: 888832. Throughput: 0: 841.1. Samples: 221718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:07:52,337][05868] Avg episode reward: [(0, '5.530')]
[2023-02-23 11:07:56,082][17741] Updated weights for policy 0, policy_version 220 (0.0016)
[2023-02-23 11:07:57,332][05868] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3161.8). Total num frames: 901120. Throughput: 0: 839.9. Samples: 225816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:07:57,337][05868] Avg episode reward: [(0, '5.666')]
[2023-02-23 11:08:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3192.1). Total num frames: 925696. Throughput: 0: 869.6. Samples: 231648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:02,335][05868] Avg episode reward: [(0, '5.742')]
[2023-02-23 11:08:06,121][17741] Updated weights for policy 0, policy_version 230 (0.0015)
[2023-02-23 11:08:07,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3207.4). Total num frames: 946176. Throughput: 0: 873.9. Samples: 234864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:07,337][05868] Avg episode reward: [(0, '6.159')]
[2023-02-23 11:08:07,348][17728] Saving new best policy, reward=6.159!
[2023-02-23 11:08:12,332][05868] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 958464. Throughput: 0: 847.4. Samples: 239862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:08:12,338][05868] Avg episode reward: [(0, '5.804')]
[2023-02-23 11:08:17,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 847.8. Samples: 243896. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:08:17,339][05868] Avg episode reward: [(0, '5.908')]
[2023-02-23 11:08:19,525][17741] Updated weights for policy 0, policy_version 240 (0.0013)
[2023-02-23 11:08:22,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 991232. Throughput: 0: 866.5. Samples: 246634. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:08:22,340][05868] Avg episode reward: [(0, '5.740')]
[2023-02-23 11:08:27,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3413.5, 300 sec: 3387.9). Total num frames: 1011712. Throughput: 0: 881.5. Samples: 253134. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:08:27,334][05868] Avg episode reward: [(0, '5.308')]
[2023-02-23 11:08:29,852][17741] Updated weights for policy 0, policy_version 250 (0.0018)
[2023-02-23 11:08:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1028096. Throughput: 0: 851.8. Samples: 258046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:32,337][05868] Avg episode reward: [(0, '5.206')]
[2023-02-23 11:08:37,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3360.2). Total num frames: 1040384. Throughput: 0: 851.2. Samples: 260022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:37,335][05868] Avg episode reward: [(0, '5.178')]
[2023-02-23 11:08:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 1060864. Throughput: 0: 872.8. Samples: 265090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:42,339][05868] Avg episode reward: [(0, '5.483')]
[2023-02-23 11:08:42,361][17741] Updated weights for policy 0, policy_version 260 (0.0023)
[2023-02-23 11:08:47,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 1085440. Throughput: 0: 888.1. Samples: 271612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:08:47,339][05868] Avg episode reward: [(0, '5.861')]
[2023-02-23 11:08:47,350][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth...
[2023-02-23 11:08:47,490][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000063_258048.pth
[2023-02-23 11:08:52,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1097728. Throughput: 0: 874.5. Samples: 274218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:52,335][05868] Avg episode reward: [(0, '5.548')]
[2023-02-23 11:08:54,095][17741] Updated weights for policy 0, policy_version 270 (0.0018)
[2023-02-23 11:08:57,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 1114112. Throughput: 0: 853.1. Samples: 278252. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:08:57,339][05868] Avg episode reward: [(0, '5.219')]
[2023-02-23 11:09:02,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 1130496. Throughput: 0: 877.3. Samples: 283376. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:09:02,340][05868] Avg episode reward: [(0, '5.420')]
[2023-02-23 11:09:05,537][17741] Updated weights for policy 0, policy_version 280 (0.0047)
[2023-02-23 11:09:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1150976. Throughput: 0: 888.8. Samples: 286628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:09:07,335][05868] Avg episode reward: [(0, '5.818')]
[2023-02-23 11:09:12,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1171456. Throughput: 0: 872.8. Samples: 292412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:09:12,334][05868] Avg episode reward: [(0, '5.745')]
[2023-02-23 11:09:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1183744. Throughput: 0: 853.7. Samples: 296464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:09:17,342][05868] Avg episode reward: [(0, '5.726')]
[2023-02-23 11:09:18,446][17741] Updated weights for policy 0, policy_version 290 (0.0025)
[2023-02-23 11:09:22,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1200128. Throughput: 0: 857.7. Samples: 298618. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:09:22,340][05868] Avg episode reward: [(0, '5.916')]
[2023-02-23 11:09:27,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1220608. Throughput: 0: 886.9. Samples: 305000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:09:27,335][05868] Avg episode reward: [(0, '6.153')]
[2023-02-23 11:09:28,514][17741] Updated weights for policy 0, policy_version 300 (0.0024)
[2023-02-23 11:09:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1241088. Throughput: 0: 869.0. Samples: 310716. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:09:32,336][05868] Avg episode reward: [(0, '5.931')]
[2023-02-23 11:09:37,336][05868] Fps is (10 sec: 3275.5, 60 sec: 3549.6, 300 sec: 3415.6). Total num frames: 1253376. Throughput: 0: 856.1. Samples: 312748. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:09:37,342][05868] Avg episode reward: [(0, '5.748')]
[2023-02-23 11:09:41,920][17741] Updated weights for policy 0, policy_version 310 (0.0030)
[2023-02-23 11:09:42,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1269760. Throughput: 0: 859.5. Samples: 316928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:09:42,335][05868] Avg episode reward: [(0, '5.567')]
[2023-02-23 11:09:47,332][05868] Fps is (10 sec: 3687.9, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1290240. Throughput: 0: 888.4. Samples: 323356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:09:47,339][05868] Avg episode reward: [(0, '6.119')]
[2023-02-23 11:09:52,127][17741] Updated weights for policy 0, policy_version 320 (0.0025)
[2023-02-23 11:09:52,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1310720. Throughput: 0: 887.6. Samples: 326572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:09:52,334][05868] Avg episode reward: [(0, '6.457')]
[2023-02-23 11:09:52,342][17728] Saving new best policy, reward=6.457!
[2023-02-23 11:09:57,335][05868] Fps is (10 sec: 3275.7, 60 sec: 3481.4, 300 sec: 3429.5). Total num frames: 1323008. Throughput: 0: 854.8. Samples: 330880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:09:57,340][05868] Avg episode reward: [(0, '6.971')]
[2023-02-23 11:09:57,355][17728] Saving new best policy, reward=6.971!
[2023-02-23 11:10:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1339392. Throughput: 0: 860.6. Samples: 335192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:10:02,335][05868] Avg episode reward: [(0, '6.893')]
[2023-02-23 11:10:04,976][17741] Updated weights for policy 0, policy_version 330 (0.0023)
[2023-02-23 11:10:07,333][05868] Fps is (10 sec: 3687.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 1359872. Throughput: 0: 884.2. Samples: 338408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:10:07,336][05868] Avg episode reward: [(0, '6.895')]
[2023-02-23 11:10:12,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 1380352. Throughput: 0: 883.2. Samples: 344744. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:10:12,340][05868] Avg episode reward: [(0, '6.748')]
[2023-02-23 11:10:16,532][17741] Updated weights for policy 0, policy_version 340 (0.0013)
[2023-02-23 11:10:17,349][05868] Fps is (10 sec: 3271.6, 60 sec: 3480.6, 300 sec: 3429.3). Total num frames: 1392640. Throughput: 0: 849.9. Samples: 348976. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:10:17,352][05868] Avg episode reward: [(0, '6.543')]
[2023-02-23 11:10:22,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1409024. Throughput: 0: 851.2. Samples: 351048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:10:22,338][05868] Avg episode reward: [(0, '6.803')]
[2023-02-23 11:10:27,332][05868] Fps is (10 sec: 3692.7, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1429504. Throughput: 0: 884.2. Samples: 356716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:10:27,340][05868] Avg episode reward: [(0, '6.990')]
[2023-02-23 11:10:27,350][17728] Saving new best policy, reward=6.990!
[2023-02-23 11:10:28,155][17741] Updated weights for policy 0, policy_version 350 (0.0020)
[2023-02-23 11:10:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1449984. Throughput: 0: 881.5. Samples: 363022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:10:32,337][05868] Avg episode reward: [(0, '7.446')]
[2023-02-23 11:10:32,340][17728] Saving new best policy, reward=7.446!
[2023-02-23 11:10:37,333][05868] Fps is (10 sec: 3276.5, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 1462272. Throughput: 0: 855.1. Samples: 365052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:10:37,336][05868] Avg episode reward: [(0, '7.268')]
[2023-02-23 11:10:41,006][17741] Updated weights for policy 0, policy_version 360 (0.0017)
[2023-02-23 11:10:42,332][05868] Fps is (10 sec: 2457.5, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 1474560. Throughput: 0: 849.6. Samples: 369110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:10:42,348][05868] Avg episode reward: [(0, '7.553')]
[2023-02-23 11:10:42,359][17728] Saving new best policy, reward=7.553!
[2023-02-23 11:10:47,332][05868] Fps is (10 sec: 3277.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1495040. Throughput: 0: 880.1. Samples: 374796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:10:47,338][05868] Avg episode reward: [(0, '7.076')]
[2023-02-23 11:10:47,350][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000365_1495040.pth...
[2023-02-23 11:10:47,477][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth
[2023-02-23 11:10:51,394][17741] Updated weights for policy 0, policy_version 370 (0.0014)
[2023-02-23 11:10:52,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1515520. Throughput: 0: 879.0. Samples: 377964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:10:52,334][05868] Avg episode reward: [(0, '7.711')]
[2023-02-23 11:10:52,347][17728] Saving new best policy, reward=7.711!
[2023-02-23 11:10:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.8, 300 sec: 3443.4). Total num frames: 1531904. Throughput: 0: 847.8. Samples: 382892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:10:57,337][05868] Avg episode reward: [(0, '7.627')]
[2023-02-23 11:11:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1544192. Throughput: 0: 847.6. Samples: 387104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:11:02,343][05868] Avg episode reward: [(0, '7.819')]
[2023-02-23 11:11:02,345][17728] Saving new best policy, reward=7.819!
[2023-02-23 11:11:04,839][17741] Updated weights for policy 0, policy_version 380 (0.0052)
[2023-02-23 11:11:07,333][05868] Fps is (10 sec: 3276.3, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1564672. Throughput: 0: 861.3. Samples: 389810. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:11:07,336][05868] Avg episode reward: [(0, '7.461')]
[2023-02-23 11:11:12,332][05868] Fps is (10 sec: 4505.7, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 1589248. Throughput: 0: 879.5. Samples: 396294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:11:12,334][05868] Avg episode reward: [(0, '8.033')]
[2023-02-23 11:11:12,338][17728] Saving new best policy, reward=8.033!
[2023-02-23 11:11:14,820][17741] Updated weights for policy 0, policy_version 390 (0.0020)
[2023-02-23 11:11:17,332][05868] Fps is (10 sec: 3686.9, 60 sec: 3482.6, 300 sec: 3443.4). Total num frames: 1601536. Throughput: 0: 848.4. Samples: 401202. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:11:17,337][05868] Avg episode reward: [(0, '8.168')]
[2023-02-23 11:11:17,348][17728] Saving new best policy, reward=8.168!
[2023-02-23 11:11:22,334][05868] Fps is (10 sec: 2457.1, 60 sec: 3413.2, 300 sec: 3443.5). Total num frames: 1613824. Throughput: 0: 847.7. Samples: 403200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:11:22,337][05868] Avg episode reward: [(0, '8.240')]
[2023-02-23 11:11:22,339][17728] Saving new best policy, reward=8.240!
[2023-02-23 11:11:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1634304. Throughput: 0: 863.9. Samples: 407986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:11:27,339][05868] Avg episode reward: [(0, '9.142')]
[2023-02-23 11:11:27,349][17728] Saving new best policy, reward=9.142!
[2023-02-23 11:11:27,853][17741] Updated weights for policy 0, policy_version 400 (0.0016)
[2023-02-23 11:11:32,332][05868] Fps is (10 sec: 4096.6, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1654784. Throughput: 0: 881.9. Samples: 414484. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:11:32,335][05868] Avg episode reward: [(0, '9.759')]
[2023-02-23 11:11:32,338][17728] Saving new best policy, reward=9.759!
[2023-02-23 11:11:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3457.3). Total num frames: 1671168. Throughput: 0: 868.6. Samples: 417052. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:11:37,342][05868] Avg episode reward: [(0, '9.882')]
[2023-02-23 11:11:37,365][17728] Saving new best policy, reward=9.882!
[2023-02-23 11:11:39,797][17741] Updated weights for policy 0, policy_version 410 (0.0014)
[2023-02-23 11:11:42,332][05868] Fps is (10 sec: 2867.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1683456. Throughput: 0: 847.4. Samples: 421026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:11:42,337][05868] Avg episode reward: [(0, '10.195')]
[2023-02-23 11:11:42,347][17728] Saving new best policy, reward=10.195!
[2023-02-23 11:11:47,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.4). Total num frames: 1703936. Throughput: 0: 867.8. Samples: 426156. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:11:47,334][05868] Avg episode reward: [(0, '9.819')]
[2023-02-23 11:11:50,985][17741] Updated weights for policy 0, policy_version 420 (0.0027)
[2023-02-23 11:11:52,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1724416. Throughput: 0: 881.1. Samples: 429460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:11:52,340][05868] Avg episode reward: [(0, '10.216')]
[2023-02-23 11:11:52,346][17728] Saving new best policy, reward=10.216!
[2023-02-23 11:11:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1740800. Throughput: 0: 862.9. Samples: 435126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:11:57,338][05868] Avg episode reward: [(0, '10.346')]
[2023-02-23 11:11:57,351][17728] Saving new best policy, reward=10.346!
[2023-02-23 11:12:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1753088. Throughput: 0: 842.8. Samples: 439128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:12:02,338][05868] Avg episode reward: [(0, '11.437')]
[2023-02-23 11:12:02,343][17728] Saving new best policy, reward=11.437!
[2023-02-23 11:12:04,584][17741] Updated weights for policy 0, policy_version 430 (0.0014)
[2023-02-23 11:12:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3443.4). Total num frames: 1769472. Throughput: 0: 845.8. Samples: 441258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:07,341][05868] Avg episode reward: [(0, '12.061')]
[2023-02-23 11:12:07,425][17728] Saving new best policy, reward=12.061!
[2023-02-23 11:12:12,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1794048. Throughput: 0: 882.4. Samples: 447696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:12,334][05868] Avg episode reward: [(0, '13.771')]
[2023-02-23 11:12:12,341][17728] Saving new best policy, reward=13.771!
[2023-02-23 11:12:14,176][17741] Updated weights for policy 0, policy_version 440 (0.0014)
[2023-02-23 11:12:17,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1810432. Throughput: 0: 862.3. Samples: 453286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:12:17,334][05868] Avg episode reward: [(0, '12.964')]
[2023-02-23 11:12:22,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.7, 300 sec: 3443.5). Total num frames: 1822720. Throughput: 0: 850.3. Samples: 455314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:22,338][05868] Avg episode reward: [(0, '13.367')]
[2023-02-23 11:12:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1839104. Throughput: 0: 854.0. Samples: 459456. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:27,334][05868] Avg episode reward: [(0, '13.969')]
[2023-02-23 11:12:27,348][17728] Saving new best policy, reward=13.969!
[2023-02-23 11:12:27,617][17741] Updated weights for policy 0, policy_version 450 (0.0033)
[2023-02-23 11:12:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 1859584. Throughput: 0: 883.2. Samples: 465898. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:32,339][05868] Avg episode reward: [(0, '13.638')]
[2023-02-23 11:12:37,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 1880064. Throughput: 0: 880.2. Samples: 469070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:12:37,335][05868] Avg episode reward: [(0, '14.639')]
[2023-02-23 11:12:37,342][17728] Saving new best policy, reward=14.639!
[2023-02-23 11:12:38,494][17741] Updated weights for policy 0, policy_version 460 (0.0013)
[2023-02-23 11:12:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1892352. Throughput: 0: 846.9. Samples: 473238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:12:42,341][05868] Avg episode reward: [(0, '14.796')]
[2023-02-23 11:12:42,349][17728] Saving new best policy, reward=14.796!
[2023-02-23 11:12:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 1908736. Throughput: 0: 855.8. Samples: 477640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:12:47,335][05868] Avg episode reward: [(0, '14.496')]
[2023-02-23 11:12:47,343][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000466_1908736.pth...
[2023-02-23 11:12:47,459][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000265_1085440.pth
[2023-02-23 11:12:50,774][17741] Updated weights for policy 0, policy_version 470 (0.0014)
[2023-02-23 11:12:52,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 1929216. Throughput: 0: 879.2. Samples: 480824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:12:52,335][05868] Avg episode reward: [(0, '13.569')]
[2023-02-23 11:12:57,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 1949696. Throughput: 0: 879.5. Samples: 487276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:12:57,337][05868] Avg episode reward: [(0, '14.671')]
[2023-02-23 11:13:02,332][05868] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1961984. Throughput: 0: 847.8. Samples: 491438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:13:02,337][05868] Avg episode reward: [(0, '15.165')]
[2023-02-23 11:13:02,342][17728] Saving new best policy, reward=15.165!
[2023-02-23 11:13:02,795][17741] Updated weights for policy 0, policy_version 480 (0.0015)
[2023-02-23 11:13:07,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 1978368. Throughput: 0: 847.0. Samples: 493428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:13:07,334][05868] Avg episode reward: [(0, '16.150')]
[2023-02-23 11:13:07,344][17728] Saving new best policy, reward=16.150!
[2023-02-23 11:13:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1998848. Throughput: 0: 884.5. Samples: 499258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:13:12,335][05868] Avg episode reward: [(0, '17.154')]
[2023-02-23 11:13:12,337][17728] Saving new best policy, reward=17.154!
[2023-02-23 11:13:13,815][17741] Updated weights for policy 0, policy_version 490 (0.0020)
[2023-02-23 11:13:17,335][05868] Fps is (10 sec: 4094.8, 60 sec: 3481.4, 300 sec: 3485.0). Total num frames: 2019328. Throughput: 0: 879.2. Samples: 505466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:13:17,340][05868] Avg episode reward: [(0, '18.178')]
[2023-02-23 11:13:17,357][17728] Saving new best policy, reward=18.178!
[2023-02-23 11:13:22,332][05868] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2031616. Throughput: 0: 852.7. Samples: 507440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:13:22,339][05868] Avg episode reward: [(0, '17.746')]
[2023-02-23 11:13:27,332][05868] Fps is (10 sec: 2458.3, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2043904. Throughput: 0: 851.3. Samples: 511548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:13:27,339][05868] Avg episode reward: [(0, '17.289')]
[2023-02-23 11:13:27,416][17741] Updated weights for policy 0, policy_version 500 (0.0021)
[2023-02-23 11:13:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2068480. Throughput: 0: 886.4. Samples: 517528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:13:32,340][05868] Avg episode reward: [(0, '17.500')]
[2023-02-23 11:13:36,900][17741] Updated weights for policy 0, policy_version 510 (0.0020)
[2023-02-23 11:13:37,332][05868] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2088960. Throughput: 0: 886.7. Samples: 520726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:13:37,337][05868] Avg episode reward: [(0, '17.833')]
[2023-02-23 11:13:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2101248. Throughput: 0: 857.1. Samples: 525844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:13:42,334][05868] Avg episode reward: [(0, '18.844')]
[2023-02-23 11:13:42,340][17728] Saving new best policy, reward=18.844!
[2023-02-23 11:13:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2117632. Throughput: 0: 855.6. Samples: 529940. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:13:47,339][05868] Avg episode reward: [(0, '20.101')]
[2023-02-23 11:13:47,357][17728] Saving new best policy, reward=20.101!
[2023-02-23 11:13:50,263][17741] Updated weights for policy 0, policy_version 520 (0.0023)
[2023-02-23 11:13:52,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2138112. Throughput: 0: 871.6. Samples: 532650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:13:52,337][05868] Avg episode reward: [(0, '21.236')]
[2023-02-23 11:13:52,343][17728] Saving new best policy, reward=21.236!
[2023-02-23 11:13:57,332][05868] Fps is (10 sec: 4095.9, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 2158592. Throughput: 0: 885.0. Samples: 539084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:13:57,341][05868] Avg episode reward: [(0, '20.925')]
[2023-02-23 11:14:00,890][17741] Updated weights for policy 0, policy_version 530 (0.0025)
[2023-02-23 11:14:02,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2170880. Throughput: 0: 855.7. Samples: 543972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:14:02,335][05868] Avg episode reward: [(0, '20.021')]
[2023-02-23 11:14:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2187264. Throughput: 0: 857.4. Samples: 546022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:14:07,342][05868] Avg episode reward: [(0, '20.288')]
[2023-02-23 11:14:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2207744. Throughput: 0: 882.1. Samples: 551242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:14:12,335][05868] Avg episode reward: [(0, '18.943')]
[2023-02-23 11:14:13,036][17741] Updated weights for policy 0, policy_version 540 (0.0025)
[2023-02-23 11:14:17,332][05868] Fps is (10 sec: 4096.2, 60 sec: 3481.8, 300 sec: 3485.1). Total num frames: 2228224. Throughput: 0: 890.8. Samples: 557616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:14:17,335][05868] Avg episode reward: [(0, '18.880')]
[2023-02-23 11:14:22,333][05868] Fps is (10 sec: 3686.0, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 2244608. Throughput: 0: 880.9. Samples: 560366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:14:22,341][05868] Avg episode reward: [(0, '19.490')]
[2023-02-23 11:14:24,727][17741] Updated weights for policy 0, policy_version 550 (0.0013)
[2023-02-23 11:14:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2256896. Throughput: 0: 860.2. Samples: 564554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:14:27,335][05868] Avg episode reward: [(0, '20.118')]
[2023-02-23 11:14:32,332][05868] Fps is (10 sec: 3277.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2277376. Throughput: 0: 883.4. Samples: 569692. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:14:32,341][05868] Avg episode reward: [(0, '22.288')]
[2023-02-23 11:14:32,347][17728] Saving new best policy, reward=22.288!
[2023-02-23 11:14:36,098][17741] Updated weights for policy 0, policy_version 560 (0.0020)
[2023-02-23 11:14:37,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2297856. Throughput: 0: 892.7. Samples: 572820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:14:37,339][05868] Avg episode reward: [(0, '22.625')]
[2023-02-23 11:14:37,350][17728] Saving new best policy, reward=22.625!
[2023-02-23 11:14:42,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2314240. Throughput: 0: 875.7. Samples: 578490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:14:42,337][05868] Avg episode reward: [(0, '21.654')]
[2023-02-23 11:14:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2326528. Throughput: 0: 858.2. Samples: 582590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:14:47,337][05868] Avg episode reward: [(0, '21.633')]
[2023-02-23 11:14:47,355][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000568_2326528.pth...
[2023-02-23 11:14:47,480][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000365_1495040.pth
[2023-02-23 11:14:49,535][17741] Updated weights for policy 0, policy_version 570 (0.0024)
[2023-02-23 11:14:52,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 2342912. Throughput: 0: 857.6. Samples: 584612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:14:52,335][05868] Avg episode reward: [(0, '21.507')]
[2023-02-23 11:14:57,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2367488. Throughput: 0: 886.1. Samples: 591118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:14:57,342][05868] Avg episode reward: [(0, '21.775')]
[2023-02-23 11:14:59,139][17741] Updated weights for policy 0, policy_version 580 (0.0023)
[2023-02-23 11:15:02,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2383872. Throughput: 0: 869.0. Samples: 596722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:15:02,337][05868] Avg episode reward: [(0, '21.856')]
[2023-02-23 11:15:07,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2396160. Throughput: 0: 853.6. Samples: 598778. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:15:07,338][05868] Avg episode reward: [(0, '21.823')]
[2023-02-23 11:15:12,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3457.5). Total num frames: 2412544. Throughput: 0: 855.8. Samples: 603064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:15:12,334][05868] Avg episode reward: [(0, '22.925')]
[2023-02-23 11:15:12,342][17728] Saving new best policy, reward=22.925!
[2023-02-23 11:15:12,591][17741] Updated weights for policy 0, policy_version 590 (0.0037)
[2023-02-23 11:15:17,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2437120. Throughput: 0: 884.8. Samples: 609508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:15:17,337][05868] Avg episode reward: [(0, '23.108')]
[2023-02-23 11:15:17,347][17728] Saving new best policy, reward=23.108!
[2023-02-23 11:15:22,335][05868] Fps is (10 sec: 4094.8, 60 sec: 3481.5, 300 sec: 3471.1). Total num frames: 2453504. Throughput: 0: 886.1. Samples: 612696. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:15:22,340][05868] Avg episode reward: [(0, '21.876')]
[2023-02-23 11:15:23,029][17741] Updated weights for policy 0, policy_version 600 (0.0022)
[2023-02-23 11:15:27,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2465792. Throughput: 0: 855.4. Samples: 616982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:15:27,337][05868] Avg episode reward: [(0, '21.930')]
[2023-02-23 11:15:32,332][05868] Fps is (10 sec: 2868.1, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 2482176. Throughput: 0: 863.1. Samples: 621430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:15:32,340][05868] Avg episode reward: [(0, '21.419')]
[2023-02-23 11:15:35,391][17741] Updated weights for policy 0, policy_version 610 (0.0024)
[2023-02-23 11:15:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2502656. Throughput: 0: 890.1. Samples: 624666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:15:37,334][05868] Avg episode reward: [(0, '20.474')]
[2023-02-23 11:15:42,333][05868] Fps is (10 sec: 4095.6, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 2523136. Throughput: 0: 891.7. Samples: 631246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:15:42,338][05868] Avg episode reward: [(0, '20.016')]
[2023-02-23 11:15:47,022][17741] Updated weights for policy 0, policy_version 620 (0.0041)
[2023-02-23 11:15:47,334][05868] Fps is (10 sec: 3685.7, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 2539520. Throughput: 0: 858.7. Samples: 635366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:15:47,336][05868] Avg episode reward: [(0, '20.280')]
[2023-02-23 11:15:52,332][05868] Fps is (10 sec: 2867.5, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2551808. Throughput: 0: 858.2. Samples: 637396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:15:52,334][05868] Avg episode reward: [(0, '20.559')]
[2023-02-23 11:15:57,332][05868] Fps is (10 sec: 3687.1, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2576384. Throughput: 0: 892.8. Samples: 643240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:15:57,339][05868] Avg episode reward: [(0, '20.358')]
[2023-02-23 11:15:58,350][17741] Updated weights for policy 0, policy_version 630 (0.0026)
[2023-02-23 11:16:02,332][05868] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2596864. Throughput: 0: 890.6. Samples: 649584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:16:02,338][05868] Avg episode reward: [(0, '20.223')]
[2023-02-23 11:16:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2609152. Throughput: 0: 864.7. Samples: 651604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:16:07,336][05868] Avg episode reward: [(0, '19.544')]
[2023-02-23 11:16:11,287][17741] Updated weights for policy 0, policy_version 640 (0.0026)
[2023-02-23 11:16:12,336][05868] Fps is (10 sec: 2456.6, 60 sec: 3481.4, 300 sec: 3457.3). Total num frames: 2621440. Throughput: 0: 860.9. Samples: 655728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:16:12,341][05868] Avg episode reward: [(0, '20.280')]
[2023-02-23 11:16:17,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2646016. Throughput: 0: 898.8. Samples: 661878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:16:17,339][05868] Avg episode reward: [(0, '20.011')]
[2023-02-23 11:16:20,808][17741] Updated weights for policy 0, policy_version 650 (0.0026)
[2023-02-23 11:16:22,332][05868] Fps is (10 sec: 4507.2, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 2666496. Throughput: 0: 899.4. Samples: 665140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:16:22,340][05868] Avg episode reward: [(0, '20.920')]
[2023-02-23 11:16:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2678784. Throughput: 0: 864.7. Samples: 670156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:16:27,334][05868] Avg episode reward: [(0, '21.276')]
[2023-02-23 11:16:32,332][05868] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2695168. Throughput: 0: 860.6. Samples: 674090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:16:32,335][05868] Avg episode reward: [(0, '22.487')]
[2023-02-23 11:16:34,384][17741] Updated weights for policy 0, policy_version 660 (0.0019)
[2023-02-23 11:16:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2711552. Throughput: 0: 878.4. Samples: 676924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:16:37,335][05868] Avg episode reward: [(0, '22.476')]
[2023-02-23 11:16:42,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2736128. Throughput: 0: 892.6. Samples: 683408. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-23 11:16:42,339][05868] Avg episode reward: [(0, '23.677')]
[2023-02-23 11:16:42,344][17728] Saving new best policy, reward=23.677!
[2023-02-23 11:16:44,953][17741] Updated weights for policy 0, policy_version 670 (0.0025)
[2023-02-23 11:16:47,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3471.2). Total num frames: 2748416. Throughput: 0: 858.8. Samples: 688230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 11:16:47,337][05868] Avg episode reward: [(0, '23.689')]
[2023-02-23 11:16:47,359][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth...
[2023-02-23 11:16:47,597][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000466_1908736.pth
[2023-02-23 11:16:47,626][17728] Saving new best policy, reward=23.689!
[2023-02-23 11:16:52,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2764800. Throughput: 0: 856.3. Samples: 690138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:16:52,337][05868] Avg episode reward: [(0, '22.950')]
[2023-02-23 11:16:57,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 2785280. Throughput: 0: 878.3. Samples: 695250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:16:57,335][17741] Updated weights for policy 0, policy_version 680 (0.0034)
[2023-02-23 11:16:57,334][05868] Avg episode reward: [(0, '23.028')]
[2023-02-23 11:17:02,334][05868] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3512.8). Total num frames: 2805760. Throughput: 0: 884.4. Samples: 701678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:02,341][05868] Avg episode reward: [(0, '23.798')]
[2023-02-23 11:17:02,344][17728] Saving new best policy, reward=23.798!
[2023-02-23 11:17:07,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2818048. Throughput: 0: 866.9. Samples: 704152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:07,340][05868] Avg episode reward: [(0, '22.594')]
[2023-02-23 11:17:09,158][17741] Updated weights for policy 0, policy_version 690 (0.0020)
[2023-02-23 11:17:12,332][05868] Fps is (10 sec: 2867.8, 60 sec: 3550.1, 300 sec: 3471.2). Total num frames: 2834432. Throughput: 0: 846.6. Samples: 708252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:17:12,339][05868] Avg episode reward: [(0, '21.835')]
[2023-02-23 11:17:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2850816. Throughput: 0: 876.7. Samples: 713542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:17:17,339][05868] Avg episode reward: [(0, '22.307')]
[2023-02-23 11:17:20,372][17741] Updated weights for policy 0, policy_version 700 (0.0033)
[2023-02-23 11:17:22,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 2875392. Throughput: 0: 886.2. Samples: 716802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:17:22,335][05868] Avg episode reward: [(0, '21.407')]
[2023-02-23 11:17:27,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2887680. Throughput: 0: 868.6. Samples: 722494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:27,337][05868] Avg episode reward: [(0, '20.878')]
[2023-02-23 11:17:32,336][05868] Fps is (10 sec: 2866.0, 60 sec: 3481.4, 300 sec: 3471.1). Total num frames: 2904064. Throughput: 0: 851.3. Samples: 726540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:32,348][05868] Avg episode reward: [(0, '20.648')]
[2023-02-23 11:17:33,498][17741] Updated weights for policy 0, policy_version 710 (0.0018)
[2023-02-23 11:17:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 2920448. Throughput: 0: 858.2. Samples: 728756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:17:37,341][05868] Avg episode reward: [(0, '21.664')]
[2023-02-23 11:17:42,332][05868] Fps is (10 sec: 3687.9, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 2940928. Throughput: 0: 887.9. Samples: 735204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:17:42,340][05868] Avg episode reward: [(0, '22.011')]
[2023-02-23 11:17:43,452][17741] Updated weights for policy 0, policy_version 720 (0.0013)
[2023-02-23 11:17:47,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 2961408. Throughput: 0: 872.0. Samples: 740916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:47,337][05868] Avg episode reward: [(0, '22.829')]
[2023-02-23 11:17:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2973696. Throughput: 0: 862.4. Samples: 742958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:17:52,339][05868] Avg episode reward: [(0, '23.605')]
[2023-02-23 11:17:56,678][17741] Updated weights for policy 0, policy_version 730 (0.0047)
[2023-02-23 11:17:57,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 2990080. Throughput: 0: 868.1. Samples: 747318. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:17:57,339][05868] Avg episode reward: [(0, '23.932')]
[2023-02-23 11:17:57,351][17728] Saving new best policy, reward=23.932!
[2023-02-23 11:18:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3499.0). Total num frames: 3010560. Throughput: 0: 891.1. Samples: 753642. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:18:02,340][05868] Avg episode reward: [(0, '25.403')]
[2023-02-23 11:18:02,346][17728] Saving new best policy, reward=25.403!
[2023-02-23 11:18:06,913][17741] Updated weights for policy 0, policy_version 740 (0.0016)
[2023-02-23 11:18:07,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3031040. Throughput: 0: 889.9. Samples: 756846. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:18:07,337][05868] Avg episode reward: [(0, '25.534')]
[2023-02-23 11:18:07,355][17728] Saving new best policy, reward=25.534!
[2023-02-23 11:18:12,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3043328. Throughput: 0: 857.9. Samples: 761098. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:18:12,338][05868] Avg episode reward: [(0, '25.039')]
[2023-02-23 11:18:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3059712. Throughput: 0: 868.7. Samples: 765628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:18:17,339][05868] Avg episode reward: [(0, '26.727')]
[2023-02-23 11:18:17,350][17728] Saving new best policy, reward=26.727!
[2023-02-23 11:18:19,547][17741] Updated weights for policy 0, policy_version 750 (0.0016)
[2023-02-23 11:18:22,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 3080192. Throughput: 0: 889.8. Samples: 768796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:18:22,335][05868] Avg episode reward: [(0, '25.943')]
[2023-02-23 11:18:27,338][05868] Fps is (10 sec: 4093.5, 60 sec: 3549.5, 300 sec: 3498.9). Total num frames: 3100672. Throughput: 0: 890.5. Samples: 775282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:18:27,341][05868] Avg episode reward: [(0, '25.010')]
[2023-02-23 11:18:31,353][17741] Updated weights for policy 0, policy_version 760 (0.0012)
[2023-02-23 11:18:32,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 3471.2). Total num frames: 3112960. Throughput: 0: 853.1. Samples: 779304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:18:32,334][05868] Avg episode reward: [(0, '23.966')]
[2023-02-23 11:18:37,332][05868] Fps is (10 sec: 2868.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3129344. Throughput: 0: 853.8. Samples: 781380. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:18:37,335][05868] Avg episode reward: [(0, '22.245')]
[2023-02-23 11:18:42,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3149824. Throughput: 0: 886.5. Samples: 787210. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:18:42,338][05868] Avg episode reward: [(0, '21.979')]
[2023-02-23 11:18:42,723][17741] Updated weights for policy 0, policy_version 770 (0.0027)
[2023-02-23 11:18:47,339][05868] Fps is (10 sec: 4093.2, 60 sec: 3481.2, 300 sec: 3498.9). Total num frames: 3170304. Throughput: 0: 885.7. Samples: 793504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:18:47,345][05868] Avg episode reward: [(0, '22.896')]
[2023-02-23 11:18:47,366][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth...
[2023-02-23 11:18:47,527][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000568_2326528.pth
[2023-02-23 11:18:52,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3182592. Throughput: 0: 858.9. Samples: 795498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:18:52,338][05868] Avg episode reward: [(0, '22.633')]
[2023-02-23 11:18:55,226][17741] Updated weights for policy 0, policy_version 780 (0.0016)
[2023-02-23 11:18:57,332][05868] Fps is (10 sec: 2869.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3198976. Throughput: 0: 857.6. Samples: 799688. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:18:57,335][05868] Avg episode reward: [(0, '25.097')]
[2023-02-23 11:19:02,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3219456. Throughput: 0: 885.8. Samples: 805490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:02,339][05868] Avg episode reward: [(0, '26.970')]
[2023-02-23 11:19:02,345][17728] Saving new best policy, reward=26.970!
[2023-02-23 11:19:05,822][17741] Updated weights for policy 0, policy_version 790 (0.0031)
[2023-02-23 11:19:07,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3239936. Throughput: 0: 885.6. Samples: 808646. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 11:19:07,337][05868] Avg episode reward: [(0, '28.138')]
[2023-02-23 11:19:07,348][17728] Saving new best policy, reward=28.138!
[2023-02-23 11:19:12,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3256320. Throughput: 0: 855.3. Samples: 813766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:12,339][05868] Avg episode reward: [(0, '28.800')]
[2023-02-23 11:19:12,344][17728] Saving new best policy, reward=28.800!
[2023-02-23 11:19:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3268608. Throughput: 0: 857.2. Samples: 817878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:17,341][05868] Avg episode reward: [(0, '28.317')]
[2023-02-23 11:19:19,029][17741] Updated weights for policy 0, policy_version 800 (0.0039)
[2023-02-23 11:19:22,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3289088. Throughput: 0: 874.3. Samples: 820722. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 11:19:22,335][05868] Avg episode reward: [(0, '26.619')]
[2023-02-23 11:19:27,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3482.0, 300 sec: 3499.0). Total num frames: 3309568. Throughput: 0: 886.9. Samples: 827122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:19:27,340][05868] Avg episode reward: [(0, '24.254')]
[2023-02-23 11:19:28,979][17741] Updated weights for policy 0, policy_version 810 (0.0019)
[2023-02-23 11:19:32,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3325952. Throughput: 0: 855.4. Samples: 831992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:32,334][05868] Avg episode reward: [(0, '25.178')]
[2023-02-23 11:19:37,334][05868] Fps is (10 sec: 2866.6, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 3338240. Throughput: 0: 855.9. Samples: 834016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:19:37,337][05868] Avg episode reward: [(0, '25.592')]
[2023-02-23 11:19:42,132][17741] Updated weights for policy 0, policy_version 820 (0.0050)
[2023-02-23 11:19:42,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3358720. Throughput: 0: 873.2. Samples: 838984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:42,335][05868] Avg episode reward: [(0, '24.701')]
[2023-02-23 11:19:47,332][05868] Fps is (10 sec: 4096.9, 60 sec: 3482.0, 300 sec: 3512.8). Total num frames: 3379200. Throughput: 0: 887.7. Samples: 845436. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:19:47,335][05868] Avg episode reward: [(0, '24.032')]
[2023-02-23 11:19:52,335][05868] Fps is (10 sec: 3685.3, 60 sec: 3549.7, 300 sec: 3485.0). Total num frames: 3395584. Throughput: 0: 879.6. Samples: 848232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:19:52,338][05868] Avg episode reward: [(0, '23.430')]
[2023-02-23 11:19:53,234][17741] Updated weights for policy 0, policy_version 830 (0.0021)
[2023-02-23 11:19:57,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3407872. Throughput: 0: 855.4. Samples: 852258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:19:57,341][05868] Avg episode reward: [(0, '24.109')]
[2023-02-23 11:20:02,332][05868] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3428352. Throughput: 0: 877.5. Samples: 857364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:20:02,339][05868] Avg episode reward: [(0, '22.084')]
[2023-02-23 11:20:05,155][17741] Updated weights for policy 0, policy_version 840 (0.0029)
[2023-02-23 11:20:07,332][05868] Fps is (10 sec: 4096.2, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3448832. Throughput: 0: 885.6. Samples: 860574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:07,334][05868] Avg episode reward: [(0, '21.356')]
[2023-02-23 11:20:12,337][05868] Fps is (10 sec: 3684.5, 60 sec: 3481.3, 300 sec: 3485.0). Total num frames: 3465216. Throughput: 0: 874.6. Samples: 866484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:12,340][05868] Avg episode reward: [(0, '21.887')]
[2023-02-23 11:20:17,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3477504. Throughput: 0: 857.1. Samples: 870560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:17,337][05868] Avg episode reward: [(0, '22.613')]
[2023-02-23 11:20:17,533][17741] Updated weights for policy 0, policy_version 850 (0.0018)
[2023-02-23 11:20:22,332][05868] Fps is (10 sec: 3278.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3497984. Throughput: 0: 858.8. Samples: 872662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:22,335][05868] Avg episode reward: [(0, '22.359')]
[2023-02-23 11:20:27,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3518464. Throughput: 0: 893.8. Samples: 879206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:27,335][05868] Avg episode reward: [(0, '23.042')]
[2023-02-23 11:20:27,816][17741] Updated weights for policy 0, policy_version 860 (0.0022)
[2023-02-23 11:20:32,338][05868] Fps is (10 sec: 3684.2, 60 sec: 3481.2, 300 sec: 3498.9). Total num frames: 3534848. Throughput: 0: 873.7. Samples: 884756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:32,345][05868] Avg episode reward: [(0, '23.889')]
[2023-02-23 11:20:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3485.1). Total num frames: 3551232. Throughput: 0: 856.8. Samples: 886784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:20:37,336][05868] Avg episode reward: [(0, '24.255')]
[2023-02-23 11:20:41,279][17741] Updated weights for policy 0, policy_version 870 (0.0045)
[2023-02-23 11:20:42,332][05868] Fps is (10 sec: 3278.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3567616. Throughput: 0: 861.4. Samples: 891022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:42,335][05868] Avg episode reward: [(0, '25.370')]
[2023-02-23 11:20:47,332][05868] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3588096. Throughput: 0: 891.7. Samples: 897490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:20:47,339][05868] Avg episode reward: [(0, '25.678')]
[2023-02-23 11:20:47,353][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000876_3588096.pth...
[2023-02-23 11:20:47,474][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000671_2748416.pth
[2023-02-23 11:20:50,726][17741] Updated weights for policy 0, policy_version 880 (0.0021)
[2023-02-23 11:20:52,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3499.0). Total num frames: 3608576. Throughput: 0: 892.0. Samples: 900714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:20:52,335][05868] Avg episode reward: [(0, '26.738')]
[2023-02-23 11:20:57,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 3620864. Throughput: 0: 859.4. Samples: 905152. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:20:57,338][05868] Avg episode reward: [(0, '27.404')]
[2023-02-23 11:21:02,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3637248. Throughput: 0: 865.4. Samples: 909504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:21:02,334][05868] Avg episode reward: [(0, '27.393')]
[2023-02-23 11:21:04,234][17741] Updated weights for policy 0, policy_version 890 (0.0026)
[2023-02-23 11:21:07,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.9). Total num frames: 3657728. Throughput: 0: 888.7. Samples: 912654. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:21:07,334][05868] Avg episode reward: [(0, '26.510')]
[2023-02-23 11:21:12,338][05868] Fps is (10 sec: 4093.5, 60 sec: 3549.8, 300 sec: 3498.9). Total num frames: 3678208. Throughput: 0: 886.0. Samples: 919080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:21:12,346][05868] Avg episode reward: [(0, '27.735')]
[2023-02-23 11:21:14,998][17741] Updated weights for policy 0, policy_version 900 (0.0012)
[2023-02-23 11:21:17,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 3690496. Throughput: 0: 858.5. Samples: 923382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:21:17,341][05868] Avg episode reward: [(0, '28.213')]
[2023-02-23 11:21:22,332][05868] Fps is (10 sec: 2868.9, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3706880. Throughput: 0: 858.0. Samples: 925392. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:21:22,337][05868] Avg episode reward: [(0, '27.407')]
[2023-02-23 11:21:27,082][17741] Updated weights for policy 0, policy_version 910 (0.0015)
[2023-02-23 11:21:27,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3727360. Throughput: 0: 889.7. Samples: 931058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:21:27,335][05868] Avg episode reward: [(0, '26.676')]
[2023-02-23 11:21:32,332][05868] Fps is (10 sec: 4096.0, 60 sec: 3550.2, 300 sec: 3512.8). Total num frames: 3747840. Throughput: 0: 890.9. Samples: 937582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:21:32,335][05868] Avg episode reward: [(0, '25.324')]
[2023-02-23 11:21:37,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 3760128. Throughput: 0: 862.1. Samples: 939508. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 11:21:37,338][05868] Avg episode reward: [(0, '27.323')]
[2023-02-23 11:21:39,371][17741] Updated weights for policy 0, policy_version 920 (0.0034)
[2023-02-23 11:21:42,333][05868] Fps is (10 sec: 2457.4, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 3772416. Throughput: 0: 853.1. Samples: 943542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:21:42,336][05868] Avg episode reward: [(0, '25.412')]
[2023-02-23 11:21:47,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3796992. Throughput: 0: 888.0. Samples: 949462. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 11:21:47,338][05868] Avg episode reward: [(0, '24.885')]
[2023-02-23 11:21:50,046][17741] Updated weights for policy 0, policy_version 930 (0.0023)
[2023-02-23 11:21:52,332][05868] Fps is (10 sec: 4506.0, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3817472. Throughput: 0: 890.3. Samples: 952718. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:21:52,341][05868] Avg episode reward: [(0, '24.711')]
[2023-02-23 11:21:57,337][05868] Fps is (10 sec: 3684.4, 60 sec: 3549.5, 300 sec: 3485.0). Total num frames: 3833856. Throughput: 0: 864.3. Samples: 957974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:21:57,340][05868] Avg episode reward: [(0, '23.671')]
[2023-02-23 11:22:02,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3846144. Throughput: 0: 861.5. Samples: 962152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 11:22:02,341][05868] Avg episode reward: [(0, '24.008')]
[2023-02-23 11:22:03,158][17741] Updated weights for policy 0, policy_version 940 (0.0037)
[2023-02-23 11:22:07,332][05868] Fps is (10 sec: 3278.5, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3866624. Throughput: 0: 875.5. Samples: 964790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:22:07,335][05868] Avg episode reward: [(0, '23.997')]
[2023-02-23 11:22:12,332][05868] Fps is (10 sec: 4096.3, 60 sec: 3482.0, 300 sec: 3512.8). Total num frames: 3887104. Throughput: 0: 893.3. Samples: 971258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:22:12,335][05868] Avg episode reward: [(0, '23.835')]
[2023-02-23 11:22:12,783][17741] Updated weights for policy 0, policy_version 950 (0.0013)
[2023-02-23 11:22:17,332][05868] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3903488. Throughput: 0: 861.9. Samples: 976368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:22:17,339][05868] Avg episode reward: [(0, '24.691')]
[2023-02-23 11:22:22,332][05868] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3915776. Throughput: 0: 864.8. Samples: 978426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:22:22,341][05868] Avg episode reward: [(0, '24.660')]
[2023-02-23 11:22:26,189][17741] Updated weights for policy 0, policy_version 960 (0.0019)
[2023-02-23 11:22:27,332][05868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 3936256. Throughput: 0: 886.9. Samples: 983454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:22:27,340][05868] Avg episode reward: [(0, '24.786')]
[2023-02-23 11:22:32,332][05868] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 3956736. Throughput: 0: 896.4. Samples: 989800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 11:22:32,335][05868] Avg episode reward: [(0, '25.239')]
[2023-02-23 11:22:36,758][17741] Updated weights for policy 0, policy_version 970 (0.0041)
[2023-02-23 11:22:37,332][05868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3973120. Throughput: 0: 885.8. Samples: 992578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 11:22:37,338][05868] Avg episode reward: [(0, '25.539')]
[2023-02-23 11:22:42,334][05868] Fps is (10 sec: 2866.6, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3985408. Throughput: 0: 858.2. Samples: 996592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 11:22:42,337][05868] Avg episode reward: [(0, '26.679')]
[2023-02-23 11:22:47,332][05868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 4001792. Throughput: 0: 874.6. Samples: 1001508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 11:22:47,335][05868] Avg episode reward: [(0, '27.634')]
[2023-02-23 11:22:47,344][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000977_4001792.pth...
[2023-02-23 11:22:47,491][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth
[2023-02-23 11:22:47,595][17728] Stopping Batcher_0...
[2023-02-23 11:22:47,596][17728] Loop batcher_evt_loop terminating...
[2023-02-23 11:22:47,596][05868] Component Batcher_0 stopped!
[2023-02-23 11:22:47,607][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 11:22:47,661][17746] Stopping RolloutWorker_w3...
[2023-02-23 11:22:47,659][05868] Component RolloutWorker_w6 stopped!
[2023-02-23 11:22:47,666][05868] Component RolloutWorker_w3 stopped!
[2023-02-23 11:22:47,671][17749] Stopping RolloutWorker_w6...
[2023-02-23 11:22:47,672][17749] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 11:22:47,662][17746] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 11:22:47,677][17748] Stopping RolloutWorker_w5...
[2023-02-23 11:22:47,677][05868] Component RolloutWorker_w5 stopped!
[2023-02-23 11:22:47,683][17744] Stopping RolloutWorker_w1...
[2023-02-23 11:22:47,683][05868] Component RolloutWorker_w1 stopped!
[2023-02-23 11:22:47,678][17748] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 11:22:47,693][17745] Stopping RolloutWorker_w2...
[2023-02-23 11:22:47,693][05868] Component RolloutWorker_w2 stopped!
[2023-02-23 11:22:47,687][17744] Loop rollout_proc1_evt_loop terminating...
[2023-02-23 11:22:47,702][17747] Stopping RolloutWorker_w4...
[2023-02-23 11:22:47,702][05868] Component RolloutWorker_w4 stopped!
[2023-02-23 11:22:47,708][17747] Loop rollout_proc4_evt_loop terminating...
[2023-02-23 11:22:47,710][17743] Stopping RolloutWorker_w0...
[2023-02-23 11:22:47,711][17743] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 11:22:47,709][05868] Component RolloutWorker_w0 stopped!
[2023-02-23 11:22:47,706][17745] Loop rollout_proc2_evt_loop terminating...
[2023-02-23 11:22:47,721][17741] Weights refcount: 2 0
[2023-02-23 11:22:47,724][05868] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 11:22:47,729][17741] Stopping InferenceWorker_p0-w0...
[2023-02-23 11:22:47,730][17741] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 11:22:47,753][17750] Stopping RolloutWorker_w7...
[2023-02-23 11:22:47,753][05868] Component RolloutWorker_w7 stopped!
[2023-02-23 11:22:47,753][17750] Loop rollout_proc7_evt_loop terminating...
[2023-02-23 11:22:47,818][17728] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000876_3588096.pth
[2023-02-23 11:22:47,832][17728] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 11:22:48,038][05868] Component LearnerWorker_p0 stopped!
[2023-02-23 11:22:48,046][05868] Waiting for process learner_proc0 to stop...
[2023-02-23 11:22:48,051][17728] Stopping LearnerWorker_p0...
[2023-02-23 11:22:48,051][17728] Loop learner_proc0_evt_loop terminating...
[2023-02-23 11:22:49,828][05868] Waiting for process inference_proc0-0 to join...
[2023-02-23 11:22:50,147][05868] Waiting for process rollout_proc0 to join...
[2023-02-23 11:22:50,673][05868] Waiting for process rollout_proc1 to join...
[2023-02-23 11:22:50,676][05868] Waiting for process rollout_proc2 to join...
[2023-02-23 11:22:50,685][05868] Waiting for process rollout_proc3 to join...
[2023-02-23 11:22:50,686][05868] Waiting for process rollout_proc4 to join...
[2023-02-23 11:22:50,687][05868] Waiting for process rollout_proc5 to join...
[2023-02-23 11:22:50,690][05868] Waiting for process rollout_proc6 to join...
[2023-02-23 11:22:50,691][05868] Waiting for process rollout_proc7 to join...
[2023-02-23 11:22:50,692][05868] Batcher 0 profile tree view:
batching: 26.9477, releasing_batches: 0.0262
[2023-02-23 11:22:50,694][05868] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 553.1603
update_model: 8.1936
weight_update: 0.0044
one_step: 0.0270
handle_policy_step: 565.2490
deserialize: 15.9555, stack: 3.1990, obs_to_device_normalize: 121.1091, forward: 278.3161, send_messages: 27.2497
prepare_outputs: 90.4579
to_cpu: 56.4857
[2023-02-23 11:22:50,697][05868] Learner 0 profile tree view:
misc: 0.0062, prepare_batch: 17.5318
train: 77.1227
epoch_init: 0.0112, minibatch_init: 0.0109, losses_postprocess: 0.6452, kl_divergence: 0.5561, after_optimizer: 32.9744
calculate_losses: 27.2532
losses_init: 0.0040, forward_head: 1.7816, bptt_initial: 17.8857, tail: 1.2159, advantages_returns: 0.3498, losses: 3.3914
bptt: 2.3067
bptt_forward_core: 2.2317
update: 15.0318
clip: 1.4671
[2023-02-23 11:22:50,698][05868] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4025, enqueue_policy_requests: 158.4457, env_step: 876.3397, overhead: 24.1461, complete_rollouts: 7.9883
save_policy_outputs: 21.9482
split_output_tensors: 10.5565
[2023-02-23 11:22:50,700][05868] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4428, enqueue_policy_requests: 156.8326, env_step: 876.6256, overhead: 23.6488, complete_rollouts: 6.9232
save_policy_outputs: 22.0293
split_output_tensors: 10.5762
[2023-02-23 11:22:50,701][05868] Loop Runner_EvtLoop terminating...
[2023-02-23 11:22:50,703][05868] Runner profile tree view:
main_loop: 1198.3333
[2023-02-23 11:22:50,704][05868] Collected {0: 4005888}, FPS: 3342.9
[2023-02-23 11:27:41,816][05868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 11:27:41,818][05868] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 11:27:41,822][05868] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 11:27:41,825][05868] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 11:27:41,829][05868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 11:27:41,831][05868] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 11:27:41,833][05868] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 11:27:41,836][05868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 11:27:41,838][05868] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 11:27:41,841][05868] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 11:27:41,844][05868] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 11:27:41,846][05868] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 11:27:41,847][05868] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 11:27:41,849][05868] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 11:27:41,850][05868] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 11:27:41,875][05868] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 11:27:41,878][05868] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 11:27:41,882][05868] RunningMeanStd input shape: (1,)
[2023-02-23 11:27:41,899][05868] ConvEncoder: input_channels=3
[2023-02-23 11:27:42,594][05868] Conv encoder output size: 512
[2023-02-23 11:27:42,597][05868] Policy head output size: 512
[2023-02-23 11:27:44,939][05868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 11:27:46,207][05868] Num frames 100...
[2023-02-23 11:27:46,342][05868] Num frames 200...
[2023-02-23 11:27:46,464][05868] Num frames 300...
[2023-02-23 11:27:46,582][05868] Num frames 400...
[2023-02-23 11:27:46,694][05868] Num frames 500...
[2023-02-23 11:27:46,805][05868] Num frames 600...
[2023-02-23 11:27:46,926][05868] Num frames 700...
[2023-02-23 11:27:47,021][05868] Avg episode rewards: #0: 16.360, true rewards: #0: 7.360
[2023-02-23 11:27:47,022][05868] Avg episode reward: 16.360, avg true_objective: 7.360
[2023-02-23 11:27:47,101][05868] Num frames 800...
[2023-02-23 11:27:47,237][05868] Num frames 900...
[2023-02-23 11:27:47,366][05868] Num frames 1000...
[2023-02-23 11:27:47,478][05868] Num frames 1100...
[2023-02-23 11:27:47,588][05868] Num frames 1200...
[2023-02-23 11:27:47,697][05868] Num frames 1300...
[2023-02-23 11:27:47,812][05868] Num frames 1400...
[2023-02-23 11:27:47,926][05868] Num frames 1500...
[2023-02-23 11:27:48,047][05868] Num frames 1600...
[2023-02-23 11:27:48,140][05868] Avg episode rewards: #0: 17.660, true rewards: #0: 8.160
[2023-02-23 11:27:48,141][05868] Avg episode reward: 17.660, avg true_objective: 8.160
[2023-02-23 11:27:48,239][05868] Num frames 1700...
[2023-02-23 11:27:48,361][05868] Num frames 1800...
[2023-02-23 11:27:48,483][05868] Num frames 1900...
[2023-02-23 11:27:48,593][05868] Num frames 2000...
[2023-02-23 11:27:48,719][05868] Num frames 2100...
[2023-02-23 11:27:48,874][05868] Avg episode rewards: #0: 15.253, true rewards: #0: 7.253
[2023-02-23 11:27:48,877][05868] Avg episode reward: 15.253, avg true_objective: 7.253
[2023-02-23 11:27:48,911][05868] Num frames 2200...
[2023-02-23 11:27:49,033][05868] Num frames 2300...
[2023-02-23 11:27:49,149][05868] Num frames 2400...
[2023-02-23 11:27:49,262][05868] Num frames 2500...
[2023-02-23 11:27:49,385][05868] Num frames 2600...
[2023-02-23 11:27:49,495][05868] Num frames 2700...
[2023-02-23 11:27:49,618][05868] Num frames 2800...
[2023-02-23 11:27:49,730][05868] Num frames 2900...
[2023-02-23 11:27:49,855][05868] Num frames 3000...
[2023-02-23 11:27:49,949][05868] Avg episode rewards: #0: 16.833, true rewards: #0: 7.582
[2023-02-23 11:27:49,952][05868] Avg episode reward: 16.833, avg true_objective: 7.582
[2023-02-23 11:27:50,035][05868] Num frames 3100...
[2023-02-23 11:27:50,152][05868] Num frames 3200...
[2023-02-23 11:27:50,285][05868] Num frames 3300...
[2023-02-23 11:27:50,418][05868] Num frames 3400...
[2023-02-23 11:27:50,555][05868] Num frames 3500...
[2023-02-23 11:27:50,677][05868] Num frames 3600...
[2023-02-23 11:27:50,797][05868] Num frames 3700...
[2023-02-23 11:27:50,917][05868] Num frames 3800...
[2023-02-23 11:27:50,976][05868] Avg episode rewards: #0: 17.002, true rewards: #0: 7.602
[2023-02-23 11:27:50,977][05868] Avg episode reward: 17.002, avg true_objective: 7.602
[2023-02-23 11:27:51,096][05868] Num frames 3900...
[2023-02-23 11:27:51,218][05868] Num frames 4000...
[2023-02-23 11:27:51,345][05868] Num frames 4100...
[2023-02-23 11:27:51,528][05868] Num frames 4200...
[2023-02-23 11:27:51,684][05868] Num frames 4300...
[2023-02-23 11:27:51,840][05868] Num frames 4400...
[2023-02-23 11:27:51,996][05868] Num frames 4500...
[2023-02-23 11:27:52,157][05868] Num frames 4600...
[2023-02-23 11:27:52,319][05868] Num frames 4700...
[2023-02-23 11:27:52,485][05868] Num frames 4800...
[2023-02-23 11:27:52,654][05868] Num frames 4900...
[2023-02-23 11:27:52,812][05868] Num frames 5000...
[2023-02-23 11:27:52,970][05868] Num frames 5100...
[2023-02-23 11:27:53,134][05868] Num frames 5200...
[2023-02-23 11:27:53,301][05868] Num frames 5300...
[2023-02-23 11:27:53,459][05868] Num frames 5400...
[2023-02-23 11:27:53,632][05868] Num frames 5500...
[2023-02-23 11:27:53,797][05868] Num frames 5600...
[2023-02-23 11:27:53,971][05868] Num frames 5700...
[2023-02-23 11:27:54,150][05868] Num frames 5800...
[2023-02-23 11:27:54,317][05868] Num frames 5900...
[2023-02-23 11:27:54,382][05868] Avg episode rewards: #0: 23.168, true rewards: #0: 9.835
[2023-02-23 11:27:54,384][05868] Avg episode reward: 23.168, avg true_objective: 9.835
[2023-02-23 11:27:54,550][05868] Num frames 6000...
[2023-02-23 11:27:54,714][05868] Num frames 6100...
[2023-02-23 11:27:54,886][05868] Num frames 6200...
[2023-02-23 11:27:55,010][05868] Num frames 6300...
[2023-02-23 11:27:55,122][05868] Num frames 6400...
[2023-02-23 11:27:55,242][05868] Num frames 6500...
[2023-02-23 11:27:55,352][05868] Num frames 6600...
[2023-02-23 11:27:55,490][05868] Avg episode rewards: #0: 21.813, true rewards: #0: 9.527
[2023-02-23 11:27:55,492][05868] Avg episode reward: 21.813, avg true_objective: 9.527
[2023-02-23 11:27:55,536][05868] Num frames 6700...
[2023-02-23 11:27:55,666][05868] Num frames 6800...
[2023-02-23 11:27:55,794][05868] Num frames 6900...
[2023-02-23 11:27:55,914][05868] Num frames 7000...
[2023-02-23 11:27:56,024][05868] Num frames 7100...
[2023-02-23 11:27:56,135][05868] Num frames 7200...
[2023-02-23 11:27:56,248][05868] Num frames 7300...
[2023-02-23 11:27:56,359][05868] Num frames 7400...
[2023-02-23 11:27:56,471][05868] Num frames 7500...
[2023-02-23 11:27:56,589][05868] Num frames 7600...
[2023-02-23 11:27:56,679][05868] Avg episode rewards: #0: 22.150, true rewards: #0: 9.525
[2023-02-23 11:27:56,681][05868] Avg episode reward: 22.150, avg true_objective: 9.525
[2023-02-23 11:27:56,784][05868] Num frames 7700...
[2023-02-23 11:27:56,906][05868] Num frames 7800...
[2023-02-23 11:27:57,018][05868] Num frames 7900...
[2023-02-23 11:27:57,139][05868] Num frames 8000...
[2023-02-23 11:27:57,252][05868] Num frames 8100...
[2023-02-23 11:27:57,363][05868] Num frames 8200...
[2023-02-23 11:27:57,456][05868] Avg episode rewards: #0: 20.698, true rewards: #0: 9.142
[2023-02-23 11:27:57,458][05868] Avg episode reward: 20.698, avg true_objective: 9.142
[2023-02-23 11:27:57,562][05868] Num frames 8300...
[2023-02-23 11:27:57,696][05868] Num frames 8400...
[2023-02-23 11:27:57,808][05868] Num frames 8500...
[2023-02-23 11:27:57,923][05868] Num frames 8600...
[2023-02-23 11:27:58,036][05868] Num frames 8700...
[2023-02-23 11:27:58,155][05868] Num frames 8800...
[2023-02-23 11:27:58,271][05868] Num frames 8900...
[2023-02-23 11:27:58,389][05868] Num frames 9000...
[2023-02-23 11:27:58,513][05868] Num frames 9100...
[2023-02-23 11:27:58,626][05868] Num frames 9200...
[2023-02-23 11:27:58,745][05868] Num frames 9300...
[2023-02-23 11:27:58,857][05868] Num frames 9400...
[2023-02-23 11:27:58,966][05868] Num frames 9500...
[2023-02-23 11:27:59,086][05868] Num frames 9600...
[2023-02-23 11:27:59,205][05868] Num frames 9700...
[2023-02-23 11:27:59,326][05868] Num frames 9800...
[2023-02-23 11:27:59,396][05868] Avg episode rewards: #0: 22.309, true rewards: #0: 9.809
[2023-02-23 11:27:59,398][05868] Avg episode reward: 22.309, avg true_objective: 9.809
[2023-02-23 11:29:02,301][05868] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 11:31:46,553][05868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 11:31:46,555][05868] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 11:31:46,556][05868] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 11:31:46,559][05868] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 11:31:46,561][05868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 11:31:46,562][05868] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 11:31:46,563][05868] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 11:31:46,565][05868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 11:31:46,566][05868] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 11:31:46,567][05868] Adding new argument 'hf_repository'='iubeda/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 11:31:46,568][05868] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 11:31:46,569][05868] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 11:31:46,570][05868] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 11:31:46,572][05868] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 11:31:46,573][05868] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 11:31:46,602][05868] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 11:31:46,604][05868] RunningMeanStd input shape: (1,)
[2023-02-23 11:31:46,620][05868] ConvEncoder: input_channels=3
[2023-02-23 11:31:46,658][05868] Conv encoder output size: 512
[2023-02-23 11:31:46,659][05868] Policy head output size: 512
[2023-02-23 11:31:46,682][05868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 11:31:47,129][05868] Num frames 100...
[2023-02-23 11:31:47,275][05868] Num frames 200...
[2023-02-23 11:31:47,397][05868] Num frames 300...
[2023-02-23 11:31:47,514][05868] Num frames 400...
[2023-02-23 11:31:47,630][05868] Num frames 500...
[2023-02-23 11:31:47,753][05868] Num frames 600...
[2023-02-23 11:31:47,884][05868] Num frames 700...
[2023-02-23 11:31:48,007][05868] Num frames 800...
[2023-02-23 11:31:48,130][05868] Num frames 900...
[2023-02-23 11:31:48,245][05868] Num frames 1000...
[2023-02-23 11:31:48,362][05868] Num frames 1100...
[2023-02-23 11:31:48,480][05868] Num frames 1200...
[2023-02-23 11:31:48,606][05868] Num frames 1300...
[2023-02-23 11:31:48,750][05868] Avg episode rewards: #0: 33.760, true rewards: #0: 13.760
[2023-02-23 11:31:48,752][05868] Avg episode reward: 33.760, avg true_objective: 13.760
[2023-02-23 11:31:48,786][05868] Num frames 1400...
[2023-02-23 11:31:48,903][05868] Num frames 1500...
[2023-02-23 11:31:49,020][05868] Num frames 1600...
[2023-02-23 11:31:49,135][05868] Num frames 1700...
[2023-02-23 11:31:49,254][05868] Num frames 1800...
[2023-02-23 11:31:49,389][05868] Num frames 1900...
[2023-02-23 11:31:49,511][05868] Num frames 2000...
[2023-02-23 11:31:49,621][05868] Num frames 2100...
[2023-02-23 11:31:49,734][05868] Num frames 2200...
[2023-02-23 11:31:49,852][05868] Num frames 2300...
[2023-02-23 11:31:49,970][05868] Num frames 2400...
[2023-02-23 11:31:50,104][05868] Num frames 2500...
[2023-02-23 11:31:50,241][05868] Num frames 2600...
[2023-02-23 11:31:50,356][05868] Num frames 2700...
[2023-02-23 11:31:50,467][05868] Num frames 2800...
[2023-02-23 11:31:50,581][05868] Num frames 2900...
[2023-02-23 11:31:50,696][05868] Num frames 3000...
[2023-02-23 11:31:50,841][05868] Avg episode rewards: #0: 38.335, true rewards: #0: 15.335
[2023-02-23 11:31:50,844][05868] Avg episode reward: 38.335, avg true_objective: 15.335
[2023-02-23 11:31:50,896][05868] Num frames 3100...
[2023-02-23 11:31:51,026][05868] Num frames 3200...
[2023-02-23 11:31:51,146][05868] Num frames 3300...
[2023-02-23 11:31:51,258][05868] Num frames 3400...
[2023-02-23 11:31:51,377][05868] Num frames 3500...
[2023-02-23 11:31:51,495][05868] Num frames 3600...
[2023-02-23 11:31:51,609][05868] Num frames 3700...
[2023-02-23 11:31:51,745][05868] Num frames 3800...
[2023-02-23 11:31:51,861][05868] Num frames 3900...
[2023-02-23 11:31:51,980][05868] Num frames 4000...
[2023-02-23 11:31:52,108][05868] Avg episode rewards: #0: 33.197, true rewards: #0: 13.530
[2023-02-23 11:31:52,112][05868] Avg episode reward: 33.197, avg true_objective: 13.530
[2023-02-23 11:31:52,164][05868] Num frames 4100...
[2023-02-23 11:31:52,297][05868] Num frames 4200...
[2023-02-23 11:31:52,422][05868] Num frames 4300...
[2023-02-23 11:31:52,568][05868] Num frames 4400...
[2023-02-23 11:31:52,691][05868] Num frames 4500...
[2023-02-23 11:31:52,820][05868] Num frames 4600...
[2023-02-23 11:31:52,932][05868] Num frames 4700...
[2023-02-23 11:31:53,072][05868] Num frames 4800...
[2023-02-23 11:31:53,197][05868] Avg episode rewards: #0: 29.647, true rewards: #0: 12.147
[2023-02-23 11:31:53,199][05868] Avg episode reward: 29.647, avg true_objective: 12.147
[2023-02-23 11:31:53,316][05868] Num frames 4900...
[2023-02-23 11:31:53,581][05868] Num frames 5000...
[2023-02-23 11:31:53,820][05868] Num frames 5100...
[2023-02-23 11:31:54,066][05868] Num frames 5200...
[2023-02-23 11:31:54,344][05868] Num frames 5300...
[2023-02-23 11:31:54,635][05868] Num frames 5400...
[2023-02-23 11:31:54,856][05868] Num frames 5500...
[2023-02-23 11:31:55,055][05868] Num frames 5600...
[2023-02-23 11:31:55,280][05868] Num frames 5700...
[2023-02-23 11:31:55,614][05868] Num frames 5800...
[2023-02-23 11:31:55,878][05868] Num frames 5900...
[2023-02-23 11:31:56,194][05868] Avg episode rewards: #0: 28.158, true rewards: #0: 11.958
[2023-02-23 11:31:56,206][05868] Avg episode reward: 28.158, avg true_objective: 11.958
[2023-02-23 11:31:56,289][05868] Num frames 6000...
[2023-02-23 11:31:56,618][05868] Num frames 6100...
[2023-02-23 11:31:56,835][05868] Num frames 6200...
[2023-02-23 11:31:56,994][05868] Num frames 6300...
[2023-02-23 11:31:57,166][05868] Num frames 6400...
[2023-02-23 11:31:57,337][05868] Num frames 6500...
[2023-02-23 11:31:57,532][05868] Avg episode rewards: #0: 25.312, true rewards: #0: 10.978
[2023-02-23 11:31:57,537][05868] Avg episode reward: 25.312, avg true_objective: 10.978
[2023-02-23 11:31:57,568][05868] Num frames 6600...
[2023-02-23 11:31:57,746][05868] Num frames 6700...
[2023-02-23 11:31:57,908][05868] Num frames 6800...
[2023-02-23 11:31:58,071][05868] Num frames 6900...
[2023-02-23 11:31:58,254][05868] Num frames 7000...
[2023-02-23 11:31:58,420][05868] Num frames 7100...
[2023-02-23 11:31:58,574][05868] Avg episode rewards: #0: 23.224, true rewards: #0: 10.224
[2023-02-23 11:31:58,577][05868] Avg episode reward: 23.224, avg true_objective: 10.224
[2023-02-23 11:31:58,652][05868] Num frames 7200...
[2023-02-23 11:31:58,810][05868] Num frames 7300...
[2023-02-23 11:31:58,970][05868] Num frames 7400...
[2023-02-23 11:31:59,133][05868] Num frames 7500...
[2023-02-23 11:31:59,309][05868] Num frames 7600...
[2023-02-23 11:31:59,477][05868] Num frames 7700...
[2023-02-23 11:31:59,639][05868] Avg episode rewards: #0: 21.456, true rewards: #0: 9.706
[2023-02-23 11:31:59,641][05868] Avg episode reward: 21.456, avg true_objective: 9.706
[2023-02-23 11:31:59,700][05868] Num frames 7800...
[2023-02-23 11:31:59,823][05868] Num frames 7900...
[2023-02-23 11:31:59,933][05868] Num frames 8000...
[2023-02-23 11:32:00,048][05868] Num frames 8100...
[2023-02-23 11:32:00,167][05868] Num frames 8200...
[2023-02-23 11:32:00,297][05868] Num frames 8300...
[2023-02-23 11:32:00,414][05868] Num frames 8400...
[2023-02-23 11:32:00,525][05868] Num frames 8500...
[2023-02-23 11:32:00,621][05868] Avg episode rewards: #0: 20.814, true rewards: #0: 9.481
[2023-02-23 11:32:00,623][05868] Avg episode reward: 20.814, avg true_objective: 9.481
[2023-02-23 11:32:00,711][05868] Num frames 8600...
[2023-02-23 11:32:00,833][05868] Num frames 8700...
[2023-02-23 11:32:00,945][05868] Num frames 8800...
[2023-02-23 11:32:01,064][05868] Num frames 8900...
[2023-02-23 11:32:01,189][05868] Num frames 9000...
[2023-02-23 11:32:01,322][05868] Num frames 9100...
[2023-02-23 11:32:01,434][05868] Num frames 9200...
[2023-02-23 11:32:01,543][05868] Num frames 9300...
[2023-02-23 11:32:01,656][05868] Num frames 9400...
[2023-02-23 11:32:01,769][05868] Num frames 9500...
[2023-02-23 11:32:01,889][05868] Num frames 9600...
[2023-02-23 11:32:02,015][05868] Num frames 9700...
[2023-02-23 11:32:02,139][05868] Num frames 9800...
[2023-02-23 11:32:02,255][05868] Num frames 9900...
[2023-02-23 11:32:02,379][05868] Num frames 10000...
[2023-02-23 11:32:02,493][05868] Num frames 10100...
[2023-02-23 11:32:02,625][05868] Num frames 10200...
[2023-02-23 11:32:02,743][05868] Num frames 10300...
[2023-02-23 11:32:02,868][05868] Num frames 10400...
[2023-02-23 11:32:02,980][05868] Num frames 10500...
[2023-02-23 11:32:03,097][05868] Num frames 10600...
[2023-02-23 11:32:03,192][05868] Avg episode rewards: #0: 24.633, true rewards: #0: 10.633
[2023-02-23 11:32:03,194][05868] Avg episode reward: 24.633, avg true_objective: 10.633
[2023-02-23 11:33:14,321][05868] Replay video saved to /content/train_dir/default_experiment/replay.mp4!