bonadio's picture
Upload . with huggingface_hub
f51322a
[2023-02-28 15:42:34,827][11028] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-28 15:42:34,831][11028] Rollout worker 0 uses device cpu
[2023-02-28 15:42:34,832][11028] Rollout worker 1 uses device cpu
[2023-02-28 15:42:34,835][11028] Rollout worker 2 uses device cpu
[2023-02-28 15:42:34,836][11028] Rollout worker 3 uses device cpu
[2023-02-28 15:42:34,837][11028] Rollout worker 4 uses device cpu
[2023-02-28 15:42:34,838][11028] Rollout worker 5 uses device cpu
[2023-02-28 15:42:34,841][11028] Rollout worker 6 uses device cpu
[2023-02-28 15:42:34,842][11028] Rollout worker 7 uses device cpu
[2023-02-28 15:42:35,070][11028] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 15:42:35,075][11028] InferenceWorker_p0-w0: min num requests: 2
[2023-02-28 15:42:35,120][11028] Starting all processes...
[2023-02-28 15:42:35,122][11028] Starting process learner_proc0
[2023-02-28 15:42:35,196][11028] Starting all processes...
[2023-02-28 15:42:35,210][11028] Starting process inference_proc0-0
[2023-02-28 15:42:35,211][11028] Starting process rollout_proc0
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc1
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc2
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc3
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc4
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc5
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc6
[2023-02-28 15:42:35,213][11028] Starting process rollout_proc7
[2023-02-28 15:42:44,636][11239] Worker 4 uses CPU cores [0]
[2023-02-28 15:42:44,683][11217] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 15:42:44,683][11217] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-28 15:42:44,764][11235] Worker 3 uses CPU cores [1]
[2023-02-28 15:42:44,816][11234] Worker 2 uses CPU cores [0]
[2023-02-28 15:42:44,885][11230] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 15:42:44,885][11230] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-28 15:42:44,937][11241] Worker 5 uses CPU cores [1]
[2023-02-28 15:42:44,952][11242] Worker 6 uses CPU cores [0]
[2023-02-28 15:42:45,001][11231] Worker 0 uses CPU cores [0]
[2023-02-28 15:42:45,041][11232] Worker 1 uses CPU cores [1]
[2023-02-28 15:42:45,065][11243] Worker 7 uses CPU cores [1]
[2023-02-28 15:42:45,487][11217] Num visible devices: 1
[2023-02-28 15:42:45,489][11230] Num visible devices: 1
[2023-02-28 15:42:45,496][11217] Starting seed is not provided
[2023-02-28 15:42:45,496][11217] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 15:42:45,497][11217] Initializing actor-critic model on device cuda:0
[2023-02-28 15:42:45,497][11217] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 15:42:45,499][11217] RunningMeanStd input shape: (1,)
[2023-02-28 15:42:45,520][11217] ConvEncoder: input_channels=3
[2023-02-28 15:42:45,909][11217] Conv encoder output size: 512
[2023-02-28 15:42:45,909][11217] Policy head output size: 512
[2023-02-28 15:42:45,964][11217] Created Actor Critic model with architecture:
[2023-02-28 15:42:45,965][11217] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-28 15:42:53,384][11217] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-28 15:42:53,385][11217] No checkpoints found
[2023-02-28 15:42:53,386][11217] Did not load from checkpoint, starting from scratch!
[2023-02-28 15:42:53,386][11217] Initialized policy 0 weights for model version 0
[2023-02-28 15:42:53,389][11217] LearnerWorker_p0 finished initialization!
[2023-02-28 15:42:53,392][11217] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 15:42:53,485][11230] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 15:42:53,488][11230] RunningMeanStd input shape: (1,)
[2023-02-28 15:42:53,505][11230] ConvEncoder: input_channels=3
[2023-02-28 15:42:53,603][11230] Conv encoder output size: 512
[2023-02-28 15:42:53,604][11230] Policy head output size: 512
[2023-02-28 15:42:55,055][11028] Heartbeat connected on Batcher_0
[2023-02-28 15:42:55,063][11028] Heartbeat connected on LearnerWorker_p0
[2023-02-28 15:42:55,075][11028] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 15:42:55,089][11028] Heartbeat connected on RolloutWorker_w0
[2023-02-28 15:42:55,093][11028] Heartbeat connected on RolloutWorker_w1
[2023-02-28 15:42:55,100][11028] Heartbeat connected on RolloutWorker_w2
[2023-02-28 15:42:55,108][11028] Heartbeat connected on RolloutWorker_w3
[2023-02-28 15:42:55,111][11028] Heartbeat connected on RolloutWorker_w4
[2023-02-28 15:42:55,114][11028] Heartbeat connected on RolloutWorker_w5
[2023-02-28 15:42:55,119][11028] Heartbeat connected on RolloutWorker_w6
[2023-02-28 15:42:55,123][11028] Heartbeat connected on RolloutWorker_w7
[2023-02-28 15:42:55,878][11028] Inference worker 0-0 is ready!
[2023-02-28 15:42:55,881][11028] All inference workers are ready! Signal rollout workers to start!
[2023-02-28 15:42:55,891][11028] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-28 15:42:56,022][11235] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,024][11232] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,047][11241] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,055][11239] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,056][11231] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,057][11242] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,054][11243] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:56,073][11234] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 15:42:57,280][11243] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,282][11239] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,283][11242] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,280][11231] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,284][11241] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,283][11232] Decorrelating experience for 0 frames...
[2023-02-28 15:42:57,992][11242] Decorrelating experience for 32 frames...
[2023-02-28 15:42:57,998][11231] Decorrelating experience for 32 frames...
[2023-02-28 15:42:58,294][11232] Decorrelating experience for 32 frames...
[2023-02-28 15:42:58,297][11241] Decorrelating experience for 32 frames...
[2023-02-28 15:42:58,306][11243] Decorrelating experience for 32 frames...
[2023-02-28 15:42:59,048][11239] Decorrelating experience for 32 frames...
[2023-02-28 15:42:59,318][11242] Decorrelating experience for 64 frames...
[2023-02-28 15:42:59,401][11235] Decorrelating experience for 0 frames...
[2023-02-28 15:42:59,536][11231] Decorrelating experience for 64 frames...
[2023-02-28 15:42:59,701][11232] Decorrelating experience for 64 frames...
[2023-02-28 15:43:00,075][11028] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 15:43:01,198][11243] Decorrelating experience for 64 frames...
[2023-02-28 15:43:01,243][11239] Decorrelating experience for 64 frames...
[2023-02-28 15:43:01,411][11242] Decorrelating experience for 96 frames...
[2023-02-28 15:43:01,664][11235] Decorrelating experience for 32 frames...
[2023-02-28 15:43:01,784][11234] Decorrelating experience for 0 frames...
[2023-02-28 15:43:02,022][11241] Decorrelating experience for 64 frames...
[2023-02-28 15:43:02,027][11231] Decorrelating experience for 96 frames...
[2023-02-28 15:43:02,123][11232] Decorrelating experience for 96 frames...
[2023-02-28 15:43:02,480][11243] Decorrelating experience for 96 frames...
[2023-02-28 15:43:03,318][11239] Decorrelating experience for 96 frames...
[2023-02-28 15:43:03,319][11234] Decorrelating experience for 32 frames...
[2023-02-28 15:43:03,941][11241] Decorrelating experience for 96 frames...
[2023-02-28 15:43:04,697][11235] Decorrelating experience for 64 frames...
[2023-02-28 15:43:04,750][11234] Decorrelating experience for 64 frames...
[2023-02-28 15:43:05,076][11028] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 15:43:05,165][11235] Decorrelating experience for 96 frames...
[2023-02-28 15:43:05,614][11234] Decorrelating experience for 96 frames...
[2023-02-28 15:43:09,532][11217] Signal inference workers to stop experience collection...
[2023-02-28 15:43:09,551][11230] InferenceWorker_p0-w0: stopping experience collection
[2023-02-28 15:43:10,075][11028] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 67.7. Samples: 1016. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 15:43:10,078][11028] Avg episode reward: [(0, '2.085')]
[2023-02-28 15:43:12,109][11217] Signal inference workers to resume experience collection...
[2023-02-28 15:43:12,111][11230] InferenceWorker_p0-w0: resuming experience collection
[2023-02-28 15:43:15,075][11028] Fps is (10 sec: 1638.5, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 163.5. Samples: 3270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:43:15,084][11028] Avg episode reward: [(0, '3.129')]
[2023-02-28 15:43:20,081][11028] Fps is (10 sec: 3275.0, 60 sec: 1310.4, 300 sec: 1310.4). Total num frames: 32768. Throughput: 0: 352.7. Samples: 8820. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:43:20,089][11028] Avg episode reward: [(0, '3.742')]
[2023-02-28 15:43:22,277][11230] Updated weights for policy 0, policy_version 10 (0.0352)
[2023-02-28 15:43:25,075][11028] Fps is (10 sec: 2867.2, 60 sec: 1501.9, 300 sec: 1501.9). Total num frames: 45056. Throughput: 0: 362.7. Samples: 10882. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:43:25,080][11028] Avg episode reward: [(0, '4.260')]
[2023-02-28 15:43:30,075][11028] Fps is (10 sec: 2868.8, 60 sec: 1755.4, 300 sec: 1755.4). Total num frames: 61440. Throughput: 0: 419.1. Samples: 14670. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-28 15:43:30,078][11028] Avg episode reward: [(0, '4.333')]
[2023-02-28 15:43:34,459][11230] Updated weights for policy 0, policy_version 20 (0.0016)
[2023-02-28 15:43:35,075][11028] Fps is (10 sec: 3686.4, 60 sec: 2048.0, 300 sec: 2048.0). Total num frames: 81920. Throughput: 0: 533.2. Samples: 21328. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:43:35,080][11028] Avg episode reward: [(0, '4.392')]
[2023-02-28 15:43:40,076][11028] Fps is (10 sec: 4095.8, 60 sec: 2275.5, 300 sec: 2275.5). Total num frames: 102400. Throughput: 0: 548.5. Samples: 24684. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:43:40,083][11028] Avg episode reward: [(0, '4.313')]
[2023-02-28 15:43:40,100][11217] Saving new best policy, reward=4.313!
[2023-02-28 15:43:45,075][11028] Fps is (10 sec: 3276.8, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 114688. Throughput: 0: 645.4. Samples: 29042. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:43:45,079][11028] Avg episode reward: [(0, '4.487')]
[2023-02-28 15:43:45,086][11217] Saving new best policy, reward=4.487!
[2023-02-28 15:43:47,187][11230] Updated weights for policy 0, policy_version 30 (0.0024)
[2023-02-28 15:43:50,075][11028] Fps is (10 sec: 2867.3, 60 sec: 2383.1, 300 sec: 2383.1). Total num frames: 131072. Throughput: 0: 740.1. Samples: 33304. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:43:50,082][11028] Avg episode reward: [(0, '4.605')]
[2023-02-28 15:43:50,098][11217] Saving new best policy, reward=4.605!
[2023-02-28 15:43:55,075][11028] Fps is (10 sec: 3686.3, 60 sec: 2525.9, 300 sec: 2525.9). Total num frames: 151552. Throughput: 0: 790.6. Samples: 36592. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:43:55,082][11028] Avg episode reward: [(0, '4.486')]
[2023-02-28 15:43:57,356][11230] Updated weights for policy 0, policy_version 40 (0.0014)
[2023-02-28 15:44:00,078][11028] Fps is (10 sec: 4094.7, 60 sec: 2867.0, 300 sec: 2646.5). Total num frames: 172032. Throughput: 0: 888.6. Samples: 43258. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:44:00,085][11028] Avg episode reward: [(0, '4.400')]
[2023-02-28 15:44:05,075][11028] Fps is (10 sec: 3276.9, 60 sec: 3072.0, 300 sec: 2633.1). Total num frames: 184320. Throughput: 0: 859.4. Samples: 47488. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:44:05,078][11028] Avg episode reward: [(0, '4.377')]
[2023-02-28 15:44:10,075][11028] Fps is (10 sec: 2868.2, 60 sec: 3345.1, 300 sec: 2676.1). Total num frames: 200704. Throughput: 0: 860.0. Samples: 49584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:44:10,078][11028] Avg episode reward: [(0, '4.324')]
[2023-02-28 15:44:10,729][11230] Updated weights for policy 0, policy_version 50 (0.0043)
[2023-02-28 15:44:15,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2764.8). Total num frames: 221184. Throughput: 0: 905.8. Samples: 55432. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:44:15,078][11028] Avg episode reward: [(0, '4.236')]
[2023-02-28 15:44:20,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3481.9, 300 sec: 2843.1). Total num frames: 241664. Throughput: 0: 901.6. Samples: 61902. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:44:20,082][11028] Avg episode reward: [(0, '4.513')]
[2023-02-28 15:44:20,300][11230] Updated weights for policy 0, policy_version 60 (0.0020)
[2023-02-28 15:44:25,081][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 2867.2). Total num frames: 258048. Throughput: 0: 874.4. Samples: 64032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:44:25,083][11028] Avg episode reward: [(0, '4.607')]
[2023-02-28 15:44:25,096][11217] Saving new best policy, reward=4.607!
[2023-02-28 15:44:30,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 2888.8). Total num frames: 274432. Throughput: 0: 874.3. Samples: 68384. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:44:30,077][11028] Avg episode reward: [(0, '4.611')]
[2023-02-28 15:44:30,095][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth...
[2023-02-28 15:44:30,228][11217] Saving new best policy, reward=4.611!
[2023-02-28 15:44:32,929][11230] Updated weights for policy 0, policy_version 70 (0.0037)
[2023-02-28 15:44:35,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 2949.1). Total num frames: 294912. Throughput: 0: 910.7. Samples: 74286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:44:35,078][11028] Avg episode reward: [(0, '4.558')]
[2023-02-28 15:44:40,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3003.7). Total num frames: 315392. Throughput: 0: 911.0. Samples: 77588. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:44:40,078][11028] Avg episode reward: [(0, '4.319')]
[2023-02-28 15:44:43,339][11230] Updated weights for policy 0, policy_version 80 (0.0012)
[2023-02-28 15:44:45,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3016.1). Total num frames: 331776. Throughput: 0: 883.1. Samples: 82996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:44:45,079][11028] Avg episode reward: [(0, '4.387')]
[2023-02-28 15:44:50,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 2991.9). Total num frames: 344064. Throughput: 0: 884.3. Samples: 87280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:44:50,078][11028] Avg episode reward: [(0, '4.321')]
[2023-02-28 15:44:55,079][11028] Fps is (10 sec: 3275.4, 60 sec: 3549.6, 300 sec: 3037.8). Total num frames: 364544. Throughput: 0: 901.1. Samples: 90136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:44:55,081][11028] Avg episode reward: [(0, '4.209')]
[2023-02-28 15:44:55,188][11230] Updated weights for policy 0, policy_version 90 (0.0016)
[2023-02-28 15:45:00,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3113.0). Total num frames: 389120. Throughput: 0: 922.7. Samples: 96954. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:45:00,081][11028] Avg episode reward: [(0, '4.433')]
[2023-02-28 15:45:05,080][11028] Fps is (10 sec: 4095.9, 60 sec: 3686.1, 300 sec: 3119.2). Total num frames: 405504. Throughput: 0: 893.6. Samples: 102116. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:45:05,088][11028] Avg episode reward: [(0, '4.487')]
[2023-02-28 15:45:05,922][11230] Updated weights for policy 0, policy_version 100 (0.0021)
[2023-02-28 15:45:10,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3094.8). Total num frames: 417792. Throughput: 0: 893.0. Samples: 104218. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:45:10,080][11028] Avg episode reward: [(0, '4.517')]
[2023-02-28 15:45:15,075][11028] Fps is (10 sec: 3278.3, 60 sec: 3618.1, 300 sec: 3130.5). Total num frames: 438272. Throughput: 0: 913.0. Samples: 109470. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:45:15,084][11028] Avg episode reward: [(0, '4.681')]
[2023-02-28 15:45:15,086][11217] Saving new best policy, reward=4.681!
[2023-02-28 15:45:17,388][11230] Updated weights for policy 0, policy_version 110 (0.0018)
[2023-02-28 15:45:20,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3163.8). Total num frames: 458752. Throughput: 0: 928.3. Samples: 116058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:45:20,077][11028] Avg episode reward: [(0, '4.725')]
[2023-02-28 15:45:20,112][11217] Saving new best policy, reward=4.725!
[2023-02-28 15:45:25,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3194.9). Total num frames: 479232. Throughput: 0: 915.3. Samples: 118776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:45:25,081][11028] Avg episode reward: [(0, '4.551')]
[2023-02-28 15:45:29,551][11230] Updated weights for policy 0, policy_version 120 (0.0026)
[2023-02-28 15:45:30,076][11028] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3171.1). Total num frames: 491520. Throughput: 0: 889.4. Samples: 123018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:45:30,082][11028] Avg episode reward: [(0, '4.541')]
[2023-02-28 15:45:35,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3174.4). Total num frames: 507904. Throughput: 0: 910.7. Samples: 128262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:45:35,082][11028] Avg episode reward: [(0, '4.495')]
[2023-02-28 15:45:39,569][11230] Updated weights for policy 0, policy_version 130 (0.0020)
[2023-02-28 15:45:40,075][11028] Fps is (10 sec: 4096.2, 60 sec: 3618.1, 300 sec: 3227.2). Total num frames: 532480. Throughput: 0: 922.7. Samples: 131654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:45:40,084][11028] Avg episode reward: [(0, '4.643')]
[2023-02-28 15:45:45,078][11028] Fps is (10 sec: 4094.6, 60 sec: 3617.9, 300 sec: 3228.5). Total num frames: 548864. Throughput: 0: 901.9. Samples: 137542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:45:45,083][11028] Avg episode reward: [(0, '4.599')]
[2023-02-28 15:45:50,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3206.6). Total num frames: 561152. Throughput: 0: 880.9. Samples: 141754. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:45:50,081][11028] Avg episode reward: [(0, '4.723')]
[2023-02-28 15:45:52,891][11230] Updated weights for policy 0, policy_version 140 (0.0012)
[2023-02-28 15:45:55,075][11028] Fps is (10 sec: 3277.9, 60 sec: 3618.4, 300 sec: 3231.3). Total num frames: 581632. Throughput: 0: 883.8. Samples: 143990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:45:55,077][11028] Avg episode reward: [(0, '4.683')]
[2023-02-28 15:46:00,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 606208. Throughput: 0: 917.2. Samples: 150742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:00,077][11028] Avg episode reward: [(0, '4.782')]
[2023-02-28 15:46:00,091][11217] Saving new best policy, reward=4.782!
[2023-02-28 15:46:02,016][11230] Updated weights for policy 0, policy_version 150 (0.0019)
[2023-02-28 15:46:05,078][11028] Fps is (10 sec: 4094.6, 60 sec: 3618.2, 300 sec: 3276.7). Total num frames: 622592. Throughput: 0: 891.8. Samples: 156190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:46:05,081][11028] Avg episode reward: [(0, '4.811')]
[2023-02-28 15:46:05,084][11217] Saving new best policy, reward=4.811!
[2023-02-28 15:46:10,077][11028] Fps is (10 sec: 2866.7, 60 sec: 3618.0, 300 sec: 3255.8). Total num frames: 634880. Throughput: 0: 876.4. Samples: 158214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:46:10,081][11028] Avg episode reward: [(0, '4.688')]
[2023-02-28 15:46:15,075][11028] Fps is (10 sec: 2868.2, 60 sec: 3549.9, 300 sec: 3256.3). Total num frames: 651264. Throughput: 0: 882.5. Samples: 162728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:46:15,083][11028] Avg episode reward: [(0, '4.584')]
[2023-02-28 15:46:15,340][11230] Updated weights for policy 0, policy_version 160 (0.0022)
[2023-02-28 15:46:20,075][11028] Fps is (10 sec: 4096.7, 60 sec: 3618.1, 300 sec: 3296.8). Total num frames: 675840. Throughput: 0: 913.7. Samples: 169378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:46:20,078][11028] Avg episode reward: [(0, '4.601')]
[2023-02-28 15:46:25,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3296.3). Total num frames: 692224. Throughput: 0: 911.8. Samples: 172684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:46:25,080][11028] Avg episode reward: [(0, '4.753')]
[2023-02-28 15:46:25,323][11230] Updated weights for policy 0, policy_version 170 (0.0013)
[2023-02-28 15:46:30,075][11028] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3295.8). Total num frames: 708608. Throughput: 0: 878.9. Samples: 177088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:30,079][11028] Avg episode reward: [(0, '4.675')]
[2023-02-28 15:46:30,098][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000173_708608.pth...
[2023-02-28 15:46:35,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3295.4). Total num frames: 724992. Throughput: 0: 889.9. Samples: 181798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:35,078][11028] Avg episode reward: [(0, '4.608')]
[2023-02-28 15:46:37,582][11230] Updated weights for policy 0, policy_version 180 (0.0015)
[2023-02-28 15:46:40,075][11028] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3313.2). Total num frames: 745472. Throughput: 0: 916.9. Samples: 185250. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:46:40,082][11028] Avg episode reward: [(0, '4.564')]
[2023-02-28 15:46:45,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3330.2). Total num frames: 765952. Throughput: 0: 911.8. Samples: 191774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:45,080][11028] Avg episode reward: [(0, '4.771')]
[2023-02-28 15:46:48,807][11230] Updated weights for policy 0, policy_version 190 (0.0024)
[2023-02-28 15:46:50,078][11028] Fps is (10 sec: 3275.9, 60 sec: 3618.0, 300 sec: 3311.6). Total num frames: 778240. Throughput: 0: 883.1. Samples: 195928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:50,084][11028] Avg episode reward: [(0, '4.772')]
[2023-02-28 15:46:55,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3310.9). Total num frames: 794624. Throughput: 0: 887.3. Samples: 198142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:46:55,077][11028] Avg episode reward: [(0, '4.941')]
[2023-02-28 15:46:55,089][11217] Saving new best policy, reward=4.941!
[2023-02-28 15:46:59,999][11230] Updated weights for policy 0, policy_version 200 (0.0019)
[2023-02-28 15:47:00,075][11028] Fps is (10 sec: 4097.2, 60 sec: 3549.9, 300 sec: 3343.7). Total num frames: 819200. Throughput: 0: 920.8. Samples: 204162. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:47:00,077][11028] Avg episode reward: [(0, '5.160')]
[2023-02-28 15:47:00,087][11217] Saving new best policy, reward=5.160!
[2023-02-28 15:47:05,077][11028] Fps is (10 sec: 4504.5, 60 sec: 3618.2, 300 sec: 3358.7). Total num frames: 839680. Throughput: 0: 916.1. Samples: 210606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:47:05,080][11028] Avg episode reward: [(0, '4.793')]
[2023-02-28 15:47:10,082][11028] Fps is (10 sec: 3274.6, 60 sec: 3617.8, 300 sec: 3341.0). Total num frames: 851968. Throughput: 0: 889.0. Samples: 212694. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:47:10,084][11028] Avg episode reward: [(0, '4.666')]
[2023-02-28 15:47:12,100][11230] Updated weights for policy 0, policy_version 210 (0.0019)
[2023-02-28 15:47:15,075][11028] Fps is (10 sec: 2867.9, 60 sec: 3618.1, 300 sec: 3339.8). Total num frames: 868352. Throughput: 0: 884.7. Samples: 216900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:47:15,078][11028] Avg episode reward: [(0, '4.849')]
[2023-02-28 15:47:20,075][11028] Fps is (10 sec: 3688.9, 60 sec: 3549.9, 300 sec: 3354.1). Total num frames: 888832. Throughput: 0: 917.9. Samples: 223104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:47:20,077][11028] Avg episode reward: [(0, '4.923')]
[2023-02-28 15:47:22,243][11230] Updated weights for policy 0, policy_version 220 (0.0023)
[2023-02-28 15:47:25,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3383.0). Total num frames: 913408. Throughput: 0: 917.7. Samples: 226548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:47:25,078][11028] Avg episode reward: [(0, '4.588')]
[2023-02-28 15:47:30,076][11028] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3366.2). Total num frames: 925696. Throughput: 0: 888.7. Samples: 231764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:47:30,083][11028] Avg episode reward: [(0, '4.719')]
[2023-02-28 15:47:34,996][11230] Updated weights for policy 0, policy_version 230 (0.0019)
[2023-02-28 15:47:35,081][11028] Fps is (10 sec: 2865.4, 60 sec: 3617.8, 300 sec: 3364.5). Total num frames: 942080. Throughput: 0: 892.6. Samples: 236100. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:47:35,087][11028] Avg episode reward: [(0, '4.849')]
[2023-02-28 15:47:40,075][11028] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3377.4). Total num frames: 962560. Throughput: 0: 912.3. Samples: 239196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:47:40,078][11028] Avg episode reward: [(0, '4.890')]
[2023-02-28 15:47:45,075][11028] Fps is (10 sec: 3688.7, 60 sec: 3549.9, 300 sec: 3375.7). Total num frames: 978944. Throughput: 0: 909.9. Samples: 245106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:47:45,078][11028] Avg episode reward: [(0, '5.183')]
[2023-02-28 15:47:45,081][11217] Saving new best policy, reward=5.183!
[2023-02-28 15:47:45,796][11230] Updated weights for policy 0, policy_version 240 (0.0014)
[2023-02-28 15:47:50,076][11028] Fps is (10 sec: 2866.8, 60 sec: 3550.0, 300 sec: 3360.1). Total num frames: 991232. Throughput: 0: 843.0. Samples: 248540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:47:50,079][11028] Avg episode reward: [(0, '5.119')]
[2023-02-28 15:47:55,075][11028] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1003520. Throughput: 0: 834.7. Samples: 250250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:47:55,084][11028] Avg episode reward: [(0, '5.000')]
[2023-02-28 15:48:00,075][11028] Fps is (10 sec: 2457.9, 60 sec: 3276.8, 300 sec: 3443.4). Total num frames: 1015808. Throughput: 0: 831.3. Samples: 254310. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:48:00,078][11028] Avg episode reward: [(0, '4.866')]
[2023-02-28 15:48:01,013][11230] Updated weights for policy 0, policy_version 250 (0.0047)
[2023-02-28 15:48:05,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3526.7). Total num frames: 1040384. Throughput: 0: 837.6. Samples: 260798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:48:05,081][11028] Avg episode reward: [(0, '5.042')]
[2023-02-28 15:48:10,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3482.0, 300 sec: 3540.6). Total num frames: 1060864. Throughput: 0: 835.2. Samples: 264132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:48:10,081][11028] Avg episode reward: [(0, '5.281')]
[2023-02-28 15:48:10,100][11217] Saving new best policy, reward=5.281!
[2023-02-28 15:48:10,867][11230] Updated weights for policy 0, policy_version 260 (0.0025)
[2023-02-28 15:48:15,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3526.8). Total num frames: 1073152. Throughput: 0: 823.4. Samples: 268816. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:48:15,078][11028] Avg episode reward: [(0, '5.120')]
[2023-02-28 15:48:20,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3540.6). Total num frames: 1089536. Throughput: 0: 822.6. Samples: 273110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:48:20,080][11028] Avg episode reward: [(0, '5.042')]
[2023-02-28 15:48:23,512][11230] Updated weights for policy 0, policy_version 270 (0.0028)
[2023-02-28 15:48:25,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3554.5). Total num frames: 1110016. Throughput: 0: 828.2. Samples: 276466. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:48:25,078][11028] Avg episode reward: [(0, '5.023')]
[2023-02-28 15:48:30,078][11028] Fps is (10 sec: 4504.5, 60 sec: 3481.5, 300 sec: 3568.3). Total num frames: 1134592. Throughput: 0: 849.4. Samples: 283330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:48:30,080][11028] Avg episode reward: [(0, '5.313')]
[2023-02-28 15:48:30,097][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000277_1134592.pth...
[2023-02-28 15:48:30,247][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000067_274432.pth
[2023-02-28 15:48:30,261][11217] Saving new best policy, reward=5.313!
[2023-02-28 15:48:33,900][11230] Updated weights for policy 0, policy_version 280 (0.0026)
[2023-02-28 15:48:35,076][11028] Fps is (10 sec: 3686.4, 60 sec: 3413.7, 300 sec: 3540.6). Total num frames: 1146880. Throughput: 0: 875.0. Samples: 287916. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:48:35,079][11028] Avg episode reward: [(0, '5.277')]
[2023-02-28 15:48:40,075][11028] Fps is (10 sec: 2867.9, 60 sec: 3345.1, 300 sec: 3554.5). Total num frames: 1163264. Throughput: 0: 885.0. Samples: 290074. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:48:40,078][11028] Avg episode reward: [(0, '5.491')]
[2023-02-28 15:48:40,087][11217] Saving new best policy, reward=5.491!
[2023-02-28 15:48:45,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 1183744. Throughput: 0: 922.4. Samples: 295818. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:48:45,082][11028] Avg episode reward: [(0, '5.701')]
[2023-02-28 15:48:45,086][11217] Saving new best policy, reward=5.701!
[2023-02-28 15:48:45,523][11230] Updated weights for policy 0, policy_version 290 (0.0018)
[2023-02-28 15:48:50,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 1208320. Throughput: 0: 925.1. Samples: 302428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:48:50,083][11028] Avg episode reward: [(0, '6.092')]
[2023-02-28 15:48:50,094][11217] Saving new best policy, reward=6.092!
[2023-02-28 15:48:55,075][11028] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1220608. Throughput: 0: 898.4. Samples: 304560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:48:55,080][11028] Avg episode reward: [(0, '6.095')]
[2023-02-28 15:48:55,089][11217] Saving new best policy, reward=6.095!
[2023-02-28 15:48:57,383][11230] Updated weights for policy 0, policy_version 300 (0.0013)
[2023-02-28 15:49:00,078][11028] Fps is (10 sec: 2457.0, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 1232896. Throughput: 0: 886.1. Samples: 308692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:49:00,081][11028] Avg episode reward: [(0, '5.765')]
[2023-02-28 15:49:05,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 1253376. Throughput: 0: 920.9. Samples: 314552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:49:05,077][11028] Avg episode reward: [(0, '5.659')]
[2023-02-28 15:49:08,014][11230] Updated weights for policy 0, policy_version 310 (0.0028)
[2023-02-28 15:49:10,075][11028] Fps is (10 sec: 4506.7, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1277952. Throughput: 0: 919.4. Samples: 317840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:49:10,078][11028] Avg episode reward: [(0, '6.049')]
[2023-02-28 15:49:15,075][11028] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1290240. Throughput: 0: 888.0. Samples: 323288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:49:15,083][11028] Avg episode reward: [(0, '6.456')]
[2023-02-28 15:49:15,095][11217] Saving new best policy, reward=6.456!
[2023-02-28 15:49:20,079][11028] Fps is (10 sec: 2866.1, 60 sec: 3617.9, 300 sec: 3554.4). Total num frames: 1306624. Throughput: 0: 877.1. Samples: 327388. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:49:20,088][11028] Avg episode reward: [(0, '6.496')]
[2023-02-28 15:49:20,107][11217] Saving new best policy, reward=6.496!
[2023-02-28 15:49:21,239][11230] Updated weights for policy 0, policy_version 320 (0.0014)
[2023-02-28 15:49:25,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1327104. Throughput: 0: 889.6. Samples: 330106. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:49:25,082][11028] Avg episode reward: [(0, '6.828')]
[2023-02-28 15:49:25,087][11217] Saving new best policy, reward=6.828!
[2023-02-28 15:49:30,075][11028] Fps is (10 sec: 4097.6, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 1347584. Throughput: 0: 912.4. Samples: 336876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:49:30,079][11028] Avg episode reward: [(0, '6.597')]
[2023-02-28 15:49:30,346][11230] Updated weights for policy 0, policy_version 330 (0.0024)
[2023-02-28 15:49:35,076][11028] Fps is (10 sec: 3685.9, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 1363968. Throughput: 0: 886.2. Samples: 342306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:49:35,081][11028] Avg episode reward: [(0, '6.616')]
[2023-02-28 15:49:40,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1380352. Throughput: 0: 885.1. Samples: 344390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:49:40,079][11028] Avg episode reward: [(0, '6.804')]
[2023-02-28 15:49:43,239][11230] Updated weights for policy 0, policy_version 340 (0.0027)
[2023-02-28 15:49:45,075][11028] Fps is (10 sec: 3686.9, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1400832. Throughput: 0: 906.2. Samples: 349468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:49:45,078][11028] Avg episode reward: [(0, '7.516')]
[2023-02-28 15:49:45,082][11217] Saving new best policy, reward=7.516!
[2023-02-28 15:49:50,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 1421312. Throughput: 0: 920.7. Samples: 355982. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:49:50,078][11028] Avg episode reward: [(0, '7.915')]
[2023-02-28 15:49:50,090][11217] Saving new best policy, reward=7.915!
[2023-02-28 15:49:52,860][11230] Updated weights for policy 0, policy_version 350 (0.0013)
[2023-02-28 15:49:55,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 1437696. Throughput: 0: 912.4. Samples: 358898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:49:55,077][11028] Avg episode reward: [(0, '8.225')]
[2023-02-28 15:49:55,093][11217] Saving new best policy, reward=8.225!
[2023-02-28 15:50:00,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.3, 300 sec: 3540.7). Total num frames: 1449984. Throughput: 0: 884.1. Samples: 363074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:50:00,084][11028] Avg episode reward: [(0, '8.255')]
[2023-02-28 15:50:00,095][11217] Saving new best policy, reward=8.255!
[2023-02-28 15:50:05,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 1470464. Throughput: 0: 913.9. Samples: 368510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:50:05,083][11028] Avg episode reward: [(0, '8.344')]
[2023-02-28 15:50:05,087][11217] Saving new best policy, reward=8.344!
[2023-02-28 15:50:05,582][11230] Updated weights for policy 0, policy_version 360 (0.0017)
[2023-02-28 15:50:10,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1490944. Throughput: 0: 926.1. Samples: 371782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:50:10,083][11028] Avg episode reward: [(0, '9.197')]
[2023-02-28 15:50:10,098][11217] Saving new best policy, reward=9.197!
[2023-02-28 15:50:15,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1511424. Throughput: 0: 909.6. Samples: 377806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:50:15,079][11028] Avg episode reward: [(0, '9.565')]
[2023-02-28 15:50:15,084][11217] Saving new best policy, reward=9.565!
[2023-02-28 15:50:16,352][11230] Updated weights for policy 0, policy_version 370 (0.0018)
[2023-02-28 15:50:20,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.4, 300 sec: 3540.6). Total num frames: 1523712. Throughput: 0: 882.7. Samples: 382028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:50:20,081][11028] Avg episode reward: [(0, '9.278')]
[2023-02-28 15:50:25,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1540096. Throughput: 0: 886.2. Samples: 384268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:50:25,082][11028] Avg episode reward: [(0, '8.894')]
[2023-02-28 15:50:27,863][11230] Updated weights for policy 0, policy_version 380 (0.0038)
[2023-02-28 15:50:30,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1564672. Throughput: 0: 918.5. Samples: 390802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:50:30,077][11028] Avg episode reward: [(0, '8.840')]
[2023-02-28 15:50:30,090][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000382_1564672.pth...
[2023-02-28 15:50:30,225][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000173_708608.pth
[2023-02-28 15:50:35,077][11028] Fps is (10 sec: 4504.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1585152. Throughput: 0: 907.8. Samples: 396834. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:50:35,079][11028] Avg episode reward: [(0, '9.662')]
[2023-02-28 15:50:35,082][11217] Saving new best policy, reward=9.662!
[2023-02-28 15:50:39,233][11230] Updated weights for policy 0, policy_version 390 (0.0027)
[2023-02-28 15:50:40,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1597440. Throughput: 0: 888.8. Samples: 398896. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:50:40,081][11028] Avg episode reward: [(0, '9.933')]
[2023-02-28 15:50:40,105][11217] Saving new best policy, reward=9.933!
[2023-02-28 15:50:45,075][11028] Fps is (10 sec: 2867.7, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1613824. Throughput: 0: 892.5. Samples: 403238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:50:45,078][11028] Avg episode reward: [(0, '10.415')]
[2023-02-28 15:50:45,082][11217] Saving new best policy, reward=10.415!
[2023-02-28 15:50:50,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1634304. Throughput: 0: 915.7. Samples: 409716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:50:50,081][11028] Avg episode reward: [(0, '10.198')]
[2023-02-28 15:50:50,349][11230] Updated weights for policy 0, policy_version 400 (0.0018)
[2023-02-28 15:50:55,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1654784. Throughput: 0: 916.6. Samples: 413030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:50:55,082][11028] Avg episode reward: [(0, '9.616')]
[2023-02-28 15:51:00,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 1671168. Throughput: 0: 891.4. Samples: 417920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:51:00,082][11028] Avg episode reward: [(0, '9.799')]
[2023-02-28 15:51:02,658][11230] Updated weights for policy 0, policy_version 410 (0.0024)
[2023-02-28 15:51:05,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1683456. Throughput: 0: 895.7. Samples: 422336. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:51:05,077][11028] Avg episode reward: [(0, '9.665')]
[2023-02-28 15:51:10,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1708032. Throughput: 0: 920.5. Samples: 425692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:51:10,078][11028] Avg episode reward: [(0, '10.797')]
[2023-02-28 15:51:10,093][11217] Saving new best policy, reward=10.797!
[2023-02-28 15:51:12,272][11230] Updated weights for policy 0, policy_version 420 (0.0012)
[2023-02-28 15:51:15,077][11028] Fps is (10 sec: 4504.8, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 1728512. Throughput: 0: 922.5. Samples: 432316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:51:15,085][11028] Avg episode reward: [(0, '10.901')]
[2023-02-28 15:51:15,087][11217] Saving new best policy, reward=10.901!
[2023-02-28 15:51:20,078][11028] Fps is (10 sec: 3275.7, 60 sec: 3617.9, 300 sec: 3554.5). Total num frames: 1740800. Throughput: 0: 888.5. Samples: 436818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:51:20,080][11028] Avg episode reward: [(0, '11.118')]
[2023-02-28 15:51:20,144][11217] Saving new best policy, reward=11.118!
[2023-02-28 15:51:25,075][11028] Fps is (10 sec: 2867.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1757184. Throughput: 0: 886.5. Samples: 438788. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-28 15:51:25,078][11028] Avg episode reward: [(0, '11.591')]
[2023-02-28 15:51:25,082][11217] Saving new best policy, reward=11.591!
[2023-02-28 15:51:25,762][11230] Updated weights for policy 0, policy_version 430 (0.0027)
[2023-02-28 15:51:30,075][11028] Fps is (10 sec: 3687.6, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1777664. Throughput: 0: 921.1. Samples: 444688. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-28 15:51:30,081][11028] Avg episode reward: [(0, '11.123')]
[2023-02-28 15:51:34,741][11230] Updated weights for policy 0, policy_version 440 (0.0017)
[2023-02-28 15:51:35,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 1802240. Throughput: 0: 929.2. Samples: 451528. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:51:35,078][11028] Avg episode reward: [(0, '12.126')]
[2023-02-28 15:51:35,084][11217] Saving new best policy, reward=12.126!
[2023-02-28 15:51:40,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1818624. Throughput: 0: 903.3. Samples: 453680. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:51:40,086][11028] Avg episode reward: [(0, '11.465')]
[2023-02-28 15:51:45,075][11028] Fps is (10 sec: 2867.1, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1830912. Throughput: 0: 891.0. Samples: 458014. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:51:45,087][11028] Avg episode reward: [(0, '11.191')]
[2023-02-28 15:51:47,581][11230] Updated weights for policy 0, policy_version 450 (0.0021)
[2023-02-28 15:51:50,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1851392. Throughput: 0: 927.1. Samples: 464054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:51:50,083][11028] Avg episode reward: [(0, '12.184')]
[2023-02-28 15:51:50,093][11217] Saving new best policy, reward=12.184!
[2023-02-28 15:51:55,075][11028] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 1875968. Throughput: 0: 925.5. Samples: 467340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:51:55,078][11028] Avg episode reward: [(0, '11.804')]
[2023-02-28 15:51:57,541][11230] Updated weights for policy 0, policy_version 460 (0.0013)
[2023-02-28 15:52:00,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1888256. Throughput: 0: 893.6. Samples: 472528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:52:00,081][11028] Avg episode reward: [(0, '12.050')]
[2023-02-28 15:52:05,076][11028] Fps is (10 sec: 2867.0, 60 sec: 3686.4, 300 sec: 3568.5). Total num frames: 1904640. Throughput: 0: 888.5. Samples: 476798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:05,080][11028] Avg episode reward: [(0, '10.837')]
[2023-02-28 15:52:09,599][11230] Updated weights for policy 0, policy_version 470 (0.0022)
[2023-02-28 15:52:10,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1925120. Throughput: 0: 910.4. Samples: 479754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:52:10,081][11028] Avg episode reward: [(0, '11.540')]
[2023-02-28 15:52:15,075][11028] Fps is (10 sec: 4096.2, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 1945600. Throughput: 0: 932.0. Samples: 486630. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:52:15,083][11028] Avg episode reward: [(0, '12.094')]
[2023-02-28 15:52:20,077][11028] Fps is (10 sec: 3685.5, 60 sec: 3686.5, 300 sec: 3554.5). Total num frames: 1961984. Throughput: 0: 892.7. Samples: 491700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:20,085][11028] Avg episode reward: [(0, '12.275')]
[2023-02-28 15:52:20,103][11217] Saving new best policy, reward=12.275!
[2023-02-28 15:52:20,730][11230] Updated weights for policy 0, policy_version 480 (0.0035)
[2023-02-28 15:52:25,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1978368. Throughput: 0: 890.4. Samples: 493748. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:25,080][11028] Avg episode reward: [(0, '12.958')]
[2023-02-28 15:52:25,084][11217] Saving new best policy, reward=12.958!
[2023-02-28 15:52:30,075][11028] Fps is (10 sec: 3687.2, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 1998848. Throughput: 0: 915.3. Samples: 499204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:30,078][11028] Avg episode reward: [(0, '12.487')]
[2023-02-28 15:52:30,092][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000488_1998848.pth...
[2023-02-28 15:52:30,233][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000277_1134592.pth
[2023-02-28 15:52:31,840][11230] Updated weights for policy 0, policy_version 490 (0.0041)
[2023-02-28 15:52:35,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2019328. Throughput: 0: 930.4. Samples: 505922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:52:35,078][11028] Avg episode reward: [(0, '12.654')]
[2023-02-28 15:52:40,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2035712. Throughput: 0: 921.2. Samples: 508796. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:52:40,080][11028] Avg episode reward: [(0, '12.802')]
[2023-02-28 15:52:43,401][11230] Updated weights for policy 0, policy_version 500 (0.0028)
[2023-02-28 15:52:45,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 2052096. Throughput: 0: 899.8. Samples: 513020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:52:45,084][11028] Avg episode reward: [(0, '12.836')]
[2023-02-28 15:52:50,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 2072576. Throughput: 0: 931.6. Samples: 518720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:50,078][11028] Avg episode reward: [(0, '13.196')]
[2023-02-28 15:52:50,091][11217] Saving new best policy, reward=13.196!
[2023-02-28 15:52:53,615][11230] Updated weights for policy 0, policy_version 510 (0.0021)
[2023-02-28 15:52:55,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 2093056. Throughput: 0: 938.8. Samples: 522002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:52:55,083][11028] Avg episode reward: [(0, '13.692')]
[2023-02-28 15:52:55,088][11217] Saving new best policy, reward=13.692!
[2023-02-28 15:53:00,077][11028] Fps is (10 sec: 3685.6, 60 sec: 3686.3, 300 sec: 3623.9). Total num frames: 2109440. Throughput: 0: 915.7. Samples: 527840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:53:00,080][11028] Avg episode reward: [(0, '14.161')]
[2023-02-28 15:53:00,091][11217] Saving new best policy, reward=14.161!
[2023-02-28 15:53:05,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2125824. Throughput: 0: 896.5. Samples: 532042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:05,081][11028] Avg episode reward: [(0, '14.522')]
[2023-02-28 15:53:05,088][11217] Saving new best policy, reward=14.522!
[2023-02-28 15:53:06,286][11230] Updated weights for policy 0, policy_version 520 (0.0025)
[2023-02-28 15:53:10,075][11028] Fps is (10 sec: 3277.6, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 2142208. Throughput: 0: 907.7. Samples: 534596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:53:10,081][11028] Avg episode reward: [(0, '14.886')]
[2023-02-28 15:53:10,095][11217] Saving new best policy, reward=14.886!
[2023-02-28 15:53:15,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2166784. Throughput: 0: 934.4. Samples: 541254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:53:15,080][11028] Avg episode reward: [(0, '15.806')]
[2023-02-28 15:53:15,083][11217] Saving new best policy, reward=15.806!
[2023-02-28 15:53:15,699][11230] Updated weights for policy 0, policy_version 530 (0.0016)
[2023-02-28 15:53:20,076][11028] Fps is (10 sec: 4095.6, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 2183168. Throughput: 0: 906.6. Samples: 546722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:20,087][11028] Avg episode reward: [(0, '16.664')]
[2023-02-28 15:53:20,103][11217] Saving new best policy, reward=16.664!
[2023-02-28 15:53:25,077][11028] Fps is (10 sec: 2866.7, 60 sec: 3618.0, 300 sec: 3596.2). Total num frames: 2195456. Throughput: 0: 888.4. Samples: 548776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:25,079][11028] Avg episode reward: [(0, '16.939')]
[2023-02-28 15:53:25,090][11217] Saving new best policy, reward=16.939!
[2023-02-28 15:53:28,625][11230] Updated weights for policy 0, policy_version 540 (0.0018)
[2023-02-28 15:53:30,075][11028] Fps is (10 sec: 3277.1, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 2215936. Throughput: 0: 904.8. Samples: 553738. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:30,077][11028] Avg episode reward: [(0, '16.136')]
[2023-02-28 15:53:35,077][11028] Fps is (10 sec: 4505.4, 60 sec: 3686.3, 300 sec: 3651.7). Total num frames: 2240512. Throughput: 0: 933.8. Samples: 560744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:53:35,083][11028] Avg episode reward: [(0, '14.489')]
[2023-02-28 15:53:37,450][11230] Updated weights for policy 0, policy_version 550 (0.0013)
[2023-02-28 15:53:40,079][11028] Fps is (10 sec: 4094.6, 60 sec: 3686.2, 300 sec: 3637.8). Total num frames: 2256896. Throughput: 0: 934.9. Samples: 564076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:40,081][11028] Avg episode reward: [(0, '14.455')]
[2023-02-28 15:53:45,075][11028] Fps is (10 sec: 3277.5, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2273280. Throughput: 0: 900.3. Samples: 568350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:53:45,081][11028] Avg episode reward: [(0, '14.103')]
[2023-02-28 15:53:50,075][11028] Fps is (10 sec: 3277.9, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 2289664. Throughput: 0: 925.3. Samples: 573682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:53:50,080][11028] Avg episode reward: [(0, '14.111')]
[2023-02-28 15:53:50,265][11230] Updated weights for policy 0, policy_version 560 (0.0020)
[2023-02-28 15:53:55,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2314240. Throughput: 0: 944.6. Samples: 577102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:53:55,081][11028] Avg episode reward: [(0, '15.782')]
[2023-02-28 15:54:00,076][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.5, 300 sec: 3651.7). Total num frames: 2330624. Throughput: 0: 936.4. Samples: 583394. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:54:00,081][11028] Avg episode reward: [(0, '15.783')]
[2023-02-28 15:54:00,286][11230] Updated weights for policy 0, policy_version 570 (0.0013)
[2023-02-28 15:54:05,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 2347008. Throughput: 0: 911.0. Samples: 587718. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:54:05,082][11028] Avg episode reward: [(0, '16.143')]
[2023-02-28 15:54:10,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 2367488. Throughput: 0: 916.1. Samples: 590000. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:54:10,081][11028] Avg episode reward: [(0, '15.853')]
[2023-02-28 15:54:11,937][11230] Updated weights for policy 0, policy_version 580 (0.0018)
[2023-02-28 15:54:15,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2387968. Throughput: 0: 955.8. Samples: 596750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:54:15,077][11028] Avg episode reward: [(0, '16.099')]
[2023-02-28 15:54:20,078][11028] Fps is (10 sec: 3685.2, 60 sec: 3686.2, 300 sec: 3651.6). Total num frames: 2404352. Throughput: 0: 930.5. Samples: 602616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:54:20,081][11028] Avg episode reward: [(0, '16.867')]
[2023-02-28 15:54:22,919][11230] Updated weights for policy 0, policy_version 590 (0.0017)
[2023-02-28 15:54:25,080][11028] Fps is (10 sec: 3275.3, 60 sec: 3754.5, 300 sec: 3637.7). Total num frames: 2420736. Throughput: 0: 905.4. Samples: 604818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:54:25,099][11028] Avg episode reward: [(0, '16.433')]
[2023-02-28 15:54:30,075][11028] Fps is (10 sec: 3277.9, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2437120. Throughput: 0: 910.4. Samples: 609316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:54:30,078][11028] Avg episode reward: [(0, '16.472')]
[2023-02-28 15:54:30,090][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000595_2437120.pth...
[2023-02-28 15:54:30,205][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000382_1564672.pth
[2023-02-28 15:54:34,072][11230] Updated weights for policy 0, policy_version 600 (0.0030)
[2023-02-28 15:54:35,075][11028] Fps is (10 sec: 4097.8, 60 sec: 3686.5, 300 sec: 3665.6). Total num frames: 2461696. Throughput: 0: 939.5. Samples: 615958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:54:35,078][11028] Avg episode reward: [(0, '17.693')]
[2023-02-28 15:54:35,087][11217] Saving new best policy, reward=17.693!
[2023-02-28 15:54:40,075][11028] Fps is (10 sec: 4096.1, 60 sec: 3686.6, 300 sec: 3651.7). Total num frames: 2478080. Throughput: 0: 935.6. Samples: 619202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:54:40,080][11028] Avg episode reward: [(0, '17.505')]
[2023-02-28 15:54:45,075][11028] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2494464. Throughput: 0: 901.4. Samples: 623956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:54:45,079][11028] Avg episode reward: [(0, '18.024')]
[2023-02-28 15:54:45,085][11217] Saving new best policy, reward=18.024!
[2023-02-28 15:54:45,797][11230] Updated weights for policy 0, policy_version 610 (0.0026)
[2023-02-28 15:54:50,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2510848. Throughput: 0: 909.7. Samples: 628654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:54:50,077][11028] Avg episode reward: [(0, '17.172')]
[2023-02-28 15:54:55,075][11028] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 2535424. Throughput: 0: 934.8. Samples: 632068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:54:55,078][11028] Avg episode reward: [(0, '15.935')]
[2023-02-28 15:54:55,902][11230] Updated weights for policy 0, policy_version 620 (0.0016)
[2023-02-28 15:55:00,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2555904. Throughput: 0: 933.7. Samples: 638766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:55:00,086][11028] Avg episode reward: [(0, '15.418')]
[2023-02-28 15:55:05,076][11028] Fps is (10 sec: 3276.6, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2568192. Throughput: 0: 902.3. Samples: 643216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:55:05,081][11028] Avg episode reward: [(0, '15.210')]
[2023-02-28 15:55:08,259][11230] Updated weights for policy 0, policy_version 630 (0.0018)
[2023-02-28 15:55:10,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 2584576. Throughput: 0: 902.0. Samples: 645406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:55:10,078][11028] Avg episode reward: [(0, '15.277')]
[2023-02-28 15:55:15,075][11028] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 2605056. Throughput: 0: 937.3. Samples: 651494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:55:15,078][11028] Avg episode reward: [(0, '16.451')]
[2023-02-28 15:55:17,921][11230] Updated weights for policy 0, policy_version 640 (0.0021)
[2023-02-28 15:55:20,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3754.9, 300 sec: 3693.3). Total num frames: 2629632. Throughput: 0: 938.5. Samples: 658192. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-28 15:55:20,077][11028] Avg episode reward: [(0, '16.100')]
[2023-02-28 15:55:25,078][11028] Fps is (10 sec: 3685.5, 60 sec: 3686.5, 300 sec: 3651.7). Total num frames: 2641920. Throughput: 0: 915.4. Samples: 660398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:55:25,082][11028] Avg episode reward: [(0, '16.718')]
[2023-02-28 15:55:30,078][11028] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2658304. Throughput: 0: 902.1. Samples: 664550. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:55:30,082][11028] Avg episode reward: [(0, '17.707')]
[2023-02-28 15:55:30,869][11230] Updated weights for policy 0, policy_version 650 (0.0016)
[2023-02-28 15:55:35,075][11028] Fps is (10 sec: 3687.3, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 2678784. Throughput: 0: 940.8. Samples: 670988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:55:35,082][11028] Avg episode reward: [(0, '17.657')]
[2023-02-28 15:55:39,632][11230] Updated weights for policy 0, policy_version 660 (0.0022)
[2023-02-28 15:55:40,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 2703360. Throughput: 0: 943.6. Samples: 674528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:55:40,082][11028] Avg episode reward: [(0, '17.678')]
[2023-02-28 15:55:45,075][11028] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2719744. Throughput: 0: 910.5. Samples: 679738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:55:45,078][11028] Avg episode reward: [(0, '18.798')]
[2023-02-28 15:55:45,081][11217] Saving new best policy, reward=18.798!
[2023-02-28 15:55:50,075][11028] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2732032. Throughput: 0: 906.0. Samples: 683984. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:55:50,083][11028] Avg episode reward: [(0, '19.250')]
[2023-02-28 15:55:50,103][11217] Saving new best policy, reward=19.250!
[2023-02-28 15:55:52,615][11230] Updated weights for policy 0, policy_version 670 (0.0017)
[2023-02-28 15:55:55,075][11028] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 2752512. Throughput: 0: 922.4. Samples: 686916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:55:55,083][11028] Avg episode reward: [(0, '18.018')]
[2023-02-28 15:56:00,075][11028] Fps is (10 sec: 4505.7, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 2777088. Throughput: 0: 937.5. Samples: 693682. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:56:00,077][11028] Avg episode reward: [(0, '17.516')]
[2023-02-28 15:56:02,016][11230] Updated weights for policy 0, policy_version 680 (0.0025)
[2023-02-28 15:56:05,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2793472. Throughput: 0: 901.4. Samples: 698754. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 15:56:05,078][11028] Avg episode reward: [(0, '17.235')]
[2023-02-28 15:56:10,076][11028] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2805760. Throughput: 0: 899.2. Samples: 700860. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-28 15:56:10,085][11028] Avg episode reward: [(0, '16.873')]
[2023-02-28 15:56:14,410][11230] Updated weights for policy 0, policy_version 690 (0.0044)
[2023-02-28 15:56:15,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 2826240. Throughput: 0: 930.1. Samples: 706406. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:56:15,082][11028] Avg episode reward: [(0, '16.367')]
[2023-02-28 15:56:20,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 2846720. Throughput: 0: 935.4. Samples: 713082. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:56:20,078][11028] Avg episode reward: [(0, '16.854')]
[2023-02-28 15:56:25,075][11028] Fps is (10 sec: 3686.3, 60 sec: 3686.5, 300 sec: 3679.5). Total num frames: 2863104. Throughput: 0: 910.9. Samples: 715520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:56:25,081][11028] Avg episode reward: [(0, '19.066')]
[2023-02-28 15:56:25,541][11230] Updated weights for policy 0, policy_version 700 (0.0018)
[2023-02-28 15:56:30,076][11028] Fps is (10 sec: 3276.6, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2879488. Throughput: 0: 889.5. Samples: 719764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:56:30,080][11028] Avg episode reward: [(0, '19.649')]
[2023-02-28 15:56:30,095][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000703_2879488.pth...
[2023-02-28 15:56:30,238][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000488_1998848.pth
[2023-02-28 15:56:30,255][11217] Saving new best policy, reward=19.649!
[2023-02-28 15:56:35,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2899968. Throughput: 0: 925.0. Samples: 725610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:56:35,077][11028] Avg episode reward: [(0, '18.937')]
[2023-02-28 15:56:36,532][11230] Updated weights for policy 0, policy_version 710 (0.0039)
[2023-02-28 15:56:40,075][11028] Fps is (10 sec: 4096.3, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 2920448. Throughput: 0: 935.8. Samples: 729026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:56:40,080][11028] Avg episode reward: [(0, '19.545')]
[2023-02-28 15:56:45,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3679.5). Total num frames: 2936832. Throughput: 0: 913.6. Samples: 734794. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:56:45,078][11028] Avg episode reward: [(0, '19.678')]
[2023-02-28 15:56:45,080][11217] Saving new best policy, reward=19.678!
[2023-02-28 15:56:48,370][11230] Updated weights for policy 0, policy_version 720 (0.0016)
[2023-02-28 15:56:50,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 2953216. Throughput: 0: 892.6. Samples: 738922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:56:50,081][11028] Avg episode reward: [(0, '18.819')]
[2023-02-28 15:56:55,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 2965504. Throughput: 0: 884.5. Samples: 740662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:56:55,080][11028] Avg episode reward: [(0, '17.972')]
[2023-02-28 15:57:00,075][11028] Fps is (10 sec: 2457.7, 60 sec: 3345.1, 300 sec: 3637.8). Total num frames: 2977792. Throughput: 0: 854.2. Samples: 744846. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:57:00,081][11028] Avg episode reward: [(0, '19.307')]
[2023-02-28 15:57:02,307][11230] Updated weights for policy 0, policy_version 730 (0.0033)
[2023-02-28 15:57:05,077][11028] Fps is (10 sec: 3276.3, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 2998272. Throughput: 0: 823.8. Samples: 750156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:57:05,081][11028] Avg episode reward: [(0, '18.943')]
[2023-02-28 15:57:10,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3610.0). Total num frames: 3010560. Throughput: 0: 818.9. Samples: 752370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:57:10,078][11028] Avg episode reward: [(0, '18.384')]
[2023-02-28 15:57:15,062][11230] Updated weights for policy 0, policy_version 740 (0.0014)
[2023-02-28 15:57:15,075][11028] Fps is (10 sec: 3277.2, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 3031040. Throughput: 0: 827.1. Samples: 756982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:57:15,080][11028] Avg episode reward: [(0, '19.359')]
[2023-02-28 15:57:20,075][11028] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 3051520. Throughput: 0: 847.1. Samples: 763728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:57:20,081][11028] Avg episode reward: [(0, '20.376')]
[2023-02-28 15:57:20,093][11217] Saving new best policy, reward=20.376!
[2023-02-28 15:57:24,642][11230] Updated weights for policy 0, policy_version 750 (0.0014)
[2023-02-28 15:57:25,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 3072000. Throughput: 0: 845.6. Samples: 767080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:57:25,078][11028] Avg episode reward: [(0, '22.173')]
[2023-02-28 15:57:25,087][11217] Saving new best policy, reward=22.173!
[2023-02-28 15:57:30,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3610.0). Total num frames: 3084288. Throughput: 0: 812.7. Samples: 771366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:57:30,079][11028] Avg episode reward: [(0, '21.494')]
[2023-02-28 15:57:35,075][11028] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3610.0). Total num frames: 3100672. Throughput: 0: 830.0. Samples: 776270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:57:35,077][11028] Avg episode reward: [(0, '21.814')]
[2023-02-28 15:57:37,108][11230] Updated weights for policy 0, policy_version 760 (0.0015)
[2023-02-28 15:57:40,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3637.8). Total num frames: 3125248. Throughput: 0: 869.6. Samples: 779796. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:57:40,078][11028] Avg episode reward: [(0, '21.806')]
[2023-02-28 15:57:45,075][11028] Fps is (10 sec: 4505.4, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 3145728. Throughput: 0: 929.3. Samples: 786666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:57:45,078][11028] Avg episode reward: [(0, '20.597')]
[2023-02-28 15:57:47,245][11230] Updated weights for policy 0, policy_version 770 (0.0032)
[2023-02-28 15:57:50,079][11028] Fps is (10 sec: 3275.7, 60 sec: 3413.2, 300 sec: 3610.0). Total num frames: 3158016. Throughput: 0: 906.3. Samples: 790942. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:57:50,084][11028] Avg episode reward: [(0, '21.074')]
[2023-02-28 15:57:55,075][11028] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3610.1). Total num frames: 3174400. Throughput: 0: 905.6. Samples: 793120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:57:55,084][11028] Avg episode reward: [(0, '20.474')]
[2023-02-28 15:57:58,904][11230] Updated weights for policy 0, policy_version 780 (0.0018)
[2023-02-28 15:58:00,075][11028] Fps is (10 sec: 4097.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3198976. Throughput: 0: 943.2. Samples: 799426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:58:00,083][11028] Avg episode reward: [(0, '21.947')]
[2023-02-28 15:58:05,081][11028] Fps is (10 sec: 4503.1, 60 sec: 3686.1, 300 sec: 3651.6). Total num frames: 3219456. Throughput: 0: 933.4. Samples: 805738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:58:05,083][11028] Avg episode reward: [(0, '21.589')]
[2023-02-28 15:58:10,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3231744. Throughput: 0: 905.2. Samples: 807816. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:58:10,082][11028] Avg episode reward: [(0, '22.278')]
[2023-02-28 15:58:10,094][11217] Saving new best policy, reward=22.278!
[2023-02-28 15:58:10,454][11230] Updated weights for policy 0, policy_version 790 (0.0012)
[2023-02-28 15:58:15,080][11028] Fps is (10 sec: 2867.3, 60 sec: 3617.8, 300 sec: 3610.0). Total num frames: 3248128. Throughput: 0: 905.1. Samples: 812098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:58:15,082][11028] Avg episode reward: [(0, '22.015')]
[2023-02-28 15:58:20,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3272704. Throughput: 0: 942.0. Samples: 818658. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:58:20,078][11028] Avg episode reward: [(0, '21.924')]
[2023-02-28 15:58:20,874][11230] Updated weights for policy 0, policy_version 800 (0.0022)
[2023-02-28 15:58:25,076][11028] Fps is (10 sec: 4507.3, 60 sec: 3686.3, 300 sec: 3651.7). Total num frames: 3293184. Throughput: 0: 941.4. Samples: 822158. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 15:58:25,082][11028] Avg episode reward: [(0, '22.042')]
[2023-02-28 15:58:30,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 3309568. Throughput: 0: 900.7. Samples: 827196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:58:30,081][11028] Avg episode reward: [(0, '21.591')]
[2023-02-28 15:58:30,096][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000808_3309568.pth...
[2023-02-28 15:58:30,271][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000595_2437120.pth
[2023-02-28 15:58:32,812][11230] Updated weights for policy 0, policy_version 810 (0.0012)
[2023-02-28 15:58:35,075][11028] Fps is (10 sec: 2867.6, 60 sec: 3686.4, 300 sec: 3610.1). Total num frames: 3321856. Throughput: 0: 902.4. Samples: 831548. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:58:35,078][11028] Avg episode reward: [(0, '21.774')]
[2023-02-28 15:58:40,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3346432. Throughput: 0: 931.0. Samples: 835014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:58:40,078][11028] Avg episode reward: [(0, '21.136')]
[2023-02-28 15:58:42,569][11230] Updated weights for policy 0, policy_version 820 (0.0017)
[2023-02-28 15:58:45,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3366912. Throughput: 0: 940.2. Samples: 841734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 15:58:45,081][11028] Avg episode reward: [(0, '20.347')]
[2023-02-28 15:58:50,077][11028] Fps is (10 sec: 3685.8, 60 sec: 3754.8, 300 sec: 3623.9). Total num frames: 3383296. Throughput: 0: 905.1. Samples: 846464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:58:50,079][11028] Avg episode reward: [(0, '19.532')]
[2023-02-28 15:58:55,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3395584. Throughput: 0: 906.6. Samples: 848612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:58:55,080][11028] Avg episode reward: [(0, '19.507')]
[2023-02-28 15:58:55,583][11230] Updated weights for policy 0, policy_version 830 (0.0016)
[2023-02-28 15:59:00,075][11028] Fps is (10 sec: 3277.3, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3416064. Throughput: 0: 939.4. Samples: 854366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:59:00,080][11028] Avg episode reward: [(0, '20.377')]
[2023-02-28 15:59:04,619][11230] Updated weights for policy 0, policy_version 840 (0.0017)
[2023-02-28 15:59:05,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3686.7, 300 sec: 3637.8). Total num frames: 3440640. Throughput: 0: 946.2. Samples: 861238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 15:59:05,078][11028] Avg episode reward: [(0, '20.965')]
[2023-02-28 15:59:10,077][11028] Fps is (10 sec: 4095.3, 60 sec: 3754.6, 300 sec: 3623.9). Total num frames: 3457024. Throughput: 0: 921.9. Samples: 863646. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:59:10,080][11028] Avg episode reward: [(0, '19.822')]
[2023-02-28 15:59:15,078][11028] Fps is (10 sec: 2866.5, 60 sec: 3686.6, 300 sec: 3610.0). Total num frames: 3469312. Throughput: 0: 906.5. Samples: 867990. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:59:15,086][11028] Avg episode reward: [(0, '22.136')]
[2023-02-28 15:59:17,408][11230] Updated weights for policy 0, policy_version 850 (0.0017)
[2023-02-28 15:59:20,075][11028] Fps is (10 sec: 3277.3, 60 sec: 3618.1, 300 sec: 3624.0). Total num frames: 3489792. Throughput: 0: 942.8. Samples: 873972. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:59:20,078][11028] Avg episode reward: [(0, '23.026')]
[2023-02-28 15:59:20,095][11217] Saving new best policy, reward=23.026!
[2023-02-28 15:59:25,075][11028] Fps is (10 sec: 4506.7, 60 sec: 3686.5, 300 sec: 3651.7). Total num frames: 3514368. Throughput: 0: 941.7. Samples: 877392. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-28 15:59:25,077][11028] Avg episode reward: [(0, '23.522')]
[2023-02-28 15:59:25,083][11217] Saving new best policy, reward=23.522!
[2023-02-28 15:59:26,707][11230] Updated weights for policy 0, policy_version 860 (0.0021)
[2023-02-28 15:59:30,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3530752. Throughput: 0: 915.1. Samples: 882912. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 15:59:30,082][11028] Avg episode reward: [(0, '23.820')]
[2023-02-28 15:59:30,097][11217] Saving new best policy, reward=23.820!
[2023-02-28 15:59:35,075][11028] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3543040. Throughput: 0: 904.6. Samples: 887168. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-28 15:59:35,078][11028] Avg episode reward: [(0, '23.374')]
[2023-02-28 15:59:39,171][11230] Updated weights for policy 0, policy_version 870 (0.0019)
[2023-02-28 15:59:40,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3567616. Throughput: 0: 922.6. Samples: 890128. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:59:40,078][11028] Avg episode reward: [(0, '24.945')]
[2023-02-28 15:59:40,090][11217] Saving new best policy, reward=24.945!
[2023-02-28 15:59:45,075][11028] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3588096. Throughput: 0: 944.0. Samples: 896846. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 15:59:45,085][11028] Avg episode reward: [(0, '25.451')]
[2023-02-28 15:59:45,087][11217] Saving new best policy, reward=25.451!
[2023-02-28 15:59:49,240][11230] Updated weights for policy 0, policy_version 880 (0.0027)
[2023-02-28 15:59:50,077][11028] Fps is (10 sec: 3685.5, 60 sec: 3686.3, 300 sec: 3623.9). Total num frames: 3604480. Throughput: 0: 906.7. Samples: 902042. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 15:59:50,080][11028] Avg episode reward: [(0, '23.852')]
[2023-02-28 15:59:55,079][11028] Fps is (10 sec: 2866.1, 60 sec: 3686.2, 300 sec: 3596.1). Total num frames: 3616768. Throughput: 0: 899.6. Samples: 904128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 15:59:55,087][11028] Avg episode reward: [(0, '23.875')]
[2023-02-28 16:00:00,077][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.3, 300 sec: 3623.9). Total num frames: 3637248. Throughput: 0: 921.3. Samples: 909450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:00:00,079][11028] Avg episode reward: [(0, '23.750')]
[2023-02-28 16:00:01,044][11230] Updated weights for policy 0, policy_version 890 (0.0016)
[2023-02-28 16:00:05,076][11028] Fps is (10 sec: 4507.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3661824. Throughput: 0: 940.9. Samples: 916312. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:00:05,080][11028] Avg episode reward: [(0, '22.588')]
[2023-02-28 16:00:10,075][11028] Fps is (10 sec: 4097.0, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 3678208. Throughput: 0: 925.1. Samples: 919022. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:00:10,077][11028] Avg episode reward: [(0, '21.334')]
[2023-02-28 16:00:12,403][11230] Updated weights for policy 0, policy_version 900 (0.0029)
[2023-02-28 16:00:15,077][11028] Fps is (10 sec: 2866.9, 60 sec: 3686.5, 300 sec: 3596.1). Total num frames: 3690496. Throughput: 0: 896.5. Samples: 923254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:00:15,081][11028] Avg episode reward: [(0, '21.474')]
[2023-02-28 16:00:20,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3710976. Throughput: 0: 926.8. Samples: 928874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:00:20,077][11028] Avg episode reward: [(0, '20.988')]
[2023-02-28 16:00:22,846][11230] Updated weights for policy 0, policy_version 910 (0.0028)
[2023-02-28 16:00:25,075][11028] Fps is (10 sec: 4506.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3735552. Throughput: 0: 940.3. Samples: 932440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:00:25,083][11028] Avg episode reward: [(0, '20.505')]
[2023-02-28 16:00:30,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3751936. Throughput: 0: 926.1. Samples: 938522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:00:30,085][11028] Avg episode reward: [(0, '20.526')]
[2023-02-28 16:00:30,101][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000916_3751936.pth...
[2023-02-28 16:00:30,260][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000703_2879488.pth
[2023-02-28 16:00:34,706][11230] Updated weights for policy 0, policy_version 920 (0.0025)
[2023-02-28 16:00:35,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 3768320. Throughput: 0: 903.6. Samples: 942702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:00:35,078][11028] Avg episode reward: [(0, '21.453')]
[2023-02-28 16:00:40,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 3784704. Throughput: 0: 914.1. Samples: 945258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:00:40,078][11028] Avg episode reward: [(0, '20.953')]
[2023-02-28 16:00:44,566][11230] Updated weights for policy 0, policy_version 930 (0.0012)
[2023-02-28 16:00:45,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3809280. Throughput: 0: 949.7. Samples: 952182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:00:45,077][11028] Avg episode reward: [(0, '22.913')]
[2023-02-28 16:00:50,078][11028] Fps is (10 sec: 4504.5, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3829760. Throughput: 0: 922.2. Samples: 957814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:00:50,081][11028] Avg episode reward: [(0, '22.317')]
[2023-02-28 16:00:55,076][11028] Fps is (10 sec: 3276.7, 60 sec: 3754.9, 300 sec: 3610.0). Total num frames: 3842048. Throughput: 0: 909.8. Samples: 959964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:00:55,085][11028] Avg episode reward: [(0, '20.630')]
[2023-02-28 16:00:57,432][11230] Updated weights for policy 0, policy_version 940 (0.0034)
[2023-02-28 16:01:00,075][11028] Fps is (10 sec: 2867.9, 60 sec: 3686.5, 300 sec: 3610.0). Total num frames: 3858432. Throughput: 0: 924.6. Samples: 964858. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:01:00,078][11028] Avg episode reward: [(0, '21.287')]
[2023-02-28 16:01:05,075][11028] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3883008. Throughput: 0: 943.8. Samples: 971346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:01:05,078][11028] Avg episode reward: [(0, '21.355')]
[2023-02-28 16:01:06,720][11230] Updated weights for policy 0, policy_version 950 (0.0017)
[2023-02-28 16:01:10,075][11028] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3899392. Throughput: 0: 934.5. Samples: 974494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:01:10,083][11028] Avg episode reward: [(0, '21.814')]
[2023-02-28 16:01:15,077][11028] Fps is (10 sec: 3276.3, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 3915776. Throughput: 0: 894.2. Samples: 978760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:01:15,080][11028] Avg episode reward: [(0, '20.759')]
[2023-02-28 16:01:19,678][11230] Updated weights for policy 0, policy_version 960 (0.0013)
[2023-02-28 16:01:20,075][11028] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3932160. Throughput: 0: 915.0. Samples: 983878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:01:20,078][11028] Avg episode reward: [(0, '22.124')]
[2023-02-28 16:01:25,075][11028] Fps is (10 sec: 3686.9, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 3952640. Throughput: 0: 931.1. Samples: 987156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:01:25,082][11028] Avg episode reward: [(0, '23.933')]
[2023-02-28 16:01:29,279][11230] Updated weights for policy 0, policy_version 970 (0.0016)
[2023-02-28 16:01:30,081][11028] Fps is (10 sec: 4093.8, 60 sec: 3686.1, 300 sec: 3637.7). Total num frames: 3973120. Throughput: 0: 918.4. Samples: 993516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:01:30,083][11028] Avg episode reward: [(0, '23.391')]
[2023-02-28 16:01:35,075][11028] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3989504. Throughput: 0: 890.0. Samples: 997864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:01:35,078][11028] Avg episode reward: [(0, '23.083')]
[2023-02-28 16:01:39,685][11217] Stopping Batcher_0...
[2023-02-28 16:01:39,694][11217] Loop batcher_evt_loop terminating...
[2023-02-28 16:01:39,687][11028] Component Batcher_0 stopped!
[2023-02-28 16:01:39,696][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-28 16:01:39,736][11230] Weights refcount: 2 0
[2023-02-28 16:01:39,748][11230] Stopping InferenceWorker_p0-w0...
[2023-02-28 16:01:39,745][11028] Component RolloutWorker_w3 stopped!
[2023-02-28 16:01:39,752][11230] Loop inference_proc0-0_evt_loop terminating...
[2023-02-28 16:01:39,750][11028] Component InferenceWorker_p0-w0 stopped!
[2023-02-28 16:01:39,745][11235] Stopping RolloutWorker_w3...
[2023-02-28 16:01:39,765][11235] Loop rollout_proc3_evt_loop terminating...
[2023-02-28 16:01:39,778][11243] Stopping RolloutWorker_w7...
[2023-02-28 16:01:39,777][11028] Component RolloutWorker_w6 stopped!
[2023-02-28 16:01:39,779][11241] Stopping RolloutWorker_w5...
[2023-02-28 16:01:39,779][11232] Stopping RolloutWorker_w1...
[2023-02-28 16:01:39,782][11028] Component RolloutWorker_w7 stopped!
[2023-02-28 16:01:39,786][11028] Component RolloutWorker_w5 stopped!
[2023-02-28 16:01:39,787][11028] Component RolloutWorker_w1 stopped!
[2023-02-28 16:01:39,793][11242] Stopping RolloutWorker_w6...
[2023-02-28 16:01:39,781][11243] Loop rollout_proc7_evt_loop terminating...
[2023-02-28 16:01:39,784][11241] Loop rollout_proc5_evt_loop terminating...
[2023-02-28 16:01:39,785][11232] Loop rollout_proc1_evt_loop terminating...
[2023-02-28 16:01:39,803][11028] Component RolloutWorker_w2 stopped!
[2023-02-28 16:01:39,808][11234] Stopping RolloutWorker_w2...
[2023-02-28 16:01:39,793][11242] Loop rollout_proc6_evt_loop terminating...
[2023-02-28 16:01:39,815][11028] Component RolloutWorker_w4 stopped!
[2023-02-28 16:01:39,818][11239] Stopping RolloutWorker_w4...
[2023-02-28 16:01:39,821][11239] Loop rollout_proc4_evt_loop terminating...
[2023-02-28 16:01:39,822][11234] Loop rollout_proc2_evt_loop terminating...
[2023-02-28 16:01:39,837][11028] Component RolloutWorker_w0 stopped!
[2023-02-28 16:01:39,841][11231] Stopping RolloutWorker_w0...
[2023-02-28 16:01:39,847][11231] Loop rollout_proc0_evt_loop terminating...
[2023-02-28 16:01:39,877][11217] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000808_3309568.pth
[2023-02-28 16:01:39,885][11217] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-28 16:01:40,100][11217] Stopping LearnerWorker_p0...
[2023-02-28 16:01:40,101][11217] Loop learner_proc0_evt_loop terminating...
[2023-02-28 16:01:40,100][11028] Component LearnerWorker_p0 stopped!
[2023-02-28 16:01:40,106][11028] Waiting for process learner_proc0 to stop...
[2023-02-28 16:01:41,949][11028] Waiting for process inference_proc0-0 to join...
[2023-02-28 16:01:42,229][11028] Waiting for process rollout_proc0 to join...
[2023-02-28 16:01:42,235][11028] Waiting for process rollout_proc1 to join...
[2023-02-28 16:01:42,938][11028] Waiting for process rollout_proc2 to join...
[2023-02-28 16:01:42,939][11028] Waiting for process rollout_proc3 to join...
[2023-02-28 16:01:42,941][11028] Waiting for process rollout_proc4 to join...
[2023-02-28 16:01:42,942][11028] Waiting for process rollout_proc5 to join...
[2023-02-28 16:01:42,943][11028] Waiting for process rollout_proc6 to join...
[2023-02-28 16:01:42,944][11028] Waiting for process rollout_proc7 to join...
[2023-02-28 16:01:42,945][11028] Batcher 0 profile tree view:
batching: 25.6160, releasing_batches: 0.0244
[2023-02-28 16:01:42,947][11028] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 536.9757
update_model: 8.0150
weight_update: 0.0026
one_step: 0.0106
handle_policy_step: 535.6958
deserialize: 15.0421, stack: 2.9491, obs_to_device_normalize: 115.8096, forward: 261.3291, send_messages: 26.1026
prepare_outputs: 87.6382
to_cpu: 55.1796
[2023-02-28 16:01:42,948][11028] Learner 0 profile tree view:
misc: 0.0055, prepare_batch: 16.2617
train: 76.6077
epoch_init: 0.0083, minibatch_init: 0.0109, losses_postprocess: 0.5747, kl_divergence: 0.6282, after_optimizer: 33.1540
calculate_losses: 27.1419
losses_init: 0.0047, forward_head: 1.7557, bptt_initial: 17.9712, tail: 1.1279, advantages_returns: 0.3304, losses: 3.3024
bptt: 2.3486
bptt_forward_core: 2.2612
update: 14.4677
clip: 1.3797
[2023-02-28 16:01:42,949][11028] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4108, enqueue_policy_requests: 145.1765, env_step: 843.8149, overhead: 22.4390, complete_rollouts: 7.1154
save_policy_outputs: 20.5511
split_output_tensors: 10.2239
[2023-02-28 16:01:42,951][11028] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3180, enqueue_policy_requests: 147.7048, env_step: 841.0940, overhead: 22.0080, complete_rollouts: 7.8867
save_policy_outputs: 20.3061
split_output_tensors: 9.9457
[2023-02-28 16:01:42,952][11028] Loop Runner_EvtLoop terminating...
[2023-02-28 16:01:42,954][11028] Runner profile tree view:
main_loop: 1147.8341
[2023-02-28 16:01:42,955][11028] Collected {0: 4005888}, FPS: 3490.0
[2023-02-28 16:02:10,307][11028] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-28 16:02:10,310][11028] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-28 16:02:10,312][11028] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-28 16:02:10,315][11028] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-28 16:02:10,316][11028] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:02:10,319][11028] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-28 16:02:10,321][11028] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:02:10,322][11028] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-28 16:02:10,323][11028] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-28 16:02:10,325][11028] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-28 16:02:10,329][11028] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-28 16:02:10,330][11028] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-28 16:02:10,332][11028] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-28 16:02:10,333][11028] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-28 16:02:10,334][11028] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-28 16:02:10,359][11028] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:02:10,361][11028] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:02:10,366][11028] RunningMeanStd input shape: (1,)
[2023-02-28 16:02:10,382][11028] ConvEncoder: input_channels=3
[2023-02-28 16:02:11,028][11028] Conv encoder output size: 512
[2023-02-28 16:02:11,031][11028] Policy head output size: 512
[2023-02-28 16:02:13,597][11028] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-28 16:02:15,393][11028] Num frames 100...
[2023-02-28 16:02:15,552][11028] Num frames 200...
[2023-02-28 16:02:15,716][11028] Num frames 300...
[2023-02-28 16:02:15,883][11028] Num frames 400...
[2023-02-28 16:02:16,029][11028] Avg episode rewards: #0: 8.560, true rewards: #0: 4.560
[2023-02-28 16:02:16,032][11028] Avg episode reward: 8.560, avg true_objective: 4.560
[2023-02-28 16:02:16,104][11028] Num frames 500...
[2023-02-28 16:02:16,230][11028] Num frames 600...
[2023-02-28 16:02:16,349][11028] Num frames 700...
[2023-02-28 16:02:16,457][11028] Num frames 800...
[2023-02-28 16:02:16,567][11028] Num frames 900...
[2023-02-28 16:02:16,681][11028] Num frames 1000...
[2023-02-28 16:02:16,790][11028] Num frames 1100...
[2023-02-28 16:02:16,904][11028] Num frames 1200...
[2023-02-28 16:02:17,016][11028] Num frames 1300...
[2023-02-28 16:02:17,129][11028] Num frames 1400...
[2023-02-28 16:02:17,239][11028] Avg episode rewards: #0: 15.240, true rewards: #0: 7.240
[2023-02-28 16:02:17,245][11028] Avg episode reward: 15.240, avg true_objective: 7.240
[2023-02-28 16:02:17,308][11028] Num frames 1500...
[2023-02-28 16:02:17,421][11028] Num frames 1600...
[2023-02-28 16:02:17,536][11028] Num frames 1700...
[2023-02-28 16:02:17,650][11028] Num frames 1800...
[2023-02-28 16:02:17,763][11028] Num frames 1900...
[2023-02-28 16:02:17,919][11028] Avg episode rewards: #0: 12.640, true rewards: #0: 6.640
[2023-02-28 16:02:17,927][11028] Avg episode reward: 12.640, avg true_objective: 6.640
[2023-02-28 16:02:17,940][11028] Num frames 2000...
[2023-02-28 16:02:18,052][11028] Num frames 2100...
[2023-02-28 16:02:18,163][11028] Num frames 2200...
[2023-02-28 16:02:18,274][11028] Num frames 2300...
[2023-02-28 16:02:18,394][11028] Num frames 2400...
[2023-02-28 16:02:18,503][11028] Num frames 2500...
[2023-02-28 16:02:18,614][11028] Num frames 2600...
[2023-02-28 16:02:18,792][11028] Avg episode rewards: #0: 12.490, true rewards: #0: 6.740
[2023-02-28 16:02:18,795][11028] Avg episode reward: 12.490, avg true_objective: 6.740
[2023-02-28 16:02:18,803][11028] Num frames 2700...
[2023-02-28 16:02:18,937][11028] Num frames 2800...
[2023-02-28 16:02:19,048][11028] Num frames 2900...
[2023-02-28 16:02:19,157][11028] Num frames 3000...
[2023-02-28 16:02:19,265][11028] Num frames 3100...
[2023-02-28 16:02:19,387][11028] Num frames 3200...
[2023-02-28 16:02:19,522][11028] Avg episode rewards: #0: 11.946, true rewards: #0: 6.546
[2023-02-28 16:02:19,524][11028] Avg episode reward: 11.946, avg true_objective: 6.546
[2023-02-28 16:02:19,558][11028] Num frames 3300...
[2023-02-28 16:02:19,668][11028] Num frames 3400...
[2023-02-28 16:02:19,788][11028] Num frames 3500...
[2023-02-28 16:02:19,903][11028] Num frames 3600...
[2023-02-28 16:02:20,015][11028] Num frames 3700...
[2023-02-28 16:02:20,127][11028] Num frames 3800...
[2023-02-28 16:02:20,240][11028] Num frames 3900...
[2023-02-28 16:02:20,355][11028] Num frames 4000...
[2023-02-28 16:02:20,472][11028] Num frames 4100...
[2023-02-28 16:02:20,582][11028] Num frames 4200...
[2023-02-28 16:02:20,693][11028] Num frames 4300...
[2023-02-28 16:02:20,806][11028] Num frames 4400...
[2023-02-28 16:02:20,918][11028] Num frames 4500...
[2023-02-28 16:02:20,988][11028] Avg episode rewards: #0: 14.352, true rewards: #0: 7.518
[2023-02-28 16:02:20,990][11028] Avg episode reward: 14.352, avg true_objective: 7.518
[2023-02-28 16:02:21,096][11028] Num frames 4600...
[2023-02-28 16:02:21,205][11028] Num frames 4700...
[2023-02-28 16:02:21,321][11028] Num frames 4800...
[2023-02-28 16:02:21,437][11028] Num frames 4900...
[2023-02-28 16:02:21,548][11028] Num frames 5000...
[2023-02-28 16:02:21,662][11028] Num frames 5100...
[2023-02-28 16:02:21,783][11028] Num frames 5200...
[2023-02-28 16:02:21,895][11028] Num frames 5300...
[2023-02-28 16:02:22,007][11028] Num frames 5400...
[2023-02-28 16:02:22,119][11028] Num frames 5500...
[2023-02-28 16:02:22,231][11028] Num frames 5600...
[2023-02-28 16:02:22,310][11028] Avg episode rewards: #0: 16.173, true rewards: #0: 8.030
[2023-02-28 16:02:22,314][11028] Avg episode reward: 16.173, avg true_objective: 8.030
[2023-02-28 16:02:22,408][11028] Num frames 5700...
[2023-02-28 16:02:22,529][11028] Num frames 5800...
[2023-02-28 16:02:22,639][11028] Num frames 5900...
[2023-02-28 16:02:22,751][11028] Num frames 6000...
[2023-02-28 16:02:22,869][11028] Num frames 6100...
[2023-02-28 16:02:22,989][11028] Num frames 6200...
[2023-02-28 16:02:23,106][11028] Num frames 6300...
[2023-02-28 16:02:23,214][11028] Num frames 6400...
[2023-02-28 16:02:23,324][11028] Num frames 6500...
[2023-02-28 16:02:23,433][11028] Num frames 6600...
[2023-02-28 16:02:23,563][11028] Num frames 6700...
[2023-02-28 16:02:23,671][11028] Avg episode rewards: #0: 17.051, true rewards: #0: 8.426
[2023-02-28 16:02:23,673][11028] Avg episode reward: 17.051, avg true_objective: 8.426
[2023-02-28 16:02:23,743][11028] Num frames 6800...
[2023-02-28 16:02:23,855][11028] Num frames 6900...
[2023-02-28 16:02:23,975][11028] Num frames 7000...
[2023-02-28 16:02:24,096][11028] Num frames 7100...
[2023-02-28 16:02:24,207][11028] Num frames 7200...
[2023-02-28 16:02:24,324][11028] Num frames 7300...
[2023-02-28 16:02:24,442][11028] Num frames 7400...
[2023-02-28 16:02:24,561][11028] Num frames 7500...
[2023-02-28 16:02:24,674][11028] Num frames 7600...
[2023-02-28 16:02:24,787][11028] Num frames 7700...
[2023-02-28 16:02:24,908][11028] Num frames 7800...
[2023-02-28 16:02:25,020][11028] Num frames 7900...
[2023-02-28 16:02:25,141][11028] Num frames 8000...
[2023-02-28 16:02:25,255][11028] Num frames 8100...
[2023-02-28 16:02:25,372][11028] Num frames 8200...
[2023-02-28 16:02:25,488][11028] Num frames 8300...
[2023-02-28 16:02:25,599][11028] Num frames 8400...
[2023-02-28 16:02:25,746][11028] Avg episode rewards: #0: 20.315, true rewards: #0: 9.426
[2023-02-28 16:02:25,748][11028] Avg episode reward: 20.315, avg true_objective: 9.426
[2023-02-28 16:02:25,774][11028] Num frames 8500...
[2023-02-28 16:02:25,894][11028] Num frames 8600...
[2023-02-28 16:02:26,007][11028] Num frames 8700...
[2023-02-28 16:02:26,123][11028] Num frames 8800...
[2023-02-28 16:02:26,279][11028] Num frames 8900...
[2023-02-28 16:02:26,435][11028] Num frames 9000...
[2023-02-28 16:02:26,590][11028] Num frames 9100...
[2023-02-28 16:02:26,754][11028] Num frames 9200...
[2023-02-28 16:02:26,919][11028] Num frames 9300...
[2023-02-28 16:02:27,083][11028] Num frames 9400...
[2023-02-28 16:02:27,241][11028] Num frames 9500...
[2023-02-28 16:02:27,399][11028] Num frames 9600...
[2023-02-28 16:02:27,558][11028] Num frames 9700...
[2023-02-28 16:02:27,767][11028] Avg episode rewards: #0: 21.296, true rewards: #0: 9.796
[2023-02-28 16:02:27,774][11028] Avg episode reward: 21.296, avg true_objective: 9.796
[2023-02-28 16:03:30,667][11028] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-28 16:13:30,731][11028] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-28 16:13:30,735][11028] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-28 16:13:30,737][11028] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-28 16:13:30,740][11028] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-28 16:13:30,743][11028] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:13:30,746][11028] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-28 16:13:30,750][11028] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-28 16:13:30,752][11028] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-28 16:13:30,754][11028] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-28 16:13:30,756][11028] Adding new argument 'hf_repository'='bonadio/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-28 16:13:30,757][11028] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-28 16:13:30,759][11028] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-28 16:13:30,761][11028] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-28 16:13:30,763][11028] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-28 16:13:30,764][11028] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-28 16:13:30,799][11028] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:13:30,803][11028] RunningMeanStd input shape: (1,)
[2023-02-28 16:13:30,819][11028] ConvEncoder: input_channels=3
[2023-02-28 16:13:30,860][11028] Conv encoder output size: 512
[2023-02-28 16:13:30,861][11028] Policy head output size: 512
[2023-02-28 16:13:30,886][11028] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-28 16:13:31,312][11028] Num frames 100...
[2023-02-28 16:13:31,438][11028] Num frames 200...
[2023-02-28 16:13:31,552][11028] Num frames 300...
[2023-02-28 16:13:31,668][11028] Num frames 400...
[2023-02-28 16:13:31,778][11028] Avg episode rewards: #0: 5.480, true rewards: #0: 4.480
[2023-02-28 16:13:31,785][11028] Avg episode reward: 5.480, avg true_objective: 4.480
[2023-02-28 16:13:31,853][11028] Num frames 500...
[2023-02-28 16:13:31,972][11028] Num frames 600...
[2023-02-28 16:13:32,083][11028] Num frames 700...
[2023-02-28 16:13:32,198][11028] Num frames 800...
[2023-02-28 16:13:32,310][11028] Num frames 900...
[2023-02-28 16:13:32,388][11028] Avg episode rewards: #0: 6.095, true rewards: #0: 4.595
[2023-02-28 16:13:32,390][11028] Avg episode reward: 6.095, avg true_objective: 4.595
[2023-02-28 16:13:32,494][11028] Num frames 1000...
[2023-02-28 16:13:32,607][11028] Num frames 1100...
[2023-02-28 16:13:32,727][11028] Num frames 1200...
[2023-02-28 16:13:32,838][11028] Num frames 1300...
[2023-02-28 16:13:32,954][11028] Num frames 1400...
[2023-02-28 16:13:33,064][11028] Num frames 1500...
[2023-02-28 16:13:33,176][11028] Num frames 1600...
[2023-02-28 16:13:33,294][11028] Num frames 1700...
[2023-02-28 16:13:33,450][11028] Num frames 1800...
[2023-02-28 16:13:33,609][11028] Num frames 1900...
[2023-02-28 16:13:33,683][11028] Avg episode rewards: #0: 10.037, true rewards: #0: 6.370
[2023-02-28 16:13:33,685][11028] Avg episode reward: 10.037, avg true_objective: 6.370
[2023-02-28 16:13:33,820][11028] Num frames 2000...
[2023-02-28 16:13:33,979][11028] Num frames 2100...
[2023-02-28 16:13:34,132][11028] Num frames 2200...
[2023-02-28 16:13:34,444][11028] Num frames 2300...
[2023-02-28 16:13:34,880][11028] Num frames 2400...
[2023-02-28 16:13:35,260][11028] Num frames 2500...
[2023-02-28 16:13:35,471][11028] Avg episode rewards: #0: 10.878, true rewards: #0: 6.377
[2023-02-28 16:13:35,473][11028] Avg episode reward: 10.878, avg true_objective: 6.377
[2023-02-28 16:13:35,667][11028] Num frames 2600...
[2023-02-28 16:13:35,949][11028] Num frames 2700...
[2023-02-28 16:13:36,309][11028] Num frames 2800...
[2023-02-28 16:13:36,633][11028] Num frames 2900...
[2023-02-28 16:13:37,031][11028] Num frames 3000...
[2023-02-28 16:13:37,410][11028] Num frames 3100...
[2023-02-28 16:13:37,609][11028] Num frames 3200...
[2023-02-28 16:13:37,780][11028] Num frames 3300...
[2023-02-28 16:13:37,991][11028] Num frames 3400...
[2023-02-28 16:13:38,206][11028] Num frames 3500...
[2023-02-28 16:13:38,425][11028] Num frames 3600...
[2023-02-28 16:13:38,501][11028] Avg episode rewards: #0: 14.214, true rewards: #0: 7.214
[2023-02-28 16:13:38,517][11028] Avg episode reward: 14.214, avg true_objective: 7.214
[2023-02-28 16:13:38,714][11028] Num frames 3700...
[2023-02-28 16:13:38,939][11028] Num frames 3800...
[2023-02-28 16:13:39,224][11028] Num frames 3900...
[2023-02-28 16:13:39,364][11028] Num frames 4000...
[2023-02-28 16:13:39,472][11028] Num frames 4100...
[2023-02-28 16:13:39,583][11028] Num frames 4200...
[2023-02-28 16:13:39,710][11028] Num frames 4300...
[2023-02-28 16:13:39,821][11028] Num frames 4400...
[2023-02-28 16:13:39,939][11028] Num frames 4500...
[2023-02-28 16:13:40,051][11028] Num frames 4600...
[2023-02-28 16:13:40,164][11028] Num frames 4700...
[2023-02-28 16:13:40,292][11028] Avg episode rewards: #0: 15.932, true rewards: #0: 7.932
[2023-02-28 16:13:40,294][11028] Avg episode reward: 15.932, avg true_objective: 7.932
[2023-02-28 16:13:40,346][11028] Num frames 4800...
[2023-02-28 16:13:40,469][11028] Num frames 4900...
[2023-02-28 16:13:40,585][11028] Num frames 5000...
[2023-02-28 16:13:40,706][11028] Num frames 5100...
[2023-02-28 16:13:40,822][11028] Num frames 5200...
[2023-02-28 16:13:40,933][11028] Num frames 5300...
[2023-02-28 16:13:41,051][11028] Num frames 5400...
[2023-02-28 16:13:41,166][11028] Num frames 5500...
[2023-02-28 16:13:41,277][11028] Num frames 5600...
[2023-02-28 16:13:41,388][11028] Num frames 5700...
[2023-02-28 16:13:41,506][11028] Num frames 5800...
[2023-02-28 16:13:41,614][11028] Num frames 5900...
[2023-02-28 16:13:41,731][11028] Num frames 6000...
[2023-02-28 16:13:41,840][11028] Num frames 6100...
[2023-02-28 16:13:41,952][11028] Num frames 6200...
[2023-02-28 16:13:42,064][11028] Num frames 6300...
[2023-02-28 16:13:42,177][11028] Num frames 6400...
[2023-02-28 16:13:42,305][11028] Num frames 6500...
[2023-02-28 16:13:42,418][11028] Num frames 6600...
[2023-02-28 16:13:42,531][11028] Num frames 6700...
[2023-02-28 16:13:42,652][11028] Num frames 6800...
[2023-02-28 16:13:42,789][11028] Avg episode rewards: #0: 22.227, true rewards: #0: 9.799
[2023-02-28 16:13:42,791][11028] Avg episode reward: 22.227, avg true_objective: 9.799
[2023-02-28 16:13:42,842][11028] Num frames 6900...
[2023-02-28 16:13:42,952][11028] Num frames 7000...
[2023-02-28 16:13:43,061][11028] Num frames 7100...
[2023-02-28 16:13:43,170][11028] Num frames 7200...
[2023-02-28 16:13:43,279][11028] Num frames 7300...
[2023-02-28 16:13:43,343][11028] Avg episode rewards: #0: 20.134, true rewards: #0: 9.134
[2023-02-28 16:13:43,346][11028] Avg episode reward: 20.134, avg true_objective: 9.134
[2023-02-28 16:13:43,452][11028] Num frames 7400...
[2023-02-28 16:13:43,574][11028] Num frames 7500...
[2023-02-28 16:13:43,699][11028] Num frames 7600...
[2023-02-28 16:13:43,832][11028] Num frames 7700...
[2023-02-28 16:13:43,956][11028] Num frames 7800...
[2023-02-28 16:13:44,074][11028] Num frames 7900...
[2023-02-28 16:13:44,183][11028] Num frames 8000...
[2023-02-28 16:13:44,294][11028] Num frames 8100...
[2023-02-28 16:13:44,404][11028] Num frames 8200...
[2023-02-28 16:13:44,512][11028] Num frames 8300...
[2023-02-28 16:13:44,627][11028] Num frames 8400...
[2023-02-28 16:13:44,748][11028] Num frames 8500...
[2023-02-28 16:13:44,861][11028] Num frames 8600...
[2023-02-28 16:13:44,979][11028] Num frames 8700...
[2023-02-28 16:13:45,097][11028] Num frames 8800...
[2023-02-28 16:13:45,220][11028] Num frames 8900...
[2023-02-28 16:13:45,332][11028] Num frames 9000...
[2023-02-28 16:13:45,445][11028] Num frames 9100...
[2023-02-28 16:13:45,514][11028] Avg episode rewards: #0: 23.233, true rewards: #0: 10.122
[2023-02-28 16:13:45,515][11028] Avg episode reward: 23.233, avg true_objective: 10.122
[2023-02-28 16:13:45,620][11028] Num frames 9200...
[2023-02-28 16:13:45,730][11028] Num frames 9300...
[2023-02-28 16:13:45,846][11028] Num frames 9400...
[2023-02-28 16:13:45,958][11028] Num frames 9500...
[2023-02-28 16:13:46,077][11028] Avg episode rewards: #0: 21.458, true rewards: #0: 9.558
[2023-02-28 16:13:46,082][11028] Avg episode reward: 21.458, avg true_objective: 9.558
[2023-02-28 16:14:45,318][11028] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-28 16:14:57,424][11028] The model has been pushed to https://huggingface.co/bonadio/rl_course_vizdoom_health_gathering_supreme
[2023-02-28 16:21:11,335][11028] Environment doom_basic already registered, overwriting...
[2023-02-28 16:21:11,338][11028] Environment doom_two_colors_easy already registered, overwriting...
[2023-02-28 16:21:11,341][11028] Environment doom_two_colors_hard already registered, overwriting...
[2023-02-28 16:21:11,342][11028] Environment doom_dm already registered, overwriting...
[2023-02-28 16:21:11,345][11028] Environment doom_dwango5 already registered, overwriting...
[2023-02-28 16:21:11,346][11028] Environment doom_my_way_home_flat_actions already registered, overwriting...
[2023-02-28 16:21:11,347][11028] Environment doom_defend_the_center_flat_actions already registered, overwriting...
[2023-02-28 16:21:11,349][11028] Environment doom_my_way_home already registered, overwriting...
[2023-02-28 16:21:11,352][11028] Environment doom_deadly_corridor already registered, overwriting...
[2023-02-28 16:21:11,353][11028] Environment doom_defend_the_center already registered, overwriting...
[2023-02-28 16:21:11,354][11028] Environment doom_defend_the_line already registered, overwriting...
[2023-02-28 16:21:11,355][11028] Environment doom_health_gathering already registered, overwriting...
[2023-02-28 16:21:11,356][11028] Environment doom_health_gathering_supreme already registered, overwriting...
[2023-02-28 16:21:11,359][11028] Environment doom_battle already registered, overwriting...
[2023-02-28 16:21:11,360][11028] Environment doom_battle2 already registered, overwriting...
[2023-02-28 16:21:11,361][11028] Environment doom_duel_bots already registered, overwriting...
[2023-02-28 16:21:11,363][11028] Environment doom_deathmatch_bots already registered, overwriting...
[2023-02-28 16:21:11,364][11028] Environment doom_duel already registered, overwriting...
[2023-02-28 16:21:11,365][11028] Environment doom_deathmatch_full already registered, overwriting...
[2023-02-28 16:21:11,366][11028] Environment doom_benchmark already registered, overwriting...
[2023-02-28 16:21:11,368][11028] register_encoder_factory: <function make_vizdoom_encoder at 0x7fcdced2d8b0>
[2023-02-28 16:21:11,396][11028] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-28 16:21:11,397][11028] Overriding arg 'train_for_env_steps' with value 8000000 passed from command line
[2023-02-28 16:21:11,407][11028] Experiment dir /content/train_dir/default_experiment already exists!
[2023-02-28 16:21:11,408][11028] Resuming existing experiment from /content/train_dir/default_experiment...
[2023-02-28 16:21:11,409][11028] Weights and Biases integration disabled
[2023-02-28 16:21:11,414][11028] Environment var CUDA_VISIBLE_DEVICES is 0
[2023-02-28 16:21:13,595][11028] Starting experiment with the following configuration:
help=False
algo=APPO
env=doom_health_gathering_supreme
experiment=default_experiment
train_dir=/content/train_dir
restart_behavior=resume
device=gpu
seed=None
num_policies=1
async_rl=True
serial_mode=False
batched_sampling=False
num_batches_to_accumulate=2
worker_num_splits=2
policy_workers_per_policy=1
max_policy_lag=1000
num_workers=8
num_envs_per_worker=4
batch_size=1024
num_batches_per_epoch=1
num_epochs=1
rollout=32
recurrence=32
shuffle_minibatches=False
gamma=0.99
reward_scale=1.0
reward_clip=1000.0
value_bootstrap=False
normalize_returns=True
exploration_loss_coeff=0.001
value_loss_coeff=0.5
kl_loss_coeff=0.0
exploration_loss=symmetric_kl
gae_lambda=0.95
ppo_clip_ratio=0.1
ppo_clip_value=0.2
with_vtrace=False
vtrace_rho=1.0
vtrace_c=1.0
optimizer=adam
adam_eps=1e-06
adam_beta1=0.9
adam_beta2=0.999
max_grad_norm=4.0
learning_rate=0.0001
lr_schedule=constant
lr_schedule_kl_threshold=0.008
lr_adaptive_min=1e-06
lr_adaptive_max=0.01
obs_subtract_mean=0.0
obs_scale=255.0
normalize_input=True
normalize_input_keys=None
decorrelate_experience_max_seconds=0
decorrelate_envs_on_one_worker=True
actor_worker_gpus=[]
set_workers_cpu_affinity=True
force_envs_single_thread=False
default_niceness=0
log_to_file=True
experiment_summaries_interval=10
flush_summaries_interval=30
stats_avg=100
summaries_use_frameskip=True
heartbeat_interval=20
heartbeat_reporting_interval=600
train_for_env_steps=8000000
train_for_seconds=10000000000
save_every_sec=120
keep_checkpoints=2
load_checkpoint_kind=latest
save_milestones_sec=-1
save_best_every_sec=5
save_best_metric=reward
save_best_after=100000
benchmark=False
encoder_mlp_layers=[512, 512]
encoder_conv_architecture=convnet_simple
encoder_conv_mlp_layers=[512]
use_rnn=True
rnn_size=512
rnn_type=gru
rnn_num_layers=1
decoder_mlp_layers=[]
nonlinearity=elu
policy_initialization=orthogonal
policy_init_gain=1.0
actor_critic_share_weights=True
adaptive_stddev=True
continuous_tanh_scale=0.0
initial_stddev=1.0
use_env_info_cache=False
env_gpu_actions=False
env_gpu_observations=True
env_frameskip=4
env_framestack=1
pixel_format=CHW
use_record_episode_statistics=False
with_wandb=False
wandb_user=None
wandb_project=sample_factory
wandb_group=None
wandb_job_type=SF
wandb_tags=[]
with_pbt=False
pbt_mix_policies_in_one_env=True
pbt_period_env_steps=5000000
pbt_start_mutation=20000000
pbt_replace_fraction=0.3
pbt_mutation_rate=0.15
pbt_replace_reward_gap=0.1
pbt_replace_reward_gap_absolute=1e-06
pbt_optimize_gamma=False
pbt_target_objective=true_objective
pbt_perturb_min=1.1
pbt_perturb_max=1.5
num_agents=-1
num_humans=0
num_bots=-1
start_bot_difficulty=None
timelimit=None
res_w=128
res_h=72
wide_aspect_ratio=False
eval_env_frameskip=1
fps=35
command_line=--env=doom_health_gathering_supreme --num_workers=8 --num_envs_per_worker=4 --train_for_env_steps=4000000
cli_args={'env': 'doom_health_gathering_supreme', 'num_workers': 8, 'num_envs_per_worker': 4, 'train_for_env_steps': 4000000}
git_hash=unknown
git_repo_name=not a git repository
[2023-02-28 16:21:13,598][11028] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-28 16:21:13,602][11028] Rollout worker 0 uses device cpu
[2023-02-28 16:21:13,606][11028] Rollout worker 1 uses device cpu
[2023-02-28 16:21:13,607][11028] Rollout worker 2 uses device cpu
[2023-02-28 16:21:13,609][11028] Rollout worker 3 uses device cpu
[2023-02-28 16:21:13,610][11028] Rollout worker 4 uses device cpu
[2023-02-28 16:21:13,612][11028] Rollout worker 5 uses device cpu
[2023-02-28 16:21:13,613][11028] Rollout worker 6 uses device cpu
[2023-02-28 16:21:13,615][11028] Rollout worker 7 uses device cpu
[2023-02-28 16:21:13,759][11028] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 16:21:13,762][11028] InferenceWorker_p0-w0: min num requests: 2
[2023-02-28 16:21:13,800][11028] Starting all processes...
[2023-02-28 16:21:13,802][11028] Starting process learner_proc0
[2023-02-28 16:21:13,990][11028] Starting all processes...
[2023-02-28 16:21:14,005][11028] Starting process inference_proc0-0
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc0
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc1
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc2
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc3
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc4
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc5
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc6
[2023-02-28 16:21:14,006][11028] Starting process rollout_proc7
[2023-02-28 16:21:22,750][24147] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 16:21:22,756][24147] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-28 16:21:22,810][24147] Num visible devices: 1
[2023-02-28 16:21:22,858][24147] Starting seed is not provided
[2023-02-28 16:21:22,859][24147] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 16:21:22,860][24147] Initializing actor-critic model on device cuda:0
[2023-02-28 16:21:22,861][24147] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:21:22,862][24147] RunningMeanStd input shape: (1,)
[2023-02-28 16:21:22,944][24147] ConvEncoder: input_channels=3
[2023-02-28 16:21:23,504][24166] Worker 1 uses CPU cores [1]
[2023-02-28 16:21:23,692][24147] Conv encoder output size: 512
[2023-02-28 16:21:23,697][24147] Policy head output size: 512
[2023-02-28 16:21:23,798][24147] Created Actor Critic model with architecture:
[2023-02-28 16:21:23,805][24147] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-28 16:21:24,019][24165] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 16:21:24,020][24165] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-28 16:21:24,072][24165] Num visible devices: 1
[2023-02-28 16:21:24,143][24169] Worker 3 uses CPU cores [1]
[2023-02-28 16:21:24,308][24167] Worker 0 uses CPU cores [0]
[2023-02-28 16:21:24,629][24176] Worker 2 uses CPU cores [0]
[2023-02-28 16:21:24,696][24178] Worker 4 uses CPU cores [0]
[2023-02-28 16:21:24,738][24180] Worker 6 uses CPU cores [0]
[2023-02-28 16:21:24,820][24182] Worker 5 uses CPU cores [1]
[2023-02-28 16:21:24,921][24188] Worker 7 uses CPU cores [1]
[2023-02-28 16:21:27,571][24147] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-28 16:21:27,572][24147] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-28 16:21:27,629][24147] Loading model from checkpoint
[2023-02-28 16:21:27,636][24147] Loaded experiment state at self.train_step=978, self.env_steps=4005888
[2023-02-28 16:21:27,637][24147] Initialized policy 0 weights for model version 978
[2023-02-28 16:21:27,646][24147] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-28 16:21:27,654][24147] LearnerWorker_p0 finished initialization!
[2023-02-28 16:21:27,883][24165] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:21:27,885][24165] RunningMeanStd input shape: (1,)
[2023-02-28 16:21:27,905][24165] ConvEncoder: input_channels=3
[2023-02-28 16:21:28,076][24165] Conv encoder output size: 512
[2023-02-28 16:21:28,077][24165] Policy head output size: 512
[2023-02-28 16:21:30,886][11028] Inference worker 0-0 is ready!
[2023-02-28 16:21:30,888][11028] All inference workers are ready! Signal rollout workers to start!
[2023-02-28 16:21:31,013][24188] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,022][24166] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,025][24182] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,053][24169] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,055][24180] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,061][24167] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,063][24176] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,053][24178] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-28 16:21:31,414][11028] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 16:21:31,928][24176] Decorrelating experience for 0 frames...
[2023-02-28 16:21:31,934][24167] Decorrelating experience for 0 frames...
[2023-02-28 16:21:32,551][24188] Decorrelating experience for 0 frames...
[2023-02-28 16:21:32,558][24166] Decorrelating experience for 0 frames...
[2023-02-28 16:21:32,571][24182] Decorrelating experience for 0 frames...
[2023-02-28 16:21:32,588][24169] Decorrelating experience for 0 frames...
[2023-02-28 16:21:32,619][24176] Decorrelating experience for 32 frames...
[2023-02-28 16:21:32,633][24167] Decorrelating experience for 32 frames...
[2023-02-28 16:21:33,223][24166] Decorrelating experience for 32 frames...
[2023-02-28 16:21:33,413][24180] Decorrelating experience for 0 frames...
[2023-02-28 16:21:33,423][24169] Decorrelating experience for 32 frames...
[2023-02-28 16:21:33,433][24178] Decorrelating experience for 0 frames...
[2023-02-28 16:21:33,750][11028] Heartbeat connected on Batcher_0
[2023-02-28 16:21:33,756][11028] Heartbeat connected on LearnerWorker_p0
[2023-02-28 16:21:33,802][11028] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-28 16:21:34,099][24178] Decorrelating experience for 32 frames...
[2023-02-28 16:21:34,192][24180] Decorrelating experience for 32 frames...
[2023-02-28 16:21:34,489][24182] Decorrelating experience for 32 frames...
[2023-02-28 16:21:34,651][24166] Decorrelating experience for 64 frames...
[2023-02-28 16:21:34,721][24188] Decorrelating experience for 32 frames...
[2023-02-28 16:21:34,960][24169] Decorrelating experience for 64 frames...
[2023-02-28 16:21:35,014][24180] Decorrelating experience for 64 frames...
[2023-02-28 16:21:35,654][24167] Decorrelating experience for 64 frames...
[2023-02-28 16:21:35,945][24182] Decorrelating experience for 64 frames...
[2023-02-28 16:21:36,026][24166] Decorrelating experience for 96 frames...
[2023-02-28 16:21:36,066][24178] Decorrelating experience for 64 frames...
[2023-02-28 16:21:36,128][24180] Decorrelating experience for 96 frames...
[2023-02-28 16:21:36,225][11028] Heartbeat connected on RolloutWorker_w1
[2023-02-28 16:21:36,229][24188] Decorrelating experience for 64 frames...
[2023-02-28 16:21:36,332][11028] Heartbeat connected on RolloutWorker_w6
[2023-02-28 16:21:36,414][11028] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 16:21:36,841][24169] Decorrelating experience for 96 frames...
[2023-02-28 16:21:36,945][11028] Heartbeat connected on RolloutWorker_w3
[2023-02-28 16:21:37,173][24167] Decorrelating experience for 96 frames...
[2023-02-28 16:21:37,345][11028] Heartbeat connected on RolloutWorker_w0
[2023-02-28 16:21:37,378][24178] Decorrelating experience for 96 frames...
[2023-02-28 16:21:37,506][11028] Heartbeat connected on RolloutWorker_w4
[2023-02-28 16:21:37,918][24176] Decorrelating experience for 64 frames...
[2023-02-28 16:21:38,527][24188] Decorrelating experience for 96 frames...
[2023-02-28 16:21:38,853][11028] Heartbeat connected on RolloutWorker_w7
[2023-02-28 16:21:41,416][11028] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 154.0. Samples: 1540. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-28 16:21:41,423][11028] Avg episode reward: [(0, '2.912')]
[2023-02-28 16:21:42,122][24147] Signal inference workers to stop experience collection...
[2023-02-28 16:21:42,163][24165] InferenceWorker_p0-w0: stopping experience collection
[2023-02-28 16:21:42,227][24176] Decorrelating experience for 96 frames...
[2023-02-28 16:21:42,299][24182] Decorrelating experience for 96 frames...
[2023-02-28 16:21:42,401][11028] Heartbeat connected on RolloutWorker_w2
[2023-02-28 16:21:42,613][11028] Heartbeat connected on RolloutWorker_w5
[2023-02-28 16:21:45,965][24147] Signal inference workers to resume experience collection...
[2023-02-28 16:21:45,967][24165] InferenceWorker_p0-w0: resuming experience collection
[2023-02-28 16:21:46,415][11028] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4009984. Throughput: 0: 146.5. Samples: 2198. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-28 16:21:46,423][11028] Avg episode reward: [(0, '3.886')]
[2023-02-28 16:21:51,414][11028] Fps is (10 sec: 2048.5, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 4026368. Throughput: 0: 173.3. Samples: 3466. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-02-28 16:21:51,417][11028] Avg episode reward: [(0, '8.554')]
[2023-02-28 16:21:55,765][24165] Updated weights for policy 0, policy_version 988 (0.0020)
[2023-02-28 16:21:56,414][11028] Fps is (10 sec: 3686.7, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 4046848. Throughput: 0: 397.2. Samples: 9930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:21:56,423][11028] Avg episode reward: [(0, '12.095')]
[2023-02-28 16:22:01,415][11028] Fps is (10 sec: 3686.3, 60 sec: 1911.4, 300 sec: 1911.4). Total num frames: 4063232. Throughput: 0: 517.2. Samples: 15516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:01,419][11028] Avg episode reward: [(0, '14.756')]
[2023-02-28 16:22:06,414][11028] Fps is (10 sec: 3276.8, 60 sec: 2106.5, 300 sec: 2106.5). Total num frames: 4079616. Throughput: 0: 503.3. Samples: 17616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:06,421][11028] Avg episode reward: [(0, '15.916')]
[2023-02-28 16:22:08,774][24165] Updated weights for policy 0, policy_version 998 (0.0043)
[2023-02-28 16:22:11,414][11028] Fps is (10 sec: 3276.9, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 4096000. Throughput: 0: 544.9. Samples: 21796. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:22:11,417][11028] Avg episode reward: [(0, '16.337')]
[2023-02-28 16:22:16,414][11028] Fps is (10 sec: 3686.4, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 4116480. Throughput: 0: 629.0. Samples: 28306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:16,420][11028] Avg episode reward: [(0, '18.354')]
[2023-02-28 16:22:18,724][24165] Updated weights for policy 0, policy_version 1008 (0.0017)
[2023-02-28 16:22:21,417][11028] Fps is (10 sec: 4095.0, 60 sec: 2621.3, 300 sec: 2621.3). Total num frames: 4136960. Throughput: 0: 700.9. Samples: 31544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:21,425][11028] Avg episode reward: [(0, '20.990')]
[2023-02-28 16:22:26,414][11028] Fps is (10 sec: 3276.8, 60 sec: 2606.5, 300 sec: 2606.5). Total num frames: 4149248. Throughput: 0: 763.0. Samples: 35872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:22:26,421][11028] Avg episode reward: [(0, '20.924')]
[2023-02-28 16:22:31,415][11028] Fps is (10 sec: 2867.8, 60 sec: 2662.4, 300 sec: 2662.4). Total num frames: 4165632. Throughput: 0: 844.1. Samples: 40182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:22:31,417][11028] Avg episode reward: [(0, '21.171')]
[2023-02-28 16:22:32,192][24165] Updated weights for policy 0, policy_version 1018 (0.0016)
[2023-02-28 16:22:36,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3003.7, 300 sec: 2772.7). Total num frames: 4186112. Throughput: 0: 890.3. Samples: 43528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:36,418][11028] Avg episode reward: [(0, '21.014')]
[2023-02-28 16:22:41,414][11028] Fps is (10 sec: 4096.2, 60 sec: 3345.2, 300 sec: 2867.2). Total num frames: 4206592. Throughput: 0: 890.5. Samples: 50004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:22:41,417][11028] Avg episode reward: [(0, '22.052')]
[2023-02-28 16:22:42,165][24165] Updated weights for policy 0, policy_version 1028 (0.0013)
[2023-02-28 16:22:46,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 2839.9). Total num frames: 4218880. Throughput: 0: 865.2. Samples: 54448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:46,419][11028] Avg episode reward: [(0, '22.085')]
[2023-02-28 16:22:51,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 2867.2). Total num frames: 4235264. Throughput: 0: 863.8. Samples: 56486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:22:51,422][11028] Avg episode reward: [(0, '22.312')]
[2023-02-28 16:22:54,796][24165] Updated weights for policy 0, policy_version 1038 (0.0032)
[2023-02-28 16:22:56,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 2939.5). Total num frames: 4255744. Throughput: 0: 896.2. Samples: 62126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:22:56,417][11028] Avg episode reward: [(0, '23.057')]
[2023-02-28 16:23:01,431][11028] Fps is (10 sec: 4498.3, 60 sec: 3617.2, 300 sec: 3048.7). Total num frames: 4280320. Throughput: 0: 900.2. Samples: 68830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:23:01,440][11028] Avg episode reward: [(0, '23.540')]
[2023-02-28 16:23:05,780][24165] Updated weights for policy 0, policy_version 1048 (0.0012)
[2023-02-28 16:23:06,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3018.1). Total num frames: 4292608. Throughput: 0: 874.6. Samples: 70898. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:23:06,419][11028] Avg episode reward: [(0, '22.832')]
[2023-02-28 16:23:11,414][11028] Fps is (10 sec: 2461.6, 60 sec: 3481.6, 300 sec: 2990.1). Total num frames: 4304896. Throughput: 0: 871.2. Samples: 75078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:23:11,417][11028] Avg episode reward: [(0, '22.900')]
[2023-02-28 16:23:11,428][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001051_4304896.pth...
[2023-02-28 16:23:11,606][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000916_3751936.pth
[2023-02-28 16:23:16,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3081.8). Total num frames: 4329472. Throughput: 0: 907.2. Samples: 81004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:23:16,420][11028] Avg episode reward: [(0, '21.526')]
[2023-02-28 16:23:17,254][24165] Updated weights for policy 0, policy_version 1058 (0.0021)
[2023-02-28 16:23:21,414][11028] Fps is (10 sec: 4505.6, 60 sec: 3550.0, 300 sec: 3127.9). Total num frames: 4349952. Throughput: 0: 906.6. Samples: 84324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:23:21,421][11028] Avg episode reward: [(0, '21.213')]
[2023-02-28 16:23:26,419][11028] Fps is (10 sec: 3275.2, 60 sec: 3549.6, 300 sec: 3098.6). Total num frames: 4362240. Throughput: 0: 876.9. Samples: 89468. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 16:23:26,427][11028] Avg episode reward: [(0, '21.610')]
[2023-02-28 16:23:29,626][24165] Updated weights for policy 0, policy_version 1068 (0.0022)
[2023-02-28 16:23:31,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3106.1). Total num frames: 4378624. Throughput: 0: 872.0. Samples: 93686. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:23:31,422][11028] Avg episode reward: [(0, '21.202')]
[2023-02-28 16:23:36,414][11028] Fps is (10 sec: 3688.2, 60 sec: 3549.9, 300 sec: 3145.7). Total num frames: 4399104. Throughput: 0: 890.4. Samples: 96556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:23:36,417][11028] Avg episode reward: [(0, '21.937')]
[2023-02-28 16:23:39,638][24165] Updated weights for policy 0, policy_version 1078 (0.0028)
[2023-02-28 16:23:41,417][11028] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3182.2). Total num frames: 4419584. Throughput: 0: 912.2. Samples: 103180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:23:41,420][11028] Avg episode reward: [(0, '21.347')]
[2023-02-28 16:23:46,420][11028] Fps is (10 sec: 3684.5, 60 sec: 3617.8, 300 sec: 3185.7). Total num frames: 4435968. Throughput: 0: 876.1. Samples: 108244. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:23:46,421][11028] Avg episode reward: [(0, '21.342')]
[2023-02-28 16:23:51,415][11028] Fps is (10 sec: 2867.9, 60 sec: 3549.8, 300 sec: 3159.8). Total num frames: 4448256. Throughput: 0: 877.2. Samples: 110372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:23:51,418][11028] Avg episode reward: [(0, '20.554')]
[2023-02-28 16:23:52,906][24165] Updated weights for policy 0, policy_version 1088 (0.0013)
[2023-02-28 16:23:56,414][11028] Fps is (10 sec: 3278.5, 60 sec: 3549.9, 300 sec: 3192.1). Total num frames: 4468736. Throughput: 0: 897.6. Samples: 115472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:23:56,417][11028] Avg episode reward: [(0, '19.617')]
[2023-02-28 16:24:01,414][11028] Fps is (10 sec: 4505.9, 60 sec: 3550.8, 300 sec: 3249.5). Total num frames: 4493312. Throughput: 0: 912.8. Samples: 122078. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:24:01,417][11028] Avg episode reward: [(0, '18.843')]
[2023-02-28 16:24:02,272][24165] Updated weights for policy 0, policy_version 1098 (0.0015)
[2023-02-28 16:24:06,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3250.4). Total num frames: 4509696. Throughput: 0: 898.4. Samples: 124752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:24:06,421][11028] Avg episode reward: [(0, '18.215')]
[2023-02-28 16:24:11,417][11028] Fps is (10 sec: 2866.5, 60 sec: 3618.0, 300 sec: 3225.6). Total num frames: 4521984. Throughput: 0: 874.8. Samples: 128832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:24:11,420][11028] Avg episode reward: [(0, '18.799')]
[2023-02-28 16:24:15,405][24165] Updated weights for policy 0, policy_version 1108 (0.0021)
[2023-02-28 16:24:16,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3252.0). Total num frames: 4542464. Throughput: 0: 904.4. Samples: 134382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:24:16,422][11028] Avg episode reward: [(0, '19.822')]
[2023-02-28 16:24:21,414][11028] Fps is (10 sec: 4097.0, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 4562944. Throughput: 0: 913.1. Samples: 137644. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:24:21,417][11028] Avg episode reward: [(0, '20.535')]
[2023-02-28 16:24:25,714][24165] Updated weights for policy 0, policy_version 1118 (0.0018)
[2023-02-28 16:24:26,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.4, 300 sec: 3276.8). Total num frames: 4579328. Throughput: 0: 890.4. Samples: 143244. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:24:26,419][11028] Avg episode reward: [(0, '21.038')]
[2023-02-28 16:24:31,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3254.0). Total num frames: 4591616. Throughput: 0: 871.8. Samples: 147470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:24:31,417][11028] Avg episode reward: [(0, '22.333')]
[2023-02-28 16:24:36,416][11028] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3276.8). Total num frames: 4612096. Throughput: 0: 878.3. Samples: 149894. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:24:36,424][11028] Avg episode reward: [(0, '22.974')]
[2023-02-28 16:24:39,011][24165] Updated weights for policy 0, policy_version 1128 (0.0026)
[2023-02-28 16:24:41,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3255.2). Total num frames: 4624384. Throughput: 0: 876.9. Samples: 154932. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:24:41,418][11028] Avg episode reward: [(0, '22.614')]
[2023-02-28 16:24:46,414][11028] Fps is (10 sec: 2457.9, 60 sec: 3345.4, 300 sec: 3234.8). Total num frames: 4636672. Throughput: 0: 813.7. Samples: 158696. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:24:46,419][11028] Avg episode reward: [(0, '23.089')]
[2023-02-28 16:24:51,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3235.8). Total num frames: 4653056. Throughput: 0: 795.4. Samples: 160546. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:24:51,419][11028] Avg episode reward: [(0, '23.742')]
[2023-02-28 16:24:54,261][24165] Updated weights for policy 0, policy_version 1138 (0.0014)
[2023-02-28 16:24:56,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3216.9). Total num frames: 4665344. Throughput: 0: 796.8. Samples: 164688. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:24:56,417][11028] Avg episode reward: [(0, '24.350')]
[2023-02-28 16:25:01,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3257.3). Total num frames: 4689920. Throughput: 0: 824.7. Samples: 171492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:25:01,416][11028] Avg episode reward: [(0, '24.793')]
[2023-02-28 16:25:03,898][24165] Updated weights for policy 0, policy_version 1148 (0.0015)
[2023-02-28 16:25:06,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3257.7). Total num frames: 4706304. Throughput: 0: 823.6. Samples: 174706. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:25:06,420][11028] Avg episode reward: [(0, '25.193')]
[2023-02-28 16:25:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3345.2, 300 sec: 3258.2). Total num frames: 4722688. Throughput: 0: 798.4. Samples: 179170. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:25:11,423][11028] Avg episode reward: [(0, '25.185')]
[2023-02-28 16:25:11,443][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001153_4722688.pth...
[2023-02-28 16:25:11,693][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth
[2023-02-28 16:25:16,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3258.6). Total num frames: 4739072. Throughput: 0: 800.0. Samples: 183472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:25:16,416][11028] Avg episode reward: [(0, '24.908')]
[2023-02-28 16:25:17,169][24165] Updated weights for policy 0, policy_version 1158 (0.0032)
[2023-02-28 16:25:21,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 4759552. Throughput: 0: 818.0. Samples: 186702. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:25:21,422][11028] Avg episode reward: [(0, '24.847')]
[2023-02-28 16:25:26,418][11028] Fps is (10 sec: 4094.6, 60 sec: 3344.9, 300 sec: 3294.2). Total num frames: 4780032. Throughput: 0: 846.7. Samples: 193038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:25:26,420][11028] Avg episode reward: [(0, '24.761')]
[2023-02-28 16:25:27,471][24165] Updated weights for policy 0, policy_version 1168 (0.0022)
[2023-02-28 16:25:31,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4792320. Throughput: 0: 862.7. Samples: 197516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:25:31,424][11028] Avg episode reward: [(0, '24.594')]
[2023-02-28 16:25:36,414][11028] Fps is (10 sec: 2868.1, 60 sec: 3276.9, 300 sec: 3276.8). Total num frames: 4808704. Throughput: 0: 867.1. Samples: 199566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:25:36,416][11028] Avg episode reward: [(0, '25.247')]
[2023-02-28 16:25:40,059][24165] Updated weights for policy 0, policy_version 1178 (0.0039)
[2023-02-28 16:25:41,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3293.2). Total num frames: 4829184. Throughput: 0: 901.3. Samples: 205246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:25:41,417][11028] Avg episode reward: [(0, '24.944')]
[2023-02-28 16:25:46,415][11028] Fps is (10 sec: 4095.7, 60 sec: 3549.8, 300 sec: 3308.9). Total num frames: 4849664. Throughput: 0: 896.5. Samples: 211834. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:25:46,419][11028] Avg episode reward: [(0, '25.229')]
[2023-02-28 16:25:51,267][24165] Updated weights for policy 0, policy_version 1188 (0.0013)
[2023-02-28 16:25:51,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3308.3). Total num frames: 4866048. Throughput: 0: 872.0. Samples: 213944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:25:51,417][11028] Avg episode reward: [(0, '25.467')]
[2023-02-28 16:25:51,429][24147] Saving new best policy, reward=25.467!
[2023-02-28 16:25:56,417][11028] Fps is (10 sec: 2866.5, 60 sec: 3549.7, 300 sec: 3292.2). Total num frames: 4878336. Throughput: 0: 860.9. Samples: 217914. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:25:56,422][11028] Avg episode reward: [(0, '24.378')]
[2023-02-28 16:26:01,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3307.1). Total num frames: 4898816. Throughput: 0: 899.2. Samples: 223938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:26:01,420][11028] Avg episode reward: [(0, '24.946')]
[2023-02-28 16:26:02,501][24165] Updated weights for policy 0, policy_version 1198 (0.0018)
[2023-02-28 16:26:06,414][11028] Fps is (10 sec: 4097.3, 60 sec: 3549.9, 300 sec: 3321.5). Total num frames: 4919296. Throughput: 0: 898.8. Samples: 227148. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:26:06,422][11028] Avg episode reward: [(0, '23.713')]
[2023-02-28 16:26:11,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3320.7). Total num frames: 4935680. Throughput: 0: 872.7. Samples: 232306. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:26:11,419][11028] Avg episode reward: [(0, '23.717')]
[2023-02-28 16:26:15,282][24165] Updated weights for policy 0, policy_version 1208 (0.0018)
[2023-02-28 16:26:16,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3305.5). Total num frames: 4947968. Throughput: 0: 869.2. Samples: 236630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:26:16,422][11028] Avg episode reward: [(0, '22.762')]
[2023-02-28 16:26:21,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3333.3). Total num frames: 4972544. Throughput: 0: 889.2. Samples: 239580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:26:21,416][11028] Avg episode reward: [(0, '21.834')]
[2023-02-28 16:26:25,165][24165] Updated weights for policy 0, policy_version 1218 (0.0024)
[2023-02-28 16:26:26,414][11028] Fps is (10 sec: 4505.6, 60 sec: 3550.1, 300 sec: 3346.2). Total num frames: 4993024. Throughput: 0: 905.1. Samples: 245974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:26:26,422][11028] Avg episode reward: [(0, '21.033')]
[2023-02-28 16:26:31,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3387.9). Total num frames: 5005312. Throughput: 0: 869.7. Samples: 250972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:26:31,419][11028] Avg episode reward: [(0, '22.915')]
[2023-02-28 16:26:36,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 5021696. Throughput: 0: 868.3. Samples: 253016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:26:36,422][11028] Avg episode reward: [(0, '23.989')]
[2023-02-28 16:26:38,517][24165] Updated weights for policy 0, policy_version 1228 (0.0016)
[2023-02-28 16:26:41,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5042176. Throughput: 0: 891.2. Samples: 258016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:26:41,416][11028] Avg episode reward: [(0, '24.043')]
[2023-02-28 16:26:46,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5062656. Throughput: 0: 906.7. Samples: 264738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:26:46,422][11028] Avg episode reward: [(0, '25.748')]
[2023-02-28 16:26:46,425][24147] Saving new best policy, reward=25.748!
[2023-02-28 16:26:48,296][24165] Updated weights for policy 0, policy_version 1238 (0.0012)
[2023-02-28 16:26:51,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5079040. Throughput: 0: 891.5. Samples: 267266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:26:51,417][11028] Avg episode reward: [(0, '26.043')]
[2023-02-28 16:26:51,426][24147] Saving new best policy, reward=26.043!
[2023-02-28 16:26:56,416][11028] Fps is (10 sec: 2866.6, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 5091328. Throughput: 0: 864.2. Samples: 271196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:26:56,424][11028] Avg episode reward: [(0, '25.831')]
[2023-02-28 16:27:01,415][11028] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 5107712. Throughput: 0: 882.2. Samples: 276330. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:27:01,417][11028] Avg episode reward: [(0, '24.540')]
[2023-02-28 16:27:01,578][24165] Updated weights for policy 0, policy_version 1248 (0.0036)
[2023-02-28 16:27:06,414][11028] Fps is (10 sec: 4096.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5132288. Throughput: 0: 889.4. Samples: 279602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:27:06,417][11028] Avg episode reward: [(0, '23.838')]
[2023-02-28 16:27:11,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5148672. Throughput: 0: 878.9. Samples: 285526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:27:11,417][11028] Avg episode reward: [(0, '24.936')]
[2023-02-28 16:27:11,431][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001257_5148672.pth...
[2023-02-28 16:27:11,720][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001051_4304896.pth
[2023-02-28 16:27:12,511][24165] Updated weights for policy 0, policy_version 1258 (0.0012)
[2023-02-28 16:27:16,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5160960. Throughput: 0: 858.8. Samples: 289616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:27:16,417][11028] Avg episode reward: [(0, '25.179')]
[2023-02-28 16:27:21,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 5177344. Throughput: 0: 858.5. Samples: 291648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:27:21,423][11028] Avg episode reward: [(0, '24.361')]
[2023-02-28 16:27:24,337][24165] Updated weights for policy 0, policy_version 1268 (0.0025)
[2023-02-28 16:27:26,414][11028] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 5201920. Throughput: 0: 890.0. Samples: 298064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:27:26,419][11028] Avg episode reward: [(0, '23.113')]
[2023-02-28 16:27:31,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5218304. Throughput: 0: 866.3. Samples: 303722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:27:31,418][11028] Avg episode reward: [(0, '23.920')]
[2023-02-28 16:27:36,358][24165] Updated weights for policy 0, policy_version 1278 (0.0012)
[2023-02-28 16:27:36,415][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 5234688. Throughput: 0: 858.2. Samples: 305886. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:27:36,418][11028] Avg episode reward: [(0, '23.715')]
[2023-02-28 16:27:41,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3485.1). Total num frames: 5246976. Throughput: 0: 865.9. Samples: 310160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:27:41,420][11028] Avg episode reward: [(0, '23.256')]
[2023-02-28 16:27:46,414][11028] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 5271552. Throughput: 0: 897.6. Samples: 316722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:27:46,420][11028] Avg episode reward: [(0, '21.766')]
[2023-02-28 16:27:46,989][24165] Updated weights for policy 0, policy_version 1288 (0.0020)
[2023-02-28 16:27:51,415][11028] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 5287936. Throughput: 0: 899.4. Samples: 320074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:27:51,426][11028] Avg episode reward: [(0, '21.476')]
[2023-02-28 16:27:56,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3471.4). Total num frames: 5304320. Throughput: 0: 863.3. Samples: 324376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:27:56,419][11028] Avg episode reward: [(0, '22.677')]
[2023-02-28 16:28:00,396][24165] Updated weights for policy 0, policy_version 1298 (0.0018)
[2023-02-28 16:28:01,414][11028] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 5320704. Throughput: 0: 872.7. Samples: 328886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:28:01,416][11028] Avg episode reward: [(0, '22.988')]
[2023-02-28 16:28:06,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 5341184. Throughput: 0: 902.9. Samples: 332278. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:28:06,417][11028] Avg episode reward: [(0, '21.687')]
[2023-02-28 16:28:09,870][24165] Updated weights for policy 0, policy_version 1308 (0.0017)
[2023-02-28 16:28:11,419][11028] Fps is (10 sec: 4094.0, 60 sec: 3549.6, 300 sec: 3498.9). Total num frames: 5361664. Throughput: 0: 901.7. Samples: 338644. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:28:11,431][11028] Avg episode reward: [(0, '21.406')]
[2023-02-28 16:28:16,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5373952. Throughput: 0: 873.6. Samples: 343032. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:28:16,421][11028] Avg episode reward: [(0, '21.211')]
[2023-02-28 16:28:21,414][11028] Fps is (10 sec: 2868.6, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 5390336. Throughput: 0: 871.9. Samples: 345120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:28:21,418][11028] Avg episode reward: [(0, '22.289')]
[2023-02-28 16:28:22,988][24165] Updated weights for policy 0, policy_version 1318 (0.0017)
[2023-02-28 16:28:26,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 5410816. Throughput: 0: 905.2. Samples: 350896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:28:26,421][11028] Avg episode reward: [(0, '20.349')]
[2023-02-28 16:28:31,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5431296. Throughput: 0: 906.2. Samples: 357502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:28:31,419][11028] Avg episode reward: [(0, '21.507')]
[2023-02-28 16:28:32,877][24165] Updated weights for policy 0, policy_version 1328 (0.0017)
[2023-02-28 16:28:36,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 5447680. Throughput: 0: 878.6. Samples: 359610. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:28:36,422][11028] Avg episode reward: [(0, '21.685')]
[2023-02-28 16:28:41,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 5464064. Throughput: 0: 877.1. Samples: 363846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:28:41,417][11028] Avg episode reward: [(0, '22.057')]
[2023-02-28 16:28:45,079][24165] Updated weights for policy 0, policy_version 1338 (0.0034)
[2023-02-28 16:28:46,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5484544. Throughput: 0: 915.9. Samples: 370102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:28:46,423][11028] Avg episode reward: [(0, '23.342')]
[2023-02-28 16:28:51,417][11028] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3512.8). Total num frames: 5505024. Throughput: 0: 911.5. Samples: 373296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:28:51,419][11028] Avg episode reward: [(0, '23.538')]
[2023-02-28 16:28:56,416][11028] Fps is (10 sec: 3276.1, 60 sec: 3549.7, 300 sec: 3471.2). Total num frames: 5517312. Throughput: 0: 883.5. Samples: 378400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:28:56,420][11028] Avg episode reward: [(0, '24.290')]
[2023-02-28 16:28:56,516][24165] Updated weights for policy 0, policy_version 1348 (0.0015)
[2023-02-28 16:29:01,415][11028] Fps is (10 sec: 2867.8, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 5533696. Throughput: 0: 880.1. Samples: 382636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:29:01,417][11028] Avg episode reward: [(0, '23.161')]
[2023-02-28 16:29:06,414][11028] Fps is (10 sec: 3687.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5554176. Throughput: 0: 899.1. Samples: 385580. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:29:06,419][11028] Avg episode reward: [(0, '22.763')]
[2023-02-28 16:29:07,468][24165] Updated weights for policy 0, policy_version 1358 (0.0015)
[2023-02-28 16:29:11,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3550.2, 300 sec: 3499.0). Total num frames: 5574656. Throughput: 0: 915.1. Samples: 392076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:29:11,417][11028] Avg episode reward: [(0, '23.088')]
[2023-02-28 16:29:11,444][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001362_5578752.pth...
[2023-02-28 16:29:11,626][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001153_4722688.pth
[2023-02-28 16:29:16,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 5591040. Throughput: 0: 873.9. Samples: 396828. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:29:16,418][11028] Avg episode reward: [(0, '21.741')]
[2023-02-28 16:29:20,194][24165] Updated weights for policy 0, policy_version 1368 (0.0029)
[2023-02-28 16:29:21,415][11028] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 5603328. Throughput: 0: 874.5. Samples: 398964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:29:21,418][11028] Avg episode reward: [(0, '20.985')]
[2023-02-28 16:29:26,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 5623808. Throughput: 0: 894.9. Samples: 404118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:29:26,422][11028] Avg episode reward: [(0, '20.563')]
[2023-02-28 16:29:30,268][24165] Updated weights for policy 0, policy_version 1378 (0.0016)
[2023-02-28 16:29:31,414][11028] Fps is (10 sec: 4505.7, 60 sec: 3618.1, 300 sec: 3512.9). Total num frames: 5648384. Throughput: 0: 904.4. Samples: 410798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:29:31,418][11028] Avg episode reward: [(0, '23.483')]
[2023-02-28 16:29:36,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 5660672. Throughput: 0: 891.1. Samples: 413394. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:29:36,419][11028] Avg episode reward: [(0, '22.833')]
[2023-02-28 16:29:41,416][11028] Fps is (10 sec: 2866.9, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 5677056. Throughput: 0: 867.4. Samples: 417430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:29:41,421][11028] Avg episode reward: [(0, '21.976')]
[2023-02-28 16:29:43,536][24165] Updated weights for policy 0, policy_version 1388 (0.0017)
[2023-02-28 16:29:46,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5697536. Throughput: 0: 897.9. Samples: 423040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:29:46,417][11028] Avg episode reward: [(0, '23.220')]
[2023-02-28 16:29:51,414][11028] Fps is (10 sec: 4096.5, 60 sec: 3550.0, 300 sec: 3568.4). Total num frames: 5718016. Throughput: 0: 906.1. Samples: 426354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:29:51,417][11028] Avg episode reward: [(0, '23.718')]
[2023-02-28 16:29:52,960][24165] Updated weights for policy 0, policy_version 1398 (0.0025)
[2023-02-28 16:29:56,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3540.6). Total num frames: 5734400. Throughput: 0: 887.8. Samples: 432028. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:29:56,420][11028] Avg episode reward: [(0, '23.415')]
[2023-02-28 16:30:01,417][11028] Fps is (10 sec: 2866.4, 60 sec: 3549.7, 300 sec: 3526.7). Total num frames: 5746688. Throughput: 0: 874.9. Samples: 436202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:30:01,422][11028] Avg episode reward: [(0, '23.101')]
[2023-02-28 16:30:06,086][24165] Updated weights for policy 0, policy_version 1408 (0.0024)
[2023-02-28 16:30:06,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5767168. Throughput: 0: 880.6. Samples: 438590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:06,417][11028] Avg episode reward: [(0, '22.720')]
[2023-02-28 16:30:11,414][11028] Fps is (10 sec: 4097.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 5787648. Throughput: 0: 911.5. Samples: 445136. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:30:11,423][11028] Avg episode reward: [(0, '22.200')]
[2023-02-28 16:30:16,418][11028] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3540.6). Total num frames: 5804032. Throughput: 0: 885.3. Samples: 450638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:16,426][11028] Avg episode reward: [(0, '22.743')]
[2023-02-28 16:30:16,632][24165] Updated weights for policy 0, policy_version 1418 (0.0014)
[2023-02-28 16:30:21,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3526.8). Total num frames: 5820416. Throughput: 0: 874.2. Samples: 452732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:21,419][11028] Avg episode reward: [(0, '22.339')]
[2023-02-28 16:30:26,414][11028] Fps is (10 sec: 3277.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5836800. Throughput: 0: 886.1. Samples: 457302. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:26,420][11028] Avg episode reward: [(0, '23.462')]
[2023-02-28 16:30:28,517][24165] Updated weights for policy 0, policy_version 1428 (0.0017)
[2023-02-28 16:30:31,414][11028] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5861376. Throughput: 0: 910.3. Samples: 464006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:31,420][11028] Avg episode reward: [(0, '22.978')]
[2023-02-28 16:30:36,416][11028] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3554.5). Total num frames: 5877760. Throughput: 0: 910.7. Samples: 467338. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:30:36,419][11028] Avg episode reward: [(0, '23.622')]
[2023-02-28 16:30:40,047][24165] Updated weights for policy 0, policy_version 1438 (0.0020)
[2023-02-28 16:30:41,414][11028] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 5890048. Throughput: 0: 878.4. Samples: 471554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:30:41,421][11028] Avg episode reward: [(0, '23.706')]
[2023-02-28 16:30:46,414][11028] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 5906432. Throughput: 0: 888.9. Samples: 476198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:30:46,423][11028] Avg episode reward: [(0, '24.329')]
[2023-02-28 16:30:50,797][24165] Updated weights for policy 0, policy_version 1448 (0.0020)
[2023-02-28 16:30:51,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 5931008. Throughput: 0: 913.8. Samples: 479710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:30:51,417][11028] Avg episode reward: [(0, '23.676')]
[2023-02-28 16:30:56,414][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 5951488. Throughput: 0: 910.5. Samples: 486110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:30:56,423][11028] Avg episode reward: [(0, '23.383')]
[2023-02-28 16:31:01,415][11028] Fps is (10 sec: 3276.4, 60 sec: 3618.2, 300 sec: 3540.6). Total num frames: 5963776. Throughput: 0: 882.1. Samples: 490332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:01,418][11028] Avg episode reward: [(0, '23.530')]
[2023-02-28 16:31:03,367][24165] Updated weights for policy 0, policy_version 1458 (0.0031)
[2023-02-28 16:31:06,414][11028] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 5980160. Throughput: 0: 883.1. Samples: 492472. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:31:06,422][11028] Avg episode reward: [(0, '22.733')]
[2023-02-28 16:31:11,414][11028] Fps is (10 sec: 3686.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6000640. Throughput: 0: 917.5. Samples: 498588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:31:11,417][11028] Avg episode reward: [(0, '22.985')]
[2023-02-28 16:31:11,425][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001465_6000640.pth...
[2023-02-28 16:31:11,611][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001257_5148672.pth
[2023-02-28 16:31:13,443][24165] Updated weights for policy 0, policy_version 1468 (0.0025)
[2023-02-28 16:31:16,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3618.3, 300 sec: 3554.5). Total num frames: 6021120. Throughput: 0: 905.5. Samples: 504752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:31:16,423][11028] Avg episode reward: [(0, '22.263')]
[2023-02-28 16:31:21,414][11028] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6037504. Throughput: 0: 877.1. Samples: 506804. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:21,422][11028] Avg episode reward: [(0, '22.489')]
[2023-02-28 16:31:26,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6049792. Throughput: 0: 871.7. Samples: 510782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:26,423][11028] Avg episode reward: [(0, '21.547')]
[2023-02-28 16:31:26,910][24165] Updated weights for policy 0, policy_version 1478 (0.0023)
[2023-02-28 16:31:31,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6074368. Throughput: 0: 910.8. Samples: 517182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:31,420][11028] Avg episode reward: [(0, '21.913')]
[2023-02-28 16:31:36,107][24165] Updated weights for policy 0, policy_version 1488 (0.0012)
[2023-02-28 16:31:36,415][11028] Fps is (10 sec: 4505.4, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 6094848. Throughput: 0: 906.0. Samples: 520482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:31:36,422][11028] Avg episode reward: [(0, '21.213')]
[2023-02-28 16:31:41,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6107136. Throughput: 0: 874.8. Samples: 525476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:31:41,422][11028] Avg episode reward: [(0, '20.799')]
[2023-02-28 16:31:46,415][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 6123520. Throughput: 0: 873.7. Samples: 529646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:31:46,417][11028] Avg episode reward: [(0, '21.503')]
[2023-02-28 16:31:49,233][24165] Updated weights for policy 0, policy_version 1498 (0.0018)
[2023-02-28 16:31:51,415][11028] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6144000. Throughput: 0: 895.6. Samples: 532774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:51,417][11028] Avg episode reward: [(0, '21.758')]
[2023-02-28 16:31:56,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6164480. Throughput: 0: 904.6. Samples: 539294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:31:56,417][11028] Avg episode reward: [(0, '23.213')]
[2023-02-28 16:31:59,725][24165] Updated weights for policy 0, policy_version 1508 (0.0021)
[2023-02-28 16:32:01,414][11028] Fps is (10 sec: 3686.5, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 6180864. Throughput: 0: 874.8. Samples: 544118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:32:01,417][11028] Avg episode reward: [(0, '23.515')]
[2023-02-28 16:32:06,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6193152. Throughput: 0: 874.4. Samples: 546150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:32:06,421][11028] Avg episode reward: [(0, '21.841')]
[2023-02-28 16:32:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6213632. Throughput: 0: 904.1. Samples: 551468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:32:11,416][11028] Avg episode reward: [(0, '22.442')]
[2023-02-28 16:32:11,836][24165] Updated weights for policy 0, policy_version 1518 (0.0037)
[2023-02-28 16:32:16,414][11028] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 6238208. Throughput: 0: 908.2. Samples: 558050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:32:16,421][11028] Avg episode reward: [(0, '22.358')]
[2023-02-28 16:32:21,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6250496. Throughput: 0: 891.2. Samples: 560586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:32:21,416][11028] Avg episode reward: [(0, '22.592')]
[2023-02-28 16:32:23,809][24165] Updated weights for policy 0, policy_version 1528 (0.0019)
[2023-02-28 16:32:26,416][11028] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6262784. Throughput: 0: 868.7. Samples: 564566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:32:26,423][11028] Avg episode reward: [(0, '22.703')]
[2023-02-28 16:32:31,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 6283264. Throughput: 0: 901.6. Samples: 570216. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:32:31,420][11028] Avg episode reward: [(0, '22.019')]
[2023-02-28 16:32:34,148][24165] Updated weights for policy 0, policy_version 1538 (0.0014)
[2023-02-28 16:32:36,414][11028] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 6307840. Throughput: 0: 906.7. Samples: 573574. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-28 16:32:36,421][11028] Avg episode reward: [(0, '23.283')]
[2023-02-28 16:32:41,416][11028] Fps is (10 sec: 4095.1, 60 sec: 3618.0, 300 sec: 3568.4). Total num frames: 6324224. Throughput: 0: 886.1. Samples: 579170. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:32:41,419][11028] Avg episode reward: [(0, '24.498')]
[2023-02-28 16:32:46,415][11028] Fps is (10 sec: 2867.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 6336512. Throughput: 0: 871.1. Samples: 583318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:32:46,419][11028] Avg episode reward: [(0, '23.753')]
[2023-02-28 16:32:47,230][24165] Updated weights for policy 0, policy_version 1548 (0.0015)
[2023-02-28 16:32:51,414][11028] Fps is (10 sec: 3277.5, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6356992. Throughput: 0: 880.0. Samples: 585750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:32:51,422][11028] Avg episode reward: [(0, '23.572')]
[2023-02-28 16:32:56,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6377472. Throughput: 0: 909.2. Samples: 592384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:32:56,421][11028] Avg episode reward: [(0, '23.117')]
[2023-02-28 16:32:56,752][24165] Updated weights for policy 0, policy_version 1558 (0.0012)
[2023-02-28 16:33:01,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6393856. Throughput: 0: 885.3. Samples: 597888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:01,423][11028] Avg episode reward: [(0, '22.895')]
[2023-02-28 16:33:06,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.6). Total num frames: 6410240. Throughput: 0: 874.5. Samples: 599938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:33:06,428][11028] Avg episode reward: [(0, '22.801')]
[2023-02-28 16:33:09,933][24165] Updated weights for policy 0, policy_version 1568 (0.0013)
[2023-02-28 16:33:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 6426624. Throughput: 0: 891.7. Samples: 604694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:11,417][11028] Avg episode reward: [(0, '23.564')]
[2023-02-28 16:33:11,429][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001569_6426624.pth...
[2023-02-28 16:33:11,596][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001362_5578752.pth
[2023-02-28 16:33:16,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 6447104. Throughput: 0: 910.2. Samples: 611174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:33:16,416][11028] Avg episode reward: [(0, '22.702')]
[2023-02-28 16:33:19,815][24165] Updated weights for policy 0, policy_version 1578 (0.0025)
[2023-02-28 16:33:21,415][11028] Fps is (10 sec: 4095.5, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 6467584. Throughput: 0: 908.1. Samples: 614440. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:21,426][11028] Avg episode reward: [(0, '22.659')]
[2023-02-28 16:33:26,415][11028] Fps is (10 sec: 3276.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 6479872. Throughput: 0: 874.1. Samples: 618504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:26,418][11028] Avg episode reward: [(0, '23.438')]
[2023-02-28 16:33:31,417][11028] Fps is (10 sec: 2866.7, 60 sec: 3549.7, 300 sec: 3554.5). Total num frames: 6496256. Throughput: 0: 895.1. Samples: 623600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:33:31,420][11028] Avg episode reward: [(0, '24.476')]
[2023-02-28 16:33:32,338][24165] Updated weights for policy 0, policy_version 1588 (0.0019)
[2023-02-28 16:33:36,414][11028] Fps is (10 sec: 4096.5, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 6520832. Throughput: 0: 914.1. Samples: 626886. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:36,420][11028] Avg episode reward: [(0, '24.812')]
[2023-02-28 16:33:41,419][11028] Fps is (10 sec: 4095.5, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 6537216. Throughput: 0: 905.8. Samples: 633148. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:33:41,421][11028] Avg episode reward: [(0, '26.187')]
[2023-02-28 16:33:41,437][24147] Saving new best policy, reward=26.187!
[2023-02-28 16:33:43,396][24165] Updated weights for policy 0, policy_version 1598 (0.0017)
[2023-02-28 16:33:46,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6549504. Throughput: 0: 872.0. Samples: 637126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:46,417][11028] Avg episode reward: [(0, '25.967')]
[2023-02-28 16:33:51,414][11028] Fps is (10 sec: 2458.6, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 6561792. Throughput: 0: 863.7. Samples: 638806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:51,418][11028] Avg episode reward: [(0, '26.326')]
[2023-02-28 16:33:51,433][24147] Saving new best policy, reward=26.326!
[2023-02-28 16:33:56,414][11028] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 6574080. Throughput: 0: 837.5. Samples: 642380. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:33:56,418][11028] Avg episode reward: [(0, '26.603')]
[2023-02-28 16:33:56,422][24147] Saving new best policy, reward=26.603!
[2023-02-28 16:33:58,917][24165] Updated weights for policy 0, policy_version 1608 (0.0014)
[2023-02-28 16:34:01,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 6594560. Throughput: 0: 819.6. Samples: 648054. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:34:01,417][11028] Avg episode reward: [(0, '25.164')]
[2023-02-28 16:34:06,416][11028] Fps is (10 sec: 3685.6, 60 sec: 3345.0, 300 sec: 3512.8). Total num frames: 6610944. Throughput: 0: 805.6. Samples: 650692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:34:06,421][11028] Avg episode reward: [(0, '24.510')]
[2023-02-28 16:34:11,415][11028] Fps is (10 sec: 2867.1, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 6623232. Throughput: 0: 806.1. Samples: 654778. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:34:11,425][11028] Avg episode reward: [(0, '24.871')]
[2023-02-28 16:34:11,452][24165] Updated weights for policy 0, policy_version 1618 (0.0014)
[2023-02-28 16:34:16,414][11028] Fps is (10 sec: 3277.5, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 6643712. Throughput: 0: 810.6. Samples: 660074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:34:16,417][11028] Avg episode reward: [(0, '23.561')]
[2023-02-28 16:34:21,414][11028] Fps is (10 sec: 4096.2, 60 sec: 3276.9, 300 sec: 3526.7). Total num frames: 6664192. Throughput: 0: 809.1. Samples: 663296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:34:21,420][11028] Avg episode reward: [(0, '23.748')]
[2023-02-28 16:34:21,595][24165] Updated weights for policy 0, policy_version 1628 (0.0030)
[2023-02-28 16:34:26,414][11028] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 6680576. Throughput: 0: 793.7. Samples: 668862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:34:26,420][11028] Avg episode reward: [(0, '23.372')]
[2023-02-28 16:34:31,415][11028] Fps is (10 sec: 3276.7, 60 sec: 3345.2, 300 sec: 3512.8). Total num frames: 6696960. Throughput: 0: 799.6. Samples: 673110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:34:31,422][11028] Avg episode reward: [(0, '21.344')]
[2023-02-28 16:34:34,732][24165] Updated weights for policy 0, policy_version 1638 (0.0025)
[2023-02-28 16:34:36,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3512.9). Total num frames: 6713344. Throughput: 0: 815.5. Samples: 675504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:34:36,426][11028] Avg episode reward: [(0, '20.965')]
[2023-02-28 16:34:41,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3345.3, 300 sec: 3526.7). Total num frames: 6737920. Throughput: 0: 884.1. Samples: 682164. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:34:41,421][11028] Avg episode reward: [(0, '21.430')]
[2023-02-28 16:34:44,203][24165] Updated weights for policy 0, policy_version 1648 (0.0013)
[2023-02-28 16:34:46,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 6754304. Throughput: 0: 879.7. Samples: 687642. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:34:46,416][11028] Avg episode reward: [(0, '20.892')]
[2023-02-28 16:34:51,421][11028] Fps is (10 sec: 2865.4, 60 sec: 3413.0, 300 sec: 3498.9). Total num frames: 6766592. Throughput: 0: 868.5. Samples: 689776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:34:51,428][11028] Avg episode reward: [(0, '20.823')]
[2023-02-28 16:34:56,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3512.9). Total num frames: 6782976. Throughput: 0: 879.4. Samples: 694350. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:34:56,417][11028] Avg episode reward: [(0, '21.440')]
[2023-02-28 16:34:57,288][24165] Updated weights for policy 0, policy_version 1658 (0.0018)
[2023-02-28 16:35:01,414][11028] Fps is (10 sec: 4098.6, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6807552. Throughput: 0: 907.7. Samples: 700922. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:35:01,423][11028] Avg episode reward: [(0, '22.441')]
[2023-02-28 16:35:06,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3550.0, 300 sec: 3512.8). Total num frames: 6823936. Throughput: 0: 908.8. Samples: 704192. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:35:06,417][11028] Avg episode reward: [(0, '22.487')]
[2023-02-28 16:35:08,221][24165] Updated weights for policy 0, policy_version 1668 (0.0015)
[2023-02-28 16:35:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3512.9). Total num frames: 6840320. Throughput: 0: 874.8. Samples: 708230. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:35:11,418][11028] Avg episode reward: [(0, '22.498')]
[2023-02-28 16:35:11,437][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001670_6840320.pth...
[2023-02-28 16:35:11,778][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001465_6000640.pth
[2023-02-28 16:35:16,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 6856704. Throughput: 0: 883.3. Samples: 712860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:35:16,421][11028] Avg episode reward: [(0, '22.041')]
[2023-02-28 16:35:19,747][24165] Updated weights for policy 0, policy_version 1678 (0.0025)
[2023-02-28 16:35:21,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 6877184. Throughput: 0: 906.9. Samples: 716316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:35:21,418][11028] Avg episode reward: [(0, '22.721')]
[2023-02-28 16:35:26,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 6897664. Throughput: 0: 899.4. Samples: 722636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:35:26,422][11028] Avg episode reward: [(0, '23.629')]
[2023-02-28 16:35:31,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 6909952. Throughput: 0: 872.4. Samples: 726898. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-28 16:35:31,420][11028] Avg episode reward: [(0, '24.021')]
[2023-02-28 16:35:31,887][24165] Updated weights for policy 0, policy_version 1688 (0.0018)
[2023-02-28 16:35:36,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 6926336. Throughput: 0: 873.8. Samples: 729092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:35:36,421][11028] Avg episode reward: [(0, '24.315')]
[2023-02-28 16:35:41,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 6950912. Throughput: 0: 910.0. Samples: 735298. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:35:41,421][11028] Avg episode reward: [(0, '23.571')]
[2023-02-28 16:35:42,311][24165] Updated weights for policy 0, policy_version 1698 (0.0013)
[2023-02-28 16:35:46,419][11028] Fps is (10 sec: 4094.2, 60 sec: 3549.6, 300 sec: 3512.8). Total num frames: 6967296. Throughput: 0: 903.2. Samples: 741572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:35:46,421][11028] Avg episode reward: [(0, '23.775')]
[2023-02-28 16:35:51,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.5, 300 sec: 3499.0). Total num frames: 6983680. Throughput: 0: 876.0. Samples: 743612. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:35:51,417][11028] Avg episode reward: [(0, '22.830')]
[2023-02-28 16:35:55,173][24165] Updated weights for policy 0, policy_version 1708 (0.0019)
[2023-02-28 16:35:56,414][11028] Fps is (10 sec: 3278.2, 60 sec: 3618.1, 300 sec: 3512.9). Total num frames: 7000064. Throughput: 0: 880.9. Samples: 747872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:35:56,421][11028] Avg episode reward: [(0, '22.135')]
[2023-02-28 16:36:01,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 7020544. Throughput: 0: 922.0. Samples: 754348. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:36:01,421][11028] Avg episode reward: [(0, '22.204')]
[2023-02-28 16:36:04,506][24165] Updated weights for policy 0, policy_version 1718 (0.0013)
[2023-02-28 16:36:06,419][11028] Fps is (10 sec: 4093.9, 60 sec: 3617.8, 300 sec: 3526.7). Total num frames: 7041024. Throughput: 0: 919.7. Samples: 757706. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:36:06,422][11028] Avg episode reward: [(0, '21.710')]
[2023-02-28 16:36:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7053312. Throughput: 0: 884.9. Samples: 762458. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:36:11,421][11028] Avg episode reward: [(0, '21.072')]
[2023-02-28 16:36:16,414][11028] Fps is (10 sec: 2868.7, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7069696. Throughput: 0: 884.7. Samples: 766708. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:36:16,421][11028] Avg episode reward: [(0, '22.470')]
[2023-02-28 16:36:17,621][24165] Updated weights for policy 0, policy_version 1728 (0.0016)
[2023-02-28 16:36:21,415][11028] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 7094272. Throughput: 0: 909.7. Samples: 770030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:36:21,421][11028] Avg episode reward: [(0, '23.050')]
[2023-02-28 16:36:26,422][11028] Fps is (10 sec: 4502.4, 60 sec: 3617.7, 300 sec: 3526.6). Total num frames: 7114752. Throughput: 0: 914.6. Samples: 776460. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:36:26,425][11028] Avg episode reward: [(0, '22.625')]
[2023-02-28 16:36:27,738][24165] Updated weights for policy 0, policy_version 1738 (0.0016)
[2023-02-28 16:36:31,414][11028] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 7127040. Throughput: 0: 874.7. Samples: 780932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:36:31,417][11028] Avg episode reward: [(0, '22.872')]
[2023-02-28 16:36:36,414][11028] Fps is (10 sec: 2869.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 7143424. Throughput: 0: 876.7. Samples: 783062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:36:36,417][11028] Avg episode reward: [(0, '22.451')]
[2023-02-28 16:36:40,211][24165] Updated weights for policy 0, policy_version 1748 (0.0025)
[2023-02-28 16:36:41,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 7163904. Throughput: 0: 909.9. Samples: 788818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:36:41,420][11028] Avg episode reward: [(0, '23.526')]
[2023-02-28 16:36:46,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3618.4, 300 sec: 3526.7). Total num frames: 7184384. Throughput: 0: 912.6. Samples: 795414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:36:46,421][11028] Avg episode reward: [(0, '23.604')]
[2023-02-28 16:36:50,960][24165] Updated weights for policy 0, policy_version 1758 (0.0027)
[2023-02-28 16:36:51,415][11028] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 7200768. Throughput: 0: 889.2. Samples: 797714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:36:51,418][11028] Avg episode reward: [(0, '23.562')]
[2023-02-28 16:36:56,415][11028] Fps is (10 sec: 2866.9, 60 sec: 3549.8, 300 sec: 3498.9). Total num frames: 7213056. Throughput: 0: 873.8. Samples: 801782. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:36:56,419][11028] Avg episode reward: [(0, '24.195')]
[2023-02-28 16:37:01,414][11028] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 7237632. Throughput: 0: 914.0. Samples: 807836. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:37:01,423][11028] Avg episode reward: [(0, '25.830')]
[2023-02-28 16:37:02,315][24165] Updated weights for policy 0, policy_version 1768 (0.0023)
[2023-02-28 16:37:06,414][11028] Fps is (10 sec: 4506.1, 60 sec: 3618.4, 300 sec: 3540.6). Total num frames: 7258112. Throughput: 0: 914.2. Samples: 811168. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:37:06,421][11028] Avg episode reward: [(0, '26.019')]
[2023-02-28 16:37:11,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 7270400. Throughput: 0: 890.4. Samples: 816522. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:37:11,421][11028] Avg episode reward: [(0, '25.245')]
[2023-02-28 16:37:11,434][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001775_7270400.pth...
[2023-02-28 16:37:11,706][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001569_6426624.pth
[2023-02-28 16:37:14,540][24165] Updated weights for policy 0, policy_version 1778 (0.0014)
[2023-02-28 16:37:16,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 7286784. Throughput: 0: 881.2. Samples: 820586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:37:16,417][11028] Avg episode reward: [(0, '24.636')]
[2023-02-28 16:37:21,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 7303168. Throughput: 0: 892.4. Samples: 823220. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:37:21,421][11028] Avg episode reward: [(0, '23.780')]
[2023-02-28 16:37:25,386][24165] Updated weights for policy 0, policy_version 1788 (0.0023)
[2023-02-28 16:37:26,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3550.3, 300 sec: 3540.6). Total num frames: 7327744. Throughput: 0: 906.6. Samples: 829614. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:37:26,421][11028] Avg episode reward: [(0, '22.077')]
[2023-02-28 16:37:31,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7340032. Throughput: 0: 874.7. Samples: 834776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:37:31,417][11028] Avg episode reward: [(0, '21.374')]
[2023-02-28 16:37:36,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7356416. Throughput: 0: 870.6. Samples: 836892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:37:36,420][11028] Avg episode reward: [(0, '21.815')]
[2023-02-28 16:37:38,447][24165] Updated weights for policy 0, policy_version 1798 (0.0014)
[2023-02-28 16:37:41,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 7376896. Throughput: 0: 894.0. Samples: 842012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:37:41,417][11028] Avg episode reward: [(0, '21.394')]
[2023-02-28 16:37:46,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 7397376. Throughput: 0: 904.6. Samples: 848544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:37:46,425][11028] Avg episode reward: [(0, '21.660')]
[2023-02-28 16:37:47,714][24165] Updated weights for policy 0, policy_version 1808 (0.0012)
[2023-02-28 16:37:51,414][11028] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 7413760. Throughput: 0: 894.3. Samples: 851410. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:37:51,420][11028] Avg episode reward: [(0, '21.242')]
[2023-02-28 16:37:56,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7426048. Throughput: 0: 866.3. Samples: 855504. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:37:56,417][11028] Avg episode reward: [(0, '21.969')]
[2023-02-28 16:38:00,928][24165] Updated weights for policy 0, policy_version 1818 (0.0019)
[2023-02-28 16:38:01,415][11028] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 7446528. Throughput: 0: 891.8. Samples: 860716. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-28 16:38:01,422][11028] Avg episode reward: [(0, '22.336')]
[2023-02-28 16:38:06,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 7467008. Throughput: 0: 907.4. Samples: 864054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:38:06,422][11028] Avg episode reward: [(0, '23.110')]
[2023-02-28 16:38:11,312][24165] Updated weights for policy 0, policy_version 1828 (0.0018)
[2023-02-28 16:38:11,419][11028] Fps is (10 sec: 4094.1, 60 sec: 3617.8, 300 sec: 3526.7). Total num frames: 7487488. Throughput: 0: 897.4. Samples: 870002. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:38:11,422][11028] Avg episode reward: [(0, '23.240')]
[2023-02-28 16:38:16,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 7499776. Throughput: 0: 875.9. Samples: 874190. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-28 16:38:16,418][11028] Avg episode reward: [(0, '24.493')]
[2023-02-28 16:38:21,415][11028] Fps is (10 sec: 2868.6, 60 sec: 3549.8, 300 sec: 3512.9). Total num frames: 7516160. Throughput: 0: 876.2. Samples: 876320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:38:21,417][11028] Avg episode reward: [(0, '24.891')]
[2023-02-28 16:38:23,517][24165] Updated weights for policy 0, policy_version 1838 (0.0025)
[2023-02-28 16:38:26,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7540736. Throughput: 0: 906.4. Samples: 882798. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:38:26,417][11028] Avg episode reward: [(0, '25.502')]
[2023-02-28 16:38:31,414][11028] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 7557120. Throughput: 0: 891.5. Samples: 888662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:38:31,422][11028] Avg episode reward: [(0, '24.923')]
[2023-02-28 16:38:34,620][24165] Updated weights for policy 0, policy_version 1848 (0.0012)
[2023-02-28 16:38:36,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3512.9). Total num frames: 7573504. Throughput: 0: 875.7. Samples: 890818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:38:36,418][11028] Avg episode reward: [(0, '24.150')]
[2023-02-28 16:38:41,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 7589888. Throughput: 0: 880.8. Samples: 895138. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:38:41,416][11028] Avg episode reward: [(0, '24.190')]
[2023-02-28 16:38:45,817][24165] Updated weights for policy 0, policy_version 1858 (0.0024)
[2023-02-28 16:38:46,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7610368. Throughput: 0: 911.7. Samples: 901742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:38:46,420][11028] Avg episode reward: [(0, '23.238')]
[2023-02-28 16:38:51,415][11028] Fps is (10 sec: 4095.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 7630848. Throughput: 0: 907.5. Samples: 904892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:38:51,418][11028] Avg episode reward: [(0, '22.141')]
[2023-02-28 16:38:56,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 7643136. Throughput: 0: 872.2. Samples: 909248. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:38:56,417][11028] Avg episode reward: [(0, '22.778')]
[2023-02-28 16:38:58,552][24165] Updated weights for policy 0, policy_version 1868 (0.0022)
[2023-02-28 16:39:01,414][11028] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7659520. Throughput: 0: 877.6. Samples: 913680. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:39:01,417][11028] Avg episode reward: [(0, '22.647')]
[2023-02-28 16:39:06,414][11028] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7680000. Throughput: 0: 904.3. Samples: 917014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:39:06,416][11028] Avg episode reward: [(0, '22.892')]
[2023-02-28 16:39:08,476][24165] Updated weights for policy 0, policy_version 1878 (0.0018)
[2023-02-28 16:39:11,421][11028] Fps is (10 sec: 4093.4, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 7700480. Throughput: 0: 903.8. Samples: 923476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:39:11,426][11028] Avg episode reward: [(0, '22.383')]
[2023-02-28 16:39:11,445][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001880_7700480.pth...
[2023-02-28 16:39:11,635][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001670_6840320.pth
[2023-02-28 16:39:16,414][11028] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7712768. Throughput: 0: 868.1. Samples: 927728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-28 16:39:16,421][11028] Avg episode reward: [(0, '23.943')]
[2023-02-28 16:39:21,414][11028] Fps is (10 sec: 2869.0, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7729152. Throughput: 0: 867.6. Samples: 929862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-28 16:39:21,422][11028] Avg episode reward: [(0, '24.328')]
[2023-02-28 16:39:22,022][24165] Updated weights for policy 0, policy_version 1888 (0.0040)
[2023-02-28 16:39:26,415][11028] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 7749632. Throughput: 0: 896.5. Samples: 935480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:39:26,417][11028] Avg episode reward: [(0, '23.838')]
[2023-02-28 16:39:31,414][11028] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7770112. Throughput: 0: 894.2. Samples: 941982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:39:31,417][11028] Avg episode reward: [(0, '23.706')]
[2023-02-28 16:39:31,947][24165] Updated weights for policy 0, policy_version 1898 (0.0022)
[2023-02-28 16:39:36,414][11028] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7786496. Throughput: 0: 870.0. Samples: 944040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:39:36,416][11028] Avg episode reward: [(0, '24.774')]
[2023-02-28 16:39:41,415][11028] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 7798784. Throughput: 0: 864.5. Samples: 948152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:39:41,418][11028] Avg episode reward: [(0, '25.116')]
[2023-02-28 16:39:44,713][24165] Updated weights for policy 0, policy_version 1908 (0.0020)
[2023-02-28 16:39:46,414][11028] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3568.5). Total num frames: 7819264. Throughput: 0: 902.8. Samples: 954308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:39:46,417][11028] Avg episode reward: [(0, '22.716')]
[2023-02-28 16:39:51,414][11028] Fps is (10 sec: 4505.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 7843840. Throughput: 0: 905.2. Samples: 957746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-28 16:39:51,421][11028] Avg episode reward: [(0, '21.961')]
[2023-02-28 16:39:55,240][24165] Updated weights for policy 0, policy_version 1918 (0.0012)
[2023-02-28 16:39:56,416][11028] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7856128. Throughput: 0: 874.5. Samples: 962826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-28 16:39:56,419][11028] Avg episode reward: [(0, '21.993')]
[2023-02-28 16:40:01,416][11028] Fps is (10 sec: 2866.7, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 7872512. Throughput: 0: 875.8. Samples: 967140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:40:01,420][11028] Avg episode reward: [(0, '21.847')]
[2023-02-28 16:40:06,414][11028] Fps is (10 sec: 3687.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7892992. Throughput: 0: 892.2. Samples: 970010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:40:06,417][11028] Avg episode reward: [(0, '20.349')]
[2023-02-28 16:40:07,114][24165] Updated weights for policy 0, policy_version 1928 (0.0033)
[2023-02-28 16:40:11,414][11028] Fps is (10 sec: 4096.8, 60 sec: 3550.2, 300 sec: 3582.3). Total num frames: 7913472. Throughput: 0: 910.0. Samples: 976428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-28 16:40:11,417][11028] Avg episode reward: [(0, '21.786')]
[2023-02-28 16:40:16,419][11028] Fps is (10 sec: 3684.8, 60 sec: 3617.9, 300 sec: 3568.3). Total num frames: 7929856. Throughput: 0: 877.1. Samples: 981456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:40:16,430][11028] Avg episode reward: [(0, '21.234')]
[2023-02-28 16:40:18,716][24165] Updated weights for policy 0, policy_version 1938 (0.0026)
[2023-02-28 16:40:21,414][11028] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 7942144. Throughput: 0: 878.5. Samples: 983574. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:40:21,422][11028] Avg episode reward: [(0, '22.184')]
[2023-02-28 16:40:26,414][11028] Fps is (10 sec: 3278.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 7962624. Throughput: 0: 897.3. Samples: 988528. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-28 16:40:26,421][11028] Avg episode reward: [(0, '23.090')]
[2023-02-28 16:40:29,767][24165] Updated weights for policy 0, policy_version 1948 (0.0016)
[2023-02-28 16:40:31,414][11028] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 7983104. Throughput: 0: 905.8. Samples: 995068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:40:31,420][11028] Avg episode reward: [(0, '25.249')]
[2023-02-28 16:40:36,414][11028] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 7999488. Throughput: 0: 891.8. Samples: 997876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-28 16:40:36,416][11028] Avg episode reward: [(0, '26.339')]
[2023-02-28 16:40:38,234][24147] Stopping Batcher_0...
[2023-02-28 16:40:38,236][24147] Loop batcher_evt_loop terminating...
[2023-02-28 16:40:38,237][11028] Component Batcher_0 stopped!
[2023-02-28 16:40:38,245][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth...
[2023-02-28 16:40:38,315][24165] Weights refcount: 2 0
[2023-02-28 16:40:38,336][24165] Stopping InferenceWorker_p0-w0...
[2023-02-28 16:40:38,355][24165] Loop inference_proc0-0_evt_loop terminating...
[2023-02-28 16:40:38,344][11028] Component InferenceWorker_p0-w0 stopped!
[2023-02-28 16:40:38,436][24147] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001775_7270400.pth
[2023-02-28 16:40:38,450][24147] Saving new best policy, reward=26.678!
[2023-02-28 16:40:38,547][24167] Stopping RolloutWorker_w0...
[2023-02-28 16:40:38,545][11028] Component RolloutWorker_w1 stopped!
[2023-02-28 16:40:38,548][24167] Loop rollout_proc0_evt_loop terminating...
[2023-02-28 16:40:38,548][11028] Component RolloutWorker_w0 stopped!
[2023-02-28 16:40:38,550][24166] Stopping RolloutWorker_w1...
[2023-02-28 16:40:38,551][24166] Loop rollout_proc1_evt_loop terminating...
[2023-02-28 16:40:38,553][11028] Component RolloutWorker_w7 stopped!
[2023-02-28 16:40:38,556][24188] Stopping RolloutWorker_w7...
[2023-02-28 16:40:38,557][24188] Loop rollout_proc7_evt_loop terminating...
[2023-02-28 16:40:38,559][11028] Component RolloutWorker_w5 stopped!
[2023-02-28 16:40:38,561][24182] Stopping RolloutWorker_w5...
[2023-02-28 16:40:38,571][11028] Component RolloutWorker_w3 stopped!
[2023-02-28 16:40:38,573][24169] Stopping RolloutWorker_w3...
[2023-02-28 16:40:38,574][24169] Loop rollout_proc3_evt_loop terminating...
[2023-02-28 16:40:38,561][24182] Loop rollout_proc5_evt_loop terminating...
[2023-02-28 16:40:38,600][24180] Stopping RolloutWorker_w6...
[2023-02-28 16:40:38,601][24180] Loop rollout_proc6_evt_loop terminating...
[2023-02-28 16:40:38,600][11028] Component RolloutWorker_w6 stopped!
[2023-02-28 16:40:38,608][24178] Stopping RolloutWorker_w4...
[2023-02-28 16:40:38,608][11028] Component RolloutWorker_w4 stopped!
[2023-02-28 16:40:38,616][24178] Loop rollout_proc4_evt_loop terminating...
[2023-02-28 16:40:38,625][24176] Stopping RolloutWorker_w2...
[2023-02-28 16:40:38,624][11028] Component RolloutWorker_w2 stopped!
[2023-02-28 16:40:38,640][24176] Loop rollout_proc2_evt_loop terminating...
[2023-02-28 16:40:38,741][24147] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth...
[2023-02-28 16:40:39,156][24147] Stopping LearnerWorker_p0...
[2023-02-28 16:40:39,156][24147] Loop learner_proc0_evt_loop terminating...
[2023-02-28 16:40:39,155][11028] Component LearnerWorker_p0 stopped!
[2023-02-28 16:40:39,157][11028] Waiting for process learner_proc0 to stop...
[2023-02-28 16:40:42,156][11028] Waiting for process inference_proc0-0 to join...
[2023-02-28 16:40:42,252][11028] Waiting for process rollout_proc0 to join...
[2023-02-28 16:40:42,640][11028] Waiting for process rollout_proc1 to join...
[2023-02-28 16:40:42,641][11028] Waiting for process rollout_proc2 to join...
[2023-02-28 16:40:42,643][11028] Waiting for process rollout_proc3 to join...
[2023-02-28 16:40:42,644][11028] Waiting for process rollout_proc4 to join...
[2023-02-28 16:40:42,648][11028] Waiting for process rollout_proc5 to join...
[2023-02-28 16:40:42,651][11028] Waiting for process rollout_proc6 to join...
[2023-02-28 16:40:42,653][11028] Waiting for process rollout_proc7 to join...
[2023-02-28 16:40:42,654][11028] Batcher 0 profile tree view:
batching: 26.9247, releasing_batches: 0.0312
[2023-02-28 16:40:42,655][11028] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 553.7172
update_model: 7.6484
weight_update: 0.0013
one_step: 0.0164
handle_policy_step: 540.8768
deserialize: 15.5778, stack: 3.1170, obs_to_device_normalize: 117.9313, forward: 260.9584, send_messages: 26.2955
prepare_outputs: 90.1002
to_cpu: 56.7629
[2023-02-28 16:40:42,657][11028] Learner 0 profile tree view:
misc: 0.0057, prepare_batch: 18.6312
train: 80.7552
epoch_init: 0.0187, minibatch_init: 0.0079, losses_postprocess: 0.5913, kl_divergence: 0.6280, after_optimizer: 3.3947
calculate_losses: 26.2648
losses_init: 0.0037, forward_head: 1.7459, bptt_initial: 17.2009, tail: 1.2050, advantages_returns: 0.3071, losses: 3.2979
bptt: 2.1612
bptt_forward_core: 2.0733
update: 49.0283
clip: 1.4631
[2023-02-28 16:40:42,658][11028] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3530, enqueue_policy_requests: 154.2610, env_step: 858.2450, overhead: 23.1643, complete_rollouts: 7.6389
save_policy_outputs: 22.8860
split_output_tensors: 10.8809
[2023-02-28 16:40:42,659][11028] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4068, enqueue_policy_requests: 151.8858, env_step: 859.3934, overhead: 23.5089, complete_rollouts: 7.6703
save_policy_outputs: 22.3224
split_output_tensors: 10.9180
[2023-02-28 16:40:42,661][11028] Loop Runner_EvtLoop terminating...
[2023-02-28 16:40:42,662][11028] Runner profile tree view:
main_loop: 1168.8627
[2023-02-28 16:40:42,663][11028] Collected {0: 8007680}, FPS: 3423.7
[2023-02-28 16:40:42,718][11028] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-28 16:40:42,720][11028] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-28 16:40:42,722][11028] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-28 16:40:42,725][11028] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-28 16:40:42,727][11028] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:40:42,728][11028] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-28 16:40:42,732][11028] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:40:42,734][11028] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-28 16:40:42,735][11028] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-28 16:40:42,737][11028] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-28 16:40:42,738][11028] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-28 16:40:42,742][11028] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-28 16:40:42,745][11028] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-28 16:40:42,747][11028] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-28 16:40:42,749][11028] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-28 16:40:42,775][11028] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:40:42,783][11028] RunningMeanStd input shape: (1,)
[2023-02-28 16:40:42,808][11028] ConvEncoder: input_channels=3
[2023-02-28 16:40:42,956][11028] Conv encoder output size: 512
[2023-02-28 16:40:42,960][11028] Policy head output size: 512
[2023-02-28 16:40:43,055][11028] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth...
[2023-02-28 16:40:44,019][11028] Num frames 100...
[2023-02-28 16:40:44,131][11028] Num frames 200...
[2023-02-28 16:40:44,241][11028] Num frames 300...
[2023-02-28 16:40:44,352][11028] Num frames 400...
[2023-02-28 16:40:44,474][11028] Num frames 500...
[2023-02-28 16:40:44,591][11028] Avg episode rewards: #0: 10.540, true rewards: #0: 5.540
[2023-02-28 16:40:44,594][11028] Avg episode reward: 10.540, avg true_objective: 5.540
[2023-02-28 16:40:44,653][11028] Num frames 600...
[2023-02-28 16:40:44,774][11028] Num frames 700...
[2023-02-28 16:40:44,893][11028] Num frames 800...
[2023-02-28 16:40:45,014][11028] Num frames 900...
[2023-02-28 16:40:45,133][11028] Num frames 1000...
[2023-02-28 16:40:45,200][11028] Avg episode rewards: #0: 10.530, true rewards: #0: 5.030
[2023-02-28 16:40:45,201][11028] Avg episode reward: 10.530, avg true_objective: 5.030
[2023-02-28 16:40:45,314][11028] Num frames 1100...
[2023-02-28 16:40:45,425][11028] Num frames 1200...
[2023-02-28 16:40:45,535][11028] Num frames 1300...
[2023-02-28 16:40:45,649][11028] Num frames 1400...
[2023-02-28 16:40:45,765][11028] Num frames 1500...
[2023-02-28 16:40:45,883][11028] Num frames 1600...
[2023-02-28 16:40:45,998][11028] Num frames 1700...
[2023-02-28 16:40:46,114][11028] Num frames 1800...
[2023-02-28 16:40:46,234][11028] Num frames 1900...
[2023-02-28 16:40:46,397][11028] Avg episode rewards: #0: 15.993, true rewards: #0: 6.660
[2023-02-28 16:40:46,399][11028] Avg episode reward: 15.993, avg true_objective: 6.660
[2023-02-28 16:40:46,404][11028] Num frames 2000...
[2023-02-28 16:40:46,525][11028] Num frames 2100...
[2023-02-28 16:40:46,643][11028] Num frames 2200...
[2023-02-28 16:40:46,759][11028] Num frames 2300...
[2023-02-28 16:40:46,878][11028] Num frames 2400...
[2023-02-28 16:40:46,998][11028] Num frames 2500...
[2023-02-28 16:40:47,121][11028] Num frames 2600...
[2023-02-28 16:40:47,233][11028] Num frames 2700...
[2023-02-28 16:40:47,345][11028] Num frames 2800...
[2023-02-28 16:40:47,459][11028] Num frames 2900...
[2023-02-28 16:40:47,549][11028] Avg episode rewards: #0: 16.565, true rewards: #0: 7.315
[2023-02-28 16:40:47,550][11028] Avg episode reward: 16.565, avg true_objective: 7.315
[2023-02-28 16:40:47,650][11028] Num frames 3000...
[2023-02-28 16:40:47,759][11028] Num frames 3100...
[2023-02-28 16:40:47,871][11028] Num frames 3200...
[2023-02-28 16:40:47,987][11028] Num frames 3300...
[2023-02-28 16:40:48,101][11028] Num frames 3400...
[2023-02-28 16:40:48,223][11028] Num frames 3500...
[2023-02-28 16:40:48,333][11028] Num frames 3600...
[2023-02-28 16:40:48,444][11028] Num frames 3700...
[2023-02-28 16:40:48,556][11028] Num frames 3800...
[2023-02-28 16:40:48,668][11028] Num frames 3900...
[2023-02-28 16:40:48,789][11028] Num frames 4000...
[2023-02-28 16:40:48,901][11028] Num frames 4100...
[2023-02-28 16:40:49,022][11028] Num frames 4200...
[2023-02-28 16:40:49,135][11028] Num frames 4300...
[2023-02-28 16:40:49,250][11028] Num frames 4400...
[2023-02-28 16:40:49,364][11028] Num frames 4500...
[2023-02-28 16:40:49,481][11028] Num frames 4600...
[2023-02-28 16:40:49,593][11028] Num frames 4700...
[2023-02-28 16:40:49,707][11028] Num frames 4800...
[2023-02-28 16:40:49,827][11028] Num frames 4900...
[2023-02-28 16:40:49,943][11028] Num frames 5000...
[2023-02-28 16:40:50,034][11028] Avg episode rewards: #0: 23.452, true rewards: #0: 10.052
[2023-02-28 16:40:50,036][11028] Avg episode reward: 23.452, avg true_objective: 10.052
[2023-02-28 16:40:50,125][11028] Num frames 5100...
[2023-02-28 16:40:50,240][11028] Num frames 5200...
[2023-02-28 16:40:50,355][11028] Num frames 5300...
[2023-02-28 16:40:50,496][11028] Avg episode rewards: #0: 20.463, true rewards: #0: 8.963
[2023-02-28 16:40:50,498][11028] Avg episode reward: 20.463, avg true_objective: 8.963
[2023-02-28 16:40:50,527][11028] Num frames 5400...
[2023-02-28 16:40:50,639][11028] Num frames 5500...
[2023-02-28 16:40:50,757][11028] Num frames 5600...
[2023-02-28 16:40:50,869][11028] Num frames 5700...
[2023-02-28 16:40:50,997][11028] Avg episode rewards: #0: 18.517, true rewards: #0: 8.231
[2023-02-28 16:40:50,998][11028] Avg episode reward: 18.517, avg true_objective: 8.231
[2023-02-28 16:40:51,049][11028] Num frames 5800...
[2023-02-28 16:40:51,162][11028] Num frames 5900...
[2023-02-28 16:40:51,273][11028] Num frames 6000...
[2023-02-28 16:40:51,381][11028] Num frames 6100...
[2023-02-28 16:40:51,506][11028] Num frames 6200...
[2023-02-28 16:40:51,641][11028] Avg episode rewards: #0: 17.217, true rewards: #0: 7.842
[2023-02-28 16:40:51,643][11028] Avg episode reward: 17.217, avg true_objective: 7.842
[2023-02-28 16:40:51,679][11028] Num frames 6300...
[2023-02-28 16:40:51,791][11028] Num frames 6400...
[2023-02-28 16:40:51,901][11028] Num frames 6500...
[2023-02-28 16:40:52,025][11028] Num frames 6600...
[2023-02-28 16:40:52,145][11028] Num frames 6700...
[2023-02-28 16:40:52,259][11028] Num frames 6800...
[2023-02-28 16:40:52,404][11028] Num frames 6900...
[2023-02-28 16:40:52,562][11028] Num frames 7000...
[2023-02-28 16:40:52,715][11028] Avg episode rewards: #0: 16.959, true rewards: #0: 7.848
[2023-02-28 16:40:52,717][11028] Avg episode reward: 16.959, avg true_objective: 7.848
[2023-02-28 16:40:52,781][11028] Num frames 7100...
[2023-02-28 16:40:52,943][11028] Num frames 7200...
[2023-02-28 16:40:53,115][11028] Num frames 7300...
[2023-02-28 16:40:53,269][11028] Num frames 7400...
[2023-02-28 16:40:53,402][11028] Avg episode rewards: #0: 15.647, true rewards: #0: 7.447
[2023-02-28 16:40:53,405][11028] Avg episode reward: 15.647, avg true_objective: 7.447
[2023-02-28 16:41:42,076][11028] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-28 16:41:42,107][11028] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-28 16:41:42,108][11028] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-28 16:41:42,109][11028] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-28 16:41:42,113][11028] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-28 16:41:42,114][11028] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-28 16:41:42,115][11028] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-28 16:41:42,116][11028] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-28 16:41:42,118][11028] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-28 16:41:42,119][11028] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-28 16:41:42,120][11028] Adding new argument 'hf_repository'='bonadio/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-28 16:41:42,121][11028] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-28 16:41:42,122][11028] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-28 16:41:42,123][11028] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-28 16:41:42,124][11028] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-28 16:41:42,125][11028] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-28 16:41:42,149][11028] RunningMeanStd input shape: (3, 72, 128)
[2023-02-28 16:41:42,152][11028] RunningMeanStd input shape: (1,)
[2023-02-28 16:41:42,168][11028] ConvEncoder: input_channels=3
[2023-02-28 16:41:42,204][11028] Conv encoder output size: 512
[2023-02-28 16:41:42,206][11028] Policy head output size: 512
[2023-02-28 16:41:42,226][11028] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001955_8007680.pth...
[2023-02-28 16:41:42,668][11028] Num frames 100...
[2023-02-28 16:41:42,780][11028] Num frames 200...
[2023-02-28 16:41:42,891][11028] Num frames 300...
[2023-02-28 16:41:43,000][11028] Num frames 400...
[2023-02-28 16:41:43,112][11028] Num frames 500...
[2023-02-28 16:41:43,222][11028] Num frames 600...
[2023-02-28 16:41:43,287][11028] Avg episode rewards: #0: 12.080, true rewards: #0: 6.080
[2023-02-28 16:41:43,289][11028] Avg episode reward: 12.080, avg true_objective: 6.080
[2023-02-28 16:41:43,407][11028] Num frames 700...
[2023-02-28 16:41:43,524][11028] Num frames 800...
[2023-02-28 16:41:43,644][11028] Num frames 900...
[2023-02-28 16:41:43,765][11028] Num frames 1000...
[2023-02-28 16:41:43,878][11028] Num frames 1100...
[2023-02-28 16:41:43,996][11028] Num frames 1200...
[2023-02-28 16:41:44,110][11028] Num frames 1300...
[2023-02-28 16:41:44,223][11028] Num frames 1400...
[2023-02-28 16:41:44,339][11028] Num frames 1500...
[2023-02-28 16:41:44,437][11028] Avg episode rewards: #0: 16.180, true rewards: #0: 7.680
[2023-02-28 16:41:44,439][11028] Avg episode reward: 16.180, avg true_objective: 7.680
[2023-02-28 16:41:44,524][11028] Num frames 1600...
[2023-02-28 16:41:44,640][11028] Num frames 1700...
[2023-02-28 16:41:44,753][11028] Num frames 1800...
[2023-02-28 16:41:44,864][11028] Num frames 1900...
[2023-02-28 16:41:44,985][11028] Num frames 2000...
[2023-02-28 16:41:45,104][11028] Num frames 2100...
[2023-02-28 16:41:45,227][11028] Num frames 2200...
[2023-02-28 16:41:45,347][11028] Num frames 2300...
[2023-02-28 16:41:45,461][11028] Num frames 2400...
[2023-02-28 16:41:45,620][11028] Avg episode rewards: #0: 17.943, true rewards: #0: 8.277
[2023-02-28 16:41:45,622][11028] Avg episode reward: 17.943, avg true_objective: 8.277
[2023-02-28 16:41:45,645][11028] Num frames 2500...
[2023-02-28 16:41:45,764][11028] Num frames 2600...
[2023-02-28 16:41:45,881][11028] Num frames 2700...
[2023-02-28 16:41:46,000][11028] Num frames 2800...
[2023-02-28 16:41:46,114][11028] Num frames 2900...
[2023-02-28 16:41:46,225][11028] Num frames 3000...
[2023-02-28 16:41:46,349][11028] Num frames 3100...
[2023-02-28 16:41:46,476][11028] Num frames 3200...
[2023-02-28 16:41:46,601][11028] Num frames 3300...
[2023-02-28 16:41:46,713][11028] Num frames 3400...
[2023-02-28 16:41:46,831][11028] Num frames 3500...
[2023-02-28 16:41:46,950][11028] Num frames 3600...
[2023-02-28 16:41:47,064][11028] Num frames 3700...
[2023-02-28 16:41:47,187][11028] Num frames 3800...
[2023-02-28 16:41:47,302][11028] Num frames 3900...
[2023-02-28 16:41:47,422][11028] Num frames 4000...
[2023-02-28 16:41:47,545][11028] Num frames 4100...
[2023-02-28 16:41:47,671][11028] Num frames 4200...
[2023-02-28 16:41:47,783][11028] Num frames 4300...
[2023-02-28 16:41:47,897][11028] Num frames 4400...
[2023-02-28 16:41:48,029][11028] Avg episode rewards: #0: 28.167, true rewards: #0: 11.167
[2023-02-28 16:41:48,031][11028] Avg episode reward: 28.167, avg true_objective: 11.167
[2023-02-28 16:41:48,076][11028] Num frames 4500...
[2023-02-28 16:41:48,188][11028] Num frames 4600...
[2023-02-28 16:41:48,301][11028] Num frames 4700...
[2023-02-28 16:41:48,414][11028] Num frames 4800...
[2023-02-28 16:41:48,531][11028] Num frames 4900...
[2023-02-28 16:41:48,649][11028] Num frames 5000...
[2023-02-28 16:41:48,719][11028] Avg episode rewards: #0: 24.022, true rewards: #0: 10.022
[2023-02-28 16:41:48,721][11028] Avg episode reward: 24.022, avg true_objective: 10.022
[2023-02-28 16:41:48,837][11028] Num frames 5100...
[2023-02-28 16:41:48,964][11028] Num frames 5200...
[2023-02-28 16:41:49,081][11028] Num frames 5300...
[2023-02-28 16:41:49,191][11028] Num frames 5400...
[2023-02-28 16:41:49,300][11028] Num frames 5500...
[2023-02-28 16:41:49,452][11028] Avg episode rewards: #0: 21.645, true rewards: #0: 9.312
[2023-02-28 16:41:49,454][11028] Avg episode reward: 21.645, avg true_objective: 9.312
[2023-02-28 16:41:49,473][11028] Num frames 5600...
[2023-02-28 16:41:49,585][11028] Num frames 5700...
[2023-02-28 16:41:49,700][11028] Num frames 5800...
[2023-02-28 16:41:49,812][11028] Num frames 5900...
[2023-02-28 16:41:49,929][11028] Num frames 6000...
[2023-02-28 16:41:50,036][11028] Num frames 6100...
[2023-02-28 16:41:50,202][11028] Num frames 6200...
[2023-02-28 16:41:50,366][11028] Num frames 6300...
[2023-02-28 16:41:50,527][11028] Num frames 6400...
[2023-02-28 16:41:50,689][11028] Num frames 6500...
[2023-02-28 16:41:50,845][11028] Num frames 6600...
[2023-02-28 16:41:50,998][11028] Num frames 6700...
[2023-02-28 16:41:51,155][11028] Num frames 6800...
[2023-02-28 16:41:51,311][11028] Num frames 6900...
[2023-02-28 16:41:51,463][11028] Num frames 7000...
[2023-02-28 16:41:51,625][11028] Num frames 7100...
[2023-02-28 16:41:51,784][11028] Num frames 7200...
[2023-02-28 16:41:51,950][11028] Num frames 7300...
[2023-02-28 16:41:52,110][11028] Num frames 7400...
[2023-02-28 16:41:52,267][11028] Num frames 7500...
[2023-02-28 16:41:52,428][11028] Num frames 7600...
[2023-02-28 16:41:52,633][11028] Avg episode rewards: #0: 26.981, true rewards: #0: 10.981
[2023-02-28 16:41:52,635][11028] Avg episode reward: 26.981, avg true_objective: 10.981
[2023-02-28 16:41:52,660][11028] Num frames 7700...
[2023-02-28 16:41:52,831][11028] Num frames 7800...
[2023-02-28 16:41:53,005][11028] Num frames 7900...
[2023-02-28 16:41:53,179][11028] Num frames 8000...
[2023-02-28 16:41:53,345][11028] Num frames 8100...
[2023-02-28 16:41:53,518][11028] Num frames 8200...
[2023-02-28 16:41:53,671][11028] Num frames 8300...
[2023-02-28 16:41:53,789][11028] Num frames 8400...
[2023-02-28 16:41:53,900][11028] Num frames 8500...
[2023-02-28 16:41:54,015][11028] Num frames 8600...
[2023-02-28 16:41:54,143][11028] Num frames 8700...
[2023-02-28 16:41:54,264][11028] Num frames 8800...
[2023-02-28 16:41:54,375][11028] Num frames 8900...
[2023-02-28 16:41:54,487][11028] Num frames 9000...
[2023-02-28 16:41:54,603][11028] Num frames 9100...
[2023-02-28 16:41:54,717][11028] Num frames 9200...
[2023-02-28 16:41:54,850][11028] Num frames 9300...
[2023-02-28 16:41:54,980][11028] Num frames 9400...
[2023-02-28 16:41:55,094][11028] Num frames 9500...
[2023-02-28 16:41:55,205][11028] Num frames 9600...
[2023-02-28 16:41:55,319][11028] Num frames 9700...
[2023-02-28 16:41:55,380][11028] Avg episode rewards: #0: 29.503, true rewards: #0: 12.129
[2023-02-28 16:41:55,382][11028] Avg episode reward: 29.503, avg true_objective: 12.129
[2023-02-28 16:41:55,503][11028] Num frames 9800...
[2023-02-28 16:41:55,619][11028] Num frames 9900...
[2023-02-28 16:41:55,738][11028] Num frames 10000...
[2023-02-28 16:41:55,857][11028] Num frames 10100...
[2023-02-28 16:41:55,974][11028] Num frames 10200...
[2023-02-28 16:41:56,090][11028] Num frames 10300...
[2023-02-28 16:41:56,212][11028] Num frames 10400...
[2023-02-28 16:41:56,323][11028] Num frames 10500...
[2023-02-28 16:41:56,444][11028] Num frames 10600...
[2023-02-28 16:41:56,555][11028] Num frames 10700...
[2023-02-28 16:41:56,670][11028] Num frames 10800...
[2023-02-28 16:41:56,809][11028] Num frames 10900...
[2023-02-28 16:41:56,926][11028] Num frames 11000...
[2023-02-28 16:41:57,038][11028] Num frames 11100...
[2023-02-28 16:41:57,155][11028] Num frames 11200...
[2023-02-28 16:41:57,271][11028] Num frames 11300...
[2023-02-28 16:41:57,386][11028] Num frames 11400...
[2023-02-28 16:41:57,507][11028] Num frames 11500...
[2023-02-28 16:41:57,620][11028] Num frames 11600...
[2023-02-28 16:41:57,745][11028] Num frames 11700...
[2023-02-28 16:41:57,861][11028] Avg episode rewards: #0: 32.056, true rewards: #0: 13.057
[2023-02-28 16:41:57,863][11028] Avg episode reward: 32.056, avg true_objective: 13.057
[2023-02-28 16:41:57,921][11028] Num frames 11800...
[2023-02-28 16:41:58,035][11028] Num frames 11900...
[2023-02-28 16:41:58,148][11028] Num frames 12000...
[2023-02-28 16:41:58,256][11028] Num frames 12100...
[2023-02-28 16:41:58,368][11028] Num frames 12200...
[2023-02-28 16:41:58,495][11028] Num frames 12300...
[2023-02-28 16:41:58,609][11028] Num frames 12400...
[2023-02-28 16:41:58,725][11028] Num frames 12500...
[2023-02-28 16:41:58,843][11028] Num frames 12600...
[2023-02-28 16:41:58,960][11028] Num frames 12700...
[2023-02-28 16:41:59,077][11028] Num frames 12800...
[2023-02-28 16:41:59,190][11028] Num frames 12900...
[2023-02-28 16:41:59,284][11028] Avg episode rewards: #0: 31.235, true rewards: #0: 12.935
[2023-02-28 16:41:59,286][11028] Avg episode reward: 31.235, avg true_objective: 12.935
[2023-02-28 16:43:22,983][11028] Replay video saved to /content/train_dir/default_experiment/replay.mp4!