atorre's picture
Upload . with huggingface_hub
e1f679d
[2023-02-25 19:56:11,636][05610] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-25 19:56:11,638][05610] Rollout worker 0 uses device cpu
[2023-02-25 19:56:11,642][05610] Rollout worker 1 uses device cpu
[2023-02-25 19:56:11,644][05610] Rollout worker 2 uses device cpu
[2023-02-25 19:56:11,647][05610] Rollout worker 3 uses device cpu
[2023-02-25 19:56:11,649][05610] Rollout worker 4 uses device cpu
[2023-02-25 19:56:11,651][05610] Rollout worker 5 uses device cpu
[2023-02-25 19:56:11,652][05610] Rollout worker 6 uses device cpu
[2023-02-25 19:56:11,654][05610] Rollout worker 7 uses device cpu
[2023-02-25 19:56:11,843][05610] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 19:56:11,844][05610] InferenceWorker_p0-w0: min num requests: 2
[2023-02-25 19:56:11,879][05610] Starting all processes...
[2023-02-25 19:56:11,884][05610] Starting process learner_proc0
[2023-02-25 19:56:11,964][05610] Starting all processes...
[2023-02-25 19:56:12,065][05610] Starting process inference_proc0-0
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc0
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc1
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc2
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc3
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc4
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc5
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc6
[2023-02-25 19:56:12,077][05610] Starting process rollout_proc7
[2023-02-25 19:56:24,094][16658] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 19:56:24,098][16658] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-25 19:56:24,368][16672] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 19:56:24,378][16672] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-25 19:56:24,638][16683] Worker 4 uses CPU cores [0]
[2023-02-25 19:56:24,644][16680] Worker 3 uses CPU cores [1]
[2023-02-25 19:56:24,699][16679] Worker 1 uses CPU cores [1]
[2023-02-25 19:56:24,743][16678] Worker 2 uses CPU cores [0]
[2023-02-25 19:56:24,774][16677] Worker 0 uses CPU cores [0]
[2023-02-25 19:56:24,864][16681] Worker 6 uses CPU cores [0]
[2023-02-25 19:56:25,124][16682] Worker 5 uses CPU cores [1]
[2023-02-25 19:56:25,126][16684] Worker 7 uses CPU cores [1]
[2023-02-25 19:56:25,240][16672] Num visible devices: 1
[2023-02-25 19:56:25,247][16658] Num visible devices: 1
[2023-02-25 19:56:25,257][16658] Starting seed is not provided
[2023-02-25 19:56:25,258][16658] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 19:56:25,259][16658] Initializing actor-critic model on device cuda:0
[2023-02-25 19:56:25,260][16658] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 19:56:25,262][16658] RunningMeanStd input shape: (1,)
[2023-02-25 19:56:25,281][16658] ConvEncoder: input_channels=3
[2023-02-25 19:56:25,700][16658] Conv encoder output size: 512
[2023-02-25 19:56:25,701][16658] Policy head output size: 512
[2023-02-25 19:56:25,791][16658] Created Actor Critic model with architecture:
[2023-02-25 19:56:25,792][16658] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-25 19:56:31,836][05610] Heartbeat connected on Batcher_0
[2023-02-25 19:56:31,843][05610] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-25 19:56:31,853][05610] Heartbeat connected on RolloutWorker_w0
[2023-02-25 19:56:31,857][05610] Heartbeat connected on RolloutWorker_w1
[2023-02-25 19:56:31,862][05610] Heartbeat connected on RolloutWorker_w2
[2023-02-25 19:56:31,866][05610] Heartbeat connected on RolloutWorker_w3
[2023-02-25 19:56:31,868][05610] Heartbeat connected on RolloutWorker_w4
[2023-02-25 19:56:31,875][05610] Heartbeat connected on RolloutWorker_w5
[2023-02-25 19:56:31,877][05610] Heartbeat connected on RolloutWorker_w6
[2023-02-25 19:56:31,881][05610] Heartbeat connected on RolloutWorker_w7
[2023-02-25 19:56:34,907][16658] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-25 19:56:34,908][16658] No checkpoints found
[2023-02-25 19:56:34,908][16658] Did not load from checkpoint, starting from scratch!
[2023-02-25 19:56:34,909][16658] Initialized policy 0 weights for model version 0
[2023-02-25 19:56:34,913][16658] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 19:56:34,919][16658] LearnerWorker_p0 finished initialization!
[2023-02-25 19:56:34,920][05610] Heartbeat connected on LearnerWorker_p0
[2023-02-25 19:56:35,108][16672] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 19:56:35,109][16672] RunningMeanStd input shape: (1,)
[2023-02-25 19:56:35,121][16672] ConvEncoder: input_channels=3
[2023-02-25 19:56:35,222][16672] Conv encoder output size: 512
[2023-02-25 19:56:35,223][16672] Policy head output size: 512
[2023-02-25 19:56:37,310][05610] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 19:56:37,496][05610] Inference worker 0-0 is ready!
[2023-02-25 19:56:37,498][05610] All inference workers are ready! Signal rollout workers to start!
[2023-02-25 19:56:37,652][16683] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,652][16680] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,661][16681] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,666][16677] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,671][16678] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,673][16682] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,689][16684] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:37,688][16679] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 19:56:38,184][16680] Decorrelating experience for 0 frames...
[2023-02-25 19:56:38,538][16680] Decorrelating experience for 32 frames...
[2023-02-25 19:56:38,698][16677] Decorrelating experience for 0 frames...
[2023-02-25 19:56:38,703][16681] Decorrelating experience for 0 frames...
[2023-02-25 19:56:38,705][16683] Decorrelating experience for 0 frames...
[2023-02-25 19:56:39,376][16680] Decorrelating experience for 64 frames...
[2023-02-25 19:56:39,469][16684] Decorrelating experience for 0 frames...
[2023-02-25 19:56:39,793][16677] Decorrelating experience for 32 frames...
[2023-02-25 19:56:39,799][16680] Decorrelating experience for 96 frames...
[2023-02-25 19:56:39,803][16681] Decorrelating experience for 32 frames...
[2023-02-25 19:56:39,810][16683] Decorrelating experience for 32 frames...
[2023-02-25 19:56:40,631][16678] Decorrelating experience for 0 frames...
[2023-02-25 19:56:40,845][16684] Decorrelating experience for 32 frames...
[2023-02-25 19:56:40,872][16679] Decorrelating experience for 0 frames...
[2023-02-25 19:56:41,094][16677] Decorrelating experience for 64 frames...
[2023-02-25 19:56:41,179][16683] Decorrelating experience for 64 frames...
[2023-02-25 19:56:41,647][16679] Decorrelating experience for 32 frames...
[2023-02-25 19:56:41,682][16682] Decorrelating experience for 0 frames...
[2023-02-25 19:56:42,310][05610] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 19:56:42,793][16682] Decorrelating experience for 32 frames...
[2023-02-25 19:56:42,858][16684] Decorrelating experience for 64 frames...
[2023-02-25 19:56:43,068][16679] Decorrelating experience for 64 frames...
[2023-02-25 19:56:43,501][16678] Decorrelating experience for 32 frames...
[2023-02-25 19:56:44,002][16677] Decorrelating experience for 96 frames...
[2023-02-25 19:56:44,215][16684] Decorrelating experience for 96 frames...
[2023-02-25 19:56:44,382][16683] Decorrelating experience for 96 frames...
[2023-02-25 19:56:44,404][16682] Decorrelating experience for 64 frames...
[2023-02-25 19:56:44,453][16679] Decorrelating experience for 96 frames...
[2023-02-25 19:56:45,468][16682] Decorrelating experience for 96 frames...
[2023-02-25 19:56:47,284][16681] Decorrelating experience for 64 frames...
[2023-02-25 19:56:47,310][05610] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 32.4. Samples: 324. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 19:56:47,313][05610] Avg episode reward: [(0, '1.173')]
[2023-02-25 19:56:51,425][16658] Signal inference workers to stop experience collection...
[2023-02-25 19:56:51,446][16672] InferenceWorker_p0-w0: stopping experience collection
[2023-02-25 19:56:51,566][16678] Decorrelating experience for 64 frames...
[2023-02-25 19:56:51,648][16681] Decorrelating experience for 96 frames...
[2023-02-25 19:56:52,153][16678] Decorrelating experience for 96 frames...
[2023-02-25 19:56:52,310][05610] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 114.3. Samples: 1714. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 19:56:52,318][05610] Avg episode reward: [(0, '2.636')]
[2023-02-25 19:56:53,466][16658] Signal inference workers to resume experience collection...
[2023-02-25 19:56:53,467][16672] InferenceWorker_p0-w0: resuming experience collection
[2023-02-25 19:56:57,310][05610] Fps is (10 sec: 2048.1, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 20480. Throughput: 0: 200.1. Samples: 4002. Policy #0 lag: (min: 0.0, avg: 1.0, max: 2.0)
[2023-02-25 19:56:57,315][05610] Avg episode reward: [(0, '3.381')]
[2023-02-25 19:57:00,944][16672] Updated weights for policy 0, policy_version 10 (0.0023)
[2023-02-25 19:57:02,310][05610] Fps is (10 sec: 4505.6, 60 sec: 1802.2, 300 sec: 1802.2). Total num frames: 45056. Throughput: 0: 445.8. Samples: 11144. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 19:57:02,319][05610] Avg episode reward: [(0, '4.172')]
[2023-02-25 19:57:07,310][05610] Fps is (10 sec: 4096.0, 60 sec: 2048.0, 300 sec: 2048.0). Total num frames: 61440. Throughput: 0: 455.5. Samples: 13666. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 19:57:07,313][05610] Avg episode reward: [(0, '4.328')]
[2023-02-25 19:57:12,310][05610] Fps is (10 sec: 2457.6, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 486.4. Samples: 17024. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 19:57:12,312][05610] Avg episode reward: [(0, '4.257')]
[2023-02-25 19:57:14,595][16672] Updated weights for policy 0, policy_version 20 (0.0026)
[2023-02-25 19:57:17,311][05610] Fps is (10 sec: 3276.3, 60 sec: 2355.1, 300 sec: 2355.1). Total num frames: 94208. Throughput: 0: 581.6. Samples: 23264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 19:57:17,317][05610] Avg episode reward: [(0, '4.233')]
[2023-02-25 19:57:22,310][05610] Fps is (10 sec: 4505.6, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 114688. Throughput: 0: 594.8. Samples: 26764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:57:22,312][05610] Avg episode reward: [(0, '4.376')]
[2023-02-25 19:57:22,319][16658] Saving new best policy, reward=4.376!
[2023-02-25 19:57:24,082][16672] Updated weights for policy 0, policy_version 30 (0.0012)
[2023-02-25 19:57:27,310][05610] Fps is (10 sec: 3687.0, 60 sec: 2621.4, 300 sec: 2621.4). Total num frames: 131072. Throughput: 0: 717.1. Samples: 32270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:57:27,319][05610] Avg episode reward: [(0, '4.330')]
[2023-02-25 19:57:32,310][05610] Fps is (10 sec: 3276.8, 60 sec: 2681.0, 300 sec: 2681.0). Total num frames: 147456. Throughput: 0: 810.0. Samples: 36772. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 19:57:32,312][05610] Avg episode reward: [(0, '4.275')]
[2023-02-25 19:57:35,567][16672] Updated weights for policy 0, policy_version 40 (0.0015)
[2023-02-25 19:57:37,310][05610] Fps is (10 sec: 3686.4, 60 sec: 2798.9, 300 sec: 2798.9). Total num frames: 167936. Throughput: 0: 854.2. Samples: 40152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 19:57:37,313][05610] Avg episode reward: [(0, '4.417')]
[2023-02-25 19:57:37,331][16658] Saving new best policy, reward=4.417!
[2023-02-25 19:57:42,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3208.5, 300 sec: 2961.7). Total num frames: 192512. Throughput: 0: 963.6. Samples: 47362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:57:42,316][05610] Avg episode reward: [(0, '4.469')]
[2023-02-25 19:57:42,318][16658] Saving new best policy, reward=4.469!
[2023-02-25 19:57:45,620][16672] Updated weights for policy 0, policy_version 50 (0.0017)
[2023-02-25 19:57:47,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2984.2). Total num frames: 208896. Throughput: 0: 916.8. Samples: 52400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:57:47,315][05610] Avg episode reward: [(0, '4.422')]
[2023-02-25 19:57:52,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3003.7). Total num frames: 225280. Throughput: 0: 911.6. Samples: 54690. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 19:57:52,316][05610] Avg episode reward: [(0, '4.420')]
[2023-02-25 19:57:56,223][16672] Updated weights for policy 0, policy_version 60 (0.0018)
[2023-02-25 19:57:57,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3123.2). Total num frames: 249856. Throughput: 0: 979.8. Samples: 61116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:57:57,313][05610] Avg episode reward: [(0, '4.714')]
[2023-02-25 19:57:57,325][16658] Saving new best policy, reward=4.714!
[2023-02-25 19:58:02,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3180.4). Total num frames: 270336. Throughput: 0: 999.4. Samples: 68236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:58:02,317][05610] Avg episode reward: [(0, '4.653')]
[2023-02-25 19:58:06,681][16672] Updated weights for policy 0, policy_version 70 (0.0012)
[2023-02-25 19:58:07,310][05610] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3185.8). Total num frames: 286720. Throughput: 0: 974.2. Samples: 70604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:58:07,312][05610] Avg episode reward: [(0, '4.542')]
[2023-02-25 19:58:07,324][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000070_286720.pth...
[2023-02-25 19:58:12,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3190.6). Total num frames: 303104. Throughput: 0: 948.0. Samples: 74928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:58:12,319][05610] Avg episode reward: [(0, '4.463')]
[2023-02-25 19:58:17,310][05610] Fps is (10 sec: 3686.5, 60 sec: 3823.0, 300 sec: 3235.8). Total num frames: 323584. Throughput: 0: 992.1. Samples: 81416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:58:17,320][05610] Avg episode reward: [(0, '4.514')]
[2023-02-25 19:58:17,552][16672] Updated weights for policy 0, policy_version 80 (0.0020)
[2023-02-25 19:58:22,310][05610] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3315.8). Total num frames: 348160. Throughput: 0: 990.8. Samples: 84740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:58:22,318][05610] Avg episode reward: [(0, '4.418')]
[2023-02-25 19:58:27,312][05610] Fps is (10 sec: 3685.6, 60 sec: 3822.8, 300 sec: 3276.7). Total num frames: 360448. Throughput: 0: 948.4. Samples: 90044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:58:27,315][05610] Avg episode reward: [(0, '4.321')]
[2023-02-25 19:58:28,792][16672] Updated weights for policy 0, policy_version 90 (0.0028)
[2023-02-25 19:58:32,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3312.4). Total num frames: 380928. Throughput: 0: 942.4. Samples: 94810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:58:32,316][05610] Avg episode reward: [(0, '4.526')]
[2023-02-25 19:58:37,310][05610] Fps is (10 sec: 4096.9, 60 sec: 3891.2, 300 sec: 3345.1). Total num frames: 401408. Throughput: 0: 971.2. Samples: 98392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:58:37,320][05610] Avg episode reward: [(0, '4.778')]
[2023-02-25 19:58:37,327][16658] Saving new best policy, reward=4.778!
[2023-02-25 19:58:38,391][16672] Updated weights for policy 0, policy_version 100 (0.0025)
[2023-02-25 19:58:42,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3407.9). Total num frames: 425984. Throughput: 0: 985.9. Samples: 105480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:58:42,322][05610] Avg episode reward: [(0, '4.555')]
[2023-02-25 19:58:47,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3371.3). Total num frames: 438272. Throughput: 0: 937.0. Samples: 110400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:58:47,316][05610] Avg episode reward: [(0, '4.449')]
[2023-02-25 19:58:50,266][16672] Updated weights for policy 0, policy_version 110 (0.0016)
[2023-02-25 19:58:52,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3398.2). Total num frames: 458752. Throughput: 0: 934.4. Samples: 112652. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:58:52,313][05610] Avg episode reward: [(0, '4.433')]
[2023-02-25 19:58:57,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3423.1). Total num frames: 479232. Throughput: 0: 984.2. Samples: 119216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:58:57,316][05610] Avg episode reward: [(0, '4.375')]
[2023-02-25 19:58:59,018][16672] Updated weights for policy 0, policy_version 120 (0.0018)
[2023-02-25 19:59:02,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3474.5). Total num frames: 503808. Throughput: 0: 998.1. Samples: 126330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:59:02,318][05610] Avg episode reward: [(0, '4.333')]
[2023-02-25 19:59:07,313][05610] Fps is (10 sec: 4094.5, 60 sec: 3891.0, 300 sec: 3467.9). Total num frames: 520192. Throughput: 0: 975.1. Samples: 128624. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:59:07,321][05610] Avg episode reward: [(0, '4.509')]
[2023-02-25 19:59:11,019][16672] Updated weights for policy 0, policy_version 130 (0.0020)
[2023-02-25 19:59:12,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3461.8). Total num frames: 536576. Throughput: 0: 962.0. Samples: 133334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:59:12,316][05610] Avg episode reward: [(0, '4.597')]
[2023-02-25 19:59:17,310][05610] Fps is (10 sec: 4097.5, 60 sec: 3959.5, 300 sec: 3507.2). Total num frames: 561152. Throughput: 0: 1013.0. Samples: 140394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:59:17,317][05610] Avg episode reward: [(0, '4.607')]
[2023-02-25 19:59:19,490][16672] Updated weights for policy 0, policy_version 140 (0.0016)
[2023-02-25 19:59:22,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3525.0). Total num frames: 581632. Throughput: 0: 1013.2. Samples: 143986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:59:22,313][05610] Avg episode reward: [(0, '4.550')]
[2023-02-25 19:59:27,315][05610] Fps is (10 sec: 3684.4, 60 sec: 3959.2, 300 sec: 3517.6). Total num frames: 598016. Throughput: 0: 972.3. Samples: 149238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:59:27,318][05610] Avg episode reward: [(0, '4.530')]
[2023-02-25 19:59:31,547][16672] Updated weights for policy 0, policy_version 150 (0.0047)
[2023-02-25 19:59:32,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3510.9). Total num frames: 614400. Throughput: 0: 972.8. Samples: 154176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 19:59:32,318][05610] Avg episode reward: [(0, '4.491')]
[2023-02-25 19:59:37,310][05610] Fps is (10 sec: 3278.6, 60 sec: 3822.9, 300 sec: 3504.4). Total num frames: 630784. Throughput: 0: 976.3. Samples: 156586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 19:59:37,316][05610] Avg episode reward: [(0, '4.393')]
[2023-02-25 19:59:42,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3498.2). Total num frames: 647168. Throughput: 0: 930.8. Samples: 161102. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 19:59:42,318][05610] Avg episode reward: [(0, '4.436')]
[2023-02-25 19:59:45,261][16672] Updated weights for policy 0, policy_version 160 (0.0019)
[2023-02-25 19:59:47,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3470.8). Total num frames: 659456. Throughput: 0: 869.2. Samples: 165442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 19:59:47,317][05610] Avg episode reward: [(0, '4.605')]
[2023-02-25 19:59:52,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3465.8). Total num frames: 675840. Throughput: 0: 869.6. Samples: 167754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 19:59:52,316][05610] Avg episode reward: [(0, '4.910')]
[2023-02-25 19:59:52,320][16658] Saving new best policy, reward=4.910!
[2023-02-25 19:59:55,939][16672] Updated weights for policy 0, policy_version 170 (0.0040)
[2023-02-25 19:59:57,310][05610] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3502.1). Total num frames: 700416. Throughput: 0: 910.0. Samples: 174286. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 19:59:57,318][05610] Avg episode reward: [(0, '4.792')]
[2023-02-25 20:00:02,315][05610] Fps is (10 sec: 4912.7, 60 sec: 3686.1, 300 sec: 3536.5). Total num frames: 724992. Throughput: 0: 909.8. Samples: 181338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:00:02,317][05610] Avg episode reward: [(0, '4.700')]
[2023-02-25 20:00:06,161][16672] Updated weights for policy 0, policy_version 180 (0.0039)
[2023-02-25 20:00:07,313][05610] Fps is (10 sec: 3685.2, 60 sec: 3618.2, 300 sec: 3510.8). Total num frames: 737280. Throughput: 0: 879.4. Samples: 183560. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:00:07,328][05610] Avg episode reward: [(0, '4.680')]
[2023-02-25 20:00:07,351][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth...
[2023-02-25 20:00:12,310][05610] Fps is (10 sec: 3278.5, 60 sec: 3686.4, 300 sec: 3524.5). Total num frames: 757760. Throughput: 0: 866.0. Samples: 188202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:00:12,312][05610] Avg episode reward: [(0, '4.814')]
[2023-02-25 20:00:16,368][16672] Updated weights for policy 0, policy_version 190 (0.0019)
[2023-02-25 20:00:17,310][05610] Fps is (10 sec: 4097.4, 60 sec: 3618.1, 300 sec: 3537.5). Total num frames: 778240. Throughput: 0: 914.6. Samples: 195332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:00:17,317][05610] Avg episode reward: [(0, '5.091')]
[2023-02-25 20:00:17,335][16658] Saving new best policy, reward=5.091!
[2023-02-25 20:00:22,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3568.1). Total num frames: 802816. Throughput: 0: 940.7. Samples: 198918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:00:22,318][05610] Avg episode reward: [(0, '5.166')]
[2023-02-25 20:00:22,323][16658] Saving new best policy, reward=5.166!
[2023-02-25 20:00:27,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3618.5, 300 sec: 3543.9). Total num frames: 815104. Throughput: 0: 953.8. Samples: 204024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:00:27,321][05610] Avg episode reward: [(0, '5.027')]
[2023-02-25 20:00:27,366][16672] Updated weights for policy 0, policy_version 200 (0.0012)
[2023-02-25 20:00:32,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3555.7). Total num frames: 835584. Throughput: 0: 970.3. Samples: 209106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:00:32,314][05610] Avg episode reward: [(0, '5.076')]
[2023-02-25 20:00:36,952][16672] Updated weights for policy 0, policy_version 210 (0.0012)
[2023-02-25 20:00:37,311][05610] Fps is (10 sec: 4504.9, 60 sec: 3822.8, 300 sec: 3584.0). Total num frames: 860160. Throughput: 0: 998.6. Samples: 212694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:00:37,315][05610] Avg episode reward: [(0, '5.506')]
[2023-02-25 20:00:37,327][16658] Saving new best policy, reward=5.506!
[2023-02-25 20:00:42,310][05610] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3594.4). Total num frames: 880640. Throughput: 0: 1012.6. Samples: 219854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:00:42,315][05610] Avg episode reward: [(0, '6.021')]
[2023-02-25 20:00:42,327][16658] Saving new best policy, reward=6.021!
[2023-02-25 20:00:47,310][05610] Fps is (10 sec: 3687.0, 60 sec: 3959.5, 300 sec: 3588.1). Total num frames: 897024. Throughput: 0: 953.8. Samples: 224256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:00:47,323][05610] Avg episode reward: [(0, '5.792')]
[2023-02-25 20:00:48,528][16672] Updated weights for policy 0, policy_version 220 (0.0025)
[2023-02-25 20:00:52,310][05610] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3582.0). Total num frames: 913408. Throughput: 0: 957.3. Samples: 226636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:00:52,317][05610] Avg episode reward: [(0, '5.910')]
[2023-02-25 20:00:57,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3607.6). Total num frames: 937984. Throughput: 0: 1010.0. Samples: 233654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:00:57,319][05610] Avg episode reward: [(0, '5.839')]
[2023-02-25 20:00:57,567][16672] Updated weights for policy 0, policy_version 230 (0.0014)
[2023-02-25 20:01:02,312][05610] Fps is (10 sec: 4504.6, 60 sec: 3891.4, 300 sec: 3616.8). Total num frames: 958464. Throughput: 0: 997.3. Samples: 240214. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:01:02,320][05610] Avg episode reward: [(0, '5.822')]
[2023-02-25 20:01:07,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.7, 300 sec: 3610.5). Total num frames: 974848. Throughput: 0: 968.5. Samples: 242502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:01:07,317][05610] Avg episode reward: [(0, '6.148')]
[2023-02-25 20:01:07,326][16658] Saving new best policy, reward=6.148!
[2023-02-25 20:01:09,538][16672] Updated weights for policy 0, policy_version 240 (0.0024)
[2023-02-25 20:01:12,310][05610] Fps is (10 sec: 3687.2, 60 sec: 3959.5, 300 sec: 3619.4). Total num frames: 995328. Throughput: 0: 963.3. Samples: 247374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:01:12,318][05610] Avg episode reward: [(0, '6.083')]
[2023-02-25 20:01:17,310][05610] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 3642.5). Total num frames: 1019904. Throughput: 0: 1010.4. Samples: 254574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:01:17,317][05610] Avg episode reward: [(0, '6.314')]
[2023-02-25 20:01:17,328][16658] Saving new best policy, reward=6.314!
[2023-02-25 20:01:18,106][16672] Updated weights for policy 0, policy_version 250 (0.0021)
[2023-02-25 20:01:22,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3636.1). Total num frames: 1036288. Throughput: 0: 1011.5. Samples: 258208. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:01:22,312][05610] Avg episode reward: [(0, '6.462')]
[2023-02-25 20:01:22,314][16658] Saving new best policy, reward=6.462!
[2023-02-25 20:01:27,310][05610] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3629.9). Total num frames: 1052672. Throughput: 0: 955.8. Samples: 262864. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 20:01:27,315][05610] Avg episode reward: [(0, '6.040')]
[2023-02-25 20:01:30,138][16672] Updated weights for policy 0, policy_version 260 (0.0015)
[2023-02-25 20:01:32,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3637.8). Total num frames: 1073152. Throughput: 0: 981.7. Samples: 268432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:01:32,315][05610] Avg episode reward: [(0, '5.881')]
[2023-02-25 20:01:37,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3721.1). Total num frames: 1097728. Throughput: 0: 1008.4. Samples: 272014. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:01:37,312][05610] Avg episode reward: [(0, '6.255')]
[2023-02-25 20:01:38,546][16672] Updated weights for policy 0, policy_version 270 (0.0018)
[2023-02-25 20:01:42,310][05610] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1118208. Throughput: 0: 1005.9. Samples: 278918. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:01:42,312][05610] Avg episode reward: [(0, '6.261')]
[2023-02-25 20:01:47,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 1130496. Throughput: 0: 958.5. Samples: 283346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:01:47,312][05610] Avg episode reward: [(0, '6.404')]
[2023-02-25 20:01:50,683][16672] Updated weights for policy 0, policy_version 280 (0.0033)
[2023-02-25 20:01:52,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1150976. Throughput: 0: 960.4. Samples: 285722. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:01:52,319][05610] Avg episode reward: [(0, '6.197')]
[2023-02-25 20:01:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1175552. Throughput: 0: 1014.8. Samples: 293040. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:01:57,313][05610] Avg episode reward: [(0, '6.628')]
[2023-02-25 20:01:57,332][16658] Saving new best policy, reward=6.628!
[2023-02-25 20:01:59,255][16672] Updated weights for policy 0, policy_version 290 (0.0021)
[2023-02-25 20:02:02,314][05610] Fps is (10 sec: 4503.6, 60 sec: 3959.3, 300 sec: 3846.0). Total num frames: 1196032. Throughput: 0: 994.0. Samples: 299310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:02:02,322][05610] Avg episode reward: [(0, '6.613')]
[2023-02-25 20:02:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1208320. Throughput: 0: 963.1. Samples: 301546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:02:07,320][05610] Avg episode reward: [(0, '6.753')]
[2023-02-25 20:02:07,337][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000295_1208320.pth...
[2023-02-25 20:02:07,493][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000070_286720.pth
[2023-02-25 20:02:07,510][16658] Saving new best policy, reward=6.753!
[2023-02-25 20:02:11,479][16672] Updated weights for policy 0, policy_version 300 (0.0019)
[2023-02-25 20:02:12,310][05610] Fps is (10 sec: 3688.1, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1232896. Throughput: 0: 974.2. Samples: 306704. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:12,313][05610] Avg episode reward: [(0, '6.453')]
[2023-02-25 20:02:17,310][05610] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1257472. Throughput: 0: 1010.2. Samples: 313890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:02:17,317][05610] Avg episode reward: [(0, '7.007')]
[2023-02-25 20:02:17,329][16658] Saving new best policy, reward=7.007!
[2023-02-25 20:02:20,572][16672] Updated weights for policy 0, policy_version 310 (0.0013)
[2023-02-25 20:02:22,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1273856. Throughput: 0: 1005.1. Samples: 317242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:02:22,312][05610] Avg episode reward: [(0, '7.896')]
[2023-02-25 20:02:22,318][16658] Saving new best policy, reward=7.896!
[2023-02-25 20:02:27,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1286144. Throughput: 0: 952.7. Samples: 321788. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:27,316][05610] Avg episode reward: [(0, '8.331')]
[2023-02-25 20:02:27,334][16658] Saving new best policy, reward=8.331!
[2023-02-25 20:02:32,085][16672] Updated weights for policy 0, policy_version 320 (0.0047)
[2023-02-25 20:02:32,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1310720. Throughput: 0: 982.6. Samples: 327564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:32,312][05610] Avg episode reward: [(0, '9.245')]
[2023-02-25 20:02:32,317][16658] Saving new best policy, reward=9.245!
[2023-02-25 20:02:37,310][05610] Fps is (10 sec: 4915.1, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1335296. Throughput: 0: 1009.2. Samples: 331138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:37,312][05610] Avg episode reward: [(0, '9.557')]
[2023-02-25 20:02:37,330][16658] Saving new best policy, reward=9.557!
[2023-02-25 20:02:41,612][16672] Updated weights for policy 0, policy_version 330 (0.0022)
[2023-02-25 20:02:42,312][05610] Fps is (10 sec: 4095.2, 60 sec: 3891.1, 300 sec: 3873.8). Total num frames: 1351680. Throughput: 0: 992.9. Samples: 337722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:42,319][05610] Avg episode reward: [(0, '9.212')]
[2023-02-25 20:02:47,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1368064. Throughput: 0: 955.5. Samples: 342304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:02:47,319][05610] Avg episode reward: [(0, '9.469')]
[2023-02-25 20:02:52,310][05610] Fps is (10 sec: 3687.1, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1388544. Throughput: 0: 963.2. Samples: 344892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:02:52,320][05610] Avg episode reward: [(0, '9.444')]
[2023-02-25 20:02:52,668][16672] Updated weights for policy 0, policy_version 340 (0.0020)
[2023-02-25 20:02:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1413120. Throughput: 0: 1010.0. Samples: 352154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:02:57,317][05610] Avg episode reward: [(0, '10.820')]
[2023-02-25 20:02:57,326][16658] Saving new best policy, reward=10.820!
[2023-02-25 20:03:02,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3891.5, 300 sec: 3873.8). Total num frames: 1429504. Throughput: 0: 987.4. Samples: 358324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:03:02,312][05610] Avg episode reward: [(0, '11.404')]
[2023-02-25 20:03:02,320][16658] Saving new best policy, reward=11.404!
[2023-02-25 20:03:02,691][16672] Updated weights for policy 0, policy_version 350 (0.0026)
[2023-02-25 20:03:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1445888. Throughput: 0: 961.8. Samples: 360522. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:03:07,312][05610] Avg episode reward: [(0, '11.572')]
[2023-02-25 20:03:07,336][16658] Saving new best policy, reward=11.572!
[2023-02-25 20:03:12,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1466368. Throughput: 0: 981.8. Samples: 365968. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-25 20:03:12,317][05610] Avg episode reward: [(0, '11.467')]
[2023-02-25 20:03:13,333][16672] Updated weights for policy 0, policy_version 360 (0.0013)
[2023-02-25 20:03:17,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1490944. Throughput: 0: 1015.8. Samples: 373274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:03:17,325][05610] Avg episode reward: [(0, '12.384')]
[2023-02-25 20:03:17,411][16658] Saving new best policy, reward=12.384!
[2023-02-25 20:03:22,312][05610] Fps is (10 sec: 4504.7, 60 sec: 3959.3, 300 sec: 3901.6). Total num frames: 1511424. Throughput: 0: 1006.8. Samples: 376446. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:03:22,320][05610] Avg episode reward: [(0, '13.537')]
[2023-02-25 20:03:22,323][16658] Saving new best policy, reward=13.537!
[2023-02-25 20:03:23,543][16672] Updated weights for policy 0, policy_version 370 (0.0012)
[2023-02-25 20:03:27,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 1523712. Throughput: 0: 962.3. Samples: 381024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:03:27,312][05610] Avg episode reward: [(0, '14.218')]
[2023-02-25 20:03:27,418][16658] Saving new best policy, reward=14.218!
[2023-02-25 20:03:32,310][05610] Fps is (10 sec: 3687.1, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 1548288. Throughput: 0: 990.0. Samples: 386852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:03:32,319][05610] Avg episode reward: [(0, '15.737')]
[2023-02-25 20:03:32,325][16658] Saving new best policy, reward=15.737!
[2023-02-25 20:03:34,127][16672] Updated weights for policy 0, policy_version 380 (0.0018)
[2023-02-25 20:03:37,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1568768. Throughput: 0: 1010.4. Samples: 390360. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0)
[2023-02-25 20:03:37,312][05610] Avg episode reward: [(0, '16.011')]
[2023-02-25 20:03:37,328][16658] Saving new best policy, reward=16.011!
[2023-02-25 20:03:42,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3901.6). Total num frames: 1589248. Throughput: 0: 991.6. Samples: 396778. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 20:03:42,316][05610] Avg episode reward: [(0, '14.512')]
[2023-02-25 20:03:44,893][16672] Updated weights for policy 0, policy_version 390 (0.0014)
[2023-02-25 20:03:47,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1601536. Throughput: 0: 953.6. Samples: 401234. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:03:47,317][05610] Avg episode reward: [(0, '14.906')]
[2023-02-25 20:03:52,310][05610] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 1626112. Throughput: 0: 966.8. Samples: 404030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:03:52,319][05610] Avg episode reward: [(0, '14.459')]
[2023-02-25 20:03:54,505][16672] Updated weights for policy 0, policy_version 400 (0.0023)
[2023-02-25 20:03:57,310][05610] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 1650688. Throughput: 0: 1009.7. Samples: 411404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:03:57,318][05610] Avg episode reward: [(0, '13.899')]
[2023-02-25 20:04:02,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3887.8). Total num frames: 1667072. Throughput: 0: 976.0. Samples: 417194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:02,316][05610] Avg episode reward: [(0, '14.212')]
[2023-02-25 20:04:06,480][16672] Updated weights for policy 0, policy_version 410 (0.0019)
[2023-02-25 20:04:07,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1679360. Throughput: 0: 952.2. Samples: 419292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:07,312][05610] Avg episode reward: [(0, '14.401')]
[2023-02-25 20:04:07,331][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000410_1679360.pth...
[2023-02-25 20:04:07,490][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000180_737280.pth
[2023-02-25 20:04:12,310][05610] Fps is (10 sec: 2457.6, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 1691648. Throughput: 0: 931.6. Samples: 422948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:12,312][05610] Avg episode reward: [(0, '14.525')]
[2023-02-25 20:04:17,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3832.2). Total num frames: 1712128. Throughput: 0: 916.4. Samples: 428090. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:17,313][05610] Avg episode reward: [(0, '15.567')]
[2023-02-25 20:04:18,615][16672] Updated weights for policy 0, policy_version 420 (0.0019)
[2023-02-25 20:04:22,316][05610] Fps is (10 sec: 4093.5, 60 sec: 3686.1, 300 sec: 3846.1). Total num frames: 1732608. Throughput: 0: 918.9. Samples: 431718. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:04:22,318][05610] Avg episode reward: [(0, '16.156')]
[2023-02-25 20:04:22,321][16658] Saving new best policy, reward=16.156!
[2023-02-25 20:04:27,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 1748992. Throughput: 0: 888.4. Samples: 436758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:04:27,325][05610] Avg episode reward: [(0, '18.121')]
[2023-02-25 20:04:27,344][16658] Saving new best policy, reward=18.121!
[2023-02-25 20:04:30,731][16672] Updated weights for policy 0, policy_version 430 (0.0013)
[2023-02-25 20:04:32,310][05610] Fps is (10 sec: 3278.8, 60 sec: 3618.1, 300 sec: 3846.1). Total num frames: 1765376. Throughput: 0: 904.8. Samples: 441952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:32,318][05610] Avg episode reward: [(0, '18.396')]
[2023-02-25 20:04:32,324][16658] Saving new best policy, reward=18.396!
[2023-02-25 20:04:37,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3873.8). Total num frames: 1789952. Throughput: 0: 920.0. Samples: 445432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:37,316][05610] Avg episode reward: [(0, '17.971')]
[2023-02-25 20:04:39,278][16672] Updated weights for policy 0, policy_version 440 (0.0011)
[2023-02-25 20:04:42,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3901.6). Total num frames: 1810432. Throughput: 0: 916.4. Samples: 452644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:04:42,318][05610] Avg episode reward: [(0, '18.425')]
[2023-02-25 20:04:42,326][16658] Saving new best policy, reward=18.425!
[2023-02-25 20:04:47,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3901.6). Total num frames: 1826816. Throughput: 0: 886.9. Samples: 457106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:47,317][05610] Avg episode reward: [(0, '17.283')]
[2023-02-25 20:04:51,444][16672] Updated weights for policy 0, policy_version 450 (0.0027)
[2023-02-25 20:04:52,316][05610] Fps is (10 sec: 3274.9, 60 sec: 3617.8, 300 sec: 3873.8). Total num frames: 1843200. Throughput: 0: 889.8. Samples: 459340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:04:52,322][05610] Avg episode reward: [(0, '17.335')]
[2023-02-25 20:04:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3887.8). Total num frames: 1871872. Throughput: 0: 965.6. Samples: 466402. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:04:57,321][05610] Avg episode reward: [(0, '17.392')]
[2023-02-25 20:04:59,689][16672] Updated weights for policy 0, policy_version 460 (0.0020)
[2023-02-25 20:05:02,314][05610] Fps is (10 sec: 4915.8, 60 sec: 3754.4, 300 sec: 3915.5). Total num frames: 1892352. Throughput: 0: 1001.3. Samples: 473152. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:05:02,320][05610] Avg episode reward: [(0, '16.453')]
[2023-02-25 20:05:07,311][05610] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3887.7). Total num frames: 1904640. Throughput: 0: 970.6. Samples: 475392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:05:07,315][05610] Avg episode reward: [(0, '17.200')]
[2023-02-25 20:05:11,914][16672] Updated weights for policy 0, policy_version 470 (0.0019)
[2023-02-25 20:05:12,310][05610] Fps is (10 sec: 3278.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 1925120. Throughput: 0: 965.2. Samples: 480192. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:05:12,315][05610] Avg episode reward: [(0, '15.989')]
[2023-02-25 20:05:17,310][05610] Fps is (10 sec: 4506.1, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 1949696. Throughput: 0: 1013.2. Samples: 487546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:05:17,317][05610] Avg episode reward: [(0, '16.533')]
[2023-02-25 20:05:20,203][16672] Updated weights for policy 0, policy_version 480 (0.0016)
[2023-02-25 20:05:22,313][05610] Fps is (10 sec: 4504.1, 60 sec: 3959.7, 300 sec: 3915.5). Total num frames: 1970176. Throughput: 0: 1016.2. Samples: 491164. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:05:22,322][05610] Avg episode reward: [(0, '16.612')]
[2023-02-25 20:05:27,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 1986560. Throughput: 0: 967.2. Samples: 496166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:05:27,312][05610] Avg episode reward: [(0, '17.416')]
[2023-02-25 20:05:32,286][16672] Updated weights for policy 0, policy_version 490 (0.0013)
[2023-02-25 20:05:32,310][05610] Fps is (10 sec: 3687.6, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 2007040. Throughput: 0: 986.4. Samples: 501496. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-25 20:05:32,314][05610] Avg episode reward: [(0, '18.864')]
[2023-02-25 20:05:32,318][16658] Saving new best policy, reward=18.864!
[2023-02-25 20:05:37,310][05610] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3901.6). Total num frames: 2031616. Throughput: 0: 1016.5. Samples: 505078. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:05:37,311][05610] Avg episode reward: [(0, '17.903')]
[2023-02-25 20:05:41,036][16672] Updated weights for policy 0, policy_version 500 (0.0012)
[2023-02-25 20:05:42,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 2048000. Throughput: 0: 1015.1. Samples: 512082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:05:42,315][05610] Avg episode reward: [(0, '18.410')]
[2023-02-25 20:05:47,311][05610] Fps is (10 sec: 3276.4, 60 sec: 3959.4, 300 sec: 3901.6). Total num frames: 2064384. Throughput: 0: 966.7. Samples: 516652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:05:47,315][05610] Avg episode reward: [(0, '17.966')]
[2023-02-25 20:05:52,310][05610] Fps is (10 sec: 3686.4, 60 sec: 4028.1, 300 sec: 3887.7). Total num frames: 2084864. Throughput: 0: 968.4. Samples: 518968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:05:52,318][05610] Avg episode reward: [(0, '17.815')]
[2023-02-25 20:05:52,686][16672] Updated weights for policy 0, policy_version 510 (0.0021)
[2023-02-25 20:05:57,310][05610] Fps is (10 sec: 4506.1, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 2109440. Throughput: 0: 1018.9. Samples: 526044. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:05:57,312][05610] Avg episode reward: [(0, '18.001')]
[2023-02-25 20:06:02,207][16672] Updated weights for policy 0, policy_version 520 (0.0011)
[2023-02-25 20:06:02,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.8, 300 sec: 3915.5). Total num frames: 2129920. Throughput: 0: 998.9. Samples: 532496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:06:02,311][05610] Avg episode reward: [(0, '18.611')]
[2023-02-25 20:06:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 2142208. Throughput: 0: 969.8. Samples: 534802. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:06:07,314][05610] Avg episode reward: [(0, '18.851')]
[2023-02-25 20:06:07,325][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000523_2142208.pth...
[2023-02-25 20:06:07,490][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000295_1208320.pth
[2023-02-25 20:06:12,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2162688. Throughput: 0: 966.8. Samples: 539670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:06:12,320][05610] Avg episode reward: [(0, '18.781')]
[2023-02-25 20:06:13,659][16672] Updated weights for policy 0, policy_version 530 (0.0017)
[2023-02-25 20:06:17,312][05610] Fps is (10 sec: 4504.8, 60 sec: 3959.3, 300 sec: 3901.6). Total num frames: 2187264. Throughput: 0: 1008.0. Samples: 546860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:06:17,313][05610] Avg episode reward: [(0, '18.415')]
[2023-02-25 20:06:22,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.7, 300 sec: 3915.5). Total num frames: 2207744. Throughput: 0: 1006.4. Samples: 550368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:06:22,314][05610] Avg episode reward: [(0, '17.522')]
[2023-02-25 20:06:23,736][16672] Updated weights for policy 0, policy_version 540 (0.0011)
[2023-02-25 20:06:27,317][05610] Fps is (10 sec: 3275.1, 60 sec: 3890.7, 300 sec: 3887.6). Total num frames: 2220032. Throughput: 0: 953.4. Samples: 554992. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:06:27,323][05610] Avg episode reward: [(0, '18.107')]
[2023-02-25 20:06:32,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2240512. Throughput: 0: 973.0. Samples: 560434. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:06:32,325][05610] Avg episode reward: [(0, '18.113')]
[2023-02-25 20:06:34,300][16672] Updated weights for policy 0, policy_version 550 (0.0029)
[2023-02-25 20:06:37,310][05610] Fps is (10 sec: 4508.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2265088. Throughput: 0: 1001.3. Samples: 564026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:06:37,318][05610] Avg episode reward: [(0, '18.420')]
[2023-02-25 20:06:42,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 2285568. Throughput: 0: 999.8. Samples: 571034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:06:42,315][05610] Avg episode reward: [(0, '18.512')]
[2023-02-25 20:06:44,576][16672] Updated weights for policy 0, policy_version 560 (0.0024)
[2023-02-25 20:06:47,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 2301952. Throughput: 0: 958.2. Samples: 575614. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 20:06:47,312][05610] Avg episode reward: [(0, '18.046')]
[2023-02-25 20:06:52,310][05610] Fps is (10 sec: 3686.3, 60 sec: 3959.4, 300 sec: 3887.7). Total num frames: 2322432. Throughput: 0: 959.5. Samples: 577980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:06:52,320][05610] Avg episode reward: [(0, '18.854')]
[2023-02-25 20:06:54,717][16672] Updated weights for policy 0, policy_version 570 (0.0014)
[2023-02-25 20:06:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3901.7). Total num frames: 2347008. Throughput: 0: 1011.2. Samples: 585172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:06:57,320][05610] Avg episode reward: [(0, '18.751')]
[2023-02-25 20:07:02,311][05610] Fps is (10 sec: 4095.6, 60 sec: 3891.1, 300 sec: 3915.5). Total num frames: 2363392. Throughput: 0: 995.2. Samples: 591644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:02,315][05610] Avg episode reward: [(0, '19.322')]
[2023-02-25 20:07:02,320][16658] Saving new best policy, reward=19.322!
[2023-02-25 20:07:05,396][16672] Updated weights for policy 0, policy_version 580 (0.0012)
[2023-02-25 20:07:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 2379776. Throughput: 0: 967.0. Samples: 593884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:07:07,314][05610] Avg episode reward: [(0, '19.395')]
[2023-02-25 20:07:07,338][16658] Saving new best policy, reward=19.395!
[2023-02-25 20:07:12,310][05610] Fps is (10 sec: 3686.8, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2400256. Throughput: 0: 976.3. Samples: 598918. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 20:07:12,314][05610] Avg episode reward: [(0, '19.644')]
[2023-02-25 20:07:12,319][16658] Saving new best policy, reward=19.644!
[2023-02-25 20:07:15,500][16672] Updated weights for policy 0, policy_version 590 (0.0018)
[2023-02-25 20:07:17,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3901.6). Total num frames: 2424832. Throughput: 0: 1014.5. Samples: 606088. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:07:17,312][05610] Avg episode reward: [(0, '20.029')]
[2023-02-25 20:07:17,328][16658] Saving new best policy, reward=20.029!
[2023-02-25 20:07:22,316][05610] Fps is (10 sec: 4093.6, 60 sec: 3890.8, 300 sec: 3915.4). Total num frames: 2441216. Throughput: 0: 1012.5. Samples: 609596. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:07:22,318][05610] Avg episode reward: [(0, '20.652')]
[2023-02-25 20:07:22,324][16658] Saving new best policy, reward=20.652!
[2023-02-25 20:07:26,576][16672] Updated weights for policy 0, policy_version 600 (0.0011)
[2023-02-25 20:07:27,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.9, 300 sec: 3887.7). Total num frames: 2457600. Throughput: 0: 956.9. Samples: 614096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:27,314][05610] Avg episode reward: [(0, '19.958')]
[2023-02-25 20:07:32,310][05610] Fps is (10 sec: 3688.6, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2478080. Throughput: 0: 982.0. Samples: 619802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:32,311][05610] Avg episode reward: [(0, '20.920')]
[2023-02-25 20:07:32,321][16658] Saving new best policy, reward=20.920!
[2023-02-25 20:07:36,262][16672] Updated weights for policy 0, policy_version 610 (0.0017)
[2023-02-25 20:07:37,312][05610] Fps is (10 sec: 4504.4, 60 sec: 3959.3, 300 sec: 3901.6). Total num frames: 2502656. Throughput: 0: 1008.2. Samples: 623350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:07:37,317][05610] Avg episode reward: [(0, '22.232')]
[2023-02-25 20:07:37,325][16658] Saving new best policy, reward=22.232!
[2023-02-25 20:07:42,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 2519040. Throughput: 0: 994.9. Samples: 629944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:07:42,312][05610] Avg episode reward: [(0, '21.163')]
[2023-02-25 20:07:47,310][05610] Fps is (10 sec: 3277.4, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2535424. Throughput: 0: 949.9. Samples: 634390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:47,313][05610] Avg episode reward: [(0, '20.871')]
[2023-02-25 20:07:47,860][16672] Updated weights for policy 0, policy_version 620 (0.0014)
[2023-02-25 20:07:52,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2555904. Throughput: 0: 955.3. Samples: 636872. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:52,311][05610] Avg episode reward: [(0, '20.216')]
[2023-02-25 20:07:56,748][16672] Updated weights for policy 0, policy_version 630 (0.0017)
[2023-02-25 20:07:57,310][05610] Fps is (10 sec: 4506.0, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 2580480. Throughput: 0: 1004.1. Samples: 644102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:07:57,311][05610] Avg episode reward: [(0, '19.724')]
[2023-02-25 20:08:02,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3915.5). Total num frames: 2600960. Throughput: 0: 983.2. Samples: 650332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:08:02,314][05610] Avg episode reward: [(0, '19.071')]
[2023-02-25 20:08:07,310][05610] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 2613248. Throughput: 0: 955.8. Samples: 652602. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:08:07,319][05610] Avg episode reward: [(0, '19.162')]
[2023-02-25 20:08:07,330][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_2613248.pth...
[2023-02-25 20:08:07,466][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000410_1679360.pth
[2023-02-25 20:08:08,764][16672] Updated weights for policy 0, policy_version 640 (0.0023)
[2023-02-25 20:08:12,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3887.7). Total num frames: 2637824. Throughput: 0: 973.8. Samples: 657918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:08:12,311][05610] Avg episode reward: [(0, '18.238')]
[2023-02-25 20:08:17,311][05610] Fps is (10 sec: 4504.9, 60 sec: 3891.1, 300 sec: 3887.7). Total num frames: 2658304. Throughput: 0: 1008.5. Samples: 665186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:08:17,313][05610] Avg episode reward: [(0, '18.470')]
[2023-02-25 20:08:17,332][16672] Updated weights for policy 0, policy_version 650 (0.0011)
[2023-02-25 20:08:22,310][05610] Fps is (10 sec: 4095.9, 60 sec: 3959.8, 300 sec: 3915.5). Total num frames: 2678784. Throughput: 0: 1003.7. Samples: 668516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:08:22,312][05610] Avg episode reward: [(0, '19.793')]
[2023-02-25 20:08:27,310][05610] Fps is (10 sec: 2867.7, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2686976. Throughput: 0: 934.0. Samples: 671974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:08:27,323][05610] Avg episode reward: [(0, '20.119')]
[2023-02-25 20:08:32,312][05610] Fps is (10 sec: 2047.4, 60 sec: 3686.2, 300 sec: 3832.2). Total num frames: 2699264. Throughput: 0: 911.9. Samples: 675428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:08:32,321][05610] Avg episode reward: [(0, '21.254')]
[2023-02-25 20:08:32,631][16672] Updated weights for policy 0, policy_version 660 (0.0022)
[2023-02-25 20:08:37,310][05610] Fps is (10 sec: 2457.6, 60 sec: 3481.7, 300 sec: 3804.4). Total num frames: 2711552. Throughput: 0: 908.4. Samples: 677750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:08:37,321][05610] Avg episode reward: [(0, '22.156')]
[2023-02-25 20:08:42,310][05610] Fps is (10 sec: 2868.0, 60 sec: 3481.6, 300 sec: 3818.3). Total num frames: 2727936. Throughput: 0: 826.1. Samples: 681276. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:08:42,313][05610] Avg episode reward: [(0, '22.300')]
[2023-02-25 20:08:42,319][16658] Saving new best policy, reward=22.300!
[2023-02-25 20:08:47,311][05610] Fps is (10 sec: 2457.2, 60 sec: 3345.0, 300 sec: 3762.7). Total num frames: 2736128. Throughput: 0: 759.7. Samples: 684518. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:08:47,322][05610] Avg episode reward: [(0, '23.600')]
[2023-02-25 20:08:47,336][16658] Saving new best policy, reward=23.600!
[2023-02-25 20:08:48,744][16672] Updated weights for policy 0, policy_version 670 (0.0023)
[2023-02-25 20:08:52,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3748.9). Total num frames: 2756608. Throughput: 0: 775.6. Samples: 687506. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:08:52,312][05610] Avg episode reward: [(0, '24.182')]
[2023-02-25 20:08:52,316][16658] Saving new best policy, reward=24.182!
[2023-02-25 20:08:57,310][05610] Fps is (10 sec: 4096.7, 60 sec: 3276.8, 300 sec: 3762.8). Total num frames: 2777088. Throughput: 0: 792.8. Samples: 693594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:08:57,313][05610] Avg episode reward: [(0, '26.211')]
[2023-02-25 20:08:57,322][16658] Saving new best policy, reward=26.211!
[2023-02-25 20:08:59,015][16672] Updated weights for policy 0, policy_version 680 (0.0047)
[2023-02-25 20:09:02,314][05610] Fps is (10 sec: 3684.7, 60 sec: 3208.3, 300 sec: 3776.6). Total num frames: 2793472. Throughput: 0: 745.2. Samples: 698724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:09:02,316][05610] Avg episode reward: [(0, '25.549')]
[2023-02-25 20:09:07,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3776.6). Total num frames: 2805760. Throughput: 0: 712.1. Samples: 700562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:09:07,317][05610] Avg episode reward: [(0, '25.053')]
[2023-02-25 20:09:12,310][05610] Fps is (10 sec: 2868.5, 60 sec: 3072.0, 300 sec: 3762.8). Total num frames: 2822144. Throughput: 0: 743.6. Samples: 705436. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:09:12,321][05610] Avg episode reward: [(0, '23.699')]
[2023-02-25 20:09:12,536][16672] Updated weights for policy 0, policy_version 690 (0.0018)
[2023-02-25 20:09:17,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3072.1, 300 sec: 3762.8). Total num frames: 2842624. Throughput: 0: 800.2. Samples: 711434. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:09:17,321][05610] Avg episode reward: [(0, '22.535')]
[2023-02-25 20:09:22,311][05610] Fps is (10 sec: 3276.4, 60 sec: 2935.4, 300 sec: 3748.9). Total num frames: 2854912. Throughput: 0: 791.4. Samples: 713362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:09:22,319][05610] Avg episode reward: [(0, '22.471')]
[2023-02-25 20:09:25,462][16672] Updated weights for policy 0, policy_version 700 (0.0028)
[2023-02-25 20:09:27,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3748.9). Total num frames: 2871296. Throughput: 0: 802.0. Samples: 717366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:09:27,312][05610] Avg episode reward: [(0, '21.576')]
[2023-02-25 20:09:32,310][05610] Fps is (10 sec: 3277.2, 60 sec: 3140.4, 300 sec: 3721.1). Total num frames: 2887680. Throughput: 0: 838.6. Samples: 722252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:09:32,312][05610] Avg episode reward: [(0, '20.798')]
[2023-02-25 20:09:36,911][16672] Updated weights for policy 0, policy_version 710 (0.0024)
[2023-02-25 20:09:37,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3721.1). Total num frames: 2908160. Throughput: 0: 827.0. Samples: 724720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:09:37,318][05610] Avg episode reward: [(0, '20.404')]
[2023-02-25 20:09:42,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3721.1). Total num frames: 2924544. Throughput: 0: 823.0. Samples: 730630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:09:42,312][05610] Avg episode reward: [(0, '19.827')]
[2023-02-25 20:09:47,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3345.2, 300 sec: 3707.3). Total num frames: 2936832. Throughput: 0: 811.1. Samples: 735220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:09:47,312][05610] Avg episode reward: [(0, '20.551')]
[2023-02-25 20:09:48,944][16672] Updated weights for policy 0, policy_version 720 (0.0019)
[2023-02-25 20:09:52,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 2961408. Throughput: 0: 842.8. Samples: 738490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:09:52,316][05610] Avg episode reward: [(0, '22.207')]
[2023-02-25 20:09:57,310][05610] Fps is (10 sec: 4915.2, 60 sec: 3481.6, 300 sec: 3707.3). Total num frames: 2985984. Throughput: 0: 894.8. Samples: 745700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:09:57,312][05610] Avg episode reward: [(0, '23.698')]
[2023-02-25 20:09:57,666][16672] Updated weights for policy 0, policy_version 730 (0.0019)
[2023-02-25 20:10:02,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3413.6, 300 sec: 3707.2). Total num frames: 2998272. Throughput: 0: 873.4. Samples: 750738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:10:02,312][05610] Avg episode reward: [(0, '24.394')]
[2023-02-25 20:10:07,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 3014656. Throughput: 0: 862.8. Samples: 752188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:10:07,313][05610] Avg episode reward: [(0, '24.609')]
[2023-02-25 20:10:07,327][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000736_3014656.pth...
[2023-02-25 20:10:07,433][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000523_2142208.pth
[2023-02-25 20:10:10,690][16672] Updated weights for policy 0, policy_version 740 (0.0021)
[2023-02-25 20:10:12,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 3039232. Throughput: 0: 908.3. Samples: 758240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:10:12,320][05610] Avg episode reward: [(0, '25.993')]
[2023-02-25 20:10:17,316][05610] Fps is (10 sec: 4912.3, 60 sec: 3686.0, 300 sec: 3707.2). Total num frames: 3063808. Throughput: 0: 960.9. Samples: 765496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:10:17,326][05610] Avg episode reward: [(0, '25.977')]
[2023-02-25 20:10:20,080][16672] Updated weights for policy 0, policy_version 750 (0.0012)
[2023-02-25 20:10:22,310][05610] Fps is (10 sec: 3686.2, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 3076096. Throughput: 0: 964.7. Samples: 768130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:10:22,317][05610] Avg episode reward: [(0, '25.594')]
[2023-02-25 20:10:27,310][05610] Fps is (10 sec: 2868.9, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 3092480. Throughput: 0: 933.0. Samples: 772616. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:10:27,321][05610] Avg episode reward: [(0, '25.067')]
[2023-02-25 20:10:31,167][16672] Updated weights for policy 0, policy_version 760 (0.0012)
[2023-02-25 20:10:32,310][05610] Fps is (10 sec: 4096.2, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 3117056. Throughput: 0: 979.8. Samples: 779312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:10:32,312][05610] Avg episode reward: [(0, '25.216')]
[2023-02-25 20:10:37,310][05610] Fps is (10 sec: 4915.2, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 3141632. Throughput: 0: 987.0. Samples: 782904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:10:37,313][05610] Avg episode reward: [(0, '25.194')]
[2023-02-25 20:10:41,203][16672] Updated weights for policy 0, policy_version 770 (0.0024)
[2023-02-25 20:10:42,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3693.4). Total num frames: 3153920. Throughput: 0: 952.1. Samples: 788546. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:10:42,316][05610] Avg episode reward: [(0, '25.115')]
[2023-02-25 20:10:47,310][05610] Fps is (10 sec: 2867.1, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3170304. Throughput: 0: 940.4. Samples: 793054. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:10:47,316][05610] Avg episode reward: [(0, '24.622')]
[2023-02-25 20:10:52,210][16672] Updated weights for policy 0, policy_version 780 (0.0018)
[2023-02-25 20:10:52,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3194880. Throughput: 0: 985.6. Samples: 796540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:10:52,313][05610] Avg episode reward: [(0, '22.818')]
[2023-02-25 20:10:57,311][05610] Fps is (10 sec: 4505.2, 60 sec: 3822.9, 300 sec: 3679.4). Total num frames: 3215360. Throughput: 0: 996.0. Samples: 803060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:10:57,320][05610] Avg episode reward: [(0, '21.878')]
[2023-02-25 20:11:02,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 3231744. Throughput: 0: 946.3. Samples: 808072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:02,318][05610] Avg episode reward: [(0, '22.549')]
[2023-02-25 20:11:03,487][16672] Updated weights for policy 0, policy_version 790 (0.0016)
[2023-02-25 20:11:07,310][05610] Fps is (10 sec: 3277.2, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3248128. Throughput: 0: 938.9. Samples: 810382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:07,319][05610] Avg episode reward: [(0, '23.520')]
[2023-02-25 20:11:12,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3272704. Throughput: 0: 984.1. Samples: 816900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:12,312][05610] Avg episode reward: [(0, '24.547')]
[2023-02-25 20:11:12,983][16672] Updated weights for policy 0, policy_version 800 (0.0023)
[2023-02-25 20:11:17,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3823.3, 300 sec: 3679.5). Total num frames: 3293184. Throughput: 0: 990.3. Samples: 823876. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:11:17,312][05610] Avg episode reward: [(0, '25.994')]
[2023-02-25 20:11:22,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3693.4). Total num frames: 3309568. Throughput: 0: 960.0. Samples: 826102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:11:22,315][05610] Avg episode reward: [(0, '26.133')]
[2023-02-25 20:11:24,840][16672] Updated weights for policy 0, policy_version 810 (0.0012)
[2023-02-25 20:11:27,310][05610] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3325952. Throughput: 0: 934.8. Samples: 830612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:11:27,318][05610] Avg episode reward: [(0, '26.265')]
[2023-02-25 20:11:27,333][16658] Saving new best policy, reward=26.265!
[2023-02-25 20:11:32,310][05610] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3350528. Throughput: 0: 985.6. Samples: 837408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:11:32,318][05610] Avg episode reward: [(0, '23.739')]
[2023-02-25 20:11:33,848][16672] Updated weights for policy 0, policy_version 820 (0.0011)
[2023-02-25 20:11:37,310][05610] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3366912. Throughput: 0: 981.9. Samples: 840724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:37,312][05610] Avg episode reward: [(0, '22.131')]
[2023-02-25 20:11:42,310][05610] Fps is (10 sec: 2867.3, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3379200. Throughput: 0: 926.2. Samples: 844736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:42,315][05610] Avg episode reward: [(0, '21.957')]
[2023-02-25 20:11:47,310][05610] Fps is (10 sec: 2457.6, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3391488. Throughput: 0: 892.0. Samples: 848214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:11:47,320][05610] Avg episode reward: [(0, '21.021')]
[2023-02-25 20:11:48,514][16672] Updated weights for policy 0, policy_version 830 (0.0021)
[2023-02-25 20:11:52,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3416064. Throughput: 0: 917.6. Samples: 851674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:11:52,314][05610] Avg episode reward: [(0, '23.366')]
[2023-02-25 20:11:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 3436544. Throughput: 0: 930.8. Samples: 858784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:11:57,323][05610] Avg episode reward: [(0, '24.230')]
[2023-02-25 20:11:58,097][16672] Updated weights for policy 0, policy_version 840 (0.0037)
[2023-02-25 20:12:02,312][05610] Fps is (10 sec: 2866.4, 60 sec: 3549.7, 300 sec: 3610.0). Total num frames: 3444736. Throughput: 0: 855.3. Samples: 862368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:12:02,314][05610] Avg episode reward: [(0, '24.148')]
[2023-02-25 20:12:07,310][05610] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 3461120. Throughput: 0: 842.3. Samples: 864004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:12:07,316][05610] Avg episode reward: [(0, '24.974')]
[2023-02-25 20:12:07,327][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000845_3461120.pth...
[2023-02-25 20:12:07,447][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_2613248.pth
[2023-02-25 20:12:11,137][16672] Updated weights for policy 0, policy_version 850 (0.0014)
[2023-02-25 20:12:12,310][05610] Fps is (10 sec: 4097.1, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 3485696. Throughput: 0: 879.1. Samples: 870172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:12,315][05610] Avg episode reward: [(0, '28.237')]
[2023-02-25 20:12:12,319][16658] Saving new best policy, reward=28.237!
[2023-02-25 20:12:17,310][05610] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3596.2). Total num frames: 3502080. Throughput: 0: 861.3. Samples: 876166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:17,315][05610] Avg episode reward: [(0, '27.938')]
[2023-02-25 20:12:22,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3582.3). Total num frames: 3514368. Throughput: 0: 828.5. Samples: 878006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:12:22,317][05610] Avg episode reward: [(0, '27.792')]
[2023-02-25 20:12:24,585][16672] Updated weights for policy 0, policy_version 860 (0.0017)
[2023-02-25 20:12:27,310][05610] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3530752. Throughput: 0: 829.6. Samples: 882070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:27,319][05610] Avg episode reward: [(0, '26.686')]
[2023-02-25 20:12:32,310][05610] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3568.4). Total num frames: 3555328. Throughput: 0: 910.2. Samples: 889172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:12:32,320][05610] Avg episode reward: [(0, '27.867')]
[2023-02-25 20:12:33,478][16672] Updated weights for policy 0, policy_version 870 (0.0012)
[2023-02-25 20:12:37,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 3575808. Throughput: 0: 913.2. Samples: 892768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:37,318][05610] Avg episode reward: [(0, '27.105')]
[2023-02-25 20:12:42,314][05610] Fps is (10 sec: 3684.8, 60 sec: 3549.6, 300 sec: 3582.2). Total num frames: 3592192. Throughput: 0: 868.6. Samples: 897876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:12:42,317][05610] Avg episode reward: [(0, '26.327')]
[2023-02-25 20:12:45,454][16672] Updated weights for policy 0, policy_version 880 (0.0033)
[2023-02-25 20:12:47,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 3612672. Throughput: 0: 902.7. Samples: 902988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:12:47,315][05610] Avg episode reward: [(0, '27.120')]
[2023-02-25 20:12:52,310][05610] Fps is (10 sec: 3688.1, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 3629056. Throughput: 0: 931.9. Samples: 905938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:52,319][05610] Avg episode reward: [(0, '27.736')]
[2023-02-25 20:12:57,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3526.7). Total num frames: 3641344. Throughput: 0: 896.8. Samples: 910528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:12:57,316][05610] Avg episode reward: [(0, '27.443')]
[2023-02-25 20:12:57,936][16672] Updated weights for policy 0, policy_version 890 (0.0013)
[2023-02-25 20:13:02,310][05610] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 3657728. Throughput: 0: 853.7. Samples: 914580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:13:02,313][05610] Avg episode reward: [(0, '25.940')]
[2023-02-25 20:13:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 3674112. Throughput: 0: 863.5. Samples: 916862. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:13:07,311][05610] Avg episode reward: [(0, '25.520')]
[2023-02-25 20:13:09,443][16672] Updated weights for policy 0, policy_version 900 (0.0019)
[2023-02-25 20:13:12,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 3698688. Throughput: 0: 919.2. Samples: 923434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:13:12,314][05610] Avg episode reward: [(0, '25.780')]
[2023-02-25 20:13:17,312][05610] Fps is (10 sec: 4504.6, 60 sec: 3618.0, 300 sec: 3526.7). Total num frames: 3719168. Throughput: 0: 916.6. Samples: 930420. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:13:17,316][05610] Avg episode reward: [(0, '25.165')]
[2023-02-25 20:13:19,219][16672] Updated weights for policy 0, policy_version 910 (0.0018)
[2023-02-25 20:13:22,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 3735552. Throughput: 0: 887.1. Samples: 932688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:13:22,314][05610] Avg episode reward: [(0, '25.460')]
[2023-02-25 20:13:27,310][05610] Fps is (10 sec: 3277.5, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 3751936. Throughput: 0: 875.7. Samples: 937280. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:13:27,313][05610] Avg episode reward: [(0, '25.550')]
[2023-02-25 20:13:29,968][16672] Updated weights for policy 0, policy_version 920 (0.0026)
[2023-02-25 20:13:32,310][05610] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3776512. Throughput: 0: 923.4. Samples: 944542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 20:13:32,314][05610] Avg episode reward: [(0, '25.568')]
[2023-02-25 20:13:37,310][05610] Fps is (10 sec: 4915.1, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 3801088. Throughput: 0: 937.6. Samples: 948130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:13:37,315][05610] Avg episode reward: [(0, '25.189')]
[2023-02-25 20:13:40,126][16672] Updated weights for policy 0, policy_version 930 (0.0025)
[2023-02-25 20:13:42,310][05610] Fps is (10 sec: 3686.4, 60 sec: 3686.7, 300 sec: 3651.7). Total num frames: 3813376. Throughput: 0: 946.0. Samples: 953098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:13:42,316][05610] Avg episode reward: [(0, '25.625')]
[2023-02-25 20:13:47,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3833856. Throughput: 0: 971.3. Samples: 958288. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:13:47,312][05610] Avg episode reward: [(0, '24.231')]
[2023-02-25 20:13:50,491][16672] Updated weights for policy 0, policy_version 940 (0.0028)
[2023-02-25 20:13:52,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3665.6). Total num frames: 3858432. Throughput: 0: 1000.9. Samples: 961904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:13:52,312][05610] Avg episode reward: [(0, '23.996')]
[2023-02-25 20:13:57,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 3878912. Throughput: 0: 1012.3. Samples: 968988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 20:13:57,312][05610] Avg episode reward: [(0, '23.484')]
[2023-02-25 20:14:01,250][16672] Updated weights for policy 0, policy_version 950 (0.0019)
[2023-02-25 20:14:02,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3891200. Throughput: 0: 956.8. Samples: 973476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 20:14:02,316][05610] Avg episode reward: [(0, '24.328')]
[2023-02-25 20:14:07,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 3911680. Throughput: 0: 958.6. Samples: 975824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 20:14:07,312][05610] Avg episode reward: [(0, '25.055')]
[2023-02-25 20:14:07,328][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000955_3911680.pth...
[2023-02-25 20:14:07,442][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000736_3014656.pth
[2023-02-25 20:14:11,154][16672] Updated weights for policy 0, policy_version 960 (0.0022)
[2023-02-25 20:14:12,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 3936256. Throughput: 0: 1009.4. Samples: 982702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 20:14:12,317][05610] Avg episode reward: [(0, '25.301')]
[2023-02-25 20:14:17,310][05610] Fps is (10 sec: 4505.6, 60 sec: 3959.6, 300 sec: 3735.0). Total num frames: 3956736. Throughput: 0: 990.5. Samples: 989116. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 20:14:17,317][05610] Avg episode reward: [(0, '25.203')]
[2023-02-25 20:14:22,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 3969024. Throughput: 0: 961.6. Samples: 991404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:14:22,318][05610] Avg episode reward: [(0, '24.632')]
[2023-02-25 20:14:22,689][16672] Updated weights for policy 0, policy_version 970 (0.0013)
[2023-02-25 20:14:27,310][05610] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 3989504. Throughput: 0: 963.4. Samples: 996450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 20:14:27,318][05610] Avg episode reward: [(0, '23.832')]
[2023-02-25 20:14:30,207][16658] Stopping Batcher_0...
[2023-02-25 20:14:30,208][16658] Loop batcher_evt_loop terminating...
[2023-02-25 20:14:30,210][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 20:14:30,209][05610] Component Batcher_0 stopped!
[2023-02-25 20:14:30,261][16683] Stopping RolloutWorker_w4...
[2023-02-25 20:14:30,263][16672] Weights refcount: 2 0
[2023-02-25 20:14:30,261][05610] Component RolloutWorker_w4 stopped!
[2023-02-25 20:14:30,267][16677] Stopping RolloutWorker_w0...
[2023-02-25 20:14:30,267][16677] Loop rollout_proc0_evt_loop terminating...
[2023-02-25 20:14:30,269][05610] Component RolloutWorker_w0 stopped!
[2023-02-25 20:14:30,274][05610] Component InferenceWorker_p0-w0 stopped!
[2023-02-25 20:14:30,273][16672] Stopping InferenceWorker_p0-w0...
[2023-02-25 20:14:30,279][16672] Loop inference_proc0-0_evt_loop terminating...
[2023-02-25 20:14:30,280][16683] Loop rollout_proc4_evt_loop terminating...
[2023-02-25 20:14:30,294][05610] Component RolloutWorker_w5 stopped!
[2023-02-25 20:14:30,296][16682] Stopping RolloutWorker_w5...
[2023-02-25 20:14:30,300][16682] Loop rollout_proc5_evt_loop terminating...
[2023-02-25 20:14:30,309][16678] Stopping RolloutWorker_w2...
[2023-02-25 20:14:30,309][05610] Component RolloutWorker_w7 stopped!
[2023-02-25 20:14:30,315][05610] Component RolloutWorker_w2 stopped!
[2023-02-25 20:14:30,310][16678] Loop rollout_proc2_evt_loop terminating...
[2023-02-25 20:14:30,327][05610] Component RolloutWorker_w6 stopped!
[2023-02-25 20:14:30,331][16681] Stopping RolloutWorker_w6...
[2023-02-25 20:14:30,333][16681] Loop rollout_proc6_evt_loop terminating...
[2023-02-25 20:14:30,311][16684] Stopping RolloutWorker_w7...
[2023-02-25 20:14:30,336][16679] Stopping RolloutWorker_w1...
[2023-02-25 20:14:30,336][05610] Component RolloutWorker_w1 stopped!
[2023-02-25 20:14:30,337][16684] Loop rollout_proc7_evt_loop terminating...
[2023-02-25 20:14:30,338][16679] Loop rollout_proc1_evt_loop terminating...
[2023-02-25 20:14:30,370][16680] Stopping RolloutWorker_w3...
[2023-02-25 20:14:30,371][05610] Component RolloutWorker_w3 stopped!
[2023-02-25 20:14:30,377][16680] Loop rollout_proc3_evt_loop terminating...
[2023-02-25 20:14:30,407][16658] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000845_3461120.pth
[2023-02-25 20:14:30,419][16658] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 20:14:30,600][05610] Component LearnerWorker_p0 stopped!
[2023-02-25 20:14:30,606][05610] Waiting for process learner_proc0 to stop...
[2023-02-25 20:14:30,611][16658] Stopping LearnerWorker_p0...
[2023-02-25 20:14:30,612][16658] Loop learner_proc0_evt_loop terminating...
[2023-02-25 20:14:32,354][05610] Waiting for process inference_proc0-0 to join...
[2023-02-25 20:14:32,776][05610] Waiting for process rollout_proc0 to join...
[2023-02-25 20:14:33,178][05610] Waiting for process rollout_proc1 to join...
[2023-02-25 20:14:33,180][05610] Waiting for process rollout_proc2 to join...
[2023-02-25 20:14:33,182][05610] Waiting for process rollout_proc3 to join...
[2023-02-25 20:14:33,185][05610] Waiting for process rollout_proc4 to join...
[2023-02-25 20:14:33,186][05610] Waiting for process rollout_proc5 to join...
[2023-02-25 20:14:33,187][05610] Waiting for process rollout_proc6 to join...
[2023-02-25 20:14:33,191][05610] Waiting for process rollout_proc7 to join...
[2023-02-25 20:14:33,192][05610] Batcher 0 profile tree view:
batching: 25.8851, releasing_batches: 0.0223
[2023-02-25 20:14:33,194][05610] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0077
wait_policy_total: 531.2915
update_model: 7.4833
weight_update: 0.0018
one_step: 0.0022
handle_policy_step: 493.8315
deserialize: 13.9820, stack: 2.8324, obs_to_device_normalize: 110.2465, forward: 236.3438, send_messages: 25.5712
prepare_outputs: 80.2336
to_cpu: 50.0203
[2023-02-25 20:14:33,195][05610] Learner 0 profile tree view:
misc: 0.0055, prepare_batch: 17.0581
train: 75.4877
epoch_init: 0.0130, minibatch_init: 0.0122, losses_postprocess: 0.6063, kl_divergence: 0.5708, after_optimizer: 33.3927
calculate_losses: 26.3803
losses_init: 0.0034, forward_head: 1.8221, bptt_initial: 17.4893, tail: 1.0093, advantages_returns: 0.2996, losses: 3.4782
bptt: 1.9922
bptt_forward_core: 1.9115
update: 13.9414
clip: 1.3912
[2023-02-25 20:14:33,197][05610] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3386, enqueue_policy_requests: 141.5231, env_step: 808.5703, overhead: 19.3346, complete_rollouts: 6.8000
save_policy_outputs: 19.4148
split_output_tensors: 9.5390
[2023-02-25 20:14:33,198][05610] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3427, enqueue_policy_requests: 146.2585, env_step: 802.5299, overhead: 19.9726, complete_rollouts: 7.1920
save_policy_outputs: 19.1870
split_output_tensors: 9.3799
[2023-02-25 20:14:33,200][05610] Loop Runner_EvtLoop terminating...
[2023-02-25 20:14:33,202][05610] Runner profile tree view:
main_loop: 1101.3230
[2023-02-25 20:14:33,203][05610] Collected {0: 4005888}, FPS: 3637.3
[2023-02-25 20:14:34,295][05610] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 20:14:34,297][05610] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 20:14:34,299][05610] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 20:14:34,302][05610] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 20:14:34,307][05610] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 20:14:34,308][05610] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 20:14:34,310][05610] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 20:14:34,311][05610] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 20:14:34,315][05610] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-25 20:14:34,316][05610] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-25 20:14:34,318][05610] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 20:14:34,319][05610] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 20:14:34,320][05610] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 20:14:34,321][05610] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 20:14:34,322][05610] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 20:14:34,348][05610] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 20:14:34,350][05610] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 20:14:34,352][05610] RunningMeanStd input shape: (1,)
[2023-02-25 20:14:34,368][05610] ConvEncoder: input_channels=3
[2023-02-25 20:14:35,150][05610] Conv encoder output size: 512
[2023-02-25 20:14:35,153][05610] Policy head output size: 512
[2023-02-25 20:14:38,290][05610] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 20:14:39,528][05610] Num frames 100...
[2023-02-25 20:14:39,644][05610] Num frames 200...
[2023-02-25 20:14:39,756][05610] Num frames 300...
[2023-02-25 20:14:39,874][05610] Num frames 400...
[2023-02-25 20:14:39,987][05610] Num frames 500...
[2023-02-25 20:14:40,098][05610] Num frames 600...
[2023-02-25 20:14:40,220][05610] Num frames 700...
[2023-02-25 20:14:40,335][05610] Num frames 800...
[2023-02-25 20:14:40,451][05610] Num frames 900...
[2023-02-25 20:14:40,567][05610] Num frames 1000...
[2023-02-25 20:14:40,685][05610] Num frames 1100...
[2023-02-25 20:14:40,797][05610] Num frames 1200...
[2023-02-25 20:14:40,909][05610] Num frames 1300...
[2023-02-25 20:14:41,023][05610] Num frames 1400...
[2023-02-25 20:14:41,083][05610] Avg episode rewards: #0: 34.030, true rewards: #0: 14.030
[2023-02-25 20:14:41,089][05610] Avg episode reward: 34.030, avg true_objective: 14.030
[2023-02-25 20:14:41,196][05610] Num frames 1500...
[2023-02-25 20:14:41,308][05610] Num frames 1600...
[2023-02-25 20:14:41,433][05610] Num frames 1700...
[2023-02-25 20:14:41,558][05610] Num frames 1800...
[2023-02-25 20:14:41,676][05610] Num frames 1900...
[2023-02-25 20:14:41,793][05610] Num frames 2000...
[2023-02-25 20:14:41,905][05610] Num frames 2100...
[2023-02-25 20:14:41,971][05610] Avg episode rewards: #0: 25.535, true rewards: #0: 10.535
[2023-02-25 20:14:41,974][05610] Avg episode reward: 25.535, avg true_objective: 10.535
[2023-02-25 20:14:42,074][05610] Num frames 2200...
[2023-02-25 20:14:42,185][05610] Num frames 2300...
[2023-02-25 20:14:42,299][05610] Num frames 2400...
[2023-02-25 20:14:42,419][05610] Num frames 2500...
[2023-02-25 20:14:42,529][05610] Num frames 2600...
[2023-02-25 20:14:42,642][05610] Num frames 2700...
[2023-02-25 20:14:42,750][05610] Avg episode rewards: #0: 21.490, true rewards: #0: 9.157
[2023-02-25 20:14:42,752][05610] Avg episode reward: 21.490, avg true_objective: 9.157
[2023-02-25 20:14:42,816][05610] Num frames 2800...
[2023-02-25 20:14:42,932][05610] Num frames 2900...
[2023-02-25 20:14:43,044][05610] Num frames 3000...
[2023-02-25 20:14:43,157][05610] Num frames 3100...
[2023-02-25 20:14:43,278][05610] Num frames 3200...
[2023-02-25 20:14:43,392][05610] Num frames 3300...
[2023-02-25 20:14:43,504][05610] Num frames 3400...
[2023-02-25 20:14:43,619][05610] Num frames 3500...
[2023-02-25 20:14:43,735][05610] Num frames 3600...
[2023-02-25 20:14:43,854][05610] Num frames 3700...
[2023-02-25 20:14:43,935][05610] Avg episode rewards: #0: 21.553, true rewards: #0: 9.302
[2023-02-25 20:14:43,936][05610] Avg episode reward: 21.553, avg true_objective: 9.302
[2023-02-25 20:14:44,025][05610] Num frames 3800...
[2023-02-25 20:14:44,137][05610] Num frames 3900...
[2023-02-25 20:14:44,248][05610] Num frames 4000...
[2023-02-25 20:14:44,370][05610] Num frames 4100...
[2023-02-25 20:14:44,488][05610] Num frames 4200...
[2023-02-25 20:14:44,600][05610] Num frames 4300...
[2023-02-25 20:14:44,715][05610] Num frames 4400...
[2023-02-25 20:14:44,819][05610] Avg episode rewards: #0: 20.076, true rewards: #0: 8.876
[2023-02-25 20:14:44,821][05610] Avg episode reward: 20.076, avg true_objective: 8.876
[2023-02-25 20:14:44,891][05610] Num frames 4500...
[2023-02-25 20:14:45,013][05610] Num frames 4600...
[2023-02-25 20:14:45,138][05610] Num frames 4700...
[2023-02-25 20:14:45,253][05610] Num frames 4800...
[2023-02-25 20:14:45,372][05610] Num frames 4900...
[2023-02-25 20:14:45,484][05610] Num frames 5000...
[2023-02-25 20:14:45,601][05610] Num frames 5100...
[2023-02-25 20:14:45,712][05610] Num frames 5200...
[2023-02-25 20:14:45,820][05610] Avg episode rewards: #0: 20.238, true rewards: #0: 8.738
[2023-02-25 20:14:45,822][05610] Avg episode reward: 20.238, avg true_objective: 8.738
[2023-02-25 20:14:45,891][05610] Num frames 5300...
[2023-02-25 20:14:46,002][05610] Num frames 5400...
[2023-02-25 20:14:46,115][05610] Num frames 5500...
[2023-02-25 20:14:46,230][05610] Num frames 5600...
[2023-02-25 20:14:46,348][05610] Num frames 5700...
[2023-02-25 20:14:46,459][05610] Num frames 5800...
[2023-02-25 20:14:46,578][05610] Num frames 5900...
[2023-02-25 20:14:46,687][05610] Num frames 6000...
[2023-02-25 20:14:46,803][05610] Num frames 6100...
[2023-02-25 20:14:46,915][05610] Num frames 6200...
[2023-02-25 20:14:47,027][05610] Num frames 6300...
[2023-02-25 20:14:47,146][05610] Num frames 6400...
[2023-02-25 20:14:47,234][05610] Avg episode rewards: #0: 20.467, true rewards: #0: 9.181
[2023-02-25 20:14:47,235][05610] Avg episode reward: 20.467, avg true_objective: 9.181
[2023-02-25 20:14:47,320][05610] Num frames 6500...
[2023-02-25 20:14:47,441][05610] Num frames 6600...
[2023-02-25 20:14:47,555][05610] Num frames 6700...
[2023-02-25 20:14:47,666][05610] Num frames 6800...
[2023-02-25 20:14:47,787][05610] Num frames 6900...
[2023-02-25 20:14:47,926][05610] Num frames 7000...
[2023-02-25 20:14:48,094][05610] Num frames 7100...
[2023-02-25 20:14:48,249][05610] Num frames 7200...
[2023-02-25 20:14:48,405][05610] Num frames 7300...
[2023-02-25 20:14:48,569][05610] Num frames 7400...
[2023-02-25 20:14:48,728][05610] Num frames 7500...
[2023-02-25 20:14:48,899][05610] Num frames 7600...
[2023-02-25 20:14:49,057][05610] Num frames 7700...
[2023-02-25 20:14:49,214][05610] Num frames 7800...
[2023-02-25 20:14:49,376][05610] Num frames 7900...
[2023-02-25 20:14:49,535][05610] Num frames 8000...
[2023-02-25 20:14:49,697][05610] Num frames 8100...
[2023-02-25 20:14:49,863][05610] Num frames 8200...
[2023-02-25 20:14:50,029][05610] Num frames 8300...
[2023-02-25 20:14:50,206][05610] Num frames 8400...
[2023-02-25 20:14:50,377][05610] Num frames 8500...
[2023-02-25 20:14:50,479][05610] Avg episode rewards: #0: 24.659, true rewards: #0: 10.659
[2023-02-25 20:14:50,480][05610] Avg episode reward: 24.659, avg true_objective: 10.659
[2023-02-25 20:14:50,595][05610] Num frames 8600...
[2023-02-25 20:14:50,760][05610] Num frames 8700...
[2023-02-25 20:14:50,927][05610] Num frames 8800...
[2023-02-25 20:14:51,088][05610] Num frames 8900...
[2023-02-25 20:14:51,249][05610] Num frames 9000...
[2023-02-25 20:14:51,403][05610] Num frames 9100...
[2023-02-25 20:14:51,498][05610] Avg episode rewards: #0: 23.261, true rewards: #0: 10.150
[2023-02-25 20:14:51,503][05610] Avg episode reward: 23.261, avg true_objective: 10.150
[2023-02-25 20:14:51,578][05610] Num frames 9200...
[2023-02-25 20:14:51,695][05610] Num frames 9300...
[2023-02-25 20:14:51,815][05610] Avg episode rewards: #0: 21.259, true rewards: #0: 9.359
[2023-02-25 20:14:51,816][05610] Avg episode reward: 21.259, avg true_objective: 9.359
[2023-02-25 20:15:46,071][05610] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-25 20:15:46,517][05610] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 20:15:46,520][05610] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 20:15:46,523][05610] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 20:15:46,526][05610] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 20:15:46,528][05610] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 20:15:46,533][05610] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 20:15:46,534][05610] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-25 20:15:46,537][05610] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 20:15:46,538][05610] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-25 20:15:46,540][05610] Adding new argument 'hf_repository'='atorre/atorre/SampleFactory-ppo-doom_health_gathering_supreme' that is not in the saved config file!
[2023-02-25 20:15:46,542][05610] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 20:15:46,543][05610] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 20:15:46,544][05610] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 20:15:46,545][05610] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 20:15:46,546][05610] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 20:15:46,573][05610] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 20:15:46,576][05610] RunningMeanStd input shape: (1,)
[2023-02-25 20:15:46,596][05610] ConvEncoder: input_channels=3
[2023-02-25 20:15:46,656][05610] Conv encoder output size: 512
[2023-02-25 20:15:46,660][05610] Policy head output size: 512
[2023-02-25 20:15:46,690][05610] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 20:15:47,537][05610] Num frames 100...
[2023-02-25 20:15:47,717][05610] Num frames 200...
[2023-02-25 20:15:47,910][05610] Num frames 300...
[2023-02-25 20:15:48,173][05610] Avg episode rewards: #0: 5.840, true rewards: #0: 3.840
[2023-02-25 20:15:48,174][05610] Avg episode reward: 5.840, avg true_objective: 3.840
[2023-02-25 20:15:48,215][05610] Num frames 400...
[2023-02-25 20:15:48,417][05610] Num frames 500...
[2023-02-25 20:15:48,608][05610] Num frames 600...
[2023-02-25 20:15:48,764][05610] Num frames 700...
[2023-02-25 20:15:48,920][05610] Num frames 800...
[2023-02-25 20:15:48,976][05610] Avg episode rewards: #0: 7.500, true rewards: #0: 4.000
[2023-02-25 20:15:48,978][05610] Avg episode reward: 7.500, avg true_objective: 4.000
[2023-02-25 20:15:49,131][05610] Num frames 900...
[2023-02-25 20:15:49,293][05610] Num frames 1000...
[2023-02-25 20:15:49,454][05610] Num frames 1100...
[2023-02-25 20:15:49,622][05610] Num frames 1200...
[2023-02-25 20:15:49,779][05610] Num frames 1300...
[2023-02-25 20:15:49,944][05610] Num frames 1400...
[2023-02-25 20:15:50,097][05610] Num frames 1500...
[2023-02-25 20:15:50,277][05610] Num frames 1600...
[2023-02-25 20:15:50,439][05610] Num frames 1700...
[2023-02-25 20:15:50,600][05610] Num frames 1800...
[2023-02-25 20:15:50,781][05610] Num frames 1900...
[2023-02-25 20:15:50,955][05610] Num frames 2000...
[2023-02-25 20:15:51,153][05610] Num frames 2100...
[2023-02-25 20:15:51,331][05610] Num frames 2200...
[2023-02-25 20:15:51,496][05610] Num frames 2300...
[2023-02-25 20:15:51,654][05610] Num frames 2400...
[2023-02-25 20:15:51,808][05610] Num frames 2500...
[2023-02-25 20:15:51,971][05610] Num frames 2600...
[2023-02-25 20:15:52,132][05610] Num frames 2700...
[2023-02-25 20:15:52,295][05610] Num frames 2800...
[2023-02-25 20:15:52,459][05610] Num frames 2900...
[2023-02-25 20:15:52,512][05610] Avg episode rewards: #0: 23.666, true rewards: #0: 9.667
[2023-02-25 20:15:52,514][05610] Avg episode reward: 23.666, avg true_objective: 9.667
[2023-02-25 20:15:52,634][05610] Num frames 3000...
[2023-02-25 20:15:52,745][05610] Num frames 3100...
[2023-02-25 20:15:52,859][05610] Num frames 3200...
[2023-02-25 20:15:52,972][05610] Num frames 3300...
[2023-02-25 20:15:53,086][05610] Num frames 3400...
[2023-02-25 20:15:53,197][05610] Num frames 3500...
[2023-02-25 20:15:53,310][05610] Num frames 3600...
[2023-02-25 20:15:53,424][05610] Num frames 3700...
[2023-02-25 20:15:53,541][05610] Num frames 3800...
[2023-02-25 20:15:53,655][05610] Num frames 3900...
[2023-02-25 20:15:53,768][05610] Num frames 4000...
[2023-02-25 20:15:53,882][05610] Num frames 4100...
[2023-02-25 20:15:53,997][05610] Num frames 4200...
[2023-02-25 20:15:54,115][05610] Num frames 4300...
[2023-02-25 20:15:54,230][05610] Num frames 4400...
[2023-02-25 20:15:54,345][05610] Num frames 4500...
[2023-02-25 20:15:54,457][05610] Num frames 4600...
[2023-02-25 20:15:54,578][05610] Num frames 4700...
[2023-02-25 20:15:54,693][05610] Num frames 4800...
[2023-02-25 20:15:54,809][05610] Num frames 4900...
[2023-02-25 20:15:54,924][05610] Num frames 5000...
[2023-02-25 20:15:54,976][05610] Avg episode rewards: #0: 32.499, true rewards: #0: 12.500
[2023-02-25 20:15:54,978][05610] Avg episode reward: 32.499, avg true_objective: 12.500
[2023-02-25 20:15:55,093][05610] Num frames 5100...
[2023-02-25 20:15:55,205][05610] Num frames 5200...
[2023-02-25 20:15:55,328][05610] Num frames 5300...
[2023-02-25 20:15:55,438][05610] Num frames 5400...
[2023-02-25 20:15:55,561][05610] Num frames 5500...
[2023-02-25 20:15:55,680][05610] Num frames 5600...
[2023-02-25 20:15:55,792][05610] Num frames 5700...
[2023-02-25 20:15:55,907][05610] Num frames 5800...
[2023-02-25 20:15:56,020][05610] Num frames 5900...
[2023-02-25 20:15:56,137][05610] Num frames 6000...
[2023-02-25 20:15:56,250][05610] Num frames 6100...
[2023-02-25 20:15:56,329][05610] Avg episode rewards: #0: 31.440, true rewards: #0: 12.240
[2023-02-25 20:15:56,333][05610] Avg episode reward: 31.440, avg true_objective: 12.240
[2023-02-25 20:15:56,425][05610] Num frames 6200...
[2023-02-25 20:15:56,537][05610] Num frames 6300...
[2023-02-25 20:15:56,655][05610] Num frames 6400...
[2023-02-25 20:15:56,764][05610] Num frames 6500...
[2023-02-25 20:15:56,878][05610] Num frames 6600...
[2023-02-25 20:15:56,990][05610] Num frames 6700...
[2023-02-25 20:15:57,104][05610] Num frames 6800...
[2023-02-25 20:15:57,216][05610] Num frames 6900...
[2023-02-25 20:15:57,332][05610] Num frames 7000...
[2023-02-25 20:15:57,451][05610] Num frames 7100...
[2023-02-25 20:15:57,555][05610] Avg episode rewards: #0: 29.240, true rewards: #0: 11.907
[2023-02-25 20:15:57,557][05610] Avg episode reward: 29.240, avg true_objective: 11.907
[2023-02-25 20:15:57,626][05610] Num frames 7200...
[2023-02-25 20:15:57,741][05610] Num frames 7300...
[2023-02-25 20:15:57,852][05610] Num frames 7400...
[2023-02-25 20:15:57,963][05610] Num frames 7500...
[2023-02-25 20:15:58,079][05610] Num frames 7600...
[2023-02-25 20:15:58,191][05610] Num frames 7700...
[2023-02-25 20:15:58,308][05610] Num frames 7800...
[2023-02-25 20:15:58,384][05610] Avg episode rewards: #0: 27.165, true rewards: #0: 11.166
[2023-02-25 20:15:58,386][05610] Avg episode reward: 27.165, avg true_objective: 11.166
[2023-02-25 20:15:58,481][05610] Num frames 7900...
[2023-02-25 20:15:58,626][05610] Num frames 8000...
[2023-02-25 20:15:58,785][05610] Num frames 8100...
[2023-02-25 20:15:58,946][05610] Num frames 8200...
[2023-02-25 20:15:59,103][05610] Num frames 8300...
[2023-02-25 20:15:59,259][05610] Num frames 8400...
[2023-02-25 20:15:59,421][05610] Num frames 8500...
[2023-02-25 20:15:59,596][05610] Num frames 8600...
[2023-02-25 20:15:59,684][05610] Avg episode rewards: #0: 26.020, true rewards: #0: 10.770
[2023-02-25 20:15:59,687][05610] Avg episode reward: 26.020, avg true_objective: 10.770
[2023-02-25 20:15:59,836][05610] Num frames 8700...
[2023-02-25 20:16:00,000][05610] Num frames 8800...
[2023-02-25 20:16:00,163][05610] Num frames 8900...
[2023-02-25 20:16:00,373][05610] Num frames 9000...
[2023-02-25 20:16:00,541][05610] Num frames 9100...
[2023-02-25 20:16:00,722][05610] Avg episode rewards: #0: 23.955, true rewards: #0: 10.178
[2023-02-25 20:16:00,724][05610] Avg episode reward: 23.955, avg true_objective: 10.178
[2023-02-25 20:16:00,790][05610] Num frames 9200...
[2023-02-25 20:16:00,948][05610] Num frames 9300...
[2023-02-25 20:16:01,116][05610] Num frames 9400...
[2023-02-25 20:16:01,272][05610] Num frames 9500...
[2023-02-25 20:16:01,428][05610] Num frames 9600...
[2023-02-25 20:16:01,587][05610] Num frames 9700...
[2023-02-25 20:16:01,653][05610] Avg episode rewards: #0: 22.304, true rewards: #0: 9.704
[2023-02-25 20:16:01,654][05610] Avg episode reward: 22.304, avg true_objective: 9.704
[2023-02-25 20:16:56,515][05610] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-25 20:31:27,549][05610] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 20:31:27,551][05610] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 20:31:27,553][05610] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 20:31:27,555][05610] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 20:31:27,557][05610] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 20:31:27,558][05610] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 20:31:27,560][05610] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-25 20:31:27,562][05610] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 20:31:27,563][05610] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-25 20:31:27,564][05610] Adding new argument 'hf_repository'='atorre/SampleFactory-ppo-doom_health_gathering_supreme' that is not in the saved config file!
[2023-02-25 20:31:27,566][05610] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 20:31:27,567][05610] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 20:31:27,568][05610] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 20:31:27,570][05610] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 20:31:27,571][05610] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 20:31:27,604][05610] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 20:31:27,608][05610] RunningMeanStd input shape: (1,)
[2023-02-25 20:31:27,623][05610] ConvEncoder: input_channels=3
[2023-02-25 20:31:27,660][05610] Conv encoder output size: 512
[2023-02-25 20:31:27,662][05610] Policy head output size: 512
[2023-02-25 20:31:27,683][05610] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 20:31:28,137][05610] Num frames 100...
[2023-02-25 20:31:28,253][05610] Num frames 200...
[2023-02-25 20:31:28,380][05610] Num frames 300...
[2023-02-25 20:31:28,501][05610] Num frames 400...
[2023-02-25 20:31:28,616][05610] Num frames 500...
[2023-02-25 20:31:28,770][05610] Avg episode rewards: #0: 10.760, true rewards: #0: 5.760
[2023-02-25 20:31:28,772][05610] Avg episode reward: 10.760, avg true_objective: 5.760
[2023-02-25 20:31:28,805][05610] Num frames 600...
[2023-02-25 20:31:28,921][05610] Num frames 700...
[2023-02-25 20:31:29,038][05610] Num frames 800...
[2023-02-25 20:31:29,162][05610] Num frames 900...
[2023-02-25 20:31:29,280][05610] Num frames 1000...
[2023-02-25 20:31:29,406][05610] Num frames 1100...
[2023-02-25 20:31:29,519][05610] Num frames 1200...
[2023-02-25 20:31:29,632][05610] Num frames 1300...
[2023-02-25 20:31:29,753][05610] Num frames 1400...
[2023-02-25 20:31:29,875][05610] Num frames 1500...
[2023-02-25 20:31:29,991][05610] Num frames 1600...
[2023-02-25 20:31:30,110][05610] Num frames 1700...
[2023-02-25 20:31:30,263][05610] Num frames 1800...
[2023-02-25 20:31:30,437][05610] Num frames 1900...
[2023-02-25 20:31:30,603][05610] Num frames 2000...
[2023-02-25 20:31:30,761][05610] Num frames 2100...
[2023-02-25 20:31:30,921][05610] Num frames 2200...
[2023-02-25 20:31:31,081][05610] Num frames 2300...
[2023-02-25 20:31:31,242][05610] Num frames 2400...
[2023-02-25 20:31:31,415][05610] Num frames 2500...
[2023-02-25 20:31:31,622][05610] Avg episode rewards: #0: 33.959, true rewards: #0: 12.960
[2023-02-25 20:31:31,624][05610] Avg episode reward: 33.959, avg true_objective: 12.960
[2023-02-25 20:31:31,641][05610] Num frames 2600...
[2023-02-25 20:31:31,794][05610] Num frames 2700...
[2023-02-25 20:31:31,969][05610] Num frames 2800...
[2023-02-25 20:31:32,135][05610] Num frames 2900...
[2023-02-25 20:31:32,287][05610] Num frames 3000...
[2023-02-25 20:31:32,457][05610] Num frames 3100...
[2023-02-25 20:31:32,617][05610] Num frames 3200...
[2023-02-25 20:31:32,776][05610] Num frames 3300...
[2023-02-25 20:31:32,935][05610] Num frames 3400...
[2023-02-25 20:31:33,095][05610] Num frames 3500...
[2023-02-25 20:31:33,252][05610] Num frames 3600...
[2023-02-25 20:31:33,327][05610] Avg episode rewards: #0: 30.703, true rewards: #0: 12.037
[2023-02-25 20:31:33,329][05610] Avg episode reward: 30.703, avg true_objective: 12.037
[2023-02-25 20:31:33,469][05610] Num frames 3700...
[2023-02-25 20:31:33,625][05610] Num frames 3800...
[2023-02-25 20:31:33,785][05610] Num frames 3900...
[2023-02-25 20:31:33,936][05610] Avg episode rewards: #0: 24.157, true rewards: #0: 9.908
[2023-02-25 20:31:33,937][05610] Avg episode reward: 24.157, avg true_objective: 9.908
[2023-02-25 20:31:33,982][05610] Num frames 4000...
[2023-02-25 20:31:34,098][05610] Num frames 4100...
[2023-02-25 20:31:34,212][05610] Num frames 4200...
[2023-02-25 20:31:34,330][05610] Num frames 4300...
[2023-02-25 20:31:34,448][05610] Num frames 4400...
[2023-02-25 20:31:34,565][05610] Num frames 4500...
[2023-02-25 20:31:34,685][05610] Num frames 4600...
[2023-02-25 20:31:34,798][05610] Num frames 4700...
[2023-02-25 20:31:34,925][05610] Avg episode rewards: #0: 22.526, true rewards: #0: 9.526
[2023-02-25 20:31:34,927][05610] Avg episode reward: 22.526, avg true_objective: 9.526
[2023-02-25 20:31:34,977][05610] Num frames 4800...
[2023-02-25 20:31:35,093][05610] Num frames 4900...
[2023-02-25 20:31:35,201][05610] Num frames 5000...
[2023-02-25 20:31:35,310][05610] Num frames 5100...
[2023-02-25 20:31:35,425][05610] Num frames 5200...
[2023-02-25 20:31:35,539][05610] Num frames 5300...
[2023-02-25 20:31:35,653][05610] Num frames 5400...
[2023-02-25 20:31:35,757][05610] Avg episode rewards: #0: 20.725, true rewards: #0: 9.058
[2023-02-25 20:31:35,759][05610] Avg episode reward: 20.725, avg true_objective: 9.058
[2023-02-25 20:31:35,831][05610] Num frames 5500...
[2023-02-25 20:31:35,942][05610] Num frames 5600...
[2023-02-25 20:31:36,055][05610] Num frames 5700...
[2023-02-25 20:31:36,169][05610] Num frames 5800...
[2023-02-25 20:31:36,289][05610] Num frames 5900...
[2023-02-25 20:31:36,403][05610] Num frames 6000...
[2023-02-25 20:31:36,522][05610] Num frames 6100...
[2023-02-25 20:31:36,633][05610] Num frames 6200...
[2023-02-25 20:31:36,749][05610] Num frames 6300...
[2023-02-25 20:31:36,860][05610] Num frames 6400...
[2023-02-25 20:31:37,019][05610] Avg episode rewards: #0: 20.559, true rewards: #0: 9.273
[2023-02-25 20:31:37,020][05610] Avg episode reward: 20.559, avg true_objective: 9.273
[2023-02-25 20:31:37,036][05610] Num frames 6500...
[2023-02-25 20:31:37,149][05610] Num frames 6600...
[2023-02-25 20:31:37,263][05610] Num frames 6700...
[2023-02-25 20:31:37,376][05610] Num frames 6800...
[2023-02-25 20:31:37,505][05610] Num frames 6900...
[2023-02-25 20:31:37,618][05610] Num frames 7000...
[2023-02-25 20:31:37,735][05610] Num frames 7100...
[2023-02-25 20:31:37,848][05610] Num frames 7200...
[2023-02-25 20:31:37,959][05610] Num frames 7300...
[2023-02-25 20:31:38,078][05610] Num frames 7400...
[2023-02-25 20:31:38,223][05610] Avg episode rewards: #0: 20.854, true rewards: #0: 9.354
[2023-02-25 20:31:38,225][05610] Avg episode reward: 20.854, avg true_objective: 9.354
[2023-02-25 20:31:38,246][05610] Num frames 7500...
[2023-02-25 20:31:38,357][05610] Num frames 7600...
[2023-02-25 20:31:38,469][05610] Num frames 7700...
[2023-02-25 20:31:38,592][05610] Num frames 7800...
[2023-02-25 20:31:38,705][05610] Num frames 7900...
[2023-02-25 20:31:38,820][05610] Num frames 8000...
[2023-02-25 20:31:38,933][05610] Num frames 8100...
[2023-02-25 20:31:39,048][05610] Num frames 8200...
[2023-02-25 20:31:39,161][05610] Num frames 8300...
[2023-02-25 20:31:39,282][05610] Num frames 8400...
[2023-02-25 20:31:39,395][05610] Num frames 8500...
[2023-02-25 20:31:39,516][05610] Num frames 8600...
[2023-02-25 20:31:39,647][05610] Num frames 8700...
[2023-02-25 20:31:39,722][05610] Avg episode rewards: #0: 22.130, true rewards: #0: 9.686
[2023-02-25 20:31:39,725][05610] Avg episode reward: 22.130, avg true_objective: 9.686
[2023-02-25 20:31:39,825][05610] Num frames 8800...
[2023-02-25 20:31:39,936][05610] Num frames 8900...
[2023-02-25 20:31:40,047][05610] Num frames 9000...
[2023-02-25 20:31:40,175][05610] Num frames 9100...
[2023-02-25 20:31:40,285][05610] Num frames 9200...
[2023-02-25 20:31:40,399][05610] Num frames 9300...
[2023-02-25 20:31:40,510][05610] Num frames 9400...
[2023-02-25 20:31:40,637][05610] Num frames 9500...
[2023-02-25 20:31:40,750][05610] Num frames 9600...
[2023-02-25 20:31:40,860][05610] Num frames 9700...
[2023-02-25 20:31:40,978][05610] Num frames 9800...
[2023-02-25 20:31:41,093][05610] Num frames 9900...
[2023-02-25 20:31:41,207][05610] Num frames 10000...
[2023-02-25 20:31:41,335][05610] Num frames 10100...
[2023-02-25 20:31:41,458][05610] Num frames 10200...
[2023-02-25 20:31:41,586][05610] Num frames 10300...
[2023-02-25 20:31:41,707][05610] Num frames 10400...
[2023-02-25 20:31:41,824][05610] Num frames 10500...
[2023-02-25 20:31:41,956][05610] Avg episode rewards: #0: 25.665, true rewards: #0: 10.565
[2023-02-25 20:31:41,958][05610] Avg episode reward: 25.665, avg true_objective: 10.565
[2023-02-25 20:32:44,791][05610] Replay video saved to /content/train_dir/default_experiment/replay.mp4!