andreatorch's picture
Upload folder using huggingface_hub
db916f2
raw
history blame contribute delete
No virus
127 kB
[2023-12-28 17:31:27,010][00255] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-12-28 17:31:27,016][00255] Rollout worker 0 uses device cpu
[2023-12-28 17:31:27,018][00255] Rollout worker 1 uses device cpu
[2023-12-28 17:31:27,020][00255] Rollout worker 2 uses device cpu
[2023-12-28 17:31:27,021][00255] Rollout worker 3 uses device cpu
[2023-12-28 17:31:27,024][00255] Rollout worker 4 uses device cpu
[2023-12-28 17:31:27,027][00255] Rollout worker 5 uses device cpu
[2023-12-28 17:31:27,037][00255] Rollout worker 6 uses device cpu
[2023-12-28 17:31:27,049][00255] Rollout worker 7 uses device cpu
[2023-12-28 17:31:27,253][00255] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-12-28 17:31:27,259][00255] InferenceWorker_p0-w0: min num requests: 2
[2023-12-28 17:31:27,307][00255] Starting all processes...
[2023-12-28 17:31:27,311][00255] Starting process learner_proc0
[2023-12-28 17:31:27,412][00255] Starting all processes...
[2023-12-28 17:31:27,552][00255] Starting process inference_proc0-0
[2023-12-28 17:31:27,553][00255] Starting process rollout_proc0
[2023-12-28 17:31:27,556][00255] Starting process rollout_proc1
[2023-12-28 17:31:27,556][00255] Starting process rollout_proc2
[2023-12-28 17:31:27,557][00255] Starting process rollout_proc3
[2023-12-28 17:31:27,559][00255] Starting process rollout_proc4
[2023-12-28 17:31:27,559][00255] Starting process rollout_proc5
[2023-12-28 17:31:27,559][00255] Starting process rollout_proc6
[2023-12-28 17:31:27,559][00255] Starting process rollout_proc7
[2023-12-28 17:31:47,869][00812] Worker 3 uses CPU cores [1]
[2023-12-28 17:31:47,865][00795] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-12-28 17:31:47,870][00795] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-12-28 17:31:47,974][00795] Num visible devices: 1
[2023-12-28 17:31:48,003][00795] Starting seed is not provided
[2023-12-28 17:31:48,003][00795] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-12-28 17:31:48,004][00795] Initializing actor-critic model on device cuda:0
[2023-12-28 17:31:48,005][00795] RunningMeanStd input shape: (3, 72, 128)
[2023-12-28 17:31:48,007][00255] Heartbeat connected on Batcher_0
[2023-12-28 17:31:48,013][00795] RunningMeanStd input shape: (1,)
[2023-12-28 17:31:48,032][00255] Heartbeat connected on RolloutWorker_w3
[2023-12-28 17:31:48,157][00795] ConvEncoder: input_channels=3
[2023-12-28 17:31:48,213][00816] Worker 7 uses CPU cores [1]
[2023-12-28 17:31:48,271][00813] Worker 4 uses CPU cores [0]
[2023-12-28 17:31:48,318][00255] Heartbeat connected on RolloutWorker_w7
[2023-12-28 17:31:48,356][00814] Worker 5 uses CPU cores [1]
[2023-12-28 17:31:48,382][00255] Heartbeat connected on RolloutWorker_w5
[2023-12-28 17:31:48,494][00255] Heartbeat connected on RolloutWorker_w4
[2023-12-28 17:31:48,636][00810] Worker 0 uses CPU cores [0]
[2023-12-28 17:31:48,673][00811] Worker 2 uses CPU cores [0]
[2023-12-28 17:31:48,695][00808] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-12-28 17:31:48,697][00808] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-12-28 17:31:48,728][00255] Heartbeat connected on RolloutWorker_w0
[2023-12-28 17:31:48,750][00808] Num visible devices: 1
[2023-12-28 17:31:48,765][00255] Heartbeat connected on RolloutWorker_w2
[2023-12-28 17:31:48,781][00255] Heartbeat connected on InferenceWorker_p0-w0
[2023-12-28 17:31:48,826][00815] Worker 6 uses CPU cores [0]
[2023-12-28 17:31:48,846][00809] Worker 1 uses CPU cores [1]
[2023-12-28 17:31:48,861][00255] Heartbeat connected on RolloutWorker_w6
[2023-12-28 17:31:48,882][00255] Heartbeat connected on RolloutWorker_w1
[2023-12-28 17:31:48,904][00795] Conv encoder output size: 512
[2023-12-28 17:31:48,904][00795] Policy head output size: 512
[2023-12-28 17:31:48,964][00795] Created Actor Critic model with architecture:
[2023-12-28 17:31:48,964][00795] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-12-28 17:31:49,409][00795] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-12-28 17:31:51,109][00795] No checkpoints found
[2023-12-28 17:31:51,109][00795] Did not load from checkpoint, starting from scratch!
[2023-12-28 17:31:51,109][00795] Initialized policy 0 weights for model version 0
[2023-12-28 17:31:51,114][00795] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-12-28 17:31:51,122][00795] LearnerWorker_p0 finished initialization!
[2023-12-28 17:31:51,126][00255] Heartbeat connected on LearnerWorker_p0
[2023-12-28 17:31:51,359][00808] RunningMeanStd input shape: (3, 72, 128)
[2023-12-28 17:31:51,360][00808] RunningMeanStd input shape: (1,)
[2023-12-28 17:31:51,382][00808] ConvEncoder: input_channels=3
[2023-12-28 17:31:51,546][00808] Conv encoder output size: 512
[2023-12-28 17:31:51,546][00808] Policy head output size: 512
[2023-12-28 17:31:51,645][00255] Inference worker 0-0 is ready!
[2023-12-28 17:31:51,647][00255] All inference workers are ready! Signal rollout workers to start!
[2023-12-28 17:31:51,925][00814] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,928][00816] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,929][00812] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,935][00809] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,957][00810] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,960][00813] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,967][00811] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:51,984][00815] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:31:53,466][00809] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,465][00812] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,468][00814] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,596][00811] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,598][00813] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,605][00810] Decorrelating experience for 0 frames...
[2023-12-28 17:31:53,935][00255] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-12-28 17:31:54,279][00812] Decorrelating experience for 32 frames...
[2023-12-28 17:31:54,284][00814] Decorrelating experience for 32 frames...
[2023-12-28 17:31:54,316][00815] Decorrelating experience for 0 frames...
[2023-12-28 17:31:54,376][00811] Decorrelating experience for 32 frames...
[2023-12-28 17:31:55,194][00815] Decorrelating experience for 32 frames...
[2023-12-28 17:31:55,278][00813] Decorrelating experience for 32 frames...
[2023-12-28 17:31:55,716][00809] Decorrelating experience for 32 frames...
[2023-12-28 17:31:55,731][00816] Decorrelating experience for 0 frames...
[2023-12-28 17:31:56,047][00812] Decorrelating experience for 64 frames...
[2023-12-28 17:31:56,431][00815] Decorrelating experience for 64 frames...
[2023-12-28 17:31:56,446][00814] Decorrelating experience for 64 frames...
[2023-12-28 17:31:56,562][00813] Decorrelating experience for 64 frames...
[2023-12-28 17:31:56,952][00811] Decorrelating experience for 64 frames...
[2023-12-28 17:31:57,240][00812] Decorrelating experience for 96 frames...
[2023-12-28 17:31:57,770][00815] Decorrelating experience for 96 frames...
[2023-12-28 17:31:57,907][00813] Decorrelating experience for 96 frames...
[2023-12-28 17:31:57,996][00816] Decorrelating experience for 32 frames...
[2023-12-28 17:31:58,173][00811] Decorrelating experience for 96 frames...
[2023-12-28 17:31:58,312][00809] Decorrelating experience for 64 frames...
[2023-12-28 17:31:58,593][00814] Decorrelating experience for 96 frames...
[2023-12-28 17:31:58,935][00255] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-12-28 17:31:59,116][00809] Decorrelating experience for 96 frames...
[2023-12-28 17:31:59,344][00816] Decorrelating experience for 64 frames...
[2023-12-28 17:32:01,836][00810] Decorrelating experience for 32 frames...
[2023-12-28 17:32:02,567][00816] Decorrelating experience for 96 frames...
[2023-12-28 17:32:03,935][00255] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 214.2. Samples: 2142. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-12-28 17:32:03,953][00255] Avg episode reward: [(0, '1.945')]
[2023-12-28 17:32:05,840][00795] Signal inference workers to stop experience collection...
[2023-12-28 17:32:05,879][00808] InferenceWorker_p0-w0: stopping experience collection
[2023-12-28 17:32:06,217][00810] Decorrelating experience for 64 frames...
[2023-12-28 17:32:07,605][00810] Decorrelating experience for 96 frames...
[2023-12-28 17:32:08,521][00795] Signal inference workers to resume experience collection...
[2023-12-28 17:32:08,522][00808] InferenceWorker_p0-w0: resuming experience collection
[2023-12-28 17:32:08,937][00255] Fps is (10 sec: 409.5, 60 sec: 273.0, 300 sec: 273.0). Total num frames: 4096. Throughput: 0: 182.9. Samples: 2744. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-12-28 17:32:08,939][00255] Avg episode reward: [(0, '2.653')]
[2023-12-28 17:32:13,935][00255] Fps is (10 sec: 2048.0, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 20480. Throughput: 0: 228.9. Samples: 4578. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:32:13,937][00255] Avg episode reward: [(0, '3.399')]
[2023-12-28 17:32:18,866][00808] Updated weights for policy 0, policy_version 10 (0.0884)
[2023-12-28 17:32:18,935][00255] Fps is (10 sec: 3687.1, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 40960. Throughput: 0: 425.3. Samples: 10632. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:32:18,941][00255] Avg episode reward: [(0, '3.953')]
[2023-12-28 17:32:23,935][00255] Fps is (10 sec: 3686.4, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 57344. Throughput: 0: 459.8. Samples: 13794. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:32:23,938][00255] Avg episode reward: [(0, '4.529')]
[2023-12-28 17:32:28,935][00255] Fps is (10 sec: 2867.2, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 514.6. Samples: 18010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:32:28,941][00255] Avg episode reward: [(0, '4.510')]
[2023-12-28 17:32:33,935][00255] Fps is (10 sec: 2048.0, 60 sec: 1945.6, 300 sec: 1945.6). Total num frames: 77824. Throughput: 0: 518.0. Samples: 20722. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:32:33,938][00255] Avg episode reward: [(0, '4.379')]
[2023-12-28 17:32:34,542][00808] Updated weights for policy 0, policy_version 20 (0.0024)
[2023-12-28 17:32:38,935][00255] Fps is (10 sec: 2457.6, 60 sec: 2093.5, 300 sec: 2093.5). Total num frames: 94208. Throughput: 0: 496.4. Samples: 22340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:32:38,940][00255] Avg episode reward: [(0, '4.095')]
[2023-12-28 17:32:43,935][00255] Fps is (10 sec: 3276.8, 60 sec: 2211.8, 300 sec: 2211.8). Total num frames: 110592. Throughput: 0: 617.1. Samples: 27768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:32:43,941][00255] Avg episode reward: [(0, '4.110')]
[2023-12-28 17:32:43,943][00795] Saving new best policy, reward=4.110!
[2023-12-28 17:32:48,017][00808] Updated weights for policy 0, policy_version 30 (0.0021)
[2023-12-28 17:32:48,935][00255] Fps is (10 sec: 2867.2, 60 sec: 2234.2, 300 sec: 2234.2). Total num frames: 122880. Throughput: 0: 655.2. Samples: 31624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:32:48,941][00255] Avg episode reward: [(0, '4.277')]
[2023-12-28 17:32:48,964][00795] Saving new best policy, reward=4.277!
[2023-12-28 17:32:53,935][00255] Fps is (10 sec: 2457.6, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 135168. Throughput: 0: 674.7. Samples: 33102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:32:53,942][00255] Avg episode reward: [(0, '4.432')]
[2023-12-28 17:32:53,945][00795] Saving new best policy, reward=4.432!
[2023-12-28 17:32:58,935][00255] Fps is (10 sec: 2867.2, 60 sec: 2525.9, 300 sec: 2331.6). Total num frames: 151552. Throughput: 0: 737.8. Samples: 37778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:32:58,941][00255] Avg episode reward: [(0, '4.500')]
[2023-12-28 17:32:58,950][00795] Saving new best policy, reward=4.500!
[2023-12-28 17:33:01,441][00808] Updated weights for policy 0, policy_version 40 (0.0033)
[2023-12-28 17:33:03,935][00255] Fps is (10 sec: 3276.8, 60 sec: 2798.9, 300 sec: 2399.1). Total num frames: 167936. Throughput: 0: 726.4. Samples: 43322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:33:03,949][00255] Avg episode reward: [(0, '4.401')]
[2023-12-28 17:33:08,937][00255] Fps is (10 sec: 2866.7, 60 sec: 2935.5, 300 sec: 2402.9). Total num frames: 180224. Throughput: 0: 691.0. Samples: 44890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:33:08,946][00255] Avg episode reward: [(0, '4.304')]
[2023-12-28 17:33:13,935][00255] Fps is (10 sec: 2457.6, 60 sec: 2867.2, 300 sec: 2406.4). Total num frames: 192512. Throughput: 0: 679.8. Samples: 48602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:33:13,938][00255] Avg episode reward: [(0, '4.316')]
[2023-12-28 17:33:16,027][00808] Updated weights for policy 0, policy_version 50 (0.0035)
[2023-12-28 17:33:18,935][00255] Fps is (10 sec: 3277.3, 60 sec: 2867.2, 300 sec: 2505.8). Total num frames: 212992. Throughput: 0: 744.7. Samples: 54234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:33:18,941][00255] Avg episode reward: [(0, '4.347')]
[2023-12-28 17:33:18,953][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000052_212992.pth...
[2023-12-28 17:33:23,937][00255] Fps is (10 sec: 3685.7, 60 sec: 2867.1, 300 sec: 2548.6). Total num frames: 229376. Throughput: 0: 771.2. Samples: 57046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:33:23,939][00255] Avg episode reward: [(0, '4.473')]
[2023-12-28 17:33:28,935][00255] Fps is (10 sec: 2867.2, 60 sec: 2867.2, 300 sec: 2543.8). Total num frames: 241664. Throughput: 0: 724.8. Samples: 60384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:33:28,943][00255] Avg episode reward: [(0, '4.488')]
[2023-12-28 17:33:30,045][00808] Updated weights for policy 0, policy_version 60 (0.0018)
[2023-12-28 17:33:33,935][00255] Fps is (10 sec: 2458.1, 60 sec: 2935.5, 300 sec: 2539.5). Total num frames: 253952. Throughput: 0: 724.4. Samples: 64224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:33:33,937][00255] Avg episode reward: [(0, '4.568')]
[2023-12-28 17:33:33,943][00795] Saving new best policy, reward=4.568!
[2023-12-28 17:33:38,935][00255] Fps is (10 sec: 3276.7, 60 sec: 3003.7, 300 sec: 2613.6). Total num frames: 274432. Throughput: 0: 762.2. Samples: 67402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:33:38,937][00255] Avg episode reward: [(0, '4.420')]
[2023-12-28 17:33:41,037][00808] Updated weights for policy 0, policy_version 70 (0.0022)
[2023-12-28 17:33:43,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3072.0, 300 sec: 2681.0). Total num frames: 294912. Throughput: 0: 802.7. Samples: 73900. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:33:43,938][00255] Avg episode reward: [(0, '4.400')]
[2023-12-28 17:33:48,936][00255] Fps is (10 sec: 3686.0, 60 sec: 3140.2, 300 sec: 2706.9). Total num frames: 311296. Throughput: 0: 776.2. Samples: 78254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:33:48,944][00255] Avg episode reward: [(0, '4.423')]
[2023-12-28 17:33:53,935][00255] Fps is (10 sec: 2867.1, 60 sec: 3140.3, 300 sec: 2696.5). Total num frames: 323584. Throughput: 0: 786.0. Samples: 80258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:33:53,943][00255] Avg episode reward: [(0, '4.367')]
[2023-12-28 17:33:54,498][00808] Updated weights for policy 0, policy_version 80 (0.0018)
[2023-12-28 17:33:58,935][00255] Fps is (10 sec: 3277.2, 60 sec: 3208.5, 300 sec: 2752.5). Total num frames: 344064. Throughput: 0: 826.8. Samples: 85810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:33:58,938][00255] Avg episode reward: [(0, '4.408')]
[2023-12-28 17:34:03,935][00255] Fps is (10 sec: 4096.1, 60 sec: 3276.8, 300 sec: 2804.2). Total num frames: 364544. Throughput: 0: 837.5. Samples: 91922. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:34:03,941][00255] Avg episode reward: [(0, '4.427')]
[2023-12-28 17:34:04,877][00808] Updated weights for policy 0, policy_version 90 (0.0021)
[2023-12-28 17:34:08,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 2791.3). Total num frames: 376832. Throughput: 0: 817.9. Samples: 93852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:08,937][00255] Avg episode reward: [(0, '4.482')]
[2023-12-28 17:34:13,936][00255] Fps is (10 sec: 2457.3, 60 sec: 3276.7, 300 sec: 2779.4). Total num frames: 389120. Throughput: 0: 831.8. Samples: 97814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:13,939][00255] Avg episode reward: [(0, '4.553')]
[2023-12-28 17:34:18,006][00808] Updated weights for policy 0, policy_version 100 (0.0014)
[2023-12-28 17:34:18,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 2824.8). Total num frames: 409600. Throughput: 0: 874.2. Samples: 103564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:34:18,938][00255] Avg episode reward: [(0, '4.406')]
[2023-12-28 17:34:23,935][00255] Fps is (10 sec: 4096.5, 60 sec: 3345.2, 300 sec: 2867.2). Total num frames: 430080. Throughput: 0: 871.8. Samples: 106632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:23,937][00255] Avg episode reward: [(0, '4.315')]
[2023-12-28 17:34:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2854.0). Total num frames: 442368. Throughput: 0: 830.9. Samples: 111292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:28,942][00255] Avg episode reward: [(0, '4.521')]
[2023-12-28 17:34:30,619][00808] Updated weights for policy 0, policy_version 110 (0.0028)
[2023-12-28 17:34:33,936][00255] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 2867.2). Total num frames: 458752. Throughput: 0: 824.9. Samples: 115374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:34:33,940][00255] Avg episode reward: [(0, '4.603')]
[2023-12-28 17:34:33,943][00795] Saving new best policy, reward=4.603!
[2023-12-28 17:34:38,935][00255] Fps is (10 sec: 3276.7, 60 sec: 3345.1, 300 sec: 2879.6). Total num frames: 475136. Throughput: 0: 839.0. Samples: 118014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:34:38,937][00255] Avg episode reward: [(0, '4.530')]
[2023-12-28 17:34:42,564][00808] Updated weights for policy 0, policy_version 120 (0.0053)
[2023-12-28 17:34:43,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 2915.4). Total num frames: 495616. Throughput: 0: 838.4. Samples: 123540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:34:43,940][00255] Avg episode reward: [(0, '4.450')]
[2023-12-28 17:34:48,935][00255] Fps is (10 sec: 3276.9, 60 sec: 3276.9, 300 sec: 2902.3). Total num frames: 507904. Throughput: 0: 800.4. Samples: 127938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:34:48,938][00255] Avg episode reward: [(0, '4.428')]
[2023-12-28 17:34:53,938][00255] Fps is (10 sec: 2456.9, 60 sec: 3276.7, 300 sec: 2889.9). Total num frames: 520192. Throughput: 0: 797.8. Samples: 129756. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:53,944][00255] Avg episode reward: [(0, '4.577')]
[2023-12-28 17:34:57,377][00808] Updated weights for policy 0, policy_version 130 (0.0025)
[2023-12-28 17:34:58,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2900.4). Total num frames: 536576. Throughput: 0: 802.6. Samples: 133930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:34:58,942][00255] Avg episode reward: [(0, '4.578')]
[2023-12-28 17:35:03,935][00255] Fps is (10 sec: 3687.5, 60 sec: 3208.5, 300 sec: 2931.9). Total num frames: 557056. Throughput: 0: 803.8. Samples: 139736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:35:03,941][00255] Avg episode reward: [(0, '4.493')]
[2023-12-28 17:35:08,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2919.7). Total num frames: 569344. Throughput: 0: 788.4. Samples: 142108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:35:08,939][00255] Avg episode reward: [(0, '4.485')]
[2023-12-28 17:35:09,565][00808] Updated weights for policy 0, policy_version 140 (0.0018)
[2023-12-28 17:35:13,940][00255] Fps is (10 sec: 2456.4, 60 sec: 3208.3, 300 sec: 2908.1). Total num frames: 581632. Throughput: 0: 767.9. Samples: 145850. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:35:13,943][00255] Avg episode reward: [(0, '4.533')]
[2023-12-28 17:35:18,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2937.1). Total num frames: 602112. Throughput: 0: 788.8. Samples: 150870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:35:18,941][00255] Avg episode reward: [(0, '4.564')]
[2023-12-28 17:35:18,953][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000147_602112.pth...
[2023-12-28 17:35:21,914][00808] Updated weights for policy 0, policy_version 150 (0.0013)
[2023-12-28 17:35:23,935][00255] Fps is (10 sec: 4098.0, 60 sec: 3208.5, 300 sec: 2964.7). Total num frames: 622592. Throughput: 0: 799.7. Samples: 154002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:35:23,937][00255] Avg episode reward: [(0, '4.625')]
[2023-12-28 17:35:23,940][00795] Saving new best policy, reward=4.625!
[2023-12-28 17:35:28,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 2972.0). Total num frames: 638976. Throughput: 0: 803.5. Samples: 159698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:35:28,938][00255] Avg episode reward: [(0, '4.690')]
[2023-12-28 17:35:28,953][00795] Saving new best policy, reward=4.690!
[2023-12-28 17:35:33,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2960.3). Total num frames: 651264. Throughput: 0: 791.2. Samples: 163542. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:35:33,937][00255] Avg episode reward: [(0, '4.687')]
[2023-12-28 17:35:34,793][00808] Updated weights for policy 0, policy_version 160 (0.0018)
[2023-12-28 17:35:38,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2967.3). Total num frames: 667648. Throughput: 0: 796.5. Samples: 165594. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:35:38,938][00255] Avg episode reward: [(0, '4.569')]
[2023-12-28 17:35:43,936][00255] Fps is (10 sec: 3686.3, 60 sec: 3208.5, 300 sec: 2991.9). Total num frames: 688128. Throughput: 0: 843.3. Samples: 171880. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:35:43,943][00255] Avg episode reward: [(0, '4.509')]
[2023-12-28 17:35:45,898][00808] Updated weights for policy 0, policy_version 170 (0.0015)
[2023-12-28 17:35:48,939][00255] Fps is (10 sec: 3685.0, 60 sec: 3276.6, 300 sec: 2997.9). Total num frames: 704512. Throughput: 0: 822.5. Samples: 176752. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:35:48,948][00255] Avg episode reward: [(0, '4.452')]
[2023-12-28 17:35:53,938][00255] Fps is (10 sec: 2866.6, 60 sec: 3276.8, 300 sec: 2986.6). Total num frames: 716800. Throughput: 0: 812.0. Samples: 178650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:35:53,945][00255] Avg episode reward: [(0, '4.412')]
[2023-12-28 17:35:58,935][00255] Fps is (10 sec: 2868.3, 60 sec: 3276.8, 300 sec: 2992.6). Total num frames: 733184. Throughput: 0: 819.6. Samples: 182726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:35:58,942][00255] Avg episode reward: [(0, '4.501')]
[2023-12-28 17:35:59,898][00808] Updated weights for policy 0, policy_version 180 (0.0015)
[2023-12-28 17:36:03,935][00255] Fps is (10 sec: 3277.6, 60 sec: 3208.5, 300 sec: 2998.3). Total num frames: 749568. Throughput: 0: 839.0. Samples: 188626. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:36:03,942][00255] Avg episode reward: [(0, '4.831')]
[2023-12-28 17:36:03,969][00795] Saving new best policy, reward=4.831!
[2023-12-28 17:36:08,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3003.7). Total num frames: 765952. Throughput: 0: 833.6. Samples: 191516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:36:08,941][00255] Avg episode reward: [(0, '4.986')]
[2023-12-28 17:36:09,012][00795] Saving new best policy, reward=4.986!
[2023-12-28 17:36:12,340][00808] Updated weights for policy 0, policy_version 190 (0.0014)
[2023-12-28 17:36:13,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3277.1, 300 sec: 2993.2). Total num frames: 778240. Throughput: 0: 787.7. Samples: 195144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:36:13,937][00255] Avg episode reward: [(0, '4.966')]
[2023-12-28 17:36:18,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2998.6). Total num frames: 794624. Throughput: 0: 794.3. Samples: 199286. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:36:18,937][00255] Avg episode reward: [(0, '5.108')]
[2023-12-28 17:36:18,950][00795] Saving new best policy, reward=5.108!
[2023-12-28 17:36:23,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3003.7). Total num frames: 811008. Throughput: 0: 812.1. Samples: 202140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:36:23,938][00255] Avg episode reward: [(0, '5.069')]
[2023-12-28 17:36:24,996][00808] Updated weights for policy 0, policy_version 200 (0.0030)
[2023-12-28 17:36:28,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3023.6). Total num frames: 831488. Throughput: 0: 800.1. Samples: 207884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:36:28,938][00255] Avg episode reward: [(0, '5.128')]
[2023-12-28 17:36:28,947][00795] Saving new best policy, reward=5.128!
[2023-12-28 17:36:33,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3013.5). Total num frames: 843776. Throughput: 0: 775.7. Samples: 211656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:36:33,937][00255] Avg episode reward: [(0, '4.889')]
[2023-12-28 17:36:38,862][00808] Updated weights for policy 0, policy_version 210 (0.0024)
[2023-12-28 17:36:38,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3018.1). Total num frames: 860160. Throughput: 0: 777.0. Samples: 213614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:36:38,938][00255] Avg episode reward: [(0, '5.165')]
[2023-12-28 17:36:38,949][00795] Saving new best policy, reward=5.165!
[2023-12-28 17:36:43,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3208.6, 300 sec: 3036.7). Total num frames: 880640. Throughput: 0: 815.9. Samples: 219442. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:36:43,942][00255] Avg episode reward: [(0, '5.363')]
[2023-12-28 17:36:43,945][00795] Saving new best policy, reward=5.363!
[2023-12-28 17:36:48,937][00255] Fps is (10 sec: 3276.2, 60 sec: 3140.4, 300 sec: 3026.9). Total num frames: 892928. Throughput: 0: 792.0. Samples: 224266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:36:48,944][00255] Avg episode reward: [(0, '5.371')]
[2023-12-28 17:36:48,955][00795] Saving new best policy, reward=5.371!
[2023-12-28 17:36:52,018][00808] Updated weights for policy 0, policy_version 220 (0.0013)
[2023-12-28 17:36:53,935][00255] Fps is (10 sec: 2048.0, 60 sec: 3072.1, 300 sec: 3054.6). Total num frames: 901120. Throughput: 0: 758.8. Samples: 225660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:36:53,938][00255] Avg episode reward: [(0, '5.257')]
[2023-12-28 17:36:58,935][00255] Fps is (10 sec: 2048.4, 60 sec: 3003.7, 300 sec: 3096.3). Total num frames: 913408. Throughput: 0: 745.5. Samples: 228690. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:36:58,938][00255] Avg episode reward: [(0, '5.238')]
[2023-12-28 17:37:03,935][00255] Fps is (10 sec: 2457.6, 60 sec: 2935.5, 300 sec: 3124.1). Total num frames: 925696. Throughput: 0: 734.0. Samples: 232314. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:37:03,938][00255] Avg episode reward: [(0, '5.014')]
[2023-12-28 17:37:07,125][00808] Updated weights for policy 0, policy_version 230 (0.0032)
[2023-12-28 17:37:08,936][00255] Fps is (10 sec: 3276.7, 60 sec: 3003.7, 300 sec: 3137.9). Total num frames: 946176. Throughput: 0: 739.2. Samples: 235402. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-12-28 17:37:08,937][00255] Avg episode reward: [(0, '4.976')]
[2023-12-28 17:37:13,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 3138.0). Total num frames: 966656. Throughput: 0: 747.6. Samples: 241526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:37:13,939][00255] Avg episode reward: [(0, '5.306')]
[2023-12-28 17:37:18,935][00255] Fps is (10 sec: 3276.9, 60 sec: 3072.0, 300 sec: 3124.1). Total num frames: 978944. Throughput: 0: 748.4. Samples: 245336. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:37:18,942][00255] Avg episode reward: [(0, '5.316')]
[2023-12-28 17:37:18,956][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000239_978944.pth...
[2023-12-28 17:37:19,119][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000052_212992.pth
[2023-12-28 17:37:20,345][00808] Updated weights for policy 0, policy_version 240 (0.0034)
[2023-12-28 17:37:23,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3003.7, 300 sec: 3124.1). Total num frames: 991232. Throughput: 0: 745.2. Samples: 247146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:37:23,942][00255] Avg episode reward: [(0, '5.385')]
[2023-12-28 17:37:23,945][00795] Saving new best policy, reward=5.385!
[2023-12-28 17:37:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3003.7, 300 sec: 3165.7). Total num frames: 1011712. Throughput: 0: 735.7. Samples: 252550. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:37:28,940][00255] Avg episode reward: [(0, '5.108')]
[2023-12-28 17:37:31,459][00808] Updated weights for policy 0, policy_version 250 (0.0028)
[2023-12-28 17:37:33,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 3179.6). Total num frames: 1032192. Throughput: 0: 761.9. Samples: 258552. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:37:33,938][00255] Avg episode reward: [(0, '5.271')]
[2023-12-28 17:37:38,935][00255] Fps is (10 sec: 3276.9, 60 sec: 3072.0, 300 sec: 3165.7). Total num frames: 1044480. Throughput: 0: 774.4. Samples: 260508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:37:38,937][00255] Avg episode reward: [(0, '5.469')]
[2023-12-28 17:37:38,947][00795] Saving new best policy, reward=5.469!
[2023-12-28 17:37:43,935][00255] Fps is (10 sec: 2457.6, 60 sec: 2935.5, 300 sec: 3165.7). Total num frames: 1056768. Throughput: 0: 793.3. Samples: 264388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:37:43,940][00255] Avg episode reward: [(0, '5.549')]
[2023-12-28 17:37:43,943][00795] Saving new best policy, reward=5.549!
[2023-12-28 17:37:45,448][00808] Updated weights for policy 0, policy_version 260 (0.0026)
[2023-12-28 17:37:48,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3072.1, 300 sec: 3193.5). Total num frames: 1077248. Throughput: 0: 839.0. Samples: 270068. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:37:48,938][00255] Avg episode reward: [(0, '5.491')]
[2023-12-28 17:37:53,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3207.4). Total num frames: 1097728. Throughput: 0: 839.1. Samples: 273162. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:37:53,938][00255] Avg episode reward: [(0, '5.789')]
[2023-12-28 17:37:53,940][00795] Saving new best policy, reward=5.789!
[2023-12-28 17:37:56,313][00808] Updated weights for policy 0, policy_version 270 (0.0017)
[2023-12-28 17:37:58,939][00255] Fps is (10 sec: 3275.5, 60 sec: 3276.6, 300 sec: 3193.4). Total num frames: 1110016. Throughput: 0: 804.6. Samples: 277736. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:37:58,947][00255] Avg episode reward: [(0, '5.489')]
[2023-12-28 17:38:03,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3193.5). Total num frames: 1122304. Throughput: 0: 808.8. Samples: 281730. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:38:03,937][00255] Avg episode reward: [(0, '5.506')]
[2023-12-28 17:38:08,777][00808] Updated weights for policy 0, policy_version 280 (0.0019)
[2023-12-28 17:38:08,935][00255] Fps is (10 sec: 3687.9, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1146880. Throughput: 0: 834.6. Samples: 284702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:38:08,943][00255] Avg episode reward: [(0, '5.721')]
[2023-12-28 17:38:13,935][00255] Fps is (10 sec: 4505.6, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1167360. Throughput: 0: 862.3. Samples: 291352. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:38:13,941][00255] Avg episode reward: [(0, '5.902')]
[2023-12-28 17:38:13,944][00795] Saving new best policy, reward=5.902!
[2023-12-28 17:38:18,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 1179648. Throughput: 0: 835.8. Samples: 296164. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:38:18,939][00255] Avg episode reward: [(0, '5.784')]
[2023-12-28 17:38:20,884][00808] Updated weights for policy 0, policy_version 290 (0.0023)
[2023-12-28 17:38:23,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 1196032. Throughput: 0: 838.7. Samples: 298248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-12-28 17:38:23,939][00255] Avg episode reward: [(0, '5.872')]
[2023-12-28 17:38:28,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1216512. Throughput: 0: 869.8. Samples: 303530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:38:28,944][00255] Avg episode reward: [(0, '5.842')]
[2023-12-28 17:38:31,626][00808] Updated weights for policy 0, policy_version 300 (0.0021)
[2023-12-28 17:38:33,937][00255] Fps is (10 sec: 4095.1, 60 sec: 3413.2, 300 sec: 3262.9). Total num frames: 1236992. Throughput: 0: 884.3. Samples: 309864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:38:33,943][00255] Avg episode reward: [(0, '5.829')]
[2023-12-28 17:38:38,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 1249280. Throughput: 0: 866.3. Samples: 312144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:38:38,941][00255] Avg episode reward: [(0, '5.993')]
[2023-12-28 17:38:38,952][00795] Saving new best policy, reward=5.993!
[2023-12-28 17:38:43,935][00255] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3235.2). Total num frames: 1265664. Throughput: 0: 850.3. Samples: 315998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:38:43,937][00255] Avg episode reward: [(0, '6.367')]
[2023-12-28 17:38:43,942][00795] Saving new best policy, reward=6.367!
[2023-12-28 17:38:45,490][00808] Updated weights for policy 0, policy_version 310 (0.0023)
[2023-12-28 17:38:48,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 1282048. Throughput: 0: 879.0. Samples: 321284. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:38:48,938][00255] Avg episode reward: [(0, '6.376')]
[2023-12-28 17:38:48,955][00795] Saving new best policy, reward=6.376!
[2023-12-28 17:38:53,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 1302528. Throughput: 0: 881.7. Samples: 324378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:38:53,940][00255] Avg episode reward: [(0, '6.268')]
[2023-12-28 17:38:56,075][00808] Updated weights for policy 0, policy_version 320 (0.0024)
[2023-12-28 17:38:58,936][00255] Fps is (10 sec: 3276.5, 60 sec: 3413.5, 300 sec: 3221.2). Total num frames: 1314816. Throughput: 0: 843.0. Samples: 329290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:38:58,938][00255] Avg episode reward: [(0, '6.170')]
[2023-12-28 17:39:03,942][00255] Fps is (10 sec: 2455.9, 60 sec: 3412.9, 300 sec: 3221.2). Total num frames: 1327104. Throughput: 0: 816.9. Samples: 332928. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:39:03,948][00255] Avg episode reward: [(0, '6.059')]
[2023-12-28 17:39:08,935][00255] Fps is (10 sec: 2867.5, 60 sec: 3276.8, 300 sec: 3235.2). Total num frames: 1343488. Throughput: 0: 821.8. Samples: 335230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:39:08,938][00255] Avg episode reward: [(0, '6.254')]
[2023-12-28 17:39:09,988][00808] Updated weights for policy 0, policy_version 330 (0.0028)
[2023-12-28 17:39:13,936][00255] Fps is (10 sec: 4098.7, 60 sec: 3345.0, 300 sec: 3249.0). Total num frames: 1368064. Throughput: 0: 840.3. Samples: 341344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:39:13,940][00255] Avg episode reward: [(0, '6.382')]
[2023-12-28 17:39:13,946][00795] Saving new best policy, reward=6.382!
[2023-12-28 17:39:18,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 1380352. Throughput: 0: 808.3. Samples: 346234. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:39:18,938][00255] Avg episode reward: [(0, '6.331')]
[2023-12-28 17:39:18,953][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000337_1380352.pth...
[2023-12-28 17:39:19,114][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000147_602112.pth
[2023-12-28 17:39:22,390][00808] Updated weights for policy 0, policy_version 340 (0.0025)
[2023-12-28 17:39:23,935][00255] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1396736. Throughput: 0: 800.0. Samples: 348144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:39:23,942][00255] Avg episode reward: [(0, '6.163')]
[2023-12-28 17:39:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1413120. Throughput: 0: 821.6. Samples: 352972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:28,937][00255] Avg episode reward: [(0, '6.052')]
[2023-12-28 17:39:32,972][00808] Updated weights for policy 0, policy_version 350 (0.0027)
[2023-12-28 17:39:33,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 1437696. Throughput: 0: 853.7. Samples: 359702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:33,937][00255] Avg episode reward: [(0, '6.432')]
[2023-12-28 17:39:33,941][00795] Saving new best policy, reward=6.432!
[2023-12-28 17:39:38,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1449984. Throughput: 0: 847.2. Samples: 362504. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:38,940][00255] Avg episode reward: [(0, '6.955')]
[2023-12-28 17:39:38,955][00795] Saving new best policy, reward=6.955!
[2023-12-28 17:39:43,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1466368. Throughput: 0: 829.1. Samples: 366598. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:43,939][00255] Avg episode reward: [(0, '7.139')]
[2023-12-28 17:39:43,943][00795] Saving new best policy, reward=7.139!
[2023-12-28 17:39:46,482][00808] Updated weights for policy 0, policy_version 360 (0.0017)
[2023-12-28 17:39:48,936][00255] Fps is (10 sec: 3276.7, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1482752. Throughput: 0: 859.1. Samples: 371580. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:48,941][00255] Avg episode reward: [(0, '7.324')]
[2023-12-28 17:39:48,952][00795] Saving new best policy, reward=7.324!
[2023-12-28 17:39:53,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 1503232. Throughput: 0: 877.6. Samples: 374724. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:39:53,941][00255] Avg episode reward: [(0, '7.307')]
[2023-12-28 17:39:56,215][00808] Updated weights for policy 0, policy_version 370 (0.0030)
[2023-12-28 17:39:58,938][00255] Fps is (10 sec: 3685.3, 60 sec: 3413.2, 300 sec: 3262.9). Total num frames: 1519616. Throughput: 0: 868.7. Samples: 380438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:39:58,941][00255] Avg episode reward: [(0, '8.075')]
[2023-12-28 17:39:58,958][00795] Saving new best policy, reward=8.075!
[2023-12-28 17:40:03,939][00255] Fps is (10 sec: 2866.1, 60 sec: 3413.5, 300 sec: 3262.9). Total num frames: 1531904. Throughput: 0: 847.5. Samples: 384374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:03,941][00255] Avg episode reward: [(0, '7.914')]
[2023-12-28 17:40:08,935][00255] Fps is (10 sec: 2868.1, 60 sec: 3413.3, 300 sec: 3276.9). Total num frames: 1548288. Throughput: 0: 848.1. Samples: 386308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:08,938][00255] Avg episode reward: [(0, '8.117')]
[2023-12-28 17:40:08,945][00795] Saving new best policy, reward=8.117!
[2023-12-28 17:40:10,287][00808] Updated weights for policy 0, policy_version 380 (0.0030)
[2023-12-28 17:40:13,938][00255] Fps is (10 sec: 3686.9, 60 sec: 3344.9, 300 sec: 3276.8). Total num frames: 1568768. Throughput: 0: 866.1. Samples: 391950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:13,943][00255] Avg episode reward: [(0, '8.500')]
[2023-12-28 17:40:13,948][00795] Saving new best policy, reward=8.500!
[2023-12-28 17:40:18,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1585152. Throughput: 0: 841.5. Samples: 397570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:18,938][00255] Avg episode reward: [(0, '8.645')]
[2023-12-28 17:40:18,957][00795] Saving new best policy, reward=8.645!
[2023-12-28 17:40:22,180][00808] Updated weights for policy 0, policy_version 390 (0.0023)
[2023-12-28 17:40:23,935][00255] Fps is (10 sec: 3277.6, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1601536. Throughput: 0: 823.1. Samples: 399544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:23,940][00255] Avg episode reward: [(0, '8.780')]
[2023-12-28 17:40:23,942][00795] Saving new best policy, reward=8.780!
[2023-12-28 17:40:28,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1613824. Throughput: 0: 820.8. Samples: 403534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:40:28,943][00255] Avg episode reward: [(0, '8.315')]
[2023-12-28 17:40:33,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1634304. Throughput: 0: 846.1. Samples: 409656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:40:33,941][00255] Avg episode reward: [(0, '8.735')]
[2023-12-28 17:40:34,227][00808] Updated weights for policy 0, policy_version 400 (0.0025)
[2023-12-28 17:40:38,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 1654784. Throughput: 0: 844.4. Samples: 412720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:40:38,938][00255] Avg episode reward: [(0, '8.351')]
[2023-12-28 17:40:43,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3263.0). Total num frames: 1667072. Throughput: 0: 813.1. Samples: 417024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:40:43,942][00255] Avg episode reward: [(0, '8.824')]
[2023-12-28 17:40:43,949][00795] Saving new best policy, reward=8.824!
[2023-12-28 17:40:48,055][00808] Updated weights for policy 0, policy_version 410 (0.0033)
[2023-12-28 17:40:48,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 1679360. Throughput: 0: 817.5. Samples: 421158. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:40:48,941][00255] Avg episode reward: [(0, '8.400')]
[2023-12-28 17:40:53,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1699840. Throughput: 0: 840.7. Samples: 424138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:40:53,937][00255] Avg episode reward: [(0, '8.727')]
[2023-12-28 17:40:58,360][00808] Updated weights for policy 0, policy_version 420 (0.0031)
[2023-12-28 17:40:58,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3345.2, 300 sec: 3290.7). Total num frames: 1720320. Throughput: 0: 846.4. Samples: 430038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:40:58,938][00255] Avg episode reward: [(0, '8.620')]
[2023-12-28 17:41:03,938][00255] Fps is (10 sec: 3275.9, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 1732608. Throughput: 0: 808.0. Samples: 433932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:41:03,941][00255] Avg episode reward: [(0, '9.349')]
[2023-12-28 17:41:03,945][00795] Saving new best policy, reward=9.349!
[2023-12-28 17:41:08,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1744896. Throughput: 0: 805.0. Samples: 435768. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:41:08,937][00255] Avg episode reward: [(0, '9.561')]
[2023-12-28 17:41:08,950][00795] Saving new best policy, reward=9.561!
[2023-12-28 17:41:12,533][00808] Updated weights for policy 0, policy_version 430 (0.0013)
[2023-12-28 17:41:13,935][00255] Fps is (10 sec: 3277.7, 60 sec: 3276.9, 300 sec: 3290.7). Total num frames: 1765376. Throughput: 0: 831.9. Samples: 440970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:41:13,938][00255] Avg episode reward: [(0, '10.113')]
[2023-12-28 17:41:13,944][00795] Saving new best policy, reward=10.113!
[2023-12-28 17:41:18,937][00255] Fps is (10 sec: 3685.7, 60 sec: 3276.7, 300 sec: 3290.7). Total num frames: 1781760. Throughput: 0: 828.7. Samples: 446948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:41:18,946][00255] Avg episode reward: [(0, '11.030')]
[2023-12-28 17:41:18,959][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000436_1785856.pth...
[2023-12-28 17:41:19,126][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000239_978944.pth
[2023-12-28 17:41:19,150][00795] Saving new best policy, reward=11.030!
[2023-12-28 17:41:23,935][00255] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1798144. Throughput: 0: 798.7. Samples: 448662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:41:23,941][00255] Avg episode reward: [(0, '11.130')]
[2023-12-28 17:41:23,947][00795] Saving new best policy, reward=11.130!
[2023-12-28 17:41:25,628][00808] Updated weights for policy 0, policy_version 440 (0.0022)
[2023-12-28 17:41:28,935][00255] Fps is (10 sec: 2867.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 1810432. Throughput: 0: 786.4. Samples: 452412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:41:28,938][00255] Avg episode reward: [(0, '11.524')]
[2023-12-28 17:41:28,947][00795] Saving new best policy, reward=11.524!
[2023-12-28 17:41:33,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 1826816. Throughput: 0: 814.0. Samples: 457790. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:41:33,941][00255] Avg episode reward: [(0, '10.917')]
[2023-12-28 17:41:37,274][00808] Updated weights for policy 0, policy_version 450 (0.0032)
[2023-12-28 17:41:38,937][00255] Fps is (10 sec: 3685.8, 60 sec: 3208.4, 300 sec: 3276.8). Total num frames: 1847296. Throughput: 0: 810.4. Samples: 460606. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:41:38,939][00255] Avg episode reward: [(0, '10.977')]
[2023-12-28 17:41:43,938][00255] Fps is (10 sec: 3275.9, 60 sec: 3208.4, 300 sec: 3276.8). Total num frames: 1859584. Throughput: 0: 780.9. Samples: 465180. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:41:43,940][00255] Avg episode reward: [(0, '11.011')]
[2023-12-28 17:41:48,937][00255] Fps is (10 sec: 2457.6, 60 sec: 3208.4, 300 sec: 3290.7). Total num frames: 1871872. Throughput: 0: 777.8. Samples: 468932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:41:48,943][00255] Avg episode reward: [(0, '10.602')]
[2023-12-28 17:41:51,628][00808] Updated weights for policy 0, policy_version 460 (0.0043)
[2023-12-28 17:41:53,935][00255] Fps is (10 sec: 3277.7, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 1892352. Throughput: 0: 796.1. Samples: 471594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:41:53,938][00255] Avg episode reward: [(0, '11.219')]
[2023-12-28 17:41:58,935][00255] Fps is (10 sec: 4096.7, 60 sec: 3208.5, 300 sec: 3346.2). Total num frames: 1912832. Throughput: 0: 814.6. Samples: 477626. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:41:58,938][00255] Avg episode reward: [(0, '11.373')]
[2023-12-28 17:42:03,498][00808] Updated weights for policy 0, policy_version 470 (0.0040)
[2023-12-28 17:42:03,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3208.7, 300 sec: 3318.5). Total num frames: 1925120. Throughput: 0: 776.4. Samples: 481884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:42:03,941][00255] Avg episode reward: [(0, '12.285')]
[2023-12-28 17:42:03,943][00795] Saving new best policy, reward=12.285!
[2023-12-28 17:42:08,936][00255] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 3290.7). Total num frames: 1937408. Throughput: 0: 777.9. Samples: 483670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:42:08,941][00255] Avg episode reward: [(0, '12.658')]
[2023-12-28 17:42:08,954][00795] Saving new best policy, reward=12.658!
[2023-12-28 17:42:13,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3304.6). Total num frames: 1953792. Throughput: 0: 799.5. Samples: 488390. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:42:13,942][00255] Avg episode reward: [(0, '13.400')]
[2023-12-28 17:42:13,951][00795] Saving new best policy, reward=13.400!
[2023-12-28 17:42:16,660][00808] Updated weights for policy 0, policy_version 480 (0.0013)
[2023-12-28 17:42:18,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3208.6, 300 sec: 3332.3). Total num frames: 1974272. Throughput: 0: 804.8. Samples: 494008. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:42:18,943][00255] Avg episode reward: [(0, '13.395')]
[2023-12-28 17:42:23,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3304.6). Total num frames: 1986560. Throughput: 0: 792.3. Samples: 496256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:42:23,942][00255] Avg episode reward: [(0, '14.593')]
[2023-12-28 17:42:23,945][00795] Saving new best policy, reward=14.593!
[2023-12-28 17:42:28,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 1998848. Throughput: 0: 770.6. Samples: 499854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:42:28,939][00255] Avg episode reward: [(0, '14.443')]
[2023-12-28 17:42:31,063][00808] Updated weights for policy 0, policy_version 490 (0.0023)
[2023-12-28 17:42:33,942][00255] Fps is (10 sec: 2865.1, 60 sec: 3139.9, 300 sec: 3290.6). Total num frames: 2015232. Throughput: 0: 796.7. Samples: 504788. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:42:33,945][00255] Avg episode reward: [(0, '14.903')]
[2023-12-28 17:42:33,947][00795] Saving new best policy, reward=14.903!
[2023-12-28 17:42:38,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3140.4, 300 sec: 3318.5). Total num frames: 2035712. Throughput: 0: 800.6. Samples: 507622. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:42:38,943][00255] Avg episode reward: [(0, '15.509')]
[2023-12-28 17:42:38,958][00795] Saving new best policy, reward=15.509!
[2023-12-28 17:42:42,719][00808] Updated weights for policy 0, policy_version 500 (0.0029)
[2023-12-28 17:42:43,935][00255] Fps is (10 sec: 3279.2, 60 sec: 3140.4, 300 sec: 3290.7). Total num frames: 2048000. Throughput: 0: 778.6. Samples: 512662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:42:43,939][00255] Avg episode reward: [(0, '15.945')]
[2023-12-28 17:42:43,941][00795] Saving new best policy, reward=15.945!
[2023-12-28 17:42:48,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3262.9). Total num frames: 2060288. Throughput: 0: 763.9. Samples: 516258. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-12-28 17:42:48,940][00255] Avg episode reward: [(0, '16.270')]
[2023-12-28 17:42:48,949][00795] Saving new best policy, reward=16.270!
[2023-12-28 17:42:53,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3276.8). Total num frames: 2076672. Throughput: 0: 771.7. Samples: 518398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:42:53,938][00255] Avg episode reward: [(0, '16.805')]
[2023-12-28 17:42:54,020][00795] Saving new best policy, reward=16.805!
[2023-12-28 17:42:56,374][00808] Updated weights for policy 0, policy_version 510 (0.0016)
[2023-12-28 17:42:58,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3072.0, 300 sec: 3304.6). Total num frames: 2097152. Throughput: 0: 794.6. Samples: 524148. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:42:58,937][00255] Avg episode reward: [(0, '17.000')]
[2023-12-28 17:42:58,952][00795] Saving new best policy, reward=17.000!
[2023-12-28 17:43:03,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 2113536. Throughput: 0: 775.2. Samples: 528894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:43:03,940][00255] Avg episode reward: [(0, '16.913')]
[2023-12-28 17:43:08,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 2125824. Throughput: 0: 766.7. Samples: 530758. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:43:08,943][00255] Avg episode reward: [(0, '17.169')]
[2023-12-28 17:43:08,956][00795] Saving new best policy, reward=17.169!
[2023-12-28 17:43:10,225][00808] Updated weights for policy 0, policy_version 520 (0.0021)
[2023-12-28 17:43:13,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 2142208. Throughput: 0: 784.9. Samples: 535176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:43:13,943][00255] Avg episode reward: [(0, '16.625')]
[2023-12-28 17:43:18,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 2162688. Throughput: 0: 812.7. Samples: 541356. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:43:18,941][00255] Avg episode reward: [(0, '16.557')]
[2023-12-28 17:43:18,953][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000528_2162688.pth...
[2023-12-28 17:43:19,086][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000337_1380352.pth
[2023-12-28 17:43:20,589][00808] Updated weights for policy 0, policy_version 530 (0.0023)
[2023-12-28 17:43:23,938][00255] Fps is (10 sec: 3685.5, 60 sec: 3208.4, 300 sec: 3262.9). Total num frames: 2179072. Throughput: 0: 814.2. Samples: 544262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:43:23,940][00255] Avg episode reward: [(0, '15.746')]
[2023-12-28 17:43:28,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3235.2). Total num frames: 2191360. Throughput: 0: 793.6. Samples: 548372. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:43:28,940][00255] Avg episode reward: [(0, '16.158')]
[2023-12-28 17:43:33,876][00808] Updated weights for policy 0, policy_version 540 (0.0012)
[2023-12-28 17:43:33,935][00255] Fps is (10 sec: 3277.6, 60 sec: 3277.2, 300 sec: 3262.9). Total num frames: 2211840. Throughput: 0: 823.1. Samples: 553298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:43:33,938][00255] Avg episode reward: [(0, '16.880')]
[2023-12-28 17:43:38,935][00255] Fps is (10 sec: 4095.9, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 2232320. Throughput: 0: 846.5. Samples: 556490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:43:38,938][00255] Avg episode reward: [(0, '16.997')]
[2023-12-28 17:43:43,939][00255] Fps is (10 sec: 3684.8, 60 sec: 3344.8, 300 sec: 3276.8). Total num frames: 2248704. Throughput: 0: 857.2. Samples: 562724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:43:43,942][00255] Avg episode reward: [(0, '17.987')]
[2023-12-28 17:43:43,951][00795] Saving new best policy, reward=17.987!
[2023-12-28 17:43:44,283][00808] Updated weights for policy 0, policy_version 550 (0.0020)
[2023-12-28 17:43:48,939][00255] Fps is (10 sec: 3275.5, 60 sec: 3413.1, 300 sec: 3262.9). Total num frames: 2265088. Throughput: 0: 842.3. Samples: 566800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:43:48,943][00255] Avg episode reward: [(0, '19.954')]
[2023-12-28 17:43:48,957][00795] Saving new best policy, reward=19.954!
[2023-12-28 17:43:53,935][00255] Fps is (10 sec: 2868.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 2277376. Throughput: 0: 843.7. Samples: 568726. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:43:53,937][00255] Avg episode reward: [(0, '20.771')]
[2023-12-28 17:43:53,948][00795] Saving new best policy, reward=20.771!
[2023-12-28 17:43:56,972][00808] Updated weights for policy 0, policy_version 560 (0.0014)
[2023-12-28 17:43:58,935][00255] Fps is (10 sec: 3278.1, 60 sec: 3345.1, 300 sec: 3290.8). Total num frames: 2297856. Throughput: 0: 879.0. Samples: 574730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:43:58,942][00255] Avg episode reward: [(0, '19.836')]
[2023-12-28 17:44:03,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3304.6). Total num frames: 2318336. Throughput: 0: 865.7. Samples: 580314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:44:03,938][00255] Avg episode reward: [(0, '19.648')]
[2023-12-28 17:44:08,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 2330624. Throughput: 0: 844.6. Samples: 582268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:44:08,938][00255] Avg episode reward: [(0, '20.396')]
[2023-12-28 17:44:09,745][00808] Updated weights for policy 0, policy_version 570 (0.0021)
[2023-12-28 17:44:13,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 2347008. Throughput: 0: 842.1. Samples: 586266. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:44:13,943][00255] Avg episode reward: [(0, '19.650')]
[2023-12-28 17:44:18,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 2367488. Throughput: 0: 874.0. Samples: 592630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:44:18,943][00255] Avg episode reward: [(0, '18.466')]
[2023-12-28 17:44:20,374][00808] Updated weights for policy 0, policy_version 580 (0.0013)
[2023-12-28 17:44:23,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3481.7, 300 sec: 3304.6). Total num frames: 2387968. Throughput: 0: 876.0. Samples: 595908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:44:23,944][00255] Avg episode reward: [(0, '18.138')]
[2023-12-28 17:44:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3262.9). Total num frames: 2400256. Throughput: 0: 836.7. Samples: 600372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:44:28,937][00255] Avg episode reward: [(0, '18.111')]
[2023-12-28 17:44:33,748][00808] Updated weights for policy 0, policy_version 590 (0.0032)
[2023-12-28 17:44:33,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 2416640. Throughput: 0: 841.4. Samples: 604660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:44:33,938][00255] Avg episode reward: [(0, '17.470')]
[2023-12-28 17:44:38,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 2437120. Throughput: 0: 868.7. Samples: 607816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:44:38,938][00255] Avg episode reward: [(0, '18.358')]
[2023-12-28 17:44:43,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.6, 300 sec: 3290.7). Total num frames: 2453504. Throughput: 0: 869.1. Samples: 613840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:44:43,940][00255] Avg episode reward: [(0, '18.525')]
[2023-12-28 17:44:44,192][00808] Updated weights for policy 0, policy_version 600 (0.0020)
[2023-12-28 17:44:48,936][00255] Fps is (10 sec: 2866.9, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 2465792. Throughput: 0: 832.4. Samples: 617774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:44:48,942][00255] Avg episode reward: [(0, '17.778')]
[2023-12-28 17:44:53,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3263.0). Total num frames: 2482176. Throughput: 0: 833.6. Samples: 619778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:44:53,938][00255] Avg episode reward: [(0, '18.788')]
[2023-12-28 17:44:57,326][00808] Updated weights for policy 0, policy_version 610 (0.0030)
[2023-12-28 17:44:58,935][00255] Fps is (10 sec: 3686.8, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 2502656. Throughput: 0: 873.4. Samples: 625570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:44:58,940][00255] Avg episode reward: [(0, '17.873')]
[2023-12-28 17:45:03,941][00255] Fps is (10 sec: 4502.8, 60 sec: 3481.2, 300 sec: 3318.4). Total num frames: 2527232. Throughput: 0: 878.0. Samples: 632146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:45:03,943][00255] Avg episode reward: [(0, '16.710')]
[2023-12-28 17:45:08,496][00808] Updated weights for policy 0, policy_version 620 (0.0026)
[2023-12-28 17:45:08,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3290.7). Total num frames: 2539520. Throughput: 0: 850.4. Samples: 634174. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:45:08,942][00255] Avg episode reward: [(0, '16.771')]
[2023-12-28 17:45:13,937][00255] Fps is (10 sec: 2458.7, 60 sec: 3413.2, 300 sec: 3276.8). Total num frames: 2551808. Throughput: 0: 842.5. Samples: 638284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:45:13,945][00255] Avg episode reward: [(0, '18.180')]
[2023-12-28 17:45:18,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 2572288. Throughput: 0: 879.6. Samples: 644240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:45:18,938][00255] Avg episode reward: [(0, '18.880')]
[2023-12-28 17:45:18,951][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth...
[2023-12-28 17:45:19,106][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000436_1785856.pth
[2023-12-28 17:45:20,068][00808] Updated weights for policy 0, policy_version 630 (0.0028)
[2023-12-28 17:45:23,935][00255] Fps is (10 sec: 4096.8, 60 sec: 3413.3, 300 sec: 3318.5). Total num frames: 2592768. Throughput: 0: 877.2. Samples: 647290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:45:23,940][00255] Avg episode reward: [(0, '19.399')]
[2023-12-28 17:45:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 2605056. Throughput: 0: 847.9. Samples: 651994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:45:28,939][00255] Avg episode reward: [(0, '19.803')]
[2023-12-28 17:45:33,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 2617344. Throughput: 0: 840.5. Samples: 655594. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:45:33,938][00255] Avg episode reward: [(0, '20.558')]
[2023-12-28 17:45:34,053][00808] Updated weights for policy 0, policy_version 640 (0.0013)
[2023-12-28 17:45:38,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3290.7). Total num frames: 2637824. Throughput: 0: 852.4. Samples: 658134. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:45:38,937][00255] Avg episode reward: [(0, '18.920')]
[2023-12-28 17:45:43,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3318.5). Total num frames: 2658304. Throughput: 0: 856.2. Samples: 664098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:45:43,937][00255] Avg episode reward: [(0, '18.585')]
[2023-12-28 17:45:44,924][00808] Updated weights for policy 0, policy_version 650 (0.0019)
[2023-12-28 17:45:48,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3290.7). Total num frames: 2670592. Throughput: 0: 807.8. Samples: 668494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:45:48,938][00255] Avg episode reward: [(0, '20.001')]
[2023-12-28 17:45:53,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 2682880. Throughput: 0: 803.4. Samples: 670328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:45:53,940][00255] Avg episode reward: [(0, '20.653')]
[2023-12-28 17:45:58,914][00808] Updated weights for policy 0, policy_version 660 (0.0035)
[2023-12-28 17:45:58,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3290.7). Total num frames: 2703360. Throughput: 0: 818.9. Samples: 675134. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:45:58,937][00255] Avg episode reward: [(0, '20.488')]
[2023-12-28 17:46:03,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3277.1, 300 sec: 3318.5). Total num frames: 2723840. Throughput: 0: 824.4. Samples: 681336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:46:03,937][00255] Avg episode reward: [(0, '20.289')]
[2023-12-28 17:46:08,939][00255] Fps is (10 sec: 3275.5, 60 sec: 3276.6, 300 sec: 3290.6). Total num frames: 2736128. Throughput: 0: 809.3. Samples: 683710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:46:08,945][00255] Avg episode reward: [(0, '21.344')]
[2023-12-28 17:46:08,966][00795] Saving new best policy, reward=21.344!
[2023-12-28 17:46:11,029][00808] Updated weights for policy 0, policy_version 670 (0.0013)
[2023-12-28 17:46:13,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3276.9, 300 sec: 3276.8). Total num frames: 2748416. Throughput: 0: 792.5. Samples: 687656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:46:13,938][00255] Avg episode reward: [(0, '21.237')]
[2023-12-28 17:46:18,935][00255] Fps is (10 sec: 3278.1, 60 sec: 3276.8, 300 sec: 3290.7). Total num frames: 2768896. Throughput: 0: 823.8. Samples: 692666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:46:18,938][00255] Avg episode reward: [(0, '19.055')]
[2023-12-28 17:46:22,920][00808] Updated weights for policy 0, policy_version 680 (0.0014)
[2023-12-28 17:46:23,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3304.6). Total num frames: 2785280. Throughput: 0: 833.3. Samples: 695632. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:46:23,938][00255] Avg episode reward: [(0, '19.430')]
[2023-12-28 17:46:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 2801664. Throughput: 0: 816.3. Samples: 700830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:46:28,937][00255] Avg episode reward: [(0, '19.012')]
[2023-12-28 17:46:33,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 2813952. Throughput: 0: 801.0. Samples: 704540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:46:33,940][00255] Avg episode reward: [(0, '19.258')]
[2023-12-28 17:46:37,255][00808] Updated weights for policy 0, policy_version 690 (0.0025)
[2023-12-28 17:46:38,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3290.7). Total num frames: 2830336. Throughput: 0: 805.4. Samples: 706570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:46:38,942][00255] Avg episode reward: [(0, '18.966')]
[2023-12-28 17:46:43,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 2850816. Throughput: 0: 828.9. Samples: 712436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:46:43,938][00255] Avg episode reward: [(0, '19.415')]
[2023-12-28 17:46:48,368][00808] Updated weights for policy 0, policy_version 700 (0.0013)
[2023-12-28 17:46:48,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3304.6). Total num frames: 2867200. Throughput: 0: 801.9. Samples: 717420. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:46:48,939][00255] Avg episode reward: [(0, '20.308')]
[2023-12-28 17:46:53,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 2879488. Throughput: 0: 791.1. Samples: 719308. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:46:53,941][00255] Avg episode reward: [(0, '19.452')]
[2023-12-28 17:46:58,936][00255] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3290.7). Total num frames: 2895872. Throughput: 0: 796.9. Samples: 723516. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:46:58,938][00255] Avg episode reward: [(0, '19.740')]
[2023-12-28 17:47:01,614][00808] Updated weights for policy 0, policy_version 710 (0.0015)
[2023-12-28 17:47:03,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3318.5). Total num frames: 2916352. Throughput: 0: 820.8. Samples: 729600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:47:03,938][00255] Avg episode reward: [(0, '19.890')]
[2023-12-28 17:47:08,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3277.0, 300 sec: 3318.5). Total num frames: 2932736. Throughput: 0: 822.8. Samples: 732658. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:08,937][00255] Avg episode reward: [(0, '20.294')]
[2023-12-28 17:47:13,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3290.7). Total num frames: 2945024. Throughput: 0: 798.7. Samples: 736770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:13,938][00255] Avg episode reward: [(0, '20.722')]
[2023-12-28 17:47:14,016][00808] Updated weights for policy 0, policy_version 720 (0.0019)
[2023-12-28 17:47:18,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3304.6). Total num frames: 2961408. Throughput: 0: 821.7. Samples: 741516. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:47:18,940][00255] Avg episode reward: [(0, '19.981')]
[2023-12-28 17:47:18,953][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000723_2961408.pth...
[2023-12-28 17:47:19,103][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000528_2162688.pth
[2023-12-28 17:47:23,935][00255] Fps is (10 sec: 3686.3, 60 sec: 3276.8, 300 sec: 3332.3). Total num frames: 2981888. Throughput: 0: 846.0. Samples: 744640. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:23,943][00255] Avg episode reward: [(0, '20.190')]
[2023-12-28 17:47:24,943][00808] Updated weights for policy 0, policy_version 730 (0.0023)
[2023-12-28 17:47:28,935][00255] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3346.3). Total num frames: 3002368. Throughput: 0: 851.3. Samples: 750746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:28,941][00255] Avg episode reward: [(0, '21.800')]
[2023-12-28 17:47:28,953][00795] Saving new best policy, reward=21.800!
[2023-12-28 17:47:33,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3318.5). Total num frames: 3014656. Throughput: 0: 830.3. Samples: 754782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:33,944][00255] Avg episode reward: [(0, '21.542')]
[2023-12-28 17:47:38,509][00808] Updated weights for policy 0, policy_version 740 (0.0021)
[2023-12-28 17:47:38,935][00255] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 3332.3). Total num frames: 3031040. Throughput: 0: 834.1. Samples: 756844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:38,940][00255] Avg episode reward: [(0, '21.856')]
[2023-12-28 17:47:38,949][00795] Saving new best policy, reward=21.856!
[2023-12-28 17:47:43,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3051520. Throughput: 0: 869.9. Samples: 762660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:47:43,945][00255] Avg episode reward: [(0, '20.698')]
[2023-12-28 17:47:48,820][00808] Updated weights for policy 0, policy_version 750 (0.0021)
[2023-12-28 17:47:48,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3072000. Throughput: 0: 864.4. Samples: 768500. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:47:48,938][00255] Avg episode reward: [(0, '20.995')]
[2023-12-28 17:47:53,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3346.2). Total num frames: 3084288. Throughput: 0: 839.9. Samples: 770454. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:47:53,945][00255] Avg episode reward: [(0, '21.684')]
[2023-12-28 17:47:58,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3332.3). Total num frames: 3096576. Throughput: 0: 835.2. Samples: 774356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:47:58,938][00255] Avg episode reward: [(0, '20.698')]
[2023-12-28 17:48:02,075][00808] Updated weights for policy 0, policy_version 760 (0.0026)
[2023-12-28 17:48:03,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 3117056. Throughput: 0: 862.9. Samples: 780348. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:48:03,943][00255] Avg episode reward: [(0, '19.410')]
[2023-12-28 17:48:08,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3137536. Throughput: 0: 863.5. Samples: 783496. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:48:08,938][00255] Avg episode reward: [(0, '19.813')]
[2023-12-28 17:48:13,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3346.2). Total num frames: 3149824. Throughput: 0: 830.2. Samples: 788106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:48:13,941][00255] Avg episode reward: [(0, '19.066')]
[2023-12-28 17:48:14,110][00808] Updated weights for policy 0, policy_version 770 (0.0017)
[2023-12-28 17:48:18,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3346.3). Total num frames: 3166208. Throughput: 0: 829.9. Samples: 792126. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:48:18,944][00255] Avg episode reward: [(0, '17.824')]
[2023-12-28 17:48:23,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3186688. Throughput: 0: 848.6. Samples: 795030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:48:23,938][00255] Avg episode reward: [(0, '17.344')]
[2023-12-28 17:48:25,958][00808] Updated weights for policy 0, policy_version 780 (0.0015)
[2023-12-28 17:48:28,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3207168. Throughput: 0: 854.4. Samples: 801106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:48:28,940][00255] Avg episode reward: [(0, '17.894')]
[2023-12-28 17:48:33,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3332.3). Total num frames: 3215360. Throughput: 0: 818.5. Samples: 805332. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:48:33,941][00255] Avg episode reward: [(0, '19.840')]
[2023-12-28 17:48:38,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3332.4). Total num frames: 3231744. Throughput: 0: 818.8. Samples: 807302. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:48:38,941][00255] Avg episode reward: [(0, '20.065')]
[2023-12-28 17:48:39,874][00808] Updated weights for policy 0, policy_version 790 (0.0033)
[2023-12-28 17:48:43,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3346.3). Total num frames: 3252224. Throughput: 0: 851.0. Samples: 812650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:48:43,937][00255] Avg episode reward: [(0, '20.373')]
[2023-12-28 17:48:48,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 3272704. Throughput: 0: 864.3. Samples: 819242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:48:48,943][00255] Avg episode reward: [(0, '21.806')]
[2023-12-28 17:48:49,156][00808] Updated weights for policy 0, policy_version 800 (0.0018)
[2023-12-28 17:48:53,937][00255] Fps is (10 sec: 3685.7, 60 sec: 3413.2, 300 sec: 3360.1). Total num frames: 3289088. Throughput: 0: 847.5. Samples: 821636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:48:53,944][00255] Avg episode reward: [(0, '21.665')]
[2023-12-28 17:48:58,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 3301376. Throughput: 0: 836.3. Samples: 825738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:48:58,939][00255] Avg episode reward: [(0, '21.234')]
[2023-12-28 17:49:02,446][00808] Updated weights for policy 0, policy_version 810 (0.0025)
[2023-12-28 17:49:03,935][00255] Fps is (10 sec: 3277.4, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 3321856. Throughput: 0: 871.2. Samples: 831328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:49:03,944][00255] Avg episode reward: [(0, '21.153')]
[2023-12-28 17:49:08,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3342336. Throughput: 0: 876.1. Samples: 834456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:49:08,939][00255] Avg episode reward: [(0, '20.431')]
[2023-12-28 17:49:13,504][00808] Updated weights for policy 0, policy_version 820 (0.0016)
[2023-12-28 17:49:13,940][00255] Fps is (10 sec: 3684.5, 60 sec: 3481.3, 300 sec: 3360.1). Total num frames: 3358720. Throughput: 0: 856.8. Samples: 839668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:49:13,946][00255] Avg episode reward: [(0, '20.526')]
[2023-12-28 17:49:18,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 3371008. Throughput: 0: 853.0. Samples: 843718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:49:18,940][00255] Avg episode reward: [(0, '20.464')]
[2023-12-28 17:49:18,954][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000823_3371008.pth...
[2023-12-28 17:49:19,121][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth
[2023-12-28 17:49:23,935][00255] Fps is (10 sec: 3278.5, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 3391488. Throughput: 0: 860.7. Samples: 846032. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:49:23,938][00255] Avg episode reward: [(0, '21.013')]
[2023-12-28 17:49:25,846][00808] Updated weights for policy 0, policy_version 830 (0.0023)
[2023-12-28 17:49:28,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3411968. Throughput: 0: 880.5. Samples: 852274. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:49:28,942][00255] Avg episode reward: [(0, '19.778')]
[2023-12-28 17:49:33,938][00255] Fps is (10 sec: 3275.9, 60 sec: 3481.5, 300 sec: 3346.2). Total num frames: 3424256. Throughput: 0: 846.0. Samples: 857314. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:49:33,940][00255] Avg episode reward: [(0, '19.626')]
[2023-12-28 17:49:38,743][00808] Updated weights for policy 0, policy_version 840 (0.0031)
[2023-12-28 17:49:38,936][00255] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 3440640. Throughput: 0: 836.6. Samples: 859282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:49:38,938][00255] Avg episode reward: [(0, '20.512')]
[2023-12-28 17:49:43,935][00255] Fps is (10 sec: 3277.6, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 3457024. Throughput: 0: 847.2. Samples: 863860. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:49:43,938][00255] Avg episode reward: [(0, '20.560')]
[2023-12-28 17:49:48,935][00255] Fps is (10 sec: 3686.5, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3477504. Throughput: 0: 865.3. Samples: 870268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:49:48,938][00255] Avg episode reward: [(0, '19.558')]
[2023-12-28 17:49:49,286][00808] Updated weights for policy 0, policy_version 850 (0.0017)
[2023-12-28 17:49:53,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3360.1). Total num frames: 3493888. Throughput: 0: 861.4. Samples: 873218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:49:53,942][00255] Avg episode reward: [(0, '20.157')]
[2023-12-28 17:49:58,935][00255] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3318.5). Total num frames: 3506176. Throughput: 0: 835.0. Samples: 877240. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-12-28 17:49:58,938][00255] Avg episode reward: [(0, '20.648')]
[2023-12-28 17:50:03,026][00808] Updated weights for policy 0, policy_version 860 (0.0015)
[2023-12-28 17:50:03,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3346.2). Total num frames: 3526656. Throughput: 0: 851.5. Samples: 882036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-12-28 17:50:03,941][00255] Avg episode reward: [(0, '20.474')]
[2023-12-28 17:50:08,936][00255] Fps is (10 sec: 4095.8, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 3547136. Throughput: 0: 870.8. Samples: 885220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:08,944][00255] Avg episode reward: [(0, '20.931')]
[2023-12-28 17:50:13,099][00808] Updated weights for policy 0, policy_version 870 (0.0029)
[2023-12-28 17:50:13,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3413.6, 300 sec: 3360.1). Total num frames: 3563520. Throughput: 0: 862.5. Samples: 891086. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:13,938][00255] Avg episode reward: [(0, '20.818')]
[2023-12-28 17:50:18,935][00255] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 3575808. Throughput: 0: 839.2. Samples: 895076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:18,938][00255] Avg episode reward: [(0, '20.952')]
[2023-12-28 17:50:23,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3332.3). Total num frames: 3588096. Throughput: 0: 836.9. Samples: 896944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:23,938][00255] Avg episode reward: [(0, '19.848')]
[2023-12-28 17:50:27,692][00808] Updated weights for policy 0, policy_version 880 (0.0013)
[2023-12-28 17:50:28,935][00255] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 3608576. Throughput: 0: 839.7. Samples: 901648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:28,938][00255] Avg episode reward: [(0, '20.172')]
[2023-12-28 17:50:33,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3346.2). Total num frames: 3624960. Throughput: 0: 812.5. Samples: 906830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:33,943][00255] Avg episode reward: [(0, '19.360')]
[2023-12-28 17:50:38,937][00255] Fps is (10 sec: 2457.2, 60 sec: 3208.5, 300 sec: 3304.5). Total num frames: 3633152. Throughput: 0: 781.3. Samples: 908380. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:50:38,939][00255] Avg episode reward: [(0, '18.815')]
[2023-12-28 17:50:43,101][00808] Updated weights for policy 0, policy_version 890 (0.0020)
[2023-12-28 17:50:43,935][00255] Fps is (10 sec: 2048.0, 60 sec: 3140.3, 300 sec: 3304.6). Total num frames: 3645440. Throughput: 0: 765.9. Samples: 911704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-12-28 17:50:43,937][00255] Avg episode reward: [(0, '18.380')]
[2023-12-28 17:50:48,935][00255] Fps is (10 sec: 2867.7, 60 sec: 3072.0, 300 sec: 3318.5). Total num frames: 3661824. Throughput: 0: 758.3. Samples: 916160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:50:48,938][00255] Avg episode reward: [(0, '19.157')]
[2023-12-28 17:50:53,937][00255] Fps is (10 sec: 3276.1, 60 sec: 3071.9, 300 sec: 3304.5). Total num frames: 3678208. Throughput: 0: 748.3. Samples: 918896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:50:53,943][00255] Avg episode reward: [(0, '19.779')]
[2023-12-28 17:50:56,065][00808] Updated weights for policy 0, policy_version 900 (0.0027)
[2023-12-28 17:50:58,937][00255] Fps is (10 sec: 2866.7, 60 sec: 3071.9, 300 sec: 3276.8). Total num frames: 3690496. Throughput: 0: 714.0. Samples: 923216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:50:58,942][00255] Avg episode reward: [(0, '19.993')]
[2023-12-28 17:51:03,935][00255] Fps is (10 sec: 2867.8, 60 sec: 3003.7, 300 sec: 3290.7). Total num frames: 3706880. Throughput: 0: 710.4. Samples: 927044. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:51:03,939][00255] Avg episode reward: [(0, '20.746')]
[2023-12-28 17:51:08,935][00255] Fps is (10 sec: 2867.7, 60 sec: 2867.2, 300 sec: 3290.7). Total num frames: 3719168. Throughput: 0: 711.4. Samples: 928956. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-12-28 17:51:08,943][00255] Avg episode reward: [(0, '21.705')]
[2023-12-28 17:51:10,662][00808] Updated weights for policy 0, policy_version 910 (0.0017)
[2023-12-28 17:51:13,937][00255] Fps is (10 sec: 3276.1, 60 sec: 2935.4, 300 sec: 3290.7). Total num frames: 3739648. Throughput: 0: 729.5. Samples: 934478. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:51:13,941][00255] Avg episode reward: [(0, '22.242')]
[2023-12-28 17:51:13,946][00795] Saving new best policy, reward=22.242!
[2023-12-28 17:51:18,938][00255] Fps is (10 sec: 3275.9, 60 sec: 2935.3, 300 sec: 3276.8). Total num frames: 3751936. Throughput: 0: 723.3. Samples: 939382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:51:18,941][00255] Avg episode reward: [(0, '21.866')]
[2023-12-28 17:51:19,025][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000917_3756032.pth...
[2023-12-28 17:51:19,261][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000723_2961408.pth
[2023-12-28 17:51:23,778][00808] Updated weights for policy 0, policy_version 920 (0.0033)
[2023-12-28 17:51:23,935][00255] Fps is (10 sec: 2867.8, 60 sec: 3003.7, 300 sec: 3276.8). Total num frames: 3768320. Throughput: 0: 729.4. Samples: 941200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:51:23,938][00255] Avg episode reward: [(0, '20.979')]
[2023-12-28 17:51:28,935][00255] Fps is (10 sec: 2868.0, 60 sec: 2867.2, 300 sec: 3276.8). Total num frames: 3780608. Throughput: 0: 746.5. Samples: 945296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:51:28,938][00255] Avg episode reward: [(0, '22.019')]
[2023-12-28 17:51:33,935][00255] Fps is (10 sec: 3276.8, 60 sec: 2935.5, 300 sec: 3290.7). Total num frames: 3801088. Throughput: 0: 788.8. Samples: 951658. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:51:33,937][00255] Avg episode reward: [(0, '21.044')]
[2023-12-28 17:51:34,969][00808] Updated weights for policy 0, policy_version 930 (0.0025)
[2023-12-28 17:51:38,935][00255] Fps is (10 sec: 4096.0, 60 sec: 3140.4, 300 sec: 3290.7). Total num frames: 3821568. Throughput: 0: 796.0. Samples: 954714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:51:38,946][00255] Avg episode reward: [(0, '20.353')]
[2023-12-28 17:51:43,936][00255] Fps is (10 sec: 3276.4, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3833856. Throughput: 0: 793.7. Samples: 958932. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:51:43,951][00255] Avg episode reward: [(0, '20.139')]
[2023-12-28 17:51:48,764][00808] Updated weights for policy 0, policy_version 940 (0.0024)
[2023-12-28 17:51:48,935][00255] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3290.7). Total num frames: 3850240. Throughput: 0: 799.3. Samples: 963012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-12-28 17:51:48,938][00255] Avg episode reward: [(0, '19.159')]
[2023-12-28 17:51:53,936][00255] Fps is (10 sec: 3686.6, 60 sec: 3208.6, 300 sec: 3304.6). Total num frames: 3870720. Throughput: 0: 822.9. Samples: 965988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:51:53,938][00255] Avg episode reward: [(0, '18.944')]
[2023-12-28 17:51:58,935][00255] Fps is (10 sec: 3686.4, 60 sec: 3276.9, 300 sec: 3290.7). Total num frames: 3887104. Throughput: 0: 836.4. Samples: 972114. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:51:58,938][00255] Avg episode reward: [(0, '19.720')]
[2023-12-28 17:51:59,265][00808] Updated weights for policy 0, policy_version 950 (0.0024)
[2023-12-28 17:52:03,939][00255] Fps is (10 sec: 2866.3, 60 sec: 3208.3, 300 sec: 3276.8). Total num frames: 3899392. Throughput: 0: 812.7. Samples: 975956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:52:03,942][00255] Avg episode reward: [(0, '19.356')]
[2023-12-28 17:52:08,935][00255] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3911680. Throughput: 0: 812.0. Samples: 977740. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-12-28 17:52:08,946][00255] Avg episode reward: [(0, '19.955')]
[2023-12-28 17:52:13,321][00808] Updated weights for policy 0, policy_version 960 (0.0028)
[2023-12-28 17:52:13,938][00255] Fps is (10 sec: 3277.2, 60 sec: 3208.5, 300 sec: 3290.7). Total num frames: 3932160. Throughput: 0: 834.0. Samples: 982830. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:52:13,941][00255] Avg episode reward: [(0, '20.062')]
[2023-12-28 17:52:18,938][00255] Fps is (10 sec: 4094.8, 60 sec: 3345.1, 300 sec: 3290.7). Total num frames: 3952640. Throughput: 0: 837.1. Samples: 989328. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:52:18,940][00255] Avg episode reward: [(0, '21.802')]
[2023-12-28 17:52:23,937][00255] Fps is (10 sec: 3686.7, 60 sec: 3345.0, 300 sec: 3276.8). Total num frames: 3969024. Throughput: 0: 822.9. Samples: 991748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-12-28 17:52:23,940][00255] Avg episode reward: [(0, '22.859')]
[2023-12-28 17:52:23,946][00795] Saving new best policy, reward=22.859!
[2023-12-28 17:52:24,698][00808] Updated weights for policy 0, policy_version 970 (0.0037)
[2023-12-28 17:52:28,936][00255] Fps is (10 sec: 2868.0, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3981312. Throughput: 0: 813.1. Samples: 995520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:52:28,944][00255] Avg episode reward: [(0, '22.055')]
[2023-12-28 17:52:33,935][00255] Fps is (10 sec: 2458.1, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3993600. Throughput: 0: 794.3. Samples: 998754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-12-28 17:52:33,940][00255] Avg episode reward: [(0, '21.657')]
[2023-12-28 17:52:38,932][00795] Stopping Batcher_0...
[2023-12-28 17:52:38,932][00795] Loop batcher_evt_loop terminating...
[2023-12-28 17:52:38,934][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-12-28 17:52:38,933][00255] Component Batcher_0 stopped!
[2023-12-28 17:52:38,999][00255] Component RolloutWorker_w4 stopped!
[2023-12-28 17:52:39,007][00813] Stopping RolloutWorker_w4...
[2023-12-28 17:52:39,007][00813] Loop rollout_proc4_evt_loop terminating...
[2023-12-28 17:52:39,014][00255] Component RolloutWorker_w2 stopped!
[2023-12-28 17:52:39,021][00811] Stopping RolloutWorker_w2...
[2023-12-28 17:52:39,022][00811] Loop rollout_proc2_evt_loop terminating...
[2023-12-28 17:52:39,028][00255] Component RolloutWorker_w6 stopped!
[2023-12-28 17:52:39,035][00815] Stopping RolloutWorker_w6...
[2023-12-28 17:52:39,036][00815] Loop rollout_proc6_evt_loop terminating...
[2023-12-28 17:52:39,052][00255] Component RolloutWorker_w0 stopped!
[2023-12-28 17:52:39,059][00810] Stopping RolloutWorker_w0...
[2023-12-28 17:52:39,060][00810] Loop rollout_proc0_evt_loop terminating...
[2023-12-28 17:52:39,088][00808] Weights refcount: 2 0
[2023-12-28 17:52:39,110][00808] Stopping InferenceWorker_p0-w0...
[2023-12-28 17:52:39,111][00808] Loop inference_proc0-0_evt_loop terminating...
[2023-12-28 17:52:39,111][00255] Component InferenceWorker_p0-w0 stopped!
[2023-12-28 17:52:39,137][00812] Stopping RolloutWorker_w3...
[2023-12-28 17:52:39,140][00814] Stopping RolloutWorker_w5...
[2023-12-28 17:52:39,137][00255] Component RolloutWorker_w3 stopped!
[2023-12-28 17:52:39,138][00812] Loop rollout_proc3_evt_loop terminating...
[2023-12-28 17:52:39,147][00255] Component RolloutWorker_w5 stopped!
[2023-12-28 17:52:39,141][00814] Loop rollout_proc5_evt_loop terminating...
[2023-12-28 17:52:39,161][00809] Stopping RolloutWorker_w1...
[2023-12-28 17:52:39,161][00809] Loop rollout_proc1_evt_loop terminating...
[2023-12-28 17:52:39,162][00255] Component RolloutWorker_w1 stopped!
[2023-12-28 17:52:39,168][00816] Stopping RolloutWorker_w7...
[2023-12-28 17:52:39,168][00255] Component RolloutWorker_w7 stopped!
[2023-12-28 17:52:39,169][00816] Loop rollout_proc7_evt_loop terminating...
[2023-12-28 17:52:39,213][00795] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000823_3371008.pth
[2023-12-28 17:52:39,240][00795] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-12-28 17:52:39,518][00255] Component LearnerWorker_p0 stopped!
[2023-12-28 17:52:39,520][00255] Waiting for process learner_proc0 to stop...
[2023-12-28 17:52:39,522][00795] Stopping LearnerWorker_p0...
[2023-12-28 17:52:39,527][00795] Loop learner_proc0_evt_loop terminating...
[2023-12-28 17:52:41,803][00255] Waiting for process inference_proc0-0 to join...
[2023-12-28 17:52:41,809][00255] Waiting for process rollout_proc0 to join...
[2023-12-28 17:52:45,004][00255] Waiting for process rollout_proc1 to join...
[2023-12-28 17:52:45,051][00255] Waiting for process rollout_proc2 to join...
[2023-12-28 17:52:45,053][00255] Waiting for process rollout_proc3 to join...
[2023-12-28 17:52:45,055][00255] Waiting for process rollout_proc4 to join...
[2023-12-28 17:52:45,056][00255] Waiting for process rollout_proc5 to join...
[2023-12-28 17:52:45,059][00255] Waiting for process rollout_proc6 to join...
[2023-12-28 17:52:45,061][00255] Waiting for process rollout_proc7 to join...
[2023-12-28 17:52:45,064][00255] Batcher 0 profile tree view:
batching: 28.0898, releasing_batches: 0.0312
[2023-12-28 17:52:45,067][00255] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0028
wait_policy_total: 585.9297
update_model: 9.9226
weight_update: 0.0030
one_step: 0.0126
handle_policy_step: 602.7236
deserialize: 16.5802, stack: 3.3224, obs_to_device_normalize: 121.3483, forward: 322.0175, send_messages: 29.0396
prepare_outputs: 79.7466
to_cpu: 45.1381
[2023-12-28 17:52:45,068][00255] Learner 0 profile tree view:
misc: 0.0058, prepare_batch: 14.7778
train: 76.8353
epoch_init: 0.0077, minibatch_init: 0.0076, losses_postprocess: 0.5790, kl_divergence: 0.7600, after_optimizer: 34.5202
calculate_losses: 27.8262
losses_init: 0.0092, forward_head: 1.3088, bptt_initial: 18.9094, tail: 1.2852, advantages_returns: 0.2868, losses: 3.6225
bptt: 2.0795
bptt_forward_core: 1.9783
update: 12.4267
clip: 0.9647
[2023-12-28 17:52:45,069][00255] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4220, enqueue_policy_requests: 170.7749, env_step: 929.0903, overhead: 24.9393, complete_rollouts: 7.5572
save_policy_outputs: 21.9222
split_output_tensors: 10.3722
[2023-12-28 17:52:45,070][00255] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4138, enqueue_policy_requests: 177.6018, env_step: 923.1664, overhead: 25.6922, complete_rollouts: 7.9335
save_policy_outputs: 22.3020
split_output_tensors: 10.8770
[2023-12-28 17:52:45,074][00255] Loop Runner_EvtLoop terminating...
[2023-12-28 17:52:45,075][00255] Runner profile tree view:
main_loop: 1277.7687
[2023-12-28 17:52:45,078][00255] Collected {0: 4005888}, FPS: 3135.1
[2023-12-28 17:52:45,406][00255] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-12-28 17:52:45,408][00255] Overriding arg 'num_workers' with value 1 passed from command line
[2023-12-28 17:52:45,410][00255] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-12-28 17:52:45,414][00255] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-12-28 17:52:45,416][00255] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-12-28 17:52:45,417][00255] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-12-28 17:52:45,420][00255] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-12-28 17:52:45,421][00255] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-12-28 17:52:45,423][00255] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-12-28 17:52:45,426][00255] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-12-28 17:52:45,429][00255] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-12-28 17:52:45,430][00255] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-12-28 17:52:45,431][00255] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-12-28 17:52:45,432][00255] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-12-28 17:52:45,433][00255] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-12-28 17:52:45,493][00255] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-12-28 17:52:45,498][00255] RunningMeanStd input shape: (3, 72, 128)
[2023-12-28 17:52:45,502][00255] RunningMeanStd input shape: (1,)
[2023-12-28 17:52:45,527][00255] ConvEncoder: input_channels=3
[2023-12-28 17:52:45,701][00255] Conv encoder output size: 512
[2023-12-28 17:52:45,705][00255] Policy head output size: 512
[2023-12-28 17:52:46,080][00255] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-12-28 17:52:47,245][00255] Num frames 100...
[2023-12-28 17:52:47,452][00255] Num frames 200...
[2023-12-28 17:52:47,665][00255] Num frames 300...
[2023-12-28 17:52:47,854][00255] Num frames 400...
[2023-12-28 17:52:48,023][00255] Num frames 500...
[2023-12-28 17:52:48,222][00255] Avg episode rewards: #0: 9.760, true rewards: #0: 5.760
[2023-12-28 17:52:48,224][00255] Avg episode reward: 9.760, avg true_objective: 5.760
[2023-12-28 17:52:48,258][00255] Num frames 600...
[2023-12-28 17:52:48,399][00255] Num frames 700...
[2023-12-28 17:52:48,566][00255] Num frames 800...
[2023-12-28 17:52:48,730][00255] Num frames 900...
[2023-12-28 17:52:48,879][00255] Num frames 1000...
[2023-12-28 17:52:49,032][00255] Num frames 1100...
[2023-12-28 17:52:49,201][00255] Num frames 1200...
[2023-12-28 17:52:49,360][00255] Num frames 1300...
[2023-12-28 17:52:49,505][00255] Num frames 1400...
[2023-12-28 17:52:49,641][00255] Num frames 1500...
[2023-12-28 17:52:49,764][00255] Num frames 1600...
[2023-12-28 17:52:49,895][00255] Num frames 1700...
[2023-12-28 17:52:50,020][00255] Num frames 1800...
[2023-12-28 17:52:50,148][00255] Num frames 1900...
[2023-12-28 17:52:50,281][00255] Num frames 2000...
[2023-12-28 17:52:50,357][00255] Avg episode rewards: #0: 26.080, true rewards: #0: 10.080
[2023-12-28 17:52:50,359][00255] Avg episode reward: 26.080, avg true_objective: 10.080
[2023-12-28 17:52:50,467][00255] Num frames 2100...
[2023-12-28 17:52:50,601][00255] Num frames 2200...
[2023-12-28 17:52:50,734][00255] Num frames 2300...
[2023-12-28 17:52:50,863][00255] Num frames 2400...
[2023-12-28 17:52:50,996][00255] Num frames 2500...
[2023-12-28 17:52:51,125][00255] Num frames 2600...
[2023-12-28 17:52:51,254][00255] Num frames 2700...
[2023-12-28 17:52:51,390][00255] Num frames 2800...
[2023-12-28 17:52:51,521][00255] Num frames 2900...
[2023-12-28 17:52:51,651][00255] Num frames 3000...
[2023-12-28 17:52:51,790][00255] Num frames 3100...
[2023-12-28 17:52:51,917][00255] Num frames 3200...
[2023-12-28 17:52:52,052][00255] Num frames 3300...
[2023-12-28 17:52:52,185][00255] Num frames 3400...
[2023-12-28 17:52:52,313][00255] Num frames 3500...
[2023-12-28 17:52:52,421][00255] Avg episode rewards: #0: 30.123, true rewards: #0: 11.790
[2023-12-28 17:52:52,423][00255] Avg episode reward: 30.123, avg true_objective: 11.790
[2023-12-28 17:52:52,503][00255] Num frames 3600...
[2023-12-28 17:52:52,637][00255] Num frames 3700...
[2023-12-28 17:52:52,778][00255] Num frames 3800...
[2023-12-28 17:52:52,907][00255] Num frames 3900...
[2023-12-28 17:52:53,037][00255] Num frames 4000...
[2023-12-28 17:52:53,173][00255] Num frames 4100...
[2023-12-28 17:52:53,303][00255] Num frames 4200...
[2023-12-28 17:52:53,437][00255] Num frames 4300...
[2023-12-28 17:52:53,589][00255] Avg episode rewards: #0: 26.922, true rewards: #0: 10.922
[2023-12-28 17:52:53,591][00255] Avg episode reward: 26.922, avg true_objective: 10.922
[2023-12-28 17:52:53,633][00255] Num frames 4400...
[2023-12-28 17:52:53,772][00255] Num frames 4500...
[2023-12-28 17:52:53,903][00255] Num frames 4600...
[2023-12-28 17:52:54,032][00255] Num frames 4700...
[2023-12-28 17:52:54,162][00255] Num frames 4800...
[2023-12-28 17:52:54,289][00255] Num frames 4900...
[2023-12-28 17:52:54,423][00255] Num frames 5000...
[2023-12-28 17:52:54,556][00255] Num frames 5100...
[2023-12-28 17:52:54,705][00255] Avg episode rewards: #0: 24.350, true rewards: #0: 10.350
[2023-12-28 17:52:54,707][00255] Avg episode reward: 24.350, avg true_objective: 10.350
[2023-12-28 17:52:54,746][00255] Num frames 5200...
[2023-12-28 17:52:54,883][00255] Num frames 5300...
[2023-12-28 17:52:55,022][00255] Num frames 5400...
[2023-12-28 17:52:55,154][00255] Num frames 5500...
[2023-12-28 17:52:55,286][00255] Num frames 5600...
[2023-12-28 17:52:55,413][00255] Num frames 5700...
[2023-12-28 17:52:55,488][00255] Avg episode rewards: #0: 22.357, true rewards: #0: 9.523
[2023-12-28 17:52:55,489][00255] Avg episode reward: 22.357, avg true_objective: 9.523
[2023-12-28 17:52:55,604][00255] Num frames 5800...
[2023-12-28 17:52:55,729][00255] Num frames 5900...
[2023-12-28 17:52:55,865][00255] Num frames 6000...
[2023-12-28 17:52:55,998][00255] Num frames 6100...
[2023-12-28 17:52:56,126][00255] Num frames 6200...
[2023-12-28 17:52:56,255][00255] Num frames 6300...
[2023-12-28 17:52:56,383][00255] Num frames 6400...
[2023-12-28 17:52:56,509][00255] Num frames 6500...
[2023-12-28 17:52:56,643][00255] Num frames 6600...
[2023-12-28 17:52:56,753][00255] Avg episode rewards: #0: 21.917, true rewards: #0: 9.489
[2023-12-28 17:52:56,755][00255] Avg episode reward: 21.917, avg true_objective: 9.489
[2023-12-28 17:52:56,843][00255] Num frames 6700...
[2023-12-28 17:52:56,977][00255] Num frames 6800...
[2023-12-28 17:52:57,112][00255] Num frames 6900...
[2023-12-28 17:52:57,243][00255] Num frames 7000...
[2023-12-28 17:52:57,376][00255] Num frames 7100...
[2023-12-28 17:52:57,508][00255] Num frames 7200...
[2023-12-28 17:52:57,647][00255] Num frames 7300...
[2023-12-28 17:52:57,774][00255] Num frames 7400...
[2023-12-28 17:52:57,928][00255] Num frames 7500...
[2023-12-28 17:52:58,116][00255] Num frames 7600...
[2023-12-28 17:52:58,298][00255] Num frames 7700...
[2023-12-28 17:52:58,479][00255] Num frames 7800...
[2023-12-28 17:52:58,669][00255] Num frames 7900...
[2023-12-28 17:52:58,851][00255] Num frames 8000...
[2023-12-28 17:52:59,050][00255] Num frames 8100...
[2023-12-28 17:52:59,250][00255] Avg episode rewards: #0: 23.472, true rewards: #0: 10.222
[2023-12-28 17:52:59,253][00255] Avg episode reward: 23.472, avg true_objective: 10.222
[2023-12-28 17:52:59,296][00255] Num frames 8200...
[2023-12-28 17:52:59,476][00255] Num frames 8300...
[2023-12-28 17:52:59,649][00255] Num frames 8400...
[2023-12-28 17:52:59,838][00255] Num frames 8500...
[2023-12-28 17:53:00,057][00255] Num frames 8600...
[2023-12-28 17:53:00,200][00255] Avg episode rewards: #0: 21.938, true rewards: #0: 9.604
[2023-12-28 17:53:00,203][00255] Avg episode reward: 21.938, avg true_objective: 9.604
[2023-12-28 17:53:00,317][00255] Num frames 8700...
[2023-12-28 17:53:00,524][00255] Num frames 8800...
[2023-12-28 17:53:00,749][00255] Num frames 8900...
[2023-12-28 17:53:00,960][00255] Num frames 9000...
[2023-12-28 17:53:01,190][00255] Num frames 9100...
[2023-12-28 17:53:01,381][00255] Num frames 9200...
[2023-12-28 17:53:01,558][00255] Num frames 9300...
[2023-12-28 17:53:01,736][00255] Num frames 9400...
[2023-12-28 17:53:01,906][00255] Num frames 9500...
[2023-12-28 17:53:02,083][00255] Num frames 9600...
[2023-12-28 17:53:02,258][00255] Num frames 9700...
[2023-12-28 17:53:02,414][00255] Num frames 9800...
[2023-12-28 17:53:02,570][00255] Num frames 9900...
[2023-12-28 17:53:02,739][00255] Num frames 10000...
[2023-12-28 17:53:02,896][00255] Num frames 10100...
[2023-12-28 17:53:03,029][00255] Avg episode rewards: #0: 23.647, true rewards: #0: 10.147
[2023-12-28 17:53:03,031][00255] Avg episode reward: 23.647, avg true_objective: 10.147
[2023-12-28 17:54:08,960][00255] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-12-28 17:59:56,881][00255] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-12-28 17:59:56,883][00255] Overriding arg 'num_workers' with value 1 passed from command line
[2023-12-28 17:59:56,885][00255] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-12-28 17:59:56,887][00255] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-12-28 17:59:56,889][00255] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-12-28 17:59:56,891][00255] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-12-28 17:59:56,894][00255] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-12-28 17:59:56,896][00255] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-12-28 17:59:56,897][00255] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-12-28 17:59:56,898][00255] Adding new argument 'hf_repository'='andreatorch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-12-28 17:59:56,900][00255] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-12-28 17:59:56,901][00255] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-12-28 17:59:56,902][00255] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-12-28 17:59:56,903][00255] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-12-28 17:59:56,905][00255] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-12-28 17:59:56,939][00255] RunningMeanStd input shape: (3, 72, 128)
[2023-12-28 17:59:56,940][00255] RunningMeanStd input shape: (1,)
[2023-12-28 17:59:56,955][00255] ConvEncoder: input_channels=3
[2023-12-28 17:59:56,992][00255] Conv encoder output size: 512
[2023-12-28 17:59:56,993][00255] Policy head output size: 512
[2023-12-28 17:59:57,013][00255] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-12-28 17:59:57,427][00255] Num frames 100...
[2023-12-28 17:59:57,564][00255] Num frames 200...
[2023-12-28 17:59:57,691][00255] Num frames 300...
[2023-12-28 17:59:57,817][00255] Num frames 400...
[2023-12-28 17:59:57,944][00255] Num frames 500...
[2023-12-28 17:59:58,076][00255] Num frames 600...
[2023-12-28 17:59:58,202][00255] Num frames 700...
[2023-12-28 17:59:58,337][00255] Num frames 800...
[2023-12-28 17:59:58,442][00255] Avg episode rewards: #0: 17.320, true rewards: #0: 8.320
[2023-12-28 17:59:58,444][00255] Avg episode reward: 17.320, avg true_objective: 8.320
[2023-12-28 17:59:58,543][00255] Num frames 900...
[2023-12-28 17:59:58,666][00255] Num frames 1000...
[2023-12-28 17:59:58,792][00255] Num frames 1100...
[2023-12-28 17:59:58,920][00255] Num frames 1200...
[2023-12-28 17:59:59,050][00255] Num frames 1300...
[2023-12-28 17:59:59,109][00255] Avg episode rewards: #0: 13.005, true rewards: #0: 6.505
[2023-12-28 17:59:59,110][00255] Avg episode reward: 13.005, avg true_objective: 6.505
[2023-12-28 17:59:59,238][00255] Num frames 1400...
[2023-12-28 17:59:59,367][00255] Num frames 1500...
[2023-12-28 17:59:59,498][00255] Num frames 1600...
[2023-12-28 17:59:59,629][00255] Num frames 1700...
[2023-12-28 17:59:59,753][00255] Num frames 1800...
[2023-12-28 17:59:59,880][00255] Num frames 1900...
[2023-12-28 18:00:00,006][00255] Num frames 2000...
[2023-12-28 18:00:00,134][00255] Num frames 2100...
[2023-12-28 18:00:00,268][00255] Num frames 2200...
[2023-12-28 18:00:00,397][00255] Num frames 2300...
[2023-12-28 18:00:00,533][00255] Num frames 2400...
[2023-12-28 18:00:00,663][00255] Num frames 2500...
[2023-12-28 18:00:00,790][00255] Num frames 2600...
[2023-12-28 18:00:00,921][00255] Num frames 2700...
[2023-12-28 18:00:01,055][00255] Num frames 2800...
[2023-12-28 18:00:01,185][00255] Num frames 2900...
[2023-12-28 18:00:01,314][00255] Num frames 3000...
[2023-12-28 18:00:01,452][00255] Num frames 3100...
[2023-12-28 18:00:01,596][00255] Num frames 3200...
[2023-12-28 18:00:01,728][00255] Num frames 3300...
[2023-12-28 18:00:01,854][00255] Num frames 3400...
[2023-12-28 18:00:01,913][00255] Avg episode rewards: #0: 27.003, true rewards: #0: 11.337
[2023-12-28 18:00:01,914][00255] Avg episode reward: 27.003, avg true_objective: 11.337
[2023-12-28 18:00:02,063][00255] Num frames 3500...
[2023-12-28 18:00:02,268][00255] Num frames 3600...
[2023-12-28 18:00:02,447][00255] Num frames 3700...
[2023-12-28 18:00:02,638][00255] Num frames 3800...
[2023-12-28 18:00:02,815][00255] Num frames 3900...
[2023-12-28 18:00:02,997][00255] Num frames 4000...
[2023-12-28 18:00:03,180][00255] Num frames 4100...
[2023-12-28 18:00:03,372][00255] Num frames 4200...
[2023-12-28 18:00:03,548][00255] Num frames 4300...
[2023-12-28 18:00:03,727][00255] Num frames 4400...
[2023-12-28 18:00:03,905][00255] Num frames 4500...
[2023-12-28 18:00:04,095][00255] Num frames 4600...
[2023-12-28 18:00:04,297][00255] Num frames 4700...
[2023-12-28 18:00:04,493][00255] Num frames 4800...
[2023-12-28 18:00:04,678][00255] Num frames 4900...
[2023-12-28 18:00:04,861][00255] Num frames 5000...
[2023-12-28 18:00:05,048][00255] Num frames 5100...
[2023-12-28 18:00:05,186][00255] Num frames 5200...
[2023-12-28 18:00:05,320][00255] Num frames 5300...
[2023-12-28 18:00:05,405][00255] Avg episode rewards: #0: 31.802, true rewards: #0: 13.303
[2023-12-28 18:00:05,406][00255] Avg episode reward: 31.802, avg true_objective: 13.303
[2023-12-28 18:00:05,508][00255] Num frames 5400...
[2023-12-28 18:00:05,640][00255] Num frames 5500...
[2023-12-28 18:00:05,771][00255] Num frames 5600...
[2023-12-28 18:00:05,897][00255] Num frames 5700...
[2023-12-28 18:00:06,029][00255] Num frames 5800...
[2023-12-28 18:00:06,159][00255] Num frames 5900...
[2023-12-28 18:00:06,293][00255] Num frames 6000...
[2023-12-28 18:00:06,465][00255] Avg episode rewards: #0: 27.978, true rewards: #0: 12.178
[2023-12-28 18:00:06,467][00255] Avg episode reward: 27.978, avg true_objective: 12.178
[2023-12-28 18:00:06,485][00255] Num frames 6100...
[2023-12-28 18:00:06,612][00255] Num frames 6200...
[2023-12-28 18:00:06,742][00255] Num frames 6300...
[2023-12-28 18:00:06,867][00255] Num frames 6400...
[2023-12-28 18:00:07,002][00255] Num frames 6500...
[2023-12-28 18:00:07,130][00255] Num frames 6600...
[2023-12-28 18:00:07,256][00255] Num frames 6700...
[2023-12-28 18:00:07,386][00255] Num frames 6800...
[2023-12-28 18:00:07,508][00255] Num frames 6900...
[2023-12-28 18:00:07,636][00255] Num frames 7000...
[2023-12-28 18:00:07,716][00255] Avg episode rewards: #0: 26.695, true rewards: #0: 11.695
[2023-12-28 18:00:07,717][00255] Avg episode reward: 26.695, avg true_objective: 11.695
[2023-12-28 18:00:07,828][00255] Num frames 7100...
[2023-12-28 18:00:07,953][00255] Num frames 7200...
[2023-12-28 18:00:08,089][00255] Num frames 7300...
[2023-12-28 18:00:08,220][00255] Num frames 7400...
[2023-12-28 18:00:08,355][00255] Num frames 7500...
[2023-12-28 18:00:08,497][00255] Avg episode rewards: #0: 24.668, true rewards: #0: 10.811
[2023-12-28 18:00:08,498][00255] Avg episode reward: 24.668, avg true_objective: 10.811
[2023-12-28 18:00:08,550][00255] Num frames 7600...
[2023-12-28 18:00:08,691][00255] Num frames 7700...
[2023-12-28 18:00:08,829][00255] Num frames 7800...
[2023-12-28 18:00:08,955][00255] Num frames 7900...
[2023-12-28 18:00:09,087][00255] Num frames 8000...
[2023-12-28 18:00:09,215][00255] Num frames 8100...
[2023-12-28 18:00:09,341][00255] Num frames 8200...
[2023-12-28 18:00:09,472][00255] Num frames 8300...
[2023-12-28 18:00:09,597][00255] Num frames 8400...
[2023-12-28 18:00:09,721][00255] Num frames 8500...
[2023-12-28 18:00:09,859][00255] Num frames 8600...
[2023-12-28 18:00:09,984][00255] Num frames 8700...
[2023-12-28 18:00:10,143][00255] Num frames 8800...
[2023-12-28 18:00:10,275][00255] Num frames 8900...
[2023-12-28 18:00:10,408][00255] Num frames 9000...
[2023-12-28 18:00:10,537][00255] Num frames 9100...
[2023-12-28 18:00:10,668][00255] Num frames 9200...
[2023-12-28 18:00:10,797][00255] Num frames 9300...
[2023-12-28 18:00:10,941][00255] Num frames 9400...
[2023-12-28 18:00:11,138][00255] Avg episode rewards: #0: 27.621, true rewards: #0: 11.871
[2023-12-28 18:00:11,140][00255] Avg episode reward: 27.621, avg true_objective: 11.871
[2023-12-28 18:00:11,150][00255] Num frames 9500...
[2023-12-28 18:00:11,281][00255] Num frames 9600...
[2023-12-28 18:00:11,413][00255] Num frames 9700...
[2023-12-28 18:00:11,542][00255] Num frames 9800...
[2023-12-28 18:00:11,674][00255] Num frames 9900...
[2023-12-28 18:00:11,799][00255] Num frames 10000...
[2023-12-28 18:00:11,945][00255] Num frames 10100...
[2023-12-28 18:00:12,089][00255] Num frames 10200...
[2023-12-28 18:00:12,220][00255] Num frames 10300...
[2023-12-28 18:00:12,354][00255] Num frames 10400...
[2023-12-28 18:00:12,482][00255] Num frames 10500...
[2023-12-28 18:00:12,618][00255] Num frames 10600...
[2023-12-28 18:00:12,746][00255] Num frames 10700...
[2023-12-28 18:00:12,879][00255] Num frames 10800...
[2023-12-28 18:00:13,010][00255] Num frames 10900...
[2023-12-28 18:00:13,142][00255] Num frames 11000...
[2023-12-28 18:00:13,270][00255] Num frames 11100...
[2023-12-28 18:00:13,405][00255] Num frames 11200...
[2023-12-28 18:00:13,540][00255] Num frames 11300...
[2023-12-28 18:00:13,668][00255] Num frames 11400...
[2023-12-28 18:00:13,799][00255] Num frames 11500...
[2023-12-28 18:00:13,994][00255] Avg episode rewards: #0: 30.330, true rewards: #0: 12.886
[2023-12-28 18:00:13,995][00255] Avg episode reward: 30.330, avg true_objective: 12.886
[2023-12-28 18:00:14,004][00255] Num frames 11600...
[2023-12-28 18:00:14,137][00255] Num frames 11700...
[2023-12-28 18:00:14,267][00255] Num frames 11800...
[2023-12-28 18:00:14,402][00255] Num frames 11900...
[2023-12-28 18:00:14,529][00255] Num frames 12000...
[2023-12-28 18:00:14,659][00255] Num frames 12100...
[2023-12-28 18:00:14,787][00255] Num frames 12200...
[2023-12-28 18:00:14,890][00255] Avg episode rewards: #0: 28.537, true rewards: #0: 12.237
[2023-12-28 18:00:14,893][00255] Avg episode reward: 28.537, avg true_objective: 12.237
[2023-12-28 18:01:31,820][00255] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-12-28 18:02:48,857][00255] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-12-28 18:02:48,859][00255] Overriding arg 'num_workers' with value 1 passed from command line
[2023-12-28 18:02:48,861][00255] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-12-28 18:02:48,863][00255] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-12-28 18:02:48,865][00255] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-12-28 18:02:48,867][00255] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-12-28 18:02:48,869][00255] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-12-28 18:02:48,872][00255] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-12-28 18:02:48,873][00255] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-12-28 18:02:48,874][00255] Adding new argument 'hf_repository'='andreatorch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-12-28 18:02:48,876][00255] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-12-28 18:02:48,877][00255] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-12-28 18:02:48,880][00255] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-12-28 18:02:48,881][00255] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-12-28 18:02:48,882][00255] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-12-28 18:02:48,952][00255] RunningMeanStd input shape: (3, 72, 128)
[2023-12-28 18:02:48,956][00255] RunningMeanStd input shape: (1,)
[2023-12-28 18:02:48,979][00255] ConvEncoder: input_channels=3
[2023-12-28 18:02:49,053][00255] Conv encoder output size: 512
[2023-12-28 18:02:49,055][00255] Policy head output size: 512
[2023-12-28 18:02:49,090][00255] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-12-28 18:02:49,725][00255] Num frames 100...
[2023-12-28 18:02:49,911][00255] Num frames 200...
[2023-12-28 18:02:50,103][00255] Num frames 300...
[2023-12-28 18:02:50,294][00255] Num frames 400...
[2023-12-28 18:02:50,477][00255] Num frames 500...
[2023-12-28 18:02:50,643][00255] Num frames 600...
[2023-12-28 18:02:50,775][00255] Num frames 700...
[2023-12-28 18:02:50,913][00255] Num frames 800...
[2023-12-28 18:02:51,048][00255] Num frames 900...
[2023-12-28 18:02:51,194][00255] Num frames 1000...
[2023-12-28 18:02:51,333][00255] Num frames 1100...
[2023-12-28 18:02:51,465][00255] Num frames 1200...
[2023-12-28 18:02:51,594][00255] Num frames 1300...
[2023-12-28 18:02:51,722][00255] Num frames 1400...
[2023-12-28 18:02:51,850][00255] Num frames 1500...
[2023-12-28 18:02:51,979][00255] Num frames 1600...
[2023-12-28 18:02:52,113][00255] Num frames 1700...
[2023-12-28 18:02:52,258][00255] Num frames 1800...
[2023-12-28 18:02:52,392][00255] Num frames 1900...
[2023-12-28 18:02:52,521][00255] Num frames 2000...
[2023-12-28 18:02:52,654][00255] Num frames 2100...
[2023-12-28 18:02:52,707][00255] Avg episode rewards: #0: 50.999, true rewards: #0: 21.000
[2023-12-28 18:02:52,709][00255] Avg episode reward: 50.999, avg true_objective: 21.000
[2023-12-28 18:02:52,834][00255] Num frames 2200...
[2023-12-28 18:02:52,967][00255] Num frames 2300...
[2023-12-28 18:02:53,103][00255] Num frames 2400...
[2023-12-28 18:02:53,240][00255] Num frames 2500...
[2023-12-28 18:02:53,376][00255] Num frames 2600...
[2023-12-28 18:02:53,500][00255] Num frames 2700...
[2023-12-28 18:02:53,634][00255] Num frames 2800...
[2023-12-28 18:02:53,756][00255] Num frames 2900...
[2023-12-28 18:02:53,882][00255] Num frames 3000...
[2023-12-28 18:02:54,017][00255] Num frames 3100...
[2023-12-28 18:02:54,134][00255] Avg episode rewards: #0: 38.225, true rewards: #0: 15.725
[2023-12-28 18:02:54,136][00255] Avg episode reward: 38.225, avg true_objective: 15.725
[2023-12-28 18:02:54,211][00255] Num frames 3200...
[2023-12-28 18:02:54,357][00255] Num frames 3300...
[2023-12-28 18:02:54,484][00255] Num frames 3400...
[2023-12-28 18:02:54,615][00255] Num frames 3500...
[2023-12-28 18:02:54,741][00255] Num frames 3600...
[2023-12-28 18:02:54,868][00255] Num frames 3700...
[2023-12-28 18:02:54,997][00255] Num frames 3800...
[2023-12-28 18:02:55,128][00255] Num frames 3900...
[2023-12-28 18:02:55,266][00255] Num frames 4000...
[2023-12-28 18:02:55,397][00255] Num frames 4100...
[2023-12-28 18:02:55,473][00255] Avg episode rewards: #0: 32.050, true rewards: #0: 13.717
[2023-12-28 18:02:55,475][00255] Avg episode reward: 32.050, avg true_objective: 13.717
[2023-12-28 18:02:55,588][00255] Num frames 4200...
[2023-12-28 18:02:55,717][00255] Num frames 4300...
[2023-12-28 18:02:55,845][00255] Num frames 4400...
[2023-12-28 18:02:55,974][00255] Num frames 4500...
[2023-12-28 18:02:56,106][00255] Num frames 4600...
[2023-12-28 18:02:56,234][00255] Num frames 4700...
[2023-12-28 18:02:56,367][00255] Avg episode rewards: #0: 27.137, true rewards: #0: 11.888
[2023-12-28 18:02:56,369][00255] Avg episode reward: 27.137, avg true_objective: 11.888
[2023-12-28 18:02:56,427][00255] Num frames 4800...
[2023-12-28 18:02:56,554][00255] Num frames 4900...
[2023-12-28 18:02:56,680][00255] Num frames 5000...
[2023-12-28 18:02:56,807][00255] Num frames 5100...
[2023-12-28 18:02:56,936][00255] Num frames 5200...
[2023-12-28 18:02:56,998][00255] Avg episode rewards: #0: 22.806, true rewards: #0: 10.406
[2023-12-28 18:02:56,999][00255] Avg episode reward: 22.806, avg true_objective: 10.406
[2023-12-28 18:02:57,127][00255] Num frames 5300...
[2023-12-28 18:02:57,259][00255] Num frames 5400...
[2023-12-28 18:02:57,400][00255] Num frames 5500...
[2023-12-28 18:02:57,532][00255] Num frames 5600...
[2023-12-28 18:02:57,660][00255] Num frames 5700...
[2023-12-28 18:02:57,793][00255] Num frames 5800...
[2023-12-28 18:02:57,929][00255] Num frames 5900...
[2023-12-28 18:02:58,060][00255] Num frames 6000...
[2023-12-28 18:02:58,188][00255] Num frames 6100...
[2023-12-28 18:02:58,327][00255] Num frames 6200...
[2023-12-28 18:02:58,458][00255] Num frames 6300...
[2023-12-28 18:02:58,590][00255] Num frames 6400...
[2023-12-28 18:02:58,716][00255] Num frames 6500...
[2023-12-28 18:02:58,847][00255] Num frames 6600...
[2023-12-28 18:02:58,978][00255] Num frames 6700...
[2023-12-28 18:02:59,109][00255] Num frames 6800...
[2023-12-28 18:02:59,239][00255] Num frames 6900...
[2023-12-28 18:02:59,375][00255] Num frames 7000...
[2023-12-28 18:02:59,509][00255] Num frames 7100...
[2023-12-28 18:02:59,644][00255] Num frames 7200...
[2023-12-28 18:02:59,774][00255] Num frames 7300...
[2023-12-28 18:02:59,835][00255] Avg episode rewards: #0: 28.171, true rewards: #0: 12.172
[2023-12-28 18:02:59,836][00255] Avg episode reward: 28.171, avg true_objective: 12.172
[2023-12-28 18:02:59,976][00255] Num frames 7400...
[2023-12-28 18:03:00,111][00255] Num frames 7500...
[2023-12-28 18:03:00,240][00255] Num frames 7600...
[2023-12-28 18:03:00,326][00255] Avg episode rewards: #0: 25.033, true rewards: #0: 10.890
[2023-12-28 18:03:00,328][00255] Avg episode reward: 25.033, avg true_objective: 10.890
[2023-12-28 18:03:00,438][00255] Num frames 7700...
[2023-12-28 18:03:00,565][00255] Num frames 7800...
[2023-12-28 18:03:00,743][00255] Num frames 7900...
[2023-12-28 18:03:00,935][00255] Num frames 8000...
[2023-12-28 18:03:01,124][00255] Num frames 8100...
[2023-12-28 18:03:01,308][00255] Num frames 8200...
[2023-12-28 18:03:01,505][00255] Num frames 8300...
[2023-12-28 18:03:01,693][00255] Num frames 8400...
[2023-12-28 18:03:01,885][00255] Num frames 8500...
[2023-12-28 18:03:01,970][00255] Avg episode rewards: #0: 24.515, true rewards: #0: 10.640
[2023-12-28 18:03:01,973][00255] Avg episode reward: 24.515, avg true_objective: 10.640
[2023-12-28 18:03:02,166][00255] Num frames 8600...
[2023-12-28 18:03:02,353][00255] Num frames 8700...
[2023-12-28 18:03:02,545][00255] Num frames 8800...
[2023-12-28 18:03:02,732][00255] Num frames 8900...
[2023-12-28 18:03:02,927][00255] Num frames 9000...
[2023-12-28 18:03:03,127][00255] Num frames 9100...
[2023-12-28 18:03:03,317][00255] Num frames 9200...
[2023-12-28 18:03:03,510][00255] Num frames 9300...
[2023-12-28 18:03:03,690][00255] Num frames 9400...
[2023-12-28 18:03:03,816][00255] Num frames 9500...
[2023-12-28 18:03:03,953][00255] Num frames 9600...
[2023-12-28 18:03:04,006][00255] Avg episode rewards: #0: 24.222, true rewards: #0: 10.667
[2023-12-28 18:03:04,007][00255] Avg episode reward: 24.222, avg true_objective: 10.667
[2023-12-28 18:03:04,145][00255] Num frames 9700...
[2023-12-28 18:03:04,279][00255] Num frames 9800...
[2023-12-28 18:03:04,408][00255] Num frames 9900...
[2023-12-28 18:03:04,550][00255] Num frames 10000...
[2023-12-28 18:03:04,687][00255] Num frames 10100...
[2023-12-28 18:03:04,826][00255] Num frames 10200...
[2023-12-28 18:03:04,955][00255] Num frames 10300...
[2023-12-28 18:03:05,088][00255] Num frames 10400...
[2023-12-28 18:03:05,226][00255] Num frames 10500...
[2023-12-28 18:03:05,360][00255] Num frames 10600...
[2023-12-28 18:03:05,493][00255] Num frames 10700...
[2023-12-28 18:03:05,627][00255] Num frames 10800...
[2023-12-28 18:03:05,757][00255] Num frames 10900...
[2023-12-28 18:03:05,885][00255] Num frames 11000...
[2023-12-28 18:03:06,017][00255] Num frames 11100...
[2023-12-28 18:03:06,163][00255] Avg episode rewards: #0: 25.468, true rewards: #0: 11.168
[2023-12-28 18:03:06,165][00255] Avg episode reward: 25.468, avg true_objective: 11.168
[2023-12-28 18:04:15,463][00255] Replay video saved to /content/train_dir/default_experiment/replay.mp4!