SRobbins's picture
Upload . with huggingface_hub
81a41e4
[2023-02-23 16:13:31,446][00868] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 16:13:31,448][00868] Rollout worker 0 uses device cpu
[2023-02-23 16:13:31,451][00868] Rollout worker 1 uses device cpu
[2023-02-23 16:13:31,455][00868] Rollout worker 2 uses device cpu
[2023-02-23 16:13:31,458][00868] Rollout worker 3 uses device cpu
[2023-02-23 16:13:31,460][00868] Rollout worker 4 uses device cpu
[2023-02-23 16:13:31,464][00868] Rollout worker 5 uses device cpu
[2023-02-23 16:13:31,467][00868] Rollout worker 6 uses device cpu
[2023-02-23 16:13:31,470][00868] Rollout worker 7 uses device cpu
[2023-02-23 16:13:31,727][00868] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 16:13:31,739][00868] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 16:13:31,785][00868] Starting all processes...
[2023-02-23 16:13:31,789][00868] Starting process learner_proc0
[2023-02-23 16:13:31,888][00868] Starting all processes...
[2023-02-23 16:13:31,986][00868] Starting process inference_proc0-0
[2023-02-23 16:13:31,987][00868] Starting process rollout_proc0
[2023-02-23 16:13:31,987][00868] Starting process rollout_proc1
[2023-02-23 16:13:31,987][00868] Starting process rollout_proc2
[2023-02-23 16:13:31,988][00868] Starting process rollout_proc3
[2023-02-23 16:13:31,988][00868] Starting process rollout_proc4
[2023-02-23 16:13:31,988][00868] Starting process rollout_proc5
[2023-02-23 16:13:31,988][00868] Starting process rollout_proc6
[2023-02-23 16:13:31,989][00868] Starting process rollout_proc7
[2023-02-23 16:13:44,644][11073] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 16:13:44,649][11073] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 16:13:45,384][11089] Worker 1 uses CPU cores [1]
[2023-02-23 16:13:45,497][11094] Worker 6 uses CPU cores [0]
[2023-02-23 16:13:45,682][11090] Worker 2 uses CPU cores [0]
[2023-02-23 16:13:45,874][11095] Worker 7 uses CPU cores [1]
[2023-02-23 16:13:46,114][11091] Worker 3 uses CPU cores [1]
[2023-02-23 16:13:46,131][11087] Worker 0 uses CPU cores [0]
[2023-02-23 16:13:46,131][11092] Worker 4 uses CPU cores [0]
[2023-02-23 16:13:46,160][11088] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 16:13:46,162][11088] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 16:13:46,185][11093] Worker 5 uses CPU cores [1]
[2023-02-23 16:13:46,251][11073] Num visible devices: 1
[2023-02-23 16:13:46,253][11088] Num visible devices: 1
[2023-02-23 16:13:46,281][11073] Starting seed is not provided
[2023-02-23 16:13:46,282][11073] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 16:13:46,282][11073] Initializing actor-critic model on device cuda:0
[2023-02-23 16:13:46,282][11073] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 16:13:46,285][11073] RunningMeanStd input shape: (1,)
[2023-02-23 16:13:46,306][11073] ConvEncoder: input_channels=3
[2023-02-23 16:13:46,802][11073] Conv encoder output size: 512
[2023-02-23 16:13:46,802][11073] Policy head output size: 512
[2023-02-23 16:13:46,865][11073] Created Actor Critic model with architecture:
[2023-02-23 16:13:46,866][11073] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-23 16:13:51,710][00868] Heartbeat connected on Batcher_0
[2023-02-23 16:13:51,727][00868] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 16:13:51,749][00868] Heartbeat connected on RolloutWorker_w0
[2023-02-23 16:13:51,756][00868] Heartbeat connected on RolloutWorker_w1
[2023-02-23 16:13:51,761][00868] Heartbeat connected on RolloutWorker_w2
[2023-02-23 16:13:51,765][00868] Heartbeat connected on RolloutWorker_w3
[2023-02-23 16:13:51,770][00868] Heartbeat connected on RolloutWorker_w4
[2023-02-23 16:13:51,775][00868] Heartbeat connected on RolloutWorker_w5
[2023-02-23 16:13:51,780][00868] Heartbeat connected on RolloutWorker_w6
[2023-02-23 16:13:51,784][00868] Heartbeat connected on RolloutWorker_w7
[2023-02-23 16:13:54,798][11073] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 16:13:54,799][11073] No checkpoints found
[2023-02-23 16:13:54,800][11073] Did not load from checkpoint, starting from scratch!
[2023-02-23 16:13:54,800][11073] Initialized policy 0 weights for model version 0
[2023-02-23 16:13:54,803][11073] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 16:13:54,810][11073] LearnerWorker_p0 finished initialization!
[2023-02-23 16:13:54,811][00868] Heartbeat connected on LearnerWorker_p0
[2023-02-23 16:13:55,018][11088] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 16:13:55,019][11088] RunningMeanStd input shape: (1,)
[2023-02-23 16:13:55,033][11088] ConvEncoder: input_channels=3
[2023-02-23 16:13:55,133][11088] Conv encoder output size: 512
[2023-02-23 16:13:55,134][11088] Policy head output size: 512
[2023-02-23 16:13:55,903][00868] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 16:13:57,450][00868] Inference worker 0-0 is ready!
[2023-02-23 16:13:57,451][00868] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 16:13:57,554][11089] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,597][11090] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,603][11092] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,602][11094] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,611][11087] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,609][11095] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,613][11091] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:57,616][11093] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:13:58,825][11089] Decorrelating experience for 0 frames...
[2023-02-23 16:13:58,826][11093] Decorrelating experience for 0 frames...
[2023-02-23 16:13:58,823][11091] Decorrelating experience for 0 frames...
[2023-02-23 16:13:58,823][11087] Decorrelating experience for 0 frames...
[2023-02-23 16:13:58,824][11090] Decorrelating experience for 0 frames...
[2023-02-23 16:13:58,825][11094] Decorrelating experience for 0 frames...
[2023-02-23 16:13:59,558][11089] Decorrelating experience for 32 frames...
[2023-02-23 16:13:59,607][11095] Decorrelating experience for 0 frames...
[2023-02-23 16:14:00,009][11095] Decorrelating experience for 32 frames...
[2023-02-23 16:14:00,048][11092] Decorrelating experience for 0 frames...
[2023-02-23 16:14:00,051][11087] Decorrelating experience for 32 frames...
[2023-02-23 16:14:00,053][11094] Decorrelating experience for 32 frames...
[2023-02-23 16:14:00,442][11095] Decorrelating experience for 64 frames...
[2023-02-23 16:14:00,815][11090] Decorrelating experience for 32 frames...
[2023-02-23 16:14:00,863][11095] Decorrelating experience for 96 frames...
[2023-02-23 16:14:00,903][00868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 16:14:01,184][11092] Decorrelating experience for 32 frames...
[2023-02-23 16:14:01,633][11091] Decorrelating experience for 32 frames...
[2023-02-23 16:14:02,154][11094] Decorrelating experience for 64 frames...
[2023-02-23 16:14:02,217][11093] Decorrelating experience for 32 frames...
[2023-02-23 16:14:03,535][11087] Decorrelating experience for 64 frames...
[2023-02-23 16:14:03,677][11091] Decorrelating experience for 64 frames...
[2023-02-23 16:14:03,848][11090] Decorrelating experience for 64 frames...
[2023-02-23 16:14:03,931][11089] Decorrelating experience for 64 frames...
[2023-02-23 16:14:04,206][11093] Decorrelating experience for 64 frames...
[2023-02-23 16:14:04,612][11092] Decorrelating experience for 64 frames...
[2023-02-23 16:14:05,100][11093] Decorrelating experience for 96 frames...
[2023-02-23 16:14:05,325][11089] Decorrelating experience for 96 frames...
[2023-02-23 16:14:05,609][11094] Decorrelating experience for 96 frames...
[2023-02-23 16:14:05,903][00868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 16:14:06,264][11087] Decorrelating experience for 96 frames...
[2023-02-23 16:14:07,244][11092] Decorrelating experience for 96 frames...
[2023-02-23 16:14:07,860][11090] Decorrelating experience for 96 frames...
[2023-02-23 16:14:10,903][00868] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 104.7. Samples: 1570. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 16:14:10,910][00868] Avg episode reward: [(0, '2.193')]
[2023-02-23 16:14:11,781][11073] Signal inference workers to stop experience collection...
[2023-02-23 16:14:11,794][11088] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 16:14:11,971][11091] Decorrelating experience for 96 frames...
[2023-02-23 16:14:14,075][11073] Signal inference workers to resume experience collection...
[2023-02-23 16:14:14,076][11088] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 16:14:15,903][00868] Fps is (10 sec: 819.2, 60 sec: 409.6, 300 sec: 409.6). Total num frames: 8192. Throughput: 0: 174.7. Samples: 3494. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-23 16:14:15,908][00868] Avg episode reward: [(0, '3.218')]
[2023-02-23 16:14:20,903][00868] Fps is (10 sec: 3276.8, 60 sec: 1310.7, 300 sec: 1310.7). Total num frames: 32768. Throughput: 0: 256.4. Samples: 6410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:14:20,909][00868] Avg episode reward: [(0, '3.843')]
[2023-02-23 16:14:22,658][11088] Updated weights for policy 0, policy_version 10 (0.0020)
[2023-02-23 16:14:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 49152. Throughput: 0: 399.5. Samples: 11986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:14:25,914][00868] Avg episode reward: [(0, '4.268')]
[2023-02-23 16:14:30,903][00868] Fps is (10 sec: 2867.2, 60 sec: 1755.4, 300 sec: 1755.4). Total num frames: 61440. Throughput: 0: 459.9. Samples: 16096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:14:30,909][00868] Avg episode reward: [(0, '4.489')]
[2023-02-23 16:14:35,636][11088] Updated weights for policy 0, policy_version 20 (0.0036)
[2023-02-23 16:14:35,903][00868] Fps is (10 sec: 3276.8, 60 sec: 2048.0, 300 sec: 2048.0). Total num frames: 81920. Throughput: 0: 465.9. Samples: 18636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:14:35,905][00868] Avg episode reward: [(0, '4.610')]
[2023-02-23 16:14:40,903][00868] Fps is (10 sec: 4096.0, 60 sec: 2275.6, 300 sec: 2275.6). Total num frames: 102400. Throughput: 0: 561.2. Samples: 25254. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:14:40,909][00868] Avg episode reward: [(0, '4.644')]
[2023-02-23 16:14:40,912][11073] Saving new best policy, reward=4.644!
[2023-02-23 16:14:45,907][00868] Fps is (10 sec: 3685.0, 60 sec: 2375.5, 300 sec: 2375.5). Total num frames: 118784. Throughput: 0: 672.3. Samples: 30258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:14:45,913][00868] Avg episode reward: [(0, '4.507')]
[2023-02-23 16:14:47,623][11088] Updated weights for policy 0, policy_version 30 (0.0026)
[2023-02-23 16:14:50,903][00868] Fps is (10 sec: 2867.2, 60 sec: 2383.1, 300 sec: 2383.1). Total num frames: 131072. Throughput: 0: 710.8. Samples: 31988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:14:50,908][00868] Avg episode reward: [(0, '4.366')]
[2023-02-23 16:14:55,903][00868] Fps is (10 sec: 2868.3, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 147456. Throughput: 0: 787.3. Samples: 36998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:14:55,905][00868] Avg episode reward: [(0, '4.281')]
[2023-02-23 16:14:58,623][11088] Updated weights for policy 0, policy_version 40 (0.0029)
[2023-02-23 16:15:00,903][00868] Fps is (10 sec: 4096.0, 60 sec: 2867.2, 300 sec: 2646.6). Total num frames: 172032. Throughput: 0: 890.7. Samples: 43576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:15:00,906][00868] Avg episode reward: [(0, '4.366')]
[2023-02-23 16:15:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3140.3, 300 sec: 2691.7). Total num frames: 188416. Throughput: 0: 889.0. Samples: 46414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:15:05,910][00868] Avg episode reward: [(0, '4.456')]
[2023-02-23 16:15:10,905][00868] Fps is (10 sec: 2866.7, 60 sec: 3345.0, 300 sec: 2676.0). Total num frames: 200704. Throughput: 0: 857.7. Samples: 50586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:15:10,908][00868] Avg episode reward: [(0, '4.474')]
[2023-02-23 16:15:11,165][11088] Updated weights for policy 0, policy_version 50 (0.0013)
[2023-02-23 16:15:15,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 2764.8). Total num frames: 221184. Throughput: 0: 882.8. Samples: 55820. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:15:15,911][00868] Avg episode reward: [(0, '4.314')]
[2023-02-23 16:15:20,903][00868] Fps is (10 sec: 4096.7, 60 sec: 3481.6, 300 sec: 2843.1). Total num frames: 241664. Throughput: 0: 900.4. Samples: 59154. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:15:20,906][00868] Avg episode reward: [(0, '4.171')]
[2023-02-23 16:15:21,077][11088] Updated weights for policy 0, policy_version 60 (0.0023)
[2023-02-23 16:15:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 2912.7). Total num frames: 262144. Throughput: 0: 888.3. Samples: 65228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:15:25,905][00868] Avg episode reward: [(0, '4.437')]
[2023-02-23 16:15:25,925][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000064_262144.pth...
[2023-02-23 16:15:30,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 2888.8). Total num frames: 274432. Throughput: 0: 868.1. Samples: 69320. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:15:30,906][00868] Avg episode reward: [(0, '4.434')]
[2023-02-23 16:15:34,027][11088] Updated weights for policy 0, policy_version 70 (0.0017)
[2023-02-23 16:15:35,906][00868] Fps is (10 sec: 3276.0, 60 sec: 3549.7, 300 sec: 2949.1). Total num frames: 294912. Throughput: 0: 878.6. Samples: 71528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:15:35,909][00868] Avg episode reward: [(0, '4.300')]
[2023-02-23 16:15:40,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3003.7). Total num frames: 315392. Throughput: 0: 917.6. Samples: 78288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:15:40,905][00868] Avg episode reward: [(0, '4.433')]
[2023-02-23 16:15:43,282][11088] Updated weights for policy 0, policy_version 80 (0.0019)
[2023-02-23 16:15:45,903][00868] Fps is (10 sec: 3687.2, 60 sec: 3550.1, 300 sec: 3016.1). Total num frames: 331776. Throughput: 0: 897.2. Samples: 83948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:15:45,909][00868] Avg episode reward: [(0, '4.404')]
[2023-02-23 16:15:50,904][00868] Fps is (10 sec: 3276.6, 60 sec: 3618.1, 300 sec: 3027.5). Total num frames: 348160. Throughput: 0: 880.7. Samples: 86048. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:15:50,909][00868] Avg episode reward: [(0, '4.351')]
[2023-02-23 16:15:55,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3037.9). Total num frames: 364544. Throughput: 0: 889.3. Samples: 90604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:15:55,909][00868] Avg episode reward: [(0, '4.335')]
[2023-02-23 16:15:56,369][11088] Updated weights for policy 0, policy_version 90 (0.0027)
[2023-02-23 16:16:00,903][00868] Fps is (10 sec: 3686.7, 60 sec: 3549.9, 300 sec: 3080.2). Total num frames: 385024. Throughput: 0: 919.7. Samples: 97206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:16:00,909][00868] Avg episode reward: [(0, '4.386')]
[2023-02-23 16:16:05,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3087.8). Total num frames: 401408. Throughput: 0: 915.1. Samples: 100332. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:16:05,911][00868] Avg episode reward: [(0, '4.386')]
[2023-02-23 16:16:07,748][11088] Updated weights for policy 0, policy_version 100 (0.0019)
[2023-02-23 16:16:10,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3094.8). Total num frames: 417792. Throughput: 0: 867.1. Samples: 104246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:16:10,910][00868] Avg episode reward: [(0, '4.382')]
[2023-02-23 16:16:15,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3072.0). Total num frames: 430080. Throughput: 0: 875.6. Samples: 108724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:16:15,905][00868] Avg episode reward: [(0, '4.328')]
[2023-02-23 16:16:19,645][11088] Updated weights for policy 0, policy_version 110 (0.0016)
[2023-02-23 16:16:20,903][00868] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3135.6). Total num frames: 454656. Throughput: 0: 898.1. Samples: 111940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:16:20,907][00868] Avg episode reward: [(0, '4.453')]
[2023-02-23 16:16:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3140.3). Total num frames: 471040. Throughput: 0: 884.4. Samples: 118084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:16:25,907][00868] Avg episode reward: [(0, '4.491')]
[2023-02-23 16:16:30,904][00868] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3144.7). Total num frames: 487424. Throughput: 0: 849.2. Samples: 122160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:16:30,908][00868] Avg episode reward: [(0, '4.601')]
[2023-02-23 16:16:32,477][11088] Updated weights for policy 0, policy_version 120 (0.0019)
[2023-02-23 16:16:35,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3413.5, 300 sec: 3123.2). Total num frames: 499712. Throughput: 0: 843.3. Samples: 123998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:16:35,910][00868] Avg episode reward: [(0, '4.555')]
[2023-02-23 16:16:40,903][00868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3152.7). Total num frames: 520192. Throughput: 0: 867.7. Samples: 129650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:16:40,906][00868] Avg episode reward: [(0, '4.496')]
[2023-02-23 16:16:43,487][11088] Updated weights for policy 0, policy_version 130 (0.0028)
[2023-02-23 16:16:45,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3156.3). Total num frames: 536576. Throughput: 0: 847.9. Samples: 135362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:16:45,913][00868] Avg episode reward: [(0, '4.412')]
[2023-02-23 16:16:50,905][00868] Fps is (10 sec: 3276.3, 60 sec: 3413.3, 300 sec: 3159.7). Total num frames: 552960. Throughput: 0: 820.4. Samples: 137252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:16:50,909][00868] Avg episode reward: [(0, '4.362')]
[2023-02-23 16:16:55,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3140.3). Total num frames: 565248. Throughput: 0: 816.8. Samples: 141004. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:16:55,906][00868] Avg episode reward: [(0, '4.450')]
[2023-02-23 16:16:57,545][11088] Updated weights for policy 0, policy_version 140 (0.0017)
[2023-02-23 16:17:00,903][00868] Fps is (10 sec: 3277.2, 60 sec: 3345.1, 300 sec: 3166.1). Total num frames: 585728. Throughput: 0: 851.2. Samples: 147026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:17:00,907][00868] Avg episode reward: [(0, '4.422')]
[2023-02-23 16:17:05,904][00868] Fps is (10 sec: 4095.8, 60 sec: 3413.3, 300 sec: 3190.6). Total num frames: 606208. Throughput: 0: 846.1. Samples: 150014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:17:05,911][00868] Avg episode reward: [(0, '4.267')]
[2023-02-23 16:17:08,024][11088] Updated weights for policy 0, policy_version 150 (0.0029)
[2023-02-23 16:17:10,903][00868] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3171.8). Total num frames: 618496. Throughput: 0: 821.9. Samples: 155068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:17:10,908][00868] Avg episode reward: [(0, '4.322')]
[2023-02-23 16:17:15,903][00868] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3174.4). Total num frames: 634880. Throughput: 0: 822.2. Samples: 159158. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:17:15,907][00868] Avg episode reward: [(0, '4.633')]
[2023-02-23 16:17:20,702][11088] Updated weights for policy 0, policy_version 160 (0.0029)
[2023-02-23 16:17:20,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3196.9). Total num frames: 655360. Throughput: 0: 845.5. Samples: 162044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:17:20,906][00868] Avg episode reward: [(0, '4.699')]
[2023-02-23 16:17:20,913][11073] Saving new best policy, reward=4.699!
[2023-02-23 16:17:25,904][00868] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3218.3). Total num frames: 675840. Throughput: 0: 863.0. Samples: 168484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:17:25,910][00868] Avg episode reward: [(0, '4.418')]
[2023-02-23 16:17:25,922][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000165_675840.pth...
[2023-02-23 16:17:30,905][00868] Fps is (10 sec: 3276.1, 60 sec: 3345.0, 300 sec: 3200.6). Total num frames: 688128. Throughput: 0: 841.9. Samples: 173250. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:17:30,913][00868] Avg episode reward: [(0, '4.470')]
[2023-02-23 16:17:32,595][11088] Updated weights for policy 0, policy_version 170 (0.0017)
[2023-02-23 16:17:35,905][00868] Fps is (10 sec: 2866.8, 60 sec: 3413.2, 300 sec: 3202.3). Total num frames: 704512. Throughput: 0: 844.4. Samples: 175250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:17:35,909][00868] Avg episode reward: [(0, '4.545')]
[2023-02-23 16:17:40,903][00868] Fps is (10 sec: 3687.2, 60 sec: 3413.3, 300 sec: 3222.2). Total num frames: 724992. Throughput: 0: 873.2. Samples: 180298. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:17:40,911][00868] Avg episode reward: [(0, '4.736')]
[2023-02-23 16:17:40,917][11073] Saving new best policy, reward=4.736!
[2023-02-23 16:17:43,908][11088] Updated weights for policy 0, policy_version 180 (0.0026)
[2023-02-23 16:17:45,909][00868] Fps is (10 sec: 4094.4, 60 sec: 3481.3, 300 sec: 3241.1). Total num frames: 745472. Throughput: 0: 875.9. Samples: 186448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:17:45,920][00868] Avg episode reward: [(0, '4.650')]
[2023-02-23 16:17:50,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3224.5). Total num frames: 757760. Throughput: 0: 869.0. Samples: 189118. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 16:17:50,911][00868] Avg episode reward: [(0, '4.723')]
[2023-02-23 16:17:55,907][00868] Fps is (10 sec: 2458.1, 60 sec: 3413.1, 300 sec: 3208.5). Total num frames: 770048. Throughput: 0: 843.4. Samples: 193024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:17:55,913][00868] Avg episode reward: [(0, '4.832')]
[2023-02-23 16:17:55,924][11073] Saving new best policy, reward=4.832!
[2023-02-23 16:17:57,676][11088] Updated weights for policy 0, policy_version 190 (0.0024)
[2023-02-23 16:18:00,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3226.6). Total num frames: 790528. Throughput: 0: 859.6. Samples: 197838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:00,906][00868] Avg episode reward: [(0, '4.873')]
[2023-02-23 16:18:00,908][11073] Saving new best policy, reward=4.873!
[2023-02-23 16:18:05,903][00868] Fps is (10 sec: 4097.4, 60 sec: 3413.4, 300 sec: 3244.0). Total num frames: 811008. Throughput: 0: 863.3. Samples: 200892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:05,907][00868] Avg episode reward: [(0, '4.821')]
[2023-02-23 16:18:07,849][11088] Updated weights for policy 0, policy_version 200 (0.0014)
[2023-02-23 16:18:10,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3244.7). Total num frames: 827392. Throughput: 0: 846.8. Samples: 206590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:10,912][00868] Avg episode reward: [(0, '4.875')]
[2023-02-23 16:18:10,914][11073] Saving new best policy, reward=4.875!
[2023-02-23 16:18:15,905][00868] Fps is (10 sec: 2866.7, 60 sec: 3413.2, 300 sec: 3229.5). Total num frames: 839680. Throughput: 0: 829.3. Samples: 210568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:18:15,910][00868] Avg episode reward: [(0, '4.813')]
[2023-02-23 16:18:20,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3230.4). Total num frames: 856064. Throughput: 0: 831.7. Samples: 212676. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:18:20,911][00868] Avg episode reward: [(0, '4.816')]
[2023-02-23 16:18:21,447][11088] Updated weights for policy 0, policy_version 210 (0.0027)
[2023-02-23 16:18:25,903][00868] Fps is (10 sec: 3687.1, 60 sec: 3345.1, 300 sec: 3246.5). Total num frames: 876544. Throughput: 0: 852.5. Samples: 218662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:25,908][00868] Avg episode reward: [(0, '4.870')]
[2023-02-23 16:18:30,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3413.5, 300 sec: 3247.0). Total num frames: 892928. Throughput: 0: 837.3. Samples: 224120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:30,906][00868] Avg episode reward: [(0, '4.925')]
[2023-02-23 16:18:30,910][11073] Saving new best policy, reward=4.925!
[2023-02-23 16:18:33,020][11088] Updated weights for policy 0, policy_version 220 (0.0022)
[2023-02-23 16:18:35,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3345.2, 300 sec: 3232.9). Total num frames: 905216. Throughput: 0: 819.5. Samples: 225996. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:18:35,911][00868] Avg episode reward: [(0, '4.895')]
[2023-02-23 16:18:40,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3233.7). Total num frames: 921600. Throughput: 0: 820.6. Samples: 229946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:18:40,906][00868] Avg episode reward: [(0, '4.706')]
[2023-02-23 16:18:45,765][11088] Updated weights for policy 0, policy_version 230 (0.0013)
[2023-02-23 16:18:45,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3277.1, 300 sec: 3248.6). Total num frames: 942080. Throughput: 0: 847.7. Samples: 235984. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:18:45,910][00868] Avg episode reward: [(0, '4.649')]
[2023-02-23 16:18:50,906][00868] Fps is (10 sec: 3685.4, 60 sec: 3344.9, 300 sec: 3249.0). Total num frames: 958464. Throughput: 0: 848.2. Samples: 239064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:18:50,913][00868] Avg episode reward: [(0, '4.460')]
[2023-02-23 16:18:55,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3345.3, 300 sec: 3290.7). Total num frames: 970752. Throughput: 0: 805.6. Samples: 242842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:18:55,907][00868] Avg episode reward: [(0, '4.447')]
[2023-02-23 16:18:59,559][11088] Updated weights for policy 0, policy_version 240 (0.0020)
[2023-02-23 16:19:00,903][00868] Fps is (10 sec: 2868.0, 60 sec: 3276.8, 300 sec: 3346.2). Total num frames: 987136. Throughput: 0: 808.1. Samples: 246932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:19:00,907][00868] Avg episode reward: [(0, '4.478')]
[2023-02-23 16:19:05,903][00868] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 1007616. Throughput: 0: 829.5. Samples: 250002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:19:05,906][00868] Avg episode reward: [(0, '4.607')]
[2023-02-23 16:19:09,517][11088] Updated weights for policy 0, policy_version 250 (0.0013)
[2023-02-23 16:19:10,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3457.3). Total num frames: 1028096. Throughput: 0: 840.6. Samples: 256488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:19:10,906][00868] Avg episode reward: [(0, '4.664')]
[2023-02-23 16:19:15,905][00868] Fps is (10 sec: 3276.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 1040384. Throughput: 0: 819.2. Samples: 260986. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:19:15,908][00868] Avg episode reward: [(0, '4.630')]
[2023-02-23 16:19:20,903][00868] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 1052672. Throughput: 0: 821.9. Samples: 262980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:19:20,907][00868] Avg episode reward: [(0, '4.676')]
[2023-02-23 16:19:24,927][11088] Updated weights for policy 0, policy_version 260 (0.0032)
[2023-02-23 16:19:25,903][00868] Fps is (10 sec: 2458.0, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 1064960. Throughput: 0: 809.3. Samples: 266366. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:19:25,905][00868] Avg episode reward: [(0, '4.765')]
[2023-02-23 16:19:25,924][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000260_1064960.pth...
[2023-02-23 16:19:26,072][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000064_262144.pth
[2023-02-23 16:19:30,903][00868] Fps is (10 sec: 2867.1, 60 sec: 3140.3, 300 sec: 3387.9). Total num frames: 1081344. Throughput: 0: 763.8. Samples: 270356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:19:30,908][00868] Avg episode reward: [(0, '4.844')]
[2023-02-23 16:19:35,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3360.1). Total num frames: 1093632. Throughput: 0: 759.8. Samples: 273252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:19:35,906][00868] Avg episode reward: [(0, '4.919')]
[2023-02-23 16:19:39,014][11088] Updated weights for policy 0, policy_version 270 (0.0027)
[2023-02-23 16:19:40,903][00868] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3360.2). Total num frames: 1110016. Throughput: 0: 765.7. Samples: 277300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:19:40,906][00868] Avg episode reward: [(0, '5.004')]
[2023-02-23 16:19:40,908][11073] Saving new best policy, reward=5.004!
[2023-02-23 16:19:45,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3374.0). Total num frames: 1126400. Throughput: 0: 781.1. Samples: 282080. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:19:45,910][00868] Avg episode reward: [(0, '5.262')]
[2023-02-23 16:19:45,922][11073] Saving new best policy, reward=5.262!
[2023-02-23 16:19:50,309][11088] Updated weights for policy 0, policy_version 280 (0.0020)
[2023-02-23 16:19:50,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3140.4, 300 sec: 3387.9). Total num frames: 1146880. Throughput: 0: 781.3. Samples: 285160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:19:50,909][00868] Avg episode reward: [(0, '5.308')]
[2023-02-23 16:19:50,916][11073] Saving new best policy, reward=5.308!
[2023-02-23 16:19:55,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 1163264. Throughput: 0: 772.1. Samples: 291232. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:19:55,912][00868] Avg episode reward: [(0, '5.115')]
[2023-02-23 16:20:00,908][00868] Fps is (10 sec: 3275.1, 60 sec: 3208.3, 300 sec: 3360.0). Total num frames: 1179648. Throughput: 0: 763.6. Samples: 295352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:20:00,915][00868] Avg episode reward: [(0, '5.058')]
[2023-02-23 16:20:03,482][11088] Updated weights for policy 0, policy_version 290 (0.0017)
[2023-02-23 16:20:05,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3374.0). Total num frames: 1196032. Throughput: 0: 762.9. Samples: 297310. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 16:20:05,910][00868] Avg episode reward: [(0, '5.118')]
[2023-02-23 16:20:10,903][00868] Fps is (10 sec: 3688.3, 60 sec: 3140.3, 300 sec: 3374.0). Total num frames: 1216512. Throughput: 0: 824.7. Samples: 303478. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:20:10,912][00868] Avg episode reward: [(0, '5.449')]
[2023-02-23 16:20:10,915][11073] Saving new best policy, reward=5.449!
[2023-02-23 16:20:13,563][11088] Updated weights for policy 0, policy_version 300 (0.0019)
[2023-02-23 16:20:15,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3208.6, 300 sec: 3360.1). Total num frames: 1232896. Throughput: 0: 867.3. Samples: 309384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:15,907][00868] Avg episode reward: [(0, '5.568')]
[2023-02-23 16:20:15,919][11073] Saving new best policy, reward=5.568!
[2023-02-23 16:20:20,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3346.2). Total num frames: 1249280. Throughput: 0: 849.2. Samples: 311466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:20,908][00868] Avg episode reward: [(0, '5.608')]
[2023-02-23 16:20:20,914][11073] Saving new best policy, reward=5.608!
[2023-02-23 16:20:25,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 1265664. Throughput: 0: 850.7. Samples: 315582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:20:25,906][00868] Avg episode reward: [(0, '5.769')]
[2023-02-23 16:20:25,918][11073] Saving new best policy, reward=5.769!
[2023-02-23 16:20:27,012][11088] Updated weights for policy 0, policy_version 310 (0.0026)
[2023-02-23 16:20:30,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 1286144. Throughput: 0: 882.9. Samples: 321810. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:30,906][00868] Avg episode reward: [(0, '6.098')]
[2023-02-23 16:20:30,911][11073] Saving new best policy, reward=6.098!
[2023-02-23 16:20:35,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3360.1). Total num frames: 1306624. Throughput: 0: 885.2. Samples: 324994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:35,911][00868] Avg episode reward: [(0, '6.298')]
[2023-02-23 16:20:35,939][11073] Saving new best policy, reward=6.298!
[2023-02-23 16:20:37,426][11088] Updated weights for policy 0, policy_version 320 (0.0013)
[2023-02-23 16:20:40,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 1318912. Throughput: 0: 854.7. Samples: 329694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:20:40,910][00868] Avg episode reward: [(0, '6.211')]
[2023-02-23 16:20:45,903][00868] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 1331200. Throughput: 0: 854.8. Samples: 333814. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:45,905][00868] Avg episode reward: [(0, '6.272')]
[2023-02-23 16:20:49,938][11088] Updated weights for policy 0, policy_version 330 (0.0050)
[2023-02-23 16:20:50,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3360.1). Total num frames: 1355776. Throughput: 0: 881.4. Samples: 336972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:20:50,912][00868] Avg episode reward: [(0, '6.205')]
[2023-02-23 16:20:55,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3360.1). Total num frames: 1376256. Throughput: 0: 888.9. Samples: 343478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:20:55,906][00868] Avg episode reward: [(0, '6.640')]
[2023-02-23 16:20:55,915][11073] Saving new best policy, reward=6.640!
[2023-02-23 16:21:00,904][00868] Fps is (10 sec: 3276.7, 60 sec: 3481.9, 300 sec: 3346.2). Total num frames: 1388544. Throughput: 0: 858.6. Samples: 348020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:21:00,911][00868] Avg episode reward: [(0, '6.670')]
[2023-02-23 16:21:00,913][11073] Saving new best policy, reward=6.670!
[2023-02-23 16:21:01,331][11088] Updated weights for policy 0, policy_version 340 (0.0018)
[2023-02-23 16:21:05,904][00868] Fps is (10 sec: 2457.3, 60 sec: 3413.3, 300 sec: 3332.3). Total num frames: 1400832. Throughput: 0: 856.1. Samples: 349990. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:21:05,909][00868] Avg episode reward: [(0, '6.481')]
[2023-02-23 16:21:10,903][00868] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 1421312. Throughput: 0: 883.4. Samples: 355336. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:21:10,911][00868] Avg episode reward: [(0, '6.228')]
[2023-02-23 16:21:12,841][11088] Updated weights for policy 0, policy_version 350 (0.0022)
[2023-02-23 16:21:15,903][00868] Fps is (10 sec: 4506.1, 60 sec: 3549.9, 300 sec: 3360.1). Total num frames: 1445888. Throughput: 0: 889.2. Samples: 361824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:21:15,905][00868] Avg episode reward: [(0, '6.480')]
[2023-02-23 16:21:20,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 1458176. Throughput: 0: 872.4. Samples: 364252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:21:20,906][00868] Avg episode reward: [(0, '7.023')]
[2023-02-23 16:21:20,908][11073] Saving new best policy, reward=7.023!
[2023-02-23 16:21:25,742][11088] Updated weights for policy 0, policy_version 360 (0.0013)
[2023-02-23 16:21:25,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 1474560. Throughput: 0: 857.1. Samples: 368264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:21:25,910][00868] Avg episode reward: [(0, '7.455')]
[2023-02-23 16:21:25,925][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000360_1474560.pth...
[2023-02-23 16:21:26,133][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000165_675840.pth
[2023-02-23 16:21:26,145][11073] Saving new best policy, reward=7.455!
[2023-02-23 16:21:30,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 1490944. Throughput: 0: 881.7. Samples: 373492. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:21:30,906][00868] Avg episode reward: [(0, '8.029')]
[2023-02-23 16:21:30,908][11073] Saving new best policy, reward=8.029!
[2023-02-23 16:21:35,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3360.1). Total num frames: 1511424. Throughput: 0: 880.0. Samples: 376574. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-23 16:21:35,909][00868] Avg episode reward: [(0, '8.001')]
[2023-02-23 16:21:36,359][11088] Updated weights for policy 0, policy_version 370 (0.0025)
[2023-02-23 16:21:40,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3360.1). Total num frames: 1527808. Throughput: 0: 862.0. Samples: 382266. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:21:40,910][00868] Avg episode reward: [(0, '8.616')]
[2023-02-23 16:21:40,913][11073] Saving new best policy, reward=8.616!
[2023-02-23 16:21:45,904][00868] Fps is (10 sec: 2867.0, 60 sec: 3481.6, 300 sec: 3346.2). Total num frames: 1540096. Throughput: 0: 848.9. Samples: 386222. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-23 16:21:45,908][00868] Avg episode reward: [(0, '8.896')]
[2023-02-23 16:21:45,926][11073] Saving new best policy, reward=8.896!
[2023-02-23 16:21:49,735][11088] Updated weights for policy 0, policy_version 380 (0.0016)
[2023-02-23 16:21:50,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 1560576. Throughput: 0: 854.0. Samples: 388418. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:21:50,906][00868] Avg episode reward: [(0, '8.872')]
[2023-02-23 16:21:55,903][00868] Fps is (10 sec: 4096.3, 60 sec: 3413.3, 300 sec: 3374.0). Total num frames: 1581056. Throughput: 0: 880.8. Samples: 394970. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-23 16:21:55,906][00868] Avg episode reward: [(0, '9.367')]
[2023-02-23 16:21:55,915][11073] Saving new best policy, reward=9.367!
[2023-02-23 16:21:59,844][11088] Updated weights for policy 0, policy_version 390 (0.0022)
[2023-02-23 16:22:00,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3360.1). Total num frames: 1597440. Throughput: 0: 857.6. Samples: 400414. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:22:00,910][00868] Avg episode reward: [(0, '9.182')]
[2023-02-23 16:22:05,905][00868] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3374.0). Total num frames: 1613824. Throughput: 0: 847.4. Samples: 402386. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:22:05,907][00868] Avg episode reward: [(0, '9.718')]
[2023-02-23 16:22:05,923][11073] Saving new best policy, reward=9.718!
[2023-02-23 16:22:10,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1630208. Throughput: 0: 856.9. Samples: 406826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:22:10,912][00868] Avg episode reward: [(0, '9.701')]
[2023-02-23 16:22:12,601][11088] Updated weights for policy 0, policy_version 400 (0.0039)
[2023-02-23 16:22:15,908][00868] Fps is (10 sec: 3684.8, 60 sec: 3413.1, 300 sec: 3373.9). Total num frames: 1650688. Throughput: 0: 885.4. Samples: 413338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:22:15,910][00868] Avg episode reward: [(0, '9.278')]
[2023-02-23 16:22:20,904][00868] Fps is (10 sec: 3686.0, 60 sec: 3481.5, 300 sec: 3360.1). Total num frames: 1667072. Throughput: 0: 889.4. Samples: 416598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:22:20,910][00868] Avg episode reward: [(0, '8.563')]
[2023-02-23 16:22:24,257][11088] Updated weights for policy 0, policy_version 410 (0.0015)
[2023-02-23 16:22:25,903][00868] Fps is (10 sec: 3278.3, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1683456. Throughput: 0: 855.5. Samples: 420764. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:22:25,906][00868] Avg episode reward: [(0, '8.018')]
[2023-02-23 16:22:30,906][00868] Fps is (10 sec: 3276.3, 60 sec: 3481.4, 300 sec: 3374.0). Total num frames: 1699840. Throughput: 0: 869.2. Samples: 425340. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:22:30,911][00868] Avg episode reward: [(0, '7.651')]
[2023-02-23 16:22:35,545][11088] Updated weights for policy 0, policy_version 420 (0.0027)
[2023-02-23 16:22:35,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1720320. Throughput: 0: 892.2. Samples: 428566. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:22:35,913][00868] Avg episode reward: [(0, '7.293')]
[2023-02-23 16:22:40,904][00868] Fps is (10 sec: 4096.8, 60 sec: 3549.8, 300 sec: 3374.0). Total num frames: 1740800. Throughput: 0: 889.9. Samples: 435016. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:22:40,908][00868] Avg episode reward: [(0, '8.779')]
[2023-02-23 16:22:45,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3374.0). Total num frames: 1753088. Throughput: 0: 859.7. Samples: 439100. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:22:45,906][00868] Avg episode reward: [(0, '9.633')]
[2023-02-23 16:22:48,073][11088] Updated weights for policy 0, policy_version 430 (0.0028)
[2023-02-23 16:22:50,903][00868] Fps is (10 sec: 2867.4, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1769472. Throughput: 0: 862.0. Samples: 441178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:22:50,908][00868] Avg episode reward: [(0, '10.696')]
[2023-02-23 16:22:50,917][11073] Saving new best policy, reward=10.696!
[2023-02-23 16:22:55,903][00868] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1789952. Throughput: 0: 899.0. Samples: 447282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:22:55,910][00868] Avg episode reward: [(0, '11.748')]
[2023-02-23 16:22:55,923][11073] Saving new best policy, reward=11.748!
[2023-02-23 16:22:58,082][11088] Updated weights for policy 0, policy_version 440 (0.0020)
[2023-02-23 16:23:00,903][00868] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3387.9). Total num frames: 1810432. Throughput: 0: 890.1. Samples: 453390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:00,912][00868] Avg episode reward: [(0, '12.262')]
[2023-02-23 16:23:00,914][11073] Saving new best policy, reward=12.262!
[2023-02-23 16:23:05,909][00868] Fps is (10 sec: 3275.0, 60 sec: 3481.3, 300 sec: 3373.9). Total num frames: 1822720. Throughput: 0: 861.6. Samples: 455374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:05,911][00868] Avg episode reward: [(0, '12.128')]
[2023-02-23 16:23:10,903][00868] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 1839104. Throughput: 0: 864.6. Samples: 459670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:10,906][00868] Avg episode reward: [(0, '12.457')]
[2023-02-23 16:23:10,909][11073] Saving new best policy, reward=12.457!
[2023-02-23 16:23:11,221][11088] Updated weights for policy 0, policy_version 450 (0.0027)
[2023-02-23 16:23:15,903][00868] Fps is (10 sec: 4098.3, 60 sec: 3550.1, 300 sec: 3415.6). Total num frames: 1863680. Throughput: 0: 907.3. Samples: 466166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:23:15,905][00868] Avg episode reward: [(0, '12.924')]
[2023-02-23 16:23:15,922][11073] Saving new best policy, reward=12.924!
[2023-02-23 16:23:20,591][11088] Updated weights for policy 0, policy_version 460 (0.0018)
[2023-02-23 16:23:20,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3415.6). Total num frames: 1884160. Throughput: 0: 910.0. Samples: 469518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:20,908][00868] Avg episode reward: [(0, '13.399')]
[2023-02-23 16:23:20,912][11073] Saving new best policy, reward=13.399!
[2023-02-23 16:23:25,907][00868] Fps is (10 sec: 3275.6, 60 sec: 3549.6, 300 sec: 3401.7). Total num frames: 1896448. Throughput: 0: 874.6. Samples: 474376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:25,910][00868] Avg episode reward: [(0, '13.383')]
[2023-02-23 16:23:25,925][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000463_1896448.pth...
[2023-02-23 16:23:26,057][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000260_1064960.pth
[2023-02-23 16:23:30,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3415.6). Total num frames: 1912832. Throughput: 0: 880.2. Samples: 478710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:23:30,906][00868] Avg episode reward: [(0, '13.859')]
[2023-02-23 16:23:30,910][11073] Saving new best policy, reward=13.859!
[2023-02-23 16:23:33,417][11088] Updated weights for policy 0, policy_version 470 (0.0027)
[2023-02-23 16:23:35,907][00868] Fps is (10 sec: 3686.4, 60 sec: 3549.7, 300 sec: 3429.5). Total num frames: 1933312. Throughput: 0: 905.4. Samples: 481924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:23:35,911][00868] Avg episode reward: [(0, '13.848')]
[2023-02-23 16:23:40,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3443.4). Total num frames: 1957888. Throughput: 0: 922.1. Samples: 488776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:23:40,909][00868] Avg episode reward: [(0, '13.708')]
[2023-02-23 16:23:43,710][11088] Updated weights for policy 0, policy_version 480 (0.0018)
[2023-02-23 16:23:45,903][00868] Fps is (10 sec: 3687.8, 60 sec: 3618.1, 300 sec: 3429.6). Total num frames: 1970176. Throughput: 0: 889.3. Samples: 493410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:23:45,906][00868] Avg episode reward: [(0, '13.864')]
[2023-02-23 16:23:45,923][11073] Saving new best policy, reward=13.864!
[2023-02-23 16:23:50,903][00868] Fps is (10 sec: 2457.6, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1982464. Throughput: 0: 891.1. Samples: 495468. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:23:50,906][00868] Avg episode reward: [(0, '14.273')]
[2023-02-23 16:23:50,930][11073] Saving new best policy, reward=14.273!
[2023-02-23 16:23:55,344][11088] Updated weights for policy 0, policy_version 490 (0.0025)
[2023-02-23 16:23:55,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 2007040. Throughput: 0: 925.2. Samples: 501306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:23:55,907][00868] Avg episode reward: [(0, '14.503')]
[2023-02-23 16:23:55,919][11073] Saving new best policy, reward=14.503!
[2023-02-23 16:24:00,905][00868] Fps is (10 sec: 4504.7, 60 sec: 3618.0, 300 sec: 3457.3). Total num frames: 2027520. Throughput: 0: 931.8. Samples: 508098. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:00,915][00868] Avg episode reward: [(0, '14.716')]
[2023-02-23 16:24:00,958][11073] Saving new best policy, reward=14.716!
[2023-02-23 16:24:05,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3686.7, 300 sec: 3443.4). Total num frames: 2043904. Throughput: 0: 902.8. Samples: 510144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:05,905][00868] Avg episode reward: [(0, '14.355')]
[2023-02-23 16:24:06,777][11088] Updated weights for policy 0, policy_version 500 (0.0021)
[2023-02-23 16:24:10,903][00868] Fps is (10 sec: 2867.7, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 2056192. Throughput: 0: 892.5. Samples: 514534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:10,905][00868] Avg episode reward: [(0, '15.221')]
[2023-02-23 16:24:10,911][11073] Saving new best policy, reward=15.221!
[2023-02-23 16:24:15,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 2080768. Throughput: 0: 932.5. Samples: 520674. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:24:15,906][00868] Avg episode reward: [(0, '16.187')]
[2023-02-23 16:24:15,919][11073] Saving new best policy, reward=16.187!
[2023-02-23 16:24:17,454][11088] Updated weights for policy 0, policy_version 510 (0.0012)
[2023-02-23 16:24:20,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 2101248. Throughput: 0: 935.5. Samples: 524020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:20,907][00868] Avg episode reward: [(0, '16.208')]
[2023-02-23 16:24:20,964][11073] Saving new best policy, reward=16.208!
[2023-02-23 16:24:25,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3686.6, 300 sec: 3512.8). Total num frames: 2117632. Throughput: 0: 902.8. Samples: 529400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:25,908][00868] Avg episode reward: [(0, '16.865')]
[2023-02-23 16:24:25,918][11073] Saving new best policy, reward=16.865!
[2023-02-23 16:24:29,480][11088] Updated weights for policy 0, policy_version 520 (0.0014)
[2023-02-23 16:24:30,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 2129920. Throughput: 0: 894.4. Samples: 533660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:30,910][00868] Avg episode reward: [(0, '15.538')]
[2023-02-23 16:24:35,904][00868] Fps is (10 sec: 3686.2, 60 sec: 3686.6, 300 sec: 3540.6). Total num frames: 2154496. Throughput: 0: 917.6. Samples: 536760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:24:35,906][00868] Avg episode reward: [(0, '14.106')]
[2023-02-23 16:24:39,038][11088] Updated weights for policy 0, policy_version 530 (0.0020)
[2023-02-23 16:24:40,903][00868] Fps is (10 sec: 4915.2, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 2179072. Throughput: 0: 941.9. Samples: 543692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:40,905][00868] Avg episode reward: [(0, '13.894')]
[2023-02-23 16:24:45,908][00868] Fps is (10 sec: 3684.8, 60 sec: 3686.1, 300 sec: 3540.6). Total num frames: 2191360. Throughput: 0: 903.5. Samples: 548758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:24:45,910][00868] Avg episode reward: [(0, '13.176')]
[2023-02-23 16:24:50,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3540.6). Total num frames: 2207744. Throughput: 0: 905.9. Samples: 550910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:24:50,911][00868] Avg episode reward: [(0, '13.737')]
[2023-02-23 16:24:51,827][11088] Updated weights for policy 0, policy_version 540 (0.0027)
[2023-02-23 16:24:55,903][00868] Fps is (10 sec: 3688.1, 60 sec: 3686.4, 300 sec: 3554.6). Total num frames: 2228224. Throughput: 0: 933.2. Samples: 556530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:24:55,911][00868] Avg episode reward: [(0, '14.988')]
[2023-02-23 16:25:00,814][11088] Updated weights for policy 0, policy_version 550 (0.0015)
[2023-02-23 16:25:00,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3754.8, 300 sec: 3582.3). Total num frames: 2252800. Throughput: 0: 951.2. Samples: 563480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:25:00,909][00868] Avg episode reward: [(0, '15.444')]
[2023-02-23 16:25:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 2269184. Throughput: 0: 933.6. Samples: 566032. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:25:05,911][00868] Avg episode reward: [(0, '16.093')]
[2023-02-23 16:25:10,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3554.5). Total num frames: 2281472. Throughput: 0: 910.8. Samples: 570384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:25:10,912][00868] Avg episode reward: [(0, '18.183')]
[2023-02-23 16:25:10,918][11073] Saving new best policy, reward=18.183!
[2023-02-23 16:25:13,612][11088] Updated weights for policy 0, policy_version 560 (0.0022)
[2023-02-23 16:25:15,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 2301952. Throughput: 0: 945.8. Samples: 576222. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:25:15,910][00868] Avg episode reward: [(0, '19.144')]
[2023-02-23 16:25:15,921][11073] Saving new best policy, reward=19.144!
[2023-02-23 16:25:20,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 2326528. Throughput: 0: 953.4. Samples: 579662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:25:20,908][00868] Avg episode reward: [(0, '19.207')]
[2023-02-23 16:25:20,916][11073] Saving new best policy, reward=19.207!
[2023-02-23 16:25:22,900][11088] Updated weights for policy 0, policy_version 570 (0.0022)
[2023-02-23 16:25:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 2342912. Throughput: 0: 925.1. Samples: 585322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:25:25,910][00868] Avg episode reward: [(0, '19.118')]
[2023-02-23 16:25:25,926][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000572_2342912.pth...
[2023-02-23 16:25:26,069][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000360_1474560.pth
[2023-02-23 16:25:30,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3554.5). Total num frames: 2355200. Throughput: 0: 906.2. Samples: 589534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:25:30,905][00868] Avg episode reward: [(0, '19.363')]
[2023-02-23 16:25:30,909][11073] Saving new best policy, reward=19.363!
[2023-02-23 16:25:35,444][11088] Updated weights for policy 0, policy_version 580 (0.0045)
[2023-02-23 16:25:35,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 2375680. Throughput: 0: 919.7. Samples: 592298. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:25:35,905][00868] Avg episode reward: [(0, '17.966')]
[2023-02-23 16:25:40,903][00868] Fps is (10 sec: 4505.5, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 2400256. Throughput: 0: 949.2. Samples: 599242. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:25:40,907][00868] Avg episode reward: [(0, '17.556')]
[2023-02-23 16:25:45,369][11088] Updated weights for policy 0, policy_version 590 (0.0012)
[2023-02-23 16:25:45,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3755.0, 300 sec: 3596.1). Total num frames: 2416640. Throughput: 0: 913.6. Samples: 604594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:25:45,906][00868] Avg episode reward: [(0, '18.286')]
[2023-02-23 16:25:50,904][00868] Fps is (10 sec: 2866.9, 60 sec: 3686.3, 300 sec: 3568.4). Total num frames: 2428928. Throughput: 0: 904.2. Samples: 606724. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-23 16:25:50,912][00868] Avg episode reward: [(0, '18.007')]
[2023-02-23 16:25:55,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 2449408. Throughput: 0: 925.4. Samples: 612028. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:25:55,909][00868] Avg episode reward: [(0, '18.989')]
[2023-02-23 16:25:57,028][11088] Updated weights for policy 0, policy_version 600 (0.0034)
[2023-02-23 16:26:00,903][00868] Fps is (10 sec: 4506.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2473984. Throughput: 0: 950.6. Samples: 619000. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:26:00,906][00868] Avg episode reward: [(0, '19.014')]
[2023-02-23 16:26:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 2490368. Throughput: 0: 938.6. Samples: 621900. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:26:05,908][00868] Avg episode reward: [(0, '19.402')]
[2023-02-23 16:26:05,921][11073] Saving new best policy, reward=19.402!
[2023-02-23 16:26:07,917][11088] Updated weights for policy 0, policy_version 610 (0.0023)
[2023-02-23 16:26:10,907][00868] Fps is (10 sec: 3275.6, 60 sec: 3754.4, 300 sec: 3596.1). Total num frames: 2506752. Throughput: 0: 906.4. Samples: 626114. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:26:10,910][00868] Avg episode reward: [(0, '19.274')]
[2023-02-23 16:26:15,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2523136. Throughput: 0: 938.7. Samples: 631776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:26:15,912][00868] Avg episode reward: [(0, '20.315')]
[2023-02-23 16:26:15,924][11073] Saving new best policy, reward=20.315!
[2023-02-23 16:26:18,729][11088] Updated weights for policy 0, policy_version 620 (0.0015)
[2023-02-23 16:26:20,903][00868] Fps is (10 sec: 4097.5, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2547712. Throughput: 0: 954.1. Samples: 635234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:26:20,906][00868] Avg episode reward: [(0, '19.785')]
[2023-02-23 16:26:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 2564096. Throughput: 0: 935.8. Samples: 641354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:26:25,906][00868] Avg episode reward: [(0, '20.508')]
[2023-02-23 16:26:25,915][11073] Saving new best policy, reward=20.508!
[2023-02-23 16:26:30,544][11088] Updated weights for policy 0, policy_version 630 (0.0019)
[2023-02-23 16:26:30,903][00868] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3623.9). Total num frames: 2580480. Throughput: 0: 912.7. Samples: 645664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:26:30,914][00868] Avg episode reward: [(0, '19.837')]
[2023-02-23 16:26:35,903][00868] Fps is (10 sec: 3686.3, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 2600960. Throughput: 0: 921.4. Samples: 648186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:26:35,905][00868] Avg episode reward: [(0, '19.254')]
[2023-02-23 16:26:40,070][11088] Updated weights for policy 0, policy_version 640 (0.0016)
[2023-02-23 16:26:40,903][00868] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2621440. Throughput: 0: 958.3. Samples: 655150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:26:40,906][00868] Avg episode reward: [(0, '18.775')]
[2023-02-23 16:26:45,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 2641920. Throughput: 0: 930.1. Samples: 660854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:26:45,906][00868] Avg episode reward: [(0, '18.701')]
[2023-02-23 16:26:50,904][00868] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 2654208. Throughput: 0: 914.0. Samples: 663030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:26:50,908][00868] Avg episode reward: [(0, '18.548')]
[2023-02-23 16:26:52,663][11088] Updated weights for policy 0, policy_version 650 (0.0024)
[2023-02-23 16:26:55,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 2674688. Throughput: 0: 933.5. Samples: 668116. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:26:55,906][00868] Avg episode reward: [(0, '19.559')]
[2023-02-23 16:27:00,903][00868] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2699264. Throughput: 0: 963.2. Samples: 675120. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:27:00,908][00868] Avg episode reward: [(0, '20.143')]
[2023-02-23 16:27:01,643][11088] Updated weights for policy 0, policy_version 660 (0.0021)
[2023-02-23 16:27:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2715648. Throughput: 0: 958.3. Samples: 678358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:05,906][00868] Avg episode reward: [(0, '21.302')]
[2023-02-23 16:27:05,921][11073] Saving new best policy, reward=21.302!
[2023-02-23 16:27:10,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.9, 300 sec: 3665.6). Total num frames: 2732032. Throughput: 0: 917.2. Samples: 682626. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:27:10,911][00868] Avg episode reward: [(0, '22.004')]
[2023-02-23 16:27:10,914][11073] Saving new best policy, reward=22.004!
[2023-02-23 16:27:14,333][11088] Updated weights for policy 0, policy_version 670 (0.0013)
[2023-02-23 16:27:15,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 2748416. Throughput: 0: 938.5. Samples: 687896. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:15,911][00868] Avg episode reward: [(0, '21.133')]
[2023-02-23 16:27:20,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 2772992. Throughput: 0: 959.6. Samples: 691366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:27:20,905][00868] Avg episode reward: [(0, '22.620')]
[2023-02-23 16:27:20,915][11073] Saving new best policy, reward=22.620!
[2023-02-23 16:27:23,293][11088] Updated weights for policy 0, policy_version 680 (0.0031)
[2023-02-23 16:27:25,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3693.4). Total num frames: 2789376. Throughput: 0: 946.0. Samples: 697722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:25,913][00868] Avg episode reward: [(0, '23.127')]
[2023-02-23 16:27:25,926][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000681_2789376.pth...
[2023-02-23 16:27:26,074][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000463_1896448.pth
[2023-02-23 16:27:26,106][11073] Saving new best policy, reward=23.127!
[2023-02-23 16:27:30,904][00868] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 2805760. Throughput: 0: 912.2. Samples: 701904. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:27:30,909][00868] Avg episode reward: [(0, '23.648')]
[2023-02-23 16:27:30,917][11073] Saving new best policy, reward=23.648!
[2023-02-23 16:27:35,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 2822144. Throughput: 0: 909.8. Samples: 703972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:27:35,906][00868] Avg episode reward: [(0, '23.937')]
[2023-02-23 16:27:35,922][11073] Saving new best policy, reward=23.937!
[2023-02-23 16:27:36,267][11088] Updated weights for policy 0, policy_version 690 (0.0029)
[2023-02-23 16:27:40,903][00868] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 2846720. Throughput: 0: 948.4. Samples: 710794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:40,908][00868] Avg episode reward: [(0, '25.020')]
[2023-02-23 16:27:40,911][11073] Saving new best policy, reward=25.020!
[2023-02-23 16:27:45,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 2863104. Throughput: 0: 924.6. Samples: 716726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:45,912][00868] Avg episode reward: [(0, '22.999')]
[2023-02-23 16:27:46,266][11088] Updated weights for policy 0, policy_version 700 (0.0020)
[2023-02-23 16:27:50,905][00868] Fps is (10 sec: 3276.2, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 2879488. Throughput: 0: 901.4. Samples: 718924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:27:50,910][00868] Avg episode reward: [(0, '23.045')]
[2023-02-23 16:27:55,905][00868] Fps is (10 sec: 2866.6, 60 sec: 3618.0, 300 sec: 3665.5). Total num frames: 2891776. Throughput: 0: 897.2. Samples: 723004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:27:55,909][00868] Avg episode reward: [(0, '22.283')]
[2023-02-23 16:28:00,787][11088] Updated weights for policy 0, policy_version 710 (0.0023)
[2023-02-23 16:28:00,907][00868] Fps is (10 sec: 2866.7, 60 sec: 3481.4, 300 sec: 3679.5). Total num frames: 2908160. Throughput: 0: 877.8. Samples: 727398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:00,910][00868] Avg episode reward: [(0, '21.820')]
[2023-02-23 16:28:05,903][00868] Fps is (10 sec: 3277.5, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 2924544. Throughput: 0: 850.4. Samples: 729634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:28:05,908][00868] Avg episode reward: [(0, '21.148')]
[2023-02-23 16:28:10,910][00868] Fps is (10 sec: 2866.2, 60 sec: 3413.0, 300 sec: 3637.7). Total num frames: 2936832. Throughput: 0: 815.3. Samples: 734418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:28:10,913][00868] Avg episode reward: [(0, '20.552')]
[2023-02-23 16:28:14,185][11088] Updated weights for policy 0, policy_version 720 (0.0024)
[2023-02-23 16:28:15,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3623.9). Total num frames: 2953216. Throughput: 0: 828.2. Samples: 739172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:28:15,905][00868] Avg episode reward: [(0, '20.546')]
[2023-02-23 16:28:20,903][00868] Fps is (10 sec: 4098.8, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 2977792. Throughput: 0: 860.1. Samples: 742676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:20,911][00868] Avg episode reward: [(0, '22.280')]
[2023-02-23 16:28:23,093][11088] Updated weights for policy 0, policy_version 730 (0.0014)
[2023-02-23 16:28:25,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 2998272. Throughput: 0: 863.8. Samples: 749664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:25,906][00868] Avg episode reward: [(0, '22.910')]
[2023-02-23 16:28:30,909][00868] Fps is (10 sec: 3684.2, 60 sec: 3481.3, 300 sec: 3665.5). Total num frames: 3014656. Throughput: 0: 827.8. Samples: 753982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:28:30,917][00868] Avg episode reward: [(0, '23.010')]
[2023-02-23 16:28:35,767][11088] Updated weights for policy 0, policy_version 740 (0.0027)
[2023-02-23 16:28:35,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 3031040. Throughput: 0: 827.0. Samples: 756138. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:28:35,911][00868] Avg episode reward: [(0, '23.558')]
[2023-02-23 16:28:40,903][00868] Fps is (10 sec: 3688.6, 60 sec: 3413.3, 300 sec: 3665.6). Total num frames: 3051520. Throughput: 0: 878.2. Samples: 762522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:28:40,908][00868] Avg episode reward: [(0, '24.994')]
[2023-02-23 16:28:44,874][11088] Updated weights for policy 0, policy_version 750 (0.0015)
[2023-02-23 16:28:45,903][00868] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3693.3). Total num frames: 3072000. Throughput: 0: 926.8. Samples: 769102. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:45,910][00868] Avg episode reward: [(0, '24.896')]
[2023-02-23 16:28:50,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3665.6). Total num frames: 3088384. Throughput: 0: 925.2. Samples: 771270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:50,910][00868] Avg episode reward: [(0, '24.054')]
[2023-02-23 16:28:55,903][00868] Fps is (10 sec: 3276.9, 60 sec: 3550.0, 300 sec: 3651.7). Total num frames: 3104768. Throughput: 0: 916.0. Samples: 775634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:28:55,909][00868] Avg episode reward: [(0, '22.904')]
[2023-02-23 16:28:57,518][11088] Updated weights for policy 0, policy_version 760 (0.0032)
[2023-02-23 16:29:00,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3665.6). Total num frames: 3125248. Throughput: 0: 957.5. Samples: 782258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:29:00,911][00868] Avg episode reward: [(0, '21.920')]
[2023-02-23 16:29:05,909][00868] Fps is (10 sec: 4503.2, 60 sec: 3754.3, 300 sec: 3707.2). Total num frames: 3149824. Throughput: 0: 953.8. Samples: 785604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:29:05,919][00868] Avg episode reward: [(0, '21.985')]
[2023-02-23 16:29:07,385][11088] Updated weights for policy 0, policy_version 770 (0.0012)
[2023-02-23 16:29:10,906][00868] Fps is (10 sec: 3685.4, 60 sec: 3754.9, 300 sec: 3665.5). Total num frames: 3162112. Throughput: 0: 910.4. Samples: 790634. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:29:10,908][00868] Avg episode reward: [(0, '20.752')]
[2023-02-23 16:29:15,903][00868] Fps is (10 sec: 2868.7, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3178496. Throughput: 0: 911.8. Samples: 795006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:29:15,905][00868] Avg episode reward: [(0, '21.277')]
[2023-02-23 16:29:19,290][11088] Updated weights for policy 0, policy_version 780 (0.0026)
[2023-02-23 16:29:20,904][00868] Fps is (10 sec: 3687.1, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 3198976. Throughput: 0: 938.7. Samples: 798378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:29:20,908][00868] Avg episode reward: [(0, '22.130')]
[2023-02-23 16:29:25,904][00868] Fps is (10 sec: 4505.4, 60 sec: 3754.6, 300 sec: 3707.2). Total num frames: 3223552. Throughput: 0: 953.2. Samples: 805416. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:29:25,908][00868] Avg episode reward: [(0, '22.451')]
[2023-02-23 16:29:25,921][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000787_3223552.pth...
[2023-02-23 16:29:26,051][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000572_2342912.pth
[2023-02-23 16:29:29,594][11088] Updated weights for policy 0, policy_version 790 (0.0020)
[2023-02-23 16:29:30,906][00868] Fps is (10 sec: 3685.7, 60 sec: 3686.6, 300 sec: 3665.5). Total num frames: 3235840. Throughput: 0: 910.0. Samples: 810054. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:29:30,909][00868] Avg episode reward: [(0, '21.998')]
[2023-02-23 16:29:35,903][00868] Fps is (10 sec: 2867.3, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3252224. Throughput: 0: 908.8. Samples: 812168. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:29:35,911][00868] Avg episode reward: [(0, '22.544')]
[2023-02-23 16:29:40,832][11088] Updated weights for policy 0, policy_version 800 (0.0029)
[2023-02-23 16:29:40,903][00868] Fps is (10 sec: 4097.1, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3276800. Throughput: 0: 946.4. Samples: 818220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:29:40,906][00868] Avg episode reward: [(0, '22.189')]
[2023-02-23 16:29:45,903][00868] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3297280. Throughput: 0: 949.5. Samples: 824984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:29:45,906][00868] Avg episode reward: [(0, '22.960')]
[2023-02-23 16:29:50,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 3309568. Throughput: 0: 924.9. Samples: 827218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:29:50,906][00868] Avg episode reward: [(0, '23.468')]
[2023-02-23 16:29:52,267][11088] Updated weights for policy 0, policy_version 810 (0.0035)
[2023-02-23 16:29:55,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3325952. Throughput: 0: 909.8. Samples: 831574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:29:55,910][00868] Avg episode reward: [(0, '23.323')]
[2023-02-23 16:30:00,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3350528. Throughput: 0: 949.6. Samples: 837740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:30:00,906][00868] Avg episode reward: [(0, '23.866')]
[2023-02-23 16:30:02,742][11088] Updated weights for policy 0, policy_version 820 (0.0029)
[2023-02-23 16:30:05,906][00868] Fps is (10 sec: 4504.5, 60 sec: 3686.6, 300 sec: 3693.3). Total num frames: 3371008. Throughput: 0: 947.8. Samples: 841030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:30:05,913][00868] Avg episode reward: [(0, '24.164')]
[2023-02-23 16:30:10,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3679.5). Total num frames: 3387392. Throughput: 0: 914.8. Samples: 846582. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:30:10,905][00868] Avg episode reward: [(0, '24.291')]
[2023-02-23 16:30:14,745][11088] Updated weights for policy 0, policy_version 830 (0.0022)
[2023-02-23 16:30:15,903][00868] Fps is (10 sec: 2867.9, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3399680. Throughput: 0: 909.2. Samples: 850966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:30:15,909][00868] Avg episode reward: [(0, '23.781')]
[2023-02-23 16:30:20,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3424256. Throughput: 0: 927.1. Samples: 853886. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 16:30:20,912][00868] Avg episode reward: [(0, '24.144')]
[2023-02-23 16:30:24,366][11088] Updated weights for policy 0, policy_version 840 (0.0016)
[2023-02-23 16:30:25,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 3444736. Throughput: 0: 948.0. Samples: 860882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:30:25,906][00868] Avg episode reward: [(0, '22.451')]
[2023-02-23 16:30:30,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3754.8, 300 sec: 3679.5). Total num frames: 3461120. Throughput: 0: 914.7. Samples: 866144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:30:30,906][00868] Avg episode reward: [(0, '22.366')]
[2023-02-23 16:30:35,903][00868] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3477504. Throughput: 0: 912.4. Samples: 868278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:30:35,911][00868] Avg episode reward: [(0, '23.498')]
[2023-02-23 16:30:37,020][11088] Updated weights for policy 0, policy_version 850 (0.0048)
[2023-02-23 16:30:40,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 3497984. Throughput: 0: 939.7. Samples: 873860. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:30:40,906][00868] Avg episode reward: [(0, '24.199')]
[2023-02-23 16:30:45,811][11088] Updated weights for policy 0, policy_version 860 (0.0024)
[2023-02-23 16:30:45,903][00868] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 3522560. Throughput: 0: 956.4. Samples: 880778. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 16:30:45,906][00868] Avg episode reward: [(0, '23.735')]
[2023-02-23 16:30:50,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3693.3). Total num frames: 3538944. Throughput: 0: 946.2. Samples: 883608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:30:50,906][00868] Avg episode reward: [(0, '24.411')]
[2023-02-23 16:30:55,903][00868] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3551232. Throughput: 0: 917.5. Samples: 887870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:30:55,910][00868] Avg episode reward: [(0, '25.743')]
[2023-02-23 16:30:55,922][11073] Saving new best policy, reward=25.743!
[2023-02-23 16:30:58,587][11088] Updated weights for policy 0, policy_version 870 (0.0034)
[2023-02-23 16:31:00,904][00868] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 3571712. Throughput: 0: 945.9. Samples: 893534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:31:00,912][00868] Avg episode reward: [(0, '26.451')]
[2023-02-23 16:31:00,914][11073] Saving new best policy, reward=26.451!
[2023-02-23 16:31:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3686.6, 300 sec: 3679.5). Total num frames: 3592192. Throughput: 0: 953.2. Samples: 896780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:31:05,910][00868] Avg episode reward: [(0, '24.412')]
[2023-02-23 16:31:07,898][11088] Updated weights for policy 0, policy_version 880 (0.0020)
[2023-02-23 16:31:10,905][00868] Fps is (10 sec: 4095.6, 60 sec: 3754.6, 300 sec: 3693.3). Total num frames: 3612672. Throughput: 0: 931.3. Samples: 902792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:31:10,907][00868] Avg episode reward: [(0, '25.042')]
[2023-02-23 16:31:15,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3624960. Throughput: 0: 909.7. Samples: 907082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:31:15,906][00868] Avg episode reward: [(0, '24.678')]
[2023-02-23 16:31:20,444][11088] Updated weights for policy 0, policy_version 890 (0.0035)
[2023-02-23 16:31:20,903][00868] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 3645440. Throughput: 0: 919.6. Samples: 909660. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:31:20,905][00868] Avg episode reward: [(0, '24.514')]
[2023-02-23 16:31:25,903][00868] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3670016. Throughput: 0: 952.0. Samples: 916700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:31:25,906][00868] Avg episode reward: [(0, '24.263')]
[2023-02-23 16:31:25,920][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000896_3670016.pth...
[2023-02-23 16:31:26,035][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000681_2789376.pth
[2023-02-23 16:31:30,192][11088] Updated weights for policy 0, policy_version 900 (0.0018)
[2023-02-23 16:31:30,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3686400. Throughput: 0: 922.7. Samples: 922298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:31:30,914][00868] Avg episode reward: [(0, '24.698')]
[2023-02-23 16:31:35,904][00868] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3665.6). Total num frames: 3702784. Throughput: 0: 908.8. Samples: 924506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:31:35,910][00868] Avg episode reward: [(0, '25.338')]
[2023-02-23 16:31:40,904][00868] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3651.7). Total num frames: 3719168. Throughput: 0: 929.9. Samples: 929716. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:31:40,906][00868] Avg episode reward: [(0, '26.484')]
[2023-02-23 16:31:40,981][11073] Saving new best policy, reward=26.484!
[2023-02-23 16:31:41,855][11088] Updated weights for policy 0, policy_version 910 (0.0023)
[2023-02-23 16:31:45,903][00868] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 3743744. Throughput: 0: 958.9. Samples: 936686. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:31:45,906][00868] Avg episode reward: [(0, '27.067')]
[2023-02-23 16:31:45,918][11073] Saving new best policy, reward=27.067!
[2023-02-23 16:31:50,903][00868] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3679.5). Total num frames: 3760128. Throughput: 0: 955.0. Samples: 939754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:31:50,910][00868] Avg episode reward: [(0, '27.464')]
[2023-02-23 16:31:50,914][11073] Saving new best policy, reward=27.464!
[2023-02-23 16:31:52,614][11088] Updated weights for policy 0, policy_version 920 (0.0022)
[2023-02-23 16:31:55,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 3776512. Throughput: 0: 917.2. Samples: 944064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:31:55,908][00868] Avg episode reward: [(0, '27.502')]
[2023-02-23 16:31:55,920][11073] Saving new best policy, reward=27.502!
[2023-02-23 16:32:00,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3796992. Throughput: 0: 944.7. Samples: 949594. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:32:00,906][00868] Avg episode reward: [(0, '29.239')]
[2023-02-23 16:32:00,911][11073] Saving new best policy, reward=29.239!
[2023-02-23 16:32:03,455][11088] Updated weights for policy 0, policy_version 930 (0.0017)
[2023-02-23 16:32:05,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3817472. Throughput: 0: 959.5. Samples: 952838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:32:05,906][00868] Avg episode reward: [(0, '27.535')]
[2023-02-23 16:32:10,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.8, 300 sec: 3693.3). Total num frames: 3837952. Throughput: 0: 943.8. Samples: 959170. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:32:10,908][00868] Avg episode reward: [(0, '27.526')]
[2023-02-23 16:32:14,691][11088] Updated weights for policy 0, policy_version 940 (0.0024)
[2023-02-23 16:32:15,904][00868] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3651.7). Total num frames: 3850240. Throughput: 0: 917.7. Samples: 963596. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:32:15,910][00868] Avg episode reward: [(0, '27.987')]
[2023-02-23 16:32:20,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3870720. Throughput: 0: 923.7. Samples: 966070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:32:20,911][00868] Avg episode reward: [(0, '27.453')]
[2023-02-23 16:32:24,638][11088] Updated weights for policy 0, policy_version 950 (0.0015)
[2023-02-23 16:32:25,903][00868] Fps is (10 sec: 4505.8, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3895296. Throughput: 0: 964.7. Samples: 973126. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 16:32:25,909][00868] Avg episode reward: [(0, '26.490')]
[2023-02-23 16:32:30,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 3911680. Throughput: 0: 942.8. Samples: 979112. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:32:30,912][00868] Avg episode reward: [(0, '26.766')]
[2023-02-23 16:32:35,903][00868] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3928064. Throughput: 0: 922.4. Samples: 981264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 16:32:35,907][00868] Avg episode reward: [(0, '26.889')]
[2023-02-23 16:32:36,704][11088] Updated weights for policy 0, policy_version 960 (0.0018)
[2023-02-23 16:32:40,903][00868] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3679.5). Total num frames: 3948544. Throughput: 0: 938.1. Samples: 986278. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 16:32:40,913][00868] Avg episode reward: [(0, '27.777')]
[2023-02-23 16:32:45,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3693.4). Total num frames: 3969024. Throughput: 0: 970.4. Samples: 993264. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 16:32:45,908][00868] Avg episode reward: [(0, '27.121')]
[2023-02-23 16:32:46,051][11088] Updated weights for policy 0, policy_version 970 (0.0032)
[2023-02-23 16:32:50,903][00868] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 3989504. Throughput: 0: 975.3. Samples: 996728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 16:32:50,910][00868] Avg episode reward: [(0, '27.208')]
[2023-02-23 16:32:55,746][11073] Stopping Batcher_0...
[2023-02-23 16:32:55,746][11073] Loop batcher_evt_loop terminating...
[2023-02-23 16:32:55,747][00868] Component Batcher_0 stopped!
[2023-02-23 16:32:55,754][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 16:32:55,834][00868] Component RolloutWorker_w7 stopped!
[2023-02-23 16:32:55,841][11095] Stopping RolloutWorker_w7...
[2023-02-23 16:32:55,841][11095] Loop rollout_proc7_evt_loop terminating...
[2023-02-23 16:32:55,829][11088] Weights refcount: 2 0
[2023-02-23 16:32:55,858][00868] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 16:32:55,862][11088] Stopping InferenceWorker_p0-w0...
[2023-02-23 16:32:55,863][11088] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 16:32:55,888][11094] Stopping RolloutWorker_w6...
[2023-02-23 16:32:55,884][00868] Component RolloutWorker_w1 stopped!
[2023-02-23 16:32:55,890][00868] Component RolloutWorker_w6 stopped!
[2023-02-23 16:32:55,892][11089] Stopping RolloutWorker_w1...
[2023-02-23 16:32:55,893][11089] Loop rollout_proc1_evt_loop terminating...
[2023-02-23 16:32:55,893][11092] Stopping RolloutWorker_w4...
[2023-02-23 16:32:55,893][00868] Component RolloutWorker_w4 stopped!
[2023-02-23 16:32:55,905][11092] Loop rollout_proc4_evt_loop terminating...
[2023-02-23 16:32:55,888][11094] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 16:32:55,951][11087] Stopping RolloutWorker_w0...
[2023-02-23 16:32:55,951][00868] Component RolloutWorker_w0 stopped!
[2023-02-23 16:32:55,962][11087] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 16:32:55,964][00868] Component RolloutWorker_w5 stopped!
[2023-02-23 16:32:55,970][11093] Stopping RolloutWorker_w5...
[2023-02-23 16:32:55,970][11093] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 16:32:55,985][00868] Component RolloutWorker_w3 stopped!
[2023-02-23 16:32:55,989][11091] Stopping RolloutWorker_w3...
[2023-02-23 16:32:55,989][11091] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 16:32:55,992][11073] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000787_3223552.pth
[2023-02-23 16:32:56,010][11090] Stopping RolloutWorker_w2...
[2023-02-23 16:32:56,010][00868] Component RolloutWorker_w2 stopped!
[2023-02-23 16:32:56,009][11073] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 16:32:56,026][11090] Loop rollout_proc2_evt_loop terminating...
[2023-02-23 16:32:56,300][00868] Component LearnerWorker_p0 stopped!
[2023-02-23 16:32:56,299][11073] Stopping LearnerWorker_p0...
[2023-02-23 16:32:56,304][00868] Waiting for process learner_proc0 to stop...
[2023-02-23 16:32:56,307][11073] Loop learner_proc0_evt_loop terminating...
[2023-02-23 16:32:58,785][00868] Waiting for process inference_proc0-0 to join...
[2023-02-23 16:32:59,153][00868] Waiting for process rollout_proc0 to join...
[2023-02-23 16:32:59,611][00868] Waiting for process rollout_proc1 to join...
[2023-02-23 16:32:59,616][00868] Waiting for process rollout_proc2 to join...
[2023-02-23 16:32:59,620][00868] Waiting for process rollout_proc3 to join...
[2023-02-23 16:32:59,624][00868] Waiting for process rollout_proc4 to join...
[2023-02-23 16:32:59,628][00868] Waiting for process rollout_proc5 to join...
[2023-02-23 16:32:59,631][00868] Waiting for process rollout_proc6 to join...
[2023-02-23 16:32:59,632][00868] Waiting for process rollout_proc7 to join...
[2023-02-23 16:32:59,634][00868] Batcher 0 profile tree view:
batching: 25.9210, releasing_batches: 0.0275
[2023-02-23 16:32:59,637][00868] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 546.7556
update_model: 8.3246
weight_update: 0.0015
one_step: 0.0179
handle_policy_step: 538.5335
deserialize: 15.4439, stack: 3.0940, obs_to_device_normalize: 118.5428, forward: 260.3327, send_messages: 28.5931
prepare_outputs: 85.2998
to_cpu: 52.2667
[2023-02-23 16:32:59,639][00868] Learner 0 profile tree view:
misc: 0.0053, prepare_batch: 16.9623
train: 76.5441
epoch_init: 0.0082, minibatch_init: 0.0063, losses_postprocess: 0.5995, kl_divergence: 0.6455, after_optimizer: 32.7253
calculate_losses: 27.2832
losses_init: 0.0035, forward_head: 1.8261, bptt_initial: 17.7524, tail: 1.0603, advantages_returns: 0.2539, losses: 3.6618
bptt: 2.3486
bptt_forward_core: 2.2690
update: 14.6765
clip: 1.4827
[2023-02-23 16:32:59,640][00868] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3552, enqueue_policy_requests: 149.1545, env_step: 853.5553, overhead: 22.2697, complete_rollouts: 7.4927
save_policy_outputs: 20.8590
split_output_tensors: 10.4682
[2023-02-23 16:32:59,642][00868] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3360, enqueue_policy_requests: 155.2829, env_step: 846.3109, overhead: 22.1155, complete_rollouts: 7.3596
save_policy_outputs: 21.3887
split_output_tensors: 10.2922
[2023-02-23 16:32:59,643][00868] Loop Runner_EvtLoop terminating...
[2023-02-23 16:32:59,644][00868] Runner profile tree view:
main_loop: 1167.8598
[2023-02-23 16:32:59,646][00868] Collected {0: 4005888}, FPS: 3430.1
[2023-02-23 16:32:59,773][00868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 16:32:59,775][00868] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 16:32:59,777][00868] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 16:32:59,779][00868] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 16:32:59,782][00868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 16:32:59,784][00868] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 16:32:59,785][00868] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 16:32:59,786][00868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 16:32:59,787][00868] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 16:32:59,789][00868] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 16:32:59,790][00868] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 16:32:59,791][00868] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 16:32:59,792][00868] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 16:32:59,794][00868] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 16:32:59,795][00868] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 16:32:59,826][00868] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 16:32:59,828][00868] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 16:32:59,831][00868] RunningMeanStd input shape: (1,)
[2023-02-23 16:32:59,849][00868] ConvEncoder: input_channels=3
[2023-02-23 16:33:00,561][00868] Conv encoder output size: 512
[2023-02-23 16:33:00,562][00868] Policy head output size: 512
[2023-02-23 16:33:03,069][00868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 16:33:04,349][00868] Num frames 100...
[2023-02-23 16:33:04,460][00868] Num frames 200...
[2023-02-23 16:33:04,576][00868] Num frames 300...
[2023-02-23 16:33:04,692][00868] Num frames 400...
[2023-02-23 16:33:04,813][00868] Num frames 500...
[2023-02-23 16:33:04,931][00868] Num frames 600...
[2023-02-23 16:33:05,055][00868] Num frames 700...
[2023-02-23 16:33:05,186][00868] Num frames 800...
[2023-02-23 16:33:05,312][00868] Num frames 900...
[2023-02-23 16:33:05,427][00868] Num frames 1000...
[2023-02-23 16:33:05,551][00868] Num frames 1100...
[2023-02-23 16:33:05,667][00868] Num frames 1200...
[2023-02-23 16:33:05,785][00868] Num frames 1300...
[2023-02-23 16:33:05,914][00868] Num frames 1400...
[2023-02-23 16:33:06,031][00868] Num frames 1500...
[2023-02-23 16:33:06,144][00868] Num frames 1600...
[2023-02-23 16:33:06,266][00868] Num frames 1700...
[2023-02-23 16:33:06,385][00868] Num frames 1800...
[2023-02-23 16:33:06,504][00868] Num frames 1900...
[2023-02-23 16:33:06,622][00868] Num frames 2000...
[2023-02-23 16:33:06,740][00868] Num frames 2100...
[2023-02-23 16:33:06,792][00868] Avg episode rewards: #0: 54.999, true rewards: #0: 21.000
[2023-02-23 16:33:06,794][00868] Avg episode reward: 54.999, avg true_objective: 21.000
[2023-02-23 16:33:06,922][00868] Num frames 2200...
[2023-02-23 16:33:07,044][00868] Num frames 2300...
[2023-02-23 16:33:07,159][00868] Num frames 2400...
[2023-02-23 16:33:07,276][00868] Num frames 2500...
[2023-02-23 16:33:07,392][00868] Num frames 2600...
[2023-02-23 16:33:07,511][00868] Num frames 2700...
[2023-02-23 16:33:07,624][00868] Num frames 2800...
[2023-02-23 16:33:07,750][00868] Num frames 2900...
[2023-02-23 16:33:07,883][00868] Num frames 3000...
[2023-02-23 16:33:08,001][00868] Num frames 3100...
[2023-02-23 16:33:08,138][00868] Avg episode rewards: #0: 40.335, true rewards: #0: 15.835
[2023-02-23 16:33:08,140][00868] Avg episode reward: 40.335, avg true_objective: 15.835
[2023-02-23 16:33:08,180][00868] Num frames 3200...
[2023-02-23 16:33:08,296][00868] Num frames 3300...
[2023-02-23 16:33:08,413][00868] Num frames 3400...
[2023-02-23 16:33:08,532][00868] Num frames 3500...
[2023-02-23 16:33:08,667][00868] Num frames 3600...
[2023-02-23 16:33:08,845][00868] Num frames 3700...
[2023-02-23 16:33:09,019][00868] Num frames 3800...
[2023-02-23 16:33:09,187][00868] Num frames 3900...
[2023-02-23 16:33:09,350][00868] Num frames 4000...
[2023-02-23 16:33:09,510][00868] Num frames 4100...
[2023-02-23 16:33:09,678][00868] Num frames 4200...
[2023-02-23 16:33:09,852][00868] Num frames 4300...
[2023-02-23 16:33:10,034][00868] Num frames 4400...
[2023-02-23 16:33:10,200][00868] Num frames 4500...
[2023-02-23 16:33:10,365][00868] Num frames 4600...
[2023-02-23 16:33:10,534][00868] Num frames 4700...
[2023-02-23 16:33:10,706][00868] Num frames 4800...
[2023-02-23 16:33:10,867][00868] Avg episode rewards: #0: 39.203, true rewards: #0: 16.203
[2023-02-23 16:33:10,868][00868] Avg episode reward: 39.203, avg true_objective: 16.203
[2023-02-23 16:33:10,934][00868] Num frames 4900...
[2023-02-23 16:33:11,108][00868] Num frames 5000...
[2023-02-23 16:33:11,276][00868] Num frames 5100...
[2023-02-23 16:33:11,444][00868] Num frames 5200...
[2023-02-23 16:33:11,620][00868] Num frames 5300...
[2023-02-23 16:33:11,789][00868] Num frames 5400...
[2023-02-23 16:33:11,952][00868] Num frames 5500...
[2023-02-23 16:33:12,125][00868] Num frames 5600...
[2023-02-23 16:33:12,282][00868] Num frames 5700...
[2023-02-23 16:33:12,403][00868] Num frames 5800...
[2023-02-23 16:33:12,523][00868] Num frames 5900...
[2023-02-23 16:33:12,672][00868] Avg episode rewards: #0: 35.202, true rewards: #0: 14.952
[2023-02-23 16:33:12,674][00868] Avg episode reward: 35.202, avg true_objective: 14.952
[2023-02-23 16:33:12,699][00868] Num frames 6000...
[2023-02-23 16:33:12,822][00868] Num frames 6100...
[2023-02-23 16:33:12,937][00868] Num frames 6200...
[2023-02-23 16:33:13,064][00868] Num frames 6300...
[2023-02-23 16:33:13,175][00868] Num frames 6400...
[2023-02-23 16:33:13,290][00868] Num frames 6500...
[2023-02-23 16:33:13,407][00868] Num frames 6600...
[2023-02-23 16:33:13,525][00868] Num frames 6700...
[2023-02-23 16:33:13,641][00868] Num frames 6800...
[2023-02-23 16:33:13,765][00868] Num frames 6900...
[2023-02-23 16:33:13,880][00868] Num frames 7000...
[2023-02-23 16:33:13,996][00868] Num frames 7100...
[2023-02-23 16:33:14,119][00868] Num frames 7200...
[2023-02-23 16:33:14,233][00868] Num frames 7300...
[2023-02-23 16:33:14,349][00868] Num frames 7400...
[2023-02-23 16:33:14,465][00868] Num frames 7500...
[2023-02-23 16:33:14,543][00868] Avg episode rewards: #0: 35.834, true rewards: #0: 15.034
[2023-02-23 16:33:14,546][00868] Avg episode reward: 35.834, avg true_objective: 15.034
[2023-02-23 16:33:14,646][00868] Num frames 7600...
[2023-02-23 16:33:14,763][00868] Num frames 7700...
[2023-02-23 16:33:14,877][00868] Num frames 7800...
[2023-02-23 16:33:14,993][00868] Num frames 7900...
[2023-02-23 16:33:15,122][00868] Num frames 8000...
[2023-02-23 16:33:15,244][00868] Num frames 8100...
[2023-02-23 16:33:15,330][00868] Avg episode rewards: #0: 31.875, true rewards: #0: 13.542
[2023-02-23 16:33:15,331][00868] Avg episode reward: 31.875, avg true_objective: 13.542
[2023-02-23 16:33:15,421][00868] Num frames 8200...
[2023-02-23 16:33:15,539][00868] Num frames 8300...
[2023-02-23 16:33:15,658][00868] Num frames 8400...
[2023-02-23 16:33:15,774][00868] Num frames 8500...
[2023-02-23 16:33:15,890][00868] Num frames 8600...
[2023-02-23 16:33:16,017][00868] Avg episode rewards: #0: 29.087, true rewards: #0: 12.373
[2023-02-23 16:33:16,018][00868] Avg episode reward: 29.087, avg true_objective: 12.373
[2023-02-23 16:33:16,073][00868] Num frames 8700...
[2023-02-23 16:33:16,187][00868] Num frames 8800...
[2023-02-23 16:33:16,301][00868] Num frames 8900...
[2023-02-23 16:33:16,416][00868] Num frames 9000...
[2023-02-23 16:33:16,529][00868] Num frames 9100...
[2023-02-23 16:33:16,639][00868] Num frames 9200...
[2023-02-23 16:33:16,758][00868] Num frames 9300...
[2023-02-23 16:33:16,873][00868] Num frames 9400...
[2023-02-23 16:33:16,984][00868] Num frames 9500...
[2023-02-23 16:33:17,116][00868] Num frames 9600...
[2023-02-23 16:33:17,197][00868] Avg episode rewards: #0: 28.651, true rewards: #0: 12.026
[2023-02-23 16:33:17,199][00868] Avg episode reward: 28.651, avg true_objective: 12.026
[2023-02-23 16:33:17,289][00868] Num frames 9700...
[2023-02-23 16:33:17,402][00868] Num frames 9800...
[2023-02-23 16:33:17,516][00868] Num frames 9900...
[2023-02-23 16:33:17,635][00868] Num frames 10000...
[2023-02-23 16:33:17,760][00868] Num frames 10100...
[2023-02-23 16:33:17,876][00868] Num frames 10200...
[2023-02-23 16:33:17,990][00868] Num frames 10300...
[2023-02-23 16:33:18,113][00868] Num frames 10400...
[2023-02-23 16:33:18,226][00868] Num frames 10500...
[2023-02-23 16:33:18,342][00868] Num frames 10600...
[2023-02-23 16:33:18,484][00868] Avg episode rewards: #0: 28.642, true rewards: #0: 11.864
[2023-02-23 16:33:18,485][00868] Avg episode reward: 28.642, avg true_objective: 11.864
[2023-02-23 16:33:18,514][00868] Num frames 10700...
[2023-02-23 16:33:18,627][00868] Num frames 10800...
[2023-02-23 16:33:18,737][00868] Num frames 10900...
[2023-02-23 16:33:18,856][00868] Num frames 11000...
[2023-02-23 16:33:18,970][00868] Num frames 11100...
[2023-02-23 16:33:19,056][00868] Avg episode rewards: #0: 26.326, true rewards: #0: 11.126
[2023-02-23 16:33:19,058][00868] Avg episode reward: 26.326, avg true_objective: 11.126
[2023-02-23 16:34:28,325][00868] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 16:37:55,680][00868] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 16:37:55,686][00868] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 16:37:55,689][00868] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 16:37:55,693][00868] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 16:37:55,697][00868] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 16:37:55,699][00868] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 16:37:55,702][00868] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 16:37:55,704][00868] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 16:37:55,706][00868] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 16:37:55,708][00868] Adding new argument 'hf_repository'='SRobbins/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 16:37:55,709][00868] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 16:37:55,711][00868] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 16:37:55,713][00868] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 16:37:55,714][00868] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 16:37:55,716][00868] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 16:37:55,754][00868] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 16:37:55,757][00868] RunningMeanStd input shape: (1,)
[2023-02-23 16:37:55,777][00868] ConvEncoder: input_channels=3
[2023-02-23 16:37:55,847][00868] Conv encoder output size: 512
[2023-02-23 16:37:55,850][00868] Policy head output size: 512
[2023-02-23 16:37:55,881][00868] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 16:37:56,403][00868] Num frames 100...
[2023-02-23 16:37:56,523][00868] Num frames 200...
[2023-02-23 16:37:56,649][00868] Num frames 300...
[2023-02-23 16:37:56,773][00868] Num frames 400...
[2023-02-23 16:37:56,895][00868] Num frames 500...
[2023-02-23 16:37:57,010][00868] Num frames 600...
[2023-02-23 16:37:57,131][00868] Num frames 700...
[2023-02-23 16:37:57,265][00868] Avg episode rewards: #0: 20.680, true rewards: #0: 7.680
[2023-02-23 16:37:57,267][00868] Avg episode reward: 20.680, avg true_objective: 7.680
[2023-02-23 16:37:57,307][00868] Num frames 800...
[2023-02-23 16:37:57,421][00868] Num frames 900...
[2023-02-23 16:37:57,539][00868] Num frames 1000...
[2023-02-23 16:37:57,655][00868] Num frames 1100...
[2023-02-23 16:37:57,772][00868] Num frames 1200...
[2023-02-23 16:37:57,896][00868] Num frames 1300...
[2023-02-23 16:37:58,013][00868] Num frames 1400...
[2023-02-23 16:37:58,149][00868] Num frames 1500...
[2023-02-23 16:37:58,267][00868] Num frames 1600...
[2023-02-23 16:37:58,383][00868] Avg episode rewards: #0: 20.755, true rewards: #0: 8.255
[2023-02-23 16:37:58,385][00868] Avg episode reward: 20.755, avg true_objective: 8.255
[2023-02-23 16:37:58,456][00868] Num frames 1700...
[2023-02-23 16:37:58,597][00868] Num frames 1800...
[2023-02-23 16:37:58,712][00868] Num frames 1900...
[2023-02-23 16:37:58,841][00868] Num frames 2000...
[2023-02-23 16:37:58,902][00868] Avg episode rewards: #0: 15.343, true rewards: #0: 6.677
[2023-02-23 16:37:58,904][00868] Avg episode reward: 15.343, avg true_objective: 6.677
[2023-02-23 16:37:59,026][00868] Num frames 2100...
[2023-02-23 16:37:59,148][00868] Num frames 2200...
[2023-02-23 16:37:59,275][00868] Num frames 2300...
[2023-02-23 16:37:59,396][00868] Num frames 2400...
[2023-02-23 16:37:59,514][00868] Num frames 2500...
[2023-02-23 16:37:59,633][00868] Num frames 2600...
[2023-02-23 16:37:59,749][00868] Num frames 2700...
[2023-02-23 16:37:59,869][00868] Num frames 2800...
[2023-02-23 16:38:00,000][00868] Num frames 2900...
[2023-02-23 16:38:00,120][00868] Num frames 3000...
[2023-02-23 16:38:00,245][00868] Num frames 3100...
[2023-02-23 16:38:00,395][00868] Avg episode rewards: #0: 18.460, true rewards: #0: 7.960
[2023-02-23 16:38:00,398][00868] Avg episode reward: 18.460, avg true_objective: 7.960
[2023-02-23 16:38:00,424][00868] Num frames 3200...
[2023-02-23 16:38:00,540][00868] Num frames 3300...
[2023-02-23 16:38:00,672][00868] Num frames 3400...
[2023-02-23 16:38:00,791][00868] Num frames 3500...
[2023-02-23 16:38:00,933][00868] Num frames 3600...
[2023-02-23 16:38:01,052][00868] Num frames 3700...
[2023-02-23 16:38:01,166][00868] Num frames 3800...
[2023-02-23 16:38:01,284][00868] Num frames 3900...
[2023-02-23 16:38:01,400][00868] Num frames 4000...
[2023-02-23 16:38:01,518][00868] Num frames 4100...
[2023-02-23 16:38:01,646][00868] Num frames 4200...
[2023-02-23 16:38:01,793][00868] Avg episode rewards: #0: 19.544, true rewards: #0: 8.544
[2023-02-23 16:38:01,795][00868] Avg episode reward: 19.544, avg true_objective: 8.544
[2023-02-23 16:38:01,834][00868] Num frames 4300...
[2023-02-23 16:38:01,956][00868] Num frames 4400...
[2023-02-23 16:38:02,076][00868] Num frames 4500...
[2023-02-23 16:38:02,195][00868] Num frames 4600...
[2023-02-23 16:38:02,315][00868] Num frames 4700...
[2023-02-23 16:38:02,434][00868] Num frames 4800...
[2023-02-23 16:38:02,550][00868] Num frames 4900...
[2023-02-23 16:38:02,674][00868] Num frames 5000...
[2023-02-23 16:38:02,797][00868] Num frames 5100...
[2023-02-23 16:38:02,920][00868] Num frames 5200...
[2023-02-23 16:38:03,100][00868] Avg episode rewards: #0: 20.160, true rewards: #0: 8.827
[2023-02-23 16:38:03,102][00868] Avg episode reward: 20.160, avg true_objective: 8.827
[2023-02-23 16:38:03,112][00868] Num frames 5300...
[2023-02-23 16:38:03,247][00868] Num frames 5400...
[2023-02-23 16:38:03,368][00868] Num frames 5500...
[2023-02-23 16:38:03,484][00868] Num frames 5600...
[2023-02-23 16:38:03,602][00868] Num frames 5700...
[2023-02-23 16:38:03,722][00868] Num frames 5800...
[2023-02-23 16:38:03,845][00868] Num frames 5900...
[2023-02-23 16:38:03,978][00868] Num frames 6000...
[2023-02-23 16:38:04,102][00868] Num frames 6100...
[2023-02-23 16:38:04,221][00868] Num frames 6200...
[2023-02-23 16:38:04,305][00868] Avg episode rewards: #0: 20.320, true rewards: #0: 8.891
[2023-02-23 16:38:04,307][00868] Avg episode reward: 20.320, avg true_objective: 8.891
[2023-02-23 16:38:04,397][00868] Num frames 6300...
[2023-02-23 16:38:04,515][00868] Num frames 6400...
[2023-02-23 16:38:04,631][00868] Num frames 6500...
[2023-02-23 16:38:04,749][00868] Num frames 6600...
[2023-02-23 16:38:04,869][00868] Num frames 6700...
[2023-02-23 16:38:04,995][00868] Num frames 6800...
[2023-02-23 16:38:05,117][00868] Num frames 6900...
[2023-02-23 16:38:05,242][00868] Num frames 7000...
[2023-02-23 16:38:05,367][00868] Num frames 7100...
[2023-02-23 16:38:05,499][00868] Num frames 7200...
[2023-02-23 16:38:05,613][00868] Avg episode rewards: #0: 21.185, true rewards: #0: 9.060
[2023-02-23 16:38:05,614][00868] Avg episode reward: 21.185, avg true_objective: 9.060
[2023-02-23 16:38:05,692][00868] Num frames 7300...
[2023-02-23 16:38:05,811][00868] Num frames 7400...
[2023-02-23 16:38:05,937][00868] Num frames 7500...
[2023-02-23 16:38:06,094][00868] Num frames 7600...
[2023-02-23 16:38:06,260][00868] Num frames 7700...
[2023-02-23 16:38:06,434][00868] Num frames 7800...
[2023-02-23 16:38:06,594][00868] Num frames 7900...
[2023-02-23 16:38:06,759][00868] Num frames 8000...
[2023-02-23 16:38:06,927][00868] Num frames 8100...
[2023-02-23 16:38:06,987][00868] Avg episode rewards: #0: 20.779, true rewards: #0: 9.001
[2023-02-23 16:38:06,991][00868] Avg episode reward: 20.779, avg true_objective: 9.001
[2023-02-23 16:38:07,169][00868] Num frames 8200...
[2023-02-23 16:38:07,331][00868] Num frames 8300...
[2023-02-23 16:38:07,493][00868] Num frames 8400...
[2023-02-23 16:38:07,668][00868] Num frames 8500...
[2023-02-23 16:38:07,848][00868] Num frames 8600...
[2023-02-23 16:38:08,017][00868] Num frames 8700...
[2023-02-23 16:38:08,187][00868] Num frames 8800...
[2023-02-23 16:38:08,354][00868] Num frames 8900...
[2023-02-23 16:38:08,520][00868] Num frames 9000...
[2023-02-23 16:38:08,692][00868] Num frames 9100...
[2023-02-23 16:38:08,862][00868] Num frames 9200...
[2023-02-23 16:38:09,035][00868] Num frames 9300...
[2023-02-23 16:38:09,217][00868] Num frames 9400...
[2023-02-23 16:38:09,301][00868] Avg episode rewards: #0: 22.113, true rewards: #0: 9.413
[2023-02-23 16:38:09,304][00868] Avg episode reward: 22.113, avg true_objective: 9.413
[2023-02-23 16:39:10,612][00868] Replay video saved to /content/train_dir/default_experiment/replay.mp4!