CoreyMorris's picture
Upload . with huggingface_hub
da2fccb
[2023-02-22 17:22:05,815][00589] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-22 17:22:05,819][00589] Rollout worker 0 uses device cpu
[2023-02-22 17:22:05,820][00589] Rollout worker 1 uses device cpu
[2023-02-22 17:22:05,826][00589] Rollout worker 2 uses device cpu
[2023-02-22 17:22:05,833][00589] Rollout worker 3 uses device cpu
[2023-02-22 17:22:05,837][00589] Rollout worker 4 uses device cpu
[2023-02-22 17:22:05,838][00589] Rollout worker 5 uses device cpu
[2023-02-22 17:22:05,840][00589] Rollout worker 6 uses device cpu
[2023-02-22 17:22:05,845][00589] Rollout worker 7 uses device cpu
[2023-02-22 17:22:06,346][00589] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-22 17:22:06,359][00589] InferenceWorker_p0-w0: min num requests: 2
[2023-02-22 17:22:06,511][00589] Starting all processes...
[2023-02-22 17:22:06,518][00589] Starting process learner_proc0
[2023-02-22 17:22:06,686][00589] Starting all processes...
[2023-02-22 17:22:06,769][00589] Starting process inference_proc0-0
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc0
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc1
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc2
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc3
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc4
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc5
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc6
[2023-02-22 17:22:06,770][00589] Starting process rollout_proc7
[2023-02-22 17:22:18,781][12689] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-22 17:22:18,785][12689] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-22 17:22:19,063][12675] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-22 17:22:19,063][12675] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-22 17:22:19,195][12697] Worker 7 uses CPU cores [1]
[2023-02-22 17:22:19,396][12693] Worker 3 uses CPU cores [1]
[2023-02-22 17:22:19,402][12691] Worker 2 uses CPU cores [0]
[2023-02-22 17:22:19,450][12692] Worker 1 uses CPU cores [1]
[2023-02-22 17:22:19,568][12696] Worker 6 uses CPU cores [0]
[2023-02-22 17:22:19,572][12695] Worker 4 uses CPU cores [0]
[2023-02-22 17:22:19,611][12690] Worker 0 uses CPU cores [0]
[2023-02-22 17:22:19,623][12694] Worker 5 uses CPU cores [1]
[2023-02-22 17:22:20,136][12675] Num visible devices: 1
[2023-02-22 17:22:20,138][12675] Starting seed is not provided
[2023-02-22 17:22:20,138][12675] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-22 17:22:20,138][12675] Initializing actor-critic model on device cuda:0
[2023-02-22 17:22:20,139][12675] RunningMeanStd input shape: (3, 72, 128)
[2023-02-22 17:22:20,141][12675] RunningMeanStd input shape: (1,)
[2023-02-22 17:22:20,146][12689] Num visible devices: 1
[2023-02-22 17:22:20,164][12675] ConvEncoder: input_channels=3
[2023-02-22 17:22:20,512][12675] Conv encoder output size: 512
[2023-02-22 17:22:20,513][12675] Policy head output size: 512
[2023-02-22 17:22:20,568][12675] Created Actor Critic model with architecture:
[2023-02-22 17:22:20,568][12675] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-22 17:22:26,299][00589] Heartbeat connected on Batcher_0
[2023-02-22 17:22:26,347][00589] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-22 17:22:26,388][00589] Heartbeat connected on RolloutWorker_w0
[2023-02-22 17:22:26,420][00589] Heartbeat connected on RolloutWorker_w1
[2023-02-22 17:22:26,428][00589] Heartbeat connected on RolloutWorker_w2
[2023-02-22 17:22:26,443][00589] Heartbeat connected on RolloutWorker_w3
[2023-02-22 17:22:26,466][00589] Heartbeat connected on RolloutWorker_w4
[2023-02-22 17:22:26,484][00589] Heartbeat connected on RolloutWorker_w5
[2023-02-22 17:22:26,499][00589] Heartbeat connected on RolloutWorker_w6
[2023-02-22 17:22:26,504][00589] Heartbeat connected on RolloutWorker_w7
[2023-02-22 17:22:27,753][12675] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-22 17:22:27,755][12675] No checkpoints found
[2023-02-22 17:22:27,755][12675] Did not load from checkpoint, starting from scratch!
[2023-02-22 17:22:27,755][12675] Initialized policy 0 weights for model version 0
[2023-02-22 17:22:27,760][12675] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-22 17:22:27,767][12675] LearnerWorker_p0 finished initialization!
[2023-02-22 17:22:27,772][00589] Heartbeat connected on LearnerWorker_p0
[2023-02-22 17:22:27,986][12689] RunningMeanStd input shape: (3, 72, 128)
[2023-02-22 17:22:27,987][12689] RunningMeanStd input shape: (1,)
[2023-02-22 17:22:28,004][12689] ConvEncoder: input_channels=3
[2023-02-22 17:22:28,131][12689] Conv encoder output size: 512
[2023-02-22 17:22:28,132][12689] Policy head output size: 512
[2023-02-22 17:22:28,403][00589] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-22 17:22:30,650][00589] Inference worker 0-0 is ready!
[2023-02-22 17:22:30,652][00589] All inference workers are ready! Signal rollout workers to start!
[2023-02-22 17:22:30,798][12691] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,812][12696] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,816][12695] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,817][12690] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,820][12697] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,816][12692] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,841][12694] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:30,853][12693] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:22:32,967][12693] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,969][12697] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,973][12694] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,968][12695] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,969][12691] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,970][12690] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,967][12696] Decorrelating experience for 0 frames...
[2023-02-22 17:22:32,980][12692] Decorrelating experience for 0 frames...
[2023-02-22 17:22:33,403][00589] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-22 17:22:34,251][12690] Decorrelating experience for 32 frames...
[2023-02-22 17:22:34,254][12691] Decorrelating experience for 32 frames...
[2023-02-22 17:22:34,990][12692] Decorrelating experience for 32 frames...
[2023-02-22 17:22:35,008][12694] Decorrelating experience for 32 frames...
[2023-02-22 17:22:35,010][12697] Decorrelating experience for 32 frames...
[2023-02-22 17:22:35,046][12693] Decorrelating experience for 32 frames...
[2023-02-22 17:22:36,659][12695] Decorrelating experience for 32 frames...
[2023-02-22 17:22:36,702][12694] Decorrelating experience for 64 frames...
[2023-02-22 17:22:36,852][12693] Decorrelating experience for 64 frames...
[2023-02-22 17:22:36,931][12690] Decorrelating experience for 64 frames...
[2023-02-22 17:22:36,934][12691] Decorrelating experience for 64 frames...
[2023-02-22 17:22:37,275][12696] Decorrelating experience for 32 frames...
[2023-02-22 17:22:38,293][12692] Decorrelating experience for 64 frames...
[2023-02-22 17:22:38,403][00589] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-22 17:22:38,523][12693] Decorrelating experience for 96 frames...
[2023-02-22 17:22:39,007][12697] Decorrelating experience for 64 frames...
[2023-02-22 17:22:39,019][12695] Decorrelating experience for 64 frames...
[2023-02-22 17:22:39,222][12691] Decorrelating experience for 96 frames...
[2023-02-22 17:22:39,546][12696] Decorrelating experience for 64 frames...
[2023-02-22 17:22:39,791][12690] Decorrelating experience for 96 frames...
[2023-02-22 17:22:40,416][12696] Decorrelating experience for 96 frames...
[2023-02-22 17:22:41,043][12692] Decorrelating experience for 96 frames...
[2023-02-22 17:22:41,052][12694] Decorrelating experience for 96 frames...
[2023-02-22 17:22:41,210][12697] Decorrelating experience for 96 frames...
[2023-02-22 17:22:41,505][12695] Decorrelating experience for 96 frames...
[2023-02-22 17:22:43,404][00589] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2.7. Samples: 40. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-22 17:22:43,411][00589] Avg episode reward: [(0, '0.480')]
[2023-02-22 17:22:45,484][12675] Signal inference workers to stop experience collection...
[2023-02-22 17:22:45,495][12689] InferenceWorker_p0-w0: stopping experience collection
[2023-02-22 17:22:47,914][12675] Signal inference workers to resume experience collection...
[2023-02-22 17:22:47,915][12689] InferenceWorker_p0-w0: resuming experience collection
[2023-02-22 17:22:48,403][00589] Fps is (10 sec: 409.6, 60 sec: 204.8, 300 sec: 204.8). Total num frames: 4096. Throughput: 0: 113.5. Samples: 2270. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-22 17:22:48,411][00589] Avg episode reward: [(0, '1.841')]
[2023-02-22 17:22:53,403][00589] Fps is (10 sec: 2048.1, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 20480. Throughput: 0: 199.0. Samples: 4974. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-22 17:22:53,405][00589] Avg episode reward: [(0, '3.356')]
[2023-02-22 17:22:58,405][00589] Fps is (10 sec: 3276.2, 60 sec: 1228.7, 300 sec: 1228.7). Total num frames: 36864. Throughput: 0: 304.4. Samples: 9132. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-22 17:22:58,411][00589] Avg episode reward: [(0, '3.616')]
[2023-02-22 17:22:59,650][12689] Updated weights for policy 0, policy_version 10 (0.0386)
[2023-02-22 17:23:03,403][00589] Fps is (10 sec: 3276.8, 60 sec: 1521.4, 300 sec: 1521.4). Total num frames: 53248. Throughput: 0: 320.3. Samples: 11210. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-22 17:23:03,410][00589] Avg episode reward: [(0, '4.310')]
[2023-02-22 17:23:08,403][00589] Fps is (10 sec: 3687.1, 60 sec: 1843.2, 300 sec: 1843.2). Total num frames: 73728. Throughput: 0: 437.0. Samples: 17480. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:23:08,410][00589] Avg episode reward: [(0, '4.485')]
[2023-02-22 17:23:10,297][12689] Updated weights for policy 0, policy_version 20 (0.0015)
[2023-02-22 17:23:13,403][00589] Fps is (10 sec: 3686.3, 60 sec: 2002.5, 300 sec: 2002.5). Total num frames: 90112. Throughput: 0: 506.6. Samples: 22798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:23:13,410][00589] Avg episode reward: [(0, '4.413')]
[2023-02-22 17:23:18,403][00589] Fps is (10 sec: 2867.2, 60 sec: 2048.0, 300 sec: 2048.0). Total num frames: 102400. Throughput: 0: 550.4. Samples: 24766. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:23:18,406][00589] Avg episode reward: [(0, '4.373')]
[2023-02-22 17:23:18,416][12675] Saving new best policy, reward=4.373!
[2023-02-22 17:23:23,403][00589] Fps is (10 sec: 2867.2, 60 sec: 2159.7, 300 sec: 2159.7). Total num frames: 118784. Throughput: 0: 639.8. Samples: 28790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:23:23,411][00589] Avg episode reward: [(0, '4.429')]
[2023-02-22 17:23:23,414][12675] Saving new best policy, reward=4.429!
[2023-02-22 17:23:24,176][12689] Updated weights for policy 0, policy_version 30 (0.0026)
[2023-02-22 17:23:28,403][00589] Fps is (10 sec: 3686.4, 60 sec: 2321.1, 300 sec: 2321.1). Total num frames: 139264. Throughput: 0: 771.6. Samples: 34760. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-22 17:23:28,411][00589] Avg episode reward: [(0, '4.436')]
[2023-02-22 17:23:28,419][12675] Saving new best policy, reward=4.436!
[2023-02-22 17:23:33,403][00589] Fps is (10 sec: 3686.4, 60 sec: 2594.1, 300 sec: 2394.6). Total num frames: 155648. Throughput: 0: 795.7. Samples: 38078. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-22 17:23:33,409][00589] Avg episode reward: [(0, '4.334')]
[2023-02-22 17:23:35,228][12689] Updated weights for policy 0, policy_version 40 (0.0029)
[2023-02-22 17:23:38,406][00589] Fps is (10 sec: 2866.3, 60 sec: 2798.8, 300 sec: 2399.0). Total num frames: 167936. Throughput: 0: 833.9. Samples: 42502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:23:38,411][00589] Avg episode reward: [(0, '4.440')]
[2023-02-22 17:23:38,423][12675] Saving new best policy, reward=4.440!
[2023-02-22 17:23:43,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 2457.6). Total num frames: 184320. Throughput: 0: 815.7. Samples: 45838. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:23:43,409][00589] Avg episode reward: [(0, '4.346')]
[2023-02-22 17:23:47,998][12689] Updated weights for policy 0, policy_version 50 (0.0015)
[2023-02-22 17:23:48,404][00589] Fps is (10 sec: 3687.5, 60 sec: 3345.1, 300 sec: 2560.0). Total num frames: 204800. Throughput: 0: 842.1. Samples: 49104. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:23:48,406][00589] Avg episode reward: [(0, '4.306')]
[2023-02-22 17:23:53,404][00589] Fps is (10 sec: 4095.8, 60 sec: 3413.3, 300 sec: 2650.3). Total num frames: 225280. Throughput: 0: 845.4. Samples: 55524. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-22 17:23:53,412][00589] Avg episode reward: [(0, '4.312')]
[2023-02-22 17:23:58,403][00589] Fps is (10 sec: 3276.9, 60 sec: 3345.2, 300 sec: 2639.6). Total num frames: 237568. Throughput: 0: 826.1. Samples: 59974. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:23:58,408][00589] Avg episode reward: [(0, '4.376')]
[2023-02-22 17:23:58,434][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000058_237568.pth...
[2023-02-22 17:24:00,551][12689] Updated weights for policy 0, policy_version 60 (0.0012)
[2023-02-22 17:24:03,403][00589] Fps is (10 sec: 2867.3, 60 sec: 3345.1, 300 sec: 2673.2). Total num frames: 253952. Throughput: 0: 826.5. Samples: 61958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:24:03,409][00589] Avg episode reward: [(0, '4.356')]
[2023-02-22 17:24:08,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 2703.4). Total num frames: 270336. Throughput: 0: 857.1. Samples: 67358. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:24:08,408][00589] Avg episode reward: [(0, '4.307')]
[2023-02-22 17:24:11,473][12689] Updated weights for policy 0, policy_version 70 (0.0017)
[2023-02-22 17:24:13,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2769.7). Total num frames: 290816. Throughput: 0: 856.0. Samples: 73282. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:24:13,407][00589] Avg episode reward: [(0, '4.443')]
[2023-02-22 17:24:13,413][12675] Saving new best policy, reward=4.443!
[2023-02-22 17:24:18,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2755.5). Total num frames: 303104. Throughput: 0: 828.0. Samples: 75336. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:24:18,408][00589] Avg episode reward: [(0, '4.364')]
[2023-02-22 17:24:23,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 2778.2). Total num frames: 319488. Throughput: 0: 818.6. Samples: 79338. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:24:23,409][00589] Avg episode reward: [(0, '4.387')]
[2023-02-22 17:24:25,142][12689] Updated weights for policy 0, policy_version 80 (0.0033)
[2023-02-22 17:24:28,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2833.1). Total num frames: 339968. Throughput: 0: 872.3. Samples: 85092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:24:28,411][00589] Avg episode reward: [(0, '4.351')]
[2023-02-22 17:24:33,403][00589] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 2883.6). Total num frames: 360448. Throughput: 0: 867.6. Samples: 88144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:24:33,406][00589] Avg episode reward: [(0, '4.317')]
[2023-02-22 17:24:36,055][12689] Updated weights for policy 0, policy_version 90 (0.0034)
[2023-02-22 17:24:38,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 2867.2). Total num frames: 372736. Throughput: 0: 837.3. Samples: 93200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:24:38,407][00589] Avg episode reward: [(0, '4.419')]
[2023-02-22 17:24:43,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2882.4). Total num frames: 389120. Throughput: 0: 832.2. Samples: 97422. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:24:43,406][00589] Avg episode reward: [(0, '4.429')]
[2023-02-22 17:24:48,026][12689] Updated weights for policy 0, policy_version 100 (0.0032)
[2023-02-22 17:24:48,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 409600. Throughput: 0: 853.8. Samples: 100378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:24:48,407][00589] Avg episode reward: [(0, '4.456')]
[2023-02-22 17:24:48,418][12675] Saving new best policy, reward=4.456!
[2023-02-22 17:24:53,403][00589] Fps is (10 sec: 4096.1, 60 sec: 3413.4, 300 sec: 2966.1). Total num frames: 430080. Throughput: 0: 882.9. Samples: 107090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:24:53,406][00589] Avg episode reward: [(0, '4.478')]
[2023-02-22 17:24:53,425][12675] Saving new best policy, reward=4.478!
[2023-02-22 17:24:58,406][00589] Fps is (10 sec: 3685.5, 60 sec: 3481.5, 300 sec: 2976.4). Total num frames: 446464. Throughput: 0: 864.4. Samples: 112182. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:24:58,412][00589] Avg episode reward: [(0, '4.677')]
[2023-02-22 17:24:58,427][12675] Saving new best policy, reward=4.677!
[2023-02-22 17:24:59,704][12689] Updated weights for policy 0, policy_version 110 (0.0014)
[2023-02-22 17:25:03,406][00589] Fps is (10 sec: 2866.4, 60 sec: 3413.2, 300 sec: 2959.6). Total num frames: 458752. Throughput: 0: 854.7. Samples: 113800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:25:03,411][00589] Avg episode reward: [(0, '4.745')]
[2023-02-22 17:25:03,416][12675] Saving new best policy, reward=4.745!
[2023-02-22 17:25:08,403][00589] Fps is (10 sec: 2048.5, 60 sec: 3276.8, 300 sec: 2918.4). Total num frames: 466944. Throughput: 0: 837.9. Samples: 117042. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:25:08,406][00589] Avg episode reward: [(0, '4.764')]
[2023-02-22 17:25:08,419][12675] Saving new best policy, reward=4.764!
[2023-02-22 17:25:13,403][00589] Fps is (10 sec: 2868.0, 60 sec: 3276.8, 300 sec: 2954.1). Total num frames: 487424. Throughput: 0: 817.6. Samples: 121886. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:25:13,406][00589] Avg episode reward: [(0, '4.486')]
[2023-02-22 17:25:13,999][12689] Updated weights for policy 0, policy_version 120 (0.0028)
[2023-02-22 17:25:18,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 2987.7). Total num frames: 507904. Throughput: 0: 825.0. Samples: 125268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:25:18,412][00589] Avg episode reward: [(0, '4.367')]
[2023-02-22 17:25:23,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2972.5). Total num frames: 520192. Throughput: 0: 817.8. Samples: 130000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:25:23,408][00589] Avg episode reward: [(0, '4.532')]
[2023-02-22 17:25:26,812][12689] Updated weights for policy 0, policy_version 130 (0.0017)
[2023-02-22 17:25:28,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 2981.0). Total num frames: 536576. Throughput: 0: 820.2. Samples: 134332. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:25:28,406][00589] Avg episode reward: [(0, '4.616')]
[2023-02-22 17:25:33,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3011.1). Total num frames: 557056. Throughput: 0: 829.5. Samples: 137704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:25:33,410][00589] Avg episode reward: [(0, '4.637')]
[2023-02-22 17:25:36,196][12689] Updated weights for policy 0, policy_version 140 (0.0012)
[2023-02-22 17:25:38,408][00589] Fps is (10 sec: 4503.5, 60 sec: 3481.3, 300 sec: 3061.1). Total num frames: 581632. Throughput: 0: 830.4. Samples: 144462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:25:38,416][00589] Avg episode reward: [(0, '4.507')]
[2023-02-22 17:25:43,405][00589] Fps is (10 sec: 3685.9, 60 sec: 3413.3, 300 sec: 3045.7). Total num frames: 593920. Throughput: 0: 817.3. Samples: 148960. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:25:43,413][00589] Avg episode reward: [(0, '4.304')]
[2023-02-22 17:25:48,403][00589] Fps is (10 sec: 2458.7, 60 sec: 3276.8, 300 sec: 3031.0). Total num frames: 606208. Throughput: 0: 825.0. Samples: 150924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:25:48,411][00589] Avg episode reward: [(0, '4.402')]
[2023-02-22 17:25:49,609][12689] Updated weights for policy 0, policy_version 150 (0.0022)
[2023-02-22 17:25:53,403][00589] Fps is (10 sec: 3277.2, 60 sec: 3276.8, 300 sec: 3057.0). Total num frames: 626688. Throughput: 0: 873.5. Samples: 156348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:25:53,411][00589] Avg episode reward: [(0, '4.468')]
[2023-02-22 17:25:58,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3345.2, 300 sec: 3081.8). Total num frames: 647168. Throughput: 0: 904.7. Samples: 162598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:25:58,409][00589] Avg episode reward: [(0, '4.463')]
[2023-02-22 17:25:58,433][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000158_647168.pth...
[2023-02-22 17:26:00,586][12689] Updated weights for policy 0, policy_version 160 (0.0027)
[2023-02-22 17:26:03,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3345.2, 300 sec: 3067.2). Total num frames: 659456. Throughput: 0: 872.0. Samples: 164510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:26:03,410][00589] Avg episode reward: [(0, '4.615')]
[2023-02-22 17:26:08,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3072.0). Total num frames: 675840. Throughput: 0: 857.9. Samples: 168604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:26:08,406][00589] Avg episode reward: [(0, '4.679')]
[2023-02-22 17:26:12,976][12689] Updated weights for policy 0, policy_version 170 (0.0013)
[2023-02-22 17:26:13,404][00589] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3094.8). Total num frames: 696320. Throughput: 0: 890.1. Samples: 174386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:26:13,410][00589] Avg episode reward: [(0, '4.694')]
[2023-02-22 17:26:18,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3116.5). Total num frames: 716800. Throughput: 0: 887.7. Samples: 177650. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:26:18,411][00589] Avg episode reward: [(0, '4.695')]
[2023-02-22 17:26:23,403][00589] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3119.9). Total num frames: 733184. Throughput: 0: 859.2. Samples: 183120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:26:23,408][00589] Avg episode reward: [(0, '4.703')]
[2023-02-22 17:26:24,033][12689] Updated weights for policy 0, policy_version 180 (0.0017)
[2023-02-22 17:26:28,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3123.2). Total num frames: 749568. Throughput: 0: 853.7. Samples: 187376. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:26:28,407][00589] Avg episode reward: [(0, '4.703')]
[2023-02-22 17:26:33,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3143.1). Total num frames: 770048. Throughput: 0: 875.9. Samples: 190340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:26:33,411][00589] Avg episode reward: [(0, '4.616')]
[2023-02-22 17:26:34,881][12689] Updated weights for policy 0, policy_version 190 (0.0025)
[2023-02-22 17:26:38,404][00589] Fps is (10 sec: 4095.8, 60 sec: 3481.8, 300 sec: 3162.1). Total num frames: 790528. Throughput: 0: 906.3. Samples: 197132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:26:38,406][00589] Avg episode reward: [(0, '4.607')]
[2023-02-22 17:26:43,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3164.4). Total num frames: 806912. Throughput: 0: 880.0. Samples: 202200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:26:43,411][00589] Avg episode reward: [(0, '4.596')]
[2023-02-22 17:26:47,340][12689] Updated weights for policy 0, policy_version 200 (0.0013)
[2023-02-22 17:26:48,403][00589] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3150.8). Total num frames: 819200. Throughput: 0: 883.9. Samples: 204284. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:26:48,407][00589] Avg episode reward: [(0, '4.652')]
[2023-02-22 17:26:53,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3168.6). Total num frames: 839680. Throughput: 0: 907.6. Samples: 209448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:26:53,406][00589] Avg episode reward: [(0, '4.786')]
[2023-02-22 17:26:53,412][12675] Saving new best policy, reward=4.786!
[2023-02-22 17:26:57,305][12689] Updated weights for policy 0, policy_version 210 (0.0023)
[2023-02-22 17:26:58,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3200.9). Total num frames: 864256. Throughput: 0: 924.7. Samples: 215998. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:26:58,411][00589] Avg episode reward: [(0, '4.894')]
[2023-02-22 17:26:58,423][12675] Saving new best policy, reward=4.894!
[2023-02-22 17:27:03,404][00589] Fps is (10 sec: 3686.1, 60 sec: 3618.1, 300 sec: 3187.4). Total num frames: 876544. Throughput: 0: 909.6. Samples: 218582. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:03,413][00589] Avg episode reward: [(0, '4.752')]
[2023-02-22 17:27:08,404][00589] Fps is (10 sec: 1638.3, 60 sec: 3413.3, 300 sec: 3145.1). Total num frames: 880640. Throughput: 0: 835.2. Samples: 220704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:08,408][00589] Avg episode reward: [(0, '4.637')]
[2023-02-22 17:27:12,725][12689] Updated weights for policy 0, policy_version 220 (0.0017)
[2023-02-22 17:27:13,403][00589] Fps is (10 sec: 2457.8, 60 sec: 3413.3, 300 sec: 3161.8). Total num frames: 901120. Throughput: 0: 854.4. Samples: 225824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:13,406][00589] Avg episode reward: [(0, '4.626')]
[2023-02-22 17:27:18,403][00589] Fps is (10 sec: 4505.9, 60 sec: 3481.6, 300 sec: 3192.1). Total num frames: 925696. Throughput: 0: 862.9. Samples: 229172. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:18,406][00589] Avg episode reward: [(0, '4.737')]
[2023-02-22 17:27:22,953][12689] Updated weights for policy 0, policy_version 230 (0.0026)
[2023-02-22 17:27:23,403][00589] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3193.5). Total num frames: 942080. Throughput: 0: 843.3. Samples: 235082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:23,412][00589] Avg episode reward: [(0, '4.617')]
[2023-02-22 17:27:28,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 954368. Throughput: 0: 827.4. Samples: 239434. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:27:28,406][00589] Avg episode reward: [(0, '4.783')]
[2023-02-22 17:27:33,403][00589] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3304.6). Total num frames: 974848. Throughput: 0: 837.3. Samples: 241962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:27:33,406][00589] Avg episode reward: [(0, '4.735')]
[2023-02-22 17:27:34,707][12689] Updated weights for policy 0, policy_version 240 (0.0019)
[2023-02-22 17:27:38,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 999424. Throughput: 0: 872.1. Samples: 248692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:27:38,410][00589] Avg episode reward: [(0, '4.670')]
[2023-02-22 17:27:43,407][00589] Fps is (10 sec: 4094.7, 60 sec: 3481.4, 300 sec: 3429.5). Total num frames: 1015808. Throughput: 0: 850.1. Samples: 254256. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:27:43,410][00589] Avg episode reward: [(0, '4.891')]
[2023-02-22 17:27:45,831][12689] Updated weights for policy 0, policy_version 250 (0.0017)
[2023-02-22 17:27:48,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1028096. Throughput: 0: 839.2. Samples: 256346. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:27:48,405][00589] Avg episode reward: [(0, '5.186')]
[2023-02-22 17:27:48,423][12675] Saving new best policy, reward=5.186!
[2023-02-22 17:27:53,403][00589] Fps is (10 sec: 3277.9, 60 sec: 3481.6, 300 sec: 3429.6). Total num frames: 1048576. Throughput: 0: 898.2. Samples: 261122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:27:53,411][00589] Avg episode reward: [(0, '5.254')]
[2023-02-22 17:27:53,416][12675] Saving new best policy, reward=5.254!
[2023-02-22 17:27:56,727][12689] Updated weights for policy 0, policy_version 260 (0.0021)
[2023-02-22 17:27:58,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1069056. Throughput: 0: 932.4. Samples: 267784. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:27:58,412][00589] Avg episode reward: [(0, '5.293')]
[2023-02-22 17:27:58,423][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000261_1069056.pth...
[2023-02-22 17:27:58,566][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000058_237568.pth
[2023-02-22 17:27:58,580][12675] Saving new best policy, reward=5.293!
[2023-02-22 17:28:03,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1085440. Throughput: 0: 923.6. Samples: 270732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:28:03,407][00589] Avg episode reward: [(0, '5.332')]
[2023-02-22 17:28:03,413][12675] Saving new best policy, reward=5.332!
[2023-02-22 17:28:08,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3429.5). Total num frames: 1101824. Throughput: 0: 884.6. Samples: 274888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:28:08,405][00589] Avg episode reward: [(0, '5.758')]
[2023-02-22 17:28:08,420][12675] Saving new best policy, reward=5.758!
[2023-02-22 17:28:09,662][12689] Updated weights for policy 0, policy_version 270 (0.0042)
[2023-02-22 17:28:13,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 1118208. Throughput: 0: 903.5. Samples: 280090. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:28:13,406][00589] Avg episode reward: [(0, '5.961')]
[2023-02-22 17:28:13,415][12675] Saving new best policy, reward=5.961!
[2023-02-22 17:28:18,403][00589] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 1142784. Throughput: 0: 919.2. Samples: 283324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:28:18,406][00589] Avg episode reward: [(0, '5.875')]
[2023-02-22 17:28:19,052][12689] Updated weights for policy 0, policy_version 280 (0.0015)
[2023-02-22 17:28:23,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 1159168. Throughput: 0: 909.2. Samples: 289608. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:28:23,409][00589] Avg episode reward: [(0, '5.678')]
[2023-02-22 17:28:28,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 1171456. Throughput: 0: 879.7. Samples: 293840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:28:28,412][00589] Avg episode reward: [(0, '5.517')]
[2023-02-22 17:28:32,153][12689] Updated weights for policy 0, policy_version 290 (0.0021)
[2023-02-22 17:28:33,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 1191936. Throughput: 0: 877.7. Samples: 295844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:28:33,406][00589] Avg episode reward: [(0, '5.590')]
[2023-02-22 17:28:38,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1212416. Throughput: 0: 916.5. Samples: 302366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:28:38,408][00589] Avg episode reward: [(0, '5.449')]
[2023-02-22 17:28:42,309][12689] Updated weights for policy 0, policy_version 300 (0.0017)
[2023-02-22 17:28:43,408][00589] Fps is (10 sec: 3684.7, 60 sec: 3549.8, 300 sec: 3471.1). Total num frames: 1228800. Throughput: 0: 893.2. Samples: 307980. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:28:43,411][00589] Avg episode reward: [(0, '5.890')]
[2023-02-22 17:28:48,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 1245184. Throughput: 0: 874.4. Samples: 310082. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:28:48,410][00589] Avg episode reward: [(0, '6.042')]
[2023-02-22 17:28:48,429][12675] Saving new best policy, reward=6.042!
[2023-02-22 17:28:53,403][00589] Fps is (10 sec: 3278.3, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 1261568. Throughput: 0: 876.8. Samples: 314346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:28:53,406][00589] Avg episode reward: [(0, '6.918')]
[2023-02-22 17:28:53,409][12675] Saving new best policy, reward=6.918!
[2023-02-22 17:28:54,976][12689] Updated weights for policy 0, policy_version 310 (0.0019)
[2023-02-22 17:28:58,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 1282048. Throughput: 0: 909.6. Samples: 321020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:28:58,409][00589] Avg episode reward: [(0, '6.798')]
[2023-02-22 17:29:03,408][00589] Fps is (10 sec: 4093.9, 60 sec: 3617.8, 300 sec: 3498.9). Total num frames: 1302528. Throughput: 0: 914.2. Samples: 324466. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:29:03,416][00589] Avg episode reward: [(0, '6.972')]
[2023-02-22 17:29:03,418][12675] Saving new best policy, reward=6.972!
[2023-02-22 17:29:05,375][12689] Updated weights for policy 0, policy_version 320 (0.0045)
[2023-02-22 17:29:08,404][00589] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1318912. Throughput: 0: 877.4. Samples: 329092. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:29:08,408][00589] Avg episode reward: [(0, '6.987')]
[2023-02-22 17:29:08,422][12675] Saving new best policy, reward=6.987!
[2023-02-22 17:29:13,403][00589] Fps is (10 sec: 3278.5, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 1335296. Throughput: 0: 886.7. Samples: 333742. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:13,405][00589] Avg episode reward: [(0, '6.728')]
[2023-02-22 17:29:16,932][12689] Updated weights for policy 0, policy_version 330 (0.0043)
[2023-02-22 17:29:18,403][00589] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1355776. Throughput: 0: 919.0. Samples: 337200. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:29:18,406][00589] Avg episode reward: [(0, '6.537')]
[2023-02-22 17:29:23,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3512.8). Total num frames: 1376256. Throughput: 0: 926.1. Samples: 344040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:23,408][00589] Avg episode reward: [(0, '7.224')]
[2023-02-22 17:29:23,413][12675] Saving new best policy, reward=7.224!
[2023-02-22 17:29:28,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3485.1). Total num frames: 1388544. Throughput: 0: 894.2. Samples: 348216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:28,405][00589] Avg episode reward: [(0, '7.544')]
[2023-02-22 17:29:28,505][12675] Saving new best policy, reward=7.544!
[2023-02-22 17:29:28,511][12689] Updated weights for policy 0, policy_version 340 (0.0018)
[2023-02-22 17:29:33,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 1404928. Throughput: 0: 891.6. Samples: 350206. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:33,411][00589] Avg episode reward: [(0, '8.146')]
[2023-02-22 17:29:33,418][12675] Saving new best policy, reward=8.146!
[2023-02-22 17:29:38,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 1429504. Throughput: 0: 933.2. Samples: 356342. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:38,410][00589] Avg episode reward: [(0, '7.638')]
[2023-02-22 17:29:39,004][12689] Updated weights for policy 0, policy_version 350 (0.0020)
[2023-02-22 17:29:43,405][00589] Fps is (10 sec: 4504.8, 60 sec: 3686.6, 300 sec: 3526.7). Total num frames: 1449984. Throughput: 0: 929.3. Samples: 362842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:43,408][00589] Avg episode reward: [(0, '8.057')]
[2023-02-22 17:29:48,404][00589] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3498.9). Total num frames: 1462272. Throughput: 0: 897.5. Samples: 364850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:48,411][00589] Avg episode reward: [(0, '8.224')]
[2023-02-22 17:29:48,427][12675] Saving new best policy, reward=8.224!
[2023-02-22 17:29:52,039][12689] Updated weights for policy 0, policy_version 360 (0.0023)
[2023-02-22 17:29:53,403][00589] Fps is (10 sec: 2867.7, 60 sec: 3618.1, 300 sec: 3499.0). Total num frames: 1478656. Throughput: 0: 887.9. Samples: 369048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:29:53,411][00589] Avg episode reward: [(0, '8.691')]
[2023-02-22 17:29:53,416][12675] Saving new best policy, reward=8.691!
[2023-02-22 17:29:58,403][00589] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3526.8). Total num frames: 1499136. Throughput: 0: 925.0. Samples: 375368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:29:58,409][00589] Avg episode reward: [(0, '9.210')]
[2023-02-22 17:29:58,424][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000366_1499136.pth...
[2023-02-22 17:29:58,538][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000158_647168.pth
[2023-02-22 17:29:58,551][12675] Saving new best policy, reward=9.210!
[2023-02-22 17:30:01,554][12689] Updated weights for policy 0, policy_version 370 (0.0024)
[2023-02-22 17:30:03,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3618.4, 300 sec: 3568.4). Total num frames: 1519616. Throughput: 0: 917.8. Samples: 378500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:03,406][00589] Avg episode reward: [(0, '9.341')]
[2023-02-22 17:30:03,407][12675] Saving new best policy, reward=9.341!
[2023-02-22 17:30:08,404][00589] Fps is (10 sec: 3276.6, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1531904. Throughput: 0: 873.1. Samples: 383332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:08,411][00589] Avg episode reward: [(0, '9.595')]
[2023-02-22 17:30:08,428][12675] Saving new best policy, reward=9.595!
[2023-02-22 17:30:13,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1548288. Throughput: 0: 865.2. Samples: 387150. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:13,406][00589] Avg episode reward: [(0, '9.441')]
[2023-02-22 17:30:15,398][12689] Updated weights for policy 0, policy_version 380 (0.0039)
[2023-02-22 17:30:18,404][00589] Fps is (10 sec: 3686.3, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 1568768. Throughput: 0: 884.7. Samples: 390016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:30:18,411][00589] Avg episode reward: [(0, '8.165')]
[2023-02-22 17:30:23,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1589248. Throughput: 0: 887.6. Samples: 396282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:23,409][00589] Avg episode reward: [(0, '7.850')]
[2023-02-22 17:30:26,103][12689] Updated weights for policy 0, policy_version 390 (0.0020)
[2023-02-22 17:30:28,408][00589] Fps is (10 sec: 3275.4, 60 sec: 3549.6, 300 sec: 3540.6). Total num frames: 1601536. Throughput: 0: 849.0. Samples: 401048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:30:28,411][00589] Avg episode reward: [(0, '8.325')]
[2023-02-22 17:30:33,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3512.9). Total num frames: 1617920. Throughput: 0: 850.6. Samples: 403126. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:33,412][00589] Avg episode reward: [(0, '8.827')]
[2023-02-22 17:30:37,895][12689] Updated weights for policy 0, policy_version 400 (0.0018)
[2023-02-22 17:30:38,403][00589] Fps is (10 sec: 3688.2, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1638400. Throughput: 0: 883.5. Samples: 408806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:30:38,411][00589] Avg episode reward: [(0, '9.932')]
[2023-02-22 17:30:38,418][12675] Saving new best policy, reward=9.932!
[2023-02-22 17:30:43,408][00589] Fps is (10 sec: 4094.0, 60 sec: 3481.4, 300 sec: 3568.3). Total num frames: 1658880. Throughput: 0: 878.4. Samples: 414902. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:30:43,415][00589] Avg episode reward: [(0, '10.208')]
[2023-02-22 17:30:43,429][12675] Saving new best policy, reward=10.208!
[2023-02-22 17:30:48,408][00589] Fps is (10 sec: 2865.8, 60 sec: 3413.1, 300 sec: 3526.7). Total num frames: 1667072. Throughput: 0: 846.8. Samples: 416612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:30:48,411][00589] Avg episode reward: [(0, '10.851')]
[2023-02-22 17:30:48,435][12675] Saving new best policy, reward=10.851!
[2023-02-22 17:30:52,221][12689] Updated weights for policy 0, policy_version 410 (0.0014)
[2023-02-22 17:30:53,406][00589] Fps is (10 sec: 2048.4, 60 sec: 3344.9, 300 sec: 3498.9). Total num frames: 1679360. Throughput: 0: 814.6. Samples: 419992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:30:53,411][00589] Avg episode reward: [(0, '10.950')]
[2023-02-22 17:30:53,416][12675] Saving new best policy, reward=10.950!
[2023-02-22 17:30:58,403][00589] Fps is (10 sec: 2868.6, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 1695744. Throughput: 0: 815.2. Samples: 423832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:30:58,410][00589] Avg episode reward: [(0, '10.488')]
[2023-02-22 17:31:03,403][00589] Fps is (10 sec: 3687.5, 60 sec: 3276.8, 300 sec: 3526.7). Total num frames: 1716224. Throughput: 0: 828.4. Samples: 427294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:31:03,405][00589] Avg episode reward: [(0, '11.225')]
[2023-02-22 17:31:03,412][12675] Saving new best policy, reward=11.225!
[2023-02-22 17:31:03,664][12689] Updated weights for policy 0, policy_version 420 (0.0026)
[2023-02-22 17:31:08,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 1740800. Throughput: 0: 841.6. Samples: 434152. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:31:08,407][00589] Avg episode reward: [(0, '11.261')]
[2023-02-22 17:31:08,422][12675] Saving new best policy, reward=11.261!
[2023-02-22 17:31:13,404][00589] Fps is (10 sec: 3686.0, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 1753088. Throughput: 0: 839.9. Samples: 438842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:31:13,410][00589] Avg episode reward: [(0, '11.925')]
[2023-02-22 17:31:13,419][12675] Saving new best policy, reward=11.925!
[2023-02-22 17:31:15,850][12689] Updated weights for policy 0, policy_version 430 (0.0013)
[2023-02-22 17:31:18,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 1769472. Throughput: 0: 841.6. Samples: 440998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:31:18,411][00589] Avg episode reward: [(0, '12.301')]
[2023-02-22 17:31:18,422][12675] Saving new best policy, reward=12.301!
[2023-02-22 17:31:23,403][00589] Fps is (10 sec: 3686.8, 60 sec: 3345.1, 300 sec: 3526.7). Total num frames: 1789952. Throughput: 0: 847.9. Samples: 446962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:31:23,406][00589] Avg episode reward: [(0, '13.068')]
[2023-02-22 17:31:23,413][12675] Saving new best policy, reward=13.068!
[2023-02-22 17:31:25,445][12689] Updated weights for policy 0, policy_version 440 (0.0040)
[2023-02-22 17:31:28,404][00589] Fps is (10 sec: 4505.3, 60 sec: 3550.1, 300 sec: 3540.6). Total num frames: 1814528. Throughput: 0: 863.5. Samples: 453758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:31:28,414][00589] Avg episode reward: [(0, '14.467')]
[2023-02-22 17:31:28,422][12675] Saving new best policy, reward=14.467!
[2023-02-22 17:31:33,405][00589] Fps is (10 sec: 3685.7, 60 sec: 3481.5, 300 sec: 3512.8). Total num frames: 1826816. Throughput: 0: 872.9. Samples: 455892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:31:33,408][00589] Avg episode reward: [(0, '14.241')]
[2023-02-22 17:31:38,272][12689] Updated weights for policy 0, policy_version 450 (0.0018)
[2023-02-22 17:31:38,403][00589] Fps is (10 sec: 2867.4, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 1843200. Throughput: 0: 892.1. Samples: 460132. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:31:38,410][00589] Avg episode reward: [(0, '13.680')]
[2023-02-22 17:31:43,403][00589] Fps is (10 sec: 3687.1, 60 sec: 3413.6, 300 sec: 3540.6). Total num frames: 1863680. Throughput: 0: 945.8. Samples: 466394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:31:43,406][00589] Avg episode reward: [(0, '14.959')]
[2023-02-22 17:31:43,409][12675] Saving new best policy, reward=14.959!
[2023-02-22 17:31:47,238][12689] Updated weights for policy 0, policy_version 460 (0.0020)
[2023-02-22 17:31:48,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3686.7, 300 sec: 3554.5). Total num frames: 1888256. Throughput: 0: 946.5. Samples: 469888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:31:48,412][00589] Avg episode reward: [(0, '14.600')]
[2023-02-22 17:31:53,404][00589] Fps is (10 sec: 3686.1, 60 sec: 3686.5, 300 sec: 3512.8). Total num frames: 1900544. Throughput: 0: 913.1. Samples: 475242. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-22 17:31:53,411][00589] Avg episode reward: [(0, '15.031')]
[2023-02-22 17:31:53,415][12675] Saving new best policy, reward=15.031!
[2023-02-22 17:31:58,404][00589] Fps is (10 sec: 2867.1, 60 sec: 3686.4, 300 sec: 3526.7). Total num frames: 1916928. Throughput: 0: 908.9. Samples: 479742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:31:58,414][00589] Avg episode reward: [(0, '15.263')]
[2023-02-22 17:31:58,431][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000468_1916928.pth...
[2023-02-22 17:31:58,564][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000261_1069056.pth
[2023-02-22 17:31:58,592][12675] Saving new best policy, reward=15.263!
[2023-02-22 17:31:59,624][12689] Updated weights for policy 0, policy_version 470 (0.0025)
[2023-02-22 17:32:03,403][00589] Fps is (10 sec: 4096.4, 60 sec: 3754.7, 300 sec: 3596.2). Total num frames: 1941504. Throughput: 0: 934.8. Samples: 483062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:32:03,411][00589] Avg episode reward: [(0, '15.285')]
[2023-02-22 17:32:03,417][12675] Saving new best policy, reward=15.285!
[2023-02-22 17:32:08,406][00589] Fps is (10 sec: 4504.4, 60 sec: 3686.2, 300 sec: 3596.1). Total num frames: 1961984. Throughput: 0: 955.6. Samples: 489968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:32:08,413][00589] Avg episode reward: [(0, '14.282')]
[2023-02-22 17:32:09,019][12689] Updated weights for policy 0, policy_version 480 (0.0026)
[2023-02-22 17:32:13,405][00589] Fps is (10 sec: 3276.2, 60 sec: 3686.4, 300 sec: 3554.5). Total num frames: 1974272. Throughput: 0: 905.7. Samples: 494516. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:32:13,415][00589] Avg episode reward: [(0, '13.271')]
[2023-02-22 17:32:18,403][00589] Fps is (10 sec: 3277.8, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 1994752. Throughput: 0: 907.9. Samples: 496748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:32:18,411][00589] Avg episode reward: [(0, '12.870')]
[2023-02-22 17:32:20,911][12689] Updated weights for policy 0, policy_version 490 (0.0016)
[2023-02-22 17:32:23,403][00589] Fps is (10 sec: 4096.8, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 2015232. Throughput: 0: 961.0. Samples: 503376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:32:23,416][00589] Avg episode reward: [(0, '13.326')]
[2023-02-22 17:32:28,404][00589] Fps is (10 sec: 4505.2, 60 sec: 3754.7, 300 sec: 3610.0). Total num frames: 2039808. Throughput: 0: 966.3. Samples: 509878. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:32:28,406][00589] Avg episode reward: [(0, '13.506')]
[2023-02-22 17:32:31,344][12689] Updated weights for policy 0, policy_version 500 (0.0013)
[2023-02-22 17:32:33,406][00589] Fps is (10 sec: 3685.3, 60 sec: 3754.6, 300 sec: 3568.3). Total num frames: 2052096. Throughput: 0: 935.4. Samples: 511982. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:32:33,409][00589] Avg episode reward: [(0, '14.786')]
[2023-02-22 17:32:38,403][00589] Fps is (10 sec: 2867.4, 60 sec: 3754.7, 300 sec: 3568.4). Total num frames: 2068480. Throughput: 0: 919.3. Samples: 516612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:32:38,411][00589] Avg episode reward: [(0, '16.854')]
[2023-02-22 17:32:38,426][12675] Saving new best policy, reward=16.854!
[2023-02-22 17:32:42,607][12689] Updated weights for policy 0, policy_version 510 (0.0016)
[2023-02-22 17:32:43,403][00589] Fps is (10 sec: 3687.5, 60 sec: 3754.7, 300 sec: 3596.1). Total num frames: 2088960. Throughput: 0: 960.0. Samples: 522944. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:32:43,411][00589] Avg episode reward: [(0, '15.503')]
[2023-02-22 17:32:48,410][00589] Fps is (10 sec: 4093.2, 60 sec: 3686.0, 300 sec: 3596.1). Total num frames: 2109440. Throughput: 0: 961.0. Samples: 526312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:32:48,412][00589] Avg episode reward: [(0, '15.547')]
[2023-02-22 17:32:53,404][00589] Fps is (10 sec: 3276.6, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 2121728. Throughput: 0: 903.1. Samples: 530604. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:32:53,407][00589] Avg episode reward: [(0, '16.027')]
[2023-02-22 17:32:55,221][12689] Updated weights for policy 0, policy_version 520 (0.0015)
[2023-02-22 17:32:58,403][00589] Fps is (10 sec: 3279.0, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 2142208. Throughput: 0: 914.8. Samples: 535680. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:32:58,415][00589] Avg episode reward: [(0, '15.610')]
[2023-02-22 17:33:03,403][00589] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 2162688. Throughput: 0: 935.8. Samples: 538860. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:33:03,411][00589] Avg episode reward: [(0, '15.039')]
[2023-02-22 17:33:04,662][12689] Updated weights for policy 0, policy_version 530 (0.0016)
[2023-02-22 17:33:08,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 2179072. Throughput: 0: 918.4. Samples: 544704. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:33:08,413][00589] Avg episode reward: [(0, '15.684')]
[2023-02-22 17:33:13,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3686.5, 300 sec: 3568.4). Total num frames: 2195456. Throughput: 0: 866.0. Samples: 548846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:33:13,406][00589] Avg episode reward: [(0, '16.771')]
[2023-02-22 17:33:17,849][12689] Updated weights for policy 0, policy_version 540 (0.0027)
[2023-02-22 17:33:18,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 2211840. Throughput: 0: 873.4. Samples: 551282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:33:18,406][00589] Avg episode reward: [(0, '16.436')]
[2023-02-22 17:33:23,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2232320. Throughput: 0: 911.9. Samples: 557648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:33:23,406][00589] Avg episode reward: [(0, '15.875')]
[2023-02-22 17:33:28,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 2248704. Throughput: 0: 882.1. Samples: 562640. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:33:28,413][00589] Avg episode reward: [(0, '18.328')]
[2023-02-22 17:33:28,429][12675] Saving new best policy, reward=18.328!
[2023-02-22 17:33:29,382][12689] Updated weights for policy 0, policy_version 550 (0.0027)
[2023-02-22 17:33:33,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3481.8, 300 sec: 3554.5). Total num frames: 2260992. Throughput: 0: 849.2. Samples: 564518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:33:33,410][00589] Avg episode reward: [(0, '18.808')]
[2023-02-22 17:33:33,415][12675] Saving new best policy, reward=18.808!
[2023-02-22 17:33:38,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 2281472. Throughput: 0: 862.9. Samples: 569432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:33:38,406][00589] Avg episode reward: [(0, '18.471')]
[2023-02-22 17:33:40,896][12689] Updated weights for policy 0, policy_version 560 (0.0019)
[2023-02-22 17:33:43,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2301952. Throughput: 0: 892.7. Samples: 575850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:33:43,406][00589] Avg episode reward: [(0, '18.461')]
[2023-02-22 17:33:48,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3413.7, 300 sec: 3568.4). Total num frames: 2314240. Throughput: 0: 876.1. Samples: 578286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:33:48,412][00589] Avg episode reward: [(0, '18.219')]
[2023-02-22 17:33:53,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 2330624. Throughput: 0: 836.0. Samples: 582322. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:33:53,412][00589] Avg episode reward: [(0, '17.478')]
[2023-02-22 17:33:54,112][12689] Updated weights for policy 0, policy_version 570 (0.0016)
[2023-02-22 17:33:58,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3554.6). Total num frames: 2351104. Throughput: 0: 870.8. Samples: 588034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:33:58,414][00589] Avg episode reward: [(0, '18.376')]
[2023-02-22 17:33:58,431][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth...
[2023-02-22 17:33:58,569][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000366_1499136.pth
[2023-02-22 17:34:03,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 2371584. Throughput: 0: 886.0. Samples: 591154. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:34:03,405][00589] Avg episode reward: [(0, '18.518')]
[2023-02-22 17:34:04,294][12689] Updated weights for policy 0, policy_version 580 (0.0015)
[2023-02-22 17:34:08,406][00589] Fps is (10 sec: 3275.9, 60 sec: 3413.2, 300 sec: 3554.5). Total num frames: 2383872. Throughput: 0: 855.6. Samples: 596152. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:34:08,413][00589] Avg episode reward: [(0, '19.904')]
[2023-02-22 17:34:08,433][12675] Saving new best policy, reward=19.904!
[2023-02-22 17:34:13,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 2400256. Throughput: 0: 832.5. Samples: 600102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:34:13,405][00589] Avg episode reward: [(0, '19.283')]
[2023-02-22 17:34:17,289][12689] Updated weights for policy 0, policy_version 590 (0.0060)
[2023-02-22 17:34:18,403][00589] Fps is (10 sec: 3687.3, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 2420736. Throughput: 0: 858.1. Samples: 603132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:18,408][00589] Avg episode reward: [(0, '20.068')]
[2023-02-22 17:34:18,424][12675] Saving new best policy, reward=20.068!
[2023-02-22 17:34:23,404][00589] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3568.4). Total num frames: 2441216. Throughput: 0: 886.3. Samples: 609314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:23,409][00589] Avg episode reward: [(0, '20.019')]
[2023-02-22 17:34:28,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 2453504. Throughput: 0: 841.1. Samples: 613700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:34:28,408][00589] Avg episode reward: [(0, '19.464')]
[2023-02-22 17:34:29,643][12689] Updated weights for policy 0, policy_version 600 (0.0012)
[2023-02-22 17:34:33,403][00589] Fps is (10 sec: 2867.3, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2469888. Throughput: 0: 831.0. Samples: 615682. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:33,410][00589] Avg episode reward: [(0, '20.248')]
[2023-02-22 17:34:33,413][12675] Saving new best policy, reward=20.248!
[2023-02-22 17:34:38,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2490368. Throughput: 0: 871.0. Samples: 621516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:38,406][00589] Avg episode reward: [(0, '21.823')]
[2023-02-22 17:34:38,420][12675] Saving new best policy, reward=21.823!
[2023-02-22 17:34:40,211][12689] Updated weights for policy 0, policy_version 610 (0.0028)
[2023-02-22 17:34:43,405][00589] Fps is (10 sec: 3685.8, 60 sec: 3413.2, 300 sec: 3540.6). Total num frames: 2506752. Throughput: 0: 877.2. Samples: 627510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:34:43,409][00589] Avg episode reward: [(0, '22.575')]
[2023-02-22 17:34:43,422][12675] Saving new best policy, reward=22.575!
[2023-02-22 17:34:48,405][00589] Fps is (10 sec: 3276.2, 60 sec: 3481.5, 300 sec: 3540.6). Total num frames: 2523136. Throughput: 0: 851.8. Samples: 629486. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:34:48,412][00589] Avg episode reward: [(0, '21.653')]
[2023-02-22 17:34:53,057][12689] Updated weights for policy 0, policy_version 620 (0.0040)
[2023-02-22 17:34:53,403][00589] Fps is (10 sec: 3277.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2539520. Throughput: 0: 837.1. Samples: 633818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:53,405][00589] Avg episode reward: [(0, '22.899')]
[2023-02-22 17:34:53,413][12675] Saving new best policy, reward=22.899!
[2023-02-22 17:34:58,403][00589] Fps is (10 sec: 3687.1, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2560000. Throughput: 0: 896.3. Samples: 640434. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:34:58,414][00589] Avg episode reward: [(0, '23.593')]
[2023-02-22 17:34:58,433][12675] Saving new best policy, reward=23.593!
[2023-02-22 17:35:03,248][12689] Updated weights for policy 0, policy_version 630 (0.0012)
[2023-02-22 17:35:03,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 2580480. Throughput: 0: 901.8. Samples: 643712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:35:03,406][00589] Avg episode reward: [(0, '21.315')]
[2023-02-22 17:35:08,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3540.6). Total num frames: 2592768. Throughput: 0: 860.8. Samples: 648052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:35:08,410][00589] Avg episode reward: [(0, '20.817')]
[2023-02-22 17:35:13,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 2613248. Throughput: 0: 887.2. Samples: 653624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:35:13,406][00589] Avg episode reward: [(0, '22.223')]
[2023-02-22 17:35:14,733][12689] Updated weights for policy 0, policy_version 640 (0.0017)
[2023-02-22 17:35:18,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 2637824. Throughput: 0: 920.3. Samples: 657096. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-22 17:35:18,406][00589] Avg episode reward: [(0, '22.000')]
[2023-02-22 17:35:23,406][00589] Fps is (10 sec: 4504.2, 60 sec: 3618.0, 300 sec: 3582.3). Total num frames: 2658304. Throughput: 0: 941.7. Samples: 663896. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-22 17:35:23,412][00589] Avg episode reward: [(0, '22.120')]
[2023-02-22 17:35:24,844][12689] Updated weights for policy 0, policy_version 650 (0.0021)
[2023-02-22 17:35:28,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 2670592. Throughput: 0: 907.0. Samples: 668322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:35:28,409][00589] Avg episode reward: [(0, '23.033')]
[2023-02-22 17:35:33,403][00589] Fps is (10 sec: 3687.6, 60 sec: 3754.7, 300 sec: 3582.3). Total num frames: 2695168. Throughput: 0: 929.9. Samples: 671330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:35:33,406][00589] Avg episode reward: [(0, '23.530')]
[2023-02-22 17:35:35,010][12689] Updated weights for policy 0, policy_version 660 (0.0028)
[2023-02-22 17:35:38,403][00589] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3596.2). Total num frames: 2719744. Throughput: 0: 992.4. Samples: 678478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:35:38,411][00589] Avg episode reward: [(0, '24.188')]
[2023-02-22 17:35:38,423][12675] Saving new best policy, reward=24.188!
[2023-02-22 17:35:43,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3624.0). Total num frames: 2736128. Throughput: 0: 971.0. Samples: 684128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:35:43,411][00589] Avg episode reward: [(0, '24.016')]
[2023-02-22 17:35:46,070][12689] Updated weights for policy 0, policy_version 670 (0.0023)
[2023-02-22 17:35:48,403][00589] Fps is (10 sec: 2867.2, 60 sec: 3754.8, 300 sec: 3624.0). Total num frames: 2748416. Throughput: 0: 948.1. Samples: 686378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:35:48,406][00589] Avg episode reward: [(0, '23.489')]
[2023-02-22 17:35:53,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3651.7). Total num frames: 2772992. Throughput: 0: 988.8. Samples: 692546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:35:53,406][00589] Avg episode reward: [(0, '24.020')]
[2023-02-22 17:35:55,280][12689] Updated weights for policy 0, policy_version 680 (0.0021)
[2023-02-22 17:35:58,404][00589] Fps is (10 sec: 4914.7, 60 sec: 3959.4, 300 sec: 3665.6). Total num frames: 2797568. Throughput: 0: 1026.2. Samples: 699806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:35:58,414][00589] Avg episode reward: [(0, '23.705')]
[2023-02-22 17:35:58,430][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000683_2797568.pth...
[2023-02-22 17:35:58,560][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000468_1916928.pth
[2023-02-22 17:36:03,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3637.8). Total num frames: 2813952. Throughput: 0: 1003.2. Samples: 702240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:36:03,408][00589] Avg episode reward: [(0, '23.421')]
[2023-02-22 17:36:06,897][12689] Updated weights for policy 0, policy_version 690 (0.0031)
[2023-02-22 17:36:08,403][00589] Fps is (10 sec: 3277.1, 60 sec: 3959.5, 300 sec: 3651.7). Total num frames: 2830336. Throughput: 0: 955.3. Samples: 706882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:36:08,413][00589] Avg episode reward: [(0, '22.907')]
[2023-02-22 17:36:13,403][00589] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3679.5). Total num frames: 2854912. Throughput: 0: 1014.1. Samples: 713958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:36:13,413][00589] Avg episode reward: [(0, '24.140')]
[2023-02-22 17:36:15,448][12689] Updated weights for policy 0, policy_version 700 (0.0012)
[2023-02-22 17:36:18,404][00589] Fps is (10 sec: 4505.4, 60 sec: 3959.4, 300 sec: 3679.5). Total num frames: 2875392. Throughput: 0: 1027.8. Samples: 717582. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:36:18,410][00589] Avg episode reward: [(0, '23.319')]
[2023-02-22 17:36:23,406][00589] Fps is (10 sec: 3685.4, 60 sec: 3891.2, 300 sec: 3651.7). Total num frames: 2891776. Throughput: 0: 979.9. Samples: 722574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:36:23,409][00589] Avg episode reward: [(0, '23.169')]
[2023-02-22 17:36:28,403][00589] Fps is (10 sec: 2867.3, 60 sec: 3891.2, 300 sec: 3651.7). Total num frames: 2904064. Throughput: 0: 938.1. Samples: 726342. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:36:28,408][00589] Avg episode reward: [(0, '23.619')]
[2023-02-22 17:36:29,726][12689] Updated weights for policy 0, policy_version 710 (0.0030)
[2023-02-22 17:36:33,403][00589] Fps is (10 sec: 2868.0, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 2920448. Throughput: 0: 933.5. Samples: 728384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:36:33,407][00589] Avg episode reward: [(0, '22.159')]
[2023-02-22 17:36:38,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 2945024. Throughput: 0: 947.9. Samples: 735200. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:36:38,412][00589] Avg episode reward: [(0, '22.145')]
[2023-02-22 17:36:38,797][12689] Updated weights for policy 0, policy_version 720 (0.0013)
[2023-02-22 17:36:43,404][00589] Fps is (10 sec: 4095.9, 60 sec: 3754.7, 300 sec: 3637.8). Total num frames: 2961408. Throughput: 0: 910.4. Samples: 740772. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:36:43,407][00589] Avg episode reward: [(0, '23.617')]
[2023-02-22 17:36:48,404][00589] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3651.7). Total num frames: 2977792. Throughput: 0: 908.1. Samples: 743106. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:36:48,414][00589] Avg episode reward: [(0, '23.244')]
[2023-02-22 17:36:50,625][12689] Updated weights for policy 0, policy_version 730 (0.0014)
[2023-02-22 17:36:53,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3679.5). Total num frames: 3002368. Throughput: 0: 942.4. Samples: 749288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:36:53,406][00589] Avg episode reward: [(0, '24.371')]
[2023-02-22 17:36:53,409][12675] Saving new best policy, reward=24.371!
[2023-02-22 17:36:58,403][00589] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3022848. Throughput: 0: 943.8. Samples: 756430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:36:58,406][00589] Avg episode reward: [(0, '24.645')]
[2023-02-22 17:36:58,415][12675] Saving new best policy, reward=24.645!
[2023-02-22 17:36:59,879][12689] Updated weights for policy 0, policy_version 740 (0.0014)
[2023-02-22 17:37:03,404][00589] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3651.7). Total num frames: 3039232. Throughput: 0: 916.8. Samples: 758840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:37:03,406][00589] Avg episode reward: [(0, '24.986')]
[2023-02-22 17:37:03,413][12675] Saving new best policy, reward=24.986!
[2023-02-22 17:37:08,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3665.6). Total num frames: 3055616. Throughput: 0: 904.4. Samples: 763270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:37:08,410][00589] Avg episode reward: [(0, '25.425')]
[2023-02-22 17:37:08,421][12675] Saving new best policy, reward=25.425!
[2023-02-22 17:37:11,266][12689] Updated weights for policy 0, policy_version 750 (0.0020)
[2023-02-22 17:37:13,403][00589] Fps is (10 sec: 4096.1, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3080192. Throughput: 0: 974.2. Samples: 770180. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:37:13,406][00589] Avg episode reward: [(0, '23.736')]
[2023-02-22 17:37:18,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3679.5). Total num frames: 3100672. Throughput: 0: 1008.5. Samples: 773768. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:37:18,409][00589] Avg episode reward: [(0, '22.817')]
[2023-02-22 17:37:21,346][12689] Updated weights for policy 0, policy_version 760 (0.0012)
[2023-02-22 17:37:23,408][00589] Fps is (10 sec: 3684.8, 60 sec: 3754.6, 300 sec: 3651.6). Total num frames: 3117056. Throughput: 0: 974.4. Samples: 779052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-22 17:37:23,410][00589] Avg episode reward: [(0, '22.182')]
[2023-02-22 17:37:28,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3137536. Throughput: 0: 963.4. Samples: 784124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-22 17:37:28,409][00589] Avg episode reward: [(0, '22.042')]
[2023-02-22 17:37:31,836][12689] Updated weights for policy 0, policy_version 770 (0.0031)
[2023-02-22 17:37:33,403][00589] Fps is (10 sec: 4507.6, 60 sec: 4027.7, 300 sec: 3707.2). Total num frames: 3162112. Throughput: 0: 991.3. Samples: 787716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:37:33,410][00589] Avg episode reward: [(0, '22.826')]
[2023-02-22 17:37:38,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3693.3). Total num frames: 3178496. Throughput: 0: 1009.0. Samples: 794694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:37:38,410][00589] Avg episode reward: [(0, '23.205')]
[2023-02-22 17:37:42,890][12689] Updated weights for policy 0, policy_version 780 (0.0011)
[2023-02-22 17:37:43,404][00589] Fps is (10 sec: 3276.5, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 3194880. Throughput: 0: 949.2. Samples: 799144. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:37:43,407][00589] Avg episode reward: [(0, '22.846')]
[2023-02-22 17:37:48,404][00589] Fps is (10 sec: 3686.2, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 3215360. Throughput: 0: 947.9. Samples: 801494. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:37:48,406][00589] Avg episode reward: [(0, '23.623')]
[2023-02-22 17:37:52,655][12689] Updated weights for policy 0, policy_version 790 (0.0024)
[2023-02-22 17:37:53,403][00589] Fps is (10 sec: 4096.4, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 3235840. Throughput: 0: 1003.1. Samples: 808410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:37:53,412][00589] Avg episode reward: [(0, '24.036')]
[2023-02-22 17:37:58,403][00589] Fps is (10 sec: 4096.2, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 3256320. Throughput: 0: 983.3. Samples: 814428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:37:58,407][00589] Avg episode reward: [(0, '23.591')]
[2023-02-22 17:37:58,425][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000795_3256320.pth...
[2023-02-22 17:37:58,558][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000574_2351104.pth
[2023-02-22 17:38:03,406][00589] Fps is (10 sec: 3275.9, 60 sec: 3822.8, 300 sec: 3693.3). Total num frames: 3268608. Throughput: 0: 951.4. Samples: 816584. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:38:03,418][00589] Avg episode reward: [(0, '22.713')]
[2023-02-22 17:38:04,828][12689] Updated weights for policy 0, policy_version 800 (0.0017)
[2023-02-22 17:38:08,403][00589] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 3293184. Throughput: 0: 955.0. Samples: 822024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:38:08,406][00589] Avg episode reward: [(0, '24.524')]
[2023-02-22 17:38:13,270][12689] Updated weights for policy 0, policy_version 810 (0.0021)
[2023-02-22 17:38:13,403][00589] Fps is (10 sec: 4916.5, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 3317760. Throughput: 0: 1004.6. Samples: 829330. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:38:13,405][00589] Avg episode reward: [(0, '24.841')]
[2023-02-22 17:38:18,406][00589] Fps is (10 sec: 4094.9, 60 sec: 3891.0, 300 sec: 3735.0). Total num frames: 3334144. Throughput: 0: 993.2. Samples: 832412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:38:18,409][00589] Avg episode reward: [(0, '23.713')]
[2023-02-22 17:38:23,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3891.5, 300 sec: 3735.0). Total num frames: 3350528. Throughput: 0: 939.5. Samples: 836972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:38:23,414][00589] Avg episode reward: [(0, '24.038')]
[2023-02-22 17:38:25,001][12689] Updated weights for policy 0, policy_version 820 (0.0030)
[2023-02-22 17:38:28,403][00589] Fps is (10 sec: 4097.1, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 3375104. Throughput: 0: 986.0. Samples: 843512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:38:28,405][00589] Avg episode reward: [(0, '24.380')]
[2023-02-22 17:38:33,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 3395584. Throughput: 0: 1014.9. Samples: 847164. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:38:33,410][00589] Avg episode reward: [(0, '23.492')]
[2023-02-22 17:38:33,429][12689] Updated weights for policy 0, policy_version 830 (0.0012)
[2023-02-22 17:38:38,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3762.8). Total num frames: 3411968. Throughput: 0: 989.7. Samples: 852948. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:38:38,412][00589] Avg episode reward: [(0, '22.954')]
[2023-02-22 17:38:43,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3891.3, 300 sec: 3776.7). Total num frames: 3428352. Throughput: 0: 959.7. Samples: 857614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:38:43,406][00589] Avg episode reward: [(0, '24.049')]
[2023-02-22 17:38:45,231][12689] Updated weights for policy 0, policy_version 840 (0.0012)
[2023-02-22 17:38:48,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 3452928. Throughput: 0: 992.7. Samples: 861254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:38:48,411][00589] Avg episode reward: [(0, '22.479')]
[2023-02-22 17:38:53,404][00589] Fps is (10 sec: 4914.8, 60 sec: 4027.7, 300 sec: 3818.3). Total num frames: 3477504. Throughput: 0: 1033.0. Samples: 868510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:38:53,408][00589] Avg episode reward: [(0, '23.316')]
[2023-02-22 17:38:54,179][12689] Updated weights for policy 0, policy_version 850 (0.0015)
[2023-02-22 17:38:58,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 3493888. Throughput: 0: 982.9. Samples: 873562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:38:58,405][00589] Avg episode reward: [(0, '23.468')]
[2023-02-22 17:39:03,403][00589] Fps is (10 sec: 3277.1, 60 sec: 4027.9, 300 sec: 3818.3). Total num frames: 3510272. Throughput: 0: 965.7. Samples: 875866. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:39:03,413][00589] Avg episode reward: [(0, '22.741')]
[2023-02-22 17:39:05,353][12689] Updated weights for policy 0, policy_version 860 (0.0036)
[2023-02-22 17:39:08,403][00589] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 3534848. Throughput: 0: 1016.8. Samples: 882730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:08,406][00589] Avg episode reward: [(0, '21.943')]
[2023-02-22 17:39:13,403][00589] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3555328. Throughput: 0: 1022.5. Samples: 889526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:13,406][00589] Avg episode reward: [(0, '22.553')]
[2023-02-22 17:39:15,172][12689] Updated weights for policy 0, policy_version 870 (0.0025)
[2023-02-22 17:39:18,404][00589] Fps is (10 sec: 3686.2, 60 sec: 3959.6, 300 sec: 3832.2). Total num frames: 3571712. Throughput: 0: 993.1. Samples: 891856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:18,410][00589] Avg episode reward: [(0, '23.088')]
[2023-02-22 17:39:23,403][00589] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 3592192. Throughput: 0: 976.1. Samples: 896874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:23,406][00589] Avg episode reward: [(0, '24.187')]
[2023-02-22 17:39:25,669][12689] Updated weights for policy 0, policy_version 880 (0.0021)
[2023-02-22 17:39:28,403][00589] Fps is (10 sec: 4505.8, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 3616768. Throughput: 0: 1033.8. Samples: 904136. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:39:28,412][00589] Avg episode reward: [(0, '24.280')]
[2023-02-22 17:39:33,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 3633152. Throughput: 0: 1031.5. Samples: 907670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:39:33,406][00589] Avg episode reward: [(0, '24.962')]
[2023-02-22 17:39:36,436][12689] Updated weights for policy 0, policy_version 890 (0.0017)
[2023-02-22 17:39:38,404][00589] Fps is (10 sec: 3276.5, 60 sec: 3959.4, 300 sec: 3873.9). Total num frames: 3649536. Throughput: 0: 968.3. Samples: 912084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:38,410][00589] Avg episode reward: [(0, '26.703')]
[2023-02-22 17:39:38,425][12675] Saving new best policy, reward=26.703!
[2023-02-22 17:39:43,403][00589] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3887.8). Total num frames: 3670016. Throughput: 0: 986.9. Samples: 917974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:39:43,406][00589] Avg episode reward: [(0, '27.663')]
[2023-02-22 17:39:43,414][12675] Saving new best policy, reward=27.663!
[2023-02-22 17:39:46,286][12689] Updated weights for policy 0, policy_version 900 (0.0015)
[2023-02-22 17:39:48,403][00589] Fps is (10 sec: 4506.0, 60 sec: 4027.7, 300 sec: 3915.5). Total num frames: 3694592. Throughput: 0: 1013.7. Samples: 921482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:39:48,406][00589] Avg episode reward: [(0, '26.223')]
[2023-02-22 17:39:53,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 3710976. Throughput: 0: 997.6. Samples: 927624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:39:53,406][00589] Avg episode reward: [(0, '25.515')]
[2023-02-22 17:39:58,062][12689] Updated weights for policy 0, policy_version 910 (0.0022)
[2023-02-22 17:39:58,403][00589] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3887.7). Total num frames: 3727360. Throughput: 0: 948.7. Samples: 932218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:39:58,405][00589] Avg episode reward: [(0, '26.117')]
[2023-02-22 17:39:58,424][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000910_3727360.pth...
[2023-02-22 17:39:58,545][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000683_2797568.pth
[2023-02-22 17:40:03,403][00589] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 3751936. Throughput: 0: 970.7. Samples: 935538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-22 17:40:03,406][00589] Avg episode reward: [(0, '25.489')]
[2023-02-22 17:40:06,510][12689] Updated weights for policy 0, policy_version 920 (0.0021)
[2023-02-22 17:40:08,409][00589] Fps is (10 sec: 4912.3, 60 sec: 4027.3, 300 sec: 3943.2). Total num frames: 3776512. Throughput: 0: 1020.1. Samples: 942786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:40:08,414][00589] Avg episode reward: [(0, '24.765')]
[2023-02-22 17:40:13,403][00589] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 3788800. Throughput: 0: 976.0. Samples: 948054. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:40:13,408][00589] Avg episode reward: [(0, '25.246')]
[2023-02-22 17:40:18,403][00589] Fps is (10 sec: 2868.9, 60 sec: 3891.2, 300 sec: 3887.8). Total num frames: 3805184. Throughput: 0: 946.7. Samples: 950272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:40:18,412][00589] Avg episode reward: [(0, '25.746')]
[2023-02-22 17:40:18,551][12689] Updated weights for policy 0, policy_version 930 (0.0018)
[2023-02-22 17:40:23,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 3829760. Throughput: 0: 992.9. Samples: 956764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:40:23,406][00589] Avg episode reward: [(0, '25.039')]
[2023-02-22 17:40:26,962][12689] Updated weights for policy 0, policy_version 940 (0.0020)
[2023-02-22 17:40:28,411][00589] Fps is (10 sec: 4911.4, 60 sec: 3959.0, 300 sec: 3929.3). Total num frames: 3854336. Throughput: 0: 1022.9. Samples: 964014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:40:28,416][00589] Avg episode reward: [(0, '23.397')]
[2023-02-22 17:40:33,403][00589] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 3870720. Throughput: 0: 995.3. Samples: 966270. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-22 17:40:33,407][00589] Avg episode reward: [(0, '23.110')]
[2023-02-22 17:40:38,403][00589] Fps is (10 sec: 3279.4, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 3887104. Throughput: 0: 962.7. Samples: 970946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-22 17:40:38,412][00589] Avg episode reward: [(0, '22.880')]
[2023-02-22 17:40:38,742][12689] Updated weights for policy 0, policy_version 950 (0.0019)
[2023-02-22 17:40:43,403][00589] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 3911680. Throughput: 0: 1019.1. Samples: 978076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:40:43,412][00589] Avg episode reward: [(0, '22.442')]
[2023-02-22 17:40:47,941][12689] Updated weights for policy 0, policy_version 960 (0.0015)
[2023-02-22 17:40:48,404][00589] Fps is (10 sec: 4505.4, 60 sec: 3959.4, 300 sec: 3929.4). Total num frames: 3932160. Throughput: 0: 1023.6. Samples: 981600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-22 17:40:48,408][00589] Avg episode reward: [(0, '22.344')]
[2023-02-22 17:40:53,403][00589] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 3948544. Throughput: 0: 970.7. Samples: 986460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:40:53,409][00589] Avg episode reward: [(0, '22.692')]
[2023-02-22 17:40:58,403][00589] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3901.6). Total num frames: 3964928. Throughput: 0: 972.8. Samples: 991832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:40:58,405][00589] Avg episode reward: [(0, '22.651')]
[2023-02-22 17:40:59,451][12689] Updated weights for policy 0, policy_version 970 (0.0028)
[2023-02-22 17:41:03,403][00589] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3929.4). Total num frames: 3989504. Throughput: 0: 1001.8. Samples: 995354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-22 17:41:03,408][00589] Avg episode reward: [(0, '22.666')]
[2023-02-22 17:41:06,703][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-22 17:41:06,738][12675] Stopping Batcher_0...
[2023-02-22 17:41:06,739][12675] Loop batcher_evt_loop terminating...
[2023-02-22 17:41:06,739][00589] Component Batcher_0 stopped!
[2023-02-22 17:41:06,779][12689] Weights refcount: 2 0
[2023-02-22 17:41:06,796][12689] Stopping InferenceWorker_p0-w0...
[2023-02-22 17:41:06,799][12689] Loop inference_proc0-0_evt_loop terminating...
[2023-02-22 17:41:06,800][00589] Component InferenceWorker_p0-w0 stopped!
[2023-02-22 17:41:06,821][12697] Stopping RolloutWorker_w7...
[2023-02-22 17:41:06,822][12697] Loop rollout_proc7_evt_loop terminating...
[2023-02-22 17:41:06,823][00589] Component RolloutWorker_w7 stopped!
[2023-02-22 17:41:06,827][00589] Component RolloutWorker_w3 stopped!
[2023-02-22 17:41:06,830][12693] Stopping RolloutWorker_w3...
[2023-02-22 17:41:06,831][12693] Loop rollout_proc3_evt_loop terminating...
[2023-02-22 17:41:06,846][00589] Component RolloutWorker_w5 stopped!
[2023-02-22 17:41:06,848][12694] Stopping RolloutWorker_w5...
[2023-02-22 17:41:06,848][12694] Loop rollout_proc5_evt_loop terminating...
[2023-02-22 17:41:06,885][12690] Stopping RolloutWorker_w0...
[2023-02-22 17:41:06,885][12690] Loop rollout_proc0_evt_loop terminating...
[2023-02-22 17:41:06,885][00589] Component RolloutWorker_w0 stopped!
[2023-02-22 17:41:06,896][12695] Stopping RolloutWorker_w4...
[2023-02-22 17:41:06,897][12695] Loop rollout_proc4_evt_loop terminating...
[2023-02-22 17:41:06,896][00589] Component RolloutWorker_w4 stopped!
[2023-02-22 17:41:06,906][00589] Component RolloutWorker_w1 stopped!
[2023-02-22 17:41:06,908][12692] Stopping RolloutWorker_w1...
[2023-02-22 17:41:06,910][12692] Loop rollout_proc1_evt_loop terminating...
[2023-02-22 17:41:06,921][12691] Stopping RolloutWorker_w2...
[2023-02-22 17:41:06,922][12691] Loop rollout_proc2_evt_loop terminating...
[2023-02-22 17:41:06,921][00589] Component RolloutWorker_w2 stopped!
[2023-02-22 17:41:06,931][12675] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000795_3256320.pth
[2023-02-22 17:41:06,955][12675] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-22 17:41:06,964][12696] Stopping RolloutWorker_w6...
[2023-02-22 17:41:06,965][12696] Loop rollout_proc6_evt_loop terminating...
[2023-02-22 17:41:06,967][00589] Component RolloutWorker_w6 stopped!
[2023-02-22 17:41:07,274][12675] Stopping LearnerWorker_p0...
[2023-02-22 17:41:07,274][12675] Loop learner_proc0_evt_loop terminating...
[2023-02-22 17:41:07,274][00589] Component LearnerWorker_p0 stopped!
[2023-02-22 17:41:07,278][00589] Waiting for process learner_proc0 to stop...
[2023-02-22 17:41:09,520][00589] Waiting for process inference_proc0-0 to join...
[2023-02-22 17:41:10,179][00589] Waiting for process rollout_proc0 to join...
[2023-02-22 17:41:10,938][00589] Waiting for process rollout_proc1 to join...
[2023-02-22 17:41:10,939][00589] Waiting for process rollout_proc2 to join...
[2023-02-22 17:41:10,944][00589] Waiting for process rollout_proc3 to join...
[2023-02-22 17:41:10,947][00589] Waiting for process rollout_proc4 to join...
[2023-02-22 17:41:10,949][00589] Waiting for process rollout_proc5 to join...
[2023-02-22 17:41:10,956][00589] Waiting for process rollout_proc6 to join...
[2023-02-22 17:41:10,961][00589] Waiting for process rollout_proc7 to join...
[2023-02-22 17:41:10,962][00589] Batcher 0 profile tree view:
batching: 26.5266, releasing_batches: 0.0253
[2023-02-22 17:41:10,964][00589] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 534.0281
update_model: 8.4350
weight_update: 0.0013
one_step: 0.0073
handle_policy_step: 529.0341
deserialize: 15.5449, stack: 3.1402, obs_to_device_normalize: 117.6378, forward: 252.0388, send_messages: 27.5362
prepare_outputs: 85.9451
to_cpu: 52.1010
[2023-02-22 17:41:10,965][00589] Learner 0 profile tree view:
misc: 0.0060, prepare_batch: 15.8904
train: 76.3720
epoch_init: 0.0090, minibatch_init: 0.0130, losses_postprocess: 0.6074, kl_divergence: 0.6083, after_optimizer: 32.9428
calculate_losses: 27.0177
losses_init: 0.0039, forward_head: 1.8285, bptt_initial: 17.9685, tail: 1.0797, advantages_returns: 0.3301, losses: 3.2983
bptt: 2.2186
bptt_forward_core: 2.1128
update: 14.5634
clip: 1.4347
[2023-02-22 17:41:10,970][00589] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3836, enqueue_policy_requests: 139.7681, env_step: 844.0746, overhead: 20.7892, complete_rollouts: 6.9822
save_policy_outputs: 21.0669
split_output_tensors: 9.8365
[2023-02-22 17:41:10,972][00589] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3352, enqueue_policy_requests: 143.5140, env_step: 839.4002, overhead: 20.7812, complete_rollouts: 6.9937
save_policy_outputs: 20.1290
split_output_tensors: 9.9510
[2023-02-22 17:41:10,974][00589] Loop Runner_EvtLoop terminating...
[2023-02-22 17:41:10,975][00589] Runner profile tree view:
main_loop: 1144.4708
[2023-02-22 17:41:10,979][00589] Collected {0: 4005888}, FPS: 3500.2
[2023-02-22 17:54:21,737][00589] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-22 17:54:21,742][00589] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-22 17:54:21,747][00589] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-22 17:54:21,751][00589] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-22 17:54:21,753][00589] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-22 17:54:21,756][00589] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-22 17:54:21,760][00589] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-22 17:54:21,761][00589] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-22 17:54:21,762][00589] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-22 17:54:21,764][00589] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-22 17:54:21,765][00589] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-22 17:54:21,766][00589] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-22 17:54:21,770][00589] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-22 17:54:21,771][00589] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-22 17:54:21,772][00589] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-22 17:54:21,814][00589] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-22 17:54:21,818][00589] RunningMeanStd input shape: (3, 72, 128)
[2023-02-22 17:54:21,822][00589] RunningMeanStd input shape: (1,)
[2023-02-22 17:54:21,849][00589] ConvEncoder: input_channels=3
[2023-02-22 17:54:22,660][00589] Conv encoder output size: 512
[2023-02-22 17:54:22,664][00589] Policy head output size: 512
[2023-02-22 17:54:25,202][00589] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-22 17:54:26,493][00589] Num frames 100...
[2023-02-22 17:54:26,609][00589] Num frames 200...
[2023-02-22 17:54:26,728][00589] Num frames 300...
[2023-02-22 17:54:26,848][00589] Num frames 400...
[2023-02-22 17:54:26,963][00589] Num frames 500...
[2023-02-22 17:54:27,085][00589] Num frames 600...
[2023-02-22 17:54:27,211][00589] Num frames 700...
[2023-02-22 17:54:27,333][00589] Num frames 800...
[2023-02-22 17:54:27,475][00589] Num frames 900...
[2023-02-22 17:54:27,588][00589] Num frames 1000...
[2023-02-22 17:54:27,706][00589] Num frames 1100...
[2023-02-22 17:54:27,817][00589] Num frames 1200...
[2023-02-22 17:54:27,941][00589] Num frames 1300...
[2023-02-22 17:54:28,056][00589] Num frames 1400...
[2023-02-22 17:54:28,197][00589] Avg episode rewards: #0: 35.720, true rewards: #0: 14.720
[2023-02-22 17:54:28,202][00589] Avg episode reward: 35.720, avg true_objective: 14.720
[2023-02-22 17:54:28,240][00589] Num frames 1500...
[2023-02-22 17:54:28,369][00589] Num frames 1600...
[2023-02-22 17:54:28,510][00589] Num frames 1700...
[2023-02-22 17:54:28,626][00589] Num frames 1800...
[2023-02-22 17:54:28,740][00589] Num frames 1900...
[2023-02-22 17:54:28,862][00589] Num frames 2000...
[2023-02-22 17:54:28,977][00589] Num frames 2100...
[2023-02-22 17:54:29,103][00589] Num frames 2200...
[2023-02-22 17:54:29,218][00589] Num frames 2300...
[2023-02-22 17:54:29,336][00589] Num frames 2400...
[2023-02-22 17:54:29,460][00589] Num frames 2500...
[2023-02-22 17:54:29,581][00589] Num frames 2600...
[2023-02-22 17:54:29,663][00589] Avg episode rewards: #0: 32.120, true rewards: #0: 13.120
[2023-02-22 17:54:29,665][00589] Avg episode reward: 32.120, avg true_objective: 13.120
[2023-02-22 17:54:29,753][00589] Num frames 2700...
[2023-02-22 17:54:29,879][00589] Num frames 2800...
[2023-02-22 17:54:29,997][00589] Num frames 2900...
[2023-02-22 17:54:30,111][00589] Num frames 3000...
[2023-02-22 17:54:30,227][00589] Num frames 3100...
[2023-02-22 17:54:30,337][00589] Num frames 3200...
[2023-02-22 17:54:30,436][00589] Avg episode rewards: #0: 25.107, true rewards: #0: 10.773
[2023-02-22 17:54:30,438][00589] Avg episode reward: 25.107, avg true_objective: 10.773
[2023-02-22 17:54:30,524][00589] Num frames 3300...
[2023-02-22 17:54:30,643][00589] Num frames 3400...
[2023-02-22 17:54:30,759][00589] Num frames 3500...
[2023-02-22 17:54:30,876][00589] Num frames 3600...
[2023-02-22 17:54:31,000][00589] Num frames 3700...
[2023-02-22 17:54:31,121][00589] Num frames 3800...
[2023-02-22 17:54:31,187][00589] Avg episode rewards: #0: 21.270, true rewards: #0: 9.520
[2023-02-22 17:54:31,191][00589] Avg episode reward: 21.270, avg true_objective: 9.520
[2023-02-22 17:54:31,305][00589] Num frames 3900...
[2023-02-22 17:54:31,432][00589] Num frames 4000...
[2023-02-22 17:54:31,549][00589] Num frames 4100...
[2023-02-22 17:54:31,668][00589] Num frames 4200...
[2023-02-22 17:54:31,788][00589] Num frames 4300...
[2023-02-22 17:54:31,905][00589] Num frames 4400...
[2023-02-22 17:54:32,015][00589] Avg episode rewards: #0: 19.296, true rewards: #0: 8.896
[2023-02-22 17:54:32,017][00589] Avg episode reward: 19.296, avg true_objective: 8.896
[2023-02-22 17:54:32,081][00589] Num frames 4500...
[2023-02-22 17:54:32,211][00589] Num frames 4600...
[2023-02-22 17:54:32,372][00589] Num frames 4700...
[2023-02-22 17:54:32,530][00589] Num frames 4800...
[2023-02-22 17:54:32,683][00589] Num frames 4900...
[2023-02-22 17:54:32,836][00589] Num frames 5000...
[2023-02-22 17:54:32,992][00589] Num frames 5100...
[2023-02-22 17:54:33,146][00589] Num frames 5200...
[2023-02-22 17:54:33,311][00589] Num frames 5300...
[2023-02-22 17:54:33,478][00589] Num frames 5400...
[2023-02-22 17:54:33,634][00589] Num frames 5500...
[2023-02-22 17:54:33,700][00589] Avg episode rewards: #0: 20.007, true rewards: #0: 9.173
[2023-02-22 17:54:33,703][00589] Avg episode reward: 20.007, avg true_objective: 9.173
[2023-02-22 17:54:33,854][00589] Num frames 5600...
[2023-02-22 17:54:34,003][00589] Num frames 5700...
[2023-02-22 17:54:34,162][00589] Num frames 5800...
[2023-02-22 17:54:34,320][00589] Num frames 5900...
[2023-02-22 17:54:34,505][00589] Avg episode rewards: #0: 18.549, true rewards: #0: 8.549
[2023-02-22 17:54:34,507][00589] Avg episode reward: 18.549, avg true_objective: 8.549
[2023-02-22 17:54:34,538][00589] Num frames 6000...
[2023-02-22 17:54:34,690][00589] Num frames 6100...
[2023-02-22 17:54:34,843][00589] Num frames 6200...
[2023-02-22 17:54:34,996][00589] Num frames 6300...
[2023-02-22 17:54:35,151][00589] Num frames 6400...
[2023-02-22 17:54:35,311][00589] Num frames 6500...
[2023-02-22 17:54:35,477][00589] Num frames 6600...
[2023-02-22 17:54:35,645][00589] Num frames 6700...
[2023-02-22 17:54:35,807][00589] Num frames 6800...
[2023-02-22 17:54:35,944][00589] Num frames 6900...
[2023-02-22 17:54:36,057][00589] Num frames 7000...
[2023-02-22 17:54:36,173][00589] Num frames 7100...
[2023-02-22 17:54:36,292][00589] Num frames 7200...
[2023-02-22 17:54:36,414][00589] Num frames 7300...
[2023-02-22 17:54:36,529][00589] Num frames 7400...
[2023-02-22 17:54:36,646][00589] Num frames 7500...
[2023-02-22 17:54:36,756][00589] Num frames 7600...
[2023-02-22 17:54:36,877][00589] Num frames 7700...
[2023-02-22 17:54:36,997][00589] Num frames 7800...
[2023-02-22 17:54:37,129][00589] Num frames 7900...
[2023-02-22 17:54:37,251][00589] Num frames 8000...
[2023-02-22 17:54:37,351][00589] Avg episode rewards: #0: 24.174, true rewards: #0: 10.049
[2023-02-22 17:54:37,353][00589] Avg episode reward: 24.174, avg true_objective: 10.049
[2023-02-22 17:54:37,430][00589] Num frames 8100...
[2023-02-22 17:54:37,548][00589] Num frames 8200...
[2023-02-22 17:54:37,665][00589] Num frames 8300...
[2023-02-22 17:54:37,776][00589] Num frames 8400...
[2023-02-22 17:54:37,886][00589] Num frames 8500...
[2023-02-22 17:54:38,008][00589] Num frames 8600...
[2023-02-22 17:54:38,124][00589] Num frames 8700...
[2023-02-22 17:54:38,269][00589] Avg episode rewards: #0: 23.194, true rewards: #0: 9.750
[2023-02-22 17:54:38,271][00589] Avg episode reward: 23.194, avg true_objective: 9.750
[2023-02-22 17:54:38,302][00589] Num frames 8800...
[2023-02-22 17:54:38,422][00589] Num frames 8900...
[2023-02-22 17:54:38,541][00589] Num frames 9000...
[2023-02-22 17:54:38,661][00589] Num frames 9100...
[2023-02-22 17:54:38,782][00589] Num frames 9200...
[2023-02-22 17:54:38,895][00589] Num frames 9300...
[2023-02-22 17:54:39,009][00589] Num frames 9400...
[2023-02-22 17:54:39,121][00589] Num frames 9500...
[2023-02-22 17:54:39,241][00589] Num frames 9600...
[2023-02-22 17:54:39,363][00589] Num frames 9700...
[2023-02-22 17:54:39,538][00589] Avg episode rewards: #0: 22.799, true rewards: #0: 9.799
[2023-02-22 17:54:39,539][00589] Avg episode reward: 22.799, avg true_objective: 9.799
[2023-02-22 17:54:39,544][00589] Num frames 9800...
[2023-02-22 17:55:39,215][00589] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-22 17:57:09,588][00589] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-22 17:57:09,591][00589] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-22 17:57:09,592][00589] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-22 17:57:09,594][00589] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-22 17:57:09,595][00589] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-22 17:57:09,597][00589] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-22 17:57:09,606][00589] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-22 17:57:09,607][00589] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-22 17:57:09,608][00589] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-22 17:57:09,609][00589] Adding new argument 'hf_repository'='CoreyMorris/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-22 17:57:09,610][00589] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-22 17:57:09,611][00589] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-22 17:57:09,612][00589] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-22 17:57:09,614][00589] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-22 17:57:09,615][00589] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-22 17:57:09,639][00589] RunningMeanStd input shape: (3, 72, 128)
[2023-02-22 17:57:09,642][00589] RunningMeanStd input shape: (1,)
[2023-02-22 17:57:09,656][00589] ConvEncoder: input_channels=3
[2023-02-22 17:57:09,695][00589] Conv encoder output size: 512
[2023-02-22 17:57:09,696][00589] Policy head output size: 512
[2023-02-22 17:57:09,717][00589] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-22 17:57:10,177][00589] Num frames 100...
[2023-02-22 17:57:10,290][00589] Num frames 200...
[2023-02-22 17:57:10,420][00589] Num frames 300...
[2023-02-22 17:57:10,571][00589] Num frames 400...
[2023-02-22 17:57:10,723][00589] Num frames 500...
[2023-02-22 17:57:10,801][00589] Avg episode rewards: #0: 7.120, true rewards: #0: 5.120
[2023-02-22 17:57:10,803][00589] Avg episode reward: 7.120, avg true_objective: 5.120
[2023-02-22 17:57:10,932][00589] Num frames 600...
[2023-02-22 17:57:11,081][00589] Num frames 700...
[2023-02-22 17:57:11,238][00589] Num frames 800...
[2023-02-22 17:57:11,391][00589] Num frames 900...
[2023-02-22 17:57:11,548][00589] Num frames 1000...
[2023-02-22 17:57:11,708][00589] Num frames 1100...
[2023-02-22 17:57:11,866][00589] Num frames 1200...
[2023-02-22 17:57:12,026][00589] Num frames 1300...
[2023-02-22 17:57:12,176][00589] Num frames 1400...
[2023-02-22 17:57:12,337][00589] Num frames 1500...
[2023-02-22 17:57:12,505][00589] Num frames 1600...
[2023-02-22 17:57:12,678][00589] Num frames 1700...
[2023-02-22 17:57:12,841][00589] Num frames 1800...
[2023-02-22 17:57:12,998][00589] Num frames 1900...
[2023-02-22 17:57:13,154][00589] Num frames 2000...
[2023-02-22 17:57:13,313][00589] Num frames 2100...
[2023-02-22 17:57:13,475][00589] Num frames 2200...
[2023-02-22 17:57:13,648][00589] Num frames 2300...
[2023-02-22 17:57:13,808][00589] Num frames 2400...
[2023-02-22 17:57:13,984][00589] Num frames 2500...
[2023-02-22 17:57:14,139][00589] Num frames 2600...
[2023-02-22 17:57:14,210][00589] Avg episode rewards: #0: 34.559, true rewards: #0: 13.060
[2023-02-22 17:57:14,213][00589] Avg episode reward: 34.559, avg true_objective: 13.060
[2023-02-22 17:57:14,314][00589] Num frames 2700...
[2023-02-22 17:57:14,438][00589] Num frames 2800...
[2023-02-22 17:57:14,554][00589] Num frames 2900...
[2023-02-22 17:57:14,673][00589] Num frames 3000...
[2023-02-22 17:57:14,782][00589] Num frames 3100...
[2023-02-22 17:57:14,895][00589] Num frames 3200...
[2023-02-22 17:57:15,019][00589] Num frames 3300...
[2023-02-22 17:57:15,135][00589] Num frames 3400...
[2023-02-22 17:57:15,245][00589] Num frames 3500...
[2023-02-22 17:57:15,357][00589] Num frames 3600...
[2023-02-22 17:57:15,453][00589] Avg episode rewards: #0: 29.776, true rewards: #0: 12.110
[2023-02-22 17:57:15,455][00589] Avg episode reward: 29.776, avg true_objective: 12.110
[2023-02-22 17:57:15,536][00589] Num frames 3700...
[2023-02-22 17:57:15,665][00589] Num frames 3800...
[2023-02-22 17:57:15,787][00589] Num frames 3900...
[2023-02-22 17:57:15,920][00589] Num frames 4000...
[2023-02-22 17:57:16,034][00589] Num frames 4100...
[2023-02-22 17:57:16,151][00589] Num frames 4200...
[2023-02-22 17:57:16,265][00589] Num frames 4300...
[2023-02-22 17:57:16,384][00589] Num frames 4400...
[2023-02-22 17:57:16,480][00589] Avg episode rewards: #0: 26.082, true rewards: #0: 11.083
[2023-02-22 17:57:16,482][00589] Avg episode reward: 26.082, avg true_objective: 11.083
[2023-02-22 17:57:16,558][00589] Num frames 4500...
[2023-02-22 17:57:16,680][00589] Num frames 4600...
[2023-02-22 17:57:16,800][00589] Num frames 4700...
[2023-02-22 17:57:16,918][00589] Num frames 4800...
[2023-02-22 17:57:17,035][00589] Num frames 4900...
[2023-02-22 17:57:17,148][00589] Num frames 5000...
[2023-02-22 17:57:17,263][00589] Num frames 5100...
[2023-02-22 17:57:17,375][00589] Num frames 5200...
[2023-02-22 17:57:17,494][00589] Num frames 5300...
[2023-02-22 17:57:17,565][00589] Avg episode rewards: #0: 25.024, true rewards: #0: 10.624
[2023-02-22 17:57:17,567][00589] Avg episode reward: 25.024, avg true_objective: 10.624
[2023-02-22 17:57:17,675][00589] Num frames 5400...
[2023-02-22 17:57:17,798][00589] Num frames 5500...
[2023-02-22 17:57:17,919][00589] Num frames 5600...
[2023-02-22 17:57:18,033][00589] Num frames 5700...
[2023-02-22 17:57:18,156][00589] Num frames 5800...
[2023-02-22 17:57:18,272][00589] Num frames 5900...
[2023-02-22 17:57:18,391][00589] Num frames 6000...
[2023-02-22 17:57:18,512][00589] Num frames 6100...
[2023-02-22 17:57:18,632][00589] Num frames 6200...
[2023-02-22 17:57:18,755][00589] Num frames 6300...
[2023-02-22 17:57:18,873][00589] Num frames 6400...
[2023-02-22 17:57:18,992][00589] Num frames 6500...
[2023-02-22 17:57:19,106][00589] Num frames 6600...
[2023-02-22 17:57:19,225][00589] Num frames 6700...
[2023-02-22 17:57:19,337][00589] Num frames 6800...
[2023-02-22 17:57:19,449][00589] Num frames 6900...
[2023-02-22 17:57:19,558][00589] Num frames 7000...
[2023-02-22 17:57:19,692][00589] Avg episode rewards: #0: 28.453, true rewards: #0: 11.787
[2023-02-22 17:57:19,696][00589] Avg episode reward: 28.453, avg true_objective: 11.787
[2023-02-22 17:57:19,728][00589] Num frames 7100...
[2023-02-22 17:57:19,838][00589] Num frames 7200...
[2023-02-22 17:57:19,947][00589] Num frames 7300...
[2023-02-22 17:57:20,064][00589] Num frames 7400...
[2023-02-22 17:57:20,177][00589] Num frames 7500...
[2023-02-22 17:57:20,304][00589] Num frames 7600...
[2023-02-22 17:57:20,427][00589] Num frames 7700...
[2023-02-22 17:57:20,543][00589] Num frames 7800...
[2023-02-22 17:57:20,659][00589] Num frames 7900...
[2023-02-22 17:57:20,783][00589] Num frames 8000...
[2023-02-22 17:57:20,900][00589] Num frames 8100...
[2023-02-22 17:57:21,016][00589] Num frames 8200...
[2023-02-22 17:57:21,135][00589] Num frames 8300...
[2023-02-22 17:57:21,254][00589] Num frames 8400...
[2023-02-22 17:57:21,367][00589] Num frames 8500...
[2023-02-22 17:57:21,489][00589] Num frames 8600...
[2023-02-22 17:57:21,601][00589] Num frames 8700...
[2023-02-22 17:57:21,726][00589] Num frames 8800...
[2023-02-22 17:57:21,839][00589] Num frames 8900...
[2023-02-22 17:57:21,954][00589] Num frames 9000...
[2023-02-22 17:57:22,113][00589] Avg episode rewards: #0: 31.268, true rewards: #0: 12.983
[2023-02-22 17:57:22,116][00589] Avg episode reward: 31.268, avg true_objective: 12.983
[2023-02-22 17:57:22,142][00589] Num frames 9100...
[2023-02-22 17:57:22,260][00589] Num frames 9200...
[2023-02-22 17:57:22,375][00589] Num frames 9300...
[2023-02-22 17:57:22,496][00589] Num frames 9400...
[2023-02-22 17:57:22,614][00589] Num frames 9500...
[2023-02-22 17:57:22,737][00589] Num frames 9600...
[2023-02-22 17:57:22,854][00589] Num frames 9700...
[2023-02-22 17:57:22,978][00589] Num frames 9800...
[2023-02-22 17:57:23,093][00589] Num frames 9900...
[2023-02-22 17:57:23,213][00589] Num frames 10000...
[2023-02-22 17:57:23,327][00589] Num frames 10100...
[2023-02-22 17:57:23,453][00589] Num frames 10200...
[2023-02-22 17:57:23,557][00589] Avg episode rewards: #0: 30.800, true rewards: #0: 12.800
[2023-02-22 17:57:23,558][00589] Avg episode reward: 30.800, avg true_objective: 12.800
[2023-02-22 17:57:23,628][00589] Num frames 10300...
[2023-02-22 17:57:23,745][00589] Num frames 10400...
[2023-02-22 17:57:23,855][00589] Num frames 10500...
[2023-02-22 17:57:23,967][00589] Num frames 10600...
[2023-02-22 17:57:24,086][00589] Num frames 10700...
[2023-02-22 17:57:24,248][00589] Num frames 10800...
[2023-02-22 17:57:24,403][00589] Num frames 10900...
[2023-02-22 17:57:24,565][00589] Num frames 11000...
[2023-02-22 17:57:24,718][00589] Num frames 11100...
[2023-02-22 17:57:24,872][00589] Num frames 11200...
[2023-02-22 17:57:25,092][00589] Avg episode rewards: #0: 30.106, true rewards: #0: 12.551
[2023-02-22 17:57:25,095][00589] Avg episode reward: 30.106, avg true_objective: 12.551
[2023-02-22 17:57:25,107][00589] Num frames 11300...
[2023-02-22 17:57:25,265][00589] Num frames 11400...
[2023-02-22 17:57:25,422][00589] Num frames 11500...
[2023-02-22 17:57:25,572][00589] Num frames 11600...
[2023-02-22 17:57:25,728][00589] Num frames 11700...
[2023-02-22 17:57:25,883][00589] Num frames 11800...
[2023-02-22 17:57:26,047][00589] Num frames 11900...
[2023-02-22 17:57:26,201][00589] Num frames 12000...
[2023-02-22 17:57:26,358][00589] Num frames 12100...
[2023-02-22 17:57:26,511][00589] Avg episode rewards: #0: 28.860, true rewards: #0: 12.160
[2023-02-22 17:57:26,513][00589] Avg episode reward: 28.860, avg true_objective: 12.160
[2023-02-22 17:58:41,829][00589] Replay video saved to /content/train_dir/default_experiment/replay.mp4!