ThomasSimonini's picture
Upload . with huggingface_hub
1dd59c7
raw
history blame
113 kB
[2023-02-20 21:06:54,627][00458] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-20 21:06:54,630][00458] Rollout worker 0 uses device cpu
[2023-02-20 21:06:54,632][00458] Rollout worker 1 uses device cpu
[2023-02-20 21:06:54,633][00458] Rollout worker 2 uses device cpu
[2023-02-20 21:06:54,635][00458] Rollout worker 3 uses device cpu
[2023-02-20 21:06:54,636][00458] Rollout worker 4 uses device cpu
[2023-02-20 21:06:54,637][00458] Rollout worker 5 uses device cpu
[2023-02-20 21:06:54,638][00458] Rollout worker 6 uses device cpu
[2023-02-20 21:06:54,639][00458] Rollout worker 7 uses device cpu
[2023-02-20 21:06:54,826][00458] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-20 21:06:54,828][00458] InferenceWorker_p0-w0: min num requests: 2
[2023-02-20 21:06:54,860][00458] Starting all processes...
[2023-02-20 21:06:54,861][00458] Starting process learner_proc0
[2023-02-20 21:06:54,919][00458] Starting all processes...
[2023-02-20 21:06:54,927][00458] Starting process inference_proc0-0
[2023-02-20 21:06:54,927][00458] Starting process rollout_proc0
[2023-02-20 21:06:54,928][00458] Starting process rollout_proc1
[2023-02-20 21:06:54,928][00458] Starting process rollout_proc2
[2023-02-20 21:06:54,928][00458] Starting process rollout_proc3
[2023-02-20 21:06:54,928][00458] Starting process rollout_proc4
[2023-02-20 21:06:54,932][00458] Starting process rollout_proc5
[2023-02-20 21:06:54,956][00458] Starting process rollout_proc6
[2023-02-20 21:06:54,968][00458] Starting process rollout_proc7
[2023-02-20 21:07:06,836][10885] Worker 0 uses CPU cores [0]
[2023-02-20 21:07:06,847][10870] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-20 21:07:06,855][10870] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-20 21:07:06,966][10886] Worker 1 uses CPU cores [1]
[2023-02-20 21:07:07,000][10889] Worker 4 uses CPU cores [0]
[2023-02-20 21:07:07,055][10891] Worker 7 uses CPU cores [1]
[2023-02-20 21:07:07,111][10887] Worker 2 uses CPU cores [0]
[2023-02-20 21:07:07,144][10884] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-20 21:07:07,144][10884] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-20 21:07:07,163][10888] Worker 3 uses CPU cores [1]
[2023-02-20 21:07:07,200][10890] Worker 5 uses CPU cores [1]
[2023-02-20 21:07:07,202][10892] Worker 6 uses CPU cores [0]
[2023-02-20 21:07:07,663][10870] Num visible devices: 1
[2023-02-20 21:07:07,663][10884] Num visible devices: 1
[2023-02-20 21:07:07,682][10870] Starting seed is not provided
[2023-02-20 21:07:07,682][10870] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-20 21:07:07,683][10870] Initializing actor-critic model on device cuda:0
[2023-02-20 21:07:07,684][10870] RunningMeanStd input shape: (3, 72, 128)
[2023-02-20 21:07:07,685][10870] RunningMeanStd input shape: (1,)
[2023-02-20 21:07:07,697][10870] ConvEncoder: input_channels=3
[2023-02-20 21:07:07,989][10870] Conv encoder output size: 512
[2023-02-20 21:07:07,989][10870] Policy head output size: 512
[2023-02-20 21:07:08,036][10870] Created Actor Critic model with architecture:
[2023-02-20 21:07:08,036][10870] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-20 21:07:14,819][00458] Heartbeat connected on Batcher_0
[2023-02-20 21:07:14,827][00458] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-20 21:07:14,836][00458] Heartbeat connected on RolloutWorker_w0
[2023-02-20 21:07:14,840][00458] Heartbeat connected on RolloutWorker_w1
[2023-02-20 21:07:14,843][00458] Heartbeat connected on RolloutWorker_w2
[2023-02-20 21:07:14,846][00458] Heartbeat connected on RolloutWorker_w3
[2023-02-20 21:07:14,849][00458] Heartbeat connected on RolloutWorker_w4
[2023-02-20 21:07:14,853][00458] Heartbeat connected on RolloutWorker_w5
[2023-02-20 21:07:14,855][00458] Heartbeat connected on RolloutWorker_w6
[2023-02-20 21:07:14,861][00458] Heartbeat connected on RolloutWorker_w7
[2023-02-20 21:07:15,534][10870] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-20 21:07:15,535][10870] No checkpoints found
[2023-02-20 21:07:15,535][10870] Did not load from checkpoint, starting from scratch!
[2023-02-20 21:07:15,535][10870] Initialized policy 0 weights for model version 0
[2023-02-20 21:07:15,543][10870] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-20 21:07:15,552][10870] LearnerWorker_p0 finished initialization!
[2023-02-20 21:07:15,556][00458] Heartbeat connected on LearnerWorker_p0
[2023-02-20 21:07:15,734][10884] RunningMeanStd input shape: (3, 72, 128)
[2023-02-20 21:07:15,736][10884] RunningMeanStd input shape: (1,)
[2023-02-20 21:07:15,755][10884] ConvEncoder: input_channels=3
[2023-02-20 21:07:15,924][10884] Conv encoder output size: 512
[2023-02-20 21:07:15,925][10884] Policy head output size: 512
[2023-02-20 21:07:18,342][00458] Inference worker 0-0 is ready!
[2023-02-20 21:07:18,344][00458] All inference workers are ready! Signal rollout workers to start!
[2023-02-20 21:07:18,479][10891] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,480][10886] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,490][10888] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,499][10890] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,500][10892] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,507][10885] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,506][10887] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:18,517][10889] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:07:19,393][10889] Decorrelating experience for 0 frames...
[2023-02-20 21:07:19,390][10885] Decorrelating experience for 0 frames...
[2023-02-20 21:07:19,659][10891] Decorrelating experience for 0 frames...
[2023-02-20 21:07:19,662][10886] Decorrelating experience for 0 frames...
[2023-02-20 21:07:19,664][10888] Decorrelating experience for 0 frames...
[2023-02-20 21:07:20,303][10885] Decorrelating experience for 32 frames...
[2023-02-20 21:07:20,316][10889] Decorrelating experience for 32 frames...
[2023-02-20 21:07:20,406][00458] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-20 21:07:20,703][10887] Decorrelating experience for 0 frames...
[2023-02-20 21:07:20,957][10889] Decorrelating experience for 64 frames...
[2023-02-20 21:07:21,026][10886] Decorrelating experience for 32 frames...
[2023-02-20 21:07:21,028][10891] Decorrelating experience for 32 frames...
[2023-02-20 21:07:21,032][10890] Decorrelating experience for 0 frames...
[2023-02-20 21:07:21,035][10888] Decorrelating experience for 32 frames...
[2023-02-20 21:07:21,706][10885] Decorrelating experience for 64 frames...
[2023-02-20 21:07:21,799][10892] Decorrelating experience for 0 frames...
[2023-02-20 21:07:22,187][10890] Decorrelating experience for 32 frames...
[2023-02-20 21:07:22,366][10886] Decorrelating experience for 64 frames...
[2023-02-20 21:07:22,375][10891] Decorrelating experience for 64 frames...
[2023-02-20 21:07:22,460][10889] Decorrelating experience for 96 frames...
[2023-02-20 21:07:22,532][10885] Decorrelating experience for 96 frames...
[2023-02-20 21:07:23,023][10892] Decorrelating experience for 32 frames...
[2023-02-20 21:07:23,323][10888] Decorrelating experience for 64 frames...
[2023-02-20 21:07:23,422][10891] Decorrelating experience for 96 frames...
[2023-02-20 21:07:23,921][10892] Decorrelating experience for 64 frames...
[2023-02-20 21:07:23,994][10890] Decorrelating experience for 64 frames...
[2023-02-20 21:07:24,326][10888] Decorrelating experience for 96 frames...
[2023-02-20 21:07:24,542][10892] Decorrelating experience for 96 frames...
[2023-02-20 21:07:24,904][10890] Decorrelating experience for 96 frames...
[2023-02-20 21:07:25,096][10886] Decorrelating experience for 96 frames...
[2023-02-20 21:07:25,407][00458] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-20 21:07:25,443][10887] Decorrelating experience for 32 frames...
[2023-02-20 21:07:25,776][10887] Decorrelating experience for 64 frames...
[2023-02-20 21:07:26,084][10887] Decorrelating experience for 96 frames...
[2023-02-20 21:07:30,407][00458] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 45.0. Samples: 450. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-20 21:07:30,413][00458] Avg episode reward: [(0, '1.138')]
[2023-02-20 21:07:31,346][10870] Signal inference workers to stop experience collection...
[2023-02-20 21:07:31,368][10884] InferenceWorker_p0-w0: stopping experience collection
[2023-02-20 21:07:33,844][10870] Signal inference workers to resume experience collection...
[2023-02-20 21:07:33,845][10884] InferenceWorker_p0-w0: resuming experience collection
[2023-02-20 21:07:35,406][00458] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 145.1. Samples: 2176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-20 21:07:35,409][00458] Avg episode reward: [(0, '2.206')]
[2023-02-20 21:07:40,406][00458] Fps is (10 sec: 2867.2, 60 sec: 1433.6, 300 sec: 1433.6). Total num frames: 28672. Throughput: 0: 327.6. Samples: 6552. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2023-02-20 21:07:40,409][00458] Avg episode reward: [(0, '3.704')]
[2023-02-20 21:07:43,305][10884] Updated weights for policy 0, policy_version 10 (0.0015)
[2023-02-20 21:07:45,410][00458] Fps is (10 sec: 4094.4, 60 sec: 1802.0, 300 sec: 1802.0). Total num frames: 45056. Throughput: 0: 508.5. Samples: 12714. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-20 21:07:45,413][00458] Avg episode reward: [(0, '4.332')]
[2023-02-20 21:07:50,406][00458] Fps is (10 sec: 2867.2, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 57344. Throughput: 0: 490.0. Samples: 14700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:07:50,410][00458] Avg episode reward: [(0, '4.398')]
[2023-02-20 21:07:55,408][00458] Fps is (10 sec: 2458.2, 60 sec: 1989.4, 300 sec: 1989.4). Total num frames: 69632. Throughput: 0: 516.4. Samples: 18074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:07:55,415][00458] Avg episode reward: [(0, '4.518')]
[2023-02-20 21:07:57,621][10884] Updated weights for policy 0, policy_version 20 (0.0039)
[2023-02-20 21:08:00,407][00458] Fps is (10 sec: 2867.1, 60 sec: 2150.4, 300 sec: 2150.4). Total num frames: 86016. Throughput: 0: 576.5. Samples: 23060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:00,418][00458] Avg episode reward: [(0, '4.479')]
[2023-02-20 21:08:05,410][00458] Fps is (10 sec: 3276.2, 60 sec: 2275.4, 300 sec: 2275.4). Total num frames: 102400. Throughput: 0: 556.4. Samples: 25038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:08:05,413][00458] Avg episode reward: [(0, '4.534')]
[2023-02-20 21:08:05,422][10870] Saving new best policy, reward=4.534!
[2023-02-20 21:08:10,406][00458] Fps is (10 sec: 2457.7, 60 sec: 2211.8, 300 sec: 2211.8). Total num frames: 110592. Throughput: 0: 629.9. Samples: 28346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:10,409][00458] Avg episode reward: [(0, '4.411')]
[2023-02-20 21:08:13,845][10884] Updated weights for policy 0, policy_version 30 (0.0016)
[2023-02-20 21:08:15,408][00458] Fps is (10 sec: 2458.0, 60 sec: 2308.6, 300 sec: 2308.6). Total num frames: 126976. Throughput: 0: 704.3. Samples: 32146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:15,417][00458] Avg episode reward: [(0, '4.565')]
[2023-02-20 21:08:15,423][10870] Saving new best policy, reward=4.565!
[2023-02-20 21:08:20,407][00458] Fps is (10 sec: 3276.7, 60 sec: 2389.3, 300 sec: 2389.3). Total num frames: 143360. Throughput: 0: 715.2. Samples: 34360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:20,412][00458] Avg episode reward: [(0, '4.429')]
[2023-02-20 21:08:25,161][10884] Updated weights for policy 0, policy_version 40 (0.0029)
[2023-02-20 21:08:25,406][00458] Fps is (10 sec: 3687.1, 60 sec: 2730.7, 300 sec: 2520.6). Total num frames: 163840. Throughput: 0: 756.7. Samples: 40604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:25,414][00458] Avg episode reward: [(0, '4.470')]
[2023-02-20 21:08:30,406][00458] Fps is (10 sec: 3686.5, 60 sec: 3003.7, 300 sec: 2574.6). Total num frames: 180224. Throughput: 0: 737.1. Samples: 45880. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:08:30,411][00458] Avg episode reward: [(0, '4.544')]
[2023-02-20 21:08:35,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2566.8). Total num frames: 192512. Throughput: 0: 737.3. Samples: 47878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:35,416][00458] Avg episode reward: [(0, '4.520')]
[2023-02-20 21:08:38,697][10884] Updated weights for policy 0, policy_version 50 (0.0022)
[2023-02-20 21:08:40,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3003.7, 300 sec: 2611.2). Total num frames: 208896. Throughput: 0: 761.8. Samples: 52356. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:08:40,414][00458] Avg episode reward: [(0, '4.554')]
[2023-02-20 21:08:45,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3072.2, 300 sec: 2698.5). Total num frames: 229376. Throughput: 0: 792.0. Samples: 58700. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:08:45,409][00458] Avg episode reward: [(0, '4.472')]
[2023-02-20 21:08:48,568][10884] Updated weights for policy 0, policy_version 60 (0.0027)
[2023-02-20 21:08:50,408][00458] Fps is (10 sec: 4095.3, 60 sec: 3208.4, 300 sec: 2776.1). Total num frames: 249856. Throughput: 0: 817.9. Samples: 61840. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:08:50,415][00458] Avg episode reward: [(0, '4.392')]
[2023-02-20 21:08:50,424][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth...
[2023-02-20 21:08:55,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 2759.4). Total num frames: 262144. Throughput: 0: 836.0. Samples: 65964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:08:55,411][00458] Avg episode reward: [(0, '4.358')]
[2023-02-20 21:09:00,406][00458] Fps is (10 sec: 2867.7, 60 sec: 3208.6, 300 sec: 2785.3). Total num frames: 278528. Throughput: 0: 853.2. Samples: 70540. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:09:00,408][00458] Avg episode reward: [(0, '4.374')]
[2023-02-20 21:09:01,918][10884] Updated weights for policy 0, policy_version 70 (0.0031)
[2023-02-20 21:09:05,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3277.0, 300 sec: 2847.7). Total num frames: 299008. Throughput: 0: 876.6. Samples: 73808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:09:05,411][00458] Avg episode reward: [(0, '4.447')]
[2023-02-20 21:09:10,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2904.4). Total num frames: 319488. Throughput: 0: 884.4. Samples: 80402. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-20 21:09:10,409][00458] Avg episode reward: [(0, '4.508')]
[2023-02-20 21:09:12,393][10884] Updated weights for policy 0, policy_version 80 (0.0033)
[2023-02-20 21:09:15,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 2920.6). Total num frames: 335872. Throughput: 0: 857.6. Samples: 84470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:09:15,411][00458] Avg episode reward: [(0, '4.470')]
[2023-02-20 21:09:20,407][00458] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 2901.3). Total num frames: 348160. Throughput: 0: 860.7. Samples: 86608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:09:20,414][00458] Avg episode reward: [(0, '4.394')]
[2023-02-20 21:09:24,415][10884] Updated weights for policy 0, policy_version 90 (0.0030)
[2023-02-20 21:09:25,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 2981.9). Total num frames: 372736. Throughput: 0: 897.1. Samples: 92726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:09:25,413][00458] Avg episode reward: [(0, '4.284')]
[2023-02-20 21:09:30,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3024.7). Total num frames: 393216. Throughput: 0: 901.2. Samples: 99252. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:09:30,416][00458] Avg episode reward: [(0, '4.581')]
[2023-02-20 21:09:30,428][10870] Saving new best policy, reward=4.581!
[2023-02-20 21:09:35,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3003.7). Total num frames: 405504. Throughput: 0: 874.8. Samples: 101206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:09:35,412][00458] Avg episode reward: [(0, '4.443')]
[2023-02-20 21:09:35,942][10884] Updated weights for policy 0, policy_version 100 (0.0036)
[2023-02-20 21:09:40,409][00458] Fps is (10 sec: 2866.4, 60 sec: 3549.7, 300 sec: 3013.4). Total num frames: 421888. Throughput: 0: 875.2. Samples: 105350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:09:40,414][00458] Avg episode reward: [(0, '4.560')]
[2023-02-20 21:09:45,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3050.8). Total num frames: 442368. Throughput: 0: 906.5. Samples: 111332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:09:45,418][00458] Avg episode reward: [(0, '4.486')]
[2023-02-20 21:09:47,088][10884] Updated weights for policy 0, policy_version 110 (0.0041)
[2023-02-20 21:09:50,410][00458] Fps is (10 sec: 4095.8, 60 sec: 3549.8, 300 sec: 3085.6). Total num frames: 462848. Throughput: 0: 908.3. Samples: 114686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:09:50,414][00458] Avg episode reward: [(0, '4.446')]
[2023-02-20 21:09:55,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3091.8). Total num frames: 479232. Throughput: 0: 876.3. Samples: 119834. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:09:55,409][00458] Avg episode reward: [(0, '4.444')]
[2023-02-20 21:09:59,783][10884] Updated weights for policy 0, policy_version 120 (0.0021)
[2023-02-20 21:10:00,408][00458] Fps is (10 sec: 2867.9, 60 sec: 3549.8, 300 sec: 3072.0). Total num frames: 491520. Throughput: 0: 878.4. Samples: 123998. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:10:00,414][00458] Avg episode reward: [(0, '4.434')]
[2023-02-20 21:10:05,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3103.0). Total num frames: 512000. Throughput: 0: 891.9. Samples: 126742. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-20 21:10:05,408][00458] Avg episode reward: [(0, '4.325')]
[2023-02-20 21:10:09,647][10884] Updated weights for policy 0, policy_version 130 (0.0014)
[2023-02-20 21:10:10,406][00458] Fps is (10 sec: 4096.5, 60 sec: 3549.9, 300 sec: 3132.2). Total num frames: 532480. Throughput: 0: 904.7. Samples: 133436. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:10:10,409][00458] Avg episode reward: [(0, '4.497')]
[2023-02-20 21:10:15,409][00458] Fps is (10 sec: 3685.6, 60 sec: 3549.7, 300 sec: 3136.3). Total num frames: 548864. Throughput: 0: 872.4. Samples: 138514. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:10:15,415][00458] Avg episode reward: [(0, '4.578')]
[2023-02-20 21:10:20,407][00458] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3140.3). Total num frames: 565248. Throughput: 0: 874.0. Samples: 140536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:10:20,412][00458] Avg episode reward: [(0, '4.576')]
[2023-02-20 21:10:22,848][10884] Updated weights for policy 0, policy_version 140 (0.0043)
[2023-02-20 21:10:25,406][00458] Fps is (10 sec: 3277.6, 60 sec: 3481.6, 300 sec: 3144.0). Total num frames: 581632. Throughput: 0: 891.3. Samples: 145454. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:10:25,409][00458] Avg episode reward: [(0, '4.483')]
[2023-02-20 21:10:30,406][00458] Fps is (10 sec: 4096.4, 60 sec: 3549.9, 300 sec: 3190.6). Total num frames: 606208. Throughput: 0: 908.0. Samples: 152190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:10:30,413][00458] Avg episode reward: [(0, '4.197')]
[2023-02-20 21:10:32,029][10884] Updated weights for policy 0, policy_version 150 (0.0015)
[2023-02-20 21:10:35,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3192.8). Total num frames: 622592. Throughput: 0: 897.9. Samples: 155088. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:10:35,409][00458] Avg episode reward: [(0, '4.266')]
[2023-02-20 21:10:40,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3550.0, 300 sec: 3174.4). Total num frames: 634880. Throughput: 0: 877.9. Samples: 159340. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:10:40,409][00458] Avg episode reward: [(0, '4.345')]
[2023-02-20 21:10:45,266][10884] Updated weights for policy 0, policy_version 160 (0.0025)
[2023-02-20 21:10:45,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3196.9). Total num frames: 655360. Throughput: 0: 901.7. Samples: 164572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:10:45,412][00458] Avg episode reward: [(0, '4.457')]
[2023-02-20 21:10:50,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3550.1, 300 sec: 3218.3). Total num frames: 675840. Throughput: 0: 914.9. Samples: 167914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:10:50,412][00458] Avg episode reward: [(0, '4.606')]
[2023-02-20 21:10:50,424][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000165_675840.pth...
[2023-02-20 21:10:50,544][10870] Saving new best policy, reward=4.606!
[2023-02-20 21:10:55,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3219.6). Total num frames: 692224. Throughput: 0: 893.4. Samples: 173638. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:10:55,411][00458] Avg episode reward: [(0, '4.440')]
[2023-02-20 21:10:55,494][10884] Updated weights for policy 0, policy_version 170 (0.0026)
[2023-02-20 21:11:00,407][00458] Fps is (10 sec: 3276.6, 60 sec: 3618.2, 300 sec: 3220.9). Total num frames: 708608. Throughput: 0: 871.9. Samples: 177748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:11:00,414][00458] Avg episode reward: [(0, '4.408')]
[2023-02-20 21:11:05,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3222.2). Total num frames: 724992. Throughput: 0: 873.4. Samples: 179838. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:11:05,410][00458] Avg episode reward: [(0, '4.444')]
[2023-02-20 21:11:08,000][10884] Updated weights for policy 0, policy_version 180 (0.0036)
[2023-02-20 21:11:10,406][00458] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3241.2). Total num frames: 745472. Throughput: 0: 907.2. Samples: 186278. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:11:10,409][00458] Avg episode reward: [(0, '4.657')]
[2023-02-20 21:11:10,425][10870] Saving new best policy, reward=4.657!
[2023-02-20 21:11:15,408][00458] Fps is (10 sec: 4095.5, 60 sec: 3618.2, 300 sec: 3259.4). Total num frames: 765952. Throughput: 0: 889.6. Samples: 192224. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:11:15,412][00458] Avg episode reward: [(0, '4.671')]
[2023-02-20 21:11:15,416][10870] Saving new best policy, reward=4.671!
[2023-02-20 21:11:19,635][10884] Updated weights for policy 0, policy_version 190 (0.0019)
[2023-02-20 21:11:20,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3242.7). Total num frames: 778240. Throughput: 0: 870.9. Samples: 194280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:11:20,411][00458] Avg episode reward: [(0, '4.766')]
[2023-02-20 21:11:20,427][10870] Saving new best policy, reward=4.766!
[2023-02-20 21:11:25,407][00458] Fps is (10 sec: 2867.5, 60 sec: 3549.8, 300 sec: 3243.4). Total num frames: 794624. Throughput: 0: 870.4. Samples: 198510. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:11:25,413][00458] Avg episode reward: [(0, '4.871')]
[2023-02-20 21:11:25,416][10870] Saving new best policy, reward=4.871!
[2023-02-20 21:11:30,406][00458] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3260.4). Total num frames: 815104. Throughput: 0: 899.9. Samples: 205068. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:11:30,408][00458] Avg episode reward: [(0, '4.662')]
[2023-02-20 21:11:30,584][10884] Updated weights for policy 0, policy_version 200 (0.0018)
[2023-02-20 21:11:35,410][00458] Fps is (10 sec: 4094.7, 60 sec: 3549.7, 300 sec: 3276.8). Total num frames: 835584. Throughput: 0: 899.9. Samples: 208414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:11:35,413][00458] Avg episode reward: [(0, '4.366')]
[2023-02-20 21:11:40,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3276.8). Total num frames: 851968. Throughput: 0: 879.7. Samples: 213224. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:11:40,417][00458] Avg episode reward: [(0, '4.468')]
[2023-02-20 21:11:42,864][10884] Updated weights for policy 0, policy_version 210 (0.0022)
[2023-02-20 21:11:45,406][00458] Fps is (10 sec: 2868.2, 60 sec: 3481.6, 300 sec: 3261.3). Total num frames: 864256. Throughput: 0: 885.0. Samples: 217574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:11:45,414][00458] Avg episode reward: [(0, '4.493')]
[2023-02-20 21:11:50,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3292.0). Total num frames: 888832. Throughput: 0: 911.6. Samples: 220858. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-20 21:11:50,413][00458] Avg episode reward: [(0, '4.502')]
[2023-02-20 21:11:52,743][10884] Updated weights for policy 0, policy_version 220 (0.0029)
[2023-02-20 21:11:55,407][00458] Fps is (10 sec: 4505.5, 60 sec: 3618.1, 300 sec: 3306.6). Total num frames: 909312. Throughput: 0: 921.1. Samples: 227728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-20 21:11:55,416][00458] Avg episode reward: [(0, '4.634')]
[2023-02-20 21:12:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3306.1). Total num frames: 925696. Throughput: 0: 891.7. Samples: 232350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:12:00,409][00458] Avg episode reward: [(0, '4.563')]
[2023-02-20 21:12:05,406][00458] Fps is (10 sec: 2867.3, 60 sec: 3549.9, 300 sec: 3291.2). Total num frames: 937984. Throughput: 0: 893.0. Samples: 234464. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:12:05,413][00458] Avg episode reward: [(0, '4.531')]
[2023-02-20 21:12:05,603][10884] Updated weights for policy 0, policy_version 230 (0.0024)
[2023-02-20 21:12:10,407][00458] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3319.2). Total num frames: 962560. Throughput: 0: 932.2. Samples: 240458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:12:10,409][00458] Avg episode reward: [(0, '4.659')]
[2023-02-20 21:12:14,520][10884] Updated weights for policy 0, policy_version 240 (0.0013)
[2023-02-20 21:12:15,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 3332.3). Total num frames: 983040. Throughput: 0: 937.7. Samples: 247266. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:15,408][00458] Avg episode reward: [(0, '4.463')]
[2023-02-20 21:12:20,409][00458] Fps is (10 sec: 3276.0, 60 sec: 3618.0, 300 sec: 3374.0). Total num frames: 995328. Throughput: 0: 906.1. Samples: 249186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:20,416][00458] Avg episode reward: [(0, '4.358')]
[2023-02-20 21:12:25,410][00458] Fps is (10 sec: 2456.7, 60 sec: 3549.7, 300 sec: 3415.6). Total num frames: 1007616. Throughput: 0: 872.5. Samples: 252490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:25,421][00458] Avg episode reward: [(0, '4.430')]
[2023-02-20 21:12:30,406][00458] Fps is (10 sec: 2458.3, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1019904. Throughput: 0: 855.2. Samples: 256060. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:30,412][00458] Avg episode reward: [(0, '4.613')]
[2023-02-20 21:12:31,152][10884] Updated weights for policy 0, policy_version 250 (0.0028)
[2023-02-20 21:12:35,406][00458] Fps is (10 sec: 3277.9, 60 sec: 3413.5, 300 sec: 3429.5). Total num frames: 1040384. Throughput: 0: 842.2. Samples: 258758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:35,409][00458] Avg episode reward: [(0, '4.813')]
[2023-02-20 21:12:40,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3443.5). Total num frames: 1060864. Throughput: 0: 841.3. Samples: 265588. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:12:40,412][00458] Avg episode reward: [(0, '4.519')]
[2023-02-20 21:12:40,533][10884] Updated weights for policy 0, policy_version 260 (0.0025)
[2023-02-20 21:12:45,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 1077248. Throughput: 0: 853.0. Samples: 270734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:12:45,416][00458] Avg episode reward: [(0, '4.601')]
[2023-02-20 21:12:50,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 1093632. Throughput: 0: 851.9. Samples: 272798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:12:50,412][00458] Avg episode reward: [(0, '4.691')]
[2023-02-20 21:12:50,431][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000267_1093632.pth...
[2023-02-20 21:12:50,603][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth
[2023-02-20 21:12:53,705][10884] Updated weights for policy 0, policy_version 270 (0.0014)
[2023-02-20 21:12:55,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3471.2). Total num frames: 1110016. Throughput: 0: 832.5. Samples: 277918. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:12:55,417][00458] Avg episode reward: [(0, '4.922')]
[2023-02-20 21:12:55,420][10870] Saving new best policy, reward=4.922!
[2023-02-20 21:13:00,407][00458] Fps is (10 sec: 4095.8, 60 sec: 3481.6, 300 sec: 3499.0). Total num frames: 1134592. Throughput: 0: 826.7. Samples: 284466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:13:00,410][00458] Avg episode reward: [(0, '4.956')]
[2023-02-20 21:13:00,420][10870] Saving new best policy, reward=4.956!
[2023-02-20 21:13:03,276][10884] Updated weights for policy 0, policy_version 280 (0.0020)
[2023-02-20 21:13:05,407][00458] Fps is (10 sec: 4095.9, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 1150976. Throughput: 0: 848.9. Samples: 287382. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:13:05,411][00458] Avg episode reward: [(0, '4.834')]
[2023-02-20 21:13:10,406][00458] Fps is (10 sec: 3276.9, 60 sec: 3413.4, 300 sec: 3526.7). Total num frames: 1167360. Throughput: 0: 868.1. Samples: 291550. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:13:10,413][00458] Avg episode reward: [(0, '5.015')]
[2023-02-20 21:13:10,429][10870] Saving new best policy, reward=5.015!
[2023-02-20 21:13:15,408][00458] Fps is (10 sec: 3276.4, 60 sec: 3345.0, 300 sec: 3526.7). Total num frames: 1183744. Throughput: 0: 903.2. Samples: 296706. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:13:15,410][00458] Avg episode reward: [(0, '5.079')]
[2023-02-20 21:13:15,419][10870] Saving new best policy, reward=5.079!
[2023-02-20 21:13:16,051][10884] Updated weights for policy 0, policy_version 290 (0.0024)
[2023-02-20 21:13:20,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.8, 300 sec: 3526.7). Total num frames: 1204224. Throughput: 0: 916.4. Samples: 299994. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:13:20,412][00458] Avg episode reward: [(0, '5.003')]
[2023-02-20 21:13:25,406][00458] Fps is (10 sec: 4096.6, 60 sec: 3618.3, 300 sec: 3540.6). Total num frames: 1224704. Throughput: 0: 905.7. Samples: 306344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:13:25,409][00458] Avg episode reward: [(0, '5.022')]
[2023-02-20 21:13:26,392][10884] Updated weights for policy 0, policy_version 300 (0.0013)
[2023-02-20 21:13:30,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1236992. Throughput: 0: 887.0. Samples: 310650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:13:30,414][00458] Avg episode reward: [(0, '5.257')]
[2023-02-20 21:13:30,422][10870] Saving new best policy, reward=5.257!
[2023-02-20 21:13:35,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1253376. Throughput: 0: 885.9. Samples: 312664. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:13:35,412][00458] Avg episode reward: [(0, '5.877')]
[2023-02-20 21:13:35,416][10870] Saving new best policy, reward=5.877!
[2023-02-20 21:13:38,418][10884] Updated weights for policy 0, policy_version 310 (0.0021)
[2023-02-20 21:13:40,407][00458] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1277952. Throughput: 0: 916.1. Samples: 319144. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:13:40,409][00458] Avg episode reward: [(0, '5.823')]
[2023-02-20 21:13:45,409][00458] Fps is (10 sec: 4504.6, 60 sec: 3686.3, 300 sec: 3554.5). Total num frames: 1298432. Throughput: 0: 906.2. Samples: 325248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:13:45,415][00458] Avg episode reward: [(0, '5.677')]
[2023-02-20 21:13:49,474][10884] Updated weights for policy 0, policy_version 320 (0.0017)
[2023-02-20 21:13:50,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1310720. Throughput: 0: 889.9. Samples: 327426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:13:50,409][00458] Avg episode reward: [(0, '6.037')]
[2023-02-20 21:13:50,423][10870] Saving new best policy, reward=6.037!
[2023-02-20 21:13:55,407][00458] Fps is (10 sec: 2867.8, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1327104. Throughput: 0: 888.5. Samples: 331534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:13:55,413][00458] Avg episode reward: [(0, '6.323')]
[2023-02-20 21:13:55,418][10870] Saving new best policy, reward=6.323!
[2023-02-20 21:14:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1347584. Throughput: 0: 923.1. Samples: 338246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:00,408][00458] Avg episode reward: [(0, '6.038')]
[2023-02-20 21:14:00,586][10884] Updated weights for policy 0, policy_version 330 (0.0013)
[2023-02-20 21:14:05,406][00458] Fps is (10 sec: 4505.8, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1372160. Throughput: 0: 924.4. Samples: 341592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:14:05,413][00458] Avg episode reward: [(0, '6.027')]
[2023-02-20 21:14:10,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1384448. Throughput: 0: 888.9. Samples: 346346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:10,411][00458] Avg episode reward: [(0, '5.971')]
[2023-02-20 21:14:12,674][10884] Updated weights for policy 0, policy_version 340 (0.0032)
[2023-02-20 21:14:15,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3618.2, 300 sec: 3568.4). Total num frames: 1400832. Throughput: 0: 889.3. Samples: 350668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:14:15,415][00458] Avg episode reward: [(0, '6.061')]
[2023-02-20 21:14:20,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1421312. Throughput: 0: 920.4. Samples: 354080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:20,409][00458] Avg episode reward: [(0, '6.335')]
[2023-02-20 21:14:20,433][10870] Saving new best policy, reward=6.335!
[2023-02-20 21:14:22,781][10884] Updated weights for policy 0, policy_version 350 (0.0016)
[2023-02-20 21:14:25,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1441792. Throughput: 0: 922.8. Samples: 360668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:25,410][00458] Avg episode reward: [(0, '6.414')]
[2023-02-20 21:14:25,414][10870] Saving new best policy, reward=6.414!
[2023-02-20 21:14:30,411][00458] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3568.4). Total num frames: 1458176. Throughput: 0: 887.3. Samples: 365174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:30,414][00458] Avg episode reward: [(0, '6.148')]
[2023-02-20 21:14:35,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1470464. Throughput: 0: 885.6. Samples: 367280. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:14:35,412][00458] Avg episode reward: [(0, '5.491')]
[2023-02-20 21:14:36,090][10884] Updated weights for policy 0, policy_version 360 (0.0040)
[2023-02-20 21:14:40,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1490944. Throughput: 0: 922.0. Samples: 373024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:40,411][00458] Avg episode reward: [(0, '5.368')]
[2023-02-20 21:14:45,113][10884] Updated weights for policy 0, policy_version 370 (0.0017)
[2023-02-20 21:14:45,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3568.4). Total num frames: 1515520. Throughput: 0: 921.7. Samples: 379724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:14:45,409][00458] Avg episode reward: [(0, '5.917')]
[2023-02-20 21:14:50,412][00458] Fps is (10 sec: 4093.7, 60 sec: 3686.1, 300 sec: 3568.3). Total num frames: 1531904. Throughput: 0: 898.0. Samples: 382006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:14:50,418][00458] Avg episode reward: [(0, '6.127')]
[2023-02-20 21:14:50,438][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000374_1531904.pth...
[2023-02-20 21:14:50,603][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000165_675840.pth
[2023-02-20 21:14:55,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1544192. Throughput: 0: 884.4. Samples: 386144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:14:55,411][00458] Avg episode reward: [(0, '6.565')]
[2023-02-20 21:14:55,415][10870] Saving new best policy, reward=6.565!
[2023-02-20 21:14:58,528][10884] Updated weights for policy 0, policy_version 380 (0.0017)
[2023-02-20 21:15:00,406][00458] Fps is (10 sec: 3278.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1564672. Throughput: 0: 915.2. Samples: 391850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:15:00,408][00458] Avg episode reward: [(0, '6.586')]
[2023-02-20 21:15:00,423][10870] Saving new best policy, reward=6.586!
[2023-02-20 21:15:05,407][00458] Fps is (10 sec: 4095.9, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 1585152. Throughput: 0: 912.8. Samples: 395156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:05,410][00458] Avg episode reward: [(0, '6.647')]
[2023-02-20 21:15:05,417][10870] Saving new best policy, reward=6.647!
[2023-02-20 21:15:08,345][10884] Updated weights for policy 0, policy_version 390 (0.0021)
[2023-02-20 21:15:10,410][00458] Fps is (10 sec: 3685.1, 60 sec: 3617.9, 300 sec: 3568.4). Total num frames: 1601536. Throughput: 0: 889.7. Samples: 400708. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:15:10,413][00458] Avg episode reward: [(0, '6.427')]
[2023-02-20 21:15:15,408][00458] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1617920. Throughput: 0: 885.8. Samples: 405038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:15,411][00458] Avg episode reward: [(0, '6.654')]
[2023-02-20 21:15:15,422][10870] Saving new best policy, reward=6.654!
[2023-02-20 21:15:20,406][00458] Fps is (10 sec: 3277.9, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1634304. Throughput: 0: 894.2. Samples: 407520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:15:20,409][00458] Avg episode reward: [(0, '6.351')]
[2023-02-20 21:15:20,780][10884] Updated weights for policy 0, policy_version 400 (0.0023)
[2023-02-20 21:15:25,406][00458] Fps is (10 sec: 4096.5, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1658880. Throughput: 0: 915.1. Samples: 414202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:25,410][00458] Avg episode reward: [(0, '6.013')]
[2023-02-20 21:15:30,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1675264. Throughput: 0: 893.2. Samples: 419920. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:30,413][00458] Avg episode reward: [(0, '6.044')]
[2023-02-20 21:15:30,955][10884] Updated weights for policy 0, policy_version 410 (0.0033)
[2023-02-20 21:15:35,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 1691648. Throughput: 0: 889.0. Samples: 422004. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:15:35,409][00458] Avg episode reward: [(0, '6.250')]
[2023-02-20 21:15:40,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1708032. Throughput: 0: 899.9. Samples: 426638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:15:40,414][00458] Avg episode reward: [(0, '6.477')]
[2023-02-20 21:15:42,986][10884] Updated weights for policy 0, policy_version 420 (0.0021)
[2023-02-20 21:15:45,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1728512. Throughput: 0: 924.5. Samples: 433452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:45,408][00458] Avg episode reward: [(0, '6.786')]
[2023-02-20 21:15:45,415][10870] Saving new best policy, reward=6.786!
[2023-02-20 21:15:50,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3618.5, 300 sec: 3582.3). Total num frames: 1748992. Throughput: 0: 922.7. Samples: 436676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:50,414][00458] Avg episode reward: [(0, '7.529')]
[2023-02-20 21:15:50,433][10870] Saving new best policy, reward=7.529!
[2023-02-20 21:15:54,637][10884] Updated weights for policy 0, policy_version 430 (0.0028)
[2023-02-20 21:15:55,407][00458] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1761280. Throughput: 0: 889.1. Samples: 440714. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:15:55,414][00458] Avg episode reward: [(0, '7.601')]
[2023-02-20 21:15:55,422][10870] Saving new best policy, reward=7.601!
[2023-02-20 21:16:00,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1777664. Throughput: 0: 894.8. Samples: 445304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:16:00,409][00458] Avg episode reward: [(0, '7.574')]
[2023-02-20 21:16:05,407][00458] Fps is (10 sec: 3686.7, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1798144. Throughput: 0: 913.2. Samples: 448612. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:16:05,414][00458] Avg episode reward: [(0, '7.366')]
[2023-02-20 21:16:05,640][10884] Updated weights for policy 0, policy_version 440 (0.0020)
[2023-02-20 21:16:10,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3618.3, 300 sec: 3568.4). Total num frames: 1818624. Throughput: 0: 914.4. Samples: 455348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:16:10,416][00458] Avg episode reward: [(0, '8.252')]
[2023-02-20 21:16:10,428][10870] Saving new best policy, reward=8.252!
[2023-02-20 21:16:15,407][00458] Fps is (10 sec: 3686.2, 60 sec: 3618.2, 300 sec: 3582.3). Total num frames: 1835008. Throughput: 0: 881.3. Samples: 459580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:15,412][00458] Avg episode reward: [(0, '8.803')]
[2023-02-20 21:16:15,421][10870] Saving new best policy, reward=8.803!
[2023-02-20 21:16:18,083][10884] Updated weights for policy 0, policy_version 450 (0.0017)
[2023-02-20 21:16:20,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1847296. Throughput: 0: 880.9. Samples: 461646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:20,414][00458] Avg episode reward: [(0, '9.694')]
[2023-02-20 21:16:20,432][10870] Saving new best policy, reward=9.694!
[2023-02-20 21:16:25,406][00458] Fps is (10 sec: 3686.7, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 1871872. Throughput: 0: 913.8. Samples: 467758. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:25,413][00458] Avg episode reward: [(0, '11.071')]
[2023-02-20 21:16:25,417][10870] Saving new best policy, reward=11.071!
[2023-02-20 21:16:27,927][10884] Updated weights for policy 0, policy_version 460 (0.0017)
[2023-02-20 21:16:30,407][00458] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 1892352. Throughput: 0: 905.4. Samples: 474196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:30,409][00458] Avg episode reward: [(0, '11.505')]
[2023-02-20 21:16:30,422][10870] Saving new best policy, reward=11.505!
[2023-02-20 21:16:35,408][00458] Fps is (10 sec: 3276.3, 60 sec: 3549.8, 300 sec: 3568.4). Total num frames: 1904640. Throughput: 0: 868.0. Samples: 475738. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:35,415][00458] Avg episode reward: [(0, '11.352')]
[2023-02-20 21:16:40,406][00458] Fps is (10 sec: 2048.0, 60 sec: 3413.3, 300 sec: 3554.5). Total num frames: 1912832. Throughput: 0: 853.1. Samples: 479102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:16:40,409][00458] Avg episode reward: [(0, '10.609')]
[2023-02-20 21:16:44,130][10884] Updated weights for policy 0, policy_version 470 (0.0048)
[2023-02-20 21:16:45,406][00458] Fps is (10 sec: 2048.3, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 1925120. Throughput: 0: 828.3. Samples: 482576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:45,409][00458] Avg episode reward: [(0, '10.485')]
[2023-02-20 21:16:50,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3512.8). Total num frames: 1945600. Throughput: 0: 812.3. Samples: 485166. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:16:50,415][00458] Avg episode reward: [(0, '10.270')]
[2023-02-20 21:16:50,428][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000475_1945600.pth...
[2023-02-20 21:16:50,553][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000267_1093632.pth
[2023-02-20 21:16:54,675][10884] Updated weights for policy 0, policy_version 480 (0.0034)
[2023-02-20 21:16:55,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3526.7). Total num frames: 1966080. Throughput: 0: 808.6. Samples: 491734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:16:55,413][00458] Avg episode reward: [(0, '9.898')]
[2023-02-20 21:17:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3540.6). Total num frames: 1982464. Throughput: 0: 828.8. Samples: 496876. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-20 21:17:00,409][00458] Avg episode reward: [(0, '10.448')]
[2023-02-20 21:17:05,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3512.8). Total num frames: 1998848. Throughput: 0: 828.1. Samples: 498910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:17:05,409][00458] Avg episode reward: [(0, '11.015')]
[2023-02-20 21:17:07,950][10884] Updated weights for policy 0, policy_version 490 (0.0039)
[2023-02-20 21:17:10,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 2015232. Throughput: 0: 802.1. Samples: 503854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:17:10,414][00458] Avg episode reward: [(0, '11.204')]
[2023-02-20 21:17:15,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3540.6). Total num frames: 2039808. Throughput: 0: 804.1. Samples: 510382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:17:15,412][00458] Avg episode reward: [(0, '12.105')]
[2023-02-20 21:17:15,416][10870] Saving new best policy, reward=12.105!
[2023-02-20 21:17:17,423][10884] Updated weights for policy 0, policy_version 500 (0.0013)
[2023-02-20 21:17:20,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3554.5). Total num frames: 2056192. Throughput: 0: 836.3. Samples: 513368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:17:20,413][00458] Avg episode reward: [(0, '11.872')]
[2023-02-20 21:17:25,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3554.5). Total num frames: 2068480. Throughput: 0: 853.2. Samples: 517498. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:17:25,411][00458] Avg episode reward: [(0, '11.425')]
[2023-02-20 21:17:30,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3540.6). Total num frames: 2084864. Throughput: 0: 885.6. Samples: 522426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:17:30,414][00458] Avg episode reward: [(0, '12.223')]
[2023-02-20 21:17:30,425][10870] Saving new best policy, reward=12.223!
[2023-02-20 21:17:30,899][10884] Updated weights for policy 0, policy_version 510 (0.0029)
[2023-02-20 21:17:35,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3540.6). Total num frames: 2105344. Throughput: 0: 899.2. Samples: 525628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:17:35,412][00458] Avg episode reward: [(0, '11.944')]
[2023-02-20 21:17:40,408][00458] Fps is (10 sec: 4095.5, 60 sec: 3549.8, 300 sec: 3554.5). Total num frames: 2125824. Throughput: 0: 890.9. Samples: 531826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:17:40,410][00458] Avg episode reward: [(0, '13.352')]
[2023-02-20 21:17:40,423][10870] Saving new best policy, reward=13.352!
[2023-02-20 21:17:41,014][10884] Updated weights for policy 0, policy_version 520 (0.0014)
[2023-02-20 21:17:45,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 2138112. Throughput: 0: 868.8. Samples: 535972. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:17:45,408][00458] Avg episode reward: [(0, '14.886')]
[2023-02-20 21:17:45,475][10870] Saving new best policy, reward=14.886!
[2023-02-20 21:17:50,406][00458] Fps is (10 sec: 2867.5, 60 sec: 3481.6, 300 sec: 3540.6). Total num frames: 2154496. Throughput: 0: 866.7. Samples: 537912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:17:50,413][00458] Avg episode reward: [(0, '14.886')]
[2023-02-20 21:17:53,644][10884] Updated weights for policy 0, policy_version 530 (0.0020)
[2023-02-20 21:17:55,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2174976. Throughput: 0: 894.8. Samples: 544122. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:17:55,409][00458] Avg episode reward: [(0, '15.759')]
[2023-02-20 21:17:55,412][10870] Saving new best policy, reward=15.759!
[2023-02-20 21:18:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 2191360. Throughput: 0: 864.5. Samples: 549286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:00,415][00458] Avg episode reward: [(0, '16.393')]
[2023-02-20 21:18:00,427][10870] Saving new best policy, reward=16.393!
[2023-02-20 21:18:05,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2199552. Throughput: 0: 820.5. Samples: 550290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:18:05,411][00458] Avg episode reward: [(0, '16.204')]
[2023-02-20 21:18:09,139][10884] Updated weights for policy 0, policy_version 540 (0.0040)
[2023-02-20 21:18:10,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2215936. Throughput: 0: 805.3. Samples: 553736. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:18:10,408][00458] Avg episode reward: [(0, '15.928')]
[2023-02-20 21:18:15,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3499.0). Total num frames: 2236416. Throughput: 0: 839.5. Samples: 560202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:15,408][00458] Avg episode reward: [(0, '16.190')]
[2023-02-20 21:18:18,595][10884] Updated weights for policy 0, policy_version 550 (0.0015)
[2023-02-20 21:18:20,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2256896. Throughput: 0: 843.5. Samples: 563586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:20,413][00458] Avg episode reward: [(0, '17.849')]
[2023-02-20 21:18:20,430][10870] Saving new best policy, reward=17.849!
[2023-02-20 21:18:25,408][00458] Fps is (10 sec: 3685.9, 60 sec: 3413.3, 300 sec: 3512.8). Total num frames: 2273280. Throughput: 0: 810.6. Samples: 568304. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:25,410][00458] Avg episode reward: [(0, '18.728')]
[2023-02-20 21:18:25,417][10870] Saving new best policy, reward=18.728!
[2023-02-20 21:18:30,407][00458] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3499.0). Total num frames: 2285568. Throughput: 0: 807.0. Samples: 572288. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:18:30,410][00458] Avg episode reward: [(0, '19.044')]
[2023-02-20 21:18:30,423][10870] Saving new best policy, reward=19.044!
[2023-02-20 21:18:32,155][10884] Updated weights for policy 0, policy_version 560 (0.0013)
[2023-02-20 21:18:35,406][00458] Fps is (10 sec: 3277.3, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2306048. Throughput: 0: 832.5. Samples: 575376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:35,414][00458] Avg episode reward: [(0, '19.778')]
[2023-02-20 21:18:35,419][10870] Saving new best policy, reward=19.778!
[2023-02-20 21:18:40,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2326528. Throughput: 0: 841.2. Samples: 581978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:18:40,411][00458] Avg episode reward: [(0, '20.306')]
[2023-02-20 21:18:40,441][10870] Saving new best policy, reward=20.306!
[2023-02-20 21:18:41,735][10884] Updated weights for policy 0, policy_version 570 (0.0038)
[2023-02-20 21:18:45,407][00458] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3499.0). Total num frames: 2342912. Throughput: 0: 829.6. Samples: 586620. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:18:45,420][00458] Avg episode reward: [(0, '20.148')]
[2023-02-20 21:18:50,407][00458] Fps is (10 sec: 2867.1, 60 sec: 3345.0, 300 sec: 3485.1). Total num frames: 2355200. Throughput: 0: 853.7. Samples: 588708. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:18:50,413][00458] Avg episode reward: [(0, '21.256')]
[2023-02-20 21:18:50,433][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000575_2355200.pth...
[2023-02-20 21:18:50,580][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000374_1531904.pth
[2023-02-20 21:18:50,598][10870] Saving new best policy, reward=21.256!
[2023-02-20 21:18:54,741][10884] Updated weights for policy 0, policy_version 580 (0.0018)
[2023-02-20 21:18:55,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3485.1). Total num frames: 2375680. Throughput: 0: 895.4. Samples: 594030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:18:55,408][00458] Avg episode reward: [(0, '20.754')]
[2023-02-20 21:19:00,406][00458] Fps is (10 sec: 4096.2, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 2396160. Throughput: 0: 896.9. Samples: 600564. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:19:00,413][00458] Avg episode reward: [(0, '20.831')]
[2023-02-20 21:19:05,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2412544. Throughput: 0: 879.8. Samples: 603178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:19:05,411][00458] Avg episode reward: [(0, '22.330')]
[2023-02-20 21:19:05,416][10870] Saving new best policy, reward=22.330!
[2023-02-20 21:19:05,745][10884] Updated weights for policy 0, policy_version 590 (0.0016)
[2023-02-20 21:19:10,407][00458] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2424832. Throughput: 0: 865.4. Samples: 607244. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:19:10,414][00458] Avg episode reward: [(0, '22.781')]
[2023-02-20 21:19:10,507][10870] Saving new best policy, reward=22.781!
[2023-02-20 21:19:15,410][00458] Fps is (10 sec: 3275.7, 60 sec: 3481.4, 300 sec: 3471.1). Total num frames: 2445312. Throughput: 0: 888.7. Samples: 612284. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:15,412][00458] Avg episode reward: [(0, '22.653')]
[2023-02-20 21:19:17,878][10884] Updated weights for policy 0, policy_version 600 (0.0017)
[2023-02-20 21:19:20,406][00458] Fps is (10 sec: 4096.1, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2465792. Throughput: 0: 894.6. Samples: 615634. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:19:20,409][00458] Avg episode reward: [(0, '21.981')]
[2023-02-20 21:19:25,409][00458] Fps is (10 sec: 3686.6, 60 sec: 3481.5, 300 sec: 3471.2). Total num frames: 2482176. Throughput: 0: 874.4. Samples: 621330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:25,418][00458] Avg episode reward: [(0, '21.628')]
[2023-02-20 21:19:30,147][10884] Updated weights for policy 0, policy_version 610 (0.0025)
[2023-02-20 21:19:30,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 2498560. Throughput: 0: 862.7. Samples: 625442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:30,411][00458] Avg episode reward: [(0, '21.877')]
[2023-02-20 21:19:35,406][00458] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 3471.2). Total num frames: 2514944. Throughput: 0: 865.7. Samples: 627662. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:35,408][00458] Avg episode reward: [(0, '20.151')]
[2023-02-20 21:19:40,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2535424. Throughput: 0: 893.7. Samples: 634246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:40,412][00458] Avg episode reward: [(0, '20.655')]
[2023-02-20 21:19:40,452][10884] Updated weights for policy 0, policy_version 620 (0.0022)
[2023-02-20 21:19:45,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.3). Total num frames: 2555904. Throughput: 0: 875.3. Samples: 639952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:19:45,412][00458] Avg episode reward: [(0, '21.474')]
[2023-02-20 21:19:50,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2568192. Throughput: 0: 862.8. Samples: 642006. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:19:50,414][00458] Avg episode reward: [(0, '22.362')]
[2023-02-20 21:19:53,782][10884] Updated weights for policy 0, policy_version 630 (0.0027)
[2023-02-20 21:19:55,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2584576. Throughput: 0: 867.6. Samples: 646286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:19:55,410][00458] Avg episode reward: [(0, '23.118')]
[2023-02-20 21:19:55,412][10870] Saving new best policy, reward=23.118!
[2023-02-20 21:20:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2605056. Throughput: 0: 897.6. Samples: 652674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:20:00,409][00458] Avg episode reward: [(0, '22.020')]
[2023-02-20 21:20:03,385][10884] Updated weights for policy 0, policy_version 640 (0.0013)
[2023-02-20 21:20:05,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 2625536. Throughput: 0: 895.1. Samples: 655912. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:05,409][00458] Avg episode reward: [(0, '22.641')]
[2023-02-20 21:20:10,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 2641920. Throughput: 0: 868.1. Samples: 660390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:20:10,413][00458] Avg episode reward: [(0, '22.421')]
[2023-02-20 21:20:15,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3481.8, 300 sec: 3457.3). Total num frames: 2654208. Throughput: 0: 870.6. Samples: 664618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:15,409][00458] Avg episode reward: [(0, '21.930')]
[2023-02-20 21:20:16,690][10884] Updated weights for policy 0, policy_version 650 (0.0068)
[2023-02-20 21:20:20,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2674688. Throughput: 0: 894.5. Samples: 667916. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:20,408][00458] Avg episode reward: [(0, '20.090')]
[2023-02-20 21:20:25,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3471.2). Total num frames: 2699264. Throughput: 0: 893.2. Samples: 674442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:20:25,413][00458] Avg episode reward: [(0, '20.395')]
[2023-02-20 21:20:27,022][10884] Updated weights for policy 0, policy_version 660 (0.0020)
[2023-02-20 21:20:30,407][00458] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2711552. Throughput: 0: 863.5. Samples: 678810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:30,410][00458] Avg episode reward: [(0, '20.620')]
[2023-02-20 21:20:35,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2723840. Throughput: 0: 866.9. Samples: 681018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:35,413][00458] Avg episode reward: [(0, '20.622')]
[2023-02-20 21:20:39,193][10884] Updated weights for policy 0, policy_version 670 (0.0020)
[2023-02-20 21:20:40,406][00458] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2748416. Throughput: 0: 901.6. Samples: 686860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:40,409][00458] Avg episode reward: [(0, '21.457')]
[2023-02-20 21:20:45,413][00458] Fps is (10 sec: 4093.4, 60 sec: 3481.2, 300 sec: 3443.3). Total num frames: 2764800. Throughput: 0: 871.7. Samples: 691904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:20:45,417][00458] Avg episode reward: [(0, '22.277')]
[2023-02-20 21:20:50,407][00458] Fps is (10 sec: 2867.0, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2777088. Throughput: 0: 836.7. Samples: 693564. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:20:50,410][00458] Avg episode reward: [(0, '22.711')]
[2023-02-20 21:20:50,431][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000678_2777088.pth...
[2023-02-20 21:20:50,663][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000475_1945600.pth
[2023-02-20 21:20:54,329][10884] Updated weights for policy 0, policy_version 680 (0.0036)
[2023-02-20 21:20:55,406][00458] Fps is (10 sec: 2049.3, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2785280. Throughput: 0: 808.6. Samples: 696776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:20:55,414][00458] Avg episode reward: [(0, '23.320')]
[2023-02-20 21:20:55,421][10870] Saving new best policy, reward=23.320!
[2023-02-20 21:21:00,406][00458] Fps is (10 sec: 2457.8, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 2801664. Throughput: 0: 808.8. Samples: 701012. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:21:00,415][00458] Avg episode reward: [(0, '23.135')]
[2023-02-20 21:21:05,407][00458] Fps is (10 sec: 3686.3, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 2822144. Throughput: 0: 808.1. Samples: 704280. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:21:05,415][00458] Avg episode reward: [(0, '24.401')]
[2023-02-20 21:21:05,420][10870] Saving new best policy, reward=24.401!
[2023-02-20 21:21:06,036][10884] Updated weights for policy 0, policy_version 690 (0.0018)
[2023-02-20 21:21:10,409][00458] Fps is (10 sec: 4095.1, 60 sec: 3344.9, 300 sec: 3415.6). Total num frames: 2842624. Throughput: 0: 805.7. Samples: 710702. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:21:10,416][00458] Avg episode reward: [(0, '25.818')]
[2023-02-20 21:21:10,427][10870] Saving new best policy, reward=25.818!
[2023-02-20 21:21:15,406][00458] Fps is (10 sec: 3276.9, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2854912. Throughput: 0: 804.0. Samples: 714992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:21:15,410][00458] Avg episode reward: [(0, '24.637')]
[2023-02-20 21:21:18,815][10884] Updated weights for policy 0, policy_version 700 (0.0039)
[2023-02-20 21:21:20,406][00458] Fps is (10 sec: 2867.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2871296. Throughput: 0: 800.6. Samples: 717046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:21:20,412][00458] Avg episode reward: [(0, '24.307')]
[2023-02-20 21:21:25,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3387.9). Total num frames: 2891776. Throughput: 0: 800.6. Samples: 722888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:21:25,414][00458] Avg episode reward: [(0, '24.682')]
[2023-02-20 21:21:28,689][10884] Updated weights for policy 0, policy_version 710 (0.0030)
[2023-02-20 21:21:30,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 2916352. Throughput: 0: 837.2. Samples: 729574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:21:30,410][00458] Avg episode reward: [(0, '24.466')]
[2023-02-20 21:21:35,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2928640. Throughput: 0: 846.8. Samples: 731668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:21:35,409][00458] Avg episode reward: [(0, '25.180')]
[2023-02-20 21:21:40,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3443.4). Total num frames: 2940928. Throughput: 0: 870.2. Samples: 735934. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:21:40,415][00458] Avg episode reward: [(0, '23.605')]
[2023-02-20 21:21:41,641][10884] Updated weights for policy 0, policy_version 720 (0.0032)
[2023-02-20 21:21:45,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3345.4, 300 sec: 3457.3). Total num frames: 2965504. Throughput: 0: 912.1. Samples: 742058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:21:45,409][00458] Avg episode reward: [(0, '24.176')]
[2023-02-20 21:21:50,407][00458] Fps is (10 sec: 4505.5, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2985984. Throughput: 0: 915.6. Samples: 745482. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:21:50,410][00458] Avg episode reward: [(0, '22.539')]
[2023-02-20 21:21:50,833][10884] Updated weights for policy 0, policy_version 730 (0.0017)
[2023-02-20 21:21:55,407][00458] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 3002368. Throughput: 0: 887.9. Samples: 750654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:21:55,412][00458] Avg episode reward: [(0, '23.586')]
[2023-02-20 21:22:00,407][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.8, 300 sec: 3443.4). Total num frames: 3014656. Throughput: 0: 884.2. Samples: 754782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:22:00,411][00458] Avg episode reward: [(0, '25.131')]
[2023-02-20 21:22:04,154][10884] Updated weights for policy 0, policy_version 740 (0.0037)
[2023-02-20 21:22:05,406][00458] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 3035136. Throughput: 0: 897.8. Samples: 757448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:22:05,409][00458] Avg episode reward: [(0, '23.397')]
[2023-02-20 21:22:10,406][00458] Fps is (10 sec: 4096.2, 60 sec: 3550.0, 300 sec: 3443.4). Total num frames: 3055616. Throughput: 0: 914.9. Samples: 764060. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:22:10,408][00458] Avg episode reward: [(0, '23.195')]
[2023-02-20 21:22:14,413][10884] Updated weights for policy 0, policy_version 750 (0.0029)
[2023-02-20 21:22:15,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3443.4). Total num frames: 3072000. Throughput: 0: 881.9. Samples: 769260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:22:15,412][00458] Avg episode reward: [(0, '23.767')]
[2023-02-20 21:22:20,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3084288. Throughput: 0: 879.6. Samples: 771248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:22:20,408][00458] Avg episode reward: [(0, '25.755')]
[2023-02-20 21:22:25,406][00458] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 3104768. Throughput: 0: 887.5. Samples: 775870. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:22:25,409][00458] Avg episode reward: [(0, '24.433')]
[2023-02-20 21:22:27,075][10884] Updated weights for policy 0, policy_version 760 (0.0021)
[2023-02-20 21:22:30,408][00458] Fps is (10 sec: 4095.2, 60 sec: 3481.5, 300 sec: 3457.3). Total num frames: 3125248. Throughput: 0: 898.8. Samples: 782504. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:22:30,411][00458] Avg episode reward: [(0, '24.257')]
[2023-02-20 21:22:35,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3141632. Throughput: 0: 886.9. Samples: 785392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:22:35,412][00458] Avg episode reward: [(0, '24.061')]
[2023-02-20 21:22:38,726][10884] Updated weights for policy 0, policy_version 770 (0.0013)
[2023-02-20 21:22:40,406][00458] Fps is (10 sec: 3277.4, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 3158016. Throughput: 0: 860.0. Samples: 789354. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:22:40,415][00458] Avg episode reward: [(0, '24.455')]
[2023-02-20 21:22:45,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3174400. Throughput: 0: 875.7. Samples: 794190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:22:45,409][00458] Avg episode reward: [(0, '22.960')]
[2023-02-20 21:22:49,903][10884] Updated weights for policy 0, policy_version 780 (0.0024)
[2023-02-20 21:22:50,407][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3194880. Throughput: 0: 891.3. Samples: 797556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:22:50,413][00458] Avg episode reward: [(0, '22.919')]
[2023-02-20 21:22:50,425][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000780_3194880.pth...
[2023-02-20 21:22:50,547][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000575_2355200.pth
[2023-02-20 21:22:55,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3471.2). Total num frames: 3215360. Throughput: 0: 881.5. Samples: 803728. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:22:55,409][00458] Avg episode reward: [(0, '22.940')]
[2023-02-20 21:23:00,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3227648. Throughput: 0: 856.0. Samples: 807782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:00,412][00458] Avg episode reward: [(0, '23.422')]
[2023-02-20 21:23:02,941][10884] Updated weights for policy 0, policy_version 790 (0.0027)
[2023-02-20 21:23:05,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3413.3, 300 sec: 3471.2). Total num frames: 3239936. Throughput: 0: 856.7. Samples: 809798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:23:05,409][00458] Avg episode reward: [(0, '23.676')]
[2023-02-20 21:23:10,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3264512. Throughput: 0: 888.2. Samples: 815840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:10,409][00458] Avg episode reward: [(0, '24.307')]
[2023-02-20 21:23:12,869][10884] Updated weights for policy 0, policy_version 800 (0.0014)
[2023-02-20 21:23:15,406][00458] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3284992. Throughput: 0: 882.7. Samples: 822226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:15,409][00458] Avg episode reward: [(0, '24.335')]
[2023-02-20 21:23:20,407][00458] Fps is (10 sec: 3276.6, 60 sec: 3549.8, 300 sec: 3471.2). Total num frames: 3297280. Throughput: 0: 865.4. Samples: 824334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:23:20,415][00458] Avg episode reward: [(0, '24.871')]
[2023-02-20 21:23:25,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3313664. Throughput: 0: 870.1. Samples: 828508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:23:25,408][00458] Avg episode reward: [(0, '23.761')]
[2023-02-20 21:23:26,238][10884] Updated weights for policy 0, policy_version 810 (0.0028)
[2023-02-20 21:23:30,407][00458] Fps is (10 sec: 3686.6, 60 sec: 3481.7, 300 sec: 3485.1). Total num frames: 3334144. Throughput: 0: 901.1. Samples: 834738. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:23:30,417][00458] Avg episode reward: [(0, '22.701')]
[2023-02-20 21:23:35,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3354624. Throughput: 0: 896.9. Samples: 837918. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:23:35,420][00458] Avg episode reward: [(0, '21.752')]
[2023-02-20 21:23:35,852][10884] Updated weights for policy 0, policy_version 820 (0.0021)
[2023-02-20 21:23:40,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3371008. Throughput: 0: 867.9. Samples: 842782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:23:40,417][00458] Avg episode reward: [(0, '21.997')]
[2023-02-20 21:23:45,407][00458] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3383296. Throughput: 0: 868.7. Samples: 846872. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:45,413][00458] Avg episode reward: [(0, '22.366')]
[2023-02-20 21:23:49,202][10884] Updated weights for policy 0, policy_version 830 (0.0019)
[2023-02-20 21:23:50,408][00458] Fps is (10 sec: 3276.2, 60 sec: 3481.5, 300 sec: 3485.1). Total num frames: 3403776. Throughput: 0: 887.9. Samples: 849754. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:50,419][00458] Avg episode reward: [(0, '21.262')]
[2023-02-20 21:23:55,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3424256. Throughput: 0: 899.0. Samples: 856294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:23:55,408][00458] Avg episode reward: [(0, '21.768')]
[2023-02-20 21:24:00,056][10884] Updated weights for policy 0, policy_version 840 (0.0014)
[2023-02-20 21:24:00,407][00458] Fps is (10 sec: 3686.9, 60 sec: 3549.8, 300 sec: 3485.1). Total num frames: 3440640. Throughput: 0: 861.5. Samples: 860996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:24:00,414][00458] Avg episode reward: [(0, '21.464')]
[2023-02-20 21:24:05,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3452928. Throughput: 0: 859.6. Samples: 863016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:24:05,413][00458] Avg episode reward: [(0, '22.785')]
[2023-02-20 21:24:10,407][00458] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3473408. Throughput: 0: 878.3. Samples: 868032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:24:10,416][00458] Avg episode reward: [(0, '21.807')]
[2023-02-20 21:24:11,998][10884] Updated weights for policy 0, policy_version 850 (0.0022)
[2023-02-20 21:24:15,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3493888. Throughput: 0: 888.7. Samples: 874730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:24:15,415][00458] Avg episode reward: [(0, '21.500')]
[2023-02-20 21:24:20,407][00458] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3510272. Throughput: 0: 877.8. Samples: 877420. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:24:20,409][00458] Avg episode reward: [(0, '22.547')]
[2023-02-20 21:24:23,653][10884] Updated weights for policy 0, policy_version 860 (0.0013)
[2023-02-20 21:24:25,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3526656. Throughput: 0: 864.2. Samples: 881670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:24:25,409][00458] Avg episode reward: [(0, '23.299')]
[2023-02-20 21:24:30,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3485.1). Total num frames: 3543040. Throughput: 0: 888.7. Samples: 886862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:24:30,413][00458] Avg episode reward: [(0, '24.526')]
[2023-02-20 21:24:34,612][10884] Updated weights for policy 0, policy_version 870 (0.0018)
[2023-02-20 21:24:35,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3499.0). Total num frames: 3567616. Throughput: 0: 896.7. Samples: 890106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:24:35,409][00458] Avg episode reward: [(0, '25.215')]
[2023-02-20 21:24:40,413][00458] Fps is (10 sec: 4093.3, 60 sec: 3549.5, 300 sec: 3485.0). Total num frames: 3584000. Throughput: 0: 887.8. Samples: 896250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:24:40,416][00458] Avg episode reward: [(0, '25.758')]
[2023-02-20 21:24:45,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3485.1). Total num frames: 3596288. Throughput: 0: 875.8. Samples: 900406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:24:45,414][00458] Avg episode reward: [(0, '27.034')]
[2023-02-20 21:24:45,416][10870] Saving new best policy, reward=27.034!
[2023-02-20 21:24:47,310][10884] Updated weights for policy 0, policy_version 880 (0.0027)
[2023-02-20 21:24:50,406][00458] Fps is (10 sec: 2459.2, 60 sec: 3413.4, 300 sec: 3471.2). Total num frames: 3608576. Throughput: 0: 870.7. Samples: 902196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:24:50,409][00458] Avg episode reward: [(0, '28.208')]
[2023-02-20 21:24:50,430][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000881_3608576.pth...
[2023-02-20 21:24:50,653][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000678_2777088.pth
[2023-02-20 21:24:50,689][10870] Saving new best policy, reward=28.208!
[2023-02-20 21:24:55,414][00458] Fps is (10 sec: 2864.9, 60 sec: 3344.6, 300 sec: 3457.2). Total num frames: 3624960. Throughput: 0: 844.1. Samples: 906022. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:24:55,421][00458] Avg episode reward: [(0, '28.827')]
[2023-02-20 21:24:55,429][10870] Saving new best policy, reward=28.827!
[2023-02-20 21:25:00,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3429.5). Total num frames: 3637248. Throughput: 0: 787.7. Samples: 910178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:25:00,409][00458] Avg episode reward: [(0, '28.553')]
[2023-02-20 21:25:02,817][10884] Updated weights for policy 0, policy_version 890 (0.0042)
[2023-02-20 21:25:05,406][00458] Fps is (10 sec: 2459.6, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 3649536. Throughput: 0: 771.2. Samples: 912122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:25:05,411][00458] Avg episode reward: [(0, '27.757')]
[2023-02-20 21:25:10,407][00458] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3415.6). Total num frames: 3661824. Throughput: 0: 765.5. Samples: 916120. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:25:10,412][00458] Avg episode reward: [(0, '25.938')]
[2023-02-20 21:25:15,368][10884] Updated weights for policy 0, policy_version 900 (0.0024)
[2023-02-20 21:25:15,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3429.5). Total num frames: 3686400. Throughput: 0: 781.7. Samples: 922040. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:25:15,413][00458] Avg episode reward: [(0, '25.524')]
[2023-02-20 21:25:20,406][00458] Fps is (10 sec: 4505.8, 60 sec: 3276.8, 300 sec: 3415.6). Total num frames: 3706880. Throughput: 0: 782.9. Samples: 925338. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:25:20,414][00458] Avg episode reward: [(0, '23.802')]
[2023-02-20 21:25:25,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3415.6). Total num frames: 3719168. Throughput: 0: 757.4. Samples: 930328. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:25:25,409][00458] Avg episode reward: [(0, '23.296')]
[2023-02-20 21:25:27,438][10884] Updated weights for policy 0, policy_version 910 (0.0013)
[2023-02-20 21:25:30,409][00458] Fps is (10 sec: 2866.5, 60 sec: 3208.4, 300 sec: 3429.5). Total num frames: 3735552. Throughput: 0: 753.6. Samples: 934318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:25:30,418][00458] Avg episode reward: [(0, '22.911')]
[2023-02-20 21:25:35,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3401.8). Total num frames: 3751936. Throughput: 0: 775.1. Samples: 937076. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-20 21:25:35,408][00458] Avg episode reward: [(0, '22.280')]
[2023-02-20 21:25:38,488][10884] Updated weights for policy 0, policy_version 920 (0.0017)
[2023-02-20 21:25:40,406][00458] Fps is (10 sec: 3687.3, 60 sec: 3140.6, 300 sec: 3415.7). Total num frames: 3772416. Throughput: 0: 832.3. Samples: 943470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:25:40,412][00458] Avg episode reward: [(0, '25.240')]
[2023-02-20 21:25:45,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3401.8). Total num frames: 3780608. Throughput: 0: 811.1. Samples: 946678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:25:45,409][00458] Avg episode reward: [(0, '23.972')]
[2023-02-20 21:25:50,406][00458] Fps is (10 sec: 2048.0, 60 sec: 3072.0, 300 sec: 3415.6). Total num frames: 3792896. Throughput: 0: 795.7. Samples: 947928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:25:50,417][00458] Avg episode reward: [(0, '23.446')]
[2023-02-20 21:25:55,406][00458] Fps is (10 sec: 2457.6, 60 sec: 3004.1, 300 sec: 3401.8). Total num frames: 3805184. Throughput: 0: 790.6. Samples: 951698. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:25:55,413][00458] Avg episode reward: [(0, '24.193')]
[2023-02-20 21:25:55,450][10884] Updated weights for policy 0, policy_version 930 (0.0025)
[2023-02-20 21:26:00,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3415.7). Total num frames: 3829760. Throughput: 0: 801.7. Samples: 958116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:26:00,415][00458] Avg episode reward: [(0, '23.853')]
[2023-02-20 21:26:05,407][00458] Fps is (10 sec: 4095.9, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3846144. Throughput: 0: 792.3. Samples: 960990. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:26:05,416][00458] Avg episode reward: [(0, '23.678')]
[2023-02-20 21:26:06,357][10884] Updated weights for policy 0, policy_version 940 (0.0028)
[2023-02-20 21:26:10,406][00458] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3858432. Throughput: 0: 773.2. Samples: 965120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:26:10,412][00458] Avg episode reward: [(0, '22.893')]
[2023-02-20 21:26:15,406][00458] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 3874816. Throughput: 0: 793.2. Samples: 970008. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:26:15,411][00458] Avg episode reward: [(0, '21.090')]
[2023-02-20 21:26:18,692][10884] Updated weights for policy 0, policy_version 950 (0.0030)
[2023-02-20 21:26:20,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3401.8). Total num frames: 3895296. Throughput: 0: 802.4. Samples: 973182. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-20 21:26:20,410][00458] Avg episode reward: [(0, '23.982')]
[2023-02-20 21:26:25,406][00458] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 3915776. Throughput: 0: 796.8. Samples: 979324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:26:25,411][00458] Avg episode reward: [(0, '23.449')]
[2023-02-20 21:26:30,406][00458] Fps is (10 sec: 3276.8, 60 sec: 3208.7, 300 sec: 3387.9). Total num frames: 3928064. Throughput: 0: 816.9. Samples: 983438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-20 21:26:30,417][00458] Avg episode reward: [(0, '23.806')]
[2023-02-20 21:26:30,451][10884] Updated weights for policy 0, policy_version 960 (0.0033)
[2023-02-20 21:26:35,407][00458] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 3944448. Throughput: 0: 834.1. Samples: 985462. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-20 21:26:35,413][00458] Avg episode reward: [(0, '23.685')]
[2023-02-20 21:26:40,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3387.9). Total num frames: 3964928. Throughput: 0: 887.6. Samples: 991638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-20 21:26:40,409][00458] Avg episode reward: [(0, '24.393')]
[2023-02-20 21:26:41,502][10884] Updated weights for policy 0, policy_version 970 (0.0015)
[2023-02-20 21:26:45,412][00458] Fps is (10 sec: 4093.9, 60 sec: 3413.0, 300 sec: 3387.8). Total num frames: 3985408. Throughput: 0: 883.8. Samples: 997890. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-20 21:26:45,420][00458] Avg episode reward: [(0, '25.337')]
[2023-02-20 21:26:50,406][00458] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3387.9). Total num frames: 4001792. Throughput: 0: 865.9. Samples: 999956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-20 21:26:50,411][00458] Avg episode reward: [(0, '25.107')]
[2023-02-20 21:26:50,426][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000977_4001792.pth...
[2023-02-20 21:26:50,595][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000780_3194880.pth
[2023-02-20 21:26:51,588][00458] Component Batcher_0 stopped!
[2023-02-20 21:26:51,587][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-20 21:26:51,588][10870] Stopping Batcher_0...
[2023-02-20 21:26:51,612][10870] Loop batcher_evt_loop terminating...
[2023-02-20 21:26:51,671][10884] Weights refcount: 2 0
[2023-02-20 21:26:51,682][00458] Component InferenceWorker_p0-w0 stopped!
[2023-02-20 21:26:51,686][10884] Stopping InferenceWorker_p0-w0...
[2023-02-20 21:26:51,686][10884] Loop inference_proc0-0_evt_loop terminating...
[2023-02-20 21:26:51,699][00458] Component RolloutWorker_w5 stopped!
[2023-02-20 21:26:51,709][10888] Stopping RolloutWorker_w3...
[2023-02-20 21:26:51,710][10888] Loop rollout_proc3_evt_loop terminating...
[2023-02-20 21:26:51,706][00458] Component RolloutWorker_w3 stopped!
[2023-02-20 21:26:51,702][10890] Stopping RolloutWorker_w5...
[2023-02-20 21:26:51,714][10890] Loop rollout_proc5_evt_loop terminating...
[2023-02-20 21:26:51,784][10887] Stopping RolloutWorker_w2...
[2023-02-20 21:26:51,786][10887] Loop rollout_proc2_evt_loop terminating...
[2023-02-20 21:26:51,784][00458] Component RolloutWorker_w2 stopped!
[2023-02-20 21:26:51,784][10891] Stopping RolloutWorker_w7...
[2023-02-20 21:26:51,793][10891] Loop rollout_proc7_evt_loop terminating...
[2023-02-20 21:26:51,789][00458] Component RolloutWorker_w7 stopped!
[2023-02-20 21:26:51,799][10886] Stopping RolloutWorker_w1...
[2023-02-20 21:26:51,800][10886] Loop rollout_proc1_evt_loop terminating...
[2023-02-20 21:26:51,800][00458] Component RolloutWorker_w1 stopped!
[2023-02-20 21:26:51,848][00458] Component RolloutWorker_w4 stopped!
[2023-02-20 21:26:51,852][10889] Stopping RolloutWorker_w4...
[2023-02-20 21:26:51,852][10889] Loop rollout_proc4_evt_loop terminating...
[2023-02-20 21:26:51,893][00458] Component RolloutWorker_w6 stopped!
[2023-02-20 21:26:51,899][10892] Stopping RolloutWorker_w6...
[2023-02-20 21:26:51,899][10892] Loop rollout_proc6_evt_loop terminating...
[2023-02-20 21:26:51,959][00458] Component RolloutWorker_w0 stopped!
[2023-02-20 21:26:51,964][10885] Stopping RolloutWorker_w0...
[2023-02-20 21:26:51,965][10885] Loop rollout_proc0_evt_loop terminating...
[2023-02-20 21:26:51,986][10870] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000881_3608576.pth
[2023-02-20 21:26:52,004][10870] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-20 21:26:52,284][00458] Component LearnerWorker_p0 stopped!
[2023-02-20 21:26:52,287][00458] Waiting for process learner_proc0 to stop...
[2023-02-20 21:26:52,289][10870] Stopping LearnerWorker_p0...
[2023-02-20 21:26:52,292][10870] Loop learner_proc0_evt_loop terminating...
[2023-02-20 21:26:54,707][00458] Waiting for process inference_proc0-0 to join...
[2023-02-20 21:26:55,210][00458] Waiting for process rollout_proc0 to join...
[2023-02-20 21:26:55,675][00458] Waiting for process rollout_proc1 to join...
[2023-02-20 21:26:55,679][00458] Waiting for process rollout_proc2 to join...
[2023-02-20 21:26:55,681][00458] Waiting for process rollout_proc3 to join...
[2023-02-20 21:26:55,682][00458] Waiting for process rollout_proc4 to join...
[2023-02-20 21:26:55,683][00458] Waiting for process rollout_proc5 to join...
[2023-02-20 21:26:55,685][00458] Waiting for process rollout_proc6 to join...
[2023-02-20 21:26:55,687][00458] Waiting for process rollout_proc7 to join...
[2023-02-20 21:26:55,688][00458] Batcher 0 profile tree view:
batching: 27.8041, releasing_batches: 0.0235
[2023-02-20 21:26:55,689][00458] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 551.3347
update_model: 8.3240
weight_update: 0.0027
one_step: 0.0167
handle_policy_step: 566.8321
deserialize: 15.9885, stack: 3.1455, obs_to_device_normalize: 120.3291, forward: 278.2312, send_messages: 28.1971
prepare_outputs: 92.6830
to_cpu: 58.3667
[2023-02-20 21:26:55,691][00458] Learner 0 profile tree view:
misc: 0.0056, prepare_batch: 18.0879
train: 77.5379
epoch_init: 0.0061, minibatch_init: 0.0098, losses_postprocess: 0.5162, kl_divergence: 0.5900, after_optimizer: 32.8849
calculate_losses: 27.7127
losses_init: 0.0037, forward_head: 1.8966, bptt_initial: 18.0616, tail: 1.2041, advantages_returns: 0.3216, losses: 3.4305
bptt: 2.4043
bptt_forward_core: 2.3227
update: 15.1706
clip: 1.4722
[2023-02-20 21:26:55,694][00458] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3482, enqueue_policy_requests: 147.7739, env_step: 886.1147, overhead: 22.7254, complete_rollouts: 7.5822
save_policy_outputs: 23.1505
split_output_tensors: 11.1062
[2023-02-20 21:26:55,695][00458] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3417, enqueue_policy_requests: 152.4656, env_step: 885.2699, overhead: 22.2835, complete_rollouts: 7.2658
save_policy_outputs: 21.8572
split_output_tensors: 10.6206
[2023-02-20 21:26:55,697][00458] Loop Runner_EvtLoop terminating...
[2023-02-20 21:26:55,698][00458] Runner profile tree view:
main_loop: 1200.8388
[2023-02-20 21:26:55,700][00458] Collected {0: 4005888}, FPS: 3335.9
[2023-02-20 21:27:07,564][00458] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-20 21:27:07,566][00458] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-20 21:27:07,568][00458] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-20 21:27:07,570][00458] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-20 21:27:07,572][00458] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-20 21:27:07,577][00458] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-20 21:27:07,578][00458] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-20 21:27:07,581][00458] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-20 21:27:07,582][00458] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-20 21:27:07,583][00458] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-20 21:27:07,584][00458] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-20 21:27:07,585][00458] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-20 21:27:07,588][00458] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-20 21:27:07,589][00458] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-20 21:27:07,590][00458] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-20 21:27:07,619][00458] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-20 21:27:07,623][00458] RunningMeanStd input shape: (3, 72, 128)
[2023-02-20 21:27:07,626][00458] RunningMeanStd input shape: (1,)
[2023-02-20 21:27:07,654][00458] ConvEncoder: input_channels=3
[2023-02-20 21:27:08,429][00458] Conv encoder output size: 512
[2023-02-20 21:27:08,431][00458] Policy head output size: 512
[2023-02-20 21:27:10,765][00458] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-20 21:27:12,028][00458] Num frames 100...
[2023-02-20 21:27:12,139][00458] Num frames 200...
[2023-02-20 21:27:12,251][00458] Num frames 300...
[2023-02-20 21:27:12,379][00458] Num frames 400...
[2023-02-20 21:27:12,495][00458] Num frames 500...
[2023-02-20 21:27:12,610][00458] Num frames 600...
[2023-02-20 21:27:12,726][00458] Num frames 700...
[2023-02-20 21:27:12,842][00458] Num frames 800...
[2023-02-20 21:27:12,956][00458] Num frames 900...
[2023-02-20 21:27:13,065][00458] Num frames 1000...
[2023-02-20 21:27:13,180][00458] Num frames 1100...
[2023-02-20 21:27:13,310][00458] Num frames 1200...
[2023-02-20 21:27:13,428][00458] Num frames 1300...
[2023-02-20 21:27:13,550][00458] Num frames 1400...
[2023-02-20 21:27:13,669][00458] Num frames 1500...
[2023-02-20 21:27:13,787][00458] Num frames 1600...
[2023-02-20 21:27:13,923][00458] Avg episode rewards: #0: 41.689, true rewards: #0: 16.690
[2023-02-20 21:27:13,929][00458] Avg episode reward: 41.689, avg true_objective: 16.690
[2023-02-20 21:27:13,974][00458] Num frames 1700...
[2023-02-20 21:27:14,090][00458] Num frames 1800...
[2023-02-20 21:27:14,200][00458] Num frames 1900...
[2023-02-20 21:27:14,324][00458] Num frames 2000...
[2023-02-20 21:27:14,441][00458] Num frames 2100...
[2023-02-20 21:27:14,551][00458] Num frames 2200...
[2023-02-20 21:27:14,661][00458] Num frames 2300...
[2023-02-20 21:27:14,770][00458] Num frames 2400...
[2023-02-20 21:27:14,899][00458] Avg episode rewards: #0: 30.285, true rewards: #0: 12.285
[2023-02-20 21:27:14,901][00458] Avg episode reward: 30.285, avg true_objective: 12.285
[2023-02-20 21:27:14,955][00458] Num frames 2500...
[2023-02-20 21:27:15,066][00458] Num frames 2600...
[2023-02-20 21:27:15,179][00458] Num frames 2700...
[2023-02-20 21:27:15,293][00458] Num frames 2800...
[2023-02-20 21:27:15,414][00458] Num frames 2900...
[2023-02-20 21:27:15,475][00458] Avg episode rewards: #0: 22.680, true rewards: #0: 9.680
[2023-02-20 21:27:15,477][00458] Avg episode reward: 22.680, avg true_objective: 9.680
[2023-02-20 21:27:15,596][00458] Num frames 3000...
[2023-02-20 21:27:15,710][00458] Num frames 3100...
[2023-02-20 21:27:15,821][00458] Num frames 3200...
[2023-02-20 21:27:15,932][00458] Num frames 3300...
[2023-02-20 21:27:16,042][00458] Num frames 3400...
[2023-02-20 21:27:16,157][00458] Num frames 3500...
[2023-02-20 21:27:16,282][00458] Num frames 3600...
[2023-02-20 21:27:16,403][00458] Num frames 3700...
[2023-02-20 21:27:16,512][00458] Num frames 3800...
[2023-02-20 21:27:16,626][00458] Num frames 3900...
[2023-02-20 21:27:16,734][00458] Avg episode rewards: #0: 22.612, true rewards: #0: 9.862
[2023-02-20 21:27:16,736][00458] Avg episode reward: 22.612, avg true_objective: 9.862
[2023-02-20 21:27:16,811][00458] Num frames 4000...
[2023-02-20 21:27:16,928][00458] Num frames 4100...
[2023-02-20 21:27:17,037][00458] Num frames 4200...
[2023-02-20 21:27:17,155][00458] Num frames 4300...
[2023-02-20 21:27:17,264][00458] Num frames 4400...
[2023-02-20 21:27:17,382][00458] Num frames 4500...
[2023-02-20 21:27:17,490][00458] Num frames 4600...
[2023-02-20 21:27:17,617][00458] Num frames 4700...
[2023-02-20 21:27:17,728][00458] Num frames 4800...
[2023-02-20 21:27:17,836][00458] Avg episode rewards: #0: 21.694, true rewards: #0: 9.694
[2023-02-20 21:27:17,838][00458] Avg episode reward: 21.694, avg true_objective: 9.694
[2023-02-20 21:27:17,909][00458] Num frames 4900...
[2023-02-20 21:27:18,032][00458] Num frames 5000...
[2023-02-20 21:27:18,149][00458] Num frames 5100...
[2023-02-20 21:27:18,299][00458] Avg episode rewards: #0: 18.945, true rewards: #0: 8.612
[2023-02-20 21:27:18,301][00458] Avg episode reward: 18.945, avg true_objective: 8.612
[2023-02-20 21:27:18,362][00458] Num frames 5200...
[2023-02-20 21:27:18,525][00458] Num frames 5300...
[2023-02-20 21:27:18,683][00458] Num frames 5400...
[2023-02-20 21:27:18,839][00458] Num frames 5500...
[2023-02-20 21:27:18,994][00458] Num frames 5600...
[2023-02-20 21:27:19,159][00458] Num frames 5700...
[2023-02-20 21:27:19,318][00458] Num frames 5800...
[2023-02-20 21:27:19,488][00458] Num frames 5900...
[2023-02-20 21:27:19,649][00458] Num frames 6000...
[2023-02-20 21:27:19,806][00458] Num frames 6100...
[2023-02-20 21:27:19,965][00458] Num frames 6200...
[2023-02-20 21:27:20,118][00458] Num frames 6300...
[2023-02-20 21:27:20,276][00458] Num frames 6400...
[2023-02-20 21:27:20,436][00458] Num frames 6500...
[2023-02-20 21:27:20,599][00458] Num frames 6600...
[2023-02-20 21:27:20,767][00458] Num frames 6700...
[2023-02-20 21:27:20,935][00458] Num frames 6800...
[2023-02-20 21:27:21,101][00458] Num frames 6900...
[2023-02-20 21:27:21,268][00458] Num frames 7000...
[2023-02-20 21:27:21,443][00458] Num frames 7100...
[2023-02-20 21:27:21,638][00458] Avg episode rewards: #0: 23.688, true rewards: #0: 10.260
[2023-02-20 21:27:21,641][00458] Avg episode reward: 23.688, avg true_objective: 10.260
[2023-02-20 21:27:21,671][00458] Num frames 7200...
[2023-02-20 21:27:21,819][00458] Num frames 7300...
[2023-02-20 21:27:21,930][00458] Num frames 7400...
[2023-02-20 21:27:22,044][00458] Num frames 7500...
[2023-02-20 21:27:22,157][00458] Num frames 7600...
[2023-02-20 21:27:22,276][00458] Num frames 7700...
[2023-02-20 21:27:22,392][00458] Num frames 7800...
[2023-02-20 21:27:22,505][00458] Num frames 7900...
[2023-02-20 21:27:22,662][00458] Avg episode rewards: #0: 22.727, true rewards: #0: 9.977
[2023-02-20 21:27:22,664][00458] Avg episode reward: 22.727, avg true_objective: 9.977
[2023-02-20 21:27:22,687][00458] Num frames 8000...
[2023-02-20 21:27:22,809][00458] Num frames 8100...
[2023-02-20 21:27:22,923][00458] Num frames 8200...
[2023-02-20 21:27:23,039][00458] Num frames 8300...
[2023-02-20 21:27:23,154][00458] Num frames 8400...
[2023-02-20 21:27:23,269][00458] Num frames 8500...
[2023-02-20 21:27:23,383][00458] Num frames 8600...
[2023-02-20 21:27:23,498][00458] Num frames 8700...
[2023-02-20 21:27:23,619][00458] Num frames 8800...
[2023-02-20 21:27:23,734][00458] Num frames 8900...
[2023-02-20 21:27:23,848][00458] Num frames 9000...
[2023-02-20 21:27:23,990][00458] Avg episode rewards: #0: 23.194, true rewards: #0: 10.083
[2023-02-20 21:27:23,993][00458] Avg episode reward: 23.194, avg true_objective: 10.083
[2023-02-20 21:27:24,026][00458] Num frames 9100...
[2023-02-20 21:27:24,142][00458] Num frames 9200...
[2023-02-20 21:27:24,256][00458] Num frames 9300...
[2023-02-20 21:27:24,370][00458] Num frames 9400...
[2023-02-20 21:27:24,496][00458] Num frames 9500...
[2023-02-20 21:27:24,614][00458] Num frames 9600...
[2023-02-20 21:27:24,729][00458] Num frames 9700...
[2023-02-20 21:27:24,852][00458] Num frames 9800...
[2023-02-20 21:27:24,965][00458] Num frames 9900...
[2023-02-20 21:27:25,076][00458] Num frames 10000...
[2023-02-20 21:27:25,190][00458] Num frames 10100...
[2023-02-20 21:27:25,306][00458] Num frames 10200...
[2023-02-20 21:27:25,422][00458] Num frames 10300...
[2023-02-20 21:27:25,536][00458] Num frames 10400...
[2023-02-20 21:27:25,614][00458] Avg episode rewards: #0: 23.719, true rewards: #0: 10.419
[2023-02-20 21:27:25,616][00458] Avg episode reward: 23.719, avg true_objective: 10.419
[2023-02-20 21:28:34,823][00458] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-20 21:37:09,566][00458] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-20 21:37:09,568][00458] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-20 21:37:09,570][00458] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-20 21:37:09,572][00458] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-20 21:37:09,574][00458] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-20 21:37:09,576][00458] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-20 21:37:09,577][00458] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-20 21:37:09,578][00458] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-20 21:37:09,580][00458] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-20 21:37:09,581][00458] Adding new argument 'hf_repository'='ThomasSimonini/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-20 21:37:09,582][00458] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-20 21:37:09,583][00458] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-20 21:37:09,584][00458] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-20 21:37:09,585][00458] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-20 21:37:09,587][00458] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-20 21:37:09,612][00458] RunningMeanStd input shape: (3, 72, 128)
[2023-02-20 21:37:09,614][00458] RunningMeanStd input shape: (1,)
[2023-02-20 21:37:09,633][00458] ConvEncoder: input_channels=3
[2023-02-20 21:37:09,670][00458] Conv encoder output size: 512
[2023-02-20 21:37:09,673][00458] Policy head output size: 512
[2023-02-20 21:37:09,693][00458] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-20 21:37:10,130][00458] Num frames 100...
[2023-02-20 21:37:10,242][00458] Num frames 200...
[2023-02-20 21:37:10,352][00458] Num frames 300...
[2023-02-20 21:37:10,471][00458] Num frames 400...
[2023-02-20 21:37:10,591][00458] Num frames 500...
[2023-02-20 21:37:10,704][00458] Num frames 600...
[2023-02-20 21:37:10,816][00458] Num frames 700...
[2023-02-20 21:37:10,931][00458] Num frames 800...
[2023-02-20 21:37:11,055][00458] Num frames 900...
[2023-02-20 21:37:11,168][00458] Num frames 1000...
[2023-02-20 21:37:11,292][00458] Num frames 1100...
[2023-02-20 21:37:11,419][00458] Num frames 1200...
[2023-02-20 21:37:11,505][00458] Avg episode rewards: #0: 27.240, true rewards: #0: 12.240
[2023-02-20 21:37:11,508][00458] Avg episode reward: 27.240, avg true_objective: 12.240
[2023-02-20 21:37:11,604][00458] Num frames 1300...
[2023-02-20 21:37:11,728][00458] Num frames 1400...
[2023-02-20 21:37:11,965][00458] Num frames 1500...
[2023-02-20 21:37:12,177][00458] Num frames 1600...
[2023-02-20 21:37:12,366][00458] Num frames 1700...
[2023-02-20 21:37:12,633][00458] Num frames 1800...
[2023-02-20 21:37:12,828][00458] Num frames 1900...
[2023-02-20 21:37:13,017][00458] Num frames 2000...
[2023-02-20 21:37:13,240][00458] Avg episode rewards: #0: 22.940, true rewards: #0: 10.440
[2023-02-20 21:37:13,242][00458] Avg episode reward: 22.940, avg true_objective: 10.440
[2023-02-20 21:37:13,273][00458] Num frames 2100...
[2023-02-20 21:37:13,465][00458] Num frames 2200...
[2023-02-20 21:37:13,798][00458] Num frames 2300...
[2023-02-20 21:37:14,046][00458] Num frames 2400...
[2023-02-20 21:37:14,302][00458] Num frames 2500...
[2023-02-20 21:37:14,481][00458] Num frames 2600...
[2023-02-20 21:37:14,679][00458] Num frames 2700...
[2023-02-20 21:37:14,856][00458] Num frames 2800...
[2023-02-20 21:37:14,971][00458] Num frames 2900...
[2023-02-20 21:37:15,090][00458] Num frames 3000...
[2023-02-20 21:37:15,152][00458] Avg episode rewards: #0: 22.014, true rewards: #0: 10.013
[2023-02-20 21:37:15,155][00458] Avg episode reward: 22.014, avg true_objective: 10.013
[2023-02-20 21:37:15,279][00458] Num frames 3100...
[2023-02-20 21:37:15,403][00458] Num frames 3200...
[2023-02-20 21:37:15,515][00458] Num frames 3300...
[2023-02-20 21:37:15,628][00458] Num frames 3400...
[2023-02-20 21:37:15,778][00458] Avg episode rewards: #0: 18.210, true rewards: #0: 8.710
[2023-02-20 21:37:15,780][00458] Avg episode reward: 18.210, avg true_objective: 8.710
[2023-02-20 21:37:15,809][00458] Num frames 3500...
[2023-02-20 21:37:15,919][00458] Num frames 3600...
[2023-02-20 21:37:16,029][00458] Num frames 3700...
[2023-02-20 21:37:16,144][00458] Num frames 3800...
[2023-02-20 21:37:16,272][00458] Num frames 3900...
[2023-02-20 21:37:16,386][00458] Num frames 4000...
[2023-02-20 21:37:16,497][00458] Num frames 4100...
[2023-02-20 21:37:16,618][00458] Num frames 4200...
[2023-02-20 21:37:16,730][00458] Num frames 4300...
[2023-02-20 21:37:16,845][00458] Num frames 4400...
[2023-02-20 21:37:16,961][00458] Num frames 4500...
[2023-02-20 21:37:17,074][00458] Num frames 4600...
[2023-02-20 21:37:17,203][00458] Num frames 4700...
[2023-02-20 21:37:17,330][00458] Avg episode rewards: #0: 20.928, true rewards: #0: 9.528
[2023-02-20 21:37:17,333][00458] Avg episode reward: 20.928, avg true_objective: 9.528
[2023-02-20 21:37:17,403][00458] Num frames 4800...
[2023-02-20 21:37:17,561][00458] Num frames 4900...
[2023-02-20 21:37:17,714][00458] Num frames 5000...
[2023-02-20 21:37:17,873][00458] Num frames 5100...
[2023-02-20 21:37:18,028][00458] Num frames 5200...
[2023-02-20 21:37:18,183][00458] Num frames 5300...
[2023-02-20 21:37:18,339][00458] Num frames 5400...
[2023-02-20 21:37:18,501][00458] Num frames 5500...
[2023-02-20 21:37:18,655][00458] Num frames 5600...
[2023-02-20 21:37:18,810][00458] Num frames 5700...
[2023-02-20 21:37:18,962][00458] Num frames 5800...
[2023-02-20 21:37:19,115][00458] Num frames 5900...
[2023-02-20 21:37:19,278][00458] Num frames 6000...
[2023-02-20 21:37:19,441][00458] Num frames 6100...
[2023-02-20 21:37:19,510][00458] Avg episode rewards: #0: 22.680, true rewards: #0: 10.180
[2023-02-20 21:37:19,512][00458] Avg episode reward: 22.680, avg true_objective: 10.180
[2023-02-20 21:37:19,664][00458] Num frames 6200...
[2023-02-20 21:37:19,825][00458] Num frames 6300...
[2023-02-20 21:37:19,985][00458] Num frames 6400...
[2023-02-20 21:37:20,144][00458] Num frames 6500...
[2023-02-20 21:37:20,319][00458] Num frames 6600...
[2023-02-20 21:37:20,485][00458] Num frames 6700...
[2023-02-20 21:37:20,651][00458] Num frames 6800...
[2023-02-20 21:37:20,813][00458] Num frames 6900...
[2023-02-20 21:37:20,977][00458] Num frames 7000...
[2023-02-20 21:37:21,119][00458] Num frames 7100...
[2023-02-20 21:37:21,233][00458] Num frames 7200...
[2023-02-20 21:37:21,353][00458] Num frames 7300...
[2023-02-20 21:37:21,475][00458] Num frames 7400...
[2023-02-20 21:37:21,622][00458] Avg episode rewards: #0: 24.834, true rewards: #0: 10.691
[2023-02-20 21:37:21,624][00458] Avg episode reward: 24.834, avg true_objective: 10.691
[2023-02-20 21:37:21,648][00458] Num frames 7500...
[2023-02-20 21:37:21,763][00458] Num frames 7600...
[2023-02-20 21:37:21,877][00458] Num frames 7700...
[2023-02-20 21:37:21,988][00458] Num frames 7800...
[2023-02-20 21:37:22,102][00458] Num frames 7900...
[2023-02-20 21:37:22,213][00458] Num frames 8000...
[2023-02-20 21:37:22,334][00458] Num frames 8100...
[2023-02-20 21:37:22,453][00458] Num frames 8200...
[2023-02-20 21:37:22,563][00458] Num frames 8300...
[2023-02-20 21:37:22,678][00458] Num frames 8400...
[2023-02-20 21:37:22,792][00458] Num frames 8500...
[2023-02-20 21:37:22,904][00458] Num frames 8600...
[2023-02-20 21:37:23,016][00458] Num frames 8700...
[2023-02-20 21:37:23,134][00458] Num frames 8800...
[2023-02-20 21:37:23,251][00458] Num frames 8900...
[2023-02-20 21:37:23,389][00458] Num frames 9000...
[2023-02-20 21:37:23,506][00458] Num frames 9100...
[2023-02-20 21:37:23,616][00458] Num frames 9200...
[2023-02-20 21:37:23,738][00458] Num frames 9300...
[2023-02-20 21:37:23,852][00458] Num frames 9400...
[2023-02-20 21:37:23,973][00458] Num frames 9500...
[2023-02-20 21:37:24,126][00458] Avg episode rewards: #0: 28.605, true rewards: #0: 11.980
[2023-02-20 21:37:24,128][00458] Avg episode reward: 28.605, avg true_objective: 11.980
[2023-02-20 21:37:24,158][00458] Num frames 9600...
[2023-02-20 21:37:24,272][00458] Num frames 9700...
[2023-02-20 21:37:24,392][00458] Num frames 9800...
[2023-02-20 21:37:24,510][00458] Num frames 9900...
[2023-02-20 21:37:24,628][00458] Num frames 10000...
[2023-02-20 21:37:24,739][00458] Num frames 10100...
[2023-02-20 21:37:24,853][00458] Num frames 10200...
[2023-02-20 21:37:24,976][00458] Avg episode rewards: #0: 26.729, true rewards: #0: 11.396
[2023-02-20 21:37:24,977][00458] Avg episode reward: 26.729, avg true_objective: 11.396
[2023-02-20 21:37:25,032][00458] Num frames 10300...
[2023-02-20 21:37:25,146][00458] Num frames 10400...
[2023-02-20 21:37:25,264][00458] Num frames 10500...
[2023-02-20 21:37:25,405][00458] Num frames 10600...
[2023-02-20 21:37:25,523][00458] Num frames 10700...
[2023-02-20 21:37:25,636][00458] Num frames 10800...
[2023-02-20 21:37:25,756][00458] Num frames 10900...
[2023-02-20 21:37:25,878][00458] Num frames 11000...
[2023-02-20 21:37:25,993][00458] Num frames 11100...
[2023-02-20 21:37:26,107][00458] Num frames 11200...
[2023-02-20 21:37:26,221][00458] Num frames 11300...
[2023-02-20 21:37:26,338][00458] Num frames 11400...
[2023-02-20 21:37:26,463][00458] Num frames 11500...
[2023-02-20 21:37:26,585][00458] Num frames 11600...
[2023-02-20 21:37:26,697][00458] Num frames 11700...
[2023-02-20 21:37:26,815][00458] Num frames 11800...
[2023-02-20 21:37:26,929][00458] Num frames 11900...
[2023-02-20 21:37:27,081][00458] Avg episode rewards: #0: 28.575, true rewards: #0: 11.975
[2023-02-20 21:37:27,083][00458] Avg episode reward: 28.575, avg true_objective: 11.975
[2023-02-20 21:38:48,314][00458] Replay video saved to /content/train_dir/default_experiment/replay.mp4!