OliP's picture
Upload . with huggingface_hub
99ad90e
[2023-02-25 11:07:25,370][00922] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-25 11:07:25,376][00922] Rollout worker 0 uses device cpu
[2023-02-25 11:07:25,380][00922] Rollout worker 1 uses device cpu
[2023-02-25 11:07:25,384][00922] Rollout worker 2 uses device cpu
[2023-02-25 11:07:25,388][00922] Rollout worker 3 uses device cpu
[2023-02-25 11:07:25,392][00922] Rollout worker 4 uses device cpu
[2023-02-25 11:07:25,395][00922] Rollout worker 5 uses device cpu
[2023-02-25 11:07:25,399][00922] Rollout worker 6 uses device cpu
[2023-02-25 11:07:25,417][00922] Rollout worker 7 uses device cpu
[2023-02-25 11:07:25,667][00922] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 11:07:25,674][00922] InferenceWorker_p0-w0: min num requests: 2
[2023-02-25 11:07:25,716][00922] Starting all processes...
[2023-02-25 11:07:25,721][00922] Starting process learner_proc0
[2023-02-25 11:07:25,849][00922] Starting all processes...
[2023-02-25 11:07:25,940][00922] Starting process inference_proc0-0
[2023-02-25 11:07:25,941][00922] Starting process rollout_proc0
[2023-02-25 11:07:25,943][00922] Starting process rollout_proc1
[2023-02-25 11:07:25,943][00922] Starting process rollout_proc2
[2023-02-25 11:07:25,943][00922] Starting process rollout_proc3
[2023-02-25 11:07:25,945][00922] Starting process rollout_proc4
[2023-02-25 11:07:25,945][00922] Starting process rollout_proc5
[2023-02-25 11:07:25,945][00922] Starting process rollout_proc6
[2023-02-25 11:07:25,945][00922] Starting process rollout_proc7
[2023-02-25 11:07:39,034][10928] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 11:07:39,034][10928] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-25 11:07:39,315][10944] Worker 2 uses CPU cores [0]
[2023-02-25 11:07:39,363][10947] Worker 4 uses CPU cores [0]
[2023-02-25 11:07:39,562][10950] Worker 7 uses CPU cores [1]
[2023-02-25 11:07:39,662][10942] Worker 0 uses CPU cores [0]
[2023-02-25 11:07:39,723][10943] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 11:07:39,724][10943] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-25 11:07:39,743][10945] Worker 3 uses CPU cores [1]
[2023-02-25 11:07:40,102][10949] Worker 6 uses CPU cores [0]
[2023-02-25 11:07:40,139][10946] Worker 1 uses CPU cores [1]
[2023-02-25 11:07:40,144][10948] Worker 5 uses CPU cores [1]
[2023-02-25 11:07:40,259][10943] Num visible devices: 1
[2023-02-25 11:07:40,262][10928] Num visible devices: 1
[2023-02-25 11:07:40,269][10928] Starting seed is not provided
[2023-02-25 11:07:40,270][10928] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 11:07:40,270][10928] Initializing actor-critic model on device cuda:0
[2023-02-25 11:07:40,270][10928] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 11:07:40,272][10928] RunningMeanStd input shape: (1,)
[2023-02-25 11:07:40,305][10928] ConvEncoder: input_channels=3
[2023-02-25 11:07:40,601][10928] Conv encoder output size: 512
[2023-02-25 11:07:40,602][10928] Policy head output size: 512
[2023-02-25 11:07:40,655][10928] Created Actor Critic model with architecture:
[2023-02-25 11:07:40,655][10928] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-25 11:07:45,658][00922] Heartbeat connected on Batcher_0
[2023-02-25 11:07:45,668][00922] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-25 11:07:45,684][00922] Heartbeat connected on RolloutWorker_w0
[2023-02-25 11:07:45,688][00922] Heartbeat connected on RolloutWorker_w1
[2023-02-25 11:07:45,693][00922] Heartbeat connected on RolloutWorker_w2
[2023-02-25 11:07:45,698][00922] Heartbeat connected on RolloutWorker_w3
[2023-02-25 11:07:45,702][00922] Heartbeat connected on RolloutWorker_w4
[2023-02-25 11:07:45,708][00922] Heartbeat connected on RolloutWorker_w5
[2023-02-25 11:07:45,715][00922] Heartbeat connected on RolloutWorker_w6
[2023-02-25 11:07:45,716][00922] Heartbeat connected on RolloutWorker_w7
[2023-02-25 11:07:48,114][10928] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-25 11:07:48,116][10928] No checkpoints found
[2023-02-25 11:07:48,116][10928] Did not load from checkpoint, starting from scratch!
[2023-02-25 11:07:48,117][10928] Initialized policy 0 weights for model version 0
[2023-02-25 11:07:48,120][10928] LearnerWorker_p0 finished initialization!
[2023-02-25 11:07:48,122][00922] Heartbeat connected on LearnerWorker_p0
[2023-02-25 11:07:48,126][10928] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 11:07:48,333][10943] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 11:07:48,334][10943] RunningMeanStd input shape: (1,)
[2023-02-25 11:07:48,348][10943] ConvEncoder: input_channels=3
[2023-02-25 11:07:48,446][10943] Conv encoder output size: 512
[2023-02-25 11:07:48,446][10943] Policy head output size: 512
[2023-02-25 11:07:48,696][00922] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 11:07:51,202][00922] Inference worker 0-0 is ready!
[2023-02-25 11:07:51,204][00922] All inference workers are ready! Signal rollout workers to start!
[2023-02-25 11:07:51,309][10950] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,316][10946] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,335][10945] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,378][10948] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,433][10949] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,434][10942] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,424][10944] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:51,449][10947] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:07:52,432][10944] Decorrelating experience for 0 frames...
[2023-02-25 11:07:53,014][10950] Decorrelating experience for 0 frames...
[2023-02-25 11:07:53,018][10945] Decorrelating experience for 0 frames...
[2023-02-25 11:07:53,020][10946] Decorrelating experience for 0 frames...
[2023-02-25 11:07:53,025][10948] Decorrelating experience for 0 frames...
[2023-02-25 11:07:53,696][00922] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 11:07:53,816][10944] Decorrelating experience for 32 frames...
[2023-02-25 11:07:54,539][10945] Decorrelating experience for 32 frames...
[2023-02-25 11:07:54,544][10950] Decorrelating experience for 32 frames...
[2023-02-25 11:07:54,546][10946] Decorrelating experience for 32 frames...
[2023-02-25 11:07:54,575][10948] Decorrelating experience for 32 frames...
[2023-02-25 11:07:54,862][10944] Decorrelating experience for 64 frames...
[2023-02-25 11:07:55,668][10947] Decorrelating experience for 0 frames...
[2023-02-25 11:07:55,713][10942] Decorrelating experience for 0 frames...
[2023-02-25 11:07:55,943][10944] Decorrelating experience for 96 frames...
[2023-02-25 11:07:56,152][10945] Decorrelating experience for 64 frames...
[2023-02-25 11:07:56,159][10950] Decorrelating experience for 64 frames...
[2023-02-25 11:07:56,214][10948] Decorrelating experience for 64 frames...
[2023-02-25 11:07:56,283][10947] Decorrelating experience for 32 frames...
[2023-02-25 11:07:56,780][10942] Decorrelating experience for 32 frames...
[2023-02-25 11:07:57,139][10946] Decorrelating experience for 64 frames...
[2023-02-25 11:07:57,642][10949] Decorrelating experience for 0 frames...
[2023-02-25 11:07:57,738][10950] Decorrelating experience for 96 frames...
[2023-02-25 11:07:57,745][10945] Decorrelating experience for 96 frames...
[2023-02-25 11:07:58,292][10948] Decorrelating experience for 96 frames...
[2023-02-25 11:07:58,362][10949] Decorrelating experience for 32 frames...
[2023-02-25 11:07:58,607][10947] Decorrelating experience for 64 frames...
[2023-02-25 11:07:58,696][00922] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 11:07:58,714][10946] Decorrelating experience for 96 frames...
[2023-02-25 11:07:59,123][10942] Decorrelating experience for 64 frames...
[2023-02-25 11:07:59,226][10949] Decorrelating experience for 64 frames...
[2023-02-25 11:08:00,003][10942] Decorrelating experience for 96 frames...
[2023-02-25 11:08:00,073][10947] Decorrelating experience for 96 frames...
[2023-02-25 11:08:00,173][10949] Decorrelating experience for 96 frames...
[2023-02-25 11:08:02,650][10928] Signal inference workers to stop experience collection...
[2023-02-25 11:08:02,660][10943] InferenceWorker_p0-w0: stopping experience collection
[2023-02-25 11:08:03,696][00922] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 146.3. Samples: 2194. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 11:08:03,698][00922] Avg episode reward: [(0, '1.847')]
[2023-02-25 11:08:05,428][10928] Signal inference workers to resume experience collection...
[2023-02-25 11:08:05,428][10943] InferenceWorker_p0-w0: resuming experience collection
[2023-02-25 11:08:08,696][00922] Fps is (10 sec: 1228.8, 60 sec: 614.4, 300 sec: 614.4). Total num frames: 12288. Throughput: 0: 128.1. Samples: 2562. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-02-25 11:08:08,707][00922] Avg episode reward: [(0, '2.778')]
[2023-02-25 11:08:13,698][00922] Fps is (10 sec: 2866.7, 60 sec: 1146.8, 300 sec: 1146.8). Total num frames: 28672. Throughput: 0: 252.3. Samples: 6308. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 11:08:13,709][00922] Avg episode reward: [(0, '3.701')]
[2023-02-25 11:08:16,514][10943] Updated weights for policy 0, policy_version 10 (0.0034)
[2023-02-25 11:08:18,696][00922] Fps is (10 sec: 3686.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 49152. Throughput: 0: 416.3. Samples: 12490. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 11:08:18,704][00922] Avg episode reward: [(0, '4.362')]
[2023-02-25 11:08:23,696][00922] Fps is (10 sec: 4506.4, 60 sec: 2106.5, 300 sec: 2106.5). Total num frames: 73728. Throughput: 0: 457.3. Samples: 16004. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:08:23,699][00922] Avg episode reward: [(0, '4.542')]
[2023-02-25 11:08:25,927][10943] Updated weights for policy 0, policy_version 20 (0.0019)
[2023-02-25 11:08:28,696][00922] Fps is (10 sec: 4096.0, 60 sec: 2252.8, 300 sec: 2252.8). Total num frames: 90112. Throughput: 0: 548.5. Samples: 21940. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 11:08:28,704][00922] Avg episode reward: [(0, '4.539')]
[2023-02-25 11:08:33,698][00922] Fps is (10 sec: 2866.8, 60 sec: 2275.5, 300 sec: 2275.5). Total num frames: 102400. Throughput: 0: 587.7. Samples: 26446. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:08:33,708][00922] Avg episode reward: [(0, '4.300')]
[2023-02-25 11:08:33,710][10928] Saving new best policy, reward=4.300!
[2023-02-25 11:08:37,458][10943] Updated weights for policy 0, policy_version 30 (0.0037)
[2023-02-25 11:08:38,696][00922] Fps is (10 sec: 3686.4, 60 sec: 2539.5, 300 sec: 2539.5). Total num frames: 126976. Throughput: 0: 656.5. Samples: 29542. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 11:08:38,699][00922] Avg episode reward: [(0, '4.363')]
[2023-02-25 11:08:38,710][10928] Saving new best policy, reward=4.363!
[2023-02-25 11:08:43,696][00922] Fps is (10 sec: 4506.2, 60 sec: 2681.0, 300 sec: 2681.0). Total num frames: 147456. Throughput: 0: 814.3. Samples: 36642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:08:43,702][00922] Avg episode reward: [(0, '4.485')]
[2023-02-25 11:08:43,755][10928] Saving new best policy, reward=4.485!
[2023-02-25 11:08:47,617][10943] Updated weights for policy 0, policy_version 40 (0.0016)
[2023-02-25 11:08:48,698][00922] Fps is (10 sec: 3685.7, 60 sec: 2730.6, 300 sec: 2730.6). Total num frames: 163840. Throughput: 0: 885.6. Samples: 42050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:08:48,712][00922] Avg episode reward: [(0, '4.354')]
[2023-02-25 11:08:53,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3003.7, 300 sec: 2772.7). Total num frames: 180224. Throughput: 0: 927.4. Samples: 44296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:08:53,699][00922] Avg episode reward: [(0, '4.374')]
[2023-02-25 11:08:58,538][10943] Updated weights for policy 0, policy_version 50 (0.0031)
[2023-02-25 11:08:58,696][00922] Fps is (10 sec: 4096.8, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 204800. Throughput: 0: 977.2. Samples: 50282. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:08:58,699][00922] Avg episode reward: [(0, '4.423')]
[2023-02-25 11:09:03,696][00922] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3058.4). Total num frames: 229376. Throughput: 0: 997.3. Samples: 57370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:09:03,700][00922] Avg episode reward: [(0, '4.381')]
[2023-02-25 11:09:08,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3020.8). Total num frames: 241664. Throughput: 0: 977.9. Samples: 60008. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:09:08,703][00922] Avg episode reward: [(0, '4.429')]
[2023-02-25 11:09:08,846][10943] Updated weights for policy 0, policy_version 60 (0.0016)
[2023-02-25 11:09:13,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3823.0, 300 sec: 3035.9). Total num frames: 258048. Throughput: 0: 948.1. Samples: 64604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:09:13,701][00922] Avg episode reward: [(0, '4.732')]
[2023-02-25 11:09:13,702][10928] Saving new best policy, reward=4.732!
[2023-02-25 11:09:18,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3140.3). Total num frames: 282624. Throughput: 0: 987.2. Samples: 70868. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:09:18,703][00922] Avg episode reward: [(0, '4.657')]
[2023-02-25 11:09:18,713][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000069_282624.pth...
[2023-02-25 11:09:19,791][10943] Updated weights for policy 0, policy_version 70 (0.0033)
[2023-02-25 11:09:23,697][00922] Fps is (10 sec: 4505.3, 60 sec: 3822.9, 300 sec: 3190.5). Total num frames: 303104. Throughput: 0: 988.5. Samples: 74026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:09:23,705][00922] Avg episode reward: [(0, '4.604')]
[2023-02-25 11:09:28,698][00922] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3194.8). Total num frames: 319488. Throughput: 0: 957.6. Samples: 79736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:09:28,703][00922] Avg episode reward: [(0, '4.547')]
[2023-02-25 11:09:31,062][10943] Updated weights for policy 0, policy_version 80 (0.0022)
[2023-02-25 11:09:33,696][00922] Fps is (10 sec: 2867.4, 60 sec: 3823.0, 300 sec: 3159.8). Total num frames: 331776. Throughput: 0: 924.4. Samples: 83648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:09:33,700][00922] Avg episode reward: [(0, '4.427')]
[2023-02-25 11:09:38,697][00922] Fps is (10 sec: 2458.0, 60 sec: 3618.1, 300 sec: 3127.8). Total num frames: 344064. Throughput: 0: 914.5. Samples: 85450. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:09:38,704][00922] Avg episode reward: [(0, '4.282')]
[2023-02-25 11:09:43,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3170.0). Total num frames: 364544. Throughput: 0: 894.0. Samples: 90510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:09:43,705][00922] Avg episode reward: [(0, '4.385')]
[2023-02-25 11:09:44,037][10943] Updated weights for policy 0, policy_version 90 (0.0011)
[2023-02-25 11:09:48,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3686.5, 300 sec: 3208.5). Total num frames: 385024. Throughput: 0: 875.9. Samples: 96786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:09:48,701][00922] Avg episode reward: [(0, '4.729')]
[2023-02-25 11:09:53,700][00922] Fps is (10 sec: 3275.5, 60 sec: 3617.9, 300 sec: 3178.4). Total num frames: 397312. Throughput: 0: 866.0. Samples: 98980. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:09:53,710][00922] Avg episode reward: [(0, '4.895')]
[2023-02-25 11:09:53,724][10928] Saving new best policy, reward=4.895!
[2023-02-25 11:09:56,314][10943] Updated weights for policy 0, policy_version 100 (0.0020)
[2023-02-25 11:09:58,696][00922] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3213.8). Total num frames: 417792. Throughput: 0: 878.6. Samples: 104140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:09:58,699][00922] Avg episode reward: [(0, '4.625')]
[2023-02-25 11:10:03,696][00922] Fps is (10 sec: 4507.4, 60 sec: 3549.9, 300 sec: 3276.8). Total num frames: 442368. Throughput: 0: 898.4. Samples: 111296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:10:03,702][00922] Avg episode reward: [(0, '4.553')]
[2023-02-25 11:10:04,866][10943] Updated weights for policy 0, policy_version 110 (0.0017)
[2023-02-25 11:10:08,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3306.1). Total num frames: 462848. Throughput: 0: 904.8. Samples: 114740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:10:08,707][00922] Avg episode reward: [(0, '4.419')]
[2023-02-25 11:10:13,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3305.1). Total num frames: 479232. Throughput: 0: 881.0. Samples: 119380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:10:13,705][00922] Avg episode reward: [(0, '4.415')]
[2023-02-25 11:10:16,884][10943] Updated weights for policy 0, policy_version 120 (0.0011)
[2023-02-25 11:10:18,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3331.4). Total num frames: 499712. Throughput: 0: 919.7. Samples: 125034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:10:18,700][00922] Avg episode reward: [(0, '4.648')]
[2023-02-25 11:10:23,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3618.2, 300 sec: 3356.1). Total num frames: 520192. Throughput: 0: 958.9. Samples: 128598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:10:23,699][00922] Avg episode reward: [(0, '4.706')]
[2023-02-25 11:10:25,580][10943] Updated weights for policy 0, policy_version 130 (0.0012)
[2023-02-25 11:10:28,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3686.5, 300 sec: 3379.2). Total num frames: 540672. Throughput: 0: 990.8. Samples: 135098. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:10:28,703][00922] Avg episode reward: [(0, '4.595')]
[2023-02-25 11:10:33,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3376.1). Total num frames: 557056. Throughput: 0: 949.6. Samples: 139518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:10:33,702][00922] Avg episode reward: [(0, '4.562')]
[2023-02-25 11:10:37,809][10943] Updated weights for policy 0, policy_version 140 (0.0020)
[2023-02-25 11:10:38,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3397.3). Total num frames: 577536. Throughput: 0: 955.5. Samples: 141972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:10:38,702][00922] Avg episode reward: [(0, '4.509')]
[2023-02-25 11:10:43,696][00922] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3440.6). Total num frames: 602112. Throughput: 0: 1003.6. Samples: 149304. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:10:43,699][00922] Avg episode reward: [(0, '4.452')]
[2023-02-25 11:10:46,467][10943] Updated weights for policy 0, policy_version 150 (0.0014)
[2023-02-25 11:10:48,696][00922] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3436.1). Total num frames: 618496. Throughput: 0: 982.2. Samples: 155494. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:10:48,701][00922] Avg episode reward: [(0, '4.400')]
[2023-02-25 11:10:53,700][00922] Fps is (10 sec: 3275.7, 60 sec: 3959.5, 300 sec: 3431.7). Total num frames: 634880. Throughput: 0: 955.3. Samples: 157732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:10:53,702][00922] Avg episode reward: [(0, '4.374')]
[2023-02-25 11:10:58,400][10943] Updated weights for policy 0, policy_version 160 (0.0019)
[2023-02-25 11:10:58,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3449.3). Total num frames: 655360. Throughput: 0: 968.8. Samples: 162978. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:10:58,705][00922] Avg episode reward: [(0, '4.677')]
[2023-02-25 11:11:03,696][00922] Fps is (10 sec: 4097.3, 60 sec: 3891.2, 300 sec: 3465.8). Total num frames: 675840. Throughput: 0: 999.5. Samples: 170010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:11:03,699][00922] Avg episode reward: [(0, '4.643')]
[2023-02-25 11:11:08,123][10943] Updated weights for policy 0, policy_version 170 (0.0016)
[2023-02-25 11:11:08,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3481.6). Total num frames: 696320. Throughput: 0: 995.3. Samples: 173386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:11:08,700][00922] Avg episode reward: [(0, '4.583')]
[2023-02-25 11:11:13,697][00922] Fps is (10 sec: 3686.2, 60 sec: 3891.2, 300 sec: 3476.6). Total num frames: 712704. Throughput: 0: 948.7. Samples: 177790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:11:13,705][00922] Avg episode reward: [(0, '4.678')]
[2023-02-25 11:11:18,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3491.4). Total num frames: 733184. Throughput: 0: 980.2. Samples: 183628. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:11:18,707][00922] Avg episode reward: [(0, '4.697')]
[2023-02-25 11:11:18,730][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000179_733184.pth...
[2023-02-25 11:11:19,412][10943] Updated weights for policy 0, policy_version 180 (0.0014)
[2023-02-25 11:11:23,696][00922] Fps is (10 sec: 4505.9, 60 sec: 3959.5, 300 sec: 3524.5). Total num frames: 757760. Throughput: 0: 1003.8. Samples: 187144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:11:23,700][00922] Avg episode reward: [(0, '4.540')]
[2023-02-25 11:11:28,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3518.8). Total num frames: 774144. Throughput: 0: 983.5. Samples: 193562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:11:28,699][00922] Avg episode reward: [(0, '4.741')]
[2023-02-25 11:11:29,251][10943] Updated weights for policy 0, policy_version 190 (0.0031)
[2023-02-25 11:11:33,697][00922] Fps is (10 sec: 3276.6, 60 sec: 3891.2, 300 sec: 3513.5). Total num frames: 790528. Throughput: 0: 946.9. Samples: 198106. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:11:33,704][00922] Avg episode reward: [(0, '4.813')]
[2023-02-25 11:11:38,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3526.1). Total num frames: 811008. Throughput: 0: 956.1. Samples: 200754. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:11:38,704][00922] Avg episode reward: [(0, '4.561')]
[2023-02-25 11:11:39,937][10943] Updated weights for policy 0, policy_version 200 (0.0019)
[2023-02-25 11:11:43,696][00922] Fps is (10 sec: 4505.9, 60 sec: 3891.2, 300 sec: 3555.7). Total num frames: 835584. Throughput: 0: 1001.6. Samples: 208052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:11:43,703][00922] Avg episode reward: [(0, '4.545')]
[2023-02-25 11:11:48,698][00922] Fps is (10 sec: 4505.0, 60 sec: 3959.4, 300 sec: 3566.9). Total num frames: 856064. Throughput: 0: 980.6. Samples: 214140. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 11:11:48,702][00922] Avg episode reward: [(0, '4.726')]
[2023-02-25 11:11:49,808][10943] Updated weights for policy 0, policy_version 210 (0.0031)
[2023-02-25 11:11:53,697][00922] Fps is (10 sec: 3276.6, 60 sec: 3891.4, 300 sec: 3544.3). Total num frames: 868352. Throughput: 0: 956.0. Samples: 216406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:11:53,703][00922] Avg episode reward: [(0, '5.083')]
[2023-02-25 11:11:53,706][10928] Saving new best policy, reward=5.083!
[2023-02-25 11:11:58,696][00922] Fps is (10 sec: 3277.3, 60 sec: 3891.2, 300 sec: 3555.3). Total num frames: 888832. Throughput: 0: 978.1. Samples: 221804. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:11:58,703][00922] Avg episode reward: [(0, '5.141')]
[2023-02-25 11:11:58,714][10928] Saving new best policy, reward=5.141!
[2023-02-25 11:12:00,685][10943] Updated weights for policy 0, policy_version 220 (0.0020)
[2023-02-25 11:12:03,696][00922] Fps is (10 sec: 4505.8, 60 sec: 3959.5, 300 sec: 3582.0). Total num frames: 913408. Throughput: 0: 1005.7. Samples: 228884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:12:03,702][00922] Avg episode reward: [(0, '4.913')]
[2023-02-25 11:12:08,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3576.1). Total num frames: 929792. Throughput: 0: 996.7. Samples: 231994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:12:08,704][00922] Avg episode reward: [(0, '4.780')]
[2023-02-25 11:12:11,559][10943] Updated weights for policy 0, policy_version 230 (0.0020)
[2023-02-25 11:12:13,696][00922] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3570.5). Total num frames: 946176. Throughput: 0: 956.3. Samples: 236596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:12:13,704][00922] Avg episode reward: [(0, '4.869')]
[2023-02-25 11:12:18,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3549.9). Total num frames: 958464. Throughput: 0: 945.5. Samples: 240654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:12:18,699][00922] Avg episode reward: [(0, '4.793')]
[2023-02-25 11:12:23,696][00922] Fps is (10 sec: 2867.3, 60 sec: 3618.1, 300 sec: 3544.9). Total num frames: 974848. Throughput: 0: 937.5. Samples: 242942. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:12:23,703][00922] Avg episode reward: [(0, '4.824')]
[2023-02-25 11:12:24,888][10943] Updated weights for policy 0, policy_version 240 (0.0035)
[2023-02-25 11:12:28,716][00922] Fps is (10 sec: 3679.3, 60 sec: 3685.2, 300 sec: 3554.5). Total num frames: 995328. Throughput: 0: 899.3. Samples: 248536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:12:28,718][00922] Avg episode reward: [(0, '4.806')]
[2023-02-25 11:12:33,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3535.5). Total num frames: 1007616. Throughput: 0: 861.9. Samples: 252924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:12:33,699][00922] Avg episode reward: [(0, '4.801')]
[2023-02-25 11:12:36,992][10943] Updated weights for policy 0, policy_version 250 (0.0027)
[2023-02-25 11:12:38,696][00922] Fps is (10 sec: 3283.2, 60 sec: 3618.1, 300 sec: 3545.2). Total num frames: 1028096. Throughput: 0: 868.5. Samples: 255490. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:12:38,698][00922] Avg episode reward: [(0, '4.898')]
[2023-02-25 11:12:43,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1052672. Throughput: 0: 910.4. Samples: 262772. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:12:43,698][00922] Avg episode reward: [(0, '5.027')]
[2023-02-25 11:12:45,653][10943] Updated weights for policy 0, policy_version 260 (0.0015)
[2023-02-25 11:12:48,696][00922] Fps is (10 sec: 4505.5, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 1073152. Throughput: 0: 888.2. Samples: 268854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:12:48,703][00922] Avg episode reward: [(0, '4.930')]
[2023-02-25 11:12:53,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1089536. Throughput: 0: 869.5. Samples: 271122. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:12:53,703][00922] Avg episode reward: [(0, '5.016')]
[2023-02-25 11:12:57,730][10943] Updated weights for policy 0, policy_version 270 (0.0014)
[2023-02-25 11:12:58,696][00922] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 1110016. Throughput: 0: 885.6. Samples: 276446. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:12:58,703][00922] Avg episode reward: [(0, '5.275')]
[2023-02-25 11:12:58,715][10928] Saving new best policy, reward=5.275!
[2023-02-25 11:13:03,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 1130496. Throughput: 0: 951.2. Samples: 283456. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:13:03,699][00922] Avg episode reward: [(0, '5.793')]
[2023-02-25 11:13:03,708][10928] Saving new best policy, reward=5.793!
[2023-02-25 11:13:06,925][10943] Updated weights for policy 0, policy_version 280 (0.0021)
[2023-02-25 11:13:08,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1150976. Throughput: 0: 974.7. Samples: 286804. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:13:08,701][00922] Avg episode reward: [(0, '5.681')]
[2023-02-25 11:13:13,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3790.5). Total num frames: 1167360. Throughput: 0: 952.1. Samples: 291364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:13:13,702][00922] Avg episode reward: [(0, '5.583')]
[2023-02-25 11:13:18,581][10943] Updated weights for policy 0, policy_version 290 (0.0012)
[2023-02-25 11:13:18,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3776.7). Total num frames: 1187840. Throughput: 0: 978.6. Samples: 296962. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:13:18,704][00922] Avg episode reward: [(0, '5.691')]
[2023-02-25 11:13:18,713][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000290_1187840.pth...
[2023-02-25 11:13:18,833][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000069_282624.pth
[2023-02-25 11:13:23,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1208320. Throughput: 0: 1001.1. Samples: 300540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:13:23,699][00922] Avg episode reward: [(0, '5.833')]
[2023-02-25 11:13:23,701][10928] Saving new best policy, reward=5.833!
[2023-02-25 11:13:28,113][10943] Updated weights for policy 0, policy_version 300 (0.0022)
[2023-02-25 11:13:28,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3892.5, 300 sec: 3818.3). Total num frames: 1228800. Throughput: 0: 982.9. Samples: 307002. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:13:28,701][00922] Avg episode reward: [(0, '5.428')]
[2023-02-25 11:13:33,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 1245184. Throughput: 0: 948.0. Samples: 311516. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:13:33,703][00922] Avg episode reward: [(0, '5.570')]
[2023-02-25 11:13:38,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 1261568. Throughput: 0: 951.6. Samples: 313942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 11:13:38,702][00922] Avg episode reward: [(0, '5.771')]
[2023-02-25 11:13:39,617][10943] Updated weights for policy 0, policy_version 310 (0.0025)
[2023-02-25 11:13:43,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1286144. Throughput: 0: 992.5. Samples: 321108. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:13:43,699][00922] Avg episode reward: [(0, '6.360')]
[2023-02-25 11:13:43,703][10928] Saving new best policy, reward=6.360!
[2023-02-25 11:13:48,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1306624. Throughput: 0: 976.0. Samples: 327374. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:13:48,699][00922] Avg episode reward: [(0, '6.249')]
[2023-02-25 11:13:49,536][10943] Updated weights for policy 0, policy_version 320 (0.0021)
[2023-02-25 11:13:53,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1323008. Throughput: 0: 952.5. Samples: 329666. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:13:53,699][00922] Avg episode reward: [(0, '6.787')]
[2023-02-25 11:13:53,702][10928] Saving new best policy, reward=6.787!
[2023-02-25 11:13:58,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 1343488. Throughput: 0: 963.3. Samples: 334712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:13:58,699][00922] Avg episode reward: [(0, '6.986')]
[2023-02-25 11:13:58,713][10928] Saving new best policy, reward=6.986!
[2023-02-25 11:14:00,278][10943] Updated weights for policy 0, policy_version 330 (0.0026)
[2023-02-25 11:14:03,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1363968. Throughput: 0: 997.4. Samples: 341846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:14:03,699][00922] Avg episode reward: [(0, '7.578')]
[2023-02-25 11:14:03,702][10928] Saving new best policy, reward=7.578!
[2023-02-25 11:14:08,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1384448. Throughput: 0: 993.2. Samples: 345236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:14:08,703][00922] Avg episode reward: [(0, '7.651')]
[2023-02-25 11:14:08,715][10928] Saving new best policy, reward=7.651!
[2023-02-25 11:14:10,960][10943] Updated weights for policy 0, policy_version 340 (0.0012)
[2023-02-25 11:14:13,697][00922] Fps is (10 sec: 3276.5, 60 sec: 3822.9, 300 sec: 3776.6). Total num frames: 1396736. Throughput: 0: 949.5. Samples: 349730. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:13,699][00922] Avg episode reward: [(0, '7.364')]
[2023-02-25 11:14:18,699][00922] Fps is (10 sec: 3685.6, 60 sec: 3891.0, 300 sec: 3790.5). Total num frames: 1421312. Throughput: 0: 977.4. Samples: 355502. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:18,701][00922] Avg episode reward: [(0, '6.968')]
[2023-02-25 11:14:21,225][10943] Updated weights for policy 0, policy_version 350 (0.0031)
[2023-02-25 11:14:23,696][00922] Fps is (10 sec: 4506.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1441792. Throughput: 0: 1002.9. Samples: 359072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:23,704][00922] Avg episode reward: [(0, '7.195')]
[2023-02-25 11:14:28,696][00922] Fps is (10 sec: 4096.9, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 1462272. Throughput: 0: 988.2. Samples: 365576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:28,706][00922] Avg episode reward: [(0, '7.324')]
[2023-02-25 11:14:31,851][10943] Updated weights for policy 0, policy_version 360 (0.0020)
[2023-02-25 11:14:33,698][00922] Fps is (10 sec: 3685.8, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 1478656. Throughput: 0: 951.4. Samples: 370188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:33,705][00922] Avg episode reward: [(0, '7.739')]
[2023-02-25 11:14:33,707][10928] Saving new best policy, reward=7.739!
[2023-02-25 11:14:38,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1499136. Throughput: 0: 956.4. Samples: 372704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:14:38,699][00922] Avg episode reward: [(0, '7.426')]
[2023-02-25 11:14:42,011][10943] Updated weights for policy 0, policy_version 370 (0.0028)
[2023-02-25 11:14:43,696][00922] Fps is (10 sec: 4506.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1523712. Throughput: 0: 1001.3. Samples: 379770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:43,699][00922] Avg episode reward: [(0, '7.648')]
[2023-02-25 11:14:48,696][00922] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3873.9). Total num frames: 1540096. Throughput: 0: 978.8. Samples: 385892. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:14:48,699][00922] Avg episode reward: [(0, '8.649')]
[2023-02-25 11:14:48,706][10928] Saving new best policy, reward=8.649!
[2023-02-25 11:14:53,301][10943] Updated weights for policy 0, policy_version 380 (0.0022)
[2023-02-25 11:14:53,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1556480. Throughput: 0: 951.7. Samples: 388062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:14:53,698][00922] Avg episode reward: [(0, '8.852')]
[2023-02-25 11:14:53,706][10928] Saving new best policy, reward=8.852!
[2023-02-25 11:14:58,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1572864. Throughput: 0: 956.9. Samples: 392790. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:14:58,703][00922] Avg episode reward: [(0, '9.174')]
[2023-02-25 11:14:58,715][10928] Saving new best policy, reward=9.174!
[2023-02-25 11:15:03,697][00922] Fps is (10 sec: 2867.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 1585152. Throughput: 0: 927.1. Samples: 397220. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:15:03,707][00922] Avg episode reward: [(0, '9.786')]
[2023-02-25 11:15:03,712][10928] Saving new best policy, reward=9.786!
[2023-02-25 11:15:06,896][10943] Updated weights for policy 0, policy_version 390 (0.0035)
[2023-02-25 11:15:08,700][00922] Fps is (10 sec: 2866.1, 60 sec: 3617.9, 300 sec: 3804.4). Total num frames: 1601536. Throughput: 0: 899.1. Samples: 399536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:15:08,702][00922] Avg episode reward: [(0, '10.205')]
[2023-02-25 11:15:08,715][10928] Saving new best policy, reward=10.205!
[2023-02-25 11:15:13,696][00922] Fps is (10 sec: 3277.0, 60 sec: 3686.5, 300 sec: 3790.5). Total num frames: 1617920. Throughput: 0: 853.8. Samples: 403998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:15:13,701][00922] Avg episode reward: [(0, '10.223')]
[2023-02-25 11:15:13,710][10928] Saving new best policy, reward=10.223!
[2023-02-25 11:15:18,425][10943] Updated weights for policy 0, policy_version 400 (0.0013)
[2023-02-25 11:15:18,696][00922] Fps is (10 sec: 3687.8, 60 sec: 3618.3, 300 sec: 3790.5). Total num frames: 1638400. Throughput: 0: 881.4. Samples: 409848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:15:18,698][00922] Avg episode reward: [(0, '10.291')]
[2023-02-25 11:15:18,709][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000400_1638400.pth...
[2023-02-25 11:15:18,849][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000179_733184.pth
[2023-02-25 11:15:18,856][10928] Saving new best policy, reward=10.291!
[2023-02-25 11:15:23,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 1658880. Throughput: 0: 902.5. Samples: 413316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:15:23,704][00922] Avg episode reward: [(0, '10.733')]
[2023-02-25 11:15:23,713][10928] Saving new best policy, reward=10.733!
[2023-02-25 11:15:28,163][10943] Updated weights for policy 0, policy_version 410 (0.0022)
[2023-02-25 11:15:28,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1679360. Throughput: 0: 884.8. Samples: 419586. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:15:28,698][00922] Avg episode reward: [(0, '10.487')]
[2023-02-25 11:15:33,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3618.2, 300 sec: 3790.5). Total num frames: 1695744. Throughput: 0: 849.5. Samples: 424118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:15:33,699][00922] Avg episode reward: [(0, '10.376')]
[2023-02-25 11:15:38,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3776.7). Total num frames: 1716224. Throughput: 0: 860.3. Samples: 426774. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:15:38,704][00922] Avg episode reward: [(0, '10.635')]
[2023-02-25 11:15:39,389][10943] Updated weights for policy 0, policy_version 420 (0.0022)
[2023-02-25 11:15:43,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 1740800. Throughput: 0: 917.6. Samples: 434084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:15:43,699][00922] Avg episode reward: [(0, '11.243')]
[2023-02-25 11:15:43,705][10928] Saving new best policy, reward=11.243!
[2023-02-25 11:15:48,701][00922] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3804.5). Total num frames: 1757184. Throughput: 0: 948.3. Samples: 439894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:15:48,712][00922] Avg episode reward: [(0, '12.732')]
[2023-02-25 11:15:48,736][10928] Saving new best policy, reward=12.732!
[2023-02-25 11:15:49,691][10943] Updated weights for policy 0, policy_version 430 (0.0018)
[2023-02-25 11:15:53,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3776.7). Total num frames: 1769472. Throughput: 0: 944.6. Samples: 442038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:15:53,701][00922] Avg episode reward: [(0, '12.739')]
[2023-02-25 11:15:53,755][10928] Saving new best policy, reward=12.739!
[2023-02-25 11:15:58,696][00922] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3776.6). Total num frames: 1789952. Throughput: 0: 964.8. Samples: 447412. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:15:58,702][00922] Avg episode reward: [(0, '13.492')]
[2023-02-25 11:15:58,717][10928] Saving new best policy, reward=13.492!
[2023-02-25 11:16:00,408][10943] Updated weights for policy 0, policy_version 440 (0.0014)
[2023-02-25 11:16:03,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3823.0, 300 sec: 3790.5). Total num frames: 1814528. Throughput: 0: 993.3. Samples: 454546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:16:03,699][00922] Avg episode reward: [(0, '12.190')]
[2023-02-25 11:16:08,696][00922] Fps is (10 sec: 4505.7, 60 sec: 3891.4, 300 sec: 3804.4). Total num frames: 1835008. Throughput: 0: 987.4. Samples: 457750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:08,699][00922] Avg episode reward: [(0, '13.204')]
[2023-02-25 11:16:10,932][10943] Updated weights for policy 0, policy_version 450 (0.0032)
[2023-02-25 11:16:13,701][00922] Fps is (10 sec: 3275.4, 60 sec: 3822.7, 300 sec: 3776.6). Total num frames: 1847296. Throughput: 0: 951.2. Samples: 462396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:13,706][00922] Avg episode reward: [(0, '14.341')]
[2023-02-25 11:16:13,711][10928] Saving new best policy, reward=14.341!
[2023-02-25 11:16:18,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 1871872. Throughput: 0: 981.4. Samples: 468280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:16:18,699][00922] Avg episode reward: [(0, '14.945')]
[2023-02-25 11:16:18,707][10928] Saving new best policy, reward=14.945!
[2023-02-25 11:16:21,068][10943] Updated weights for policy 0, policy_version 460 (0.0027)
[2023-02-25 11:16:23,696][00922] Fps is (10 sec: 4917.4, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 1896448. Throughput: 0: 1001.8. Samples: 471856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:16:23,703][00922] Avg episode reward: [(0, '14.137')]
[2023-02-25 11:16:28,699][00922] Fps is (10 sec: 4095.0, 60 sec: 3891.1, 300 sec: 3804.4). Total num frames: 1912832. Throughput: 0: 981.3. Samples: 478246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:28,703][00922] Avg episode reward: [(0, '14.547')]
[2023-02-25 11:16:32,001][10943] Updated weights for policy 0, policy_version 470 (0.0012)
[2023-02-25 11:16:33,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 1929216. Throughput: 0: 953.2. Samples: 482790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:33,701][00922] Avg episode reward: [(0, '13.909')]
[2023-02-25 11:16:38,697][00922] Fps is (10 sec: 3687.2, 60 sec: 3891.2, 300 sec: 3776.6). Total num frames: 1949696. Throughput: 0: 967.9. Samples: 485596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:38,699][00922] Avg episode reward: [(0, '14.052')]
[2023-02-25 11:16:41,648][10943] Updated weights for policy 0, policy_version 480 (0.0025)
[2023-02-25 11:16:43,696][00922] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3790.6). Total num frames: 1974272. Throughput: 0: 1012.1. Samples: 492956. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:16:43,702][00922] Avg episode reward: [(0, '16.562')]
[2023-02-25 11:16:43,705][10928] Saving new best policy, reward=16.562!
[2023-02-25 11:16:48,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1990656. Throughput: 0: 980.6. Samples: 498672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:16:48,702][00922] Avg episode reward: [(0, '16.755')]
[2023-02-25 11:16:48,711][10928] Saving new best policy, reward=16.755!
[2023-02-25 11:16:52,981][10943] Updated weights for policy 0, policy_version 490 (0.0017)
[2023-02-25 11:16:53,699][00922] Fps is (10 sec: 3275.9, 60 sec: 3959.3, 300 sec: 3790.5). Total num frames: 2007040. Throughput: 0: 959.6. Samples: 500936. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:16:53,701][00922] Avg episode reward: [(0, '17.688')]
[2023-02-25 11:16:53,706][10928] Saving new best policy, reward=17.688!
[2023-02-25 11:16:58,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 2027520. Throughput: 0: 981.9. Samples: 506578. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:16:58,699][00922] Avg episode reward: [(0, '18.189')]
[2023-02-25 11:16:58,759][10928] Saving new best policy, reward=18.189!
[2023-02-25 11:17:02,214][10943] Updated weights for policy 0, policy_version 500 (0.0012)
[2023-02-25 11:17:03,696][00922] Fps is (10 sec: 4506.7, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2052096. Throughput: 0: 1009.8. Samples: 513720. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:03,699][00922] Avg episode reward: [(0, '16.437')]
[2023-02-25 11:17:08,696][00922] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 2068480. Throughput: 0: 995.9. Samples: 516672. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:08,701][00922] Avg episode reward: [(0, '15.480')]
[2023-02-25 11:17:13,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3959.8, 300 sec: 3818.3). Total num frames: 2084864. Throughput: 0: 958.1. Samples: 521360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:17:13,702][00922] Avg episode reward: [(0, '15.376')]
[2023-02-25 11:17:14,070][10943] Updated weights for policy 0, policy_version 510 (0.0017)
[2023-02-25 11:17:18,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 2109440. Throughput: 0: 993.1. Samples: 527480. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:17:18,698][00922] Avg episode reward: [(0, '15.154')]
[2023-02-25 11:17:18,708][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000515_2109440.pth...
[2023-02-25 11:17:18,834][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000290_1187840.pth
[2023-02-25 11:17:22,672][10943] Updated weights for policy 0, policy_version 520 (0.0021)
[2023-02-25 11:17:23,696][00922] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3860.2). Total num frames: 2134016. Throughput: 0: 1012.2. Samples: 531146. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:17:23,701][00922] Avg episode reward: [(0, '17.563')]
[2023-02-25 11:17:28,698][00922] Fps is (10 sec: 4095.5, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2150400. Throughput: 0: 985.5. Samples: 537304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:17:28,700][00922] Avg episode reward: [(0, '17.133')]
[2023-02-25 11:17:33,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2162688. Throughput: 0: 958.1. Samples: 541786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:33,706][00922] Avg episode reward: [(0, '16.988')]
[2023-02-25 11:17:34,871][10943] Updated weights for policy 0, policy_version 530 (0.0021)
[2023-02-25 11:17:38,696][00922] Fps is (10 sec: 3277.2, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2183168. Throughput: 0: 975.9. Samples: 544850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:38,699][00922] Avg episode reward: [(0, '17.828')]
[2023-02-25 11:17:43,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2203648. Throughput: 0: 978.0. Samples: 550588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:43,701][00922] Avg episode reward: [(0, '17.441')]
[2023-02-25 11:17:46,408][10943] Updated weights for policy 0, policy_version 540 (0.0015)
[2023-02-25 11:17:48,698][00922] Fps is (10 sec: 3276.4, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 2215936. Throughput: 0: 907.1. Samples: 554540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:17:48,700][00922] Avg episode reward: [(0, '15.580')]
[2023-02-25 11:17:53,696][00922] Fps is (10 sec: 2457.6, 60 sec: 3686.6, 300 sec: 3790.5). Total num frames: 2228224. Throughput: 0: 882.7. Samples: 556392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:17:53,702][00922] Avg episode reward: [(0, '15.834')]
[2023-02-25 11:17:58,697][00922] Fps is (10 sec: 2867.5, 60 sec: 3618.1, 300 sec: 3776.6). Total num frames: 2244608. Throughput: 0: 869.9. Samples: 560506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:17:58,699][00922] Avg episode reward: [(0, '15.788')]
[2023-02-25 11:17:59,967][10943] Updated weights for policy 0, policy_version 550 (0.0018)
[2023-02-25 11:18:03,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 2269184. Throughput: 0: 891.6. Samples: 567602. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:18:03,698][00922] Avg episode reward: [(0, '15.596')]
[2023-02-25 11:18:08,697][00922] Fps is (10 sec: 4505.4, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 2289664. Throughput: 0: 888.6. Samples: 571134. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:18:08,699][00922] Avg episode reward: [(0, '15.171')]
[2023-02-25 11:18:09,279][10943] Updated weights for policy 0, policy_version 560 (0.0012)
[2023-02-25 11:18:13,702][00922] Fps is (10 sec: 3684.4, 60 sec: 3686.1, 300 sec: 3790.5). Total num frames: 2306048. Throughput: 0: 865.2. Samples: 576240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:18:13,708][00922] Avg episode reward: [(0, '15.760')]
[2023-02-25 11:18:18,697][00922] Fps is (10 sec: 3276.8, 60 sec: 3549.8, 300 sec: 3776.6). Total num frames: 2322432. Throughput: 0: 876.7. Samples: 581236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:18:18,700][00922] Avg episode reward: [(0, '16.088')]
[2023-02-25 11:18:20,612][10943] Updated weights for policy 0, policy_version 570 (0.0032)
[2023-02-25 11:18:23,696][00922] Fps is (10 sec: 4098.2, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 2347008. Throughput: 0: 891.2. Samples: 584952. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:18:23,699][00922] Avg episode reward: [(0, '16.502')]
[2023-02-25 11:18:28,697][00922] Fps is (10 sec: 4505.8, 60 sec: 3618.2, 300 sec: 3804.4). Total num frames: 2367488. Throughput: 0: 926.2. Samples: 592266. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2023-02-25 11:18:28,699][00922] Avg episode reward: [(0, '18.530')]
[2023-02-25 11:18:28,730][10928] Saving new best policy, reward=18.530!
[2023-02-25 11:18:30,430][10943] Updated weights for policy 0, policy_version 580 (0.0019)
[2023-02-25 11:18:33,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 2383872. Throughput: 0: 939.6. Samples: 596820. Policy #0 lag: (min: 0.0, avg: 0.8, max: 1.0)
[2023-02-25 11:18:33,699][00922] Avg episode reward: [(0, '18.906')]
[2023-02-25 11:18:33,705][10928] Saving new best policy, reward=18.906!
[2023-02-25 11:18:38,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3776.6). Total num frames: 2400256. Throughput: 0: 948.1. Samples: 599058. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:18:38,705][00922] Avg episode reward: [(0, '18.408')]
[2023-02-25 11:18:41,273][10943] Updated weights for policy 0, policy_version 590 (0.0041)
[2023-02-25 11:18:43,698][00922] Fps is (10 sec: 4095.3, 60 sec: 3686.3, 300 sec: 3790.5). Total num frames: 2424832. Throughput: 0: 1008.1. Samples: 605874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:18:43,706][00922] Avg episode reward: [(0, '20.162')]
[2023-02-25 11:18:43,790][10928] Saving new best policy, reward=20.162!
[2023-02-25 11:18:48,696][00922] Fps is (10 sec: 4505.8, 60 sec: 3823.0, 300 sec: 3804.4). Total num frames: 2445312. Throughput: 0: 1000.6. Samples: 612628. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:18:48,712][00922] Avg episode reward: [(0, '21.149')]
[2023-02-25 11:18:48,770][10928] Saving new best policy, reward=21.149!
[2023-02-25 11:18:51,496][10943] Updated weights for policy 0, policy_version 600 (0.0013)
[2023-02-25 11:18:53,698][00922] Fps is (10 sec: 3686.3, 60 sec: 3891.1, 300 sec: 3790.5). Total num frames: 2461696. Throughput: 0: 970.7. Samples: 614818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:18:53,707][00922] Avg episode reward: [(0, '21.226')]
[2023-02-25 11:18:53,716][10928] Saving new best policy, reward=21.226!
[2023-02-25 11:18:58,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 2478080. Throughput: 0: 959.3. Samples: 619404. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:18:58,705][00922] Avg episode reward: [(0, '21.668')]
[2023-02-25 11:18:58,774][10928] Saving new best policy, reward=21.668!
[2023-02-25 11:19:02,298][10943] Updated weights for policy 0, policy_version 610 (0.0023)
[2023-02-25 11:19:03,696][00922] Fps is (10 sec: 4096.9, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 2502656. Throughput: 0: 1006.3. Samples: 626518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:19:03,699][00922] Avg episode reward: [(0, '20.923')]
[2023-02-25 11:19:08,696][00922] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2527232. Throughput: 0: 1005.4. Samples: 630194. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:19:08,702][00922] Avg episode reward: [(0, '20.631')]
[2023-02-25 11:19:12,390][10943] Updated weights for policy 0, policy_version 620 (0.0032)
[2023-02-25 11:19:13,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3959.8, 300 sec: 3804.4). Total num frames: 2543616. Throughput: 0: 956.4. Samples: 635302. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 11:19:13,698][00922] Avg episode reward: [(0, '19.858')]
[2023-02-25 11:19:18,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 2560000. Throughput: 0: 967.9. Samples: 640374. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:19:18,701][00922] Avg episode reward: [(0, '19.140')]
[2023-02-25 11:19:18,714][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000625_2560000.pth...
[2023-02-25 11:19:18,834][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000400_1638400.pth
[2023-02-25 11:19:22,610][10943] Updated weights for policy 0, policy_version 630 (0.0018)
[2023-02-25 11:19:23,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2584576. Throughput: 0: 996.6. Samples: 643906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:19:23,705][00922] Avg episode reward: [(0, '19.377')]
[2023-02-25 11:19:28,697][00922] Fps is (10 sec: 4505.1, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 2605056. Throughput: 0: 1006.4. Samples: 651160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:19:28,700][00922] Avg episode reward: [(0, '21.548')]
[2023-02-25 11:19:33,639][10943] Updated weights for policy 0, policy_version 640 (0.0018)
[2023-02-25 11:19:33,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 2621440. Throughput: 0: 957.5. Samples: 655716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:19:33,701][00922] Avg episode reward: [(0, '22.427')]
[2023-02-25 11:19:33,708][10928] Saving new best policy, reward=22.427!
[2023-02-25 11:19:38,696][00922] Fps is (10 sec: 3277.1, 60 sec: 3959.5, 300 sec: 3776.6). Total num frames: 2637824. Throughput: 0: 958.7. Samples: 657958. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:19:38,699][00922] Avg episode reward: [(0, '21.923')]
[2023-02-25 11:19:43,347][10943] Updated weights for policy 0, policy_version 650 (0.0018)
[2023-02-25 11:19:43,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3804.4). Total num frames: 2662400. Throughput: 0: 1010.8. Samples: 664890. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:19:43,705][00922] Avg episode reward: [(0, '21.782')]
[2023-02-25 11:19:48,698][00922] Fps is (10 sec: 4504.8, 60 sec: 3959.3, 300 sec: 3818.3). Total num frames: 2682880. Throughput: 0: 1001.2. Samples: 671574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:19:48,707][00922] Avg episode reward: [(0, '21.466')]
[2023-02-25 11:19:53,696][00922] Fps is (10 sec: 3686.3, 60 sec: 3959.6, 300 sec: 3818.3). Total num frames: 2699264. Throughput: 0: 969.9. Samples: 673840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:19:53,701][00922] Avg episode reward: [(0, '20.202')]
[2023-02-25 11:19:54,842][10943] Updated weights for policy 0, policy_version 660 (0.0020)
[2023-02-25 11:19:58,696][00922] Fps is (10 sec: 3277.5, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 2715648. Throughput: 0: 960.8. Samples: 678536. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:19:58,699][00922] Avg episode reward: [(0, '20.823')]
[2023-02-25 11:20:03,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2740224. Throughput: 0: 1006.6. Samples: 685670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:20:03,699][00922] Avg episode reward: [(0, '20.816')]
[2023-02-25 11:20:03,938][10943] Updated weights for policy 0, policy_version 670 (0.0012)
[2023-02-25 11:20:08,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2760704. Throughput: 0: 1008.8. Samples: 689300. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:20:08,702][00922] Avg episode reward: [(0, '21.068')]
[2023-02-25 11:20:13,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2777088. Throughput: 0: 957.7. Samples: 694254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:20:13,701][00922] Avg episode reward: [(0, '20.730')]
[2023-02-25 11:20:15,704][10943] Updated weights for policy 0, policy_version 680 (0.0022)
[2023-02-25 11:20:18,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2797568. Throughput: 0: 974.2. Samples: 699554. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:20:18,699][00922] Avg episode reward: [(0, '19.410')]
[2023-02-25 11:20:23,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 2822144. Throughput: 0: 1001.7. Samples: 703036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:20:23,703][00922] Avg episode reward: [(0, '18.720')]
[2023-02-25 11:20:24,557][10943] Updated weights for policy 0, policy_version 690 (0.0020)
[2023-02-25 11:20:28,700][00922] Fps is (10 sec: 4094.3, 60 sec: 3891.0, 300 sec: 3873.8). Total num frames: 2838528. Throughput: 0: 1007.1. Samples: 710216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:20:28,708][00922] Avg episode reward: [(0, '18.957')]
[2023-02-25 11:20:33,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2850816. Throughput: 0: 940.1. Samples: 713878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:20:33,702][00922] Avg episode reward: [(0, '18.121')]
[2023-02-25 11:20:38,697][00922] Fps is (10 sec: 2458.5, 60 sec: 3754.6, 300 sec: 3804.4). Total num frames: 2863104. Throughput: 0: 928.8. Samples: 715638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:20:38,703][00922] Avg episode reward: [(0, '18.862')]
[2023-02-25 11:20:39,158][10943] Updated weights for policy 0, policy_version 700 (0.0015)
[2023-02-25 11:20:43,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 2879488. Throughput: 0: 911.9. Samples: 719570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:20:43,699][00922] Avg episode reward: [(0, '18.359')]
[2023-02-25 11:20:48,696][00922] Fps is (10 sec: 4096.2, 60 sec: 3686.5, 300 sec: 3846.1). Total num frames: 2904064. Throughput: 0: 915.7. Samples: 726878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:20:48,704][00922] Avg episode reward: [(0, '18.712')]
[2023-02-25 11:20:48,916][10943] Updated weights for policy 0, policy_version 710 (0.0031)
[2023-02-25 11:20:53,696][00922] Fps is (10 sec: 4505.5, 60 sec: 3754.7, 300 sec: 3846.1). Total num frames: 2924544. Throughput: 0: 914.5. Samples: 730454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:20:53,699][00922] Avg episode reward: [(0, '18.343')]
[2023-02-25 11:20:58,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2940928. Throughput: 0: 908.0. Samples: 735116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:20:58,700][00922] Avg episode reward: [(0, '18.436')]
[2023-02-25 11:21:00,773][10943] Updated weights for policy 0, policy_version 720 (0.0023)
[2023-02-25 11:21:03,697][00922] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 2961408. Throughput: 0: 913.1. Samples: 740644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:21:03,705][00922] Avg episode reward: [(0, '18.040')]
[2023-02-25 11:21:08,697][00922] Fps is (10 sec: 4505.3, 60 sec: 3754.6, 300 sec: 3860.0). Total num frames: 2985984. Throughput: 0: 915.5. Samples: 744234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:21:08,699][00922] Avg episode reward: [(0, '19.935')]
[2023-02-25 11:21:09,214][10943] Updated weights for policy 0, policy_version 730 (0.0012)
[2023-02-25 11:21:13,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3006464. Throughput: 0: 907.3. Samples: 751042. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:21:13,701][00922] Avg episode reward: [(0, '20.787')]
[2023-02-25 11:21:18,696][00922] Fps is (10 sec: 3277.0, 60 sec: 3686.4, 300 sec: 3804.4). Total num frames: 3018752. Throughput: 0: 927.7. Samples: 755626. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:21:18,704][00922] Avg episode reward: [(0, '19.882')]
[2023-02-25 11:21:18,808][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000738_3022848.pth...
[2023-02-25 11:21:18,959][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000515_2109440.pth
[2023-02-25 11:21:21,518][10943] Updated weights for policy 0, policy_version 740 (0.0011)
[2023-02-25 11:21:23,696][00922] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 3039232. Throughput: 0: 941.9. Samples: 758024. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:21:23,698][00922] Avg episode reward: [(0, '22.550')]
[2023-02-25 11:21:23,701][10928] Saving new best policy, reward=22.550!
[2023-02-25 11:21:28,696][00922] Fps is (10 sec: 4505.7, 60 sec: 3754.9, 300 sec: 3846.1). Total num frames: 3063808. Throughput: 0: 1014.0. Samples: 765202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:21:28,699][00922] Avg episode reward: [(0, '23.001')]
[2023-02-25 11:21:28,712][10928] Saving new best policy, reward=23.001!
[2023-02-25 11:21:30,009][10943] Updated weights for policy 0, policy_version 750 (0.0030)
[2023-02-25 11:21:33,697][00922] Fps is (10 sec: 4505.1, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 3084288. Throughput: 0: 990.0. Samples: 771428. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:21:33,700][00922] Avg episode reward: [(0, '21.664')]
[2023-02-25 11:21:38,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3096576. Throughput: 0: 961.2. Samples: 773708. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 11:21:38,699][00922] Avg episode reward: [(0, '22.141')]
[2023-02-25 11:21:42,179][10943] Updated weights for policy 0, policy_version 760 (0.0018)
[2023-02-25 11:21:43,696][00922] Fps is (10 sec: 3277.2, 60 sec: 3959.5, 300 sec: 3818.3). Total num frames: 3117056. Throughput: 0: 974.6. Samples: 778972. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:21:43,700][00922] Avg episode reward: [(0, '22.451')]
[2023-02-25 11:21:48,696][00922] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 3145728. Throughput: 0: 1013.9. Samples: 786268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:21:48,699][00922] Avg episode reward: [(0, '23.426')]
[2023-02-25 11:21:48,709][10928] Saving new best policy, reward=23.426!
[2023-02-25 11:21:50,529][10943] Updated weights for policy 0, policy_version 770 (0.0015)
[2023-02-25 11:21:53,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3162112. Throughput: 0: 1012.1. Samples: 789780. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:21:53,703][00922] Avg episode reward: [(0, '23.056')]
[2023-02-25 11:21:58,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 3178496. Throughput: 0: 959.5. Samples: 794220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:21:58,702][00922] Avg episode reward: [(0, '23.797')]
[2023-02-25 11:21:58,719][10928] Saving new best policy, reward=23.797!
[2023-02-25 11:22:02,917][10943] Updated weights for policy 0, policy_version 780 (0.0016)
[2023-02-25 11:22:03,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3194880. Throughput: 0: 980.3. Samples: 799742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:22:03,699][00922] Avg episode reward: [(0, '24.410')]
[2023-02-25 11:22:03,744][10928] Saving new best policy, reward=24.410!
[2023-02-25 11:22:08,696][00922] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3219456. Throughput: 0: 1004.8. Samples: 803242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:22:08,702][00922] Avg episode reward: [(0, '24.690')]
[2023-02-25 11:22:08,710][10928] Saving new best policy, reward=24.690!
[2023-02-25 11:22:11,790][10943] Updated weights for policy 0, policy_version 790 (0.0021)
[2023-02-25 11:22:13,696][00922] Fps is (10 sec: 4505.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3239936. Throughput: 0: 989.1. Samples: 809710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:22:13,699][00922] Avg episode reward: [(0, '23.349')]
[2023-02-25 11:22:18,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3252224. Throughput: 0: 950.2. Samples: 814188. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:22:18,704][00922] Avg episode reward: [(0, '23.364')]
[2023-02-25 11:22:23,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3272704. Throughput: 0: 955.2. Samples: 816690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:22:23,703][00922] Avg episode reward: [(0, '22.983')]
[2023-02-25 11:22:23,717][10943] Updated weights for policy 0, policy_version 800 (0.0040)
[2023-02-25 11:22:28,696][00922] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3297280. Throughput: 0: 996.7. Samples: 823824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:22:28,702][00922] Avg episode reward: [(0, '21.880')]
[2023-02-25 11:22:33,487][10943] Updated weights for policy 0, policy_version 810 (0.0015)
[2023-02-25 11:22:33,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.3, 300 sec: 3846.1). Total num frames: 3317760. Throughput: 0: 969.6. Samples: 829902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:22:33,700][00922] Avg episode reward: [(0, '22.516')]
[2023-02-25 11:22:38,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3330048. Throughput: 0: 941.2. Samples: 832132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:22:38,699][00922] Avg episode reward: [(0, '23.209')]
[2023-02-25 11:22:43,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3350528. Throughput: 0: 959.5. Samples: 837396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:22:43,709][00922] Avg episode reward: [(0, '22.783')]
[2023-02-25 11:22:44,628][10943] Updated weights for policy 0, policy_version 820 (0.0014)
[2023-02-25 11:22:48,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3887.7). Total num frames: 3375104. Throughput: 0: 994.1. Samples: 844478. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:22:48,706][00922] Avg episode reward: [(0, '24.733')]
[2023-02-25 11:22:48,720][10928] Saving new best policy, reward=24.733!
[2023-02-25 11:22:53,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3901.6). Total num frames: 3395584. Throughput: 0: 990.1. Samples: 847798. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:22:53,701][00922] Avg episode reward: [(0, '25.698')]
[2023-02-25 11:22:53,707][10928] Saving new best policy, reward=25.698!
[2023-02-25 11:22:54,946][10943] Updated weights for policy 0, policy_version 830 (0.0027)
[2023-02-25 11:22:58,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 3407872. Throughput: 0: 944.1. Samples: 852194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:22:58,699][00922] Avg episode reward: [(0, '26.884')]
[2023-02-25 11:22:58,715][10928] Saving new best policy, reward=26.884!
[2023-02-25 11:23:03,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 3428352. Throughput: 0: 968.4. Samples: 857766. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:23:03,703][00922] Avg episode reward: [(0, '28.436')]
[2023-02-25 11:23:03,709][10928] Saving new best policy, reward=28.436!
[2023-02-25 11:23:05,768][10943] Updated weights for policy 0, policy_version 840 (0.0012)
[2023-02-25 11:23:08,697][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3887.8). Total num frames: 3452928. Throughput: 0: 992.3. Samples: 861344. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:23:08,703][00922] Avg episode reward: [(0, '28.099')]
[2023-02-25 11:23:13,702][00922] Fps is (10 sec: 4093.8, 60 sec: 3822.6, 300 sec: 3887.7). Total num frames: 3469312. Throughput: 0: 977.6. Samples: 867822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:23:13,703][00922] Avg episode reward: [(0, '28.524')]
[2023-02-25 11:23:13,707][10928] Saving new best policy, reward=28.524!
[2023-02-25 11:23:17,481][10943] Updated weights for policy 0, policy_version 850 (0.0012)
[2023-02-25 11:23:18,696][00922] Fps is (10 sec: 2867.3, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 3481600. Throughput: 0: 921.3. Samples: 871360. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:23:18,699][00922] Avg episode reward: [(0, '28.451')]
[2023-02-25 11:23:18,710][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000850_3481600.pth...
[2023-02-25 11:23:18,905][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000625_2560000.pth
[2023-02-25 11:23:23,696][00922] Fps is (10 sec: 2458.9, 60 sec: 3686.4, 300 sec: 3818.3). Total num frames: 3493888. Throughput: 0: 911.1. Samples: 873130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:23:23,699][00922] Avg episode reward: [(0, '27.567')]
[2023-02-25 11:23:28,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3832.2). Total num frames: 3514368. Throughput: 0: 894.4. Samples: 877642. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:23:28,699][00922] Avg episode reward: [(0, '26.958')]
[2023-02-25 11:23:30,198][10943] Updated weights for policy 0, policy_version 860 (0.0014)
[2023-02-25 11:23:33,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3860.0). Total num frames: 3538944. Throughput: 0: 895.4. Samples: 884772. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:23:33,699][00922] Avg episode reward: [(0, '25.805')]
[2023-02-25 11:23:38,701][00922] Fps is (10 sec: 4094.2, 60 sec: 3754.4, 300 sec: 3832.2). Total num frames: 3555328. Throughput: 0: 882.7. Samples: 887524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:23:38,704][00922] Avg episode reward: [(0, '25.013')]
[2023-02-25 11:23:41,174][10943] Updated weights for policy 0, policy_version 870 (0.0016)
[2023-02-25 11:23:43,696][00922] Fps is (10 sec: 2867.2, 60 sec: 3618.1, 300 sec: 3804.4). Total num frames: 3567616. Throughput: 0: 886.5. Samples: 892084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:23:43,699][00922] Avg episode reward: [(0, '24.749')]
[2023-02-25 11:23:48,696][00922] Fps is (10 sec: 3688.1, 60 sec: 3618.1, 300 sec: 3832.2). Total num frames: 3592192. Throughput: 0: 899.3. Samples: 898236. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:23:48,699][00922] Avg episode reward: [(0, '24.609')]
[2023-02-25 11:23:51,191][10943] Updated weights for policy 0, policy_version 880 (0.0022)
[2023-02-25 11:23:53,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3618.1, 300 sec: 3846.1). Total num frames: 3612672. Throughput: 0: 898.0. Samples: 901752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:23:53,703][00922] Avg episode reward: [(0, '25.504')]
[2023-02-25 11:23:58,700][00922] Fps is (10 sec: 4094.6, 60 sec: 3754.5, 300 sec: 3832.1). Total num frames: 3633152. Throughput: 0: 889.1. Samples: 907828. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 11:23:58,702][00922] Avg episode reward: [(0, '25.385')]
[2023-02-25 11:24:02,693][10943] Updated weights for policy 0, policy_version 890 (0.0022)
[2023-02-25 11:24:03,697][00922] Fps is (10 sec: 3276.7, 60 sec: 3618.1, 300 sec: 3790.5). Total num frames: 3645440. Throughput: 0: 910.3. Samples: 912324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:24:03,703][00922] Avg episode reward: [(0, '25.431')]
[2023-02-25 11:24:08,696][00922] Fps is (10 sec: 3687.6, 60 sec: 3618.1, 300 sec: 3818.3). Total num frames: 3670016. Throughput: 0: 938.9. Samples: 915380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:24:08,699][00922] Avg episode reward: [(0, '24.900')]
[2023-02-25 11:24:11,877][10943] Updated weights for policy 0, policy_version 900 (0.0017)
[2023-02-25 11:24:13,696][00922] Fps is (10 sec: 4915.4, 60 sec: 3755.0, 300 sec: 3846.1). Total num frames: 3694592. Throughput: 0: 999.6. Samples: 922622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:24:13,704][00922] Avg episode reward: [(0, '25.591')]
[2023-02-25 11:24:18,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3710976. Throughput: 0: 966.0. Samples: 928240. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:24:18,699][00922] Avg episode reward: [(0, '24.345')]
[2023-02-25 11:24:23,493][10943] Updated weights for policy 0, policy_version 910 (0.0025)
[2023-02-25 11:24:23,697][00922] Fps is (10 sec: 3276.6, 60 sec: 3891.1, 300 sec: 3804.4). Total num frames: 3727360. Throughput: 0: 955.0. Samples: 930494. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 11:24:23,710][00922] Avg episode reward: [(0, '24.425')]
[2023-02-25 11:24:28,696][00922] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3747840. Throughput: 0: 986.3. Samples: 936468. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 11:24:28,698][00922] Avg episode reward: [(0, '22.920')]
[2023-02-25 11:24:32,401][10943] Updated weights for policy 0, policy_version 920 (0.0022)
[2023-02-25 11:24:33,698][00922] Fps is (10 sec: 4505.0, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 3772416. Throughput: 0: 1010.6. Samples: 943716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:24:33,701][00922] Avg episode reward: [(0, '21.190')]
[2023-02-25 11:24:38,701][00922] Fps is (10 sec: 4094.2, 60 sec: 3891.2, 300 sec: 3818.2). Total num frames: 3788800. Throughput: 0: 995.0. Samples: 946532. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:24:38,703][00922] Avg episode reward: [(0, '20.169')]
[2023-02-25 11:24:43,696][00922] Fps is (10 sec: 3277.5, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 3805184. Throughput: 0: 961.5. Samples: 951094. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 11:24:43,702][00922] Avg episode reward: [(0, '21.749')]
[2023-02-25 11:24:44,529][10943] Updated weights for policy 0, policy_version 930 (0.0023)
[2023-02-25 11:24:48,696][00922] Fps is (10 sec: 3688.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3825664. Throughput: 0: 1001.0. Samples: 957368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:24:48,699][00922] Avg episode reward: [(0, '23.769')]
[2023-02-25 11:24:53,133][10943] Updated weights for policy 0, policy_version 940 (0.0022)
[2023-02-25 11:24:53,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3850240. Throughput: 0: 1011.1. Samples: 960878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:24:53,702][00922] Avg episode reward: [(0, '25.907')]
[2023-02-25 11:24:58,696][00922] Fps is (10 sec: 4095.9, 60 sec: 3891.4, 300 sec: 3818.3). Total num frames: 3866624. Throughput: 0: 980.4. Samples: 966742. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:24:58,704][00922] Avg episode reward: [(0, '25.893')]
[2023-02-25 11:25:03,696][00922] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3804.4). Total num frames: 3883008. Throughput: 0: 956.0. Samples: 971262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:25:03,702][00922] Avg episode reward: [(0, '25.626')]
[2023-02-25 11:25:05,396][10943] Updated weights for policy 0, policy_version 950 (0.0030)
[2023-02-25 11:25:08,700][00922] Fps is (10 sec: 3685.3, 60 sec: 3891.0, 300 sec: 3818.3). Total num frames: 3903488. Throughput: 0: 976.3. Samples: 974432. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:25:08,708][00922] Avg episode reward: [(0, '25.768')]
[2023-02-25 11:25:13,696][00922] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3928064. Throughput: 0: 1005.2. Samples: 981704. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:25:13,698][00922] Avg episode reward: [(0, '23.835')]
[2023-02-25 11:25:13,721][10943] Updated weights for policy 0, policy_version 960 (0.0026)
[2023-02-25 11:25:18,696][00922] Fps is (10 sec: 4097.3, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 3944448. Throughput: 0: 968.9. Samples: 987316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 11:25:18,703][00922] Avg episode reward: [(0, '22.216')]
[2023-02-25 11:25:18,728][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000964_3948544.pth...
[2023-02-25 11:25:18,907][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000738_3022848.pth
[2023-02-25 11:25:23,700][00922] Fps is (10 sec: 3275.7, 60 sec: 3891.0, 300 sec: 3804.4). Total num frames: 3960832. Throughput: 0: 955.7. Samples: 989536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 11:25:23,705][00922] Avg episode reward: [(0, '22.525')]
[2023-02-25 11:25:25,688][10943] Updated weights for policy 0, policy_version 970 (0.0015)
[2023-02-25 11:25:28,696][00922] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3985408. Throughput: 0: 987.5. Samples: 995530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 11:25:28,703][00922] Avg episode reward: [(0, '22.980')]
[2023-02-25 11:25:32,596][10928] Stopping Batcher_0...
[2023-02-25 11:25:32,598][10928] Loop batcher_evt_loop terminating...
[2023-02-25 11:25:32,598][00922] Component Batcher_0 stopped!
[2023-02-25 11:25:32,598][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 11:25:32,650][00922] Component RolloutWorker_w3 stopped!
[2023-02-25 11:25:32,658][00922] Component RolloutWorker_w7 stopped!
[2023-02-25 11:25:32,664][00922] Component RolloutWorker_w2 stopped!
[2023-02-25 11:25:32,666][10944] Stopping RolloutWorker_w2...
[2023-02-25 11:25:32,658][10950] Stopping RolloutWorker_w7...
[2023-02-25 11:25:32,656][10945] Stopping RolloutWorker_w3...
[2023-02-25 11:25:32,669][00922] Component RolloutWorker_w6 stopped!
[2023-02-25 11:25:32,675][00922] Component RolloutWorker_w0 stopped!
[2023-02-25 11:25:32,677][10942] Stopping RolloutWorker_w0...
[2023-02-25 11:25:32,678][10942] Loop rollout_proc0_evt_loop terminating...
[2023-02-25 11:25:32,672][10949] Stopping RolloutWorker_w6...
[2023-02-25 11:25:32,678][10949] Loop rollout_proc6_evt_loop terminating...
[2023-02-25 11:25:32,682][00922] Component RolloutWorker_w4 stopped!
[2023-02-25 11:25:32,683][10947] Stopping RolloutWorker_w4...
[2023-02-25 11:25:32,684][10950] Loop rollout_proc7_evt_loop terminating...
[2023-02-25 11:25:32,684][10947] Loop rollout_proc4_evt_loop terminating...
[2023-02-25 11:25:32,692][10946] Stopping RolloutWorker_w1...
[2023-02-25 11:25:32,692][00922] Component RolloutWorker_w1 stopped!
[2023-02-25 11:25:32,697][10943] Weights refcount: 2 0
[2023-02-25 11:25:32,690][10944] Loop rollout_proc2_evt_loop terminating...
[2023-02-25 11:25:32,671][10945] Loop rollout_proc3_evt_loop terminating...
[2023-02-25 11:25:32,704][10946] Loop rollout_proc1_evt_loop terminating...
[2023-02-25 11:25:32,706][10943] Stopping InferenceWorker_p0-w0...
[2023-02-25 11:25:32,707][00922] Component InferenceWorker_p0-w0 stopped!
[2023-02-25 11:25:32,712][10943] Loop inference_proc0-0_evt_loop terminating...
[2023-02-25 11:25:32,755][10928] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000850_3481600.pth
[2023-02-25 11:25:32,759][10948] Stopping RolloutWorker_w5...
[2023-02-25 11:25:32,759][00922] Component RolloutWorker_w5 stopped!
[2023-02-25 11:25:32,760][10948] Loop rollout_proc5_evt_loop terminating...
[2023-02-25 11:25:32,769][10928] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 11:25:32,966][10928] Stopping LearnerWorker_p0...
[2023-02-25 11:25:32,968][10928] Loop learner_proc0_evt_loop terminating...
[2023-02-25 11:25:32,966][00922] Component LearnerWorker_p0 stopped!
[2023-02-25 11:25:32,973][00922] Waiting for process learner_proc0 to stop...
[2023-02-25 11:25:34,914][00922] Waiting for process inference_proc0-0 to join...
[2023-02-25 11:25:35,463][00922] Waiting for process rollout_proc0 to join...
[2023-02-25 11:25:35,525][00922] Waiting for process rollout_proc1 to join...
[2023-02-25 11:25:36,228][00922] Waiting for process rollout_proc2 to join...
[2023-02-25 11:25:36,236][00922] Waiting for process rollout_proc3 to join...
[2023-02-25 11:25:36,238][00922] Waiting for process rollout_proc4 to join...
[2023-02-25 11:25:36,241][00922] Waiting for process rollout_proc5 to join...
[2023-02-25 11:25:36,242][00922] Waiting for process rollout_proc6 to join...
[2023-02-25 11:25:36,243][00922] Waiting for process rollout_proc7 to join...
[2023-02-25 11:25:36,245][00922] Batcher 0 profile tree view:
batching: 25.3283, releasing_batches: 0.0214
[2023-02-25 11:25:36,247][00922] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 517.2347
update_model: 7.6469
weight_update: 0.0026
one_step: 0.0023
handle_policy_step: 495.5379
deserialize: 13.9862, stack: 2.8855, obs_to_device_normalize: 112.0871, forward: 234.9928, send_messages: 25.4744
prepare_outputs: 81.0255
to_cpu: 50.8060
[2023-02-25 11:25:36,249][00922] Learner 0 profile tree view:
misc: 0.0083, prepare_batch: 16.8646
train: 75.2030
epoch_init: 0.0062, minibatch_init: 0.0065, losses_postprocess: 0.6230, kl_divergence: 0.5826, after_optimizer: 32.6935
calculate_losses: 26.4108
losses_init: 0.0036, forward_head: 1.7095, bptt_initial: 17.5575, tail: 1.0525, advantages_returns: 0.2275, losses: 3.3693
bptt: 2.1509
bptt_forward_core: 2.0627
update: 14.2731
clip: 1.4126
[2023-02-25 11:25:36,252][00922] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3730, enqueue_policy_requests: 136.7399, env_step: 799.3576, overhead: 19.5141, complete_rollouts: 6.9353
save_policy_outputs: 19.3820
split_output_tensors: 9.5286
[2023-02-25 11:25:36,253][00922] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3206, enqueue_policy_requests: 140.8423, env_step: 797.2201, overhead: 19.1502, complete_rollouts: 6.5918
save_policy_outputs: 19.3957
split_output_tensors: 9.3063
[2023-02-25 11:25:36,255][00922] Loop Runner_EvtLoop terminating...
[2023-02-25 11:25:36,257][00922] Runner profile tree view:
main_loop: 1090.5419
[2023-02-25 11:25:36,259][00922] Collected {0: 4005888}, FPS: 3673.3
[2023-02-25 11:25:36,420][00922] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 11:25:36,423][00922] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 11:25:36,425][00922] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 11:25:36,427][00922] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 11:25:36,429][00922] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 11:25:36,430][00922] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 11:25:36,432][00922] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 11:25:36,434][00922] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 11:25:36,437][00922] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-25 11:25:36,438][00922] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-25 11:25:36,439][00922] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 11:25:36,442][00922] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 11:25:36,443][00922] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 11:25:36,444][00922] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 11:25:36,445][00922] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 11:25:36,491][00922] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 11:25:36,498][00922] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 11:25:36,501][00922] RunningMeanStd input shape: (1,)
[2023-02-25 11:25:36,530][00922] ConvEncoder: input_channels=3
[2023-02-25 11:25:37,313][00922] Conv encoder output size: 512
[2023-02-25 11:25:37,318][00922] Policy head output size: 512
[2023-02-25 11:25:39,997][00922] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 11:25:41,243][00922] Num frames 100...
[2023-02-25 11:25:41,361][00922] Num frames 200...
[2023-02-25 11:25:41,474][00922] Num frames 300...
[2023-02-25 11:25:41,582][00922] Num frames 400...
[2023-02-25 11:25:41,698][00922] Num frames 500...
[2023-02-25 11:25:41,815][00922] Num frames 600...
[2023-02-25 11:25:41,936][00922] Num frames 700...
[2023-02-25 11:25:42,049][00922] Num frames 800...
[2023-02-25 11:25:42,169][00922] Num frames 900...
[2023-02-25 11:25:42,285][00922] Num frames 1000...
[2023-02-25 11:25:42,412][00922] Num frames 1100...
[2023-02-25 11:25:42,524][00922] Num frames 1200...
[2023-02-25 11:25:42,645][00922] Num frames 1300...
[2023-02-25 11:25:42,759][00922] Num frames 1400...
[2023-02-25 11:25:42,883][00922] Num frames 1500...
[2023-02-25 11:25:43,000][00922] Num frames 1600...
[2023-02-25 11:25:43,101][00922] Avg episode rewards: #0: 47.349, true rewards: #0: 16.350
[2023-02-25 11:25:43,103][00922] Avg episode reward: 47.349, avg true_objective: 16.350
[2023-02-25 11:25:43,179][00922] Num frames 1700...
[2023-02-25 11:25:43,302][00922] Num frames 1800...
[2023-02-25 11:25:43,416][00922] Num frames 1900...
[2023-02-25 11:25:43,527][00922] Num frames 2000...
[2023-02-25 11:25:43,641][00922] Num frames 2100...
[2023-02-25 11:25:43,756][00922] Num frames 2200...
[2023-02-25 11:25:43,881][00922] Num frames 2300...
[2023-02-25 11:25:44,019][00922] Avg episode rewards: #0: 31.355, true rewards: #0: 11.855
[2023-02-25 11:25:44,021][00922] Avg episode reward: 31.355, avg true_objective: 11.855
[2023-02-25 11:25:44,056][00922] Num frames 2400...
[2023-02-25 11:25:44,169][00922] Num frames 2500...
[2023-02-25 11:25:44,283][00922] Num frames 2600...
[2023-02-25 11:25:44,408][00922] Num frames 2700...
[2023-02-25 11:25:44,526][00922] Num frames 2800...
[2023-02-25 11:25:44,648][00922] Num frames 2900...
[2023-02-25 11:25:44,763][00922] Num frames 3000...
[2023-02-25 11:25:44,885][00922] Num frames 3100...
[2023-02-25 11:25:45,005][00922] Num frames 3200...
[2023-02-25 11:25:45,127][00922] Num frames 3300...
[2023-02-25 11:25:45,243][00922] Num frames 3400...
[2023-02-25 11:25:45,367][00922] Num frames 3500...
[2023-02-25 11:25:45,483][00922] Num frames 3600...
[2023-02-25 11:25:45,602][00922] Num frames 3700...
[2023-02-25 11:25:45,720][00922] Num frames 3800...
[2023-02-25 11:25:45,837][00922] Num frames 3900...
[2023-02-25 11:25:45,936][00922] Avg episode rewards: #0: 31.463, true rewards: #0: 13.130
[2023-02-25 11:25:45,940][00922] Avg episode reward: 31.463, avg true_objective: 13.130
[2023-02-25 11:25:46,009][00922] Num frames 4000...
[2023-02-25 11:25:46,124][00922] Num frames 4100...
[2023-02-25 11:25:46,237][00922] Num frames 4200...
[2023-02-25 11:25:46,357][00922] Num frames 4300...
[2023-02-25 11:25:46,474][00922] Num frames 4400...
[2023-02-25 11:25:46,594][00922] Num frames 4500...
[2023-02-25 11:25:46,704][00922] Num frames 4600...
[2023-02-25 11:25:46,821][00922] Num frames 4700...
[2023-02-25 11:25:46,934][00922] Num frames 4800...
[2023-02-25 11:25:47,055][00922] Num frames 4900...
[2023-02-25 11:25:47,217][00922] Avg episode rewards: #0: 29.488, true rewards: #0: 12.487
[2023-02-25 11:25:47,222][00922] Avg episode reward: 29.488, avg true_objective: 12.487
[2023-02-25 11:25:47,233][00922] Num frames 5000...
[2023-02-25 11:25:47,349][00922] Num frames 5100...
[2023-02-25 11:25:47,473][00922] Num frames 5200...
[2023-02-25 11:25:47,591][00922] Num frames 5300...
[2023-02-25 11:25:47,708][00922] Num frames 5400...
[2023-02-25 11:25:47,820][00922] Num frames 5500...
[2023-02-25 11:25:47,940][00922] Num frames 5600...
[2023-02-25 11:25:48,060][00922] Num frames 5700...
[2023-02-25 11:25:48,178][00922] Num frames 5800...
[2023-02-25 11:25:48,291][00922] Num frames 5900...
[2023-02-25 11:25:48,416][00922] Num frames 6000...
[2023-02-25 11:25:48,528][00922] Num frames 6100...
[2023-02-25 11:25:48,685][00922] Num frames 6200...
[2023-02-25 11:25:48,853][00922] Num frames 6300...
[2023-02-25 11:25:49,016][00922] Num frames 6400...
[2023-02-25 11:25:49,174][00922] Num frames 6500...
[2023-02-25 11:25:49,329][00922] Num frames 6600...
[2023-02-25 11:25:49,489][00922] Num frames 6700...
[2023-02-25 11:25:49,646][00922] Num frames 6800...
[2023-02-25 11:25:49,809][00922] Num frames 6900...
[2023-02-25 11:25:49,985][00922] Num frames 7000...
[2023-02-25 11:25:50,198][00922] Avg episode rewards: #0: 36.390, true rewards: #0: 14.190
[2023-02-25 11:25:50,203][00922] Avg episode reward: 36.390, avg true_objective: 14.190
[2023-02-25 11:25:50,217][00922] Num frames 7100...
[2023-02-25 11:25:50,374][00922] Num frames 7200...
[2023-02-25 11:25:50,541][00922] Num frames 7300...
[2023-02-25 11:25:50,700][00922] Num frames 7400...
[2023-02-25 11:25:50,866][00922] Num frames 7500...
[2023-02-25 11:25:51,033][00922] Num frames 7600...
[2023-02-25 11:25:51,106][00922] Avg episode rewards: #0: 31.845, true rewards: #0: 12.678
[2023-02-25 11:25:51,108][00922] Avg episode reward: 31.845, avg true_objective: 12.678
[2023-02-25 11:25:51,262][00922] Num frames 7700...
[2023-02-25 11:25:51,430][00922] Num frames 7800...
[2023-02-25 11:25:51,605][00922] Num frames 7900...
[2023-02-25 11:25:51,781][00922] Num frames 8000...
[2023-02-25 11:25:51,940][00922] Num frames 8100...
[2023-02-25 11:25:52,104][00922] Num frames 8200...
[2023-02-25 11:25:52,252][00922] Num frames 8300...
[2023-02-25 11:25:52,366][00922] Num frames 8400...
[2023-02-25 11:25:52,482][00922] Num frames 8500...
[2023-02-25 11:25:52,601][00922] Num frames 8600...
[2023-02-25 11:25:52,719][00922] Num frames 8700...
[2023-02-25 11:25:52,839][00922] Avg episode rewards: #0: 30.798, true rewards: #0: 12.513
[2023-02-25 11:25:52,841][00922] Avg episode reward: 30.798, avg true_objective: 12.513
[2023-02-25 11:25:52,890][00922] Num frames 8800...
[2023-02-25 11:25:53,002][00922] Num frames 8900...
[2023-02-25 11:25:53,109][00922] Num frames 9000...
[2023-02-25 11:25:53,222][00922] Num frames 9100...
[2023-02-25 11:25:53,331][00922] Num frames 9200...
[2023-02-25 11:25:53,447][00922] Num frames 9300...
[2023-02-25 11:25:53,564][00922] Num frames 9400...
[2023-02-25 11:25:53,672][00922] Num frames 9500...
[2023-02-25 11:25:53,793][00922] Num frames 9600...
[2023-02-25 11:25:53,906][00922] Num frames 9700...
[2023-02-25 11:25:54,026][00922] Num frames 9800...
[2023-02-25 11:25:54,136][00922] Num frames 9900...
[2023-02-25 11:25:54,250][00922] Num frames 10000...
[2023-02-25 11:25:54,360][00922] Num frames 10100...
[2023-02-25 11:25:54,482][00922] Num frames 10200...
[2023-02-25 11:25:54,601][00922] Num frames 10300...
[2023-02-25 11:25:54,717][00922] Num frames 10400...
[2023-02-25 11:25:54,828][00922] Num frames 10500...
[2023-02-25 11:25:54,941][00922] Num frames 10600...
[2023-02-25 11:25:55,055][00922] Num frames 10700...
[2023-02-25 11:25:55,174][00922] Num frames 10800...
[2023-02-25 11:25:55,294][00922] Avg episode rewards: #0: 34.198, true rewards: #0: 13.574
[2023-02-25 11:25:55,296][00922] Avg episode reward: 34.198, avg true_objective: 13.574
[2023-02-25 11:25:55,345][00922] Num frames 10900...
[2023-02-25 11:25:55,462][00922] Num frames 11000...
[2023-02-25 11:25:55,575][00922] Num frames 11100...
[2023-02-25 11:25:55,698][00922] Num frames 11200...
[2023-02-25 11:25:55,810][00922] Num frames 11300...
[2023-02-25 11:25:55,927][00922] Num frames 11400...
[2023-02-25 11:25:56,036][00922] Num frames 11500...
[2023-02-25 11:25:56,150][00922] Num frames 11600...
[2023-02-25 11:25:56,268][00922] Num frames 11700...
[2023-02-25 11:25:56,380][00922] Num frames 11800...
[2023-02-25 11:25:56,507][00922] Num frames 11900...
[2023-02-25 11:25:56,634][00922] Num frames 12000...
[2023-02-25 11:25:56,762][00922] Num frames 12100...
[2023-02-25 11:25:56,889][00922] Num frames 12200...
[2023-02-25 11:25:57,015][00922] Num frames 12300...
[2023-02-25 11:25:57,131][00922] Num frames 12400...
[2023-02-25 11:25:57,284][00922] Num frames 12500...
[2023-02-25 11:25:57,447][00922] Num frames 12600...
[2023-02-25 11:25:57,633][00922] Avg episode rewards: #0: 35.981, true rewards: #0: 14.092
[2023-02-25 11:25:57,635][00922] Avg episode reward: 35.981, avg true_objective: 14.092
[2023-02-25 11:25:57,670][00922] Num frames 12700...
[2023-02-25 11:25:57,828][00922] Num frames 12800...
[2023-02-25 11:25:57,986][00922] Num frames 12900...
[2023-02-25 11:25:58,139][00922] Num frames 13000...
[2023-02-25 11:25:58,296][00922] Num frames 13100...
[2023-02-25 11:25:58,452][00922] Num frames 13200...
[2023-02-25 11:25:58,614][00922] Num frames 13300...
[2023-02-25 11:25:58,696][00922] Avg episode rewards: #0: 33.715, true rewards: #0: 13.315
[2023-02-25 11:25:58,700][00922] Avg episode reward: 33.715, avg true_objective: 13.315
[2023-02-25 11:27:19,205][00922] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-25 11:27:19,962][00922] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 11:27:19,966][00922] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 11:27:19,970][00922] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 11:27:19,972][00922] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 11:27:19,975][00922] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 11:27:19,977][00922] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 11:27:19,978][00922] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-25 11:27:19,979][00922] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 11:27:19,982][00922] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-25 11:27:19,983][00922] Adding new argument 'hf_repository'='OliP/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-25 11:27:19,984][00922] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 11:27:19,985][00922] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 11:27:19,987][00922] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 11:27:19,988][00922] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 11:27:19,989][00922] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 11:27:20,020][00922] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 11:27:20,022][00922] RunningMeanStd input shape: (1,)
[2023-02-25 11:27:20,046][00922] ConvEncoder: input_channels=3
[2023-02-25 11:27:20,102][00922] Conv encoder output size: 512
[2023-02-25 11:27:20,104][00922] Policy head output size: 512
[2023-02-25 11:27:20,131][00922] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 11:27:20,653][00922] Num frames 100...
[2023-02-25 11:27:20,768][00922] Num frames 200...
[2023-02-25 11:27:20,880][00922] Num frames 300...
[2023-02-25 11:27:21,009][00922] Num frames 400...
[2023-02-25 11:27:21,132][00922] Num frames 500...
[2023-02-25 11:27:21,239][00922] Num frames 600...
[2023-02-25 11:27:21,355][00922] Num frames 700...
[2023-02-25 11:27:21,465][00922] Num frames 800...
[2023-02-25 11:27:21,596][00922] Avg episode rewards: #0: 19.700, true rewards: #0: 8.700
[2023-02-25 11:27:21,598][00922] Avg episode reward: 19.700, avg true_objective: 8.700
[2023-02-25 11:27:21,633][00922] Num frames 900...
[2023-02-25 11:27:21,743][00922] Num frames 1000...
[2023-02-25 11:27:21,864][00922] Num frames 1100...
[2023-02-25 11:27:21,976][00922] Num frames 1200...
[2023-02-25 11:27:22,093][00922] Num frames 1300...
[2023-02-25 11:27:22,209][00922] Num frames 1400...
[2023-02-25 11:27:22,320][00922] Num frames 1500...
[2023-02-25 11:27:22,442][00922] Num frames 1600...
[2023-02-25 11:27:22,536][00922] Avg episode rewards: #0: 19.125, true rewards: #0: 8.125
[2023-02-25 11:27:22,537][00922] Avg episode reward: 19.125, avg true_objective: 8.125
[2023-02-25 11:27:22,623][00922] Num frames 1700...
[2023-02-25 11:27:22,739][00922] Num frames 1800...
[2023-02-25 11:27:22,855][00922] Num frames 1900...
[2023-02-25 11:27:22,972][00922] Num frames 2000...
[2023-02-25 11:27:23,088][00922] Num frames 2100...
[2023-02-25 11:27:23,205][00922] Num frames 2200...
[2023-02-25 11:27:23,313][00922] Num frames 2300...
[2023-02-25 11:27:23,426][00922] Num frames 2400...
[2023-02-25 11:27:23,535][00922] Num frames 2500...
[2023-02-25 11:27:23,649][00922] Num frames 2600...
[2023-02-25 11:27:23,761][00922] Num frames 2700...
[2023-02-25 11:27:23,874][00922] Num frames 2800...
[2023-02-25 11:27:23,994][00922] Num frames 2900...
[2023-02-25 11:27:24,115][00922] Num frames 3000...
[2023-02-25 11:27:24,233][00922] Num frames 3100...
[2023-02-25 11:27:24,346][00922] Num frames 3200...
[2023-02-25 11:27:24,513][00922] Avg episode rewards: #0: 27.637, true rewards: #0: 10.970
[2023-02-25 11:27:24,515][00922] Avg episode reward: 27.637, avg true_objective: 10.970
[2023-02-25 11:27:24,530][00922] Num frames 3300...
[2023-02-25 11:27:24,639][00922] Num frames 3400...
[2023-02-25 11:27:24,762][00922] Num frames 3500...
[2023-02-25 11:27:24,873][00922] Num frames 3600...
[2023-02-25 11:27:24,990][00922] Num frames 3700...
[2023-02-25 11:27:25,105][00922] Num frames 3800...
[2023-02-25 11:27:25,222][00922] Num frames 3900...
[2023-02-25 11:27:25,329][00922] Num frames 4000...
[2023-02-25 11:27:25,443][00922] Num frames 4100...
[2023-02-25 11:27:25,555][00922] Num frames 4200...
[2023-02-25 11:27:25,666][00922] Num frames 4300...
[2023-02-25 11:27:25,778][00922] Num frames 4400...
[2023-02-25 11:27:25,893][00922] Num frames 4500...
[2023-02-25 11:27:26,009][00922] Num frames 4600...
[2023-02-25 11:27:26,130][00922] Num frames 4700...
[2023-02-25 11:27:26,252][00922] Num frames 4800...
[2023-02-25 11:27:26,363][00922] Num frames 4900...
[2023-02-25 11:27:26,482][00922] Num frames 5000...
[2023-02-25 11:27:26,595][00922] Num frames 5100...
[2023-02-25 11:27:26,717][00922] Num frames 5200...
[2023-02-25 11:27:26,833][00922] Num frames 5300...
[2023-02-25 11:27:26,999][00922] Avg episode rewards: #0: 34.977, true rewards: #0: 13.478
[2023-02-25 11:27:27,001][00922] Avg episode reward: 34.977, avg true_objective: 13.478
[2023-02-25 11:27:27,017][00922] Num frames 5400...
[2023-02-25 11:27:27,172][00922] Num frames 5500...
[2023-02-25 11:27:27,326][00922] Num frames 5600...
[2023-02-25 11:27:27,485][00922] Num frames 5700...
[2023-02-25 11:27:27,644][00922] Num frames 5800...
[2023-02-25 11:27:27,798][00922] Num frames 5900...
[2023-02-25 11:27:27,957][00922] Num frames 6000...
[2023-02-25 11:27:28,116][00922] Num frames 6100...
[2023-02-25 11:27:28,282][00922] Num frames 6200...
[2023-02-25 11:27:28,445][00922] Num frames 6300...
[2023-02-25 11:27:28,604][00922] Num frames 6400...
[2023-02-25 11:27:28,764][00922] Num frames 6500...
[2023-02-25 11:27:28,939][00922] Avg episode rewards: #0: 33.550, true rewards: #0: 13.150
[2023-02-25 11:27:28,942][00922] Avg episode reward: 33.550, avg true_objective: 13.150
[2023-02-25 11:27:28,989][00922] Num frames 6600...
[2023-02-25 11:27:29,151][00922] Num frames 6700...
[2023-02-25 11:27:29,314][00922] Num frames 6800...
[2023-02-25 11:27:29,474][00922] Num frames 6900...
[2023-02-25 11:27:29,633][00922] Num frames 7000...
[2023-02-25 11:27:29,795][00922] Num frames 7100...
[2023-02-25 11:27:29,953][00922] Num frames 7200...
[2023-02-25 11:27:30,090][00922] Num frames 7300...
[2023-02-25 11:27:30,206][00922] Num frames 7400...
[2023-02-25 11:27:30,323][00922] Num frames 7500...
[2023-02-25 11:27:30,448][00922] Num frames 7600...
[2023-02-25 11:27:30,570][00922] Num frames 7700...
[2023-02-25 11:27:30,687][00922] Num frames 7800...
[2023-02-25 11:27:30,802][00922] Num frames 7900...
[2023-02-25 11:27:30,924][00922] Num frames 8000...
[2023-02-25 11:27:31,038][00922] Num frames 8100...
[2023-02-25 11:27:31,155][00922] Num frames 8200...
[2023-02-25 11:27:31,275][00922] Num frames 8300...
[2023-02-25 11:27:31,394][00922] Num frames 8400...
[2023-02-25 11:27:31,506][00922] Num frames 8500...
[2023-02-25 11:27:31,664][00922] Avg episode rewards: #0: 36.651, true rewards: #0: 14.318
[2023-02-25 11:27:31,666][00922] Avg episode reward: 36.651, avg true_objective: 14.318
[2023-02-25 11:27:31,681][00922] Num frames 8600...
[2023-02-25 11:27:31,796][00922] Num frames 8700...
[2023-02-25 11:27:31,921][00922] Num frames 8800...
[2023-02-25 11:27:32,042][00922] Num frames 8900...
[2023-02-25 11:27:32,162][00922] Num frames 9000...
[2023-02-25 11:27:32,282][00922] Num frames 9100...
[2023-02-25 11:27:32,403][00922] Num frames 9200...
[2023-02-25 11:27:32,516][00922] Num frames 9300...
[2023-02-25 11:27:32,636][00922] Num frames 9400...
[2023-02-25 11:27:32,748][00922] Num frames 9500...
[2023-02-25 11:27:32,867][00922] Num frames 9600...
[2023-02-25 11:27:32,982][00922] Num frames 9700...
[2023-02-25 11:27:33,098][00922] Num frames 9800...
[2023-02-25 11:27:33,212][00922] Num frames 9900...
[2023-02-25 11:27:33,338][00922] Num frames 10000...
[2023-02-25 11:27:33,452][00922] Num frames 10100...
[2023-02-25 11:27:33,570][00922] Num frames 10200...
[2023-02-25 11:27:33,682][00922] Num frames 10300...
[2023-02-25 11:27:33,808][00922] Num frames 10400...
[2023-02-25 11:27:33,922][00922] Num frames 10500...
[2023-02-25 11:27:34,037][00922] Num frames 10600...
[2023-02-25 11:27:34,194][00922] Avg episode rewards: #0: 39.558, true rewards: #0: 15.273
[2023-02-25 11:27:34,196][00922] Avg episode reward: 39.558, avg true_objective: 15.273
[2023-02-25 11:27:34,210][00922] Num frames 10700...
[2023-02-25 11:27:34,326][00922] Num frames 10800...
[2023-02-25 11:27:34,444][00922] Num frames 10900...
[2023-02-25 11:27:34,553][00922] Num frames 11000...
[2023-02-25 11:27:34,668][00922] Num frames 11100...
[2023-02-25 11:27:34,777][00922] Num frames 11200...
[2023-02-25 11:27:34,893][00922] Num frames 11300...
[2023-02-25 11:27:35,007][00922] Num frames 11400...
[2023-02-25 11:27:35,125][00922] Num frames 11500...
[2023-02-25 11:27:35,237][00922] Num frames 11600...
[2023-02-25 11:27:35,354][00922] Num frames 11700...
[2023-02-25 11:27:35,469][00922] Num frames 11800...
[2023-02-25 11:27:35,585][00922] Avg episode rewards: #0: 38.192, true rewards: #0: 14.818
[2023-02-25 11:27:35,587][00922] Avg episode reward: 38.192, avg true_objective: 14.818
[2023-02-25 11:27:35,641][00922] Num frames 11900...
[2023-02-25 11:27:35,754][00922] Num frames 12000...
[2023-02-25 11:27:35,872][00922] Num frames 12100...
[2023-02-25 11:27:35,988][00922] Num frames 12200...
[2023-02-25 11:27:36,107][00922] Num frames 12300...
[2023-02-25 11:27:36,227][00922] Num frames 12400...
[2023-02-25 11:27:36,342][00922] Num frames 12500...
[2023-02-25 11:27:36,455][00922] Num frames 12600...
[2023-02-25 11:27:36,565][00922] Num frames 12700...
[2023-02-25 11:27:36,678][00922] Num frames 12800...
[2023-02-25 11:27:36,820][00922] Avg episode rewards: #0: 36.197, true rewards: #0: 14.309
[2023-02-25 11:27:36,822][00922] Avg episode reward: 36.197, avg true_objective: 14.309
[2023-02-25 11:27:36,858][00922] Num frames 12900...
[2023-02-25 11:27:36,984][00922] Num frames 13000...
[2023-02-25 11:27:37,105][00922] Num frames 13100...
[2023-02-25 11:27:37,237][00922] Avg episode rewards: #0: 33.066, true rewards: #0: 13.166
[2023-02-25 11:27:37,238][00922] Avg episode reward: 33.066, avg true_objective: 13.166
[2023-02-25 11:28:55,074][00922] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-25 12:04:18,763][00922] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 12:04:18,767][00922] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 12:04:18,771][00922] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 12:04:18,775][00922] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 12:04:18,777][00922] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 12:04:18,782][00922] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 12:04:18,784][00922] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-25 12:04:18,785][00922] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 12:04:18,786][00922] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-25 12:04:18,790][00922] Adding new argument 'hf_repository'='OliP/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-25 12:04:18,791][00922] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 12:04:18,792][00922] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 12:04:18,793][00922] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 12:04:18,799][00922] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 12:04:18,803][00922] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 12:04:18,829][00922] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 12:04:18,834][00922] RunningMeanStd input shape: (1,)
[2023-02-25 12:04:18,857][00922] ConvEncoder: input_channels=3
[2023-02-25 12:04:18,923][00922] Conv encoder output size: 512
[2023-02-25 12:04:18,927][00922] Policy head output size: 512
[2023-02-25 12:04:18,957][00922] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 12:04:19,881][00922] Num frames 100...
[2023-02-25 12:04:20,068][00922] Num frames 200...
[2023-02-25 12:04:20,251][00922] Num frames 300...
[2023-02-25 12:04:20,446][00922] Num frames 400...
[2023-02-25 12:04:20,696][00922] Num frames 500...
[2023-02-25 12:04:20,861][00922] Num frames 600...
[2023-02-25 12:04:21,073][00922] Num frames 700...
[2023-02-25 12:04:21,288][00922] Num frames 800...
[2023-02-25 12:04:21,475][00922] Num frames 900...
[2023-02-25 12:04:21,666][00922] Num frames 1000...
[2023-02-25 12:04:21,856][00922] Num frames 1100...
[2023-02-25 12:04:22,032][00922] Num frames 1200...
[2023-02-25 12:04:22,366][00922] Num frames 1300...
[2023-02-25 12:04:22,587][00922] Num frames 1400...
[2023-02-25 12:04:22,830][00922] Avg episode rewards: #0: 37.930, true rewards: #0: 14.930
[2023-02-25 12:04:22,845][00922] Avg episode reward: 37.930, avg true_objective: 14.930
[2023-02-25 12:04:22,882][00922] Num frames 1500...
[2023-02-25 12:04:23,131][00922] Num frames 1600...
[2023-02-25 12:04:23,466][00922] Num frames 1700...
[2023-02-25 12:04:23,698][00922] Num frames 1800...
[2023-02-25 12:04:23,949][00922] Num frames 1900...
[2023-02-25 12:04:24,143][00922] Num frames 2000...
[2023-02-25 12:04:24,470][00922] Num frames 2100...
[2023-02-25 12:04:24,672][00922] Num frames 2200...
[2023-02-25 12:04:25,049][00922] Num frames 2300...
[2023-02-25 12:04:25,222][00922] Num frames 2400...
[2023-02-25 12:04:25,419][00922] Num frames 2500...
[2023-02-25 12:04:25,579][00922] Num frames 2600...
[2023-02-25 12:04:25,747][00922] Num frames 2700...
[2023-02-25 12:04:25,864][00922] Num frames 2800...
[2023-02-25 12:04:25,982][00922] Num frames 2900...
[2023-02-25 12:04:26,081][00922] Avg episode rewards: #0: 34.665, true rewards: #0: 14.665
[2023-02-25 12:04:26,084][00922] Avg episode reward: 34.665, avg true_objective: 14.665
[2023-02-25 12:04:26,174][00922] Num frames 3000...
[2023-02-25 12:04:26,286][00922] Num frames 3100...
[2023-02-25 12:04:26,401][00922] Num frames 3200...
[2023-02-25 12:04:26,510][00922] Num frames 3300...
[2023-02-25 12:04:26,621][00922] Num frames 3400...
[2023-02-25 12:04:26,744][00922] Num frames 3500...
[2023-02-25 12:04:26,834][00922] Avg episode rewards: #0: 27.097, true rewards: #0: 11.763
[2023-02-25 12:04:26,837][00922] Avg episode reward: 27.097, avg true_objective: 11.763
[2023-02-25 12:04:26,924][00922] Num frames 3600...
[2023-02-25 12:04:27,042][00922] Num frames 3700...
[2023-02-25 12:04:27,158][00922] Num frames 3800...
[2023-02-25 12:04:27,268][00922] Num frames 3900...
[2023-02-25 12:04:27,384][00922] Num frames 4000...
[2023-02-25 12:04:27,496][00922] Num frames 4100...
[2023-02-25 12:04:27,617][00922] Num frames 4200...
[2023-02-25 12:04:27,726][00922] Num frames 4300...
[2023-02-25 12:04:27,842][00922] Num frames 4400...
[2023-02-25 12:04:27,963][00922] Num frames 4500...
[2023-02-25 12:04:28,076][00922] Num frames 4600...
[2023-02-25 12:04:28,195][00922] Num frames 4700...
[2023-02-25 12:04:28,306][00922] Num frames 4800...
[2023-02-25 12:04:28,423][00922] Num frames 4900...
[2023-02-25 12:04:28,533][00922] Num frames 5000...
[2023-02-25 12:04:28,694][00922] Avg episode rewards: #0: 28.493, true rewards: #0: 12.742
[2023-02-25 12:04:28,696][00922] Avg episode reward: 28.493, avg true_objective: 12.742
[2023-02-25 12:04:28,703][00922] Num frames 5100...
[2023-02-25 12:04:28,825][00922] Num frames 5200...
[2023-02-25 12:04:28,945][00922] Num frames 5300...
[2023-02-25 12:04:29,063][00922] Num frames 5400...
[2023-02-25 12:04:29,173][00922] Num frames 5500...
[2023-02-25 12:04:29,290][00922] Num frames 5600...
[2023-02-25 12:04:29,410][00922] Avg episode rewards: #0: 25.114, true rewards: #0: 11.314
[2023-02-25 12:04:29,412][00922] Avg episode reward: 25.114, avg true_objective: 11.314
[2023-02-25 12:04:29,463][00922] Num frames 5700...
[2023-02-25 12:04:29,571][00922] Num frames 5800...
[2023-02-25 12:04:29,742][00922] Num frames 5900...
[2023-02-25 12:04:29,902][00922] Num frames 6000...
[2023-02-25 12:04:30,060][00922] Num frames 6100...
[2023-02-25 12:04:30,215][00922] Num frames 6200...
[2023-02-25 12:04:30,370][00922] Num frames 6300...
[2023-02-25 12:04:30,528][00922] Num frames 6400...
[2023-02-25 12:04:30,688][00922] Num frames 6500...
[2023-02-25 12:04:30,851][00922] Num frames 6600...
[2023-02-25 12:04:30,922][00922] Avg episode rewards: #0: 24.845, true rewards: #0: 11.012
[2023-02-25 12:04:30,924][00922] Avg episode reward: 24.845, avg true_objective: 11.012
[2023-02-25 12:04:31,081][00922] Num frames 6700...
[2023-02-25 12:04:31,264][00922] Num frames 6800...
[2023-02-25 12:04:31,429][00922] Num frames 6900...
[2023-02-25 12:04:31,586][00922] Num frames 7000...
[2023-02-25 12:04:31,744][00922] Num frames 7100...
[2023-02-25 12:04:31,904][00922] Num frames 7200...
[2023-02-25 12:04:32,074][00922] Num frames 7300...
[2023-02-25 12:04:32,235][00922] Num frames 7400...
[2023-02-25 12:04:32,390][00922] Num frames 7500...
[2023-02-25 12:04:32,547][00922] Num frames 7600...
[2023-02-25 12:04:32,727][00922] Num frames 7700...
[2023-02-25 12:04:32,892][00922] Num frames 7800...
[2023-02-25 12:04:33,062][00922] Num frames 7900...
[2023-02-25 12:04:33,211][00922] Num frames 8000...
[2023-02-25 12:04:33,323][00922] Num frames 8100...
[2023-02-25 12:04:33,437][00922] Num frames 8200...
[2023-02-25 12:04:33,552][00922] Num frames 8300...
[2023-02-25 12:04:33,664][00922] Num frames 8400...
[2023-02-25 12:04:33,784][00922] Num frames 8500...
[2023-02-25 12:04:33,925][00922] Num frames 8600...
[2023-02-25 12:04:34,052][00922] Avg episode rewards: #0: 28.221, true rewards: #0: 12.364
[2023-02-25 12:04:34,054][00922] Avg episode reward: 28.221, avg true_objective: 12.364
[2023-02-25 12:04:34,107][00922] Num frames 8700...
[2023-02-25 12:04:34,220][00922] Num frames 8800...
[2023-02-25 12:04:34,333][00922] Num frames 8900...
[2023-02-25 12:04:34,455][00922] Num frames 9000...
[2023-02-25 12:04:34,570][00922] Num frames 9100...
[2023-02-25 12:04:34,727][00922] Avg episode rewards: #0: 26.241, true rewards: #0: 11.491
[2023-02-25 12:04:34,730][00922] Avg episode reward: 26.241, avg true_objective: 11.491
[2023-02-25 12:04:34,742][00922] Num frames 9200...
[2023-02-25 12:04:34,859][00922] Num frames 9300...
[2023-02-25 12:04:34,980][00922] Num frames 9400...
[2023-02-25 12:04:35,096][00922] Num frames 9500...
[2023-02-25 12:04:35,205][00922] Num frames 9600...
[2023-02-25 12:04:35,330][00922] Num frames 9700...
[2023-02-25 12:04:35,445][00922] Num frames 9800...
[2023-02-25 12:04:35,560][00922] Num frames 9900...
[2023-02-25 12:04:35,683][00922] Avg episode rewards: #0: 25.179, true rewards: #0: 11.068
[2023-02-25 12:04:35,685][00922] Avg episode reward: 25.179, avg true_objective: 11.068
[2023-02-25 12:04:35,732][00922] Num frames 10000...
[2023-02-25 12:04:35,843][00922] Num frames 10100...
[2023-02-25 12:04:35,959][00922] Num frames 10200...
[2023-02-25 12:04:36,084][00922] Num frames 10300...
[2023-02-25 12:04:36,197][00922] Num frames 10400...
[2023-02-25 12:04:36,316][00922] Num frames 10500...
[2023-02-25 12:04:36,450][00922] Avg episode rewards: #0: 24.069, true rewards: #0: 10.569
[2023-02-25 12:04:36,452][00922] Avg episode reward: 24.069, avg true_objective: 10.569
[2023-02-25 12:05:37,936][00922] Replay video saved to /content/train_dir/default_experiment/replay.mp4!