besa2001's picture
Upload . with huggingface_hub
15f0539
[2023-02-23 09:07:07,037][00238] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 09:07:07,039][00238] Rollout worker 0 uses device cpu
[2023-02-23 09:07:07,041][00238] Rollout worker 1 uses device cpu
[2023-02-23 09:07:07,042][00238] Rollout worker 2 uses device cpu
[2023-02-23 09:07:07,043][00238] Rollout worker 3 uses device cpu
[2023-02-23 09:07:07,044][00238] Rollout worker 4 uses device cpu
[2023-02-23 09:07:07,046][00238] Rollout worker 5 uses device cpu
[2023-02-23 09:07:07,047][00238] Rollout worker 6 uses device cpu
[2023-02-23 09:07:07,048][00238] Rollout worker 7 uses device cpu
[2023-02-23 09:07:07,231][00238] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 09:07:07,234][00238] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 09:07:07,265][00238] Starting all processes...
[2023-02-23 09:07:07,266][00238] Starting process learner_proc0
[2023-02-23 09:07:07,321][00238] Starting all processes...
[2023-02-23 09:07:07,330][00238] Starting process inference_proc0-0
[2023-02-23 09:07:07,344][00238] Starting process rollout_proc0
[2023-02-23 09:07:07,346][00238] Starting process rollout_proc1
[2023-02-23 09:07:07,347][00238] Starting process rollout_proc2
[2023-02-23 09:07:07,347][00238] Starting process rollout_proc3
[2023-02-23 09:07:07,348][00238] Starting process rollout_proc4
[2023-02-23 09:07:07,348][00238] Starting process rollout_proc5
[2023-02-23 09:07:07,348][00238] Starting process rollout_proc6
[2023-02-23 09:07:07,348][00238] Starting process rollout_proc7
[2023-02-23 09:07:16,270][12156] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 09:07:16,279][12156] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 09:07:16,390][12176] Worker 1 uses CPU cores [1]
[2023-02-23 09:07:16,754][12182] Worker 7 uses CPU cores [1]
[2023-02-23 09:07:16,775][12175] Worker 0 uses CPU cores [0]
[2023-02-23 09:07:16,823][12178] Worker 3 uses CPU cores [1]
[2023-02-23 09:07:16,824][12179] Worker 4 uses CPU cores [0]
[2023-02-23 09:07:16,942][12170] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 09:07:16,949][12170] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 09:07:16,965][12180] Worker 5 uses CPU cores [1]
[2023-02-23 09:07:16,992][12181] Worker 6 uses CPU cores [0]
[2023-02-23 09:07:16,995][12177] Worker 2 uses CPU cores [0]
[2023-02-23 09:07:17,333][12170] Num visible devices: 1
[2023-02-23 09:07:17,334][12156] Num visible devices: 1
[2023-02-23 09:07:17,336][12156] Starting seed is not provided
[2023-02-23 09:07:17,336][12156] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 09:07:17,336][12156] Initializing actor-critic model on device cuda:0
[2023-02-23 09:07:17,336][12156] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 09:07:17,338][12156] RunningMeanStd input shape: (1,)
[2023-02-23 09:07:17,363][12156] ConvEncoder: input_channels=3
[2023-02-23 09:07:17,694][12156] Conv encoder output size: 512
[2023-02-23 09:07:17,694][12156] Policy head output size: 512
[2023-02-23 09:07:17,753][12156] Created Actor Critic model with architecture:
[2023-02-23 09:07:17,753][12156] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-23 09:07:24,655][12156] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 09:07:24,656][12156] No checkpoints found
[2023-02-23 09:07:24,657][12156] Did not load from checkpoint, starting from scratch!
[2023-02-23 09:07:24,658][12156] Initialized policy 0 weights for model version 0
[2023-02-23 09:07:24,662][12156] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 09:07:24,670][12156] LearnerWorker_p0 finished initialization!
[2023-02-23 09:07:24,868][12170] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 09:07:24,869][12170] RunningMeanStd input shape: (1,)
[2023-02-23 09:07:24,882][12170] ConvEncoder: input_channels=3
[2023-02-23 09:07:24,982][12170] Conv encoder output size: 512
[2023-02-23 09:07:24,983][12170] Policy head output size: 512
[2023-02-23 09:07:27,159][00238] Inference worker 0-0 is ready!
[2023-02-23 09:07:27,161][00238] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 09:07:27,225][00238] Heartbeat connected on Batcher_0
[2023-02-23 09:07:27,228][00238] Heartbeat connected on LearnerWorker_p0
[2023-02-23 09:07:27,237][00238] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 09:07:27,265][00238] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 09:07:27,285][12180] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,307][12182] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,319][12176] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,330][12181] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,332][12178] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,334][12177] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,344][12175] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:27,354][12179] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:07:28,163][12181] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,164][12177] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,734][12180] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,741][12182] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,746][12176] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,753][12178] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,925][12175] Decorrelating experience for 0 frames...
[2023-02-23 09:07:28,944][12181] Decorrelating experience for 32 frames...
[2023-02-23 09:07:29,411][12176] Decorrelating experience for 32 frames...
[2023-02-23 09:07:29,415][12178] Decorrelating experience for 32 frames...
[2023-02-23 09:07:29,940][12176] Decorrelating experience for 64 frames...
[2023-02-23 09:07:30,104][12177] Decorrelating experience for 32 frames...
[2023-02-23 09:07:30,144][12179] Decorrelating experience for 0 frames...
[2023-02-23 09:07:30,321][12175] Decorrelating experience for 32 frames...
[2023-02-23 09:07:30,512][12181] Decorrelating experience for 64 frames...
[2023-02-23 09:07:30,881][12176] Decorrelating experience for 96 frames...
[2023-02-23 09:07:30,894][12178] Decorrelating experience for 64 frames...
[2023-02-23 09:07:31,111][00238] Heartbeat connected on RolloutWorker_w1
[2023-02-23 09:07:31,548][12182] Decorrelating experience for 32 frames...
[2023-02-23 09:07:31,900][12179] Decorrelating experience for 32 frames...
[2023-02-23 09:07:32,039][12178] Decorrelating experience for 96 frames...
[2023-02-23 09:07:32,169][00238] Heartbeat connected on RolloutWorker_w3
[2023-02-23 09:07:32,234][00238] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 09:07:32,378][12182] Decorrelating experience for 64 frames...
[2023-02-23 09:07:32,821][12175] Decorrelating experience for 64 frames...
[2023-02-23 09:07:33,105][12182] Decorrelating experience for 96 frames...
[2023-02-23 09:07:33,208][00238] Heartbeat connected on RolloutWorker_w7
[2023-02-23 09:07:33,807][12177] Decorrelating experience for 64 frames...
[2023-02-23 09:07:34,348][12179] Decorrelating experience for 64 frames...
[2023-02-23 09:07:34,435][12180] Decorrelating experience for 32 frames...
[2023-02-23 09:07:34,828][12181] Decorrelating experience for 96 frames...
[2023-02-23 09:07:34,998][12180] Decorrelating experience for 64 frames...
[2023-02-23 09:07:35,480][12180] Decorrelating experience for 96 frames...
[2023-02-23 09:07:35,528][00238] Heartbeat connected on RolloutWorker_w6
[2023-02-23 09:07:35,624][00238] Heartbeat connected on RolloutWorker_w5
[2023-02-23 09:07:36,878][12177] Decorrelating experience for 96 frames...
[2023-02-23 09:07:36,997][12179] Decorrelating experience for 96 frames...
[2023-02-23 09:07:37,234][00238] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 2.8. Samples: 28. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 09:07:37,295][00238] Heartbeat connected on RolloutWorker_w2
[2023-02-23 09:07:37,391][12175] Decorrelating experience for 96 frames...
[2023-02-23 09:07:37,546][00238] Heartbeat connected on RolloutWorker_w4
[2023-02-23 09:07:38,010][00238] Heartbeat connected on RolloutWorker_w0
[2023-02-23 09:07:40,702][12156] Signal inference workers to stop experience collection...
[2023-02-23 09:07:40,718][12170] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 09:07:42,234][00238] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 182.2. Samples: 2732. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 09:07:42,237][00238] Avg episode reward: [(0, '2.451')]
[2023-02-23 09:07:43,068][12156] Signal inference workers to resume experience collection...
[2023-02-23 09:07:43,070][12170] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 09:07:47,234][00238] Fps is (10 sec: 2457.6, 60 sec: 1229.0, 300 sec: 1229.0). Total num frames: 24576. Throughput: 0: 218.2. Samples: 4364. Policy #0 lag: (min: 0.0, avg: 1.3, max: 3.0)
[2023-02-23 09:07:47,236][00238] Avg episode reward: [(0, '3.571')]
[2023-02-23 09:07:51,226][12170] Updated weights for policy 0, policy_version 10 (0.0026)
[2023-02-23 09:07:52,238][00238] Fps is (10 sec: 4094.3, 60 sec: 1638.3, 300 sec: 1638.3). Total num frames: 40960. Throughput: 0: 427.7. Samples: 10692. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:07:52,241][00238] Avg episode reward: [(0, '4.182')]
[2023-02-23 09:07:57,234][00238] Fps is (10 sec: 3276.8, 60 sec: 1911.6, 300 sec: 1911.6). Total num frames: 57344. Throughput: 0: 506.2. Samples: 15186. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:07:57,238][00238] Avg episode reward: [(0, '4.572')]
[2023-02-23 09:08:02,236][00238] Fps is (10 sec: 3687.3, 60 sec: 2223.6, 300 sec: 2223.6). Total num frames: 77824. Throughput: 0: 509.9. Samples: 17846. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 09:08:02,238][00238] Avg episode reward: [(0, '4.668')]
[2023-02-23 09:08:02,968][12170] Updated weights for policy 0, policy_version 20 (0.0011)
[2023-02-23 09:08:07,234][00238] Fps is (10 sec: 4096.0, 60 sec: 2457.8, 300 sec: 2457.8). Total num frames: 98304. Throughput: 0: 622.6. Samples: 24902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:08:07,237][00238] Avg episode reward: [(0, '4.597')]
[2023-02-23 09:08:12,234][00238] Fps is (10 sec: 4096.5, 60 sec: 2639.8, 300 sec: 2639.8). Total num frames: 118784. Throughput: 0: 684.7. Samples: 30808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:08:12,237][00238] Avg episode reward: [(0, '4.506')]
[2023-02-23 09:08:12,248][12156] Saving new best policy, reward=4.506!
[2023-02-23 09:08:13,018][12170] Updated weights for policy 0, policy_version 30 (0.0026)
[2023-02-23 09:08:17,235][00238] Fps is (10 sec: 3685.9, 60 sec: 2703.5, 300 sec: 2703.5). Total num frames: 135168. Throughput: 0: 733.2. Samples: 32996. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:08:17,239][00238] Avg episode reward: [(0, '4.445')]
[2023-02-23 09:08:22,234][00238] Fps is (10 sec: 3686.6, 60 sec: 2830.1, 300 sec: 2830.1). Total num frames: 155648. Throughput: 0: 854.2. Samples: 38468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:08:22,236][00238] Avg episode reward: [(0, '4.348')]
[2023-02-23 09:08:23,851][12170] Updated weights for policy 0, policy_version 40 (0.0028)
[2023-02-23 09:08:27,234][00238] Fps is (10 sec: 4506.2, 60 sec: 3003.9, 300 sec: 3003.9). Total num frames: 180224. Throughput: 0: 951.0. Samples: 45526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:08:27,239][00238] Avg episode reward: [(0, '4.330')]
[2023-02-23 09:08:32,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3024.9). Total num frames: 196608. Throughput: 0: 984.3. Samples: 48658. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:08:32,241][00238] Avg episode reward: [(0, '4.431')]
[2023-02-23 09:08:34,242][12170] Updated weights for policy 0, policy_version 50 (0.0014)
[2023-02-23 09:08:37,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3042.9). Total num frames: 212992. Throughput: 0: 945.3. Samples: 53228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:08:37,246][00238] Avg episode reward: [(0, '4.525')]
[2023-02-23 09:08:37,257][12156] Saving new best policy, reward=4.525!
[2023-02-23 09:08:42,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3113.1). Total num frames: 233472. Throughput: 0: 979.5. Samples: 59264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:08:42,242][00238] Avg episode reward: [(0, '4.497')]
[2023-02-23 09:08:44,591][12170] Updated weights for policy 0, policy_version 60 (0.0014)
[2023-02-23 09:08:47,234][00238] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3225.7). Total num frames: 258048. Throughput: 0: 998.8. Samples: 62790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:08:47,236][00238] Avg episode reward: [(0, '4.521')]
[2023-02-23 09:08:52,239][00238] Fps is (10 sec: 4093.9, 60 sec: 3891.1, 300 sec: 3228.5). Total num frames: 274432. Throughput: 0: 979.3. Samples: 68974. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:08:52,242][00238] Avg episode reward: [(0, '4.587')]
[2023-02-23 09:08:52,258][12156] Saving new best policy, reward=4.587!
[2023-02-23 09:08:55,669][12170] Updated weights for policy 0, policy_version 70 (0.0017)
[2023-02-23 09:08:57,234][00238] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3231.4). Total num frames: 290816. Throughput: 0: 947.5. Samples: 73444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:08:57,240][00238] Avg episode reward: [(0, '4.566')]
[2023-02-23 09:09:02,234][00238] Fps is (10 sec: 3688.3, 60 sec: 3891.3, 300 sec: 3276.9). Total num frames: 311296. Throughput: 0: 966.2. Samples: 76474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:09:02,236][00238] Avg episode reward: [(0, '4.418')]
[2023-02-23 09:09:02,249][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000076_311296.pth...
[2023-02-23 09:09:05,178][12170] Updated weights for policy 0, policy_version 80 (0.0019)
[2023-02-23 09:09:07,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3358.8). Total num frames: 335872. Throughput: 0: 1001.8. Samples: 83550. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:09:07,242][00238] Avg episode reward: [(0, '4.421')]
[2023-02-23 09:09:12,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3354.9). Total num frames: 352256. Throughput: 0: 961.6. Samples: 88798. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:09:12,237][00238] Avg episode reward: [(0, '4.332')]
[2023-02-23 09:09:17,234][00238] Fps is (10 sec: 2457.6, 60 sec: 3754.7, 300 sec: 3276.9). Total num frames: 360448. Throughput: 0: 932.7. Samples: 90630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:09:17,237][00238] Avg episode reward: [(0, '4.363')]
[2023-02-23 09:09:19,212][12170] Updated weights for policy 0, policy_version 90 (0.0016)
[2023-02-23 09:09:22,234][00238] Fps is (10 sec: 2047.9, 60 sec: 3618.1, 300 sec: 3241.3). Total num frames: 372736. Throughput: 0: 902.3. Samples: 93832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:09:22,241][00238] Avg episode reward: [(0, '4.402')]
[2023-02-23 09:09:27,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3276.9). Total num frames: 393216. Throughput: 0: 897.0. Samples: 99628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:09:27,236][00238] Avg episode reward: [(0, '4.524')]
[2023-02-23 09:09:30,773][12170] Updated weights for policy 0, policy_version 100 (0.0032)
[2023-02-23 09:09:32,234][00238] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3309.6). Total num frames: 413696. Throughput: 0: 884.1. Samples: 102576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:09:32,238][00238] Avg episode reward: [(0, '4.599')]
[2023-02-23 09:09:32,255][12156] Saving new best policy, reward=4.599!
[2023-02-23 09:09:37,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3276.9). Total num frames: 425984. Throughput: 0: 845.0. Samples: 106996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:09:37,237][00238] Avg episode reward: [(0, '4.601')]
[2023-02-23 09:09:37,244][12156] Saving new best policy, reward=4.601!
[2023-02-23 09:09:42,055][12170] Updated weights for policy 0, policy_version 110 (0.0013)
[2023-02-23 09:09:42,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3337.6). Total num frames: 450560. Throughput: 0: 885.2. Samples: 113276. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:09:42,236][00238] Avg episode reward: [(0, '4.624')]
[2023-02-23 09:09:42,246][12156] Saving new best policy, reward=4.624!
[2023-02-23 09:09:47,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3364.6). Total num frames: 471040. Throughput: 0: 896.7. Samples: 116824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:09:47,237][00238] Avg episode reward: [(0, '4.455')]
[2023-02-23 09:09:52,237][00238] Fps is (10 sec: 3685.4, 60 sec: 3550.0, 300 sec: 3361.5). Total num frames: 487424. Throughput: 0: 868.1. Samples: 122618. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:09:52,242][00238] Avg episode reward: [(0, '4.302')]
[2023-02-23 09:09:52,278][12170] Updated weights for policy 0, policy_version 120 (0.0017)
[2023-02-23 09:09:57,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3358.8). Total num frames: 503808. Throughput: 0: 853.6. Samples: 127212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:09:57,238][00238] Avg episode reward: [(0, '4.437')]
[2023-02-23 09:10:02,234][00238] Fps is (10 sec: 4097.1, 60 sec: 3618.1, 300 sec: 3409.0). Total num frames: 528384. Throughput: 0: 883.9. Samples: 130406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:10:02,237][00238] Avg episode reward: [(0, '4.537')]
[2023-02-23 09:10:02,753][12170] Updated weights for policy 0, policy_version 130 (0.0028)
[2023-02-23 09:10:07,234][00238] Fps is (10 sec: 4915.3, 60 sec: 3618.1, 300 sec: 3456.1). Total num frames: 552960. Throughput: 0: 970.9. Samples: 137522. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 09:10:07,236][00238] Avg episode reward: [(0, '4.547')]
[2023-02-23 09:10:12,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3450.6). Total num frames: 569344. Throughput: 0: 960.0. Samples: 142830. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 09:10:12,236][00238] Avg episode reward: [(0, '4.461')]
[2023-02-23 09:10:13,227][12170] Updated weights for policy 0, policy_version 140 (0.0021)
[2023-02-23 09:10:17,234][00238] Fps is (10 sec: 2867.0, 60 sec: 3686.4, 300 sec: 3421.4). Total num frames: 581632. Throughput: 0: 946.2. Samples: 145154. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 09:10:17,242][00238] Avg episode reward: [(0, '4.442')]
[2023-02-23 09:10:22,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3464.1). Total num frames: 606208. Throughput: 0: 985.1. Samples: 151324. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:10:22,242][00238] Avg episode reward: [(0, '4.436')]
[2023-02-23 09:10:23,646][12170] Updated weights for policy 0, policy_version 150 (0.0017)
[2023-02-23 09:10:27,234][00238] Fps is (10 sec: 4915.5, 60 sec: 3959.5, 300 sec: 3504.4). Total num frames: 630784. Throughput: 0: 1004.8. Samples: 158494. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:10:27,241][00238] Avg episode reward: [(0, '4.556')]
[2023-02-23 09:10:32,238][00238] Fps is (10 sec: 3684.8, 60 sec: 3822.7, 300 sec: 3476.0). Total num frames: 643072. Throughput: 0: 973.0. Samples: 160614. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:10:32,241][00238] Avg episode reward: [(0, '4.647')]
[2023-02-23 09:10:32,250][12156] Saving new best policy, reward=4.647!
[2023-02-23 09:10:35,456][12170] Updated weights for policy 0, policy_version 160 (0.0014)
[2023-02-23 09:10:37,234][00238] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3470.9). Total num frames: 659456. Throughput: 0: 943.5. Samples: 165072. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:10:37,237][00238] Avg episode reward: [(0, '4.556')]
[2023-02-23 09:10:42,234][00238] Fps is (10 sec: 4097.8, 60 sec: 3891.2, 300 sec: 3507.9). Total num frames: 684032. Throughput: 0: 987.9. Samples: 171668. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:10:42,236][00238] Avg episode reward: [(0, '4.341')]
[2023-02-23 09:10:44,692][12170] Updated weights for policy 0, policy_version 170 (0.0018)
[2023-02-23 09:10:47,235][00238] Fps is (10 sec: 4914.5, 60 sec: 3959.4, 300 sec: 3543.1). Total num frames: 708608. Throughput: 0: 996.5. Samples: 175248. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:10:47,239][00238] Avg episode reward: [(0, '4.462')]
[2023-02-23 09:10:52,237][00238] Fps is (10 sec: 3685.2, 60 sec: 3891.2, 300 sec: 3516.6). Total num frames: 720896. Throughput: 0: 956.5. Samples: 180566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:10:52,244][00238] Avg episode reward: [(0, '4.610')]
[2023-02-23 09:10:56,941][12170] Updated weights for policy 0, policy_version 180 (0.0016)
[2023-02-23 09:10:57,234][00238] Fps is (10 sec: 2867.6, 60 sec: 3891.2, 300 sec: 3510.9). Total num frames: 737280. Throughput: 0: 939.6. Samples: 185112. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:10:57,236][00238] Avg episode reward: [(0, '4.611')]
[2023-02-23 09:11:02,234][00238] Fps is (10 sec: 4097.3, 60 sec: 3891.2, 300 sec: 3543.6). Total num frames: 761856. Throughput: 0: 969.2. Samples: 188766. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:11:02,236][00238] Avg episode reward: [(0, '4.944')]
[2023-02-23 09:11:02,253][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000186_761856.pth...
[2023-02-23 09:11:02,361][12156] Saving new best policy, reward=4.944!
[2023-02-23 09:11:05,723][12170] Updated weights for policy 0, policy_version 190 (0.0020)
[2023-02-23 09:11:07,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3556.1). Total num frames: 782336. Throughput: 0: 984.0. Samples: 195602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:11:07,238][00238] Avg episode reward: [(0, '5.233')]
[2023-02-23 09:11:07,249][12156] Saving new best policy, reward=5.233!
[2023-02-23 09:11:12,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3549.9). Total num frames: 798720. Throughput: 0: 935.2. Samples: 200580. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:11:12,239][00238] Avg episode reward: [(0, '5.407')]
[2023-02-23 09:11:12,251][12156] Saving new best policy, reward=5.407!
[2023-02-23 09:11:17,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3544.0). Total num frames: 815104. Throughput: 0: 938.7. Samples: 202850. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 09:11:17,240][00238] Avg episode reward: [(0, '5.328')]
[2023-02-23 09:11:17,748][12170] Updated weights for policy 0, policy_version 200 (0.0021)
[2023-02-23 09:11:22,235][00238] Fps is (10 sec: 4095.4, 60 sec: 3891.1, 300 sec: 3573.1). Total num frames: 839680. Throughput: 0: 985.7. Samples: 209432. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:11:22,238][00238] Avg episode reward: [(0, '5.261')]
[2023-02-23 09:11:26,367][12170] Updated weights for policy 0, policy_version 210 (0.0021)
[2023-02-23 09:11:27,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3584.0). Total num frames: 860160. Throughput: 0: 1000.1. Samples: 216674. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:11:27,239][00238] Avg episode reward: [(0, '5.654')]
[2023-02-23 09:11:27,275][12156] Saving new best policy, reward=5.654!
[2023-02-23 09:11:32,234][00238] Fps is (10 sec: 3687.0, 60 sec: 3891.5, 300 sec: 3577.8). Total num frames: 876544. Throughput: 0: 969.7. Samples: 218884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:11:32,239][00238] Avg episode reward: [(0, '5.615')]
[2023-02-23 09:11:37,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3571.8). Total num frames: 892928. Throughput: 0: 954.2. Samples: 223504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:11:37,241][00238] Avg episode reward: [(0, '5.604')]
[2023-02-23 09:11:38,351][12170] Updated weights for policy 0, policy_version 220 (0.0024)
[2023-02-23 09:11:42,234][00238] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3598.1). Total num frames: 917504. Throughput: 0: 1012.2. Samples: 230662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:11:42,236][00238] Avg episode reward: [(0, '5.724')]
[2023-02-23 09:11:42,254][12156] Saving new best policy, reward=5.724!
[2023-02-23 09:11:46,933][12170] Updated weights for policy 0, policy_version 230 (0.0015)
[2023-02-23 09:11:47,234][00238] Fps is (10 sec: 4915.2, 60 sec: 3891.3, 300 sec: 3623.4). Total num frames: 942080. Throughput: 0: 1008.1. Samples: 234132. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:11:47,238][00238] Avg episode reward: [(0, '5.598')]
[2023-02-23 09:11:52,234][00238] Fps is (10 sec: 4096.1, 60 sec: 3959.7, 300 sec: 3616.9). Total num frames: 958464. Throughput: 0: 973.4. Samples: 239406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:11:52,236][00238] Avg episode reward: [(0, '5.357')]
[2023-02-23 09:11:57,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3610.6). Total num frames: 974848. Throughput: 0: 974.0. Samples: 244410. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:11:57,238][00238] Avg episode reward: [(0, '5.496')]
[2023-02-23 09:11:58,714][12170] Updated weights for policy 0, policy_version 240 (0.0017)
[2023-02-23 09:12:02,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3634.3). Total num frames: 999424. Throughput: 0: 1004.5. Samples: 248054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:12:02,236][00238] Avg episode reward: [(0, '5.773')]
[2023-02-23 09:12:02,248][12156] Saving new best policy, reward=5.773!
[2023-02-23 09:12:07,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3642.6). Total num frames: 1019904. Throughput: 0: 1016.3. Samples: 255162. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:12:07,238][00238] Avg episode reward: [(0, '6.191')]
[2023-02-23 09:12:07,241][12156] Saving new best policy, reward=6.191!
[2023-02-23 09:12:08,124][12170] Updated weights for policy 0, policy_version 250 (0.0021)
[2023-02-23 09:12:12,234][00238] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3636.1). Total num frames: 1036288. Throughput: 0: 955.5. Samples: 259670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:12:12,239][00238] Avg episode reward: [(0, '6.118')]
[2023-02-23 09:12:17,234][00238] Fps is (10 sec: 3276.7, 60 sec: 3959.5, 300 sec: 3629.9). Total num frames: 1052672. Throughput: 0: 957.9. Samples: 261988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:12:17,237][00238] Avg episode reward: [(0, '6.363')]
[2023-02-23 09:12:17,241][12156] Saving new best policy, reward=6.363!
[2023-02-23 09:12:19,434][12170] Updated weights for policy 0, policy_version 260 (0.0024)
[2023-02-23 09:12:22,234][00238] Fps is (10 sec: 4096.1, 60 sec: 3959.6, 300 sec: 3651.7). Total num frames: 1077248. Throughput: 0: 1007.4. Samples: 268838. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:12:22,242][00238] Avg episode reward: [(0, '6.024')]
[2023-02-23 09:12:27,234][00238] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 1097728. Throughput: 0: 1000.3. Samples: 275674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:12:27,240][00238] Avg episode reward: [(0, '6.246')]
[2023-02-23 09:12:29,113][12170] Updated weights for policy 0, policy_version 270 (0.0019)
[2023-02-23 09:12:32,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1114112. Throughput: 0: 974.1. Samples: 277968. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:12:32,242][00238] Avg episode reward: [(0, '6.459')]
[2023-02-23 09:12:32,265][12156] Saving new best policy, reward=6.459!
[2023-02-23 09:12:37,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3846.1). Total num frames: 1134592. Throughput: 0: 965.2. Samples: 282842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:12:37,240][00238] Avg episode reward: [(0, '7.253')]
[2023-02-23 09:12:37,242][12156] Saving new best policy, reward=7.253!
[2023-02-23 09:12:39,757][12170] Updated weights for policy 0, policy_version 280 (0.0030)
[2023-02-23 09:12:42,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3832.2). Total num frames: 1155072. Throughput: 0: 1014.3. Samples: 290054. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-23 09:12:42,239][00238] Avg episode reward: [(0, '7.522')]
[2023-02-23 09:12:42,267][12156] Saving new best policy, reward=7.522!
[2023-02-23 09:12:47,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1179648. Throughput: 0: 1012.8. Samples: 293630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:12:47,236][00238] Avg episode reward: [(0, '7.405')]
[2023-02-23 09:12:49,692][12170] Updated weights for policy 0, policy_version 290 (0.0022)
[2023-02-23 09:12:52,235][00238] Fps is (10 sec: 3685.9, 60 sec: 3891.1, 300 sec: 3846.1). Total num frames: 1191936. Throughput: 0: 962.3. Samples: 298466. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-23 09:12:52,242][00238] Avg episode reward: [(0, '7.736')]
[2023-02-23 09:12:52,255][12156] Saving new best policy, reward=7.736!
[2023-02-23 09:12:57,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1212416. Throughput: 0: 981.8. Samples: 303852. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:12:57,241][00238] Avg episode reward: [(0, '8.005')]
[2023-02-23 09:12:57,246][12156] Saving new best policy, reward=8.005!
[2023-02-23 09:13:00,332][12170] Updated weights for policy 0, policy_version 300 (0.0026)
[2023-02-23 09:13:02,234][00238] Fps is (10 sec: 4506.2, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1236992. Throughput: 0: 1011.0. Samples: 307482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:13:02,242][00238] Avg episode reward: [(0, '8.356')]
[2023-02-23 09:13:02,252][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000302_1236992.pth...
[2023-02-23 09:13:02,362][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000076_311296.pth
[2023-02-23 09:13:02,374][12156] Saving new best policy, reward=8.356!
[2023-02-23 09:13:07,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1257472. Throughput: 0: 1014.9. Samples: 314508. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:13:07,237][00238] Avg episode reward: [(0, '8.486')]
[2023-02-23 09:13:07,241][12156] Saving new best policy, reward=8.486!
[2023-02-23 09:13:10,859][12170] Updated weights for policy 0, policy_version 310 (0.0014)
[2023-02-23 09:13:12,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1269760. Throughput: 0: 961.7. Samples: 318952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:12,244][00238] Avg episode reward: [(0, '8.369')]
[2023-02-23 09:13:17,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1290240. Throughput: 0: 963.3. Samples: 321316. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:17,236][00238] Avg episode reward: [(0, '8.035')]
[2023-02-23 09:13:20,954][12170] Updated weights for policy 0, policy_version 320 (0.0015)
[2023-02-23 09:13:22,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1314816. Throughput: 0: 1016.8. Samples: 328598. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:22,240][00238] Avg episode reward: [(0, '7.392')]
[2023-02-23 09:13:27,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1335296. Throughput: 0: 1004.5. Samples: 335256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:13:27,236][00238] Avg episode reward: [(0, '7.968')]
[2023-02-23 09:13:31,439][12170] Updated weights for policy 0, policy_version 330 (0.0031)
[2023-02-23 09:13:32,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1351680. Throughput: 0: 976.5. Samples: 337572. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:13:32,236][00238] Avg episode reward: [(0, '8.104')]
[2023-02-23 09:13:37,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1372160. Throughput: 0: 986.3. Samples: 342848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:37,241][00238] Avg episode reward: [(0, '8.741')]
[2023-02-23 09:13:37,245][12156] Saving new best policy, reward=8.741!
[2023-02-23 09:13:40,975][12170] Updated weights for policy 0, policy_version 340 (0.0016)
[2023-02-23 09:13:42,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3860.0). Total num frames: 1396736. Throughput: 0: 1030.6. Samples: 350228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:42,236][00238] Avg episode reward: [(0, '8.847')]
[2023-02-23 09:13:42,250][12156] Saving new best policy, reward=8.847!
[2023-02-23 09:13:47,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3873.9). Total num frames: 1417216. Throughput: 0: 1032.0. Samples: 353920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:13:47,239][00238] Avg episode reward: [(0, '9.338')]
[2023-02-23 09:13:47,242][12156] Saving new best policy, reward=9.338!
[2023-02-23 09:13:51,617][12170] Updated weights for policy 0, policy_version 350 (0.0021)
[2023-02-23 09:13:52,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 3873.8). Total num frames: 1433600. Throughput: 0: 980.6. Samples: 358634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:13:52,239][00238] Avg episode reward: [(0, '9.607')]
[2023-02-23 09:13:52,256][12156] Saving new best policy, reward=9.607!
[2023-02-23 09:13:57,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 1454080. Throughput: 0: 1010.1. Samples: 364406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:13:57,240][00238] Avg episode reward: [(0, '9.484')]
[2023-02-23 09:14:01,148][12170] Updated weights for policy 0, policy_version 360 (0.0030)
[2023-02-23 09:14:02,234][00238] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 1478656. Throughput: 0: 1038.6. Samples: 368054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:14:02,237][00238] Avg episode reward: [(0, '9.081')]
[2023-02-23 09:14:07,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3887.7). Total num frames: 1499136. Throughput: 0: 1029.0. Samples: 374902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:14:07,241][00238] Avg episode reward: [(0, '8.714')]
[2023-02-23 09:14:11,950][12170] Updated weights for policy 0, policy_version 370 (0.0012)
[2023-02-23 09:14:12,234][00238] Fps is (10 sec: 3686.3, 60 sec: 4096.0, 300 sec: 3915.5). Total num frames: 1515520. Throughput: 0: 985.8. Samples: 379618. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:14:12,241][00238] Avg episode reward: [(0, '8.646')]
[2023-02-23 09:14:17,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3943.3). Total num frames: 1536000. Throughput: 0: 996.4. Samples: 382412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:14:17,236][00238] Avg episode reward: [(0, '8.956')]
[2023-02-23 09:14:20,969][12170] Updated weights for policy 0, policy_version 380 (0.0014)
[2023-02-23 09:14:22,234][00238] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1560576. Throughput: 0: 1047.5. Samples: 389984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:14:22,240][00238] Avg episode reward: [(0, '8.843')]
[2023-02-23 09:14:27,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1581056. Throughput: 0: 1024.0. Samples: 396306. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:14:27,244][00238] Avg episode reward: [(0, '8.543')]
[2023-02-23 09:14:31,947][12170] Updated weights for policy 0, policy_version 390 (0.0014)
[2023-02-23 09:14:32,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3971.0). Total num frames: 1597440. Throughput: 0: 991.9. Samples: 398556. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:14:32,236][00238] Avg episode reward: [(0, '9.061')]
[2023-02-23 09:14:37,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 1617920. Throughput: 0: 1014.7. Samples: 404296. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:14:37,238][00238] Avg episode reward: [(0, '10.138')]
[2023-02-23 09:14:37,242][12156] Saving new best policy, reward=10.138!
[2023-02-23 09:14:40,895][12170] Updated weights for policy 0, policy_version 400 (0.0022)
[2023-02-23 09:14:42,246][00238] Fps is (10 sec: 4500.2, 60 sec: 4095.2, 300 sec: 3970.9). Total num frames: 1642496. Throughput: 0: 1050.2. Samples: 411676. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:14:42,248][00238] Avg episode reward: [(0, '11.538')]
[2023-02-23 09:14:42,255][12156] Saving new best policy, reward=11.538!
[2023-02-23 09:14:47,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3985.0). Total num frames: 1662976. Throughput: 0: 1039.8. Samples: 414844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:14:47,236][00238] Avg episode reward: [(0, '11.993')]
[2023-02-23 09:14:47,239][12156] Saving new best policy, reward=11.993!
[2023-02-23 09:14:52,234][00238] Fps is (10 sec: 3280.7, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1675264. Throughput: 0: 989.0. Samples: 419408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:14:52,239][00238] Avg episode reward: [(0, '12.266')]
[2023-02-23 09:14:52,259][12156] Saving new best policy, reward=12.266!
[2023-02-23 09:14:52,603][12170] Updated weights for policy 0, policy_version 410 (0.0032)
[2023-02-23 09:14:57,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3971.0). Total num frames: 1699840. Throughput: 0: 1023.0. Samples: 425654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:14:57,236][00238] Avg episode reward: [(0, '12.119')]
[2023-02-23 09:15:01,272][12170] Updated weights for policy 0, policy_version 420 (0.0022)
[2023-02-23 09:15:02,238][00238] Fps is (10 sec: 4913.1, 60 sec: 4095.7, 300 sec: 3971.0). Total num frames: 1724416. Throughput: 0: 1042.9. Samples: 429348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:15:02,241][00238] Avg episode reward: [(0, '12.582')]
[2023-02-23 09:15:02,252][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000421_1724416.pth...
[2023-02-23 09:15:02,390][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000186_761856.pth
[2023-02-23 09:15:02,399][12156] Saving new best policy, reward=12.582!
[2023-02-23 09:15:07,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1740800. Throughput: 0: 1017.1. Samples: 435752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:15:07,236][00238] Avg episode reward: [(0, '13.373')]
[2023-02-23 09:15:07,243][12156] Saving new best policy, reward=13.373!
[2023-02-23 09:15:12,234][00238] Fps is (10 sec: 3278.2, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1757184. Throughput: 0: 980.7. Samples: 440436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:15:12,242][00238] Avg episode reward: [(0, '13.913')]
[2023-02-23 09:15:12,252][12156] Saving new best policy, reward=13.913!
[2023-02-23 09:15:12,808][12170] Updated weights for policy 0, policy_version 430 (0.0040)
[2023-02-23 09:15:17,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1777664. Throughput: 0: 996.0. Samples: 443374. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:15:17,237][00238] Avg episode reward: [(0, '15.161')]
[2023-02-23 09:15:17,241][12156] Saving new best policy, reward=15.161!
[2023-02-23 09:15:21,312][12170] Updated weights for policy 0, policy_version 440 (0.0024)
[2023-02-23 09:15:22,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 1802240. Throughput: 0: 1035.1. Samples: 450876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:15:22,236][00238] Avg episode reward: [(0, '16.626')]
[2023-02-23 09:15:22,290][12156] Saving new best policy, reward=16.626!
[2023-02-23 09:15:27,235][00238] Fps is (10 sec: 4505.1, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 1822720. Throughput: 0: 1003.8. Samples: 456836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:15:27,238][00238] Avg episode reward: [(0, '16.641')]
[2023-02-23 09:15:27,242][12156] Saving new best policy, reward=16.641!
[2023-02-23 09:15:32,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 1839104. Throughput: 0: 983.1. Samples: 459084. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:15:32,240][00238] Avg episode reward: [(0, '17.034')]
[2023-02-23 09:15:32,256][12156] Saving new best policy, reward=17.034!
[2023-02-23 09:15:32,996][12170] Updated weights for policy 0, policy_version 450 (0.0014)
[2023-02-23 09:15:37,234][00238] Fps is (10 sec: 3686.8, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 1859584. Throughput: 0: 1011.0. Samples: 464904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:15:37,240][00238] Avg episode reward: [(0, '16.550')]
[2023-02-23 09:15:41,612][12170] Updated weights for policy 0, policy_version 460 (0.0018)
[2023-02-23 09:15:42,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4028.5, 300 sec: 3984.9). Total num frames: 1884160. Throughput: 0: 1039.2. Samples: 472418. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:15:42,239][00238] Avg episode reward: [(0, '15.795')]
[2023-02-23 09:15:47,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 1904640. Throughput: 0: 1020.1. Samples: 475250. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:15:47,238][00238] Avg episode reward: [(0, '15.275')]
[2023-02-23 09:15:52,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 1916928. Throughput: 0: 978.8. Samples: 479800. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:15:52,248][00238] Avg episode reward: [(0, '15.155')]
[2023-02-23 09:15:53,609][12170] Updated weights for policy 0, policy_version 470 (0.0015)
[2023-02-23 09:15:57,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 1941504. Throughput: 0: 1018.6. Samples: 486274. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:15:57,236][00238] Avg episode reward: [(0, '13.594')]
[2023-02-23 09:16:01,789][12170] Updated weights for policy 0, policy_version 480 (0.0015)
[2023-02-23 09:16:02,234][00238] Fps is (10 sec: 4915.1, 60 sec: 4028.0, 300 sec: 4012.7). Total num frames: 1966080. Throughput: 0: 1037.3. Samples: 490052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:16:02,242][00238] Avg episode reward: [(0, '15.065')]
[2023-02-23 09:16:07,234][00238] Fps is (10 sec: 4095.7, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 1982464. Throughput: 0: 1003.8. Samples: 496046. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:16:07,239][00238] Avg episode reward: [(0, '15.464')]
[2023-02-23 09:16:12,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 1998848. Throughput: 0: 971.6. Samples: 500558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:16:12,236][00238] Avg episode reward: [(0, '16.176')]
[2023-02-23 09:16:14,074][12170] Updated weights for policy 0, policy_version 490 (0.0018)
[2023-02-23 09:16:17,234][00238] Fps is (10 sec: 3686.6, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 2019328. Throughput: 0: 991.7. Samples: 503712. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:16:17,237][00238] Avg episode reward: [(0, '17.618')]
[2023-02-23 09:16:17,294][12156] Saving new best policy, reward=17.618!
[2023-02-23 09:16:22,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2043904. Throughput: 0: 1026.8. Samples: 511112. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:16:22,236][00238] Avg episode reward: [(0, '16.243')]
[2023-02-23 09:16:22,279][12170] Updated weights for policy 0, policy_version 500 (0.0011)
[2023-02-23 09:16:27,238][00238] Fps is (10 sec: 4503.8, 60 sec: 4027.5, 300 sec: 4026.5). Total num frames: 2064384. Throughput: 0: 982.2. Samples: 516622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:16:27,240][00238] Avg episode reward: [(0, '16.607')]
[2023-02-23 09:16:32,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2076672. Throughput: 0: 969.4. Samples: 518874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:16:32,243][00238] Avg episode reward: [(0, '16.221')]
[2023-02-23 09:16:34,602][12170] Updated weights for policy 0, policy_version 510 (0.0011)
[2023-02-23 09:16:37,234][00238] Fps is (10 sec: 3687.9, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2101248. Throughput: 0: 1000.0. Samples: 524802. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:16:37,238][00238] Avg episode reward: [(0, '15.722')]
[2023-02-23 09:16:42,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2121728. Throughput: 0: 1017.0. Samples: 532038. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:16:42,241][00238] Avg episode reward: [(0, '18.147')]
[2023-02-23 09:16:42,277][12156] Saving new best policy, reward=18.147!
[2023-02-23 09:16:43,509][12170] Updated weights for policy 0, policy_version 520 (0.0011)
[2023-02-23 09:16:47,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2142208. Throughput: 0: 985.6. Samples: 534404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:16:47,237][00238] Avg episode reward: [(0, '18.653')]
[2023-02-23 09:16:47,242][12156] Saving new best policy, reward=18.653!
[2023-02-23 09:16:52,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2154496. Throughput: 0: 956.2. Samples: 539074. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:16:52,236][00238] Avg episode reward: [(0, '19.309')]
[2023-02-23 09:16:52,253][12156] Saving new best policy, reward=19.309!
[2023-02-23 09:16:55,004][12170] Updated weights for policy 0, policy_version 530 (0.0020)
[2023-02-23 09:16:57,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 2179072. Throughput: 0: 1010.6. Samples: 546036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:16:57,241][00238] Avg episode reward: [(0, '21.995')]
[2023-02-23 09:16:57,245][12156] Saving new best policy, reward=21.995!
[2023-02-23 09:17:02,234][00238] Fps is (10 sec: 4915.1, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2203648. Throughput: 0: 1020.7. Samples: 549644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:17:02,240][00238] Avg episode reward: [(0, '23.587')]
[2023-02-23 09:17:02,260][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000538_2203648.pth...
[2023-02-23 09:17:02,518][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000302_1236992.pth
[2023-02-23 09:17:02,528][12156] Saving new best policy, reward=23.587!
[2023-02-23 09:17:03,959][12170] Updated weights for policy 0, policy_version 540 (0.0031)
[2023-02-23 09:17:07,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2220032. Throughput: 0: 977.9. Samples: 555118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:17:07,241][00238] Avg episode reward: [(0, '23.481')]
[2023-02-23 09:17:12,234][00238] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2236416. Throughput: 0: 964.1. Samples: 560002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:17:12,236][00238] Avg episode reward: [(0, '22.087')]
[2023-02-23 09:17:15,322][12170] Updated weights for policy 0, policy_version 550 (0.0023)
[2023-02-23 09:17:17,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2260992. Throughput: 0: 995.6. Samples: 563676. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:17,241][00238] Avg episode reward: [(0, '21.207')]
[2023-02-23 09:17:22,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2285568. Throughput: 0: 1027.9. Samples: 571056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:22,236][00238] Avg episode reward: [(0, '20.655')]
[2023-02-23 09:17:24,722][12170] Updated weights for policy 0, policy_version 560 (0.0022)
[2023-02-23 09:17:27,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3891.5, 300 sec: 4012.7). Total num frames: 2297856. Throughput: 0: 976.6. Samples: 575984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:27,241][00238] Avg episode reward: [(0, '20.350')]
[2023-02-23 09:17:32,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2318336. Throughput: 0: 974.8. Samples: 578272. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:32,236][00238] Avg episode reward: [(0, '20.527')]
[2023-02-23 09:17:35,679][12170] Updated weights for policy 0, policy_version 570 (0.0017)
[2023-02-23 09:17:37,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2338816. Throughput: 0: 1019.8. Samples: 584966. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:37,236][00238] Avg episode reward: [(0, '20.414')]
[2023-02-23 09:17:42,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2363392. Throughput: 0: 1025.9. Samples: 592202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:42,240][00238] Avg episode reward: [(0, '21.190')]
[2023-02-23 09:17:45,307][12170] Updated weights for policy 0, policy_version 580 (0.0021)
[2023-02-23 09:17:47,234][00238] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2379776. Throughput: 0: 996.1. Samples: 594470. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:47,242][00238] Avg episode reward: [(0, '21.976')]
[2023-02-23 09:17:52,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2396160. Throughput: 0: 976.4. Samples: 599056. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:17:52,236][00238] Avg episode reward: [(0, '22.285')]
[2023-02-23 09:17:56,074][12170] Updated weights for policy 0, policy_version 590 (0.0021)
[2023-02-23 09:17:57,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2420736. Throughput: 0: 1025.6. Samples: 606152. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:17:57,240][00238] Avg episode reward: [(0, '22.255')]
[2023-02-23 09:18:02,234][00238] Fps is (10 sec: 4914.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2445312. Throughput: 0: 1026.9. Samples: 609886. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:18:02,241][00238] Avg episode reward: [(0, '23.318')]
[2023-02-23 09:18:05,878][12170] Updated weights for policy 0, policy_version 600 (0.0014)
[2023-02-23 09:18:07,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2457600. Throughput: 0: 976.9. Samples: 615016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:18:07,236][00238] Avg episode reward: [(0, '23.530')]
[2023-02-23 09:18:12,234][00238] Fps is (10 sec: 3277.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2478080. Throughput: 0: 986.8. Samples: 620388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 09:18:12,246][00238] Avg episode reward: [(0, '22.143')]
[2023-02-23 09:18:16,084][12170] Updated weights for policy 0, policy_version 610 (0.0019)
[2023-02-23 09:18:17,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2502656. Throughput: 0: 1017.3. Samples: 624052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:18:17,240][00238] Avg episode reward: [(0, '22.590')]
[2023-02-23 09:18:22,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2523136. Throughput: 0: 1032.7. Samples: 631438. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:18:22,239][00238] Avg episode reward: [(0, '23.664')]
[2023-02-23 09:18:22,276][12156] Saving new best policy, reward=23.664!
[2023-02-23 09:18:26,309][12170] Updated weights for policy 0, policy_version 620 (0.0031)
[2023-02-23 09:18:27,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2539520. Throughput: 0: 973.6. Samples: 636014. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 09:18:27,241][00238] Avg episode reward: [(0, '23.153')]
[2023-02-23 09:18:32,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2560000. Throughput: 0: 975.7. Samples: 638378. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:18:32,243][00238] Avg episode reward: [(0, '22.132')]
[2023-02-23 09:18:36,336][12170] Updated weights for policy 0, policy_version 630 (0.0020)
[2023-02-23 09:18:37,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2584576. Throughput: 0: 1033.4. Samples: 645558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:18:37,239][00238] Avg episode reward: [(0, '20.174')]
[2023-02-23 09:18:42,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2605056. Throughput: 0: 1024.0. Samples: 652232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:18:42,239][00238] Avg episode reward: [(0, '21.235')]
[2023-02-23 09:18:46,996][12170] Updated weights for policy 0, policy_version 640 (0.0019)
[2023-02-23 09:18:47,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2621440. Throughput: 0: 991.9. Samples: 654522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:18:47,236][00238] Avg episode reward: [(0, '20.799')]
[2023-02-23 09:18:52,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2641920. Throughput: 0: 995.4. Samples: 659808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:18:52,240][00238] Avg episode reward: [(0, '20.891')]
[2023-02-23 09:18:56,419][12170] Updated weights for policy 0, policy_version 650 (0.0013)
[2023-02-23 09:18:57,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2666496. Throughput: 0: 1040.5. Samples: 667212. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:18:57,239][00238] Avg episode reward: [(0, '21.049')]
[2023-02-23 09:19:02,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 2686976. Throughput: 0: 1039.8. Samples: 670842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:19:02,241][00238] Avg episode reward: [(0, '20.618')]
[2023-02-23 09:19:02,250][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth...
[2023-02-23 09:19:02,391][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000421_1724416.pth
[2023-02-23 09:19:07,241][00238] Fps is (10 sec: 3274.4, 60 sec: 4027.3, 300 sec: 4012.6). Total num frames: 2699264. Throughput: 0: 977.1. Samples: 675416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:07,247][00238] Avg episode reward: [(0, '20.662')]
[2023-02-23 09:19:07,502][12170] Updated weights for policy 0, policy_version 660 (0.0027)
[2023-02-23 09:19:12,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2719744. Throughput: 0: 1008.7. Samples: 681404. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:12,237][00238] Avg episode reward: [(0, '21.221')]
[2023-02-23 09:19:16,419][12170] Updated weights for policy 0, policy_version 670 (0.0011)
[2023-02-23 09:19:17,234][00238] Fps is (10 sec: 4508.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2744320. Throughput: 0: 1039.2. Samples: 685142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:17,240][00238] Avg episode reward: [(0, '20.846')]
[2023-02-23 09:19:22,234][00238] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2764800. Throughput: 0: 1028.7. Samples: 691848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:22,238][00238] Avg episode reward: [(0, '22.061')]
[2023-02-23 09:19:27,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2781184. Throughput: 0: 986.3. Samples: 696616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:19:27,245][00238] Avg episode reward: [(0, '23.535')]
[2023-02-23 09:19:27,612][12170] Updated weights for policy 0, policy_version 680 (0.0018)
[2023-02-23 09:19:32,234][00238] Fps is (10 sec: 4096.1, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2805760. Throughput: 0: 998.8. Samples: 699468. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:32,236][00238] Avg episode reward: [(0, '22.051')]
[2023-02-23 09:19:36,411][12170] Updated weights for policy 0, policy_version 690 (0.0026)
[2023-02-23 09:19:37,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4026.7). Total num frames: 2830336. Throughput: 0: 1047.7. Samples: 706956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:37,236][00238] Avg episode reward: [(0, '19.980')]
[2023-02-23 09:19:42,234][00238] Fps is (10 sec: 4095.8, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2846720. Throughput: 0: 1018.4. Samples: 713040. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:19:42,238][00238] Avg episode reward: [(0, '20.217')]
[2023-02-23 09:19:47,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2863104. Throughput: 0: 990.1. Samples: 715396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:47,239][00238] Avg episode reward: [(0, '19.879')]
[2023-02-23 09:19:47,666][12170] Updated weights for policy 0, policy_version 700 (0.0031)
[2023-02-23 09:19:52,234][00238] Fps is (10 sec: 4096.2, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2887680. Throughput: 0: 1020.6. Samples: 721334. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:52,237][00238] Avg episode reward: [(0, '18.496')]
[2023-02-23 09:19:56,354][12170] Updated weights for policy 0, policy_version 710 (0.0019)
[2023-02-23 09:19:57,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2912256. Throughput: 0: 1052.1. Samples: 728748. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:19:57,236][00238] Avg episode reward: [(0, '19.989')]
[2023-02-23 09:20:02,236][00238] Fps is (10 sec: 4095.2, 60 sec: 4027.6, 300 sec: 4026.5). Total num frames: 2928640. Throughput: 0: 1036.4. Samples: 731780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:20:02,239][00238] Avg episode reward: [(0, '20.975')]
[2023-02-23 09:20:07,236][00238] Fps is (10 sec: 3276.2, 60 sec: 4096.4, 300 sec: 4026.6). Total num frames: 2945024. Throughput: 0: 991.2. Samples: 736452. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:20:07,242][00238] Avg episode reward: [(0, '21.878')]
[2023-02-23 09:20:08,019][12170] Updated weights for policy 0, policy_version 720 (0.0016)
[2023-02-23 09:20:12,234][00238] Fps is (10 sec: 4096.8, 60 sec: 4164.3, 300 sec: 4040.5). Total num frames: 2969600. Throughput: 0: 1029.0. Samples: 742922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:20:12,239][00238] Avg episode reward: [(0, '24.013')]
[2023-02-23 09:20:12,250][12156] Saving new best policy, reward=24.013!
[2023-02-23 09:20:16,411][12170] Updated weights for policy 0, policy_version 730 (0.0012)
[2023-02-23 09:20:17,234][00238] Fps is (10 sec: 4916.0, 60 sec: 4164.3, 300 sec: 4040.5). Total num frames: 2994176. Throughput: 0: 1046.5. Samples: 746562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:20:17,238][00238] Avg episode reward: [(0, '23.589')]
[2023-02-23 09:20:22,238][00238] Fps is (10 sec: 4094.2, 60 sec: 4095.7, 300 sec: 4026.5). Total num frames: 3010560. Throughput: 0: 1013.4. Samples: 752562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:20:22,245][00238] Avg episode reward: [(0, '23.732')]
[2023-02-23 09:20:27,234][00238] Fps is (10 sec: 2867.2, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3022848. Throughput: 0: 982.1. Samples: 757236. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:20:27,240][00238] Avg episode reward: [(0, '23.424')]
[2023-02-23 09:20:28,223][12170] Updated weights for policy 0, policy_version 740 (0.0017)
[2023-02-23 09:20:32,234][00238] Fps is (10 sec: 3688.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3047424. Throughput: 0: 1008.0. Samples: 760758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:20:32,237][00238] Avg episode reward: [(0, '22.101')]
[2023-02-23 09:20:36,457][12170] Updated weights for policy 0, policy_version 750 (0.0014)
[2023-02-23 09:20:37,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3072000. Throughput: 0: 1040.5. Samples: 768156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:20:37,237][00238] Avg episode reward: [(0, '21.298')]
[2023-02-23 09:20:42,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 4012.7). Total num frames: 3088384. Throughput: 0: 995.2. Samples: 773534. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:20:42,238][00238] Avg episode reward: [(0, '22.649')]
[2023-02-23 09:20:47,234][00238] Fps is (10 sec: 3276.7, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3104768. Throughput: 0: 979.1. Samples: 775838. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:20:47,239][00238] Avg episode reward: [(0, '21.421')]
[2023-02-23 09:20:48,278][12170] Updated weights for policy 0, policy_version 760 (0.0042)
[2023-02-23 09:20:52,234][00238] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3129344. Throughput: 0: 1021.2. Samples: 782404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:20:52,242][00238] Avg episode reward: [(0, '21.913')]
[2023-02-23 09:20:56,604][12170] Updated weights for policy 0, policy_version 770 (0.0012)
[2023-02-23 09:20:57,235][00238] Fps is (10 sec: 4914.8, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3153920. Throughput: 0: 1040.9. Samples: 789764. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:20:57,240][00238] Avg episode reward: [(0, '21.812')]
[2023-02-23 09:21:02,234][00238] Fps is (10 sec: 4096.1, 60 sec: 4027.9, 300 sec: 4026.6). Total num frames: 3170304. Throughput: 0: 1011.7. Samples: 792088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:21:02,241][00238] Avg episode reward: [(0, '22.223')]
[2023-02-23 09:21:02,260][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth...
[2023-02-23 09:21:02,387][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000538_2203648.pth
[2023-02-23 09:21:07,234][00238] Fps is (10 sec: 3277.2, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 3186688. Throughput: 0: 982.4. Samples: 796766. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:21:07,239][00238] Avg episode reward: [(0, '23.246')]
[2023-02-23 09:21:08,562][12170] Updated weights for policy 0, policy_version 780 (0.0015)
[2023-02-23 09:21:12,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3211264. Throughput: 0: 1036.6. Samples: 803882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:21:12,242][00238] Avg episode reward: [(0, '24.087')]
[2023-02-23 09:21:12,253][12156] Saving new best policy, reward=24.087!
[2023-02-23 09:21:16,770][12170] Updated weights for policy 0, policy_version 790 (0.0016)
[2023-02-23 09:21:17,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3235840. Throughput: 0: 1036.7. Samples: 807408. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:21:17,240][00238] Avg episode reward: [(0, '25.385')]
[2023-02-23 09:21:17,245][12156] Saving new best policy, reward=25.385!
[2023-02-23 09:21:22,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.8, 300 sec: 4012.7). Total num frames: 3248128. Throughput: 0: 990.5. Samples: 812728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:21:22,239][00238] Avg episode reward: [(0, '25.508')]
[2023-02-23 09:21:22,249][12156] Saving new best policy, reward=25.508!
[2023-02-23 09:21:27,234][00238] Fps is (10 sec: 2867.2, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3264512. Throughput: 0: 979.7. Samples: 817622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:21:27,236][00238] Avg episode reward: [(0, '25.618')]
[2023-02-23 09:21:27,250][12156] Saving new best policy, reward=25.618!
[2023-02-23 09:21:29,147][12170] Updated weights for policy 0, policy_version 800 (0.0026)
[2023-02-23 09:21:32,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3289088. Throughput: 0: 1002.8. Samples: 820964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:21:32,236][00238] Avg episode reward: [(0, '25.626')]
[2023-02-23 09:21:32,248][12156] Saving new best policy, reward=25.626!
[2023-02-23 09:21:37,234][00238] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3313664. Throughput: 0: 1016.1. Samples: 828130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:21:37,241][00238] Avg episode reward: [(0, '25.890')]
[2023-02-23 09:21:37,248][12156] Saving new best policy, reward=25.890!
[2023-02-23 09:21:38,536][12170] Updated weights for policy 0, policy_version 810 (0.0011)
[2023-02-23 09:21:42,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 3325952. Throughput: 0: 960.0. Samples: 832962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:21:42,236][00238] Avg episode reward: [(0, '23.734')]
[2023-02-23 09:21:47,234][00238] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3346432. Throughput: 0: 957.2. Samples: 835164. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:21:47,242][00238] Avg episode reward: [(0, '25.252')]
[2023-02-23 09:21:49,600][12170] Updated weights for policy 0, policy_version 820 (0.0015)
[2023-02-23 09:21:52,234][00238] Fps is (10 sec: 4505.5, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3371008. Throughput: 0: 1008.9. Samples: 842166. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:21:52,236][00238] Avg episode reward: [(0, '26.434')]
[2023-02-23 09:21:52,244][12156] Saving new best policy, reward=26.434!
[2023-02-23 09:21:57,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3391488. Throughput: 0: 996.5. Samples: 848726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:21:57,238][00238] Avg episode reward: [(0, '25.298')]
[2023-02-23 09:21:59,530][12170] Updated weights for policy 0, policy_version 830 (0.0019)
[2023-02-23 09:22:02,236][00238] Fps is (10 sec: 3276.2, 60 sec: 3891.1, 300 sec: 4012.7). Total num frames: 3403776. Throughput: 0: 968.0. Samples: 850970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:22:02,239][00238] Avg episode reward: [(0, '24.925')]
[2023-02-23 09:22:07,236][00238] Fps is (10 sec: 3276.1, 60 sec: 3959.3, 300 sec: 4026.5). Total num frames: 3424256. Throughput: 0: 958.8. Samples: 855876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:22:07,238][00238] Avg episode reward: [(0, '24.630')]
[2023-02-23 09:22:10,117][12170] Updated weights for policy 0, policy_version 840 (0.0022)
[2023-02-23 09:22:12,234][00238] Fps is (10 sec: 4506.6, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3448832. Throughput: 0: 1013.9. Samples: 863248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:22:12,236][00238] Avg episode reward: [(0, '23.085')]
[2023-02-23 09:22:17,234][00238] Fps is (10 sec: 4506.5, 60 sec: 3891.2, 300 sec: 4012.7). Total num frames: 3469312. Throughput: 0: 1022.8. Samples: 866988. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:22:17,237][00238] Avg episode reward: [(0, '21.688')]
[2023-02-23 09:22:19,916][12170] Updated weights for policy 0, policy_version 850 (0.0012)
[2023-02-23 09:22:22,234][00238] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3485696. Throughput: 0: 973.7. Samples: 871946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:22:22,241][00238] Avg episode reward: [(0, '22.286')]
[2023-02-23 09:22:27,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3506176. Throughput: 0: 992.6. Samples: 877628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:22:27,237][00238] Avg episode reward: [(0, '20.871')]
[2023-02-23 09:22:30,077][12170] Updated weights for policy 0, policy_version 860 (0.0012)
[2023-02-23 09:22:32,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3530752. Throughput: 0: 1026.3. Samples: 881346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:22:32,239][00238] Avg episode reward: [(0, '20.451')]
[2023-02-23 09:22:37,234][00238] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3551232. Throughput: 0: 1025.3. Samples: 888304. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:22:37,240][00238] Avg episode reward: [(0, '20.702')]
[2023-02-23 09:22:40,435][12170] Updated weights for policy 0, policy_version 870 (0.0015)
[2023-02-23 09:22:42,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3567616. Throughput: 0: 981.7. Samples: 892902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:22:42,236][00238] Avg episode reward: [(0, '21.503')]
[2023-02-23 09:22:47,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3588096. Throughput: 0: 987.2. Samples: 895394. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:22:47,242][00238] Avg episode reward: [(0, '20.773')]
[2023-02-23 09:22:50,335][12170] Updated weights for policy 0, policy_version 880 (0.0031)
[2023-02-23 09:22:52,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 4040.5). Total num frames: 3612672. Throughput: 0: 1041.6. Samples: 902744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:22:52,236][00238] Avg episode reward: [(0, '23.370')]
[2023-02-23 09:22:57,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3633152. Throughput: 0: 1016.1. Samples: 908972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:22:57,239][00238] Avg episode reward: [(0, '24.757')]
[2023-02-23 09:23:01,137][12170] Updated weights for policy 0, policy_version 890 (0.0026)
[2023-02-23 09:23:02,235][00238] Fps is (10 sec: 3276.4, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 3645440. Throughput: 0: 984.2. Samples: 911276. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:02,237][00238] Avg episode reward: [(0, '24.384')]
[2023-02-23 09:23:02,254][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000890_3645440.pth...
[2023-02-23 09:23:02,408][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000656_2686976.pth
[2023-02-23 09:23:07,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.1, 300 sec: 4040.5). Total num frames: 3670016. Throughput: 0: 997.2. Samples: 916818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:07,237][00238] Avg episode reward: [(0, '22.985')]
[2023-02-23 09:23:10,523][12170] Updated weights for policy 0, policy_version 900 (0.0019)
[2023-02-23 09:23:12,234][00238] Fps is (10 sec: 4915.8, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3694592. Throughput: 0: 1035.6. Samples: 924232. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:12,236][00238] Avg episode reward: [(0, '23.968')]
[2023-02-23 09:23:17,234][00238] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3710976. Throughput: 0: 1027.3. Samples: 927574. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:23:17,241][00238] Avg episode reward: [(0, '23.635')]
[2023-02-23 09:23:21,405][12170] Updated weights for policy 0, policy_version 910 (0.0017)
[2023-02-23 09:23:22,235][00238] Fps is (10 sec: 3276.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3727360. Throughput: 0: 975.9. Samples: 932218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:22,239][00238] Avg episode reward: [(0, '23.584')]
[2023-02-23 09:23:27,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3747840. Throughput: 0: 1012.5. Samples: 938466. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:23:27,239][00238] Avg episode reward: [(0, '23.243')]
[2023-02-23 09:23:30,592][12170] Updated weights for policy 0, policy_version 920 (0.0014)
[2023-02-23 09:23:32,234][00238] Fps is (10 sec: 4505.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3772416. Throughput: 0: 1040.5. Samples: 942216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:32,241][00238] Avg episode reward: [(0, '23.055')]
[2023-02-23 09:23:37,237][00238] Fps is (10 sec: 4504.2, 60 sec: 4027.5, 300 sec: 4026.5). Total num frames: 3792896. Throughput: 0: 1017.8. Samples: 948550. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 09:23:37,240][00238] Avg episode reward: [(0, '24.654')]
[2023-02-23 09:23:41,670][12170] Updated weights for policy 0, policy_version 930 (0.0011)
[2023-02-23 09:23:42,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3809280. Throughput: 0: 982.8. Samples: 953200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:23:42,236][00238] Avg episode reward: [(0, '24.432')]
[2023-02-23 09:23:47,234][00238] Fps is (10 sec: 3687.5, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3829760. Throughput: 0: 998.2. Samples: 956194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:23:47,237][00238] Avg episode reward: [(0, '24.951')]
[2023-02-23 09:23:50,756][12170] Updated weights for policy 0, policy_version 940 (0.0025)
[2023-02-23 09:23:52,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3854336. Throughput: 0: 1042.3. Samples: 963722. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:52,237][00238] Avg episode reward: [(0, '25.501')]
[2023-02-23 09:23:57,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3874816. Throughput: 0: 1006.3. Samples: 969516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:23:57,236][00238] Avg episode reward: [(0, '26.046')]
[2023-02-23 09:24:02,147][12170] Updated weights for policy 0, policy_version 950 (0.0049)
[2023-02-23 09:24:02,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4096.1, 300 sec: 4040.6). Total num frames: 3891200. Throughput: 0: 984.7. Samples: 971884. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:24:02,236][00238] Avg episode reward: [(0, '25.233')]
[2023-02-23 09:24:07,234][00238] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3911680. Throughput: 0: 1013.2. Samples: 977812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 09:24:07,242][00238] Avg episode reward: [(0, '25.005')]
[2023-02-23 09:24:11,064][12170] Updated weights for policy 0, policy_version 960 (0.0013)
[2023-02-23 09:24:12,234][00238] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3936256. Throughput: 0: 1035.8. Samples: 985076. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 09:24:12,239][00238] Avg episode reward: [(0, '23.957')]
[2023-02-23 09:24:17,239][00238] Fps is (10 sec: 4093.9, 60 sec: 4027.4, 300 sec: 4026.5). Total num frames: 3952640. Throughput: 0: 1018.8. Samples: 988066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 09:24:17,241][00238] Avg episode reward: [(0, '24.600')]
[2023-02-23 09:24:22,235][00238] Fps is (10 sec: 3276.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3969024. Throughput: 0: 981.6. Samples: 992718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:24:22,237][00238] Avg episode reward: [(0, '23.685')]
[2023-02-23 09:24:22,312][12170] Updated weights for policy 0, policy_version 970 (0.0019)
[2023-02-23 09:24:27,234][00238] Fps is (10 sec: 4098.1, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3993600. Throughput: 0: 1026.0. Samples: 999370. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 09:24:27,240][00238] Avg episode reward: [(0, '25.469')]
[2023-02-23 09:24:29,283][12156] Stopping Batcher_0...
[2023-02-23 09:24:29,283][12156] Loop batcher_evt_loop terminating...
[2023-02-23 09:24:29,284][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 09:24:29,283][00238] Component Batcher_0 stopped!
[2023-02-23 09:24:29,333][12170] Weights refcount: 2 0
[2023-02-23 09:24:29,354][00238] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 09:24:29,357][12170] Stopping InferenceWorker_p0-w0...
[2023-02-23 09:24:29,358][12181] Stopping RolloutWorker_w6...
[2023-02-23 09:24:29,358][00238] Component RolloutWorker_w6 stopped!
[2023-02-23 09:24:29,366][00238] Component RolloutWorker_w0 stopped!
[2023-02-23 09:24:29,368][00238] Component RolloutWorker_w5 stopped!
[2023-02-23 09:24:29,371][12179] Stopping RolloutWorker_w4...
[2023-02-23 09:24:29,373][00238] Component RolloutWorker_w4 stopped!
[2023-02-23 09:24:29,377][12170] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 09:24:29,379][12180] Stopping RolloutWorker_w5...
[2023-02-23 09:24:29,379][12180] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 09:24:29,380][12177] Stopping RolloutWorker_w2...
[2023-02-23 09:24:29,380][12177] Loop rollout_proc2_evt_loop terminating...
[2023-02-23 09:24:29,380][00238] Component RolloutWorker_w2 stopped!
[2023-02-23 09:24:29,386][00238] Component RolloutWorker_w3 stopped!
[2023-02-23 09:24:29,388][12178] Stopping RolloutWorker_w3...
[2023-02-23 09:24:29,366][12175] Stopping RolloutWorker_w0...
[2023-02-23 09:24:29,358][12181] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 09:24:29,371][12179] Loop rollout_proc4_evt_loop terminating...
[2023-02-23 09:24:29,390][12175] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 09:24:29,403][12178] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 09:24:29,415][00238] Component RolloutWorker_w7 stopped!
[2023-02-23 09:24:29,417][12182] Stopping RolloutWorker_w7...
[2023-02-23 09:24:29,418][12182] Loop rollout_proc7_evt_loop terminating...
[2023-02-23 09:24:29,424][00238] Component RolloutWorker_w1 stopped!
[2023-02-23 09:24:29,426][12176] Stopping RolloutWorker_w1...
[2023-02-23 09:24:29,427][12176] Loop rollout_proc1_evt_loop terminating...
[2023-02-23 09:24:29,518][12156] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000774_3170304.pth
[2023-02-23 09:24:29,525][12156] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 09:24:29,706][00238] Component LearnerWorker_p0 stopped!
[2023-02-23 09:24:29,706][12156] Stopping LearnerWorker_p0...
[2023-02-23 09:24:29,709][00238] Waiting for process learner_proc0 to stop...
[2023-02-23 09:24:29,709][12156] Loop learner_proc0_evt_loop terminating...
[2023-02-23 09:24:31,475][00238] Waiting for process inference_proc0-0 to join...
[2023-02-23 09:24:31,902][00238] Waiting for process rollout_proc0 to join...
[2023-02-23 09:24:32,155][00238] Waiting for process rollout_proc1 to join...
[2023-02-23 09:24:32,157][00238] Waiting for process rollout_proc2 to join...
[2023-02-23 09:24:32,161][00238] Waiting for process rollout_proc3 to join...
[2023-02-23 09:24:32,163][00238] Waiting for process rollout_proc4 to join...
[2023-02-23 09:24:32,164][00238] Waiting for process rollout_proc5 to join...
[2023-02-23 09:24:32,167][00238] Waiting for process rollout_proc6 to join...
[2023-02-23 09:24:32,169][00238] Waiting for process rollout_proc7 to join...
[2023-02-23 09:24:32,171][00238] Batcher 0 profile tree view:
batching: 25.1050, releasing_batches: 0.0233
[2023-02-23 09:24:32,174][00238] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 495.8973
update_model: 7.2122
weight_update: 0.0012
one_step: 0.0023
handle_policy_step: 480.1854
deserialize: 14.0650, stack: 2.9727, obs_to_device_normalize: 109.0095, forward: 228.7281, send_messages: 24.7978
prepare_outputs: 76.3867
to_cpu: 48.2503
[2023-02-23 09:24:32,176][00238] Learner 0 profile tree view:
misc: 0.0048, prepare_batch: 15.4590
train: 74.2930
epoch_init: 0.0056, minibatch_init: 0.0090, losses_postprocess: 0.7113, kl_divergence: 0.5427, after_optimizer: 32.9408
calculate_losses: 26.0627
losses_init: 0.0044, forward_head: 1.6491, bptt_initial: 17.3955, tail: 1.0698, advantages_returns: 0.2418, losses: 3.3754
bptt: 1.9902
bptt_forward_core: 1.9092
update: 13.4492
clip: 1.3319
[2023-02-23 09:24:32,179][00238] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.2564, enqueue_policy_requests: 128.9847, env_step: 772.8201, overhead: 18.4395, complete_rollouts: 6.6934
save_policy_outputs: 18.7502
split_output_tensors: 9.4110
[2023-02-23 09:24:32,182][00238] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3186, enqueue_policy_requests: 126.0902, env_step: 777.8579, overhead: 17.8704, complete_rollouts: 6.7435
save_policy_outputs: 18.1966
split_output_tensors: 8.9460
[2023-02-23 09:24:32,184][00238] Loop Runner_EvtLoop terminating...
[2023-02-23 09:24:32,194][00238] Runner profile tree view:
main_loop: 1044.9298
[2023-02-23 09:24:32,196][00238] Collected {0: 4005888}, FPS: 3833.6
[2023-02-23 09:24:42,160][00238] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 09:24:42,162][00238] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 09:24:42,165][00238] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 09:24:42,168][00238] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 09:24:42,171][00238] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 09:24:42,173][00238] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 09:24:42,177][00238] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 09:24:42,178][00238] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 09:24:42,180][00238] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 09:24:42,181][00238] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 09:24:42,183][00238] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 09:24:42,186][00238] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 09:24:42,189][00238] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 09:24:42,190][00238] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 09:24:42,192][00238] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 09:24:42,215][00238] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 09:24:42,217][00238] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 09:24:42,221][00238] RunningMeanStd input shape: (1,)
[2023-02-23 09:24:42,237][00238] ConvEncoder: input_channels=3
[2023-02-23 09:24:42,924][00238] Conv encoder output size: 512
[2023-02-23 09:24:42,926][00238] Policy head output size: 512
[2023-02-23 09:24:45,282][00238] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 09:24:46,856][00238] Num frames 100...
[2023-02-23 09:24:47,013][00238] Num frames 200...
[2023-02-23 09:24:47,166][00238] Num frames 300...
[2023-02-23 09:24:47,321][00238] Num frames 400...
[2023-02-23 09:24:47,481][00238] Num frames 500...
[2023-02-23 09:24:47,643][00238] Num frames 600...
[2023-02-23 09:24:47,797][00238] Num frames 700...
[2023-02-23 09:24:47,962][00238] Num frames 800...
[2023-02-23 09:24:48,120][00238] Num frames 900...
[2023-02-23 09:24:48,277][00238] Num frames 1000...
[2023-02-23 09:24:48,436][00238] Num frames 1100...
[2023-02-23 09:24:48,612][00238] Num frames 1200...
[2023-02-23 09:24:48,771][00238] Num frames 1300...
[2023-02-23 09:24:48,931][00238] Num frames 1400...
[2023-02-23 09:24:49,086][00238] Num frames 1500...
[2023-02-23 09:24:49,197][00238] Num frames 1600...
[2023-02-23 09:24:49,304][00238] Num frames 1700...
[2023-02-23 09:24:49,416][00238] Num frames 1800...
[2023-02-23 09:24:49,527][00238] Num frames 1900...
[2023-02-23 09:24:49,675][00238] Avg episode rewards: #0: 48.839, true rewards: #0: 19.840
[2023-02-23 09:24:49,677][00238] Avg episode reward: 48.839, avg true_objective: 19.840
[2023-02-23 09:24:49,699][00238] Num frames 2000...
[2023-02-23 09:24:49,810][00238] Num frames 2100...
[2023-02-23 09:24:49,919][00238] Num frames 2200...
[2023-02-23 09:24:50,036][00238] Num frames 2300...
[2023-02-23 09:24:50,145][00238] Num frames 2400...
[2023-02-23 09:24:50,255][00238] Num frames 2500...
[2023-02-23 09:24:50,379][00238] Num frames 2600...
[2023-02-23 09:24:50,489][00238] Num frames 2700...
[2023-02-23 09:24:50,603][00238] Num frames 2800...
[2023-02-23 09:24:50,714][00238] Num frames 2900...
[2023-02-23 09:24:50,830][00238] Num frames 3000...
[2023-02-23 09:24:50,943][00238] Num frames 3100...
[2023-02-23 09:24:51,060][00238] Num frames 3200...
[2023-02-23 09:24:51,192][00238] Num frames 3300...
[2023-02-23 09:24:51,302][00238] Num frames 3400...
[2023-02-23 09:24:51,417][00238] Num frames 3500...
[2023-02-23 09:24:51,531][00238] Num frames 3600...
[2023-02-23 09:24:51,643][00238] Num frames 3700...
[2023-02-23 09:24:51,756][00238] Num frames 3800...
[2023-02-23 09:24:51,874][00238] Num frames 3900...
[2023-02-23 09:24:51,991][00238] Num frames 4000...
[2023-02-23 09:24:52,117][00238] Avg episode rewards: #0: 51.319, true rewards: #0: 20.320
[2023-02-23 09:24:52,118][00238] Avg episode reward: 51.319, avg true_objective: 20.320
[2023-02-23 09:24:52,161][00238] Num frames 4100...
[2023-02-23 09:24:52,270][00238] Num frames 4200...
[2023-02-23 09:24:52,381][00238] Num frames 4300...
[2023-02-23 09:24:52,498][00238] Num frames 4400...
[2023-02-23 09:24:52,610][00238] Num frames 4500...
[2023-02-23 09:24:52,722][00238] Num frames 4600...
[2023-02-23 09:24:52,830][00238] Num frames 4700...
[2023-02-23 09:24:52,940][00238] Num frames 4800...
[2023-02-23 09:24:53,060][00238] Num frames 4900...
[2023-02-23 09:24:53,171][00238] Num frames 5000...
[2023-02-23 09:24:53,279][00238] Num frames 5100...
[2023-02-23 09:24:53,388][00238] Num frames 5200...
[2023-02-23 09:24:53,500][00238] Num frames 5300...
[2023-02-23 09:24:53,607][00238] Avg episode rewards: #0: 43.813, true rewards: #0: 17.813
[2023-02-23 09:24:53,608][00238] Avg episode reward: 43.813, avg true_objective: 17.813
[2023-02-23 09:24:53,674][00238] Num frames 5400...
[2023-02-23 09:24:53,780][00238] Num frames 5500...
[2023-02-23 09:24:53,888][00238] Num frames 5600...
[2023-02-23 09:24:53,997][00238] Num frames 5700...
[2023-02-23 09:24:54,115][00238] Num frames 5800...
[2023-02-23 09:24:54,225][00238] Num frames 5900...
[2023-02-23 09:24:54,333][00238] Num frames 6000...
[2023-02-23 09:24:54,495][00238] Avg episode rewards: #0: 37.482, true rewards: #0: 15.233
[2023-02-23 09:24:54,497][00238] Avg episode reward: 37.482, avg true_objective: 15.233
[2023-02-23 09:24:54,508][00238] Num frames 6100...
[2023-02-23 09:24:54,633][00238] Num frames 6200...
[2023-02-23 09:24:54,742][00238] Num frames 6300...
[2023-02-23 09:24:54,859][00238] Num frames 6400...
[2023-02-23 09:24:54,969][00238] Num frames 6500...
[2023-02-23 09:24:55,087][00238] Num frames 6600...
[2023-02-23 09:24:55,194][00238] Num frames 6700...
[2023-02-23 09:24:55,310][00238] Num frames 6800...
[2023-02-23 09:24:55,422][00238] Num frames 6900...
[2023-02-23 09:24:55,530][00238] Num frames 7000...
[2023-02-23 09:24:55,627][00238] Avg episode rewards: #0: 33.876, true rewards: #0: 14.076
[2023-02-23 09:24:55,632][00238] Avg episode reward: 33.876, avg true_objective: 14.076
[2023-02-23 09:24:55,702][00238] Num frames 7100...
[2023-02-23 09:24:55,811][00238] Num frames 7200...
[2023-02-23 09:24:55,920][00238] Num frames 7300...
[2023-02-23 09:24:56,029][00238] Num frames 7400...
[2023-02-23 09:24:56,144][00238] Num frames 7500...
[2023-02-23 09:24:56,255][00238] Num frames 7600...
[2023-02-23 09:24:56,364][00238] Num frames 7700...
[2023-02-23 09:24:56,481][00238] Num frames 7800...
[2023-02-23 09:24:56,596][00238] Num frames 7900...
[2023-02-23 09:24:56,706][00238] Num frames 8000...
[2023-02-23 09:24:56,816][00238] Num frames 8100...
[2023-02-23 09:24:56,927][00238] Num frames 8200...
[2023-02-23 09:24:57,037][00238] Num frames 8300...
[2023-02-23 09:24:57,152][00238] Num frames 8400...
[2023-02-23 09:24:57,262][00238] Num frames 8500...
[2023-02-23 09:24:57,372][00238] Num frames 8600...
[2023-02-23 09:24:57,437][00238] Avg episode rewards: #0: 34.013, true rewards: #0: 14.347
[2023-02-23 09:24:57,439][00238] Avg episode reward: 34.013, avg true_objective: 14.347
[2023-02-23 09:24:57,548][00238] Num frames 8700...
[2023-02-23 09:24:57,659][00238] Num frames 8800...
[2023-02-23 09:24:57,769][00238] Num frames 8900...
[2023-02-23 09:24:57,879][00238] Num frames 9000...
[2023-02-23 09:24:57,997][00238] Num frames 9100...
[2023-02-23 09:24:58,115][00238] Num frames 9200...
[2023-02-23 09:24:58,227][00238] Num frames 9300...
[2023-02-23 09:24:58,336][00238] Num frames 9400...
[2023-02-23 09:24:58,448][00238] Num frames 9500...
[2023-02-23 09:24:58,560][00238] Num frames 9600...
[2023-02-23 09:24:58,678][00238] Num frames 9700...
[2023-02-23 09:24:58,791][00238] Num frames 9800...
[2023-02-23 09:24:58,905][00238] Num frames 9900...
[2023-02-23 09:24:59,013][00238] Num frames 10000...
[2023-02-23 09:24:59,157][00238] Num frames 10100...
[2023-02-23 09:24:59,305][00238] Num frames 10200...
[2023-02-23 09:24:59,462][00238] Num frames 10300...
[2023-02-23 09:24:59,619][00238] Num frames 10400...
[2023-02-23 09:24:59,770][00238] Num frames 10500...
[2023-02-23 09:24:59,886][00238] Avg episode rewards: #0: 36.055, true rewards: #0: 15.056
[2023-02-23 09:24:59,888][00238] Avg episode reward: 36.055, avg true_objective: 15.056
[2023-02-23 09:24:59,983][00238] Num frames 10600...
[2023-02-23 09:25:00,136][00238] Num frames 10700...
[2023-02-23 09:25:00,292][00238] Num frames 10800...
[2023-02-23 09:25:00,448][00238] Num frames 10900...
[2023-02-23 09:25:00,605][00238] Num frames 11000...
[2023-02-23 09:25:00,754][00238] Num frames 11100...
[2023-02-23 09:25:00,906][00238] Num frames 11200...
[2023-02-23 09:25:01,066][00238] Num frames 11300...
[2023-02-23 09:25:01,222][00238] Num frames 11400...
[2023-02-23 09:25:01,387][00238] Num frames 11500...
[2023-02-23 09:25:01,543][00238] Num frames 11600...
[2023-02-23 09:25:01,703][00238] Num frames 11700...
[2023-02-23 09:25:01,858][00238] Num frames 11800...
[2023-02-23 09:25:02,014][00238] Num frames 11900...
[2023-02-23 09:25:02,177][00238] Num frames 12000...
[2023-02-23 09:25:02,342][00238] Num frames 12100...
[2023-02-23 09:25:02,514][00238] Avg episode rewards: #0: 35.963, true rewards: #0: 15.214
[2023-02-23 09:25:02,516][00238] Avg episode reward: 35.963, avg true_objective: 15.214
[2023-02-23 09:25:02,560][00238] Num frames 12200...
[2023-02-23 09:25:02,680][00238] Num frames 12300...
[2023-02-23 09:25:02,791][00238] Num frames 12400...
[2023-02-23 09:25:02,901][00238] Num frames 12500...
[2023-02-23 09:25:03,016][00238] Num frames 12600...
[2023-02-23 09:25:03,125][00238] Num frames 12700...
[2023-02-23 09:25:03,247][00238] Num frames 12800...
[2023-02-23 09:25:03,317][00238] Avg episode rewards: #0: 33.456, true rewards: #0: 14.234
[2023-02-23 09:25:03,318][00238] Avg episode reward: 33.456, avg true_objective: 14.234
[2023-02-23 09:25:03,416][00238] Num frames 12900...
[2023-02-23 09:25:03,531][00238] Num frames 13000...
[2023-02-23 09:25:03,642][00238] Num frames 13100...
[2023-02-23 09:25:03,751][00238] Num frames 13200...
[2023-02-23 09:25:03,858][00238] Num frames 13300...
[2023-02-23 09:25:03,970][00238] Num frames 13400...
[2023-02-23 09:25:04,078][00238] Num frames 13500...
[2023-02-23 09:25:04,193][00238] Num frames 13600...
[2023-02-23 09:25:04,301][00238] Avg episode rewards: #0: 31.843, true rewards: #0: 13.643
[2023-02-23 09:25:04,303][00238] Avg episode reward: 31.843, avg true_objective: 13.643
[2023-02-23 09:26:21,578][00238] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 09:26:53,716][00238] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 09:26:53,718][00238] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 09:26:53,720][00238] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 09:26:53,723][00238] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 09:26:53,726][00238] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 09:26:53,728][00238] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 09:26:53,730][00238] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 09:26:53,731][00238] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 09:26:53,733][00238] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 09:26:53,734][00238] Adding new argument 'hf_repository'='besa2001/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 09:26:53,735][00238] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 09:26:53,737][00238] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 09:26:53,738][00238] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 09:26:53,739][00238] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 09:26:53,741][00238] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 09:26:53,762][00238] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 09:26:53,765][00238] RunningMeanStd input shape: (1,)
[2023-02-23 09:26:53,777][00238] ConvEncoder: input_channels=3
[2023-02-23 09:26:53,812][00238] Conv encoder output size: 512
[2023-02-23 09:26:53,814][00238] Policy head output size: 512
[2023-02-23 09:26:53,832][00238] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 09:26:54,379][00238] Num frames 100...
[2023-02-23 09:26:54,490][00238] Num frames 200...
[2023-02-23 09:26:54,600][00238] Num frames 300...
[2023-02-23 09:26:54,728][00238] Num frames 400...
[2023-02-23 09:26:54,838][00238] Num frames 500...
[2023-02-23 09:26:54,966][00238] Num frames 600...
[2023-02-23 09:26:55,076][00238] Num frames 700...
[2023-02-23 09:26:55,195][00238] Num frames 800...
[2023-02-23 09:26:55,305][00238] Num frames 900...
[2023-02-23 09:26:55,417][00238] Num frames 1000...
[2023-02-23 09:26:55,525][00238] Num frames 1100...
[2023-02-23 09:26:55,637][00238] Num frames 1200...
[2023-02-23 09:26:55,756][00238] Num frames 1300...
[2023-02-23 09:26:55,873][00238] Num frames 1400...
[2023-02-23 09:26:55,987][00238] Num frames 1500...
[2023-02-23 09:26:56,100][00238] Num frames 1600...
[2023-02-23 09:26:56,210][00238] Num frames 1700...
[2023-02-23 09:26:56,340][00238] Num frames 1800...
[2023-02-23 09:26:56,453][00238] Num frames 1900...
[2023-02-23 09:26:56,562][00238] Num frames 2000...
[2023-02-23 09:26:56,679][00238] Num frames 2100...
[2023-02-23 09:26:56,732][00238] Avg episode rewards: #0: 61.999, true rewards: #0: 21.000
[2023-02-23 09:26:56,734][00238] Avg episode reward: 61.999, avg true_objective: 21.000
[2023-02-23 09:26:56,853][00238] Num frames 2200...
[2023-02-23 09:26:56,964][00238] Num frames 2300...
[2023-02-23 09:26:57,074][00238] Num frames 2400...
[2023-02-23 09:26:57,188][00238] Num frames 2500...
[2023-02-23 09:26:57,302][00238] Num frames 2600...
[2023-02-23 09:26:57,424][00238] Num frames 2700...
[2023-02-23 09:26:57,540][00238] Num frames 2800...
[2023-02-23 09:26:57,651][00238] Num frames 2900...
[2023-02-23 09:26:57,769][00238] Num frames 3000...
[2023-02-23 09:26:57,883][00238] Num frames 3100...
[2023-02-23 09:26:58,000][00238] Avg episode rewards: #0: 41.779, true rewards: #0: 15.780
[2023-02-23 09:26:58,003][00238] Avg episode reward: 41.779, avg true_objective: 15.780
[2023-02-23 09:26:58,053][00238] Num frames 3200...
[2023-02-23 09:26:58,167][00238] Num frames 3300...
[2023-02-23 09:26:58,280][00238] Num frames 3400...
[2023-02-23 09:26:58,397][00238] Num frames 3500...
[2023-02-23 09:26:58,506][00238] Num frames 3600...
[2023-02-23 09:26:58,627][00238] Num frames 3700...
[2023-02-23 09:26:58,742][00238] Num frames 3800...
[2023-02-23 09:26:58,854][00238] Num frames 3900...
[2023-02-23 09:26:58,966][00238] Num frames 4000...
[2023-02-23 09:26:59,121][00238] Num frames 4100...
[2023-02-23 09:26:59,286][00238] Num frames 4200...
[2023-02-23 09:26:59,443][00238] Num frames 4300...
[2023-02-23 09:26:59,595][00238] Num frames 4400...
[2023-02-23 09:26:59,749][00238] Num frames 4500...
[2023-02-23 09:26:59,906][00238] Avg episode rewards: #0: 38.880, true rewards: #0: 15.213
[2023-02-23 09:26:59,912][00238] Avg episode reward: 38.880, avg true_objective: 15.213
[2023-02-23 09:26:59,979][00238] Num frames 4600...
[2023-02-23 09:27:00,386][00238] Num frames 4700...
[2023-02-23 09:27:00,703][00238] Num frames 4800...
[2023-02-23 09:27:00,951][00238] Avg episode rewards: #0: 30.130, true rewards: #0: 12.130
[2023-02-23 09:27:00,962][00238] Avg episode reward: 30.130, avg true_objective: 12.130
[2023-02-23 09:27:01,132][00238] Num frames 4900...
[2023-02-23 09:27:01,424][00238] Num frames 5000...
[2023-02-23 09:27:01,747][00238] Num frames 5100...
[2023-02-23 09:27:02,188][00238] Num frames 5200...
[2023-02-23 09:27:02,535][00238] Num frames 5300...
[2023-02-23 09:27:02,588][00238] Avg episode rewards: #0: 25.800, true rewards: #0: 10.600
[2023-02-23 09:27:02,590][00238] Avg episode reward: 25.800, avg true_objective: 10.600
[2023-02-23 09:27:02,926][00238] Num frames 5400...
[2023-02-23 09:27:03,247][00238] Num frames 5500...
[2023-02-23 09:27:03,437][00238] Num frames 5600...
[2023-02-23 09:27:03,608][00238] Num frames 5700...
[2023-02-23 09:27:03,809][00238] Num frames 5800...
[2023-02-23 09:27:04,035][00238] Avg episode rewards: #0: 23.310, true rewards: #0: 9.810
[2023-02-23 09:27:04,041][00238] Avg episode reward: 23.310, avg true_objective: 9.810
[2023-02-23 09:27:04,073][00238] Num frames 5900...
[2023-02-23 09:27:04,338][00238] Num frames 6000...
[2023-02-23 09:27:04,547][00238] Num frames 6100...
[2023-02-23 09:27:04,726][00238] Num frames 6200...
[2023-02-23 09:27:04,893][00238] Num frames 6300...
[2023-02-23 09:27:05,052][00238] Num frames 6400...
[2023-02-23 09:27:05,251][00238] Num frames 6500...
[2023-02-23 09:27:05,449][00238] Num frames 6600...
[2023-02-23 09:27:05,637][00238] Num frames 6700...
[2023-02-23 09:27:05,823][00238] Num frames 6800...
[2023-02-23 09:27:05,994][00238] Num frames 6900...
[2023-02-23 09:27:06,290][00238] Num frames 7000...
[2023-02-23 09:27:06,512][00238] Num frames 7100...
[2023-02-23 09:27:06,712][00238] Avg episode rewards: #0: 25.068, true rewards: #0: 10.211
[2023-02-23 09:27:06,720][00238] Avg episode reward: 25.068, avg true_objective: 10.211
[2023-02-23 09:27:06,831][00238] Num frames 7200...
[2023-02-23 09:27:07,018][00238] Num frames 7300...
[2023-02-23 09:27:07,302][00238] Num frames 7400...
[2023-02-23 09:27:07,503][00238] Num frames 7500...
[2023-02-23 09:27:07,699][00238] Num frames 7600...
[2023-02-23 09:27:07,922][00238] Num frames 7700...
[2023-02-23 09:27:08,037][00238] Num frames 7800...
[2023-02-23 09:27:08,149][00238] Num frames 7900...
[2023-02-23 09:27:08,262][00238] Num frames 8000...
[2023-02-23 09:27:08,378][00238] Num frames 8100...
[2023-02-23 09:27:08,500][00238] Num frames 8200...
[2023-02-23 09:27:08,635][00238] Avg episode rewards: #0: 25.210, true rewards: #0: 10.335
[2023-02-23 09:27:08,636][00238] Avg episode reward: 25.210, avg true_objective: 10.335
[2023-02-23 09:27:08,684][00238] Num frames 8300...
[2023-02-23 09:27:08,793][00238] Num frames 8400...
[2023-02-23 09:27:08,912][00238] Num frames 8500...
[2023-02-23 09:27:09,030][00238] Num frames 8600...
[2023-02-23 09:27:09,141][00238] Num frames 8700...
[2023-02-23 09:27:09,256][00238] Num frames 8800...
[2023-02-23 09:27:09,374][00238] Num frames 8900...
[2023-02-23 09:27:09,483][00238] Num frames 9000...
[2023-02-23 09:27:09,594][00238] Num frames 9100...
[2023-02-23 09:27:09,708][00238] Num frames 9200...
[2023-02-23 09:27:09,819][00238] Num frames 9300...
[2023-02-23 09:27:09,935][00238] Num frames 9400...
[2023-02-23 09:27:10,052][00238] Num frames 9500...
[2023-02-23 09:27:10,160][00238] Num frames 9600...
[2023-02-23 09:27:10,274][00238] Num frames 9700...
[2023-02-23 09:27:10,388][00238] Num frames 9800...
[2023-02-23 09:27:10,500][00238] Num frames 9900...
[2023-02-23 09:27:10,616][00238] Num frames 10000...
[2023-02-23 09:27:10,752][00238] Avg episode rewards: #0: 27.963, true rewards: #0: 11.186
[2023-02-23 09:27:10,754][00238] Avg episode reward: 27.963, avg true_objective: 11.186
[2023-02-23 09:27:10,794][00238] Num frames 10100...
[2023-02-23 09:27:10,904][00238] Num frames 10200...
[2023-02-23 09:27:11,021][00238] Num frames 10300...
[2023-02-23 09:27:11,135][00238] Num frames 10400...
[2023-02-23 09:27:11,247][00238] Num frames 10500...
[2023-02-23 09:27:11,359][00238] Num frames 10600...
[2023-02-23 09:27:11,469][00238] Num frames 10700...
[2023-02-23 09:27:11,583][00238] Num frames 10800...
[2023-02-23 09:27:11,696][00238] Num frames 10900...
[2023-02-23 09:27:11,806][00238] Num frames 11000...
[2023-02-23 09:27:11,914][00238] Num frames 11100...
[2023-02-23 09:27:12,033][00238] Num frames 11200...
[2023-02-23 09:27:12,144][00238] Num frames 11300...
[2023-02-23 09:27:12,258][00238] Num frames 11400...
[2023-02-23 09:27:12,371][00238] Num frames 11500...
[2023-02-23 09:27:12,484][00238] Num frames 11600...
[2023-02-23 09:27:12,612][00238] Num frames 11700...
[2023-02-23 09:27:12,732][00238] Num frames 11800...
[2023-02-23 09:27:12,844][00238] Avg episode rewards: #0: 29.750, true rewards: #0: 11.850
[2023-02-23 09:27:12,845][00238] Avg episode reward: 29.750, avg true_objective: 11.850
[2023-02-23 09:28:21,995][00238] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 09:28:54,262][00238] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 09:28:54,263][00238] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 09:28:54,265][00238] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 09:28:54,267][00238] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 09:28:54,269][00238] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 09:28:54,270][00238] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 09:28:54,272][00238] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 09:28:54,273][00238] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 09:28:54,275][00238] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 09:28:54,276][00238] Adding new argument 'hf_repository'='besa2001/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 09:28:54,277][00238] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 09:28:54,279][00238] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 09:28:54,280][00238] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 09:28:54,281][00238] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 09:28:54,283][00238] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 09:28:54,314][00238] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 09:28:54,318][00238] RunningMeanStd input shape: (1,)
[2023-02-23 09:28:54,336][00238] ConvEncoder: input_channels=3
[2023-02-23 09:28:54,372][00238] Conv encoder output size: 512
[2023-02-23 09:28:54,374][00238] Policy head output size: 512
[2023-02-23 09:28:54,401][00238] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 09:28:54,855][00238] Num frames 100...
[2023-02-23 09:28:54,964][00238] Num frames 200...
[2023-02-23 09:28:55,071][00238] Num frames 300...
[2023-02-23 09:28:55,179][00238] Num frames 400...
[2023-02-23 09:28:55,297][00238] Num frames 500...
[2023-02-23 09:28:55,442][00238] Avg episode rewards: #0: 9.760, true rewards: #0: 5.760
[2023-02-23 09:28:55,444][00238] Avg episode reward: 9.760, avg true_objective: 5.760
[2023-02-23 09:28:55,472][00238] Num frames 600...
[2023-02-23 09:28:55,588][00238] Num frames 700...
[2023-02-23 09:28:55,702][00238] Num frames 800...
[2023-02-23 09:28:55,810][00238] Num frames 900...
[2023-02-23 09:28:55,916][00238] Num frames 1000...
[2023-02-23 09:28:56,025][00238] Num frames 1100...
[2023-02-23 09:28:56,133][00238] Num frames 1200...
[2023-02-23 09:28:56,255][00238] Num frames 1300...
[2023-02-23 09:28:56,370][00238] Num frames 1400...
[2023-02-23 09:28:56,490][00238] Num frames 1500...
[2023-02-23 09:28:56,588][00238] Avg episode rewards: #0: 14.680, true rewards: #0: 7.680
[2023-02-23 09:28:56,590][00238] Avg episode reward: 14.680, avg true_objective: 7.680
[2023-02-23 09:28:56,672][00238] Num frames 1600...
[2023-02-23 09:28:56,785][00238] Num frames 1700...
[2023-02-23 09:28:56,895][00238] Num frames 1800...
[2023-02-23 09:28:57,003][00238] Num frames 1900...
[2023-02-23 09:28:57,114][00238] Num frames 2000...
[2023-02-23 09:28:57,221][00238] Num frames 2100...
[2023-02-23 09:28:57,330][00238] Num frames 2200...
[2023-02-23 09:28:57,417][00238] Avg episode rewards: #0: 14.750, true rewards: #0: 7.417
[2023-02-23 09:28:57,418][00238] Avg episode reward: 14.750, avg true_objective: 7.417
[2023-02-23 09:28:57,500][00238] Num frames 2300...
[2023-02-23 09:28:57,615][00238] Num frames 2400...
[2023-02-23 09:28:57,737][00238] Num frames 2500...
[2023-02-23 09:28:57,846][00238] Num frames 2600...
[2023-02-23 09:28:57,955][00238] Num frames 2700...
[2023-02-23 09:28:58,065][00238] Num frames 2800...
[2023-02-23 09:28:58,173][00238] Num frames 2900...
[2023-02-23 09:28:58,291][00238] Num frames 3000...
[2023-02-23 09:28:58,407][00238] Num frames 3100...
[2023-02-23 09:28:58,524][00238] Num frames 3200...
[2023-02-23 09:28:58,647][00238] Num frames 3300...
[2023-02-23 09:28:58,764][00238] Num frames 3400...
[2023-02-23 09:28:58,875][00238] Num frames 3500...
[2023-02-23 09:28:58,988][00238] Num frames 3600...
[2023-02-23 09:28:59,106][00238] Num frames 3700...
[2023-02-23 09:28:59,230][00238] Avg episode rewards: #0: 20.883, true rewards: #0: 9.382
[2023-02-23 09:28:59,231][00238] Avg episode reward: 20.883, avg true_objective: 9.382
[2023-02-23 09:28:59,285][00238] Num frames 3800...
[2023-02-23 09:28:59,408][00238] Num frames 3900...
[2023-02-23 09:28:59,521][00238] Num frames 4000...
[2023-02-23 09:28:59,639][00238] Num frames 4100...
[2023-02-23 09:28:59,749][00238] Num frames 4200...
[2023-02-23 09:28:59,859][00238] Num frames 4300...
[2023-02-23 09:28:59,972][00238] Num frames 4400...
[2023-02-23 09:29:00,083][00238] Num frames 4500...
[2023-02-23 09:29:00,195][00238] Num frames 4600...
[2023-02-23 09:29:00,312][00238] Num frames 4700...
[2023-02-23 09:29:00,432][00238] Num frames 4800...
[2023-02-23 09:29:00,544][00238] Num frames 4900...
[2023-02-23 09:29:00,658][00238] Num frames 5000...
[2023-02-23 09:29:00,778][00238] Num frames 5100...
[2023-02-23 09:29:00,839][00238] Avg episode rewards: #0: 23.208, true rewards: #0: 10.208
[2023-02-23 09:29:00,840][00238] Avg episode reward: 23.208, avg true_objective: 10.208
[2023-02-23 09:29:00,970][00238] Num frames 5200...
[2023-02-23 09:29:01,148][00238] Num frames 5300...
[2023-02-23 09:29:01,302][00238] Num frames 5400...
[2023-02-23 09:29:01,459][00238] Num frames 5500...
[2023-02-23 09:29:01,610][00238] Num frames 5600...
[2023-02-23 09:29:01,766][00238] Num frames 5700...
[2023-02-23 09:29:01,919][00238] Num frames 5800...
[2023-02-23 09:29:02,075][00238] Num frames 5900...
[2023-02-23 09:29:02,228][00238] Num frames 6000...
[2023-02-23 09:29:02,381][00238] Num frames 6100...
[2023-02-23 09:29:02,541][00238] Num frames 6200...
[2023-02-23 09:29:02,733][00238] Avg episode rewards: #0: 24.147, true rewards: #0: 10.480
[2023-02-23 09:29:02,736][00238] Avg episode reward: 24.147, avg true_objective: 10.480
[2023-02-23 09:29:02,761][00238] Num frames 6300...
[2023-02-23 09:29:02,918][00238] Num frames 6400...
[2023-02-23 09:29:03,077][00238] Num frames 6500...
[2023-02-23 09:29:03,234][00238] Num frames 6600...
[2023-02-23 09:29:03,393][00238] Num frames 6700...
[2023-02-23 09:29:03,554][00238] Num frames 6800...
[2023-02-23 09:29:03,716][00238] Num frames 6900...
[2023-02-23 09:29:03,819][00238] Avg episode rewards: #0: 22.040, true rewards: #0: 9.897
[2023-02-23 09:29:03,821][00238] Avg episode reward: 22.040, avg true_objective: 9.897
[2023-02-23 09:29:03,933][00238] Num frames 7000...
[2023-02-23 09:29:04,092][00238] Num frames 7100...
[2023-02-23 09:29:04,257][00238] Num frames 7200...
[2023-02-23 09:29:04,387][00238] Num frames 7300...
[2023-02-23 09:29:04,502][00238] Num frames 7400...
[2023-02-23 09:29:04,645][00238] Avg episode rewards: #0: 20.465, true rewards: #0: 9.340
[2023-02-23 09:29:04,648][00238] Avg episode reward: 20.465, avg true_objective: 9.340
[2023-02-23 09:29:04,683][00238] Num frames 7500...
[2023-02-23 09:29:04,795][00238] Num frames 7600...
[2023-02-23 09:29:04,906][00238] Num frames 7700...
[2023-02-23 09:29:05,022][00238] Num frames 7800...
[2023-02-23 09:29:05,146][00238] Num frames 7900...
[2023-02-23 09:29:05,273][00238] Num frames 8000...
[2023-02-23 09:29:05,347][00238] Avg episode rewards: #0: 19.240, true rewards: #0: 8.907
[2023-02-23 09:29:05,352][00238] Avg episode reward: 19.240, avg true_objective: 8.907
[2023-02-23 09:29:05,443][00238] Num frames 8100...
[2023-02-23 09:29:05,558][00238] Num frames 8200...
[2023-02-23 09:29:05,681][00238] Num frames 8300...
[2023-02-23 09:29:05,793][00238] Num frames 8400...
[2023-02-23 09:29:05,903][00238] Num frames 8500...
[2023-02-23 09:29:06,013][00238] Num frames 8600...
[2023-02-23 09:29:06,123][00238] Num frames 8700...
[2023-02-23 09:29:06,253][00238] Num frames 8800...
[2023-02-23 09:29:06,366][00238] Num frames 8900...
[2023-02-23 09:29:06,484][00238] Num frames 9000...
[2023-02-23 09:29:06,608][00238] Num frames 9100...
[2023-02-23 09:29:06,730][00238] Num frames 9200...
[2023-02-23 09:29:06,847][00238] Num frames 9300...
[2023-02-23 09:29:06,959][00238] Num frames 9400...
[2023-02-23 09:29:07,071][00238] Num frames 9500...
[2023-02-23 09:29:07,183][00238] Num frames 9600...
[2023-02-23 09:29:07,293][00238] Num frames 9700...
[2023-02-23 09:29:07,442][00238] Avg episode rewards: #0: 21.886, true rewards: #0: 9.786
[2023-02-23 09:29:07,444][00238] Avg episode reward: 21.886, avg true_objective: 9.786
[2023-02-23 09:30:03,509][00238] Replay video saved to /content/train_dir/default_experiment/replay.mp4!