akgeni's picture
Upload . with huggingface_hub
7243717
[2023-02-25 17:05:20,778][08744] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-25 17:05:20,781][08744] Rollout worker 0 uses device cpu
[2023-02-25 17:05:20,783][08744] Rollout worker 1 uses device cpu
[2023-02-25 17:05:20,784][08744] Rollout worker 2 uses device cpu
[2023-02-25 17:05:20,785][08744] Rollout worker 3 uses device cpu
[2023-02-25 17:05:20,787][08744] Rollout worker 4 uses device cpu
[2023-02-25 17:05:20,788][08744] Rollout worker 5 uses device cpu
[2023-02-25 17:05:20,789][08744] Rollout worker 6 uses device cpu
[2023-02-25 17:05:20,790][08744] Rollout worker 7 uses device cpu
[2023-02-25 17:05:21,005][08744] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 17:05:21,007][08744] InferenceWorker_p0-w0: min num requests: 2
[2023-02-25 17:05:21,048][08744] Starting all processes...
[2023-02-25 17:05:21,055][08744] Starting process learner_proc0
[2023-02-25 17:05:21,143][08744] Starting all processes...
[2023-02-25 17:05:21,161][08744] Starting process inference_proc0-0
[2023-02-25 17:05:21,162][08744] Starting process rollout_proc0
[2023-02-25 17:05:21,168][08744] Starting process rollout_proc1
[2023-02-25 17:05:21,168][08744] Starting process rollout_proc2
[2023-02-25 17:05:21,168][08744] Starting process rollout_proc3
[2023-02-25 17:05:21,169][08744] Starting process rollout_proc5
[2023-02-25 17:05:21,169][08744] Starting process rollout_proc4
[2023-02-25 17:05:21,187][08744] Starting process rollout_proc6
[2023-02-25 17:05:21,198][08744] Starting process rollout_proc7
[2023-02-25 17:05:32,007][14418] Worker 3 uses CPU cores [1]
[2023-02-25 17:05:32,081][14400] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 17:05:32,083][14400] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-25 17:05:32,221][14416] Worker 0 uses CPU cores [0]
[2023-02-25 17:05:32,257][14414] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 17:05:32,258][14414] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-25 17:05:32,309][14417] Worker 2 uses CPU cores [0]
[2023-02-25 17:05:32,422][14422] Worker 7 uses CPU cores [1]
[2023-02-25 17:05:32,425][14420] Worker 4 uses CPU cores [0]
[2023-02-25 17:05:32,441][14415] Worker 1 uses CPU cores [1]
[2023-02-25 17:05:32,447][14421] Worker 6 uses CPU cores [0]
[2023-02-25 17:05:32,475][14419] Worker 5 uses CPU cores [1]
[2023-02-25 17:05:32,932][14400] Num visible devices: 1
[2023-02-25 17:05:32,932][14414] Num visible devices: 1
[2023-02-25 17:05:32,935][14400] Starting seed is not provided
[2023-02-25 17:05:32,935][14400] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 17:05:32,936][14400] Initializing actor-critic model on device cuda:0
[2023-02-25 17:05:32,936][14400] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 17:05:32,939][14400] RunningMeanStd input shape: (1,)
[2023-02-25 17:05:32,953][14400] ConvEncoder: input_channels=3
[2023-02-25 17:05:33,215][14400] Conv encoder output size: 512
[2023-02-25 17:05:33,216][14400] Policy head output size: 512
[2023-02-25 17:05:33,259][14400] Created Actor Critic model with architecture:
[2023-02-25 17:05:33,260][14400] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2023-02-25 17:05:40,182][14400] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-25 17:05:40,183][14400] No checkpoints found
[2023-02-25 17:05:40,183][14400] Did not load from checkpoint, starting from scratch!
[2023-02-25 17:05:40,184][14400] Initialized policy 0 weights for model version 0
[2023-02-25 17:05:40,187][14400] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-25 17:05:40,195][14400] LearnerWorker_p0 finished initialization!
[2023-02-25 17:05:40,305][14414] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 17:05:40,306][14414] RunningMeanStd input shape: (1,)
[2023-02-25 17:05:40,318][14414] ConvEncoder: input_channels=3
[2023-02-25 17:05:40,421][14414] Conv encoder output size: 512
[2023-02-25 17:05:40,421][14414] Policy head output size: 512
[2023-02-25 17:05:40,723][08744] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 17:05:40,994][08744] Heartbeat connected on Batcher_0
[2023-02-25 17:05:41,001][08744] Heartbeat connected on LearnerWorker_p0
[2023-02-25 17:05:41,016][08744] Heartbeat connected on RolloutWorker_w0
[2023-02-25 17:05:41,022][08744] Heartbeat connected on RolloutWorker_w1
[2023-02-25 17:05:41,027][08744] Heartbeat connected on RolloutWorker_w2
[2023-02-25 17:05:41,031][08744] Heartbeat connected on RolloutWorker_w3
[2023-02-25 17:05:41,036][08744] Heartbeat connected on RolloutWorker_w4
[2023-02-25 17:05:41,041][08744] Heartbeat connected on RolloutWorker_w5
[2023-02-25 17:05:41,046][08744] Heartbeat connected on RolloutWorker_w6
[2023-02-25 17:05:41,052][08744] Heartbeat connected on RolloutWorker_w7
[2023-02-25 17:05:42,793][08744] Inference worker 0-0 is ready!
[2023-02-25 17:05:42,795][08744] All inference workers are ready! Signal rollout workers to start!
[2023-02-25 17:05:42,799][08744] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-25 17:05:42,902][14416] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,914][14420] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,948][14417] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,953][14415] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,956][14422] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,958][14421] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,961][14418] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:42,983][14419] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:05:43,474][14415] Decorrelating experience for 0 frames...
[2023-02-25 17:05:43,815][14415] Decorrelating experience for 32 frames...
[2023-02-25 17:05:44,193][14416] Decorrelating experience for 0 frames...
[2023-02-25 17:05:44,201][14420] Decorrelating experience for 0 frames...
[2023-02-25 17:05:44,213][14421] Decorrelating experience for 0 frames...
[2023-02-25 17:05:44,216][14417] Decorrelating experience for 0 frames...
[2023-02-25 17:05:44,901][14422] Decorrelating experience for 0 frames...
[2023-02-25 17:05:44,928][14418] Decorrelating experience for 0 frames...
[2023-02-25 17:05:45,501][14420] Decorrelating experience for 32 frames...
[2023-02-25 17:05:45,506][14421] Decorrelating experience for 32 frames...
[2023-02-25 17:05:45,546][14417] Decorrelating experience for 32 frames...
[2023-02-25 17:05:45,604][14416] Decorrelating experience for 32 frames...
[2023-02-25 17:05:45,676][14415] Decorrelating experience for 64 frames...
[2023-02-25 17:05:45,710][14418] Decorrelating experience for 32 frames...
[2023-02-25 17:05:45,723][08744] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 17:05:46,688][14422] Decorrelating experience for 32 frames...
[2023-02-25 17:05:46,909][14419] Decorrelating experience for 0 frames...
[2023-02-25 17:05:46,979][14420] Decorrelating experience for 64 frames...
[2023-02-25 17:05:47,077][14421] Decorrelating experience for 64 frames...
[2023-02-25 17:05:47,150][14417] Decorrelating experience for 64 frames...
[2023-02-25 17:05:47,184][14418] Decorrelating experience for 64 frames...
[2023-02-25 17:05:48,569][14422] Decorrelating experience for 64 frames...
[2023-02-25 17:05:48,669][14419] Decorrelating experience for 32 frames...
[2023-02-25 17:05:48,940][14416] Decorrelating experience for 64 frames...
[2023-02-25 17:05:48,960][14418] Decorrelating experience for 96 frames...
[2023-02-25 17:05:49,305][14421] Decorrelating experience for 96 frames...
[2023-02-25 17:05:49,303][14420] Decorrelating experience for 96 frames...
[2023-02-25 17:05:49,521][14417] Decorrelating experience for 96 frames...
[2023-02-25 17:05:50,329][14415] Decorrelating experience for 96 frames...
[2023-02-25 17:05:50,450][14419] Decorrelating experience for 64 frames...
[2023-02-25 17:05:50,723][08744] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 17:05:50,953][14422] Decorrelating experience for 96 frames...
[2023-02-25 17:05:51,379][14419] Decorrelating experience for 96 frames...
[2023-02-25 17:05:52,079][14416] Decorrelating experience for 96 frames...
[2023-02-25 17:05:55,723][08744] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 25.6. Samples: 384. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-25 17:05:55,732][08744] Avg episode reward: [(0, '0.640')]
[2023-02-25 17:05:57,897][14400] Signal inference workers to stop experience collection...
[2023-02-25 17:05:57,907][14414] InferenceWorker_p0-w0: stopping experience collection
[2023-02-25 17:06:00,198][14400] Signal inference workers to resume experience collection...
[2023-02-25 17:06:00,199][14414] InferenceWorker_p0-w0: resuming experience collection
[2023-02-25 17:06:00,723][08744] Fps is (10 sec: 409.6, 60 sec: 204.8, 300 sec: 204.8). Total num frames: 4096. Throughput: 0: 111.2. Samples: 2224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-02-25 17:06:00,726][08744] Avg episode reward: [(0, '2.229')]
[2023-02-25 17:06:05,730][08744] Fps is (10 sec: 2455.8, 60 sec: 982.7, 300 sec: 982.7). Total num frames: 24576. Throughput: 0: 195.3. Samples: 4884. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:06:05,733][08744] Avg episode reward: [(0, '3.567')]
[2023-02-25 17:06:10,723][08744] Fps is (10 sec: 2867.1, 60 sec: 1092.2, 300 sec: 1092.2). Total num frames: 32768. Throughput: 0: 298.1. Samples: 8942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:06:10,726][08744] Avg episode reward: [(0, '3.855')]
[2023-02-25 17:06:12,789][14414] Updated weights for policy 0, policy_version 10 (0.0025)
[2023-02-25 17:06:15,723][08744] Fps is (10 sec: 2049.5, 60 sec: 1287.3, 300 sec: 1287.3). Total num frames: 45056. Throughput: 0: 307.7. Samples: 10768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:06:15,725][08744] Avg episode reward: [(0, '4.478')]
[2023-02-25 17:06:20,723][08744] Fps is (10 sec: 3277.0, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 65536. Throughput: 0: 384.9. Samples: 15394. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:06:20,725][08744] Avg episode reward: [(0, '4.642')]
[2023-02-25 17:06:24,549][14414] Updated weights for policy 0, policy_version 20 (0.0017)
[2023-02-25 17:06:25,723][08744] Fps is (10 sec: 4095.9, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 86016. Throughput: 0: 470.2. Samples: 21158. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:06:25,731][08744] Avg episode reward: [(0, '4.277')]
[2023-02-25 17:06:30,725][08744] Fps is (10 sec: 3276.0, 60 sec: 1966.0, 300 sec: 1966.0). Total num frames: 98304. Throughput: 0: 523.0. Samples: 23536. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 17:06:30,731][08744] Avg episode reward: [(0, '4.172')]
[2023-02-25 17:06:35,725][08744] Fps is (10 sec: 2457.0, 60 sec: 2010.7, 300 sec: 2010.7). Total num frames: 110592. Throughput: 0: 603.8. Samples: 27174. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 17:06:35,733][08744] Avg episode reward: [(0, '4.203')]
[2023-02-25 17:06:35,749][14400] Saving new best policy, reward=4.203!
[2023-02-25 17:06:39,002][14414] Updated weights for policy 0, policy_version 30 (0.0015)
[2023-02-25 17:06:40,723][08744] Fps is (10 sec: 2867.9, 60 sec: 2116.3, 300 sec: 2116.3). Total num frames: 126976. Throughput: 0: 697.3. Samples: 31764. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2023-02-25 17:06:40,724][08744] Avg episode reward: [(0, '4.350')]
[2023-02-25 17:06:40,735][14400] Saving new best policy, reward=4.350!
[2023-02-25 17:06:45,723][08744] Fps is (10 sec: 3277.7, 60 sec: 2389.3, 300 sec: 2205.5). Total num frames: 143360. Throughput: 0: 717.1. Samples: 34492. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:06:45,725][08744] Avg episode reward: [(0, '4.386')]
[2023-02-25 17:06:45,748][14400] Saving new best policy, reward=4.386!
[2023-02-25 17:06:50,723][08744] Fps is (10 sec: 3276.8, 60 sec: 2662.4, 300 sec: 2282.1). Total num frames: 159744. Throughput: 0: 772.0. Samples: 39616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:06:50,728][08744] Avg episode reward: [(0, '4.530')]
[2023-02-25 17:06:50,732][14400] Saving new best policy, reward=4.530!
[2023-02-25 17:06:51,542][14414] Updated weights for policy 0, policy_version 40 (0.0020)
[2023-02-25 17:06:55,723][08744] Fps is (10 sec: 2867.1, 60 sec: 2867.2, 300 sec: 2293.8). Total num frames: 172032. Throughput: 0: 770.5. Samples: 43616. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:06:55,730][08744] Avg episode reward: [(0, '4.600')]
[2023-02-25 17:06:55,741][14400] Saving new best policy, reward=4.600!
[2023-02-25 17:07:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 2355.2). Total num frames: 188416. Throughput: 0: 773.2. Samples: 45564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:07:00,730][08744] Avg episode reward: [(0, '4.569')]
[2023-02-25 17:07:04,090][14414] Updated weights for policy 0, policy_version 50 (0.0021)
[2023-02-25 17:07:05,723][08744] Fps is (10 sec: 3686.5, 60 sec: 3072.4, 300 sec: 2457.6). Total num frames: 208896. Throughput: 0: 798.8. Samples: 51338. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:07:05,729][08744] Avg episode reward: [(0, '4.423')]
[2023-02-25 17:07:10,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.6, 300 sec: 2503.1). Total num frames: 225280. Throughput: 0: 784.7. Samples: 56470. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:07:10,730][08744] Avg episode reward: [(0, '4.351')]
[2023-02-25 17:07:15,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2500.7). Total num frames: 237568. Throughput: 0: 771.6. Samples: 58254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:07:15,732][08744] Avg episode reward: [(0, '4.314')]
[2023-02-25 17:07:15,751][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000058_237568.pth...
[2023-02-25 17:07:18,663][14414] Updated weights for policy 0, policy_version 60 (0.0017)
[2023-02-25 17:07:20,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 2498.6). Total num frames: 249856. Throughput: 0: 775.5. Samples: 62070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:07:20,725][08744] Avg episode reward: [(0, '4.342')]
[2023-02-25 17:07:25,724][08744] Fps is (10 sec: 3276.2, 60 sec: 3071.9, 300 sec: 2574.6). Total num frames: 270336. Throughput: 0: 804.1. Samples: 67952. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:07:25,732][08744] Avg episode reward: [(0, '4.377')]
[2023-02-25 17:07:29,843][14414] Updated weights for policy 0, policy_version 70 (0.0022)
[2023-02-25 17:07:30,730][08744] Fps is (10 sec: 3683.7, 60 sec: 3140.0, 300 sec: 2606.4). Total num frames: 286720. Throughput: 0: 810.1. Samples: 70952. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:07:30,735][08744] Avg episode reward: [(0, '4.406')]
[2023-02-25 17:07:35,723][08744] Fps is (10 sec: 2867.7, 60 sec: 3140.4, 300 sec: 2600.1). Total num frames: 299008. Throughput: 0: 782.0. Samples: 74808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:07:35,725][08744] Avg episode reward: [(0, '4.294')]
[2023-02-25 17:07:40,723][08744] Fps is (10 sec: 2869.3, 60 sec: 3140.3, 300 sec: 2628.3). Total num frames: 315392. Throughput: 0: 784.4. Samples: 78912. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:07:40,729][08744] Avg episode reward: [(0, '4.296')]
[2023-02-25 17:07:43,373][14414] Updated weights for policy 0, policy_version 80 (0.0028)
[2023-02-25 17:07:45,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2687.0). Total num frames: 335872. Throughput: 0: 805.6. Samples: 81818. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:07:45,729][08744] Avg episode reward: [(0, '4.178')]
[2023-02-25 17:07:50,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2709.7). Total num frames: 352256. Throughput: 0: 811.5. Samples: 87854. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:07:50,728][08744] Avg episode reward: [(0, '4.235')]
[2023-02-25 17:07:55,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2700.3). Total num frames: 364544. Throughput: 0: 780.3. Samples: 91582. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:07:55,731][08744] Avg episode reward: [(0, '4.349')]
[2023-02-25 17:07:56,403][14414] Updated weights for policy 0, policy_version 90 (0.0032)
[2023-02-25 17:08:00,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 2691.7). Total num frames: 376832. Throughput: 0: 782.7. Samples: 93476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:08:00,729][08744] Avg episode reward: [(0, '4.510')]
[2023-02-25 17:08:05,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 2740.1). Total num frames: 397312. Throughput: 0: 814.1. Samples: 98704. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:08:05,725][08744] Avg episode reward: [(0, '4.583')]
[2023-02-25 17:08:08,313][14414] Updated weights for policy 0, policy_version 100 (0.0021)
[2023-02-25 17:08:10,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 2758.0). Total num frames: 413696. Throughput: 0: 809.4. Samples: 104374. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:08:10,736][08744] Avg episode reward: [(0, '4.776')]
[2023-02-25 17:08:10,787][14400] Saving new best policy, reward=4.776!
[2023-02-25 17:08:15,724][08744] Fps is (10 sec: 2866.8, 60 sec: 3140.2, 300 sec: 2748.3). Total num frames: 425984. Throughput: 0: 780.8. Samples: 106084. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:08:15,732][08744] Avg episode reward: [(0, '4.512')]
[2023-02-25 17:08:20,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2764.8). Total num frames: 442368. Throughput: 0: 778.3. Samples: 109832. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:08:20,728][08744] Avg episode reward: [(0, '4.527')]
[2023-02-25 17:08:22,334][14414] Updated weights for policy 0, policy_version 110 (0.0021)
[2023-02-25 17:08:25,723][08744] Fps is (10 sec: 3686.9, 60 sec: 3208.6, 300 sec: 2805.1). Total num frames: 462848. Throughput: 0: 825.2. Samples: 116044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:08:25,731][08744] Avg episode reward: [(0, '4.627')]
[2023-02-25 17:08:30,725][08744] Fps is (10 sec: 3685.4, 60 sec: 3208.8, 300 sec: 2819.0). Total num frames: 479232. Throughput: 0: 830.4. Samples: 119188. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:08:30,728][08744] Avg episode reward: [(0, '4.615')]
[2023-02-25 17:08:35,726][08744] Fps is (10 sec: 2456.7, 60 sec: 3140.1, 300 sec: 2785.2). Total num frames: 487424. Throughput: 0: 766.7. Samples: 122358. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:08:35,729][08744] Avg episode reward: [(0, '4.539')]
[2023-02-25 17:08:36,154][14414] Updated weights for policy 0, policy_version 120 (0.0015)
[2023-02-25 17:08:40,723][08744] Fps is (10 sec: 2048.5, 60 sec: 3072.0, 300 sec: 2776.2). Total num frames: 499712. Throughput: 0: 750.8. Samples: 125368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:08:40,733][08744] Avg episode reward: [(0, '4.489')]
[2023-02-25 17:08:45,723][08744] Fps is (10 sec: 2458.5, 60 sec: 2935.5, 300 sec: 2767.6). Total num frames: 512000. Throughput: 0: 740.1. Samples: 126780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:08:45,728][08744] Avg episode reward: [(0, '4.580')]
[2023-02-25 17:08:50,279][14414] Updated weights for policy 0, policy_version 130 (0.0016)
[2023-02-25 17:08:50,728][08744] Fps is (10 sec: 3275.2, 60 sec: 3003.5, 300 sec: 2802.5). Total num frames: 532480. Throughput: 0: 744.3. Samples: 132202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:08:50,734][08744] Avg episode reward: [(0, '4.607')]
[2023-02-25 17:08:55,723][08744] Fps is (10 sec: 3686.3, 60 sec: 3072.0, 300 sec: 2814.7). Total num frames: 548864. Throughput: 0: 744.2. Samples: 137862. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:08:55,728][08744] Avg episode reward: [(0, '4.758')]
[2023-02-25 17:09:00,727][08744] Fps is (10 sec: 2867.3, 60 sec: 3071.8, 300 sec: 2805.7). Total num frames: 561152. Throughput: 0: 748.7. Samples: 139776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:09:00,730][08744] Avg episode reward: [(0, '4.748')]
[2023-02-25 17:09:04,307][14414] Updated weights for policy 0, policy_version 140 (0.0033)
[2023-02-25 17:09:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3003.7, 300 sec: 2817.2). Total num frames: 577536. Throughput: 0: 748.9. Samples: 143534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:09:05,729][08744] Avg episode reward: [(0, '4.662')]
[2023-02-25 17:09:10,723][08744] Fps is (10 sec: 3278.3, 60 sec: 3003.7, 300 sec: 2828.2). Total num frames: 593920. Throughput: 0: 732.0. Samples: 148982. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:09:10,726][08744] Avg episode reward: [(0, '4.529')]
[2023-02-25 17:09:15,017][14414] Updated weights for policy 0, policy_version 150 (0.0021)
[2023-02-25 17:09:15,723][08744] Fps is (10 sec: 3686.5, 60 sec: 3140.3, 300 sec: 2857.7). Total num frames: 614400. Throughput: 0: 729.0. Samples: 151992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:09:15,728][08744] Avg episode reward: [(0, '4.614')]
[2023-02-25 17:09:15,740][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000150_614400.pth...
[2023-02-25 17:09:20,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 2848.6). Total num frames: 626688. Throughput: 0: 755.7. Samples: 156362. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:09:20,734][08744] Avg episode reward: [(0, '4.498')]
[2023-02-25 17:09:25,723][08744] Fps is (10 sec: 2457.6, 60 sec: 2935.5, 300 sec: 2839.9). Total num frames: 638976. Throughput: 0: 771.4. Samples: 160080. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-25 17:09:25,725][08744] Avg episode reward: [(0, '4.610')]
[2023-02-25 17:09:29,640][14414] Updated weights for policy 0, policy_version 160 (0.0019)
[2023-02-25 17:09:30,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3003.9, 300 sec: 2867.2). Total num frames: 659456. Throughput: 0: 799.3. Samples: 162750. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:09:30,730][08744] Avg episode reward: [(0, '4.637')]
[2023-02-25 17:09:35,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.4, 300 sec: 2875.9). Total num frames: 675840. Throughput: 0: 808.6. Samples: 168584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:09:35,736][08744] Avg episode reward: [(0, '4.827')]
[2023-02-25 17:09:35,749][14400] Saving new best policy, reward=4.827!
[2023-02-25 17:09:40,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 2867.2). Total num frames: 688128. Throughput: 0: 781.2. Samples: 173016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:09:40,734][08744] Avg episode reward: [(0, '4.839')]
[2023-02-25 17:09:40,738][14400] Saving new best policy, reward=4.839!
[2023-02-25 17:09:42,225][14414] Updated weights for policy 0, policy_version 170 (0.0025)
[2023-02-25 17:09:45,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2875.6). Total num frames: 704512. Throughput: 0: 784.0. Samples: 175054. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:09:45,730][08744] Avg episode reward: [(0, '4.863')]
[2023-02-25 17:09:45,743][14400] Saving new best policy, reward=4.863!
[2023-02-25 17:09:50,726][08744] Fps is (10 sec: 3685.1, 60 sec: 3208.6, 300 sec: 2899.9). Total num frames: 724992. Throughput: 0: 815.1. Samples: 180218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:09:50,730][08744] Avg episode reward: [(0, '5.033')]
[2023-02-25 17:09:50,738][14400] Saving new best policy, reward=5.033!
[2023-02-25 17:09:53,707][14414] Updated weights for policy 0, policy_version 180 (0.0025)
[2023-02-25 17:09:55,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2907.4). Total num frames: 741376. Throughput: 0: 824.0. Samples: 186062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:09:55,730][08744] Avg episode reward: [(0, '5.025')]
[2023-02-25 17:10:00,723][08744] Fps is (10 sec: 3278.0, 60 sec: 3277.1, 300 sec: 2914.5). Total num frames: 757760. Throughput: 0: 810.1. Samples: 188446. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:10:00,727][08744] Avg episode reward: [(0, '4.923')]
[2023-02-25 17:10:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2905.8). Total num frames: 770048. Throughput: 0: 792.7. Samples: 192034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:10:05,729][08744] Avg episode reward: [(0, '5.099')]
[2023-02-25 17:10:05,744][14400] Saving new best policy, reward=5.099!
[2023-02-25 17:10:07,753][14414] Updated weights for policy 0, policy_version 190 (0.0023)
[2023-02-25 17:10:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 2912.7). Total num frames: 786432. Throughput: 0: 822.0. Samples: 197072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:10:10,730][08744] Avg episode reward: [(0, '5.299')]
[2023-02-25 17:10:10,732][14400] Saving new best policy, reward=5.299!
[2023-02-25 17:10:15,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2934.2). Total num frames: 806912. Throughput: 0: 827.3. Samples: 199978. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:10:15,728][08744] Avg episode reward: [(0, '5.744')]
[2023-02-25 17:10:15,739][14400] Saving new best policy, reward=5.744!
[2023-02-25 17:10:19,047][14414] Updated weights for policy 0, policy_version 200 (0.0013)
[2023-02-25 17:10:20,726][08744] Fps is (10 sec: 3275.6, 60 sec: 3208.3, 300 sec: 2925.7). Total num frames: 819200. Throughput: 0: 811.6. Samples: 205110. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:10:20,730][08744] Avg episode reward: [(0, '5.686')]
[2023-02-25 17:10:25,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 2917.5). Total num frames: 831488. Throughput: 0: 795.5. Samples: 208814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:10:25,728][08744] Avg episode reward: [(0, '5.266')]
[2023-02-25 17:10:30,723][08744] Fps is (10 sec: 3278.0, 60 sec: 3208.5, 300 sec: 2937.8). Total num frames: 851968. Throughput: 0: 802.8. Samples: 211180. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:10:30,725][08744] Avg episode reward: [(0, '5.335')]
[2023-02-25 17:10:32,317][14414] Updated weights for policy 0, policy_version 210 (0.0022)
[2023-02-25 17:10:35,723][08744] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 2957.5). Total num frames: 872448. Throughput: 0: 818.6. Samples: 217054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:10:35,726][08744] Avg episode reward: [(0, '5.309')]
[2023-02-25 17:10:40,727][08744] Fps is (10 sec: 3275.3, 60 sec: 3276.5, 300 sec: 2999.1). Total num frames: 884736. Throughput: 0: 795.5. Samples: 221864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:10:40,731][08744] Avg episode reward: [(0, '5.515')]
[2023-02-25 17:10:45,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3040.8). Total num frames: 897024. Throughput: 0: 784.6. Samples: 223754. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:10:45,727][08744] Avg episode reward: [(0, '5.396')]
[2023-02-25 17:10:46,126][14414] Updated weights for policy 0, policy_version 220 (0.0021)
[2023-02-25 17:10:50,723][08744] Fps is (10 sec: 3278.2, 60 sec: 3208.7, 300 sec: 3110.2). Total num frames: 917504. Throughput: 0: 800.3. Samples: 228046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:10:50,726][08744] Avg episode reward: [(0, '5.215')]
[2023-02-25 17:10:55,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3151.8). Total num frames: 933888. Throughput: 0: 822.3. Samples: 234076. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:10:55,729][08744] Avg episode reward: [(0, '5.265')]
[2023-02-25 17:10:56,917][14414] Updated weights for policy 0, policy_version 230 (0.0016)
[2023-02-25 17:11:00,723][08744] Fps is (10 sec: 3276.9, 60 sec: 3208.5, 300 sec: 3138.0). Total num frames: 950272. Throughput: 0: 820.6. Samples: 236904. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:00,731][08744] Avg episode reward: [(0, '5.549')]
[2023-02-25 17:11:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3151.8). Total num frames: 962560. Throughput: 0: 789.2. Samples: 240620. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:05,728][08744] Avg episode reward: [(0, '5.692')]
[2023-02-25 17:11:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 978944. Throughput: 0: 804.1. Samples: 245000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:10,728][08744] Avg episode reward: [(0, '5.749')]
[2023-02-25 17:11:10,731][14400] Saving new best policy, reward=5.749!
[2023-02-25 17:11:11,258][14414] Updated weights for policy 0, policy_version 240 (0.0013)
[2023-02-25 17:11:15,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 999424. Throughput: 0: 818.8. Samples: 248026. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:15,724][08744] Avg episode reward: [(0, '6.110')]
[2023-02-25 17:11:15,741][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000244_999424.pth...
[2023-02-25 17:11:15,906][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000058_237568.pth
[2023-02-25 17:11:15,916][14400] Saving new best policy, reward=6.110!
[2023-02-25 17:11:20,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3277.0, 300 sec: 3151.8). Total num frames: 1015808. Throughput: 0: 818.4. Samples: 253882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:11:20,725][08744] Avg episode reward: [(0, '6.685')]
[2023-02-25 17:11:20,730][14400] Saving new best policy, reward=6.685!
[2023-02-25 17:11:23,217][14414] Updated weights for policy 0, policy_version 250 (0.0016)
[2023-02-25 17:11:25,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3151.9). Total num frames: 1028096. Throughput: 0: 790.3. Samples: 257424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:11:25,746][08744] Avg episode reward: [(0, '7.039')]
[2023-02-25 17:11:25,760][14400] Saving new best policy, reward=7.039!
[2023-02-25 17:11:30,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3165.8). Total num frames: 1044480. Throughput: 0: 790.6. Samples: 259332. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:11:30,730][08744] Avg episode reward: [(0, '7.019')]
[2023-02-25 17:11:35,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1060864. Throughput: 0: 812.3. Samples: 264600. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:11:35,730][08744] Avg episode reward: [(0, '6.927')]
[2023-02-25 17:11:36,031][14414] Updated weights for policy 0, policy_version 260 (0.0042)
[2023-02-25 17:11:40,729][08744] Fps is (10 sec: 3274.6, 60 sec: 3208.4, 300 sec: 3165.7). Total num frames: 1077248. Throughput: 0: 801.4. Samples: 270146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:40,732][08744] Avg episode reward: [(0, '7.292')]
[2023-02-25 17:11:40,734][14400] Saving new best policy, reward=7.292!
[2023-02-25 17:11:45,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3151.8). Total num frames: 1089536. Throughput: 0: 777.9. Samples: 271908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:45,727][08744] Avg episode reward: [(0, '7.455')]
[2023-02-25 17:11:45,750][14400] Saving new best policy, reward=7.455!
[2023-02-25 17:11:50,679][14414] Updated weights for policy 0, policy_version 270 (0.0028)
[2023-02-25 17:11:50,723][08744] Fps is (10 sec: 2869.1, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1105920. Throughput: 0: 774.8. Samples: 275484. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:11:50,725][08744] Avg episode reward: [(0, '7.914')]
[2023-02-25 17:11:50,732][14400] Saving new best policy, reward=7.914!
[2023-02-25 17:11:55,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1122304. Throughput: 0: 799.6. Samples: 280982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:11:55,725][08744] Avg episode reward: [(0, '8.110')]
[2023-02-25 17:11:55,736][14400] Saving new best policy, reward=8.110!
[2023-02-25 17:12:00,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 1138688. Throughput: 0: 799.9. Samples: 284022. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:12:00,729][08744] Avg episode reward: [(0, '8.484')]
[2023-02-25 17:12:00,757][14400] Saving new best policy, reward=8.484!
[2023-02-25 17:12:02,413][14414] Updated weights for policy 0, policy_version 280 (0.0026)
[2023-02-25 17:12:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3138.0). Total num frames: 1150976. Throughput: 0: 759.8. Samples: 288072. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:12:05,728][08744] Avg episode reward: [(0, '8.342')]
[2023-02-25 17:12:10,723][08744] Fps is (10 sec: 2457.5, 60 sec: 3072.0, 300 sec: 3137.9). Total num frames: 1163264. Throughput: 0: 760.3. Samples: 291636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:12:10,725][08744] Avg episode reward: [(0, '8.155')]
[2023-02-25 17:12:15,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3165.7). Total num frames: 1183744. Throughput: 0: 781.1. Samples: 294480. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 17:12:15,736][08744] Avg episode reward: [(0, '7.725')]
[2023-02-25 17:12:16,115][14414] Updated weights for policy 0, policy_version 290 (0.0027)
[2023-02-25 17:12:20,723][08744] Fps is (10 sec: 4096.2, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1204224. Throughput: 0: 792.1. Samples: 300246. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:12:20,730][08744] Avg episode reward: [(0, '7.452')]
[2023-02-25 17:12:25,725][08744] Fps is (10 sec: 3276.0, 60 sec: 3140.1, 300 sec: 3151.9). Total num frames: 1216512. Throughput: 0: 758.9. Samples: 304292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:12:25,728][08744] Avg episode reward: [(0, '7.537')]
[2023-02-25 17:12:30,646][14414] Updated weights for policy 0, policy_version 300 (0.0028)
[2023-02-25 17:12:30,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3151.8). Total num frames: 1228800. Throughput: 0: 758.7. Samples: 306048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:12:30,732][08744] Avg episode reward: [(0, '8.051')]
[2023-02-25 17:12:35,723][08744] Fps is (10 sec: 3277.7, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1249280. Throughput: 0: 791.9. Samples: 311118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:12:35,726][08744] Avg episode reward: [(0, '8.974')]
[2023-02-25 17:12:35,738][14400] Saving new best policy, reward=8.974!
[2023-02-25 17:12:40,723][08744] Fps is (10 sec: 3686.1, 60 sec: 3140.6, 300 sec: 3151.8). Total num frames: 1265664. Throughput: 0: 806.1. Samples: 317256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:12:40,726][08744] Avg episode reward: [(0, '9.192')]
[2023-02-25 17:12:40,735][14400] Saving new best policy, reward=9.192!
[2023-02-25 17:12:40,749][14414] Updated weights for policy 0, policy_version 310 (0.0022)
[2023-02-25 17:12:45,724][08744] Fps is (10 sec: 2866.9, 60 sec: 3140.2, 300 sec: 3137.9). Total num frames: 1277952. Throughput: 0: 782.2. Samples: 319220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:12:45,730][08744] Avg episode reward: [(0, '9.452')]
[2023-02-25 17:12:45,850][14400] Saving new best policy, reward=9.452!
[2023-02-25 17:12:50,723][08744] Fps is (10 sec: 2457.8, 60 sec: 3072.0, 300 sec: 3138.0). Total num frames: 1290240. Throughput: 0: 774.4. Samples: 322922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:12:50,729][08744] Avg episode reward: [(0, '9.020')]
[2023-02-25 17:12:55,429][14414] Updated weights for policy 0, policy_version 320 (0.0024)
[2023-02-25 17:12:55,723][08744] Fps is (10 sec: 3277.1, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1310720. Throughput: 0: 800.9. Samples: 327674. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:12:55,728][08744] Avg episode reward: [(0, '9.124')]
[2023-02-25 17:13:00,723][08744] Fps is (10 sec: 4096.1, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 1331200. Throughput: 0: 803.5. Samples: 330638. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:13:00,725][08744] Avg episode reward: [(0, '9.416')]
[2023-02-25 17:13:05,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3151.8). Total num frames: 1343488. Throughput: 0: 780.9. Samples: 335388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:13:05,729][08744] Avg episode reward: [(0, '9.770')]
[2023-02-25 17:13:05,741][14400] Saving new best policy, reward=9.770!
[2023-02-25 17:13:08,879][14414] Updated weights for policy 0, policy_version 330 (0.0019)
[2023-02-25 17:13:10,725][08744] Fps is (10 sec: 2457.0, 60 sec: 3208.4, 300 sec: 3151.8). Total num frames: 1355776. Throughput: 0: 770.9. Samples: 338980. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:13:10,727][08744] Avg episode reward: [(0, '10.805')]
[2023-02-25 17:13:10,736][14400] Saving new best policy, reward=10.805!
[2023-02-25 17:13:15,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 1372160. Throughput: 0: 779.9. Samples: 341142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:13:15,726][08744] Avg episode reward: [(0, '10.669')]
[2023-02-25 17:13:15,735][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000335_1372160.pth...
[2023-02-25 17:13:15,878][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000150_614400.pth
[2023-02-25 17:13:20,723][08744] Fps is (10 sec: 3277.6, 60 sec: 3072.0, 300 sec: 3138.0). Total num frames: 1388544. Throughput: 0: 792.1. Samples: 346762. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:13:20,730][08744] Avg episode reward: [(0, '11.135')]
[2023-02-25 17:13:20,734][14400] Saving new best policy, reward=11.135!
[2023-02-25 17:13:21,057][14414] Updated weights for policy 0, policy_version 340 (0.0031)
[2023-02-25 17:13:25,724][08744] Fps is (10 sec: 3276.5, 60 sec: 3140.4, 300 sec: 3138.0). Total num frames: 1404928. Throughput: 0: 760.7. Samples: 351488. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:13:25,731][08744] Avg episode reward: [(0, '11.089')]
[2023-02-25 17:13:30,723][08744] Fps is (10 sec: 2867.1, 60 sec: 3140.2, 300 sec: 3151.9). Total num frames: 1417216. Throughput: 0: 757.2. Samples: 353292. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:13:30,729][08744] Avg episode reward: [(0, '10.085')]
[2023-02-25 17:13:35,257][14414] Updated weights for policy 0, policy_version 350 (0.0025)
[2023-02-25 17:13:35,723][08744] Fps is (10 sec: 2867.5, 60 sec: 3072.0, 300 sec: 3165.7). Total num frames: 1433600. Throughput: 0: 767.1. Samples: 357440. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:13:35,725][08744] Avg episode reward: [(0, '10.197')]
[2023-02-25 17:13:40,723][08744] Fps is (10 sec: 3686.6, 60 sec: 3140.3, 300 sec: 3193.5). Total num frames: 1454080. Throughput: 0: 789.4. Samples: 363196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:13:40,725][08744] Avg episode reward: [(0, '9.878')]
[2023-02-25 17:13:45,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3165.8). Total num frames: 1466368. Throughput: 0: 787.6. Samples: 366080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:13:45,726][08744] Avg episode reward: [(0, '10.156')]
[2023-02-25 17:13:47,744][14414] Updated weights for policy 0, policy_version 360 (0.0016)
[2023-02-25 17:13:50,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 1478656. Throughput: 0: 763.6. Samples: 369752. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:13:50,725][08744] Avg episode reward: [(0, '10.745')]
[2023-02-25 17:13:55,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3165.8). Total num frames: 1495040. Throughput: 0: 776.5. Samples: 373920. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:13:55,725][08744] Avg episode reward: [(0, '11.914')]
[2023-02-25 17:13:55,741][14400] Saving new best policy, reward=11.914!
[2023-02-25 17:14:00,348][14414] Updated weights for policy 0, policy_version 370 (0.0020)
[2023-02-25 17:14:00,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3072.0, 300 sec: 3179.6). Total num frames: 1515520. Throughput: 0: 796.7. Samples: 376992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:14:00,730][08744] Avg episode reward: [(0, '11.641')]
[2023-02-25 17:14:05,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3179.6). Total num frames: 1531904. Throughput: 0: 794.6. Samples: 382520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:14:05,725][08744] Avg episode reward: [(0, '11.505')]
[2023-02-25 17:14:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.4, 300 sec: 3151.8). Total num frames: 1544192. Throughput: 0: 771.5. Samples: 386204. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:14:10,727][08744] Avg episode reward: [(0, '10.725')]
[2023-02-25 17:14:15,082][14414] Updated weights for policy 0, policy_version 380 (0.0018)
[2023-02-25 17:14:15,724][08744] Fps is (10 sec: 2457.4, 60 sec: 3072.0, 300 sec: 3151.8). Total num frames: 1556480. Throughput: 0: 770.5. Samples: 387964. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:14:15,730][08744] Avg episode reward: [(0, '10.763')]
[2023-02-25 17:14:20,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3179.6). Total num frames: 1576960. Throughput: 0: 801.9. Samples: 393526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:14:20,730][08744] Avg episode reward: [(0, '11.336')]
[2023-02-25 17:14:25,723][08744] Fps is (10 sec: 3686.7, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 1593344. Throughput: 0: 797.4. Samples: 399080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:14:25,733][08744] Avg episode reward: [(0, '11.515')]
[2023-02-25 17:14:25,941][14414] Updated weights for policy 0, policy_version 390 (0.0021)
[2023-02-25 17:14:30,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 1605632. Throughput: 0: 776.9. Samples: 401040. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:14:30,728][08744] Avg episode reward: [(0, '12.102')]
[2023-02-25 17:14:30,734][14400] Saving new best policy, reward=12.102!
[2023-02-25 17:14:35,723][08744] Fps is (10 sec: 2867.0, 60 sec: 3140.2, 300 sec: 3165.7). Total num frames: 1622016. Throughput: 0: 773.5. Samples: 404558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:14:35,729][08744] Avg episode reward: [(0, '11.995')]
[2023-02-25 17:14:39,854][14414] Updated weights for policy 0, policy_version 400 (0.0017)
[2023-02-25 17:14:40,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3165.7). Total num frames: 1638400. Throughput: 0: 808.0. Samples: 410282. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:14:40,729][08744] Avg episode reward: [(0, '11.900')]
[2023-02-25 17:14:45,727][08744] Fps is (10 sec: 2866.0, 60 sec: 3071.8, 300 sec: 3137.9). Total num frames: 1650688. Throughput: 0: 789.8. Samples: 412538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:14:45,730][08744] Avg episode reward: [(0, '12.644')]
[2023-02-25 17:14:45,751][14400] Saving new best policy, reward=12.644!
[2023-02-25 17:14:50,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3124.1). Total num frames: 1662976. Throughput: 0: 737.1. Samples: 415688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:14:50,729][08744] Avg episode reward: [(0, '13.302')]
[2023-02-25 17:14:50,732][14400] Saving new best policy, reward=13.302!
[2023-02-25 17:14:55,723][08744] Fps is (10 sec: 2048.9, 60 sec: 2935.5, 300 sec: 3096.3). Total num frames: 1671168. Throughput: 0: 718.6. Samples: 418542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:14:55,728][08744] Avg episode reward: [(0, '13.665')]
[2023-02-25 17:14:55,741][14400] Saving new best policy, reward=13.665!
[2023-02-25 17:14:57,799][14414] Updated weights for policy 0, policy_version 410 (0.0023)
[2023-02-25 17:15:00,723][08744] Fps is (10 sec: 2457.6, 60 sec: 2867.2, 300 sec: 3110.2). Total num frames: 1687552. Throughput: 0: 722.1. Samples: 420458. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:00,729][08744] Avg episode reward: [(0, '13.653')]
[2023-02-25 17:15:05,723][08744] Fps is (10 sec: 3276.8, 60 sec: 2867.2, 300 sec: 3110.2). Total num frames: 1703936. Throughput: 0: 717.1. Samples: 425794. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:15:05,730][08744] Avg episode reward: [(0, '13.800')]
[2023-02-25 17:15:05,743][14400] Saving new best policy, reward=13.800!
[2023-02-25 17:15:08,644][14414] Updated weights for policy 0, policy_version 420 (0.0013)
[2023-02-25 17:15:10,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3003.7, 300 sec: 3110.2). Total num frames: 1724416. Throughput: 0: 721.3. Samples: 431540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:10,726][08744] Avg episode reward: [(0, '13.722')]
[2023-02-25 17:15:15,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3003.8, 300 sec: 3110.2). Total num frames: 1736704. Throughput: 0: 724.8. Samples: 433656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:15,729][08744] Avg episode reward: [(0, '13.638')]
[2023-02-25 17:15:15,740][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth...
[2023-02-25 17:15:15,906][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000244_999424.pth
[2023-02-25 17:15:20,723][08744] Fps is (10 sec: 2867.2, 60 sec: 2935.5, 300 sec: 3124.1). Total num frames: 1753088. Throughput: 0: 735.3. Samples: 437644. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:20,735][08744] Avg episode reward: [(0, '14.002')]
[2023-02-25 17:15:20,742][14400] Saving new best policy, reward=14.002!
[2023-02-25 17:15:22,535][14414] Updated weights for policy 0, policy_version 430 (0.0026)
[2023-02-25 17:15:25,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3003.7, 300 sec: 3124.1). Total num frames: 1773568. Throughput: 0: 729.5. Samples: 443108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:15:25,731][08744] Avg episode reward: [(0, '14.465')]
[2023-02-25 17:15:25,744][14400] Saving new best policy, reward=14.465!
[2023-02-25 17:15:30,724][08744] Fps is (10 sec: 3686.0, 60 sec: 3071.9, 300 sec: 3110.2). Total num frames: 1789952. Throughput: 0: 742.5. Samples: 445950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:15:30,727][08744] Avg episode reward: [(0, '15.492')]
[2023-02-25 17:15:30,730][14400] Saving new best policy, reward=15.492!
[2023-02-25 17:15:35,730][08744] Fps is (10 sec: 2455.7, 60 sec: 2935.1, 300 sec: 3096.3). Total num frames: 1798144. Throughput: 0: 760.7. Samples: 449924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:15:35,737][08744] Avg episode reward: [(0, '15.255')]
[2023-02-25 17:15:35,818][14414] Updated weights for policy 0, policy_version 440 (0.0018)
[2023-02-25 17:15:40,723][08744] Fps is (10 sec: 2457.9, 60 sec: 2935.5, 300 sec: 3110.2). Total num frames: 1814528. Throughput: 0: 779.0. Samples: 453596. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:15:40,731][08744] Avg episode reward: [(0, '15.440')]
[2023-02-25 17:15:45,723][08744] Fps is (10 sec: 3279.2, 60 sec: 3003.9, 300 sec: 3096.3). Total num frames: 1830912. Throughput: 0: 799.5. Samples: 456436. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:45,725][08744] Avg episode reward: [(0, '14.834')]
[2023-02-25 17:15:47,996][14414] Updated weights for policy 0, policy_version 450 (0.0024)
[2023-02-25 17:15:50,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 1851392. Throughput: 0: 809.0. Samples: 462198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:15:50,728][08744] Avg episode reward: [(0, '14.948')]
[2023-02-25 17:15:55,723][08744] Fps is (10 sec: 3276.9, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 1863680. Throughput: 0: 769.0. Samples: 466146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:15:55,726][08744] Avg episode reward: [(0, '14.785')]
[2023-02-25 17:16:00,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 1875968. Throughput: 0: 760.6. Samples: 467882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:16:00,724][08744] Avg episode reward: [(0, '15.583')]
[2023-02-25 17:16:00,732][14400] Saving new best policy, reward=15.583!
[2023-02-25 17:16:02,654][14414] Updated weights for policy 0, policy_version 460 (0.0022)
[2023-02-25 17:16:05,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 1896448. Throughput: 0: 786.2. Samples: 473022. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:16:05,726][08744] Avg episode reward: [(0, '14.764')]
[2023-02-25 17:16:10,723][08744] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 1916928. Throughput: 0: 809.0. Samples: 479512. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:16:10,725][08744] Avg episode reward: [(0, '16.320')]
[2023-02-25 17:16:10,729][14400] Saving new best policy, reward=16.320!
[2023-02-25 17:16:13,238][14414] Updated weights for policy 0, policy_version 470 (0.0013)
[2023-02-25 17:16:15,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 1929216. Throughput: 0: 789.8. Samples: 481490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:16:15,725][08744] Avg episode reward: [(0, '16.161')]
[2023-02-25 17:16:20,723][08744] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 1941504. Throughput: 0: 783.9. Samples: 485194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:16:20,731][08744] Avg episode reward: [(0, '15.425')]
[2023-02-25 17:16:25,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 1957888. Throughput: 0: 807.4. Samples: 489930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:16:25,731][08744] Avg episode reward: [(0, '15.040')]
[2023-02-25 17:16:27,077][14414] Updated weights for policy 0, policy_version 480 (0.0013)
[2023-02-25 17:16:30,723][08744] Fps is (10 sec: 3686.5, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 1978368. Throughput: 0: 811.4. Samples: 492948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:16:30,731][08744] Avg episode reward: [(0, '15.778')]
[2023-02-25 17:16:35,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.9, 300 sec: 3096.4). Total num frames: 1990656. Throughput: 0: 792.6. Samples: 497866. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:16:35,726][08744] Avg episode reward: [(0, '15.815')]
[2023-02-25 17:16:40,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2002944. Throughput: 0: 786.0. Samples: 501518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:16:40,731][08744] Avg episode reward: [(0, '16.218')]
[2023-02-25 17:16:41,138][14414] Updated weights for policy 0, policy_version 490 (0.0023)
[2023-02-25 17:16:45,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 2023424. Throughput: 0: 795.2. Samples: 503668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:16:45,729][08744] Avg episode reward: [(0, '16.968')]
[2023-02-25 17:16:45,740][14400] Saving new best policy, reward=16.968!
[2023-02-25 17:16:50,723][08744] Fps is (10 sec: 3686.5, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 2039808. Throughput: 0: 810.2. Samples: 509482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:16:50,731][08744] Avg episode reward: [(0, '16.833')]
[2023-02-25 17:16:52,073][14414] Updated weights for policy 0, policy_version 500 (0.0014)
[2023-02-25 17:16:55,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 2056192. Throughput: 0: 770.4. Samples: 514178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:16:55,725][08744] Avg episode reward: [(0, '16.872')]
[2023-02-25 17:17:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 2068480. Throughput: 0: 765.7. Samples: 515948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:17:00,729][08744] Avg episode reward: [(0, '16.678')]
[2023-02-25 17:17:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 2084864. Throughput: 0: 774.6. Samples: 520050. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:17:05,726][08744] Avg episode reward: [(0, '16.460')]
[2023-02-25 17:17:06,715][14414] Updated weights for policy 0, policy_version 510 (0.0029)
[2023-02-25 17:17:10,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3110.2). Total num frames: 2101248. Throughput: 0: 797.4. Samples: 525812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:17:10,725][08744] Avg episode reward: [(0, '15.644')]
[2023-02-25 17:17:15,725][08744] Fps is (10 sec: 3276.0, 60 sec: 3140.1, 300 sec: 3096.3). Total num frames: 2117632. Throughput: 0: 793.6. Samples: 528664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:17:15,731][08744] Avg episode reward: [(0, '17.008')]
[2023-02-25 17:17:15,746][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000517_2117632.pth...
[2023-02-25 17:17:15,912][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000335_1372160.pth
[2023-02-25 17:17:15,930][14400] Saving new best policy, reward=17.008!
[2023-02-25 17:17:20,240][14414] Updated weights for policy 0, policy_version 520 (0.0017)
[2023-02-25 17:17:20,724][08744] Fps is (10 sec: 2866.8, 60 sec: 3140.2, 300 sec: 3096.3). Total num frames: 2129920. Throughput: 0: 762.3. Samples: 532172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:17:20,730][08744] Avg episode reward: [(0, '17.734')]
[2023-02-25 17:17:20,736][14400] Saving new best policy, reward=17.734!
[2023-02-25 17:17:25,723][08744] Fps is (10 sec: 2867.9, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 2146304. Throughput: 0: 770.6. Samples: 536196. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:17:25,726][08744] Avg episode reward: [(0, '17.502')]
[2023-02-25 17:17:30,723][08744] Fps is (10 sec: 3277.3, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2162688. Throughput: 0: 786.8. Samples: 539072. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:17:30,730][08744] Avg episode reward: [(0, '19.060')]
[2023-02-25 17:17:30,738][14400] Saving new best policy, reward=19.060!
[2023-02-25 17:17:32,151][14414] Updated weights for policy 0, policy_version 530 (0.0025)
[2023-02-25 17:17:35,723][08744] Fps is (10 sec: 3276.7, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2179072. Throughput: 0: 780.3. Samples: 544598. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:17:35,725][08744] Avg episode reward: [(0, '19.363')]
[2023-02-25 17:17:35,733][14400] Saving new best policy, reward=19.363!
[2023-02-25 17:17:40,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2191360. Throughput: 0: 756.4. Samples: 548214. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-25 17:17:40,732][08744] Avg episode reward: [(0, '19.549')]
[2023-02-25 17:17:40,736][14400] Saving new best policy, reward=19.549!
[2023-02-25 17:17:45,723][08744] Fps is (10 sec: 2457.7, 60 sec: 3003.7, 300 sec: 3096.3). Total num frames: 2203648. Throughput: 0: 755.5. Samples: 549946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:17:45,732][08744] Avg episode reward: [(0, '20.077')]
[2023-02-25 17:17:45,744][14400] Saving new best policy, reward=20.077!
[2023-02-25 17:17:47,021][14414] Updated weights for policy 0, policy_version 540 (0.0034)
[2023-02-25 17:17:50,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2224128. Throughput: 0: 788.4. Samples: 555528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:17:50,729][08744] Avg episode reward: [(0, '19.937')]
[2023-02-25 17:17:55,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3072.0, 300 sec: 3082.4). Total num frames: 2240512. Throughput: 0: 779.2. Samples: 560878. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:17:55,728][08744] Avg episode reward: [(0, '19.433')]
[2023-02-25 17:17:59,244][14414] Updated weights for policy 0, policy_version 550 (0.0040)
[2023-02-25 17:18:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3082.4). Total num frames: 2252800. Throughput: 0: 758.2. Samples: 562780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:00,726][08744] Avg episode reward: [(0, '18.060')]
[2023-02-25 17:18:05,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2269184. Throughput: 0: 761.4. Samples: 566432. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:18:05,725][08744] Avg episode reward: [(0, '18.077')]
[2023-02-25 17:18:10,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2285568. Throughput: 0: 795.8. Samples: 572006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:10,726][08744] Avg episode reward: [(0, '17.103')]
[2023-02-25 17:18:12,052][14414] Updated weights for policy 0, policy_version 560 (0.0032)
[2023-02-25 17:18:15,724][08744] Fps is (10 sec: 3686.0, 60 sec: 3140.4, 300 sec: 3110.2). Total num frames: 2306048. Throughput: 0: 794.5. Samples: 574826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:15,726][08744] Avg episode reward: [(0, '18.251')]
[2023-02-25 17:18:20,726][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2318336. Throughput: 0: 771.0. Samples: 579292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:18:20,729][08744] Avg episode reward: [(0, '17.980')]
[2023-02-25 17:18:25,723][08744] Fps is (10 sec: 2457.8, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2330624. Throughput: 0: 769.2. Samples: 582826. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:18:25,730][08744] Avg episode reward: [(0, '17.932')]
[2023-02-25 17:18:26,486][14414] Updated weights for policy 0, policy_version 570 (0.0060)
[2023-02-25 17:18:30,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 2347008. Throughput: 0: 790.2. Samples: 585504. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:18:30,725][08744] Avg episode reward: [(0, '19.607')]
[2023-02-25 17:18:35,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2367488. Throughput: 0: 796.0. Samples: 591348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:35,725][08744] Avg episode reward: [(0, '19.250')]
[2023-02-25 17:18:37,848][14414] Updated weights for policy 0, policy_version 580 (0.0032)
[2023-02-25 17:18:40,729][08744] Fps is (10 sec: 3274.6, 60 sec: 3139.9, 300 sec: 3096.2). Total num frames: 2379776. Throughput: 0: 775.9. Samples: 595798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:40,731][08744] Avg episode reward: [(0, '19.932')]
[2023-02-25 17:18:45,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2392064. Throughput: 0: 774.0. Samples: 597612. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:45,728][08744] Avg episode reward: [(0, '20.006')]
[2023-02-25 17:18:50,723][08744] Fps is (10 sec: 3279.0, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 2412544. Throughput: 0: 799.3. Samples: 602400. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:50,725][08744] Avg episode reward: [(0, '20.122')]
[2023-02-25 17:18:50,732][14400] Saving new best policy, reward=20.122!
[2023-02-25 17:18:51,398][14414] Updated weights for policy 0, policy_version 590 (0.0030)
[2023-02-25 17:18:55,723][08744] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 2433024. Throughput: 0: 798.7. Samples: 607948. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:18:55,726][08744] Avg episode reward: [(0, '19.264')]
[2023-02-25 17:19:00,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 2445312. Throughput: 0: 791.4. Samples: 610440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:19:00,725][08744] Avg episode reward: [(0, '18.698')]
[2023-02-25 17:19:05,446][14414] Updated weights for policy 0, policy_version 600 (0.0048)
[2023-02-25 17:19:05,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 2457600. Throughput: 0: 771.1. Samples: 613992. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:19:05,729][08744] Avg episode reward: [(0, '19.630')]
[2023-02-25 17:19:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 2473984. Throughput: 0: 803.1. Samples: 618964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:19:10,732][08744] Avg episode reward: [(0, '19.213')]
[2023-02-25 17:19:15,387][14414] Updated weights for policy 0, policy_version 610 (0.0036)
[2023-02-25 17:19:15,723][08744] Fps is (10 sec: 4096.1, 60 sec: 3208.6, 300 sec: 3124.1). Total num frames: 2498560. Throughput: 0: 817.0. Samples: 622268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:19:15,730][08744] Avg episode reward: [(0, '18.684')]
[2023-02-25 17:19:15,741][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000610_2498560.pth...
[2023-02-25 17:19:15,873][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000424_1736704.pth
[2023-02-25 17:19:20,724][08744] Fps is (10 sec: 4095.3, 60 sec: 3276.7, 300 sec: 3124.0). Total num frames: 2514944. Throughput: 0: 821.3. Samples: 628310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:19:20,731][08744] Avg episode reward: [(0, '20.037')]
[2023-02-25 17:19:25,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3124.1). Total num frames: 2527232. Throughput: 0: 801.2. Samples: 631846. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:19:25,732][08744] Avg episode reward: [(0, '19.818')]
[2023-02-25 17:19:29,679][14414] Updated weights for policy 0, policy_version 620 (0.0015)
[2023-02-25 17:19:30,726][08744] Fps is (10 sec: 2457.1, 60 sec: 3208.3, 300 sec: 3110.2). Total num frames: 2539520. Throughput: 0: 800.4. Samples: 633634. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:19:30,733][08744] Avg episode reward: [(0, '20.364')]
[2023-02-25 17:19:30,739][14400] Saving new best policy, reward=20.364!
[2023-02-25 17:19:35,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 2560000. Throughput: 0: 819.2. Samples: 639262. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:19:35,725][08744] Avg episode reward: [(0, '20.080')]
[2023-02-25 17:19:40,723][08744] Fps is (10 sec: 3278.0, 60 sec: 3208.9, 300 sec: 3124.1). Total num frames: 2572288. Throughput: 0: 805.0. Samples: 644172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:19:40,729][08744] Avg episode reward: [(0, '20.810')]
[2023-02-25 17:19:40,761][14400] Saving new best policy, reward=20.810!
[2023-02-25 17:19:42,472][14414] Updated weights for policy 0, policy_version 630 (0.0017)
[2023-02-25 17:19:45,725][08744] Fps is (10 sec: 2866.5, 60 sec: 3276.7, 300 sec: 3137.9). Total num frames: 2588672. Throughput: 0: 788.5. Samples: 645926. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:19:45,731][08744] Avg episode reward: [(0, '21.370')]
[2023-02-25 17:19:45,751][14400] Saving new best policy, reward=21.370!
[2023-02-25 17:19:50,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 2605056. Throughput: 0: 802.8. Samples: 650118. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:19:50,730][08744] Avg episode reward: [(0, '21.636')]
[2023-02-25 17:19:50,733][14400] Saving new best policy, reward=21.636!
[2023-02-25 17:19:55,037][14414] Updated weights for policy 0, policy_version 640 (0.0018)
[2023-02-25 17:19:55,723][08744] Fps is (10 sec: 3277.6, 60 sec: 3140.3, 300 sec: 3165.7). Total num frames: 2621440. Throughput: 0: 818.4. Samples: 655790. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:19:55,725][08744] Avg episode reward: [(0, '21.917')]
[2023-02-25 17:19:55,738][14400] Saving new best policy, reward=21.917!
[2023-02-25 17:20:00,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 2637824. Throughput: 0: 809.9. Samples: 658714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:20:00,728][08744] Avg episode reward: [(0, '22.207')]
[2023-02-25 17:20:00,730][14400] Saving new best policy, reward=22.207!
[2023-02-25 17:20:05,727][08744] Fps is (10 sec: 2865.9, 60 sec: 3208.3, 300 sec: 3137.9). Total num frames: 2650112. Throughput: 0: 760.3. Samples: 662528. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:20:05,731][08744] Avg episode reward: [(0, '21.734')]
[2023-02-25 17:20:09,505][14414] Updated weights for policy 0, policy_version 650 (0.0021)
[2023-02-25 17:20:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3151.8). Total num frames: 2666496. Throughput: 0: 771.8. Samples: 666576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:20:10,728][08744] Avg episode reward: [(0, '22.612')]
[2023-02-25 17:20:10,731][14400] Saving new best policy, reward=22.612!
[2023-02-25 17:20:15,723][08744] Fps is (10 sec: 3278.3, 60 sec: 3072.0, 300 sec: 3151.8). Total num frames: 2682880. Throughput: 0: 795.2. Samples: 669416. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:20:15,725][08744] Avg episode reward: [(0, '23.123')]
[2023-02-25 17:20:15,740][14400] Saving new best policy, reward=23.123!
[2023-02-25 17:20:20,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.1, 300 sec: 3138.0). Total num frames: 2699264. Throughput: 0: 796.6. Samples: 675108. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 17:20:20,731][08744] Avg episode reward: [(0, '22.541')]
[2023-02-25 17:20:20,982][14414] Updated weights for policy 0, policy_version 660 (0.0016)
[2023-02-25 17:20:25,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3124.1). Total num frames: 2711552. Throughput: 0: 768.2. Samples: 678740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:20:25,726][08744] Avg episode reward: [(0, '22.754')]
[2023-02-25 17:20:30,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 3151.9). Total num frames: 2727936. Throughput: 0: 768.4. Samples: 680500. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:20:30,728][08744] Avg episode reward: [(0, '23.468')]
[2023-02-25 17:20:30,733][14400] Saving new best policy, reward=23.468!
[2023-02-25 17:20:34,883][14414] Updated weights for policy 0, policy_version 670 (0.0015)
[2023-02-25 17:20:35,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3072.0, 300 sec: 3151.8). Total num frames: 2744320. Throughput: 0: 788.2. Samples: 685588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:20:35,725][08744] Avg episode reward: [(0, '24.594')]
[2023-02-25 17:20:35,741][14400] Saving new best policy, reward=24.594!
[2023-02-25 17:20:40,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 2760704. Throughput: 0: 787.3. Samples: 691218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:20:40,731][08744] Avg episode reward: [(0, '25.368')]
[2023-02-25 17:20:40,739][14400] Saving new best policy, reward=25.368!
[2023-02-25 17:20:45,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3072.1, 300 sec: 3124.1). Total num frames: 2772992. Throughput: 0: 761.0. Samples: 692960. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:20:45,729][08744] Avg episode reward: [(0, '25.285')]
[2023-02-25 17:20:49,552][14414] Updated weights for policy 0, policy_version 680 (0.0038)
[2023-02-25 17:20:50,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3003.7, 300 sec: 3124.1). Total num frames: 2785280. Throughput: 0: 757.9. Samples: 696630. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-25 17:20:50,731][08744] Avg episode reward: [(0, '24.226')]
[2023-02-25 17:20:55,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3003.7, 300 sec: 3138.0). Total num frames: 2801664. Throughput: 0: 772.2. Samples: 701324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:20:55,725][08744] Avg episode reward: [(0, '22.987')]
[2023-02-25 17:21:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 2935.5, 300 sec: 3110.2). Total num frames: 2813952. Throughput: 0: 750.3. Samples: 703178. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:21:00,730][08744] Avg episode reward: [(0, '23.022')]
[2023-02-25 17:21:05,715][14414] Updated weights for policy 0, policy_version 690 (0.0025)
[2023-02-25 17:21:05,725][08744] Fps is (10 sec: 2457.3, 60 sec: 2935.6, 300 sec: 3082.4). Total num frames: 2826240. Throughput: 0: 691.8. Samples: 706242. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:21:05,730][08744] Avg episode reward: [(0, '21.652')]
[2023-02-25 17:21:10,723][08744] Fps is (10 sec: 2457.6, 60 sec: 2867.2, 300 sec: 3082.4). Total num frames: 2838528. Throughput: 0: 689.7. Samples: 709776. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:21:10,730][08744] Avg episode reward: [(0, '21.176')]
[2023-02-25 17:21:15,723][08744] Fps is (10 sec: 2867.6, 60 sec: 2867.2, 300 sec: 3096.3). Total num frames: 2854912. Throughput: 0: 698.5. Samples: 711932. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:21:15,725][08744] Avg episode reward: [(0, '19.658')]
[2023-02-25 17:21:15,736][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000697_2854912.pth...
[2023-02-25 17:21:15,848][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000517_2117632.pth
[2023-02-25 17:21:17,765][14414] Updated weights for policy 0, policy_version 700 (0.0013)
[2023-02-25 17:21:20,723][08744] Fps is (10 sec: 3686.4, 60 sec: 2935.5, 300 sec: 3110.2). Total num frames: 2875392. Throughput: 0: 731.2. Samples: 718492. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:21:20,725][08744] Avg episode reward: [(0, '18.954')]
[2023-02-25 17:21:25,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3003.7, 300 sec: 3096.3). Total num frames: 2891776. Throughput: 0: 718.2. Samples: 723536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:21:25,725][08744] Avg episode reward: [(0, '20.180')]
[2023-02-25 17:21:30,723][08744] Fps is (10 sec: 2867.2, 60 sec: 2935.5, 300 sec: 3096.3). Total num frames: 2904064. Throughput: 0: 720.1. Samples: 725364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:21:30,726][08744] Avg episode reward: [(0, '19.764')]
[2023-02-25 17:21:31,427][14414] Updated weights for policy 0, policy_version 710 (0.0042)
[2023-02-25 17:21:35,723][08744] Fps is (10 sec: 2867.2, 60 sec: 2935.5, 300 sec: 3110.2). Total num frames: 2920448. Throughput: 0: 722.2. Samples: 729130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:21:35,729][08744] Avg episode reward: [(0, '20.469')]
[2023-02-25 17:21:40,723][08744] Fps is (10 sec: 3276.8, 60 sec: 2935.5, 300 sec: 3096.3). Total num frames: 2936832. Throughput: 0: 745.6. Samples: 734876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:21:40,729][08744] Avg episode reward: [(0, '20.133')]
[2023-02-25 17:21:43,009][14414] Updated weights for policy 0, policy_version 720 (0.0016)
[2023-02-25 17:21:45,723][08744] Fps is (10 sec: 3276.6, 60 sec: 3003.7, 300 sec: 3096.3). Total num frames: 2953216. Throughput: 0: 768.3. Samples: 737750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:21:45,726][08744] Avg episode reward: [(0, '20.151')]
[2023-02-25 17:21:50,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3003.7, 300 sec: 3082.4). Total num frames: 2965504. Throughput: 0: 788.7. Samples: 741734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:21:50,731][08744] Avg episode reward: [(0, '20.098')]
[2023-02-25 17:21:55,723][08744] Fps is (10 sec: 2867.3, 60 sec: 3003.7, 300 sec: 3096.3). Total num frames: 2981888. Throughput: 0: 797.4. Samples: 745660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:21:55,725][08744] Avg episode reward: [(0, '20.816')]
[2023-02-25 17:21:57,326][14414] Updated weights for policy 0, policy_version 730 (0.0022)
[2023-02-25 17:22:00,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 3002368. Throughput: 0: 813.6. Samples: 748544. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:22:00,730][08744] Avg episode reward: [(0, '19.674')]
[2023-02-25 17:22:05,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.6, 300 sec: 3110.2). Total num frames: 3018752. Throughput: 0: 797.8. Samples: 754392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:22:05,726][08744] Avg episode reward: [(0, '19.218')]
[2023-02-25 17:22:10,111][14414] Updated weights for policy 0, policy_version 740 (0.0014)
[2023-02-25 17:22:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 3031040. Throughput: 0: 767.8. Samples: 758088. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:22:10,727][08744] Avg episode reward: [(0, '19.907')]
[2023-02-25 17:22:15,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 3043328. Throughput: 0: 766.3. Samples: 759846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:22:15,726][08744] Avg episode reward: [(0, '21.542')]
[2023-02-25 17:22:20,724][08744] Fps is (10 sec: 3276.2, 60 sec: 3140.2, 300 sec: 3110.2). Total num frames: 3063808. Throughput: 0: 798.5. Samples: 765066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:22:20,727][08744] Avg episode reward: [(0, '22.363')]
[2023-02-25 17:22:22,415][14414] Updated weights for policy 0, policy_version 750 (0.0035)
[2023-02-25 17:22:25,723][08744] Fps is (10 sec: 3686.3, 60 sec: 3140.2, 300 sec: 3110.2). Total num frames: 3080192. Throughput: 0: 795.6. Samples: 770678. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:22:25,728][08744] Avg episode reward: [(0, '23.141')]
[2023-02-25 17:22:30,723][08744] Fps is (10 sec: 2867.7, 60 sec: 3140.3, 300 sec: 3096.3). Total num frames: 3092480. Throughput: 0: 772.2. Samples: 772500. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:22:30,733][08744] Avg episode reward: [(0, '23.033')]
[2023-02-25 17:22:35,723][08744] Fps is (10 sec: 2457.7, 60 sec: 3072.0, 300 sec: 3096.3). Total num frames: 3104768. Throughput: 0: 763.8. Samples: 776104. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:22:35,725][08744] Avg episode reward: [(0, '23.687')]
[2023-02-25 17:22:37,069][14414] Updated weights for policy 0, policy_version 760 (0.0023)
[2023-02-25 17:22:40,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3125248. Throughput: 0: 803.8. Samples: 781830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:22:40,725][08744] Avg episode reward: [(0, '23.476')]
[2023-02-25 17:22:45,723][08744] Fps is (10 sec: 4095.9, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 3145728. Throughput: 0: 802.7. Samples: 784664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:22:45,729][08744] Avg episode reward: [(0, '22.540')]
[2023-02-25 17:22:48,394][14414] Updated weights for policy 0, policy_version 770 (0.0036)
[2023-02-25 17:22:50,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3158016. Throughput: 0: 774.5. Samples: 789246. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:22:50,725][08744] Avg episode reward: [(0, '21.947')]
[2023-02-25 17:22:55,723][08744] Fps is (10 sec: 2457.7, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 3170304. Throughput: 0: 774.5. Samples: 792940. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:22:55,730][08744] Avg episode reward: [(0, '22.382')]
[2023-02-25 17:23:00,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3190784. Throughput: 0: 795.3. Samples: 795634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:23:00,726][08744] Avg episode reward: [(0, '20.137')]
[2023-02-25 17:23:01,283][14414] Updated weights for policy 0, policy_version 780 (0.0032)
[2023-02-25 17:23:05,723][08744] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3138.0). Total num frames: 3211264. Throughput: 0: 811.4. Samples: 801578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:23:05,725][08744] Avg episode reward: [(0, '19.357')]
[2023-02-25 17:23:10,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3223552. Throughput: 0: 785.3. Samples: 806016. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:23:10,727][08744] Avg episode reward: [(0, '19.773')]
[2023-02-25 17:23:15,721][14414] Updated weights for policy 0, policy_version 790 (0.0037)
[2023-02-25 17:23:15,724][08744] Fps is (10 sec: 2457.4, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3235840. Throughput: 0: 782.3. Samples: 807706. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:23:15,727][08744] Avg episode reward: [(0, '20.982')]
[2023-02-25 17:23:15,737][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000790_3235840.pth...
[2023-02-25 17:23:15,910][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000610_2498560.pth
[2023-02-25 17:23:20,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.4, 300 sec: 3124.1). Total num frames: 3252224. Throughput: 0: 810.4. Samples: 812574. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:23:20,731][08744] Avg episode reward: [(0, '19.469')]
[2023-02-25 17:23:25,723][08744] Fps is (10 sec: 3686.7, 60 sec: 3208.5, 300 sec: 3138.0). Total num frames: 3272704. Throughput: 0: 813.6. Samples: 818444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:23:25,731][08744] Avg episode reward: [(0, '20.064')]
[2023-02-25 17:23:26,493][14414] Updated weights for policy 0, policy_version 800 (0.0033)
[2023-02-25 17:23:30,723][08744] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3284992. Throughput: 0: 799.1. Samples: 820622. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:23:30,726][08744] Avg episode reward: [(0, '21.453')]
[2023-02-25 17:23:35,724][08744] Fps is (10 sec: 2457.2, 60 sec: 3208.4, 300 sec: 3110.2). Total num frames: 3297280. Throughput: 0: 779.9. Samples: 824344. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:23:35,728][08744] Avg episode reward: [(0, '22.085')]
[2023-02-25 17:23:40,492][14414] Updated weights for policy 0, policy_version 810 (0.0034)
[2023-02-25 17:23:40,723][08744] Fps is (10 sec: 3276.9, 60 sec: 3208.5, 300 sec: 3138.0). Total num frames: 3317760. Throughput: 0: 811.9. Samples: 829474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:23:40,730][08744] Avg episode reward: [(0, '23.566')]
[2023-02-25 17:23:45,723][08744] Fps is (10 sec: 3687.0, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3334144. Throughput: 0: 814.0. Samples: 832262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:23:45,725][08744] Avg episode reward: [(0, '23.509')]
[2023-02-25 17:23:50,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3350528. Throughput: 0: 793.6. Samples: 837290. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:23:50,729][08744] Avg episode reward: [(0, '24.037')]
[2023-02-25 17:23:53,389][14414] Updated weights for policy 0, policy_version 820 (0.0023)
[2023-02-25 17:23:55,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3362816. Throughput: 0: 778.7. Samples: 841058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:23:55,725][08744] Avg episode reward: [(0, '24.403')]
[2023-02-25 17:24:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3379200. Throughput: 0: 790.0. Samples: 843256. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:24:00,731][08744] Avg episode reward: [(0, '23.389')]
[2023-02-25 17:24:05,131][14414] Updated weights for policy 0, policy_version 830 (0.0026)
[2023-02-25 17:24:05,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3138.0). Total num frames: 3399680. Throughput: 0: 813.5. Samples: 849182. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:24:05,731][08744] Avg episode reward: [(0, '23.387')]
[2023-02-25 17:24:10,723][08744] Fps is (10 sec: 3686.2, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3416064. Throughput: 0: 794.1. Samples: 854180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:24:10,726][08744] Avg episode reward: [(0, '23.680')]
[2023-02-25 17:24:15,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3096.3). Total num frames: 3428352. Throughput: 0: 785.2. Samples: 855956. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:24:15,725][08744] Avg episode reward: [(0, '22.665')]
[2023-02-25 17:24:19,194][14414] Updated weights for policy 0, policy_version 840 (0.0034)
[2023-02-25 17:24:20,723][08744] Fps is (10 sec: 2867.4, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3444736. Throughput: 0: 798.3. Samples: 860264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:24:20,726][08744] Avg episode reward: [(0, '22.173')]
[2023-02-25 17:24:25,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3138.0). Total num frames: 3465216. Throughput: 0: 816.4. Samples: 866214. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:24:25,728][08744] Avg episode reward: [(0, '21.007')]
[2023-02-25 17:24:30,560][14414] Updated weights for policy 0, policy_version 850 (0.0013)
[2023-02-25 17:24:30,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3124.1). Total num frames: 3481600. Throughput: 0: 815.7. Samples: 868968. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:24:30,726][08744] Avg episode reward: [(0, '21.696')]
[2023-02-25 17:24:35,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3208.6, 300 sec: 3110.2). Total num frames: 3489792. Throughput: 0: 786.9. Samples: 872702. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-25 17:24:35,729][08744] Avg episode reward: [(0, '22.429')]
[2023-02-25 17:24:40,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3110.2). Total num frames: 3506176. Throughput: 0: 802.6. Samples: 877174. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:24:40,728][08744] Avg episode reward: [(0, '23.166')]
[2023-02-25 17:24:44,046][14414] Updated weights for policy 0, policy_version 860 (0.0025)
[2023-02-25 17:24:45,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 3526656. Throughput: 0: 816.8. Samples: 880014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:24:45,730][08744] Avg episode reward: [(0, '23.388')]
[2023-02-25 17:24:50,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 3543040. Throughput: 0: 806.9. Samples: 885494. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:24:50,727][08744] Avg episode reward: [(0, '24.066')]
[2023-02-25 17:24:55,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3110.2). Total num frames: 3555328. Throughput: 0: 779.1. Samples: 889238. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:24:55,730][08744] Avg episode reward: [(0, '24.721')]
[2023-02-25 17:24:58,288][14414] Updated weights for policy 0, policy_version 870 (0.0025)
[2023-02-25 17:25:00,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 3571712. Throughput: 0: 779.7. Samples: 891044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:25:00,726][08744] Avg episode reward: [(0, '24.003')]
[2023-02-25 17:25:05,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3588096. Throughput: 0: 809.7. Samples: 896700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:25:05,725][08744] Avg episode reward: [(0, '24.185')]
[2023-02-25 17:25:09,168][14414] Updated weights for policy 0, policy_version 880 (0.0033)
[2023-02-25 17:25:10,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3604480. Throughput: 0: 799.2. Samples: 902176. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:25:10,727][08744] Avg episode reward: [(0, '23.144')]
[2023-02-25 17:25:15,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3124.1). Total num frames: 3620864. Throughput: 0: 777.8. Samples: 903968. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:25:15,729][08744] Avg episode reward: [(0, '22.641')]
[2023-02-25 17:25:15,743][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000884_3620864.pth...
[2023-02-25 17:25:15,911][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000697_2854912.pth
[2023-02-25 17:25:20,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3633152. Throughput: 0: 776.5. Samples: 907644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:25:20,726][08744] Avg episode reward: [(0, '22.378')]
[2023-02-25 17:25:23,288][14414] Updated weights for policy 0, policy_version 890 (0.0019)
[2023-02-25 17:25:25,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3140.3, 300 sec: 3138.0). Total num frames: 3653632. Throughput: 0: 804.3. Samples: 913368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:25:25,730][08744] Avg episode reward: [(0, '22.687')]
[2023-02-25 17:25:30,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3140.3, 300 sec: 3138.0). Total num frames: 3670016. Throughput: 0: 802.2. Samples: 916114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:25:30,725][08744] Avg episode reward: [(0, '24.259')]
[2023-02-25 17:25:35,725][08744] Fps is (10 sec: 2866.5, 60 sec: 3208.4, 300 sec: 3124.0). Total num frames: 3682304. Throughput: 0: 770.8. Samples: 920180. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:25:35,733][08744] Avg episode reward: [(0, '24.641')]
[2023-02-25 17:25:37,023][14414] Updated weights for policy 0, policy_version 900 (0.0027)
[2023-02-25 17:25:40,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3124.1). Total num frames: 3694592. Throughput: 0: 768.8. Samples: 923834. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:25:40,730][08744] Avg episode reward: [(0, '24.149')]
[2023-02-25 17:25:45,723][08744] Fps is (10 sec: 3277.6, 60 sec: 3140.3, 300 sec: 3151.8). Total num frames: 3715072. Throughput: 0: 796.7. Samples: 926896. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:25:45,729][08744] Avg episode reward: [(0, '23.459')]
[2023-02-25 17:25:48,032][14414] Updated weights for policy 0, policy_version 910 (0.0038)
[2023-02-25 17:25:50,726][08744] Fps is (10 sec: 4094.5, 60 sec: 3208.3, 300 sec: 3165.7). Total num frames: 3735552. Throughput: 0: 816.5. Samples: 933444. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-25 17:25:50,729][08744] Avg episode reward: [(0, '24.422')]
[2023-02-25 17:25:55,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3165.7). Total num frames: 3747840. Throughput: 0: 790.0. Samples: 937726. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-25 17:25:55,742][08744] Avg episode reward: [(0, '24.302')]
[2023-02-25 17:26:00,723][08744] Fps is (10 sec: 2868.2, 60 sec: 3208.5, 300 sec: 3179.6). Total num frames: 3764224. Throughput: 0: 795.4. Samples: 939760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-25 17:26:00,725][08744] Avg episode reward: [(0, '23.426')]
[2023-02-25 17:26:01,254][14414] Updated weights for policy 0, policy_version 920 (0.0026)
[2023-02-25 17:26:05,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3207.4). Total num frames: 3784704. Throughput: 0: 838.7. Samples: 945384. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:05,725][08744] Avg episode reward: [(0, '23.203')]
[2023-02-25 17:26:10,723][08744] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 3805184. Throughput: 0: 856.0. Samples: 951888. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-25 17:26:10,724][08744] Avg episode reward: [(0, '24.365')]
[2023-02-25 17:26:10,878][14414] Updated weights for policy 0, policy_version 930 (0.0019)
[2023-02-25 17:26:15,723][08744] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3193.5). Total num frames: 3817472. Throughput: 0: 842.5. Samples: 954026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-25 17:26:15,728][08744] Avg episode reward: [(0, '24.662')]
[2023-02-25 17:26:20,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3193.5). Total num frames: 3833856. Throughput: 0: 840.9. Samples: 958018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:20,730][08744] Avg episode reward: [(0, '23.894')]
[2023-02-25 17:26:24,118][14414] Updated weights for policy 0, policy_version 940 (0.0014)
[2023-02-25 17:26:25,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 3854336. Throughput: 0: 889.1. Samples: 963844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:25,726][08744] Avg episode reward: [(0, '24.688')]
[2023-02-25 17:26:30,725][08744] Fps is (10 sec: 4095.0, 60 sec: 3413.2, 300 sec: 3235.1). Total num frames: 3874816. Throughput: 0: 895.9. Samples: 967212. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:30,728][08744] Avg episode reward: [(0, '26.237')]
[2023-02-25 17:26:30,732][14400] Saving new best policy, reward=26.237!
[2023-02-25 17:26:35,190][14414] Updated weights for policy 0, policy_version 950 (0.0021)
[2023-02-25 17:26:35,729][08744] Fps is (10 sec: 3683.9, 60 sec: 3481.4, 300 sec: 3235.1). Total num frames: 3891200. Throughput: 0: 863.6. Samples: 972308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:35,736][08744] Avg episode reward: [(0, '25.852')]
[2023-02-25 17:26:40,723][08744] Fps is (10 sec: 2867.9, 60 sec: 3481.6, 300 sec: 3221.3). Total num frames: 3903488. Throughput: 0: 859.0. Samples: 976380. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:40,725][08744] Avg episode reward: [(0, '26.139')]
[2023-02-25 17:26:45,723][08744] Fps is (10 sec: 3279.0, 60 sec: 3481.6, 300 sec: 3249.0). Total num frames: 3923968. Throughput: 0: 874.4. Samples: 979108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:45,728][08744] Avg episode reward: [(0, '24.856')]
[2023-02-25 17:26:46,934][14414] Updated weights for policy 0, policy_version 960 (0.0017)
[2023-02-25 17:26:50,723][08744] Fps is (10 sec: 4505.6, 60 sec: 3550.1, 300 sec: 3276.8). Total num frames: 3948544. Throughput: 0: 895.7. Samples: 985690. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:26:50,728][08744] Avg episode reward: [(0, '23.068')]
[2023-02-25 17:26:55,723][08744] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3249.0). Total num frames: 3960832. Throughput: 0: 864.3. Samples: 990782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-25 17:26:55,728][08744] Avg episode reward: [(0, '21.279')]
[2023-02-25 17:26:59,384][14414] Updated weights for policy 0, policy_version 970 (0.0023)
[2023-02-25 17:27:00,723][08744] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3235.1). Total num frames: 3973120. Throughput: 0: 863.1. Samples: 992866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:27:00,727][08744] Avg episode reward: [(0, '21.030')]
[2023-02-25 17:27:05,724][08744] Fps is (10 sec: 2866.7, 60 sec: 3413.2, 300 sec: 3249.0). Total num frames: 3989504. Throughput: 0: 868.6. Samples: 997108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:27:05,730][08744] Avg episode reward: [(0, '20.007')]
[2023-02-25 17:27:10,723][08744] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 4001792. Throughput: 0: 832.0. Samples: 1001286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-25 17:27:10,728][08744] Avg episode reward: [(0, '19.957')]
[2023-02-25 17:27:10,794][14400] Stopping Batcher_0...
[2023-02-25 17:27:10,795][14400] Loop batcher_evt_loop terminating...
[2023-02-25 17:27:10,795][08744] Component Batcher_0 stopped!
[2023-02-25 17:27:10,800][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 17:27:10,842][14414] Weights refcount: 2 0
[2023-02-25 17:27:10,848][14414] Stopping InferenceWorker_p0-w0...
[2023-02-25 17:27:10,848][14414] Loop inference_proc0-0_evt_loop terminating...
[2023-02-25 17:27:10,855][08744] Component InferenceWorker_p0-w0 stopped!
[2023-02-25 17:27:10,877][14418] Stopping RolloutWorker_w3...
[2023-02-25 17:27:10,878][08744] Component RolloutWorker_w3 stopped!
[2023-02-25 17:27:10,879][14418] Loop rollout_proc3_evt_loop terminating...
[2023-02-25 17:27:10,888][08744] Component RolloutWorker_w7 stopped!
[2023-02-25 17:27:10,887][14422] Stopping RolloutWorker_w7...
[2023-02-25 17:27:10,899][08744] Component RolloutWorker_w1 stopped!
[2023-02-25 17:27:10,904][14415] Stopping RolloutWorker_w1...
[2023-02-25 17:27:10,905][14415] Loop rollout_proc1_evt_loop terminating...
[2023-02-25 17:27:10,912][08744] Component RolloutWorker_w5 stopped!
[2023-02-25 17:27:10,916][14419] Stopping RolloutWorker_w5...
[2023-02-25 17:27:10,918][14419] Loop rollout_proc5_evt_loop terminating...
[2023-02-25 17:27:10,896][14422] Loop rollout_proc7_evt_loop terminating...
[2023-02-25 17:27:11,023][14416] Stopping RolloutWorker_w0...
[2023-02-25 17:27:11,023][08744] Component RolloutWorker_w0 stopped!
[2023-02-25 17:27:11,033][08744] Component RolloutWorker_w4 stopped!
[2023-02-25 17:27:11,033][14420] Stopping RolloutWorker_w4...
[2023-02-25 17:27:11,039][14420] Loop rollout_proc4_evt_loop terminating...
[2023-02-25 17:27:11,044][14400] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000790_3235840.pth
[2023-02-25 17:27:11,053][14417] Stopping RolloutWorker_w2...
[2023-02-25 17:27:11,054][14417] Loop rollout_proc2_evt_loop terminating...
[2023-02-25 17:27:11,053][08744] Component RolloutWorker_w2 stopped!
[2023-02-25 17:27:11,024][14416] Loop rollout_proc0_evt_loop terminating...
[2023-02-25 17:27:11,059][14400] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 17:27:11,067][14421] Stopping RolloutWorker_w6...
[2023-02-25 17:27:11,068][14421] Loop rollout_proc6_evt_loop terminating...
[2023-02-25 17:27:11,067][08744] Component RolloutWorker_w6 stopped!
[2023-02-25 17:27:11,395][14400] Stopping LearnerWorker_p0...
[2023-02-25 17:27:11,396][14400] Loop learner_proc0_evt_loop terminating...
[2023-02-25 17:27:11,395][08744] Component LearnerWorker_p0 stopped!
[2023-02-25 17:27:11,397][08744] Waiting for process learner_proc0 to stop...
[2023-02-25 17:27:14,084][08744] Waiting for process inference_proc0-0 to join...
[2023-02-25 17:27:14,943][08744] Waiting for process rollout_proc0 to join...
[2023-02-25 17:27:15,577][08744] Waiting for process rollout_proc1 to join...
[2023-02-25 17:27:15,580][08744] Waiting for process rollout_proc2 to join...
[2023-02-25 17:27:15,581][08744] Waiting for process rollout_proc3 to join...
[2023-02-25 17:27:15,582][08744] Waiting for process rollout_proc4 to join...
[2023-02-25 17:27:15,583][08744] Waiting for process rollout_proc5 to join...
[2023-02-25 17:27:15,584][08744] Waiting for process rollout_proc6 to join...
[2023-02-25 17:27:15,585][08744] Waiting for process rollout_proc7 to join...
[2023-02-25 17:27:15,586][08744] Batcher 0 profile tree view:
batching: 27.0065, releasing_batches: 0.0307
[2023-02-25 17:27:15,589][08744] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0113
wait_policy_total: 625.5268
update_model: 9.3480
weight_update: 0.0030
one_step: 0.0025
handle_policy_step: 603.7016
deserialize: 17.6749, stack: 3.5419, obs_to_device_normalize: 130.3602, forward: 298.9829, send_messages: 30.3434
prepare_outputs: 93.4677
to_cpu: 56.6202
[2023-02-25 17:27:15,590][08744] Learner 0 profile tree view:
misc: 0.0068, prepare_batch: 16.5096
train: 77.2435
epoch_init: 0.0100, minibatch_init: 0.0202, losses_postprocess: 0.5078, kl_divergence: 0.6118, after_optimizer: 32.9877
calculate_losses: 27.5819
losses_init: 0.0060, forward_head: 2.1287, bptt_initial: 17.6677, tail: 1.2567, advantages_returns: 0.3092, losses: 3.4066
bptt: 2.4943
bptt_forward_core: 2.4010
update: 14.7871
clip: 1.4763
[2023-02-25 17:27:15,591][08744] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.4449, enqueue_policy_requests: 186.1918, env_step: 947.4893, overhead: 29.3215, complete_rollouts: 8.0498
save_policy_outputs: 24.4024
split_output_tensors: 12.1745
[2023-02-25 17:27:15,593][08744] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3633, enqueue_policy_requests: 188.2064, env_step: 945.1326, overhead: 27.9021, complete_rollouts: 8.6555
save_policy_outputs: 24.3528
split_output_tensors: 11.4950
[2023-02-25 17:27:15,594][08744] Loop Runner_EvtLoop terminating...
[2023-02-25 17:27:15,596][08744] Runner profile tree view:
main_loop: 1314.5485
[2023-02-25 17:27:15,601][08744] Collected {0: 4005888}, FPS: 3047.3
[2023-02-25 17:27:37,701][08744] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 17:27:37,703][08744] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 17:27:37,706][08744] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 17:27:37,709][08744] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 17:27:37,711][08744] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 17:27:37,713][08744] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 17:27:37,716][08744] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 17:27:37,718][08744] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 17:27:37,719][08744] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-25 17:27:37,720][08744] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-25 17:27:37,721][08744] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 17:27:37,722][08744] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 17:27:37,725][08744] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 17:27:37,726][08744] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 17:27:37,729][08744] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 17:27:37,760][08744] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-25 17:27:37,763][08744] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 17:27:37,766][08744] RunningMeanStd input shape: (1,)
[2023-02-25 17:27:37,783][08744] ConvEncoder: input_channels=3
[2023-02-25 17:27:38,469][08744] Conv encoder output size: 512
[2023-02-25 17:27:38,471][08744] Policy head output size: 512
[2023-02-25 17:27:41,326][08744] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 17:27:42,994][08744] Num frames 100...
[2023-02-25 17:27:43,111][08744] Num frames 200...
[2023-02-25 17:27:43,222][08744] Num frames 300...
[2023-02-25 17:27:43,335][08744] Num frames 400...
[2023-02-25 17:27:43,443][08744] Num frames 500...
[2023-02-25 17:27:43,551][08744] Num frames 600...
[2023-02-25 17:27:43,661][08744] Num frames 700...
[2023-02-25 17:27:43,777][08744] Num frames 800...
[2023-02-25 17:27:43,890][08744] Num frames 900...
[2023-02-25 17:27:44,009][08744] Num frames 1000...
[2023-02-25 17:27:44,133][08744] Num frames 1100...
[2023-02-25 17:27:44,249][08744] Num frames 1200...
[2023-02-25 17:27:44,365][08744] Num frames 1300...
[2023-02-25 17:27:44,483][08744] Num frames 1400...
[2023-02-25 17:27:44,599][08744] Num frames 1500...
[2023-02-25 17:27:44,728][08744] Avg episode rewards: #0: 33.680, true rewards: #0: 15.680
[2023-02-25 17:27:44,729][08744] Avg episode reward: 33.680, avg true_objective: 15.680
[2023-02-25 17:27:44,779][08744] Num frames 1600...
[2023-02-25 17:27:44,905][08744] Num frames 1700...
[2023-02-25 17:27:45,031][08744] Num frames 1800...
[2023-02-25 17:27:45,145][08744] Num frames 1900...
[2023-02-25 17:27:45,253][08744] Num frames 2000...
[2023-02-25 17:27:45,363][08744] Num frames 2100...
[2023-02-25 17:27:45,473][08744] Num frames 2200...
[2023-02-25 17:27:45,588][08744] Num frames 2300...
[2023-02-25 17:27:45,697][08744] Num frames 2400...
[2023-02-25 17:27:45,808][08744] Num frames 2500...
[2023-02-25 17:27:45,918][08744] Num frames 2600...
[2023-02-25 17:27:46,040][08744] Num frames 2700...
[2023-02-25 17:27:46,159][08744] Num frames 2800...
[2023-02-25 17:27:46,274][08744] Num frames 2900...
[2023-02-25 17:27:46,391][08744] Num frames 3000...
[2023-02-25 17:27:46,514][08744] Num frames 3100...
[2023-02-25 17:27:46,635][08744] Num frames 3200...
[2023-02-25 17:27:46,795][08744] Avg episode rewards: #0: 38.980, true rewards: #0: 16.480
[2023-02-25 17:27:46,797][08744] Avg episode reward: 38.980, avg true_objective: 16.480
[2023-02-25 17:27:46,808][08744] Num frames 3300...
[2023-02-25 17:27:46,918][08744] Num frames 3400...
[2023-02-25 17:27:47,043][08744] Num frames 3500...
[2023-02-25 17:27:47,158][08744] Num frames 3600...
[2023-02-25 17:27:47,268][08744] Num frames 3700...
[2023-02-25 17:27:47,377][08744] Num frames 3800...
[2023-02-25 17:27:47,486][08744] Num frames 3900...
[2023-02-25 17:27:47,596][08744] Num frames 4000...
[2023-02-25 17:27:47,719][08744] Num frames 4100...
[2023-02-25 17:27:47,829][08744] Num frames 4200...
[2023-02-25 17:27:47,940][08744] Num frames 4300...
[2023-02-25 17:27:48,048][08744] Num frames 4400...
[2023-02-25 17:27:48,164][08744] Num frames 4500...
[2023-02-25 17:27:48,277][08744] Num frames 4600...
[2023-02-25 17:27:48,389][08744] Num frames 4700...
[2023-02-25 17:27:48,506][08744] Num frames 4800...
[2023-02-25 17:27:48,657][08744] Avg episode rewards: #0: 38.293, true rewards: #0: 16.293
[2023-02-25 17:27:48,658][08744] Avg episode reward: 38.293, avg true_objective: 16.293
[2023-02-25 17:27:48,684][08744] Num frames 4900...
[2023-02-25 17:27:48,793][08744] Num frames 5000...
[2023-02-25 17:27:48,904][08744] Num frames 5100...
[2023-02-25 17:27:49,025][08744] Num frames 5200...
[2023-02-25 17:27:49,142][08744] Num frames 5300...
[2023-02-25 17:27:49,259][08744] Num frames 5400...
[2023-02-25 17:27:49,370][08744] Num frames 5500...
[2023-02-25 17:27:49,477][08744] Num frames 5600...
[2023-02-25 17:27:49,603][08744] Avg episode rewards: #0: 32.665, true rewards: #0: 14.165
[2023-02-25 17:27:49,605][08744] Avg episode reward: 32.665, avg true_objective: 14.165
[2023-02-25 17:27:49,651][08744] Num frames 5700...
[2023-02-25 17:27:49,764][08744] Num frames 5800...
[2023-02-25 17:27:49,876][08744] Num frames 5900...
[2023-02-25 17:27:49,996][08744] Num frames 6000...
[2023-02-25 17:27:50,118][08744] Num frames 6100...
[2023-02-25 17:27:50,228][08744] Num frames 6200...
[2023-02-25 17:27:50,339][08744] Num frames 6300...
[2023-02-25 17:27:50,425][08744] Avg episode rewards: #0: 28.854, true rewards: #0: 12.654
[2023-02-25 17:27:50,427][08744] Avg episode reward: 28.854, avg true_objective: 12.654
[2023-02-25 17:27:50,512][08744] Num frames 6400...
[2023-02-25 17:27:50,629][08744] Num frames 6500...
[2023-02-25 17:27:50,751][08744] Num frames 6600...
[2023-02-25 17:27:50,868][08744] Num frames 6700...
[2023-02-25 17:27:50,986][08744] Num frames 6800...
[2023-02-25 17:27:51,095][08744] Num frames 6900...
[2023-02-25 17:27:51,155][08744] Avg episode rewards: #0: 25.672, true rewards: #0: 11.505
[2023-02-25 17:27:51,157][08744] Avg episode reward: 25.672, avg true_objective: 11.505
[2023-02-25 17:27:51,267][08744] Num frames 7000...
[2023-02-25 17:27:51,376][08744] Num frames 7100...
[2023-02-25 17:27:51,489][08744] Num frames 7200...
[2023-02-25 17:27:51,599][08744] Num frames 7300...
[2023-02-25 17:27:51,705][08744] Num frames 7400...
[2023-02-25 17:27:51,814][08744] Num frames 7500...
[2023-02-25 17:27:51,922][08744] Num frames 7600...
[2023-02-25 17:27:51,997][08744] Avg episode rewards: #0: 24.310, true rewards: #0: 10.881
[2023-02-25 17:27:51,999][08744] Avg episode reward: 24.310, avg true_objective: 10.881
[2023-02-25 17:27:52,104][08744] Num frames 7700...
[2023-02-25 17:27:52,218][08744] Num frames 7800...
[2023-02-25 17:27:52,331][08744] Num frames 7900...
[2023-02-25 17:27:52,441][08744] Num frames 8000...
[2023-02-25 17:27:52,551][08744] Num frames 8100...
[2023-02-25 17:27:52,661][08744] Num frames 8200...
[2023-02-25 17:27:52,770][08744] Num frames 8300...
[2023-02-25 17:27:52,927][08744] Num frames 8400...
[2023-02-25 17:27:53,094][08744] Num frames 8500...
[2023-02-25 17:27:53,250][08744] Num frames 8600...
[2023-02-25 17:27:53,409][08744] Num frames 8700...
[2023-02-25 17:27:53,567][08744] Num frames 8800...
[2023-02-25 17:27:53,727][08744] Num frames 8900...
[2023-02-25 17:27:53,903][08744] Num frames 9000...
[2023-02-25 17:27:54,016][08744] Avg episode rewards: #0: 25.281, true rewards: #0: 11.281
[2023-02-25 17:27:54,021][08744] Avg episode reward: 25.281, avg true_objective: 11.281
[2023-02-25 17:27:54,139][08744] Num frames 9100...
[2023-02-25 17:27:54,302][08744] Num frames 9200...
[2023-02-25 17:27:54,454][08744] Num frames 9300...
[2023-02-25 17:27:54,610][08744] Num frames 9400...
[2023-02-25 17:27:54,770][08744] Num frames 9500...
[2023-02-25 17:27:54,938][08744] Num frames 9600...
[2023-02-25 17:27:55,106][08744] Num frames 9700...
[2023-02-25 17:27:55,268][08744] Num frames 9800...
[2023-02-25 17:27:55,430][08744] Num frames 9900...
[2023-02-25 17:27:55,590][08744] Num frames 10000...
[2023-02-25 17:27:55,751][08744] Num frames 10100...
[2023-02-25 17:27:55,916][08744] Num frames 10200...
[2023-02-25 17:27:56,076][08744] Num frames 10300...
[2023-02-25 17:27:56,238][08744] Num frames 10400...
[2023-02-25 17:27:56,379][08744] Num frames 10500...
[2023-02-25 17:27:56,490][08744] Num frames 10600...
[2023-02-25 17:27:56,603][08744] Num frames 10700...
[2023-02-25 17:27:56,707][08744] Avg episode rewards: #0: 27.602, true rewards: #0: 11.936
[2023-02-25 17:27:56,711][08744] Avg episode reward: 27.602, avg true_objective: 11.936
[2023-02-25 17:27:56,778][08744] Num frames 10800...
[2023-02-25 17:27:56,902][08744] Num frames 10900...
[2023-02-25 17:27:57,012][08744] Num frames 11000...
[2023-02-25 17:27:57,120][08744] Num frames 11100...
[2023-02-25 17:27:57,228][08744] Num frames 11200...
[2023-02-25 17:27:57,345][08744] Num frames 11300...
[2023-02-25 17:27:57,453][08744] Num frames 11400...
[2023-02-25 17:27:57,560][08744] Num frames 11500...
[2023-02-25 17:27:57,675][08744] Num frames 11600...
[2023-02-25 17:27:57,785][08744] Num frames 11700...
[2023-02-25 17:27:57,898][08744] Num frames 11800...
[2023-02-25 17:27:58,006][08744] Num frames 11900...
[2023-02-25 17:27:58,118][08744] Num frames 12000...
[2023-02-25 17:27:58,231][08744] Num frames 12100...
[2023-02-25 17:27:58,352][08744] Num frames 12200...
[2023-02-25 17:27:58,464][08744] Num frames 12300...
[2023-02-25 17:27:58,576][08744] Num frames 12400...
[2023-02-25 17:27:58,692][08744] Num frames 12500...
[2023-02-25 17:27:58,804][08744] Num frames 12600...
[2023-02-25 17:27:58,925][08744] Num frames 12700...
[2023-02-25 17:27:59,043][08744] Num frames 12800...
[2023-02-25 17:27:59,146][08744] Avg episode rewards: #0: 30.342, true rewards: #0: 12.842
[2023-02-25 17:27:59,148][08744] Avg episode reward: 30.342, avg true_objective: 12.842
[2023-02-25 17:29:20,408][08744] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-25 17:32:34,122][08744] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-25 17:32:34,128][08744] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-25 17:32:34,130][08744] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-25 17:32:34,133][08744] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-25 17:32:34,138][08744] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-25 17:32:34,139][08744] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-25 17:32:34,141][08744] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-25 17:32:34,142][08744] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-25 17:32:34,143][08744] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-25 17:32:34,147][08744] Adding new argument 'hf_repository'='akgeni/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-25 17:32:34,149][08744] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-25 17:32:34,151][08744] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-25 17:32:34,152][08744] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-25 17:32:34,154][08744] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-25 17:32:34,157][08744] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-25 17:32:34,192][08744] RunningMeanStd input shape: (3, 72, 128)
[2023-02-25 17:32:34,194][08744] RunningMeanStd input shape: (1,)
[2023-02-25 17:32:34,219][08744] ConvEncoder: input_channels=3
[2023-02-25 17:32:34,282][08744] Conv encoder output size: 512
[2023-02-25 17:32:34,284][08744] Policy head output size: 512
[2023-02-25 17:32:34,313][08744] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-25 17:32:34,933][08744] Num frames 100...
[2023-02-25 17:32:35,084][08744] Num frames 200...
[2023-02-25 17:32:35,248][08744] Num frames 300...
[2023-02-25 17:32:35,408][08744] Num frames 400...
[2023-02-25 17:32:35,559][08744] Num frames 500...
[2023-02-25 17:32:35,710][08744] Num frames 600...
[2023-02-25 17:32:35,782][08744] Avg episode rewards: #0: 9.080, true rewards: #0: 6.080
[2023-02-25 17:32:35,784][08744] Avg episode reward: 9.080, avg true_objective: 6.080
[2023-02-25 17:32:35,939][08744] Num frames 700...
[2023-02-25 17:32:36,097][08744] Num frames 800...
[2023-02-25 17:32:36,257][08744] Num frames 900...
[2023-02-25 17:32:36,428][08744] Num frames 1000...
[2023-02-25 17:32:36,520][08744] Avg episode rewards: #0: 7.610, true rewards: #0: 5.110
[2023-02-25 17:32:36,522][08744] Avg episode reward: 7.610, avg true_objective: 5.110
[2023-02-25 17:32:36,663][08744] Num frames 1100...
[2023-02-25 17:32:36,820][08744] Num frames 1200...
[2023-02-25 17:32:36,968][08744] Num frames 1300...
[2023-02-25 17:32:37,078][08744] Num frames 1400...
[2023-02-25 17:32:37,197][08744] Num frames 1500...
[2023-02-25 17:32:37,309][08744] Num frames 1600...
[2023-02-25 17:32:37,436][08744] Num frames 1700...
[2023-02-25 17:32:37,546][08744] Num frames 1800...
[2023-02-25 17:32:37,658][08744] Num frames 1900...
[2023-02-25 17:32:37,766][08744] Num frames 2000...
[2023-02-25 17:32:37,879][08744] Num frames 2100...
[2023-02-25 17:32:37,994][08744] Num frames 2200...
[2023-02-25 17:32:38,110][08744] Num frames 2300...
[2023-02-25 17:32:38,221][08744] Num frames 2400...
[2023-02-25 17:32:38,359][08744] Num frames 2500...
[2023-02-25 17:32:38,482][08744] Num frames 2600...
[2023-02-25 17:32:38,596][08744] Num frames 2700...
[2023-02-25 17:32:38,709][08744] Num frames 2800...
[2023-02-25 17:32:38,820][08744] Num frames 2900...
[2023-02-25 17:32:38,931][08744] Num frames 3000...
[2023-02-25 17:32:39,058][08744] Num frames 3100...
[2023-02-25 17:32:39,146][08744] Avg episode rewards: #0: 22.073, true rewards: #0: 10.407
[2023-02-25 17:32:39,151][08744] Avg episode reward: 22.073, avg true_objective: 10.407
[2023-02-25 17:32:39,239][08744] Num frames 3200...
[2023-02-25 17:32:39,355][08744] Num frames 3300...
[2023-02-25 17:32:39,473][08744] Num frames 3400...
[2023-02-25 17:32:39,541][08744] Avg episode rewards: #0: 17.775, true rewards: #0: 8.525
[2023-02-25 17:32:39,542][08744] Avg episode reward: 17.775, avg true_objective: 8.525
[2023-02-25 17:32:39,645][08744] Num frames 3500...
[2023-02-25 17:32:39,763][08744] Num frames 3600...
[2023-02-25 17:32:39,882][08744] Num frames 3700...
[2023-02-25 17:32:40,005][08744] Num frames 3800...
[2023-02-25 17:32:40,122][08744] Num frames 3900...
[2023-02-25 17:32:40,231][08744] Num frames 4000...
[2023-02-25 17:32:40,346][08744] Num frames 4100...
[2023-02-25 17:32:40,460][08744] Num frames 4200...
[2023-02-25 17:32:40,586][08744] Num frames 4300...
[2023-02-25 17:32:40,694][08744] Num frames 4400...
[2023-02-25 17:32:40,804][08744] Num frames 4500...
[2023-02-25 17:32:40,914][08744] Num frames 4600...
[2023-02-25 17:32:41,031][08744] Num frames 4700...
[2023-02-25 17:32:41,147][08744] Num frames 4800...
[2023-02-25 17:32:41,257][08744] Num frames 4900...
[2023-02-25 17:32:41,366][08744] Num frames 5000...
[2023-02-25 17:32:41,480][08744] Num frames 5100...
[2023-02-25 17:32:41,584][08744] Avg episode rewards: #0: 22.876, true rewards: #0: 10.276
[2023-02-25 17:32:41,586][08744] Avg episode reward: 22.876, avg true_objective: 10.276
[2023-02-25 17:32:41,656][08744] Num frames 5200...
[2023-02-25 17:32:41,773][08744] Num frames 5300...
[2023-02-25 17:32:41,903][08744] Num frames 5400...
[2023-02-25 17:32:42,010][08744] Avg episode rewards: #0: 19.898, true rewards: #0: 9.065
[2023-02-25 17:32:42,011][08744] Avg episode reward: 19.898, avg true_objective: 9.065
[2023-02-25 17:32:42,086][08744] Num frames 5500...
[2023-02-25 17:32:42,197][08744] Num frames 5600...
[2023-02-25 17:32:42,307][08744] Num frames 5700...
[2023-02-25 17:32:42,418][08744] Num frames 5800...
[2023-02-25 17:32:42,529][08744] Num frames 5900...
[2023-02-25 17:32:42,635][08744] Num frames 6000...
[2023-02-25 17:32:42,741][08744] Num frames 6100...
[2023-02-25 17:32:42,857][08744] Num frames 6200...
[2023-02-25 17:32:42,922][08744] Avg episode rewards: #0: 18.867, true rewards: #0: 8.867
[2023-02-25 17:32:42,923][08744] Avg episode reward: 18.867, avg true_objective: 8.867
[2023-02-25 17:32:43,030][08744] Num frames 6300...
[2023-02-25 17:32:43,139][08744] Num frames 6400...
[2023-02-25 17:32:43,249][08744] Num frames 6500...
[2023-02-25 17:32:43,365][08744] Num frames 6600...
[2023-02-25 17:32:43,493][08744] Num frames 6700...
[2023-02-25 17:32:43,616][08744] Num frames 6800...
[2023-02-25 17:32:43,734][08744] Num frames 6900...
[2023-02-25 17:32:43,855][08744] Avg episode rewards: #0: 18.696, true rewards: #0: 8.696
[2023-02-25 17:32:43,856][08744] Avg episode reward: 18.696, avg true_objective: 8.696
[2023-02-25 17:32:43,908][08744] Num frames 7000...
[2023-02-25 17:32:44,017][08744] Num frames 7100...
[2023-02-25 17:32:44,129][08744] Num frames 7200...
[2023-02-25 17:32:44,271][08744] Avg episode rewards: #0: 17.310, true rewards: #0: 8.088
[2023-02-25 17:32:44,274][08744] Avg episode reward: 17.310, avg true_objective: 8.088
[2023-02-25 17:32:44,302][08744] Num frames 7300...
[2023-02-25 17:32:44,410][08744] Num frames 7400...
[2023-02-25 17:32:44,534][08744] Num frames 7500...
[2023-02-25 17:32:44,643][08744] Num frames 7600...
[2023-02-25 17:32:44,765][08744] Avg episode rewards: #0: 15.963, true rewards: #0: 7.663
[2023-02-25 17:32:44,767][08744] Avg episode reward: 15.963, avg true_objective: 7.663
[2023-02-25 17:33:35,350][08744] Replay video saved to /content/train_dir/default_experiment/replay.mp4!