[2023-03-14 14:00:34,317][00372] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-03-14 14:00:34,320][00372] Rollout worker 0 uses device cpu [2023-03-14 14:00:34,324][00372] Rollout worker 1 uses device cpu [2023-03-14 14:00:34,325][00372] Rollout worker 2 uses device cpu [2023-03-14 14:00:34,327][00372] Rollout worker 3 uses device cpu [2023-03-14 14:00:34,328][00372] Rollout worker 4 uses device cpu [2023-03-14 14:00:34,329][00372] Rollout worker 5 uses device cpu [2023-03-14 14:00:34,330][00372] Rollout worker 6 uses device cpu [2023-03-14 14:00:34,331][00372] Rollout worker 7 uses device cpu [2023-03-14 14:00:34,528][00372] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-14 14:00:34,530][00372] InferenceWorker_p0-w0: min num requests: 2 [2023-03-14 14:00:34,772][00372] Starting all processes... [2023-03-14 14:00:34,774][00372] Starting process learner_proc0 [2023-03-14 14:00:34,843][00372] Starting all processes... [2023-03-14 14:00:34,850][00372] Starting process inference_proc0-0 [2023-03-14 14:00:34,852][00372] Starting process rollout_proc0 [2023-03-14 14:00:34,875][00372] Starting process rollout_proc1 [2023-03-14 14:00:34,877][00372] Starting process rollout_proc2 [2023-03-14 14:00:34,877][00372] Starting process rollout_proc3 [2023-03-14 14:00:34,877][00372] Starting process rollout_proc4 [2023-03-14 14:00:34,877][00372] Starting process rollout_proc5 [2023-03-14 14:00:34,882][00372] Starting process rollout_proc6 [2023-03-14 14:00:34,882][00372] Starting process rollout_proc7 [2023-03-14 14:00:47,488][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-14 14:00:47,490][13187] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-03-14 14:00:47,910][13205] Worker 0 uses CPU cores [0] [2023-03-14 14:00:48,043][13202] Worker 1 uses CPU cores [1] [2023-03-14 14:00:48,189][13200] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-14 14:00:48,197][13200] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-03-14 14:00:48,230][13207] Worker 4 uses CPU cores [0] [2023-03-14 14:00:48,319][13209] Worker 7 uses CPU cores [1] [2023-03-14 14:00:48,402][13208] Worker 3 uses CPU cores [1] [2023-03-14 14:00:48,465][13211] Worker 6 uses CPU cores [0] [2023-03-14 14:00:48,483][13210] Worker 5 uses CPU cores [1] [2023-03-14 14:00:48,513][13204] Worker 2 uses CPU cores [0] [2023-03-14 14:00:48,576][13187] Num visible devices: 1 [2023-03-14 14:00:48,577][13200] Num visible devices: 1 [2023-03-14 14:00:48,590][13187] Starting seed is not provided [2023-03-14 14:00:48,591][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-14 14:00:48,591][13187] Initializing actor-critic model on device cuda:0 [2023-03-14 14:00:48,592][13187] RunningMeanStd input shape: (3, 72, 128) [2023-03-14 14:00:48,593][13187] RunningMeanStd input shape: (1,) [2023-03-14 14:00:48,606][13187] ConvEncoder: input_channels=3 [2023-03-14 14:00:48,868][13187] Conv encoder output size: 512 [2023-03-14 14:00:48,869][13187] Policy head output size: 512 [2023-03-14 14:00:48,915][13187] Created Actor Critic model with architecture: [2023-03-14 14:00:48,916][13187] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-03-14 14:00:54,519][00372] Heartbeat connected on Batcher_0 [2023-03-14 14:00:54,529][00372] Heartbeat connected on InferenceWorker_p0-w0 [2023-03-14 14:00:54,539][00372] Heartbeat connected on RolloutWorker_w0 [2023-03-14 14:00:54,544][00372] Heartbeat connected on RolloutWorker_w1 [2023-03-14 14:00:54,548][00372] Heartbeat connected on RolloutWorker_w2 [2023-03-14 14:00:54,759][00372] Heartbeat connected on RolloutWorker_w3 [2023-03-14 14:00:54,764][00372] Heartbeat connected on RolloutWorker_w4 [2023-03-14 14:00:54,766][00372] Heartbeat connected on RolloutWorker_w5 [2023-03-14 14:00:54,769][00372] Heartbeat connected on RolloutWorker_w6 [2023-03-14 14:00:54,773][00372] Heartbeat connected on RolloutWorker_w7 [2023-03-14 14:00:55,602][13187] Using optimizer [2023-03-14 14:00:55,603][13187] No checkpoints found [2023-03-14 14:00:55,603][13187] Did not load from checkpoint, starting from scratch! [2023-03-14 14:00:55,603][13187] Initialized policy 0 weights for model version 0 [2023-03-14 14:00:55,613][13187] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-03-14 14:00:55,623][13187] LearnerWorker_p0 finished initialization! [2023-03-14 14:00:55,623][00372] Heartbeat connected on LearnerWorker_p0 [2023-03-14 14:00:55,807][13200] RunningMeanStd input shape: (3, 72, 128) [2023-03-14 14:00:55,809][13200] RunningMeanStd input shape: (1,) [2023-03-14 14:00:55,831][13200] ConvEncoder: input_channels=3 [2023-03-14 14:00:55,997][13200] Conv encoder output size: 512 [2023-03-14 14:00:55,999][13200] Policy head output size: 512 [2023-03-14 14:00:59,436][00372] Inference worker 0-0 is ready! [2023-03-14 14:00:59,438][00372] All inference workers are ready! Signal rollout workers to start! [2023-03-14 14:00:59,554][13205] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,559][13211] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,560][13204] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,578][13207] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,713][13209] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,720][13202] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,725][13208] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:00:59,723][13210] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:01:00,144][00372] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-03-14 14:01:00,912][13209] Decorrelating experience for 0 frames... [2023-03-14 14:01:00,914][13202] Decorrelating experience for 0 frames... [2023-03-14 14:01:01,406][13211] Decorrelating experience for 0 frames... [2023-03-14 14:01:01,408][13204] Decorrelating experience for 0 frames... [2023-03-14 14:01:01,412][13207] Decorrelating experience for 0 frames... [2023-03-14 14:01:01,544][13209] Decorrelating experience for 32 frames... [2023-03-14 14:01:02,126][13208] Decorrelating experience for 0 frames... [2023-03-14 14:01:02,339][13204] Decorrelating experience for 32 frames... [2023-03-14 14:01:02,431][13205] Decorrelating experience for 0 frames... [2023-03-14 14:01:02,523][13209] Decorrelating experience for 64 frames... [2023-03-14 14:01:02,655][13211] Decorrelating experience for 32 frames... [2023-03-14 14:01:03,241][13210] Decorrelating experience for 0 frames... [2023-03-14 14:01:03,249][13208] Decorrelating experience for 32 frames... [2023-03-14 14:01:03,339][13205] Decorrelating experience for 32 frames... [2023-03-14 14:01:03,486][13204] Decorrelating experience for 64 frames... [2023-03-14 14:01:03,847][13209] Decorrelating experience for 96 frames... [2023-03-14 14:01:04,539][13210] Decorrelating experience for 32 frames... [2023-03-14 14:01:04,571][13207] Decorrelating experience for 32 frames... [2023-03-14 14:01:04,738][13208] Decorrelating experience for 64 frames... [2023-03-14 14:01:04,850][13211] Decorrelating experience for 64 frames... [2023-03-14 14:01:04,953][13202] Decorrelating experience for 32 frames... [2023-03-14 14:01:05,143][00372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-03-14 14:01:05,226][13204] Decorrelating experience for 96 frames... [2023-03-14 14:01:05,563][13202] Decorrelating experience for 64 frames... [2023-03-14 14:01:05,843][13208] Decorrelating experience for 96 frames... [2023-03-14 14:01:06,116][13205] Decorrelating experience for 64 frames... [2023-03-14 14:01:06,243][13211] Decorrelating experience for 96 frames... [2023-03-14 14:01:06,720][13210] Decorrelating experience for 64 frames... [2023-03-14 14:01:06,743][13205] Decorrelating experience for 96 frames... [2023-03-14 14:01:07,121][13202] Decorrelating experience for 96 frames... [2023-03-14 14:01:07,507][13210] Decorrelating experience for 96 frames... [2023-03-14 14:01:07,957][13207] Decorrelating experience for 64 frames... [2023-03-14 14:01:10,143][00372] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 53.0. Samples: 530. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-03-14 14:01:10,146][00372] Avg episode reward: [(0, '1.660')] [2023-03-14 14:01:10,932][13187] Signal inference workers to stop experience collection... [2023-03-14 14:01:10,941][13200] InferenceWorker_p0-w0: stopping experience collection [2023-03-14 14:01:10,999][13207] Decorrelating experience for 96 frames... [2023-03-14 14:01:14,007][13187] Signal inference workers to resume experience collection... [2023-03-14 14:01:14,008][13200] InferenceWorker_p0-w0: resuming experience collection [2023-03-14 14:01:15,143][00372] Fps is (10 sec: 409.6, 60 sec: 273.1, 300 sec: 273.1). Total num frames: 4096. Throughput: 0: 169.9. Samples: 2548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-03-14 14:01:15,150][00372] Avg episode reward: [(0, '2.686')] [2023-03-14 14:01:20,143][00372] Fps is (10 sec: 2048.0, 60 sec: 1024.1, 300 sec: 1024.1). Total num frames: 20480. Throughput: 0: 180.6. Samples: 3612. Policy #0 lag: (min: 0.0, avg: 0.3, max: 3.0) [2023-03-14 14:01:20,150][00372] Avg episode reward: [(0, '3.515')] [2023-03-14 14:01:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 1310.8, 300 sec: 1310.8). Total num frames: 32768. Throughput: 0: 313.6. Samples: 7840. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:01:25,146][00372] Avg episode reward: [(0, '4.007')] [2023-03-14 14:01:26,631][13200] Updated weights for policy 0, policy_version 10 (0.0013) [2023-03-14 14:01:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 1775.0, 300 sec: 1775.0). Total num frames: 53248. Throughput: 0: 463.6. Samples: 13908. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:01:30,149][00372] Avg episode reward: [(0, '4.655')] [2023-03-14 14:01:35,147][00372] Fps is (10 sec: 4094.1, 60 sec: 2106.3, 300 sec: 2106.3). Total num frames: 73728. Throughput: 0: 486.1. Samples: 17014. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:01:35,152][00372] Avg episode reward: [(0, '4.581')] [2023-03-14 14:01:37,689][13200] Updated weights for policy 0, policy_version 20 (0.0033) [2023-03-14 14:01:40,144][00372] Fps is (10 sec: 3276.3, 60 sec: 2150.4, 300 sec: 2150.4). Total num frames: 86016. Throughput: 0: 533.8. Samples: 21354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:01:40,148][00372] Avg episode reward: [(0, '4.389')] [2023-03-14 14:01:45,144][00372] Fps is (10 sec: 2458.3, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 98304. Throughput: 0: 564.9. Samples: 25420. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:01:45,152][00372] Avg episode reward: [(0, '4.206')] [2023-03-14 14:01:50,143][00372] Fps is (10 sec: 2867.6, 60 sec: 2293.8, 300 sec: 2293.8). Total num frames: 114688. Throughput: 0: 608.6. Samples: 27388. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:01:50,145][00372] Avg episode reward: [(0, '4.330')] [2023-03-14 14:01:50,152][13187] Saving new best policy, reward=4.330! [2023-03-14 14:01:51,670][13200] Updated weights for policy 0, policy_version 30 (0.0014) [2023-03-14 14:01:55,143][00372] Fps is (10 sec: 3686.9, 60 sec: 2457.7, 300 sec: 2457.7). Total num frames: 135168. Throughput: 0: 727.6. Samples: 33272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:01:55,146][00372] Avg episode reward: [(0, '4.421')] [2023-03-14 14:01:55,151][13187] Saving new best policy, reward=4.421! [2023-03-14 14:02:00,144][00372] Fps is (10 sec: 3686.2, 60 sec: 2525.9, 300 sec: 2525.9). Total num frames: 151552. Throughput: 0: 811.1. Samples: 39048. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:00,146][00372] Avg episode reward: [(0, '4.232')] [2023-03-14 14:02:03,230][13200] Updated weights for policy 0, policy_version 40 (0.0023) [2023-03-14 14:02:05,143][00372] Fps is (10 sec: 3276.9, 60 sec: 2798.9, 300 sec: 2583.7). Total num frames: 167936. Throughput: 0: 832.0. Samples: 41054. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:05,154][00372] Avg episode reward: [(0, '4.304')] [2023-03-14 14:02:10,143][00372] Fps is (10 sec: 2867.4, 60 sec: 3003.7, 300 sec: 2574.7). Total num frames: 180224. Throughput: 0: 826.5. Samples: 45034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:10,151][00372] Avg episode reward: [(0, '4.388')] [2023-03-14 14:02:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 2621.5). Total num frames: 196608. Throughput: 0: 794.9. Samples: 49678. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:15,148][00372] Avg episode reward: [(0, '4.358')] [2023-03-14 14:02:16,544][13200] Updated weights for policy 0, policy_version 50 (0.0040) [2023-03-14 14:02:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 2713.7). Total num frames: 217088. Throughput: 0: 795.0. Samples: 52784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:20,148][00372] Avg episode reward: [(0, '4.380')] [2023-03-14 14:02:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2746.8). Total num frames: 233472. Throughput: 0: 834.4. Samples: 58900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:25,149][00372] Avg episode reward: [(0, '4.276')] [2023-03-14 14:02:28,154][13200] Updated weights for policy 0, policy_version 60 (0.0016) [2023-03-14 14:02:30,148][00372] Fps is (10 sec: 3275.0, 60 sec: 3276.5, 300 sec: 2776.1). Total num frames: 249856. Throughput: 0: 833.0. Samples: 62910. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:02:30,156][00372] Avg episode reward: [(0, '4.309')] [2023-03-14 14:02:30,170][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth... [2023-03-14 14:02:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 2759.5). Total num frames: 262144. Throughput: 0: 832.0. Samples: 64830. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:02:35,147][00372] Avg episode reward: [(0, '4.342')] [2023-03-14 14:02:40,143][00372] Fps is (10 sec: 2868.8, 60 sec: 3208.6, 300 sec: 2785.3). Total num frames: 278528. Throughput: 0: 798.6. Samples: 69208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:02:40,153][00372] Avg episode reward: [(0, '4.421')] [2023-03-14 14:02:41,492][13200] Updated weights for policy 0, policy_version 70 (0.0017) [2023-03-14 14:02:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 2847.7). Total num frames: 299008. Throughput: 0: 808.2. Samples: 75418. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:02:45,145][00372] Avg episode reward: [(0, '4.279')] [2023-03-14 14:02:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2867.2). Total num frames: 315392. Throughput: 0: 832.9. Samples: 78536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:02:50,148][00372] Avg episode reward: [(0, '4.310')] [2023-03-14 14:02:53,566][13200] Updated weights for policy 0, policy_version 80 (0.0034) [2023-03-14 14:02:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 2885.0). Total num frames: 331776. Throughput: 0: 832.0. Samples: 82476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:02:55,145][00372] Avg episode reward: [(0, '4.472')] [2023-03-14 14:02:55,151][13187] Saving new best policy, reward=4.472! [2023-03-14 14:03:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 2867.2). Total num frames: 344064. Throughput: 0: 816.7. Samples: 86430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:03:00,145][00372] Avg episode reward: [(0, '4.478')] [2023-03-14 14:03:00,153][13187] Saving new best policy, reward=4.478! [2023-03-14 14:03:05,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 2883.6). Total num frames: 360448. Throughput: 0: 791.9. Samples: 88420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:03:05,145][00372] Avg episode reward: [(0, '4.335')] [2023-03-14 14:03:06,695][13200] Updated weights for policy 0, policy_version 90 (0.0027) [2023-03-14 14:03:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 2930.2). Total num frames: 380928. Throughput: 0: 794.7. Samples: 94662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:03:10,151][00372] Avg episode reward: [(0, '4.322')] [2023-03-14 14:03:15,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3345.0, 300 sec: 2943.0). Total num frames: 397312. Throughput: 0: 824.3. Samples: 100002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:03:15,149][00372] Avg episode reward: [(0, '4.364')] [2023-03-14 14:03:18,947][13200] Updated weights for policy 0, policy_version 100 (0.0015) [2023-03-14 14:03:20,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.3, 300 sec: 2925.7). Total num frames: 409600. Throughput: 0: 823.8. Samples: 101906. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:03:20,149][00372] Avg episode reward: [(0, '4.373')] [2023-03-14 14:03:25,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3208.5, 300 sec: 2937.8). Total num frames: 425984. Throughput: 0: 817.2. Samples: 105984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:03:25,149][00372] Avg episode reward: [(0, '4.317')] [2023-03-14 14:03:30,143][00372] Fps is (10 sec: 3277.9, 60 sec: 3208.8, 300 sec: 2949.1). Total num frames: 442368. Throughput: 0: 787.7. Samples: 110864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:03:30,150][00372] Avg episode reward: [(0, '4.578')] [2023-03-14 14:03:30,160][13187] Saving new best policy, reward=4.578! [2023-03-14 14:03:31,728][13200] Updated weights for policy 0, policy_version 110 (0.0035) [2023-03-14 14:03:35,146][00372] Fps is (10 sec: 3685.3, 60 sec: 3344.9, 300 sec: 2986.1). Total num frames: 462848. Throughput: 0: 787.7. Samples: 113986. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:03:35,149][00372] Avg episode reward: [(0, '4.485')] [2023-03-14 14:03:40,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 2995.2). Total num frames: 479232. Throughput: 0: 827.2. Samples: 119702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:03:40,152][00372] Avg episode reward: [(0, '4.286')] [2023-03-14 14:03:44,115][13200] Updated weights for policy 0, policy_version 120 (0.0019) [2023-03-14 14:03:45,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3208.5, 300 sec: 2978.9). Total num frames: 491520. Throughput: 0: 827.6. Samples: 123670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:03:45,148][00372] Avg episode reward: [(0, '4.310')] [2023-03-14 14:03:50,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 2987.7). Total num frames: 507904. Throughput: 0: 826.8. Samples: 125628. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:03:50,145][00372] Avg episode reward: [(0, '4.462')] [2023-03-14 14:03:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2996.0). Total num frames: 524288. Throughput: 0: 793.6. Samples: 130372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:03:55,146][00372] Avg episode reward: [(0, '4.788')] [2023-03-14 14:03:55,153][13187] Saving new best policy, reward=4.788! [2023-03-14 14:03:56,712][13200] Updated weights for policy 0, policy_version 130 (0.0016) [2023-03-14 14:04:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3026.5). Total num frames: 544768. Throughput: 0: 810.2. Samples: 136460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:04:00,146][00372] Avg episode reward: [(0, '4.721')] [2023-03-14 14:04:05,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3033.3). Total num frames: 561152. Throughput: 0: 830.4. Samples: 139270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:04:05,145][00372] Avg episode reward: [(0, '4.687')] [2023-03-14 14:04:09,762][13200] Updated weights for policy 0, policy_version 140 (0.0021) [2023-03-14 14:04:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3018.1). Total num frames: 573440. Throughput: 0: 823.7. Samples: 143048. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:10,149][00372] Avg episode reward: [(0, '4.655')] [2023-03-14 14:04:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3003.8). Total num frames: 585728. Throughput: 0: 799.6. Samples: 146846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:15,151][00372] Avg episode reward: [(0, '4.629')] [2023-03-14 14:04:20,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.5, 300 sec: 3010.5). Total num frames: 602112. Throughput: 0: 780.0. Samples: 149088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:20,149][00372] Avg episode reward: [(0, '4.413')] [2023-03-14 14:04:22,207][13200] Updated weights for policy 0, policy_version 150 (0.0018) [2023-03-14 14:04:25,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3037.1). Total num frames: 622592. Throughput: 0: 792.9. Samples: 155382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:25,145][00372] Avg episode reward: [(0, '4.350')] [2023-03-14 14:04:30,144][00372] Fps is (10 sec: 3687.2, 60 sec: 3276.7, 300 sec: 3042.7). Total num frames: 638976. Throughput: 0: 819.4. Samples: 160544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:30,156][00372] Avg episode reward: [(0, '4.533')] [2023-03-14 14:04:30,176][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000156_638976.pth... [2023-03-14 14:04:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 3029.2). Total num frames: 651264. Throughput: 0: 817.4. Samples: 162410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:04:35,145][00372] Avg episode reward: [(0, '4.534')] [2023-03-14 14:04:35,593][13200] Updated weights for policy 0, policy_version 160 (0.0019) [2023-03-14 14:04:40,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3140.2, 300 sec: 3034.8). Total num frames: 667648. Throughput: 0: 800.5. Samples: 166396. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:04:40,149][00372] Avg episode reward: [(0, '4.574')] [2023-03-14 14:04:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3040.2). Total num frames: 684032. Throughput: 0: 784.0. Samples: 171742. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:04:45,149][00372] Avg episode reward: [(0, '4.492')] [2023-03-14 14:04:47,244][13200] Updated weights for policy 0, policy_version 170 (0.0032) [2023-03-14 14:04:50,143][00372] Fps is (10 sec: 3686.6, 60 sec: 3276.8, 300 sec: 3063.1). Total num frames: 704512. Throughput: 0: 791.4. Samples: 174882. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:04:50,149][00372] Avg episode reward: [(0, '4.543')] [2023-03-14 14:04:55,143][00372] Fps is (10 sec: 3686.2, 60 sec: 3276.8, 300 sec: 3067.7). Total num frames: 720896. Throughput: 0: 823.9. Samples: 180124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:04:55,151][00372] Avg episode reward: [(0, '4.618')] [2023-03-14 14:05:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3055.0). Total num frames: 733184. Throughput: 0: 826.7. Samples: 184046. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:05:00,146][00372] Avg episode reward: [(0, '4.672')] [2023-03-14 14:05:00,735][13200] Updated weights for policy 0, policy_version 180 (0.0034) [2023-03-14 14:05:05,143][00372] Fps is (10 sec: 2457.7, 60 sec: 3072.0, 300 sec: 3042.8). Total num frames: 745472. Throughput: 0: 822.3. Samples: 186088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:05:05,146][00372] Avg episode reward: [(0, '4.585')] [2023-03-14 14:05:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3063.8). Total num frames: 765952. Throughput: 0: 796.8. Samples: 191238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:05:10,145][00372] Avg episode reward: [(0, '4.510')] [2023-03-14 14:05:12,262][13200] Updated weights for policy 0, policy_version 190 (0.0015) [2023-03-14 14:05:15,143][00372] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3100.1). Total num frames: 790528. Throughput: 0: 820.9. Samples: 197482. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:15,151][00372] Avg episode reward: [(0, '4.737')] [2023-03-14 14:05:20,145][00372] Fps is (10 sec: 3685.5, 60 sec: 3345.1, 300 sec: 3087.7). Total num frames: 802816. Throughput: 0: 834.1. Samples: 199946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:05:20,148][00372] Avg episode reward: [(0, '4.657')] [2023-03-14 14:05:25,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 3075.9). Total num frames: 815104. Throughput: 0: 832.4. Samples: 203852. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:25,152][00372] Avg episode reward: [(0, '4.589')] [2023-03-14 14:05:25,510][13200] Updated weights for policy 0, policy_version 200 (0.0030) [2023-03-14 14:05:30,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.6, 300 sec: 3079.6). Total num frames: 831488. Throughput: 0: 803.6. Samples: 207904. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:30,146][00372] Avg episode reward: [(0, '4.492')] [2023-03-14 14:05:35,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3083.2). Total num frames: 847872. Throughput: 0: 793.5. Samples: 210590. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:35,148][00372] Avg episode reward: [(0, '4.463')] [2023-03-14 14:05:37,128][13200] Updated weights for policy 0, policy_version 210 (0.0034) [2023-03-14 14:05:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.4, 300 sec: 3115.9). Total num frames: 872448. Throughput: 0: 816.9. Samples: 216884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:40,145][00372] Avg episode reward: [(0, '4.534')] [2023-03-14 14:05:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3104.4). Total num frames: 884736. Throughput: 0: 835.5. Samples: 221642. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:05:45,148][00372] Avg episode reward: [(0, '4.562')] [2023-03-14 14:05:50,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3208.5, 300 sec: 3093.2). Total num frames: 897024. Throughput: 0: 835.1. Samples: 223670. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:05:50,145][00372] Avg episode reward: [(0, '4.589')] [2023-03-14 14:05:50,581][13200] Updated weights for policy 0, policy_version 220 (0.0015) [2023-03-14 14:05:55,144][00372] Fps is (10 sec: 2866.9, 60 sec: 3208.5, 300 sec: 3096.3). Total num frames: 913408. Throughput: 0: 810.1. Samples: 227692. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:05:55,148][00372] Avg episode reward: [(0, '4.615')] [2023-03-14 14:06:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3151.8). Total num frames: 929792. Throughput: 0: 793.6. Samples: 233194. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:06:00,150][00372] Avg episode reward: [(0, '4.408')] [2023-03-14 14:06:02,180][13200] Updated weights for policy 0, policy_version 230 (0.0023) [2023-03-14 14:06:05,143][00372] Fps is (10 sec: 4096.5, 60 sec: 3481.6, 300 sec: 3235.1). Total num frames: 954368. Throughput: 0: 808.7. Samples: 236334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:06:05,150][00372] Avg episode reward: [(0, '4.418')] [2023-03-14 14:06:10,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 966656. Throughput: 0: 834.8. Samples: 241418. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:06:10,150][00372] Avg episode reward: [(0, '4.573')] [2023-03-14 14:06:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 978944. Throughput: 0: 832.3. Samples: 245358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:06:15,145][00372] Avg episode reward: [(0, '4.674')] [2023-03-14 14:06:15,917][13200] Updated weights for policy 0, policy_version 240 (0.0029) [2023-03-14 14:06:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3249.0). Total num frames: 991232. Throughput: 0: 818.0. Samples: 247398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:06:20,146][00372] Avg episode reward: [(0, '4.651')] [2023-03-14 14:06:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1011712. Throughput: 0: 794.9. Samples: 252654. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:06:25,146][00372] Avg episode reward: [(0, '4.397')] [2023-03-14 14:06:27,287][13200] Updated weights for policy 0, policy_version 250 (0.0020) [2023-03-14 14:06:30,145][00372] Fps is (10 sec: 4095.0, 60 sec: 3344.9, 300 sec: 3249.1). Total num frames: 1032192. Throughput: 0: 827.1. Samples: 258864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:06:30,153][00372] Avg episode reward: [(0, '4.469')] [2023-03-14 14:06:30,191][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000253_1036288.pth... [2023-03-14 14:06:30,384][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth [2023-03-14 14:06:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1048576. Throughput: 0: 829.6. Samples: 261000. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:06:35,145][00372] Avg episode reward: [(0, '4.512')] [2023-03-14 14:06:40,146][00372] Fps is (10 sec: 2867.9, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 1060864. Throughput: 0: 827.6. Samples: 264932. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:06:40,149][00372] Avg episode reward: [(0, '4.483')] [2023-03-14 14:06:41,269][13200] Updated weights for policy 0, policy_version 260 (0.0023) [2023-03-14 14:06:45,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1073152. Throughput: 0: 792.3. Samples: 268848. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:06:45,150][00372] Avg episode reward: [(0, '4.525')] [2023-03-14 14:06:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1093632. Throughput: 0: 789.1. Samples: 271844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:06:50,145][00372] Avg episode reward: [(0, '4.628')] [2023-03-14 14:06:52,412][13200] Updated weights for policy 0, policy_version 270 (0.0018) [2023-03-14 14:06:55,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 1114112. Throughput: 0: 812.0. Samples: 277958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:06:55,150][00372] Avg episode reward: [(0, '4.876')] [2023-03-14 14:06:55,155][13187] Saving new best policy, reward=4.876! [2023-03-14 14:07:00,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1126400. Throughput: 0: 819.8. Samples: 282248. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:07:00,152][00372] Avg episode reward: [(0, '5.077')] [2023-03-14 14:07:00,168][13187] Saving new best policy, reward=5.077! [2023-03-14 14:07:05,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3249.0). Total num frames: 1138688. Throughput: 0: 817.0. Samples: 284164. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:07:05,149][00372] Avg episode reward: [(0, '5.161')] [2023-03-14 14:07:05,154][13187] Saving new best policy, reward=5.161! [2023-03-14 14:07:07,096][13200] Updated weights for policy 0, policy_version 280 (0.0013) [2023-03-14 14:07:10,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1155072. Throughput: 0: 791.3. Samples: 288264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:07:10,145][00372] Avg episode reward: [(0, '4.802')] [2023-03-14 14:07:15,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1175552. Throughput: 0: 784.3. Samples: 294156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:07:15,151][00372] Avg episode reward: [(0, '4.633')] [2023-03-14 14:07:17,752][13200] Updated weights for policy 0, policy_version 290 (0.0013) [2023-03-14 14:07:20,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1196032. Throughput: 0: 805.0. Samples: 297226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:07:20,147][00372] Avg episode reward: [(0, '4.748')] [2023-03-14 14:07:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.1). Total num frames: 1208320. Throughput: 0: 822.8. Samples: 301958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:07:25,151][00372] Avg episode reward: [(0, '4.705')] [2023-03-14 14:07:30,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3249.0). Total num frames: 1220608. Throughput: 0: 822.4. Samples: 305854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:07:30,151][00372] Avg episode reward: [(0, '4.705')] [2023-03-14 14:07:32,215][13200] Updated weights for policy 0, policy_version 300 (0.0023) [2023-03-14 14:07:35,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1236992. Throughput: 0: 800.0. Samples: 307844. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:07:35,153][00372] Avg episode reward: [(0, '4.809')] [2023-03-14 14:07:40,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1257472. Throughput: 0: 791.3. Samples: 313568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:07:40,145][00372] Avg episode reward: [(0, '4.788')] [2023-03-14 14:07:42,619][13200] Updated weights for policy 0, policy_version 310 (0.0017) [2023-03-14 14:07:45,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 1277952. Throughput: 0: 830.7. Samples: 319628. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:07:45,145][00372] Avg episode reward: [(0, '4.939')] [2023-03-14 14:07:50,146][00372] Fps is (10 sec: 3275.7, 60 sec: 3276.6, 300 sec: 3249.0). Total num frames: 1290240. Throughput: 0: 831.8. Samples: 321596. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:07:50,151][00372] Avg episode reward: [(0, '4.881')] [2023-03-14 14:07:55,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 1302528. Throughput: 0: 826.0. Samples: 325436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:07:55,145][00372] Avg episode reward: [(0, '4.948')] [2023-03-14 14:07:57,511][13200] Updated weights for policy 0, policy_version 320 (0.0050) [2023-03-14 14:08:00,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 1318912. Throughput: 0: 790.2. Samples: 329714. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:08:00,147][00372] Avg episode reward: [(0, '4.761')] [2023-03-14 14:08:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1339392. Throughput: 0: 790.8. Samples: 332812. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:08:05,149][00372] Avg episode reward: [(0, '4.732')] [2023-03-14 14:08:07,835][13200] Updated weights for policy 0, policy_version 330 (0.0012) [2023-03-14 14:08:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1355776. Throughput: 0: 823.8. Samples: 339030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:08:10,150][00372] Avg episode reward: [(0, '4.674')] [2023-03-14 14:08:15,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 1368064. Throughput: 0: 824.3. Samples: 342946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:08:15,148][00372] Avg episode reward: [(0, '4.679')] [2023-03-14 14:08:20,160][00372] Fps is (10 sec: 2453.3, 60 sec: 3071.1, 300 sec: 3235.0). Total num frames: 1380352. Throughput: 0: 821.8. Samples: 344838. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:08:20,168][00372] Avg episode reward: [(0, '4.929')] [2023-03-14 14:08:23,057][13200] Updated weights for policy 0, policy_version 340 (0.0033) [2023-03-14 14:08:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 1396736. Throughput: 0: 786.0. Samples: 348938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:08:25,145][00372] Avg episode reward: [(0, '5.001')] [2023-03-14 14:08:30,143][00372] Fps is (10 sec: 3692.7, 60 sec: 3276.8, 300 sec: 3235.2). Total num frames: 1417216. Throughput: 0: 786.2. Samples: 355006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:08:30,148][00372] Avg episode reward: [(0, '4.809')] [2023-03-14 14:08:30,196][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000347_1421312.pth... [2023-03-14 14:08:30,333][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000156_638976.pth [2023-03-14 14:08:33,054][13200] Updated weights for policy 0, policy_version 350 (0.0022) [2023-03-14 14:08:35,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1437696. Throughput: 0: 810.3. Samples: 358058. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:08:35,147][00372] Avg episode reward: [(0, '4.563')] [2023-03-14 14:08:40,148][00372] Fps is (10 sec: 3275.1, 60 sec: 3208.2, 300 sec: 3249.0). Total num frames: 1449984. Throughput: 0: 822.7. Samples: 362460. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:08:40,155][00372] Avg episode reward: [(0, '4.525')] [2023-03-14 14:08:45,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 1462272. Throughput: 0: 818.3. Samples: 366538. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:08:45,157][00372] Avg episode reward: [(0, '4.564')] [2023-03-14 14:08:48,177][13200] Updated weights for policy 0, policy_version 360 (0.0017) [2023-03-14 14:08:50,143][00372] Fps is (10 sec: 2868.8, 60 sec: 3140.5, 300 sec: 3235.1). Total num frames: 1478656. Throughput: 0: 794.6. Samples: 368568. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:08:50,145][00372] Avg episode reward: [(0, '4.648')] [2023-03-14 14:08:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1499136. Throughput: 0: 789.3. Samples: 374550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:08:55,145][00372] Avg episode reward: [(0, '4.734')] [2023-03-14 14:08:58,167][13200] Updated weights for policy 0, policy_version 370 (0.0012) [2023-03-14 14:09:00,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1519616. Throughput: 0: 829.0. Samples: 380250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:00,147][00372] Avg episode reward: [(0, '4.794')] [2023-03-14 14:09:05,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1531904. Throughput: 0: 828.8. Samples: 382118. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:09:05,151][00372] Avg episode reward: [(0, '4.553')] [2023-03-14 14:09:10,145][00372] Fps is (10 sec: 2457.0, 60 sec: 3140.1, 300 sec: 3249.0). Total num frames: 1544192. Throughput: 0: 820.9. Samples: 385880. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:09:10,149][00372] Avg episode reward: [(0, '4.453')] [2023-03-14 14:09:13,941][13200] Updated weights for policy 0, policy_version 380 (0.0026) [2023-03-14 14:09:15,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 1560576. Throughput: 0: 782.5. Samples: 390218. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:15,146][00372] Avg episode reward: [(0, '4.338')] [2023-03-14 14:09:20,143][00372] Fps is (10 sec: 3687.3, 60 sec: 3346.0, 300 sec: 3249.0). Total num frames: 1581056. Throughput: 0: 782.7. Samples: 393278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:20,150][00372] Avg episode reward: [(0, '4.581')] [2023-03-14 14:09:24,412][13200] Updated weights for policy 0, policy_version 390 (0.0013) [2023-03-14 14:09:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 1597440. Throughput: 0: 814.5. Samples: 399108. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:09:25,148][00372] Avg episode reward: [(0, '4.716')] [2023-03-14 14:09:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1609728. Throughput: 0: 810.7. Samples: 403020. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:09:30,150][00372] Avg episode reward: [(0, '4.786')] [2023-03-14 14:09:35,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 1622016. Throughput: 0: 809.0. Samples: 404972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:35,150][00372] Avg episode reward: [(0, '4.711')] [2023-03-14 14:09:39,256][13200] Updated weights for policy 0, policy_version 400 (0.0016) [2023-03-14 14:09:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.6, 300 sec: 3235.1). Total num frames: 1638400. Throughput: 0: 774.8. Samples: 409414. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:40,152][00372] Avg episode reward: [(0, '4.627')] [2023-03-14 14:09:45,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 1658880. Throughput: 0: 784.7. Samples: 415560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:45,151][00372] Avg episode reward: [(0, '4.710')] [2023-03-14 14:09:50,144][00372] Fps is (10 sec: 3685.8, 60 sec: 3276.7, 300 sec: 3235.1). Total num frames: 1675264. Throughput: 0: 808.9. Samples: 418518. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:09:50,146][00372] Avg episode reward: [(0, '4.747')] [2023-03-14 14:09:50,488][13200] Updated weights for policy 0, policy_version 410 (0.0020) [2023-03-14 14:09:55,145][00372] Fps is (10 sec: 2866.4, 60 sec: 3140.1, 300 sec: 3235.1). Total num frames: 1687552. Throughput: 0: 809.8. Samples: 422320. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:09:55,154][00372] Avg episode reward: [(0, '4.852')] [2023-03-14 14:10:00,144][00372] Fps is (10 sec: 2867.3, 60 sec: 3071.9, 300 sec: 3249.0). Total num frames: 1703936. Throughput: 0: 800.4. Samples: 426238. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:10:00,153][00372] Avg episode reward: [(0, '4.884')] [2023-03-14 14:10:04,502][13200] Updated weights for policy 0, policy_version 420 (0.0039) [2023-03-14 14:10:05,143][00372] Fps is (10 sec: 3277.6, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 1720320. Throughput: 0: 783.8. Samples: 428548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:10:05,145][00372] Avg episode reward: [(0, '5.060')] [2023-03-14 14:10:10,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.9, 300 sec: 3221.3). Total num frames: 1740800. Throughput: 0: 793.3. Samples: 434808. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:10:10,150][00372] Avg episode reward: [(0, '5.365')] [2023-03-14 14:10:10,164][13187] Saving new best policy, reward=5.365! [2023-03-14 14:10:15,149][00372] Fps is (10 sec: 3684.0, 60 sec: 3276.4, 300 sec: 3235.1). Total num frames: 1757184. Throughput: 0: 824.9. Samples: 440144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:10:15,154][00372] Avg episode reward: [(0, '5.472')] [2023-03-14 14:10:15,159][13187] Saving new best policy, reward=5.472! [2023-03-14 14:10:15,539][13200] Updated weights for policy 0, policy_version 430 (0.0019) [2023-03-14 14:10:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 1773568. Throughput: 0: 825.9. Samples: 442138. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:10:20,151][00372] Avg episode reward: [(0, '5.414')] [2023-03-14 14:10:25,144][00372] Fps is (10 sec: 2868.8, 60 sec: 3140.2, 300 sec: 3235.1). Total num frames: 1785856. Throughput: 0: 818.8. Samples: 446260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:10:25,152][00372] Avg episode reward: [(0, '5.449')] [2023-03-14 14:10:29,073][13200] Updated weights for policy 0, policy_version 440 (0.0048) [2023-03-14 14:10:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1806336. Throughput: 0: 799.4. Samples: 451534. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:10:30,145][00372] Avg episode reward: [(0, '5.195')] [2023-03-14 14:10:30,160][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000441_1806336.pth... [2023-03-14 14:10:30,282][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000253_1036288.pth [2023-03-14 14:10:35,143][00372] Fps is (10 sec: 4096.4, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 1826816. Throughput: 0: 802.2. Samples: 454616. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:10:35,145][00372] Avg episode reward: [(0, '5.015')] [2023-03-14 14:10:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1839104. Throughput: 0: 840.2. Samples: 460128. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:10:40,146][00372] Avg episode reward: [(0, '5.231')] [2023-03-14 14:10:40,331][13200] Updated weights for policy 0, policy_version 450 (0.0020) [2023-03-14 14:10:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 1855488. Throughput: 0: 840.3. Samples: 464052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:10:45,147][00372] Avg episode reward: [(0, '5.159')] [2023-03-14 14:10:50,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.6, 300 sec: 3235.2). Total num frames: 1867776. Throughput: 0: 832.3. Samples: 466004. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:10:50,146][00372] Avg episode reward: [(0, '5.252')] [2023-03-14 14:10:54,323][13200] Updated weights for policy 0, policy_version 460 (0.0059) [2023-03-14 14:10:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.9, 300 sec: 3235.1). Total num frames: 1884160. Throughput: 0: 797.2. Samples: 470682. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:10:55,145][00372] Avg episode reward: [(0, '5.608')] [2023-03-14 14:10:55,154][13187] Saving new best policy, reward=5.608! [2023-03-14 14:11:00,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 1904640. Throughput: 0: 811.7. Samples: 476666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:11:00,148][00372] Avg episode reward: [(0, '5.539')] [2023-03-14 14:11:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 1921024. Throughput: 0: 825.4. Samples: 479280. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:11:05,152][00372] Avg episode reward: [(0, '5.431')] [2023-03-14 14:11:06,456][13200] Updated weights for policy 0, policy_version 470 (0.0019) [2023-03-14 14:11:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 1933312. Throughput: 0: 818.1. Samples: 483074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:11:10,149][00372] Avg episode reward: [(0, '5.626')] [2023-03-14 14:11:10,166][13187] Saving new best policy, reward=5.626! [2023-03-14 14:11:15,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.6, 300 sec: 3235.1). Total num frames: 1945600. Throughput: 0: 787.8. Samples: 486984. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:11:15,151][00372] Avg episode reward: [(0, '5.505')] [2023-03-14 14:11:19,713][13200] Updated weights for policy 0, policy_version 480 (0.0061) [2023-03-14 14:11:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 1966080. Throughput: 0: 775.3. Samples: 489506. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:11:20,146][00372] Avg episode reward: [(0, '5.250')] [2023-03-14 14:11:25,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3235.2). Total num frames: 1986560. Throughput: 0: 792.1. Samples: 495774. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:11:25,150][00372] Avg episode reward: [(0, '5.645')] [2023-03-14 14:11:25,157][13187] Saving new best policy, reward=5.645! [2023-03-14 14:11:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 1998848. Throughput: 0: 811.6. Samples: 500574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:11:30,145][00372] Avg episode reward: [(0, '6.080')] [2023-03-14 14:11:30,161][13187] Saving new best policy, reward=6.080! [2023-03-14 14:11:32,139][13200] Updated weights for policy 0, policy_version 490 (0.0021) [2023-03-14 14:11:35,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2011136. Throughput: 0: 811.4. Samples: 502516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:11:35,149][00372] Avg episode reward: [(0, '6.293')] [2023-03-14 14:11:35,195][13187] Saving new best policy, reward=6.293! [2023-03-14 14:11:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2027520. Throughput: 0: 797.3. Samples: 506560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:11:40,146][00372] Avg episode reward: [(0, '6.281')] [2023-03-14 14:11:44,968][13200] Updated weights for policy 0, policy_version 500 (0.0021) [2023-03-14 14:11:45,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 2048000. Throughput: 0: 784.7. Samples: 511978. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:11:45,146][00372] Avg episode reward: [(0, '6.643')] [2023-03-14 14:11:45,148][13187] Saving new best policy, reward=6.643! [2023-03-14 14:11:50,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 2068480. Throughput: 0: 794.0. Samples: 515012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:11:50,146][00372] Avg episode reward: [(0, '6.715')] [2023-03-14 14:11:50,160][13187] Saving new best policy, reward=6.715! [2023-03-14 14:11:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3235.2). Total num frames: 2080768. Throughput: 0: 823.5. Samples: 520130. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:11:55,147][00372] Avg episode reward: [(0, '6.622')] [2023-03-14 14:11:57,214][13200] Updated weights for policy 0, policy_version 510 (0.0023) [2023-03-14 14:12:00,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2093056. Throughput: 0: 821.5. Samples: 523950. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:12:00,150][00372] Avg episode reward: [(0, '6.371')] [2023-03-14 14:12:05,147][00372] Fps is (10 sec: 2456.6, 60 sec: 3071.8, 300 sec: 3221.2). Total num frames: 2105344. Throughput: 0: 809.3. Samples: 525928. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:12:05,150][00372] Avg episode reward: [(0, '6.637')] [2023-03-14 14:12:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2125824. Throughput: 0: 784.6. Samples: 531080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:12:10,146][00372] Avg episode reward: [(0, '6.483')] [2023-03-14 14:12:10,391][13200] Updated weights for policy 0, policy_version 520 (0.0041) [2023-03-14 14:12:15,143][00372] Fps is (10 sec: 4097.7, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2146304. Throughput: 0: 813.4. Samples: 537178. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:12:15,145][00372] Avg episode reward: [(0, '6.937')] [2023-03-14 14:12:15,168][13187] Saving new best policy, reward=6.937! [2023-03-14 14:12:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2162688. Throughput: 0: 820.3. Samples: 539430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:12:20,156][00372] Avg episode reward: [(0, '7.011')] [2023-03-14 14:12:20,174][13187] Saving new best policy, reward=7.011! [2023-03-14 14:12:23,128][13200] Updated weights for policy 0, policy_version 530 (0.0017) [2023-03-14 14:12:25,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3235.1). Total num frames: 2174976. Throughput: 0: 815.8. Samples: 543270. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2023-03-14 14:12:25,146][00372] Avg episode reward: [(0, '7.531')] [2023-03-14 14:12:25,154][13187] Saving new best policy, reward=7.531! [2023-03-14 14:12:30,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2187264. Throughput: 0: 784.0. Samples: 547260. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:12:30,150][00372] Avg episode reward: [(0, '7.091')] [2023-03-14 14:12:30,172][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000534_2187264.pth... [2023-03-14 14:12:30,321][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000347_1421312.pth [2023-03-14 14:12:35,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2207744. Throughput: 0: 779.1. Samples: 550070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:12:35,151][00372] Avg episode reward: [(0, '6.819')] [2023-03-14 14:12:35,842][13200] Updated weights for policy 0, policy_version 540 (0.0036) [2023-03-14 14:12:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2228224. Throughput: 0: 801.2. Samples: 556184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:12:40,145][00372] Avg episode reward: [(0, '6.809')] [2023-03-14 14:12:45,143][00372] Fps is (10 sec: 3276.6, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2240512. Throughput: 0: 817.4. Samples: 560734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:12:45,145][00372] Avg episode reward: [(0, '6.777')] [2023-03-14 14:12:49,019][13200] Updated weights for policy 0, policy_version 550 (0.0034) [2023-03-14 14:12:50,145][00372] Fps is (10 sec: 2457.2, 60 sec: 3071.9, 300 sec: 3221.2). Total num frames: 2252800. Throughput: 0: 818.3. Samples: 562750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:12:50,152][00372] Avg episode reward: [(0, '7.140')] [2023-03-14 14:12:55,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2269184. Throughput: 0: 795.0. Samples: 566854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:12:55,145][00372] Avg episode reward: [(0, '7.639')] [2023-03-14 14:12:55,151][13187] Saving new best policy, reward=7.639! [2023-03-14 14:13:00,143][00372] Fps is (10 sec: 3687.1, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2289664. Throughput: 0: 782.2. Samples: 572378. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:13:00,151][00372] Avg episode reward: [(0, '7.805')] [2023-03-14 14:13:00,166][13187] Saving new best policy, reward=7.805! [2023-03-14 14:13:01,173][13200] Updated weights for policy 0, policy_version 560 (0.0022) [2023-03-14 14:13:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.6, 300 sec: 3235.1). Total num frames: 2310144. Throughput: 0: 800.8. Samples: 575468. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:13:05,145][00372] Avg episode reward: [(0, '8.309')] [2023-03-14 14:13:05,148][13187] Saving new best policy, reward=8.309! [2023-03-14 14:13:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2322432. Throughput: 0: 821.2. Samples: 580226. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:13:10,154][00372] Avg episode reward: [(0, '8.103')] [2023-03-14 14:13:14,629][13200] Updated weights for policy 0, policy_version 570 (0.0037) [2023-03-14 14:13:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3235.3). Total num frames: 2334720. Throughput: 0: 819.7. Samples: 584148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:13:15,148][00372] Avg episode reward: [(0, '7.802')] [2023-03-14 14:13:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2347008. Throughput: 0: 801.4. Samples: 586132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:13:20,152][00372] Avg episode reward: [(0, '7.974')] [2023-03-14 14:13:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2367488. Throughput: 0: 786.7. Samples: 591586. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:13:25,145][00372] Avg episode reward: [(0, '7.424')] [2023-03-14 14:13:26,236][13200] Updated weights for policy 0, policy_version 580 (0.0013) [2023-03-14 14:13:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2387968. Throughput: 0: 818.4. Samples: 597560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:13:30,147][00372] Avg episode reward: [(0, '7.896')] [2023-03-14 14:13:35,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2400256. Throughput: 0: 817.4. Samples: 599530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:13:35,145][00372] Avg episode reward: [(0, '7.798')] [2023-03-14 14:13:40,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2412544. Throughput: 0: 811.7. Samples: 603382. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:13:40,152][00372] Avg episode reward: [(0, '7.567')] [2023-03-14 14:13:40,783][13200] Updated weights for policy 0, policy_version 590 (0.0026) [2023-03-14 14:13:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2428928. Throughput: 0: 782.0. Samples: 607570. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:13:45,152][00372] Avg episode reward: [(0, '7.653')] [2023-03-14 14:13:50,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.9, 300 sec: 3221.3). Total num frames: 2449408. Throughput: 0: 780.6. Samples: 610596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:13:50,145][00372] Avg episode reward: [(0, '7.783')] [2023-03-14 14:13:51,853][13200] Updated weights for policy 0, policy_version 600 (0.0023) [2023-03-14 14:13:55,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3207.4). Total num frames: 2465792. Throughput: 0: 809.9. Samples: 616670. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:13:55,147][00372] Avg episode reward: [(0, '8.113')] [2023-03-14 14:14:00,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2482176. Throughput: 0: 813.3. Samples: 620748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:14:00,148][00372] Avg episode reward: [(0, '8.021')] [2023-03-14 14:14:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2494464. Throughput: 0: 811.6. Samples: 622656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:14:05,147][00372] Avg episode reward: [(0, '7.975')] [2023-03-14 14:14:06,251][13200] Updated weights for policy 0, policy_version 610 (0.0014) [2023-03-14 14:14:10,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3072.0, 300 sec: 3207.4). Total num frames: 2506752. Throughput: 0: 775.3. Samples: 626474. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:10,145][00372] Avg episode reward: [(0, '7.801')] [2023-03-14 14:14:15,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 2527232. Throughput: 0: 770.5. Samples: 632232. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:15,150][00372] Avg episode reward: [(0, '7.768')] [2023-03-14 14:14:17,647][13200] Updated weights for policy 0, policy_version 620 (0.0012) [2023-03-14 14:14:20,150][00372] Fps is (10 sec: 4093.0, 60 sec: 3344.7, 300 sec: 3221.2). Total num frames: 2547712. Throughput: 0: 793.7. Samples: 635254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:20,153][00372] Avg episode reward: [(0, '7.218')] [2023-03-14 14:14:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2560000. Throughput: 0: 807.1. Samples: 639700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:25,146][00372] Avg episode reward: [(0, '7.239')] [2023-03-14 14:14:30,143][00372] Fps is (10 sec: 2459.4, 60 sec: 3072.0, 300 sec: 3221.3). Total num frames: 2572288. Throughput: 0: 804.0. Samples: 643748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:14:30,148][00372] Avg episode reward: [(0, '7.537')] [2023-03-14 14:14:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth... [2023-03-14 14:14:30,367][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000441_1806336.pth [2023-03-14 14:14:32,324][13200] Updated weights for policy 0, policy_version 630 (0.0024) [2023-03-14 14:14:35,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2588672. Throughput: 0: 779.8. Samples: 645686. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:35,146][00372] Avg episode reward: [(0, '8.336')] [2023-03-14 14:14:35,154][13187] Saving new best policy, reward=8.336! [2023-03-14 14:14:40,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2609152. Throughput: 0: 778.6. Samples: 651706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:40,146][00372] Avg episode reward: [(0, '8.934')] [2023-03-14 14:14:40,156][13187] Saving new best policy, reward=8.934! [2023-03-14 14:14:42,454][13200] Updated weights for policy 0, policy_version 640 (0.0023) [2023-03-14 14:14:45,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3235.2). Total num frames: 2629632. Throughput: 0: 818.2. Samples: 657566. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:14:45,145][00372] Avg episode reward: [(0, '9.972')] [2023-03-14 14:14:45,147][13187] Saving new best policy, reward=9.972! [2023-03-14 14:14:50,144][00372] Fps is (10 sec: 3276.5, 60 sec: 3208.5, 300 sec: 3235.2). Total num frames: 2641920. Throughput: 0: 818.4. Samples: 659484. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:14:50,146][00372] Avg episode reward: [(0, '9.165')] [2023-03-14 14:14:55,145][00372] Fps is (10 sec: 2457.0, 60 sec: 3140.1, 300 sec: 3221.2). Total num frames: 2654208. Throughput: 0: 820.0. Samples: 663376. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:14:55,150][00372] Avg episode reward: [(0, '10.121')] [2023-03-14 14:14:55,152][13187] Saving new best policy, reward=10.121! [2023-03-14 14:14:57,564][13200] Updated weights for policy 0, policy_version 650 (0.0021) [2023-03-14 14:15:00,143][00372] Fps is (10 sec: 2867.5, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2670592. Throughput: 0: 793.1. Samples: 667922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:15:00,146][00372] Avg episode reward: [(0, '9.747')] [2023-03-14 14:15:05,143][00372] Fps is (10 sec: 3687.3, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2691072. Throughput: 0: 795.0. Samples: 671022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:05,148][00372] Avg episode reward: [(0, '9.345')] [2023-03-14 14:15:07,570][13200] Updated weights for policy 0, policy_version 660 (0.0014) [2023-03-14 14:15:10,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2707456. Throughput: 0: 833.3. Samples: 677198. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:15:10,154][00372] Avg episode reward: [(0, '9.946')] [2023-03-14 14:15:15,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3221.3). Total num frames: 2723840. Throughput: 0: 832.4. Samples: 681204. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:15:15,146][00372] Avg episode reward: [(0, '9.821')] [2023-03-14 14:15:20,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.7, 300 sec: 3221.3). Total num frames: 2736128. Throughput: 0: 834.7. Samples: 683248. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:20,151][00372] Avg episode reward: [(0, '10.564')] [2023-03-14 14:15:20,163][13187] Saving new best policy, reward=10.564! [2023-03-14 14:15:22,388][13200] Updated weights for policy 0, policy_version 670 (0.0018) [2023-03-14 14:15:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3208.5, 300 sec: 3207.4). Total num frames: 2752512. Throughput: 0: 798.4. Samples: 687636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:15:25,148][00372] Avg episode reward: [(0, '11.390')] [2023-03-14 14:15:25,154][13187] Saving new best policy, reward=11.390! [2023-03-14 14:15:30,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3207.4). Total num frames: 2772992. Throughput: 0: 809.2. Samples: 693982. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:15:30,145][00372] Avg episode reward: [(0, '11.470')] [2023-03-14 14:15:30,168][13187] Saving new best policy, reward=11.470! [2023-03-14 14:15:32,367][13200] Updated weights for policy 0, policy_version 680 (0.0028) [2023-03-14 14:15:35,146][00372] Fps is (10 sec: 3685.2, 60 sec: 3344.9, 300 sec: 3221.2). Total num frames: 2789376. Throughput: 0: 834.0. Samples: 697018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:35,149][00372] Avg episode reward: [(0, '12.260')] [2023-03-14 14:15:35,154][13187] Saving new best policy, reward=12.260! [2023-03-14 14:15:40,150][00372] Fps is (10 sec: 3274.4, 60 sec: 3276.4, 300 sec: 3221.2). Total num frames: 2805760. Throughput: 0: 835.1. Samples: 700960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:40,158][00372] Avg episode reward: [(0, '13.049')] [2023-03-14 14:15:40,172][13187] Saving new best policy, reward=13.049! [2023-03-14 14:15:45,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3140.3, 300 sec: 3221.3). Total num frames: 2818048. Throughput: 0: 824.0. Samples: 705000. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:45,147][00372] Avg episode reward: [(0, '13.164')] [2023-03-14 14:15:45,151][13187] Saving new best policy, reward=13.164! [2023-03-14 14:15:47,485][13200] Updated weights for policy 0, policy_version 690 (0.0026) [2023-03-14 14:15:50,143][00372] Fps is (10 sec: 2869.3, 60 sec: 3208.6, 300 sec: 3221.3). Total num frames: 2834432. Throughput: 0: 805.2. Samples: 707258. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:15:50,146][00372] Avg episode reward: [(0, '13.988')] [2023-03-14 14:15:50,154][13187] Saving new best policy, reward=13.988! [2023-03-14 14:15:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3221.3). Total num frames: 2854912. Throughput: 0: 806.6. Samples: 713496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:15:55,145][00372] Avg episode reward: [(0, '14.562')] [2023-03-14 14:15:55,151][13187] Saving new best policy, reward=14.562! [2023-03-14 14:15:57,624][13200] Updated weights for policy 0, policy_version 700 (0.0047) [2023-03-14 14:16:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3221.3). Total num frames: 2871296. Throughput: 0: 832.6. Samples: 718670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:16:00,148][00372] Avg episode reward: [(0, '13.817')] [2023-03-14 14:16:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3221.3). Total num frames: 2883584. Throughput: 0: 831.1. Samples: 720648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:16:05,148][00372] Avg episode reward: [(0, '14.462')] [2023-03-14 14:16:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3235.1). Total num frames: 2899968. Throughput: 0: 824.2. Samples: 724724. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:16:10,145][00372] Avg episode reward: [(0, '14.085')] [2023-03-14 14:16:12,285][13200] Updated weights for policy 0, policy_version 710 (0.0031) [2023-03-14 14:16:15,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3235.1). Total num frames: 2920448. Throughput: 0: 799.3. Samples: 729950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:15,149][00372] Avg episode reward: [(0, '13.695')] [2023-03-14 14:16:20,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 2940928. Throughput: 0: 800.9. Samples: 733058. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:20,146][00372] Avg episode reward: [(0, '14.497')] [2023-03-14 14:16:22,310][13200] Updated weights for policy 0, policy_version 720 (0.0013) [2023-03-14 14:16:25,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 2953216. Throughput: 0: 835.8. Samples: 738566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:25,148][00372] Avg episode reward: [(0, '14.419')] [2023-03-14 14:16:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 2969600. Throughput: 0: 835.9. Samples: 742614. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:16:30,150][00372] Avg episode reward: [(0, '13.953')] [2023-03-14 14:16:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000725_2969600.pth... [2023-03-14 14:16:30,331][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000534_2187264.pth [2023-03-14 14:16:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.7, 300 sec: 3235.1). Total num frames: 2981888. Throughput: 0: 829.1. Samples: 744568. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:35,147][00372] Avg episode reward: [(0, '14.644')] [2023-03-14 14:16:35,149][13187] Saving new best policy, reward=14.644! [2023-03-14 14:16:37,150][13200] Updated weights for policy 0, policy_version 730 (0.0015) [2023-03-14 14:16:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3277.2, 300 sec: 3235.1). Total num frames: 3002368. Throughput: 0: 801.2. Samples: 749550. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:40,145][00372] Avg episode reward: [(0, '15.058')] [2023-03-14 14:16:40,156][13187] Saving new best policy, reward=15.058! [2023-03-14 14:16:45,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3235.1). Total num frames: 3022848. Throughput: 0: 827.0. Samples: 755886. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:45,146][00372] Avg episode reward: [(0, '15.576')] [2023-03-14 14:16:45,152][13187] Saving new best policy, reward=15.576! [2023-03-14 14:16:47,228][13200] Updated weights for policy 0, policy_version 740 (0.0025) [2023-03-14 14:16:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3235.1). Total num frames: 3035136. Throughput: 0: 838.7. Samples: 758390. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:50,147][00372] Avg episode reward: [(0, '15.567')] [2023-03-14 14:16:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3051520. Throughput: 0: 838.9. Samples: 762476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:16:55,147][00372] Avg episode reward: [(0, '15.279')] [2023-03-14 14:17:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 3063808. Throughput: 0: 815.1. Samples: 766630. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:17:00,146][00372] Avg episode reward: [(0, '15.984')] [2023-03-14 14:17:00,168][13187] Saving new best policy, reward=15.984! [2023-03-14 14:17:01,939][13200] Updated weights for policy 0, policy_version 750 (0.0013) [2023-03-14 14:17:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 3084288. Throughput: 0: 801.5. Samples: 769124. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:17:05,146][00372] Avg episode reward: [(0, '15.766')] [2023-03-14 14:17:10,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3104768. Throughput: 0: 819.0. Samples: 775420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:17:10,149][00372] Avg episode reward: [(0, '15.274')] [2023-03-14 14:17:11,876][13200] Updated weights for policy 0, policy_version 760 (0.0013) [2023-03-14 14:17:15,144][00372] Fps is (10 sec: 3685.8, 60 sec: 3345.0, 300 sec: 3249.0). Total num frames: 3121152. Throughput: 0: 842.0. Samples: 780506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:17:15,151][00372] Avg episode reward: [(0, '15.076')] [2023-03-14 14:17:20,144][00372] Fps is (10 sec: 2866.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 3133440. Throughput: 0: 841.6. Samples: 782440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:17:20,147][00372] Avg episode reward: [(0, '15.162')] [2023-03-14 14:17:25,147][00372] Fps is (10 sec: 2457.0, 60 sec: 3208.3, 300 sec: 3249.0). Total num frames: 3145728. Throughput: 0: 819.4. Samples: 786424. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:17:25,152][00372] Avg episode reward: [(0, '14.169')] [2023-03-14 14:17:26,724][13200] Updated weights for policy 0, policy_version 770 (0.0013) [2023-03-14 14:17:30,143][00372] Fps is (10 sec: 3277.2, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3166208. Throughput: 0: 803.1. Samples: 792024. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:17:30,150][00372] Avg episode reward: [(0, '15.131')] [2023-03-14 14:17:35,143][00372] Fps is (10 sec: 4097.7, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3186688. Throughput: 0: 814.7. Samples: 795052. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:17:35,148][00372] Avg episode reward: [(0, '16.771')] [2023-03-14 14:17:35,153][13187] Saving new best policy, reward=16.771! [2023-03-14 14:17:37,258][13200] Updated weights for policy 0, policy_version 780 (0.0018) [2023-03-14 14:17:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3198976. Throughput: 0: 834.9. Samples: 800046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:17:40,145][00372] Avg episode reward: [(0, '17.101')] [2023-03-14 14:17:40,261][13187] Saving new best policy, reward=17.101! [2023-03-14 14:17:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3215360. Throughput: 0: 830.4. Samples: 803998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:17:45,145][00372] Avg episode reward: [(0, '17.761')] [2023-03-14 14:17:45,147][13187] Saving new best policy, reward=17.761! [2023-03-14 14:17:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 3227648. Throughput: 0: 820.0. Samples: 806026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:17:50,146][00372] Avg episode reward: [(0, '18.537')] [2023-03-14 14:17:50,164][13187] Saving new best policy, reward=18.537! [2023-03-14 14:17:51,979][13200] Updated weights for policy 0, policy_version 790 (0.0043) [2023-03-14 14:17:55,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3248128. Throughput: 0: 793.9. Samples: 811144. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:17:55,145][00372] Avg episode reward: [(0, '19.508')] [2023-03-14 14:17:55,149][13187] Saving new best policy, reward=19.508! [2023-03-14 14:18:00,143][00372] Fps is (10 sec: 4095.9, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 3268608. Throughput: 0: 823.3. Samples: 817554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:18:00,145][00372] Avg episode reward: [(0, '18.626')] [2023-03-14 14:18:01,970][13200] Updated weights for policy 0, policy_version 800 (0.0028) [2023-03-14 14:18:05,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 3280896. Throughput: 0: 834.7. Samples: 820002. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:18:05,150][00372] Avg episode reward: [(0, '18.513')] [2023-03-14 14:18:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3297280. Throughput: 0: 833.5. Samples: 823930. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:18:10,149][00372] Avg episode reward: [(0, '19.400')] [2023-03-14 14:18:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 3309568. Throughput: 0: 802.0. Samples: 828114. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:18:15,148][00372] Avg episode reward: [(0, '20.029')] [2023-03-14 14:18:15,153][13187] Saving new best policy, reward=20.029! [2023-03-14 14:18:16,501][13200] Updated weights for policy 0, policy_version 810 (0.0030) [2023-03-14 14:18:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3262.9). Total num frames: 3330048. Throughput: 0: 795.9. Samples: 830868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:18:20,148][00372] Avg episode reward: [(0, '18.905')] [2023-03-14 14:18:25,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.6, 300 sec: 3262.9). Total num frames: 3350528. Throughput: 0: 825.3. Samples: 837184. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:18:25,146][00372] Avg episode reward: [(0, '18.714')] [2023-03-14 14:18:27,160][13200] Updated weights for policy 0, policy_version 820 (0.0030) [2023-03-14 14:18:30,146][00372] Fps is (10 sec: 3685.1, 60 sec: 3344.9, 300 sec: 3276.8). Total num frames: 3366912. Throughput: 0: 843.9. Samples: 841978. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:18:30,149][00372] Avg episode reward: [(0, '18.459')] [2023-03-14 14:18:30,166][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000822_3366912.pth... [2023-03-14 14:18:30,347][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000628_2572288.pth [2023-03-14 14:18:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3379200. Throughput: 0: 840.4. Samples: 843842. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:18:35,147][00372] Avg episode reward: [(0, '17.178')] [2023-03-14 14:18:40,143][00372] Fps is (10 sec: 2458.5, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3391488. Throughput: 0: 817.7. Samples: 847942. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:18:40,145][00372] Avg episode reward: [(0, '15.610')] [2023-03-14 14:18:41,517][13200] Updated weights for policy 0, policy_version 830 (0.0034) [2023-03-14 14:18:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 3411968. Throughput: 0: 803.0. Samples: 853690. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:18:45,146][00372] Avg episode reward: [(0, '16.384')] [2023-03-14 14:18:50,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 3432448. Throughput: 0: 817.2. Samples: 856778. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:18:50,145][00372] Avg episode reward: [(0, '17.387')] [2023-03-14 14:18:51,883][13200] Updated weights for policy 0, policy_version 840 (0.0021) [2023-03-14 14:18:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3448832. Throughput: 0: 839.2. Samples: 861696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:18:55,150][00372] Avg episode reward: [(0, '17.480')] [2023-03-14 14:19:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3461120. Throughput: 0: 836.4. Samples: 865752. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:19:00,145][00372] Avg episode reward: [(0, '17.248')] [2023-03-14 14:19:05,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3473408. Throughput: 0: 820.6. Samples: 867794. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:19:05,145][00372] Avg episode reward: [(0, '17.249')] [2023-03-14 14:19:06,500][13200] Updated weights for policy 0, policy_version 850 (0.0031) [2023-03-14 14:19:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3493888. Throughput: 0: 794.5. Samples: 872938. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:19:10,145][00372] Avg episode reward: [(0, '18.821')] [2023-03-14 14:19:15,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3276.9). Total num frames: 3514368. Throughput: 0: 823.5. Samples: 879034. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:15,145][00372] Avg episode reward: [(0, '19.260')] [2023-03-14 14:19:17,585][13200] Updated weights for policy 0, policy_version 860 (0.0013) [2023-03-14 14:19:20,145][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3526656. Throughput: 0: 828.3. Samples: 881116. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:19:20,147][00372] Avg episode reward: [(0, '19.241')] [2023-03-14 14:19:25,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3538944. Throughput: 0: 824.5. Samples: 885044. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:19:25,146][00372] Avg episode reward: [(0, '19.037')] [2023-03-14 14:19:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.5, 300 sec: 3276.8). Total num frames: 3555328. Throughput: 0: 790.5. Samples: 889262. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:19:30,148][00372] Avg episode reward: [(0, '19.234')] [2023-03-14 14:19:31,699][13200] Updated weights for policy 0, policy_version 870 (0.0022) [2023-03-14 14:19:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3575808. Throughput: 0: 790.8. Samples: 892364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:35,145][00372] Avg episode reward: [(0, '18.772')] [2023-03-14 14:19:40,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3276.8). Total num frames: 3596288. Throughput: 0: 819.4. Samples: 898568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:40,149][00372] Avg episode reward: [(0, '18.643')] [2023-03-14 14:19:42,605][13200] Updated weights for policy 0, policy_version 880 (0.0023) [2023-03-14 14:19:45,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3608576. Throughput: 0: 825.9. Samples: 902916. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:45,153][00372] Avg episode reward: [(0, '18.709')] [2023-03-14 14:19:50,144][00372] Fps is (10 sec: 2457.4, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3620864. Throughput: 0: 824.7. Samples: 904908. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:50,148][00372] Avg episode reward: [(0, '19.604')] [2023-03-14 14:19:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3637248. Throughput: 0: 802.7. Samples: 909058. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:19:55,145][00372] Avg episode reward: [(0, '19.484')] [2023-03-14 14:19:56,598][13200] Updated weights for policy 0, policy_version 890 (0.0029) [2023-03-14 14:20:00,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3657728. Throughput: 0: 800.4. Samples: 915054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:20:00,152][00372] Avg episode reward: [(0, '20.453')] [2023-03-14 14:20:00,165][13187] Saving new best policy, reward=20.453! [2023-03-14 14:20:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 3678208. Throughput: 0: 822.6. Samples: 918134. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:20:05,147][00372] Avg episode reward: [(0, '20.899')] [2023-03-14 14:20:05,149][13187] Saving new best policy, reward=20.899! [2023-03-14 14:20:08,041][13200] Updated weights for policy 0, policy_version 900 (0.0016) [2023-03-14 14:20:10,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3690496. Throughput: 0: 833.9. Samples: 922568. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:20:10,149][00372] Avg episode reward: [(0, '21.631')] [2023-03-14 14:20:10,162][13187] Saving new best policy, reward=21.631! [2023-03-14 14:20:15,147][00372] Fps is (10 sec: 2456.8, 60 sec: 3140.1, 300 sec: 3276.8). Total num frames: 3702784. Throughput: 0: 825.9. Samples: 926430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:20:15,151][00372] Avg episode reward: [(0, '20.911')] [2023-03-14 14:20:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3719168. Throughput: 0: 800.8. Samples: 928402. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:20:20,145][00372] Avg episode reward: [(0, '21.338')] [2023-03-14 14:20:21,885][13200] Updated weights for policy 0, policy_version 910 (0.0041) [2023-03-14 14:20:25,143][00372] Fps is (10 sec: 3687.5, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3739648. Throughput: 0: 790.2. Samples: 934128. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:20:25,145][00372] Avg episode reward: [(0, '20.509')] [2023-03-14 14:20:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3290.7). Total num frames: 3760128. Throughput: 0: 828.3. Samples: 940190. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:20:30,156][00372] Avg episode reward: [(0, '20.659')] [2023-03-14 14:20:30,168][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000918_3760128.pth... [2023-03-14 14:20:30,304][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000725_2969600.pth [2023-03-14 14:20:33,133][13200] Updated weights for policy 0, policy_version 920 (0.0013) [2023-03-14 14:20:35,144][00372] Fps is (10 sec: 3276.3, 60 sec: 3276.7, 300 sec: 3276.9). Total num frames: 3772416. Throughput: 0: 826.9. Samples: 942120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:20:35,147][00372] Avg episode reward: [(0, '20.531')] [2023-03-14 14:20:40,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3784704. Throughput: 0: 824.0. Samples: 946138. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:20:40,151][00372] Avg episode reward: [(0, '19.511')] [2023-03-14 14:20:45,144][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 3801088. Throughput: 0: 791.8. Samples: 950684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:20:45,148][00372] Avg episode reward: [(0, '19.759')] [2023-03-14 14:20:46,700][13200] Updated weights for policy 0, policy_version 930 (0.0034) [2023-03-14 14:20:50,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3821568. Throughput: 0: 792.1. Samples: 953780. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:20:50,146][00372] Avg episode reward: [(0, '19.582')] [2023-03-14 14:20:55,143][00372] Fps is (10 sec: 3687.0, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3837952. Throughput: 0: 830.6. Samples: 959946. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:20:55,149][00372] Avg episode reward: [(0, '19.445')] [2023-03-14 14:20:58,397][13200] Updated weights for policy 0, policy_version 940 (0.0018) [2023-03-14 14:21:00,145][00372] Fps is (10 sec: 3276.1, 60 sec: 3276.7, 300 sec: 3290.7). Total num frames: 3854336. Throughput: 0: 834.1. Samples: 963964. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:21:00,151][00372] Avg episode reward: [(0, '19.360')] [2023-03-14 14:21:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3276.8). Total num frames: 3866624. Throughput: 0: 834.4. Samples: 965948. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:21:05,147][00372] Avg episode reward: [(0, '19.113')] [2023-03-14 14:21:10,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 3883008. Throughput: 0: 802.1. Samples: 970222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:10,145][00372] Avg episode reward: [(0, '18.789')] [2023-03-14 14:21:11,477][13200] Updated weights for policy 0, policy_version 950 (0.0036) [2023-03-14 14:21:15,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 3903488. Throughput: 0: 807.1. Samples: 976508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:15,146][00372] Avg episode reward: [(0, '18.265')] [2023-03-14 14:21:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 3919872. Throughput: 0: 834.8. Samples: 979684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:20,145][00372] Avg episode reward: [(0, '18.707')] [2023-03-14 14:21:23,182][13200] Updated weights for policy 0, policy_version 960 (0.0018) [2023-03-14 14:21:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3276.8). Total num frames: 3936256. Throughput: 0: 834.7. Samples: 983700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:25,147][00372] Avg episode reward: [(0, '19.918')] [2023-03-14 14:21:30,145][00372] Fps is (10 sec: 2866.7, 60 sec: 3140.2, 300 sec: 3276.8). Total num frames: 3948544. Throughput: 0: 823.2. Samples: 987730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:21:30,147][00372] Avg episode reward: [(0, '20.429')] [2023-03-14 14:21:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3262.9). Total num frames: 3964928. Throughput: 0: 802.4. Samples: 989888. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:35,146][00372] Avg episode reward: [(0, '20.354')] [2023-03-14 14:21:36,223][13200] Updated weights for policy 0, policy_version 970 (0.0019) [2023-03-14 14:21:40,143][00372] Fps is (10 sec: 3687.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 3985408. Throughput: 0: 804.1. Samples: 996132. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:21:40,145][00372] Avg episode reward: [(0, '21.814')] [2023-03-14 14:21:40,215][13187] Saving new best policy, reward=21.814! [2023-03-14 14:21:45,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4001792. Throughput: 0: 830.7. Samples: 1001342. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:45,150][00372] Avg episode reward: [(0, '22.687')] [2023-03-14 14:21:45,152][13187] Saving new best policy, reward=22.687! [2023-03-14 14:21:48,804][13200] Updated weights for policy 0, policy_version 980 (0.0017) [2023-03-14 14:21:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4014080. Throughput: 0: 829.2. Samples: 1003260. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:21:50,148][00372] Avg episode reward: [(0, '23.651')] [2023-03-14 14:21:50,165][13187] Saving new best policy, reward=23.651! [2023-03-14 14:21:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3276.8). Total num frames: 4030464. Throughput: 0: 819.3. Samples: 1007092. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:21:55,146][00372] Avg episode reward: [(0, '23.598')] [2023-03-14 14:22:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.7, 300 sec: 3262.9). Total num frames: 4046848. Throughput: 0: 793.1. Samples: 1012196. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:22:00,146][00372] Avg episode reward: [(0, '21.497')] [2023-03-14 14:22:01,428][13200] Updated weights for policy 0, policy_version 990 (0.0023) [2023-03-14 14:22:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4067328. Throughput: 0: 791.7. Samples: 1015312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:22:05,146][00372] Avg episode reward: [(0, '20.983')] [2023-03-14 14:22:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4083712. Throughput: 0: 823.6. Samples: 1020760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:22:10,147][00372] Avg episode reward: [(0, '19.705')] [2023-03-14 14:22:14,565][13200] Updated weights for policy 0, policy_version 1000 (0.0021) [2023-03-14 14:22:15,145][00372] Fps is (10 sec: 2866.6, 60 sec: 3208.4, 300 sec: 3262.9). Total num frames: 4096000. Throughput: 0: 822.2. Samples: 1024730. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:22:15,153][00372] Avg episode reward: [(0, '20.227')] [2023-03-14 14:22:20,148][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3263.0). Total num frames: 4108288. Throughput: 0: 818.7. Samples: 1026728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:20,150][00372] Avg episode reward: [(0, '19.321')] [2023-03-14 14:22:25,143][00372] Fps is (10 sec: 3277.5, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4128768. Throughput: 0: 790.5. Samples: 1031704. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:25,145][00372] Avg episode reward: [(0, '19.285')] [2023-03-14 14:22:26,635][13200] Updated weights for policy 0, policy_version 1010 (0.0028) [2023-03-14 14:22:30,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.2, 300 sec: 3262.9). Total num frames: 4149248. Throughput: 0: 814.2. Samples: 1037982. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:30,150][00372] Avg episode reward: [(0, '19.720')] [2023-03-14 14:22:30,160][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth... [2023-03-14 14:22:30,290][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000822_3366912.pth [2023-03-14 14:22:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4165632. Throughput: 0: 826.9. Samples: 1040470. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:35,145][00372] Avg episode reward: [(0, '19.750')] [2023-03-14 14:22:39,723][13200] Updated weights for policy 0, policy_version 1020 (0.0014) [2023-03-14 14:22:40,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4177920. Throughput: 0: 828.6. Samples: 1044378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:22:40,151][00372] Avg episode reward: [(0, '20.221')] [2023-03-14 14:22:45,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4190208. Throughput: 0: 804.0. Samples: 1048374. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:45,150][00372] Avg episode reward: [(0, '20.149')] [2023-03-14 14:22:50,143][00372] Fps is (10 sec: 3277.0, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4210688. Throughput: 0: 798.0. Samples: 1051224. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:22:50,145][00372] Avg episode reward: [(0, '19.594')] [2023-03-14 14:22:51,425][13200] Updated weights for policy 0, policy_version 1030 (0.0026) [2023-03-14 14:22:55,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4231168. Throughput: 0: 813.5. Samples: 1057366. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:22:55,145][00372] Avg episode reward: [(0, '19.841')] [2023-03-14 14:23:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3276.8). Total num frames: 4247552. Throughput: 0: 829.5. Samples: 1062054. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:23:00,150][00372] Avg episode reward: [(0, '20.467')] [2023-03-14 14:23:04,800][13200] Updated weights for policy 0, policy_version 1040 (0.0020) [2023-03-14 14:23:05,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4259840. Throughput: 0: 827.6. Samples: 1063970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:23:05,147][00372] Avg episode reward: [(0, '20.312')] [2023-03-14 14:23:10,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4272128. Throughput: 0: 804.6. Samples: 1067910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:23:10,146][00372] Avg episode reward: [(0, '19.500')] [2023-03-14 14:23:15,143][00372] Fps is (10 sec: 3276.9, 60 sec: 3276.9, 300 sec: 3262.9). Total num frames: 4292608. Throughput: 0: 790.3. Samples: 1073544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:23:15,145][00372] Avg episode reward: [(0, '19.971')] [2023-03-14 14:23:16,552][13200] Updated weights for policy 0, policy_version 1050 (0.0044) [2023-03-14 14:23:20,145][00372] Fps is (10 sec: 4095.0, 60 sec: 3413.2, 300 sec: 3262.9). Total num frames: 4313088. Throughput: 0: 805.1. Samples: 1076702. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:23:20,148][00372] Avg episode reward: [(0, '21.662')] [2023-03-14 14:23:25,155][00372] Fps is (10 sec: 3682.0, 60 sec: 3344.4, 300 sec: 3262.8). Total num frames: 4329472. Throughput: 0: 829.4. Samples: 1081712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:23:25,160][00372] Avg episode reward: [(0, '21.603')] [2023-03-14 14:23:29,925][13200] Updated weights for policy 0, policy_version 1060 (0.0033) [2023-03-14 14:23:30,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4341760. Throughput: 0: 828.7. Samples: 1085664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:23:30,148][00372] Avg episode reward: [(0, '21.446')] [2023-03-14 14:23:35,143][00372] Fps is (10 sec: 2460.5, 60 sec: 3140.2, 300 sec: 3262.9). Total num frames: 4354048. Throughput: 0: 811.1. Samples: 1087722. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:23:35,153][00372] Avg episode reward: [(0, '22.320')] [2023-03-14 14:23:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4374528. Throughput: 0: 796.8. Samples: 1093220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:23:40,151][00372] Avg episode reward: [(0, '22.747')] [2023-03-14 14:23:41,361][13200] Updated weights for policy 0, policy_version 1070 (0.0014) [2023-03-14 14:23:45,143][00372] Fps is (10 sec: 4096.2, 60 sec: 3413.4, 300 sec: 3262.9). Total num frames: 4395008. Throughput: 0: 832.8. Samples: 1099530. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:23:45,147][00372] Avg episode reward: [(0, '21.084')] [2023-03-14 14:23:50,146][00372] Fps is (10 sec: 3275.7, 60 sec: 3276.6, 300 sec: 3249.0). Total num frames: 4407296. Throughput: 0: 834.8. Samples: 1101538. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:23:50,149][00372] Avg episode reward: [(0, '21.179')] [2023-03-14 14:23:54,976][13200] Updated weights for policy 0, policy_version 1080 (0.0025) [2023-03-14 14:23:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4423680. Throughput: 0: 835.7. Samples: 1105516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:23:55,151][00372] Avg episode reward: [(0, '22.660')] [2023-03-14 14:24:00,143][00372] Fps is (10 sec: 2868.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4435968. Throughput: 0: 800.2. Samples: 1109552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:24:00,146][00372] Avg episode reward: [(0, '21.523')] [2023-03-14 14:24:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4456448. Throughput: 0: 798.9. Samples: 1112650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:24:05,145][00372] Avg episode reward: [(0, '20.522')] [2023-03-14 14:24:06,391][13200] Updated weights for policy 0, policy_version 1090 (0.0020) [2023-03-14 14:24:10,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 4476928. Throughput: 0: 823.7. Samples: 1118770. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:10,151][00372] Avg episode reward: [(0, '21.747')] [2023-03-14 14:24:15,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4489216. Throughput: 0: 823.5. Samples: 1122720. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:15,152][00372] Avg episode reward: [(0, '22.149')] [2023-03-14 14:24:20,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.4, 300 sec: 3262.9). Total num frames: 4501504. Throughput: 0: 822.3. Samples: 1124726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:24:20,148][00372] Avg episode reward: [(0, '22.304')] [2023-03-14 14:24:20,803][13200] Updated weights for policy 0, policy_version 1100 (0.0033) [2023-03-14 14:24:25,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.9, 300 sec: 3262.9). Total num frames: 4517888. Throughput: 0: 788.4. Samples: 1128700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:25,151][00372] Avg episode reward: [(0, '20.465')] [2023-03-14 14:24:30,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4538368. Throughput: 0: 784.7. Samples: 1134840. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:24:30,151][00372] Avg episode reward: [(0, '21.047')] [2023-03-14 14:24:30,167][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001108_4538368.pth... [2023-03-14 14:24:30,304][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000918_3760128.pth [2023-03-14 14:24:31,920][13200] Updated weights for policy 0, policy_version 1110 (0.0014) [2023-03-14 14:24:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 4554752. Throughput: 0: 807.1. Samples: 1137854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:35,146][00372] Avg episode reward: [(0, '21.064')] [2023-03-14 14:24:40,144][00372] Fps is (10 sec: 3276.4, 60 sec: 3276.7, 300 sec: 3262.9). Total num frames: 4571136. Throughput: 0: 817.7. Samples: 1142312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:40,151][00372] Avg episode reward: [(0, '22.806')] [2023-03-14 14:24:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3262.9). Total num frames: 4583424. Throughput: 0: 816.4. Samples: 1146290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:24:45,151][00372] Avg episode reward: [(0, '22.617')] [2023-03-14 14:24:46,216][13200] Updated weights for policy 0, policy_version 1120 (0.0018) [2023-03-14 14:24:50,143][00372] Fps is (10 sec: 2867.6, 60 sec: 3208.7, 300 sec: 3262.9). Total num frames: 4599808. Throughput: 0: 793.9. Samples: 1148374. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:24:50,152][00372] Avg episode reward: [(0, '21.802')] [2023-03-14 14:24:55,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4620288. Throughput: 0: 791.0. Samples: 1154366. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:24:55,151][00372] Avg episode reward: [(0, '22.051')] [2023-03-14 14:24:56,685][13200] Updated weights for policy 0, policy_version 1130 (0.0029) [2023-03-14 14:25:00,148][00372] Fps is (10 sec: 3684.4, 60 sec: 3344.8, 300 sec: 3249.0). Total num frames: 4636672. Throughput: 0: 831.6. Samples: 1160148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:25:00,150][00372] Avg episode reward: [(0, '22.586')] [2023-03-14 14:25:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 4653056. Throughput: 0: 831.0. Samples: 1162120. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:25:05,147][00372] Avg episode reward: [(0, '21.852')] [2023-03-14 14:25:10,143][00372] Fps is (10 sec: 2868.6, 60 sec: 3140.2, 300 sec: 3262.9). Total num frames: 4665344. Throughput: 0: 829.6. Samples: 1166034. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:25:10,147][00372] Avg episode reward: [(0, '20.633')] [2023-03-14 14:25:11,188][13200] Updated weights for policy 0, policy_version 1140 (0.0030) [2023-03-14 14:25:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4681728. Throughput: 0: 795.8. Samples: 1170652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:25:15,147][00372] Avg episode reward: [(0, '21.372')] [2023-03-14 14:25:20,143][00372] Fps is (10 sec: 3686.6, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4702208. Throughput: 0: 798.3. Samples: 1173776. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:25:20,146][00372] Avg episode reward: [(0, '22.690')] [2023-03-14 14:25:21,751][13200] Updated weights for policy 0, policy_version 1150 (0.0015) [2023-03-14 14:25:25,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 4718592. Throughput: 0: 834.3. Samples: 1179856. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:25:25,150][00372] Avg episode reward: [(0, '21.794')] [2023-03-14 14:25:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 4730880. Throughput: 0: 833.2. Samples: 1183786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:25:30,147][00372] Avg episode reward: [(0, '21.458')] [2023-03-14 14:25:35,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4747264. Throughput: 0: 831.5. Samples: 1185792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:25:35,147][00372] Avg episode reward: [(0, '22.316')] [2023-03-14 14:25:36,509][13200] Updated weights for policy 0, policy_version 1160 (0.0017) [2023-03-14 14:25:40,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3262.9). Total num frames: 4763648. Throughput: 0: 795.4. Samples: 1190158. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:25:40,146][00372] Avg episode reward: [(0, '22.188')] [2023-03-14 14:25:45,143][00372] Fps is (10 sec: 3686.3, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4784128. Throughput: 0: 805.6. Samples: 1196394. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:25:45,145][00372] Avg episode reward: [(0, '21.684')] [2023-03-14 14:25:46,755][13200] Updated weights for policy 0, policy_version 1170 (0.0029) [2023-03-14 14:25:50,149][00372] Fps is (10 sec: 3684.1, 60 sec: 3344.7, 300 sec: 3262.8). Total num frames: 4800512. Throughput: 0: 827.7. Samples: 1199372. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:25:50,155][00372] Avg episode reward: [(0, '22.007')] [2023-03-14 14:25:55,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.1). Total num frames: 4812800. Throughput: 0: 828.4. Samples: 1203312. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:25:55,145][00372] Avg episode reward: [(0, '22.908')] [2023-03-14 14:26:00,143][00372] Fps is (10 sec: 2459.1, 60 sec: 3140.5, 300 sec: 3249.0). Total num frames: 4825088. Throughput: 0: 813.1. Samples: 1207242. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:26:00,151][00372] Avg episode reward: [(0, '23.199')] [2023-03-14 14:26:01,865][13200] Updated weights for policy 0, policy_version 1180 (0.0030) [2023-03-14 14:26:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4845568. Throughput: 0: 795.3. Samples: 1209566. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:26:05,151][00372] Avg episode reward: [(0, '23.918')] [2023-03-14 14:26:05,153][13187] Saving new best policy, reward=23.918! [2023-03-14 14:26:10,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4866048. Throughput: 0: 797.7. Samples: 1215754. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:26:10,149][00372] Avg episode reward: [(0, '24.095')] [2023-03-14 14:26:10,171][13187] Saving new best policy, reward=24.095! [2023-03-14 14:26:12,303][13200] Updated weights for policy 0, policy_version 1190 (0.0018) [2023-03-14 14:26:15,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 4878336. Throughput: 0: 821.7. Samples: 1220762. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:26:15,146][00372] Avg episode reward: [(0, '23.553')] [2023-03-14 14:26:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 4894720. Throughput: 0: 821.7. Samples: 1222768. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) [2023-03-14 14:26:20,149][00372] Avg episode reward: [(0, '22.819')] [2023-03-14 14:26:25,143][00372] Fps is (10 sec: 2867.3, 60 sec: 3140.3, 300 sec: 3249.1). Total num frames: 4907008. Throughput: 0: 813.1. Samples: 1226746. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:26:25,151][00372] Avg episode reward: [(0, '21.228')] [2023-03-14 14:26:26,897][13200] Updated weights for policy 0, policy_version 1200 (0.0018) [2023-03-14 14:26:30,144][00372] Fps is (10 sec: 3276.3, 60 sec: 3276.7, 300 sec: 3262.9). Total num frames: 4927488. Throughput: 0: 793.9. Samples: 1232120. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:26:30,147][00372] Avg episode reward: [(0, '21.901')] [2023-03-14 14:26:30,158][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001203_4927488.pth... [2023-03-14 14:26:30,285][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001013_4149248.pth [2023-03-14 14:26:35,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 4947968. Throughput: 0: 794.2. Samples: 1235108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:26:35,152][00372] Avg episode reward: [(0, '20.714')] [2023-03-14 14:26:37,042][13200] Updated weights for policy 0, policy_version 1210 (0.0036) [2023-03-14 14:26:40,152][00372] Fps is (10 sec: 3683.6, 60 sec: 3344.6, 300 sec: 3262.8). Total num frames: 4964352. Throughput: 0: 825.7. Samples: 1240474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:26:40,154][00372] Avg episode reward: [(0, '21.705')] [2023-03-14 14:26:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 4976640. Throughput: 0: 824.9. Samples: 1244362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:26:45,150][00372] Avg episode reward: [(0, '22.097')] [2023-03-14 14:26:50,143][00372] Fps is (10 sec: 2459.7, 60 sec: 3140.6, 300 sec: 3249.0). Total num frames: 4988928. Throughput: 0: 815.8. Samples: 1246278. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:26:50,147][00372] Avg episode reward: [(0, '21.970')] [2023-03-14 14:26:52,116][13200] Updated weights for policy 0, policy_version 1220 (0.0022) [2023-03-14 14:26:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5009408. Throughput: 0: 793.5. Samples: 1251462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:26:55,149][00372] Avg episode reward: [(0, '22.552')] [2023-03-14 14:27:00,143][00372] Fps is (10 sec: 4096.2, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 5029888. Throughput: 0: 820.3. Samples: 1257674. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:27:00,146][00372] Avg episode reward: [(0, '24.695')] [2023-03-14 14:27:00,161][13187] Saving new best policy, reward=24.695! [2023-03-14 14:27:02,688][13200] Updated weights for policy 0, policy_version 1230 (0.0013) [2023-03-14 14:27:05,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5042176. Throughput: 0: 824.8. Samples: 1259884. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:27:05,146][00372] Avg episode reward: [(0, '23.962')] [2023-03-14 14:27:10,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5054464. Throughput: 0: 822.5. Samples: 1263758. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:27:10,146][00372] Avg episode reward: [(0, '25.158')] [2023-03-14 14:27:10,168][13187] Saving new best policy, reward=25.158! [2023-03-14 14:27:15,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5066752. Throughput: 0: 791.8. Samples: 1267750. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:27:15,149][00372] Avg episode reward: [(0, '24.751')] [2023-03-14 14:27:17,135][13200] Updated weights for policy 0, policy_version 1240 (0.0017) [2023-03-14 14:27:20,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5091328. Throughput: 0: 790.3. Samples: 1270672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:27:20,150][00372] Avg episode reward: [(0, '23.312')] [2023-03-14 14:27:25,143][00372] Fps is (10 sec: 4505.4, 60 sec: 3413.3, 300 sec: 3262.9). Total num frames: 5111808. Throughput: 0: 812.2. Samples: 1277018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:27:25,148][00372] Avg episode reward: [(0, '24.556')] [2023-03-14 14:27:28,076][13200] Updated weights for policy 0, policy_version 1250 (0.0028) [2023-03-14 14:27:30,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.9, 300 sec: 3249.0). Total num frames: 5124096. Throughput: 0: 827.0. Samples: 1281576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:27:30,147][00372] Avg episode reward: [(0, '22.490')] [2023-03-14 14:27:35,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3249.0). Total num frames: 5136384. Throughput: 0: 828.0. Samples: 1283536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:27:35,149][00372] Avg episode reward: [(0, '22.414')] [2023-03-14 14:27:40,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.7, 300 sec: 3262.9). Total num frames: 5152768. Throughput: 0: 803.1. Samples: 1287600. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:27:40,150][00372] Avg episode reward: [(0, '20.182')] [2023-03-14 14:27:42,028][13200] Updated weights for policy 0, policy_version 1260 (0.0025) [2023-03-14 14:27:45,143][00372] Fps is (10 sec: 3686.7, 60 sec: 3276.8, 300 sec: 3262.9). Total num frames: 5173248. Throughput: 0: 795.3. Samples: 1293462. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:27:45,149][00372] Avg episode reward: [(0, '20.726')] [2023-03-14 14:27:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5189632. Throughput: 0: 813.6. Samples: 1296494. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:27:50,145][00372] Avg episode reward: [(0, '20.565')] [2023-03-14 14:27:53,318][13200] Updated weights for policy 0, policy_version 1270 (0.0015) [2023-03-14 14:27:55,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5206016. Throughput: 0: 832.1. Samples: 1301202. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:27:55,145][00372] Avg episode reward: [(0, '20.306')] [2023-03-14 14:28:00,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5218304. Throughput: 0: 832.9. Samples: 1305230. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:28:00,150][00372] Avg episode reward: [(0, '21.171')] [2023-03-14 14:28:05,143][00372] Fps is (10 sec: 2867.1, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5234688. Throughput: 0: 812.8. Samples: 1307250. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:28:05,145][00372] Avg episode reward: [(0, '21.338')] [2023-03-14 14:28:06,752][13200] Updated weights for policy 0, policy_version 1280 (0.0036) [2023-03-14 14:28:10,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5255168. Throughput: 0: 798.7. Samples: 1312958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:28:10,148][00372] Avg episode reward: [(0, '21.994')] [2023-03-14 14:28:15,150][00372] Fps is (10 sec: 4093.4, 60 sec: 3481.2, 300 sec: 3262.9). Total num frames: 5275648. Throughput: 0: 832.2. Samples: 1319030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:28:15,153][00372] Avg episode reward: [(0, '22.574')] [2023-03-14 14:28:18,137][13200] Updated weights for policy 0, policy_version 1290 (0.0016) [2023-03-14 14:28:20,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.2). Total num frames: 5287936. Throughput: 0: 831.7. Samples: 1320960. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:28:20,148][00372] Avg episode reward: [(0, '22.764')] [2023-03-14 14:28:25,143][00372] Fps is (10 sec: 2459.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5300224. Throughput: 0: 828.4. Samples: 1324876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:28:25,152][00372] Avg episode reward: [(0, '22.168')] [2023-03-14 14:28:30,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5316608. Throughput: 0: 793.6. Samples: 1329172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:28:30,151][00372] Avg episode reward: [(0, '20.893')] [2023-03-14 14:28:30,162][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001298_5316608.pth... [2023-03-14 14:28:30,302][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001108_4538368.pth [2023-03-14 14:28:32,143][13200] Updated weights for policy 0, policy_version 1300 (0.0014) [2023-03-14 14:28:35,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5337088. Throughput: 0: 794.9. Samples: 1332264. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) [2023-03-14 14:28:35,146][00372] Avg episode reward: [(0, '20.645')] [2023-03-14 14:28:40,143][00372] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5353472. Throughput: 0: 830.4. Samples: 1338570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:28:40,145][00372] Avg episode reward: [(0, '20.926')] [2023-03-14 14:28:43,934][13200] Updated weights for policy 0, policy_version 1310 (0.0016) [2023-03-14 14:28:45,149][00372] Fps is (10 sec: 2865.5, 60 sec: 3208.2, 300 sec: 3249.0). Total num frames: 5365760. Throughput: 0: 828.9. Samples: 1342536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:28:45,154][00372] Avg episode reward: [(0, '20.029')] [2023-03-14 14:28:50,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5382144. Throughput: 0: 827.7. Samples: 1344498. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:28:50,151][00372] Avg episode reward: [(0, '21.027')] [2023-03-14 14:28:55,143][00372] Fps is (10 sec: 3278.9, 60 sec: 3208.5, 300 sec: 3262.9). Total num frames: 5398528. Throughput: 0: 794.2. Samples: 1348696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:28:55,148][00372] Avg episode reward: [(0, '22.142')] [2023-03-14 14:28:56,978][13200] Updated weights for policy 0, policy_version 1320 (0.0019) [2023-03-14 14:29:00,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5419008. Throughput: 0: 797.7. Samples: 1354920. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:29:00,146][00372] Avg episode reward: [(0, '22.430')] [2023-03-14 14:29:05,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5435392. Throughput: 0: 824.1. Samples: 1358046. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:29:05,145][00372] Avg episode reward: [(0, '24.843')] [2023-03-14 14:29:08,847][13200] Updated weights for policy 0, policy_version 1330 (0.0026) [2023-03-14 14:29:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5447680. Throughput: 0: 827.7. Samples: 1362124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:29:10,150][00372] Avg episode reward: [(0, '25.161')] [2023-03-14 14:29:10,167][13187] Saving new best policy, reward=25.161! [2023-03-14 14:29:15,144][00372] Fps is (10 sec: 2457.4, 60 sec: 3072.3, 300 sec: 3249.0). Total num frames: 5459968. Throughput: 0: 812.7. Samples: 1365744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:29:15,148][00372] Avg episode reward: [(0, '25.140')] [2023-03-14 14:29:20,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5476352. Throughput: 0: 788.8. Samples: 1367758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:29:20,153][00372] Avg episode reward: [(0, '25.306')] [2023-03-14 14:29:20,167][13187] Saving new best policy, reward=25.306! [2023-03-14 14:29:22,345][13200] Updated weights for policy 0, policy_version 1340 (0.0023) [2023-03-14 14:29:25,143][00372] Fps is (10 sec: 3686.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5496832. Throughput: 0: 784.1. Samples: 1373856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:29:25,145][00372] Avg episode reward: [(0, '25.035')] [2023-03-14 14:29:30,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5517312. Throughput: 0: 820.2. Samples: 1379438. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) [2023-03-14 14:29:30,151][00372] Avg episode reward: [(0, '24.574')] [2023-03-14 14:29:34,868][13200] Updated weights for policy 0, policy_version 1350 (0.0019) [2023-03-14 14:29:35,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 5529600. Throughput: 0: 818.9. Samples: 1381348. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:29:35,148][00372] Avg episode reward: [(0, '24.586')] [2023-03-14 14:29:40,143][00372] Fps is (10 sec: 2457.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5541888. Throughput: 0: 815.0. Samples: 1385372. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:29:40,145][00372] Avg episode reward: [(0, '23.317')] [2023-03-14 14:29:45,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.9, 300 sec: 3249.0). Total num frames: 5558272. Throughput: 0: 783.9. Samples: 1390196. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) [2023-03-14 14:29:45,149][00372] Avg episode reward: [(0, '24.084')] [2023-03-14 14:29:47,447][13200] Updated weights for policy 0, policy_version 1360 (0.0018) [2023-03-14 14:29:50,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 5578752. Throughput: 0: 782.6. Samples: 1393264. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:29:50,149][00372] Avg episode reward: [(0, '23.874')] [2023-03-14 14:29:55,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3276.7, 300 sec: 3249.1). Total num frames: 5595136. Throughput: 0: 820.8. Samples: 1399062. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:29:55,147][00372] Avg episode reward: [(0, '23.975')] [2023-03-14 14:30:00,052][13200] Updated weights for policy 0, policy_version 1370 (0.0015) [2023-03-14 14:30:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5611520. Throughput: 0: 827.9. Samples: 1403000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:30:00,149][00372] Avg episode reward: [(0, '23.838')] [2023-03-14 14:30:05,143][00372] Fps is (10 sec: 2867.6, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5623808. Throughput: 0: 826.7. Samples: 1404960. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:30:05,152][00372] Avg episode reward: [(0, '23.508')] [2023-03-14 14:30:10,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5640192. Throughput: 0: 795.4. Samples: 1409648. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:30:10,151][00372] Avg episode reward: [(0, '22.751')] [2023-03-14 14:30:12,340][13200] Updated weights for policy 0, policy_version 1380 (0.0024) [2023-03-14 14:30:15,143][00372] Fps is (10 sec: 3686.2, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5660672. Throughput: 0: 808.6. Samples: 1415826. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:30:15,153][00372] Avg episode reward: [(0, '21.424')] [2023-03-14 14:30:20,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5677056. Throughput: 0: 828.0. Samples: 1418610. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:30:20,148][00372] Avg episode reward: [(0, '20.753')] [2023-03-14 14:30:25,145][00372] Fps is (10 sec: 2866.6, 60 sec: 3208.4, 300 sec: 3249.0). Total num frames: 5689344. Throughput: 0: 825.7. Samples: 1422532. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:30:25,150][00372] Avg episode reward: [(0, '20.196')] [2023-03-14 14:30:25,600][13200] Updated weights for policy 0, policy_version 1390 (0.0022) [2023-03-14 14:30:30,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3072.0, 300 sec: 3235.1). Total num frames: 5701632. Throughput: 0: 805.3. Samples: 1426436. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:30:30,151][00372] Avg episode reward: [(0, '19.434')] [2023-03-14 14:30:30,165][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001392_5701632.pth... [2023-03-14 14:30:30,361][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001203_4927488.pth [2023-03-14 14:30:35,143][00372] Fps is (10 sec: 3277.6, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5722112. Throughput: 0: 792.4. Samples: 1428924. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:30:35,145][00372] Avg episode reward: [(0, '20.116')] [2023-03-14 14:30:37,500][13200] Updated weights for policy 0, policy_version 1400 (0.0044) [2023-03-14 14:30:40,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5742592. Throughput: 0: 802.9. Samples: 1435190. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:30:40,145][00372] Avg episode reward: [(0, '20.498')] [2023-03-14 14:30:45,144][00372] Fps is (10 sec: 3685.9, 60 sec: 3345.0, 300 sec: 3249.1). Total num frames: 5758976. Throughput: 0: 825.0. Samples: 1440128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:30:45,148][00372] Avg episode reward: [(0, '19.854')] [2023-03-14 14:30:50,146][00372] Fps is (10 sec: 2866.2, 60 sec: 3208.4, 300 sec: 3249.0). Total num frames: 5771264. Throughput: 0: 825.2. Samples: 1442096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:30:50,153][00372] Avg episode reward: [(0, '19.707')] [2023-03-14 14:30:51,024][13200] Updated weights for policy 0, policy_version 1410 (0.0025) [2023-03-14 14:30:55,143][00372] Fps is (10 sec: 2457.9, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5783552. Throughput: 0: 808.0. Samples: 1446006. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) [2023-03-14 14:30:55,155][00372] Avg episode reward: [(0, '21.100')] [2023-03-14 14:31:00,143][00372] Fps is (10 sec: 3277.9, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5804032. Throughput: 0: 793.4. Samples: 1451528. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:31:00,145][00372] Avg episode reward: [(0, '20.707')] [2023-03-14 14:31:02,453][13200] Updated weights for policy 0, policy_version 1420 (0.0026) [2023-03-14 14:31:05,143][00372] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3249.0). Total num frames: 5824512. Throughput: 0: 801.0. Samples: 1454654. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:31:05,145][00372] Avg episode reward: [(0, '20.869')] [2023-03-14 14:31:10,143][00372] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3262.9). Total num frames: 5840896. Throughput: 0: 827.2. Samples: 1459752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:31:10,147][00372] Avg episode reward: [(0, '20.677')] [2023-03-14 14:31:15,143][00372] Fps is (10 sec: 2867.2, 60 sec: 3208.6, 300 sec: 3249.0). Total num frames: 5853184. Throughput: 0: 825.6. Samples: 1463586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:31:15,146][00372] Avg episode reward: [(0, '21.012')] [2023-03-14 14:31:16,330][13200] Updated weights for policy 0, policy_version 1430 (0.0044) [2023-03-14 14:31:20,143][00372] Fps is (10 sec: 2457.5, 60 sec: 3140.2, 300 sec: 3249.0). Total num frames: 5865472. Throughput: 0: 814.1. Samples: 1465560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:31:20,145][00372] Avg episode reward: [(0, '21.364')] [2023-03-14 14:31:25,143][00372] Fps is (10 sec: 3276.7, 60 sec: 3276.9, 300 sec: 3249.0). Total num frames: 5885952. Throughput: 0: 793.4. Samples: 1470892. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:31:25,152][00372] Avg episode reward: [(0, '19.767')] [2023-03-14 14:31:27,492][13200] Updated weights for policy 0, policy_version 1440 (0.0014) [2023-03-14 14:31:30,145][00372] Fps is (10 sec: 4095.2, 60 sec: 3413.2, 300 sec: 3249.0). Total num frames: 5906432. Throughput: 0: 823.7. Samples: 1477194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:31:30,152][00372] Avg episode reward: [(0, '20.242')] [2023-03-14 14:31:35,149][00372] Fps is (10 sec: 3274.7, 60 sec: 3276.4, 300 sec: 3235.2). Total num frames: 5918720. Throughput: 0: 826.4. Samples: 1479286. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:31:35,155][00372] Avg episode reward: [(0, '20.704')] [2023-03-14 14:31:40,143][00372] Fps is (10 sec: 2867.9, 60 sec: 3208.5, 300 sec: 3249.0). Total num frames: 5935104. Throughput: 0: 829.2. Samples: 1483318. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) [2023-03-14 14:31:40,146][00372] Avg episode reward: [(0, '20.893')] [2023-03-14 14:31:41,642][13200] Updated weights for policy 0, policy_version 1450 (0.0020) [2023-03-14 14:31:45,143][00372] Fps is (10 sec: 2869.0, 60 sec: 3140.3, 300 sec: 3249.0). Total num frames: 5947392. Throughput: 0: 800.4. Samples: 1487548. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) [2023-03-14 14:31:45,146][00372] Avg episode reward: [(0, '22.108')] [2023-03-14 14:31:50,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3277.0, 300 sec: 3249.0). Total num frames: 5967872. Throughput: 0: 801.0. Samples: 1490698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:31:50,146][00372] Avg episode reward: [(0, '22.192')] [2023-03-14 14:31:52,073][13200] Updated weights for policy 0, policy_version 1460 (0.0028) [2023-03-14 14:31:55,143][00372] Fps is (10 sec: 4096.1, 60 sec: 3413.3, 300 sec: 3249.0). Total num frames: 5988352. Throughput: 0: 828.5. Samples: 1497036. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) [2023-03-14 14:31:55,146][00372] Avg episode reward: [(0, '22.800')] [2023-03-14 14:32:00,143][00372] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3249.0). Total num frames: 6000640. Throughput: 0: 833.1. Samples: 1501074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) [2023-03-14 14:32:00,147][00372] Avg episode reward: [(0, '23.851')] [2023-03-14 14:32:00,600][13187] Stopping Batcher_0... [2023-03-14 14:32:00,601][13187] Loop batcher_evt_loop terminating... [2023-03-14 14:32:00,602][00372] Component Batcher_0 stopped! [2023-03-14 14:32:00,605][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... [2023-03-14 14:32:00,694][13200] Weights refcount: 2 0 [2023-03-14 14:32:00,701][00372] Component InferenceWorker_p0-w0 stopped! [2023-03-14 14:32:00,704][13200] Stopping InferenceWorker_p0-w0... [2023-03-14 14:32:00,704][13200] Loop inference_proc0-0_evt_loop terminating... [2023-03-14 14:32:00,750][00372] Component RolloutWorker_w1 stopped! [2023-03-14 14:32:00,752][13202] Stopping RolloutWorker_w1... [2023-03-14 14:32:00,758][00372] Component RolloutWorker_w3 stopped! [2023-03-14 14:32:00,760][13208] Stopping RolloutWorker_w3... [2023-03-14 14:32:00,764][13208] Loop rollout_proc3_evt_loop terminating... [2023-03-14 14:32:00,766][13202] Loop rollout_proc1_evt_loop terminating... [2023-03-14 14:32:00,768][00372] Component RolloutWorker_w5 stopped! [2023-03-14 14:32:00,770][13210] Stopping RolloutWorker_w5... [2023-03-14 14:32:00,771][13210] Loop rollout_proc5_evt_loop terminating... [2023-03-14 14:32:00,801][00372] Component RolloutWorker_w7 stopped! [2023-03-14 14:32:00,803][13209] Stopping RolloutWorker_w7... [2023-03-14 14:32:00,803][13209] Loop rollout_proc7_evt_loop terminating... [2023-03-14 14:32:00,826][00372] Component RolloutWorker_w6 stopped! [2023-03-14 14:32:00,839][13187] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001298_5316608.pth [2023-03-14 14:32:00,826][13211] Stopping RolloutWorker_w6... [2023-03-14 14:32:00,856][13187] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... [2023-03-14 14:32:00,867][13204] Stopping RolloutWorker_w2... [2023-03-14 14:32:00,867][00372] Component RolloutWorker_w2 stopped! [2023-03-14 14:32:00,876][00372] Component RolloutWorker_w0 stopped! [2023-03-14 14:32:00,854][13211] Loop rollout_proc6_evt_loop terminating... [2023-03-14 14:32:00,876][13205] Stopping RolloutWorker_w0... [2023-03-14 14:32:00,899][13205] Loop rollout_proc0_evt_loop terminating... [2023-03-14 14:32:00,868][13204] Loop rollout_proc2_evt_loop terminating... [2023-03-14 14:32:00,940][13207] Stopping RolloutWorker_w4... [2023-03-14 14:32:00,941][13207] Loop rollout_proc4_evt_loop terminating... [2023-03-14 14:32:00,934][00372] Component RolloutWorker_w4 stopped! [2023-03-14 14:32:01,221][13187] Stopping LearnerWorker_p0... [2023-03-14 14:32:01,221][13187] Loop learner_proc0_evt_loop terminating... [2023-03-14 14:32:01,220][00372] Component LearnerWorker_p0 stopped! [2023-03-14 14:32:01,223][00372] Waiting for process learner_proc0 to stop... [2023-03-14 14:32:03,670][00372] Waiting for process inference_proc0-0 to join... [2023-03-14 14:32:04,191][00372] Waiting for process rollout_proc0 to join... [2023-03-14 14:32:05,111][00372] Waiting for process rollout_proc1 to join... [2023-03-14 14:32:05,115][00372] Waiting for process rollout_proc2 to join... [2023-03-14 14:32:05,124][00372] Waiting for process rollout_proc3 to join... [2023-03-14 14:32:05,128][00372] Waiting for process rollout_proc4 to join... [2023-03-14 14:32:05,130][00372] Waiting for process rollout_proc5 to join... [2023-03-14 14:32:05,133][00372] Waiting for process rollout_proc6 to join... [2023-03-14 14:32:05,135][00372] Waiting for process rollout_proc7 to join... [2023-03-14 14:32:05,136][00372] Batcher 0 profile tree view: batching: 43.1679, releasing_batches: 0.0384 [2023-03-14 14:32:05,140][00372] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 891.3881 update_model: 12.5659 weight_update: 0.0029 one_step: 0.0024 handle_policy_step: 885.8978 deserialize: 25.2186, stack: 4.8805, obs_to_device_normalize: 189.4452, forward: 436.1786, send_messages: 44.4774 prepare_outputs: 139.8153 to_cpu: 87.7006 [2023-03-14 14:32:05,141][00372] Learner 0 profile tree view: misc: 0.0119, prepare_batch: 24.1735 train: 116.4224 epoch_init: 0.0256, minibatch_init: 0.0136, losses_postprocess: 0.8965, kl_divergence: 0.9717, after_optimizer: 49.3341 calculate_losses: 41.5410 losses_init: 0.0059, forward_head: 2.7492, bptt_initial: 27.0971, tail: 1.8301, advantages_returns: 0.5186, losses: 5.0086 bptt: 3.7426 bptt_forward_core: 3.5580 update: 22.6093 clip: 2.2821 [2023-03-14 14:32:05,142][00372] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.7599, enqueue_policy_requests: 260.8927, env_step: 1378.4958, overhead: 39.9957, complete_rollouts: 11.3098 save_policy_outputs: 39.3351 split_output_tensors: 18.8414 [2023-03-14 14:32:05,143][00372] RolloutWorker_w7 profile tree view: wait_for_trajectories: 0.6716, enqueue_policy_requests: 269.8551, env_step: 1370.5238, overhead: 39.7292, complete_rollouts: 11.8346 save_policy_outputs: 37.5300 split_output_tensors: 18.0413 [2023-03-14 14:32:05,145][00372] Loop Runner_EvtLoop terminating... [2023-03-14 14:32:05,150][00372] Runner profile tree view: main_loop: 1890.3781 [2023-03-14 14:32:05,151][00372] Collected {0: 6004736}, FPS: 3176.5 [2023-03-14 14:33:00,806][00372] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-03-14 14:33:00,808][00372] Overriding arg 'num_workers' with value 1 passed from command line [2023-03-14 14:33:00,810][00372] Adding new argument 'no_render'=True that is not in the saved config file! [2023-03-14 14:33:00,812][00372] Adding new argument 'save_video'=True that is not in the saved config file! [2023-03-14 14:33:00,817][00372] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-03-14 14:33:00,818][00372] Adding new argument 'video_name'=None that is not in the saved config file! [2023-03-14 14:33:00,820][00372] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-03-14 14:33:00,822][00372] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-03-14 14:33:00,824][00372] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-03-14 14:33:00,826][00372] Adding new argument 'hf_repository'='Kittitouch/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-03-14 14:33:00,827][00372] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-03-14 14:33:00,828][00372] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-03-14 14:33:00,829][00372] Adding new argument 'train_script'=None that is not in the saved config file! [2023-03-14 14:33:00,830][00372] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-03-14 14:33:00,831][00372] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-03-14 14:33:00,869][00372] Doom resolution: 160x120, resize resolution: (128, 72) [2023-03-14 14:33:00,874][00372] RunningMeanStd input shape: (3, 72, 128) [2023-03-14 14:33:00,877][00372] RunningMeanStd input shape: (1,) [2023-03-14 14:33:00,913][00372] ConvEncoder: input_channels=3 [2023-03-14 14:33:01,694][00372] Conv encoder output size: 512 [2023-03-14 14:33:01,699][00372] Policy head output size: 512 [2023-03-14 14:33:04,178][00372] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001466_6004736.pth... [2023-03-14 14:33:05,495][00372] Num frames 100... [2023-03-14 14:33:05,610][00372] Num frames 200... [2023-03-14 14:33:05,729][00372] Num frames 300... [2023-03-14 14:33:05,850][00372] Num frames 400... [2023-03-14 14:33:05,964][00372] Num frames 500... [2023-03-14 14:33:06,077][00372] Num frames 600... [2023-03-14 14:33:06,193][00372] Num frames 700... [2023-03-14 14:33:06,310][00372] Num frames 800... [2023-03-14 14:33:06,425][00372] Num frames 900... [2023-03-14 14:33:06,542][00372] Num frames 1000... [2023-03-14 14:33:06,647][00372] Avg episode rewards: #0: 25.420, true rewards: #0: 10.420 [2023-03-14 14:33:06,648][00372] Avg episode reward: 25.420, avg true_objective: 10.420 [2023-03-14 14:33:06,725][00372] Num frames 1100... [2023-03-14 14:33:06,859][00372] Num frames 1200... [2023-03-14 14:33:06,979][00372] Num frames 1300... [2023-03-14 14:33:07,093][00372] Num frames 1400... [2023-03-14 14:33:07,206][00372] Num frames 1500... [2023-03-14 14:33:07,324][00372] Num frames 1600... [2023-03-14 14:33:07,439][00372] Num frames 1700... [2023-03-14 14:33:07,554][00372] Num frames 1800... [2023-03-14 14:33:07,672][00372] Num frames 1900... [2023-03-14 14:33:07,791][00372] Num frames 2000... [2023-03-14 14:33:07,915][00372] Num frames 2100... [2023-03-14 14:33:08,029][00372] Num frames 2200... [2023-03-14 14:33:08,144][00372] Num frames 2300... [2023-03-14 14:33:08,260][00372] Num frames 2400... [2023-03-14 14:33:08,376][00372] Num frames 2500... [2023-03-14 14:33:08,496][00372] Num frames 2600... [2023-03-14 14:33:08,613][00372] Num frames 2700... [2023-03-14 14:33:08,734][00372] Num frames 2800... [2023-03-14 14:33:08,857][00372] Num frames 2900... [2023-03-14 14:33:08,979][00372] Num frames 3000... [2023-03-14 14:33:09,097][00372] Num frames 3100... [2023-03-14 14:33:09,201][00372] Avg episode rewards: #0: 40.709, true rewards: #0: 15.710 [2023-03-14 14:33:09,202][00372] Avg episode reward: 40.709, avg true_objective: 15.710 [2023-03-14 14:33:09,272][00372] Num frames 3200... [2023-03-14 14:33:09,389][00372] Num frames 3300... [2023-03-14 14:33:09,503][00372] Num frames 3400... [2023-03-14 14:33:09,622][00372] Avg episode rewards: #0: 28.846, true rewards: #0: 11.513 [2023-03-14 14:33:09,624][00372] Avg episode reward: 28.846, avg true_objective: 11.513 [2023-03-14 14:33:09,683][00372] Num frames 3500... [2023-03-14 14:33:09,798][00372] Num frames 3600... [2023-03-14 14:33:09,919][00372] Num frames 3700... [2023-03-14 14:33:10,034][00372] Num frames 3800... [2023-03-14 14:33:10,149][00372] Num frames 3900... [2023-03-14 14:33:10,266][00372] Num frames 4000... [2023-03-14 14:33:10,385][00372] Num frames 4100... [2023-03-14 14:33:10,503][00372] Num frames 4200... [2023-03-14 14:33:10,618][00372] Num frames 4300... [2023-03-14 14:33:10,747][00372] Num frames 4400... [2023-03-14 14:33:10,874][00372] Num frames 4500... [2023-03-14 14:33:10,997][00372] Num frames 4600... [2023-03-14 14:33:11,062][00372] Avg episode rewards: #0: 28.515, true rewards: #0: 11.515 [2023-03-14 14:33:11,064][00372] Avg episode reward: 28.515, avg true_objective: 11.515 [2023-03-14 14:33:11,183][00372] Num frames 4700... [2023-03-14 14:33:11,311][00372] Num frames 4800... [2023-03-14 14:33:11,443][00372] Num frames 4900... [2023-03-14 14:33:11,559][00372] Num frames 5000... [2023-03-14 14:33:11,675][00372] Num frames 5100... [2023-03-14 14:33:11,790][00372] Num frames 5200... [2023-03-14 14:33:11,911][00372] Num frames 5300... [2023-03-14 14:33:12,035][00372] Num frames 5400... [2023-03-14 14:33:12,159][00372] Num frames 5500... [2023-03-14 14:33:12,322][00372] Num frames 5600... [2023-03-14 14:33:12,486][00372] Num frames 5700... [2023-03-14 14:33:12,643][00372] Num frames 5800... [2023-03-14 14:33:12,803][00372] Num frames 5900... [2023-03-14 14:33:12,976][00372] Num frames 6000... [2023-03-14 14:33:13,136][00372] Num frames 6100... [2023-03-14 14:33:13,266][00372] Avg episode rewards: #0: 29.284, true rewards: #0: 12.284 [2023-03-14 14:33:13,267][00372] Avg episode reward: 29.284, avg true_objective: 12.284 [2023-03-14 14:33:13,368][00372] Num frames 6200... [2023-03-14 14:33:13,535][00372] Num frames 6300... [2023-03-14 14:33:13,696][00372] Num frames 6400... [2023-03-14 14:33:13,856][00372] Num frames 6500... [2023-03-14 14:33:14,030][00372] Num frames 6600... [2023-03-14 14:33:14,186][00372] Num frames 6700... [2023-03-14 14:33:14,345][00372] Num frames 6800... [2023-03-14 14:33:14,508][00372] Num frames 6900... [2023-03-14 14:33:14,675][00372] Num frames 7000... [2023-03-14 14:33:14,839][00372] Num frames 7100... [2023-03-14 14:33:15,006][00372] Num frames 7200... [2023-03-14 14:33:15,172][00372] Num frames 7300... [2023-03-14 14:33:15,340][00372] Num frames 7400... [2023-03-14 14:33:15,505][00372] Num frames 7500... [2023-03-14 14:33:15,677][00372] Num frames 7600... [2023-03-14 14:33:15,846][00372] Num frames 7700... [2023-03-14 14:33:16,025][00372] Num frames 7800... [2023-03-14 14:33:16,203][00372] Num frames 7900... [2023-03-14 14:33:16,373][00372] Num frames 8000... [2023-03-14 14:33:16,550][00372] Num frames 8100... [2023-03-14 14:33:16,725][00372] Num frames 8200... [2023-03-14 14:33:16,859][00372] Avg episode rewards: #0: 33.070, true rewards: #0: 13.737 [2023-03-14 14:33:16,861][00372] Avg episode reward: 33.070, avg true_objective: 13.737 [2023-03-14 14:33:16,968][00372] Num frames 8300... [2023-03-14 14:33:17,149][00372] Num frames 8400... [2023-03-14 14:33:17,316][00372] Num frames 8500... [2023-03-14 14:33:17,483][00372] Num frames 8600... [2023-03-14 14:33:17,626][00372] Num frames 8700... [2023-03-14 14:33:17,751][00372] Num frames 8800... [2023-03-14 14:33:17,870][00372] Avg episode rewards: #0: 30.077, true rewards: #0: 12.649 [2023-03-14 14:33:17,872][00372] Avg episode reward: 30.077, avg true_objective: 12.649 [2023-03-14 14:33:17,929][00372] Num frames 8900... [2023-03-14 14:33:18,043][00372] Num frames 9000... [2023-03-14 14:33:18,163][00372] Num frames 9100... [2023-03-14 14:33:18,277][00372] Num frames 9200... [2023-03-14 14:33:18,390][00372] Num frames 9300... [2023-03-14 14:33:18,501][00372] Num frames 9400... [2023-03-14 14:33:18,616][00372] Num frames 9500... [2023-03-14 14:33:18,728][00372] Num frames 9600... [2023-03-14 14:33:18,825][00372] Avg episode rewards: #0: 28.289, true rewards: #0: 12.039 [2023-03-14 14:33:18,826][00372] Avg episode reward: 28.289, avg true_objective: 12.039 [2023-03-14 14:33:18,911][00372] Num frames 9700... [2023-03-14 14:33:19,022][00372] Num frames 9800... [2023-03-14 14:33:19,134][00372] Num frames 9900... [2023-03-14 14:33:19,252][00372] Num frames 10000... [2023-03-14 14:33:19,370][00372] Num frames 10100... [2023-03-14 14:33:19,484][00372] Num frames 10200... [2023-03-14 14:33:19,594][00372] Num frames 10300... [2023-03-14 14:33:19,706][00372] Num frames 10400... [2023-03-14 14:33:19,828][00372] Num frames 10500... [2023-03-14 14:33:19,941][00372] Num frames 10600... [2023-03-14 14:33:20,056][00372] Num frames 10700... [2023-03-14 14:33:20,170][00372] Num frames 10800... [2023-03-14 14:33:20,299][00372] Num frames 10900... [2023-03-14 14:33:20,419][00372] Num frames 11000... [2023-03-14 14:33:20,543][00372] Num frames 11100... [2023-03-14 14:33:20,662][00372] Num frames 11200... [2023-03-14 14:33:20,782][00372] Num frames 11300... [2023-03-14 14:33:20,901][00372] Num frames 11400... [2023-03-14 14:33:21,020][00372] Num frames 11500... [2023-03-14 14:33:21,137][00372] Num frames 11600... [2023-03-14 14:33:21,262][00372] Avg episode rewards: #0: 30.723, true rewards: #0: 12.946 [2023-03-14 14:33:21,264][00372] Avg episode reward: 30.723, avg true_objective: 12.946 [2023-03-14 14:33:21,341][00372] Num frames 11700... [2023-03-14 14:33:21,464][00372] Num frames 11800... [2023-03-14 14:33:21,592][00372] Num frames 11900... [2023-03-14 14:33:21,720][00372] Num frames 12000... [2023-03-14 14:33:21,897][00372] Avg episode rewards: #0: 28.199, true rewards: #0: 12.099 [2023-03-14 14:33:21,899][00372] Avg episode reward: 28.199, avg true_objective: 12.099 [2023-03-14 14:33:21,905][00372] Num frames 12100... [2023-03-14 14:34:47,474][00372] Replay video saved to /content/train_dir/default_experiment/replay.mp4!