[2023-02-26 09:11:47,644][00488] Saving configuration to /content/train_dir/default_experiment/config.json... [2023-02-26 09:11:47,646][00488] Rollout worker 0 uses device cpu [2023-02-26 09:11:47,648][00488] Rollout worker 1 uses device cpu [2023-02-26 09:11:47,649][00488] Rollout worker 2 uses device cpu [2023-02-26 09:11:47,650][00488] Rollout worker 3 uses device cpu [2023-02-26 09:11:47,651][00488] Rollout worker 4 uses device cpu [2023-02-26 09:11:47,653][00488] Rollout worker 5 uses device cpu [2023-02-26 09:11:47,654][00488] Rollout worker 6 uses device cpu [2023-02-26 09:11:47,655][00488] Rollout worker 7 uses device cpu [2023-02-26 09:11:47,656][00488] Rollout worker 8 uses device cpu [2023-02-26 09:11:47,657][00488] Rollout worker 9 uses device cpu [2023-02-26 09:11:47,658][00488] Rollout worker 10 uses device cpu [2023-02-26 09:11:47,660][00488] Rollout worker 11 uses device cpu [2023-02-26 09:11:47,661][00488] Rollout worker 12 uses device cpu [2023-02-26 09:11:47,662][00488] Rollout worker 13 uses device cpu [2023-02-26 09:11:47,663][00488] Rollout worker 14 uses device cpu [2023-02-26 09:11:47,665][00488] Rollout worker 15 uses device cpu [2023-02-26 09:11:48,290][00488] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 09:11:48,295][00488] InferenceWorker_p0-w0: min num requests: 5 [2023-02-26 09:11:48,374][00488] Starting all processes... [2023-02-26 09:11:48,378][00488] Starting process learner_proc0 [2023-02-26 09:11:48,453][00488] Starting all processes... [2023-02-26 09:11:48,466][00488] Starting process inference_proc0-0 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc0 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc1 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc2 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc3 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc4 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc5 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc6 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc7 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc8 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc9 [2023-02-26 09:11:48,466][00488] Starting process rollout_proc10 [2023-02-26 09:11:48,486][00488] Starting process rollout_proc11 [2023-02-26 09:11:48,486][00488] Starting process rollout_proc12 [2023-02-26 09:11:48,486][00488] Starting process rollout_proc13 [2023-02-26 09:11:48,486][00488] Starting process rollout_proc14 [2023-02-26 09:11:49,048][00488] Starting process rollout_proc15 [2023-02-26 09:12:07,218][11063] Worker 7 uses CPU cores [1] [2023-02-26 09:12:08,335][00488] Heartbeat connected on RolloutWorker_w7 [2023-02-26 09:12:08,428][11057] Worker 0 uses CPU cores [0] [2023-02-26 09:12:08,462][11056] Worker 1 uses CPU cores [1] [2023-02-26 09:12:08,593][11034] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 09:12:08,593][11034] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2023-02-26 09:12:08,701][11054] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 09:12:08,701][11054] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2023-02-26 09:12:08,713][11062] Worker 6 uses CPU cores [0] [2023-02-26 09:12:08,817][00488] Heartbeat connected on RolloutWorker_w1 [2023-02-26 09:12:08,867][00488] Heartbeat connected on RolloutWorker_w0 [2023-02-26 09:12:08,963][11059] Worker 3 uses CPU cores [1] [2023-02-26 09:12:09,082][11071] Worker 11 uses CPU cores [1] [2023-02-26 09:12:09,091][11060] Worker 4 uses CPU cores [0] [2023-02-26 09:12:09,113][00488] Heartbeat connected on RolloutWorker_w6 [2023-02-26 09:12:09,199][11075] Worker 15 uses CPU cores [1] [2023-02-26 09:12:09,201][00488] Heartbeat connected on RolloutWorker_w3 [2023-02-26 09:12:09,231][00488] Heartbeat connected on RolloutWorker_w11 [2023-02-26 09:12:09,296][11070] Worker 10 uses CPU cores [0] [2023-02-26 09:12:09,318][11058] Worker 2 uses CPU cores [0] [2023-02-26 09:12:09,385][00488] Heartbeat connected on RolloutWorker_w4 [2023-02-26 09:12:09,390][11061] Worker 5 uses CPU cores [1] [2023-02-26 09:12:09,400][00488] Heartbeat connected on RolloutWorker_w15 [2023-02-26 09:12:09,443][11074] Worker 14 uses CPU cores [0] [2023-02-26 09:12:09,461][11068] Worker 8 uses CPU cores [0] [2023-02-26 09:12:09,460][00488] Heartbeat connected on RolloutWorker_w2 [2023-02-26 09:12:09,469][00488] Heartbeat connected on RolloutWorker_w10 [2023-02-26 09:12:09,607][11073] Worker 13 uses CPU cores [1] [2023-02-26 09:12:09,629][00488] Heartbeat connected on RolloutWorker_w5 [2023-02-26 09:12:09,653][00488] Heartbeat connected on RolloutWorker_w14 [2023-02-26 09:12:09,655][00488] Heartbeat connected on RolloutWorker_w8 [2023-02-26 09:12:09,793][11069] Worker 9 uses CPU cores [1] [2023-02-26 09:12:09,810][00488] Heartbeat connected on RolloutWorker_w13 [2023-02-26 09:12:09,835][00488] Heartbeat connected on RolloutWorker_w9 [2023-02-26 09:12:09,879][11072] Worker 12 uses CPU cores [0] [2023-02-26 09:12:09,891][00488] Heartbeat connected on RolloutWorker_w12 [2023-02-26 09:12:09,964][11054] Num visible devices: 1 [2023-02-26 09:12:09,965][11034] Num visible devices: 1 [2023-02-26 09:12:09,965][00488] Heartbeat connected on InferenceWorker_p0-w0 [2023-02-26 09:12:09,972][11034] Starting seed is not provided [2023-02-26 09:12:09,972][11034] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 09:12:09,972][11034] Initializing actor-critic model on device cuda:0 [2023-02-26 09:12:09,973][11034] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 09:12:09,973][00488] Heartbeat connected on Batcher_0 [2023-02-26 09:12:09,976][11034] RunningMeanStd input shape: (1,) [2023-02-26 09:12:10,001][11034] ConvEncoder: input_channels=3 [2023-02-26 09:12:10,361][11034] Conv encoder output size: 512 [2023-02-26 09:12:10,361][11034] Policy head output size: 512 [2023-02-26 09:12:10,425][11034] Created Actor Critic model with architecture: [2023-02-26 09:12:10,425][11034] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ELU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ELU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ELU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ELU) ) ) ) ) (core): ModelCoreRNN( (core): GRU(512, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=5, bias=True) ) ) [2023-02-26 09:12:18,449][11034] Using optimizer [2023-02-26 09:12:18,450][11034] No checkpoints found [2023-02-26 09:12:18,451][11034] Did not load from checkpoint, starting from scratch! [2023-02-26 09:12:18,451][11034] Initialized policy 0 weights for model version 0 [2023-02-26 09:12:18,455][11034] LearnerWorker_p0 finished initialization! [2023-02-26 09:12:18,459][11034] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2023-02-26 09:12:18,456][00488] Heartbeat connected on LearnerWorker_p0 [2023-02-26 09:12:18,586][11054] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 09:12:18,589][11054] RunningMeanStd input shape: (1,) [2023-02-26 09:12:18,604][11054] ConvEncoder: input_channels=3 [2023-02-26 09:12:18,700][11054] Conv encoder output size: 512 [2023-02-26 09:12:18,701][11054] Policy head output size: 512 [2023-02-26 09:12:21,991][00488] Inference worker 0-0 is ready! [2023-02-26 09:12:21,994][00488] All inference workers are ready! Signal rollout workers to start! [2023-02-26 09:12:22,368][11060] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,430][11069] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,443][11061] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,434][11074] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,453][11063] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,463][11073] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,469][11056] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,488][11062] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,485][11071] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,506][11068] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,512][11058] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,519][11075] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,504][11059] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,523][11072] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,535][11057] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:22,563][11070] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:12:23,348][00488] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:24,705][11070] Decorrelating experience for 0 frames... [2023-02-26 09:12:24,707][11062] Decorrelating experience for 0 frames... [2023-02-26 09:12:24,709][11060] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,508][11060] Decorrelating experience for 32 frames... [2023-02-26 09:12:25,526][11057] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,563][11061] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,572][11069] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,599][11059] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,612][11073] Decorrelating experience for 0 frames... [2023-02-26 09:12:25,618][11075] Decorrelating experience for 0 frames... [2023-02-26 09:12:26,619][11061] Decorrelating experience for 32 frames... [2023-02-26 09:12:26,621][11063] Decorrelating experience for 0 frames... [2023-02-26 09:12:26,664][11073] Decorrelating experience for 32 frames... [2023-02-26 09:12:26,819][11057] Decorrelating experience for 32 frames... [2023-02-26 09:12:27,056][11062] Decorrelating experience for 32 frames... [2023-02-26 09:12:27,151][11070] Decorrelating experience for 32 frames... [2023-02-26 09:12:27,160][11058] Decorrelating experience for 0 frames... [2023-02-26 09:12:27,216][11061] Decorrelating experience for 64 frames... [2023-02-26 09:12:27,920][11072] Decorrelating experience for 0 frames... [2023-02-26 09:12:27,923][11068] Decorrelating experience for 0 frames... [2023-02-26 09:12:27,937][11069] Decorrelating experience for 32 frames... [2023-02-26 09:12:28,109][11070] Decorrelating experience for 64 frames... [2023-02-26 09:12:28,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:28,580][11073] Decorrelating experience for 64 frames... [2023-02-26 09:12:28,627][11063] Decorrelating experience for 32 frames... [2023-02-26 09:12:28,650][11058] Decorrelating experience for 32 frames... [2023-02-26 09:12:28,787][11070] Decorrelating experience for 96 frames... [2023-02-26 09:12:28,911][11061] Decorrelating experience for 96 frames... [2023-02-26 09:12:29,596][11058] Decorrelating experience for 64 frames... [2023-02-26 09:12:29,706][11072] Decorrelating experience for 32 frames... [2023-02-26 09:12:29,764][11056] Decorrelating experience for 0 frames... [2023-02-26 09:12:30,400][11072] Decorrelating experience for 64 frames... [2023-02-26 09:12:30,504][11070] Decorrelating experience for 128 frames... [2023-02-26 09:12:30,507][11069] Decorrelating experience for 64 frames... [2023-02-26 09:12:30,514][11071] Decorrelating experience for 0 frames... [2023-02-26 09:12:30,535][11075] Decorrelating experience for 32 frames... [2023-02-26 09:12:30,630][11063] Decorrelating experience for 64 frames... [2023-02-26 09:12:31,369][11056] Decorrelating experience for 32 frames... [2023-02-26 09:12:31,638][11072] Decorrelating experience for 96 frames... [2023-02-26 09:12:31,670][11068] Decorrelating experience for 32 frames... [2023-02-26 09:12:31,716][11057] Decorrelating experience for 64 frames... [2023-02-26 09:12:32,081][11059] Decorrelating experience for 32 frames... [2023-02-26 09:12:32,181][11075] Decorrelating experience for 64 frames... [2023-02-26 09:12:32,191][11071] Decorrelating experience for 32 frames... [2023-02-26 09:12:32,808][11069] Decorrelating experience for 96 frames... [2023-02-26 09:12:33,043][11061] Decorrelating experience for 128 frames... [2023-02-26 09:12:33,158][11062] Decorrelating experience for 64 frames... [2023-02-26 09:12:33,161][11068] Decorrelating experience for 64 frames... [2023-02-26 09:12:33,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:33,528][11063] Decorrelating experience for 96 frames... [2023-02-26 09:12:33,570][11072] Decorrelating experience for 128 frames... [2023-02-26 09:12:33,580][11074] Decorrelating experience for 0 frames... [2023-02-26 09:12:33,871][11058] Decorrelating experience for 96 frames... [2023-02-26 09:12:34,599][11070] Decorrelating experience for 160 frames... [2023-02-26 09:12:34,615][11071] Decorrelating experience for 64 frames... [2023-02-26 09:12:34,740][11059] Decorrelating experience for 64 frames... [2023-02-26 09:12:34,995][11058] Decorrelating experience for 128 frames... [2023-02-26 09:12:35,203][11069] Decorrelating experience for 128 frames... [2023-02-26 09:12:35,250][11061] Decorrelating experience for 160 frames... [2023-02-26 09:12:36,448][11068] Decorrelating experience for 96 frames... [2023-02-26 09:12:37,369][11063] Decorrelating experience for 128 frames... [2023-02-26 09:12:37,380][11057] Decorrelating experience for 96 frames... [2023-02-26 09:12:37,390][11060] Decorrelating experience for 64 frames... [2023-02-26 09:12:37,380][11071] Decorrelating experience for 96 frames... [2023-02-26 09:12:37,630][11058] Decorrelating experience for 160 frames... [2023-02-26 09:12:37,893][11059] Decorrelating experience for 96 frames... [2023-02-26 09:12:38,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:38,458][11069] Decorrelating experience for 160 frames... [2023-02-26 09:12:38,667][11061] Decorrelating experience for 192 frames... [2023-02-26 09:12:39,347][11056] Decorrelating experience for 64 frames... [2023-02-26 09:12:40,340][11063] Decorrelating experience for 160 frames... [2023-02-26 09:12:40,664][11074] Decorrelating experience for 32 frames... [2023-02-26 09:12:40,667][11062] Decorrelating experience for 96 frames... [2023-02-26 09:12:40,703][11070] Decorrelating experience for 192 frames... [2023-02-26 09:12:40,932][11068] Decorrelating experience for 128 frames... [2023-02-26 09:12:41,339][11071] Decorrelating experience for 128 frames... [2023-02-26 09:12:41,391][11060] Decorrelating experience for 96 frames... [2023-02-26 09:12:41,669][11059] Decorrelating experience for 128 frames... [2023-02-26 09:12:43,004][11056] Decorrelating experience for 96 frames... [2023-02-26 09:12:43,320][11075] Decorrelating experience for 96 frames... [2023-02-26 09:12:43,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:43,446][11069] Decorrelating experience for 192 frames... [2023-02-26 09:12:43,896][11072] Decorrelating experience for 160 frames... [2023-02-26 09:12:43,989][11074] Decorrelating experience for 64 frames... [2023-02-26 09:12:44,228][11058] Decorrelating experience for 192 frames... [2023-02-26 09:12:44,588][11057] Decorrelating experience for 128 frames... [2023-02-26 09:12:44,801][11063] Decorrelating experience for 192 frames... [2023-02-26 09:12:44,964][11068] Decorrelating experience for 160 frames... [2023-02-26 09:12:46,135][11062] Decorrelating experience for 128 frames... [2023-02-26 09:12:46,599][11060] Decorrelating experience for 128 frames... [2023-02-26 09:12:46,754][11070] Decorrelating experience for 224 frames... [2023-02-26 09:12:46,828][11073] Decorrelating experience for 96 frames... [2023-02-26 09:12:47,271][11061] Decorrelating experience for 224 frames... [2023-02-26 09:12:47,542][11059] Decorrelating experience for 160 frames... [2023-02-26 09:12:47,847][11056] Decorrelating experience for 128 frames... [2023-02-26 09:12:47,952][11075] Decorrelating experience for 128 frames... [2023-02-26 09:12:48,351][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:48,885][11063] Decorrelating experience for 224 frames... [2023-02-26 09:12:49,369][11069] Decorrelating experience for 224 frames... [2023-02-26 09:12:50,343][11075] Decorrelating experience for 160 frames... [2023-02-26 09:12:50,375][11058] Decorrelating experience for 224 frames... [2023-02-26 09:12:51,080][11073] Decorrelating experience for 128 frames... [2023-02-26 09:12:51,355][11062] Decorrelating experience for 160 frames... [2023-02-26 09:12:51,798][11059] Decorrelating experience for 192 frames... [2023-02-26 09:12:51,809][11060] Decorrelating experience for 160 frames... [2023-02-26 09:12:52,610][11068] Decorrelating experience for 192 frames... [2023-02-26 09:12:52,637][11073] Decorrelating experience for 160 frames... [2023-02-26 09:12:52,705][11057] Decorrelating experience for 160 frames... [2023-02-26 09:12:52,923][11059] Decorrelating experience for 224 frames... [2023-02-26 09:12:53,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 1.6. Samples: 48. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:53,450][11072] Decorrelating experience for 192 frames... [2023-02-26 09:12:55,477][11071] Decorrelating experience for 160 frames... [2023-02-26 09:12:55,572][11074] Decorrelating experience for 96 frames... [2023-02-26 09:12:55,978][11075] Decorrelating experience for 192 frames... [2023-02-26 09:12:56,796][11062] Decorrelating experience for 192 frames... [2023-02-26 09:12:57,314][11073] Decorrelating experience for 192 frames... [2023-02-26 09:12:57,577][11057] Decorrelating experience for 192 frames... [2023-02-26 09:12:57,618][11068] Decorrelating experience for 224 frames... [2023-02-26 09:12:58,076][11034] Signal inference workers to stop experience collection... [2023-02-26 09:12:58,131][11054] InferenceWorker_p0-w0: stopping experience collection [2023-02-26 09:12:58,347][00488] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 77.4. Samples: 2708. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2023-02-26 09:12:58,358][00488] Avg episode reward: [(0, '1.320')] [2023-02-26 09:12:59,079][11060] Decorrelating experience for 192 frames... [2023-02-26 09:12:59,125][11072] Decorrelating experience for 224 frames... [2023-02-26 09:12:59,424][11071] Decorrelating experience for 192 frames... [2023-02-26 09:12:59,599][11075] Decorrelating experience for 224 frames... [2023-02-26 09:12:59,867][11073] Decorrelating experience for 224 frames... [2023-02-26 09:13:00,519][11071] Decorrelating experience for 224 frames... [2023-02-26 09:13:00,784][11057] Decorrelating experience for 224 frames... [2023-02-26 09:13:01,046][11074] Decorrelating experience for 128 frames... [2023-02-26 09:13:01,394][11060] Decorrelating experience for 224 frames... [2023-02-26 09:13:01,488][11034] Signal inference workers to resume experience collection... [2023-02-26 09:13:01,493][11054] InferenceWorker_p0-w0: resuming experience collection [2023-02-26 09:13:03,348][00488] Fps is (10 sec: 409.6, 60 sec: 102.4, 300 sec: 102.4). Total num frames: 4096. Throughput: 0: 69.2. Samples: 2768. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2023-02-26 09:13:03,355][00488] Avg episode reward: [(0, '1.450')] [2023-02-26 09:13:05,573][11062] Decorrelating experience for 224 frames... [2023-02-26 09:13:06,851][11074] Decorrelating experience for 160 frames... [2023-02-26 09:13:08,347][00488] Fps is (10 sec: 1638.4, 60 sec: 364.1, 300 sec: 364.1). Total num frames: 16384. Throughput: 0: 120.7. Samples: 5432. Policy #0 lag: (min: 0.0, avg: 1.4, max: 3.0) [2023-02-26 09:13:08,353][00488] Avg episode reward: [(0, '1.901')] [2023-02-26 09:13:13,347][00488] Fps is (10 sec: 2867.4, 60 sec: 655.4, 300 sec: 655.4). Total num frames: 32768. Throughput: 0: 219.1. Samples: 9860. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:13:13,354][00488] Avg episode reward: [(0, '2.308')] [2023-02-26 09:13:14,262][11056] Decorrelating experience for 160 frames... [2023-02-26 09:13:16,024][11054] Updated weights for policy 0, policy_version 10 (0.0711) [2023-02-26 09:13:16,436][11074] Decorrelating experience for 192 frames... [2023-02-26 09:13:18,347][00488] Fps is (10 sec: 2867.2, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 45056. Throughput: 0: 268.0. Samples: 12060. Policy #0 lag: (min: 0.0, avg: 1.2, max: 4.0) [2023-02-26 09:13:18,356][00488] Avg episode reward: [(0, '3.172')] [2023-02-26 09:13:23,347][00488] Fps is (10 sec: 2867.1, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 61440. Throughput: 0: 374.2. Samples: 16840. Policy #0 lag: (min: 0.0, avg: 2.4, max: 4.0) [2023-02-26 09:13:23,355][00488] Avg episode reward: [(0, '4.021')] [2023-02-26 09:13:23,506][11056] Decorrelating experience for 192 frames... [2023-02-26 09:13:25,919][11074] Decorrelating experience for 224 frames... [2023-02-26 09:13:27,404][11054] Updated weights for policy 0, policy_version 20 (0.0014) [2023-02-26 09:13:28,347][00488] Fps is (10 sec: 3686.4, 60 sec: 1365.3, 300 sec: 1260.3). Total num frames: 81920. Throughput: 0: 511.4. Samples: 23012. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:13:28,356][00488] Avg episode reward: [(0, '4.518')] [2023-02-26 09:13:30,386][11056] Decorrelating experience for 224 frames... [2023-02-26 09:13:33,347][00488] Fps is (10 sec: 4505.7, 60 sec: 1774.9, 300 sec: 1521.4). Total num frames: 106496. Throughput: 0: 590.8. Samples: 26584. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:13:33,355][00488] Avg episode reward: [(0, '4.439')] [2023-02-26 09:13:33,363][11034] Saving new best policy, reward=4.439! [2023-02-26 09:13:36,468][11054] Updated weights for policy 0, policy_version 30 (0.0012) [2023-02-26 09:13:38,347][00488] Fps is (10 sec: 4505.6, 60 sec: 2116.3, 300 sec: 1693.0). Total num frames: 126976. Throughput: 0: 729.9. Samples: 32892. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:13:38,350][00488] Avg episode reward: [(0, '4.216')] [2023-02-26 09:13:43,352][00488] Fps is (10 sec: 3684.6, 60 sec: 2389.1, 300 sec: 1791.9). Total num frames: 143360. Throughput: 0: 784.4. Samples: 38008. Policy #0 lag: (min: 0.0, avg: 2.4, max: 6.0) [2023-02-26 09:13:43,354][00488] Avg episode reward: [(0, '4.446')] [2023-02-26 09:13:43,361][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000035_143360.pth... [2023-02-26 09:13:43,670][11034] Saving new best policy, reward=4.446! [2023-02-26 09:13:48,349][00488] Fps is (10 sec: 3276.3, 60 sec: 2662.5, 300 sec: 1879.3). Total num frames: 159744. Throughput: 0: 835.4. Samples: 40360. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:13:48,353][00488] Avg episode reward: [(0, '4.652')] [2023-02-26 09:13:48,358][11034] Saving new best policy, reward=4.652! [2023-02-26 09:13:49,983][11054] Updated weights for policy 0, policy_version 40 (0.0012) [2023-02-26 09:13:53,347][00488] Fps is (10 sec: 3278.4, 60 sec: 2935.5, 300 sec: 1957.0). Total num frames: 176128. Throughput: 0: 888.9. Samples: 45432. Policy #0 lag: (min: 0.0, avg: 2.6, max: 5.0) [2023-02-26 09:13:53,354][00488] Avg episode reward: [(0, '4.545')] [2023-02-26 09:13:58,347][00488] Fps is (10 sec: 3277.3, 60 sec: 3208.5, 300 sec: 2026.5). Total num frames: 192512. Throughput: 0: 906.6. Samples: 50656. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:13:58,350][00488] Avg episode reward: [(0, '4.292')] [2023-02-26 09:14:00,175][11054] Updated weights for policy 0, policy_version 50 (0.0018) [2023-02-26 09:14:03,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3618.2, 300 sec: 2211.9). Total num frames: 221184. Throughput: 0: 940.5. Samples: 54384. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:14:03,356][00488] Avg episode reward: [(0, '4.568')] [2023-02-26 09:14:08,350][00488] Fps is (10 sec: 4913.9, 60 sec: 3754.5, 300 sec: 2301.5). Total num frames: 241664. Throughput: 0: 1001.2. Samples: 61896. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:14:08,353][00488] Avg episode reward: [(0, '4.419')] [2023-02-26 09:14:09,025][11054] Updated weights for policy 0, policy_version 60 (0.0012) [2023-02-26 09:14:13,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 2345.9). Total num frames: 258048. Throughput: 0: 988.2. Samples: 67480. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:14:13,353][00488] Avg episode reward: [(0, '4.390')] [2023-02-26 09:14:18,347][00488] Fps is (10 sec: 3277.7, 60 sec: 3822.9, 300 sec: 2386.4). Total num frames: 274432. Throughput: 0: 961.9. Samples: 69868. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:14:18,351][00488] Avg episode reward: [(0, '4.413')] [2023-02-26 09:14:20,401][11054] Updated weights for policy 0, policy_version 70 (0.0016) [2023-02-26 09:14:23,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 2423.5). Total num frames: 290816. Throughput: 0: 935.6. Samples: 74996. Policy #0 lag: (min: 0.0, avg: 2.5, max: 4.0) [2023-02-26 09:14:23,352][00488] Avg episode reward: [(0, '4.364')] [2023-02-26 09:14:28,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 2457.6). Total num frames: 307200. Throughput: 0: 938.8. Samples: 80248. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:14:28,361][00488] Avg episode reward: [(0, '4.440')] [2023-02-26 09:14:32,639][11054] Updated weights for policy 0, policy_version 80 (0.0012) [2023-02-26 09:14:33,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 2552.1). Total num frames: 331776. Throughput: 0: 942.6. Samples: 82776. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:14:33,350][00488] Avg episode reward: [(0, '4.393')] [2023-02-26 09:14:38,347][00488] Fps is (10 sec: 4915.1, 60 sec: 3822.9, 300 sec: 2639.7). Total num frames: 356352. Throughput: 0: 988.3. Samples: 89904. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:14:38,349][00488] Avg episode reward: [(0, '4.457')] [2023-02-26 09:14:40,501][11054] Updated weights for policy 0, policy_version 90 (0.0012) [2023-02-26 09:14:43,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3891.5, 300 sec: 2691.7). Total num frames: 376832. Throughput: 0: 1030.6. Samples: 97032. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:14:43,358][00488] Avg episode reward: [(0, '4.601')] [2023-02-26 09:14:48,350][00488] Fps is (10 sec: 3685.5, 60 sec: 3891.1, 300 sec: 2711.8). Total num frames: 393216. Throughput: 0: 1000.3. Samples: 99400. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:14:48,364][00488] Avg episode reward: [(0, '4.585')] [2023-02-26 09:14:52,833][11054] Updated weights for policy 0, policy_version 100 (0.0014) [2023-02-26 09:14:53,355][00488] Fps is (10 sec: 3274.2, 60 sec: 3890.7, 300 sec: 2730.5). Total num frames: 409600. Throughput: 0: 945.0. Samples: 104424. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:14:53,358][00488] Avg episode reward: [(0, '4.579')] [2023-02-26 09:14:58,347][00488] Fps is (10 sec: 3277.6, 60 sec: 3891.2, 300 sec: 2748.3). Total num frames: 425984. Throughput: 0: 933.0. Samples: 109464. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:14:58,349][00488] Avg episode reward: [(0, '4.640')] [2023-02-26 09:15:03,347][00488] Fps is (10 sec: 3279.4, 60 sec: 3686.4, 300 sec: 2764.8). Total num frames: 442368. Throughput: 0: 935.9. Samples: 111984. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:15:03,353][00488] Avg episode reward: [(0, '4.693')] [2023-02-26 09:15:03,366][11034] Saving new best policy, reward=4.693! [2023-02-26 09:15:05,612][11054] Updated weights for policy 0, policy_version 110 (0.0018) [2023-02-26 09:15:08,349][00488] Fps is (10 sec: 2866.6, 60 sec: 3549.9, 300 sec: 2755.5). Total num frames: 454656. Throughput: 0: 904.6. Samples: 115704. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:15:08,351][00488] Avg episode reward: [(0, '4.566')] [2023-02-26 09:15:13,347][00488] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 2770.8). Total num frames: 471040. Throughput: 0: 890.6. Samples: 120324. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:15:13,359][00488] Avg episode reward: [(0, '4.521')] [2023-02-26 09:15:18,352][00488] Fps is (10 sec: 3275.9, 60 sec: 3549.6, 300 sec: 2785.2). Total num frames: 487424. Throughput: 0: 890.8. Samples: 122868. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:15:18,355][00488] Avg episode reward: [(0, '4.527')] [2023-02-26 09:15:19,316][11054] Updated weights for policy 0, policy_version 120 (0.0015) [2023-02-26 09:15:23,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 2798.9). Total num frames: 503808. Throughput: 0: 825.0. Samples: 127028. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:15:23,353][00488] Avg episode reward: [(0, '4.692')] [2023-02-26 09:15:28,347][00488] Fps is (10 sec: 2458.7, 60 sec: 3413.3, 300 sec: 2767.6). Total num frames: 512000. Throughput: 0: 748.7. Samples: 130724. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:15:28,349][00488] Avg episode reward: [(0, '4.669')] [2023-02-26 09:15:33,347][00488] Fps is (10 sec: 2457.6, 60 sec: 3276.8, 300 sec: 2781.0). Total num frames: 528384. Throughput: 0: 745.8. Samples: 132960. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:15:33,352][00488] Avg episode reward: [(0, '4.588')] [2023-02-26 09:15:33,495][11054] Updated weights for policy 0, policy_version 130 (0.0020) [2023-02-26 09:15:38,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 2814.7). Total num frames: 548864. Throughput: 0: 750.5. Samples: 138192. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:15:38,350][00488] Avg episode reward: [(0, '4.508')] [2023-02-26 09:15:43,348][00488] Fps is (10 sec: 3686.2, 60 sec: 3140.2, 300 sec: 2826.2). Total num frames: 565248. Throughput: 0: 752.9. Samples: 143344. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:15:43,351][00488] Avg episode reward: [(0, '4.436')] [2023-02-26 09:15:43,366][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000138_565248.pth... [2023-02-26 09:15:45,970][11054] Updated weights for policy 0, policy_version 140 (0.0017) [2023-02-26 09:15:48,347][00488] Fps is (10 sec: 3686.5, 60 sec: 3208.7, 300 sec: 2857.2). Total num frames: 585728. Throughput: 0: 758.8. Samples: 146128. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:15:48,353][00488] Avg episode reward: [(0, '4.606')] [2023-02-26 09:15:53,347][00488] Fps is (10 sec: 4505.8, 60 sec: 3345.5, 300 sec: 2906.2). Total num frames: 610304. Throughput: 0: 846.4. Samples: 153792. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:15:53,350][00488] Avg episode reward: [(0, '4.934')] [2023-02-26 09:15:53,363][11034] Saving new best policy, reward=4.934! [2023-02-26 09:15:53,713][11054] Updated weights for policy 0, policy_version 150 (0.0014) [2023-02-26 09:15:58,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3481.6, 300 sec: 2952.9). Total num frames: 634880. Throughput: 0: 891.3. Samples: 160432. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:15:58,354][00488] Avg episode reward: [(0, '5.012')] [2023-02-26 09:15:58,362][11034] Saving new best policy, reward=5.012! [2023-02-26 09:16:03,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2960.3). Total num frames: 651264. Throughput: 0: 891.6. Samples: 162984. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:16:03,350][00488] Avg episode reward: [(0, '5.013')] [2023-02-26 09:16:03,361][11034] Saving new best policy, reward=5.013! [2023-02-26 09:16:04,526][11054] Updated weights for policy 0, policy_version 160 (0.0016) [2023-02-26 09:16:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 2967.3). Total num frames: 667648. Throughput: 0: 911.7. Samples: 168056. Policy #0 lag: (min: 0.0, avg: 1.9, max: 6.0) [2023-02-26 09:16:08,354][00488] Avg episode reward: [(0, '4.915')] [2023-02-26 09:16:13,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 2974.1). Total num frames: 684032. Throughput: 0: 945.5. Samples: 173272. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:16:13,362][00488] Avg episode reward: [(0, '4.892')] [2023-02-26 09:16:16,664][11054] Updated weights for policy 0, policy_version 170 (0.0012) [2023-02-26 09:16:18,347][00488] Fps is (10 sec: 3276.7, 60 sec: 3550.1, 300 sec: 2980.5). Total num frames: 700416. Throughput: 0: 952.9. Samples: 175840. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:16:18,355][00488] Avg episode reward: [(0, '5.114')] [2023-02-26 09:16:18,359][11034] Saving new best policy, reward=5.114! [2023-02-26 09:16:23,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3020.8). Total num frames: 724992. Throughput: 0: 973.7. Samples: 182008. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:16:23,349][00488] Avg episode reward: [(0, '4.802')] [2023-02-26 09:16:25,413][11054] Updated weights for policy 0, policy_version 180 (0.0012) [2023-02-26 09:16:28,347][00488] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3059.5). Total num frames: 749568. Throughput: 0: 1031.3. Samples: 189752. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:16:28,353][00488] Avg episode reward: [(0, '4.736')] [2023-02-26 09:16:33,347][00488] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3080.2). Total num frames: 770048. Throughput: 0: 1040.4. Samples: 192944. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:16:33,350][00488] Avg episode reward: [(0, '4.781')] [2023-02-26 09:16:36,170][11054] Updated weights for policy 0, policy_version 190 (0.0025) [2023-02-26 09:16:38,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3084.1). Total num frames: 786432. Throughput: 0: 984.7. Samples: 198104. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:16:38,350][00488] Avg episode reward: [(0, '4.865')] [2023-02-26 09:16:43,348][00488] Fps is (10 sec: 3685.9, 60 sec: 4027.7, 300 sec: 3103.5). Total num frames: 806912. Throughput: 0: 954.0. Samples: 203364. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:16:43,358][00488] Avg episode reward: [(0, '4.846')] [2023-02-26 09:16:47,025][11054] Updated weights for policy 0, policy_version 200 (0.0016) [2023-02-26 09:16:48,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3106.8). Total num frames: 823296. Throughput: 0: 955.4. Samples: 205976. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:16:48,352][00488] Avg episode reward: [(0, '4.985')] [2023-02-26 09:16:53,347][00488] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3109.9). Total num frames: 839680. Throughput: 0: 957.1. Samples: 211124. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:16:53,349][00488] Avg episode reward: [(0, '5.108')] [2023-02-26 09:16:57,482][11054] Updated weights for policy 0, policy_version 210 (0.0014) [2023-02-26 09:16:58,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3142.8). Total num frames: 864256. Throughput: 0: 1003.2. Samples: 218416. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:16:58,355][00488] Avg episode reward: [(0, '5.153')] [2023-02-26 09:16:58,359][11034] Saving new best policy, reward=5.153! [2023-02-26 09:17:03,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3174.4). Total num frames: 888832. Throughput: 0: 1030.9. Samples: 222232. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:17:03,349][00488] Avg episode reward: [(0, '5.195')] [2023-02-26 09:17:03,360][11034] Saving new best policy, reward=5.195! [2023-02-26 09:17:05,950][11054] Updated weights for policy 0, policy_version 220 (0.0018) [2023-02-26 09:17:08,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3176.2). Total num frames: 905216. Throughput: 0: 1026.1. Samples: 228184. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:17:08,357][00488] Avg episode reward: [(0, '5.353')] [2023-02-26 09:17:08,361][11034] Saving new best policy, reward=5.353! [2023-02-26 09:17:13,348][00488] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 3192.1). Total num frames: 925696. Throughput: 0: 969.4. Samples: 233376. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:17:13,355][00488] Avg episode reward: [(0, '5.369')] [2023-02-26 09:17:13,372][11034] Saving new best policy, reward=5.369! [2023-02-26 09:17:17,820][11054] Updated weights for policy 0, policy_version 230 (0.0012) [2023-02-26 09:17:18,353][00488] Fps is (10 sec: 3684.3, 60 sec: 4027.4, 300 sec: 3193.4). Total num frames: 942080. Throughput: 0: 956.9. Samples: 236008. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:17:18,357][00488] Avg episode reward: [(0, '5.713')] [2023-02-26 09:17:18,363][11034] Saving new best policy, reward=5.713! [2023-02-26 09:17:23,347][00488] Fps is (10 sec: 3276.9, 60 sec: 3891.2, 300 sec: 3249.0). Total num frames: 958464. Throughput: 0: 956.4. Samples: 241140. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:17:23,355][00488] Avg episode reward: [(0, '5.408')] [2023-02-26 09:17:28,347][00488] Fps is (10 sec: 3688.5, 60 sec: 3822.9, 300 sec: 3318.5). Total num frames: 978944. Throughput: 0: 967.6. Samples: 246904. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:17:28,353][00488] Avg episode reward: [(0, '5.266')] [2023-02-26 09:17:29,008][11054] Updated weights for policy 0, policy_version 240 (0.0011) [2023-02-26 09:17:33,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3401.8). Total num frames: 1003520. Throughput: 0: 996.4. Samples: 250816. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:17:33,353][00488] Avg episode reward: [(0, '5.103')] [2023-02-26 09:17:36,465][11054] Updated weights for policy 0, policy_version 250 (0.0012) [2023-02-26 09:17:38,356][00488] Fps is (10 sec: 4910.9, 60 sec: 4027.1, 300 sec: 3485.0). Total num frames: 1028096. Throughput: 0: 1055.2. Samples: 258616. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:17:38,361][00488] Avg episode reward: [(0, '4.999')] [2023-02-26 09:17:43,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 3540.7). Total num frames: 1044480. Throughput: 0: 1018.1. Samples: 264232. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:17:43,352][00488] Avg episode reward: [(0, '5.011')] [2023-02-26 09:17:43,366][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000255_1044480.pth... [2023-02-26 09:17:43,716][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000035_143360.pth [2023-02-26 09:17:47,925][11054] Updated weights for policy 0, policy_version 260 (0.0013) [2023-02-26 09:17:48,347][00488] Fps is (10 sec: 3689.7, 60 sec: 4027.7, 300 sec: 3610.0). Total num frames: 1064960. Throughput: 0: 989.1. Samples: 266740. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:17:48,351][00488] Avg episode reward: [(0, '5.153')] [2023-02-26 09:17:53,351][00488] Fps is (10 sec: 3685.0, 60 sec: 4027.5, 300 sec: 3665.5). Total num frames: 1081344. Throughput: 0: 971.5. Samples: 271904. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:17:53,354][00488] Avg episode reward: [(0, '5.351')] [2023-02-26 09:17:58,347][00488] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3693.4). Total num frames: 1093632. Throughput: 0: 953.5. Samples: 276284. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:17:58,350][00488] Avg episode reward: [(0, '5.321')] [2023-02-26 09:18:01,541][11054] Updated weights for policy 0, policy_version 270 (0.0013) [2023-02-26 09:18:03,351][00488] Fps is (10 sec: 2457.6, 60 sec: 3617.9, 300 sec: 3693.3). Total num frames: 1105920. Throughput: 0: 940.7. Samples: 278336. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:18:03,355][00488] Avg episode reward: [(0, '5.219')] [2023-02-26 09:18:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 1126400. Throughput: 0: 928.9. Samples: 282940. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:18:08,351][00488] Avg episode reward: [(0, '5.389')] [2023-02-26 09:18:13,347][00488] Fps is (10 sec: 3687.8, 60 sec: 3618.2, 300 sec: 3721.1). Total num frames: 1142784. Throughput: 0: 919.9. Samples: 288300. Policy #0 lag: (min: 0.0, avg: 1.7, max: 5.0) [2023-02-26 09:18:13,350][00488] Avg episode reward: [(0, '5.347')] [2023-02-26 09:18:13,535][11054] Updated weights for policy 0, policy_version 280 (0.0013) [2023-02-26 09:18:18,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3618.5, 300 sec: 3721.1). Total num frames: 1159168. Throughput: 0: 881.3. Samples: 290476. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:18:18,357][00488] Avg episode reward: [(0, '5.546')] [2023-02-26 09:18:23,347][00488] Fps is (10 sec: 2457.6, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 1167360. Throughput: 0: 795.6. Samples: 294412. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:18:23,350][00488] Avg episode reward: [(0, '5.759')] [2023-02-26 09:18:23,359][11034] Saving new best policy, reward=5.759! [2023-02-26 09:18:28,227][11054] Updated weights for policy 0, policy_version 290 (0.0013) [2023-02-26 09:18:28,347][00488] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3665.6). Total num frames: 1187840. Throughput: 0: 785.0. Samples: 299556. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:18:28,354][00488] Avg episode reward: [(0, '6.104')] [2023-02-26 09:18:28,360][11034] Saving new best policy, reward=6.104! [2023-02-26 09:18:33,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3651.7). Total num frames: 1204224. Throughput: 0: 784.4. Samples: 302040. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:18:33,353][00488] Avg episode reward: [(0, '6.076')] [2023-02-26 09:18:38,347][00488] Fps is (10 sec: 3686.3, 60 sec: 3277.3, 300 sec: 3665.6). Total num frames: 1224704. Throughput: 0: 789.7. Samples: 307436. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:18:38,354][00488] Avg episode reward: [(0, '5.930')] [2023-02-26 09:18:38,954][11054] Updated weights for policy 0, policy_version 300 (0.0013) [2023-02-26 09:18:43,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3413.3, 300 sec: 3693.4). Total num frames: 1249280. Throughput: 0: 837.4. Samples: 313968. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:18:43,354][00488] Avg episode reward: [(0, '6.034')] [2023-02-26 09:18:47,635][11054] Updated weights for policy 0, policy_version 310 (0.0012) [2023-02-26 09:18:48,347][00488] Fps is (10 sec: 4915.3, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 1273856. Throughput: 0: 878.0. Samples: 317844. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:18:48,349][00488] Avg episode reward: [(0, '6.359')] [2023-02-26 09:18:48,353][11034] Saving new best policy, reward=6.359! [2023-02-26 09:18:53,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3550.1, 300 sec: 3735.0). Total num frames: 1294336. Throughput: 0: 927.3. Samples: 324668. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:18:53,351][00488] Avg episode reward: [(0, '6.405')] [2023-02-26 09:18:53,376][11034] Saving new best policy, reward=6.405! [2023-02-26 09:18:57,770][11054] Updated weights for policy 0, policy_version 320 (0.0012) [2023-02-26 09:18:58,348][00488] Fps is (10 sec: 3686.1, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 1310720. Throughput: 0: 923.9. Samples: 329876. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) [2023-02-26 09:18:58,352][00488] Avg episode reward: [(0, '6.826')] [2023-02-26 09:18:58,361][11034] Saving new best policy, reward=6.826! [2023-02-26 09:19:03,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3754.9, 300 sec: 3693.4). Total num frames: 1331200. Throughput: 0: 935.2. Samples: 332560. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:19:03,350][00488] Avg episode reward: [(0, '6.745')] [2023-02-26 09:19:08,347][00488] Fps is (10 sec: 3686.7, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1347584. Throughput: 0: 965.4. Samples: 337856. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-26 09:19:08,352][00488] Avg episode reward: [(0, '6.618')] [2023-02-26 09:19:09,529][11054] Updated weights for policy 0, policy_version 330 (0.0021) [2023-02-26 09:19:13,348][00488] Fps is (10 sec: 3276.5, 60 sec: 3686.3, 300 sec: 3693.3). Total num frames: 1363968. Throughput: 0: 969.6. Samples: 343188. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:19:13,352][00488] Avg episode reward: [(0, '6.598')] [2023-02-26 09:19:18,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 1388544. Throughput: 0: 992.0. Samples: 346680. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) [2023-02-26 09:19:18,354][00488] Avg episode reward: [(0, '6.780')] [2023-02-26 09:19:18,726][11054] Updated weights for policy 0, policy_version 340 (0.0012) [2023-02-26 09:19:23,347][00488] Fps is (10 sec: 4915.6, 60 sec: 4096.0, 300 sec: 3748.9). Total num frames: 1413120. Throughput: 0: 1044.5. Samples: 354440. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:19:23,355][00488] Avg episode reward: [(0, '6.743')] [2023-02-26 09:19:28,347][00488] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3721.1). Total num frames: 1429504. Throughput: 0: 1038.6. Samples: 360704. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:19:28,352][00488] Avg episode reward: [(0, '6.976')] [2023-02-26 09:19:28,396][11034] Saving new best policy, reward=6.976! [2023-02-26 09:19:28,409][11054] Updated weights for policy 0, policy_version 350 (0.0012) [2023-02-26 09:19:33,349][00488] Fps is (10 sec: 3685.8, 60 sec: 4095.9, 300 sec: 3707.2). Total num frames: 1449984. Throughput: 0: 1008.7. Samples: 363236. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:19:33,351][00488] Avg episode reward: [(0, '6.941')] [2023-02-26 09:19:38,347][00488] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3707.2). Total num frames: 1470464. Throughput: 0: 976.2. Samples: 368596. Policy #0 lag: (min: 0.0, avg: 1.7, max: 5.0) [2023-02-26 09:19:38,351][00488] Avg episode reward: [(0, '7.018')] [2023-02-26 09:19:38,361][11034] Saving new best policy, reward=7.018! [2023-02-26 09:19:39,339][11054] Updated weights for policy 0, policy_version 360 (0.0011) [2023-02-26 09:19:43,347][00488] Fps is (10 sec: 3686.9, 60 sec: 3959.5, 300 sec: 3707.3). Total num frames: 1486848. Throughput: 0: 977.2. Samples: 373848. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:19:43,354][00488] Avg episode reward: [(0, '7.284')] [2023-02-26 09:19:43,369][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000363_1486848.pth... [2023-02-26 09:19:43,742][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000138_565248.pth [2023-02-26 09:19:43,795][11034] Saving new best policy, reward=7.284! [2023-02-26 09:19:48,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3721.2). Total num frames: 1507328. Throughput: 0: 973.3. Samples: 376360. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:19:48,355][00488] Avg episode reward: [(0, '7.324')] [2023-02-26 09:19:48,360][11034] Saving new best policy, reward=7.324! [2023-02-26 09:19:50,370][11054] Updated weights for policy 0, policy_version 370 (0.0019) [2023-02-26 09:19:53,347][00488] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 1527808. Throughput: 0: 1010.0. Samples: 383308. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:19:53,349][00488] Avg episode reward: [(0, '7.627')] [2023-02-26 09:19:53,360][11034] Saving new best policy, reward=7.627! [2023-02-26 09:19:57,857][11054] Updated weights for policy 0, policy_version 380 (0.0012) [2023-02-26 09:19:58,347][00488] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3776.7). Total num frames: 1556480. Throughput: 0: 1061.3. Samples: 390944. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:19:58,354][00488] Avg episode reward: [(0, '8.139')] [2023-02-26 09:19:58,360][11034] Saving new best policy, reward=8.139! [2023-02-26 09:20:03,347][00488] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3790.6). Total num frames: 1572864. Throughput: 0: 1042.0. Samples: 393568. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:20:03,350][00488] Avg episode reward: [(0, '8.198')] [2023-02-26 09:20:03,365][11034] Saving new best policy, reward=8.198! [2023-02-26 09:20:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 1589248. Throughput: 0: 982.1. Samples: 398636. Policy #0 lag: (min: 0.0, avg: 1.7, max: 5.0) [2023-02-26 09:20:08,352][00488] Avg episode reward: [(0, '8.137')] [2023-02-26 09:20:10,561][11054] Updated weights for policy 0, policy_version 390 (0.0012) [2023-02-26 09:20:13,349][00488] Fps is (10 sec: 3276.1, 60 sec: 4027.7, 300 sec: 3790.6). Total num frames: 1605632. Throughput: 0: 957.6. Samples: 403796. Policy #0 lag: (min: 0.0, avg: 2.4, max: 4.0) [2023-02-26 09:20:13,352][00488] Avg episode reward: [(0, '8.684')] [2023-02-26 09:20:13,371][11034] Saving new best policy, reward=8.684! [2023-02-26 09:20:18,351][00488] Fps is (10 sec: 3275.6, 60 sec: 3891.0, 300 sec: 3790.5). Total num frames: 1622016. Throughput: 0: 957.1. Samples: 406308. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:20:18,358][00488] Avg episode reward: [(0, '8.753')] [2023-02-26 09:20:18,361][11034] Saving new best policy, reward=8.753! [2023-02-26 09:20:22,292][11054] Updated weights for policy 0, policy_version 400 (0.0013) [2023-02-26 09:20:23,347][00488] Fps is (10 sec: 3277.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 1638400. Throughput: 0: 951.0. Samples: 411392. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:20:23,353][00488] Avg episode reward: [(0, '9.463')] [2023-02-26 09:20:23,365][11034] Saving new best policy, reward=9.463! [2023-02-26 09:20:28,347][00488] Fps is (10 sec: 4507.3, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1667072. Throughput: 0: 999.2. Samples: 418812. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:20:28,352][00488] Avg episode reward: [(0, '9.675')] [2023-02-26 09:20:28,357][11034] Saving new best policy, reward=9.675! [2023-02-26 09:20:30,517][11054] Updated weights for policy 0, policy_version 410 (0.0012) [2023-02-26 09:20:33,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.6, 300 sec: 3860.0). Total num frames: 1687552. Throughput: 0: 1028.5. Samples: 422644. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:20:33,351][00488] Avg episode reward: [(0, '10.220')] [2023-02-26 09:20:33,405][11034] Saving new best policy, reward=10.220! [2023-02-26 09:20:38,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.9). Total num frames: 1708032. Throughput: 0: 999.4. Samples: 428280. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:20:38,353][00488] Avg episode reward: [(0, '10.051')] [2023-02-26 09:20:41,333][11054] Updated weights for policy 0, policy_version 420 (0.0012) [2023-02-26 09:20:43,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 1724416. Throughput: 0: 947.1. Samples: 433564. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:20:43,357][00488] Avg episode reward: [(0, '10.170')] [2023-02-26 09:20:48,348][00488] Fps is (10 sec: 3276.5, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 1740800. Throughput: 0: 945.1. Samples: 436100. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:20:48,350][00488] Avg episode reward: [(0, '10.184')] [2023-02-26 09:20:53,352][00488] Fps is (10 sec: 2865.8, 60 sec: 3754.4, 300 sec: 3790.5). Total num frames: 1753088. Throughput: 0: 926.1. Samples: 440316. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:20:53,355][00488] Avg episode reward: [(0, '10.118')] [2023-02-26 09:20:55,560][11054] Updated weights for policy 0, policy_version 430 (0.0021) [2023-02-26 09:20:58,348][00488] Fps is (10 sec: 2867.2, 60 sec: 3549.8, 300 sec: 3790.5). Total num frames: 1769472. Throughput: 0: 899.0. Samples: 444252. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:20:58,350][00488] Avg episode reward: [(0, '10.125')] [2023-02-26 09:21:03,347][00488] Fps is (10 sec: 3278.4, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 1785856. Throughput: 0: 897.4. Samples: 446688. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:21:03,352][00488] Avg episode reward: [(0, '11.064')] [2023-02-26 09:21:03,374][11034] Saving new best policy, reward=11.064! [2023-02-26 09:21:07,912][11054] Updated weights for policy 0, policy_version 440 (0.0021) [2023-02-26 09:21:08,347][00488] Fps is (10 sec: 3277.1, 60 sec: 3549.9, 300 sec: 3790.5). Total num frames: 1802240. Throughput: 0: 901.6. Samples: 451964. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:21:08,350][00488] Avg episode reward: [(0, '11.027')] [2023-02-26 09:21:13,348][00488] Fps is (10 sec: 2867.0, 60 sec: 3481.7, 300 sec: 3776.6). Total num frames: 1814528. Throughput: 0: 833.0. Samples: 456296. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:21:13,353][00488] Avg episode reward: [(0, '11.010')] [2023-02-26 09:21:18,348][00488] Fps is (10 sec: 2866.8, 60 sec: 3481.7, 300 sec: 3748.9). Total num frames: 1830912. Throughput: 0: 791.9. Samples: 458280. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:21:18,351][00488] Avg episode reward: [(0, '11.171')] [2023-02-26 09:21:18,358][11034] Saving new best policy, reward=11.171! [2023-02-26 09:21:21,393][11054] Updated weights for policy 0, policy_version 450 (0.0015) [2023-02-26 09:21:23,347][00488] Fps is (10 sec: 3277.0, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 1847296. Throughput: 0: 780.2. Samples: 463388. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:21:23,350][00488] Avg episode reward: [(0, '11.178')] [2023-02-26 09:21:23,370][11034] Saving new best policy, reward=11.178! [2023-02-26 09:21:28,347][00488] Fps is (10 sec: 3277.2, 60 sec: 3276.8, 300 sec: 3707.2). Total num frames: 1863680. Throughput: 0: 774.6. Samples: 468420. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:21:28,350][00488] Avg episode reward: [(0, '10.580')] [2023-02-26 09:21:33,354][11054] Updated weights for policy 0, policy_version 460 (0.0012) [2023-02-26 09:21:33,347][00488] Fps is (10 sec: 3276.7, 60 sec: 3208.5, 300 sec: 3707.2). Total num frames: 1880064. Throughput: 0: 777.7. Samples: 471096. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:21:33,356][00488] Avg episode reward: [(0, '11.058')] [2023-02-26 09:21:38,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3276.8, 300 sec: 3721.1). Total num frames: 1904640. Throughput: 0: 822.1. Samples: 477308. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:21:38,349][00488] Avg episode reward: [(0, '11.886')] [2023-02-26 09:21:38,352][11034] Saving new best policy, reward=11.886! [2023-02-26 09:21:41,730][11054] Updated weights for policy 0, policy_version 470 (0.0017) [2023-02-26 09:21:43,347][00488] Fps is (10 sec: 4915.3, 60 sec: 3413.3, 300 sec: 3748.9). Total num frames: 1929216. Throughput: 0: 907.7. Samples: 485096. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:21:43,350][00488] Avg episode reward: [(0, '13.137')] [2023-02-26 09:21:43,373][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000472_1933312.pth... [2023-02-26 09:21:43,583][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000255_1044480.pth [2023-02-26 09:21:43,607][11034] Saving new best policy, reward=13.137! [2023-02-26 09:21:48,347][00488] Fps is (10 sec: 4505.7, 60 sec: 3481.6, 300 sec: 3762.8). Total num frames: 1949696. Throughput: 0: 923.4. Samples: 488240. Policy #0 lag: (min: 0.0, avg: 1.7, max: 5.0) [2023-02-26 09:21:48,358][00488] Avg episode reward: [(0, '12.644')] [2023-02-26 09:21:52,459][11054] Updated weights for policy 0, policy_version 480 (0.0015) [2023-02-26 09:21:53,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3618.4, 300 sec: 3748.9). Total num frames: 1970176. Throughput: 0: 919.3. Samples: 493332. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:21:53,350][00488] Avg episode reward: [(0, '13.244')] [2023-02-26 09:21:53,377][11034] Saving new best policy, reward=13.244! [2023-02-26 09:21:58,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3721.1). Total num frames: 1986560. Throughput: 0: 934.1. Samples: 498332. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:21:58,350][00488] Avg episode reward: [(0, '12.654')] [2023-02-26 09:22:03,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 2002944. Throughput: 0: 948.4. Samples: 500956. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:22:03,350][00488] Avg episode reward: [(0, '11.621')] [2023-02-26 09:22:04,173][11054] Updated weights for policy 0, policy_version 490 (0.0017) [2023-02-26 09:22:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 2019328. Throughput: 0: 952.3. Samples: 506240. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:22:08,351][00488] Avg episode reward: [(0, '12.515')] [2023-02-26 09:22:13,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3735.1). Total num frames: 2043904. Throughput: 0: 1000.7. Samples: 513452. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:22:13,354][00488] Avg episode reward: [(0, '11.686')] [2023-02-26 09:22:13,577][11054] Updated weights for policy 0, policy_version 500 (0.0013) [2023-02-26 09:22:18,347][00488] Fps is (10 sec: 5324.8, 60 sec: 4027.8, 300 sec: 3776.7). Total num frames: 2072576. Throughput: 0: 1029.3. Samples: 517416. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:22:18,350][00488] Avg episode reward: [(0, '12.568')] [2023-02-26 09:22:23,046][11054] Updated weights for policy 0, policy_version 510 (0.0011) [2023-02-26 09:22:23,347][00488] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3762.8). Total num frames: 2088960. Throughput: 0: 1033.3. Samples: 523808. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:22:23,350][00488] Avg episode reward: [(0, '12.314')] [2023-02-26 09:22:28,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3735.0). Total num frames: 2105344. Throughput: 0: 975.7. Samples: 529004. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:22:28,355][00488] Avg episode reward: [(0, '12.552')] [2023-02-26 09:22:33,347][00488] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3721.2). Total num frames: 2125824. Throughput: 0: 966.3. Samples: 531724. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:22:33,352][00488] Avg episode reward: [(0, '13.479')] [2023-02-26 09:22:33,370][11034] Saving new best policy, reward=13.479! [2023-02-26 09:22:33,903][11054] Updated weights for policy 0, policy_version 520 (0.0015) [2023-02-26 09:22:38,347][00488] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3735.0). Total num frames: 2146304. Throughput: 0: 972.1. Samples: 537076. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:22:38,355][00488] Avg episode reward: [(0, '13.210')] [2023-02-26 09:22:43,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 2162688. Throughput: 0: 988.6. Samples: 542820. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:22:43,356][00488] Avg episode reward: [(0, '14.117')] [2023-02-26 09:22:43,367][11034] Saving new best policy, reward=14.117! [2023-02-26 09:22:44,912][11054] Updated weights for policy 0, policy_version 530 (0.0015) [2023-02-26 09:22:48,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 2187264. Throughput: 0: 1017.2. Samples: 546732. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:22:48,354][00488] Avg episode reward: [(0, '14.607')] [2023-02-26 09:22:48,361][11034] Saving new best policy, reward=14.607! [2023-02-26 09:22:52,212][11054] Updated weights for policy 0, policy_version 540 (0.0012) [2023-02-26 09:22:53,347][00488] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 2211840. Throughput: 0: 1073.2. Samples: 554532. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:22:53,355][00488] Avg episode reward: [(0, '14.338')] [2023-02-26 09:22:58,348][00488] Fps is (10 sec: 4095.6, 60 sec: 4027.7, 300 sec: 3804.5). Total num frames: 2228224. Throughput: 0: 1030.0. Samples: 559804. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:22:58,351][00488] Avg episode reward: [(0, '14.613')] [2023-02-26 09:22:58,355][11034] Saving new best policy, reward=14.613! [2023-02-26 09:23:03,350][00488] Fps is (10 sec: 3275.8, 60 sec: 4027.5, 300 sec: 3790.5). Total num frames: 2244608. Throughput: 0: 995.3. Samples: 562208. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:23:03,353][00488] Avg episode reward: [(0, '14.539')] [2023-02-26 09:23:04,747][11054] Updated weights for policy 0, policy_version 550 (0.0013) [2023-02-26 09:23:08,347][00488] Fps is (10 sec: 3277.1, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 2260992. Throughput: 0: 971.6. Samples: 567528. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:23:08,355][00488] Avg episode reward: [(0, '14.937')] [2023-02-26 09:23:08,358][11034] Saving new best policy, reward=14.937! [2023-02-26 09:23:13,347][00488] Fps is (10 sec: 3687.4, 60 sec: 3959.4, 300 sec: 3804.4). Total num frames: 2281472. Throughput: 0: 969.3. Samples: 572624. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:23:13,351][00488] Avg episode reward: [(0, '15.112')] [2023-02-26 09:23:13,359][11034] Saving new best policy, reward=15.112! [2023-02-26 09:23:16,432][11054] Updated weights for policy 0, policy_version 560 (0.0016) [2023-02-26 09:23:18,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2301952. Throughput: 0: 966.8. Samples: 575228. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:23:18,350][00488] Avg episode reward: [(0, '14.842')] [2023-02-26 09:23:23,347][00488] Fps is (10 sec: 4505.7, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 2326528. Throughput: 0: 1015.1. Samples: 582756. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:23:23,352][00488] Avg episode reward: [(0, '16.638')] [2023-02-26 09:23:23,362][11034] Saving new best policy, reward=16.638! [2023-02-26 09:23:24,591][11054] Updated weights for policy 0, policy_version 570 (0.0016) [2023-02-26 09:23:28,347][00488] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 2347008. Throughput: 0: 1041.6. Samples: 589692. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:23:28,354][00488] Avg episode reward: [(0, '18.417')] [2023-02-26 09:23:28,357][11034] Saving new best policy, reward=18.417! [2023-02-26 09:23:33,347][00488] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 2367488. Throughput: 0: 1010.7. Samples: 592212. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:23:33,349][00488] Avg episode reward: [(0, '18.137')] [2023-02-26 09:23:35,444][11054] Updated weights for policy 0, policy_version 580 (0.0012) [2023-02-26 09:23:38,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 2383872. Throughput: 0: 954.6. Samples: 597488. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:23:38,351][00488] Avg episode reward: [(0, '18.572')] [2023-02-26 09:23:38,358][11034] Saving new best policy, reward=18.572! [2023-02-26 09:23:43,349][00488] Fps is (10 sec: 3685.7, 60 sec: 4027.6, 300 sec: 3832.2). Total num frames: 2404352. Throughput: 0: 956.5. Samples: 602848. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:23:43,353][00488] Avg episode reward: [(0, '19.325')] [2023-02-26 09:23:43,365][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000587_2404352.pth... [2023-02-26 09:23:43,684][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000363_1486848.pth [2023-02-26 09:23:43,740][11034] Saving new best policy, reward=19.325! [2023-02-26 09:23:47,550][11054] Updated weights for policy 0, policy_version 590 (0.0012) [2023-02-26 09:23:48,347][00488] Fps is (10 sec: 3686.3, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 2420736. Throughput: 0: 958.6. Samples: 605344. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:23:48,357][00488] Avg episode reward: [(0, '19.151')] [2023-02-26 09:23:53,347][00488] Fps is (10 sec: 3687.1, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2441216. Throughput: 0: 974.5. Samples: 611380. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:23:53,349][00488] Avg episode reward: [(0, '17.912')] [2023-02-26 09:23:56,976][11054] Updated weights for policy 0, policy_version 600 (0.0016) [2023-02-26 09:23:58,347][00488] Fps is (10 sec: 3686.5, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 2457600. Throughput: 0: 987.2. Samples: 617048. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:23:58,354][00488] Avg episode reward: [(0, '17.281')] [2023-02-26 09:24:03,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3818.3). Total num frames: 2473984. Throughput: 0: 985.5. Samples: 619576. Policy #0 lag: (min: 0.0, avg: 1.6, max: 5.0) [2023-02-26 09:24:03,350][00488] Avg episode reward: [(0, '17.363')] [2023-02-26 09:24:08,353][00488] Fps is (10 sec: 3274.9, 60 sec: 3822.6, 300 sec: 3818.2). Total num frames: 2490368. Throughput: 0: 906.9. Samples: 623572. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:24:08,356][00488] Avg episode reward: [(0, '16.683')] [2023-02-26 09:24:12,463][11054] Updated weights for policy 0, policy_version 610 (0.0012) [2023-02-26 09:24:13,355][00488] Fps is (10 sec: 2865.0, 60 sec: 3685.9, 300 sec: 3776.6). Total num frames: 2502656. Throughput: 0: 840.4. Samples: 627516. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:24:13,357][00488] Avg episode reward: [(0, '15.460')] [2023-02-26 09:24:18,347][00488] Fps is (10 sec: 2049.2, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 2510848. Throughput: 0: 826.9. Samples: 629424. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:24:18,356][00488] Avg episode reward: [(0, '15.788')] [2023-02-26 09:24:23,347][00488] Fps is (10 sec: 2459.5, 60 sec: 3345.1, 300 sec: 3721.1). Total num frames: 2527232. Throughput: 0: 797.3. Samples: 633368. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:24:23,352][00488] Avg episode reward: [(0, '16.054')] [2023-02-26 09:24:26,379][11054] Updated weights for policy 0, policy_version 620 (0.0013) [2023-02-26 09:24:28,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3707.2). Total num frames: 2543616. Throughput: 0: 789.5. Samples: 638372. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:24:28,352][00488] Avg episode reward: [(0, '16.876')] [2023-02-26 09:24:33,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3721.1). Total num frames: 2568192. Throughput: 0: 804.7. Samples: 641556. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:24:33,350][00488] Avg episode reward: [(0, '16.922')] [2023-02-26 09:24:35,345][11054] Updated weights for policy 0, policy_version 630 (0.0017) [2023-02-26 09:24:38,347][00488] Fps is (10 sec: 4915.1, 60 sec: 3481.6, 300 sec: 3748.9). Total num frames: 2592768. Throughput: 0: 843.9. Samples: 649356. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:24:38,350][00488] Avg episode reward: [(0, '16.816')] [2023-02-26 09:24:43,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3481.7, 300 sec: 3748.9). Total num frames: 2613248. Throughput: 0: 864.5. Samples: 655952. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:24:43,354][00488] Avg episode reward: [(0, '18.187')] [2023-02-26 09:24:44,748][11054] Updated weights for policy 0, policy_version 640 (0.0012) [2023-02-26 09:24:48,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3735.0). Total num frames: 2629632. Throughput: 0: 867.5. Samples: 658612. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:24:48,350][00488] Avg episode reward: [(0, '18.115')] [2023-02-26 09:24:53,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 2646016. Throughput: 0: 887.8. Samples: 663520. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:24:53,353][00488] Avg episode reward: [(0, '18.763')] [2023-02-26 09:24:56,837][11054] Updated weights for policy 0, policy_version 650 (0.0012) [2023-02-26 09:24:58,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3693.3). Total num frames: 2662400. Throughput: 0: 914.0. Samples: 668640. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:24:58,349][00488] Avg episode reward: [(0, '20.299')] [2023-02-26 09:24:58,356][11034] Saving new best policy, reward=20.299! [2023-02-26 09:25:03,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 2682880. Throughput: 0: 924.4. Samples: 671020. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:25:03,352][00488] Avg episode reward: [(0, '21.768')] [2023-02-26 09:25:03,368][11034] Saving new best policy, reward=21.768! [2023-02-26 09:25:07,314][11054] Updated weights for policy 0, policy_version 660 (0.0018) [2023-02-26 09:25:08,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3618.5, 300 sec: 3735.0). Total num frames: 2707456. Throughput: 0: 982.1. Samples: 677564. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:25:08,349][00488] Avg episode reward: [(0, '20.610')] [2023-02-26 09:25:13,347][00488] Fps is (10 sec: 4915.0, 60 sec: 3823.4, 300 sec: 3762.8). Total num frames: 2732032. Throughput: 0: 1042.7. Samples: 685296. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:25:13,349][00488] Avg episode reward: [(0, '19.985')] [2023-02-26 09:25:15,867][11054] Updated weights for policy 0, policy_version 670 (0.0017) [2023-02-26 09:25:18,347][00488] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3776.7). Total num frames: 2752512. Throughput: 0: 1034.7. Samples: 688116. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:25:18,355][00488] Avg episode reward: [(0, '19.436')] [2023-02-26 09:25:23,347][00488] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3735.0). Total num frames: 2768896. Throughput: 0: 973.2. Samples: 693152. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:25:23,354][00488] Avg episode reward: [(0, '18.411')] [2023-02-26 09:25:28,055][11054] Updated weights for policy 0, policy_version 680 (0.0016) [2023-02-26 09:25:28,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3721.1). Total num frames: 2785280. Throughput: 0: 936.4. Samples: 698088. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:25:28,354][00488] Avg episode reward: [(0, '19.118')] [2023-02-26 09:25:33,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 2801664. Throughput: 0: 935.3. Samples: 700700. Policy #0 lag: (min: 0.0, avg: 1.6, max: 5.0) [2023-02-26 09:25:33,350][00488] Avg episode reward: [(0, '20.175')] [2023-02-26 09:25:38,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3707.2). Total num frames: 2818048. Throughput: 0: 941.1. Samples: 705868. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:25:38,355][00488] Avg episode reward: [(0, '21.238')] [2023-02-26 09:25:39,729][11054] Updated weights for policy 0, policy_version 690 (0.0028) [2023-02-26 09:25:43,347][00488] Fps is (10 sec: 4505.5, 60 sec: 3891.2, 300 sec: 3748.9). Total num frames: 2846720. Throughput: 0: 989.8. Samples: 713180. Policy #0 lag: (min: 0.0, avg: 2.4, max: 6.0) [2023-02-26 09:25:43,350][00488] Avg episode reward: [(0, '20.317')] [2023-02-26 09:25:43,362][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000695_2846720.pth... [2023-02-26 09:25:43,542][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000472_1933312.pth [2023-02-26 09:25:47,421][11054] Updated weights for policy 0, policy_version 700 (0.0012) [2023-02-26 09:25:48,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 2867200. Throughput: 0: 1021.3. Samples: 716980. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:25:48,355][00488] Avg episode reward: [(0, '20.116')] [2023-02-26 09:25:53,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 2883584. Throughput: 0: 1006.2. Samples: 722844. Policy #0 lag: (min: 0.0, avg: 1.5, max: 5.0) [2023-02-26 09:25:53,353][00488] Avg episode reward: [(0, '20.402')] [2023-02-26 09:25:58,347][00488] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 2904064. Throughput: 0: 951.7. Samples: 728124. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:25:58,352][00488] Avg episode reward: [(0, '19.880')] [2023-02-26 09:25:59,070][11054] Updated weights for policy 0, policy_version 710 (0.0012) [2023-02-26 09:26:03,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3959.4, 300 sec: 3790.5). Total num frames: 2920448. Throughput: 0: 945.7. Samples: 730672. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:26:03,351][00488] Avg episode reward: [(0, '19.125')] [2023-02-26 09:26:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 2936832. Throughput: 0: 950.5. Samples: 735924. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:26:08,353][00488] Avg episode reward: [(0, '20.264')] [2023-02-26 09:26:10,858][11054] Updated weights for policy 0, policy_version 720 (0.0016) [2023-02-26 09:26:13,347][00488] Fps is (10 sec: 3686.5, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 2957312. Throughput: 0: 972.3. Samples: 741840. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:26:13,349][00488] Avg episode reward: [(0, '20.163')] [2023-02-26 09:26:18,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2981888. Throughput: 0: 997.1. Samples: 745568. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:26:18,350][00488] Avg episode reward: [(0, '20.881')] [2023-02-26 09:26:19,220][11054] Updated weights for policy 0, policy_version 730 (0.0012) [2023-02-26 09:26:23,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 3006464. Throughput: 0: 1045.4. Samples: 752912. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:26:23,350][00488] Avg episode reward: [(0, '21.903')] [2023-02-26 09:26:23,367][11034] Saving new best policy, reward=21.903! [2023-02-26 09:26:28,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3873.8). Total num frames: 3022848. Throughput: 0: 997.1. Samples: 758048. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:26:28,353][00488] Avg episode reward: [(0, '23.159')] [2023-02-26 09:26:28,356][11034] Saving new best policy, reward=23.159! [2023-02-26 09:26:30,217][11054] Updated weights for policy 0, policy_version 740 (0.0018) [2023-02-26 09:26:33,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3039232. Throughput: 0: 969.0. Samples: 760584. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:26:33,350][00488] Avg episode reward: [(0, '24.105')] [2023-02-26 09:26:33,375][11034] Saving new best policy, reward=24.105! [2023-02-26 09:26:38,347][00488] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3832.2). Total num frames: 3059712. Throughput: 0: 957.4. Samples: 765928. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:26:38,357][00488] Avg episode reward: [(0, '24.240')] [2023-02-26 09:26:38,359][11034] Saving new best policy, reward=24.240! [2023-02-26 09:26:41,931][11054] Updated weights for policy 0, policy_version 750 (0.0012) [2023-02-26 09:26:43,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 3076096. Throughput: 0: 953.9. Samples: 771048. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:26:43,353][00488] Avg episode reward: [(0, '23.843')] [2023-02-26 09:26:48,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3096576. Throughput: 0: 964.4. Samples: 774068. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:26:48,354][00488] Avg episode reward: [(0, '22.247')] [2023-02-26 09:26:50,977][11054] Updated weights for policy 0, policy_version 760 (0.0012) [2023-02-26 09:26:53,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 3121152. Throughput: 0: 1017.4. Samples: 781708. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:26:53,353][00488] Avg episode reward: [(0, '20.967')] [2023-02-26 09:26:58,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3860.0). Total num frames: 3141632. Throughput: 0: 1023.4. Samples: 787892. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:26:58,350][00488] Avg episode reward: [(0, '21.195')] [2023-02-26 09:27:01,765][11054] Updated weights for policy 0, policy_version 770 (0.0012) [2023-02-26 09:27:03,347][00488] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3153920. Throughput: 0: 987.1. Samples: 789988. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:27:03,350][00488] Avg episode reward: [(0, '21.152')] [2023-02-26 09:27:08,349][00488] Fps is (10 sec: 2866.5, 60 sec: 3891.0, 300 sec: 3818.3). Total num frames: 3170304. Throughput: 0: 913.0. Samples: 794000. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:27:08,354][00488] Avg episode reward: [(0, '21.729')] [2023-02-26 09:27:13,347][00488] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3762.8). Total num frames: 3182592. Throughput: 0: 889.4. Samples: 798072. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:27:13,350][00488] Avg episode reward: [(0, '20.900')] [2023-02-26 09:27:17,142][11054] Updated weights for policy 0, policy_version 780 (0.0022) [2023-02-26 09:27:18,347][00488] Fps is (10 sec: 2458.2, 60 sec: 3549.9, 300 sec: 3748.9). Total num frames: 3194880. Throughput: 0: 878.0. Samples: 800092. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:27:18,350][00488] Avg episode reward: [(0, '20.539')] [2023-02-26 09:27:23,350][00488] Fps is (10 sec: 2457.0, 60 sec: 3344.9, 300 sec: 3735.0). Total num frames: 3207168. Throughput: 0: 846.6. Samples: 804028. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:27:23,352][00488] Avg episode reward: [(0, '20.746')] [2023-02-26 09:27:28,348][00488] Fps is (10 sec: 2867.0, 60 sec: 3345.0, 300 sec: 3721.1). Total num frames: 3223552. Throughput: 0: 820.8. Samples: 807984. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:27:28,352][00488] Avg episode reward: [(0, '21.011')] [2023-02-26 09:27:30,756][11054] Updated weights for policy 0, policy_version 790 (0.0013) [2023-02-26 09:27:33,347][00488] Fps is (10 sec: 4097.1, 60 sec: 3481.6, 300 sec: 3735.0). Total num frames: 3248128. Throughput: 0: 830.8. Samples: 811452. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:27:33,350][00488] Avg episode reward: [(0, '21.803')] [2023-02-26 09:27:38,350][00488] Fps is (10 sec: 4914.1, 60 sec: 3549.7, 300 sec: 3762.7). Total num frames: 3272704. Throughput: 0: 831.1. Samples: 819108. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:27:38,356][00488] Avg episode reward: [(0, '23.337')] [2023-02-26 09:27:38,923][11054] Updated weights for policy 0, policy_version 800 (0.0012) [2023-02-26 09:27:43,347][00488] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 3289088. Throughput: 0: 817.1. Samples: 824660. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:27:43,351][00488] Avg episode reward: [(0, '22.409')] [2023-02-26 09:27:43,370][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000803_3289088.pth... [2023-02-26 09:27:43,746][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000587_2404352.pth [2023-02-26 09:27:48,347][00488] Fps is (10 sec: 3277.7, 60 sec: 3481.6, 300 sec: 3707.2). Total num frames: 3305472. Throughput: 0: 825.9. Samples: 827152. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:27:48,351][00488] Avg episode reward: [(0, '22.575')] [2023-02-26 09:27:50,950][11054] Updated weights for policy 0, policy_version 810 (0.0013) [2023-02-26 09:27:53,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3707.2). Total num frames: 3321856. Throughput: 0: 853.6. Samples: 832408. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:27:53,356][00488] Avg episode reward: [(0, '20.804')] [2023-02-26 09:27:58,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3721.1). Total num frames: 3342336. Throughput: 0: 876.4. Samples: 837508. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:27:58,351][00488] Avg episode reward: [(0, '20.577')] [2023-02-26 09:28:02,518][11054] Updated weights for policy 0, policy_version 820 (0.0015) [2023-02-26 09:28:03,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3721.1). Total num frames: 3358720. Throughput: 0: 889.9. Samples: 840136. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:28:03,349][00488] Avg episode reward: [(0, '20.931')] [2023-02-26 09:28:08,347][00488] Fps is (10 sec: 4505.6, 60 sec: 3618.3, 300 sec: 3748.9). Total num frames: 3387392. Throughput: 0: 972.5. Samples: 847788. Policy #0 lag: (min: 0.0, avg: 2.4, max: 4.0) [2023-02-26 09:28:08,354][00488] Avg episode reward: [(0, '20.875')] [2023-02-26 09:28:10,429][11054] Updated weights for policy 0, policy_version 830 (0.0012) [2023-02-26 09:28:13,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3754.7, 300 sec: 3748.9). Total num frames: 3407872. Throughput: 0: 1043.6. Samples: 854944. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:28:13,355][00488] Avg episode reward: [(0, '21.222')] [2023-02-26 09:28:18,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3735.0). Total num frames: 3428352. Throughput: 0: 1026.6. Samples: 857648. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:28:18,355][00488] Avg episode reward: [(0, '22.440')] [2023-02-26 09:28:21,670][11054] Updated weights for policy 0, policy_version 840 (0.0018) [2023-02-26 09:28:23,348][00488] Fps is (10 sec: 3686.1, 60 sec: 3959.6, 300 sec: 3721.1). Total num frames: 3444736. Throughput: 0: 972.8. Samples: 862880. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:28:23,357][00488] Avg episode reward: [(0, '21.344')] [2023-02-26 09:28:28,349][00488] Fps is (10 sec: 3276.3, 60 sec: 3959.4, 300 sec: 3707.2). Total num frames: 3461120. Throughput: 0: 966.4. Samples: 868148. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:28:28,352][00488] Avg episode reward: [(0, '23.189')] [2023-02-26 09:28:33,286][11054] Updated weights for policy 0, policy_version 850 (0.0017) [2023-02-26 09:28:33,347][00488] Fps is (10 sec: 3686.7, 60 sec: 3891.2, 300 sec: 3721.1). Total num frames: 3481600. Throughput: 0: 967.5. Samples: 870688. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:28:33,352][00488] Avg episode reward: [(0, '23.178')] [2023-02-26 09:28:38,347][00488] Fps is (10 sec: 4096.7, 60 sec: 3823.1, 300 sec: 3721.1). Total num frames: 3502080. Throughput: 0: 984.0. Samples: 876688. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:28:38,350][00488] Avg episode reward: [(0, '23.503')] [2023-02-26 09:28:41,886][11054] Updated weights for policy 0, policy_version 860 (0.0012) [2023-02-26 09:28:43,347][00488] Fps is (10 sec: 4915.2, 60 sec: 4027.7, 300 sec: 3762.8). Total num frames: 3530752. Throughput: 0: 1042.3. Samples: 884412. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:28:43,350][00488] Avg episode reward: [(0, '23.579')] [2023-02-26 09:28:48,347][00488] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3762.8). Total num frames: 3551232. Throughput: 0: 1063.3. Samples: 887984. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:28:48,354][00488] Avg episode reward: [(0, '23.556')] [2023-02-26 09:28:51,778][11054] Updated weights for policy 0, policy_version 870 (0.0012) [2023-02-26 09:28:53,347][00488] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3762.8). Total num frames: 3567616. Throughput: 0: 1009.5. Samples: 893216. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:28:53,350][00488] Avg episode reward: [(0, '22.454')] [2023-02-26 09:28:58,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3762.8). Total num frames: 3584000. Throughput: 0: 966.3. Samples: 898428. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:28:58,352][00488] Avg episode reward: [(0, '22.847')] [2023-02-26 09:29:03,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3762.8). Total num frames: 3600384. Throughput: 0: 963.2. Samples: 900992. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:29:03,351][00488] Avg episode reward: [(0, '23.982')] [2023-02-26 09:29:04,076][11054] Updated weights for policy 0, policy_version 880 (0.0012) [2023-02-26 09:29:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3776.8). Total num frames: 3616768. Throughput: 0: 960.6. Samples: 906104. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:29:08,351][00488] Avg episode reward: [(0, '24.358')] [2023-02-26 09:29:08,357][11034] Saving new best policy, reward=24.358! [2023-02-26 09:29:13,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3641344. Throughput: 0: 1002.5. Samples: 913260. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:29:13,350][00488] Avg episode reward: [(0, '24.133')] [2023-02-26 09:29:13,379][11054] Updated weights for policy 0, policy_version 890 (0.0022) [2023-02-26 09:29:18,347][00488] Fps is (10 sec: 5324.8, 60 sec: 4027.7, 300 sec: 3873.8). Total num frames: 3670016. Throughput: 0: 1033.6. Samples: 917200. Policy #0 lag: (min: 0.0, avg: 2.2, max: 6.0) [2023-02-26 09:29:18,350][00488] Avg episode reward: [(0, '22.177')] [2023-02-26 09:29:22,020][11054] Updated weights for policy 0, policy_version 900 (0.0016) [2023-02-26 09:29:23,347][00488] Fps is (10 sec: 4505.4, 60 sec: 4027.8, 300 sec: 3873.8). Total num frames: 3686400. Throughput: 0: 1043.0. Samples: 923624. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:29:23,353][00488] Avg episode reward: [(0, '20.464')] [2023-02-26 09:29:28,347][00488] Fps is (10 sec: 3686.3, 60 sec: 4096.1, 300 sec: 3860.0). Total num frames: 3706880. Throughput: 0: 991.7. Samples: 929040. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:29:28,353][00488] Avg episode reward: [(0, '21.339')] [2023-02-26 09:29:33,347][00488] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3832.2). Total num frames: 3723264. Throughput: 0: 966.9. Samples: 931496. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) [2023-02-26 09:29:33,354][00488] Avg episode reward: [(0, '21.284')] [2023-02-26 09:29:33,612][11054] Updated weights for policy 0, policy_version 910 (0.0018) [2023-02-26 09:29:38,349][00488] Fps is (10 sec: 3276.3, 60 sec: 3959.3, 300 sec: 3818.3). Total num frames: 3739648. Throughput: 0: 967.3. Samples: 936744. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:29:38,352][00488] Avg episode reward: [(0, '20.808')] [2023-02-26 09:29:43,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3764224. Throughput: 0: 972.5. Samples: 942192. Policy #0 lag: (min: 0.0, avg: 1.7, max: 5.0) [2023-02-26 09:29:43,355][00488] Avg episode reward: [(0, '21.595')] [2023-02-26 09:29:43,371][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000919_3764224.pth... [2023-02-26 09:29:43,568][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000695_2846720.pth [2023-02-26 09:29:44,473][11054] Updated weights for policy 0, policy_version 920 (0.0012) [2023-02-26 09:29:48,347][00488] Fps is (10 sec: 4506.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 3784704. Throughput: 0: 999.9. Samples: 945988. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:29:48,349][00488] Avg episode reward: [(0, '22.451')] [2023-02-26 09:29:52,568][11054] Updated weights for policy 0, policy_version 930 (0.0012) [2023-02-26 09:29:53,347][00488] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 3901.6). Total num frames: 3813376. Throughput: 0: 1062.1. Samples: 953900. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:29:53,359][00488] Avg episode reward: [(0, '22.114')] [2023-02-26 09:29:58,347][00488] Fps is (10 sec: 4505.5, 60 sec: 4096.0, 300 sec: 3887.7). Total num frames: 3829760. Throughput: 0: 1019.9. Samples: 959156. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:29:58,351][00488] Avg episode reward: [(0, '21.726')] [2023-02-26 09:30:03,351][00488] Fps is (10 sec: 2866.2, 60 sec: 4027.5, 300 sec: 3846.0). Total num frames: 3842048. Throughput: 0: 986.8. Samples: 961608. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:30:03,353][00488] Avg episode reward: [(0, '22.352')] [2023-02-26 09:30:05,514][11054] Updated weights for policy 0, policy_version 940 (0.0012) [2023-02-26 09:30:08,347][00488] Fps is (10 sec: 2048.0, 60 sec: 3891.2, 300 sec: 3790.5). Total num frames: 3850240. Throughput: 0: 928.9. Samples: 965424. Policy #0 lag: (min: 0.0, avg: 2.3, max: 4.0) [2023-02-26 09:30:08,354][00488] Avg episode reward: [(0, '22.112')] [2023-02-26 09:30:13,347][00488] Fps is (10 sec: 2458.4, 60 sec: 3754.7, 300 sec: 3776.6). Total num frames: 3866624. Throughput: 0: 892.8. Samples: 969216. Policy #0 lag: (min: 0.0, avg: 2.2, max: 4.0) [2023-02-26 09:30:13,350][00488] Avg episode reward: [(0, '22.559')] [2023-02-26 09:30:18,348][00488] Fps is (10 sec: 2867.1, 60 sec: 3481.6, 300 sec: 3762.8). Total num frames: 3878912. Throughput: 0: 880.3. Samples: 971112. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:30:18,351][00488] Avg episode reward: [(0, '22.234')] [2023-02-26 09:30:21,842][11054] Updated weights for policy 0, policy_version 950 (0.0013) [2023-02-26 09:30:23,347][00488] Fps is (10 sec: 2867.2, 60 sec: 3481.6, 300 sec: 3762.8). Total num frames: 3895296. Throughput: 0: 847.7. Samples: 974888. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:30:23,355][00488] Avg episode reward: [(0, '22.897')] [2023-02-26 09:30:28,347][00488] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3762.8). Total num frames: 3911680. Throughput: 0: 842.2. Samples: 980092. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:30:28,350][00488] Avg episode reward: [(0, '21.736')] [2023-02-26 09:30:32,736][11054] Updated weights for policy 0, policy_version 960 (0.0020) [2023-02-26 09:30:33,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3776.7). Total num frames: 3932160. Throughput: 0: 823.7. Samples: 983056. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:30:33,355][00488] Avg episode reward: [(0, '22.672')] [2023-02-26 09:30:38,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3735.0). Total num frames: 3948544. Throughput: 0: 777.0. Samples: 988864. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:30:38,352][00488] Avg episode reward: [(0, '22.405')] [2023-02-26 09:30:43,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3721.1). Total num frames: 3964928. Throughput: 0: 778.2. Samples: 994176. Policy #0 lag: (min: 0.0, avg: 2.4, max: 5.0) [2023-02-26 09:30:43,353][00488] Avg episode reward: [(0, '21.692')] [2023-02-26 09:30:44,216][11054] Updated weights for policy 0, policy_version 970 (0.0014) [2023-02-26 09:30:48,347][00488] Fps is (10 sec: 3686.5, 60 sec: 3345.1, 300 sec: 3735.0). Total num frames: 3985408. Throughput: 0: 781.8. Samples: 996788. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:30:48,354][00488] Avg episode reward: [(0, '21.542')] [2023-02-26 09:30:53,350][00488] Fps is (10 sec: 4094.9, 60 sec: 3208.4, 300 sec: 3735.0). Total num frames: 4005888. Throughput: 0: 813.1. Samples: 1002016. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:30:53,353][00488] Avg episode reward: [(0, '21.458')] [2023-02-26 09:30:56,100][11054] Updated weights for policy 0, policy_version 980 (0.0012) [2023-02-26 09:30:58,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3208.5, 300 sec: 3735.0). Total num frames: 4022272. Throughput: 0: 865.8. Samples: 1008176. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:30:58,350][00488] Avg episode reward: [(0, '22.044')] [2023-02-26 09:31:03,347][00488] Fps is (10 sec: 4097.1, 60 sec: 3413.5, 300 sec: 3762.8). Total num frames: 4046848. Throughput: 0: 906.9. Samples: 1011924. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:31:03,353][00488] Avg episode reward: [(0, '23.682')] [2023-02-26 09:31:03,978][11054] Updated weights for policy 0, policy_version 990 (0.0012) [2023-02-26 09:31:08,351][00488] Fps is (10 sec: 4913.5, 60 sec: 3686.2, 300 sec: 3776.6). Total num frames: 4071424. Throughput: 0: 978.2. Samples: 1018908. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:31:08,353][00488] Avg episode reward: [(0, '25.808')] [2023-02-26 09:31:08,360][11034] Saving new best policy, reward=25.808! [2023-02-26 09:31:13,351][00488] Fps is (10 sec: 4094.4, 60 sec: 3686.2, 300 sec: 3748.8). Total num frames: 4087808. Throughput: 0: 976.4. Samples: 1024032. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:31:13,354][00488] Avg episode reward: [(0, '26.271')] [2023-02-26 09:31:13,368][11034] Saving new best policy, reward=26.271! [2023-02-26 09:31:14,837][11054] Updated weights for policy 0, policy_version 1000 (0.0012) [2023-02-26 09:31:18,347][00488] Fps is (10 sec: 3277.9, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4104192. Throughput: 0: 969.2. Samples: 1026668. Policy #0 lag: (min: 0.0, avg: 1.7, max: 4.0) [2023-02-26 09:31:18,350][00488] Avg episode reward: [(0, '26.076')] [2023-02-26 09:31:23,347][00488] Fps is (10 sec: 3278.1, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4120576. Throughput: 0: 953.4. Samples: 1031768. Policy #0 lag: (min: 0.0, avg: 2.4, max: 4.0) [2023-02-26 09:31:23,350][00488] Avg episode reward: [(0, '25.902')] [2023-02-26 09:31:26,655][11054] Updated weights for policy 0, policy_version 1010 (0.0016) [2023-02-26 09:31:28,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3721.1). Total num frames: 4136960. Throughput: 0: 950.1. Samples: 1036932. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:31:28,354][00488] Avg episode reward: [(0, '24.943')] [2023-02-26 09:31:33,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3707.2). Total num frames: 4153344. Throughput: 0: 947.1. Samples: 1039408. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:31:33,350][00488] Avg episode reward: [(0, '24.938')] [2023-02-26 09:31:37,915][11054] Updated weights for policy 0, policy_version 1020 (0.0017) [2023-02-26 09:31:38,347][00488] Fps is (10 sec: 4095.9, 60 sec: 3822.9, 300 sec: 3735.0). Total num frames: 4177920. Throughput: 0: 956.8. Samples: 1045068. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:31:38,350][00488] Avg episode reward: [(0, '23.128')] [2023-02-26 09:31:43,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3721.1). Total num frames: 4194304. Throughput: 0: 956.4. Samples: 1051216. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:31:43,350][00488] Avg episode reward: [(0, '22.727')] [2023-02-26 09:31:43,371][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001024_4194304.pth... [2023-02-26 09:31:43,788][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000803_3289088.pth [2023-02-26 09:31:48,347][00488] Fps is (10 sec: 3276.9, 60 sec: 3754.7, 300 sec: 3693.3). Total num frames: 4210688. Throughput: 0: 927.4. Samples: 1053656. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:31:48,352][00488] Avg episode reward: [(0, '23.059')] [2023-02-26 09:31:49,827][11054] Updated weights for policy 0, policy_version 1030 (0.0015) [2023-02-26 09:31:53,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.6, 300 sec: 3679.5). Total num frames: 4227072. Throughput: 0: 891.0. Samples: 1059000. Policy #0 lag: (min: 0.0, avg: 1.6, max: 4.0) [2023-02-26 09:31:53,354][00488] Avg episode reward: [(0, '23.190')] [2023-02-26 09:31:58,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 4243456. Throughput: 0: 894.7. Samples: 1064292. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:31:58,354][00488] Avg episode reward: [(0, '24.826')] [2023-02-26 09:32:01,003][11054] Updated weights for policy 0, policy_version 1040 (0.0021) [2023-02-26 09:32:03,359][00488] Fps is (10 sec: 3682.2, 60 sec: 3617.4, 300 sec: 3707.1). Total num frames: 4263936. Throughput: 0: 891.9. Samples: 1066812. Policy #0 lag: (min: 0.0, avg: 1.9, max: 5.0) [2023-02-26 09:32:03,362][00488] Avg episode reward: [(0, '24.158')] [2023-02-26 09:32:08,349][00488] Fps is (10 sec: 4914.0, 60 sec: 3686.5, 300 sec: 3762.7). Total num frames: 4292608. Throughput: 0: 926.9. Samples: 1073480. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:32:08,360][00488] Avg episode reward: [(0, '24.639')] [2023-02-26 09:32:10,167][11054] Updated weights for policy 0, policy_version 1050 (0.0012) [2023-02-26 09:32:13,347][00488] Fps is (10 sec: 4920.8, 60 sec: 3754.9, 300 sec: 3790.5). Total num frames: 4313088. Throughput: 0: 981.5. Samples: 1081100. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:32:13,354][00488] Avg episode reward: [(0, '25.088')] [2023-02-26 09:32:18,347][00488] Fps is (10 sec: 4096.9, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 4333568. Throughput: 0: 986.4. Samples: 1083796. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:32:18,354][00488] Avg episode reward: [(0, '25.890')] [2023-02-26 09:32:19,858][11054] Updated weights for policy 0, policy_version 1060 (0.0012) [2023-02-26 09:32:23,347][00488] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 4349952. Throughput: 0: 977.4. Samples: 1089052. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:32:23,352][00488] Avg episode reward: [(0, '26.502')] [2023-02-26 09:32:23,362][11034] Saving new best policy, reward=26.502! [2023-02-26 09:32:28,347][00488] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 4366336. Throughput: 0: 956.0. Samples: 1094236. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:32:28,357][00488] Avg episode reward: [(0, '24.854')] [2023-02-26 09:32:31,970][11054] Updated weights for policy 0, policy_version 1070 (0.0019) [2023-02-26 09:32:33,347][00488] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 3776.7). Total num frames: 4386816. Throughput: 0: 964.9. Samples: 1097076. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:32:33,351][00488] Avg episode reward: [(0, '25.825')] [2023-02-26 09:32:38,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 4407296. Throughput: 0: 960.6. Samples: 1102228. Policy #0 lag: (min: 0.0, avg: 1.9, max: 4.0) [2023-02-26 09:32:38,349][00488] Avg episode reward: [(0, '25.362')] [2023-02-26 09:32:41,253][11054] Updated weights for policy 0, policy_version 1080 (0.0012) [2023-02-26 09:32:43,349][00488] Fps is (10 sec: 4504.9, 60 sec: 3959.4, 300 sec: 3818.3). Total num frames: 4431872. Throughput: 0: 1019.7. Samples: 1110180. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:32:43,357][00488] Avg episode reward: [(0, '25.859')] [2023-02-26 09:32:48,349][00488] Fps is (10 sec: 4914.3, 60 sec: 4095.9, 300 sec: 3846.1). Total num frames: 4456448. Throughput: 0: 1049.2. Samples: 1114016. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:32:48,351][00488] Avg episode reward: [(0, '25.827')] [2023-02-26 09:32:50,409][11054] Updated weights for policy 0, policy_version 1090 (0.0012) [2023-02-26 09:32:53,347][00488] Fps is (10 sec: 4096.7, 60 sec: 4096.0, 300 sec: 3832.2). Total num frames: 4472832. Throughput: 0: 1028.1. Samples: 1119740. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:32:53,350][00488] Avg episode reward: [(0, '25.841')] [2023-02-26 09:32:58,347][00488] Fps is (10 sec: 3277.4, 60 sec: 4096.0, 300 sec: 3832.2). Total num frames: 4489216. Throughput: 0: 972.4. Samples: 1124856. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2023-02-26 09:32:58,349][00488] Avg episode reward: [(0, '23.086')] [2023-02-26 09:33:01,637][11054] Updated weights for policy 0, policy_version 1100 (0.0012) [2023-02-26 09:33:03,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4028.5, 300 sec: 3790.5). Total num frames: 4505600. Throughput: 0: 968.7. Samples: 1127388. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:33:03,355][00488] Avg episode reward: [(0, '23.867')] [2023-02-26 09:33:08,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3776.7). Total num frames: 4521984. Throughput: 0: 966.0. Samples: 1132520. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:33:08,350][00488] Avg episode reward: [(0, '24.109')] [2023-02-26 09:33:13,349][00488] Fps is (10 sec: 3276.1, 60 sec: 3754.5, 300 sec: 3762.7). Total num frames: 4538368. Throughput: 0: 940.0. Samples: 1136540. Policy #0 lag: (min: 0.0, avg: 2.3, max: 5.0) [2023-02-26 09:33:13,353][00488] Avg episode reward: [(0, '23.307')] [2023-02-26 09:33:15,825][11054] Updated weights for policy 0, policy_version 1110 (0.0021) [2023-02-26 09:33:18,348][00488] Fps is (10 sec: 3276.5, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 4554752. Throughput: 0: 936.1. Samples: 1139200. Policy #0 lag: (min: 0.0, avg: 1.8, max: 5.0) [2023-02-26 09:33:18,354][00488] Avg episode reward: [(0, '22.772')] [2023-02-26 09:33:23,348][00488] Fps is (10 sec: 3277.3, 60 sec: 3686.4, 300 sec: 3762.8). Total num frames: 4571136. Throughput: 0: 941.1. Samples: 1144580. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:33:23,359][00488] Avg episode reward: [(0, '24.847')] [2023-02-26 09:33:27,895][11054] Updated weights for policy 0, policy_version 1120 (0.0016) [2023-02-26 09:33:28,351][00488] Fps is (10 sec: 3275.8, 60 sec: 3686.2, 300 sec: 3748.8). Total num frames: 4587520. Throughput: 0: 854.2. Samples: 1148620. Policy #0 lag: (min: 0.0, avg: 2.5, max: 5.0) [2023-02-26 09:33:28,353][00488] Avg episode reward: [(0, '24.756')] [2023-02-26 09:33:33,347][00488] Fps is (10 sec: 2867.4, 60 sec: 3549.9, 300 sec: 3721.1). Total num frames: 4599808. Throughput: 0: 815.0. Samples: 1150688. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:33:33,351][00488] Avg episode reward: [(0, '25.788')] [2023-02-26 09:33:38,347][00488] Fps is (10 sec: 2868.3, 60 sec: 3481.6, 300 sec: 3679.5). Total num frames: 4616192. Throughput: 0: 784.6. Samples: 1155048. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-26 09:33:38,351][00488] Avg episode reward: [(0, '25.807')] [2023-02-26 09:33:42,370][11054] Updated weights for policy 0, policy_version 1130 (0.0016) [2023-02-26 09:33:43,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3345.2, 300 sec: 3665.6). Total num frames: 4632576. Throughput: 0: 781.8. Samples: 1160036. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:33:43,355][00488] Avg episode reward: [(0, '26.420')] [2023-02-26 09:33:43,370][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001131_4632576.pth... [2023-02-26 09:33:43,723][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000919_3764224.pth [2023-02-26 09:33:48,347][00488] Fps is (10 sec: 3276.7, 60 sec: 3208.6, 300 sec: 3665.6). Total num frames: 4648960. Throughput: 0: 780.8. Samples: 1162524. Policy #0 lag: (min: 0.0, avg: 1.4, max: 4.0) [2023-02-26 09:33:48,355][00488] Avg episode reward: [(0, '25.763')] [2023-02-26 09:33:52,433][11054] Updated weights for policy 0, policy_version 1140 (0.0013) [2023-02-26 09:33:53,347][00488] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3679.5). Total num frames: 4669440. Throughput: 0: 801.6. Samples: 1168592. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:33:53,349][00488] Avg episode reward: [(0, '25.550')] [2023-02-26 09:33:58,347][00488] Fps is (10 sec: 4915.3, 60 sec: 3481.6, 300 sec: 3721.1). Total num frames: 4698112. Throughput: 0: 883.2. Samples: 1176284. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:33:58,355][00488] Avg episode reward: [(0, '24.119')] [2023-02-26 09:34:00,827][11054] Updated weights for policy 0, policy_version 1150 (0.0012) [2023-02-26 09:34:03,347][00488] Fps is (10 sec: 4915.2, 60 sec: 3549.9, 300 sec: 3735.0). Total num frames: 4718592. Throughput: 0: 901.4. Samples: 1179764. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:34:03,354][00488] Avg episode reward: [(0, '24.002')] [2023-02-26 09:34:08,352][00488] Fps is (10 sec: 3684.6, 60 sec: 3549.6, 300 sec: 3707.2). Total num frames: 4734976. Throughput: 0: 898.8. Samples: 1185028. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:34:08,355][00488] Avg episode reward: [(0, '23.313')] [2023-02-26 09:34:11,740][11054] Updated weights for policy 0, policy_version 1160 (0.0019) [2023-02-26 09:34:13,347][00488] Fps is (10 sec: 3276.8, 60 sec: 3550.0, 300 sec: 3665.6). Total num frames: 4751360. Throughput: 0: 930.4. Samples: 1190484. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:34:13,354][00488] Avg episode reward: [(0, '22.647')] [2023-02-26 09:34:18,347][00488] Fps is (10 sec: 3688.2, 60 sec: 3618.2, 300 sec: 3679.5). Total num frames: 4771840. Throughput: 0: 944.5. Samples: 1193192. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:34:18,351][00488] Avg episode reward: [(0, '21.962')] [2023-02-26 09:34:23,347][00488] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 4784128. Throughput: 0: 963.0. Samples: 1198384. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:34:23,353][00488] Avg episode reward: [(0, '21.246')] [2023-02-26 09:34:24,473][11054] Updated weights for policy 0, policy_version 1170 (0.0012) [2023-02-26 09:34:28,347][00488] Fps is (10 sec: 4096.0, 60 sec: 3754.9, 300 sec: 3693.3). Total num frames: 4812800. Throughput: 0: 1010.8. Samples: 1205524. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:34:28,349][00488] Avg episode reward: [(0, '20.972')] [2023-02-26 09:34:31,677][11054] Updated weights for policy 0, policy_version 1180 (0.0012) [2023-02-26 09:34:33,347][00488] Fps is (10 sec: 5324.8, 60 sec: 3959.4, 300 sec: 3721.1). Total num frames: 4837376. Throughput: 0: 1043.7. Samples: 1209492. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:34:33,350][00488] Avg episode reward: [(0, '21.482')] [2023-02-26 09:34:38,347][00488] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3707.2). Total num frames: 4857856. Throughput: 0: 1046.5. Samples: 1215684. Policy #0 lag: (min: 0.0, avg: 2.2, max: 5.0) [2023-02-26 09:34:38,352][00488] Avg episode reward: [(0, '22.014')] [2023-02-26 09:34:42,757][11054] Updated weights for policy 0, policy_version 1190 (0.0027) [2023-02-26 09:34:43,347][00488] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3693.3). Total num frames: 4874240. Throughput: 0: 991.1. Samples: 1220884. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:34:43,354][00488] Avg episode reward: [(0, '22.299')] [2023-02-26 09:34:48,347][00488] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3651.7). Total num frames: 4890624. Throughput: 0: 973.8. Samples: 1223584. Policy #0 lag: (min: 0.0, avg: 2.1, max: 4.0) [2023-02-26 09:34:48,352][00488] Avg episode reward: [(0, '24.190')] [2023-02-26 09:34:53,347][00488] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 3665.6). Total num frames: 4911104. Throughput: 0: 974.7. Samples: 1228884. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:34:53,349][00488] Avg episode reward: [(0, '26.750')] [2023-02-26 09:34:53,365][11034] Saving new best policy, reward=26.750! [2023-02-26 09:34:54,416][11054] Updated weights for policy 0, policy_version 1200 (0.0017) [2023-02-26 09:34:58,351][00488] Fps is (10 sec: 3685.0, 60 sec: 3822.7, 300 sec: 3679.5). Total num frames: 4927488. Throughput: 0: 972.5. Samples: 1234248. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:34:58,356][00488] Avg episode reward: [(0, '27.777')] [2023-02-26 09:34:58,442][11034] Saving new best policy, reward=27.777! [2023-02-26 09:35:03,291][11054] Updated weights for policy 0, policy_version 1210 (0.0014) [2023-02-26 09:35:03,347][00488] Fps is (10 sec: 4505.5, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 4956160. Throughput: 0: 998.3. Samples: 1238116. Policy #0 lag: (min: 0.0, avg: 2.0, max: 5.0) [2023-02-26 09:35:03,354][00488] Avg episode reward: [(0, '27.485')] [2023-02-26 09:35:08,347][00488] Fps is (10 sec: 5326.8, 60 sec: 4096.3, 300 sec: 3776.7). Total num frames: 4980736. Throughput: 0: 1056.0. Samples: 1245904. Policy #0 lag: (min: 0.0, avg: 2.1, max: 5.0) [2023-02-26 09:35:08,366][00488] Avg episode reward: [(0, '26.338')] [2023-02-26 09:35:12,842][11054] Updated weights for policy 0, policy_version 1220 (0.0018) [2023-02-26 09:35:13,347][00488] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3790.5). Total num frames: 4997120. Throughput: 0: 1017.5. Samples: 1251312. Policy #0 lag: (min: 0.0, avg: 2.0, max: 4.0) [2023-02-26 09:35:13,352][00488] Avg episode reward: [(0, '25.827')] [2023-02-26 09:35:15,677][11034] Stopping Batcher_0... [2023-02-26 09:35:15,678][11034] Loop batcher_evt_loop terminating... [2023-02-26 09:35:15,682][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... [2023-02-26 09:35:15,690][00488] Component Batcher_0 stopped! [2023-02-26 09:35:15,755][11054] Weights refcount: 2 0 [2023-02-26 09:35:15,771][11054] Stopping InferenceWorker_p0-w0... [2023-02-26 09:35:15,772][11054] Loop inference_proc0-0_evt_loop terminating... [2023-02-26 09:35:15,772][00488] Component InferenceWorker_p0-w0 stopped! [2023-02-26 09:35:15,983][11069] Stopping RolloutWorker_w9... [2023-02-26 09:35:15,984][11069] Loop rollout_proc9_evt_loop terminating... [2023-02-26 09:35:15,985][00488] Component RolloutWorker_w9 stopped! [2023-02-26 09:35:15,994][11061] Stopping RolloutWorker_w5... [2023-02-26 09:35:15,994][00488] Component RolloutWorker_w5 stopped! [2023-02-26 09:35:15,998][11063] Stopping RolloutWorker_w7... [2023-02-26 09:35:15,999][00488] Component RolloutWorker_w7 stopped! [2023-02-26 09:35:15,994][11061] Loop rollout_proc5_evt_loop terminating... [2023-02-26 09:35:16,015][11059] Stopping RolloutWorker_w3... [2023-02-26 09:35:16,016][11059] Loop rollout_proc3_evt_loop terminating... [2023-02-26 09:35:16,019][11063] Loop rollout_proc7_evt_loop terminating... [2023-02-26 09:35:16,016][00488] Component RolloutWorker_w3 stopped! [2023-02-26 09:35:16,045][11075] Stopping RolloutWorker_w15... [2023-02-26 09:35:16,046][11075] Loop rollout_proc15_evt_loop terminating... [2023-02-26 09:35:16,046][00488] Component RolloutWorker_w15 stopped! [2023-02-26 09:35:16,050][11071] Stopping RolloutWorker_w11... [2023-02-26 09:35:16,051][11071] Loop rollout_proc11_evt_loop terminating... [2023-02-26 09:35:16,050][00488] Component RolloutWorker_w11 stopped! [2023-02-26 09:35:16,055][11073] Stopping RolloutWorker_w13... [2023-02-26 09:35:16,055][11073] Loop rollout_proc13_evt_loop terminating... [2023-02-26 09:35:16,058][00488] Component RolloutWorker_w13 stopped! [2023-02-26 09:35:16,066][11056] Stopping RolloutWorker_w1... [2023-02-26 09:35:16,070][11056] Loop rollout_proc1_evt_loop terminating... [2023-02-26 09:35:16,067][00488] Component RolloutWorker_w1 stopped! [2023-02-26 09:35:16,196][11034] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001024_4194304.pth [2023-02-26 09:35:16,246][11074] Stopping RolloutWorker_w14... [2023-02-26 09:35:16,249][11074] Loop rollout_proc14_evt_loop terminating... [2023-02-26 09:35:16,246][00488] Component RolloutWorker_w14 stopped! [2023-02-26 09:35:16,254][11062] Stopping RolloutWorker_w6... [2023-02-26 09:35:16,254][11062] Loop rollout_proc6_evt_loop terminating... [2023-02-26 09:35:16,254][00488] Component RolloutWorker_w6 stopped! [2023-02-26 09:35:16,264][11034] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... [2023-02-26 09:35:16,270][11072] Stopping RolloutWorker_w12... [2023-02-26 09:35:16,270][00488] Component RolloutWorker_w12 stopped! [2023-02-26 09:35:16,270][11072] Loop rollout_proc12_evt_loop terminating... [2023-02-26 09:35:16,310][11070] Stopping RolloutWorker_w10... [2023-02-26 09:35:16,314][11070] Loop rollout_proc10_evt_loop terminating... [2023-02-26 09:35:16,310][00488] Component RolloutWorker_w10 stopped! [2023-02-26 09:35:16,388][00488] Component RolloutWorker_w4 stopped! [2023-02-26 09:35:16,404][11060] Stopping RolloutWorker_w4... [2023-02-26 09:35:16,404][11060] Loop rollout_proc4_evt_loop terminating... [2023-02-26 09:35:16,412][00488] Component RolloutWorker_w2 stopped! [2023-02-26 09:35:16,416][11058] Stopping RolloutWorker_w2... [2023-02-26 09:35:16,417][11058] Loop rollout_proc2_evt_loop terminating... [2023-02-26 09:35:16,439][00488] Component RolloutWorker_w8 stopped! [2023-02-26 09:35:16,441][11068] Stopping RolloutWorker_w8... [2023-02-26 09:35:16,447][11068] Loop rollout_proc8_evt_loop terminating... [2023-02-26 09:35:16,463][00488] Component RolloutWorker_w0 stopped! [2023-02-26 09:35:16,465][11057] Stopping RolloutWorker_w0... [2023-02-26 09:35:16,469][11057] Loop rollout_proc0_evt_loop terminating... [2023-02-26 09:35:16,656][00488] Component LearnerWorker_p0 stopped! [2023-02-26 09:35:16,667][00488] Waiting for process learner_proc0 to stop... [2023-02-26 09:35:16,680][11034] Stopping LearnerWorker_p0... [2023-02-26 09:35:16,680][11034] Loop learner_proc0_evt_loop terminating... [2023-02-26 09:35:21,011][00488] Waiting for process inference_proc0-0 to join... [2023-02-26 09:35:21,313][00488] Waiting for process rollout_proc0 to join... [2023-02-26 09:35:22,179][00488] Waiting for process rollout_proc1 to join... [2023-02-26 09:35:22,180][00488] Waiting for process rollout_proc2 to join... [2023-02-26 09:35:22,183][00488] Waiting for process rollout_proc3 to join... [2023-02-26 09:35:22,188][00488] Waiting for process rollout_proc4 to join... [2023-02-26 09:35:22,190][00488] Waiting for process rollout_proc5 to join... [2023-02-26 09:35:22,192][00488] Waiting for process rollout_proc6 to join... [2023-02-26 09:35:22,194][00488] Waiting for process rollout_proc7 to join... [2023-02-26 09:35:22,196][00488] Waiting for process rollout_proc8 to join... [2023-02-26 09:35:22,201][00488] Waiting for process rollout_proc9 to join... [2023-02-26 09:35:22,203][00488] Waiting for process rollout_proc10 to join... [2023-02-26 09:35:22,205][00488] Waiting for process rollout_proc11 to join... [2023-02-26 09:35:22,208][00488] Waiting for process rollout_proc12 to join... [2023-02-26 09:35:22,210][00488] Waiting for process rollout_proc13 to join... [2023-02-26 09:35:22,222][00488] Waiting for process rollout_proc14 to join... [2023-02-26 09:35:22,223][00488] Waiting for process rollout_proc15 to join... [2023-02-26 09:35:22,228][00488] Batcher 0 profile tree view: batching: 38.5047, releasing_batches: 0.0316 [2023-02-26 09:35:22,229][00488] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0033 wait_policy_total: 1094.5583 update_model: 5.0461 weight_update: 0.0018 one_step: 0.0030 handle_policy_step: 245.6419 deserialize: 11.1576, stack: 1.3673, obs_to_device_normalize: 56.0200, forward: 109.7127, send_messages: 14.6599 prepare_outputs: 42.0203 to_cpu: 26.4702 [2023-02-26 09:35:22,231][00488] Learner 0 profile tree view: misc: 0.0068, prepare_batch: 22.5721 train: 102.1283 epoch_init: 0.0176, minibatch_init: 0.0141, losses_postprocess: 0.8838, kl_divergence: 0.8917, after_optimizer: 36.3564 calculate_losses: 38.6596 losses_init: 0.0045, forward_head: 2.7728, bptt_initial: 23.9530, tail: 1.7370, advantages_returns: 0.4457, losses: 6.1301 bptt: 3.0901 bptt_forward_core: 2.9862 update: 24.4332 clip: 2.2829 [2023-02-26 09:35:22,232][00488] RolloutWorker_w0 profile tree view: wait_for_trajectories: 0.2688, enqueue_policy_requests: 36.7800, env_step: 1224.8023, overhead: 22.4128, complete_rollouts: 2.0045 save_policy_outputs: 23.0625 split_output_tensors: 10.7173 [2023-02-26 09:35:22,238][00488] RolloutWorker_w15 profile tree view: wait_for_trajectories: 0.4043, enqueue_policy_requests: 34.8736, env_step: 1227.0864, overhead: 21.8513, complete_rollouts: 1.6868 save_policy_outputs: 22.9854 split_output_tensors: 10.7368 [2023-02-26 09:35:22,245][00488] Loop Runner_EvtLoop terminating... [2023-02-26 09:35:22,246][00488] Runner profile tree view: main_loop: 1413.8728 [2023-02-26 09:35:22,248][00488] Collected {0: 5005312}, FPS: 3540.1 [2023-02-26 09:35:28,060][00488] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 09:35:28,062][00488] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 09:35:28,065][00488] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 09:35:28,067][00488] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 09:35:28,068][00488] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 09:35:28,070][00488] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 09:35:28,071][00488] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 09:35:28,072][00488] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 09:35:28,074][00488] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2023-02-26 09:35:28,075][00488] Adding new argument 'hf_repository'=None that is not in the saved config file! [2023-02-26 09:35:28,076][00488] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 09:35:28,083][00488] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 09:35:28,087][00488] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 09:35:28,089][00488] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 09:35:28,090][00488] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 09:35:28,118][00488] Doom resolution: 160x120, resize resolution: (128, 72) [2023-02-26 09:35:28,120][00488] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 09:35:28,131][00488] RunningMeanStd input shape: (1,) [2023-02-26 09:35:28,180][00488] ConvEncoder: input_channels=3 [2023-02-26 09:35:29,045][00488] Conv encoder output size: 512 [2023-02-26 09:35:29,047][00488] Policy head output size: 512 [2023-02-26 09:35:32,327][00488] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... [2023-02-26 09:35:34,181][00488] Num frames 100... [2023-02-26 09:35:34,289][00488] Num frames 200... [2023-02-26 09:35:34,406][00488] Num frames 300... [2023-02-26 09:35:34,518][00488] Num frames 400... [2023-02-26 09:35:34,640][00488] Num frames 500... [2023-02-26 09:35:34,760][00488] Num frames 600... [2023-02-26 09:35:34,871][00488] Num frames 700... [2023-02-26 09:35:34,932][00488] Avg episode rewards: #0: 14.040, true rewards: #0: 7.040 [2023-02-26 09:35:34,933][00488] Avg episode reward: 14.040, avg true_objective: 7.040 [2023-02-26 09:35:35,044][00488] Num frames 800... [2023-02-26 09:35:35,159][00488] Num frames 900... [2023-02-26 09:35:35,276][00488] Num frames 1000... [2023-02-26 09:35:35,387][00488] Num frames 1100... [2023-02-26 09:35:35,499][00488] Num frames 1200... [2023-02-26 09:35:35,614][00488] Num frames 1300... [2023-02-26 09:35:35,724][00488] Num frames 1400... [2023-02-26 09:35:35,840][00488] Num frames 1500... [2023-02-26 09:35:35,951][00488] Num frames 1600... [2023-02-26 09:35:36,065][00488] Num frames 1700... [2023-02-26 09:35:36,198][00488] Num frames 1800... [2023-02-26 09:35:36,317][00488] Num frames 1900... [2023-02-26 09:35:36,449][00488] Num frames 2000... [2023-02-26 09:35:36,571][00488] Num frames 2100... [2023-02-26 09:35:36,683][00488] Num frames 2200... [2023-02-26 09:35:36,811][00488] Num frames 2300... [2023-02-26 09:35:36,925][00488] Num frames 2400... [2023-02-26 09:35:37,041][00488] Num frames 2500... [2023-02-26 09:35:37,152][00488] Num frames 2600... [2023-02-26 09:35:37,280][00488] Avg episode rewards: #0: 32.810, true rewards: #0: 13.310 [2023-02-26 09:35:37,282][00488] Avg episode reward: 32.810, avg true_objective: 13.310 [2023-02-26 09:35:37,325][00488] Num frames 2700... [2023-02-26 09:35:37,444][00488] Num frames 2800... [2023-02-26 09:35:37,554][00488] Num frames 2900... [2023-02-26 09:35:37,663][00488] Num frames 3000... [2023-02-26 09:35:37,773][00488] Num frames 3100... [2023-02-26 09:35:37,914][00488] Avg episode rewards: #0: 24.246, true rewards: #0: 10.580 [2023-02-26 09:35:37,915][00488] Avg episode reward: 24.246, avg true_objective: 10.580 [2023-02-26 09:35:37,949][00488] Num frames 3200... [2023-02-26 09:35:38,060][00488] Num frames 3300... [2023-02-26 09:35:38,176][00488] Num frames 3400... [2023-02-26 09:35:38,287][00488] Num frames 3500... [2023-02-26 09:35:38,403][00488] Num frames 3600... [2023-02-26 09:35:38,516][00488] Num frames 3700... [2023-02-26 09:35:38,635][00488] Num frames 3800... [2023-02-26 09:35:38,748][00488] Num frames 3900... [2023-02-26 09:35:38,866][00488] Num frames 4000... [2023-02-26 09:35:38,977][00488] Num frames 4100... [2023-02-26 09:35:39,092][00488] Num frames 4200... [2023-02-26 09:35:39,215][00488] Num frames 4300... [2023-02-26 09:35:39,332][00488] Num frames 4400... [2023-02-26 09:35:39,446][00488] Num frames 4500... [2023-02-26 09:35:39,558][00488] Num frames 4600... [2023-02-26 09:35:39,675][00488] Num frames 4700... [2023-02-26 09:35:39,786][00488] Num frames 4800... [2023-02-26 09:35:39,904][00488] Num frames 4900... [2023-02-26 09:35:40,017][00488] Num frames 5000... [2023-02-26 09:35:40,133][00488] Num frames 5100... [2023-02-26 09:35:40,249][00488] Num frames 5200... [2023-02-26 09:35:40,386][00488] Avg episode rewards: #0: 31.935, true rewards: #0: 13.185 [2023-02-26 09:35:40,387][00488] Avg episode reward: 31.935, avg true_objective: 13.185 [2023-02-26 09:35:40,421][00488] Num frames 5300... [2023-02-26 09:35:40,538][00488] Num frames 5400... [2023-02-26 09:35:40,654][00488] Num frames 5500... [2023-02-26 09:35:40,767][00488] Num frames 5600... [2023-02-26 09:35:40,883][00488] Num frames 5700... [2023-02-26 09:35:40,994][00488] Num frames 5800... [2023-02-26 09:35:41,110][00488] Num frames 5900... [2023-02-26 09:35:41,220][00488] Num frames 6000... [2023-02-26 09:35:41,333][00488] Num frames 6100... [2023-02-26 09:35:41,453][00488] Num frames 6200... [2023-02-26 09:35:41,565][00488] Num frames 6300... [2023-02-26 09:35:41,679][00488] Num frames 6400... [2023-02-26 09:35:41,796][00488] Num frames 6500... [2023-02-26 09:35:41,911][00488] Num frames 6600... [2023-02-26 09:35:42,023][00488] Num frames 6700... [2023-02-26 09:35:42,140][00488] Num frames 6800... [2023-02-26 09:35:42,252][00488] Num frames 6900... [2023-02-26 09:35:42,364][00488] Num frames 7000... [2023-02-26 09:35:42,482][00488] Num frames 7100... [2023-02-26 09:35:42,593][00488] Num frames 7200... [2023-02-26 09:35:42,707][00488] Num frames 7300... [2023-02-26 09:35:42,774][00488] Avg episode rewards: #0: 35.817, true rewards: #0: 14.618 [2023-02-26 09:35:42,775][00488] Avg episode reward: 35.817, avg true_objective: 14.618 [2023-02-26 09:35:42,883][00488] Num frames 7400... [2023-02-26 09:35:42,998][00488] Num frames 7500... [2023-02-26 09:35:43,109][00488] Num frames 7600... [2023-02-26 09:35:43,218][00488] Num frames 7700... [2023-02-26 09:35:43,327][00488] Num frames 7800... [2023-02-26 09:35:43,444][00488] Num frames 7900... [2023-02-26 09:35:43,555][00488] Num frames 8000... [2023-02-26 09:35:43,695][00488] Num frames 8100... [2023-02-26 09:35:43,864][00488] Num frames 8200... [2023-02-26 09:35:44,038][00488] Avg episode rewards: #0: 33.951, true rewards: #0: 13.785 [2023-02-26 09:35:44,040][00488] Avg episode reward: 33.951, avg true_objective: 13.785 [2023-02-26 09:35:44,088][00488] Num frames 8300... [2023-02-26 09:35:44,241][00488] Num frames 8400... [2023-02-26 09:35:44,399][00488] Num frames 8500... [2023-02-26 09:35:44,560][00488] Num frames 8600... [2023-02-26 09:35:44,719][00488] Num frames 8700... [2023-02-26 09:35:44,872][00488] Num frames 8800... [2023-02-26 09:35:45,033][00488] Num frames 8900... [2023-02-26 09:35:45,192][00488] Num frames 9000... [2023-02-26 09:35:45,366][00488] Num frames 9100... [2023-02-26 09:35:45,546][00488] Num frames 9200... [2023-02-26 09:35:45,706][00488] Num frames 9300... [2023-02-26 09:35:45,862][00488] Num frames 9400... [2023-02-26 09:35:46,023][00488] Num frames 9500... [2023-02-26 09:35:46,185][00488] Num frames 9600... [2023-02-26 09:35:46,346][00488] Num frames 9700... [2023-02-26 09:35:46,551][00488] Avg episode rewards: #0: 33.842, true rewards: #0: 13.986 [2023-02-26 09:35:46,554][00488] Avg episode reward: 33.842, avg true_objective: 13.986 [2023-02-26 09:35:46,576][00488] Num frames 9800... [2023-02-26 09:35:46,762][00488] Num frames 9900... [2023-02-26 09:35:46,926][00488] Num frames 10000... [2023-02-26 09:35:47,090][00488] Num frames 10100... [2023-02-26 09:35:47,211][00488] Num frames 10200... [2023-02-26 09:35:47,326][00488] Num frames 10300... [2023-02-26 09:35:47,437][00488] Num frames 10400... [2023-02-26 09:35:47,554][00488] Num frames 10500... [2023-02-26 09:35:47,673][00488] Avg episode rewards: #0: 31.197, true rewards: #0: 13.198 [2023-02-26 09:35:47,675][00488] Avg episode reward: 31.197, avg true_objective: 13.198 [2023-02-26 09:35:47,723][00488] Num frames 10600... [2023-02-26 09:35:47,831][00488] Num frames 10700... [2023-02-26 09:35:47,948][00488] Num frames 10800... [2023-02-26 09:35:48,057][00488] Num frames 10900... [2023-02-26 09:35:48,177][00488] Num frames 11000... [2023-02-26 09:35:48,244][00488] Avg episode rewards: #0: 28.340, true rewards: #0: 12.229 [2023-02-26 09:35:48,245][00488] Avg episode reward: 28.340, avg true_objective: 12.229 [2023-02-26 09:35:48,351][00488] Num frames 11100... [2023-02-26 09:35:48,462][00488] Num frames 11200... [2023-02-26 09:35:48,620][00488] Avg episode rewards: #0: 25.994, true rewards: #0: 11.294 [2023-02-26 09:35:48,622][00488] Avg episode reward: 25.994, avg true_objective: 11.294 [2023-02-26 09:36:58,780][00488] Replay video saved to /content/train_dir/default_experiment/replay.mp4! [2023-02-26 09:38:45,176][00488] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json [2023-02-26 09:38:45,178][00488] Overriding arg 'num_workers' with value 1 passed from command line [2023-02-26 09:38:45,179][00488] Adding new argument 'no_render'=True that is not in the saved config file! [2023-02-26 09:38:45,180][00488] Adding new argument 'save_video'=True that is not in the saved config file! [2023-02-26 09:38:45,187][00488] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2023-02-26 09:38:45,188][00488] Adding new argument 'video_name'=None that is not in the saved config file! [2023-02-26 09:38:45,190][00488] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2023-02-26 09:38:45,192][00488] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! [2023-02-26 09:38:45,193][00488] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2023-02-26 09:38:45,195][00488] Adding new argument 'hf_repository'='Brain22/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! [2023-02-26 09:38:45,197][00488] Adding new argument 'policy_index'=0 that is not in the saved config file! [2023-02-26 09:38:45,198][00488] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2023-02-26 09:38:45,200][00488] Adding new argument 'train_script'=None that is not in the saved config file! [2023-02-26 09:38:45,202][00488] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2023-02-26 09:38:45,204][00488] Using frameskip 1 and render_action_repeat=4 for evaluation [2023-02-26 09:38:45,230][00488] RunningMeanStd input shape: (3, 72, 128) [2023-02-26 09:38:45,233][00488] RunningMeanStd input shape: (1,) [2023-02-26 09:38:45,249][00488] ConvEncoder: input_channels=3 [2023-02-26 09:38:45,284][00488] Conv encoder output size: 512 [2023-02-26 09:38:45,285][00488] Policy head output size: 512 [2023-02-26 09:38:45,305][00488] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... [2023-02-26 09:38:45,751][00488] Num frames 100... [2023-02-26 09:38:45,867][00488] Num frames 200... [2023-02-26 09:38:45,978][00488] Num frames 300... [2023-02-26 09:38:46,102][00488] Num frames 400... [2023-02-26 09:38:46,212][00488] Num frames 500... [2023-02-26 09:38:46,325][00488] Num frames 600... [2023-02-26 09:38:46,435][00488] Num frames 700... [2023-02-26 09:38:46,551][00488] Num frames 800... [2023-02-26 09:38:46,670][00488] Num frames 900... [2023-02-26 09:38:46,782][00488] Num frames 1000... [2023-02-26 09:38:46,894][00488] Num frames 1100... [2023-02-26 09:38:47,006][00488] Num frames 1200... [2023-02-26 09:38:47,118][00488] Num frames 1300... [2023-02-26 09:38:47,231][00488] Num frames 1400... [2023-02-26 09:38:47,340][00488] Num frames 1500... [2023-02-26 09:38:47,453][00488] Num frames 1600... [2023-02-26 09:38:47,576][00488] Num frames 1700... [2023-02-26 09:38:47,692][00488] Avg episode rewards: #0: 48.549, true rewards: #0: 17.550 [2023-02-26 09:38:47,693][00488] Avg episode reward: 48.549, avg true_objective: 17.550 [2023-02-26 09:38:47,753][00488] Num frames 1800... [2023-02-26 09:38:47,867][00488] Num frames 1900... [2023-02-26 09:38:47,980][00488] Num frames 2000... [2023-02-26 09:38:48,095][00488] Num frames 2100... [2023-02-26 09:38:48,202][00488] Num frames 2200... [2023-02-26 09:38:48,311][00488] Num frames 2300... [2023-02-26 09:38:48,475][00488] Avg episode rewards: #0: 29.975, true rewards: #0: 11.975 [2023-02-26 09:38:48,477][00488] Avg episode reward: 29.975, avg true_objective: 11.975 [2023-02-26 09:38:48,489][00488] Num frames 2400... [2023-02-26 09:38:48,613][00488] Num frames 2500... [2023-02-26 09:38:48,723][00488] Num frames 2600... [2023-02-26 09:38:48,841][00488] Num frames 2700... [2023-02-26 09:38:48,953][00488] Num frames 2800... [2023-02-26 09:38:49,065][00488] Num frames 2900... [2023-02-26 09:38:49,187][00488] Num frames 3000... [2023-02-26 09:38:49,309][00488] Num frames 3100... [2023-02-26 09:38:49,477][00488] Num frames 3200... [2023-02-26 09:38:49,637][00488] Num frames 3300... [2023-02-26 09:38:49,791][00488] Num frames 3400... [2023-02-26 09:38:49,961][00488] Num frames 3500... [2023-02-26 09:38:50,122][00488] Num frames 3600... [2023-02-26 09:38:50,296][00488] Avg episode rewards: #0: 30.587, true rewards: #0: 12.253 [2023-02-26 09:38:50,299][00488] Avg episode reward: 30.587, avg true_objective: 12.253 [2023-02-26 09:38:50,339][00488] Num frames 3700... [2023-02-26 09:38:50,499][00488] Num frames 3800... [2023-02-26 09:38:50,658][00488] Num frames 3900... [2023-02-26 09:38:50,810][00488] Num frames 4000... [2023-02-26 09:38:50,967][00488] Avg episode rewards: #0: 23.900, true rewards: #0: 10.150 [2023-02-26 09:38:50,973][00488] Avg episode reward: 23.900, avg true_objective: 10.150 [2023-02-26 09:38:51,052][00488] Num frames 4100... [2023-02-26 09:38:51,215][00488] Num frames 4200... [2023-02-26 09:38:51,367][00488] Num frames 4300... [2023-02-26 09:38:51,532][00488] Num frames 4400... [2023-02-26 09:38:51,705][00488] Num frames 4500... [2023-02-26 09:38:51,865][00488] Num frames 4600... [2023-02-26 09:38:52,026][00488] Num frames 4700... [2023-02-26 09:38:52,188][00488] Num frames 4800... [2023-02-26 09:38:52,351][00488] Num frames 4900... [2023-02-26 09:38:52,511][00488] Num frames 5000... [2023-02-26 09:38:52,672][00488] Num frames 5100... [2023-02-26 09:38:52,814][00488] Num frames 5200... [2023-02-26 09:38:52,884][00488] Avg episode rewards: #0: 24.824, true rewards: #0: 10.424 [2023-02-26 09:38:52,885][00488] Avg episode reward: 24.824, avg true_objective: 10.424 [2023-02-26 09:38:52,983][00488] Num frames 5300... [2023-02-26 09:38:53,094][00488] Num frames 5400... [2023-02-26 09:38:53,208][00488] Num frames 5500... [2023-02-26 09:38:53,316][00488] Num frames 5600... [2023-02-26 09:38:53,425][00488] Num frames 5700... [2023-02-26 09:38:53,536][00488] Num frames 5800... [2023-02-26 09:38:53,647][00488] Num frames 5900... [2023-02-26 09:38:53,763][00488] Num frames 6000... [2023-02-26 09:38:53,874][00488] Num frames 6100... [2023-02-26 09:38:53,983][00488] Num frames 6200... [2023-02-26 09:38:54,113][00488] Num frames 6300... [2023-02-26 09:38:54,225][00488] Num frames 6400... [2023-02-26 09:38:54,336][00488] Num frames 6500... [2023-02-26 09:38:54,447][00488] Num frames 6600... [2023-02-26 09:38:54,557][00488] Num frames 6700... [2023-02-26 09:38:54,672][00488] Num frames 6800... [2023-02-26 09:38:54,789][00488] Num frames 6900... [2023-02-26 09:38:54,901][00488] Num frames 7000... [2023-02-26 09:38:55,014][00488] Num frames 7100... [2023-02-26 09:38:55,126][00488] Num frames 7200... [2023-02-26 09:38:55,238][00488] Num frames 7300... [2023-02-26 09:38:55,308][00488] Avg episode rewards: #0: 31.020, true rewards: #0: 12.187 [2023-02-26 09:38:55,310][00488] Avg episode reward: 31.020, avg true_objective: 12.187 [2023-02-26 09:38:55,409][00488] Num frames 7400... [2023-02-26 09:38:55,527][00488] Num frames 7500... [2023-02-26 09:38:55,639][00488] Num frames 7600... [2023-02-26 09:38:55,750][00488] Num frames 7700... [2023-02-26 09:38:55,866][00488] Num frames 7800... [2023-02-26 09:38:55,976][00488] Num frames 7900... [2023-02-26 09:38:56,088][00488] Num frames 8000... [2023-02-26 09:38:56,200][00488] Num frames 8100... [2023-02-26 09:38:56,310][00488] Num frames 8200... [2023-02-26 09:38:56,419][00488] Num frames 8300... [2023-02-26 09:38:56,531][00488] Num frames 8400... [2023-02-26 09:38:56,644][00488] Num frames 8500... [2023-02-26 09:38:56,761][00488] Num frames 8600... [2023-02-26 09:38:56,879][00488] Num frames 8700... [2023-02-26 09:38:56,989][00488] Num frames 8800... [2023-02-26 09:38:57,113][00488] Num frames 8900... [2023-02-26 09:38:57,229][00488] Num frames 9000... [2023-02-26 09:38:57,343][00488] Num frames 9100... [2023-02-26 09:38:57,463][00488] Num frames 9200... [2023-02-26 09:38:57,573][00488] Num frames 9300... [2023-02-26 09:38:57,685][00488] Num frames 9400... [2023-02-26 09:38:57,754][00488] Avg episode rewards: #0: 35.588, true rewards: #0: 13.446 [2023-02-26 09:38:57,756][00488] Avg episode reward: 35.588, avg true_objective: 13.446 [2023-02-26 09:38:57,868][00488] Num frames 9500... [2023-02-26 09:38:57,979][00488] Num frames 9600... [2023-02-26 09:38:58,090][00488] Num frames 9700... [2023-02-26 09:38:58,202][00488] Num frames 9800... [2023-02-26 09:38:58,311][00488] Num frames 9900... [2023-02-26 09:38:58,426][00488] Num frames 10000... [2023-02-26 09:38:58,540][00488] Num frames 10100... [2023-02-26 09:38:58,652][00488] Num frames 10200... [2023-02-26 09:38:58,765][00488] Num frames 10300... [2023-02-26 09:38:58,882][00488] Num frames 10400... [2023-02-26 09:38:58,991][00488] Num frames 10500... [2023-02-26 09:38:59,111][00488] Num frames 10600... [2023-02-26 09:38:59,246][00488] Avg episode rewards: #0: 34.837, true rewards: #0: 13.338 [2023-02-26 09:38:59,248][00488] Avg episode reward: 34.837, avg true_objective: 13.338 [2023-02-26 09:38:59,286][00488] Num frames 10700... [2023-02-26 09:38:59,403][00488] Num frames 10800... [2023-02-26 09:38:59,518][00488] Num frames 10900... [2023-02-26 09:38:59,627][00488] Num frames 11000... [2023-02-26 09:38:59,736][00488] Num frames 11100... [2023-02-26 09:38:59,857][00488] Num frames 11200... [2023-02-26 09:38:59,969][00488] Num frames 11300... [2023-02-26 09:39:00,081][00488] Num frames 11400... [2023-02-26 09:39:00,196][00488] Num frames 11500... [2023-02-26 09:39:00,307][00488] Num frames 11600... [2023-02-26 09:39:00,423][00488] Num frames 11700... [2023-02-26 09:39:00,538][00488] Num frames 11800... [2023-02-26 09:39:00,652][00488] Num frames 11900... [2023-02-26 09:39:00,770][00488] Num frames 12000... [2023-02-26 09:39:00,890][00488] Num frames 12100... [2023-02-26 09:39:01,004][00488] Num frames 12200... [2023-02-26 09:39:01,120][00488] Num frames 12300... [2023-02-26 09:39:01,242][00488] Num frames 12400... [2023-02-26 09:39:01,356][00488] Num frames 12500... [2023-02-26 09:39:01,471][00488] Num frames 12600... [2023-02-26 09:39:01,583][00488] Num frames 12700... [2023-02-26 09:39:01,717][00488] Avg episode rewards: #0: 37.077, true rewards: #0: 14.189 [2023-02-26 09:39:01,719][00488] Avg episode reward: 37.077, avg true_objective: 14.189 [2023-02-26 09:39:01,755][00488] Num frames 12800... [2023-02-26 09:39:01,866][00488] Num frames 12900... [2023-02-26 09:39:01,984][00488] Num frames 13000... [2023-02-26 09:39:02,096][00488] Num frames 13100... [2023-02-26 09:39:02,216][00488] Num frames 13200... [2023-02-26 09:39:02,329][00488] Num frames 13300... [2023-02-26 09:39:02,438][00488] Num frames 13400... [2023-02-26 09:39:02,556][00488] Num frames 13500... [2023-02-26 09:39:02,666][00488] Num frames 13600... [2023-02-26 09:39:02,785][00488] Num frames 13700... [2023-02-26 09:39:02,948][00488] Num frames 13800... [2023-02-26 09:39:03,098][00488] Avg episode rewards: #0: 35.658, true rewards: #0: 13.858 [2023-02-26 09:39:03,101][00488] Avg episode reward: 35.658, avg true_objective: 13.858 [2023-02-26 09:40:29,286][00488] Replay video saved to /content/train_dir/default_experiment/replay.mp4!