File size: 142,391 Bytes

3d89bca

[2023-02-23 15:11:45,445][00417] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 15:11:45,449][00417] Rollout worker 0 uses device cpu
[2023-02-23 15:11:45,450][00417] Rollout worker 1 uses device cpu
[2023-02-23 15:11:45,452][00417] Rollout worker 2 uses device cpu
[2023-02-23 15:11:45,453][00417] Rollout worker 3 uses device cpu
[2023-02-23 15:11:45,454][00417] Rollout worker 4 uses device cpu
[2023-02-23 15:11:45,456][00417] Rollout worker 5 uses device cpu
[2023-02-23 15:11:45,457][00417] Rollout worker 6 uses device cpu
[2023-02-23 15:11:45,458][00417] Rollout worker 7 uses device cpu
[2023-02-23 15:11:45,655][00417] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:11:45,660][00417] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 15:11:45,691][00417] Starting all processes...
[2023-02-23 15:11:45,693][00417] Starting process learner_proc0
[2023-02-23 15:11:45,751][00417] Starting all processes...
[2023-02-23 15:11:45,761][00417] Starting process inference_proc0-0
[2023-02-23 15:11:45,762][00417] Starting process rollout_proc0
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc1
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc2
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc3
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc4
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc5
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc6
[2023-02-23 15:11:45,764][00417] Starting process rollout_proc7
[2023-02-23 15:11:52,642][00417] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 417], exiting...
[2023-02-23 15:11:52,649][00417] Runner profile tree view:
main_loop: 6.9587
[2023-02-23 15:11:52,660][00417] Collected {}, FPS: 0.0
[2023-02-23 15:11:53,956][00417] Environment doom_basic already registered, overwriting...
[2023-02-23 15:11:53,960][00417] Environment doom_two_colors_easy already registered, overwriting...
[2023-02-23 15:11:53,961][00417] Environment doom_two_colors_hard already registered, overwriting...
[2023-02-23 15:11:53,965][00417] Environment doom_dm already registered, overwriting...
[2023-02-23 15:11:53,967][00417] Environment doom_dwango5 already registered, overwriting...
[2023-02-23 15:11:53,969][00417] Environment doom_my_way_home_flat_actions already registered, overwriting...
[2023-02-23 15:11:53,970][00417] Environment doom_defend_the_center_flat_actions already registered, overwriting...
[2023-02-23 15:11:53,971][00417] Environment doom_my_way_home already registered, overwriting...
[2023-02-23 15:11:53,973][00417] Environment doom_deadly_corridor already registered, overwriting...
[2023-02-23 15:11:53,976][00417] Environment doom_defend_the_center already registered, overwriting...
[2023-02-23 15:11:53,978][00417] Environment doom_defend_the_line already registered, overwriting...
[2023-02-23 15:11:53,990][00417] Environment doom_health_gathering already registered, overwriting...
[2023-02-23 15:11:53,991][00417] Environment doom_health_gathering_supreme already registered, overwriting...
[2023-02-23 15:11:53,996][00417] Environment doom_battle already registered, overwriting...
[2023-02-23 15:11:53,998][00417] Environment doom_battle2 already registered, overwriting...
[2023-02-23 15:11:54,002][00417] Environment doom_duel_bots already registered, overwriting...
[2023-02-23 15:11:54,006][00417] Environment doom_deathmatch_bots already registered, overwriting...
[2023-02-23 15:11:54,008][00417] Environment doom_duel already registered, overwriting...
[2023-02-23 15:11:54,009][00417] Environment doom_deathmatch_full already registered, overwriting...
[2023-02-23 15:11:54,010][00417] Environment doom_benchmark already registered, overwriting...
[2023-02-23 15:11:54,012][00417] register_encoder_factory: <function make_vizdoom_encoder at 0x7fe011dce4c0>
[2023-02-23 15:11:54,027][00417] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 15:11:54,035][00417] Experiment dir /content/train_dir/default_experiment already exists!
[2023-02-23 15:11:54,038][00417] Resuming existing experiment from /content/train_dir/default_experiment...
[2023-02-23 15:11:54,040][00417] Weights and Biases integration disabled
[2023-02-23 15:11:54,046][00417] Environment var CUDA_VISIBLE_DEVICES is 0

[2023-02-23 15:12:24,016][11317] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 15:12:24,022][11317] Rollout worker 0 uses device cpu
[2023-02-23 15:12:24,024][11317] Rollout worker 1 uses device cpu
[2023-02-23 15:12:24,027][11317] Rollout worker 2 uses device cpu
[2023-02-23 15:12:24,030][11317] Rollout worker 3 uses device cpu
[2023-02-23 15:12:24,031][11317] Rollout worker 4 uses device cpu
[2023-02-23 15:12:24,032][11317] Rollout worker 5 uses device cpu
[2023-02-23 15:12:24,034][11317] Rollout worker 6 uses device cpu
[2023-02-23 15:12:24,035][11317] Rollout worker 7 uses device cpu
[2023-02-23 15:12:24,170][11317] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:12:24,172][11317] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 15:12:24,203][11317] Starting all processes...
[2023-02-23 15:12:24,204][11317] Starting process learner_proc0
[2023-02-23 15:12:24,258][11317] Starting all processes...
[2023-02-23 15:12:24,267][11317] Starting process inference_proc0-0
[2023-02-23 15:12:24,268][11317] Starting process rollout_proc0
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc1
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc2
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc3
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc4
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc5
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc6
[2023-02-23 15:12:24,270][11317] Starting process rollout_proc7
[2023-02-23 15:12:34,932][11456] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:12:34,933][11456] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 15:12:35,401][11477] Worker 6 uses CPU cores [0]
[2023-02-23 15:12:35,499][11473] Worker 3 uses CPU cores [1]
[2023-02-23 15:12:35,783][11471] Worker 0 uses CPU cores [0]
[2023-02-23 15:12:35,932][11478] Worker 7 uses CPU cores [1]
[2023-02-23 15:12:36,002][11476] Worker 5 uses CPU cores [1]
[2023-02-23 15:12:36,007][11470] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:12:36,008][11470] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 15:12:36,086][11475] Worker 2 uses CPU cores [0]
[2023-02-23 15:12:36,141][11474] Worker 4 uses CPU cores [0]
[2023-02-23 15:12:36,230][11472] Worker 1 uses CPU cores [1]
[2023-02-23 15:12:36,335][11470] Num visible devices: 1
[2023-02-23 15:12:36,335][11456] Num visible devices: 1
[2023-02-23 15:12:36,353][11456] Starting seed is not provided
[2023-02-23 15:12:36,353][11456] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:12:36,354][11456] Initializing actor-critic model on device cuda:0
[2023-02-23 15:12:36,355][11456] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:12:36,357][11456] RunningMeanStd input shape: (1,)
[2023-02-23 15:12:36,368][11456] ConvEncoder: input_channels=3
[2023-02-23 15:12:36,657][11456] Conv encoder output size: 512
[2023-02-23 15:12:36,658][11456] Policy head output size: 512
[2023-02-23 15:12:36,709][11456] Created Actor Critic model with architecture:
[2023-02-23 15:12:36,709][11456] ActorCriticSharedWeights(
  (obs_normalizer): ObservationNormalizer(
    (running_mean_std): RunningMeanStdDictInPlace(
      (running_mean_std): ModuleDict(
        (obs): RunningMeanStdInPlace()
      )
    )
  )
  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
  (encoder): VizdoomEncoder(
    (basic_encoder): ConvEncoder(
      (enc): RecursiveScriptModule(
        original_name=ConvEncoderImpl
        (conv_head): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=Conv2d)
          (1): RecursiveScriptModule(original_name=ELU)
          (2): RecursiveScriptModule(original_name=Conv2d)
          (3): RecursiveScriptModule(original_name=ELU)
          (4): RecursiveScriptModule(original_name=Conv2d)
          (5): RecursiveScriptModule(original_name=ELU)
        )
        (mlp_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=Linear)
          (1): RecursiveScriptModule(original_name=ELU)
        )
      )
    )
  )
  (core): ModelCoreRNN(
    (core): GRU(512, 512)
  )
  (decoder): MlpDecoder(
    (mlp): Identity()
  )
  (critic_linear): Linear(in_features=512, out_features=1, bias=True)
  (action_parameterization): ActionParameterizationDefault(
    (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
  )
)
[2023-02-23 15:12:44,128][11456] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 15:12:44,129][11456] No checkpoints found
[2023-02-23 15:12:44,129][11456] Did not load from checkpoint, starting from scratch!
[2023-02-23 15:12:44,129][11456] Initialized policy 0 weights for model version 0
[2023-02-23 15:12:44,134][11456] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:12:44,145][11456] LearnerWorker_p0 finished initialization!
[2023-02-23 15:12:44,167][11317] Heartbeat connected on LearnerWorker_p0
[2023-02-23 15:12:44,220][11317] Heartbeat connected on RolloutWorker_w0
[2023-02-23 15:12:44,224][11317] Heartbeat connected on RolloutWorker_w2
[2023-02-23 15:12:44,227][11317] Heartbeat connected on RolloutWorker_w4
[2023-02-23 15:12:44,231][11317] Heartbeat connected on RolloutWorker_w6
[2023-02-23 15:12:44,269][11317] Heartbeat connected on RolloutWorker_w1
[2023-02-23 15:12:44,270][11317] Heartbeat connected on RolloutWorker_w7
[2023-02-23 15:12:44,271][11317] Heartbeat connected on RolloutWorker_w5
[2023-02-23 15:12:44,273][11317] Heartbeat connected on RolloutWorker_w3
[2023-02-23 15:12:44,329][11317] Heartbeat connected on Batcher_0
[2023-02-23 15:12:44,384][11470] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:12:44,385][11470] RunningMeanStd input shape: (1,)
[2023-02-23 15:12:44,399][11470] ConvEncoder: input_channels=3
[2023-02-23 15:12:44,501][11470] Conv encoder output size: 512
[2023-02-23 15:12:44,502][11470] Policy head output size: 512
[2023-02-23 15:12:45,912][11317] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:12:47,348][11317] Inference worker 0-0 is ready!
[2023-02-23 15:12:47,351][11317] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 15:12:47,352][11317] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 15:12:47,522][11476] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,533][11477] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,539][11474] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,542][11471] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,544][11472] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,573][11478] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,571][11475] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:47,619][11473] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:12:48,452][11478] Decorrelating experience for 0 frames...
[2023-02-23 15:12:48,896][11474] Decorrelating experience for 0 frames...
[2023-02-23 15:12:48,901][11477] Decorrelating experience for 0 frames...
[2023-02-23 15:12:48,918][11475] Decorrelating experience for 0 frames...
[2023-02-23 15:12:49,499][11478] Decorrelating experience for 32 frames...
[2023-02-23 15:12:50,055][11471] Decorrelating experience for 0 frames...
[2023-02-23 15:12:50,058][11474] Decorrelating experience for 32 frames...
[2023-02-23 15:12:50,068][11475] Decorrelating experience for 32 frames...
[2023-02-23 15:12:50,188][11476] Decorrelating experience for 0 frames...
[2023-02-23 15:12:50,801][11478] Decorrelating experience for 64 frames...
[2023-02-23 15:12:50,912][11317] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:12:51,088][11473] Decorrelating experience for 0 frames...
[2023-02-23 15:12:51,210][11471] Decorrelating experience for 32 frames...
[2023-02-23 15:12:51,213][11477] Decorrelating experience for 32 frames...
[2023-02-23 15:12:51,407][11474] Decorrelating experience for 64 frames...
[2023-02-23 15:12:52,040][11478] Decorrelating experience for 96 frames...
[2023-02-23 15:12:52,124][11477] Decorrelating experience for 64 frames...
[2023-02-23 15:12:52,196][11474] Decorrelating experience for 96 frames...
[2023-02-23 15:12:52,393][11473] Decorrelating experience for 32 frames...
[2023-02-23 15:12:52,408][11472] Decorrelating experience for 0 frames...
[2023-02-23 15:12:52,542][11476] Decorrelating experience for 32 frames...
[2023-02-23 15:12:53,507][11471] Decorrelating experience for 64 frames...
[2023-02-23 15:12:53,538][11477] Decorrelating experience for 96 frames...
[2023-02-23 15:12:53,556][11472] Decorrelating experience for 32 frames...
[2023-02-23 15:12:53,733][11475] Decorrelating experience for 64 frames...
[2023-02-23 15:12:53,793][11473] Decorrelating experience for 64 frames...
[2023-02-23 15:12:54,498][11475] Decorrelating experience for 96 frames...
[2023-02-23 15:12:54,497][11471] Decorrelating experience for 96 frames...
[2023-02-23 15:12:54,795][11476] Decorrelating experience for 64 frames...
[2023-02-23 15:12:54,984][11473] Decorrelating experience for 96 frames...
[2023-02-23 15:12:55,421][11472] Decorrelating experience for 64 frames...
[2023-02-23 15:12:55,630][11476] Decorrelating experience for 96 frames...
[2023-02-23 15:12:55,912][11317] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:12:56,187][11472] Decorrelating experience for 96 frames...
[2023-02-23 15:12:58,653][11317] Keyboard interrupt detected in the event loop EvtLoop [Runner_EvtLoop, process=main process 11317], exiting...
[2023-02-23 15:12:58,660][11456] Stopping Batcher_0...
[2023-02-23 15:12:58,661][11456] Loop batcher_evt_loop terminating...
[2023-02-23 15:12:58,659][11317] Runner profile tree view:
main_loop: 34.4556
[2023-02-23 15:12:58,663][11317] Collected {0: 0}, FPS: 0.0
[2023-02-23 15:12:58,691][11471] EvtLoop [rollout_proc0_evt_loop, process=rollout_proc0] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance0'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,642][11475] EvtLoop [rollout_proc2_evt_loop, process=rollout_proc2] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance2'), args=(0, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,728][11470] Weights refcount: 2 0
[2023-02-23 15:12:58,727][11478] EvtLoop [rollout_proc7_evt_loop, process=rollout_proc7] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance7'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,706][11474] EvtLoop [rollout_proc4_evt_loop, process=rollout_proc4] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance4'), args=(0, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,735][11470] Stopping InferenceWorker_p0-w0...
[2023-02-23 15:12:58,736][11470] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 15:12:58,737][11471] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc0_evt_loop
[2023-02-23 15:12:58,694][11472] EvtLoop [rollout_proc1_evt_loop, process=rollout_proc1] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance1'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,739][11472] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc1_evt_loop
[2023-02-23 15:12:58,730][11475] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc2_evt_loop
[2023-02-23 15:12:58,681][11476] EvtLoop [rollout_proc5_evt_loop, process=rollout_proc5] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance5'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,755][11476] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc5_evt_loop
[2023-02-23 15:12:58,731][11474] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc4_evt_loop
[2023-02-23 15:12:58,739][11477] EvtLoop [rollout_proc6_evt_loop, process=rollout_proc6] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance6'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,787][11477] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc6_evt_loop
[2023-02-23 15:12:58,697][11473] EvtLoop [rollout_proc3_evt_loop, process=rollout_proc3] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance3'), args=(1, 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/signal_slot/signal_slot.py", line 355, in _process_signal
    slot_callable(*args)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts
    complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts
    new_obs, rewards, terminated, truncated, infos = e.step(actions)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 129, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/algo/utils/make_env.py", line 115, in step
    obs, rew, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/scenario_wrappers/gathering_reward_shaping.py", line 33, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 384, in step
    observation, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sample_factory/envs/env_wrappers.py", line 88, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/gym/core.py", line 319, in step
    return self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/wrappers/multiplayer_stats.py", line 54, in step
    obs, reward, terminated, truncated, info = self.env.step(action)
  File "/usr/local/lib/python3.8/dist-packages/sf_examples/vizdoom/doom/doom_gym.py", line 452, in step
    reward = self.game.make_action(actions_flattened, self.skip_frames)
vizdoom.vizdoom.SignalException: Signal SIGINT received. ViZDoom instance has been closed.
[2023-02-23 15:12:58,790][11473] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc3_evt_loop
[2023-02-23 15:12:58,728][11478] Unhandled exception Signal SIGINT received. ViZDoom instance has been closed. in evt loop rollout_proc7_evt_loop
[2023-02-23 15:13:03,759][11456] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000001_4096.pth...
[2023-02-23 15:13:03,826][11456] Stopping LearnerWorker_p0...
[2023-02-23 15:13:03,826][11456] Loop learner_proc0_evt_loop terminating...
[2023-02-23 15:17:18,614][14252] Saving configuration to /content/train_dir/default_experiment/config.json...
[2023-02-23 15:17:18,617][14252] Rollout worker 0 uses device cpu
[2023-02-23 15:17:18,621][14252] Rollout worker 1 uses device cpu
[2023-02-23 15:17:18,624][14252] Rollout worker 2 uses device cpu
[2023-02-23 15:17:18,626][14252] Rollout worker 3 uses device cpu
[2023-02-23 15:17:18,630][14252] Rollout worker 4 uses device cpu
[2023-02-23 15:17:18,635][14252] Rollout worker 5 uses device cpu
[2023-02-23 15:17:18,636][14252] Rollout worker 6 uses device cpu
[2023-02-23 15:17:18,639][14252] Rollout worker 7 uses device cpu
[2023-02-23 15:17:18,791][14252] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:17:18,793][14252] InferenceWorker_p0-w0: min num requests: 2
[2023-02-23 15:17:18,827][14252] Starting all processes...
[2023-02-23 15:17:18,830][14252] Starting process learner_proc0
[2023-02-23 15:17:18,890][14252] Starting all processes...
[2023-02-23 15:17:18,901][14252] Starting process inference_proc0-0
[2023-02-23 15:17:18,902][14252] Starting process rollout_proc0
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc1
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc2
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc3
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc4
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc5
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc6
[2023-02-23 15:17:18,904][14252] Starting process rollout_proc7
[2023-02-23 15:17:30,288][15287] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:17:30,292][15287] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-02-23 15:17:30,461][15302] Worker 0 uses CPU cores [0]
[2023-02-23 15:17:30,721][15308] Worker 6 uses CPU cores [0]
[2023-02-23 15:17:30,725][15306] Worker 4 uses CPU cores [0]
[2023-02-23 15:17:30,737][15301] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:17:30,738][15301] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-02-23 15:17:30,876][15309] Worker 7 uses CPU cores [1]
[2023-02-23 15:17:30,880][15303] Worker 1 uses CPU cores [1]
[2023-02-23 15:17:30,907][15307] Worker 5 uses CPU cores [1]
[2023-02-23 15:17:30,924][15305] Worker 3 uses CPU cores [1]
[2023-02-23 15:17:30,944][15304] Worker 2 uses CPU cores [0]
[2023-02-23 15:17:31,382][15301] Num visible devices: 1
[2023-02-23 15:17:31,381][15287] Num visible devices: 1
[2023-02-23 15:17:31,391][15287] Starting seed is not provided
[2023-02-23 15:17:31,391][15287] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:17:31,392][15287] Initializing actor-critic model on device cuda:0
[2023-02-23 15:17:31,393][15287] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:17:31,394][15287] RunningMeanStd input shape: (1,)
[2023-02-23 15:17:31,411][15287] ConvEncoder: input_channels=3
[2023-02-23 15:17:31,542][15287] Conv encoder output size: 512
[2023-02-23 15:17:31,543][15287] Policy head output size: 512
[2023-02-23 15:17:31,558][15287] Created Actor Critic model with architecture:
[2023-02-23 15:17:31,558][15287] ActorCriticSharedWeights(
  (obs_normalizer): ObservationNormalizer(
    (running_mean_std): RunningMeanStdDictInPlace(
      (running_mean_std): ModuleDict(
        (obs): RunningMeanStdInPlace()
      )
    )
  )
  (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
  (encoder): VizdoomEncoder(
    (basic_encoder): ConvEncoder(
      (enc): RecursiveScriptModule(
        original_name=ConvEncoderImpl
        (conv_head): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=Conv2d)
          (1): RecursiveScriptModule(original_name=ELU)
          (2): RecursiveScriptModule(original_name=Conv2d)
          (3): RecursiveScriptModule(original_name=ELU)
          (4): RecursiveScriptModule(original_name=Conv2d)
          (5): RecursiveScriptModule(original_name=ELU)
        )
        (mlp_layers): RecursiveScriptModule(
          original_name=Sequential
          (0): RecursiveScriptModule(original_name=Linear)
          (1): RecursiveScriptModule(original_name=ELU)
        )
      )
    )
  )
  (core): ModelCoreRNN(
    (core): GRU(512, 512)
  )
  (decoder): MlpDecoder(
    (mlp): Identity()
  )
  (critic_linear): Linear(in_features=512, out_features=1, bias=True)
  (action_parameterization): ActionParameterizationDefault(
    (distribution_linear): Linear(in_features=512, out_features=5, bias=True)
  )
)
[2023-02-23 15:17:33,992][15287] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-02-23 15:17:33,993][15287] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000001_4096.pth...
[2023-02-23 15:17:34,025][15287] Loading model from checkpoint
[2023-02-23 15:17:34,029][15287] Loaded experiment state at self.train_step=1, self.env_steps=4096
[2023-02-23 15:17:34,030][15287] Initialized policy 0 weights for model version 1
[2023-02-23 15:17:34,034][15287] LearnerWorker_p0 finished initialization!
[2023-02-23 15:17:34,035][15287] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-02-23 15:17:34,242][15301] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:17:34,245][15301] RunningMeanStd input shape: (1,)
[2023-02-23 15:17:34,257][15301] ConvEncoder: input_channels=3
[2023-02-23 15:17:34,354][15301] Conv encoder output size: 512
[2023-02-23 15:17:34,355][15301] Policy head output size: 512
[2023-02-23 15:17:35,449][14252] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4096. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:17:36,551][14252] Inference worker 0-0 is ready!
[2023-02-23 15:17:36,553][14252] All inference workers are ready! Signal rollout workers to start!
[2023-02-23 15:17:36,649][15309] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,651][15303] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,657][15306] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,654][15302] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,659][15304] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,658][15308] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,663][15305] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:36,655][15307] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:17:37,451][15309] Decorrelating experience for 0 frames...
[2023-02-23 15:17:37,452][15307] Decorrelating experience for 0 frames...
[2023-02-23 15:17:37,819][15307] Decorrelating experience for 32 frames...
[2023-02-23 15:17:38,042][15306] Decorrelating experience for 0 frames...
[2023-02-23 15:17:38,044][15308] Decorrelating experience for 0 frames...
[2023-02-23 15:17:38,047][15302] Decorrelating experience for 0 frames...
[2023-02-23 15:17:38,048][15304] Decorrelating experience for 0 frames...
[2023-02-23 15:17:38,405][15309] Decorrelating experience for 32 frames...
[2023-02-23 15:17:38,783][14252] Heartbeat connected on Batcher_0
[2023-02-23 15:17:38,787][14252] Heartbeat connected on LearnerWorker_p0
[2023-02-23 15:17:38,822][14252] Heartbeat connected on InferenceWorker_p0-w0
[2023-02-23 15:17:39,155][15303] Decorrelating experience for 0 frames...
[2023-02-23 15:17:39,665][15302] Decorrelating experience for 32 frames...
[2023-02-23 15:17:39,672][15308] Decorrelating experience for 32 frames...
[2023-02-23 15:17:39,670][15304] Decorrelating experience for 32 frames...
[2023-02-23 15:17:39,676][15306] Decorrelating experience for 32 frames...
[2023-02-23 15:17:40,139][15303] Decorrelating experience for 32 frames...
[2023-02-23 15:17:40,450][14252] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4096. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:17:41,001][15305] Decorrelating experience for 0 frames...
[2023-02-23 15:17:41,462][15303] Decorrelating experience for 64 frames...
[2023-02-23 15:17:41,960][15302] Decorrelating experience for 64 frames...
[2023-02-23 15:17:41,958][15306] Decorrelating experience for 64 frames...
[2023-02-23 15:17:41,962][15304] Decorrelating experience for 64 frames...
[2023-02-23 15:17:42,216][15308] Decorrelating experience for 64 frames...
[2023-02-23 15:17:42,242][15305] Decorrelating experience for 32 frames...
[2023-02-23 15:17:42,278][15307] Decorrelating experience for 64 frames...
[2023-02-23 15:17:43,536][15306] Decorrelating experience for 96 frames...
[2023-02-23 15:17:43,538][15302] Decorrelating experience for 96 frames...
[2023-02-23 15:17:43,753][15309] Decorrelating experience for 64 frames...
[2023-02-23 15:17:43,805][14252] Heartbeat connected on RolloutWorker_w4
[2023-02-23 15:17:43,822][14252] Heartbeat connected on RolloutWorker_w0
[2023-02-23 15:17:43,910][15308] Decorrelating experience for 96 frames...
[2023-02-23 15:17:44,078][15307] Decorrelating experience for 96 frames...
[2023-02-23 15:17:44,318][14252] Heartbeat connected on RolloutWorker_w6
[2023-02-23 15:17:44,678][14252] Heartbeat connected on RolloutWorker_w5
[2023-02-23 15:17:45,160][15304] Decorrelating experience for 96 frames...
[2023-02-23 15:17:45,449][14252] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4096. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:17:45,466][14252] Heartbeat connected on RolloutWorker_w2
[2023-02-23 15:17:45,659][15305] Decorrelating experience for 64 frames...
[2023-02-23 15:17:47,474][15303] Decorrelating experience for 96 frames...
[2023-02-23 15:17:47,790][14252] Heartbeat connected on RolloutWorker_w1
[2023-02-23 15:17:48,103][15309] Decorrelating experience for 96 frames...
[2023-02-23 15:17:48,381][15305] Decorrelating experience for 96 frames...
[2023-02-23 15:17:48,654][14252] Heartbeat connected on RolloutWorker_w7
[2023-02-23 15:17:48,795][14252] Heartbeat connected on RolloutWorker_w3
[2023-02-23 15:17:49,508][15287] Signal inference workers to stop experience collection...
[2023-02-23 15:17:49,514][15301] InferenceWorker_p0-w0: stopping experience collection
[2023-02-23 15:17:50,448][14252] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4096. Throughput: 0: 120.7. Samples: 1810. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-02-23 15:17:50,452][14252] Avg episode reward: [(0, '2.505')]
[2023-02-23 15:17:50,759][15287] Signal inference workers to resume experience collection...
[2023-02-23 15:17:50,760][15301] InferenceWorker_p0-w0: resuming experience collection
[2023-02-23 15:17:55,448][14252] Fps is (10 sec: 2457.6, 60 sec: 1228.8, 300 sec: 1228.8). Total num frames: 28672. Throughput: 0: 292.4. Samples: 5848. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:17:55,457][14252] Avg episode reward: [(0, '3.360')]
[2023-02-23 15:17:59,742][15301] Updated weights for policy 0, policy_version 11 (0.0017)
[2023-02-23 15:18:00,448][14252] Fps is (10 sec: 4096.0, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 45056. Throughput: 0: 358.5. Samples: 8962. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 15:18:00,451][14252] Avg episode reward: [(0, '3.907')]
[2023-02-23 15:18:05,449][14252] Fps is (10 sec: 2867.2, 60 sec: 1774.9, 300 sec: 1774.9). Total num frames: 57344. Throughput: 0: 445.6. Samples: 13368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:18:05,455][14252] Avg episode reward: [(0, '4.337')]
[2023-02-23 15:18:10,448][14252] Fps is (10 sec: 3686.4, 60 sec: 2223.6, 300 sec: 2223.6). Total num frames: 81920. Throughput: 0: 540.3. Samples: 18910. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:18:10,454][14252] Avg episode reward: [(0, '4.496')]
[2023-02-23 15:18:11,415][15301] Updated weights for policy 0, policy_version 21 (0.0021)
[2023-02-23 15:18:15,448][14252] Fps is (10 sec: 4505.6, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 102400. Throughput: 0: 560.9. Samples: 22434. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:18:15,450][14252] Avg episode reward: [(0, '4.377')]
[2023-02-23 15:18:15,465][15287] Saving new best policy, reward=4.377!
[2023-02-23 15:18:20,448][14252] Fps is (10 sec: 3686.4, 60 sec: 2548.6, 300 sec: 2548.6). Total num frames: 118784. Throughput: 0: 635.5. Samples: 28596. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:18:20,452][14252] Avg episode reward: [(0, '4.542')]
[2023-02-23 15:18:20,456][15287] Saving new best policy, reward=4.542!
[2023-02-23 15:18:22,247][15301] Updated weights for policy 0, policy_version 31 (0.0019)
[2023-02-23 15:18:25,449][14252] Fps is (10 sec: 3276.8, 60 sec: 2621.4, 300 sec: 2621.4). Total num frames: 135168. Throughput: 0: 733.0. Samples: 32982. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:18:25,456][14252] Avg episode reward: [(0, '4.660')]
[2023-02-23 15:18:25,467][15287] Saving new best policy, reward=4.660!
[2023-02-23 15:18:30,448][14252] Fps is (10 sec: 3686.4, 60 sec: 2755.5, 300 sec: 2755.5). Total num frames: 155648. Throughput: 0: 785.5. Samples: 35346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:18:30,450][14252] Avg episode reward: [(0, '4.470')]
[2023-02-23 15:18:32,874][15301] Updated weights for policy 0, policy_version 41 (0.0017)
[2023-02-23 15:18:35,450][14252] Fps is (10 sec: 4095.5, 60 sec: 2867.1, 300 sec: 2867.1). Total num frames: 176128. Throughput: 0: 900.6. Samples: 42340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:18:35,455][14252] Avg episode reward: [(0, '4.446')]
[2023-02-23 15:18:40,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3208.6, 300 sec: 2961.7). Total num frames: 196608. Throughput: 0: 941.7. Samples: 48226. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:18:40,454][14252] Avg episode reward: [(0, '4.401')]
[2023-02-23 15:18:44,334][15301] Updated weights for policy 0, policy_version 51 (0.0016)
[2023-02-23 15:18:45,448][14252] Fps is (10 sec: 3277.2, 60 sec: 3413.3, 300 sec: 2925.7). Total num frames: 208896. Throughput: 0: 922.0. Samples: 50452. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:18:45,452][14252] Avg episode reward: [(0, '4.464')]
[2023-02-23 15:18:50,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3003.7). Total num frames: 229376. Throughput: 0: 938.1. Samples: 55584. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:18:50,451][14252] Avg episode reward: [(0, '4.460')]
[2023-02-23 15:18:54,044][15301] Updated weights for policy 0, policy_version 61 (0.0013)
[2023-02-23 15:18:55,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3754.7, 300 sec: 3123.2). Total num frames: 253952. Throughput: 0: 970.6. Samples: 62588. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:18:55,451][14252] Avg episode reward: [(0, '4.315')]
[2023-02-23 15:19:00,454][14252] Fps is (10 sec: 4093.7, 60 sec: 3754.3, 300 sec: 3132.0). Total num frames: 270336. Throughput: 0: 963.3. Samples: 65786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:19:00,457][14252] Avg episode reward: [(0, '4.543')]
[2023-02-23 15:19:05,449][14252] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3140.3). Total num frames: 286720. Throughput: 0: 924.1. Samples: 70180. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:19:05,451][14252] Avg episode reward: [(0, '4.613')]
[2023-02-23 15:19:06,599][15301] Updated weights for policy 0, policy_version 71 (0.0030)
[2023-02-23 15:19:10,448][14252] Fps is (10 sec: 3688.4, 60 sec: 3754.7, 300 sec: 3190.6). Total num frames: 307200. Throughput: 0: 952.5. Samples: 75846. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:19:10,457][14252] Avg episode reward: [(0, '4.490')]
[2023-02-23 15:19:15,448][14252] Fps is (10 sec: 4096.2, 60 sec: 3754.7, 300 sec: 3235.8). Total num frames: 327680. Throughput: 0: 978.3. Samples: 79370. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:19:15,456][14252] Avg episode reward: [(0, '4.392')]
[2023-02-23 15:19:15,487][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000081_331776.pth...
[2023-02-23 15:19:15,494][15301] Updated weights for policy 0, policy_version 81 (0.0023)
[2023-02-23 15:19:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 348160. Throughput: 0: 955.7. Samples: 85346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:19:20,453][14252] Avg episode reward: [(0, '4.379')]
[2023-02-23 15:19:25,449][14252] Fps is (10 sec: 3276.6, 60 sec: 3754.6, 300 sec: 3239.6). Total num frames: 360448. Throughput: 0: 921.1. Samples: 89678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:19:25,453][14252] Avg episode reward: [(0, '4.315')]
[2023-02-23 15:19:28,099][15301] Updated weights for policy 0, policy_version 91 (0.0019)
[2023-02-23 15:19:30,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3276.8). Total num frames: 380928. Throughput: 0: 930.5. Samples: 92324. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:19:30,450][14252] Avg episode reward: [(0, '4.384')]
[2023-02-23 15:19:35,448][14252] Fps is (10 sec: 4505.8, 60 sec: 3823.0, 300 sec: 3345.1). Total num frames: 405504. Throughput: 0: 970.1. Samples: 99238. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:19:35,456][14252] Avg episode reward: [(0, '4.644')]
[2023-02-23 15:19:36,821][15301] Updated weights for policy 0, policy_version 101 (0.0018)
[2023-02-23 15:19:40,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3342.3). Total num frames: 421888. Throughput: 0: 942.9. Samples: 105020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:19:40,457][14252] Avg episode reward: [(0, '4.666')]
[2023-02-23 15:19:40,462][15287] Saving new best policy, reward=4.666!
[2023-02-23 15:19:45,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3339.8). Total num frames: 438272. Throughput: 0: 917.1. Samples: 107050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:19:45,462][14252] Avg episode reward: [(0, '4.482')]
[2023-02-23 15:19:49,313][15301] Updated weights for policy 0, policy_version 111 (0.0030)
[2023-02-23 15:19:50,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3367.8). Total num frames: 458752. Throughput: 0: 936.6. Samples: 112326. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:19:50,456][14252] Avg episode reward: [(0, '4.636')]
[2023-02-23 15:19:55,449][14252] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3393.8). Total num frames: 479232. Throughput: 0: 966.6. Samples: 119344. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:19:55,454][14252] Avg episode reward: [(0, '4.722')]
[2023-02-23 15:19:55,545][15287] Saving new best policy, reward=4.722!
[2023-02-23 15:19:59,069][15301] Updated weights for policy 0, policy_version 121 (0.0014)
[2023-02-23 15:20:00,453][14252] Fps is (10 sec: 3684.7, 60 sec: 3754.7, 300 sec: 3389.7). Total num frames: 495616. Throughput: 0: 956.6. Samples: 122420. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:20:00,458][14252] Avg episode reward: [(0, '4.685')]
[2023-02-23 15:20:05,450][14252] Fps is (10 sec: 3276.3, 60 sec: 3754.6, 300 sec: 3386.0). Total num frames: 512000. Throughput: 0: 921.8. Samples: 126828. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:05,454][14252] Avg episode reward: [(0, '4.660')]
[2023-02-23 15:20:10,448][14252] Fps is (10 sec: 3688.1, 60 sec: 3754.7, 300 sec: 3408.9). Total num frames: 532480. Throughput: 0: 946.9. Samples: 132288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:10,451][14252] Avg episode reward: [(0, '4.488')]
[2023-02-23 15:20:10,760][15301] Updated weights for policy 0, policy_version 131 (0.0024)
[2023-02-23 15:20:15,448][14252] Fps is (10 sec: 4506.3, 60 sec: 3822.9, 300 sec: 3456.0). Total num frames: 557056. Throughput: 0: 965.3. Samples: 135764. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:20:15,451][14252] Avg episode reward: [(0, '4.815')]
[2023-02-23 15:20:15,474][15287] Saving new best policy, reward=4.815!
[2023-02-23 15:20:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3450.6). Total num frames: 573440. Throughput: 0: 950.0. Samples: 141986. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:20:20,451][14252] Avg episode reward: [(0, '4.941')]
[2023-02-23 15:20:20,454][15287] Saving new best policy, reward=4.941!
[2023-02-23 15:20:21,275][15301] Updated weights for policy 0, policy_version 141 (0.0018)
[2023-02-23 15:20:25,449][14252] Fps is (10 sec: 2867.2, 60 sec: 3754.7, 300 sec: 3421.4). Total num frames: 585728. Throughput: 0: 917.9. Samples: 146324. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:25,452][14252] Avg episode reward: [(0, '4.751')]
[2023-02-23 15:20:30,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3464.0). Total num frames: 610304. Throughput: 0: 930.0. Samples: 148902. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:30,451][14252] Avg episode reward: [(0, '4.574')]
[2023-02-23 15:20:31,992][15301] Updated weights for policy 0, policy_version 151 (0.0021)
[2023-02-23 15:20:35,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3754.7, 300 sec: 3481.6). Total num frames: 630784. Throughput: 0: 967.4. Samples: 155860. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:20:35,451][14252] Avg episode reward: [(0, '4.500')]
[2023-02-23 15:20:40,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3498.2). Total num frames: 651264. Throughput: 0: 943.9. Samples: 161818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:20:40,455][14252] Avg episode reward: [(0, '4.748')]
[2023-02-23 15:20:43,004][15301] Updated weights for policy 0, policy_version 161 (0.0013)
[2023-02-23 15:20:45,449][14252] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3470.8). Total num frames: 663552. Throughput: 0: 924.8. Samples: 164032. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:20:45,460][14252] Avg episode reward: [(0, '4.819')]
[2023-02-23 15:20:50,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3754.7, 300 sec: 3486.9). Total num frames: 684032. Throughput: 0: 942.2. Samples: 169224. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:50,454][14252] Avg episode reward: [(0, '4.486')]
[2023-02-23 15:20:53,186][15301] Updated weights for policy 0, policy_version 171 (0.0013)
[2023-02-23 15:20:55,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3522.6). Total num frames: 708608. Throughput: 0: 976.7. Samples: 176238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:20:55,453][14252] Avg episode reward: [(0, '4.421')]
[2023-02-23 15:21:00,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3891.5, 300 sec: 3536.5). Total num frames: 729088. Throughput: 0: 973.1. Samples: 179552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:21:00,455][14252] Avg episode reward: [(0, '4.558')]
[2023-02-23 15:21:04,848][15301] Updated weights for policy 0, policy_version 181 (0.0018)
[2023-02-23 15:21:05,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3823.0, 300 sec: 3510.9). Total num frames: 741376. Throughput: 0: 931.0. Samples: 183882. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:21:05,452][14252] Avg episode reward: [(0, '4.558')]
[2023-02-23 15:21:10,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3524.5). Total num frames: 761856. Throughput: 0: 960.4. Samples: 189544. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:21:10,458][14252] Avg episode reward: [(0, '4.696')]
[2023-02-23 15:21:14,337][15301] Updated weights for policy 0, policy_version 191 (0.0012)
[2023-02-23 15:21:15,449][14252] Fps is (10 sec: 4505.5, 60 sec: 3822.9, 300 sec: 3556.1). Total num frames: 786432. Throughput: 0: 982.3. Samples: 193106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:21:15,452][14252] Avg episode reward: [(0, '4.928')]
[2023-02-23 15:21:15,461][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth...
[2023-02-23 15:21:15,589][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000001_4096.pth
[2023-02-23 15:21:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3549.9). Total num frames: 802816. Throughput: 0: 966.8. Samples: 199368. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:21:20,453][14252] Avg episode reward: [(0, '5.181')]
[2023-02-23 15:21:20,455][15287] Saving new best policy, reward=5.181!
[2023-02-23 15:21:25,448][14252] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3526.1). Total num frames: 815104. Throughput: 0: 929.9. Samples: 203662. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:21:25,457][14252] Avg episode reward: [(0, '4.964')]
[2023-02-23 15:21:26,696][15301] Updated weights for policy 0, policy_version 201 (0.0020)
[2023-02-23 15:21:30,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3555.7). Total num frames: 839680. Throughput: 0: 939.1. Samples: 206292. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:21:30,451][14252] Avg episode reward: [(0, '4.747')]
[2023-02-23 15:21:35,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3566.9). Total num frames: 860160. Throughput: 0: 982.7. Samples: 213444. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:21:35,456][14252] Avg episode reward: [(0, '4.807')]
[2023-02-23 15:21:35,552][15301] Updated weights for policy 0, policy_version 211 (0.0012)
[2023-02-23 15:21:40,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3577.7). Total num frames: 880640. Throughput: 0: 954.5. Samples: 219192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:21:40,451][14252] Avg episode reward: [(0, '4.880')]
[2023-02-23 15:21:45,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3555.3). Total num frames: 892928. Throughput: 0: 931.2. Samples: 221454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:21:45,454][14252] Avg episode reward: [(0, '5.105')]
[2023-02-23 15:21:47,710][15301] Updated weights for policy 0, policy_version 221 (0.0016)
[2023-02-23 15:21:50,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3582.0). Total num frames: 917504. Throughput: 0: 954.9. Samples: 226854. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:21:50,451][14252] Avg episode reward: [(0, '5.186')]
[2023-02-23 15:21:50,457][15287] Saving new best policy, reward=5.186!
[2023-02-23 15:21:55,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3591.9). Total num frames: 937984. Throughput: 0: 983.4. Samples: 233798. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:21:55,451][14252] Avg episode reward: [(0, '4.596')]
[2023-02-23 15:21:56,708][15301] Updated weights for policy 0, policy_version 231 (0.0017)
[2023-02-23 15:22:00,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3585.9). Total num frames: 954368. Throughput: 0: 971.5. Samples: 236824. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:22:00,457][14252] Avg episode reward: [(0, '4.756')]
[2023-02-23 15:22:05,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3580.2). Total num frames: 970752. Throughput: 0: 931.7. Samples: 241296. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:22:05,450][14252] Avg episode reward: [(0, '5.073')]
[2023-02-23 15:22:08,917][15301] Updated weights for policy 0, policy_version 241 (0.0023)
[2023-02-23 15:22:10,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3589.6). Total num frames: 991232. Throughput: 0: 967.2. Samples: 247186. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:22:10,451][14252] Avg episode reward: [(0, '5.340')]
[2023-02-23 15:22:10,456][15287] Saving new best policy, reward=5.340!
[2023-02-23 15:22:15,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3613.3). Total num frames: 1015808. Throughput: 0: 986.2. Samples: 250670. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:22:15,457][14252] Avg episode reward: [(0, '5.479')]
[2023-02-23 15:22:15,471][15287] Saving new best policy, reward=5.479!
[2023-02-23 15:22:18,280][15301] Updated weights for policy 0, policy_version 251 (0.0023)
[2023-02-23 15:22:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3607.4). Total num frames: 1032192. Throughput: 0: 961.9. Samples: 256728. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:22:20,455][14252] Avg episode reward: [(0, '5.678')]
[2023-02-23 15:22:20,458][15287] Saving new best policy, reward=5.678!
[2023-02-23 15:22:25,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3601.7). Total num frames: 1048576. Throughput: 0: 933.2. Samples: 261184. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:22:25,453][14252] Avg episode reward: [(0, '5.678')]
[2023-02-23 15:22:29,863][15301] Updated weights for policy 0, policy_version 261 (0.0023)
[2023-02-23 15:22:30,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3610.0). Total num frames: 1069056. Throughput: 0: 944.0. Samples: 263934. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:22:30,454][14252] Avg episode reward: [(0, '5.838')]
[2023-02-23 15:22:30,458][15287] Saving new best policy, reward=5.838!
[2023-02-23 15:22:35,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3693.4). Total num frames: 1093632. Throughput: 0: 981.0. Samples: 270998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:22:35,452][14252] Avg episode reward: [(0, '5.844')]
[2023-02-23 15:22:35,464][15287] Saving new best policy, reward=5.844!
[2023-02-23 15:22:40,181][15301] Updated weights for policy 0, policy_version 271 (0.0013)
[2023-02-23 15:22:40,450][14252] Fps is (10 sec: 4095.3, 60 sec: 3822.8, 300 sec: 3748.9). Total num frames: 1110016. Throughput: 0: 951.4. Samples: 276614. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:22:40,453][14252] Avg episode reward: [(0, '6.024')]
[2023-02-23 15:22:40,454][15287] Saving new best policy, reward=6.024!
[2023-02-23 15:22:45,449][14252] Fps is (10 sec: 2867.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1122304. Throughput: 0: 933.9. Samples: 278848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 15:22:45,451][14252] Avg episode reward: [(0, '6.245')]
[2023-02-23 15:22:45,470][15287] Saving new best policy, reward=6.245!
[2023-02-23 15:22:50,448][14252] Fps is (10 sec: 3687.0, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1146880. Throughput: 0: 955.4. Samples: 284288. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:22:50,455][14252] Avg episode reward: [(0, '6.428')]
[2023-02-23 15:22:50,464][15287] Saving new best policy, reward=6.428!
[2023-02-23 15:22:51,176][15301] Updated weights for policy 0, policy_version 281 (0.0027)
[2023-02-23 15:22:55,448][14252] Fps is (10 sec: 4915.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1171456. Throughput: 0: 982.8. Samples: 291414. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:22:55,455][14252] Avg episode reward: [(0, '6.566')]
[2023-02-23 15:22:55,469][15287] Saving new best policy, reward=6.566!
[2023-02-23 15:23:00,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 1187840. Throughput: 0: 972.0. Samples: 294408. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 15:23:00,451][14252] Avg episode reward: [(0, '7.006')]
[2023-02-23 15:23:00,456][15287] Saving new best policy, reward=7.006!
[2023-02-23 15:23:01,600][15301] Updated weights for policy 0, policy_version 291 (0.0015)
[2023-02-23 15:23:05,449][14252] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3790.5). Total num frames: 1200128. Throughput: 0: 936.4. Samples: 298866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:23:05,451][14252] Avg episode reward: [(0, '6.681')]
[2023-02-23 15:23:10,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3804.4). Total num frames: 1224704. Throughput: 0: 971.1. Samples: 304882. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:23:10,456][14252] Avg episode reward: [(0, '6.374')]
[2023-02-23 15:23:12,034][15301] Updated weights for policy 0, policy_version 301 (0.0026)
[2023-02-23 15:23:15,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1245184. Throughput: 0: 988.4. Samples: 308414. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:23:15,451][14252] Avg episode reward: [(0, '6.767')]
[2023-02-23 15:23:15,514][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000305_1249280.pth...
[2023-02-23 15:23:15,628][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000081_331776.pth
[2023-02-23 15:23:20,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1261568. Throughput: 0: 964.5. Samples: 314400. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:23:20,451][14252] Avg episode reward: [(0, '6.916')]
[2023-02-23 15:23:23,496][15301] Updated weights for policy 0, policy_version 311 (0.0014)
[2023-02-23 15:23:25,449][14252] Fps is (10 sec: 3276.6, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 1277952. Throughput: 0: 939.8. Samples: 318906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:23:25,452][14252] Avg episode reward: [(0, '7.100')]
[2023-02-23 15:23:25,465][15287] Saving new best policy, reward=7.100!
[2023-02-23 15:23:30,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1302528. Throughput: 0: 956.9. Samples: 321908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:23:30,451][14252] Avg episode reward: [(0, '7.083')]
[2023-02-23 15:23:32,938][15301] Updated weights for policy 0, policy_version 321 (0.0023)
[2023-02-23 15:23:35,448][14252] Fps is (10 sec: 4505.9, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1323008. Throughput: 0: 995.4. Samples: 329080. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:23:35,451][14252] Avg episode reward: [(0, '7.584')]
[2023-02-23 15:23:35,533][15287] Saving new best policy, reward=7.584!
[2023-02-23 15:23:40,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 1339392. Throughput: 0: 956.9. Samples: 334474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:23:40,455][14252] Avg episode reward: [(0, '8.474')]
[2023-02-23 15:23:40,458][15287] Saving new best policy, reward=8.474!
[2023-02-23 15:23:44,802][15301] Updated weights for policy 0, policy_version 331 (0.0028)
[2023-02-23 15:23:45,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1355776. Throughput: 0: 940.1. Samples: 336714. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:23:45,455][14252] Avg episode reward: [(0, '8.647')]
[2023-02-23 15:23:45,482][15287] Saving new best policy, reward=8.647!
[2023-02-23 15:23:50,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1380352. Throughput: 0: 971.2. Samples: 342570. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:23:50,454][14252] Avg episode reward: [(0, '9.196')]
[2023-02-23 15:23:50,457][15287] Saving new best policy, reward=9.196!
[2023-02-23 15:23:53,902][15301] Updated weights for policy 0, policy_version 341 (0.0012)
[2023-02-23 15:23:55,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3822.9, 300 sec: 3832.3). Total num frames: 1400832. Throughput: 0: 994.2. Samples: 349620. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:23:55,457][14252] Avg episode reward: [(0, '8.705')]
[2023-02-23 15:24:00,449][14252] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1417216. Throughput: 0: 977.7. Samples: 352412. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:24:00,455][14252] Avg episode reward: [(0, '8.476')]
[2023-02-23 15:24:05,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 1433600. Throughput: 0: 946.8. Samples: 357008. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:24:05,453][14252] Avg episode reward: [(0, '9.109')]
[2023-02-23 15:24:06,173][15301] Updated weights for policy 0, policy_version 351 (0.0013)
[2023-02-23 15:24:10,448][14252] Fps is (10 sec: 4096.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 1458176. Throughput: 0: 983.8. Samples: 363176. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:24:10,453][14252] Avg episode reward: [(0, '9.138')]
[2023-02-23 15:24:14,602][15301] Updated weights for policy 0, policy_version 361 (0.0021)
[2023-02-23 15:24:15,448][14252] Fps is (10 sec: 4915.3, 60 sec: 3959.5, 300 sec: 3846.1). Total num frames: 1482752. Throughput: 0: 998.2. Samples: 366828. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:24:15,457][14252] Avg episode reward: [(0, '10.321')]
[2023-02-23 15:24:15,465][15287] Saving new best policy, reward=10.321!
[2023-02-23 15:24:20,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1495040. Throughput: 0: 969.5. Samples: 372706. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:24:20,451][14252] Avg episode reward: [(0, '10.876')]
[2023-02-23 15:24:20,454][15287] Saving new best policy, reward=10.876!
[2023-02-23 15:24:25,450][14252] Fps is (10 sec: 2866.7, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 1511424. Throughput: 0: 946.6. Samples: 377074. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:24:25,455][14252] Avg episode reward: [(0, '11.253')]
[2023-02-23 15:24:25,469][15287] Saving new best policy, reward=11.253!
[2023-02-23 15:24:27,050][15301] Updated weights for policy 0, policy_version 371 (0.0013)
[2023-02-23 15:24:30,450][14252] Fps is (10 sec: 4095.3, 60 sec: 3891.1, 300 sec: 3832.2). Total num frames: 1536000. Throughput: 0: 965.0. Samples: 380140. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:24:30,453][14252] Avg episode reward: [(0, '11.627')]
[2023-02-23 15:24:30,461][15287] Saving new best policy, reward=11.627!
[2023-02-23 15:24:35,448][14252] Fps is (10 sec: 4506.3, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1556480. Throughput: 0: 987.9. Samples: 387024. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:24:35,451][14252] Avg episode reward: [(0, '11.964')]
[2023-02-23 15:24:35,464][15287] Saving new best policy, reward=11.964!
[2023-02-23 15:24:35,855][15301] Updated weights for policy 0, policy_version 381 (0.0014)
[2023-02-23 15:24:40,448][14252] Fps is (10 sec: 3687.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1572864. Throughput: 0: 948.9. Samples: 392322. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:24:40,453][14252] Avg episode reward: [(0, '11.292')]
[2023-02-23 15:24:45,449][14252] Fps is (10 sec: 2867.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 1585152. Throughput: 0: 936.9. Samples: 394572. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:24:45,456][14252] Avg episode reward: [(0, '11.170')]
[2023-02-23 15:24:48,088][15301] Updated weights for policy 0, policy_version 391 (0.0022)
[2023-02-23 15:24:50,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1609728. Throughput: 0: 965.6. Samples: 400458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:24:50,453][14252] Avg episode reward: [(0, '9.943')]
[2023-02-23 15:24:55,448][14252] Fps is (10 sec: 4915.5, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1634304. Throughput: 0: 985.8. Samples: 407536. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:24:55,451][14252] Avg episode reward: [(0, '9.766')]
[2023-02-23 15:24:57,504][15301] Updated weights for policy 0, policy_version 401 (0.0013)
[2023-02-23 15:25:00,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1650688. Throughput: 0: 963.8. Samples: 410198. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:25:00,453][14252] Avg episode reward: [(0, '9.388')]
[2023-02-23 15:25:05,449][14252] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1662976. Throughput: 0: 934.5. Samples: 414758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:25:05,453][14252] Avg episode reward: [(0, '9.105')]
[2023-02-23 15:25:09,057][15301] Updated weights for policy 0, policy_version 411 (0.0019)
[2023-02-23 15:25:10,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1687552. Throughput: 0: 976.4. Samples: 421012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:25:10,455][14252] Avg episode reward: [(0, '9.705')]
[2023-02-23 15:25:15,448][14252] Fps is (10 sec: 4915.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 1712128. Throughput: 0: 987.2. Samples: 424560. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:25:15,451][14252] Avg episode reward: [(0, '10.413')]
[2023-02-23 15:25:15,465][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000418_1712128.pth...
[2023-02-23 15:25:15,618][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000192_786432.pth
[2023-02-23 15:25:19,024][15301] Updated weights for policy 0, policy_version 421 (0.0012)
[2023-02-23 15:25:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 1728512. Throughput: 0: 961.3. Samples: 430282. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:25:20,453][14252] Avg episode reward: [(0, '10.961')]
[2023-02-23 15:25:25,449][14252] Fps is (10 sec: 2867.1, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 1740800. Throughput: 0: 942.9. Samples: 434752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:25:25,451][14252] Avg episode reward: [(0, '11.855')]
[2023-02-23 15:25:30,021][15301] Updated weights for policy 0, policy_version 431 (0.0024)
[2023-02-23 15:25:30,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 1765376. Throughput: 0: 965.7. Samples: 438030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:25:30,451][14252] Avg episode reward: [(0, '12.125')]
[2023-02-23 15:25:30,457][15287] Saving new best policy, reward=12.125!
[2023-02-23 15:25:35,448][14252] Fps is (10 sec: 4915.4, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 1789952. Throughput: 0: 991.4. Samples: 445070. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:25:35,455][14252] Avg episode reward: [(0, '12.379')]
[2023-02-23 15:25:35,470][15287] Saving new best policy, reward=12.379!
[2023-02-23 15:25:40,450][14252] Fps is (10 sec: 3685.8, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 1802240. Throughput: 0: 946.7. Samples: 450140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:25:40,454][14252] Avg episode reward: [(0, '12.229')]
[2023-02-23 15:25:40,982][15301] Updated weights for policy 0, policy_version 441 (0.0014)
[2023-02-23 15:25:45,448][14252] Fps is (10 sec: 2867.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1818624. Throughput: 0: 935.8. Samples: 452310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:25:45,451][14252] Avg episode reward: [(0, '12.514')]
[2023-02-23 15:25:45,462][15287] Saving new best policy, reward=12.514!
[2023-02-23 15:25:50,448][14252] Fps is (10 sec: 4096.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1843200. Throughput: 0: 968.1. Samples: 458324. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:25:50,456][14252] Avg episode reward: [(0, '12.573')]
[2023-02-23 15:25:50,462][15287] Saving new best policy, reward=12.573!
[2023-02-23 15:25:51,224][15301] Updated weights for policy 0, policy_version 451 (0.0012)
[2023-02-23 15:25:55,450][14252] Fps is (10 sec: 4504.9, 60 sec: 3822.8, 300 sec: 3846.1). Total num frames: 1863680. Throughput: 0: 984.9. Samples: 465332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:25:55,457][14252] Avg episode reward: [(0, '13.030')]
[2023-02-23 15:25:55,474][15287] Saving new best policy, reward=13.030!
[2023-02-23 15:26:00,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 1880064. Throughput: 0: 958.8. Samples: 467706. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:00,452][14252] Avg episode reward: [(0, '14.522')]
[2023-02-23 15:26:00,460][15287] Saving new best policy, reward=14.522!
[2023-02-23 15:26:02,815][15301] Updated weights for policy 0, policy_version 461 (0.0023)
[2023-02-23 15:26:05,448][14252] Fps is (10 sec: 3277.3, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1896448. Throughput: 0: 932.2. Samples: 472230. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:05,453][14252] Avg episode reward: [(0, '14.546')]
[2023-02-23 15:26:05,475][15287] Saving new best policy, reward=14.546!
[2023-02-23 15:26:10,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 1916928. Throughput: 0: 974.8. Samples: 478616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:26:10,454][14252] Avg episode reward: [(0, '15.032')]
[2023-02-23 15:26:10,459][15287] Saving new best policy, reward=15.032!
[2023-02-23 15:26:12,328][15301] Updated weights for policy 0, policy_version 471 (0.0018)
[2023-02-23 15:26:15,453][14252] Fps is (10 sec: 4503.7, 60 sec: 3822.7, 300 sec: 3859.9). Total num frames: 1941504. Throughput: 0: 979.2. Samples: 482096. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:15,460][14252] Avg episode reward: [(0, '17.052')]
[2023-02-23 15:26:15,479][15287] Saving new best policy, reward=17.052!
[2023-02-23 15:26:20,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3873.8). Total num frames: 1957888. Throughput: 0: 948.0. Samples: 487732. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:20,451][14252] Avg episode reward: [(0, '17.260')]
[2023-02-23 15:26:20,459][15287] Saving new best policy, reward=17.260!
[2023-02-23 15:26:24,350][15301] Updated weights for policy 0, policy_version 481 (0.0019)
[2023-02-23 15:26:25,448][14252] Fps is (10 sec: 3278.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 1974272. Throughput: 0: 935.9. Samples: 492256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:25,451][14252] Avg episode reward: [(0, '18.028')]
[2023-02-23 15:26:25,461][15287] Saving new best policy, reward=18.028!
[2023-02-23 15:26:30,454][14252] Fps is (10 sec: 3684.4, 60 sec: 3822.6, 300 sec: 3846.0). Total num frames: 1994752. Throughput: 0: 959.8. Samples: 495508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:30,456][14252] Avg episode reward: [(0, '19.636')]
[2023-02-23 15:26:30,468][15287] Saving new best policy, reward=19.636!
[2023-02-23 15:26:33,402][15301] Updated weights for policy 0, policy_version 491 (0.0012)
[2023-02-23 15:26:35,451][14252] Fps is (10 sec: 4504.5, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 2019328. Throughput: 0: 983.1. Samples: 502564. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:26:35,460][14252] Avg episode reward: [(0, '18.011')]
[2023-02-23 15:26:40,449][14252] Fps is (10 sec: 3688.4, 60 sec: 3823.0, 300 sec: 3860.0). Total num frames: 2031616. Throughput: 0: 940.5. Samples: 507654. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:26:40,452][14252] Avg episode reward: [(0, '18.535')]
[2023-02-23 15:26:45,448][14252] Fps is (10 sec: 2867.9, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2048000. Throughput: 0: 935.8. Samples: 509816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:26:45,453][14252] Avg episode reward: [(0, '18.757')]
[2023-02-23 15:26:45,709][15301] Updated weights for policy 0, policy_version 501 (0.0012)
[2023-02-23 15:26:50,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2072576. Throughput: 0: 973.5. Samples: 516036. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:26:50,451][14252] Avg episode reward: [(0, '18.635')]
[2023-02-23 15:26:54,273][15301] Updated weights for policy 0, policy_version 511 (0.0013)
[2023-02-23 15:26:55,448][14252] Fps is (10 sec: 4915.2, 60 sec: 3891.3, 300 sec: 3873.8). Total num frames: 2097152. Throughput: 0: 991.3. Samples: 523224. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:26:55,455][14252] Avg episode reward: [(0, '19.653')]
[2023-02-23 15:26:55,470][15287] Saving new best policy, reward=19.653!
[2023-02-23 15:27:00,450][14252] Fps is (10 sec: 3685.7, 60 sec: 3822.8, 300 sec: 3859.9). Total num frames: 2109440. Throughput: 0: 965.1. Samples: 525524. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:27:00,453][14252] Avg episode reward: [(0, '20.280')]
[2023-02-23 15:27:00,456][15287] Saving new best policy, reward=20.280!
[2023-02-23 15:27:05,448][14252] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2125824. Throughput: 0: 938.9. Samples: 529984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:27:05,451][14252] Avg episode reward: [(0, '21.781')]
[2023-02-23 15:27:05,465][15287] Saving new best policy, reward=21.781!
[2023-02-23 15:27:06,671][15301] Updated weights for policy 0, policy_version 521 (0.0029)
[2023-02-23 15:27:10,448][14252] Fps is (10 sec: 4096.8, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2150400. Throughput: 0: 984.7. Samples: 536568. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:27:10,451][14252] Avg episode reward: [(0, '20.821')]
[2023-02-23 15:27:15,447][15301] Updated weights for policy 0, policy_version 531 (0.0012)
[2023-02-23 15:27:15,452][14252] Fps is (10 sec: 4913.5, 60 sec: 3891.2, 300 sec: 3873.8). Total num frames: 2174976. Throughput: 0: 990.5. Samples: 540080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:27:15,466][14252] Avg episode reward: [(0, '20.319')]
[2023-02-23 15:27:15,480][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000531_2174976.pth...
[2023-02-23 15:27:15,653][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000305_1249280.pth
[2023-02-23 15:27:20,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2187264. Throughput: 0: 950.8. Samples: 545350. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:27:20,455][14252] Avg episode reward: [(0, '20.089')]
[2023-02-23 15:27:25,448][14252] Fps is (10 sec: 2868.2, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2203648. Throughput: 0: 939.9. Samples: 549950. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:27:25,450][14252] Avg episode reward: [(0, '19.768')]
[2023-02-23 15:27:27,542][15301] Updated weights for policy 0, policy_version 541 (0.0016)
[2023-02-23 15:27:30,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.6, 300 sec: 3846.1). Total num frames: 2228224. Throughput: 0: 969.6. Samples: 553448. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:27:30,456][14252] Avg episode reward: [(0, '18.928')]
[2023-02-23 15:27:35,454][14252] Fps is (10 sec: 4503.1, 60 sec: 3822.7, 300 sec: 3859.9). Total num frames: 2248704. Throughput: 0: 992.8. Samples: 560718. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:27:35,457][14252] Avg episode reward: [(0, '19.602')]
[2023-02-23 15:27:37,019][15301] Updated weights for policy 0, policy_version 551 (0.0020)
[2023-02-23 15:27:40,456][14252] Fps is (10 sec: 3683.6, 60 sec: 3890.7, 300 sec: 3873.8). Total num frames: 2265088. Throughput: 0: 942.1. Samples: 565628. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:27:40,459][14252] Avg episode reward: [(0, '21.022')]
[2023-02-23 15:27:45,448][14252] Fps is (10 sec: 3278.6, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2281472. Throughput: 0: 938.9. Samples: 567772. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-23 15:27:45,454][14252] Avg episode reward: [(0, '22.111')]
[2023-02-23 15:27:45,468][15287] Saving new best policy, reward=22.111!
[2023-02-23 15:27:48,526][15301] Updated weights for policy 0, policy_version 561 (0.0018)
[2023-02-23 15:27:50,448][14252] Fps is (10 sec: 4099.2, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2306048. Throughput: 0: 977.5. Samples: 573970. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:27:50,451][14252] Avg episode reward: [(0, '21.952')]
[2023-02-23 15:27:55,449][14252] Fps is (10 sec: 4505.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2326528. Throughput: 0: 993.4. Samples: 581272. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:27:55,456][14252] Avg episode reward: [(0, '21.804')]
[2023-02-23 15:27:58,374][15301] Updated weights for policy 0, policy_version 571 (0.0022)
[2023-02-23 15:28:00,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 3873.8). Total num frames: 2342912. Throughput: 0: 967.0. Samples: 583592. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:28:00,451][14252] Avg episode reward: [(0, '20.670')]
[2023-02-23 15:28:05,448][14252] Fps is (10 sec: 3277.0, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 2359296. Throughput: 0: 948.4. Samples: 588026. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:28:05,457][14252] Avg episode reward: [(0, '19.990')]
[2023-02-23 15:28:09,455][15301] Updated weights for policy 0, policy_version 581 (0.0015)
[2023-02-23 15:28:10,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3860.0). Total num frames: 2383872. Throughput: 0: 991.6. Samples: 594570. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:28:10,451][14252] Avg episode reward: [(0, '19.808')]
[2023-02-23 15:28:15,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3823.2, 300 sec: 3873.8). Total num frames: 2404352. Throughput: 0: 991.6. Samples: 598070. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:28:15,456][14252] Avg episode reward: [(0, '19.816')]
[2023-02-23 15:28:20,227][15301] Updated weights for policy 0, policy_version 591 (0.0016)
[2023-02-23 15:28:20,450][14252] Fps is (10 sec: 3685.8, 60 sec: 3891.1, 300 sec: 3873.8). Total num frames: 2420736. Throughput: 0: 949.5. Samples: 603440. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:28:20,453][14252] Avg episode reward: [(0, '20.385')]
[2023-02-23 15:28:25,448][14252] Fps is (10 sec: 2867.2, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2433024. Throughput: 0: 937.4. Samples: 607802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:28:25,452][14252] Avg episode reward: [(0, '20.841')]
[2023-02-23 15:28:30,448][14252] Fps is (10 sec: 3687.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2457600. Throughput: 0: 964.9. Samples: 611194. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:28:30,451][14252] Avg episode reward: [(0, '21.531')]
[2023-02-23 15:28:31,070][15301] Updated weights for policy 0, policy_version 601 (0.0030)
[2023-02-23 15:28:35,449][14252] Fps is (10 sec: 4505.6, 60 sec: 3823.3, 300 sec: 3860.0). Total num frames: 2478080. Throughput: 0: 980.6. Samples: 618096. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:28:35,455][14252] Avg episode reward: [(0, '22.390')]
[2023-02-23 15:28:35,467][15287] Saving new best policy, reward=22.390!
[2023-02-23 15:28:40,451][14252] Fps is (10 sec: 3685.5, 60 sec: 3823.3, 300 sec: 3859.9). Total num frames: 2494464. Throughput: 0: 922.8. Samples: 622800. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 15:28:40,456][14252] Avg episode reward: [(0, '20.951')]
[2023-02-23 15:28:43,122][15301] Updated weights for policy 0, policy_version 611 (0.0014)
[2023-02-23 15:28:45,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2510848. Throughput: 0: 920.8. Samples: 625028. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:28:45,454][14252] Avg episode reward: [(0, '20.423')]
[2023-02-23 15:28:50,448][14252] Fps is (10 sec: 4097.0, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2535424. Throughput: 0: 968.4. Samples: 631604. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:28:50,455][14252] Avg episode reward: [(0, '20.206')]
[2023-02-23 15:28:51,833][15301] Updated weights for policy 0, policy_version 621 (0.0012)
[2023-02-23 15:28:55,450][14252] Fps is (10 sec: 4505.2, 60 sec: 3822.9, 300 sec: 3859.9). Total num frames: 2555904. Throughput: 0: 981.5. Samples: 638738. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 15:28:55,454][14252] Avg episode reward: [(0, '21.306')]
[2023-02-23 15:29:00,449][14252] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2572288. Throughput: 0: 953.4. Samples: 640974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:29:00,450][14252] Avg episode reward: [(0, '21.304')]
[2023-02-23 15:29:04,052][15301] Updated weights for policy 0, policy_version 631 (0.0014)
[2023-02-23 15:29:05,448][14252] Fps is (10 sec: 3277.2, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2588672. Throughput: 0: 933.2. Samples: 645434. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:29:05,452][14252] Avg episode reward: [(0, '21.609')]
[2023-02-23 15:29:10,451][14252] Fps is (10 sec: 4095.1, 60 sec: 3822.8, 300 sec: 3832.2). Total num frames: 2613248. Throughput: 0: 987.8. Samples: 652256. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:10,457][14252] Avg episode reward: [(0, '22.252')]
[2023-02-23 15:29:12,691][15301] Updated weights for policy 0, policy_version 641 (0.0011)
[2023-02-23 15:29:15,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2633728. Throughput: 0: 989.3. Samples: 655712. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:15,456][14252] Avg episode reward: [(0, '23.526')]
[2023-02-23 15:29:15,465][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000643_2633728.pth...
[2023-02-23 15:29:15,603][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000418_1712128.pth
[2023-02-23 15:29:15,634][15287] Saving new best policy, reward=23.526!
[2023-02-23 15:29:20,448][14252] Fps is (10 sec: 3277.6, 60 sec: 3754.8, 300 sec: 3846.1). Total num frames: 2646016. Throughput: 0: 944.0. Samples: 660574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:20,453][14252] Avg episode reward: [(0, '23.802')]
[2023-02-23 15:29:20,459][15287] Saving new best policy, reward=23.802!
[2023-02-23 15:29:25,228][15301] Updated weights for policy 0, policy_version 651 (0.0021)
[2023-02-23 15:29:25,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2666496. Throughput: 0: 944.5. Samples: 665298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:25,456][14252] Avg episode reward: [(0, '22.421')]
[2023-02-23 15:29:30,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2686976. Throughput: 0: 973.8. Samples: 668848. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:29:30,451][14252] Avg episode reward: [(0, '22.582')]
[2023-02-23 15:29:34,058][15301] Updated weights for policy 0, policy_version 661 (0.0020)
[2023-02-23 15:29:35,451][14252] Fps is (10 sec: 4095.0, 60 sec: 3822.8, 300 sec: 3846.0). Total num frames: 2707456. Throughput: 0: 981.7. Samples: 675782. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:35,454][14252] Avg episode reward: [(0, '22.472')]
[2023-02-23 15:29:40,452][14252] Fps is (10 sec: 3685.2, 60 sec: 3822.9, 300 sec: 3859.9). Total num frames: 2723840. Throughput: 0: 926.2. Samples: 680420. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:29:40,456][14252] Avg episode reward: [(0, '24.322')]
[2023-02-23 15:29:40,460][15287] Saving new best policy, reward=24.322!
[2023-02-23 15:29:45,448][14252] Fps is (10 sec: 3277.6, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 2740224. Throughput: 0: 925.8. Samples: 682636. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:29:45,451][14252] Avg episode reward: [(0, '23.811')]
[2023-02-23 15:29:46,475][15301] Updated weights for policy 0, policy_version 671 (0.0032)
[2023-02-23 15:29:50,448][14252] Fps is (10 sec: 4097.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2764800. Throughput: 0: 973.0. Samples: 689220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:29:50,455][14252] Avg episode reward: [(0, '23.797')]
[2023-02-23 15:29:55,449][14252] Fps is (10 sec: 4505.4, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 2785280. Throughput: 0: 970.2. Samples: 695914. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:29:55,455][14252] Avg episode reward: [(0, '24.199')]
[2023-02-23 15:29:56,023][15301] Updated weights for policy 0, policy_version 681 (0.0030)
[2023-02-23 15:30:00,449][14252] Fps is (10 sec: 3686.3, 60 sec: 3822.9, 300 sec: 3860.0). Total num frames: 2801664. Throughput: 0: 941.3. Samples: 698072. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:00,451][14252] Avg episode reward: [(0, '24.201')]
[2023-02-23 15:30:05,448][14252] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2818048. Throughput: 0: 930.7. Samples: 702456. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:30:05,453][14252] Avg episode reward: [(0, '24.388')]
[2023-02-23 15:30:05,467][15287] Saving new best policy, reward=24.388!
[2023-02-23 15:30:07,738][15301] Updated weights for policy 0, policy_version 691 (0.0025)
[2023-02-23 15:30:10,448][14252] Fps is (10 sec: 4096.1, 60 sec: 3823.1, 300 sec: 3832.2). Total num frames: 2842624. Throughput: 0: 979.4. Samples: 709370. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:30:10,451][14252] Avg episode reward: [(0, '22.743')]
[2023-02-23 15:30:15,451][14252] Fps is (10 sec: 4504.6, 60 sec: 3822.8, 300 sec: 3846.0). Total num frames: 2863104. Throughput: 0: 980.0. Samples: 712950. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:15,455][14252] Avg episode reward: [(0, '23.450')]
[2023-02-23 15:30:18,076][15301] Updated weights for policy 0, policy_version 701 (0.0032)
[2023-02-23 15:30:20,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 2875392. Throughput: 0: 934.9. Samples: 717850. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:20,451][14252] Avg episode reward: [(0, '24.661')]
[2023-02-23 15:30:20,456][15287] Saving new best policy, reward=24.661!
[2023-02-23 15:30:25,448][14252] Fps is (10 sec: 3277.5, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2895872. Throughput: 0: 944.2. Samples: 722904. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:30:25,453][14252] Avg episode reward: [(0, '24.530')]
[2023-02-23 15:30:28,640][15301] Updated weights for policy 0, policy_version 711 (0.0011)
[2023-02-23 15:30:30,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2920448. Throughput: 0: 973.7. Samples: 726454. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:30,451][14252] Avg episode reward: [(0, '27.185')]
[2023-02-23 15:30:30,455][15287] Saving new best policy, reward=27.185!
[2023-02-23 15:30:35,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3823.1, 300 sec: 3846.1). Total num frames: 2936832. Throughput: 0: 981.6. Samples: 733392. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:35,455][14252] Avg episode reward: [(0, '27.592')]
[2023-02-23 15:30:35,493][15287] Saving new best policy, reward=27.592!
[2023-02-23 15:30:39,585][15301] Updated weights for policy 0, policy_version 721 (0.0016)
[2023-02-23 15:30:40,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3823.1, 300 sec: 3846.1). Total num frames: 2953216. Throughput: 0: 930.3. Samples: 737778. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:30:40,452][14252] Avg episode reward: [(0, '27.149')]
[2023-02-23 15:30:45,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 2973696. Throughput: 0: 931.2. Samples: 739974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:30:45,452][14252] Avg episode reward: [(0, '27.865')]
[2023-02-23 15:30:45,462][15287] Saving new best policy, reward=27.865!
[2023-02-23 15:30:49,778][15301] Updated weights for policy 0, policy_version 731 (0.0032)
[2023-02-23 15:30:50,449][14252] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 2994176. Throughput: 0: 982.8. Samples: 746684. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:30:50,452][14252] Avg episode reward: [(0, '28.729')]
[2023-02-23 15:30:50,453][15287] Saving new best policy, reward=28.729!
[2023-02-23 15:30:55,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3846.1). Total num frames: 3014656. Throughput: 0: 969.8. Samples: 753012. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:30:55,452][14252] Avg episode reward: [(0, '29.258')]
[2023-02-23 15:30:55,469][15287] Saving new best policy, reward=29.258!
[2023-02-23 15:31:00,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3026944. Throughput: 0: 936.8. Samples: 755106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:31:00,455][14252] Avg episode reward: [(0, '28.948')]
[2023-02-23 15:31:02,064][15301] Updated weights for policy 0, policy_version 741 (0.0014)
[2023-02-23 15:31:05,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3047424. Throughput: 0: 926.9. Samples: 759562. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:31:05,457][14252] Avg episode reward: [(0, '27.728')]
[2023-02-23 15:31:10,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3818.4). Total num frames: 3067904. Throughput: 0: 965.7. Samples: 766362. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:31:10,451][14252] Avg episode reward: [(0, '28.506')]
[2023-02-23 15:31:11,612][15301] Updated weights for policy 0, policy_version 751 (0.0025)
[2023-02-23 15:31:15,450][14252] Fps is (10 sec: 4095.3, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3088384. Throughput: 0: 962.4. Samples: 769762. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:31:15,453][14252] Avg episode reward: [(0, '28.257')]
[2023-02-23 15:31:15,474][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000754_3088384.pth...
[2023-02-23 15:31:15,639][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000531_2174976.pth
[2023-02-23 15:31:20,450][14252] Fps is (10 sec: 3276.3, 60 sec: 3754.6, 300 sec: 3818.3). Total num frames: 3100672. Throughput: 0: 908.9. Samples: 774292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:31:20,455][14252] Avg episode reward: [(0, '27.264')]
[2023-02-23 15:31:24,111][15301] Updated weights for policy 0, policy_version 761 (0.0011)
[2023-02-23 15:31:25,448][14252] Fps is (10 sec: 3277.4, 60 sec: 3754.7, 300 sec: 3818.4). Total num frames: 3121152. Throughput: 0: 925.1. Samples: 779406. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:31:25,456][14252] Avg episode reward: [(0, '25.810')]
[2023-02-23 15:31:30,448][14252] Fps is (10 sec: 4506.3, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3145728. Throughput: 0: 954.8. Samples: 782938. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:31:30,455][14252] Avg episode reward: [(0, '23.947')]
[2023-02-23 15:31:32,821][15301] Updated weights for policy 0, policy_version 771 (0.0014)
[2023-02-23 15:31:35,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3846.1). Total num frames: 3166208. Throughput: 0: 959.1. Samples: 789844. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:31:35,455][14252] Avg episode reward: [(0, '24.050')]
[2023-02-23 15:31:40,452][14252] Fps is (10 sec: 3275.7, 60 sec: 3754.5, 300 sec: 3832.1). Total num frames: 3178496. Throughput: 0: 914.3. Samples: 794158. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:31:40,456][14252] Avg episode reward: [(0, '22.190')]
[2023-02-23 15:31:45,096][15301] Updated weights for policy 0, policy_version 781 (0.0022)
[2023-02-23 15:31:45,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3818.3). Total num frames: 3198976. Throughput: 0: 918.2. Samples: 796424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:31:45,456][14252] Avg episode reward: [(0, '21.929')]
[2023-02-23 15:31:50,448][14252] Fps is (10 sec: 4507.2, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3223552. Throughput: 0: 973.0. Samples: 803346. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:31:50,451][14252] Avg episode reward: [(0, '22.969')]
[2023-02-23 15:31:54,283][15301] Updated weights for policy 0, policy_version 791 (0.0021)
[2023-02-23 15:31:55,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3832.2). Total num frames: 3239936. Throughput: 0: 964.1. Samples: 809746. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:31:55,451][14252] Avg episode reward: [(0, '23.241')]
[2023-02-23 15:32:00,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3256320. Throughput: 0: 938.6. Samples: 811996. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:32:00,454][14252] Avg episode reward: [(0, '24.888')]
[2023-02-23 15:32:05,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3276800. Throughput: 0: 944.9. Samples: 816812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:32:05,451][14252] Avg episode reward: [(0, '25.580')]
[2023-02-23 15:32:06,185][15301] Updated weights for policy 0, policy_version 801 (0.0020)
[2023-02-23 15:32:10,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3818.4). Total num frames: 3301376. Throughput: 0: 987.2. Samples: 823830. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:32:10,456][14252] Avg episode reward: [(0, '25.716')]
[2023-02-23 15:32:15,450][14252] Fps is (10 sec: 4095.3, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3317760. Throughput: 0: 987.1. Samples: 827358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:32:15,457][14252] Avg episode reward: [(0, '26.350')]
[2023-02-23 15:32:16,106][15301] Updated weights for policy 0, policy_version 811 (0.0027)
[2023-02-23 15:32:20,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3891.3, 300 sec: 3832.2). Total num frames: 3334144. Throughput: 0: 933.9. Samples: 831870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:32:20,452][14252] Avg episode reward: [(0, '26.334')]
[2023-02-23 15:32:25,448][14252] Fps is (10 sec: 3277.4, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3350528. Throughput: 0: 954.7. Samples: 837114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:32:25,457][14252] Avg episode reward: [(0, '25.119')]
[2023-02-23 15:32:27,252][15301] Updated weights for policy 0, policy_version 821 (0.0012)
[2023-02-23 15:32:30,448][14252] Fps is (10 sec: 4096.1, 60 sec: 3822.9, 300 sec: 3818.4). Total num frames: 3375104. Throughput: 0: 981.6. Samples: 840598. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:32:30,456][14252] Avg episode reward: [(0, '24.194')]
[2023-02-23 15:32:35,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3832.3). Total num frames: 3395584. Throughput: 0: 978.2. Samples: 847364. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:32:35,453][14252] Avg episode reward: [(0, '25.447')]
[2023-02-23 15:32:37,720][15301] Updated weights for policy 0, policy_version 831 (0.0014)
[2023-02-23 15:32:40,449][14252] Fps is (10 sec: 3686.1, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 3411968. Throughput: 0: 934.3. Samples: 851790. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:32:40,456][14252] Avg episode reward: [(0, '26.321')]
[2023-02-23 15:32:45,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3428352. Throughput: 0: 936.4. Samples: 854136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:32:45,452][14252] Avg episode reward: [(0, '24.337')]
[2023-02-23 15:32:48,291][15301] Updated weights for policy 0, policy_version 841 (0.0015)
[2023-02-23 15:32:50,448][14252] Fps is (10 sec: 4096.3, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3452928. Throughput: 0: 983.6. Samples: 861074. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:32:50,450][14252] Avg episode reward: [(0, '24.045')]
[2023-02-23 15:32:55,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3473408. Throughput: 0: 968.7. Samples: 867420. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:32:55,453][14252] Avg episode reward: [(0, '25.343')]
[2023-02-23 15:32:59,351][15301] Updated weights for policy 0, policy_version 851 (0.0029)
[2023-02-23 15:33:00,452][14252] Fps is (10 sec: 3275.6, 60 sec: 3822.7, 300 sec: 3818.3). Total num frames: 3485696. Throughput: 0: 939.6. Samples: 869642. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:33:00,456][14252] Avg episode reward: [(0, '26.033')]
[2023-02-23 15:33:05,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3804.4). Total num frames: 3506176. Throughput: 0: 949.2. Samples: 874586. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:33:05,454][14252] Avg episode reward: [(0, '26.062')]
[2023-02-23 15:33:09,335][15301] Updated weights for policy 0, policy_version 861 (0.0012)
[2023-02-23 15:33:10,448][14252] Fps is (10 sec: 4507.2, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3530752. Throughput: 0: 989.8. Samples: 881656. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:33:10,457][14252] Avg episode reward: [(0, '26.435')]
[2023-02-23 15:33:15,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3823.0, 300 sec: 3818.3). Total num frames: 3547136. Throughput: 0: 991.8. Samples: 885228. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:33:15,455][14252] Avg episode reward: [(0, '26.231')]
[2023-02-23 15:33:15,468][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000866_3547136.pth...
[2023-02-23 15:33:15,617][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000643_2633728.pth
[2023-02-23 15:33:20,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3563520. Throughput: 0: 938.1. Samples: 889578. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:33:20,451][14252] Avg episode reward: [(0, '26.717')]
[2023-02-23 15:33:21,036][15301] Updated weights for policy 0, policy_version 871 (0.0025)
[2023-02-23 15:33:25,450][14252] Fps is (10 sec: 3685.8, 60 sec: 3891.1, 300 sec: 3818.3). Total num frames: 3584000. Throughput: 0: 963.4. Samples: 895142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:33:25,456][14252] Avg episode reward: [(0, '27.168')]
[2023-02-23 15:33:30,210][15301] Updated weights for policy 0, policy_version 881 (0.0014)
[2023-02-23 15:33:30,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3608576. Throughput: 0: 989.2. Samples: 898648. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:33:30,451][14252] Avg episode reward: [(0, '25.425')]
[2023-02-23 15:33:35,454][14252] Fps is (10 sec: 4094.4, 60 sec: 3822.6, 300 sec: 3832.1). Total num frames: 3624960. Throughput: 0: 980.3. Samples: 905192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:33:35,457][14252] Avg episode reward: [(0, '25.969')]
[2023-02-23 15:33:40,449][14252] Fps is (10 sec: 3276.7, 60 sec: 3823.0, 300 sec: 3832.2). Total num frames: 3641344. Throughput: 0: 937.8. Samples: 909622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:33:40,454][14252] Avg episode reward: [(0, '25.230')]
[2023-02-23 15:33:42,687][15301] Updated weights for policy 0, policy_version 891 (0.0033)
[2023-02-23 15:33:45,448][14252] Fps is (10 sec: 3688.5, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3661824. Throughput: 0: 941.1. Samples: 911990. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2023-02-23 15:33:45,454][14252] Avg episode reward: [(0, '26.890')]
[2023-02-23 15:33:50,448][14252] Fps is (10 sec: 4505.7, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3686400. Throughput: 0: 988.5. Samples: 919070. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:33:50,451][14252] Avg episode reward: [(0, '26.763')]
[2023-02-23 15:33:51,092][15301] Updated weights for policy 0, policy_version 901 (0.0021)
[2023-02-23 15:33:55,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3702784. Throughput: 0: 967.2. Samples: 925182. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:33:55,451][14252] Avg episode reward: [(0, '26.948')]
[2023-02-23 15:34:00,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3891.4, 300 sec: 3832.2). Total num frames: 3719168. Throughput: 0: 936.0. Samples: 927348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:34:00,455][14252] Avg episode reward: [(0, '27.841')]
[2023-02-23 15:34:03,601][15301] Updated weights for policy 0, policy_version 911 (0.0015)
[2023-02-23 15:34:05,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3818.3). Total num frames: 3739648. Throughput: 0: 953.7. Samples: 932496. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2023-02-23 15:34:05,456][14252] Avg episode reward: [(0, '27.685')]
[2023-02-23 15:34:10,448][14252] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3760128. Throughput: 0: 987.0. Samples: 939554. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2023-02-23 15:34:10,451][14252] Avg episode reward: [(0, '26.982')]
[2023-02-23 15:34:12,451][15301] Updated weights for policy 0, policy_version 921 (0.0017)
[2023-02-23 15:34:15,449][14252] Fps is (10 sec: 4095.9, 60 sec: 3891.2, 300 sec: 3846.1). Total num frames: 3780608. Throughput: 0: 980.8. Samples: 942782. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:34:15,459][14252] Avg episode reward: [(0, '26.515')]
[2023-02-23 15:34:20,448][14252] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3792896. Throughput: 0: 933.3. Samples: 947184. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:34:20,451][14252] Avg episode reward: [(0, '26.085')]
[2023-02-23 15:34:24,585][15301] Updated weights for policy 0, policy_version 931 (0.0036)
[2023-02-23 15:34:25,449][14252] Fps is (10 sec: 3686.1, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3817472. Throughput: 0: 963.3. Samples: 952970. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:34:25,456][14252] Avg episode reward: [(0, '24.158')]
[2023-02-23 15:34:30,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3837952. Throughput: 0: 988.1. Samples: 956454. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2023-02-23 15:34:30,457][14252] Avg episode reward: [(0, '24.301')]
[2023-02-23 15:34:34,118][15301] Updated weights for policy 0, policy_version 941 (0.0013)
[2023-02-23 15:34:35,448][14252] Fps is (10 sec: 3686.7, 60 sec: 3823.3, 300 sec: 3832.2). Total num frames: 3854336. Throughput: 0: 967.0. Samples: 962586. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:34:35,455][14252] Avg episode reward: [(0, '24.058')]
[2023-02-23 15:34:40,452][14252] Fps is (10 sec: 3275.5, 60 sec: 3822.7, 300 sec: 3832.1). Total num frames: 3870720. Throughput: 0: 928.2. Samples: 966954. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2023-02-23 15:34:40,455][14252] Avg episode reward: [(0, '24.679')]
[2023-02-23 15:34:45,448][14252] Fps is (10 sec: 3686.4, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3891200. Throughput: 0: 940.4. Samples: 969668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2023-02-23 15:34:45,453][14252] Avg episode reward: [(0, '25.359')]
[2023-02-23 15:34:45,763][15301] Updated weights for policy 0, policy_version 951 (0.0018)
[2023-02-23 15:34:50,449][14252] Fps is (10 sec: 4507.2, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3915776. Throughput: 0: 984.2. Samples: 976786. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2023-02-23 15:34:50,456][14252] Avg episode reward: [(0, '25.452')]
[2023-02-23 15:34:55,450][14252] Fps is (10 sec: 4095.4, 60 sec: 3822.8, 300 sec: 3832.2). Total num frames: 3932160. Throughput: 0: 955.3. Samples: 982546. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2023-02-23 15:34:55,457][14252] Avg episode reward: [(0, '26.563')]
[2023-02-23 15:34:55,784][15301] Updated weights for policy 0, policy_version 961 (0.0017)
[2023-02-23 15:35:00,448][14252] Fps is (10 sec: 3276.9, 60 sec: 3822.9, 300 sec: 3832.2). Total num frames: 3948544. Throughput: 0: 932.4. Samples: 984740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2023-02-23 15:35:00,451][14252] Avg episode reward: [(0, '26.253')]
[2023-02-23 15:35:05,448][14252] Fps is (10 sec: 3687.0, 60 sec: 3822.9, 300 sec: 3818.3). Total num frames: 3969024. Throughput: 0: 956.2. Samples: 990212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:35:05,455][14252] Avg episode reward: [(0, '25.363')]
[2023-02-23 15:35:06,690][15301] Updated weights for policy 0, policy_version 971 (0.0029)
[2023-02-23 15:35:10,448][14252] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3832.2). Total num frames: 3993600. Throughput: 0: 983.5. Samples: 997226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2023-02-23 15:35:10,454][14252] Avg episode reward: [(0, '25.358')]
[2023-02-23 15:35:13,676][15287] Stopping Batcher_0...
[2023-02-23 15:35:13,677][15287] Loop batcher_evt_loop terminating...
[2023-02-23 15:35:13,678][14252] Component Batcher_0 stopped!
[2023-02-23 15:35:13,679][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 15:35:13,761][15301] Weights refcount: 2 0
[2023-02-23 15:35:13,767][14252] Component RolloutWorker_w1 stopped!
[2023-02-23 15:35:13,771][15303] Stopping RolloutWorker_w1...
[2023-02-23 15:35:13,778][14252] Component InferenceWorker_p0-w0 stopped!
[2023-02-23 15:35:13,780][15301] Stopping InferenceWorker_p0-w0...
[2023-02-23 15:35:13,784][15301] Loop inference_proc0-0_evt_loop terminating...
[2023-02-23 15:35:13,786][15303] Loop rollout_proc1_evt_loop terminating...
[2023-02-23 15:35:13,816][14252] Component RolloutWorker_w5 stopped!
[2023-02-23 15:35:13,824][15307] Stopping RolloutWorker_w5...
[2023-02-23 15:35:13,824][15307] Loop rollout_proc5_evt_loop terminating...
[2023-02-23 15:35:13,842][15302] Stopping RolloutWorker_w0...
[2023-02-23 15:35:13,843][15302] Loop rollout_proc0_evt_loop terminating...
[2023-02-23 15:35:13,839][14252] Component RolloutWorker_w3 stopped!
[2023-02-23 15:35:13,844][14252] Component RolloutWorker_w0 stopped!
[2023-02-23 15:35:13,849][15305] Stopping RolloutWorker_w3...
[2023-02-23 15:35:13,850][15305] Loop rollout_proc3_evt_loop terminating...
[2023-02-23 15:35:13,865][14252] Component RolloutWorker_w4 stopped!
[2023-02-23 15:35:13,865][15306] Stopping RolloutWorker_w4...
[2023-02-23 15:35:13,874][15306] Loop rollout_proc4_evt_loop terminating...
[2023-02-23 15:35:13,883][15309] Stopping RolloutWorker_w7...
[2023-02-23 15:35:13,884][15309] Loop rollout_proc7_evt_loop terminating...
[2023-02-23 15:35:13,886][15308] Stopping RolloutWorker_w6...
[2023-02-23 15:35:13,892][15304] Stopping RolloutWorker_w2...
[2023-02-23 15:35:13,892][15304] Loop rollout_proc2_evt_loop terminating...
[2023-02-23 15:35:13,883][14252] Component RolloutWorker_w7 stopped!
[2023-02-23 15:35:13,896][14252] Component RolloutWorker_w6 stopped!
[2023-02-23 15:35:13,897][14252] Component RolloutWorker_w2 stopped!
[2023-02-23 15:35:13,887][15308] Loop rollout_proc6_evt_loop terminating...
[2023-02-23 15:35:13,953][15287] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000754_3088384.pth
[2023-02-23 15:35:13,973][15287] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 15:35:14,190][14252] Component LearnerWorker_p0 stopped!
[2023-02-23 15:35:14,192][14252] Waiting for process learner_proc0 to stop...
[2023-02-23 15:35:14,194][15287] Stopping LearnerWorker_p0...
[2023-02-23 15:35:14,197][15287] Loop learner_proc0_evt_loop terminating...
[2023-02-23 15:35:16,320][14252] Waiting for process inference_proc0-0 to join...
[2023-02-23 15:35:17,175][14252] Waiting for process rollout_proc0 to join...
[2023-02-23 15:35:18,023][14252] Waiting for process rollout_proc1 to join...
[2023-02-23 15:35:18,028][14252] Waiting for process rollout_proc2 to join...
[2023-02-23 15:35:18,030][14252] Waiting for process rollout_proc3 to join...
[2023-02-23 15:35:18,032][14252] Waiting for process rollout_proc4 to join...
[2023-02-23 15:35:18,033][14252] Waiting for process rollout_proc5 to join...
[2023-02-23 15:35:18,035][14252] Waiting for process rollout_proc6 to join...
[2023-02-23 15:35:18,036][14252] Waiting for process rollout_proc7 to join...
[2023-02-23 15:35:18,037][14252] Batcher 0 profile tree view:
batching: 25.4059, releasing_batches: 0.0342
[2023-02-23 15:35:18,038][14252] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0099
  wait_policy_total: 514.9866
update_model: 7.3590
  weight_update: 0.0018
one_step: 0.0021
  handle_policy_step: 494.7788
    deserialize: 14.7049, stack: 2.9275, obs_to_device_normalize: 112.7425, forward: 234.8439, send_messages: 25.3925
    prepare_outputs: 79.0665
      to_cpu: 48.5471
[2023-02-23 15:35:18,040][14252] Learner 0 profile tree view:
misc: 0.0053, prepare_batch: 15.2338
train: 79.3040
  epoch_init: 0.0061, minibatch_init: 0.0097, losses_postprocess: 0.6127, kl_divergence: 0.5486, after_optimizer: 3.8811
  calculate_losses: 26.1793
    losses_init: 0.0033, forward_head: 1.7274, bptt_initial: 17.3315, tail: 1.0478, advantages_returns: 0.2927, losses: 3.4118
    bptt: 2.0135
      bptt_forward_core: 1.9285
  update: 47.2798
    clip: 1.3897
[2023-02-23 15:35:18,042][14252] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3290, enqueue_policy_requests: 136.3886, env_step: 798.8956, overhead: 19.2139, complete_rollouts: 6.7458
save_policy_outputs: 19.2426
  split_output_tensors: 9.4486
[2023-02-23 15:35:18,043][14252] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3455, enqueue_policy_requests: 138.5361, env_step: 793.1275, overhead: 19.8811, complete_rollouts: 7.3809
save_policy_outputs: 19.5137
  split_output_tensors: 9.4712
[2023-02-23 15:35:18,045][14252] Loop Runner_EvtLoop terminating...
[2023-02-23 15:35:18,046][14252] Runner profile tree view:
main_loop: 1079.2194
[2023-02-23 15:35:18,048][14252] Collected {0: 4005888}, FPS: 3708.0
[2023-02-23 15:35:55,139][14252] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 15:35:55,144][14252] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 15:35:55,149][14252] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 15:35:55,151][14252] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 15:35:55,155][14252] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 15:35:55,157][14252] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 15:35:55,162][14252] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 15:35:55,163][14252] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 15:35:55,164][14252] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2023-02-23 15:35:55,165][14252] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2023-02-23 15:35:55,167][14252] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 15:35:55,171][14252] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 15:35:55,172][14252] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 15:35:55,173][14252] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 15:35:55,174][14252] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 15:35:55,214][14252] Doom resolution: 160x120, resize resolution: (128, 72)
[2023-02-23 15:35:55,217][14252] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:35:55,219][14252] RunningMeanStd input shape: (1,)
[2023-02-23 15:35:55,237][14252] ConvEncoder: input_channels=3
[2023-02-23 15:35:55,942][14252] Conv encoder output size: 512
[2023-02-23 15:35:55,944][14252] Policy head output size: 512
[2023-02-23 15:35:58,321][14252] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 15:35:59,597][14252] Num frames 100...
[2023-02-23 15:35:59,721][14252] Num frames 200...
[2023-02-23 15:35:59,836][14252] Num frames 300...
[2023-02-23 15:35:59,956][14252] Num frames 400...
[2023-02-23 15:36:00,061][14252] Avg episode rewards: #0: 11.430, true rewards: #0: 4.430
[2023-02-23 15:36:00,063][14252] Avg episode reward: 11.430, avg true_objective: 4.430
[2023-02-23 15:36:00,127][14252] Num frames 500...
[2023-02-23 15:36:00,246][14252] Num frames 600...
[2023-02-23 15:36:00,364][14252] Num frames 700...
[2023-02-23 15:36:00,484][14252] Num frames 800...
[2023-02-23 15:36:00,603][14252] Num frames 900...
[2023-02-23 15:36:00,725][14252] Num frames 1000...
[2023-02-23 15:36:00,856][14252] Num frames 1100...
[2023-02-23 15:36:00,972][14252] Num frames 1200...
[2023-02-23 15:36:01,087][14252] Num frames 1300...
[2023-02-23 15:36:01,198][14252] Num frames 1400...
[2023-02-23 15:36:01,311][14252] Num frames 1500...
[2023-02-23 15:36:01,429][14252] Num frames 1600...
[2023-02-23 15:36:01,544][14252] Num frames 1700...
[2023-02-23 15:36:01,661][14252] Num frames 1800...
[2023-02-23 15:36:01,784][14252] Num frames 1900...
[2023-02-23 15:36:01,892][14252] Avg episode rewards: #0: 22.235, true rewards: #0: 9.735
[2023-02-23 15:36:01,894][14252] Avg episode reward: 22.235, avg true_objective: 9.735
[2023-02-23 15:36:01,961][14252] Num frames 2000...
[2023-02-23 15:36:02,076][14252] Num frames 2100...
[2023-02-23 15:36:02,194][14252] Num frames 2200...
[2023-02-23 15:36:02,317][14252] Num frames 2300...
[2023-02-23 15:36:02,443][14252] Num frames 2400...
[2023-02-23 15:36:02,562][14252] Num frames 2500...
[2023-02-23 15:36:02,679][14252] Num frames 2600...
[2023-02-23 15:36:02,804][14252] Num frames 2700...
[2023-02-23 15:36:02,926][14252] Num frames 2800...
[2023-02-23 15:36:03,002][14252] Avg episode rewards: #0: 20.703, true rewards: #0: 9.370
[2023-02-23 15:36:03,004][14252] Avg episode reward: 20.703, avg true_objective: 9.370
[2023-02-23 15:36:03,105][14252] Num frames 2900...
[2023-02-23 15:36:03,217][14252] Num frames 3000...
[2023-02-23 15:36:03,336][14252] Num frames 3100...
[2023-02-23 15:36:03,448][14252] Num frames 3200...
[2023-02-23 15:36:03,557][14252] Num frames 3300...
[2023-02-23 15:36:03,676][14252] Num frames 3400...
[2023-02-23 15:36:03,802][14252] Num frames 3500...
[2023-02-23 15:36:03,920][14252] Num frames 3600...
[2023-02-23 15:36:04,040][14252] Num frames 3700...
[2023-02-23 15:36:04,161][14252] Num frames 3800...
[2023-02-23 15:36:04,270][14252] Num frames 3900...
[2023-02-23 15:36:04,386][14252] Num frames 4000...
[2023-02-23 15:36:04,507][14252] Num frames 4100...
[2023-02-23 15:36:04,565][14252] Avg episode rewards: #0: 23.253, true rewards: #0: 10.252
[2023-02-23 15:36:04,567][14252] Avg episode reward: 23.253, avg true_objective: 10.252
[2023-02-23 15:36:04,684][14252] Num frames 4200...
[2023-02-23 15:36:04,801][14252] Num frames 4300...
[2023-02-23 15:36:04,914][14252] Num frames 4400...
[2023-02-23 15:36:05,030][14252] Avg episode rewards: #0: 20.106, true rewards: #0: 8.906
[2023-02-23 15:36:05,032][14252] Avg episode reward: 20.106, avg true_objective: 8.906
[2023-02-23 15:36:05,091][14252] Num frames 4500...
[2023-02-23 15:36:05,202][14252] Num frames 4600...
[2023-02-23 15:36:05,360][14252] Num frames 4700...
[2023-02-23 15:36:05,522][14252] Num frames 4800...
[2023-02-23 15:36:05,680][14252] Num frames 4900...
[2023-02-23 15:36:05,850][14252] Num frames 5000...
[2023-02-23 15:36:06,015][14252] Num frames 5100...
[2023-02-23 15:36:06,175][14252] Num frames 5200...
[2023-02-23 15:36:06,342][14252] Num frames 5300...
[2023-02-23 15:36:06,500][14252] Num frames 5400...
[2023-02-23 15:36:06,656][14252] Num frames 5500...
[2023-02-23 15:36:06,812][14252] Num frames 5600...
[2023-02-23 15:36:06,979][14252] Num frames 5700...
[2023-02-23 15:36:07,141][14252] Num frames 5800...
[2023-02-23 15:36:07,309][14252] Num frames 5900...
[2023-02-23 15:36:07,468][14252] Num frames 6000...
[2023-02-23 15:36:07,637][14252] Num frames 6100...
[2023-02-23 15:36:07,803][14252] Num frames 6200...
[2023-02-23 15:36:07,969][14252] Num frames 6300...
[2023-02-23 15:36:08,136][14252] Num frames 6400...
[2023-02-23 15:36:08,310][14252] Num frames 6500...
[2023-02-23 15:36:08,460][14252] Avg episode rewards: #0: 25.755, true rewards: #0: 10.922
[2023-02-23 15:36:08,463][14252] Avg episode reward: 25.755, avg true_objective: 10.922
[2023-02-23 15:36:08,554][14252] Num frames 6600...
[2023-02-23 15:36:08,731][14252] Num frames 6700...
[2023-02-23 15:36:08,880][14252] Num frames 6800...
[2023-02-23 15:36:09,002][14252] Num frames 6900...
[2023-02-23 15:36:09,127][14252] Num frames 7000...
[2023-02-23 15:36:09,240][14252] Num frames 7100...
[2023-02-23 15:36:09,361][14252] Num frames 7200...
[2023-02-23 15:36:09,463][14252] Avg episode rewards: #0: 24.201, true rewards: #0: 10.344
[2023-02-23 15:36:09,464][14252] Avg episode reward: 24.201, avg true_objective: 10.344
[2023-02-23 15:36:09,533][14252] Num frames 7300...
[2023-02-23 15:36:09,650][14252] Num frames 7400...
[2023-02-23 15:36:09,766][14252] Num frames 7500...
[2023-02-23 15:36:09,877][14252] Num frames 7600...
[2023-02-23 15:36:09,998][14252] Num frames 7700...
[2023-02-23 15:36:10,120][14252] Num frames 7800...
[2023-02-23 15:36:10,239][14252] Num frames 7900...
[2023-02-23 15:36:10,364][14252] Num frames 8000...
[2023-02-23 15:36:10,487][14252] Num frames 8100...
[2023-02-23 15:36:10,611][14252] Num frames 8200...
[2023-02-23 15:36:10,734][14252] Num frames 8300...
[2023-02-23 15:36:10,867][14252] Num frames 8400...
[2023-02-23 15:36:10,995][14252] Num frames 8500...
[2023-02-23 15:36:11,107][14252] Avg episode rewards: #0: 25.434, true rewards: #0: 10.684
[2023-02-23 15:36:11,108][14252] Avg episode reward: 25.434, avg true_objective: 10.684
[2023-02-23 15:36:11,176][14252] Num frames 8600...
[2023-02-23 15:36:11,292][14252] Num frames 8700...
[2023-02-23 15:36:11,412][14252] Num frames 8800...
[2023-02-23 15:36:11,528][14252] Num frames 8900...
[2023-02-23 15:36:11,629][14252] Avg episode rewards: #0: 23.368, true rewards: #0: 9.923
[2023-02-23 15:36:11,632][14252] Avg episode reward: 23.368, avg true_objective: 9.923
[2023-02-23 15:36:11,711][14252] Num frames 9000...
[2023-02-23 15:36:11,822][14252] Num frames 9100...
[2023-02-23 15:36:11,935][14252] Num frames 9200...
[2023-02-23 15:36:12,056][14252] Num frames 9300...
[2023-02-23 15:36:12,172][14252] Num frames 9400...
[2023-02-23 15:36:12,285][14252] Num frames 9500...
[2023-02-23 15:36:12,406][14252] Num frames 9600...
[2023-02-23 15:36:12,527][14252] Num frames 9700...
[2023-02-23 15:36:12,604][14252] Avg episode rewards: #0: 22.512, true rewards: #0: 9.712
[2023-02-23 15:36:12,605][14252] Avg episode reward: 22.512, avg true_objective: 9.712
[2023-02-23 15:37:09,989][14252] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2023-02-23 15:39:08,475][14252] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2023-02-23 15:39:08,478][14252] Overriding arg 'num_workers' with value 1 passed from command line
[2023-02-23 15:39:08,483][14252] Adding new argument 'no_render'=True that is not in the saved config file!
[2023-02-23 15:39:08,486][14252] Adding new argument 'save_video'=True that is not in the saved config file!
[2023-02-23 15:39:08,487][14252] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2023-02-23 15:39:08,491][14252] Adding new argument 'video_name'=None that is not in the saved config file!
[2023-02-23 15:39:08,492][14252] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2023-02-23 15:39:08,493][14252] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2023-02-23 15:39:08,494][14252] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2023-02-23 15:39:08,495][14252] Adding new argument 'hf_repository'='SuburbanLion/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2023-02-23 15:39:08,499][14252] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2023-02-23 15:39:08,500][14252] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2023-02-23 15:39:08,501][14252] Adding new argument 'train_script'=None that is not in the saved config file!
[2023-02-23 15:39:08,503][14252] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2023-02-23 15:39:08,504][14252] Using frameskip 1 and render_action_repeat=4 for evaluation
[2023-02-23 15:39:08,541][14252] RunningMeanStd input shape: (3, 72, 128)
[2023-02-23 15:39:08,544][14252] RunningMeanStd input shape: (1,)
[2023-02-23 15:39:08,562][14252] ConvEncoder: input_channels=3
[2023-02-23 15:39:08,622][14252] Conv encoder output size: 512
[2023-02-23 15:39:08,624][14252] Policy head output size: 512
[2023-02-23 15:39:08,651][14252] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2023-02-23 15:39:09,349][14252] Num frames 100...
[2023-02-23 15:39:09,512][14252] Num frames 200...
[2023-02-23 15:39:09,678][14252] Num frames 300...
[2023-02-23 15:39:09,834][14252] Num frames 400...
[2023-02-23 15:39:09,990][14252] Num frames 500...
[2023-02-23 15:39:10,147][14252] Num frames 600...
[2023-02-23 15:39:10,313][14252] Num frames 700...
[2023-02-23 15:39:10,469][14252] Num frames 800...
[2023-02-23 15:39:10,631][14252] Num frames 900...
[2023-02-23 15:39:10,795][14252] Num frames 1000...
[2023-02-23 15:39:10,968][14252] Num frames 1100...
[2023-02-23 15:39:11,134][14252] Num frames 1200...
[2023-02-23 15:39:11,305][14252] Num frames 1300...
[2023-02-23 15:39:11,397][14252] Avg episode rewards: #0: 32.210, true rewards: #0: 13.210
[2023-02-23 15:39:11,398][14252] Avg episode reward: 32.210, avg true_objective: 13.210
[2023-02-23 15:39:11,524][14252] Num frames 1400...
[2023-02-23 15:39:11,700][14252] Num frames 1500...
[2023-02-23 15:39:11,847][14252] Num frames 1600...
[2023-02-23 15:39:11,963][14252] Num frames 1700...
[2023-02-23 15:39:12,079][14252] Num frames 1800...
[2023-02-23 15:39:12,248][14252] Avg episode rewards: #0: 20.985, true rewards: #0: 9.485
[2023-02-23 15:39:12,251][14252] Avg episode reward: 20.985, avg true_objective: 9.485
[2023-02-23 15:39:12,258][14252] Num frames 1900...
[2023-02-23 15:39:12,384][14252] Num frames 2000...
[2023-02-23 15:39:12,501][14252] Num frames 2100...
[2023-02-23 15:39:12,621][14252] Num frames 2200...
[2023-02-23 15:39:12,737][14252] Num frames 2300...
[2023-02-23 15:39:12,848][14252] Num frames 2400...
[2023-02-23 15:39:12,965][14252] Num frames 2500...
[2023-02-23 15:39:13,082][14252] Num frames 2600...
[2023-02-23 15:39:13,195][14252] Num frames 2700...
[2023-02-23 15:39:13,317][14252] Num frames 2800...
[2023-02-23 15:39:13,461][14252] Num frames 2900...
[2023-02-23 15:39:13,575][14252] Num frames 3000...
[2023-02-23 15:39:13,685][14252] Num frames 3100...
[2023-02-23 15:39:13,805][14252] Num frames 3200...
[2023-02-23 15:39:13,917][14252] Num frames 3300...
[2023-02-23 15:39:14,028][14252] Num frames 3400...
[2023-02-23 15:39:14,140][14252] Num frames 3500...
[2023-02-23 15:39:14,299][14252] Avg episode rewards: #0: 28.643, true rewards: #0: 11.977
[2023-02-23 15:39:14,301][14252] Avg episode reward: 28.643, avg true_objective: 11.977
[2023-02-23 15:39:14,313][14252] Num frames 3600...
[2023-02-23 15:39:14,438][14252] Num frames 3700...
[2023-02-23 15:39:14,558][14252] Num frames 3800...
[2023-02-23 15:39:14,672][14252] Num frames 3900...
[2023-02-23 15:39:14,790][14252] Num frames 4000...
[2023-02-23 15:39:14,908][14252] Num frames 4100...
[2023-02-23 15:39:15,026][14252] Num frames 4200...
[2023-02-23 15:39:15,148][14252] Num frames 4300...
[2023-02-23 15:39:15,267][14252] Num frames 4400...
[2023-02-23 15:39:15,385][14252] Avg episode rewards: #0: 26.392, true rewards: #0: 11.142
[2023-02-23 15:39:15,387][14252] Avg episode reward: 26.392, avg true_objective: 11.142
[2023-02-23 15:39:15,439][14252] Num frames 4500...
[2023-02-23 15:39:15,555][14252] Num frames 4600...
[2023-02-23 15:39:15,666][14252] Num frames 4700...
[2023-02-23 15:39:15,788][14252] Num frames 4800...
[2023-02-23 15:39:15,905][14252] Num frames 4900...
[2023-02-23 15:39:16,003][14252] Avg episode rewards: #0: 23.274, true rewards: #0: 9.874
[2023-02-23 15:39:16,008][14252] Avg episode reward: 23.274, avg true_objective: 9.874
[2023-02-23 15:39:16,081][14252] Num frames 5000...
[2023-02-23 15:39:16,193][14252] Num frames 5100...
[2023-02-23 15:39:16,310][14252] Num frames 5200...
[2023-02-23 15:39:16,434][14252] Num frames 5300...
[2023-02-23 15:39:16,555][14252] Num frames 5400...
[2023-02-23 15:39:16,677][14252] Num frames 5500...
[2023-02-23 15:39:16,794][14252] Num frames 5600...
[2023-02-23 15:39:16,912][14252] Num frames 5700...
[2023-02-23 15:39:17,025][14252] Num frames 5800...
[2023-02-23 15:39:17,144][14252] Num frames 5900...
[2023-02-23 15:39:17,263][14252] Num frames 6000...
[2023-02-23 15:39:17,380][14252] Num frames 6100...
[2023-02-23 15:39:17,500][14252] Num frames 6200...
[2023-02-23 15:39:17,626][14252] Num frames 6300...
[2023-02-23 15:39:17,751][14252] Num frames 6400...
[2023-02-23 15:39:17,862][14252] Num frames 6500...
[2023-02-23 15:39:17,962][14252] Avg episode rewards: #0: 25.728, true rewards: #0: 10.895
[2023-02-23 15:39:17,964][14252] Avg episode reward: 25.728, avg true_objective: 10.895
[2023-02-23 15:39:18,037][14252] Num frames 6600...
[2023-02-23 15:39:18,148][14252] Num frames 6700...
[2023-02-23 15:39:18,259][14252] Num frames 6800...
[2023-02-23 15:39:18,373][14252] Num frames 6900...
[2023-02-23 15:39:18,495][14252] Num frames 7000...
[2023-02-23 15:39:18,615][14252] Num frames 7100...
[2023-02-23 15:39:18,730][14252] Num frames 7200...
[2023-02-23 15:39:18,845][14252] Num frames 7300...
[2023-02-23 15:39:18,981][14252] Num frames 7400...
[2023-02-23 15:39:19,094][14252] Num frames 7500...
[2023-02-23 15:39:19,205][14252] Num frames 7600...
[2023-02-23 15:39:19,330][14252] Num frames 7700...
[2023-02-23 15:39:19,452][14252] Num frames 7800...
[2023-02-23 15:39:19,577][14252] Num frames 7900...
[2023-02-23 15:39:19,701][14252] Num frames 8000...
[2023-02-23 15:39:19,822][14252] Num frames 8100...
[2023-02-23 15:39:19,935][14252] Num frames 8200...
[2023-02-23 15:39:20,050][14252] Num frames 8300...
[2023-02-23 15:39:20,162][14252] Num frames 8400...
[2023-02-23 15:39:20,276][14252] Num frames 8500...
[2023-02-23 15:39:20,392][14252] Num frames 8600...
[2023-02-23 15:39:20,491][14252] Avg episode rewards: #0: 30.053, true rewards: #0: 12.339
[2023-02-23 15:39:20,493][14252] Avg episode reward: 30.053, avg true_objective: 12.339
[2023-02-23 15:39:20,576][14252] Num frames 8700...
[2023-02-23 15:39:20,688][14252] Num frames 8800...
[2023-02-23 15:39:20,812][14252] Num frames 8900...
[2023-02-23 15:39:20,933][14252] Num frames 9000...
[2023-02-23 15:39:21,051][14252] Num frames 9100...
[2023-02-23 15:39:21,171][14252] Num frames 9200...
[2023-02-23 15:39:21,243][14252] Avg episode rewards: #0: 27.516, true rewards: #0: 11.516
[2023-02-23 15:39:21,245][14252] Avg episode reward: 27.516, avg true_objective: 11.516
[2023-02-23 15:39:21,350][14252] Num frames 9300...
[2023-02-23 15:39:21,474][14252] Num frames 9400...
[2023-02-23 15:39:21,592][14252] Num frames 9500...
[2023-02-23 15:39:21,715][14252] Num frames 9600...
[2023-02-23 15:39:21,853][14252] Num frames 9700...
[2023-02-23 15:39:22,017][14252] Num frames 9800...
[2023-02-23 15:39:22,179][14252] Num frames 9900...
[2023-02-23 15:39:22,359][14252] Avg episode rewards: #0: 26.308, true rewards: #0: 11.086
[2023-02-23 15:39:22,362][14252] Avg episode reward: 26.308, avg true_objective: 11.086
[2023-02-23 15:39:22,417][14252] Num frames 10000...
[2023-02-23 15:39:22,583][14252] Num frames 10100...
[2023-02-23 15:39:22,756][14252] Num frames 10200...
[2023-02-23 15:39:22,919][14252] Num frames 10300...
[2023-02-23 15:39:23,082][14252] Num frames 10400...
[2023-02-23 15:39:23,243][14252] Num frames 10500...
[2023-02-23 15:39:23,406][14252] Num frames 10600...
[2023-02-23 15:39:23,567][14252] Num frames 10700...
[2023-02-23 15:39:23,738][14252] Num frames 10800...
[2023-02-23 15:39:23,910][14252] Num frames 10900...
[2023-02-23 15:39:24,081][14252] Num frames 11000...
[2023-02-23 15:39:24,253][14252] Num frames 11100...
[2023-02-23 15:39:24,426][14252] Num frames 11200...
[2023-02-23 15:39:24,606][14252] Num frames 11300...
[2023-02-23 15:39:24,775][14252] Num frames 11400...
[2023-02-23 15:39:24,941][14252] Num frames 11500...
[2023-02-23 15:39:25,104][14252] Num frames 11600...
[2023-02-23 15:39:25,268][14252] Num frames 11700...
[2023-02-23 15:39:25,420][14252] Num frames 11800...
[2023-02-23 15:39:25,545][14252] Num frames 11900...
[2023-02-23 15:39:25,675][14252] Num frames 12000...
[2023-02-23 15:39:25,735][14252] Avg episode rewards: #0: 29.402, true rewards: #0: 12.002
[2023-02-23 15:39:25,736][14252] Avg episode reward: 29.402, avg true_objective: 12.002
[2023-02-23 15:40:37,283][14252] Replay video saved to /content/train_dir/default_experiment/replay.mp4!