appo-mujoco-swimmer / sf_log.txt
MattStammers's picture
Upload folder using huggingface_hub
d614137
raw
history blame contribute delete
No virus
215 kB
[2023-09-21 10:31:52,075][130331] Saving configuration to ./train_dir/Swimmer/config.json...
[2023-09-21 10:31:52,141][130331] Rollout worker 0 uses device cpu
[2023-09-21 10:31:52,142][130331] Rollout worker 1 uses device cpu
[2023-09-21 10:31:52,143][130331] Rollout worker 2 uses device cpu
[2023-09-21 10:31:52,143][130331] Rollout worker 3 uses device cpu
[2023-09-21 10:31:52,143][130331] Rollout worker 4 uses device cpu
[2023-09-21 10:31:52,144][130331] Rollout worker 5 uses device cpu
[2023-09-21 10:31:52,144][130331] Rollout worker 6 uses device cpu
[2023-09-21 10:31:52,145][130331] Rollout worker 7 uses device cpu
[2023-09-21 10:31:52,145][130331] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-09-21 10:31:52,193][130331] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-21 10:31:52,193][130331] InferenceWorker_p0-w0: min num requests: 1
[2023-09-21 10:31:52,196][130331] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-21 10:31:52,197][130331] InferenceWorker_p1-w0: min num requests: 1
[2023-09-21 10:31:52,227][130331] Starting all processes...
[2023-09-21 10:31:52,228][130331] Starting process learner_proc0
[2023-09-21 10:31:52,230][130331] Starting process learner_proc1
[2023-09-21 10:31:52,277][130331] Starting all processes...
[2023-09-21 10:31:52,307][130331] Starting process inference_proc0-0
[2023-09-21 10:31:52,311][130331] Starting process inference_proc1-0
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc0
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc1
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc2
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc3
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc4
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc5
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc6
[2023-09-21 10:31:52,311][130331] Starting process rollout_proc7
[2023-09-21 10:31:54,167][130980] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-21 10:31:54,167][130980] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-21 10:31:54,185][130980] Num visible devices: 1
[2023-09-21 10:31:54,203][130980] Starting seed is not provided
[2023-09-21 10:31:54,203][130980] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-21 10:31:54,203][130980] Initializing actor-critic model on device cuda:0
[2023-09-21 10:31:54,204][130980] RunningMeanStd input shape: (8,)
[2023-09-21 10:31:54,204][130980] RunningMeanStd input shape: (1,)
[2023-09-21 10:31:54,222][00307] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-21 10:31:54,222][00325] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-21 10:31:54,246][131067] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-21 10:31:54,246][131067] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-21 10:31:54,247][130980] Created Actor Critic model with architecture:
[2023-09-21 10:31:54,247][00318] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-21 10:31:54,247][130980] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): MlpEncoder(
(mlp_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=Tanh)
(2): RecursiveScriptModule(original_name=Linear)
(3): RecursiveScriptModule(original_name=Tanh)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=64, out_features=1, bias=True)
(action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev(
(distribution_linear): Linear(in_features=64, out_features=2, bias=True)
)
)
[2023-09-21 10:31:54,257][130981] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-21 10:31:54,257][130981] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1
[2023-09-21 10:31:54,278][00330] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-21 10:31:54,280][00304] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-21 10:31:54,286][00317] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-21 10:31:54,289][131067] Num visible devices: 1
[2023-09-21 10:31:54,295][130981] Num visible devices: 1
[2023-09-21 10:31:54,296][00302] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-21 10:31:54,296][00302] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1
[2023-09-21 10:31:54,318][00302] Num visible devices: 1
[2023-09-21 10:31:54,361][130981] Starting seed is not provided
[2023-09-21 10:31:54,362][130981] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-21 10:31:54,362][130981] Initializing actor-critic model on device cuda:0
[2023-09-21 10:31:54,362][130981] RunningMeanStd input shape: (8,)
[2023-09-21 10:31:54,363][130981] RunningMeanStd input shape: (1,)
[2023-09-21 10:31:54,414][130981] Created Actor Critic model with architecture:
[2023-09-21 10:31:54,414][130981] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): MlpEncoder(
(mlp_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=Tanh)
(2): RecursiveScriptModule(original_name=Linear)
(3): RecursiveScriptModule(original_name=Tanh)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=64, out_features=1, bias=True)
(action_parameterization): ActionParameterizationContinuousNonAdaptiveStddev(
(distribution_linear): Linear(in_features=64, out_features=2, bias=True)
)
)
[2023-09-21 10:31:54,455][00301] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-21 10:31:54,488][00314] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-21 10:31:54,830][130980] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-21 10:31:54,830][130980] No checkpoints found
[2023-09-21 10:31:54,830][130980] Did not load from checkpoint, starting from scratch!
[2023-09-21 10:31:54,831][130980] Initialized policy 0 weights for model version 0
[2023-09-21 10:31:54,832][130980] LearnerWorker_p0 finished initialization!
[2023-09-21 10:31:54,833][130980] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-21 10:31:54,995][130981] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-21 10:31:54,996][130981] No checkpoints found
[2023-09-21 10:31:54,996][130981] Did not load from checkpoint, starting from scratch!
[2023-09-21 10:31:54,996][130981] Initialized policy 1 weights for model version 0
[2023-09-21 10:31:54,998][130981] LearnerWorker_p1 finished initialization!
[2023-09-21 10:31:54,998][130981] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-21 10:31:55,388][131067] RunningMeanStd input shape: (8,)
[2023-09-21 10:31:55,389][131067] RunningMeanStd input shape: (1,)
[2023-09-21 10:31:55,420][130331] Inference worker 0-0 is ready!
[2023-09-21 10:31:55,527][00302] RunningMeanStd input shape: (8,)
[2023-09-21 10:31:55,528][00302] RunningMeanStd input shape: (1,)
[2023-09-21 10:31:55,560][130331] Inference worker 1-0 is ready!
[2023-09-21 10:31:55,560][130331] All inference workers are ready! Signal rollout workers to start!
[2023-09-21 10:31:55,655][00307] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,655][00307] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,656][00318] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,656][00318] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,657][00325] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,658][00325] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,661][00314] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,662][00314] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,662][00301] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,663][00301] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,666][00304] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,667][00307] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,667][00304] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,668][00318] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,669][00325] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,674][00314] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,674][00301] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,685][00304] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,689][00307] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,690][00318] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,691][00325] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,694][00330] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,695][00330] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,696][00314] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,697][00301] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,702][00317] Decorrelating experience for 0 frames...
[2023-09-21 10:31:55,703][00317] Decorrelating experience for 64 frames...
[2023-09-21 10:31:55,706][00330] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,714][00317] Decorrelating experience for 128 frames...
[2023-09-21 10:31:55,720][00304] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,728][00330] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,737][00318] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,737][00325] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,737][00317] Decorrelating experience for 192 frames...
[2023-09-21 10:31:55,738][00307] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,743][00301] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,745][00314] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,776][00304] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,778][00330] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,779][00325] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,781][00307] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,783][00317] Decorrelating experience for 256 frames...
[2023-09-21 10:31:55,785][00301] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,787][00318] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,788][00314] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,820][00304] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,822][00330] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,826][00317] Decorrelating experience for 320 frames...
[2023-09-21 10:31:55,831][00325] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,834][00307] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,839][00301] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,840][00318] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,843][00314] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,874][00304] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,874][00330] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,879][00317] Decorrelating experience for 384 frames...
[2023-09-21 10:31:55,896][00325] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,898][00307] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,904][00301] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,906][00318] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,909][00314] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,939][00330] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,941][00304] Decorrelating experience for 448 frames...
[2023-09-21 10:31:55,944][00317] Decorrelating experience for 448 frames...
[2023-09-21 10:31:58,503][130331] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 8192. Throughput: 0: nan, 1: nan. Samples: 6536. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:32:03,503][130331] Fps is (10 sec: 9830.0, 60 sec: 9830.0, 300 sec: 9830.0). Total num frames: 57344. Throughput: 0: 2358.7, 1: 3891.8. Samples: 37790. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:32:03,504][130331] Avg episode reward: [(0, '20.406'), (1, '23.376')]
[2023-09-21 10:32:03,507][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000056_28672.pth...
[2023-09-21 10:32:03,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000056_28672.pth...
[2023-09-21 10:32:05,216][131067] Updated weights for policy 0, policy_version 80 (0.0012)
[2023-09-21 10:32:05,216][00302] Updated weights for policy 1, policy_version 80 (0.0011)
[2023-09-21 10:32:08,502][130331] Fps is (10 sec: 11468.9, 60 sec: 11468.9, 300 sec: 11468.9). Total num frames: 122880. Throughput: 0: 4846.2, 1: 5618.5. Samples: 111182. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:32:08,503][130331] Avg episode reward: [(0, '17.519'), (1, '21.925')]
[2023-09-21 10:32:11,621][00302] Updated weights for policy 1, policy_version 160 (0.0016)
[2023-09-21 10:32:11,622][131067] Updated weights for policy 0, policy_version 160 (0.0015)
[2023-09-21 10:32:12,179][130331] Heartbeat connected on Batcher_0
[2023-09-21 10:32:12,182][130331] Heartbeat connected on LearnerWorker_p0
[2023-09-21 10:32:12,186][130331] Heartbeat connected on Batcher_1
[2023-09-21 10:32:12,189][130331] Heartbeat connected on LearnerWorker_p1
[2023-09-21 10:32:12,195][130331] Heartbeat connected on InferenceWorker_p0-w0
[2023-09-21 10:32:12,199][130331] Heartbeat connected on InferenceWorker_p1-w0
[2023-09-21 10:32:12,200][130331] Heartbeat connected on RolloutWorker_w0
[2023-09-21 10:32:12,203][130331] Heartbeat connected on RolloutWorker_w1
[2023-09-21 10:32:12,206][130331] Heartbeat connected on RolloutWorker_w2
[2023-09-21 10:32:12,208][130331] Heartbeat connected on RolloutWorker_w3
[2023-09-21 10:32:12,211][130331] Heartbeat connected on RolloutWorker_w4
[2023-09-21 10:32:12,213][130331] Heartbeat connected on RolloutWorker_w5
[2023-09-21 10:32:12,219][130331] Heartbeat connected on RolloutWorker_w6
[2023-09-21 10:32:12,225][130331] Heartbeat connected on RolloutWorker_w7
[2023-09-21 10:32:13,503][130331] Fps is (10 sec: 12288.1, 60 sec: 11468.7, 300 sec: 11468.7). Total num frames: 180224. Throughput: 0: 5829.6, 1: 5025.6. Samples: 169364. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:32:13,504][130331] Avg episode reward: [(0, '19.352'), (1, '23.006')]
[2023-09-21 10:32:18,040][131067] Updated weights for policy 0, policy_version 240 (0.0014)
[2023-09-21 10:32:18,040][00302] Updated weights for policy 1, policy_version 240 (0.0015)
[2023-09-21 10:32:18,503][130331] Fps is (10 sec: 12287.6, 60 sec: 11878.2, 300 sec: 11878.2). Total num frames: 245760. Throughput: 0: 5286.1, 1: 5675.4. Samples: 225770. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:32:18,504][130331] Avg episode reward: [(0, '22.656'), (1, '24.883')]
[2023-09-21 10:32:18,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000240_122880.pth...
[2023-09-21 10:32:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000240_122880.pth...
[2023-09-21 10:32:18,516][130981] Saving new best policy, reward=24.883!
[2023-09-21 10:32:18,517][130980] Saving new best policy, reward=22.656!
[2023-09-21 10:32:23,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12124.0, 300 sec: 12124.0). Total num frames: 311296. Throughput: 0: 5774.0, 1: 6063.3. Samples: 302472. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:32:23,504][130331] Avg episode reward: [(0, '27.459'), (1, '28.453')]
[2023-09-21 10:32:23,506][130980] Saving new best policy, reward=27.459!
[2023-09-21 10:32:23,506][130981] Saving new best policy, reward=28.453!
[2023-09-21 10:32:24,287][00302] Updated weights for policy 1, policy_version 320 (0.0016)
[2023-09-21 10:32:24,287][131067] Updated weights for policy 0, policy_version 320 (0.0014)
[2023-09-21 10:32:28,503][130331] Fps is (10 sec: 13107.5, 60 sec: 12288.0, 300 sec: 12288.0). Total num frames: 376832. Throughput: 0: 6085.5, 1: 5735.5. Samples: 361164. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:32:28,504][130331] Avg episode reward: [(0, '29.814'), (1, '29.809')]
[2023-09-21 10:32:28,505][130980] Saving new best policy, reward=29.814!
[2023-09-21 10:32:28,505][130981] Saving new best policy, reward=29.809!
[2023-09-21 10:32:30,976][131067] Updated weights for policy 0, policy_version 400 (0.0013)
[2023-09-21 10:32:30,977][00302] Updated weights for policy 1, policy_version 400 (0.0014)
[2023-09-21 10:32:33,503][130331] Fps is (10 sec: 13107.5, 60 sec: 12405.0, 300 sec: 12405.0). Total num frames: 442368. Throughput: 0: 5768.5, 1: 5968.5. Samples: 417330. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:32:33,503][130331] Avg episode reward: [(0, '30.710'), (1, '30.441')]
[2023-09-21 10:32:33,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000432_221184.pth...
[2023-09-21 10:32:33,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000432_221184.pth...
[2023-09-21 10:32:33,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000056_28672.pth
[2023-09-21 10:32:33,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000056_28672.pth
[2023-09-21 10:32:33,519][130980] Saving new best policy, reward=30.710!
[2023-09-21 10:32:33,519][130981] Saving new best policy, reward=30.441!
[2023-09-21 10:32:36,993][131067] Updated weights for policy 0, policy_version 480 (0.0013)
[2023-09-21 10:32:36,994][00302] Updated weights for policy 1, policy_version 480 (0.0014)
[2023-09-21 10:32:38,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12492.8, 300 sec: 12492.8). Total num frames: 507904. Throughput: 0: 6078.5, 1: 6247.4. Samples: 499570. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:32:38,503][130331] Avg episode reward: [(0, '30.865'), (1, '30.123')]
[2023-09-21 10:32:38,504][130980] Saving new best policy, reward=30.865!
[2023-09-21 10:32:43,083][00302] Updated weights for policy 1, policy_version 560 (0.0016)
[2023-09-21 10:32:43,083][131067] Updated weights for policy 0, policy_version 560 (0.0012)
[2023-09-21 10:32:43,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12561.1, 300 sec: 12561.1). Total num frames: 573440. Throughput: 0: 6290.5, 1: 6002.7. Samples: 559732. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:32:43,503][130331] Avg episode reward: [(0, '32.542'), (1, '31.562')]
[2023-09-21 10:32:43,504][130981] Saving new best policy, reward=31.562!
[2023-09-21 10:32:43,504][130980] Saving new best policy, reward=32.542!
[2023-09-21 10:32:48,502][130331] Fps is (10 sec: 13107.2, 60 sec: 12615.7, 300 sec: 12615.7). Total num frames: 638976. Throughput: 0: 6450.8, 1: 6453.3. Samples: 618472. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:32:48,503][130331] Avg episode reward: [(0, '33.636'), (1, '34.157')]
[2023-09-21 10:32:48,507][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000624_319488.pth...
[2023-09-21 10:32:48,507][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000624_319488.pth...
[2023-09-21 10:32:48,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000240_122880.pth
[2023-09-21 10:32:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000240_122880.pth
[2023-09-21 10:32:48,516][130980] Saving new best policy, reward=33.636!
[2023-09-21 10:32:48,516][130981] Saving new best policy, reward=34.157!
[2023-09-21 10:32:49,311][00302] Updated weights for policy 1, policy_version 640 (0.0014)
[2023-09-21 10:32:49,311][131067] Updated weights for policy 0, policy_version 640 (0.0014)
[2023-09-21 10:32:53,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12660.3, 300 sec: 12660.3). Total num frames: 704512. Throughput: 0: 6510.9, 1: 6491.8. Samples: 696306. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:32:53,504][130331] Avg episode reward: [(0, '35.493'), (1, '36.756')]
[2023-09-21 10:32:53,505][130980] Saving new best policy, reward=35.493!
[2023-09-21 10:32:53,505][130981] Saving new best policy, reward=36.756!
[2023-09-21 10:32:55,648][131067] Updated weights for policy 0, policy_version 720 (0.0015)
[2023-09-21 10:32:55,648][00302] Updated weights for policy 1, policy_version 720 (0.0015)
[2023-09-21 10:32:58,503][130331] Fps is (10 sec: 13926.3, 60 sec: 12834.1, 300 sec: 12834.1). Total num frames: 778240. Throughput: 0: 6558.7, 1: 6515.2. Samples: 757688. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:32:58,503][130331] Avg episode reward: [(0, '37.279'), (1, '38.414')]
[2023-09-21 10:32:58,504][130980] Saving new best policy, reward=37.279!
[2023-09-21 10:32:58,504][130981] Saving new best policy, reward=38.414!
[2023-09-21 10:33:01,574][131067] Updated weights for policy 0, policy_version 800 (0.0013)
[2023-09-21 10:33:01,575][00302] Updated weights for policy 1, policy_version 800 (0.0014)
[2023-09-21 10:33:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 12729.1). Total num frames: 835584. Throughput: 0: 6602.4, 1: 6580.5. Samples: 818998. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:33:03,504][130331] Avg episode reward: [(0, '40.157'), (1, '39.803')]
[2023-09-21 10:33:03,520][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000824_421888.pth...
[2023-09-21 10:33:03,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000432_221184.pth
[2023-09-21 10:33:03,523][130981] Saving new best policy, reward=39.803!
[2023-09-21 10:33:03,532][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000824_421888.pth...
[2023-09-21 10:33:03,535][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000432_221184.pth
[2023-09-21 10:33:03,535][130980] Saving new best policy, reward=40.157!
[2023-09-21 10:33:08,115][00302] Updated weights for policy 1, policy_version 880 (0.0012)
[2023-09-21 10:33:08,115][131067] Updated weights for policy 0, policy_version 880 (0.0013)
[2023-09-21 10:33:08,502][130331] Fps is (10 sec: 12288.2, 60 sec: 12970.7, 300 sec: 12756.1). Total num frames: 901120. Throughput: 0: 6569.2, 1: 6573.4. Samples: 893882. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:08,503][130331] Avg episode reward: [(0, '40.387'), (1, '41.338')]
[2023-09-21 10:33:08,504][130980] Saving new best policy, reward=40.387!
[2023-09-21 10:33:08,504][130981] Saving new best policy, reward=41.338!
[2023-09-21 10:33:13,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12779.5). Total num frames: 966656. Throughput: 0: 6566.3, 1: 6552.1. Samples: 951492. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:33:13,503][130331] Avg episode reward: [(0, '44.775'), (1, '41.176')]
[2023-09-21 10:33:13,504][130980] Saving new best policy, reward=44.775!
[2023-09-21 10:33:14,492][131067] Updated weights for policy 0, policy_version 960 (0.0012)
[2023-09-21 10:33:14,493][00302] Updated weights for policy 1, policy_version 960 (0.0012)
[2023-09-21 10:33:18,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 12800.0). Total num frames: 1032192. Throughput: 0: 6563.9, 1: 6563.4. Samples: 1008058. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:33:18,503][130331] Avg episode reward: [(0, '52.105'), (1, '40.737')]
[2023-09-21 10:33:18,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001008_516096.pth...
[2023-09-21 10:33:18,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001008_516096.pth...
[2023-09-21 10:33:18,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000624_319488.pth
[2023-09-21 10:33:18,513][130980] Saving new best policy, reward=52.105!
[2023-09-21 10:33:18,512][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000624_319488.pth
[2023-09-21 10:33:20,731][00302] Updated weights for policy 1, policy_version 1040 (0.0015)
[2023-09-21 10:33:20,731][131067] Updated weights for policy 0, policy_version 1040 (0.0015)
[2023-09-21 10:33:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12818.1). Total num frames: 1097728. Throughput: 0: 6538.0, 1: 6549.0. Samples: 1088486. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:23,504][130331] Avg episode reward: [(0, '59.577'), (1, '40.107')]
[2023-09-21 10:33:23,505][130980] Saving new best policy, reward=59.577!
[2023-09-21 10:33:26,985][00302] Updated weights for policy 1, policy_version 1120 (0.0014)
[2023-09-21 10:33:26,985][131067] Updated weights for policy 0, policy_version 1120 (0.0015)
[2023-09-21 10:33:28,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13107.1, 300 sec: 12834.1). Total num frames: 1163264. Throughput: 0: 6509.3, 1: 6537.1. Samples: 1146824. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:33:28,504][130331] Avg episode reward: [(0, '63.692'), (1, '38.449')]
[2023-09-21 10:33:28,505][130980] Saving new best policy, reward=63.692!
[2023-09-21 10:33:33,295][131067] Updated weights for policy 0, policy_version 1200 (0.0013)
[2023-09-21 10:33:33,295][00302] Updated weights for policy 1, policy_version 1200 (0.0016)
[2023-09-21 10:33:33,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 12848.5). Total num frames: 1228800. Throughput: 0: 6535.5, 1: 6534.9. Samples: 1206644. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:33,504][130331] Avg episode reward: [(0, '67.516'), (1, '37.699')]
[2023-09-21 10:33:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001200_614400.pth...
[2023-09-21 10:33:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001200_614400.pth...
[2023-09-21 10:33:33,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000000824_421888.pth
[2023-09-21 10:33:33,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000000824_421888.pth
[2023-09-21 10:33:33,521][130980] Saving new best policy, reward=67.516!
[2023-09-21 10:33:38,502][130331] Fps is (10 sec: 12288.5, 60 sec: 12970.7, 300 sec: 12779.5). Total num frames: 1286144. Throughput: 0: 6510.8, 1: 6524.0. Samples: 1282868. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:33:38,503][130331] Avg episode reward: [(0, '72.598'), (1, '36.057')]
[2023-09-21 10:33:38,523][130980] Saving new best policy, reward=72.598!
[2023-09-21 10:33:39,922][131067] Updated weights for policy 0, policy_version 1280 (0.0016)
[2023-09-21 10:33:39,923][00302] Updated weights for policy 1, policy_version 1280 (0.0013)
[2023-09-21 10:33:43,502][130331] Fps is (10 sec: 12288.3, 60 sec: 12970.7, 300 sec: 12795.1). Total num frames: 1351680. Throughput: 0: 6432.1, 1: 6465.0. Samples: 1338054. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:43,503][130331] Avg episode reward: [(0, '78.140'), (1, '36.746')]
[2023-09-21 10:33:43,504][130980] Saving new best policy, reward=78.140!
[2023-09-21 10:33:46,435][131067] Updated weights for policy 0, policy_version 1360 (0.0011)
[2023-09-21 10:33:46,436][00302] Updated weights for policy 1, policy_version 1360 (0.0015)
[2023-09-21 10:33:48,503][130331] Fps is (10 sec: 13106.8, 60 sec: 12970.6, 300 sec: 12809.3). Total num frames: 1417216. Throughput: 0: 6377.5, 1: 6394.1. Samples: 1393720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:48,504][130331] Avg episode reward: [(0, '80.259'), (1, '36.998')]
[2023-09-21 10:33:48,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001384_708608.pth...
[2023-09-21 10:33:48,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001384_708608.pth...
[2023-09-21 10:33:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001008_516096.pth
[2023-09-21 10:33:48,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001008_516096.pth
[2023-09-21 10:33:48,517][130980] Saving new best policy, reward=80.259!
[2023-09-21 10:33:52,805][131067] Updated weights for policy 0, policy_version 1440 (0.0014)
[2023-09-21 10:33:52,805][00302] Updated weights for policy 1, policy_version 1440 (0.0012)
[2023-09-21 10:33:53,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12970.7, 300 sec: 12822.2). Total num frames: 1482752. Throughput: 0: 6416.7, 1: 6424.4. Samples: 1471736. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:33:53,504][130331] Avg episode reward: [(0, '79.871'), (1, '37.986')]
[2023-09-21 10:33:58,502][130331] Fps is (10 sec: 13107.6, 60 sec: 12834.2, 300 sec: 12834.1). Total num frames: 1548288. Throughput: 0: 6436.2, 1: 6411.3. Samples: 1529632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:33:58,503][130331] Avg episode reward: [(0, '79.646'), (1, '36.642')]
[2023-09-21 10:33:59,022][131067] Updated weights for policy 0, policy_version 1520 (0.0013)
[2023-09-21 10:33:59,022][00302] Updated weights for policy 1, policy_version 1520 (0.0014)
[2023-09-21 10:34:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 12845.0). Total num frames: 1613824. Throughput: 0: 6463.4, 1: 6473.5. Samples: 1590220. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:34:03,504][130331] Avg episode reward: [(0, '77.283'), (1, '37.069')]
[2023-09-21 10:34:03,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001576_806912.pth...
[2023-09-21 10:34:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001576_806912.pth...
[2023-09-21 10:34:03,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001200_614400.pth
[2023-09-21 10:34:03,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001200_614400.pth
[2023-09-21 10:34:05,114][00302] Updated weights for policy 1, policy_version 1600 (0.0016)
[2023-09-21 10:34:05,114][131067] Updated weights for policy 0, policy_version 1600 (0.0016)
[2023-09-21 10:34:08,502][130331] Fps is (10 sec: 13107.2, 60 sec: 12970.7, 300 sec: 12855.1). Total num frames: 1679360. Throughput: 0: 6482.3, 1: 6484.7. Samples: 1672000. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:34:08,503][130331] Avg episode reward: [(0, '78.806'), (1, '38.884')]
[2023-09-21 10:34:11,416][131067] Updated weights for policy 0, policy_version 1680 (0.0013)
[2023-09-21 10:34:11,417][00302] Updated weights for policy 1, policy_version 1680 (0.0015)
[2023-09-21 10:34:13,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12970.6, 300 sec: 12864.5). Total num frames: 1744896. Throughput: 0: 6459.1, 1: 6476.2. Samples: 1728908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:34:13,504][130331] Avg episode reward: [(0, '80.726'), (1, '42.724')]
[2023-09-21 10:34:13,505][130981] Saving new best policy, reward=42.724!
[2023-09-21 10:34:13,505][130980] Saving new best policy, reward=80.726!
[2023-09-21 10:34:17,979][131067] Updated weights for policy 0, policy_version 1760 (0.0012)
[2023-09-21 10:34:17,981][00302] Updated weights for policy 1, policy_version 1760 (0.0018)
[2023-09-21 10:34:18,502][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.2, 300 sec: 12814.6). Total num frames: 1802240. Throughput: 0: 6423.9, 1: 6422.2. Samples: 1784714. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:34:18,503][130331] Avg episode reward: [(0, '86.120'), (1, '44.575')]
[2023-09-21 10:34:18,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001760_901120.pth...
[2023-09-21 10:34:18,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001760_901120.pth...
[2023-09-21 10:34:18,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001384_708608.pth
[2023-09-21 10:34:18,513][130980] Saving new best policy, reward=86.120!
[2023-09-21 10:34:18,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001384_708608.pth
[2023-09-21 10:34:18,514][130981] Saving new best policy, reward=44.575!
[2023-09-21 10:34:23,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.1, 300 sec: 12824.7). Total num frames: 1867776. Throughput: 0: 6433.1, 1: 6436.8. Samples: 1862018. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:34:23,504][130331] Avg episode reward: [(0, '89.627'), (1, '43.576')]
[2023-09-21 10:34:23,563][130980] Saving new best policy, reward=89.627!
[2023-09-21 10:34:24,216][00302] Updated weights for policy 1, policy_version 1840 (0.0014)
[2023-09-21 10:34:24,216][131067] Updated weights for policy 0, policy_version 1840 (0.0013)
[2023-09-21 10:34:28,503][130331] Fps is (10 sec: 13106.9, 60 sec: 12834.2, 300 sec: 12834.1). Total num frames: 1933312. Throughput: 0: 6478.8, 1: 6468.6. Samples: 1920686. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:34:28,504][130331] Avg episode reward: [(0, '90.351'), (1, '38.683')]
[2023-09-21 10:34:28,505][130980] Saving new best policy, reward=90.351!
[2023-09-21 10:34:30,364][131067] Updated weights for policy 0, policy_version 1920 (0.0012)
[2023-09-21 10:34:30,364][00302] Updated weights for policy 1, policy_version 1920 (0.0013)
[2023-09-21 10:34:33,503][130331] Fps is (10 sec: 13926.0, 60 sec: 12970.6, 300 sec: 12895.8). Total num frames: 2007040. Throughput: 0: 6524.5, 1: 6527.3. Samples: 1981058. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:34:33,504][130331] Avg episode reward: [(0, '90.423'), (1, '31.712')]
[2023-09-21 10:34:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001960_1003520.pth...
[2023-09-21 10:34:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001960_1003520.pth...
[2023-09-21 10:34:33,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001576_806912.pth
[2023-09-21 10:34:33,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001576_806912.pth
[2023-09-21 10:34:33,519][130980] Saving new best policy, reward=90.423!
[2023-09-21 10:34:36,810][131067] Updated weights for policy 0, policy_version 2000 (0.0014)
[2023-09-21 10:34:36,810][00302] Updated weights for policy 1, policy_version 2000 (0.0016)
[2023-09-21 10:34:38,502][130331] Fps is (10 sec: 13107.5, 60 sec: 12970.7, 300 sec: 12851.2). Total num frames: 2064384. Throughput: 0: 6503.8, 1: 6484.1. Samples: 2056186. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:34:38,503][130331] Avg episode reward: [(0, '91.416'), (1, '26.294')]
[2023-09-21 10:34:38,504][130980] Saving new best policy, reward=91.416!
[2023-09-21 10:34:43,319][131067] Updated weights for policy 0, policy_version 2080 (0.0013)
[2023-09-21 10:34:43,319][00302] Updated weights for policy 1, policy_version 2080 (0.0014)
[2023-09-21 10:34:43,503][130331] Fps is (10 sec: 12288.4, 60 sec: 12970.6, 300 sec: 12858.9). Total num frames: 2129920. Throughput: 0: 6482.5, 1: 6512.7. Samples: 2114418. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:34:43,504][130331] Avg episode reward: [(0, '94.220'), (1, '22.669')]
[2023-09-21 10:34:43,505][130980] Saving new best policy, reward=94.220!
[2023-09-21 10:34:48,503][130331] Fps is (10 sec: 12287.8, 60 sec: 12834.2, 300 sec: 12818.1). Total num frames: 2187264. Throughput: 0: 6426.0, 1: 6426.1. Samples: 2168564. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:34:48,503][130331] Avg episode reward: [(0, '99.105'), (1, '19.191')]
[2023-09-21 10:34:48,548][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002144_1097728.pth...
[2023-09-21 10:34:48,550][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002144_1097728.pth...
[2023-09-21 10:34:48,552][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001760_901120.pth
[2023-09-21 10:34:48,553][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001760_901120.pth
[2023-09-21 10:34:48,553][130980] Saving new best policy, reward=99.105!
[2023-09-21 10:34:49,802][131067] Updated weights for policy 0, policy_version 2160 (0.0013)
[2023-09-21 10:34:49,803][00302] Updated weights for policy 1, policy_version 2160 (0.0015)
[2023-09-21 10:34:53,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.1, 300 sec: 12826.3). Total num frames: 2252800. Throughput: 0: 6371.5, 1: 6365.2. Samples: 2245152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:34:53,504][130331] Avg episode reward: [(0, '103.534'), (1, '14.251')]
[2023-09-21 10:34:53,505][130980] Saving new best policy, reward=103.534!
[2023-09-21 10:34:56,217][00302] Updated weights for policy 1, policy_version 2240 (0.0014)
[2023-09-21 10:34:56,218][131067] Updated weights for policy 0, policy_version 2240 (0.0016)
[2023-09-21 10:34:58,502][130331] Fps is (10 sec: 13107.5, 60 sec: 12834.1, 300 sec: 12834.1). Total num frames: 2318336. Throughput: 0: 6420.9, 1: 6380.0. Samples: 2304946. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:34:58,503][130331] Avg episode reward: [(0, '107.916'), (1, '7.871')]
[2023-09-21 10:34:58,504][130980] Saving new best policy, reward=107.916!
[2023-09-21 10:35:02,458][00302] Updated weights for policy 1, policy_version 2320 (0.0013)
[2023-09-21 10:35:02,459][131067] Updated weights for policy 0, policy_version 2320 (0.0010)
[2023-09-21 10:35:03,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12834.2, 300 sec: 12841.5). Total num frames: 2383872. Throughput: 0: 6433.9, 1: 6437.7. Samples: 2363940. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:35:03,503][130331] Avg episode reward: [(0, '112.262'), (1, '2.952')]
[2023-09-21 10:35:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002328_1191936.pth...
[2023-09-21 10:35:03,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002328_1191936.pth...
[2023-09-21 10:35:03,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000001960_1003520.pth
[2023-09-21 10:35:03,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000001960_1003520.pth
[2023-09-21 10:35:03,520][130980] Saving new best policy, reward=112.262!
[2023-09-21 10:35:08,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12834.1, 300 sec: 12848.5). Total num frames: 2449408. Throughput: 0: 6459.3, 1: 6455.5. Samples: 2443180. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:35:08,503][130331] Avg episode reward: [(0, '117.119'), (1, '-0.275')]
[2023-09-21 10:35:08,504][130980] Saving new best policy, reward=117.119!
[2023-09-21 10:35:08,709][131067] Updated weights for policy 0, policy_version 2400 (0.0011)
[2023-09-21 10:35:08,709][00302] Updated weights for policy 1, policy_version 2400 (0.0014)
[2023-09-21 10:35:13,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12834.2, 300 sec: 12855.1). Total num frames: 2514944. Throughput: 0: 6396.9, 1: 6455.3. Samples: 2499030. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:35:13,503][130331] Avg episode reward: [(0, '123.140'), (1, '-1.024')]
[2023-09-21 10:35:13,504][130980] Saving new best policy, reward=123.140!
[2023-09-21 10:35:15,291][00302] Updated weights for policy 1, policy_version 2480 (0.0012)
[2023-09-21 10:35:15,292][131067] Updated weights for policy 0, policy_version 2480 (0.0015)
[2023-09-21 10:35:18,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12834.1, 300 sec: 12820.5). Total num frames: 2572288. Throughput: 0: 6387.1, 1: 6374.4. Samples: 2555322. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:35:18,503][130331] Avg episode reward: [(0, '125.928'), (1, '-1.251')]
[2023-09-21 10:35:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002512_1286144.pth...
[2023-09-21 10:35:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002512_1286144.pth...
[2023-09-21 10:35:18,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002144_1097728.pth
[2023-09-21 10:35:18,516][130980] Saving new best policy, reward=125.928!
[2023-09-21 10:35:18,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002144_1097728.pth
[2023-09-21 10:35:21,854][131067] Updated weights for policy 0, policy_version 2560 (0.0013)
[2023-09-21 10:35:21,856][00302] Updated weights for policy 1, policy_version 2560 (0.0015)
[2023-09-21 10:35:23,503][130331] Fps is (10 sec: 12287.8, 60 sec: 12834.1, 300 sec: 12827.5). Total num frames: 2637824. Throughput: 0: 6375.5, 1: 6385.1. Samples: 2630416. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:35:23,504][130331] Avg episode reward: [(0, '127.466'), (1, '-1.119')]
[2023-09-21 10:35:23,505][130980] Saving new best policy, reward=127.466!
[2023-09-21 10:35:28,003][131067] Updated weights for policy 0, policy_version 2640 (0.0011)
[2023-09-21 10:35:28,004][00302] Updated weights for policy 1, policy_version 2640 (0.0017)
[2023-09-21 10:35:28,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12834.2, 300 sec: 12834.1). Total num frames: 2703360. Throughput: 0: 6417.6, 1: 6359.0. Samples: 2689362. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:35:28,503][130331] Avg episode reward: [(0, '127.387'), (1, '-1.198')]
[2023-09-21 10:35:33,502][130331] Fps is (10 sec: 13107.4, 60 sec: 12697.7, 300 sec: 12840.5). Total num frames: 2768896. Throughput: 0: 6436.6, 1: 6441.0. Samples: 2748056. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:35:33,503][130331] Avg episode reward: [(0, '128.210'), (1, '-1.493')]
[2023-09-21 10:35:33,507][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002704_1384448.pth...
[2023-09-21 10:35:33,507][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002704_1384448.pth...
[2023-09-21 10:35:33,511][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002328_1191936.pth
[2023-09-21 10:35:33,512][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002328_1191936.pth
[2023-09-21 10:35:33,512][130980] Saving new best policy, reward=128.210!
[2023-09-21 10:35:34,391][00302] Updated weights for policy 1, policy_version 2720 (0.0015)
[2023-09-21 10:35:34,391][131067] Updated weights for policy 0, policy_version 2720 (0.0011)
[2023-09-21 10:35:38,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 12846.5). Total num frames: 2834432. Throughput: 0: 6441.8, 1: 6448.2. Samples: 2825202. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:35:38,504][130331] Avg episode reward: [(0, '128.090'), (1, '-2.055')]
[2023-09-21 10:35:40,844][131067] Updated weights for policy 0, policy_version 2800 (0.0014)
[2023-09-21 10:35:40,844][00302] Updated weights for policy 1, policy_version 2800 (0.0016)
[2023-09-21 10:35:43,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 12852.3). Total num frames: 2899968. Throughput: 0: 6413.2, 1: 6434.9. Samples: 2883114. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:35:43,504][130331] Avg episode reward: [(0, '126.477'), (1, '-2.278')]
[2023-09-21 10:35:47,122][131067] Updated weights for policy 0, policy_version 2880 (0.0015)
[2023-09-21 10:35:47,122][00302] Updated weights for policy 1, policy_version 2880 (0.0013)
[2023-09-21 10:35:48,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12970.7, 300 sec: 12857.9). Total num frames: 2965504. Throughput: 0: 6426.3, 1: 6412.4. Samples: 2941680. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:35:48,503][130331] Avg episode reward: [(0, '123.904'), (1, '-1.908')]
[2023-09-21 10:35:48,507][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002896_1482752.pth...
[2023-09-21 10:35:48,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002896_1482752.pth...
[2023-09-21 10:35:48,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002512_1286144.pth
[2023-09-21 10:35:48,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002512_1286144.pth
[2023-09-21 10:35:53,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12834.2, 300 sec: 12828.3). Total num frames: 3022848. Throughput: 0: 6393.5, 1: 6397.9. Samples: 3018794. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:35:53,504][130331] Avg episode reward: [(0, '121.333'), (1, '-1.097')]
[2023-09-21 10:35:53,508][00302] Updated weights for policy 1, policy_version 2960 (0.0014)
[2023-09-21 10:35:53,509][131067] Updated weights for policy 0, policy_version 2960 (0.0016)
[2023-09-21 10:35:58,502][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.1, 300 sec: 12834.1). Total num frames: 3088384. Throughput: 0: 6417.4, 1: 6372.2. Samples: 3074560. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:35:58,503][130331] Avg episode reward: [(0, '116.035'), (1, '-0.949')]
[2023-09-21 10:35:59,885][00302] Updated weights for policy 1, policy_version 3040 (0.0014)
[2023-09-21 10:35:59,885][131067] Updated weights for policy 0, policy_version 3040 (0.0014)
[2023-09-21 10:36:03,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 12839.7). Total num frames: 3153920. Throughput: 0: 6449.7, 1: 6460.1. Samples: 3136266. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:36:03,504][130331] Avg episode reward: [(0, '110.073'), (1, '-1.405')]
[2023-09-21 10:36:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003080_1576960.pth...
[2023-09-21 10:36:03,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003080_1576960.pth...
[2023-09-21 10:36:03,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002704_1384448.pth
[2023-09-21 10:36:03,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002704_1384448.pth
[2023-09-21 10:36:06,169][131067] Updated weights for policy 0, policy_version 3120 (0.0014)
[2023-09-21 10:36:06,169][00302] Updated weights for policy 1, policy_version 3120 (0.0013)
[2023-09-21 10:36:08,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 12845.1). Total num frames: 3219456. Throughput: 0: 6479.2, 1: 6490.8. Samples: 3214070. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:08,504][130331] Avg episode reward: [(0, '108.174'), (1, '-2.359')]
[2023-09-21 10:36:12,453][131067] Updated weights for policy 0, policy_version 3200 (0.0013)
[2023-09-21 10:36:12,454][00302] Updated weights for policy 1, policy_version 3200 (0.0014)
[2023-09-21 10:36:13,502][130331] Fps is (10 sec: 13107.6, 60 sec: 12834.1, 300 sec: 12850.2). Total num frames: 3284992. Throughput: 0: 6448.4, 1: 6511.5. Samples: 3272554. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:36:13,503][130331] Avg episode reward: [(0, '110.100'), (1, '-2.555')]
[2023-09-21 10:36:18,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.6, 300 sec: 12855.1). Total num frames: 3350528. Throughput: 0: 6464.2, 1: 6460.2. Samples: 3329660. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:36:18,504][130331] Avg episode reward: [(0, '114.407'), (1, '-2.034')]
[2023-09-21 10:36:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003272_1675264.pth...
[2023-09-21 10:36:18,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003272_1675264.pth...
[2023-09-21 10:36:18,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000002896_1482752.pth
[2023-09-21 10:36:18,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000002896_1482752.pth
[2023-09-21 10:36:18,682][131067] Updated weights for policy 0, policy_version 3280 (0.0012)
[2023-09-21 10:36:18,682][00302] Updated weights for policy 1, policy_version 3280 (0.0016)
[2023-09-21 10:36:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 12859.9). Total num frames: 3416064. Throughput: 0: 6475.2, 1: 6465.6. Samples: 3407536. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:36:23,503][130331] Avg episode reward: [(0, '116.233'), (1, '-1.737')]
[2023-09-21 10:36:25,049][131067] Updated weights for policy 0, policy_version 3360 (0.0015)
[2023-09-21 10:36:25,049][00302] Updated weights for policy 1, policy_version 3360 (0.0016)
[2023-09-21 10:36:28,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.6, 300 sec: 12864.5). Total num frames: 3481600. Throughput: 0: 6476.2, 1: 6485.6. Samples: 3466398. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:28,504][130331] Avg episode reward: [(0, '116.358'), (1, '-1.871')]
[2023-09-21 10:36:31,401][131067] Updated weights for policy 0, policy_version 3440 (0.0015)
[2023-09-21 10:36:31,401][00302] Updated weights for policy 1, policy_version 3440 (0.0014)
[2023-09-21 10:36:33,503][130331] Fps is (10 sec: 13106.9, 60 sec: 12970.6, 300 sec: 12868.9). Total num frames: 3547136. Throughput: 0: 6464.3, 1: 6469.5. Samples: 3523704. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:33,504][130331] Avg episode reward: [(0, '114.788'), (1, '-3.190')]
[2023-09-21 10:36:33,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003464_1773568.pth...
[2023-09-21 10:36:33,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003464_1773568.pth...
[2023-09-21 10:36:33,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003080_1576960.pth
[2023-09-21 10:36:33,520][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003080_1576960.pth
[2023-09-21 10:36:38,071][131067] Updated weights for policy 0, policy_version 3520 (0.0017)
[2023-09-21 10:36:38,071][00302] Updated weights for policy 1, policy_version 3520 (0.0012)
[2023-09-21 10:36:38,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.1, 300 sec: 12843.9). Total num frames: 3604480. Throughput: 0: 6429.0, 1: 6420.3. Samples: 3597014. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:38,504][130331] Avg episode reward: [(0, '114.735'), (1, '-3.580')]
[2023-09-21 10:36:43,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12834.1, 300 sec: 12848.5). Total num frames: 3670016. Throughput: 0: 6431.7, 1: 6447.0. Samples: 3654106. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:43,504][130331] Avg episode reward: [(0, '115.560'), (1, '-4.242')]
[2023-09-21 10:36:44,564][131067] Updated weights for policy 0, policy_version 3600 (0.0012)
[2023-09-21 10:36:44,564][00302] Updated weights for policy 1, policy_version 3600 (0.0015)
[2023-09-21 10:36:48,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12834.1, 300 sec: 12853.0). Total num frames: 3735552. Throughput: 0: 6381.2, 1: 6374.2. Samples: 3710258. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:36:48,504][130331] Avg episode reward: [(0, '117.703'), (1, '-4.967')]
[2023-09-21 10:36:48,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003648_1867776.pth...
[2023-09-21 10:36:48,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003648_1867776.pth...
[2023-09-21 10:36:48,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003272_1675264.pth
[2023-09-21 10:36:48,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003272_1675264.pth
[2023-09-21 10:36:50,928][131067] Updated weights for policy 0, policy_version 3680 (0.0012)
[2023-09-21 10:36:50,929][00302] Updated weights for policy 1, policy_version 3680 (0.0015)
[2023-09-21 10:36:53,503][130331] Fps is (10 sec: 12288.2, 60 sec: 12834.1, 300 sec: 12829.5). Total num frames: 3792896. Throughput: 0: 6364.2, 1: 6355.3. Samples: 3786448. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:53,503][130331] Avg episode reward: [(0, '118.451'), (1, '-5.784')]
[2023-09-21 10:36:57,811][00302] Updated weights for policy 1, policy_version 3760 (0.0011)
[2023-09-21 10:36:57,812][131067] Updated weights for policy 0, policy_version 3760 (0.0013)
[2023-09-21 10:36:58,502][130331] Fps is (10 sec: 11469.0, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 3850240. Throughput: 0: 6315.4, 1: 6327.5. Samples: 3841482. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:36:58,503][130331] Avg episode reward: [(0, '119.655'), (1, '-6.642')]
[2023-09-21 10:37:03,503][130331] Fps is (10 sec: 12287.8, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 3915776. Throughput: 0: 6312.8, 1: 6312.5. Samples: 3897798. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:03,504][130331] Avg episode reward: [(0, '121.861'), (1, '-6.974')]
[2023-09-21 10:37:03,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003824_1957888.pth...
[2023-09-21 10:37:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003824_1957888.pth...
[2023-09-21 10:37:03,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003464_1773568.pth
[2023-09-21 10:37:03,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003464_1773568.pth
[2023-09-21 10:37:04,201][00302] Updated weights for policy 1, policy_version 3840 (0.0014)
[2023-09-21 10:37:04,201][131067] Updated weights for policy 0, policy_version 3840 (0.0013)
[2023-09-21 10:37:08,503][130331] Fps is (10 sec: 13926.3, 60 sec: 12834.2, 300 sec: 12912.8). Total num frames: 3989504. Throughput: 0: 6333.1, 1: 6351.4. Samples: 3978336. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:37:08,503][130331] Avg episode reward: [(0, '124.412'), (1, '-6.769')]
[2023-09-21 10:37:10,438][131067] Updated weights for policy 0, policy_version 3920 (0.0007)
[2023-09-21 10:37:10,439][00302] Updated weights for policy 1, policy_version 3920 (0.0013)
[2023-09-21 10:37:13,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12697.6, 300 sec: 12885.1). Total num frames: 4046848. Throughput: 0: 6340.5, 1: 6305.4. Samples: 4035462. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:13,503][130331] Avg episode reward: [(0, '127.222'), (1, '-7.141')]
[2023-09-21 10:37:16,608][131067] Updated weights for policy 0, policy_version 4000 (0.0014)
[2023-09-21 10:37:16,609][00302] Updated weights for policy 1, policy_version 4000 (0.0015)
[2023-09-21 10:37:18,510][130331] Fps is (10 sec: 13097.8, 60 sec: 12832.6, 300 sec: 12912.5). Total num frames: 4120576. Throughput: 0: 6359.1, 1: 6347.6. Samples: 4095596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:18,511][130331] Avg episode reward: [(0, '128.131'), (1, '-6.833')]
[2023-09-21 10:37:18,518][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004024_2060288.pth...
[2023-09-21 10:37:18,519][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004024_2060288.pth...
[2023-09-21 10:37:18,525][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003648_1867776.pth
[2023-09-21 10:37:18,526][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003648_1867776.pth
[2023-09-21 10:37:22,950][131067] Updated weights for policy 0, policy_version 4080 (0.0012)
[2023-09-21 10:37:22,951][00302] Updated weights for policy 1, policy_version 4080 (0.0015)
[2023-09-21 10:37:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12697.6, 300 sec: 12885.0). Total num frames: 4177920. Throughput: 0: 6381.9, 1: 6390.7. Samples: 4171778. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:23,504][130331] Avg episode reward: [(0, '128.666'), (1, '-7.439')]
[2023-09-21 10:37:23,512][130980] Saving new best policy, reward=128.666!
[2023-09-21 10:37:28,503][130331] Fps is (10 sec: 13116.4, 60 sec: 12834.1, 300 sec: 12912.8). Total num frames: 4251648. Throughput: 0: 6437.3, 1: 6418.5. Samples: 4232620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:28,504][130331] Avg episode reward: [(0, '128.096'), (1, '-7.339')]
[2023-09-21 10:37:29,109][131067] Updated weights for policy 0, policy_version 4160 (0.0011)
[2023-09-21 10:37:29,110][00302] Updated weights for policy 1, policy_version 4160 (0.0014)
[2023-09-21 10:37:33,503][130331] Fps is (10 sec: 13926.3, 60 sec: 12834.1, 300 sec: 12912.8). Total num frames: 4317184. Throughput: 0: 6466.4, 1: 6462.6. Samples: 4292066. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:37:33,504][130331] Avg episode reward: [(0, '123.982'), (1, '-7.694')]
[2023-09-21 10:37:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004216_2158592.pth...
[2023-09-21 10:37:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004216_2158592.pth...
[2023-09-21 10:37:33,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000003824_1957888.pth
[2023-09-21 10:37:33,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000003824_1957888.pth
[2023-09-21 10:37:35,473][00302] Updated weights for policy 1, policy_version 4240 (0.0015)
[2023-09-21 10:37:35,473][131067] Updated weights for policy 0, policy_version 4240 (0.0013)
[2023-09-21 10:37:38,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12834.1, 300 sec: 12885.0). Total num frames: 4374528. Throughput: 0: 6453.9, 1: 6451.8. Samples: 4367206. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:37:38,504][130331] Avg episode reward: [(0, '120.380'), (1, '-8.052')]
[2023-09-21 10:37:41,833][00302] Updated weights for policy 1, policy_version 4320 (0.0013)
[2023-09-21 10:37:41,833][131067] Updated weights for policy 0, policy_version 4320 (0.0015)
[2023-09-21 10:37:43,502][130331] Fps is (10 sec: 12288.3, 60 sec: 12834.2, 300 sec: 12885.0). Total num frames: 4440064. Throughput: 0: 6539.3, 1: 6473.6. Samples: 4427062. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:37:43,503][130331] Avg episode reward: [(0, '116.063'), (1, '-8.358')]
[2023-09-21 10:37:47,888][131067] Updated weights for policy 0, policy_version 4400 (0.0013)
[2023-09-21 10:37:47,889][00302] Updated weights for policy 1, policy_version 4400 (0.0014)
[2023-09-21 10:37:48,502][130331] Fps is (10 sec: 13107.5, 60 sec: 12834.2, 300 sec: 12885.1). Total num frames: 4505600. Throughput: 0: 6549.9, 1: 6554.4. Samples: 4487490. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:37:48,503][130331] Avg episode reward: [(0, '117.691'), (1, '-8.453')]
[2023-09-21 10:37:48,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004400_2252800.pth...
[2023-09-21 10:37:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004400_2252800.pth...
[2023-09-21 10:37:48,513][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004024_2060288.pth
[2023-09-21 10:37:48,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004024_2060288.pth
[2023-09-21 10:37:53,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13107.2, 300 sec: 12885.0). Total num frames: 4579328. Throughput: 0: 6545.6, 1: 6542.6. Samples: 4567306. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:37:53,504][130331] Avg episode reward: [(0, '121.033'), (1, '-9.061')]
[2023-09-21 10:37:54,074][131067] Updated weights for policy 0, policy_version 4480 (0.0013)
[2023-09-21 10:37:54,074][00302] Updated weights for policy 1, policy_version 4480 (0.0014)
[2023-09-21 10:37:58,502][130331] Fps is (10 sec: 13926.4, 60 sec: 13243.7, 300 sec: 12912.8). Total num frames: 4644864. Throughput: 0: 6555.3, 1: 6586.4. Samples: 4626836. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:37:58,503][130331] Avg episode reward: [(0, '125.521'), (1, '-9.855')]
[2023-09-21 10:38:00,391][00302] Updated weights for policy 1, policy_version 4560 (0.0015)
[2023-09-21 10:38:00,392][131067] Updated weights for policy 0, policy_version 4560 (0.0011)
[2023-09-21 10:38:03,503][130331] Fps is (10 sec: 12288.0, 60 sec: 13107.2, 300 sec: 12885.0). Total num frames: 4702208. Throughput: 0: 6540.9, 1: 6554.0. Samples: 4684776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:03,504][130331] Avg episode reward: [(0, '130.502'), (1, '-10.262')]
[2023-09-21 10:38:03,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004592_2351104.pth...
[2023-09-21 10:38:03,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004592_2351104.pth...
[2023-09-21 10:38:03,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004216_2158592.pth
[2023-09-21 10:38:03,522][130980] Saving new best policy, reward=130.502!
[2023-09-21 10:38:03,525][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004216_2158592.pth
[2023-09-21 10:38:06,724][131067] Updated weights for policy 0, policy_version 4640 (0.0015)
[2023-09-21 10:38:06,724][00302] Updated weights for policy 1, policy_version 4640 (0.0012)
[2023-09-21 10:38:08,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12970.7, 300 sec: 12885.0). Total num frames: 4767744. Throughput: 0: 6549.5, 1: 6553.0. Samples: 4761392. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:08,503][130331] Avg episode reward: [(0, '136.342'), (1, '-10.359')]
[2023-09-21 10:38:08,504][130980] Saving new best policy, reward=136.342!
[2023-09-21 10:38:13,109][00302] Updated weights for policy 1, policy_version 4720 (0.0014)
[2023-09-21 10:38:13,109][131067] Updated weights for policy 0, policy_version 4720 (0.0011)
[2023-09-21 10:38:13,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 12885.0). Total num frames: 4833280. Throughput: 0: 6523.8, 1: 6523.4. Samples: 4819742. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:13,503][130331] Avg episode reward: [(0, '139.192'), (1, '-9.822')]
[2023-09-21 10:38:13,504][130980] Saving new best policy, reward=139.192!
[2023-09-21 10:38:18,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12972.2, 300 sec: 12885.0). Total num frames: 4898816. Throughput: 0: 6491.4, 1: 6507.2. Samples: 4877002. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:38:18,503][130331] Avg episode reward: [(0, '137.190'), (1, '-9.702')]
[2023-09-21 10:38:18,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004784_2449408.pth...
[2023-09-21 10:38:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004784_2449408.pth...
[2023-09-21 10:38:18,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004400_2252800.pth
[2023-09-21 10:38:18,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004400_2252800.pth
[2023-09-21 10:38:19,587][00302] Updated weights for policy 1, policy_version 4800 (0.0016)
[2023-09-21 10:38:19,587][131067] Updated weights for policy 0, policy_version 4800 (0.0013)
[2023-09-21 10:38:23,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12885.1). Total num frames: 4964352. Throughput: 0: 6531.7, 1: 6534.5. Samples: 4955184. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:23,504][130331] Avg episode reward: [(0, '136.528'), (1, '-9.452')]
[2023-09-21 10:38:25,901][00302] Updated weights for policy 1, policy_version 4880 (0.0014)
[2023-09-21 10:38:25,902][131067] Updated weights for policy 0, policy_version 4880 (0.0015)
[2023-09-21 10:38:28,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12834.2, 300 sec: 12857.3). Total num frames: 5021696. Throughput: 0: 6472.5, 1: 6512.5. Samples: 5011390. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:38:28,503][130331] Avg episode reward: [(0, '140.908'), (1, '-9.658')]
[2023-09-21 10:38:28,546][130980] Saving new best policy, reward=140.908!
[2023-09-21 10:38:32,363][131067] Updated weights for policy 0, policy_version 4960 (0.0013)
[2023-09-21 10:38:32,364][00302] Updated weights for policy 1, policy_version 4960 (0.0015)
[2023-09-21 10:38:33,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12834.1, 300 sec: 12885.0). Total num frames: 5087232. Throughput: 0: 6461.9, 1: 6457.6. Samples: 5068870. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:33,504][130331] Avg episode reward: [(0, '150.469'), (1, '-9.949')]
[2023-09-21 10:38:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004968_2543616.pth...
[2023-09-21 10:38:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004968_2543616.pth...
[2023-09-21 10:38:33,520][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004592_2351104.pth
[2023-09-21 10:38:33,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004592_2351104.pth
[2023-09-21 10:38:33,521][130980] Saving new best policy, reward=150.469!
[2023-09-21 10:38:38,502][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.7, 300 sec: 12885.0). Total num frames: 5152768. Throughput: 0: 6399.7, 1: 6399.4. Samples: 5143262. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:38:38,503][130331] Avg episode reward: [(0, '156.320'), (1, '-10.162')]
[2023-09-21 10:38:38,504][130980] Saving new best policy, reward=156.320!
[2023-09-21 10:38:38,979][131067] Updated weights for policy 0, policy_version 5040 (0.0014)
[2023-09-21 10:38:38,980][00302] Updated weights for policy 1, policy_version 5040 (0.0016)
[2023-09-21 10:38:43,503][130331] Fps is (10 sec: 12288.2, 60 sec: 12834.1, 300 sec: 12857.3). Total num frames: 5210112. Throughput: 0: 6368.4, 1: 6361.4. Samples: 5199678. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:38:43,503][130331] Avg episode reward: [(0, '158.415'), (1, '-11.279')]
[2023-09-21 10:38:43,549][130980] Saving new best policy, reward=158.415!
[2023-09-21 10:38:45,507][131067] Updated weights for policy 0, policy_version 5120 (0.0013)
[2023-09-21 10:38:45,508][00302] Updated weights for policy 1, policy_version 5120 (0.0014)
[2023-09-21 10:38:48,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12834.1, 300 sec: 12857.3). Total num frames: 5275648. Throughput: 0: 6353.3, 1: 6359.4. Samples: 5256844. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:38:48,503][130331] Avg episode reward: [(0, '159.694'), (1, '-11.664')]
[2023-09-21 10:38:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005152_2637824.pth...
[2023-09-21 10:38:48,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005152_2637824.pth...
[2023-09-21 10:38:48,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004784_2449408.pth
[2023-09-21 10:38:48,516][130980] Saving new best policy, reward=159.694!
[2023-09-21 10:38:48,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004784_2449408.pth
[2023-09-21 10:38:51,834][131067] Updated weights for policy 0, policy_version 5200 (0.0013)
[2023-09-21 10:38:51,834][00302] Updated weights for policy 1, policy_version 5200 (0.0015)
[2023-09-21 10:38:53,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 5341184. Throughput: 0: 6366.0, 1: 6359.1. Samples: 5334026. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:38:53,504][130331] Avg episode reward: [(0, '156.615'), (1, '-11.835')]
[2023-09-21 10:38:58,041][131067] Updated weights for policy 0, policy_version 5280 (0.0012)
[2023-09-21 10:38:58,041][00302] Updated weights for policy 1, policy_version 5280 (0.0015)
[2023-09-21 10:38:58,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 5406720. Throughput: 0: 6375.9, 1: 6371.0. Samples: 5393354. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:38:58,503][130331] Avg episode reward: [(0, '152.534'), (1, '-11.480')]
[2023-09-21 10:39:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12834.1, 300 sec: 12857.3). Total num frames: 5472256. Throughput: 0: 6408.7, 1: 6407.3. Samples: 5453726. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:39:03,504][130331] Avg episode reward: [(0, '147.178'), (1, '-11.085')]
[2023-09-21 10:39:03,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005344_2736128.pth...
[2023-09-21 10:39:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005344_2736128.pth...
[2023-09-21 10:39:03,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000004968_2543616.pth
[2023-09-21 10:39:03,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000004968_2543616.pth
[2023-09-21 10:39:04,246][131067] Updated weights for policy 0, policy_version 5360 (0.0012)
[2023-09-21 10:39:04,246][00302] Updated weights for policy 1, policy_version 5360 (0.0012)
[2023-09-21 10:39:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12834.1, 300 sec: 12857.3). Total num frames: 5537792. Throughput: 0: 6393.6, 1: 6394.9. Samples: 5530668. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:39:08,503][130331] Avg episode reward: [(0, '148.515'), (1, '-11.057')]
[2023-09-21 10:39:10,886][131067] Updated weights for policy 0, policy_version 5440 (0.0010)
[2023-09-21 10:39:10,887][00302] Updated weights for policy 1, policy_version 5440 (0.0015)
[2023-09-21 10:39:13,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 5595136. Throughput: 0: 6376.4, 1: 6377.3. Samples: 5585308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:39:13,504][130331] Avg episode reward: [(0, '154.868'), (1, '-11.132')]
[2023-09-21 10:39:17,318][00302] Updated weights for policy 1, policy_version 5520 (0.0010)
[2023-09-21 10:39:17,319][131067] Updated weights for policy 0, policy_version 5520 (0.0015)
[2023-09-21 10:39:18,502][130331] Fps is (10 sec: 12288.1, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 5660672. Throughput: 0: 6370.5, 1: 6378.3. Samples: 5642560. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:39:18,503][130331] Avg episode reward: [(0, '159.343'), (1, '-10.508')]
[2023-09-21 10:39:18,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005528_2830336.pth...
[2023-09-21 10:39:18,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005528_2830336.pth...
[2023-09-21 10:39:18,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005152_2637824.pth
[2023-09-21 10:39:18,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005152_2637824.pth
[2023-09-21 10:39:23,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12697.6, 300 sec: 12857.3). Total num frames: 5726208. Throughput: 0: 6411.2, 1: 6409.5. Samples: 5720196. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:39:23,503][130331] Avg episode reward: [(0, '161.354'), (1, '-4.985')]
[2023-09-21 10:39:23,504][130980] Saving new best policy, reward=161.354!
[2023-09-21 10:39:23,734][00302] Updated weights for policy 1, policy_version 5600 (0.0014)
[2023-09-21 10:39:23,734][131067] Updated weights for policy 0, policy_version 5600 (0.0017)
[2023-09-21 10:39:28,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12697.6, 300 sec: 12801.8). Total num frames: 5783552. Throughput: 0: 6355.9, 1: 6392.3. Samples: 5773346. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:39:28,503][130331] Avg episode reward: [(0, '163.169'), (1, '5.542')]
[2023-09-21 10:39:28,504][130980] Saving new best policy, reward=163.169!
[2023-09-21 10:39:30,719][00302] Updated weights for policy 1, policy_version 5680 (0.0016)
[2023-09-21 10:39:30,719][131067] Updated weights for policy 0, policy_version 5680 (0.0017)
[2023-09-21 10:39:33,503][130331] Fps is (10 sec: 12287.5, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 5849088. Throughput: 0: 6326.0, 1: 6318.2. Samples: 5825836. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:39:33,504][130331] Avg episode reward: [(0, '168.341'), (1, '24.243')]
[2023-09-21 10:39:33,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005712_2924544.pth...
[2023-09-21 10:39:33,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005712_2924544.pth...
[2023-09-21 10:39:33,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005344_2736128.pth
[2023-09-21 10:39:33,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005344_2736128.pth
[2023-09-21 10:39:33,520][130980] Saving new best policy, reward=168.341!
[2023-09-21 10:39:37,029][131067] Updated weights for policy 0, policy_version 5760 (0.0010)
[2023-09-21 10:39:37,029][00302] Updated weights for policy 1, policy_version 5760 (0.0014)
[2023-09-21 10:39:38,502][130331] Fps is (10 sec: 13107.3, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 5914624. Throughput: 0: 6337.9, 1: 6338.8. Samples: 5904478. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:39:38,503][130331] Avg episode reward: [(0, '174.177'), (1, '38.576')]
[2023-09-21 10:39:38,504][130980] Saving new best policy, reward=174.177!
[2023-09-21 10:39:43,468][131067] Updated weights for policy 0, policy_version 5840 (0.0014)
[2023-09-21 10:39:43,468][00302] Updated weights for policy 1, policy_version 5840 (0.0013)
[2023-09-21 10:39:43,503][130331] Fps is (10 sec: 13107.5, 60 sec: 12834.1, 300 sec: 12857.3). Total num frames: 5980160. Throughput: 0: 6305.3, 1: 6312.5. Samples: 5961156. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:39:43,504][130331] Avg episode reward: [(0, '179.931'), (1, '45.038')]
[2023-09-21 10:39:43,505][130980] Saving new best policy, reward=179.931!
[2023-09-21 10:39:43,505][130981] Saving new best policy, reward=45.038!
[2023-09-21 10:39:48,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 6037504. Throughput: 0: 6274.1, 1: 6270.4. Samples: 6018226. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:39:48,503][130331] Avg episode reward: [(0, '185.771'), (1, '43.882')]
[2023-09-21 10:39:48,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005896_3018752.pth...
[2023-09-21 10:39:48,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005896_3018752.pth...
[2023-09-21 10:39:48,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005528_2830336.pth
[2023-09-21 10:39:48,512][130980] Saving new best policy, reward=185.771!
[2023-09-21 10:39:48,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005528_2830336.pth
[2023-09-21 10:39:50,123][131067] Updated weights for policy 0, policy_version 5920 (0.0013)
[2023-09-21 10:39:50,127][00302] Updated weights for policy 1, policy_version 5920 (0.0014)
[2023-09-21 10:39:53,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 6103040. Throughput: 0: 6248.5, 1: 6252.1. Samples: 6093198. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:39:53,504][130331] Avg episode reward: [(0, '188.050'), (1, '40.988')]
[2023-09-21 10:39:53,505][130980] Saving new best policy, reward=188.050!
[2023-09-21 10:39:56,445][131067] Updated weights for policy 0, policy_version 6000 (0.0012)
[2023-09-21 10:39:56,445][00302] Updated weights for policy 1, policy_version 6000 (0.0012)
[2023-09-21 10:39:58,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 6168576. Throughput: 0: 6300.1, 1: 6284.6. Samples: 6151620. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:39:58,503][130331] Avg episode reward: [(0, '189.113'), (1, '38.981')]
[2023-09-21 10:39:58,504][130980] Saving new best policy, reward=189.113!
[2023-09-21 10:40:02,789][00302] Updated weights for policy 1, policy_version 6080 (0.0009)
[2023-09-21 10:40:02,790][131067] Updated weights for policy 0, policy_version 6080 (0.0013)
[2023-09-21 10:40:03,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 6234112. Throughput: 0: 6314.5, 1: 6305.8. Samples: 6210476. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:03,504][130331] Avg episode reward: [(0, '185.150'), (1, '38.356')]
[2023-09-21 10:40:03,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006088_3117056.pth...
[2023-09-21 10:40:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006088_3117056.pth...
[2023-09-21 10:40:03,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005712_2924544.pth
[2023-09-21 10:40:03,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005712_2924544.pth
[2023-09-21 10:40:08,503][130331] Fps is (10 sec: 12287.8, 60 sec: 12561.0, 300 sec: 12801.7). Total num frames: 6291456. Throughput: 0: 6282.0, 1: 6282.4. Samples: 6285596. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:08,504][130331] Avg episode reward: [(0, '183.413'), (1, '40.359')]
[2023-09-21 10:40:09,224][131067] Updated weights for policy 0, policy_version 6160 (0.0010)
[2023-09-21 10:40:09,225][00302] Updated weights for policy 1, policy_version 6160 (0.0012)
[2023-09-21 10:40:13,502][130331] Fps is (10 sec: 12288.5, 60 sec: 12697.6, 300 sec: 12829.5). Total num frames: 6356992. Throughput: 0: 6349.1, 1: 6307.9. Samples: 6342908. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:13,503][130331] Avg episode reward: [(0, '183.205'), (1, '43.093')]
[2023-09-21 10:40:15,872][00302] Updated weights for policy 1, policy_version 6240 (0.0013)
[2023-09-21 10:40:15,873][131067] Updated weights for policy 0, policy_version 6240 (0.0014)
[2023-09-21 10:40:18,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12561.0, 300 sec: 12801.7). Total num frames: 6414336. Throughput: 0: 6354.9, 1: 6347.4. Samples: 6397434. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:18,504][130331] Avg episode reward: [(0, '185.439'), (1, '46.222')]
[2023-09-21 10:40:18,515][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006264_3207168.pth...
[2023-09-21 10:40:18,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006264_3207168.pth...
[2023-09-21 10:40:18,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000005896_3018752.pth
[2023-09-21 10:40:18,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000005896_3018752.pth
[2023-09-21 10:40:18,519][130981] Saving new best policy, reward=46.222!
[2023-09-21 10:40:22,296][131067] Updated weights for policy 0, policy_version 6320 (0.0013)
[2023-09-21 10:40:22,297][00302] Updated weights for policy 1, policy_version 6320 (0.0015)
[2023-09-21 10:40:23,503][130331] Fps is (10 sec: 12287.8, 60 sec: 12561.1, 300 sec: 12801.7). Total num frames: 6479872. Throughput: 0: 6334.2, 1: 6338.5. Samples: 6474748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:23,503][130331] Avg episode reward: [(0, '184.625'), (1, '46.216')]
[2023-09-21 10:40:28,412][00302] Updated weights for policy 1, policy_version 6400 (0.0015)
[2023-09-21 10:40:28,412][131067] Updated weights for policy 0, policy_version 6400 (0.0011)
[2023-09-21 10:40:28,503][130331] Fps is (10 sec: 13926.6, 60 sec: 12834.1, 300 sec: 12829.5). Total num frames: 6553600. Throughput: 0: 6372.7, 1: 6365.0. Samples: 6534352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:28,503][130331] Avg episode reward: [(0, '190.243'), (1, '46.077')]
[2023-09-21 10:40:28,504][130980] Saving new best policy, reward=190.243!
[2023-09-21 10:40:33,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12697.6, 300 sec: 12801.7). Total num frames: 6610944. Throughput: 0: 6404.7, 1: 6392.9. Samples: 6594124. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:33,504][130331] Avg episode reward: [(0, '198.449'), (1, '45.107')]
[2023-09-21 10:40:33,540][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006464_3309568.pth...
[2023-09-21 10:40:33,546][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006088_3117056.pth
[2023-09-21 10:40:33,547][130980] Saving new best policy, reward=198.449!
[2023-09-21 10:40:33,547][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006464_3309568.pth...
[2023-09-21 10:40:33,550][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006088_3117056.pth
[2023-09-21 10:40:34,758][131067] Updated weights for policy 0, policy_version 6480 (0.0013)
[2023-09-21 10:40:34,759][00302] Updated weights for policy 1, policy_version 6480 (0.0013)
[2023-09-21 10:40:38,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12697.6, 300 sec: 12801.7). Total num frames: 6676480. Throughput: 0: 6425.8, 1: 6420.6. Samples: 6671284. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:40:38,503][130331] Avg episode reward: [(0, '209.565'), (1, '45.116')]
[2023-09-21 10:40:38,542][130980] Saving new best policy, reward=209.565!
[2023-09-21 10:40:40,942][00302] Updated weights for policy 1, policy_version 6560 (0.0012)
[2023-09-21 10:40:40,943][131067] Updated weights for policy 0, policy_version 6560 (0.0013)
[2023-09-21 10:40:43,503][130331] Fps is (10 sec: 13926.7, 60 sec: 12834.2, 300 sec: 12829.5). Total num frames: 6750208. Throughput: 0: 6452.0, 1: 6442.0. Samples: 6731850. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:40:43,503][130331] Avg episode reward: [(0, '215.069'), (1, '46.472')]
[2023-09-21 10:40:43,504][130980] Saving new best policy, reward=215.069!
[2023-09-21 10:40:43,504][130981] Saving new best policy, reward=46.472!
[2023-09-21 10:40:47,213][131067] Updated weights for policy 0, policy_version 6640 (0.0013)
[2023-09-21 10:40:47,213][00302] Updated weights for policy 1, policy_version 6640 (0.0011)
[2023-09-21 10:40:48,502][130331] Fps is (10 sec: 13107.2, 60 sec: 12834.1, 300 sec: 12829.5). Total num frames: 6807552. Throughput: 0: 6450.0, 1: 6442.5. Samples: 6790632. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:48,503][130331] Avg episode reward: [(0, '218.030'), (1, '48.768')]
[2023-09-21 10:40:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006648_3403776.pth...
[2023-09-21 10:40:48,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006648_3403776.pth...
[2023-09-21 10:40:48,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006264_3207168.pth
[2023-09-21 10:40:48,512][130980] Saving new best policy, reward=218.030!
[2023-09-21 10:40:48,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006264_3207168.pth
[2023-09-21 10:40:48,515][130981] Saving new best policy, reward=48.768!
[2023-09-21 10:40:53,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12834.2, 300 sec: 12829.5). Total num frames: 6873088. Throughput: 0: 6458.1, 1: 6456.0. Samples: 6866728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:53,503][130331] Avg episode reward: [(0, '217.652'), (1, '49.085')]
[2023-09-21 10:40:53,504][130981] Saving new best policy, reward=49.085!
[2023-09-21 10:40:53,698][131067] Updated weights for policy 0, policy_version 6720 (0.0016)
[2023-09-21 10:40:53,698][00302] Updated weights for policy 1, policy_version 6720 (0.0016)
[2023-09-21 10:40:58,502][130331] Fps is (10 sec: 13107.2, 60 sec: 12834.1, 300 sec: 12829.5). Total num frames: 6938624. Throughput: 0: 6480.4, 1: 6462.5. Samples: 6925338. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:40:58,503][130331] Avg episode reward: [(0, '214.333'), (1, '49.824')]
[2023-09-21 10:40:58,504][130981] Saving new best policy, reward=49.824!
[2023-09-21 10:40:59,823][131067] Updated weights for policy 0, policy_version 6800 (0.0013)
[2023-09-21 10:40:59,823][00302] Updated weights for policy 1, policy_version 6800 (0.0014)
[2023-09-21 10:41:03,503][130331] Fps is (10 sec: 13926.3, 60 sec: 12970.7, 300 sec: 12857.3). Total num frames: 7012352. Throughput: 0: 6557.5, 1: 6553.6. Samples: 6987432. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:03,503][130331] Avg episode reward: [(0, '213.315'), (1, '49.968')]
[2023-09-21 10:41:03,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006848_3506176.pth...
[2023-09-21 10:41:03,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006848_3506176.pth...
[2023-09-21 10:41:03,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006464_3309568.pth
[2023-09-21 10:41:03,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006464_3309568.pth
[2023-09-21 10:41:03,518][130981] Saving new best policy, reward=49.968!
[2023-09-21 10:41:05,984][131067] Updated weights for policy 0, policy_version 6880 (0.0013)
[2023-09-21 10:41:05,985][00302] Updated weights for policy 1, policy_version 6880 (0.0014)
[2023-09-21 10:41:08,503][130331] Fps is (10 sec: 13925.9, 60 sec: 13107.2, 300 sec: 12857.3). Total num frames: 7077888. Throughput: 0: 6584.5, 1: 6583.2. Samples: 7067300. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:08,504][130331] Avg episode reward: [(0, '219.236'), (1, '53.302')]
[2023-09-21 10:41:08,506][130980] Saving new best policy, reward=219.236!
[2023-09-21 10:41:08,506][130981] Saving new best policy, reward=53.302!
[2023-09-21 10:41:12,308][131067] Updated weights for policy 0, policy_version 6960 (0.0011)
[2023-09-21 10:41:12,308][00302] Updated weights for policy 1, policy_version 6960 (0.0015)
[2023-09-21 10:41:13,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 12857.3). Total num frames: 7143424. Throughput: 0: 6551.8, 1: 6553.1. Samples: 7124072. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:41:13,504][130331] Avg episode reward: [(0, '226.392'), (1, '57.289')]
[2023-09-21 10:41:13,505][130980] Saving new best policy, reward=226.392!
[2023-09-21 10:41:13,505][130981] Saving new best policy, reward=57.289!
[2023-09-21 10:41:18,482][00302] Updated weights for policy 1, policy_version 7040 (0.0015)
[2023-09-21 10:41:18,482][131067] Updated weights for policy 0, policy_version 7040 (0.0012)
[2023-09-21 10:41:18,505][130331] Fps is (10 sec: 13103.8, 60 sec: 13243.1, 300 sec: 12857.1). Total num frames: 7208960. Throughput: 0: 6530.8, 1: 6545.7. Samples: 7182600. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:18,506][130331] Avg episode reward: [(0, '227.091'), (1, '59.805')]
[2023-09-21 10:41:18,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007040_3604480.pth...
[2023-09-21 10:41:18,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007040_3604480.pth...
[2023-09-21 10:41:18,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006648_3403776.pth
[2023-09-21 10:41:18,519][130981] Saving new best policy, reward=59.805!
[2023-09-21 10:41:18,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006648_3403776.pth
[2023-09-21 10:41:18,520][130980] Saving new best policy, reward=227.091!
[2023-09-21 10:41:23,502][130331] Fps is (10 sec: 12288.2, 60 sec: 13107.2, 300 sec: 12829.5). Total num frames: 7266304. Throughput: 0: 6557.7, 1: 6564.7. Samples: 7261792. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:41:23,503][130331] Avg episode reward: [(0, '219.895'), (1, '58.582')]
[2023-09-21 10:41:24,727][131067] Updated weights for policy 0, policy_version 7120 (0.0014)
[2023-09-21 10:41:24,727][00302] Updated weights for policy 1, policy_version 7120 (0.0011)
[2023-09-21 10:41:28,503][130331] Fps is (10 sec: 13111.0, 60 sec: 13107.2, 300 sec: 12857.3). Total num frames: 7340032. Throughput: 0: 6557.8, 1: 6555.7. Samples: 7321958. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:41:28,503][130331] Avg episode reward: [(0, '212.218'), (1, '53.964')]
[2023-09-21 10:41:30,780][131067] Updated weights for policy 0, policy_version 7200 (0.0013)
[2023-09-21 10:41:30,780][00302] Updated weights for policy 1, policy_version 7200 (0.0015)
[2023-09-21 10:41:33,503][130331] Fps is (10 sec: 13926.0, 60 sec: 13243.7, 300 sec: 12885.0). Total num frames: 7405568. Throughput: 0: 6571.3, 1: 6580.7. Samples: 7382474. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:41:33,504][130331] Avg episode reward: [(0, '199.833'), (1, '50.189')]
[2023-09-21 10:41:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007232_3702784.pth...
[2023-09-21 10:41:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007232_3702784.pth...
[2023-09-21 10:41:33,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000006848_3506176.pth
[2023-09-21 10:41:33,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000006848_3506176.pth
[2023-09-21 10:41:36,677][131067] Updated weights for policy 0, policy_version 7280 (0.0014)
[2023-09-21 10:41:36,677][00302] Updated weights for policy 1, policy_version 7280 (0.0013)
[2023-09-21 10:41:38,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13243.7, 300 sec: 12885.0). Total num frames: 7471104. Throughput: 0: 6653.0, 1: 6659.9. Samples: 7465810. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:38,504][130331] Avg episode reward: [(0, '189.788'), (1, '48.886')]
[2023-09-21 10:41:42,838][131067] Updated weights for policy 0, policy_version 7360 (0.0015)
[2023-09-21 10:41:42,839][00302] Updated weights for policy 1, policy_version 7360 (0.0013)
[2023-09-21 10:41:43,503][130331] Fps is (10 sec: 13926.5, 60 sec: 13243.7, 300 sec: 12912.8). Total num frames: 7544832. Throughput: 0: 6670.1, 1: 6678.1. Samples: 7526010. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:41:43,504][130331] Avg episode reward: [(0, '176.734'), (1, '48.754')]
[2023-09-21 10:41:48,503][130331] Fps is (10 sec: 13926.3, 60 sec: 13380.2, 300 sec: 12940.6). Total num frames: 7610368. Throughput: 0: 6653.7, 1: 6664.0. Samples: 7586728. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:48,504][130331] Avg episode reward: [(0, '170.621'), (1, '50.557')]
[2023-09-21 10:41:48,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007432_3805184.pth...
[2023-09-21 10:41:48,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007432_3805184.pth...
[2023-09-21 10:41:48,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007040_3604480.pth
[2023-09-21 10:41:48,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007040_3604480.pth
[2023-09-21 10:41:48,886][00302] Updated weights for policy 1, policy_version 7440 (0.0015)
[2023-09-21 10:41:48,886][131067] Updated weights for policy 0, policy_version 7440 (0.0015)
[2023-09-21 10:41:53,502][130331] Fps is (10 sec: 13107.5, 60 sec: 13380.3, 300 sec: 12968.4). Total num frames: 7675904. Throughput: 0: 6664.0, 1: 6658.6. Samples: 7666810. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:53,503][130331] Avg episode reward: [(0, '166.810'), (1, '52.041')]
[2023-09-21 10:41:54,914][131067] Updated weights for policy 0, policy_version 7520 (0.0014)
[2023-09-21 10:41:54,914][00302] Updated weights for policy 1, policy_version 7520 (0.0016)
[2023-09-21 10:41:58,503][130331] Fps is (10 sec: 13106.8, 60 sec: 13380.1, 300 sec: 12968.3). Total num frames: 7741440. Throughput: 0: 6720.0, 1: 6711.9. Samples: 7728514. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:41:58,504][130331] Avg episode reward: [(0, '165.259'), (1, '51.943')]
[2023-09-21 10:42:01,247][131067] Updated weights for policy 0, policy_version 7600 (0.0015)
[2023-09-21 10:42:01,247][00302] Updated weights for policy 1, policy_version 7600 (0.0015)
[2023-09-21 10:42:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 12940.6). Total num frames: 7806976. Throughput: 0: 6705.8, 1: 6707.3. Samples: 7786154. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:03,503][130331] Avg episode reward: [(0, '165.270'), (1, '50.171')]
[2023-09-21 10:42:03,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007624_3903488.pth...
[2023-09-21 10:42:03,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007624_3903488.pth...
[2023-09-21 10:42:03,513][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007232_3702784.pth
[2023-09-21 10:42:03,513][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007232_3702784.pth
[2023-09-21 10:42:07,159][131067] Updated weights for policy 0, policy_version 7680 (0.0013)
[2023-09-21 10:42:07,159][00302] Updated weights for policy 1, policy_version 7680 (0.0014)
[2023-09-21 10:42:08,503][130331] Fps is (10 sec: 13927.1, 60 sec: 13380.3, 300 sec: 12996.1). Total num frames: 7880704. Throughput: 0: 6760.2, 1: 6760.5. Samples: 7870226. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:08,503][130331] Avg episode reward: [(0, '163.975'), (1, '49.621')]
[2023-09-21 10:42:13,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 12940.9). Total num frames: 7938048. Throughput: 0: 6702.1, 1: 6741.4. Samples: 7926916. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:42:13,504][130331] Avg episode reward: [(0, '160.226'), (1, '49.729')]
[2023-09-21 10:42:13,640][00302] Updated weights for policy 1, policy_version 7760 (0.0012)
[2023-09-21 10:42:13,641][131067] Updated weights for policy 0, policy_version 7760 (0.0012)
[2023-09-21 10:42:18,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13380.9, 300 sec: 12996.1). Total num frames: 8011776. Throughput: 0: 6718.0, 1: 6709.3. Samples: 7986700. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:42:18,503][130331] Avg episode reward: [(0, '157.609'), (1, '53.903')]
[2023-09-21 10:42:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007824_4005888.pth...
[2023-09-21 10:42:18,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007824_4005888.pth...
[2023-09-21 10:42:18,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007432_3805184.pth
[2023-09-21 10:42:18,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007432_3805184.pth
[2023-09-21 10:42:19,785][00302] Updated weights for policy 1, policy_version 7840 (0.0009)
[2023-09-21 10:42:19,785][131067] Updated weights for policy 0, policy_version 7840 (0.0015)
[2023-09-21 10:42:23,503][130331] Fps is (10 sec: 13926.4, 60 sec: 13516.8, 300 sec: 12968.4). Total num frames: 8077312. Throughput: 0: 6662.0, 1: 6661.7. Samples: 8065376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:23,504][130331] Avg episode reward: [(0, '154.830'), (1, '55.909')]
[2023-09-21 10:42:26,034][00302] Updated weights for policy 1, policy_version 7920 (0.0012)
[2023-09-21 10:42:26,034][131067] Updated weights for policy 0, policy_version 7920 (0.0015)
[2023-09-21 10:42:28,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13380.3, 300 sec: 12968.4). Total num frames: 8142848. Throughput: 0: 6642.8, 1: 6640.9. Samples: 8123772. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:28,503][130331] Avg episode reward: [(0, '155.023'), (1, '58.992')]
[2023-09-21 10:42:32,163][131067] Updated weights for policy 0, policy_version 8000 (0.0015)
[2023-09-21 10:42:32,164][00302] Updated weights for policy 1, policy_version 8000 (0.0013)
[2023-09-21 10:42:33,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13380.3, 300 sec: 12996.1). Total num frames: 8208384. Throughput: 0: 6642.5, 1: 6627.8. Samples: 8183890. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:42:33,503][130331] Avg episode reward: [(0, '150.660'), (1, '55.725')]
[2023-09-21 10:42:33,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008016_4104192.pth...
[2023-09-21 10:42:33,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008016_4104192.pth...
[2023-09-21 10:42:33,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007624_3903488.pth
[2023-09-21 10:42:33,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007624_3903488.pth
[2023-09-21 10:42:38,387][00302] Updated weights for policy 1, policy_version 8080 (0.0012)
[2023-09-21 10:42:38,387][131067] Updated weights for policy 0, policy_version 8080 (0.0016)
[2023-09-21 10:42:38,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13380.3, 300 sec: 12996.1). Total num frames: 8273920. Throughput: 0: 6638.2, 1: 6638.7. Samples: 8264272. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:38,503][130331] Avg episode reward: [(0, '147.741'), (1, '56.279')]
[2023-09-21 10:42:43,503][130331] Fps is (10 sec: 12288.0, 60 sec: 13107.2, 300 sec: 12968.3). Total num frames: 8331264. Throughput: 0: 6541.0, 1: 6578.6. Samples: 8318890. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:43,504][130331] Avg episode reward: [(0, '142.258'), (1, '53.751')]
[2023-09-21 10:42:45,049][131067] Updated weights for policy 0, policy_version 8160 (0.0011)
[2023-09-21 10:42:45,050][00302] Updated weights for policy 1, policy_version 8160 (0.0014)
[2023-09-21 10:42:48,503][130331] Fps is (10 sec: 12287.9, 60 sec: 13107.2, 300 sec: 12940.6). Total num frames: 8396800. Throughput: 0: 6533.3, 1: 6526.1. Samples: 8373830. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:48,503][130331] Avg episode reward: [(0, '135.504'), (1, '54.095')]
[2023-09-21 10:42:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008200_4198400.pth...
[2023-09-21 10:42:48,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008200_4198400.pth...
[2023-09-21 10:42:48,513][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000007824_4005888.pth
[2023-09-21 10:42:48,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000007824_4005888.pth
[2023-09-21 10:42:51,580][131067] Updated weights for policy 0, policy_version 8240 (0.0009)
[2023-09-21 10:42:51,581][00302] Updated weights for policy 1, policy_version 8240 (0.0015)
[2023-09-21 10:42:53,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.1, 300 sec: 12940.6). Total num frames: 8462336. Throughput: 0: 6445.4, 1: 6445.2. Samples: 8450306. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:53,504][130331] Avg episode reward: [(0, '126.985'), (1, '52.539')]
[2023-09-21 10:42:57,566][131067] Updated weights for policy 0, policy_version 8320 (0.0012)
[2023-09-21 10:42:57,566][00302] Updated weights for policy 1, policy_version 8320 (0.0014)
[2023-09-21 10:42:58,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 12968.4). Total num frames: 8527872. Throughput: 0: 6536.3, 1: 6482.8. Samples: 8512776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:42:58,503][130331] Avg episode reward: [(0, '120.408'), (1, '52.845')]
[2023-09-21 10:43:03,503][130331] Fps is (10 sec: 13106.8, 60 sec: 13107.1, 300 sec: 12968.3). Total num frames: 8593408. Throughput: 0: 6474.4, 1: 6473.8. Samples: 8569376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:03,504][130331] Avg episode reward: [(0, '117.518'), (1, '53.364')]
[2023-09-21 10:43:03,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008392_4296704.pth...
[2023-09-21 10:43:03,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008392_4296704.pth...
[2023-09-21 10:43:03,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008016_4104192.pth
[2023-09-21 10:43:03,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008016_4104192.pth
[2023-09-21 10:43:03,984][131067] Updated weights for policy 0, policy_version 8400 (0.0013)
[2023-09-21 10:43:03,984][00302] Updated weights for policy 1, policy_version 8400 (0.0014)
[2023-09-21 10:43:08,502][130331] Fps is (10 sec: 13107.4, 60 sec: 12970.7, 300 sec: 12968.4). Total num frames: 8658944. Throughput: 0: 6473.1, 1: 6471.5. Samples: 8647878. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:08,503][130331] Avg episode reward: [(0, '117.071'), (1, '51.884')]
[2023-09-21 10:43:10,417][131067] Updated weights for policy 0, policy_version 8480 (0.0012)
[2023-09-21 10:43:10,417][00302] Updated weights for policy 1, policy_version 8480 (0.0014)
[2023-09-21 10:43:13,503][130331] Fps is (10 sec: 12288.4, 60 sec: 12970.7, 300 sec: 12940.6). Total num frames: 8716288. Throughput: 0: 6456.3, 1: 6453.7. Samples: 8704724. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:43:13,504][130331] Avg episode reward: [(0, '118.778'), (1, '51.851')]
[2023-09-21 10:43:16,614][00302] Updated weights for policy 1, policy_version 8560 (0.0014)
[2023-09-21 10:43:16,615][131067] Updated weights for policy 0, policy_version 8560 (0.0014)
[2023-09-21 10:43:18,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12970.7, 300 sec: 12968.4). Total num frames: 8790016. Throughput: 0: 6451.9, 1: 6459.4. Samples: 8764898. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:43:18,503][130331] Avg episode reward: [(0, '118.466'), (1, '50.661')]
[2023-09-21 10:43:18,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008584_4395008.pth...
[2023-09-21 10:43:18,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008584_4395008.pth...
[2023-09-21 10:43:18,512][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008200_4198400.pth
[2023-09-21 10:43:18,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008200_4198400.pth
[2023-09-21 10:43:22,785][131067] Updated weights for policy 0, policy_version 8640 (0.0013)
[2023-09-21 10:43:22,786][00302] Updated weights for policy 1, policy_version 8640 (0.0013)
[2023-09-21 10:43:23,503][130331] Fps is (10 sec: 13926.4, 60 sec: 12970.7, 300 sec: 12996.1). Total num frames: 8855552. Throughput: 0: 6446.2, 1: 6450.3. Samples: 8844612. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:43:23,504][130331] Avg episode reward: [(0, '120.305'), (1, '49.921')]
[2023-09-21 10:43:28,503][130331] Fps is (10 sec: 12287.9, 60 sec: 12834.1, 300 sec: 12968.4). Total num frames: 8912896. Throughput: 0: 6490.3, 1: 6460.7. Samples: 8901684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:28,504][130331] Avg episode reward: [(0, '120.832'), (1, '49.322')]
[2023-09-21 10:43:29,169][131067] Updated weights for policy 0, policy_version 8720 (0.0012)
[2023-09-21 10:43:29,170][00302] Updated weights for policy 1, policy_version 8720 (0.0013)
[2023-09-21 10:43:33,503][130331] Fps is (10 sec: 12288.2, 60 sec: 12834.2, 300 sec: 12968.4). Total num frames: 8978432. Throughput: 0: 6533.7, 1: 6526.3. Samples: 8961530. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:33,503][130331] Avg episode reward: [(0, '122.236'), (1, '47.269')]
[2023-09-21 10:43:33,554][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008776_4493312.pth...
[2023-09-21 10:43:33,557][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008392_4296704.pth
[2023-09-21 10:43:33,563][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008776_4493312.pth...
[2023-09-21 10:43:33,567][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008392_4296704.pth
[2023-09-21 10:43:35,367][00302] Updated weights for policy 1, policy_version 8800 (0.0011)
[2023-09-21 10:43:35,368][131067] Updated weights for policy 0, policy_version 8800 (0.0010)
[2023-09-21 10:43:38,502][130331] Fps is (10 sec: 13107.5, 60 sec: 12834.2, 300 sec: 12996.1). Total num frames: 9043968. Throughput: 0: 6543.1, 1: 6536.6. Samples: 9038890. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:43:38,503][130331] Avg episode reward: [(0, '123.050'), (1, '47.353')]
[2023-09-21 10:43:41,778][131067] Updated weights for policy 0, policy_version 8880 (0.0014)
[2023-09-21 10:43:41,779][00302] Updated weights for policy 1, policy_version 8880 (0.0015)
[2023-09-21 10:43:43,502][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.7, 300 sec: 12996.1). Total num frames: 9109504. Throughput: 0: 6486.8, 1: 6495.6. Samples: 9096982. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:43:43,503][130331] Avg episode reward: [(0, '123.362'), (1, '44.298')]
[2023-09-21 10:43:47,760][131067] Updated weights for policy 0, policy_version 8960 (0.0011)
[2023-09-21 10:43:47,760][00302] Updated weights for policy 1, policy_version 8960 (0.0014)
[2023-09-21 10:43:48,503][130331] Fps is (10 sec: 13925.9, 60 sec: 13107.2, 300 sec: 13023.9). Total num frames: 9183232. Throughput: 0: 6547.1, 1: 6541.6. Samples: 9158366. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:43:48,504][130331] Avg episode reward: [(0, '122.658'), (1, '43.700')]
[2023-09-21 10:43:48,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008968_4591616.pth...
[2023-09-21 10:43:48,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008968_4591616.pth...
[2023-09-21 10:43:48,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008584_4395008.pth
[2023-09-21 10:43:48,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008584_4395008.pth
[2023-09-21 10:43:53,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13107.2, 300 sec: 13023.9). Total num frames: 9248768. Throughput: 0: 6584.8, 1: 6572.8. Samples: 9239972. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:53,504][130331] Avg episode reward: [(0, '122.505'), (1, '41.776')]
[2023-09-21 10:43:53,782][131067] Updated weights for policy 0, policy_version 9040 (0.0015)
[2023-09-21 10:43:53,782][00302] Updated weights for policy 1, policy_version 9040 (0.0015)
[2023-09-21 10:43:58,503][130331] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13023.9). Total num frames: 9314304. Throughput: 0: 6599.0, 1: 6606.3. Samples: 9298962. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:43:58,503][130331] Avg episode reward: [(0, '125.072'), (1, '41.473')]
[2023-09-21 10:44:00,021][00302] Updated weights for policy 1, policy_version 9120 (0.0013)
[2023-09-21 10:44:00,022][131067] Updated weights for policy 0, policy_version 9120 (0.0014)
[2023-09-21 10:44:03,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.3, 300 sec: 13023.9). Total num frames: 9379840. Throughput: 0: 6581.4, 1: 6595.3. Samples: 9357850. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:03,504][130331] Avg episode reward: [(0, '125.590'), (1, '39.113')]
[2023-09-21 10:44:03,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009160_4689920.pth...
[2023-09-21 10:44:03,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009160_4689920.pth...
[2023-09-21 10:44:03,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008776_4493312.pth
[2023-09-21 10:44:03,524][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008776_4493312.pth
[2023-09-21 10:44:06,218][131067] Updated weights for policy 0, policy_version 9200 (0.0015)
[2023-09-21 10:44:06,218][00302] Updated weights for policy 1, policy_version 9200 (0.0014)
[2023-09-21 10:44:08,502][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13051.7). Total num frames: 9445376. Throughput: 0: 6603.3, 1: 6598.8. Samples: 9438702. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:08,503][130331] Avg episode reward: [(0, '122.564'), (1, '36.388')]
[2023-09-21 10:44:12,515][131067] Updated weights for policy 0, policy_version 9280 (0.0016)
[2023-09-21 10:44:12,515][00302] Updated weights for policy 1, policy_version 9280 (0.0014)
[2023-09-21 10:44:13,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13051.7). Total num frames: 9510912. Throughput: 0: 6597.3, 1: 6617.6. Samples: 9496354. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:13,504][130331] Avg episode reward: [(0, '116.843'), (1, '35.998')]
[2023-09-21 10:44:18,502][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13051.7). Total num frames: 9576448. Throughput: 0: 6570.4, 1: 6582.7. Samples: 9553418. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:18,503][130331] Avg episode reward: [(0, '112.657'), (1, '35.793')]
[2023-09-21 10:44:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009352_4788224.pth...
[2023-09-21 10:44:18,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009352_4788224.pth...
[2023-09-21 10:44:18,514][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000008968_4591616.pth
[2023-09-21 10:44:18,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000008968_4591616.pth
[2023-09-21 10:44:18,804][00302] Updated weights for policy 1, policy_version 9360 (0.0013)
[2023-09-21 10:44:18,805][131067] Updated weights for policy 0, policy_version 9360 (0.0015)
[2023-09-21 10:44:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 9641984. Throughput: 0: 6619.9, 1: 6606.0. Samples: 9634058. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:23,504][130331] Avg episode reward: [(0, '111.900'), (1, '34.924')]
[2023-09-21 10:44:25,001][131067] Updated weights for policy 0, policy_version 9440 (0.0014)
[2023-09-21 10:44:25,003][00302] Updated weights for policy 1, policy_version 9440 (0.0015)
[2023-09-21 10:44:28,503][130331] Fps is (10 sec: 13106.7, 60 sec: 13243.7, 300 sec: 13079.4). Total num frames: 9707520. Throughput: 0: 6627.9, 1: 6625.2. Samples: 9693376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:28,504][130331] Avg episode reward: [(0, '109.391'), (1, '33.738')]
[2023-09-21 10:44:31,002][131067] Updated weights for policy 0, policy_version 9520 (0.0013)
[2023-09-21 10:44:31,002][00302] Updated weights for policy 1, policy_version 9520 (0.0011)
[2023-09-21 10:44:33,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13079.4). Total num frames: 9773056. Throughput: 0: 6639.3, 1: 6644.6. Samples: 9756142. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:44:33,504][130331] Avg episode reward: [(0, '105.398'), (1, '35.598')]
[2023-09-21 10:44:33,534][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009552_4890624.pth...
[2023-09-21 10:44:33,537][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009160_4689920.pth
[2023-09-21 10:44:33,546][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009552_4890624.pth...
[2023-09-21 10:44:33,551][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009160_4689920.pth
[2023-09-21 10:44:37,310][131067] Updated weights for policy 0, policy_version 9600 (0.0009)
[2023-09-21 10:44:37,312][00302] Updated weights for policy 1, policy_version 9600 (0.0016)
[2023-09-21 10:44:38,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13243.7, 300 sec: 13079.4). Total num frames: 9838592. Throughput: 0: 6582.7, 1: 6590.0. Samples: 9832748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:38,504][130331] Avg episode reward: [(0, '98.672'), (1, '40.431')]
[2023-09-21 10:44:43,502][130331] Fps is (10 sec: 13107.6, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 9904128. Throughput: 0: 6594.7, 1: 6592.1. Samples: 9892366. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:43,503][130331] Avg episode reward: [(0, '81.929'), (1, '41.689')]
[2023-09-21 10:44:43,567][131067] Updated weights for policy 0, policy_version 9680 (0.0013)
[2023-09-21 10:44:43,568][00302] Updated weights for policy 1, policy_version 9680 (0.0016)
[2023-09-21 10:44:48,503][130331] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13107.2). Total num frames: 9969664. Throughput: 0: 6601.2, 1: 6599.6. Samples: 9951882. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:44:48,503][130331] Avg episode reward: [(0, '67.784'), (1, '41.911')]
[2023-09-21 10:44:48,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009736_4984832.pth...
[2023-09-21 10:44:48,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009736_4984832.pth...
[2023-09-21 10:44:48,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009352_4788224.pth
[2023-09-21 10:44:48,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009352_4788224.pth
[2023-09-21 10:44:49,799][131067] Updated weights for policy 0, policy_version 9760 (0.0014)
[2023-09-21 10:44:49,799][00302] Updated weights for policy 1, policy_version 9760 (0.0015)
[2023-09-21 10:44:53,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 10043392. Throughput: 0: 6584.8, 1: 6590.8. Samples: 10031604. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:53,504][130331] Avg episode reward: [(0, '59.344'), (1, '40.063')]
[2023-09-21 10:44:55,958][131067] Updated weights for policy 0, policy_version 9840 (0.0013)
[2023-09-21 10:44:55,958][00302] Updated weights for policy 1, policy_version 9840 (0.0015)
[2023-09-21 10:44:58,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 10108928. Throughput: 0: 6633.0, 1: 6589.7. Samples: 10091376. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:44:58,504][130331] Avg episode reward: [(0, '56.387'), (1, '41.775')]
[2023-09-21 10:45:01,960][131067] Updated weights for policy 0, policy_version 9920 (0.0014)
[2023-09-21 10:45:01,960][00302] Updated weights for policy 1, policy_version 9920 (0.0014)
[2023-09-21 10:45:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13162.7). Total num frames: 10174464. Throughput: 0: 6649.5, 1: 6646.0. Samples: 10151720. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:45:03,504][130331] Avg episode reward: [(0, '55.104'), (1, '44.001')]
[2023-09-21 10:45:03,514][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009936_5087232.pth...
[2023-09-21 10:45:03,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009936_5087232.pth...
[2023-09-21 10:45:03,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009552_4890624.pth
[2023-09-21 10:45:03,522][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009552_4890624.pth
[2023-09-21 10:45:07,972][00302] Updated weights for policy 1, policy_version 10000 (0.0014)
[2023-09-21 10:45:07,972][131067] Updated weights for policy 0, policy_version 10000 (0.0014)
[2023-09-21 10:45:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13162.7). Total num frames: 10240000. Throughput: 0: 6650.7, 1: 6672.0. Samples: 10233580. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:45:08,504][130331] Avg episode reward: [(0, '52.689'), (1, '47.059')]
[2023-09-21 10:45:13,503][130331] Fps is (10 sec: 13926.6, 60 sec: 13380.3, 300 sec: 13218.3). Total num frames: 10313728. Throughput: 0: 6677.0, 1: 6666.8. Samples: 10293846. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:45:13,504][130331] Avg episode reward: [(0, '54.000'), (1, '46.535')]
[2023-09-21 10:45:14,107][131067] Updated weights for policy 0, policy_version 10080 (0.0012)
[2023-09-21 10:45:14,108][00302] Updated weights for policy 1, policy_version 10080 (0.0013)
[2023-09-21 10:45:18,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13190.5). Total num frames: 10371072. Throughput: 0: 6649.9, 1: 6645.3. Samples: 10354426. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:45:18,504][130331] Avg episode reward: [(0, '57.047'), (1, '44.263')]
[2023-09-21 10:45:18,546][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010136_5189632.pth...
[2023-09-21 10:45:18,549][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009736_4984832.pth
[2023-09-21 10:45:18,561][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010136_5189632.pth...
[2023-09-21 10:45:18,567][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009736_4984832.pth
[2023-09-21 10:45:20,321][131067] Updated weights for policy 0, policy_version 10160 (0.0015)
[2023-09-21 10:45:20,321][00302] Updated weights for policy 1, policy_version 10160 (0.0014)
[2023-09-21 10:45:23,503][130331] Fps is (10 sec: 12287.9, 60 sec: 13243.7, 300 sec: 13162.7). Total num frames: 10436608. Throughput: 0: 6658.9, 1: 6664.2. Samples: 10432288. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:45:23,504][130331] Avg episode reward: [(0, '61.332'), (1, '41.603')]
[2023-09-21 10:45:26,694][131067] Updated weights for policy 0, policy_version 10240 (0.0012)
[2023-09-21 10:45:26,694][00302] Updated weights for policy 1, policy_version 10240 (0.0013)
[2023-09-21 10:45:28,502][130331] Fps is (10 sec: 13107.6, 60 sec: 13243.8, 300 sec: 13190.5). Total num frames: 10502144. Throughput: 0: 6626.3, 1: 6647.8. Samples: 10489702. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:45:28,503][130331] Avg episode reward: [(0, '65.689'), (1, '37.276')]
[2023-09-21 10:45:33,036][131067] Updated weights for policy 0, policy_version 10320 (0.0012)
[2023-09-21 10:45:33,037][00302] Updated weights for policy 1, policy_version 10320 (0.0015)
[2023-09-21 10:45:33,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13243.8, 300 sec: 13190.5). Total num frames: 10567680. Throughput: 0: 6625.2, 1: 6620.8. Samples: 10547952. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:45:33,503][130331] Avg episode reward: [(0, '71.417'), (1, '34.806')]
[2023-09-21 10:45:33,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010320_5283840.pth...
[2023-09-21 10:45:33,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010320_5283840.pth...
[2023-09-21 10:45:33,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000009936_5087232.pth
[2023-09-21 10:45:33,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000009936_5087232.pth
[2023-09-21 10:45:38,502][130331] Fps is (10 sec: 13926.4, 60 sec: 13380.3, 300 sec: 13190.5). Total num frames: 10641408. Throughput: 0: 6651.9, 1: 6652.1. Samples: 10630282. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:45:38,503][130331] Avg episode reward: [(0, '78.226'), (1, '33.408')]
[2023-09-21 10:45:39,017][131067] Updated weights for policy 0, policy_version 10400 (0.0010)
[2023-09-21 10:45:39,018][00302] Updated weights for policy 1, policy_version 10400 (0.0014)
[2023-09-21 10:45:43,502][130331] Fps is (10 sec: 13926.4, 60 sec: 13380.3, 300 sec: 13218.3). Total num frames: 10706944. Throughput: 0: 6624.5, 1: 6666.7. Samples: 10689480. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:45:43,503][130331] Avg episode reward: [(0, '83.894'), (1, '34.739')]
[2023-09-21 10:45:45,182][131067] Updated weights for policy 0, policy_version 10480 (0.0011)
[2023-09-21 10:45:45,184][00302] Updated weights for policy 1, policy_version 10480 (0.0012)
[2023-09-21 10:45:48,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13380.2, 300 sec: 13218.3). Total num frames: 10772480. Throughput: 0: 6642.1, 1: 6642.1. Samples: 10749506. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:45:48,503][130331] Avg episode reward: [(0, '90.409'), (1, '32.352')]
[2023-09-21 10:45:48,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010520_5386240.pth...
[2023-09-21 10:45:48,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010520_5386240.pth...
[2023-09-21 10:45:48,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010136_5189632.pth
[2023-09-21 10:45:48,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010136_5189632.pth
[2023-09-21 10:45:51,511][00302] Updated weights for policy 1, policy_version 10560 (0.0013)
[2023-09-21 10:45:51,511][131067] Updated weights for policy 0, policy_version 10560 (0.0011)
[2023-09-21 10:45:53,503][130331] Fps is (10 sec: 12287.6, 60 sec: 13107.2, 300 sec: 13190.5). Total num frames: 10829824. Throughput: 0: 6582.0, 1: 6574.3. Samples: 10825614. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:45:53,504][130331] Avg episode reward: [(0, '95.210'), (1, '30.396')]
[2023-09-21 10:45:58,131][131067] Updated weights for policy 0, policy_version 10640 (0.0014)
[2023-09-21 10:45:58,132][00302] Updated weights for policy 1, policy_version 10640 (0.0015)
[2023-09-21 10:45:58,503][130331] Fps is (10 sec: 12288.1, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 10895360. Throughput: 0: 6513.0, 1: 6540.9. Samples: 10881270. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:45:58,503][130331] Avg episode reward: [(0, '97.869'), (1, '30.762')]
[2023-09-21 10:46:03,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 10960896. Throughput: 0: 6499.6, 1: 6511.0. Samples: 10939904. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:46:03,504][130331] Avg episode reward: [(0, '98.555'), (1, '34.538')]
[2023-09-21 10:46:03,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010704_5480448.pth...
[2023-09-21 10:46:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010704_5480448.pth...
[2023-09-21 10:46:03,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010320_5283840.pth
[2023-09-21 10:46:03,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010320_5283840.pth
[2023-09-21 10:46:04,518][00302] Updated weights for policy 1, policy_version 10720 (0.0014)
[2023-09-21 10:46:04,519][131067] Updated weights for policy 0, policy_version 10720 (0.0013)
[2023-09-21 10:46:08,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 11018240. Throughput: 0: 6456.4, 1: 6450.9. Samples: 11013116. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:46:08,503][130331] Avg episode reward: [(0, '97.779'), (1, '38.552')]
[2023-09-21 10:46:11,020][131067] Updated weights for policy 0, policy_version 10800 (0.0014)
[2023-09-21 10:46:11,021][00302] Updated weights for policy 1, policy_version 10800 (0.0011)
[2023-09-21 10:46:13,502][130331] Fps is (10 sec: 12288.4, 60 sec: 12834.2, 300 sec: 13135.1). Total num frames: 11083776. Throughput: 0: 6488.8, 1: 6472.8. Samples: 11072974. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:46:13,503][130331] Avg episode reward: [(0, '97.332'), (1, '40.053')]
[2023-09-21 10:46:17,175][00302] Updated weights for policy 1, policy_version 10880 (0.0011)
[2023-09-21 10:46:17,175][131067] Updated weights for policy 0, policy_version 10880 (0.0010)
[2023-09-21 10:46:18,503][130331] Fps is (10 sec: 13926.2, 60 sec: 13107.2, 300 sec: 13190.5). Total num frames: 11157504. Throughput: 0: 6506.2, 1: 6490.4. Samples: 11132800. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:46:18,503][130331] Avg episode reward: [(0, '98.465'), (1, '40.454')]
[2023-09-21 10:46:18,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010896_5578752.pth...
[2023-09-21 10:46:18,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010896_5578752.pth...
[2023-09-21 10:46:18,513][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010520_5386240.pth
[2023-09-21 10:46:18,514][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010520_5386240.pth
[2023-09-21 10:46:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 11214848. Throughput: 0: 6419.1, 1: 6414.0. Samples: 11207776. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:46:23,503][130331] Avg episode reward: [(0, '97.364'), (1, '42.346')]
[2023-09-21 10:46:23,728][131067] Updated weights for policy 0, policy_version 10960 (0.0015)
[2023-09-21 10:46:23,728][00302] Updated weights for policy 1, policy_version 10960 (0.0015)
[2023-09-21 10:46:28,503][130331] Fps is (10 sec: 12288.2, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 11280384. Throughput: 0: 6390.1, 1: 6392.4. Samples: 11264694. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:46:28,503][130331] Avg episode reward: [(0, '97.185'), (1, '44.324')]
[2023-09-21 10:46:30,076][00302] Updated weights for policy 1, policy_version 11040 (0.0013)
[2023-09-21 10:46:30,077][131067] Updated weights for policy 0, policy_version 11040 (0.0013)
[2023-09-21 10:46:33,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.6, 300 sec: 13135.0). Total num frames: 11345920. Throughput: 0: 6381.3, 1: 6388.8. Samples: 11324160. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:46:33,503][130331] Avg episode reward: [(0, '91.706'), (1, '47.068')]
[2023-09-21 10:46:33,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011080_5672960.pth...
[2023-09-21 10:46:33,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011080_5672960.pth...
[2023-09-21 10:46:33,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010704_5480448.pth
[2023-09-21 10:46:33,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010704_5480448.pth
[2023-09-21 10:46:36,176][00302] Updated weights for policy 1, policy_version 11120 (0.0014)
[2023-09-21 10:46:36,176][131067] Updated weights for policy 0, policy_version 11120 (0.0015)
[2023-09-21 10:46:38,502][130331] Fps is (10 sec: 13107.2, 60 sec: 12834.1, 300 sec: 13107.2). Total num frames: 11411456. Throughput: 0: 6427.7, 1: 6416.2. Samples: 11403588. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:46:38,503][130331] Avg episode reward: [(0, '82.096'), (1, '48.171')]
[2023-09-21 10:46:42,539][00302] Updated weights for policy 1, policy_version 11200 (0.0015)
[2023-09-21 10:46:42,540][131067] Updated weights for policy 0, policy_version 11200 (0.0012)
[2023-09-21 10:46:43,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 13107.2). Total num frames: 11476992. Throughput: 0: 6464.4, 1: 6460.6. Samples: 11462898. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:46:43,504][130331] Avg episode reward: [(0, '69.868'), (1, '49.982')]
[2023-09-21 10:46:48,503][130331] Fps is (10 sec: 13106.8, 60 sec: 12834.1, 300 sec: 13107.2). Total num frames: 11542528. Throughput: 0: 6459.9, 1: 6471.2. Samples: 11521804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:46:48,504][130331] Avg episode reward: [(0, '62.598'), (1, '52.613')]
[2023-09-21 10:46:48,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011272_5771264.pth...
[2023-09-21 10:46:48,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011272_5771264.pth...
[2023-09-21 10:46:48,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000010896_5578752.pth
[2023-09-21 10:46:48,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000010896_5578752.pth
[2023-09-21 10:46:48,586][131067] Updated weights for policy 0, policy_version 11280 (0.0015)
[2023-09-21 10:46:48,587][00302] Updated weights for policy 1, policy_version 11280 (0.0017)
[2023-09-21 10:46:53,502][130331] Fps is (10 sec: 13107.6, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 11608064. Throughput: 0: 6573.3, 1: 6577.4. Samples: 11604898. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:46:53,503][130331] Avg episode reward: [(0, '62.063'), (1, '53.927')]
[2023-09-21 10:46:54,740][131067] Updated weights for policy 0, policy_version 11360 (0.0014)
[2023-09-21 10:46:54,740][00302] Updated weights for policy 1, policy_version 11360 (0.0014)
[2023-09-21 10:46:58,503][130331] Fps is (10 sec: 13926.8, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 11681792. Throughput: 0: 6559.6, 1: 6558.8. Samples: 11663304. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:46:58,503][130331] Avg episode reward: [(0, '64.387'), (1, '53.861')]
[2023-09-21 10:47:00,798][00302] Updated weights for policy 1, policy_version 11440 (0.0013)
[2023-09-21 10:47:00,798][131067] Updated weights for policy 0, policy_version 11440 (0.0015)
[2023-09-21 10:47:03,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 11747328. Throughput: 0: 6557.9, 1: 6569.3. Samples: 11723524. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:03,504][130331] Avg episode reward: [(0, '71.011'), (1, '51.072')]
[2023-09-21 10:47:03,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011472_5873664.pth...
[2023-09-21 10:47:03,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011472_5873664.pth...
[2023-09-21 10:47:03,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011080_5672960.pth
[2023-09-21 10:47:03,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011080_5672960.pth
[2023-09-21 10:47:06,985][00302] Updated weights for policy 1, policy_version 11520 (0.0012)
[2023-09-21 10:47:06,986][131067] Updated weights for policy 0, policy_version 11520 (0.0014)
[2023-09-21 10:47:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 11812864. Throughput: 0: 6620.0, 1: 6618.9. Samples: 11803530. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:08,503][130331] Avg episode reward: [(0, '76.750'), (1, '48.653')]
[2023-09-21 10:47:13,214][00302] Updated weights for policy 1, policy_version 11600 (0.0011)
[2023-09-21 10:47:13,214][131067] Updated weights for policy 0, policy_version 11600 (0.0016)
[2023-09-21 10:47:13,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 11878400. Throughput: 0: 6654.3, 1: 6618.7. Samples: 11861984. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:13,504][130331] Avg episode reward: [(0, '81.130'), (1, '45.547')]
[2023-09-21 10:47:18,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 11943936. Throughput: 0: 6629.4, 1: 6628.2. Samples: 11920756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:18,504][130331] Avg episode reward: [(0, '83.330'), (1, '43.470')]
[2023-09-21 10:47:18,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011664_5971968.pth...
[2023-09-21 10:47:18,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011664_5971968.pth...
[2023-09-21 10:47:18,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011272_5771264.pth
[2023-09-21 10:47:18,526][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011272_5771264.pth
[2023-09-21 10:47:19,510][131067] Updated weights for policy 0, policy_version 11680 (0.0014)
[2023-09-21 10:47:19,510][00302] Updated weights for policy 1, policy_version 11680 (0.0013)
[2023-09-21 10:47:23,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 12009472. Throughput: 0: 6645.7, 1: 6650.3. Samples: 12001910. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:23,504][130331] Avg episode reward: [(0, '87.504'), (1, '44.312')]
[2023-09-21 10:47:25,542][00302] Updated weights for policy 1, policy_version 11760 (0.0013)
[2023-09-21 10:47:25,543][131067] Updated weights for policy 0, policy_version 11760 (0.0012)
[2023-09-21 10:47:28,502][130331] Fps is (10 sec: 13107.6, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 12075008. Throughput: 0: 6631.9, 1: 6647.6. Samples: 12060474. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:28,503][130331] Avg episode reward: [(0, '92.664'), (1, '44.424')]
[2023-09-21 10:47:32,028][131067] Updated weights for policy 0, policy_version 11840 (0.0013)
[2023-09-21 10:47:32,028][00302] Updated weights for policy 1, policy_version 11840 (0.0015)
[2023-09-21 10:47:33,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 12140544. Throughput: 0: 6617.6, 1: 6611.8. Samples: 12117130. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:33,504][130331] Avg episode reward: [(0, '98.836'), (1, '43.554')]
[2023-09-21 10:47:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011856_6070272.pth...
[2023-09-21 10:47:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011856_6070272.pth...
[2023-09-21 10:47:33,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011472_5873664.pth
[2023-09-21 10:47:33,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011472_5873664.pth
[2023-09-21 10:47:38,068][131067] Updated weights for policy 0, policy_version 11920 (0.0012)
[2023-09-21 10:47:38,069][00302] Updated weights for policy 1, policy_version 11920 (0.0014)
[2023-09-21 10:47:38,502][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 12206080. Throughput: 0: 6602.4, 1: 6602.8. Samples: 12199128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:38,503][130331] Avg episode reward: [(0, '103.467'), (1, '41.996')]
[2023-09-21 10:47:43,503][130331] Fps is (10 sec: 13107.6, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 12271616. Throughput: 0: 6590.6, 1: 6591.2. Samples: 12256482. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:47:43,503][130331] Avg episode reward: [(0, '108.837'), (1, '41.985')]
[2023-09-21 10:47:44,435][131067] Updated weights for policy 0, policy_version 12000 (0.0013)
[2023-09-21 10:47:44,435][00302] Updated weights for policy 1, policy_version 12000 (0.0014)
[2023-09-21 10:47:48,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 12337152. Throughput: 0: 6585.0, 1: 6594.3. Samples: 12316590. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:48,503][130331] Avg episode reward: [(0, '113.415'), (1, '41.375')]
[2023-09-21 10:47:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012048_6168576.pth...
[2023-09-21 10:47:48,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012048_6168576.pth...
[2023-09-21 10:47:48,513][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011664_5971968.pth
[2023-09-21 10:47:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011664_5971968.pth
[2023-09-21 10:47:50,489][00302] Updated weights for policy 1, policy_version 12080 (0.0010)
[2023-09-21 10:47:50,491][131067] Updated weights for policy 0, policy_version 12080 (0.0012)
[2023-09-21 10:47:53,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 12402688. Throughput: 0: 6580.8, 1: 6585.1. Samples: 12395996. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:53,503][130331] Avg episode reward: [(0, '116.415'), (1, '41.819')]
[2023-09-21 10:47:56,721][00302] Updated weights for policy 1, policy_version 12160 (0.0013)
[2023-09-21 10:47:56,722][131067] Updated weights for policy 0, policy_version 12160 (0.0014)
[2023-09-21 10:47:58,502][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 12468224. Throughput: 0: 6573.1, 1: 6598.8. Samples: 12454716. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:47:58,503][130331] Avg episode reward: [(0, '119.027'), (1, '42.768')]
[2023-09-21 10:48:03,076][131067] Updated weights for policy 0, policy_version 12240 (0.0015)
[2023-09-21 10:48:03,076][00302] Updated weights for policy 1, policy_version 12240 (0.0015)
[2023-09-21 10:48:03,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 12533760. Throughput: 0: 6579.7, 1: 6582.6. Samples: 12513062. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:48:03,504][130331] Avg episode reward: [(0, '116.873'), (1, '46.090')]
[2023-09-21 10:48:03,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012240_6266880.pth...
[2023-09-21 10:48:03,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012240_6266880.pth...
[2023-09-21 10:48:03,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000011856_6070272.pth
[2023-09-21 10:48:03,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000011856_6070272.pth
[2023-09-21 10:48:08,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 12599296. Throughput: 0: 6559.3, 1: 6573.8. Samples: 12592894. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:08,503][130331] Avg episode reward: [(0, '111.116'), (1, '47.594')]
[2023-09-21 10:48:09,208][131067] Updated weights for policy 0, policy_version 12320 (0.0012)
[2023-09-21 10:48:09,209][00302] Updated weights for policy 1, policy_version 12320 (0.0014)
[2023-09-21 10:48:13,503][130331] Fps is (10 sec: 13926.5, 60 sec: 13243.7, 300 sec: 13162.7). Total num frames: 12673024. Throughput: 0: 6605.3, 1: 6574.3. Samples: 12653558. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:13,504][130331] Avg episode reward: [(0, '95.039'), (1, '49.160')]
[2023-09-21 10:48:15,295][131067] Updated weights for policy 0, policy_version 12400 (0.0013)
[2023-09-21 10:48:15,296][00302] Updated weights for policy 1, policy_version 12400 (0.0013)
[2023-09-21 10:48:18,502][130331] Fps is (10 sec: 13926.5, 60 sec: 13243.8, 300 sec: 13162.7). Total num frames: 12738560. Throughput: 0: 6641.4, 1: 6622.8. Samples: 12714014. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:18,503][130331] Avg episode reward: [(0, '81.280'), (1, '50.163')]
[2023-09-21 10:48:18,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012440_6369280.pth...
[2023-09-21 10:48:18,508][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012440_6369280.pth...
[2023-09-21 10:48:18,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012048_6168576.pth
[2023-09-21 10:48:18,512][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012048_6168576.pth
[2023-09-21 10:48:21,661][131067] Updated weights for policy 0, policy_version 12480 (0.0013)
[2023-09-21 10:48:21,661][00302] Updated weights for policy 1, policy_version 12480 (0.0014)
[2023-09-21 10:48:23,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13190.5). Total num frames: 12804096. Throughput: 0: 6578.9, 1: 6580.6. Samples: 12791308. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:23,504][130331] Avg episode reward: [(0, '68.154'), (1, '50.722')]
[2023-09-21 10:48:27,973][00302] Updated weights for policy 1, policy_version 12560 (0.0014)
[2023-09-21 10:48:27,975][131067] Updated weights for policy 0, policy_version 12560 (0.0016)
[2023-09-21 10:48:28,503][130331] Fps is (10 sec: 12288.0, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 12861440. Throughput: 0: 6578.6, 1: 6618.1. Samples: 12850334. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:28,503][130331] Avg episode reward: [(0, '63.905'), (1, '50.862')]
[2023-09-21 10:48:33,503][130331] Fps is (10 sec: 12287.9, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 12926976. Throughput: 0: 6555.5, 1: 6551.3. Samples: 12906396. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:33,504][130331] Avg episode reward: [(0, '66.543'), (1, '52.234')]
[2023-09-21 10:48:33,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012624_6463488.pth...
[2023-09-21 10:48:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012624_6463488.pth...
[2023-09-21 10:48:33,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012240_6266880.pth
[2023-09-21 10:48:33,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012240_6266880.pth
[2023-09-21 10:48:34,381][131067] Updated weights for policy 0, policy_version 12640 (0.0014)
[2023-09-21 10:48:34,381][00302] Updated weights for policy 1, policy_version 12640 (0.0013)
[2023-09-21 10:48:38,502][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 12992512. Throughput: 0: 6535.2, 1: 6524.8. Samples: 12983698. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:48:38,503][130331] Avg episode reward: [(0, '69.224'), (1, '54.505')]
[2023-09-21 10:48:40,936][131067] Updated weights for policy 0, policy_version 12720 (0.0010)
[2023-09-21 10:48:40,937][00302] Updated weights for policy 1, policy_version 12720 (0.0017)
[2023-09-21 10:48:43,503][130331] Fps is (10 sec: 12288.1, 60 sec: 12970.6, 300 sec: 13107.2). Total num frames: 13049856. Throughput: 0: 6484.5, 1: 6505.5. Samples: 13039266. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:48:43,504][130331] Avg episode reward: [(0, '72.046'), (1, '56.539')]
[2023-09-21 10:48:47,432][131067] Updated weights for policy 0, policy_version 12800 (0.0010)
[2023-09-21 10:48:47,432][00302] Updated weights for policy 1, policy_version 12800 (0.0014)
[2023-09-21 10:48:48,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 13115392. Throughput: 0: 6503.3, 1: 6491.1. Samples: 13097804. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:48,503][130331] Avg episode reward: [(0, '78.052'), (1, '58.850')]
[2023-09-21 10:48:48,508][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012808_6557696.pth...
[2023-09-21 10:48:48,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012808_6557696.pth...
[2023-09-21 10:48:48,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012440_6369280.pth
[2023-09-21 10:48:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012440_6369280.pth
[2023-09-21 10:48:53,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12970.6, 300 sec: 13107.2). Total num frames: 13180928. Throughput: 0: 6417.7, 1: 6409.8. Samples: 13170132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:48:53,504][130331] Avg episode reward: [(0, '83.624'), (1, '61.591')]
[2023-09-21 10:48:53,505][130981] Saving new best policy, reward=61.591!
[2023-09-21 10:48:54,057][131067] Updated weights for policy 0, policy_version 12880 (0.0014)
[2023-09-21 10:48:54,057][00302] Updated weights for policy 1, policy_version 12880 (0.0016)
[2023-09-21 10:48:58,503][130331] Fps is (10 sec: 13106.9, 60 sec: 12970.6, 300 sec: 13107.2). Total num frames: 13246464. Throughput: 0: 6377.8, 1: 6413.9. Samples: 13229188. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:48:58,504][130331] Avg episode reward: [(0, '90.779'), (1, '64.483')]
[2023-09-21 10:48:58,506][130981] Saving new best policy, reward=64.483!
[2023-09-21 10:49:00,243][131067] Updated weights for policy 0, policy_version 12960 (0.0012)
[2023-09-21 10:49:00,243][00302] Updated weights for policy 1, policy_version 12960 (0.0013)
[2023-09-21 10:49:03,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 13312000. Throughput: 0: 6360.3, 1: 6369.2. Samples: 13286842. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:49:03,504][130331] Avg episode reward: [(0, '92.775'), (1, '66.374')]
[2023-09-21 10:49:03,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013000_6656000.pth...
[2023-09-21 10:49:03,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013000_6656000.pth...
[2023-09-21 10:49:03,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012624_6463488.pth
[2023-09-21 10:49:03,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012624_6463488.pth
[2023-09-21 10:49:03,522][130981] Saving new best policy, reward=66.374!
[2023-09-21 10:49:06,352][00302] Updated weights for policy 1, policy_version 13040 (0.0013)
[2023-09-21 10:49:06,352][131067] Updated weights for policy 0, policy_version 13040 (0.0014)
[2023-09-21 10:49:08,503][130331] Fps is (10 sec: 13107.5, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 13377536. Throughput: 0: 6426.2, 1: 6410.0. Samples: 13368936. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:08,503][130331] Avg episode reward: [(0, '96.862'), (1, '67.459')]
[2023-09-21 10:49:08,504][130981] Saving new best policy, reward=67.459!
[2023-09-21 10:49:12,520][131067] Updated weights for policy 0, policy_version 13120 (0.0013)
[2023-09-21 10:49:12,520][00302] Updated weights for policy 1, policy_version 13120 (0.0013)
[2023-09-21 10:49:13,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12834.1, 300 sec: 13107.2). Total num frames: 13443072. Throughput: 0: 6456.9, 1: 6408.1. Samples: 13429260. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:13,504][130331] Avg episode reward: [(0, '102.007'), (1, '68.008')]
[2023-09-21 10:49:13,505][130981] Saving new best policy, reward=68.008!
[2023-09-21 10:49:18,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12834.1, 300 sec: 13107.2). Total num frames: 13508608. Throughput: 0: 6492.9, 1: 6494.1. Samples: 13490814. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:18,503][130331] Avg episode reward: [(0, '104.894'), (1, '67.505')]
[2023-09-21 10:49:18,553][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013200_6758400.pth...
[2023-09-21 10:49:18,556][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000012808_6557696.pth
[2023-09-21 10:49:18,559][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013200_6758400.pth...
[2023-09-21 10:49:18,559][00302] Updated weights for policy 1, policy_version 13200 (0.0010)
[2023-09-21 10:49:18,560][131067] Updated weights for policy 0, policy_version 13200 (0.0013)
[2023-09-21 10:49:18,562][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000012808_6557696.pth
[2023-09-21 10:49:23,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12834.2, 300 sec: 13107.2). Total num frames: 13574144. Throughput: 0: 6477.6, 1: 6484.4. Samples: 13566990. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:23,503][130331] Avg episode reward: [(0, '109.774'), (1, '67.561')]
[2023-09-21 10:49:24,724][00302] Updated weights for policy 1, policy_version 13280 (0.0014)
[2023-09-21 10:49:24,725][131067] Updated weights for policy 0, policy_version 13280 (0.0014)
[2023-09-21 10:49:28,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 13639680. Throughput: 0: 6549.3, 1: 6534.8. Samples: 13628048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:28,503][130331] Avg episode reward: [(0, '115.126'), (1, '68.429')]
[2023-09-21 10:49:28,504][130981] Saving new best policy, reward=68.429!
[2023-09-21 10:49:30,928][131067] Updated weights for policy 0, policy_version 13360 (0.0012)
[2023-09-21 10:49:30,928][00302] Updated weights for policy 1, policy_version 13360 (0.0015)
[2023-09-21 10:49:33,503][130331] Fps is (10 sec: 13926.2, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 13713408. Throughput: 0: 6575.0, 1: 6563.8. Samples: 13689048. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:33,503][130331] Avg episode reward: [(0, '123.626'), (1, '69.634')]
[2023-09-21 10:49:33,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013392_6856704.pth...
[2023-09-21 10:49:33,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013392_6856704.pth...
[2023-09-21 10:49:33,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013000_6656000.pth
[2023-09-21 10:49:33,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013000_6656000.pth
[2023-09-21 10:49:33,518][130981] Saving new best policy, reward=69.634!
[2023-09-21 10:49:37,159][131067] Updated weights for policy 0, policy_version 13440 (0.0011)
[2023-09-21 10:49:37,159][00302] Updated weights for policy 1, policy_version 13440 (0.0011)
[2023-09-21 10:49:38,502][130331] Fps is (10 sec: 13926.5, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 13778944. Throughput: 0: 6647.2, 1: 6653.5. Samples: 13768664. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:38,503][130331] Avg episode reward: [(0, '130.419'), (1, '70.218')]
[2023-09-21 10:49:38,504][130981] Saving new best policy, reward=70.218!
[2023-09-21 10:49:43,503][130331] Fps is (10 sec: 12288.1, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 13836288. Throughput: 0: 6636.7, 1: 6612.1. Samples: 13825384. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:43,503][130331] Avg episode reward: [(0, '134.130'), (1, '70.856')]
[2023-09-21 10:49:43,525][130981] Saving new best policy, reward=70.856!
[2023-09-21 10:49:43,533][00302] Updated weights for policy 1, policy_version 13520 (0.0015)
[2023-09-21 10:49:43,534][131067] Updated weights for policy 0, policy_version 13520 (0.0013)
[2023-09-21 10:49:48,503][130331] Fps is (10 sec: 12287.7, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 13901824. Throughput: 0: 6618.2, 1: 6632.6. Samples: 13883128. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:48,503][130331] Avg episode reward: [(0, '138.164'), (1, '71.312')]
[2023-09-21 10:49:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013576_6950912.pth...
[2023-09-21 10:49:48,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013576_6950912.pth...
[2023-09-21 10:49:48,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013200_6758400.pth
[2023-09-21 10:49:48,514][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013200_6758400.pth
[2023-09-21 10:49:48,514][130981] Saving new best policy, reward=71.312!
[2023-09-21 10:49:49,811][131067] Updated weights for policy 0, policy_version 13600 (0.0011)
[2023-09-21 10:49:49,811][00302] Updated weights for policy 1, policy_version 13600 (0.0014)
[2023-09-21 10:49:53,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 13967360. Throughput: 0: 6564.1, 1: 6571.1. Samples: 13960022. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:53,503][130331] Avg episode reward: [(0, '139.133'), (1, '72.290')]
[2023-09-21 10:49:53,504][130981] Saving new best policy, reward=72.290!
[2023-09-21 10:49:56,346][00302] Updated weights for policy 1, policy_version 13680 (0.0015)
[2023-09-21 10:49:56,347][131067] Updated weights for policy 0, policy_version 13680 (0.0014)
[2023-09-21 10:49:58,502][130331] Fps is (10 sec: 13107.6, 60 sec: 13107.3, 300 sec: 13079.4). Total num frames: 14032896. Throughput: 0: 6559.8, 1: 6556.5. Samples: 14019488. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:49:58,503][130331] Avg episode reward: [(0, '136.821'), (1, '72.983')]
[2023-09-21 10:49:58,504][130981] Saving new best policy, reward=72.983!
[2023-09-21 10:50:02,371][131067] Updated weights for policy 0, policy_version 13760 (0.0014)
[2023-09-21 10:50:02,372][00302] Updated weights for policy 1, policy_version 13760 (0.0013)
[2023-09-21 10:50:03,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 14098432. Throughput: 0: 6550.6, 1: 6549.7. Samples: 14080326. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:50:03,503][130331] Avg episode reward: [(0, '129.861'), (1, '73.994')]
[2023-09-21 10:50:03,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013768_7049216.pth...
[2023-09-21 10:50:03,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013768_7049216.pth...
[2023-09-21 10:50:03,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013392_6856704.pth
[2023-09-21 10:50:03,515][130981] Saving new best policy, reward=73.994!
[2023-09-21 10:50:03,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013392_6856704.pth
[2023-09-21 10:50:08,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13051.7). Total num frames: 14163968. Throughput: 0: 6578.6, 1: 6586.7. Samples: 14159426. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:50:08,503][130331] Avg episode reward: [(0, '124.451'), (1, '74.828')]
[2023-09-21 10:50:08,510][130981] Saving new best policy, reward=74.828!
[2023-09-21 10:50:08,513][00302] Updated weights for policy 1, policy_version 13840 (0.0015)
[2023-09-21 10:50:08,514][131067] Updated weights for policy 0, policy_version 13840 (0.0011)
[2023-09-21 10:50:13,503][130331] Fps is (10 sec: 13926.3, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 14237696. Throughput: 0: 6568.4, 1: 6559.4. Samples: 14218802. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:50:13,504][130331] Avg episode reward: [(0, '119.586'), (1, '75.652')]
[2023-09-21 10:50:13,505][130981] Saving new best policy, reward=75.652!
[2023-09-21 10:50:14,652][131067] Updated weights for policy 0, policy_version 13920 (0.0007)
[2023-09-21 10:50:14,652][00302] Updated weights for policy 1, policy_version 13920 (0.0012)
[2023-09-21 10:50:18,503][130331] Fps is (10 sec: 13926.1, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 14303232. Throughput: 0: 6549.9, 1: 6548.2. Samples: 14278466. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:50:18,504][130331] Avg episode reward: [(0, '115.914'), (1, '75.731')]
[2023-09-21 10:50:18,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013968_7151616.pth...
[2023-09-21 10:50:18,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013968_7151616.pth...
[2023-09-21 10:50:18,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013576_6950912.pth
[2023-09-21 10:50:18,520][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013576_6950912.pth
[2023-09-21 10:50:18,520][130981] Saving new best policy, reward=75.731!
[2023-09-21 10:50:20,837][131067] Updated weights for policy 0, policy_version 14000 (0.0013)
[2023-09-21 10:50:20,837][00302] Updated weights for policy 1, policy_version 14000 (0.0012)
[2023-09-21 10:50:23,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 14368768. Throughput: 0: 6569.6, 1: 6562.3. Samples: 14359600. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:50:23,503][130331] Avg episode reward: [(0, '113.046'), (1, '75.113')]
[2023-09-21 10:50:26,961][131067] Updated weights for policy 0, policy_version 14080 (0.0011)
[2023-09-21 10:50:26,962][00302] Updated weights for policy 1, policy_version 14080 (0.0015)
[2023-09-21 10:50:28,502][130331] Fps is (10 sec: 13107.6, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 14434304. Throughput: 0: 6580.6, 1: 6611.6. Samples: 14419030. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:28,503][130331] Avg episode reward: [(0, '111.592'), (1, '73.197')]
[2023-09-21 10:50:33,137][00302] Updated weights for policy 1, policy_version 14160 (0.0014)
[2023-09-21 10:50:33,137][131067] Updated weights for policy 0, policy_version 14160 (0.0014)
[2023-09-21 10:50:33,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 14499840. Throughput: 0: 6594.4, 1: 6591.4. Samples: 14476488. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:33,504][130331] Avg episode reward: [(0, '113.217'), (1, '70.735')]
[2023-09-21 10:50:33,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014160_7249920.pth...
[2023-09-21 10:50:33,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014160_7249920.pth...
[2023-09-21 10:50:33,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013768_7049216.pth
[2023-09-21 10:50:33,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013768_7049216.pth
[2023-09-21 10:50:38,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 14565376. Throughput: 0: 6643.9, 1: 6635.0. Samples: 14557570. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:38,504][130331] Avg episode reward: [(0, '111.017'), (1, '67.020')]
[2023-09-21 10:50:39,432][131067] Updated weights for policy 0, policy_version 14240 (0.0016)
[2023-09-21 10:50:39,433][00302] Updated weights for policy 1, policy_version 14240 (0.0015)
[2023-09-21 10:50:43,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13079.4). Total num frames: 14630912. Throughput: 0: 6608.4, 1: 6641.9. Samples: 14615752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:43,504][130331] Avg episode reward: [(0, '111.692'), (1, '64.918')]
[2023-09-21 10:50:45,734][00302] Updated weights for policy 1, policy_version 14320 (0.0014)
[2023-09-21 10:50:45,736][131067] Updated weights for policy 0, policy_version 14320 (0.0015)
[2023-09-21 10:50:48,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.8, 300 sec: 13107.2). Total num frames: 14696448. Throughput: 0: 6584.3, 1: 6584.4. Samples: 14672918. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:48,503][130331] Avg episode reward: [(0, '116.814'), (1, '61.189')]
[2023-09-21 10:50:48,509][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014352_7348224.pth...
[2023-09-21 10:50:48,509][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014352_7348224.pth...
[2023-09-21 10:50:48,512][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000013968_7151616.pth
[2023-09-21 10:50:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000013968_7151616.pth
[2023-09-21 10:50:52,240][131067] Updated weights for policy 0, policy_version 14400 (0.0014)
[2023-09-21 10:50:52,240][00302] Updated weights for policy 1, policy_version 14400 (0.0014)
[2023-09-21 10:50:53,503][130331] Fps is (10 sec: 12287.9, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 14753792. Throughput: 0: 6559.2, 1: 6556.6. Samples: 14749640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:50:53,504][130331] Avg episode reward: [(0, '126.956'), (1, '61.450')]
[2023-09-21 10:50:58,463][131067] Updated weights for policy 0, policy_version 14480 (0.0011)
[2023-09-21 10:50:58,464][00302] Updated weights for policy 1, policy_version 14480 (0.0014)
[2023-09-21 10:50:58,502][130331] Fps is (10 sec: 13107.4, 60 sec: 13243.7, 300 sec: 13107.2). Total num frames: 14827520. Throughput: 0: 6553.7, 1: 6528.1. Samples: 14807478. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:50:58,503][130331] Avg episode reward: [(0, '135.846'), (1, '57.658')]
[2023-09-21 10:51:03,503][130331] Fps is (10 sec: 13926.6, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 14893056. Throughput: 0: 6555.5, 1: 6556.1. Samples: 14868484. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:51:03,503][130331] Avg episode reward: [(0, '139.002'), (1, '54.447')]
[2023-09-21 10:51:03,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014544_7446528.pth...
[2023-09-21 10:51:03,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014544_7446528.pth...
[2023-09-21 10:51:03,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014160_7249920.pth
[2023-09-21 10:51:03,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014160_7249920.pth
[2023-09-21 10:51:04,592][131067] Updated weights for policy 0, policy_version 14560 (0.0012)
[2023-09-21 10:51:04,592][00302] Updated weights for policy 1, policy_version 14560 (0.0014)
[2023-09-21 10:51:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 14958592. Throughput: 0: 6574.2, 1: 6569.5. Samples: 14951068. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:08,503][130331] Avg episode reward: [(0, '140.259'), (1, '51.169')]
[2023-09-21 10:51:10,662][00302] Updated weights for policy 1, policy_version 14640 (0.0013)
[2023-09-21 10:51:10,663][131067] Updated weights for policy 0, policy_version 14640 (0.0013)
[2023-09-21 10:51:13,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 15024128. Throughput: 0: 6559.6, 1: 6553.1. Samples: 15009102. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:13,504][130331] Avg episode reward: [(0, '141.002'), (1, '57.096')]
[2023-09-21 10:51:17,108][131067] Updated weights for policy 0, policy_version 14720 (0.0014)
[2023-09-21 10:51:17,108][00302] Updated weights for policy 1, policy_version 14720 (0.0013)
[2023-09-21 10:51:18,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 15089664. Throughput: 0: 6552.8, 1: 6541.9. Samples: 15065754. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:18,504][130331] Avg episode reward: [(0, '142.972'), (1, '61.364')]
[2023-09-21 10:51:18,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014736_7544832.pth...
[2023-09-21 10:51:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014736_7544832.pth...
[2023-09-21 10:51:18,513][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014352_7348224.pth
[2023-09-21 10:51:18,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014352_7348224.pth
[2023-09-21 10:51:23,160][131067] Updated weights for policy 0, policy_version 14800 (0.0013)
[2023-09-21 10:51:23,161][00302] Updated weights for policy 1, policy_version 14800 (0.0014)
[2023-09-21 10:51:23,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 15155200. Throughput: 0: 6541.7, 1: 6545.7. Samples: 15146506. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:23,504][130331] Avg episode reward: [(0, '147.119'), (1, '67.464')]
[2023-09-21 10:51:28,503][130331] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 15220736. Throughput: 0: 6525.4, 1: 6551.0. Samples: 15204188. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:28,503][130331] Avg episode reward: [(0, '150.027'), (1, '67.368')]
[2023-09-21 10:51:29,658][131067] Updated weights for policy 0, policy_version 14880 (0.0012)
[2023-09-21 10:51:29,658][00302] Updated weights for policy 1, policy_version 14880 (0.0013)
[2023-09-21 10:51:33,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 15286272. Throughput: 0: 6552.9, 1: 6545.5. Samples: 15262344. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:51:33,503][130331] Avg episode reward: [(0, '158.771'), (1, '66.506')]
[2023-09-21 10:51:33,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014928_7643136.pth...
[2023-09-21 10:51:33,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014928_7643136.pth...
[2023-09-21 10:51:33,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014544_7446528.pth
[2023-09-21 10:51:33,528][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014544_7446528.pth
[2023-09-21 10:51:35,918][131067] Updated weights for policy 0, policy_version 14960 (0.0013)
[2023-09-21 10:51:35,918][00302] Updated weights for policy 1, policy_version 14960 (0.0014)
[2023-09-21 10:51:38,502][130331] Fps is (10 sec: 12288.1, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 15343616. Throughput: 0: 6537.8, 1: 6532.2. Samples: 15337788. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:51:38,503][130331] Avg episode reward: [(0, '162.518'), (1, '65.125')]
[2023-09-21 10:51:42,321][131067] Updated weights for policy 0, policy_version 15040 (0.0015)
[2023-09-21 10:51:42,321][00302] Updated weights for policy 1, policy_version 15040 (0.0014)
[2023-09-21 10:51:43,503][130331] Fps is (10 sec: 12288.0, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 15409152. Throughput: 0: 6534.2, 1: 6572.4. Samples: 15397278. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:43,503][130331] Avg episode reward: [(0, '167.619'), (1, '61.044')]
[2023-09-21 10:51:48,503][130331] Fps is (10 sec: 13106.8, 60 sec: 12970.6, 300 sec: 13107.2). Total num frames: 15474688. Throughput: 0: 6511.5, 1: 6531.8. Samples: 15455436. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:51:48,504][130331] Avg episode reward: [(0, '169.843'), (1, '58.644')]
[2023-09-21 10:51:48,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015112_7737344.pth...
[2023-09-21 10:51:48,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015112_7737344.pth...
[2023-09-21 10:51:48,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014736_7544832.pth
[2023-09-21 10:51:48,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014736_7544832.pth
[2023-09-21 10:51:48,716][00302] Updated weights for policy 1, policy_version 15120 (0.0012)
[2023-09-21 10:51:48,716][131067] Updated weights for policy 0, policy_version 15120 (0.0012)
[2023-09-21 10:51:53,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 15540224. Throughput: 0: 6461.6, 1: 6456.1. Samples: 15532364. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:51:53,504][130331] Avg episode reward: [(0, '173.276'), (1, '57.719')]
[2023-09-21 10:51:54,895][131067] Updated weights for policy 0, policy_version 15200 (0.0013)
[2023-09-21 10:51:54,896][00302] Updated weights for policy 1, policy_version 15200 (0.0014)
[2023-09-21 10:51:58,503][130331] Fps is (10 sec: 13107.5, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15605760. Throughput: 0: 6474.7, 1: 6466.7. Samples: 15591462. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:51:58,503][130331] Avg episode reward: [(0, '176.901'), (1, '59.278')]
[2023-09-21 10:52:01,523][131067] Updated weights for policy 0, policy_version 15280 (0.0013)
[2023-09-21 10:52:01,524][00302] Updated weights for policy 1, policy_version 15280 (0.0014)
[2023-09-21 10:52:03,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15671296. Throughput: 0: 6461.3, 1: 6453.0. Samples: 15646894. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:52:03,503][130331] Avg episode reward: [(0, '178.545'), (1, '63.661')]
[2023-09-21 10:52:03,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015304_7835648.pth...
[2023-09-21 10:52:03,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015304_7835648.pth...
[2023-09-21 10:52:03,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000014928_7643136.pth
[2023-09-21 10:52:03,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000014928_7643136.pth
[2023-09-21 10:52:07,552][131067] Updated weights for policy 0, policy_version 15360 (0.0015)
[2023-09-21 10:52:07,552][00302] Updated weights for policy 1, policy_version 15360 (0.0016)
[2023-09-21 10:52:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15736832. Throughput: 0: 6470.0, 1: 6462.7. Samples: 15728476. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:52:08,503][130331] Avg episode reward: [(0, '177.006'), (1, '70.415')]
[2023-09-21 10:52:13,502][130331] Fps is (10 sec: 13107.4, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15802368. Throughput: 0: 6508.0, 1: 6465.1. Samples: 15787978. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:52:13,503][130331] Avg episode reward: [(0, '173.738'), (1, '74.394')]
[2023-09-21 10:52:13,838][131067] Updated weights for policy 0, policy_version 15440 (0.0015)
[2023-09-21 10:52:13,838][00302] Updated weights for policy 1, policy_version 15440 (0.0016)
[2023-09-21 10:52:18,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15867904. Throughput: 0: 6463.0, 1: 6462.3. Samples: 15843980. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:52:18,504][130331] Avg episode reward: [(0, '169.439'), (1, '77.634')]
[2023-09-21 10:52:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015496_7933952.pth...
[2023-09-21 10:52:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015496_7933952.pth...
[2023-09-21 10:52:18,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015112_7737344.pth
[2023-09-21 10:52:18,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015112_7737344.pth
[2023-09-21 10:52:18,517][130981] Saving new best policy, reward=77.634!
[2023-09-21 10:52:20,169][00302] Updated weights for policy 1, policy_version 15520 (0.0009)
[2023-09-21 10:52:20,170][131067] Updated weights for policy 0, policy_version 15520 (0.0014)
[2023-09-21 10:52:23,503][130331] Fps is (10 sec: 13106.9, 60 sec: 12970.7, 300 sec: 13079.4). Total num frames: 15933440. Throughput: 0: 6500.2, 1: 6508.2. Samples: 15923172. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-21 10:52:23,504][130331] Avg episode reward: [(0, '165.614'), (1, '77.876')]
[2023-09-21 10:52:23,505][130981] Saving new best policy, reward=77.876!
[2023-09-21 10:52:26,471][00302] Updated weights for policy 1, policy_version 15600 (0.0013)
[2023-09-21 10:52:26,472][131067] Updated weights for policy 0, policy_version 15600 (0.0015)
[2023-09-21 10:52:28,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.6, 300 sec: 13079.4). Total num frames: 15998976. Throughput: 0: 6498.1, 1: 6495.3. Samples: 15981982. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:28,504][130331] Avg episode reward: [(0, '160.660'), (1, '78.227')]
[2023-09-21 10:52:28,505][130981] Saving new best policy, reward=78.227!
[2023-09-21 10:52:32,551][131067] Updated weights for policy 0, policy_version 15680 (0.0014)
[2023-09-21 10:52:32,551][00302] Updated weights for policy 1, policy_version 15680 (0.0015)
[2023-09-21 10:52:33,503][130331] Fps is (10 sec: 13107.0, 60 sec: 12970.6, 300 sec: 13079.4). Total num frames: 16064512. Throughput: 0: 6530.5, 1: 6528.5. Samples: 16043090. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:33,504][130331] Avg episode reward: [(0, '157.086'), (1, '78.582')]
[2023-09-21 10:52:33,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015688_8032256.pth...
[2023-09-21 10:52:33,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015688_8032256.pth...
[2023-09-21 10:52:33,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015304_7835648.pth
[2023-09-21 10:52:33,516][130981] Saving new best policy, reward=78.582!
[2023-09-21 10:52:33,519][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015304_7835648.pth
[2023-09-21 10:52:38,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.1, 300 sec: 13079.4). Total num frames: 16130048. Throughput: 0: 6554.6, 1: 6562.2. Samples: 16122620. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:38,504][130331] Avg episode reward: [(0, '154.772'), (1, '79.192')]
[2023-09-21 10:52:38,505][130981] Saving new best policy, reward=79.192!
[2023-09-21 10:52:38,676][131067] Updated weights for policy 0, policy_version 15760 (0.0014)
[2023-09-21 10:52:38,676][00302] Updated weights for policy 1, policy_version 15760 (0.0015)
[2023-09-21 10:52:43,503][130331] Fps is (10 sec: 13107.6, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 16195584. Throughput: 0: 6594.8, 1: 6562.8. Samples: 16183558. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:43,503][130331] Avg episode reward: [(0, '152.222'), (1, '79.728')]
[2023-09-21 10:52:43,538][130981] Saving new best policy, reward=79.728!
[2023-09-21 10:52:44,816][131067] Updated weights for policy 0, policy_version 15840 (0.0014)
[2023-09-21 10:52:44,817][00302] Updated weights for policy 1, policy_version 15840 (0.0015)
[2023-09-21 10:52:48,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 16261120. Throughput: 0: 6597.2, 1: 6611.0. Samples: 16241266. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:48,504][130331] Avg episode reward: [(0, '151.872'), (1, '80.194')]
[2023-09-21 10:52:48,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015880_8130560.pth...
[2023-09-21 10:52:48,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015880_8130560.pth...
[2023-09-21 10:52:48,520][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015496_7933952.pth
[2023-09-21 10:52:48,521][130981] Saving new best policy, reward=80.194!
[2023-09-21 10:52:48,524][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015496_7933952.pth
[2023-09-21 10:52:51,430][00302] Updated weights for policy 1, policy_version 15920 (0.0015)
[2023-09-21 10:52:51,430][131067] Updated weights for policy 0, policy_version 15920 (0.0017)
[2023-09-21 10:52:53,502][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 16326656. Throughput: 0: 6526.0, 1: 6546.7. Samples: 16316748. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:53,503][130331] Avg episode reward: [(0, '151.434'), (1, '80.606')]
[2023-09-21 10:52:53,504][130981] Saving new best policy, reward=80.606!
[2023-09-21 10:52:57,589][00302] Updated weights for policy 1, policy_version 16000 (0.0013)
[2023-09-21 10:52:57,589][131067] Updated weights for policy 0, policy_version 16000 (0.0015)
[2023-09-21 10:52:58,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13079.4). Total num frames: 16392192. Throughput: 0: 6539.1, 1: 6546.9. Samples: 16376848. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:52:58,504][130331] Avg episode reward: [(0, '151.048'), (1, '81.302')]
[2023-09-21 10:52:58,507][130981] Saving new best policy, reward=81.302!
[2023-09-21 10:53:03,503][130331] Fps is (10 sec: 13106.7, 60 sec: 13107.1, 300 sec: 13079.4). Total num frames: 16457728. Throughput: 0: 6564.1, 1: 6571.9. Samples: 16435104. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:53:03,504][130331] Avg episode reward: [(0, '151.768'), (1, '82.452')]
[2023-09-21 10:53:03,515][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016072_8228864.pth...
[2023-09-21 10:53:03,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016072_8228864.pth...
[2023-09-21 10:53:03,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015688_8032256.pth
[2023-09-21 10:53:03,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015688_8032256.pth
[2023-09-21 10:53:03,524][130981] Saving new best policy, reward=82.452!
[2023-09-21 10:53:03,992][131067] Updated weights for policy 0, policy_version 16080 (0.0015)
[2023-09-21 10:53:03,992][00302] Updated weights for policy 1, policy_version 16080 (0.0015)
[2023-09-21 10:53:08,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13051.7). Total num frames: 16523264. Throughput: 0: 6586.8, 1: 6563.1. Samples: 16514916. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:53:08,504][130331] Avg episode reward: [(0, '152.727'), (1, '83.062')]
[2023-09-21 10:53:08,506][130981] Saving new best policy, reward=83.062!
[2023-09-21 10:53:10,011][00302] Updated weights for policy 1, policy_version 16160 (0.0013)
[2023-09-21 10:53:10,011][131067] Updated weights for policy 0, policy_version 16160 (0.0010)
[2023-09-21 10:53:13,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.1, 300 sec: 13051.7). Total num frames: 16588800. Throughput: 0: 6591.3, 1: 6583.3. Samples: 16574840. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:53:13,504][130331] Avg episode reward: [(0, '154.899'), (1, '82.919')]
[2023-09-21 10:53:16,194][00302] Updated weights for policy 1, policy_version 16240 (0.0012)
[2023-09-21 10:53:16,194][131067] Updated weights for policy 0, policy_version 16240 (0.0012)
[2023-09-21 10:53:18,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13051.7). Total num frames: 16654336. Throughput: 0: 6568.5, 1: 6567.4. Samples: 16634204. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:53:18,504][130331] Avg episode reward: [(0, '155.191'), (1, '82.547')]
[2023-09-21 10:53:18,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016264_8327168.pth...
[2023-09-21 10:53:18,514][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016264_8327168.pth...
[2023-09-21 10:53:18,518][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000015880_8130560.pth
[2023-09-21 10:53:18,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000015880_8130560.pth
[2023-09-21 10:53:22,350][131067] Updated weights for policy 0, policy_version 16320 (0.0013)
[2023-09-21 10:53:22,351][00302] Updated weights for policy 1, policy_version 16320 (0.0015)
[2023-09-21 10:53:23,502][130331] Fps is (10 sec: 13107.5, 60 sec: 13107.3, 300 sec: 13079.4). Total num frames: 16719872. Throughput: 0: 6567.4, 1: 6579.4. Samples: 16714224. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:53:23,503][130331] Avg episode reward: [(0, '155.535'), (1, '82.778')]
[2023-09-21 10:53:28,493][00302] Updated weights for policy 1, policy_version 16400 (0.0012)
[2023-09-21 10:53:28,494][131067] Updated weights for policy 0, policy_version 16400 (0.0012)
[2023-09-21 10:53:28,503][130331] Fps is (10 sec: 13926.7, 60 sec: 13243.8, 300 sec: 13107.2). Total num frames: 16793600. Throughput: 0: 6559.7, 1: 6549.2. Samples: 16773458. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:53:28,503][130331] Avg episode reward: [(0, '153.172'), (1, '82.989')]
[2023-09-21 10:53:33,503][130331] Fps is (10 sec: 13926.0, 60 sec: 13243.8, 300 sec: 13107.2). Total num frames: 16859136. Throughput: 0: 6588.5, 1: 6582.1. Samples: 16833938. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:53:33,504][130331] Avg episode reward: [(0, '152.743'), (1, '83.080')]
[2023-09-21 10:53:33,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016464_8429568.pth...
[2023-09-21 10:53:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016464_8429568.pth...
[2023-09-21 10:53:33,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016072_8228864.pth
[2023-09-21 10:53:33,518][130981] Saving new best policy, reward=83.080!
[2023-09-21 10:53:33,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016072_8228864.pth
[2023-09-21 10:53:34,470][00302] Updated weights for policy 1, policy_version 16480 (0.0011)
[2023-09-21 10:53:34,471][131067] Updated weights for policy 0, policy_version 16480 (0.0012)
[2023-09-21 10:53:38,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 16924672. Throughput: 0: 6677.4, 1: 6674.4. Samples: 16917578. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:53:38,503][130331] Avg episode reward: [(0, '151.908'), (1, '83.610')]
[2023-09-21 10:53:38,504][130981] Saving new best policy, reward=83.610!
[2023-09-21 10:53:40,421][00302] Updated weights for policy 1, policy_version 16560 (0.0013)
[2023-09-21 10:53:40,422][131067] Updated weights for policy 0, policy_version 16560 (0.0013)
[2023-09-21 10:53:43,503][130331] Fps is (10 sec: 13926.6, 60 sec: 13380.2, 300 sec: 13162.7). Total num frames: 16998400. Throughput: 0: 6702.8, 1: 6675.6. Samples: 16978876. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:53:43,504][130331] Avg episode reward: [(0, '151.571'), (1, '84.211')]
[2023-09-21 10:53:43,505][130981] Saving new best policy, reward=84.211!
[2023-09-21 10:53:46,669][131067] Updated weights for policy 0, policy_version 16640 (0.0012)
[2023-09-21 10:53:46,670][00302] Updated weights for policy 1, policy_version 16640 (0.0015)
[2023-09-21 10:53:48,503][130331] Fps is (10 sec: 13926.4, 60 sec: 13380.3, 300 sec: 13162.7). Total num frames: 17063936. Throughput: 0: 6700.1, 1: 6702.4. Samples: 17038216. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:53:48,503][130331] Avg episode reward: [(0, '152.416'), (1, '84.533')]
[2023-09-21 10:53:48,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016664_8531968.pth...
[2023-09-21 10:53:48,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016664_8531968.pth...
[2023-09-21 10:53:48,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016264_8327168.pth
[2023-09-21 10:53:48,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016264_8327168.pth
[2023-09-21 10:53:48,515][130981] Saving new best policy, reward=84.533!
[2023-09-21 10:53:52,933][131067] Updated weights for policy 0, policy_version 16720 (0.0011)
[2023-09-21 10:53:52,934][00302] Updated weights for policy 1, policy_version 16720 (0.0014)
[2023-09-21 10:53:53,503][130331] Fps is (10 sec: 12288.0, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 17121280. Throughput: 0: 6670.5, 1: 6690.2. Samples: 17116150. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:53:53,504][130331] Avg episode reward: [(0, '150.822'), (1, '84.614')]
[2023-09-21 10:53:53,517][130981] Saving new best policy, reward=84.614!
[2023-09-21 10:53:58,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13380.3, 300 sec: 13162.7). Total num frames: 17195008. Throughput: 0: 6682.5, 1: 6704.7. Samples: 17177264. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:53:58,504][130331] Avg episode reward: [(0, '150.184'), (1, '84.982')]
[2023-09-21 10:53:58,505][130981] Saving new best policy, reward=84.982!
[2023-09-21 10:53:59,024][131067] Updated weights for policy 0, policy_version 16800 (0.0012)
[2023-09-21 10:53:59,024][00302] Updated weights for policy 1, policy_version 16800 (0.0014)
[2023-09-21 10:54:03,502][130331] Fps is (10 sec: 13926.7, 60 sec: 13380.4, 300 sec: 13162.7). Total num frames: 17260544. Throughput: 0: 6682.6, 1: 6675.2. Samples: 17235304. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:03,503][130331] Avg episode reward: [(0, '149.100'), (1, '84.584')]
[2023-09-21 10:54:03,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016856_8630272.pth...
[2023-09-21 10:54:03,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016856_8630272.pth...
[2023-09-21 10:54:03,515][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016464_8429568.pth
[2023-09-21 10:54:03,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016464_8429568.pth
[2023-09-21 10:54:05,326][00302] Updated weights for policy 1, policy_version 16880 (0.0014)
[2023-09-21 10:54:05,327][131067] Updated weights for policy 0, policy_version 16880 (0.0016)
[2023-09-21 10:54:08,503][130331] Fps is (10 sec: 12288.0, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 17317888. Throughput: 0: 6660.0, 1: 6657.0. Samples: 17313494. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:08,504][130331] Avg episode reward: [(0, '150.146'), (1, '84.342')]
[2023-09-21 10:54:11,857][131067] Updated weights for policy 0, policy_version 16960 (0.0013)
[2023-09-21 10:54:11,857][00302] Updated weights for policy 1, policy_version 16960 (0.0014)
[2023-09-21 10:54:13,502][130331] Fps is (10 sec: 12288.0, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 17383424. Throughput: 0: 6605.6, 1: 6641.1. Samples: 17369558. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:13,503][130331] Avg episode reward: [(0, '149.192'), (1, '83.818')]
[2023-09-21 10:54:18,399][131067] Updated weights for policy 0, policy_version 17040 (0.0011)
[2023-09-21 10:54:18,401][00302] Updated weights for policy 1, policy_version 17040 (0.0014)
[2023-09-21 10:54:18,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 17448960. Throughput: 0: 6582.6, 1: 6589.6. Samples: 17426684. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:18,503][130331] Avg episode reward: [(0, '149.376'), (1, '83.931')]
[2023-09-21 10:54:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017040_8724480.pth...
[2023-09-21 10:54:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017040_8724480.pth...
[2023-09-21 10:54:18,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016664_8531968.pth
[2023-09-21 10:54:18,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016664_8531968.pth
[2023-09-21 10:54:23,502][130331] Fps is (10 sec: 13107.2, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 17514496. Throughput: 0: 6480.0, 1: 6483.2. Samples: 17500918. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:23,503][130331] Avg episode reward: [(0, '149.332'), (1, '84.178')]
[2023-09-21 10:54:24,711][131067] Updated weights for policy 0, policy_version 17120 (0.0011)
[2023-09-21 10:54:24,712][00302] Updated weights for policy 1, policy_version 17120 (0.0015)
[2023-09-21 10:54:28,502][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 17580032. Throughput: 0: 6463.3, 1: 6475.0. Samples: 17561098. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:28,503][130331] Avg episode reward: [(0, '150.213'), (1, '84.990')]
[2023-09-21 10:54:28,504][130981] Saving new best policy, reward=84.990!
[2023-09-21 10:54:30,972][00302] Updated weights for policy 1, policy_version 17200 (0.0016)
[2023-09-21 10:54:30,972][131067] Updated weights for policy 0, policy_version 17200 (0.0012)
[2023-09-21 10:54:33,503][130331] Fps is (10 sec: 13106.8, 60 sec: 13107.2, 300 sec: 13107.2). Total num frames: 17645568. Throughput: 0: 6478.6, 1: 6464.4. Samples: 17620654. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:33,504][130331] Avg episode reward: [(0, '149.183'), (1, '85.403')]
[2023-09-21 10:54:33,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017232_8822784.pth...
[2023-09-21 10:54:33,515][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017232_8822784.pth...
[2023-09-21 10:54:33,522][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000016856_8630272.pth
[2023-09-21 10:54:33,523][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000016856_8630272.pth
[2023-09-21 10:54:33,524][130981] Saving new best policy, reward=85.403!
[2023-09-21 10:54:36,871][00302] Updated weights for policy 1, policy_version 17280 (0.0013)
[2023-09-21 10:54:36,871][131067] Updated weights for policy 0, policy_version 17280 (0.0015)
[2023-09-21 10:54:38,503][130331] Fps is (10 sec: 13106.9, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 17711104. Throughput: 0: 6532.7, 1: 6533.6. Samples: 17704132. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:38,504][130331] Avg episode reward: [(0, '149.469'), (1, '85.508')]
[2023-09-21 10:54:38,505][130981] Saving new best policy, reward=85.508!
[2023-09-21 10:54:42,920][131067] Updated weights for policy 0, policy_version 17360 (0.0014)
[2023-09-21 10:54:42,920][00302] Updated weights for policy 1, policy_version 17360 (0.0015)
[2023-09-21 10:54:43,503][130331] Fps is (10 sec: 13926.6, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 17784832. Throughput: 0: 6552.9, 1: 6522.9. Samples: 17765676. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:43,504][130331] Avg episode reward: [(0, '150.419'), (1, '84.870')]
[2023-09-21 10:54:48,503][130331] Fps is (10 sec: 13107.1, 60 sec: 12970.6, 300 sec: 13135.0). Total num frames: 17842176. Throughput: 0: 6557.2, 1: 6553.6. Samples: 17825290. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:54:48,504][130331] Avg episode reward: [(0, '150.151'), (1, '85.219')]
[2023-09-21 10:54:48,548][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017432_8925184.pth...
[2023-09-21 10:54:48,548][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017432_8925184.pth...
[2023-09-21 10:54:48,551][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017040_8724480.pth
[2023-09-21 10:54:48,552][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017040_8724480.pth
[2023-09-21 10:54:49,166][00302] Updated weights for policy 1, policy_version 17440 (0.0011)
[2023-09-21 10:54:49,167][131067] Updated weights for policy 0, policy_version 17440 (0.0015)
[2023-09-21 10:54:53,503][130331] Fps is (10 sec: 12697.6, 60 sec: 13175.5, 300 sec: 13148.8). Total num frames: 17911808. Throughput: 0: 6550.9, 1: 6553.2. Samples: 17903178. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:54:53,504][130331] Avg episode reward: [(0, '149.126'), (1, '86.340')]
[2023-09-21 10:54:53,505][130981] Saving new best policy, reward=86.340!
[2023-09-21 10:54:55,245][00302] Updated weights for policy 1, policy_version 17520 (0.0012)
[2023-09-21 10:54:55,245][131067] Updated weights for policy 0, policy_version 17520 (0.0016)
[2023-09-21 10:54:58,503][130331] Fps is (10 sec: 13926.7, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 17981440. Throughput: 0: 6628.9, 1: 6607.0. Samples: 17965174. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:54:58,503][130331] Avg episode reward: [(0, '147.625'), (1, '87.622')]
[2023-09-21 10:54:58,504][130981] Saving new best policy, reward=87.622!
[2023-09-21 10:55:01,271][00302] Updated weights for policy 1, policy_version 17600 (0.0014)
[2023-09-21 10:55:01,272][131067] Updated weights for policy 0, policy_version 17600 (0.0012)
[2023-09-21 10:55:03,502][130331] Fps is (10 sec: 13517.0, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 18046976. Throughput: 0: 6655.6, 1: 6660.0. Samples: 18025884. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:55:03,503][130331] Avg episode reward: [(0, '147.351'), (1, '88.096')]
[2023-09-21 10:55:03,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017624_9023488.pth...
[2023-09-21 10:55:03,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017624_9023488.pth...
[2023-09-21 10:55:03,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017232_8822784.pth
[2023-09-21 10:55:03,521][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017232_8822784.pth
[2023-09-21 10:55:03,522][130981] Saving new best policy, reward=88.096!
[2023-09-21 10:55:07,423][00302] Updated weights for policy 1, policy_version 17680 (0.0015)
[2023-09-21 10:55:07,424][131067] Updated weights for policy 0, policy_version 17680 (0.0014)
[2023-09-21 10:55:08,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18112512. Throughput: 0: 6718.2, 1: 6711.2. Samples: 18105244. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:55:08,504][130331] Avg episode reward: [(0, '145.596'), (1, '88.699')]
[2023-09-21 10:55:08,505][130981] Saving new best policy, reward=88.699!
[2023-09-21 10:55:13,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18178048. Throughput: 0: 6673.9, 1: 6696.0. Samples: 18162744. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-21 10:55:13,503][130331] Avg episode reward: [(0, '144.061'), (1, '90.414')]
[2023-09-21 10:55:13,504][130981] Saving new best policy, reward=90.414!
[2023-09-21 10:55:13,834][131067] Updated weights for policy 0, policy_version 17760 (0.0015)
[2023-09-21 10:55:13,834][00302] Updated weights for policy 1, policy_version 17760 (0.0015)
[2023-09-21 10:55:18,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18243584. Throughput: 0: 6653.6, 1: 6668.3. Samples: 18220140. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:18,503][130331] Avg episode reward: [(0, '143.288'), (1, '91.141')]
[2023-09-21 10:55:18,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017816_9121792.pth...
[2023-09-21 10:55:18,510][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017816_9121792.pth...
[2023-09-21 10:55:18,513][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017432_8925184.pth
[2023-09-21 10:55:18,515][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017432_8925184.pth
[2023-09-21 10:55:18,516][130981] Saving new best policy, reward=91.141!
[2023-09-21 10:55:20,159][131067] Updated weights for policy 0, policy_version 17840 (0.0013)
[2023-09-21 10:55:20,159][00302] Updated weights for policy 1, policy_version 17840 (0.0015)
[2023-09-21 10:55:23,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18309120. Throughput: 0: 6593.3, 1: 6588.8. Samples: 18297324. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:23,504][130331] Avg episode reward: [(0, '143.705'), (1, '92.992')]
[2023-09-21 10:55:23,505][130981] Saving new best policy, reward=92.992!
[2023-09-21 10:55:26,452][00302] Updated weights for policy 1, policy_version 17920 (0.0015)
[2023-09-21 10:55:26,452][131067] Updated weights for policy 0, policy_version 17920 (0.0015)
[2023-09-21 10:55:28,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18374656. Throughput: 0: 6584.0, 1: 6581.2. Samples: 18358114. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:28,504][130331] Avg episode reward: [(0, '142.966'), (1, '89.918')]
[2023-09-21 10:55:32,765][131067] Updated weights for policy 0, policy_version 18000 (0.0012)
[2023-09-21 10:55:32,765][00302] Updated weights for policy 1, policy_version 18000 (0.0016)
[2023-09-21 10:55:33,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13243.7, 300 sec: 13135.0). Total num frames: 18440192. Throughput: 0: 6568.9, 1: 6582.0. Samples: 18417078. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:33,504][130331] Avg episode reward: [(0, '141.717'), (1, '87.003')]
[2023-09-21 10:55:33,514][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018008_9220096.pth...
[2023-09-21 10:55:33,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018008_9220096.pth...
[2023-09-21 10:55:33,522][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017624_9023488.pth
[2023-09-21 10:55:33,522][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017624_9023488.pth
[2023-09-21 10:55:38,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13243.8, 300 sec: 13135.0). Total num frames: 18505728. Throughput: 0: 6584.5, 1: 6577.4. Samples: 18495464. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:38,503][130331] Avg episode reward: [(0, '141.415'), (1, '82.051')]
[2023-09-21 10:55:39,106][131067] Updated weights for policy 0, policy_version 18080 (0.0013)
[2023-09-21 10:55:39,107][00302] Updated weights for policy 1, policy_version 18080 (0.0016)
[2023-09-21 10:55:43,502][130331] Fps is (10 sec: 12288.3, 60 sec: 12970.7, 300 sec: 13107.2). Total num frames: 18563072. Throughput: 0: 6515.6, 1: 6515.6. Samples: 18551576. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:55:43,503][130331] Avg episode reward: [(0, '143.349'), (1, '78.677')]
[2023-09-21 10:55:45,467][00302] Updated weights for policy 1, policy_version 18160 (0.0014)
[2023-09-21 10:55:45,468][131067] Updated weights for policy 0, policy_version 18160 (0.0014)
[2023-09-21 10:55:48,503][130331] Fps is (10 sec: 12287.7, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 18628608. Throughput: 0: 6473.6, 1: 6468.9. Samples: 18608302. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:55:48,504][130331] Avg episode reward: [(0, '145.312'), (1, '76.609')]
[2023-09-21 10:55:48,526][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018200_9318400.pth...
[2023-09-21 10:55:48,527][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018200_9318400.pth...
[2023-09-21 10:55:48,531][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000017816_9121792.pth
[2023-09-21 10:55:48,539][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000017816_9121792.pth
[2023-09-21 10:55:51,552][00302] Updated weights for policy 1, policy_version 18240 (0.0014)
[2023-09-21 10:55:51,552][131067] Updated weights for policy 0, policy_version 18240 (0.0012)
[2023-09-21 10:55:53,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13038.9, 300 sec: 13107.2). Total num frames: 18694144. Throughput: 0: 6494.8, 1: 6499.9. Samples: 18690006. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:55:53,504][130331] Avg episode reward: [(0, '147.688'), (1, '73.000')]
[2023-09-21 10:55:57,793][131067] Updated weights for policy 0, policy_version 18320 (0.0017)
[2023-09-21 10:55:57,794][00302] Updated weights for policy 1, policy_version 18320 (0.0016)
[2023-09-21 10:55:58,503][130331] Fps is (10 sec: 13926.7, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 18767872. Throughput: 0: 6527.6, 1: 6482.8. Samples: 18748216. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:55:58,503][130331] Avg episode reward: [(0, '148.766'), (1, '74.062')]
[2023-09-21 10:56:03,503][130331] Fps is (10 sec: 13926.3, 60 sec: 13107.1, 300 sec: 13135.0). Total num frames: 18833408. Throughput: 0: 6555.8, 1: 6558.6. Samples: 18810294. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-21 10:56:03,504][130331] Avg episode reward: [(0, '147.768'), (1, '75.140')]
[2023-09-21 10:56:03,514][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018392_9416704.pth...
[2023-09-21 10:56:03,514][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018392_9416704.pth...
[2023-09-21 10:56:03,526][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018008_9220096.pth
[2023-09-21 10:56:03,526][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018008_9220096.pth
[2023-09-21 10:56:03,812][00302] Updated weights for policy 1, policy_version 18400 (0.0012)
[2023-09-21 10:56:03,813][131067] Updated weights for policy 0, policy_version 18400 (0.0015)
[2023-09-21 10:56:08,503][130331] Fps is (10 sec: 13107.1, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 18898944. Throughput: 0: 6579.3, 1: 6582.0. Samples: 18889584. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:56:08,504][130331] Avg episode reward: [(0, '146.337'), (1, '78.283')]
[2023-09-21 10:56:10,122][00302] Updated weights for policy 1, policy_version 18480 (0.0012)
[2023-09-21 10:56:10,123][131067] Updated weights for policy 0, policy_version 18480 (0.0014)
[2023-09-21 10:56:13,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 18964480. Throughput: 0: 6533.7, 1: 6570.8. Samples: 18947816. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:56:13,504][130331] Avg episode reward: [(0, '141.130'), (1, '74.565')]
[2023-09-21 10:56:16,423][00302] Updated weights for policy 1, policy_version 18560 (0.0013)
[2023-09-21 10:56:16,423][131067] Updated weights for policy 0, policy_version 18560 (0.0014)
[2023-09-21 10:56:18,503][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 19030016. Throughput: 0: 6551.0, 1: 6550.7. Samples: 19006652. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:56:18,503][130331] Avg episode reward: [(0, '137.099'), (1, '69.617')]
[2023-09-21 10:56:18,511][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018584_9515008.pth...
[2023-09-21 10:56:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018584_9515008.pth...
[2023-09-21 10:56:18,516][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018200_9318400.pth
[2023-09-21 10:56:18,516][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018200_9318400.pth
[2023-09-21 10:56:22,481][00302] Updated weights for policy 1, policy_version 18640 (0.0012)
[2023-09-21 10:56:22,481][131067] Updated weights for policy 0, policy_version 18640 (0.0010)
[2023-09-21 10:56:23,502][130331] Fps is (10 sec: 13107.4, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 19095552. Throughput: 0: 6587.2, 1: 6581.8. Samples: 19088068. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:56:23,503][130331] Avg episode reward: [(0, '132.828'), (1, '64.817')]
[2023-09-21 10:56:28,503][130331] Fps is (10 sec: 13107.3, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 19161088. Throughput: 0: 6645.5, 1: 6628.6. Samples: 19148910. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:56:28,503][130331] Avg episode reward: [(0, '129.164'), (1, '62.703')]
[2023-09-21 10:56:28,537][00302] Updated weights for policy 1, policy_version 18720 (0.0012)
[2023-09-21 10:56:28,538][131067] Updated weights for policy 0, policy_version 18720 (0.0013)
[2023-09-21 10:56:33,503][130331] Fps is (10 sec: 13926.0, 60 sec: 13243.7, 300 sec: 13190.5). Total num frames: 19234816. Throughput: 0: 6671.3, 1: 6675.3. Samples: 19208896. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-21 10:56:33,504][130331] Avg episode reward: [(0, '126.096'), (1, '60.934')]
[2023-09-21 10:56:33,513][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018784_9617408.pth...
[2023-09-21 10:56:33,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018784_9617408.pth...
[2023-09-21 10:56:33,518][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018392_9416704.pth
[2023-09-21 10:56:33,521][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018392_9416704.pth
[2023-09-21 10:56:34,776][00302] Updated weights for policy 1, policy_version 18800 (0.0015)
[2023-09-21 10:56:34,776][131067] Updated weights for policy 0, policy_version 18800 (0.0015)
[2023-09-21 10:56:38,503][130331] Fps is (10 sec: 13107.0, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 19292160. Throughput: 0: 6622.6, 1: 6619.1. Samples: 19285882. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:56:38,504][130331] Avg episode reward: [(0, '120.952'), (1, '59.010')]
[2023-09-21 10:56:41,226][131067] Updated weights for policy 0, policy_version 18880 (0.0014)
[2023-09-21 10:56:41,226][00302] Updated weights for policy 1, policy_version 18880 (0.0009)
[2023-09-21 10:56:43,503][130331] Fps is (10 sec: 12288.2, 60 sec: 13243.7, 300 sec: 13162.7). Total num frames: 19357696. Throughput: 0: 6580.1, 1: 6631.1. Samples: 19342722. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:56:43,504][130331] Avg episode reward: [(0, '122.056'), (1, '57.174')]
[2023-09-21 10:56:48,018][131067] Updated weights for policy 0, policy_version 18960 (0.0011)
[2023-09-21 10:56:48,018][00302] Updated weights for policy 1, policy_version 18960 (0.0015)
[2023-09-21 10:56:48,503][130331] Fps is (10 sec: 12287.8, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 19415040. Throughput: 0: 6525.2, 1: 6521.6. Samples: 19397400. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:56:48,504][130331] Avg episode reward: [(0, '125.143'), (1, '55.112')]
[2023-09-21 10:56:48,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018960_9707520.pth...
[2023-09-21 10:56:48,512][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018960_9707520.pth...
[2023-09-21 10:56:48,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018584_9515008.pth
[2023-09-21 10:56:48,517][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018584_9515008.pth
[2023-09-21 10:56:53,503][130331] Fps is (10 sec: 12287.9, 60 sec: 13107.2, 300 sec: 13135.0). Total num frames: 19480576. Throughput: 0: 6487.6, 1: 6478.8. Samples: 19473074. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:56:53,504][130331] Avg episode reward: [(0, '126.975'), (1, '52.770')]
[2023-09-21 10:56:54,388][00302] Updated weights for policy 1, policy_version 19040 (0.0014)
[2023-09-21 10:56:54,389][131067] Updated weights for policy 0, policy_version 19040 (0.0013)
[2023-09-21 10:56:58,503][130331] Fps is (10 sec: 13107.4, 60 sec: 12970.6, 300 sec: 13135.0). Total num frames: 19546112. Throughput: 0: 6496.3, 1: 6471.6. Samples: 19531370. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-21 10:56:58,504][130331] Avg episode reward: [(0, '127.835'), (1, '49.589')]
[2023-09-21 10:57:00,731][131067] Updated weights for policy 0, policy_version 19120 (0.0013)
[2023-09-21 10:57:00,731][00302] Updated weights for policy 1, policy_version 19120 (0.0014)
[2023-09-21 10:57:03,503][130331] Fps is (10 sec: 13107.2, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 19611648. Throughput: 0: 6462.6, 1: 6456.9. Samples: 19588028. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:57:03,504][130331] Avg episode reward: [(0, '127.035'), (1, '44.639')]
[2023-09-21 10:57:03,512][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000019152_9805824.pth...
[2023-09-21 10:57:03,513][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000019152_9805824.pth...
[2023-09-21 10:57:03,519][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018784_9617408.pth
[2023-09-21 10:57:03,520][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018784_9617408.pth
[2023-09-21 10:57:06,890][131067] Updated weights for policy 0, policy_version 19200 (0.0012)
[2023-09-21 10:57:06,891][00302] Updated weights for policy 1, policy_version 19200 (0.0015)
[2023-09-21 10:57:08,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 19677184. Throughput: 0: 6462.6, 1: 6461.3. Samples: 19669644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:57:08,504][130331] Avg episode reward: [(0, '125.917'), (1, '40.212')]
[2023-09-21 10:57:12,988][00302] Updated weights for policy 1, policy_version 19280 (0.0013)
[2023-09-21 10:57:12,988][131067] Updated weights for policy 0, policy_version 19280 (0.0014)
[2023-09-21 10:57:13,503][130331] Fps is (10 sec: 13107.3, 60 sec: 12970.7, 300 sec: 13135.0). Total num frames: 19742720. Throughput: 0: 6457.3, 1: 6459.7. Samples: 19730176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-21 10:57:13,504][130331] Avg episode reward: [(0, '125.469'), (1, '43.105')]
[2023-09-21 10:57:18,503][130331] Fps is (10 sec: 13926.5, 60 sec: 13107.2, 300 sec: 13162.7). Total num frames: 19816448. Throughput: 0: 6488.2, 1: 6470.3. Samples: 19792028. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:57:18,503][130331] Avg episode reward: [(0, '121.469'), (1, '46.326')]
[2023-09-21 10:57:18,510][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000019352_9908224.pth...
[2023-09-21 10:57:18,511][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000019352_9908224.pth...
[2023-09-21 10:57:18,517][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000018960_9707520.pth
[2023-09-21 10:57:18,520][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000018960_9707520.pth
[2023-09-21 10:57:18,948][00302] Updated weights for policy 1, policy_version 19360 (0.0014)
[2023-09-21 10:57:18,950][131067] Updated weights for policy 0, policy_version 19360 (0.0015)
[2023-09-21 10:57:23,503][130331] Fps is (10 sec: 13926.2, 60 sec: 13107.1, 300 sec: 13162.7). Total num frames: 19881984. Throughput: 0: 6518.1, 1: 6520.4. Samples: 19872618. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-21 10:57:23,504][130331] Avg episode reward: [(0, '118.010'), (1, '49.223')]
[2023-09-21 10:57:25,145][131067] Updated weights for policy 0, policy_version 19440 (0.0013)
[2023-09-21 10:57:25,146][00302] Updated weights for policy 1, policy_version 19440 (0.0013)
[2023-09-21 10:57:28,503][130331] Fps is (10 sec: 13107.2, 60 sec: 13107.2, 300 sec: 13162.8). Total num frames: 19947520. Throughput: 0: 6559.3, 1: 6535.7. Samples: 19932000. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-21 10:57:28,503][130331] Avg episode reward: [(0, '114.648'), (1, '46.819')]
[2023-09-21 10:57:31,391][131067] Updated weights for policy 0, policy_version 19520 (0.0015)
[2023-09-21 10:57:31,391][00302] Updated weights for policy 1, policy_version 19520 (0.0015)
[2023-09-21 10:57:32,628][130980] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000005
[2023-09-21 10:57:33,266][130980] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000
[2023-09-21 10:57:33,266][130981] Early stopping after 2 epochs (8 sgd steps), loss delta 0.0000000
[2023-09-21 10:57:33,268][00307] Stopping RolloutWorker_w2...
[2023-09-21 10:57:33,268][00307] Loop rollout_proc2_evt_loop terminating...
[2023-09-21 10:57:33,268][00317] Stopping RolloutWorker_w3...
[2023-09-21 10:57:33,268][00318] Stopping RolloutWorker_w4...
[2023-09-21 10:57:33,268][00325] Stopping RolloutWorker_w6...
[2023-09-21 10:57:33,268][00304] Stopping RolloutWorker_w1...
[2023-09-21 10:57:33,268][00314] Stopping RolloutWorker_w5...
[2023-09-21 10:57:33,268][00301] Stopping RolloutWorker_w0...
[2023-09-21 10:57:33,268][00330] Stopping RolloutWorker_w7...
[2023-09-21 10:57:33,268][130331] Component RolloutWorker_w3 stopped!
[2023-09-21 10:57:33,269][130980] Stopping Batcher_0...
[2023-09-21 10:57:33,269][130981] Stopping Batcher_1...
[2023-09-21 10:57:33,269][00325] Loop rollout_proc6_evt_loop terminating...
[2023-09-21 10:57:33,269][00317] Loop rollout_proc3_evt_loop terminating...
[2023-09-21 10:57:33,269][00318] Loop rollout_proc4_evt_loop terminating...
[2023-09-21 10:57:33,269][00314] Loop rollout_proc5_evt_loop terminating...
[2023-09-21 10:57:33,269][00304] Loop rollout_proc1_evt_loop terminating...
[2023-09-21 10:57:33,269][130980] Loop batcher_evt_loop terminating...
[2023-09-21 10:57:33,269][00301] Loop rollout_proc0_evt_loop terminating...
[2023-09-21 10:57:33,269][00330] Loop rollout_proc7_evt_loop terminating...
[2023-09-21 10:57:33,270][130331] Component RolloutWorker_w2 stopped!
[2023-09-21 10:57:33,270][130981] Loop batcher_evt_loop terminating...
[2023-09-21 10:57:33,270][130331] Component RolloutWorker_w4 stopped!
[2023-09-21 10:57:33,271][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000019544_10006528.pth...
[2023-09-21 10:57:33,271][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000019544_10006528.pth...
[2023-09-21 10:57:33,271][130331] Component RolloutWorker_w1 stopped!
[2023-09-21 10:57:33,271][130331] Component RolloutWorker_w5 stopped!
[2023-09-21 10:57:33,272][130331] Component RolloutWorker_w6 stopped!
[2023-09-21 10:57:33,272][130331] Component RolloutWorker_w0 stopped!
[2023-09-21 10:57:33,273][130331] Component RolloutWorker_w7 stopped!
[2023-09-21 10:57:33,273][130331] Component Batcher_0 stopped!
[2023-09-21 10:57:33,274][130331] Component Batcher_1 stopped!
[2023-09-21 10:57:33,278][130981] Removing ./train_dir/Swimmer/checkpoint_p1/checkpoint_000019152_9805824.pth
[2023-09-21 10:57:33,279][130981] Saving ./train_dir/Swimmer/checkpoint_p1/checkpoint_000019544_10006528.pth...
[2023-09-21 10:57:33,281][130980] Removing ./train_dir/Swimmer/checkpoint_p0/checkpoint_000019152_9805824.pth
[2023-09-21 10:57:33,282][130980] Saving ./train_dir/Swimmer/checkpoint_p0/checkpoint_000019544_10006528.pth...
[2023-09-21 10:57:33,283][130981] Stopping LearnerWorker_p1...
[2023-09-21 10:57:33,283][130981] Loop learner_proc1_evt_loop terminating...
[2023-09-21 10:57:33,283][130331] Component LearnerWorker_p1 stopped!
[2023-09-21 10:57:33,287][130980] Stopping LearnerWorker_p0...
[2023-09-21 10:57:33,287][130980] Loop learner_proc0_evt_loop terminating...
[2023-09-21 10:57:33,287][130331] Component LearnerWorker_p0 stopped!
[2023-09-21 10:57:33,313][131067] Weights refcount: 2 0
[2023-09-21 10:57:33,314][131067] Stopping InferenceWorker_p0-w0...
[2023-09-21 10:57:33,314][131067] Loop inference_proc0-0_evt_loop terminating...
[2023-09-21 10:57:33,314][00302] Weights refcount: 2 0
[2023-09-21 10:57:33,314][130331] Component InferenceWorker_p0-w0 stopped!
[2023-09-21 10:57:33,315][00302] Stopping InferenceWorker_p1-w0...
[2023-09-21 10:57:33,315][00302] Loop inference_proc1-0_evt_loop terminating...
[2023-09-21 10:57:33,315][130331] Component InferenceWorker_p1-w0 stopped!
[2023-09-21 10:57:33,316][130331] Waiting for process learner_proc0 to stop...
[2023-09-21 10:57:33,938][130331] Waiting for process learner_proc1 to stop...
[2023-09-21 10:57:33,938][130331] Waiting for process inference_proc0-0 to join...
[2023-09-21 10:57:33,939][130331] Waiting for process inference_proc1-0 to join...
[2023-09-21 10:57:33,940][130331] Waiting for process rollout_proc0 to join...
[2023-09-21 10:57:33,940][130331] Waiting for process rollout_proc1 to join...
[2023-09-21 10:57:33,941][130331] Waiting for process rollout_proc2 to join...
[2023-09-21 10:57:33,942][130331] Waiting for process rollout_proc3 to join...
[2023-09-21 10:57:33,942][130331] Waiting for process rollout_proc4 to join...
[2023-09-21 10:57:33,943][130331] Waiting for process rollout_proc5 to join...
[2023-09-21 10:57:33,943][130331] Waiting for process rollout_proc6 to join...
[2023-09-21 10:57:33,944][130331] Waiting for process rollout_proc7 to join...
[2023-09-21 10:57:33,944][130331] Batcher 0 profile tree view:
batching: 39.9176, releasing_batches: 3.4096
[2023-09-21 10:57:33,945][130331] Batcher 1 profile tree view:
batching: 40.6154, releasing_batches: 3.4508
[2023-09-21 10:57:33,945][130331] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0052
wait_policy_total: 206.0553
update_model: 19.5310
weight_update: 0.0015
one_step: 0.0012
handle_policy_step: 1221.9984
deserialize: 33.0595, stack: 7.6936, obs_to_device_normalize: 247.8765, forward: 613.9300, send_messages: 95.8447
prepare_outputs: 154.2824
to_cpu: 78.8096
[2023-09-21 10:57:33,946][130331] InferenceWorker_p1-w0 profile tree view:
wait_policy: 0.0052
wait_policy_total: 205.1068
update_model: 19.4940
weight_update: 0.0015
one_step: 0.0013
handle_policy_step: 1223.4305
deserialize: 32.9334, stack: 7.4602, obs_to_device_normalize: 249.7174, forward: 615.2011, send_messages: 95.1498
prepare_outputs: 153.0661
to_cpu: 78.5194
[2023-09-21 10:57:33,947][130331] Learner 0 profile tree view:
misc: 0.0147, prepare_batch: 21.9928
train: 106.8356
epoch_init: 0.0626, minibatch_init: 1.7432, losses_postprocess: 2.8800, kl_divergence: 1.3421, after_optimizer: 1.6095
calculate_losses: 31.6577
losses_init: 0.0579, forward_head: 3.6138, bptt_initial: 0.2083, bptt: 0.2040, tail: 12.0280, advantages_returns: 1.6200, losses: 12.0190
update: 65.3196
clip: 8.2916
[2023-09-21 10:57:33,947][130331] Learner 1 profile tree view:
misc: 0.0165, prepare_batch: 21.6833
train: 106.1829
epoch_init: 0.0623, minibatch_init: 1.7200, losses_postprocess: 2.8544, kl_divergence: 1.3371, after_optimizer: 1.6213
calculate_losses: 31.4701
losses_init: 0.0572, forward_head: 3.5693, bptt_initial: 0.2067, bptt: 0.2234, tail: 11.8549, advantages_returns: 1.6116, losses: 12.0224
update: 64.8904
clip: 8.1980
[2023-09-21 10:57:33,948][130331] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 1.6344, enqueue_policy_requests: 74.6746, complete_rollouts: 2.5239, env_step: 452.4023, overhead: 98.7678
save_policy_outputs: 174.0335
split_output_tensors: 60.2270
[2023-09-21 10:57:33,948][130331] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 1.5810, enqueue_policy_requests: 72.0793, complete_rollouts: 2.4101, env_step: 434.6806, overhead: 94.3722
save_policy_outputs: 165.4400
split_output_tensors: 58.2534
[2023-09-21 10:57:33,948][130331] Loop Runner_EvtLoop terminating...
[2023-09-21 10:57:33,949][130331] Runner profile tree view:
main_loop: 1541.7214
[2023-09-21 10:57:33,949][130331] Collected {0: 10006528, 1: 10006528}, FPS: 12981.0